If you clicked on the link to this topic: Thank you!
I have the following setup:
6x 500GB HDD-Drives
1x 32GB NVME-SSD (Intel Optane)
I used bcache to setup up the SSD as caching device and all other six
drives are backing devices. After all that was in place, I formatted the
six HHDs with
i All,
the aim of this patches set is to provide support for a BTRFS raid5/6
filesystem in GRUB.
The first patch, implements the basic support for raid5/6. I.e this works when
all the disks are present.
The next 5 patches, are preparatory ones.
The 7th patch implements the raid5 recovery
76b4666ed17685666
>> Author: Goffredo Baroncelli <kreij...@inwind.it>
>> Date: Tue Apr 17 21:40:31 2018 +0200
>>
>> Add initial support for btrfs raid5/6 chunk
>>
>> diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
>> index be195448d..4c5632acb
c80a1b7c913faf50f95c5c76b4666ed17685666
> Author: Goffredo Baroncelli <kreij...@inwind.it>
> Date: Tue Apr 17 21:40:31 2018 +0200
>
> Add initial support for btrfs raid5/6 chunk
>
> diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
> index be195448d..4c5632acb 1
.
Comments are welcome.
BR
G.Baroncelli
---
commit 8c80a1b7c913faf50f95c5c76b4666ed17685666
Author: Goffredo Baroncelli <kreij...@inwind.it>
Date: Tue Apr 17 21:40:31 2018 +0200
Add initial support for btrfs raid5/6 chunk
diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
index be1
g. To be fair
having a pair of SSDs (md raid1) caching three spindles (btrfs raid5)
may not be an ideal configuration. If I had three SSDs, one for each
drive, then it may have performed better?? I have also ~980 snapshots
spread over a years time, so I don't know how much that impacts
things. I did
ng seemed to provide the performance I was hoping. To be fair
having a pair of SSDs (md raid1) caching three spindles (btrfs raid5)
may not be an ideal configuration. If I had three SSDs, one for each
drive, then it may have performed better?? I have also ~980 snapshots
spread over a years time, so I
, it seemed that the btrfs RAID5 setup required one of
the drives, but I thought I had data with RAID5 and metadata with 2
copies. Was I missing something else that prevented mounting with that
specific drive? I don't want to get into a situation where one drive
dies and I can't get to any data
putting the root and roots for my LXC containers on the SSDs (btrfs
> RAID1) and the bulk stuff on the three spindle drives (btrfs RAID1).
> For some reason, it seemed that the btrfs RAID5 setup required one of
> the drives, but I thought I had data with RAID5 and metadata with 2
>
that the btrfs RAID5 setup required one of
the drives, but I thought I had data with RAID5 and metadata with 2
copies. Was I missing something else that prevented mounting with that
specific drive? I don't want to get into a situation where one drive
dies and I can't get to any data.
Thank you
On Mon, Aug 21, 2017 at 10:31 AM, Robert LeBlanc wrote:
> Qu,
>
> Sorry, I'm not on the list (I was for a few years about three years ago).
>
> I looked at the backup roots like you mentioned.
>
> # ./btrfs inspect dump-super -f /dev/bcache0
> superblock: bytenr=65536,
Qu,
Sorry, I'm not on the list (I was for a few years about three years ago).
I looked at the backup roots like you mentioned.
# ./btrfs inspect dump-super -f /dev/bcache0
superblock: bytenr=65536, device=/dev/bcache0
-
csum_type
I lost enough Btrfs m=d=s=RAID5 filesystems in past experiments (I
didn't try using RAID5 for metadata and system chunks in the last few
years) to faulty SATA cables + hotplug enabled SATA controllers (where
a disk could disappear and reappear "as the wind blew"). Since then, I
made a habit of
and didn't think twice
about what could go wrong, hey I set it up in RAID5 so it will be
fine. Well, it wasn't...
Well, Btrfs RAID5 is not that safe.
I would recommend to use RAID1 for metadata at least.
(And in your case, your metadata is damaged, so I really recommend to
use a better profile for your
I've been running btrfs in a raid5 for about a year now with bcache in
front of it. Yesterday, one of my drives was acting really slow, so I
was going to move it to a different port. I guess I get too
comfortable hot plugging drives in at work and didn't think twice
about what could go wrong, hey
It seems like I accidentally managed to break my Btrfs/RAID5
filesystem, yet again, in a similar fashion.
This time around, I ran into some random libata driver issue (?)
instead of a faulty hardware part but the end result is quiet similar.
I issued the command (replacing X with valid letters
On Wed, Jul 6, 2016 at 1:15 PM, Austin S. Hemmelgarn
wrote:
> On 2016-07-06 14:45, Chris Murphy wrote:
>> I think it's statistically 0 people changing this from default. It's
>> people with drives that have no SCT ERC support, used in raid1+, who
>> happen to stumble upon
On 2016-07-06 14:45, Chris Murphy wrote:
On Wed, Jul 6, 2016 at 11:18 AM, Austin S. Hemmelgarn
wrote:
On 2016-07-06 12:43, Chris Murphy wrote:
So does it make sense to just set the default to 180? Or is there a
smarter way to do this? I don't know.
Just thinking
On Wed, Jul 6, 2016 at 11:18 AM, Austin S. Hemmelgarn
wrote:
> On 2016-07-06 12:43, Chris Murphy wrote:
>> So does it make sense to just set the default to 180? Or is there a
>> smarter way to do this? I don't know.
>
> Just thinking about this:
> 1. People who are setting
On 2016-07-06 12:43, Chris Murphy wrote:
On Wed, Jul 6, 2016 at 5:51 AM, Austin S. Hemmelgarn
wrote:
On 2016-07-05 19:05, Chris Murphy wrote:
Related:
http://www.spinics.net/lists/raid/msg52880.html
Looks like there is some traction to figuring out what to do about
On Wed, Jul 6, 2016 at 5:51 AM, Austin S. Hemmelgarn
wrote:
> On 2016-07-05 19:05, Chris Murphy wrote:
>>
>> Related:
>> http://www.spinics.net/lists/raid/msg52880.html
>>
>> Looks like there is some traction to figuring out what to do about
>> this, whether it's a udev rule
On 2016-07-05 19:05, Chris Murphy wrote:
Related:
http://www.spinics.net/lists/raid/msg52880.html
Looks like there is some traction to figuring out what to do about
this, whether it's a udev rule or something that happens in the kernel
itself. Pretty much the only hardware setup unaffected by
Related:
http://www.spinics.net/lists/raid/msg52880.html
Looks like there is some traction to figuring out what to do about
this, whether it's a udev rule or something that happens in the kernel
itself. Pretty much the only hardware setup unaffected by this are
those with enterprise or NAS
On 29/06/16 04:01, Chris Murphy wrote:
> Just wiping the slate clean to summarize:
>
>
> 1. We have a consistent ~1 in 3 maybe 1 in 2, reproducible corruption
> of *data extent* parity during a scrub with raid5. Goffredo and I have
> both reproduced it. It's a big bug. It might still be useful
Just wiping the slate clean to summarize:
1. We have a consistent ~1 in 3 maybe 1 in 2, reproducible corruption
of *data extent* parity during a scrub with raid5. Goffredo and I have
both reproduced it. It's a big bug. It might still be useful if
someone else can reproduce it too.
Goffredo, can
On 28/06/16 22:25, Austin S. Hemmelgarn wrote:
> On 2016-06-28 08:14, Steven Haigh wrote:
>> On 28/06/16 22:05, Austin S. Hemmelgarn wrote:
>>> On 2016-06-27 17:57, Zygo Blaxell wrote:
On Mon, Jun 27, 2016 at 10:17:04AM -0600, Chris Murphy wrote:
> On Mon, Jun 27, 2016 at 5:21 AM, Austin
On 2016-06-28 08:14, Steven Haigh wrote:
On 28/06/16 22:05, Austin S. Hemmelgarn wrote:
On 2016-06-27 17:57, Zygo Blaxell wrote:
On Mon, Jun 27, 2016 at 10:17:04AM -0600, Chris Murphy wrote:
On Mon, Jun 27, 2016 at 5:21 AM, Austin S. Hemmelgarn
wrote:
On 2016-06-25
On 28/06/16 22:05, Austin S. Hemmelgarn wrote:
> On 2016-06-27 17:57, Zygo Blaxell wrote:
>> On Mon, Jun 27, 2016 at 10:17:04AM -0600, Chris Murphy wrote:
>>> On Mon, Jun 27, 2016 at 5:21 AM, Austin S. Hemmelgarn
>>> wrote:
On 2016-06-25 12:44, Chris Murphy wrote:
>
On 2016-06-27 17:57, Zygo Blaxell wrote:
On Mon, Jun 27, 2016 at 10:17:04AM -0600, Chris Murphy wrote:
On Mon, Jun 27, 2016 at 5:21 AM, Austin S. Hemmelgarn
wrote:
On 2016-06-25 12:44, Chris Murphy wrote:
On Fri, Jun 24, 2016 at 12:19 PM, Austin S. Hemmelgarn
On 2016-06-27 23:17, Zygo Blaxell wrote:
On Mon, Jun 27, 2016 at 08:39:21PM -0600, Chris Murphy wrote:
On Mon, Jun 27, 2016 at 7:52 PM, Zygo Blaxell
wrote:
On Mon, Jun 27, 2016 at 04:30:23PM -0600, Chris Murphy wrote:
Btrfs does have something of a work around
On Mon, Jun 27, 2016 at 08:39:21PM -0600, Chris Murphy wrote:
> On Mon, Jun 27, 2016 at 7:52 PM, Zygo Blaxell
> wrote:
> > On Mon, Jun 27, 2016 at 04:30:23PM -0600, Chris Murphy wrote:
> >> Btrfs does have something of a work around for when things get slow,
> >>
; It's a crude form of "resilvering" as ZFS calls it.
In what manner is it crude?
> If btrfs sees EIO from a lower block layer it will try to reconstruct the
> missing data (but not repair it). If that happens during a scrub,
> it will also attempt to rewrite the missing data over the
hat happens during a scrub,
it will also attempt to rewrite the missing data over the original
offending sectors. This happens every few months in my server pool,
and seems to be working even on btrfs raid5.
Last time I checked all the RAID implementations on Linux (ok, so that's
pretty much ju
On Mon, Jun 27, 2016 at 3:57 PM, Zygo Blaxell
wrote:
> On Mon, Jun 27, 2016 at 10:17:04AM -0600, Chris Murphy wrote:
>
>> It just came up again in a thread over the weekend on linux-raid@. I'm
>> going to ask while people are paying attention if a patch to change
On Mon, Jun 27, 2016 at 10:17:04AM -0600, Chris Murphy wrote:
> On Mon, Jun 27, 2016 at 5:21 AM, Austin S. Hemmelgarn
> wrote:
> > On 2016-06-25 12:44, Chris Murphy wrote:
> >> On Fri, Jun 24, 2016 at 12:19 PM, Austin S. Hemmelgarn
> >> wrote:
> >>
>
ng
>> that to check parity, we can safely speed up the common case of near zero
>> errors during a scrub by a pretty significant factor.
>
> OK I'm in favor of that. Although somehow md gets away with this by
> computing and checking parity for its scrubs, and still manages to
> keep
For what it's worth I found btrfs-map-logical can produce mapping for
raid5 (didn't test raid6) by specifying the extent block length. If
that's omitted it only shows the device+mapping for the first strip.
This example is a 3 disk raid5, with a 128KiB file all in a single extent.
[root@f24s ~]#
On Mon, Jun 27, 2016 at 5:21 AM, Austin S. Hemmelgarn
wrote:
> On 2016-06-25 12:44, Chris Murphy wrote:
>>
>> On Fri, Jun 24, 2016 at 12:19 PM, Austin S. Hemmelgarn
>> wrote:
>>
>>> Well, the obvious major advantage that comes to mind for me to
>>>
On 2016-06-25 12:44, Chris Murphy wrote:
On Fri, Jun 24, 2016 at 12:19 PM, Austin S. Hemmelgarn
wrote:
Well, the obvious major advantage that comes to mind for me to checksumming
parity is that it would let us scrub the parity data itself and verify it.
OK but hold on.
On Sun, Jun 26, 2016 at 1:54 AM, Andrei Borzenkov wrote:
> 26.06.2016 00:52, Chris Murphy пишет:
>> Interestingly enough, so far I'm finding with full stripe writes, i.e.
>> 3x raid5, exactly 128KiB data writes, devid 3 is always parity. This
>> is raid4.
>
> That's not what
Andrei Borzenkov posted on Sun, 26 Jun 2016 10:54:16 +0300 as excerpted:
> P.S. usage of "stripe" to mean "stripe element" actually adds to
> confusion when reading code :)
... and posts (including patches, which I guess are code as well, just
not applied yet). I've been noticing that in the
26.06.2016 00:52, Chris Murphy пишет:
> Interestingly enough, so far I'm finding with full stripe writes, i.e.
> 3x raid5, exactly 128KiB data writes, devid 3 is always parity. This
> is raid4.
That's not what code suggests and what I see in practice - parity seems
to be distributed across all
Interestingly enough, so far I'm finding with full stripe writes, i.e.
3x raid5, exactly 128KiB data writes, devid 3 is always parity. This
is raid4. So...I wonder if some of these slow cases end up with a
bunch of stripes that are effectively raid4-like, and have a lot of
parity overwrites, which
On Fri, Jun 24, 2016 at 12:19 PM, Austin S. Hemmelgarn
wrote:
> Well, the obvious major advantage that comes to mind for me to checksumming
> parity is that it would let us scrub the parity data itself and verify it.
OK but hold on. During scrub, it should read data,
On 2016-06-24 13:52, Chris Murphy wrote:
On Fri, Jun 24, 2016 at 11:21 AM, Andrei Borzenkov wrote:
24.06.2016 20:06, Chris Murphy пишет:
On Fri, Jun 24, 2016 at 3:52 AM, Andrei Borzenkov wrote:
On Fri, Jun 24, 2016 at 11:50 AM, Hugo Mills
ms (except for the minor wart that 'scrub status -d' counts the
randomly against every device, while 'dev stats' counts all the errors
on the disk that was corrupted).
Disk-side data corruption is a thing I have to deal with a few times each
year, so I tested the btrfs raid5 implementation for that
On Fri, Jun 24, 2016 at 11:21 AM, Andrei Borzenkov wrote:
> 24.06.2016 20:06, Chris Murphy пишет:
>> On Fri, Jun 24, 2016 at 3:52 AM, Andrei Borzenkov
>> wrote:
>>> On Fri, Jun 24, 2016 at 11:50 AM, Hugo Mills wrote:
>>> eta)data
On Fri, Jun 24, 2016 at 4:16 AM, Hugo Mills wrote:
> On Fri, Jun 24, 2016 at 12:52:21PM +0300, Andrei Borzenkov wrote:
>> Yes, that is what I wrote below. But that means that RAID5 with one
>> degraded disk won't be able to reconstruct data on this degraded disk
>> because
On Fri, Jun 24, 2016 at 4:16 AM, Andrei Borzenkov wrote:
> On Fri, Jun 24, 2016 at 8:20 AM, Chris Murphy wrote:
>
>> [root@f24s ~]# filefrag -v /mnt/5/*
>> Filesystem type is: 9123683e
>> File size of /mnt/5/a.txt is 16383 (4 blocks of 4096 bytes)
>>
24.06.2016 20:06, Chris Murphy пишет:
> On Fri, Jun 24, 2016 at 3:52 AM, Andrei Borzenkov wrote:
>> On Fri, Jun 24, 2016 at 11:50 AM, Hugo Mills wrote:
>> eta)data and RAID56 parity is not data.
>>>
>>>Checksums are not parity, correct. However, every
On Fri, Jun 24, 2016 at 3:52 AM, Andrei Borzenkov wrote:
> On Fri, Jun 24, 2016 at 11:50 AM, Hugo Mills wrote:
>eta)data and RAID56 parity is not data.
>>
>>Checksums are not parity, correct. However, every data block
>> (including, I think, the
On Fri, Jun 24, 2016 at 2:50 AM, Hugo Mills wrote:
>Checksums are not parity, correct. However, every data block
> (including, I think, the parity) is checksummed and put into the csum
> tree.
I don't see how parity is checksummed. It definitely is not in the
csum tree.
On Fri, Jun 24, 2016 at 10:52:53AM -0600, Chris Murphy wrote:
> On Fri, Jun 24, 2016 at 2:50 AM, Hugo Mills wrote:
>
> >Checksums are not parity, correct. However, every data block
> > (including, I think, the parity) is checksummed and put into the csum
> > tree.
>
> I
On Fri, Jun 24, 2016 at 07:02:34AM +0300, Andrei Borzenkov wrote:
> >> I don't read code well enough, but I'd be surprised if Btrfs
> >> reconstructs from parity and doesn't then check the resulting
> >> reconstructed data to its EXTENT_CSUM.
> >
> > I wouldn't be surprised if both things happen
On Thu, Jun 23, 2016 at 11:20:40PM -0600, Chris Murphy wrote:
> [root@f24s ~]# filefrag -v /mnt/5/*
> Filesystem type is: 9123683e
> File size of /mnt/5/a.txt is 16383 (4 blocks of 4096 bytes)
> ext: logical_offset:physical_offset: length: expected: flags:
>0:0..
On 2016-06-24 06:59, Hugo Mills wrote:
On Fri, Jun 24, 2016 at 01:19:30PM +0300, Andrei Borzenkov wrote:
On Fri, Jun 24, 2016 at 1:16 PM, Hugo Mills wrote:
On Fri, Jun 24, 2016 at 12:52:21PM +0300, Andrei Borzenkov wrote:
On Fri, Jun 24, 2016 at 11:50 AM, Hugo Mills
On 2016-06-24 01:20, Chris Murphy wrote:
On Thu, Jun 23, 2016 at 8:07 PM, Zygo Blaxell
wrote:
With simple files changing one character with vi and gedit,
I get completely different logical and physical numbers with each
change, so it's clearly cowing the entire
On Fri, Jun 24, 2016 at 01:19:30PM +0300, Andrei Borzenkov wrote:
> On Fri, Jun 24, 2016 at 1:16 PM, Hugo Mills wrote:
> > On Fri, Jun 24, 2016 at 12:52:21PM +0300, Andrei Borzenkov wrote:
> >> On Fri, Jun 24, 2016 at 11:50 AM, Hugo Mills wrote:
> >> > On
On Fri, Jun 24, 2016 at 1:16 PM, Hugo Mills wrote:
> On Fri, Jun 24, 2016 at 12:52:21PM +0300, Andrei Borzenkov wrote:
>> On Fri, Jun 24, 2016 at 11:50 AM, Hugo Mills wrote:
>> > On Fri, Jun 24, 2016 at 07:02:34AM +0300, Andrei Borzenkov wrote:
>> >>
On Fri, Jun 24, 2016 at 12:52:21PM +0300, Andrei Borzenkov wrote:
> On Fri, Jun 24, 2016 at 11:50 AM, Hugo Mills wrote:
> > On Fri, Jun 24, 2016 at 07:02:34AM +0300, Andrei Borzenkov wrote:
> >> 24.06.2016 04:47, Zygo Blaxell пишет:
> >> > On Thu, Jun 23, 2016 at 06:26:22PM
On Fri, Jun 24, 2016 at 8:20 AM, Chris Murphy wrote:
> [root@f24s ~]# filefrag -v /mnt/5/*
> Filesystem type is: 9123683e
> File size of /mnt/5/a.txt is 16383 (4 blocks of 4096 bytes)
> ext: logical_offset:physical_offset: length: expected: flags:
>0:
On Fri, Jun 24, 2016 at 11:50 AM, Hugo Mills wrote:
> On Fri, Jun 24, 2016 at 07:02:34AM +0300, Andrei Borzenkov wrote:
>> 24.06.2016 04:47, Zygo Blaxell пишет:
>> > On Thu, Jun 23, 2016 at 06:26:22PM -0600, Chris Murphy wrote:
>> >> On Thu, Jun 23, 2016 at 1:32 PM, Goffredo
On Fri, Jun 24, 2016 at 07:02:34AM +0300, Andrei Borzenkov wrote:
> 24.06.2016 04:47, Zygo Blaxell пишет:
> > On Thu, Jun 23, 2016 at 06:26:22PM -0600, Chris Murphy wrote:
> >> On Thu, Jun 23, 2016 at 1:32 PM, Goffredo Baroncelli
> >> wrote:
> >>> The raid5 write hole is
On Thu, Jun 23, 2016 at 8:07 PM, Zygo Blaxell
wrote:
>> With simple files changing one character with vi and gedit,
>> I get completely different logical and physical numbers with each
>> change, so it's clearly cowing the entire stripe (192KiB in my 3 dev
>>
24.06.2016 04:47, Zygo Blaxell пишет:
> On Thu, Jun 23, 2016 at 06:26:22PM -0600, Chris Murphy wrote:
>> On Thu, Jun 23, 2016 at 1:32 PM, Goffredo Baroncelli
>> wrote:
>>> The raid5 write hole is avoided in BTRFS (and in ZFS) thanks to the
>>> checksum.
>>
>> Yeah I'm kinda
On Thu, Jun 23, 2016 at 05:37:09PM -0600, Chris Murphy wrote:
> > I expect that parity is in this data block group, and therefore is
> > checksummed the same as any other data in that block group.
>
> This appears to be wrong. Comparing the same file, one file only, one
> two new Btrfs volumes,
On Thu, Jun 23, 2016 at 05:37:09PM -0600, Chris Murphy wrote:
> > So in your example of degraded writes, no matter what the on disk
> > format makes it discoverable there is a problem:
> >
> > A. The "updating" is still always COW so there is no overwriting.
>
> There is RMW code in
On Thu, Jun 23, 2016 at 06:26:22PM -0600, Chris Murphy wrote:
> On Thu, Jun 23, 2016 at 1:32 PM, Goffredo Baroncelli
> wrote:
> > The raid5 write hole is avoided in BTRFS (and in ZFS) thanks to the
> > checksum.
>
> Yeah I'm kinda confused on this point.
>
>
any point between step 1 and 4 with no data loss.
Before step 3 the data and parity blocks are not part of the extent
tree so their contents are irrelevant. After step 3 (assuming each
step is completed in order) data block 1 is part of the extent tree and
can be reconstructed if any one disk fa
On Thu, Jun 23, 2016 at 1:32 PM, Goffredo Baroncelli wrote:
>
> The raid5 write hole is avoided in BTRFS (and in ZFS) thanks to the checksum.
Yeah I'm kinda confused on this point.
https://btrfs.wiki.kernel.org/index.php/RAID56
It says there is a write hole for Btrfs. But
On Wed, Jun 22, 2016 at 11:14 AM, Chris Murphy wrote:
>
> However, from btrfs-debug-tree from a 3 device raid5 volume:
>
> item 5 key (FIRST_CHUNK_TREE CHUNK_ITEM 1103101952) itemoff 15621 itemsize 144
> chunk length 2147483648 owner 2 stripe_len 65536
> type DATA|RAID5
On 2016-06-22 22:35, Zygo Blaxell wrote:
>> I do not know the exact nature of the Btrfs raid56 write hole. Maybe a
>> > dev or someone who knows can explain it.
> If you have 3 raid5 devices, they might be laid out on disk like this
> (e.g. with a 16K stripe width):
>
> Address: 0..16K
tentionally corrupted that disk. Literal reading, you corrupted the
> entire disk, but that's impractical. The fs is expected to behave
> differently depending on what's been corrupted and how much.
The first round of testing I did (a year ago, when deciding whether
btrfs raid5 was m
gt; were mostly wrong. In such cases a scrub could repair the data.
I don't often use the -Bd options, so I haven't tested it thoroughly,
but what you're describing sounds like a bug in user space tools. I've
found it reflects the same information as btrfs dev stats, and dev
stats have been reliab
TL;DR:
Kernel 4.6.2 causes a world of pain. Use 4.5.7 instead.
'btrfs dev stat' doesn't seem to count "csum failed"
(i.e. corruption) errors in compressed extents.
On Sun, Jun 19, 2016 at 11:44:27PM -0400, Zygo Blaxell wrote:
> Not so long ago, I had a disk fail in a
On Mon, Jun 20, 2016 at 09:55:59PM -0400, Zygo Blaxell wrote:
> In this current case, I'm getting things like this:
>
> [12008.243867] BTRFS info (device vdc): csum failed ino 4420604 extent
> 26805825306624 csum 4105596028 wanted 787343232 mirror 0
[...]
> The other other weird thing here is
matched the offsets producing EIO on read(), but
the statistics reported by scrub about which disk had been corrupted
were mostly wrong. In such cases a scrub could repair the data.
A different thing happens if there is a crash. In that case, scrub cannot
repair the errors. Every btrfs raid5 fi
On Mon, Jun 20, 2016 at 2:40 PM, Zygo Blaxell
wrote:
> On Mon, Jun 20, 2016 at 01:30:11PM -0600, Chris Murphy wrote:
>> For me the critical question is what does "some corrupted sectors" mean?
>
> On other raid5 arrays, I would observe a small amount of corruption
On Mon, Jun 20, 2016 at 01:30:11PM -0600, Chris Murphy wrote:
> On Mon, Jun 20, 2016 at 1:11 PM, Zygo Blaxell
> wrote:
> > On Mon, Jun 20, 2016 at 11:13:51PM +0500, Roman Mamedov wrote:
> >> On Sun, 19 Jun 2016 23:44:27 -0400
> Seems difficult at best due to this:
standpoint, [aside from not using Btrfs RAID5], you'd be
>> better off shutting down the system, booting a rescue OS, copying the content
>> of the failing disk to the replacement one using 'ddrescue', then removing
>> the
>> bad disk, and after boot up your main system woul
On Mon, Jun 20, 2016 at 11:13:51PM +0500, Roman Mamedov wrote:
> On Sun, 19 Jun 2016 23:44:27 -0400
> Zygo Blaxell <ce3g8...@umail.furryterror.org> wrote:
> From a practical standpoint, [aside from not using Btrfs RAID5], you'd be
> better off shutting down the system, booting a
k so far,
> and this is week four of this particular project.
From a practical standpoint, [aside from not using Btrfs RAID5], you'd be
better off shutting down the system, booting a rescue OS, copying the content
of the failing disk to the replacement one using 'ddrescue', then removing the
b
Not so long ago, I had a disk fail in a btrfs filesystem with raid1
metadata and raid5 data. I mounted the filesystem readonly, replaced
the failing disk, and attempted to recover by adding the new disk and
deleting the missing disk.
It's not going well so far. Pay attention, there are at least
>> > Do you think there is still a chance to recover those files?
>>
>> You can use btrfs restore to get files off a damaged fs.
>
> This however does work - thank you!
> Now since I'm a bit short on disc space, can I remove the disc that
> previously disappeared (and thus doesn't have all the
>
Henk Slager gmail.com> writes:
> You could use 1-time mount option clear_cache, then mount normally and
> cache will be rebuild automatically (but also corrected if you don't
> clear it)
This didn't help, gave me
[ 316.111596] BTRFS info (device sda): force clearing of disk cache
[
g (device sda): csum failed ino 171545 off
> 2269569024 csum 2566472073 expected csum 212160686
>
> when trying to read the file.
You could use 1-time mount option clear_cache, then mount normally and
cache will be rebuild automatically (but also corrected if you don't
clear it)
&
am I mistaken to believe that btrfs-raid5 would continue to
function when one disc fails?
If you need any more info I'm happy to provide that - here is some
information about the system:
Linux nashorn 4.4.0-2-generic #16-Ubuntu SMP Thu Jan 28 15:44:21 UTC 2016
x86_64 x86_64 x86_64 GNU/Linux
On 6 November 2015 at 10:03, Janos Toth F. wrote:
>
> Although I updated the firmware of the drives. (I found an IMPORTANT
> update when I went there to download SeaTools, although there was no
> change log to tell me why this was important). This might changed the
> error
I created a fresh RAID-5 mode Btrfs on the same 3 disks (including the
faulty one which is still producing numerous random read errors) and
Btrfs now seems to work exactly as I would anticipate.
I copied some data and verified the checksum. The data is readable and
correct regardless of the
On 2015-11-04 23:06, Duncan wrote:
(Tho I should mention, while not on zfs, I've actually had my own
problems with ECC RAM too. In my case, the RAM was certified to run at
speeds faster than it was actually reliable at, such that actually stored
data, what the ECC protects, was fine, the data
Duncan wrote:
Austin S Hemmelgarn posted on Wed, 04 Nov 2015 13:45:37 -0500 as
excerpted:
On 2015-11-04 13:01, Janos Toth F. wrote:
But the worst part is that there are some ISO files which were
seemingly copied without errors but their external checksums (the one
which I can calculate with
Austin S Hemmelgarn posted on Wed, 04 Nov 2015 13:45:37 -0500 as
excerpted:
> On 2015-11-04 13:01, Janos Toth F. wrote:
>> But the worst part is that there are some ISO files which were
>> seemingly copied without errors but their external checksums (the one
>> which I can calculate with md5sum
Well. Now I am really confused about Btrfs RAID-5!
So, I replaced all SATA cables (which are explicitly marked for beeing
aimed at SATA3 speeds) and all the 3x2Tb WD Red 2.0 drives with 3x4Tb
Seagate Contellation ES 3 drives and started from sratch. I
secure-erased every drives, created an empty
On 2015-11-04 13:01, Janos Toth F. wrote:
But the worst part is that there are some ISO files which were
seemingly copied without errors but their external checksums (the one
which I can calculate with md5sum and compare to the one supplied by
the publisher of the ISO file) don't match!
Well...
I went through all the recovery options I could find (starting from
read-only to "extraordinarily dangerous"). Nothing seemed to work.
A Windows based proprietary recovery software (ReclaiMe) could scratch
the surface but only that (it showed me the whole original folder
structure after a few
If it is for mostly archival storage, I would suggest you take a look
at snapraid.
On Wed, Oct 21, 2015 at 9:09 AM, Janos Toth F. wrote:
> I went through all the recovery options I could find (starting from
> read-only to "extraordinarily dangerous"). Nothing seemed to
Maybe hold off erasing the drives a little in case someone wants to
collect some extra data for diagnosing how/why the filesystem got into
this unrecoverable state.
A single device having issues should not cause the whole filesystem to
become unrecoverable.
On Wed, Oct 21, 2015 at 9:09 AM, Janos
I am afraid the filesystem right now is really damaged regardless of
it's state upon the unexpected cable failure because I tried some
dangerous options after read-only restore/recovery methods all failed
(including zero-log, followed by init-csum-tree and even
chunk-recovery -> all of them just
https://btrfs.wiki.kernel.org/index.php/Restore
This should still be possible with even a degraded/unmounted raid5. It
is a bit tedious to figure out how to use it but if you've got some
things you want off the volume, it's not so difficult to prevent
trying it.
Chris Murphy
--
To unsubscribe
I tried several things, including the degraded mount option. One example:
# mount /dev/sdb /data -o ro,degraded,nodatasum,notreelog
mount: wrong fs type, bad option, bad superblock on /dev/sdb,
missing codepage or helper program, or other error
In some cases useful info is found in
1 - 100 of 139 matches
Mail list logo