Re: RAID1 storage server won't boot with one disk missing

2015-09-22 Thread Austin S Hemmelgarn

On 2015-09-22 14:35, Chris Murphy wrote:

On Tue, Sep 22, 2015 at 7:21 AM, Austin S Hemmelgarn
 wrote:


It's not a bad idea, except that it changes established usage, and there are
probably some people out there who depend on the current behavior. If we do
go that way, mount needs to spit out a big obnoxious warning (as in, not
through dmesg or mount options, but directly on stderr) if the filesystem
gets mounted degraded automatically.


Definitely there needs to be a user space message. Even now when
trying to mount a volume with a missing device without -o degraded the
generic fs type unrecognized message is misleading.
I agree that it's confusing for many users, but from the point of view 
of someone who's been doing system administration for a few years, the 
'or incorrect mount options' bit does kind of make sense given the context.




smime.p7s
Description: S/MIME Cryptographic Signature


Re: RAID1 storage server won't boot with one disk missing

2015-09-22 Thread Austin S Hemmelgarn

On 2015-09-21 16:35, Erkki Seppala wrote:

Gareth Pye  writes:


People tend to be looking at BTRFS for a guarantee that data doesn't
die when hardware does. Defaults that defeat that shouldn't be used.


However, data is no more in danger at startup than it is at the moment
when btrfs notices a drive dropping, yet it permits IO to proceed. Is
there not a contradiction?

Personally I don't see why system startup should be a special case, in
particular as it can be very stressful situation to recover from when
RAID is there just to avoid the immediate reaction when hardware breaks;
and when in practice you can do the recovery while the system is running
in systems where short service interruptions matter.

The difference is that we have code to detect a device not being present 
at mount, we don't have code (yet) to detect it dropping on a mounted 
filesystem.  Why having proper detection for a device disappearing does 
not appear to be a priority, I have no idea, but that is a separate 
issue from mount behavior.




smime.p7s
Description: S/MIME Cryptographic Signature


Re: RAID1 storage server won't boot with one disk missing

2015-09-22 Thread Qu Wenruo



在 2015年09月22日 19:32, Austin S Hemmelgarn 写道:

On 2015-09-21 16:35, Erkki Seppala wrote:

Gareth Pye  writes:


People tend to be looking at BTRFS for a guarantee that data doesn't
die when hardware does. Defaults that defeat that shouldn't be used.


However, data is no more in danger at startup than it is at the moment
when btrfs notices a drive dropping, yet it permits IO to proceed. Is
there not a contradiction?

Personally I don't see why system startup should be a special case, in
particular as it can be very stressful situation to recover from when
RAID is there just to avoid the immediate reaction when hardware breaks;
and when in practice you can do the recovery while the system is running
in systems where short service interruptions matter.


The difference is that we have code to detect a device not being present
at mount, we don't have code (yet) to detect it dropping on a mounted
filesystem.  Why having proper detection for a device disappearing does
not appear to be a priority, I have no idea, but that is a separate
issue from mount behavior.



Sorry for jumping out for such a sudden, but I submitted a patchset 
which is somewhat related to such degraded case.


[PATCH 0/5] Btrfs: Per-chunk degradable check

With that patchset, btrfs can do quite good check for degradable and 
mount/remount and even mounted time.
(Return value of btrfs_check_degradable() will indicate all devices OK, 
or not all OK but degradable, or not writable degradable)


And with that patchset, it's quite easy to add support for btrfs to 
sliently switch to degraded mount option if possible.

(Along with other improvement of course)


For the original feedback from end user, personally I understand users 
who don't want to add degraded mount option manually when a device 
fails, nor don't want to add a permanent mount option info fstab, just 
in case of failure.


So personally, I'd like to add a new mount option "nodegraded", to allow 
user to force no degraded mount.
And if neither "degraded" nor "nondegraded" is given, let btrfs to 
automatically switch to degraded if possible.


And it should also still be able to info user, either by the easily 
igored dmesg, or easier to notice mount option change.


How about this method to solve the problem?

Thanks,
Qu
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID1 storage server won't boot with one disk missing

2015-09-22 Thread Austin S Hemmelgarn

On 2015-09-22 08:51, Qu Wenruo wrote:



在 2015年09月22日 19:32, Austin S Hemmelgarn 写道:

On 2015-09-21 16:35, Erkki Seppala wrote:

Gareth Pye  writes:


People tend to be looking at BTRFS for a guarantee that data doesn't
die when hardware does. Defaults that defeat that shouldn't be used.


However, data is no more in danger at startup than it is at the moment
when btrfs notices a drive dropping, yet it permits IO to proceed. Is
there not a contradiction?

Personally I don't see why system startup should be a special case, in
particular as it can be very stressful situation to recover from when
RAID is there just to avoid the immediate reaction when hardware breaks;
and when in practice you can do the recovery while the system is running
in systems where short service interruptions matter.


The difference is that we have code to detect a device not being present
at mount, we don't have code (yet) to detect it dropping on a mounted
filesystem.  Why having proper detection for a device disappearing does
not appear to be a priority, I have no idea, but that is a separate
issue from mount behavior.



Sorry for jumping out for such a sudden, but I submitted a patchset
which is somewhat related to such degraded case.

[PATCH 0/5] Btrfs: Per-chunk degradable check

With that patchset, btrfs can do quite good check for degradable and
mount/remount and even mounted time.
(Return value of btrfs_check_degradable() will indicate all devices OK,
or not all OK but degradable, or not writable degradable)

And with that patchset, it's quite easy to add support for btrfs to
sliently switch to degraded mount option if possible.
(Along with other improvement of course)


For the original feedback from end user, personally I understand users
who don't want to add degraded mount option manually when a device
fails, nor don't want to add a permanent mount option info fstab, just
in case of failure.

So personally, I'd like to add a new mount option "nodegraded", to allow
user to force no degraded mount.
And if neither "degraded" nor "nondegraded" is given, let btrfs to
automatically switch to degraded if possible.

And it should also still be able to info user, either by the easily
igored dmesg, or easier to notice mount option change.

How about this method to solve the problem?
It's not a bad idea, except that it changes established usage, and there 
are probably some people out there who depend on the current behavior. 
If we do go that way, mount needs to spit out a big obnoxious warning 
(as in, not through dmesg or mount options, but directly on stderr) if 
the filesystem gets mounted degraded automatically.  A better option 
might be to add a compat feature bit, and if that bit is set, then use 
the above logic, otherwise use the current logic.





smime.p7s
Description: S/MIME Cryptographic Signature


Re: RAID1 storage server won't boot with one disk missing

2015-09-21 Thread Duncan
Erkki Seppala posted on Mon, 21 Sep 2015 23:35:39 +0300 as excerpted:

> Gareth Pye  writes:
> 
>> People tend to be looking at BTRFS for a guarantee that data doesn't
>> die when hardware does. Defaults that defeat that shouldn't be used.
> 
> However, data is no more in danger at startup than it is at the moment
> when btrfs notices a drive dropping, yet it permits IO to proceed. Is
> there not a contradiction?

The problem at runtime is that btrfs _doesn't_ really notice a device 
dropping.  It simply continues writing to the existing devices, and 
buffering the data for the now missing device.  The block device 
management parts of the kernel know it's missing (the device node will 
disappear from devtmpfs, etc), but the btrfs part carries on, oblivious.

At mount, however, btrfs notices (since it must as it's trying to 
assemble the filesystem at that point), and refuses to mount without the 
degraded option if there's too many devices missing.

I'd argue that noticing the problem and requiring admin intervention to 
avoid risk to the data is a feature, not a misfeature, and that the 
runtime behavior is therefore ultimately a lacking feature, ultimately a 
bug which should be fixed, while you seem to be arguing that carrying on 
oblivious is the feature, and requiring admin intervention when there's a 
risk to data is a misfeature, ultimately a bug that should be fixed.

=:^\

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID1 storage server won't boot with one disk missing

2015-09-21 Thread Erkki Seppala
Gareth Pye  writes:

> People tend to be looking at BTRFS for a guarantee that data doesn't
> die when hardware does. Defaults that defeat that shouldn't be used.

However, data is no more in danger at startup than it is at the moment
when btrfs notices a drive dropping, yet it permits IO to proceed. Is
there not a contradiction?

Personally I don't see why system startup should be a special case, in
particular as it can be very stressful situation to recover from when
RAID is there just to avoid the immediate reaction when hardware breaks;
and when in practice you can do the recovery while the system is running
in systems where short service interruptions matter.

-- 
  _
 / __// /__   __   http://www.modeemi.fi/~flux/\   \
/ /_ / // // /\ \/ /\  /
   /_/  /_/ \___/ /_/\_\@modeemi.fi  \/

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID1 storage server won't boot with one disk missing

2015-09-21 Thread Erkki Seppala
Goffredo Baroncelli  writes:

> Hi Anand,
>
>
> On 2015-09-17 17:18, Anand Jain wrote:
>>  it looks like -o degraded is going to be a very obvious feature,
>>  I have plans of making it a default feature, and provide -o
>>  nodegraded feature instead. Thanks for comments if any.
>> 
>> Thanks, Anand
>
> I am not sure if there is a "good" default for this kind of problem; there 
> are several aspects:
>
> - remote machine:
> for a remote machine, I think that the root filesystem should be mounted 
> anyway. For a secondary filesystem (home ?), may be that an user intervention 
> could be better (but without home, how an user could log?).

However, if the basis for requiring user intervention is that going
forward automatically with the situation as-is would result in risk to
the data, how can the default of going forward during runtime, when one
of the disks drops out, be rationalized?

Most certainly the risk is already there when you no longer have parity
device for RAID1/RAID5, so wouldn't the prudent action be to remount the
device read-only immediately - instead of going on, which is what btrfs
now does, just waiting for another device to die.

Of course, I think few people would agree with that, as it would stop
the service (the parts requiring write access), when in fact the whole
point of RAID is to keep serving the clients when a device dies. So why
is the startup a special case?

I suppose the thinking is that the default forces the administrator to
consider setting up a monitoring system before adding 'nodegraded' to
the root mounting options, but in the outlined scenario there could
easily be data loss when the second device dies, and the user/admin
would be none the wiser until it's too late, even with the current
defaults.

-- 
  _
 / __// /__   __   http://www.modeemi.fi/~flux/\   \
/ /_ / // // /\ \/ /\  /
   /_/  /_/ \___/ /_/\_\@modeemi.fi  \/

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID1 storage server won't boot with one disk missing

2015-09-18 Thread Austin S Hemmelgarn

On 2015-09-17 16:18, Chris Murphy wrote:

On Thu, Sep 17, 2015 at 1:02 PM, Roman Mamedov  wrote:

On Thu, 17 Sep 2015 19:00:08 +0200
Goffredo Baroncelli  wrote:


On 2015-09-17 17:18, Anand Jain wrote:

  it looks like -o degraded is going to be a very obvious feature,
  I have plans of making it a default feature, and provide -o
  nodegraded feature instead. Thanks for comments if any.


I am not sure if there is a "good" default for this kind of problem


Yes there is. It is whatever people came to expect from using other RAID
systems and/or generally expect from RAID as a concept.

Both mdadm software RAID, and I believe virtually any hardware RAID controller
out there will let you to successfully boot up and give read-write(!) access
to a RAID in a non-critical failure state, because that's kind of the whole
point of a RAID, to eliminate downtime. If the removed disk is later re-added,
then it is automatically resynced. Mdadm can also make use of its 'write
intent bitmap' to resync only those areas of the array which were in any way
touched during the absence of the newly re-added disk.

If you're concerned that the user "misses" the fact that they have a disk
down, then solve *that*, make some sort of a notify daemon, e.g. mdadm has a
built-in "monitor" mode which sends E-Mail on critical events with any of the
arrays.


Given the current state: no proposal and no work done yet, I think
it's premature to change the default.

It's an open question what a modern monitoring and notification
mechanism should look like. At the moment it'd be a unique Btrfs thing
because the mdadm and LVM methods aren't abstracted enough to reuse. I
wonder if the storaged and/or openlmi folks have some input on what
this would look like. Feedback from KDE and GNOME also, who rely on at
least mdadm in order to present user space notifications. I think
udisks2 is on the way out and storaged is on the way in, there's just
too much stuff that udisks2 doesn't do and is getting confused about,
including LVM thinly provisioned volumes, not just Btrfs stuff.


The problem with that is that storaged (from what I understand) is 
systemd dependent, and there are too many people out there who don't 
want systemd.  udisks2 will almost certainly live on (just like 
consolekit has).  And if it's something systemd integrated, I can 
already tell you it will look like the OS X solution.  Now, what I think 
it should look like is a different story, I'd say that:

1. It should give the option to either:
a. Refuse to boot degraded.
b. Ask the operator if he wants to boot degraded
c. Just automatically boot degraded, and probably send a 
notification about it.
2. Provide some service (sadly probably dbus based) to schedule 
scrub/balance/re-sync operations and get info about ENOSPC/sync 
failure/parity mismatch/device failure/SMART status failure.
3. Provide a consistent interface to such operations on hardware RAID 
controllers that support them.
4. Provide the ability to notify via arbitrary means on any of the above 
mentioned issues.

5. Have the ability to turn anything not needed off on a given system.



smime.p7s
Description: S/MIME Cryptographic Signature


Re: RAID1 storage server won't boot with one disk missing

2015-09-17 Thread Anand Jain


Thanks for the report.

 There is a bug that raid1 with one disk missing and trying to mount 
for the 2nd time.. it would fail. I am not too sure if in the boot 
process would there be mount and then remount/mount again ? If yes then 
it is potentially hitting the problem as in the patch below.


  Btrfs: allow -o rw,degraded for single group profile

 you may want to give this patch a try.

 more below..

On 09/17/2015 07:56 AM, erp...@gmail.com wrote:

Good afternoon,

Earlier today, I tried to set up a storage server using btrfs but ran
into some problems. The goal was to use two disks (4.0TB each) in a
raid1 configuration.

What I did:
1. Attached a single disk to a regular PC configured to boot with UEFI.
2. Booted from a thumb drive that had been made from an Ubuntu 14.04
Server x64 installation DVD.
3. Ran the installation procedure. When it came time to partition the
disk, I chose the guided partitioning option. The partitioning scheme
it suggested was:

* A 500MB EFI System Partition.
* An ext4 root partition of nearly 4 TB in size.
* A 4GB swap partition.

4. Changed the type of the middle partition from ext4 to btrfs, but
left everything else the same.
5. Finalized the partitioning scheme, allowing changes to be written to disk.
6. Continued the installation procedure until it finished. I was able
to boot into a working server from the single disk.
7. Attached the second disk.
8. Used parted to create a GPT label on the second disk and a btrfs
partition that was the same size as the btrfs partition on the first
disk.

# parted /dev/sdb
(parted) mklabel gpt
(parted) mkpart primary btrfs #s ##s
(parted) quit

9. Ran "btrfs device add /dev/sdb1 /" to add the second device to the
filesystem.
10. Ran "btrfs balance start -dconvert=raid1 -mconvert=raid1 /" and
waited for it to finish. It reported that it finished successfully.
11. Rebooted the system. At this point, everything appeared to be working.
12. Shut down the system, temporarily disconnected the second disk
(/dev/sdb) from the motherboard, and powered it back up.

What I expected to happen:
I expected that the system would either start as if nothing were
wrong, or would warn me that one half of the mirror was missing and
ask if I really wanted to start the system with the root array in a
degraded state.


 as of now it would/should start normally only when there is an entry 
-o degraded


 it looks like -o degraded is going to be a very obvious feature,
 I have plans of making it a default feature, and provide -o
 nodegraded feature instead. Thanks for comments if any.

Thanks, Anand



What actually happened:
During the boot process, a kernel message appeared indicating that the
"system array" could not be found for the root filesystem (as
identified by a UUID). It then dumped me to an initramfs prompt.
Powering down the system, reattaching the second disk, and powering it
on allowed me to boot successfully. Running "btrfs fi df /" showed
that all System data was stored as RAID1.

If I want to have a storage server where one of two drives can fail at
any time without causing much down time, am I on the right track? If
so, what should I try next to get the behavior I'm looking for?

Thanks,
Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID1 storage server won't boot with one disk missing

2015-09-17 Thread Chris Murphy
On Wed, Sep 16, 2015 at 5:56 PM, erp...@gmail.com  wrote:

> What I expected to happen:
> I expected that the system would either start as if nothing were
> wrong, or would warn me that one half of the mirror was missing and
> ask if I really wanted to start the system with the root array in a
> degraded state.

It's not this sophisticated yet. Btrfs does not "assemble" degraded by
default like mdadm and LVM based RAID. You need to manually mount it
with -o degraded and then continue the boot process, or use boot
parameter rootflags=degraded. Yet there is still some interaction
between btrfs dev scan and udev (?) that I don't understand precisely,
but what happens is when any device is missing, the Btrfs volume UUID
doesn't appear and therefore it still can't be mounted degraded if
volume UUID is used, e.g. boot parameter
root=UUID=  so that needs to be changed to a
/dev/sdXY type of notation and hope that you guess it correctly.



>
> What actually happened:
> During the boot process, a kernel message appeared indicating that the
> "system array" could not be found for the root filesystem (as
> identified by a UUID). It then dumped me to an initramfs prompt.
> Powering down the system, reattaching the second disk, and powering it
> on allowed me to boot successfully. Running "btrfs fi df /" showed
> that all System data was stored as RAID1.

Just an FYI to be really careful about degraded rw mounts. There is no
automatic resync to catch up the previously missing device with the
device that was degraded,rw mounted. You have to scrub or balance,
there's no optimization yet for Btrfs to effectively just "diff"
between the devices' generations and get them all in sync quickly.

Much worse is if you don't scrub or balance, and then redo the test
reversing the device to make missing. Now you have multiple devices
that were rw,degraded mounted, and putting them back together again
will corrupt the whole file system irreparably. Fixing the first
problem would (almost always) avoid the second problem.

> If I want to have a storage server where one of two drives can fail at
> any time without causing much down time, am I on the right track? If
> so, what should I try next to get the behavior I'm looking for?

It's totally not there yet if you want to obviate manual checks and
intervention for failure cases. Both mdadm and LVM integrated RAID
have monitoring and notification which Btrfs lacks entirely. So that
means you have to check it or create scripts to check it. What often
tends to happen is Btrfs just keeps retrying rather than ignoring a
bad device, so you'll see piles of retries with dmesg But Btrfs
doesn't kick out the bad device like the md drive would do. This could
go on for hours, or days. So if you aren't checking for it, you could
unwittingly have a degraded array already.



-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID1 storage server won't boot with one disk missing

2015-09-17 Thread Chris Murphy
On Thu, Sep 17, 2015 at 9:18 AM, Anand Jain  wrote:
>
>  as of now it would/should start normally only when there is an entry -o
> degraded
>
>  it looks like -o degraded is going to be a very obvious feature,
>  I have plans of making it a default feature, and provide -o
>  nodegraded feature instead. Thanks for comments if any.


If degraded mounts happen by default, what happens when dev 1 goes
missing temporarily and dev 2 is mounted degraded,rw and then dev 1
reappears? Is there an automatic way to a.) catch up dev 1 with dev 2?
and then b.) automatically make the array no longer degraded?

I think it's a problem to have automatic degraded mounts when there's
no monitoring or notification system of problems. We can get silent
degraded mounts by default with no notification at all there's a
problem with a Btrfs volume.

So off hand my comment is that I think other work is needed before
degraded mounts is default behavior.



-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID1 storage server won't boot with one disk missing

2015-09-17 Thread Goffredo Baroncelli
Hi Anand,


On 2015-09-17 17:18, Anand Jain wrote:
>  it looks like -o degraded is going to be a very obvious feature,
>  I have plans of making it a default feature, and provide -o
>  nodegraded feature instead. Thanks for comments if any.
> 
> Thanks, Anand

I am not sure if there is a "good" default for this kind of problem; there are 
several aspects:

- remote machine:
for a remote machine, I think that the root filesystem should be mounted 
anyway. For a secondary filesystem (home ?), may be that an user intervention 
could be better (but without home, how an user could log?).

- spare:
in case of a degraded filesystem, the system could insert a spare disk; or a 
reshaping could be started (raid5->raid1, raid6->raid5)

- initramfs:
this is the most complicated things: currently most initramfs don't mount the 
filesystem if all the volumes aren't available. Allowing a degraded root 
filesystem means:
a) wait for the disks until a timeout
b) if the timeout expires, mount in degraded mode (inserting a spare 
disk if available ?)
c) otherwise mount the filesystem as usual

- degraded:
I think that there are different level of degraded. For example, in case of 
raid6 a missing device could be acceptable; however in case of a raid5, this 
should be not allowed; and an user intervention may be preferred.


In the past I suggested the use of an helper, mount.btrfs [1], which could 
handle all these cases better without a kernel intervention:
- wait for the devices to appear
- verifying if all the needed devices are present 
- mounting the filesystem passing
 - all the devices to the kernel (without relying to udev and btrfs dev 
scan...)
 - allowing the degraded mode or not (policy)
 - starting an insertion of the spare(policy)

G.Baroncelli

[1] http://www.spinics.net/lists/linux-btrfs/msg39706.html


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli 
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID1 storage server won't boot with one disk missing

2015-09-17 Thread Roman Mamedov
On Thu, 17 Sep 2015 19:00:08 +0200
Goffredo Baroncelli  wrote:

> On 2015-09-17 17:18, Anand Jain wrote:
> >  it looks like -o degraded is going to be a very obvious feature,
> >  I have plans of making it a default feature, and provide -o
> >  nodegraded feature instead. Thanks for comments if any.
> > 
> I am not sure if there is a "good" default for this kind of problem

Yes there is. It is whatever people came to expect from using other RAID
systems and/or generally expect from RAID as a concept.

Both mdadm software RAID, and I believe virtually any hardware RAID controller
out there will let you to successfully boot up and give read-write(!) access
to a RAID in a non-critical failure state, because that's kind of the whole
point of a RAID, to eliminate downtime. If the removed disk is later re-added,
then it is automatically resynced. Mdadm can also make use of its 'write
intent bitmap' to resync only those areas of the array which were in any way
touched during the absence of the newly re-added disk.

If you're concerned that the user "misses" the fact that they have a disk
down, then solve *that*, make some sort of a notify daemon, e.g. mdadm has a
built-in "monitor" mode which sends E-Mail on critical events with any of the
arrays.

-- 
With respect,
Roman


signature.asc
Description: PGP signature


Re: RAID1 storage server won't boot with one disk missing

2015-09-17 Thread Chris Murphy
On Thu, Sep 17, 2015 at 1:02 PM, Roman Mamedov  wrote:
> On Thu, 17 Sep 2015 19:00:08 +0200
> Goffredo Baroncelli  wrote:
>
>> On 2015-09-17 17:18, Anand Jain wrote:
>> >  it looks like -o degraded is going to be a very obvious feature,
>> >  I have plans of making it a default feature, and provide -o
>> >  nodegraded feature instead. Thanks for comments if any.
>> >
>> I am not sure if there is a "good" default for this kind of problem
>
> Yes there is. It is whatever people came to expect from using other RAID
> systems and/or generally expect from RAID as a concept.
>
> Both mdadm software RAID, and I believe virtually any hardware RAID controller
> out there will let you to successfully boot up and give read-write(!) access
> to a RAID in a non-critical failure state, because that's kind of the whole
> point of a RAID, to eliminate downtime. If the removed disk is later re-added,
> then it is automatically resynced. Mdadm can also make use of its 'write
> intent bitmap' to resync only those areas of the array which were in any way
> touched during the absence of the newly re-added disk.
>
> If you're concerned that the user "misses" the fact that they have a disk
> down, then solve *that*, make some sort of a notify daemon, e.g. mdadm has a
> built-in "monitor" mode which sends E-Mail on critical events with any of the
> arrays.

Given the current state: no proposal and no work done yet, I think
it's premature to change the default.

It's an open question what a modern monitoring and notification
mechanism should look like. At the moment it'd be a unique Btrfs thing
because the mdadm and LVM methods aren't abstracted enough to reuse. I
wonder if the storaged and/or openlmi folks have some input on what
this would look like. Feedback from KDE and GNOME also, who rely on at
least mdadm in order to present user space notifications. I think
udisks2 is on the way out and storaged is on the way in, there's just
too much stuff that udisks2 doesn't do and is getting confused about,
including LVM thinly provisioned volumes, not just Btrfs stuff.


-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID1 storage server won't boot with one disk missing

2015-09-17 Thread Duncan
Anand Jain posted on Thu, 17 Sep 2015 23:18:36 +0800 as excerpted:

>> What I expected to happen:
>> I expected that the [btrfs raid1 data/metadata] system would either
>> start as if nothing were wrong, or would warn me that one half of the
>> mirror was missing and ask if I really wanted to start the system with
>> the root array in a degraded state.
> 
> as of now it would/should start normally only when there is an entry
> -o degraded
> 
> it looks like -o degraded is going to be a very obvious feature,
> I have plans of making it a default feature, and provide -o nodegraded
> feature instead. Thanks for comments if any.

As Chris Murphy, I have my doubts about this, and think it's likely to 
cause as many unhappy users as it prevents.

I'd definitely put -o nodegraded in my default options here, so it's not 
about me, but about all those others that would end up running a silently 
degraded system and have no idea until it's too late, as further devices 
have failed or the one single other available copy of something important 
(remember, still raid1 without N-mirrors option, unfortunately, so if a 
device drops out, that's now data/metadata with only a single valid copy 
regardless of the number of devices, and if it goes invalid...) fails 
checksum for whatever reason.

And since it only /allows/ degraded, not forcing it, if admins or distros 
want it as the default, -o degraded can be added now.  Nothing's stopping 
them except lack of knowledge of the option, the *same* lack of knowledge 
that would potentially cause so much harm if the default were switched.

Put it this way.  With the current default, if it fails and people have 
to ask about the unexpected failure here, no harm to existing data done, 
just add -o degraded and get on with things.  If -o degraded were made 
the default, failure mode would be *MUCH* worse, potential loss of the 
entire filesystem due to silent and thus uncorrected device loss and 
degraded mounting.

So despite the inconvenience of less knowledgeable people losing the 
availability of the filesystem until they can read the wiki or ask about 
it here, I don't believe changing the default to -o degraded is wise, at 
all.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID1 storage server won't boot with one disk missing

2015-09-17 Thread Gareth Pye
I think you have stated that in a very polite and friendly way. I'm
pretty sure I'd phrase it less politely :)

Following mdadm's example of an easy option to allow degraded
mounting, but that shouldn't be the default. Anyone with the expertise
to set that option can be expected to implement a way of knowing that
the mount is degraded.

People tend to be looking at BTRFS for a guarantee that data doesn't
die when hardware does. Defaults that defeat that shouldn't be used.

On Fri, Sep 18, 2015 at 11:36 AM, Duncan <1i5t5.dun...@cox.net> wrote:
> Anand Jain posted on Thu, 17 Sep 2015 23:18:36 +0800 as excerpted:
>
>>> What I expected to happen:
>>> I expected that the [btrfs raid1 data/metadata] system would either
>>> start as if nothing were wrong, or would warn me that one half of the
>>> mirror was missing and ask if I really wanted to start the system with
>>> the root array in a degraded state.
>>
>> as of now it would/should start normally only when there is an entry
>> -o degraded
>>
>> it looks like -o degraded is going to be a very obvious feature,
>> I have plans of making it a default feature, and provide -o nodegraded
>> feature instead. Thanks for comments if any.
>
> As Chris Murphy, I have my doubts about this, and think it's likely to
> cause as many unhappy users as it prevents.
>
> I'd definitely put -o nodegraded in my default options here, so it's not
> about me, but about all those others that would end up running a silently
> degraded system and have no idea until it's too late, as further devices
> have failed or the one single other available copy of something important
> (remember, still raid1 without N-mirrors option, unfortunately, so if a
> device drops out, that's now data/metadata with only a single valid copy
> regardless of the number of devices, and if it goes invalid...) fails
> checksum for whatever reason.
>
> And since it only /allows/ degraded, not forcing it, if admins or distros
> want it as the default, -o degraded can be added now.  Nothing's stopping
> them except lack of knowledge of the option, the *same* lack of knowledge
> that would potentially cause so much harm if the default were switched.
>
> Put it this way.  With the current default, if it fails and people have
> to ask about the unexpected failure here, no harm to existing data done,
> just add -o degraded and get on with things.  If -o degraded were made
> the default, failure mode would be *MUCH* worse, potential loss of the
> entire filesystem due to silent and thus uncorrected device loss and
> degraded mounting.
>
> So despite the inconvenience of less knowledgeable people losing the
> availability of the filesystem until they can read the wiki or ask about
> it here, I don't believe changing the default to -o degraded is wise, at
> all.
>
> --
> Duncan - List replies preferred.   No HTML msgs.
> "Every nonfree program has a lord, a master --
> and if you use the program, he is your master."  Richard Stallman
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Gareth Pye - blog.cerberos.id.au
Level 2 MTG Judge, Melbourne, Australia
"Dear God, I would like to file a bug report"
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RAID1 storage server won't boot with one disk missing

2015-09-16 Thread erp...@gmail.com
Good afternoon,

Earlier today, I tried to set up a storage server using btrfs but ran
into some problems. The goal was to use two disks (4.0TB each) in a
raid1 configuration.

What I did:
1. Attached a single disk to a regular PC configured to boot with UEFI.
2. Booted from a thumb drive that had been made from an Ubuntu 14.04
Server x64 installation DVD.
3. Ran the installation procedure. When it came time to partition the
disk, I chose the guided partitioning option. The partitioning scheme
it suggested was:

* A 500MB EFI System Partition.
* An ext4 root partition of nearly 4 TB in size.
* A 4GB swap partition.

4. Changed the type of the middle partition from ext4 to btrfs, but
left everything else the same.
5. Finalized the partitioning scheme, allowing changes to be written to disk.
6. Continued the installation procedure until it finished. I was able
to boot into a working server from the single disk.
7. Attached the second disk.
8. Used parted to create a GPT label on the second disk and a btrfs
partition that was the same size as the btrfs partition on the first
disk.

# parted /dev/sdb
(parted) mklabel gpt
(parted) mkpart primary btrfs #s ##s
(parted) quit

9. Ran "btrfs device add /dev/sdb1 /" to add the second device to the
filesystem.
10. Ran "btrfs balance start -dconvert=raid1 -mconvert=raid1 /" and
waited for it to finish. It reported that it finished successfully.
11. Rebooted the system. At this point, everything appeared to be working.
12. Shut down the system, temporarily disconnected the second disk
(/dev/sdb) from the motherboard, and powered it back up.

What I expected to happen:
I expected that the system would either start as if nothing were
wrong, or would warn me that one half of the mirror was missing and
ask if I really wanted to start the system with the root array in a
degraded state.

What actually happened:
During the boot process, a kernel message appeared indicating that the
"system array" could not be found for the root filesystem (as
identified by a UUID). It then dumped me to an initramfs prompt.
Powering down the system, reattaching the second disk, and powering it
on allowed me to boot successfully. Running "btrfs fi df /" showed
that all System data was stored as RAID1.

If I want to have a storage server where one of two drives can fail at
any time without causing much down time, am I on the right track? If
so, what should I try next to get the behavior I'm looking for?

Thanks,
Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html