Re: How to erase a RAID1 (+++)?

2018-08-31 Thread Alberto Bursi

On 8/31/2018 8:53 AM, Pierre Couderc wrote:
>
> OK, I have understood the message... I was planning that as you said 
> "semi-routinely", and I understand btrfs is not soon enough ready, and 
> I am very very far to be a specialist as you are.
> So, I shall mount my RAID1 very standard, and I  shall expect the 
> disaster, hoping it does not occur
> Now, I shall try to absorb all that...
>
> Thank you very much !
>

I just keep around a USB drive with a full Linux system on it, to act as 
"recovery". If the btrfs raid fails I boot into that and I can do 
maintenance with a full graphical interface and internet access so I can 
google things.

Of course on a home server you can't do that without some automation 
that will switch boot device after some amount of boot failures of the 
main OS.

While if your server has BMC (lights-out management) you can switch boot 
device through that.

-Alberto



Re: How to erase a RAID1 (+++)?

2018-08-30 Thread Alberto Bursi

On 8/30/2018 11:13 AM, Pierre Couderc wrote:
> Trying to install a RAID1 on a debian stretch, I made some mistake and 
> got this, after installing on disk1 and trying to add second disk :
>
>
> root@server:~# fdisk -l
> Disk /dev/sda: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors
> Units: sectors of 1 * 512 = 512 bytes
> Sector size (logical/physical): 512 bytes / 512 bytes
> I/O size (minimum/optimal): 512 bytes / 512 bytes
> Disklabel type: dos
> Disk identifier: 0x2a799300
>
> Device Boot Start    End    Sectors  Size Id Type
> /dev/sda1  * 2048 3907028991 3907026944  1.8T 83 Linux
>
>
> Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors
> Units: sectors of 1 * 512 = 512 bytes
> Sector size (logical/physical): 512 bytes / 512 bytes
> I/O size (minimum/optimal): 512 bytes / 512 bytes
> Disklabel type: dos
> Disk identifier: 0x9770f6fa
>
> Device Boot Start    End    Sectors  Size Id Type
> /dev/sdb1  * 2048 3907029167 3907027120  1.8T  5 Extended
>
>
> And :
>
> root@server:~# btrfs fi show
> Label: none  uuid: eed65d24-6501-4991-94bd-6c3baf2af1ed
>     Total devices 2 FS bytes used 1.10GiB
>     devid    1 size 1.82TiB used 4.02GiB path /dev/sda1
>     devid    2 size 1.00KiB used 0.00B path /dev/sdb1
>
> ...
>
> My purpose is a simple RAID1 main fs, with bootable flag on the 2 
> disks in prder to start in degraded mode
> How to get out ofr that...?
>
> Thnaks
> PC


sdb1 is an extended partition, you cannot format an extended partition.

change sdb1 into primary partition or add a logical partition into it.


-Alberto



Re: [PATCH RFC] btrfs: Do extra device generation check at mount time

2018-06-28 Thread Alberto Bursi


On 28/06/2018 09:04, Qu Wenruo wrote:
> Despite his incorrect expectation, btrfs indeed doesn't handle device
> generation mismatch well.
>
> This means if one devices missed and re-appeared, even its generation
> no longer matches with the rest device pool, btrfs does nothing to it,
> but treat it as normal good device.
>
> At least let's detect such generation mismatch and avoid mounting the
> fs.
> Currently there is no automatic rebuild yet, which means if users find
> device generation mismatch error message, they can only mount the fs
> using "device" and "degraded" mount option (if possible), then replace
> the offending device to manually "rebuild" the fs.
>

Yes. This is a long-standing issue, handling it this way is similar to 
what mdadm
(software raid) also does.

Please get this merged fast, don't get bogged down too much with 
integrating with
Anand Jain's branch as this is a big issue and should get at least this 
basic mitigation asap.

-Alberto


Re: RAID56 - 6 parity raid

2018-05-03 Thread Alberto Bursi


On 01/05/2018 23:57, Gandalf Corvotempesta wrote:
> Hi to all
> I've found some patches from Andrea Mazzoleni that adds support up to 6
> parity raid.
> Why these are wasn't merged ?
> With modern disk size, having something greater than 2 parity, would be
> great.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

His patch was about a generic library to do RAID6, it wasn't directly 
for btrfs.

To actually use that for btrfs someone would have to actually port btrfs 
to that library.

-Alberto


Re: btrfs balance problems

2017-12-23 Thread Alberto Bursi


On 12/23/2017 12:19 PM, James Courtier-Dutton wrote:
> Hi,
>
> During a btrfs balance, the process hogs all CPU.
> Or, to be exact, any other program that wishes to use the SSD during a
> btrfs balance is blocked for long periods. Long periods being more
> than 5 seconds.
> Is there any way to multiplex SSD access while btrfs balance is
> operating, so that other applications can still access the SSD with
> relatively low latency?
>
> My guess is that btrfs is doing a transaction with a large number of
> SSD blocks at a time, and thus blocking other applications.
>
> This makes for atrocious user interactivity as well as applications
> failing because they cannot access the disk in a relatively low latent
> manner.
> For, example, this is causing a High Definition network CCTV
> application to fail.
>
> What I would really like, is for some way to limit SSD bandwidths to
> applications.
> For example the CCTV app always gets the bandwidth it needs, and all
> other applications can still access the SSD, but are rate limited.
> This would fix my particular problem.
> We have rate limiting for network applications, why not disk access also?
>
> Kind Regards
>
> James
>

On most I/O intensive programs in Linux you can use "ionice" tool to 
change the disk access priority of a process. [1]
This allows me to run I/O intensive background scripts in servers 
without the users noticing slowdowns or lagging, of course this means 
the process doing heavy I/O will run more slowly or get outright paused 
if higher-priority processes need a lot of access to the disk.

It works on btrfs balance too, see (commandline example) [2].

If you don't start the process with ionice as in [2], you can always 
change the priority later if you get the get the process ID. I use iotop 
[3], which also supports commandline arguments to integrate its output 
in scripts.

For btrfs scrub it seems to be possible to specify the ionice options 
directly, while btrfs balance does not seem to have them (would be nice 
to add them imho). [4]

For the sake of completeness, there is also "nice" tool for CPU usage 
priority (also used in my scripts on servers to keep the scripts from 
hogging the CPU for what is just a background process, and seen in [2] 
commandline too). [5]

1. http://man7.org/linux/man-pages/man1/ionice.1.html
2. 
https://unix.stackexchange.com/questions/390480/nice-and-ionice-which-one-should-come-first
3. http://man7.org/linux/man-pages/man8/iotop.8.html
4. https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs-scrub
5. http://man7.org/linux/man-pages/man1/nice.1.html

-Alberto
N�r��yb�X��ǧv�^�)޺{.n�+{�n�߲)w*jg����ݢj/���z�ޖ��2�ޙ&�)ߡ�a�����G���h��j:+v���w��٥

dead links in wiki about why not using lz4

2017-09-12 Thread Alberto Bursi
Since zstd was added in btrfs there are people wondering why not lz4 too.

The wiki paragraph [1] cites a couple dead links as explanations on why 
it's not a good idea to have lz4 in btrfs.

I think it would be nice if someone can write that information back in 
the wiki, or for the very least write them in a mailing list message 
that can be linked there.


-Alberto


1. https://btrfs.wiki.kernel.org/index.php/FAQ#Will_btrfs_support_LZ4.3F




Re: Copy BTRFS volume to another BTRFS volume including subvolumes and snapshots

2016-10-14 Thread Alberto Bursi


On 10/15/2016 12:17 AM, Chris Murphy wrote:
> It should be -e can accept a listing of all the subvolumes you want to
> send at once. And possibly an -r flag, if it existed, could
> automatically populate -e. But the last time I tested -e I just got
> errors.
>
> https://bugzilla.kernel.org/show_bug.cgi?id=111221
>
>

Not a problem (for me anyway), I can send all subvolumes already with my 
script (one after another, but still automatically).

What I can't do with btrfs commands is to send over the contents of a ro 
snapshot of / called for example "oldRootSnapshot", directly to 
"/tmp/newroot" (which is where I have mounted the other drive/volume).

The only thing I can do is send over the subvolume as a subvolume.
So I end up with /tmp/newroot/oldRootSnapshot and inside oldRootSnapshot 
I get my root, not what I wanted.

Only way I found so far is using rsync to move the contents of 
oldRootSnapshot in the /tmp/newroot by setting an exclusion list for all 
subvolumes, then run a deduplication with duperemove.

So, is there something I missed to do that?

-Alberto
N�r��yb�X��ǧv�^�)޺{.n�+{�n�߲)w*jg����ݢj/���z�ޖ��2�ޙ&�)ߡ�a�����G���h��j:+v���w��٥

Re: Copy BTRFS volume to another BTRFS volume including subvolumes and snapshots

2016-10-14 Thread Alberto Bursi


On 10/14/2016 01:38 PM, Austin S. Hemmelgarn wrote:
> On 2016-10-13 17:21, Alberto Bursi wrote:
>> Hi, I'm using OpenSUSE on a btrfs volume spanning 2 disks (set as raid1
>> for both metadata and data), no separate /home partition.
>> -
>> I'd like to be able to clone verbatim the whole volume to another
>> volume, for backup purposes.
>>
>> Now, I think I can do that with a brutal partition clone from my
>> "recovery" (a debian system installed in another drive) and then doing
>> the procedures to deal with a lost drive.
>>
>> But I'd rather prefer a clone on a live system, and without doing a
>> brutal clone as that will keep the same UUIDs.
>>
>> I can(will) script this so even if it is a tedious process or it
>> involves writing a huge line in the commandline it's not an issue.
> I'm not sure there is any way to do this on a live system.  You
> essentially need to either do a block level copy and change the UUID's
> (which recent versions of btrfstune can do), or use send with some
> manual setup to get the subvolumes across.  Both options require the
> filesystem to be effectively read-only, which is not something that any
> modern Linux distro can reliably handle for more than a few minutes.
>
> If I had to do this, I'd go with the block level copy, since it requires
> a lot less effort, just make sure to use btrfstune to change the UUID's
> when the copy is done (that may take a while itself though, since it has
> to rewrite a lot of metadata).

Heh, one of the reasons I migrated to btrfs was that I wanted to do 
things like these on a live system.

With my script I can already generate any necessary folder structure and 
send over all subvolumes with a bunch of btrfs send | btrfs receive with 
like 5 lines of script.

I was hoping there was some neat trick with btrfs send | btrfs receive
that allowed me to send a snapshot of / to the / of the new volume.

With rsync (from a ro snapshot of /) it should be possible to use the 
subvolume list as exclusion list, but I'd have rather wanted to use a 
btrfs feature instead.

-Alberto


Copy BTRFS volume to another BTRFS volume including subvolumes and snapshots

2016-10-13 Thread Alberto Bursi
Hi, I'm using OpenSUSE on a btrfs volume spanning 2 disks (set as raid1 
for both metadata and data), no separate /home partition.
The distro loves to create dozens of subvolumes for various things and 
makes snapshots, see:
alby@openSUSE-xeon:~> sudo btrfs subvolume list /
ID 257 gen 394 top level 5 path @
ID 258 gen 293390 top level 257 path @/.snapshots
ID 259 gen 293607 top level 258 path @/.snapshots/1/snapshot
ID 260 gen 107012 top level 257 path @/boot/grub2/i386-pc
ID 261 gen 107012 top level 257 path @/boot/grub2/x86_64-efi
ID 262 gen 293610 top level 257 path @/home
ID 263 gen 292439 top level 257 path @/opt
ID 264 gen 288726 top level 257 path @/srv
ID 265 gen 293610 top level 257 path @/tmp
ID 266 gen 292657 top level 257 path @/usr/local
ID 267 gen 104612 top level 257 path @/var/crash
ID 268 gen 133454 top level 257 path @/var/lib/libvirt/images
ID 269 gen 104612 top level 257 path @/var/lib/mailman
ID 270 gen 104612 top level 257 path @/var/lib/mariadb
ID 271 gen 292441 top level 257 path @/var/lib/mysql
ID 272 gen 104612 top level 257 path @/var/lib/named
ID 273 gen 104612 top level 257 path @/var/lib/pgsql
ID 274 gen 293608 top level 257 path @/var/log
ID 275 gen 104612 top level 257 path @/var/opt
ID 276 gen 293610 top level 257 path @/var/spool
ID 277 gen 293606 top level 257 path @/var/tmp
ID 362 gen 228259 top level 258 path @/.snapshots/56/snapshot
ID 364 gen 228259 top level 258 path @/.snapshots/57/snapshot
ID 528 gen 228259 top level 258 path @/.snapshots/110/snapshot
ID 529 gen 228259 top level 258 path @/.snapshots/111/snapshot
ID 670 gen 228259 top level 258 path @/.snapshots/240/snapshot
ID 671 gen 228259 top level 258 path @/.snapshots/241/snapshot
ID 894 gen 228283 top level 258 path @/.snapshots/438/snapshot
ID 895 gen 228283 top level 258 path @/.snapshots/439/snapshot
ID 896 gen 228283 top level 258 path @/.snapshots/440/snapshot
ID 897 gen 228283 top level 258 path @/.snapshots/441/snapshot
ID 1033 gen 288965 top level 258 path @/.snapshots/554/snapshot
ID 1034 gen 289531 top level 258 path @/.snapshots/555/snapshot
ID 1035 gen 289726 top level 258 path @/.snapshots/556/snapshot
ID 1036 gen 289729 top level 258 path @/.snapshots/557/snapshot
ID 1037 gen 290297 top level 258 path @/.snapshots/558/snapshot
ID 1038 gen 290301 top level 258 path @/.snapshots/559/snapshot
ID 1039 gen 290336 top level 258 path @/.snapshots/560/snapshot
ID 1041 gen 290338 top level 258 path @/.snapshots/562/snapshot
ID 1043 gen 292047 top level 258 path @/.snapshots/563/snapshot
ID 1044 gen 292051 top level 258 path @/.snapshots/564/snapshot
ID 1045 gen 292531 top level 258 path @/.snapshots/565/snapshot
ID 1046 gen 293153 top level 258 path @/.snapshots/566/snapshot

I'd like to be able to clone verbatim the whole volume to another 
volume, for backup purposes.

Now, I think I can do that with a brutal partition clone from my 
"recovery" (a debian system installed in another drive) and then doing 
the procedures to deal with a lost drive.

But I'd rather prefer a clone on a live system, and without doing a 
brutal clone as that will keep the same UUIDs.

I can(will) script this so even if it is a tedious process or it 
involves writing a huge line in the commandline it's not an issue.

Thanks for any input.

-Alberto
N�r��yb�X��ǧv�^�)޺{.n�+{�n�߲)w*jg����ݢj/���z�ޖ��2�ޙ&�)ߡ�a�����G���h��j:+v���w��٥

Re: check if hardware checksumming works or not

2016-06-08 Thread Alberto Bursi

I was hoping for a more advanced method than just blacklisting drivers. :)

Anyways, it seems that first june there was a patch


   [PATCH] btrfs: advertise which crc32c implementation is being used
   at module load

that is getting merged lately, so In the near future I will be able to 
see on btrfs load what driver is in use.



On 05/06/2016 22:33, Nicholas D Steeves wrote:

Hi Alberto,

On 5 June 2016 at 15:37, Alberto Bursi <alberto.bu...@outlook.it> wrote:

Hi, I'm running Debian ARM on a Marvell Kirkwood-based 2-disk NAS.

Kirkwood SoCs have a XOR engine that can hardware-accelerate crc32c
checksumming, and from what I see in kernel mailing lists it seems to have a
linux driver and should be supported.

I wanted to ask if there is a way to test if it is working at all.

How do I force btrfs to use software checksumming for testing purposes?

Is there a mv_xor.ko module you can blacklist?  I'm not familiar with
the platform, but I imagine you'll have to blacklist it and reboot,
because I'm guessing the module can't be removed once it's loaded.

'just a guess,
Nicholas
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


check if hardware checksumming works or not

2016-06-05 Thread Alberto Bursi


Hi, I'm running Debian ARM on a Marvell Kirkwood-based 2-disk NAS.

Kirkwood SoCs have a XOR engine that can hardware-accelerate crc32c 
checksumming, and from what I see in kernel mailing lists it seems to 
have a linux driver and should be supported.


I wanted to ask if there is a way to test if it is working at all.

How do I force btrfs to use software checksumming for testing purposes?


Thanks


-Albert
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html