Re: [Gluster-users] Replica 3 scale out and ZFS bricks

2020-09-16 Thread Strahil Nikolov







В сряда, 16 септември 2020 г., 11:54:57 Гринуич+3, Alexander Iliev 
 написа: 





From what I understood, in order to be able to scale it one node at a 
time, I need to set up the initial nodes with a number of bricks that is 
a multiple of 3 (e.g., 3, 6, 9, etc. bricks). The initial cluster will 
be able to export a volume as large as the storage of a single node and 
adding one more node will grow the volume by 1/3 (assuming homogeneous 
nodes.)

    You can't add 1 node to a replica 3, so no - you won't get 1/3 with that 
extra node.

My plan is to use ZFS as the underlying system for the bricks. Now I'm 
wondering - if I join the disks on each node in a, say, RAIDZ2 pool and 
then create a dataset within the pool for each brick, the GlusterFS 
volume would report the volume size 3x$brick_size, because each brick 
shares the same pool and the size/free space is reported according to 
the ZFS pool size/free space.

I'm not sure about ZFS (never played with it on Linux), but in my systems I 
setup a Thinpool consisting on all HDDs in a striped way (when no Hardware Raid 
Controller is available) and then you setup thin LVs for each brick.
In thin LVM you can define Virtual Size and this size is reported as the volume 
size (assuming that all bricks are the same in size).If you have 1 RAIDZ2 pool 
per Gluster TSP node, then that pool's size is the maximum size of your volume. 
If you plan to use snapshots , then you should set quota on the volume to 
control the usage. 

How should I go about this? Should I create a ZFS pool per brick (this 
seems to have a negative impact on performance)? Should I set a quota 
for each dataset?

I would go with 1 RAIDZ2 pool with 1 dataset of type 'filesystem' per Gluster 
node . Quota is always good to have.


P.S.: Any reason to use ZFS ? It uses a lot of memory .

Best Regards,
Strahil Nikolov




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Correct way to migrate brick to new server (Gluster 6.10)

2020-09-16 Thread Strahil Nikolov
Actually I used 'replace-brick' several times and I had no issues.
I guess you can 'remove brick replica  old_brick' and later 'add 
brick replica  new brick' ...

Best Regards,
Strahil Nikolov





В сряда, 16 септември 2020 г., 08:41:29 Гринуич+3, Alex Wakefield 
 написа: 





Hi all,

We have a distribute replicate gluster volume running Gluster 6.10 on Ubuntu 
18.04 machines. Its a 2 x 2 brick setup (2 bricks, 2 replicas).

We need to migrate the existing bricks to new hardware without downtime and are 
lost at whats the proper way to do it. I've found this post [1] which suggests 
that we can do a replace-brick command and move it to the new server without 
downtime but this [2] mailinglist thread suggests this isn't the correct way to 
do it anymore?

The gluster docs [3] have information for replacing _faulty_ bricks but our 
bricks aren't faulty, we just need to move them to new hardware. We've tried 
using this method mentioned in the docs in the past but have found that the 
volume gets into weird states where files go into read-only mode or have their 
permissions set to root:root. It basically plays havoc on the fs mount that the 
clients use.

Any help would be greatly appreciated. Apologies if I've left any information 
out.

[1]: https://joejulian.name/post/replacing-a-glusterfs-server-best-practice/
[2]: https://lists.gluster.org/pipermail/gluster-users/2012-October/011502.html
[3]: 
https://docs.gluster.org/en/latest/Administrator%20Guide/Managing%20Volumes/#replace-faulty-brick

Cheers,
Alex




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Correct way to migrate brick to new server (Gluster 6.10)

2020-09-16 Thread Ronny Adsetts
Alex Wakefield wrote on 16/09/2020 06:33:
> 
> We have a distribute replicate gluster volume running Gluster 6.10 on
> Ubuntu 18.04 machines. Its a 2 x 2 brick setup (2 bricks, 2
> replicas).
> 
> We need to migrate the existing bricks to new hardware without
> downtime and are lost at whats the proper way to do it. I've found
> this post [1] which suggests that we can do a replace-brick command
> and move it to the new server without downtime but this [2]
> mailinglist thread suggests this isn't the correct way to do it
> anymore?
> 
> The gluster docs [3] have information for replacing _faulty_ bricks
> but our bricks aren't faulty, we just need to move them to new
> hardware. We've tried using this method mentioned in the docs in the
> past but have found that the volume gets into weird states where
> files go into read-only mode or have their permissions set to
> root:root. It basically plays havoc on the fs mount that the clients
> use.
> 
> Any help would be greatly appreciated. Apologies if I've left any
> information out.

Hi,

We had this same quandary in March[1]. We first tested using 
add-brick/remove-brick which resulted in permissions/ownership mayhem. After 
some head scratching, I took the replace-brick approach which worked fine.

Something like so for each brick, waiting for all heals to complete between 
brick replacements:

$ sudo gluster volume replace-brick volname 
stor-old-1:/data/glusterfs/volname/brick1/brick 
stor-new-1:/data/glusterfs/volname/brick1/brick commit force

I did this on a live volume starting with the data I cared about least first. 
Nerves were properly on edge for the first volume I can tell you! :-).

I would, if feasible, recommend doing a test migration on a small volume and 
checksum the data before and after.

Thanks.

Ronny

[1] https://lists.gluster.org/pipermail/gluster-users/2020-March/037786.html

-- 
Ronny Adsetts
Technical Director
Amazing Internet Ltd, London
t: +44 20 8977 8943
w: www.amazinginternet.com

Registered office: 85 Waldegrave Park, Twickenham, TW1 4TJ
Registered in England. Company No. 4042957




signature.asc
Description: OpenPGP digital signature




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Replica 3 scale out and ZFS bricks

2020-09-16 Thread Alexander Iliev

Hi list,

I am in the process of planning a 3-node replica 3 setup and I have a 
question about scaling it out.


From what I understood, in order to be able to scale it one node at a 
time, I need to set up the initial nodes with a number of bricks that is 
a multiple of 3 (e.g., 3, 6, 9, etc. bricks). The initial cluster will 
be able to export a volume as large as the storage of a single node and 
adding one more node will grow the volume by 1/3 (assuming homogeneous 
nodes.)


Please let me know if my understanding is correct.

My plan is to use ZFS as the underlying system for the bricks. Now I'm 
wondering - if I join the disks on each node in a, say, RAIDZ2 pool and 
then create a dataset within the pool for each brick, the GlusterFS 
volume would report the volume size 3x$brick_size, because each brick 
shares the same pool and the size/free space is reported according to 
the ZFS pool size/free space.


How should I go about this? Should I create a ZFS pool per brick (this 
seems to have a negative impact on performance)? Should I set a quota 
for each dataset?


Does my plan even make sense?

Thank you!

Best regards,
--
alexander iliev




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users