from:"Diego Zuccato"

Re: [Gluster-users] gluster volume tier option missing

2024-02-21 Thread Diego Zuccato


It's been deprecated quite a long time ago, like RDMA support :(

Diego

Il 21/02/2024 18:43, garcetto ha scritto:

good evening,
  i am new to glusterFS and read about tiering option, BUT cannot find 
it on current v10 ubuntu 22 lts version, my fault?


thank you.

https://www.gluster.org/automated-tiering-in-gluster-2/ 
<https://www.gluster.org/automated-tiering-in-gluster-2/>







Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] __Geo-replication status is getting Faulty after few seconds

2024-02-08 Thread Diego Zuccato

you do not
recognize the sender.*

Have you tried setting up gluster georep with a dedicated non-root user ?

Best Regards,
Strahil Nikolov

On Tue, Feb 6, 2024 at 16:38, Anant Saraswat
mailto:anant.saras...@techblue.co.uk>> wrote:

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk

<https://urldefense.com/v3/__https://meet.google.com/cpu-eiue-hvk__;!!I_DbfM1H!Dm8_fHcUmz5wnOfTdrkMSb6PXqGdC_3VpklsIdfjPuKgee_Ds7JD__1KjwR4F62a67f5292of5PyQVk9y3-TRe_00eSiJw$>
Gluster-users mailing list
Gluster-users@gluster.org <mailto:Gluster-users@gluster.org>
https://lists.gluster.org/mailman/listinfo/gluster-users

<https://urldefense.com/v3/__https://lists.gluster.org/mailman/listinfo/gluster-users__;!!I_DbfM1H!Dm8_fHcUmz5wnOfTdrkMSb6PXqGdC_3VpklsIdfjPuKgee_Ds7JD__1KjwR4F62a67f5292of5PyQVk9y3-TRe-GwoljEQ$>

DISCLAIMER: This email and any files transmitted with it are
confidential and intended solely for the use of the individual or entity
to whom they are addressed. If you have received this email in error,
please notify the sender. This message contains confidential information
and is intended only for the individual named. If you are not the named
addressee, you should not disseminate, distribute or copy this email.
Please notify the sender immediately by email if you have received this
email by mistake and delete this email from your system.

If you are not the intended recipient, you are notified that disclosing,
copying, distributing or taking any action in reliance on the contents
of this information is strictly prohibited. Thanks for your cooperation.

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786

Community Meeting Calendar:

Re: [Gluster-users] Challenges with Replicated Gluster volume after stopping Gluster on any node.

2024-02-05 Thread Diego Zuccato

ded recipient, you are notified that disclosing, 
copying, distributing or taking any action in reliance on the contents 
of this information is strictly prohibited. Thanks for your cooperation.







Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Upgrade 10.4 -> 11.1 making problems

2024-01-18 Thread Diego Zuccato

I don't want to hijack the thread. And in my case setting logs to debug 
would fill my /var partitions in no time. Maybe the OP can.


Diego

Il 18/01/2024 22:58, Strahil Nikolov ha scritto:

Are you able to set the logs to debug level ?
It might provide a clue what it is going on.

Best Regards,
Strahil Nikolov

On Thu, Jan 18, 2024 at 13:08, Diego Zuccato
 wrote:
That's the same kind of errors I keep seeing on my 2 clusters,
regenerated some months ago. Seems a pseudo-split-brain that should be
impossible on a replica 3 cluster but keeps happening.
Sadly going to ditch Gluster ASAP.

Diego

Il 18/01/2024 07:11, Hu Bert ha scritto:
 > Good morning,
 > heal still not running. Pending heals now sum up to 60K per brick.
 > Heal was starting instantly e.g. after server reboot with version
 > 10.4, but doesn't with version 11. What could be wrong?
 >
 > I only see these errors on one of the "good" servers in
glustershd.log:
 >
 > [2024-01-18 06:08:57.328480 +] W [MSGID: 114031]
 > [client-rpc-fops_v2.c:2561:client4_0_lookup_cbk] 0-workdata-client-0:
 > remote operation failed.
 > [{path=},
 > {gfid=cb39a1e4-2a4c-4727-861d-3ed9e
 > f00681b}, {errno=2}, {error=No such file or directory}]
 > [2024-01-18 06:08:57.594051 +] W [MSGID: 114031]
 > [client-rpc-fops_v2.c:2561:client4_0_lookup_cbk] 0-workdata-client-1:
 > remote operation failed.
 > [{path=},
 > {gfid=3e9b178c-ae1f-4d85-ae47-fc539
 > d94dd11}, {errno=2}, {error=No such file or directory}]
 >
 > About 7K today. Any ideas? Someone?
 >
 >
 > Best regards,
 > Hubert
 >
 > Am Mi., 17. Jan. 2024 um 11:24 Uhr schrieb Hu Bert
mailto:revi...@googlemail.com>>:
 >>
 >> ok, finally managed to get all servers, volumes etc runnung, but
took
 >> a couple of restarts, cksum checks etc.
 >>
 >> One problem: a volume doesn't heal automatically or doesn't heal
at all.
 >>
 >> gluster volume status
 >> Status of volume: workdata
 >> Gluster process                            TCP Port  RDMA Port 
Online  Pid

 >>

--
 >> Brick glusterpub1:/gluster/md3/workdata    58832    0 
Y      3436
 >> Brick glusterpub2:/gluster/md3/workdata    59315    0 
Y      1526
 >> Brick glusterpub3:/gluster/md3/workdata    56917    0 
Y      1952
 >> Brick glusterpub1:/gluster/md4/workdata    59688    0 
Y      3755
 >> Brick glusterpub2:/gluster/md4/workdata    60271    0 
Y      2271
 >> Brick glusterpub3:/gluster/md4/workdata    49461    0 
Y      2399
 >> Brick glusterpub1:/gluster/md5/workdata    54651    0 
Y      4208
 >> Brick glusterpub2:/gluster/md5/workdata    49685    0 
Y      2751
 >> Brick glusterpub3:/gluster/md5/workdata    59202    0 
Y      2803
 >> Brick glusterpub1:/gluster/md6/workdata    55829    0 
Y      4583
 >> Brick glusterpub2:/gluster/md6/workdata    50455    0 
Y      3296
 >> Brick glusterpub3:/gluster/md6/workdata    50262    0 
Y      3237
 >> Brick glusterpub1:/gluster/md7/workdata    52238    0 
Y      5014
 >> Brick glusterpub2:/gluster/md7/workdata    52474    0 
Y      3673
 >> Brick glusterpub3:/gluster/md7/workdata    57966    0 
Y      3653
 >> Self-heal Daemon on localhost              N/A      N/A   
Y      4141
 >> Self-heal Daemon on glusterpub1            N/A      N/A   
Y      5570
 >> Self-heal Daemon on glusterpub2            N/A      N/A   
Y      4139

 >>
 >> "gluster volume heal workdata info" lists a lot of files per brick.
 >> "gluster volume heal workdata statistics heal-count" shows thousands
 >> of files per brick.
 >> "gluster volume heal workdata enable" has no effect.
 >>
 >> gluster volume heal workdata full
 >> Launching heal operation to perform full self heal on volume
workdata
 >> has been successful
 >> Use heal info commands to check status.
 >>
 >> -> not doing anything at all. And nothing happening on the 2 "good"
 >> servers in e.g. glustershd.log. Heal was working as expected on
 >> version 10.4, but here... silence. Someone has an idea?
 >>
 >>
 >> Best regards,
 >> Hubert
 &

Re: [Gluster-users] Upgrade 10.4 -> 11.1 making problems

2024-01-18 Thread Diego Zuccato

Since glusterd does not consider it a split brain, you can't solve it 
with standard split brain tools.
I've found no way to resolve it except by manually handling one file at 
a time: completely unmanageable with thousands of files and having to 
juggle between actual path on brick and metadata files!

Previously I "fixed" it by:
1) moving all the data from the volume to a temp space
2) recovering from the bricks what was inaccessible from the mountpoint 
(keeping different file revisions for the conflicting ones)

3) destroying and recreating the volume
4) copying back the data from the backup

When gluster gets used because you need lots of space (we had more than 
400TB on 3 nodes with 30x12TB SAS disks in "replica 3 arbiter 1"), where 
do you park the data? Is the official solution "just have a second 
cluster idle for when you need to fix errors"?
It took more than a month of downtime this summer, and after less than 6 
months I'd have to repeat it? Users are rightly quite upset...


Diego

Il 18/01/2024 09:17, Hu Bert ha scritto:

were you able to solve the problem? Can it be treated like a "normal"
split brain? 'gluster peer status' and 'gluster volume status' are ok,
so kinda looks like "pseudo"...


hubert

Am Do., 18. Jan. 2024 um 08:28 Uhr schrieb Diego Zuccato
:


That's the same kind of errors I keep seeing on my 2 clusters,
regenerated some months ago. Seems a pseudo-split-brain that should be
impossible on a replica 3 cluster but keeps happening.
Sadly going to ditch Gluster ASAP.

Diego

Il 18/01/2024 07:11, Hu Bert ha scritto:

Good morning,
heal still not running. Pending heals now sum up to 60K per brick.
Heal was starting instantly e.g. after server reboot with version
10.4, but doesn't with version 11. What could be wrong?

I only see these errors on one of the "good" servers in glustershd.log:

[2024-01-18 06:08:57.328480 +] W [MSGID: 114031]
[client-rpc-fops_v2.c:2561:client4_0_lookup_cbk] 0-workdata-client-0:
remote operation failed.
[{path=},
{gfid=cb39a1e4-2a4c-4727-861d-3ed9e
f00681b}, {errno=2}, {error=No such file or directory}]
[2024-01-18 06:08:57.594051 +] W [MSGID: 114031]
[client-rpc-fops_v2.c:2561:client4_0_lookup_cbk] 0-workdata-client-1:
remote operation failed.
[{path=},
{gfid=3e9b178c-ae1f-4d85-ae47-fc539
d94dd11}, {errno=2}, {error=No such file or directory}]

About 7K today. Any ideas? Someone?


Best regards,
Hubert

Am Mi., 17. Jan. 2024 um 11:24 Uhr schrieb Hu Bert :


ok, finally managed to get all servers, volumes etc runnung, but took
a couple of restarts, cksum checks etc.

One problem: a volume doesn't heal automatically or doesn't heal at all.

gluster volume status
Status of volume: workdata
Gluster process TCP Port  RDMA Port  Online  Pid
--
Brick glusterpub1:/gluster/md3/workdata 58832 0  Y   3436
Brick glusterpub2:/gluster/md3/workdata 59315 0  Y   1526
Brick glusterpub3:/gluster/md3/workdata 56917 0  Y   1952
Brick glusterpub1:/gluster/md4/workdata 59688 0  Y   3755
Brick glusterpub2:/gluster/md4/workdata 60271 0  Y   2271
Brick glusterpub3:/gluster/md4/workdata 49461 0  Y   2399
Brick glusterpub1:/gluster/md5/workdata 54651 0  Y   4208
Brick glusterpub2:/gluster/md5/workdata 49685 0  Y   2751
Brick glusterpub3:/gluster/md5/workdata 59202 0  Y   2803
Brick glusterpub1:/gluster/md6/workdata 55829 0  Y   4583
Brick glusterpub2:/gluster/md6/workdata 50455 0  Y   3296
Brick glusterpub3:/gluster/md6/workdata 50262 0  Y   3237
Brick glusterpub1:/gluster/md7/workdata 52238 0  Y   5014
Brick glusterpub2:/gluster/md7/workdata 52474 0  Y   3673
Brick glusterpub3:/gluster/md7/workdata 57966 0  Y   3653
Self-heal Daemon on localhost   N/A   N/AY   4141
Self-heal Daemon on glusterpub1 N/A   N/AY   5570
Self-heal Daemon on glusterpub2 N/A   N/AY   4139

"gluster volume heal workdata info" lists a lot of files per brick.
"gluster volume heal workdata statistics heal-count" shows thousands
of files per brick.
"gluster volume heal workdata enable" has no effect.

gluster volume heal workdata full
Launching heal operation to perform full self heal on volume workdata
has been successful
Use heal info commands to check status.

-> not doing anything at all. And nothing happening on the 2 "good"
servers in e.g. glustershd.log. Heal was working as expected on
version 10.4, but here... silence. Someone has an idea?


Best regards,
Hubert

Am Di., 16. Jan. 2024 um 13:44 Uhr schrieb

Re: [Gluster-users] Upgrade 10.4 -> 11.1 making problems

2024-01-17 Thread Diego Zuccato

usterd-handler.c:2546:__glusterd_handle_incoming_friend_req]
0-glusterd: Received probe from uuid:
b71401c3-512a-47cb-ac18-473c4ba7776e
[2024-01-15 08:02:23.608349 +] E [MSGID: 106010]
[glusterd-utils.c:3824:glusterd_compare_friend_volume] 0-management:
Version of Cksums sourceimages differ. local cksum = 2204642525,
remote cksum = 1931483801 on peer gluster190
[2024-01-15 08:02:23.608584 +] I [MSGID: 106493]
[glusterd-handler.c:3819:glusterd_xfer_friend_add_resp] 0-glusterd:
Responded to gluster190 (0), ret: 0, op_ret: -1
[2024-01-15 08:02:23.613553 +] I [MSGID: 106493]
[glusterd-rpc-ops.c:467:__glusterd_friend_add_cbk] 0-glusterd:
Received RJT from uuid: b71401c3-512a-47cb-ac18-473c4ba7776e, host:
gluster190, port: 0

peer status from rebooted node:

root@gluster190 ~ # gluster peer status
Number of Peers: 2

Hostname: gluster189
Uuid: 50dc8288-aa49-4ea8-9c6c-9a9a926c67a7
State: Peer Rejected (Connected)

Hostname: gluster188
Uuid: e15a33fe-e2f7-47cf-ac53-a3b34136555d
State: Peer Rejected (Connected)

So the rebooted gluster190 is not accepted anymore. And thus does not
appear in "gluster volume status". I then followed this guide:

https://gluster-documentations.readthedocs.io/en/latest/Administrator%20Guide/Resolving%20Peer%20Rejected/

Remove everything under /var/lib/glusterd/ (except glusterd.info) and
restart glusterd service etc. Data get copied from other nodes,
'gluster peer status' is ok again - but the volume info is missing,
/var/lib/glusterd/vols is empty. When syncing this dir from another
node, the volume then is available again, heals start etc.

Well, and just to be sure that everything's working as it should,
rebooted that node again - the rebooted node is kicked out again, and
you have to restart bringing it back again.

Sry, but did i miss anything? Has someone experienced similar
problems? I'll probably downgrade to 10.4 again, that version was
working...


Thx,
Hubert





Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users





Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster -> Ceph

2023-12-17 Thread Diego Zuccato


Il 17/12/2023 14:52, Joe Julian ha scritto:


From what I've been told (by experts) it's really hard to make it happen. More 
if proper redundancy of MON and MDS daemons is implemented on quality HW.

LSI isn't exactly crap hardware. But when a flaw causes it to drop drives under 
heavy load, the rebalance from dropped drives can cause that heavy load causing 
a cascading failure. When the journal is never idle long enough to checkpoint, 
it fills the partition and ends up corrupted and unrecoverable.


Good to know. Better to add a monitoring service that stops everything 
when the log is too full.
That also applies to Gluster, BTW, even if with less severe 
consequences: sometimes, "peer files" got lost due to /var filling up 
and glusterd wouldn't come up after a reboot.



Neither Gluster nor Ceph are "backup solutions", so if the data is not easily 
replaceable it's better to have it elsewhere. Better if offline.

It's a nice idea but when you're dealing in petabytes of data, streaming in as 
fast as your storage will allow, it's just not physically possible.
Well, it will have to stop sometimes, or you'd need an infinite storage, 
no? :) Usually data from experiments comes in bursts, with (often large) 
intervals when you can process/archive it.


--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster -> Ceph

2023-12-17 Thread Diego Zuccato


Il 14/12/2023 16:08, Joe Julian ha scritto:

With ceph, if the placement database is corrupted, all your data is lost 
(happened to my employer, once, losing 5PB of customer data).


From what I've been told (by experts) it's really hard to make it 
happen. More if proper redundancy of MON and MDS daemons is implemented 
on quality HW.



With Gluster, it's just files on disks, easily recovered.


I've already had to do it twice in a year with the coming third time 
that's the "definitive migration".
The first time there were too many little files, the second it seemed 
192GB RAM are not enough to handle 30 bricks per server, and now that I 
reduced to just 6 bricks per server (creating RAIDs) and created a brand 
new volume in august, I already find lots of FUSE-inaccessible files 
that doesn't heal. Should be impossible since I'm using "replica 3 
arbiter 1" over IPoIB with the three servers speaking directly via the 
switch. But it keeps happening. I really trusted Gluster promises, but 
currently what I (and, worse, the users) see is a 60-70% availability.


Neither Gluster nor Ceph are "backup solutions", so if the data is not 
easily replaceable it's better to have it elsewhere. Better if offline.


--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster 9.3 not distributing data evenly across bricks

2023-12-01 Thread Diego Zuccato

3    9.9T  8.7T  1.2T  89% /data3
*Node 8:*
/dev/bcache0    9.7T  8.7T  994G  90% /data
/dev/bcache1    3.9T  3.3T  645G  84% /data1
/dev/bcache2    3.9T  3.4T  519G  87% /data2
/dev/bcache3    9.9T  9.0T  868G  92% /data3
*Node 9:*
/dev/bcache0     10T  8.6T  1.4T  87% /data
/dev/bcache1    8.0T  6.7T  1.4T  84% /data1
/dev/bcache2    8.0T  6.8T  1.3T  85% /data2
*Node 10:*
/dev/bcache0     10T  8.8T  1.3T  88% /data
/dev/bcache1    8.0T  6.6T  1.4T  83% /data1
/dev/bcache2    8.0T  7.0T  990G  88% /data2
*Node 11:*
/dev/bcache0     10T  8.1T  1.9T  82% /data
/dev/bcache1     10T  8.5T  1.5T  86% /data1
/dev/bcache2     10T  8.4T  1.6T  85% /data2
*Node 12:*
/dev/bcache0     10T  8.4T  1.6T  85% /data
/dev/bcache1     10T  8.4T  1.6T  85% /data1
/dev/bcache2     10T  8.2T  1.8T  83% /data2
*Node 13:*
/dev/bcache1     10T  8.7T  1.3T  88% /data1
/dev/bcache2     10T  8.8T  1.2T  88% /data2
/dev/bcache0     10T  8.6T  1.5T  86% /data




--
Regards,
Shreyansh Shah
AlphaGrep* Securities Pvt. Ltd.*





Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Rebuilding a failed cluster

2023-11-29 Thread Diego Zuccato

Much depends on the original volume layout. For replica volumes you'll 
find multiple copies of the same file on different bricks. And sometimes 
0-byte files that are placeholders of renamed files: do not overwrite a 
good file with its empty version!
If the old volume is still online, it's better if you copy from its FUSE 
mount point to the new one.
But since it's a temporary "backup", there's no need to use another 
Gluster volume as the destination: just use a USB drive directly 
connected to the old nodes (one at a time) or to a machine that can 
still FUSE mount the old volume. Once you have a backup, write-protect 
it and experiment freely :)


Diego

Il 29/11/2023 19:17, Richard Betel ha scritto:

Ok, it's been a while, but I'm getting back to this "project".
I was unable to get gluster for the platform: the machines are 
ARM-based, and there are no ARM binaries on the gluster package repo. I 
tried building it instead, but the version of gluster I was running was 
quite old, and I couldn't get all the right package versions to do a 
successful build.
As a result, it sounds like my best option is to follow your alternate 
suggestion:
"The other option is to setup a new cluster and volume and then mount 
the volume via FUSE and copy the data from one of the bricks."


I want to be sure I understand what you're saying, though. Here's my plan:
create 3 VMs on amd64 processors(*)
Give each a 100G brick
set up the 3 bricks as disperse
mount the new gluster volume on my workstation
copy directories from one of the old bricks to the mounted new GFS volume
Copy fully restored data from new GFS volume to workstation or whatever 
permanent setup I go with.


Is that right? Or do I want the GFS system to be offline while I copy 
the contents of the old brick to the new brick?


(*) I'm not planning to keep my GFS on VMs on cloud, I just want 
something temporary to work with so I don't blow up anything else.





On Sat, 12 Aug 2023 at 09:20, Strahil Nikolov <mailto:hunter86...@yahoo.com>> wrote:


If you preserved the gluster structure in /etc/ and /var/lib, you
should be able to run the cluster again.
First install the same gluster version all nodes and then overwrite
the structure in /etc and in /var/lib.
Once you mount the bricks , start glusterd and check the situation.

The other option is to setup a new cluster and volume and then mount
the volume via FUSE and copy the data from one of the bricks.

Best Regards,
Strahil Nikolov

On Saturday, August 12, 2023, 7:46 AM, Richard Betel
mailto:emte...@gmail.com>> wrote:

I had a small cluster with a disperse 3 volume. 2 nodes had
hardware failures and no longer boot, and I don't have
replacement hardware for them (it's an old board called a
PC-duino). However, I do have their intact root filesystems and
the disks the bricks are on.

So I need to rebuild the cluster on all new host hardware. does
anyone have any suggestions on how to go about doing this? I've
built 3 vms to be a new test cluster, but if I copy over a file
from the 3 nodes and try to read it, I can't and get errors in
/var/log/glusterfs/foo.log:
[2023-08-12 03:50:47.638134 +] W [MSGID: 114031]
[client-rpc-fops_v2.c:2561:client4_0_lookup_cbk] 0-gv-client-0:
remote operation failed. [{path=/helmetpart.scad},
{gfid=----}
, {errno=61}, {error=No data available}]
[2023-08-12 03:50:49.834859 +] E [MSGID: 122066]
[ec-common.c:1301:ec_prepare_update_cbk] 0-gv-disperse-0: Unable
to get config xattr. FOP : 'FXATTROP' failed on gfid
076a511d-3721-4231-ba3b-5c4cbdbd7f5d. Pa
rent FOP: READ [No data available]
[2023-08-12 03:50:49.834930 +] W
[fuse-bridge.c:2994:fuse_readv_cbk] 0-glusterfs-fuse: 39: READ
=> -1 gfid=076a511d-3721-4231-ba3b-5c4cbdbd7f5d
fd=0x7fbc9c001a98 (No data available)

so obviously, I need to copy over more stuff from the original
cluster. If I force the 3 nodes and the volume to have the same
uuids, will that be enough?




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
<https://meet.google.com/cpu-eiue-hvk>
Gluster-users mailing list
Gluster-users@gluster.org <mailto:Gluster-users@gluster.org>
https://lists.gluster.org/mailman/listinfo/gluster-users
<https://lists.gluster.org/mailman/listinfo/gluster-users>






Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman

Re: [Gluster-users] Verify limit-objects from clients in Gluster9 ?

2023-11-21 Thread Diego Zuccato

Since Gluster allows for setting "project quotas" on directories both 
for size (limit) and for inodes (limit-objects), and it allows for 
checking size quota from clients (via df), I expected it to do the same 
for inodes (tentatively via "df -i"). But it seems it's not the case, so 
I was asking if I'm missing something...


Diego

Il 21/11/2023 19:06, Strahil Nikolov ha scritto:

What do you mean by dir ?
Usually inode max value is per File System.

Best Regards,
Strahil Nikolov

On Mon, Nov 6, 2023 at 12:58, difa.csi
 wrote:
Hello all.

Is there a way to check inode limit from clients?
df -i /path/to/dir
seems to report values for all the volume, not just the dir.

For space it works as expected:

# gluster v quota cluster_data list
                   Path                  Hard-limit  Soft-limit
Used  Available  Soft-limit exceeded? Hard-limit exceeded?

---
/astro                                    20.0TB    80%(16.0TB)
18.8TB  1.2TB            Yes                  No
# df /mnt/scratch/astro
Filesystem              1K-blocks        Used  Available Use% Mounted on
clustor00:cluster_data 21474836480 20169918036 1304918444  94%
/mnt/scratch

For inodes, instead:
# gluster v quota cluster_data list-objects
                   Path                  Hard-limit  Soft-limit
Files      Dirs    Available  Soft-limit exceeded? Hard-limit exceeded?

---
/astro                                      10      80%(8)
# df -i /mnt/scratch/astro
Filesystem                Inodes  IUsed      IFree IUse% Mounted on
clustor00:cluster_data 4687500480 122689 4687377791    1% /mnt/scratch
99897      103          0            Yes                  Yes

Should report 100% use for "hard quota exceeded", IMO.

That's on Gluster 9.6.

-- 
Diego Zuccato

DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
<https://meet.google.com/cpu-eiue-hvk>
Gluster-users mailing list
Gluster-users@gluster.org <mailto:Gluster-users@gluster.org>
https://lists.gluster.org/mailman/listinfo/gluster-users
<https://lists.gluster.org/mailman/listinfo/gluster-users>






Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] State of the gluster project

2023-10-27 Thread Diego Zuccato


Maybe a bit OT...

I'm no expert on either, but the concepts are quite similar.
Both require "extra" nodes (metadata and monitor), but those can be 
virtual machines or you can host the services on OSD machines.


We don't use snapshots, so I can't comment on that.

My experience with Ceph is limited to having it working on Proxmox. No 
experience yet with CephFS.


BeeGFS is more like a "freemium" FS: the base functionality is free, but 
if you need "enterprise" features (quota, replication...) you have to 
pay (quite a lot... probably not to compromise lucrative GPFS licensing).


We also saw more than 30 minutes for an ls on a Gluster directory 
containing about 50 files when we had many millions of files on the fs 
(with one disk per brick, which also lead to many memory issues). After 
last rebuild I created 5-disks RAID5 bricks (about 44TB each) and memory 
pressure wend down drastically, but desyncs still happen even if the 
nodes are connected via IPoIB links that are really rock-solid (and in 
the worst case they could fallback to 1Gbps Ethernet connectivity).


Diego

Il 27/10/2023 10:30, Marcus Pedersén ha scritto:

Hi Diego,
I have had a look at BeeGFS and is seems more similar
to ceph then to gluster. It requires extra management
nodes similar to ceph, right?
Second of all there are no snapshots in BeeGFS, as
I understand it.
I know ceph has snapshots so for us this seems a
better alternative. What is your experience of ceph?

I am sorry to hear about your problems with gluster,
from my experience we had quite some issues with gluster
when it was "young", I thing the first version we installed
whas 3.5 or so. It was also extremly slow, an ls took forever.
But later versions has been "kind" to us and worked quite well
and file access has become really comfortable.

Best regards
Marcus

On Fri, Oct 27, 2023 at 10:16:08AM +0200, Diego Zuccato wrote:

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you recognize the sender and know the content 
is safe.


Hi.

I'm also migrating to BeeGFS and CephFS (depending on usage).

What I liked most about Gluster was that files were easily recoverable
from bricks even in case of disaster and that it said it supported RDMA.
But I soon found that RDMA was being phased out, and I always find
entries that are not healing after a couple months of (not really heavy)
use, directories that can't be removed because not all files have been
deleted from all the bricks and files or directories that become
inaccessible with no apparent reason.
Given that I currently have 3 nodes with 30 12TB disks each in replica 3
arbiter 1 it's become a major showstopper: can't stop production, backup
everything and restart from scratch every 3-4 months. And there are no
tools helping, just log digging :( Even at version 9.6 seems it's not
really "production ready"... More like v0.9.6 IMVHO. And now it being
EOLed makes it way worse.

Diego

Il 27/10/2023 09:40, Zakhar Kirpichenko ha scritto:

Hi,

Red Hat Gluster Storage is EOL, Red Hat moved Gluster devs to other
projects, so Gluster doesn't get much attention. From my experience, it
has deteriorated since about version 9.0, and we're migrating to
alternatives.

/Z

On Fri, 27 Oct 2023 at 10:29, Marcus Pedersén mailto:marcus.peder...@slu.se>> wrote:

 Hi all,
 I just have a general thought about the gluster
 project.
 I have got the feeling that things has slowed down
 in the gluster project.
 I have had a look at github and to me the project
 seems to slow down, for gluster version 11 there has
 been no minor releases, we are still on 11.0 and I have
 not found any references to 11.1.
 There is a milestone called 12 but it seems to be
 stale.
 I have hit the issue:
 https://github.com/gluster/glusterfs/issues/4085
 <https://github.com/gluster/glusterfs/issues/4085>
 that seems to have no sollution.
 I noticed when version 11 was released that you
 could not bump OP version to 11 and reported this,
 but this is still not available.

 I am just wondering if I am missing something here?

 We have been using gluster for many years in production
 and I think that gluster is great!! It has served as well over
 the years and we have seen some great improvments
 of stabilility and speed increase.

 So is there something going on or have I got
 the wrong impression (and feeling)?

 Best regards
 Marcus
 ---
 När du skickar e-post till SLU så innebär detta att SLU behandlar
 dina personuppgifter. För att läsa mer om hur detta går till, klicka
 här <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/
 <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/>>
 E-mailing SLU will result in SLU processing your personal data. For
 more information on how this

Re: [Gluster-users] State of the gluster project

2023-10-27 Thread Diego Zuccato


Hi.

I'm also migrating to BeeGFS and CephFS (depending on usage).

What I liked most about Gluster was that files were easily recoverable 
from bricks even in case of disaster and that it said it supported RDMA. 
But I soon found that RDMA was being phased out, and I always find 
entries that are not healing after a couple months of (not really heavy) 
use, directories that can't be removed because not all files have been 
deleted from all the bricks and files or directories that become 
inaccessible with no apparent reason.
Given that I currently have 3 nodes with 30 12TB disks each in replica 3 
arbiter 1 it's become a major showstopper: can't stop production, backup 
everything and restart from scratch every 3-4 months. And there are no 
tools helping, just log digging :( Even at version 9.6 seems it's not 
really "production ready"... More like v0.9.6 IMVHO. And now it being 
EOLed makes it way worse.


Diego

Il 27/10/2023 09:40, Zakhar Kirpichenko ha scritto:

Hi,

Red Hat Gluster Storage is EOL, Red Hat moved Gluster devs to other 
projects, so Gluster doesn't get much attention. From my experience, it 
has deteriorated since about version 9.0, and we're migrating to 
alternatives.


/Z

On Fri, 27 Oct 2023 at 10:29, Marcus Pedersén <mailto:marcus.peder...@slu.se>> wrote:


Hi all,
I just have a general thought about the gluster
project.
I have got the feeling that things has slowed down
in the gluster project.
I have had a look at github and to me the project
seems to slow down, for gluster version 11 there has
been no minor releases, we are still on 11.0 and I have
not found any references to 11.1.
There is a milestone called 12 but it seems to be
stale.
I have hit the issue:
https://github.com/gluster/glusterfs/issues/4085
<https://github.com/gluster/glusterfs/issues/4085>
that seems to have no sollution.
I noticed when version 11 was released that you
could not bump OP version to 11 and reported this,
but this is still not available.

I am just wondering if I am missing something here?

We have been using gluster for many years in production
and I think that gluster is great!! It has served as well over
the years and we have seen some great improvments
of stabilility and speed increase.

So is there something going on or have I got
the wrong impression (and feeling)?

Best regards
Marcus
---
När du skickar e-post till SLU så innebär detta att SLU behandlar
dina personuppgifter. För att läsa mer om hur detta går till, klicka
här <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/
<https://www.slu.se/om-slu/kontakta-slu/personuppgifter/>>
E-mailing SLU will result in SLU processing your personal data. For
more information on how this is done, click here
<https://www.slu.se/en/about-slu/contact-slu/personal-data/
<https://www.slu.se/en/about-slu/contact-slu/personal-data/>>




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
<https://meet.google.com/cpu-eiue-hvk>
Gluster-users mailing list
Gluster-users@gluster.org <mailto:Gluster-users@gluster.org>
https://lists.gluster.org/mailman/listinfo/gluster-users
<https://lists.gluster.org/mailman/listinfo/gluster-users>






Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Qustionmark in permission and Owner

2023-06-05 Thread Diego Zuccato

Seen something similar when FUSE client died, but it marked the whole 
mountpoint, not just some files.

Might be a desync or communication loss between the nodes?

Diego

Il 05/06/2023 11:23, Stefan Kania ha scritto:

Hello,

I have a strange problem on a gluster volume

If I do an "ls -l" in a directory insight a mountet gluster-volume I 
see, only for some files, questionmarks for the permission, the owner, 
the size and the date.

Looking at the same directory on the brick it self, everything is ok.
After rebooting the nodes everything is back to normal.

System is Debian 11 and Gluster is version 9. The filesystem is LVM2 
thin provisioned and formated with XFS.


But as I said, the brick is ok only the mountet volume is having the 
problem.


Any hind what it could be?

Thank's

Stefan







Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] How to configure?

2023-04-23 Thread Diego Zuccato

After a lot of tests and unsuccessful searching, I decided to start from 
scratch: I'm going to ditch the old volume and create a new one.

I have 3 servers with 30 12TB disks each. Since I'm going to start a new 
volume, could it be better to group disks in 10 3-disk (or 6 5-disk) 
RAID-0 volumes to reduce the number of bricks? Redundancy would be given 
by replica 2 (still undecided about arbiter vs thin-arbiter...).

Current configuration is:
root@str957-clustor00:~# gluster v info cluster_data

Volume Name: cluster_data
Type: Distributed-Replicate
Volume ID: a8caaa90-d161-45bb-a68c-278263a8531a
Status: Started
Snapshot Count: 0
Number of Bricks: 45 x (2 + 1) = 135
Transport-type: tcp
Bricks:
Brick1: clustor00:/srv/bricks/00/d
Brick2: clustor01:/srv/bricks/00/d
Brick3: clustor02:/srv/bricks/00/q (arbiter)
...
Brick133: clustor01:/srv/bricks/29/d
Brick134: clustor02:/srv/bricks/29/d
Brick135: clustor00:/srv/bricks/14/q (arbiter)
Options Reconfigured:
cluster.background-self-heal-count: 256
cluster.heal-wait-queue-length: 1
performance.quick-read: off
cluster.entry-self-heal: on
cluster.data-self-heal-algorithm: full
cluster.metadata-self-heal: on
cluster.shd-max-threads: 2
network.inode-lru-limit: 50
performance.md-cache-timeout: 600
performance.cache-invalidation: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
features.quota-deem-statfs: on
performance.readdir-ahead: on
cluster.granular-entry-heal: enable
features.scrub: Active
features.bitrot: on
cluster.lookup-optimize: on
performance.stat-prefetch: on
performance.cache-refresh-timeout: 60
performance.parallel-readdir: on
performance.write-behind-window-size: 128MB
cluster.self-heal-daemon: enable
features.inode-quota: on
features.quota: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
client.event-threads: 1
features.scrub-throttle: normal
diagnostics.brick-log-level: ERROR
diagnostics.client-log-level: ERROR
config.brick-threads: 0
cluster.lookup-unhashed: on
config.client-threads: 1
cluster.use-anonymous-inode: off
diagnostics.brick-sys-log-level: CRITICAL
features.scrub-freq: monthly
cluster.data-self-heal: on
cluster.brick-multiplex: on
cluster.daemon-log-level: ERROR

Each node is a dual-Xeon 4210 (for a total of 20 cores, 40 threads) 
equipped with 192GB RAM (that got exhausted quite often, before enabling 
brick-multiplex).

Diego

Il 24/03/2023 19:21, Strahil Nikolov ha scritto:

Try finding if any of them is missing on one of the systems.

Best Regards,
Strahil Nikolov

On Fri, Mar 24, 2023 at 15:59, Diego Zuccato
 wrote:
There are 285 files in /var/lib/glusterd/vols/cluster_data ...
including
many files with names related to quorum bricks already moved to a
different path (like cluster_data.client.clustor02.srv-quorum-00-d.vol
that should already have been replaced by
cluster_data.clustor02.srv-bricks-00-q.vol -- and both vol files exist).

Is there something I should check inside the volfiles?

Diego

Il 24/03/2023 13:05, Strahil Nikolov ha scritto:
 > Can you check your volume file contents?
 > Maybe it really can't find (or access) a specific volfile ?
 >
 > Best Regards,
 > Strahil Nikolov
 >
 >    On Fri, Mar 24, 2023 at 8:07, Diego Zuccato
 >    mailto:diego.zucc...@unibo.it>> wrote:
 >    In glfsheal-Connection.log I see many lines like:
 >    [2023-03-13 23:04:40.241481 +] E [MSGID: 104021]
 >    [glfs-mgmt.c:586:glfs_mgmt_getspec_cbk] 0-gfapi: failed to get the
 >    volume file [{from server}, {errno=2}, {error=File o directory non
 >    esistente}]
 >
 >    And *lots* of gfid-mismatch errors in glustershd.log .
 >
 >    Couldn't find anything that would prevent heal to start. :(
 >
 >    Diego
 >
 >    Il 21/03/2023 20:39, Strahil Nikolov ha scritto:
 >      > I have no clue. Have you checked for errors in the logs ?
Maybe you
 >      > might find something useful.
 >      >
 >      > Best Regards,
     >      > Strahil Nikolov
 >      >
 >      >    On Tue, Mar 21, 2023 at 9:56, Diego Zuccato
 >      >    mailto:diego.zucc...@unibo.it>
<mailto:diego.zucc...@unibo.it>> wrote:
 >      >    Killed glfsheal, after a day there were 218 processes, then
 >    they got
 >      >    killed by OOM during the weekend. Now there are no
processes
 >    active.
 >      >    Trying to run "heal info" reports lots of files quite
quickly
 >    but does
 >      >    not spawn any glfsheal process. And neither does restarting
 >    glusterd.
 >      >    Is there some way to selectively run glfsheal to fix
one brick
 >    at a
 >      >    time?
 >

Re: [Gluster-users] Performance: lots of small files, hdd, nvme etc.

2023-03-30 Thread Diego Zuccato


Well, you have *way* more files than we do... :)

Il 30/03/2023 11:26, Hu Bert ha scritto:


Just an observation: is there a performance difference between a sw
raid10 (10 disks -> one brick) or 5x raid1 (each raid1 a brick)

Err... RAID10 is not 10 disks unless you stripe 5 mirrors of 2 disks.


with
the same disks (10TB hdd)? The heal processes on the 5xraid1-scenario
seems faster. Just out of curiosity...
It should be, since the bricks are smaller. But given you're using a 
replica 3 I don't understand why you're also using RAID1: for each 10T 
of user-facing capacity you're keeping 60TB of data on disks.
I'd ditch local RAIDs to double the space available. Unless you 
desperately need the extra read performance.


Options Reconfigured:I'll have a look at the options you use. Maybe something can be useful 

in our case. Tks :)

--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] How to configure?

2023-03-24 Thread Diego Zuccato

There are 285 files in /var/lib/glusterd/vols/cluster_data ... including 
many files with names related to quorum bricks already moved to a 
different path (like cluster_data.client.clustor02.srv-quorum-00-d.vol 
that should already have been replaced by 
cluster_data.clustor02.srv-bricks-00-q.vol -- and both vol files exist).


Is there something I should check inside the volfiles?

Diego

Il 24/03/2023 13:05, Strahil Nikolov ha scritto:

Can you check your volume file contents?
Maybe it really can't find (or access) a specific volfile ?

Best Regards,
Strahil Nikolov

On Fri, Mar 24, 2023 at 8:07, Diego Zuccato
 wrote:
In glfsheal-Connection.log I see many lines like:
[2023-03-13 23:04:40.241481 +] E [MSGID: 104021]
[glfs-mgmt.c:586:glfs_mgmt_getspec_cbk] 0-gfapi: failed to get the
volume file [{from server}, {errno=2}, {error=File o directory non
esistente}]

And *lots* of gfid-mismatch errors in glustershd.log .

Couldn't find anything that would prevent heal to start. :(

Diego

Il 21/03/2023 20:39, Strahil Nikolov ha scritto:
 > I have no clue. Have you checked for errors in the logs ? Maybe you
 > might find something useful.
 >
 > Best Regards,
 > Strahil Nikolov
 >
 >    On Tue, Mar 21, 2023 at 9:56, Diego Zuccato
 >    mailto:diego.zucc...@unibo.it>> wrote:
 >    Killed glfsheal, after a day there were 218 processes, then
they got
 >    killed by OOM during the weekend. Now there are no processes
active.
 >    Trying to run "heal info" reports lots of files quite quickly
but does
 >    not spawn any glfsheal process. And neither does restarting
glusterd.
 >    Is there some way to selectively run glfsheal to fix one brick
at a
 >    time?
 >
 >    Diego
 >
 >    Il 21/03/2023 01:21, Strahil Nikolov ha scritto:
 >      > Theoretically it might help.
 >      > If possible, try to resolve any pending heals.
 >      >
 >      > Best Regards,
     >      > Strahil Nikolov
 >      >
 >      >    On Thu, Mar 16, 2023 at 15:29, Diego Zuccato
 >      >    mailto:diego.zucc...@unibo.it>
<mailto:diego.zucc...@unibo.it>> wrote:
 >      >    In Debian stopping glusterd does not stop brick
processes: to stop
 >      >    everything (and free the memory) I have to
 >      >    systemctl stop glusterd
 >      >        killall glusterfs{,d}
 >      >        killall glfsheal
 >      >        systemctl start glusterd
 >      >    [this behaviour hangs a simple reboot of a machine running
 >    glusterd...
 >      >    not nice]
 >      >
 >      >    For now I just restarted glusterd w/o killing the bricks:
 >      >
 >      >    root@str957-clustor00:~# ps aux|grep glfsheal|wc -l ;
 >    systemctl restart
 >      >    glusterd ; ps aux|grep glfsheal|wc -l
 >      >    618
 >      >    618
 >      >
 >      >    No change neither in glfsheal processes nor in free
memory :(
 >      >    Should I "killall glfsheal" before OOK kicks in?
 >      >
 >      >    Diego
 >      >
 >      >    Il 16/03/2023 12:37, Strahil Nikolov ha scritto:
 >      >      > Can you restart glusterd service (first check that
it was not
 >      >    modified
 >      >      > to kill the bricks)?
 >      >      >
 >      >      > Best Regards,
 >      >      > Strahil Nikolov
 >      >      >
 >      >      >    On Thu, Mar 16, 2023 at 8:26, Diego Zuccato
 >      >      >    mailto:diego.zucc...@unibo.it> <mailto:diego.zucc...@unibo.it>
 >    <mailto:diego.zucc...@unibo.it>> wrote:
 >      >      >    OOM is just just a matter of time.
 >      >      >
 >      >      >    Today mem use is up to 177G/187 and:
 >      >      >    # ps aux|grep glfsheal|wc -l
 >      >      >    551
 >      >      >
 >      >      >    (well, one is actually the grep process, so
"only" 550
 >    glfsheal
 >      >      >    processes.
 >      >      >
 >      >      >    I'll take the last 5:
 >      >      >    root    3266352  0.5  0.0 600292 93044 ?        Sl
 >    06:55  0:07
 >      >      >    /usr/libexec/glusterfs/glfsheal cluster_data
 >    info-summary --xml
 >      >      >    root    3267220  0.7  0.0 600292 91964 ?        Sl
 >    07:00  0:07
 >

Re: [Gluster-users] How to configure?

2023-03-24 Thread Diego Zuccato


In glfsheal-Connection.log I see many lines like:
[2023-03-13 23:04:40.241481 +] E [MSGID: 104021] 
[glfs-mgmt.c:586:glfs_mgmt_getspec_cbk] 0-gfapi: failed to get the 
volume file [{from server}, {errno=2}, {error=File o directory non 
esistente}]


And *lots* of gfid-mismatch errors in glustershd.log .

Couldn't find anything that would prevent heal to start. :(

Diego

Il 21/03/2023 20:39, Strahil Nikolov ha scritto:
I have no clue. Have you checked for errors in the logs ? Maybe you 
might find something useful.


Best Regards,
Strahil Nikolov

On Tue, Mar 21, 2023 at 9:56, Diego Zuccato
 wrote:
Killed glfsheal, after a day there were 218 processes, then they got
killed by OOM during the weekend. Now there are no processes active.
Trying to run "heal info" reports lots of files quite quickly but does
not spawn any glfsheal process. And neither does restarting glusterd.
Is there some way to selectively run glfsheal to fix one brick at a
time?

Diego

Il 21/03/2023 01:21, Strahil Nikolov ha scritto:
 > Theoretically it might help.
 > If possible, try to resolve any pending heals.
 >
 > Best Regards,
 > Strahil Nikolov
 >
 >    On Thu, Mar 16, 2023 at 15:29, Diego Zuccato
 >    mailto:diego.zucc...@unibo.it>> wrote:
 >    In Debian stopping glusterd does not stop brick processes: to stop
 >    everything (and free the memory) I have to
 >    systemctl stop glusterd
 >        killall glusterfs{,d}
 >        killall glfsheal
 >        systemctl start glusterd
 >    [this behaviour hangs a simple reboot of a machine running
glusterd...
 >    not nice]
 >
 >    For now I just restarted glusterd w/o killing the bricks:
 >
 >    root@str957-clustor00:~# ps aux|grep glfsheal|wc -l ;
systemctl restart
 >    glusterd ; ps aux|grep glfsheal|wc -l
 >    618
 >    618
 >
 >    No change neither in glfsheal processes nor in free memory :(
 >    Should I "killall glfsheal" before OOK kicks in?
 >
 >    Diego
 >
 >    Il 16/03/2023 12:37, Strahil Nikolov ha scritto:
 >      > Can you restart glusterd service (first check that it was not
 >    modified
 >      > to kill the bricks)?
     >      >
 >      > Best Regards,
 >      > Strahil Nikolov
 >      >
 >      >    On Thu, Mar 16, 2023 at 8:26, Diego Zuccato
 >      >    mailto:diego.zucc...@unibo.it>
<mailto:diego.zucc...@unibo.it>> wrote:
 >      >    OOM is just just a matter of time.
 >      >
 >      >    Today mem use is up to 177G/187 and:
 >      >    # ps aux|grep glfsheal|wc -l
 >      >    551
 >      >
 >      >    (well, one is actually the grep process, so "only" 550
glfsheal
 >      >    processes.
 >      >
 >      >    I'll take the last 5:
 >      >    root    3266352  0.5  0.0 600292 93044 ?        Sl 
06:55  0:07

 >      >    /usr/libexec/glusterfs/glfsheal cluster_data
info-summary --xml
 >      >    root    3267220  0.7  0.0 600292 91964 ?        Sl 
07:00  0:07

 >      >    /usr/libexec/glusterfs/glfsheal cluster_data
info-summary --xml
 >      >    root    3268076  1.0  0.0 600160 88216 ?        Sl 
07:05  0:08

 >      >    /usr/libexec/glusterfs/glfsheal cluster_data
info-summary --xml
 >      >    root    3269492  1.6  0.0 600292 91248 ?        Sl 
07:10  0:07

 >      >    /usr/libexec/glusterfs/glfsheal cluster_data
info-summary --xml
 >      >    root    3270354  4.4  0.0 600292 93260 ?        Sl 
07:15  0:07

 >      >    /usr/libexec/glusterfs/glfsheal cluster_data
info-summary --xml
 >      >
 >      >    -8<--
 >      >    root@str957-clustor00:~# ps -o ppid= 3266352
 >      >    3266345
 >      >    root@str957-clustor00:~# ps -o ppid= 3267220
 >      >    3267213
 >      >    root@str957-clustor00:~# ps -o ppid= 3268076
 >      >    3268069
 >      >    root@str957-clustor00:~# ps -o ppid= 3269492
 >      >    3269485
 >      >    root@str957-clustor00:~# ps -o ppid= 3270354
 >      >    3270347
 >      >    root@str957-clustor00:~# ps aux|grep 3266345
 >      >    root    3266345  0.0  0.0 430536 10764 ?        Sl 
06:55  0:00

 >      >    gluster volume heal cluster_data info summary --xml
 >      >    root    3271532  0.0  0.0  6260  2500 pts/1    S+ 
07:21  0:00

 >    grep
 >      >    3266345
 >

Re: [Gluster-users] How to configure?

2023-03-21 Thread Diego Zuccato

Killed glfsheal, after a day there were 218 processes, then they got 
killed by OOM during the weekend. Now there are no processes active.
Trying to run "heal info" reports lots of files quite quickly but does 
not spawn any glfsheal process. And neither does restarting glusterd.

Is there some way to selectively run glfsheal to fix one brick at a time?

Diego

Il 21/03/2023 01:21, Strahil Nikolov ha scritto:

Theoretically it might help.
If possible, try to resolve any pending heals.

Best Regards,
Strahil Nikolov

On Thu, Mar 16, 2023 at 15:29, Diego Zuccato
 wrote:
In Debian stopping glusterd does not stop brick processes: to stop
everything (and free the memory) I have to
systemctl stop glusterd
   killall glusterfs{,d}
   killall glfsheal
   systemctl start glusterd
[this behaviour hangs a simple reboot of a machine running glusterd...
not nice]

For now I just restarted glusterd w/o killing the bricks:

root@str957-clustor00:~# ps aux|grep glfsheal|wc -l ; systemctl restart
glusterd ; ps aux|grep glfsheal|wc -l
618
618

No change neither in glfsheal processes nor in free memory :(
Should I "killall glfsheal" before OOK kicks in?

Diego

Il 16/03/2023 12:37, Strahil Nikolov ha scritto:
 > Can you restart glusterd service (first check that it was not
modified
 > to kill the bricks)?
 >
 > Best Regards,
 > Strahil Nikolov
 >
     >    On Thu, Mar 16, 2023 at 8:26, Diego Zuccato
 >    mailto:diego.zucc...@unibo.it>> wrote:
 >    OOM is just just a matter of time.
 >
 >    Today mem use is up to 177G/187 and:
 >    # ps aux|grep glfsheal|wc -l
 >    551
 >
 >    (well, one is actually the grep process, so "only" 550 glfsheal
 >    processes.
 >
 >    I'll take the last 5:
 >    root    3266352  0.5  0.0 600292 93044 ?        Sl  06:55  0:07
 >    /usr/libexec/glusterfs/glfsheal cluster_data info-summary --xml
 >    root    3267220  0.7  0.0 600292 91964 ?        Sl  07:00  0:07
 >    /usr/libexec/glusterfs/glfsheal cluster_data info-summary --xml
 >    root    3268076  1.0  0.0 600160 88216 ?        Sl  07:05  0:08
 >    /usr/libexec/glusterfs/glfsheal cluster_data info-summary --xml
 >    root    3269492  1.6  0.0 600292 91248 ?        Sl  07:10  0:07
 >    /usr/libexec/glusterfs/glfsheal cluster_data info-summary --xml
 >    root    3270354  4.4  0.0 600292 93260 ?        Sl  07:15  0:07
 >    /usr/libexec/glusterfs/glfsheal cluster_data info-summary --xml
 >
 >    -8<--
 >    root@str957-clustor00:~# ps -o ppid= 3266352
 >    3266345
 >    root@str957-clustor00:~# ps -o ppid= 3267220
 >    3267213
 >    root@str957-clustor00:~# ps -o ppid= 3268076
 >    3268069
 >    root@str957-clustor00:~# ps -o ppid= 3269492
 >    3269485
 >    root@str957-clustor00:~# ps -o ppid= 3270354
 >    3270347
 >    root@str957-clustor00:~# ps aux|grep 3266345
 >    root    3266345  0.0  0.0 430536 10764 ?        Sl  06:55  0:00
 >    gluster volume heal cluster_data info summary --xml
 >    root    3271532  0.0  0.0  6260  2500 pts/1    S+  07:21  0:00
grep
 >    3266345
 >    root@str957-clustor00:~# ps aux|grep 3267213
 >    root    3267213  0.0  0.0 430536 10644 ?        Sl  07:00  0:00
 >    gluster volume heal cluster_data info summary --xml
 >    root    3271599  0.0  0.0  6260  2480 pts/1    S+  07:22  0:00
grep
 >    3267213
 >    root@str957-clustor00:~# ps aux|grep 3268069
 >    root    3268069  0.0  0.0 430536 10704 ?        Sl  07:05  0:00
 >    gluster volume heal cluster_data info summary --xml
 >    root    3271626  0.0  0.0  6260  2516 pts/1    S+  07:22  0:00
grep
 >    3268069
 >    root@str957-clustor00:~# ps aux|grep 3269485
 >    root    3269485  0.0  0.0 430536 10756 ?        Sl  07:10  0:00
 >    gluster volume heal cluster_data info summary --xml
 >    root    3271647  0.0  0.0  6260  2480 pts/1    S+  07:22  0:00
grep
 >    3269485
 >    root@str957-clustor00:~# ps aux|grep 3270347
 >    root    3270347  0.0  0.0 430536 10672 ?        Sl  07:15  0:00
 >    gluster volume heal cluster_data info summary --xml
 >    root    3271666  0.0  0.0  6260  2568 pts/1    S+  07:22  0:00
grep
 >    3270347
 >    -8<--
 >
 >    Seems glfsheal is spawning more processes.
 >    I can't rule out a metadata corruption (or at least a desync),
but it
 >    shouldn't happen...
 >
 >    Diego
 >
 >    Il 15/03/2023 20:11, Strahil Nikolov ha scritto:
 >      > I

Re: [Gluster-users] How to configure?

2023-03-16 Thread Diego Zuccato

In Debian stopping glusterd does not stop brick processes: to stop 
everything (and free the memory) I have to

systemctl stop glusterd
  killall glusterfs{,d}
  killall glfsheal
  systemctl start glusterd
[this behaviour hangs a simple reboot of a machine running glusterd... 
not nice]


For now I just restarted glusterd w/o killing the bricks:

root@str957-clustor00:~# ps aux|grep glfsheal|wc -l ; systemctl restart 
glusterd ; ps aux|grep glfsheal|wc -l

618
618

No change neither in glfsheal processes nor in free memory :(
Should I "killall glfsheal" before OOK kicks in?

Diego

Il 16/03/2023 12:37, Strahil Nikolov ha scritto:
Can you restart glusterd service (first check that it was not modified 
to kill the bricks)?


Best Regards,
Strahil Nikolov

On Thu, Mar 16, 2023 at 8:26, Diego Zuccato
 wrote:
OOM is just just a matter of time.

Today mem use is up to 177G/187 and:
# ps aux|grep glfsheal|wc -l
551

(well, one is actually the grep process, so "only" 550 glfsheal
processes.

I'll take the last 5:
root    3266352  0.5  0.0 600292 93044 ?        Sl  06:55  0:07
/usr/libexec/glusterfs/glfsheal cluster_data info-summary --xml
root    3267220  0.7  0.0 600292 91964 ?        Sl  07:00  0:07
/usr/libexec/glusterfs/glfsheal cluster_data info-summary --xml
root    3268076  1.0  0.0 600160 88216 ?        Sl  07:05  0:08
/usr/libexec/glusterfs/glfsheal cluster_data info-summary --xml
root    3269492  1.6  0.0 600292 91248 ?        Sl  07:10  0:07
/usr/libexec/glusterfs/glfsheal cluster_data info-summary --xml
root    3270354  4.4  0.0 600292 93260 ?        Sl  07:15  0:07
/usr/libexec/glusterfs/glfsheal cluster_data info-summary --xml

-8<--
root@str957-clustor00:~# ps -o ppid= 3266352
3266345
root@str957-clustor00:~# ps -o ppid= 3267220
3267213
root@str957-clustor00:~# ps -o ppid= 3268076
3268069
root@str957-clustor00:~# ps -o ppid= 3269492
3269485
root@str957-clustor00:~# ps -o ppid= 3270354
3270347
root@str957-clustor00:~# ps aux|grep 3266345
root    3266345  0.0  0.0 430536 10764 ?        Sl  06:55  0:00
gluster volume heal cluster_data info summary --xml
root    3271532  0.0  0.0  6260  2500 pts/1    S+  07:21  0:00 grep
3266345
root@str957-clustor00:~# ps aux|grep 3267213
root    3267213  0.0  0.0 430536 10644 ?        Sl  07:00  0:00
gluster volume heal cluster_data info summary --xml
root    3271599  0.0  0.0  6260  2480 pts/1    S+  07:22  0:00 grep
3267213
root@str957-clustor00:~# ps aux|grep 3268069
root    3268069  0.0  0.0 430536 10704 ?        Sl  07:05  0:00
gluster volume heal cluster_data info summary --xml
root    3271626  0.0  0.0  6260  2516 pts/1    S+  07:22  0:00 grep
3268069
root@str957-clustor00:~# ps aux|grep 3269485
root    3269485  0.0  0.0 430536 10756 ?        Sl  07:10  0:00
gluster volume heal cluster_data info summary --xml
root    3271647  0.0  0.0  6260  2480 pts/1    S+  07:22  0:00 grep
3269485
root@str957-clustor00:~# ps aux|grep 3270347
root    3270347  0.0  0.0 430536 10672 ?        Sl  07:15  0:00
gluster volume heal cluster_data info summary --xml
root    3271666  0.0  0.0  6260  2568 pts/1    S+  07:22  0:00 grep
3270347
-8<--

Seems glfsheal is spawning more processes.
I can't rule out a metadata corruption (or at least a desync), but it
shouldn't happen...

Diego

Il 15/03/2023 20:11, Strahil Nikolov ha scritto:
 > If you don't experience any OOM , you can focus on the heals.
 >
 > 284 processes of glfsheal seems odd.
 >
 > Can you check the ppid for 2-3 randomly picked ?
 > ps -o ppid= 
 >
 > Best Regards,
 > Strahil Nikolov
 >
 >    On Wed, Mar 15, 2023 at 9:54, Diego Zuccato
 >    mailto:diego.zucc...@unibo.it>> wrote:
 >    I enabled it yesterday and that greatly reduced memory pressure.
 >    Current volume info:
 >    -8<--
 >    Volume Name: cluster_data
 >    Type: Distributed-Replicate
 >    Volume ID: a8caaa90-d161-45bb-a68c-278263a8531a
 >    Status: Started
 >    Snapshot Count: 0
 >    Number of Bricks: 45 x (2 + 1) = 135
 >    Transport-type: tcp
 >    Bricks:
 >    Brick1: clustor00:/srv/bricks/00/d
 >    Brick2: clustor01:/srv/bricks/00/d
 >    Brick3: clustor02:/srv/bricks/00/q (arbiter)
 >    [...]
 >    Brick133: clustor01:/srv/bricks/29/d
 >    Brick134: clustor02:/srv/bricks/29/d
 >    Brick135: clustor00:/srv/bricks/14/q (arbiter)
 >    Options Reconfigured:
 >    performance.quick-read: off
 >    cluster.entry-self-heal: on
 >    cluster.data-self-heal-algorithm: full
 >    cluster.metadata-self-heal: on
 >

Re: [Gluster-users] How to configure?

2023-03-16 Thread Diego Zuccato


OOM is just just a matter of time.

Today mem use is up to 177G/187 and:
# ps aux|grep glfsheal|wc -l
551

(well, one is actually the grep process, so "only" 550 glfsheal processes.

I'll take the last 5:
root 3266352  0.5  0.0 600292 93044 ?Sl   06:55   0:07 
/usr/libexec/glusterfs/glfsheal cluster_data info-summary --xml
root 3267220  0.7  0.0 600292 91964 ?Sl   07:00   0:07 
/usr/libexec/glusterfs/glfsheal cluster_data info-summary --xml
root 3268076  1.0  0.0 600160 88216 ?Sl   07:05   0:08 
/usr/libexec/glusterfs/glfsheal cluster_data info-summary --xml
root 3269492  1.6  0.0 600292 91248 ?Sl   07:10   0:07 
/usr/libexec/glusterfs/glfsheal cluster_data info-summary --xml
root 3270354  4.4  0.0 600292 93260 ?Sl   07:15   0:07 
/usr/libexec/glusterfs/glfsheal cluster_data info-summary --xml


-8<--
root@str957-clustor00:~# ps -o ppid= 3266352
3266345
root@str957-clustor00:~# ps -o ppid= 3267220
3267213
root@str957-clustor00:~# ps -o ppid= 3268076
3268069
root@str957-clustor00:~# ps -o ppid= 3269492
3269485
root@str957-clustor00:~# ps -o ppid= 3270354
3270347
root@str957-clustor00:~# ps aux|grep 3266345
root 3266345  0.0  0.0 430536 10764 ?Sl   06:55   0:00 
gluster volume heal cluster_data info summary --xml
root 3271532  0.0  0.0   6260  2500 pts/1S+   07:21   0:00 grep 
3266345

root@str957-clustor00:~# ps aux|grep 3267213
root 3267213  0.0  0.0 430536 10644 ?Sl   07:00   0:00 
gluster volume heal cluster_data info summary --xml
root 3271599  0.0  0.0   6260  2480 pts/1S+   07:22   0:00 grep 
3267213

root@str957-clustor00:~# ps aux|grep 3268069
root 3268069  0.0  0.0 430536 10704 ?Sl   07:05   0:00 
gluster volume heal cluster_data info summary --xml
root 3271626  0.0  0.0   6260  2516 pts/1S+   07:22   0:00 grep 
3268069

root@str957-clustor00:~# ps aux|grep 3269485
root 3269485  0.0  0.0 430536 10756 ?Sl   07:10   0:00 
gluster volume heal cluster_data info summary --xml
root 3271647  0.0  0.0   6260  2480 pts/1S+   07:22   0:00 grep 
3269485

root@str957-clustor00:~# ps aux|grep 3270347
root 3270347  0.0  0.0 430536 10672 ?Sl   07:15   0:00 
gluster volume heal cluster_data info summary --xml
root 3271666  0.0  0.0   6260  2568 pts/1S+   07:22   0:00 grep 
3270347

-8<--

Seems glfsheal is spawning more processes.
I can't rule out a metadata corruption (or at least a desync), but it 
shouldn't happen...


Diego

Il 15/03/2023 20:11, Strahil Nikolov ha scritto:

If you don't experience any OOM , you can focus on the heals.

284 processes of glfsheal seems odd.

Can you check the ppid for 2-3 randomly picked ?
ps -o ppid= 

Best Regards,
Strahil Nikolov

On Wed, Mar 15, 2023 at 9:54, Diego Zuccato
 wrote:
I enabled it yesterday and that greatly reduced memory pressure.
Current volume info:
-8<--
Volume Name: cluster_data
Type: Distributed-Replicate
Volume ID: a8caaa90-d161-45bb-a68c-278263a8531a
Status: Started
Snapshot Count: 0
Number of Bricks: 45 x (2 + 1) = 135
Transport-type: tcp
Bricks:
Brick1: clustor00:/srv/bricks/00/d
Brick2: clustor01:/srv/bricks/00/d
Brick3: clustor02:/srv/bricks/00/q (arbiter)
[...]
Brick133: clustor01:/srv/bricks/29/d
Brick134: clustor02:/srv/bricks/29/d
Brick135: clustor00:/srv/bricks/14/q (arbiter)
Options Reconfigured:
performance.quick-read: off
cluster.entry-self-heal: on
cluster.data-self-heal-algorithm: full
cluster.metadata-self-heal: on
cluster.shd-max-threads: 2
network.inode-lru-limit: 50
performance.md-cache-timeout: 600
performance.cache-invalidation: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
features.quota-deem-statfs: on
performance.readdir-ahead: on
cluster.granular-entry-heal: enable
features.scrub: Active
features.bitrot: on
cluster.lookup-optimize: on
performance.stat-prefetch: on
performance.cache-refresh-timeout: 60
performance.parallel-readdir: on
performance.write-behind-window-size: 128MB
cluster.self-heal-daemon: enable
features.inode-quota: on
features.quota: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
client.event-threads: 1
features.scrub-throttle: normal
diagnostics.brick-log-level: ERROR
diagnostics.client-log-level: ERROR
config.brick-threads: 0
cluster.lookup-unhashed: on
config.client-threads: 1
cluster.use-anonymous-inode: off
diagnostics.brick-sys-log-level: CRITICAL
features.scrub-freq: monthly
cluster.data-self-heal: on
cluster.brick-multiplex: on
cluster.daemon-log-level: ERROR
-8<--

htop reports that memory usage is up to 143G, there are 602 tasks and
5232 threads (~20 running) on clustor00, 117G/49 tasks/1565 threads on

Re: [Gluster-users] How to configure?

2023-03-15 Thread Diego Zuccato


I enabled it yesterday and that greatly reduced memory pressure.
Current volume info:
-8<--
Volume Name: cluster_data
Type: Distributed-Replicate
Volume ID: a8caaa90-d161-45bb-a68c-278263a8531a
Status: Started
Snapshot Count: 0
Number of Bricks: 45 x (2 + 1) = 135
Transport-type: tcp
Bricks:
Brick1: clustor00:/srv/bricks/00/d
Brick2: clustor01:/srv/bricks/00/d
Brick3: clustor02:/srv/bricks/00/q (arbiter)
[...]
Brick133: clustor01:/srv/bricks/29/d
Brick134: clustor02:/srv/bricks/29/d
Brick135: clustor00:/srv/bricks/14/q (arbiter)
Options Reconfigured:
performance.quick-read: off
cluster.entry-self-heal: on
cluster.data-self-heal-algorithm: full
cluster.metadata-self-heal: on
cluster.shd-max-threads: 2
network.inode-lru-limit: 50
performance.md-cache-timeout: 600
performance.cache-invalidation: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
features.quota-deem-statfs: on
performance.readdir-ahead: on
cluster.granular-entry-heal: enable
features.scrub: Active
features.bitrot: on
cluster.lookup-optimize: on
performance.stat-prefetch: on
performance.cache-refresh-timeout: 60
performance.parallel-readdir: on
performance.write-behind-window-size: 128MB
cluster.self-heal-daemon: enable
features.inode-quota: on
features.quota: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
client.event-threads: 1
features.scrub-throttle: normal
diagnostics.brick-log-level: ERROR
diagnostics.client-log-level: ERROR
config.brick-threads: 0
cluster.lookup-unhashed: on
config.client-threads: 1
cluster.use-anonymous-inode: off
diagnostics.brick-sys-log-level: CRITICAL
features.scrub-freq: monthly
cluster.data-self-heal: on
cluster.brick-multiplex: on
cluster.daemon-log-level: ERROR
-8<--

htop reports that memory usage is up to 143G, there are 602 tasks and 
5232 threads (~20 running) on clustor00, 117G/49 tasks/1565 threads on 
clustor01 and 126G/45 tasks/1574 threads on clustor02.
I see quite a lot (284!) of glfsheal processes running on clustor00 (a 
"gluster v heal cluster_data info summary" is running on clustor02 since 
yesterday, still no output). Shouldn't be just one per brick?


Diego

Il 15/03/2023 08:30, Strahil Nikolov ha scritto:

Do you use brick multiplexing ?

Best Regards,
Strahil Nikolov

On Tue, Mar 14, 2023 at 16:44, Diego Zuccato
 wrote:
Hello all.

Our Gluster 9.6 cluster is showing increasing problems.
Currently it's composed of 3 servers (2x Intel Xeon 4210 [20 cores dual
thread, total 40 threads], 192GB RAM, 30x HGST HUH721212AL5200 [12TB]),
configured in replica 3 arbiter 1. Using Debian packages from Gluster
9.x latest repository.

Seems 192G RAM are not enough to handle 30 data bricks + 15 arbiters
and
I often had to reload glusterfsd because glusterfs processed got killed
for OOM.
On top of that, performance have been quite bad, especially when we
reached about 20M files. On top of that, one of the servers have had
mobo issues that resulted in memory errors that corrupted some
bricks fs
(XFS, it required "xfs_reparir -L" to fix).
Now I'm getting lots of "stale file handle" errors and other errors
(like directories that seem empty from the client but still containing
files in some bricks) and auto healing seems unable to complete.

Since I can't keep up continuing to manually fix all the issues, I'm
thinking about backup+destroy+recreate strategy.

I think that if I reduce the number of bricks per server to just 5
(RAID1 of 6x12TB disks) I might resolve RAM issues - at the cost of
longer heal times in case a disk fails. Am I right or it's useless?
Other recommendations?
Servers have space for another 6 disks. Maybe those could be used for
some SSDs to speed up access?

TIA.

-- 
Diego Zuccato

DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
<https://meet.google.com/cpu-eiue-hvk>
Gluster-users mailing list
Gluster-users@gluster.org <mailto:Gluster-users@gluster.org>
https://lists.gluster.org/mailman/listinfo/gluster-users
<https://lists.gluster.org/mailman/listinfo/gluster-users>



--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] How to configure?

2023-03-14 Thread Diego Zuccato


Hello all.

Our Gluster 9.6 cluster is showing increasing problems.
Currently it's composed of 3 servers (2x Intel Xeon 4210 [20 cores dual 
thread, total 40 threads], 192GB RAM, 30x HGST HUH721212AL5200 [12TB]), 
configured in replica 3 arbiter 1. Using Debian packages from Gluster 
9.x latest repository.


Seems 192G RAM are not enough to handle 30 data bricks + 15 arbiters and 
I often had to reload glusterfsd because glusterfs processed got killed 
for OOM.
On top of that, performance have been quite bad, especially when we 
reached about 20M files. On top of that, one of the servers have had 
mobo issues that resulted in memory errors that corrupted some bricks fs 
(XFS, it required "xfs_reparir -L" to fix).
Now I'm getting lots of "stale file handle" errors and other errors 
(like directories that seem empty from the client but still containing 
files in some bricks) and auto healing seems unable to complete.


Since I can't keep up continuing to manually fix all the issues, I'm 
thinking about backup+destroy+recreate strategy.


I think that if I reduce the number of bricks per server to just 5 
(RAID1 of 6x12TB disks) I might resolve RAM issues - at the cost of 
longer heal times in case a disk fails. Am I right or it's useless? 
Other recommendations?
Servers have space for another 6 disks. Maybe those could be used for 
some SSDs to speed up access?


TIA.

--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Quick way to fix stale gfids?

2023-02-13 Thread Diego Zuccato

My volume is replica 3 arbiter 1, maybe that makes a difference?
Bricks processes tend to die quite often (I have to restart glusterd at 
least once a day because "gluster v info | grep ' N '" reports at least 
one missing brick; sometimes even if all bricks are reported up I have 
to kill all glusterfs[d] processes and restart glusterd).

The 3 servers have 192GB RAM (that should be way more than enough!), 30 
data bricks and 15 arbiters (the arbiters share a single SSD).

And I noticed that some "stale file handle" are not reported by heal info.

root@str957-cluster:/# ls -l 
/scratch/extra/m**/PNG/PNGQuijote/ModGrav/fNL40/
ls: cannot access 
'/scratch/extra/m**/PNG/PNGQuijote/ModGrav/fNL40/output_21': Stale 
file handle

total 40
d?  ? ??   ?? output_21
...
but "gluster v heal cluster_data info |grep output_21" returns nothing. :(

Seems the other stale handles either got corrected by subsequent 'stat's 
or became I/O errors.

Diego.

Il 12/02/2023 21:34, Strahil Nikolov ha scritto:
The 2-nd error indicates conflicts between the nodes. The only way that 
could happen on replica 3 is gfid conflict (file/dir was renamed or 
recreated).

Are you sure that all bricks are online? Usually 'Transport endpoint is 
not connected' indicates a brick down situation.

First start with all stale file handles:
check md5sum on all bricks. If it differs somewhere, delete the gfid and 
move the file away from the brick and check in FUSE. If it's fine , 
touch it and the FUSE client will "heal" it.

Best Regards,
Strahil Nikolov

On Tue, Feb 7, 2023 at 16:33, Diego Zuccato
 wrote:
The contents do not match exactly, but the only difference is the
"option shared-brick-count" line that sometimes is 0 and sometimes 1.

The command you gave could be useful for the files that still needs
healing with the source still present, but the files related to the
stale gfids have been deleted, so "find -samefile" won't find anything.

For the other files reported by heal info, I saved the output to
'healinfo', then:
   for T in $(grep '^/' healinfo |sort|uniq); do stat /mnt/scratch$T >
/dev/null; done

but I still see a lot of 'Transport endpoint is not connected' and
'Stale file handle' errors :( And many 'No such file or directory'...

I don't understand the first two errors, since /mnt/scratch have been
freshly mounted after enabling client healing, and gluster v info does
not highlight unconnected/down bricks.

Diego

Il 06/02/2023 22:46, Strahil Nikolov ha scritto:
 > I'm not sure if the md5sum has to match , but at least the content
 > should do.
 > In modern versions of GlusterFS the client side healing is
disabled ,
 > but it's worth trying.
 > You will need to enable cluster.metadata-self-heal,
 > cluster.data-self-heal and cluster.entry-self-heal and then create a
 > small one-liner that identifies the names of the files/dirs from the
 > volume heal ,so you can stat them through the FUSE.
 >
 > Something like this:
 >
 >
 > for i in $(gluster volume heal  info | awk -F ''
'/gfid:/
 > {print $2}'); do find /PATH/TO/BRICK/ -samefile
 > /PATH/TO/BRICK/.glusterfs/${i:0:2}/${i:2:2}/$i | awk '!/.glusterfs/
 > {gsub("/PATH/TO/BRICK", "stat /MY/FUSE/MOUNTPOINT", $0); print
$0}' ; done
 >
 > Then Just copy paste the output and you will trigger the client side
 > heal only on the affected gfids.
 >
 > Best Regards,
 > Strahil Nikolov
 > В понеделник, 6 февруари 2023 г., 10:19:02 ч. Гринуич+2, Diego
Zuccato
 > mailto:diego.zucc...@unibo.it>> написа:
 >
 >
 > Ops... Reincluding the list that got excluded in my previous
answer :(
 >
 > I generated md5sums of all files in vols/ on clustor02 and
compared to
 > the other nodes (clustor00 and clustor01).
 > There are differences in volfiles (shouldn't it always be 1,
since every
 > data brick is on its own fs? quorum bricks, OTOH, share a single
 > partition on SSD and should always be 15, but in both cases sometimes
 > it's 0).
 >
 > I nearly got a stroke when I saw diff output for 'info' files,
but once
 > I sorted 'em their contents matched. Pfhew!
 >
 > Diego
 >
 > Il 03/02/2023 19:01, Strahil Nikolov ha scritto:
 >  > This one doesn't look good:
 >  >
 >  >
 >  > [2023-02-03 07:45:46.896924 +] E [MSGID: 114079]
 >  > [client-handshake.c:1253:client_query_portmap]
0-cluster_data-client-48:
 >  > remote-subvolume not set in volfile []
 >  >
 >  >
 >  >

Re: [Gluster-users] Quick way to fix stale gfids?

2023-02-07 Thread Diego Zuccato

The contents do not match exactly, but the only difference is the 
"option shared-brick-count" line that sometimes is 0 and sometimes 1.

The command you gave could be useful for the files that still needs 
healing with the source still present, but the files related to the 
stale gfids have been deleted, so "find -samefile" won't find anything.

For the other files reported by heal info, I saved the output to 
'healinfo', then:
  for T in $(grep '^/' healinfo |sort|uniq); do stat /mnt/scratch$T > 
/dev/null; done

but I still see a lot of 'Transport endpoint is not connected' and 
'Stale file handle' errors :( And many 'No such file or directory'...

I don't understand the first two errors, since /mnt/scratch have been 
freshly mounted after enabling client healing, and gluster v info does 
not highlight unconnected/down bricks.

Diego

Il 06/02/2023 22:46, Strahil Nikolov ha scritto:
I'm not sure if the md5sum has to match , but at least the content 
should do.
In modern versions of GlusterFS the client side healing is disabled , 
but it's worth trying.
You will need to enable cluster.metadata-self-heal, 
cluster.data-self-heal and cluster.entry-self-heal and then create a 
small one-liner that identifies the names of the files/dirs from the 
volume heal ,so you can stat them through the FUSE.

Something like this:

for i in $(gluster volume heal  info | awk -F '' '/gfid:/ 
{print $2}'); do find /PATH/TO/BRICK/ -samefile 
/PATH/TO/BRICK/.glusterfs/${i:0:2}/${i:2:2}/$i | awk '!/.glusterfs/ 
{gsub("/PATH/TO/BRICK", "stat /MY/FUSE/MOUNTPOINT", $0); print $0}' ; done

Then Just copy paste the output and you will trigger the client side 
heal only on the affected gfids.

Best Regards,
Strahil Nikolov
В понеделник, 6 февруари 2023 г., 10:19:02 ч. Гринуич+2, Diego Zuccato 
 написа:

Ops... Reincluding the list that got excluded in my previous answer :(

I generated md5sums of all files in vols/ on clustor02 and compared to
the other nodes (clustor00 and clustor01).
There are differences in volfiles (shouldn't it always be 1, since every
data brick is on its own fs? quorum bricks, OTOH, share a single
partition on SSD and should always be 15, but in both cases sometimes
it's 0).

I nearly got a stroke when I saw diff output for 'info' files, but once
I sorted 'em their contents matched. Pfhew!

Diego

Il 03/02/2023 19:01, Strahil Nikolov ha scritto:
 > This one doesn't look good:
 >
 >
 > [2023-02-03 07:45:46.896924 +] E [MSGID: 114079]
 > [client-handshake.c:1253:client_query_portmap] 0-cluster_data-client-48:
 > remote-subvolume not set in volfile []
 >
 >
 > Can you compare all vol files in /var/lib/glusterd/vols/ between the 
nodes ?

 > I have the suspicioun that there is a vol file mismatch (maybe
 > /var/lib/glusterd/vols//*-shd.vol).
 >
 > Best Regards,
 > Strahil Nikolov
 >
 >    On Fri, Feb 3, 2023 at 12:20, Diego Zuccato
 >    mailto:diego.zucc...@unibo.it>> wrote:
 >    Can't see anything relevant in glfsheal log, just messages related to
 >    the crash of one of the nodes (the one that had the mobo replaced... I
 >    fear some on-disk structures could have been silently damaged by RAM
 >    errors and that makes gluster processes crash, or it's just an issue
 >    with enabling brick-multiplex).
 >    -8<--
 >    [2023-02-03 07:45:46.896924 +] E [MSGID: 114079]
 >    [client-handshake.c:1253:client_query_portmap]
 >    0-cluster_data-client-48:
 >    remote-subvolume not set in volfile []
 >    [2023-02-03 07:45:46.897282 +] E
 >    [rpc-clnt.c:331:saved_frames_unwind] (-->
 >
/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x195)[0x7fce0c867b95]

 >    (--> /lib/x86_64-linux-gnu/libgfrpc.so.0(+0x72fc)[0x7fce0c0ca2fc] (-->
 >
/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x109)[0x7fce0c0d2419]
 >    (--> /lib/x86_64-linux-gnu/libgfrpc.so.0(+0x10308)[0x7fce0c0d3308] 
(-->
 >
/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_transport_notify+0x26)[0x7fce0c0ce7e6]

 >    ) 0-cluster_data-client-48: forced unwinding frame type(GF-DUMP)
 >    op(NULL(2)) called at 2023-02-03 07:45:46.891054 + (xid=0x13)
 >    -8<--
 >
 >    Well, actually I *KNOW* the files outside .glusterfs have been deleted
 >    (by me :) ). That's why I call those 'stale' gfids.
 >    Affected entries under .glusterfs have usually link count = 1 =>
 >    nothing
 >    'find' can find.
 >    Since I already recovered those files (before deleting from bricks),
 >    can
 >    .glusterfs entries be deleted too or should I check something else?
 >    Maybe I should create a script that finds all files/dirs (not 
symlinks,
 >    IIUC) in .glusterfs on all bricks/arbiters and moves 'em to a temp 
dir?

 >
 >    Diego
 >
 >    Il 02/02/2023

Re: [Gluster-users] Quick way to fix stale gfids?

2023-02-06 Thread Diego Zuccato


Ops... Reincluding the list that got excluded in my previous answer :(

I generated md5sums of all files in vols/ on clustor02 and compared to 
the other nodes (clustor00 and clustor01).
There are differences in volfiles (shouldn't it always be 1, since every 
data brick is on its own fs? quorum bricks, OTOH, share a single 
partition on SSD and should always be 15, but in both cases sometimes 
it's 0).


I nearly got a stroke when I saw diff output for 'info' files, but once 
I sorted 'em their contents matched. Pfhew!


Diego

Il 03/02/2023 19:01, Strahil Nikolov ha scritto:

This one doesn't look good:


[2023-02-03 07:45:46.896924 +] E [MSGID: 114079]
[client-handshake.c:1253:client_query_portmap] 0-cluster_data-client-48:
remote-subvolume not set in volfile []


Can you compare all vol files in /var/lib/glusterd/vols/ between the nodes ?
I have the suspicioun that there is a vol file mismatch (maybe 
/var/lib/glusterd/vols//*-shd.vol).


Best Regards,
Strahil Nikolov

On Fri, Feb 3, 2023 at 12:20, Diego Zuccato
 wrote:
Can't see anything relevant in glfsheal log, just messages related to
the crash of one of the nodes (the one that had the mobo replaced... I
fear some on-disk structures could have been silently damaged by RAM
errors and that makes gluster processes crash, or it's just an issue
with enabling brick-multiplex).
-8<--
[2023-02-03 07:45:46.896924 +] E [MSGID: 114079]
[client-handshake.c:1253:client_query_portmap]
0-cluster_data-client-48:
remote-subvolume not set in volfile []
[2023-02-03 07:45:46.897282 +] E
[rpc-clnt.c:331:saved_frames_unwind] (-->

/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x195)[0x7fce0c867b95]
(--> /lib/x86_64-linux-gnu/libgfrpc.so.0(+0x72fc)[0x7fce0c0ca2fc] (-->

/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x109)[0x7fce0c0d2419]
(--> /lib/x86_64-linux-gnu/libgfrpc.so.0(+0x10308)[0x7fce0c0d3308] (-->

/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_transport_notify+0x26)[0x7fce0c0ce7e6]
) 0-cluster_data-client-48: forced unwinding frame type(GF-DUMP)
op(NULL(2)) called at 2023-02-03 07:45:46.891054 + (xid=0x13)
-8<--

Well, actually I *KNOW* the files outside .glusterfs have been deleted
(by me :) ). That's why I call those 'stale' gfids.
Affected entries under .glusterfs have usually link count = 1 =>
nothing
'find' can find.
Since I already recovered those files (before deleting from bricks),
can
.glusterfs entries be deleted too or should I check something else?
Maybe I should create a script that finds all files/dirs (not symlinks,
IIUC) in .glusterfs on all bricks/arbiters and moves 'em to a temp dir?

Diego

Il 02/02/2023 23:35, Strahil Nikolov ha scritto:
 > Any issues reported in /var/log/glusterfs/glfsheal-*.log ?
 >
 > The easiest way to identify the affected entries is to run:
 > find /FULL/PATH/TO/BRICK/ -samefile
 >
/FULL/PATH/TO/BRICK/.glusterfs/57/e4/57e428c7-6bed-4eb3-b9bd-02ca4c46657a
 >
 >
 > Best Regards,
 > Strahil Nikolov
 >
 >
     > В вторник, 31 януари 2023 г., 11:58:24 ч. Гринуич+2, Diego Zuccato
 > mailto:diego.zucc...@unibo.it>> написа:
 >
 >
 > Hello all.
 >
 > I've had one of the 3 nodes serving a "replica 3 arbiter 1" down for
 > some days (apparently RAM issues, but actually failing mobo).
 > The other nodes have had some issues (RAM exhaustion, old problem
 > already ticketed but still no solution) and some brick processes
 > coredumped. Restarting the processes allowed the cluster to continue
 > working. Mostly.
 >
 > After the third server got fixed I started a heal, but files
didn't get
 > healed and count (by "ls -l
 > /srv/bricks/*/d/.glusterfs/indices/xattrop/|grep ^-|wc -l") did not
 > decrease over 2 days. So, to recover I copied files from bricks
to temp
 > storage (keeping both copies of conflicting files with different
 > contents), removed files on bricks and arbiters, and finally
copied back
 > from temp storage to the volume.
 >
 > Now the files are accessible but I still see lots of entries like
 > 
 >
 > IIUC that's due to a mismatch between .glusterfs/ contents and normal
 > hierarchy. Is there some tool to speed up the cleanup?
 >
 > Tks.
 >
 > --
 > Diego Zuccato
 > DIFA - Dip. di Fisica e Astronomia
 > Servizi Informatici
 > Alma Mater Studiorum - Università di Bologna
 > V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
 > tel.: +39 051 20 95786
 > 
 >
 >
 >
 > Community Meeting Calendar:
 >
 &g

Re: [Gluster-users] [Gluster-devel] Regarding Glusterfs file locking

2023-02-03 Thread Diego Zuccato

w21v6MqByaqQXxNXfIu_8nDGQD8EEStnhIl-Z9rpRbcbOmmg9ZOkU1ATnFJWyzPFNRdREsAw2g-BW2quWfglxYjdcUYrf63ntrYgrg8ZEDOgMzp8pV0psisEjmHR57IuTgPjs7iZWes9nG_yBsP6yBmLPtWSKfIGj4Diu01fwJfIG3EKXlE4xtia9TqEAj7nTcAMx1_dqKyjCgDU7ZhN-S8XQ9RWlp7OVKQ0GEPM-CSJozOXukVWlM00zAGfmPVfQAI_DmCap5bB6BXhAiIB9LXqWWDi8nrR5/https%3A%2F%2Flists.gluster.org%2Fmailman%2Flistinfo%2Fgluster-devel>










NOTE: This message may contain information that is confidential, 
proprietary, privileged or otherwise protected by law. The message is 
intended solely for the named addressee. If received in error, please 
destroy and notify the sender. Any use of this email is prohibited when 
received in error. Impetus does not represent, warrant and/or guarantee, 
that the integrity of this communication has been maintained nor that 
the communication is free of errors, virus, interception or interference.






Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Quick way to fix stale gfids?

2023-01-31 Thread Diego Zuccato


Hello all.

I've had one of the 3 nodes serving a "replica 3 arbiter 1" down for 
some days (apparently RAM issues, but actually failing mobo).
The other nodes have had some issues (RAM exhaustion, old problem 
already ticketed but still no solution) and some brick processes 
coredumped. Restarting the processes allowed the cluster to continue 
working. Mostly.


After the third server got fixed I started a heal, but files didn't get 
healed and count (by "ls -l 
/srv/bricks/*/d/.glusterfs/indices/xattrop/|grep ^-|wc -l") did not 
decrease over 2 days. So, to recover I copied files from bricks to temp 
storage (keeping both copies of conflicting files with different 
contents), removed files on bricks and arbiters, and finally copied back 
from temp storage to the volume.


Now the files are accessible but I still see lots of entries like


IIUC that's due to a mismatch between .glusterfs/ contents and normal 
hierarchy. Is there some tool to speed up the cleanup?


Tks.

--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Doubts re: remove-brick

2022-11-20 Thread Diego Zuccato

Il 20/11/2022 09:39, Strahil Nikolov ha scritto:
Have you
checked https://staged-gluster-docs.readthedocs.io/en/release3.7.0beta1/Features/rebalance/ ? I know it's old but it might provide some clarity.

The files are removed from the source subvolume to the new subvolume.

Ok, Tks!
RH's numbering is really confusing. Why can't they simply use the
official release number? :( And having lots of docs paywalled doesn't
help either.

Removed bricks do not get any writes, as during the preparation - a
rebalance is issued which notifies the clients to use the new DHT subvolume.
IIUC there's a mismatch between what you say and the warning I get when
starting a remove-brick operation:
"It is recommended that remove-brick be run with cluster.force-migration
option disabled to prevent possible data corruption. Doing so will
ensure that files that receive writes during migration will not be
migrated and will need to be manually copied after the remove-brick
commit operation. Please check the value of the option and update
accordingly.
Do you want to continue with your current cluster.force-migration
settings? (y/n)"
If files being migrated don't receive writes (I assume "on the original
brick"), then why is that note needed?

Most probably I'm missing some vital piece of information.

[BTW my cluster.force-migration is already off... that warning is a long
standing issue that seems is not easily fixable]

--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786

Community Meeting Calendar:

[Gluster-users] Doubts re: remove-brick

2022-11-18 Thread Diego Zuccato


Hello all.

I need to reorganize the bricks (making RAID1 on the backing devices to 
reduce memory used by Gluster processes) and I have a couple of doubts:
- do moved (rebalanced) files get removed from source bricks so at the 
end I only have the files that received writes?

- do bricks being removed continue getting writes for new files?

Tks.

--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster 5.10 rebalance stuck

2022-11-02 Thread Diego Zuccato


I think I've been in a similar situation.
"Solved" by creating a new volume on a new set of bricks on the same 
disks and moving data to new volume. Then just deleted old volume and 
relative bricks. Quite sure there's a better way, but that was 
nearly-static data and the move was a faster fix.


Diego

Il 02/11/2022 08:05, Shreyansh Shah ha scritto:

Hi,
I Would really appreciate it if someone would be able to help on the 
above issue. We are stuck as we cannot run rebalance due to this and 
thus are not able to extract peak performance from the setup due to 
unbalanced data.
Adding gluster info (without the bricks) below. Please let me know if 
any other details/logs are needed.


Volume Name: data
Type: Distribute
Volume ID: 75410231-bb25-4f14-bcde-caf18fce1d31
Status: Started
Snapshot Count: 0
Number of Bricks: 41
Transport-type: tcp
Options Reconfigured:
server.event-threads: 4
network.ping-timeout: 90
client.keepalive-time: 60
server.keepalive-time: 60
storage.health-check-interval: 60
performance.client-io-threads: on
nfs.disable: on
transport.address-family: inet
performance.cache-size: 8GB
performance.cache-refresh-timeout: 60
cluster.min-free-disk: 3%
client.event-threads: 4
performance.io-thread-count: 16



On Fri, Oct 28, 2022 at 11:40 AM Shreyansh Shah 
mailto:shreyansh.s...@alpha-grep.com>> 
wrote:


Hi,
We are running glusterfs 5.10 server volume. Recently we added a few
new bricks and started a rebalance operation. After a couple of days
the rebalance operation was just stuck, with one of the peers
showing In-Progress with no file being read/transferred and the rest
showing Failed/Completed, so we stopped it using "gluster volume
rebalance data stop". Now when we are trying to start it again, we
get the below error. Any assistance would be appreciated

root@gluster-11:~# gluster volume rebalance data status
volume rebalance: data: failed: Rebalance not started for volume
data.
root@gluster-11:~# gluster volume rebalance data start
volume rebalance: data: failed: Rebalance on data is already started
root@gluster-11:~# gluster volume rebalance data stop
volume rebalance: data: failed: Rebalance not started for volume
data.

-- 
Regards,

Shreyansh Shah
AlphaGrep* Securities Pvt. Ltd.*



--
Regards,
Shreyansh Shah
AlphaGrep* Securities Pvt. Ltd.*





Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] bitd.log and quotad.log flooding /var

2022-10-27 Thread Diego Zuccato


Seems it's accumulating again. ATM it's like this:

root 2134553  2.1 11.2 23071940 22091644 ?   Ssl  set23 1059:58 
/usr/sbin/glusterfs -s localhost --volfile-id gluster/quotad -p 
/var/run/gluster/quotad/quotad.pid -l /var/log/glusterfs/quotad.log -S 
/var/run/gluster/321cad6822171c64.socket --process-name quotad


Uptime is 77d.
The other 2 nodes are in the same situation.

Gluster is 9.5-1 amd64. Is it latest enough or should I plan a migration 
to 10?


Hints?

Diego

Il 12/08/2022 22:18, Strahil Nikolov ha scritto:

75GB -> that's definately a memory leak.
What version do you use ?

If latest - open a github issue.

Best Regards,
Strahil Nikolov

On Thu, Aug 11, 2022 at 10:06, Diego Zuccato
 wrote:
Yup.

Seems the /etc/sysconfig/glusterd setting got finally applied and I now
have a process like this:
root    4107315  0.0  0.0 529244 40124 ?        Ssl  ago08  2:44
/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level ERROR
but bitd still spits out (some) 'I' lines
[2022-08-11 07:02:21.072943 +] I [MSGID: 118016]
[bit-rot.c:1052:bitd_oneshot_crawl] 0-cluster_data-bit-rot-0:
Triggering
signing [{path=/extra/some/other/dirs/file.dat},
{gfid=3e35b158-35a6-4e63-adbd-41075a11022e},
{Brick-path=/srv/bricks/00/d}]

Moreover I've had to disable quota, since quota processes were eating
more than *75GB* RAM on each storage node! :(

Il 11/08/2022 07:12, Strahil Nikolov ha scritto:
 > Have you decreased glusterd log level via:
 > glusterd --log-level WARNING|ERROR
 >
 > It seems that bitrot doesn't have it's own log level.
 >
 > As a workaround, you can configure syslog to send the logs only
remotely
 > and thus preventing the overfill of the /var .
 >
 >
 > Best Regards,
 > Strahil Nikolov
 >
 >    On Wed, Aug 10, 2022 at 7:52, Diego Zuccato
 >    mailto:diego.zucc...@unibo.it>> wrote:
 >    Hi Strahil.
 >
 >    Sure. Luckily I didn't delete 'em all :)
 >
 >      From bitd.log:
 >    -8<--
 >    [2022-08-09 05:58:12.075999 +] I [MSGID: 118016]
 >    [bit-rot.c:1052:bitd_oneshot_crawl] 0-cluster_data-bit-rot-0:
 >    Triggering
 >    signing [{path=/astro/...omisis.../file.dat},
 >    {gfid=5956af24-5efc-496c-8d7e-ea6656f298de},
 >    {Brick-path=/srv/bricks/10/d}]
 >    [2022-08-09 05:58:12.082264 +] I [MSGID: 118016]
 >    [bit-rot.c:1052:bitd_oneshot_crawl] 0-cluster_data-bit-rot-0:
 >    Triggering
 >    signing [{path=/astro/...omisis.../file.txt},
 >    {gfid=afb75c03-0d29-414e-917a-ff718982c849},
 >    {Brick-path=/srv/bricks/13/d}]
 >    [2022-08-09 05:58:12.082267 +] I [MSGID: 118016]
 >    [bit-rot.c:1052:bitd_oneshot_crawl] 0-cluster_data-bit-rot-0:
 >    Triggering
 >    signing [{path=/astro/...omisis.../file.dat},
 >    {gfid=982bc7a8-d4ba-45d7-9104-044e5d446802},
 >    {Brick-path=/srv/bricks/06/d}]
 >    [2022-08-09 05:58:12.084960 +] I [MSGID: 118016]
 >    [bit-rot.c:1052:bitd_oneshot_crawl] 0-cluster_data-bit-rot-0:
 >    Triggering
 >    signing [{path=/atmos/...omisis.../file},
 >    {gfid=17e4dfb0-1f64-47a3-9aa8-b3fa05b7cd4e},
 >    {Brick-path=/srv/bricks/15/d}]
 >    [2022-08-09 05:58:12.089357 +] I [MSGID: 118016]
 >    [bit-rot.c:1052:bitd_oneshot_crawl] 0-cluster_data-bit-rot-0:
 >    Triggering
 >    signing [{path=/astro/...omisis.../file.txt},
 >    {gfid=e70bf289-5aeb-43c2-aadd-d18979cf62b5},
 >    {Brick-path=/srv/bricks/00/d}]
 >    [2022-08-09 05:58:12.094440 +] I [MSGID: 100011]
 >    [glusterfsd.c:1511:reincarnate] 0-glusterfsd: Fetching the
volume file
 >    from server... []
 >    [2022-08-09 05:58:12.096299 +] I
 >    [glusterfsd-mgmt.c:2170:mgmt_getspec_cbk] 0-glusterfs:
Received list of
 >    available volfile servers: clustor00:24007 clustor02:24007
 >    [2022-08-09 05:58:12.096653 +] I [MSGID: 101221]
 >    [common-utils.c:3851:gf_set_volfile_server_common] 0-gluster:
duplicate
 >    entry for volfile-server [{errno=17}, {error=File già esistente}]
 >    [2022-08-09 05:58:12.096853 +] I
 >    [glusterfsd-mgmt.c:2203:mgmt_getspec_cbk] 0-glusterfs: No
change in
 >    volfile,continuing
 >    [2022-08-09 05:58:12.096702 +] I [MSGID: 101221]
 >    [common-utils.c:3851:gf_set_volfile_server_common] 0-gluster:
duplicate
 >    entry for volfile-server [{errno=17}, {error=File già esistente}]
 >    [2022-08-09 05:58:12.102176 +] I [MSGID: 118016]
 >    [bit-rot.c:1052:bitd_oneshot_crawl] 0-cluster_data-bit-rot-0:
 >    Triggering
 >    signin

Re: [Gluster-users] bitd.log and quotad.log flooding /var

2022-08-11 Thread Diego Zuccato


Yup.

Seems the /etc/sysconfig/glusterd setting got finally applied and I now 
have a process like this:
root 4107315  0.0  0.0 529244 40124 ?Ssl  ago08   2:44 
/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level ERROR

but bitd still spits out (some) 'I' lines
[2022-08-11 07:02:21.072943 +] I [MSGID: 118016] 
[bit-rot.c:1052:bitd_oneshot_crawl] 0-cluster_data-bit-rot-0: Triggering 
signing [{path=/extra/some/other/dirs/file.dat}, 
{gfid=3e35b158-35a6-4e63-adbd-41075a11022e}, {Brick-path=/srv/bricks/00/d}]


Moreover I've had to disable quota, since quota processes were eating 
more than *75GB* RAM on each storage node! :(


Il 11/08/2022 07:12, Strahil Nikolov ha scritto:

Have you decreased glusterd log level via:
glusterd --log-level WARNING|ERROR

It seems that bitrot doesn't have it's own log level.

As a workaround, you can configure syslog to send the logs only remotely 
and thus preventing the overfill of the /var .



Best Regards,
Strahil Nikolov

On Wed, Aug 10, 2022 at 7:52, Diego Zuccato
 wrote:
Hi Strahil.

Sure. Luckily I didn't delete 'em all :)

 From bitd.log:
-8<--
[2022-08-09 05:58:12.075999 +] I [MSGID: 118016]
[bit-rot.c:1052:bitd_oneshot_crawl] 0-cluster_data-bit-rot-0:
Triggering
signing [{path=/astro/...omisis.../file.dat},
{gfid=5956af24-5efc-496c-8d7e-ea6656f298de},
{Brick-path=/srv/bricks/10/d}]
[2022-08-09 05:58:12.082264 +] I [MSGID: 118016]
[bit-rot.c:1052:bitd_oneshot_crawl] 0-cluster_data-bit-rot-0:
Triggering
signing [{path=/astro/...omisis.../file.txt},
{gfid=afb75c03-0d29-414e-917a-ff718982c849},
{Brick-path=/srv/bricks/13/d}]
[2022-08-09 05:58:12.082267 +] I [MSGID: 118016]
[bit-rot.c:1052:bitd_oneshot_crawl] 0-cluster_data-bit-rot-0:
Triggering
signing [{path=/astro/...omisis.../file.dat},
{gfid=982bc7a8-d4ba-45d7-9104-044e5d446802},
{Brick-path=/srv/bricks/06/d}]
[2022-08-09 05:58:12.084960 +] I [MSGID: 118016]
[bit-rot.c:1052:bitd_oneshot_crawl] 0-cluster_data-bit-rot-0:
Triggering
signing [{path=/atmos/...omisis.../file},
{gfid=17e4dfb0-1f64-47a3-9aa8-b3fa05b7cd4e},
{Brick-path=/srv/bricks/15/d}]
[2022-08-09 05:58:12.089357 +] I [MSGID: 118016]
[bit-rot.c:1052:bitd_oneshot_crawl] 0-cluster_data-bit-rot-0:
Triggering
signing [{path=/astro/...omisis.../file.txt},
{gfid=e70bf289-5aeb-43c2-aadd-d18979cf62b5},
{Brick-path=/srv/bricks/00/d}]
[2022-08-09 05:58:12.094440 +] I [MSGID: 100011]
[glusterfsd.c:1511:reincarnate] 0-glusterfsd: Fetching the volume file
from server... []
[2022-08-09 05:58:12.096299 +] I
[glusterfsd-mgmt.c:2170:mgmt_getspec_cbk] 0-glusterfs: Received list of
available volfile servers: clustor00:24007 clustor02:24007
[2022-08-09 05:58:12.096653 +] I [MSGID: 101221]
[common-utils.c:3851:gf_set_volfile_server_common] 0-gluster: duplicate
entry for volfile-server [{errno=17}, {error=File già esistente}]
[2022-08-09 05:58:12.096853 +] I
[glusterfsd-mgmt.c:2203:mgmt_getspec_cbk] 0-glusterfs: No change in
volfile,continuing
[2022-08-09 05:58:12.096702 +] I [MSGID: 101221]
[common-utils.c:3851:gf_set_volfile_server_common] 0-gluster: duplicate
entry for volfile-server [{errno=17}, {error=File già esistente}]
[2022-08-09 05:58:12.102176 +] I [MSGID: 118016]
[bit-rot.c:1052:bitd_oneshot_crawl] 0-cluster_data-bit-rot-0:
Triggering
signing [{path=/astro/...omisis.../file.dat},
{gfid=45f59e3f-eef4-4ccf-baac-bc8bf10c5ced},
{Brick-path=/srv/bricks/09/d}]
[2022-08-09 05:58:12.106120 +] I [MSGID: 118016]
[bit-rot.c:1052:bitd_oneshot_crawl] 0-cluster_data-bit-rot-0:
Triggering
signing [{path=/astro/...omisis.../file.txt},
{gfid=216832dd-0a1c-4593-8a9e-f54d70efc637},
{Brick-path=/srv/bricks/13/d}]
-8<--

And from quotad.log:
-<--
[2022-08-09 05:58:12.291030 +] I
[glusterfsd-mgmt.c:2170:mgmt_getspec_cbk] 0-glusterfs: Received list of
available volfile servers: clustor00:24007 clustor02:24007
[2022-08-09 05:58:12.291143 +] I [MSGID: 101221]
[common-utils.c:3851:gf_set_volfile_server_common] 0-gluster: duplicate
entry for volfile-server [{errno=17}, {error=File già esistente}]
[2022-08-09 05:58:12.291653 +] I
[glusterfsd-mgmt.c:2203:mgmt_getspec_cbk] 0-glusterfs: No change in
volfile,continuing
[2022-08-09 05:58:12.292990 +] I
[glusterfsd-mgmt.c:2170:mgmt_getspec_cbk] 0-glusterfs: Received list of
available volfile servers: clustor00:24007 clustor02:24007
[2022-08-09 05:58:12.293204 +] I
[glusterfsd-mgmt.c:2170:mgmt_getspec_cbk] 0-glusterfs: Received list of
available volfile servers: clustor00:24007 clustor02:24007
[2022-08-09 05:58:12.293500 +] I
[glusterfsd-mgmt.c:2203:mgmt_getspec_cbk] 0-glusterfs: No change in
volfile,continuing
[2

Re: [Gluster-users] bitd.log and quotad.log flooding /var

2022-08-09 Thread Diego Zuccato

] 0-gluster: duplicate 
entry for volfile-server [{errno=17}, {error=File già esistente}]
[2022-08-09 22:00:07.364719 +] I [MSGID: 100011] 
[glusterfsd.c:1511:reincarnate] 0-glusterfsd: Fetching the volume file 
from server... []
[2022-08-09 22:00:07.374040 +] I 
[glusterfsd-mgmt.c:2170:mgmt_getspec_cbk] 0-glusterfs: Received list of 
available volfile servers: clustor00:24007 clustor02:24007
[2022-08-09 22:00:07.374099 +] I [MSGID: 101221] 
[common-utils.c:3851:gf_set_volfile_server_common] 0-gluster: duplicate 
entry for volfile-server [{errno=17}, {error=File già esistente}]
[2022-08-09 22:00:07.374569 +] I 
[glusterfsd-mgmt.c:2203:mgmt_getspec_cbk] 0-glusterfs: No change in 
volfile,continuing
[2022-08-09 22:00:07.385610 +] I 
[glusterfsd-mgmt.c:2170:mgmt_getspec_cbk] 0-glusterfs: Received list of 
available volfile servers: clustor00:24007 clustor02:24007
[2022-08-09 22:00:07.386119 +] I 
[glusterfsd-mgmt.c:2203:mgmt_getspec_cbk] 0-glusterfs: No change in 
volfile,continuing

-8<--

I've now used
  gluster v set cluster_data diagnostics.brick-sys-log-level CRITICAL
and rate of filling decreased, but I still see many 'I' lines :(

Using Gluster 9.5 packages from
deb [arch=amd64] 
https://download.gluster.org/pub/gluster/glusterfs/9/LATEST/Debian/bullseye/amd64/apt 
bullseye main


Tks,
 Diego

Il 09/08/2022 22:08, Strahil Nikolov ha scritto:

Hey Diego,

can you show a sample of such Info entries ?

Best Regards,
Strahil Nikolov

    On Mon, Aug 8, 2022 at 15:59, Diego Zuccato
 wrote:
Hello all.

Lately, I noticed some hickups in our Gluster volume. It's a "replica 3
arbiter 1" with many bricks (currently 90 data bricks over 3 servers).

I tried to reduce log level by setting
diagnostics.brick-log-level: ERROR
diagnostics.client-log-level: ERROR
and creating /etc/default/glusterd containing "LOG_LEVEL=ERROR".
But I still see a lot of 'I' lines in the logs and have to manually run
logrotate way too often or /var gets too full.

Any hints? What did I forget?

Tks.

-- 
Diego Zuccato

DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
<https://meet.google.com/cpu-eiue-hvk>
Gluster-users mailing list
Gluster-users@gluster.org <mailto:Gluster-users@gluster.org>
https://lists.gluster.org/mailman/listinfo/gluster-users
<https://lists.gluster.org/mailman/listinfo/gluster-users>



--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] bitd.log and quotad.log flooding /var

2022-08-08 Thread Diego Zuccato


Hello all.

Lately, I noticed some hickups in our Gluster volume. It's a "replica 3 
arbiter 1" with many bricks (currently 90 data bricks over 3 servers).


I tried to reduce log level by setting
diagnostics.brick-log-level: ERROR
diagnostics.client-log-level: ERROR
and creating /etc/default/glusterd containing "LOG_LEVEL=ERROR".
But I still see a lot of 'I' lines in the logs and have to manually run 
logrotate way too often or /var gets too full.


Any hints? What did I forget?

Tks.

--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Debian repository instructions outdated

2022-06-26 Thread Diego Zuccato


Hello all.

The instructions given f.e. at [1] do not follow the Debian instructions 
for 3rdparty repositories [2] .


Mostly it boils down to changing the first step to:
mkdir /etc/apt/keyrings
curl https://download.gluster.org/pub/gluster/glusterfs/9/rsa.pub | gpg 
--dearmor > /etc/apt/keyrings/gluster-archive-keyring.gpg
and then add 'signed-by=/etc/apt/keyrings/gluster-archive-keyring.gpg' 
between '[' and 'arch=amd64'.


HIH,
 Diego

[1] https://download.gluster.org/pub/gluster/glusterfs/9/9.4/Debian/
[2] https://wiki.debian.org/DebianRepository/UseThirdParty

--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Flood of "SSL support for MGMT is ENABLED"

2022-06-06 Thread Diego Zuccato


Hello all.

I have a Gluster 9.2 volume in "replica 3 arbiter 1":
-8<--
Volume Name: cluster_data
Type: Distributed-Replicate
Volume ID: a8caaa90-d161-45bb-a68c-278263a8531a
Status: Started
Snapshot Count: 0
Number of Bricks: 45 x (2 + 1) = 135
Transport-type: tcp
Bricks:
Brick1: clustor00:/srv/bricks/00/d
Brick2: clustor01:/srv/bricks/00/d
Brick3: clustor02:/srv/quorum/00/d (arbiter)
[...]
Brick133: clustor01:/srv/bricks/29/d
Brick134: clustor02:/srv/bricks/29/d
Brick135: clustor00:/srv/quorum/14/d (arbiter)
Options Reconfigured:
cluster.granular-entry-heal: disable
features.scrub-throttle: normal
performance.parallel-readdir: on
performance.write-behind-window-size: 128MB
cluster.self-heal-daemon: enable
features.default-soft-limit: 90
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: on
client.event-threads: 8
performance.cache-refresh-timeout: 60
performance.stat-prefetch: on
cluster.lookup-optimize: on
features.bitrot: on
features.scrub: Active
diagnostics.brick-log-level: WARNING
diagnostics.client-log-level: WARNING
config.brick-threads: 0
cluster.lookup-unhashed: on
config.client-threads: 36
-8<--

I've had to reboot clustor02 and now that I'm trying to restart it I get 
the log flooded by lines like:
[2022-06-06 10:18:01.639007 +] I 
[socket.c:4279:ssl_setup_connection_params] 0-socket.management: SSL 
support for MGMT is ENABLED IO path is ENABLED certificate depth is 1 
for peer 192.168.253.79:48962
[2022-06-06 10:18:01.641246 +] I 
[socket.c:4279:ssl_setup_connection_params] 0-socket.management: SSL 
support for MGMT is ENABLED IO path is ENABLED certificate depth is 1 
for peer 192.168.253.73:48951


To have a working node I've had to create /etc/sysconfig/glusterd file 
containing

LOG_LEVEL=WARNING
But that just hides the messages... Is that normal behaviour?

--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Usage not updating in quotas

2022-04-27 Thread Diego Zuccato


Hello Amar.

Maybe I missed something, but does that mean that if I upgrade from 9.5 
to 10 I lose the quota? Seems too strange to be true...


Tks,
 Diego

Il 28/04/2022 07:12, Amar Tumballi ha scritto:

Hi Alan, Strahil,

On Thu, Apr 28, 2022 at 3:50 AM Strahil Nikolov <mailto:hunter86...@yahoo.com>> wrote:


@Amar,

did the quota feature reach Gluster v10 ?


It got merged in the development branch (ie, confirmed on v11). As per 
our release policy, it wouldn't make it to v10 as its a feature. Anyone 
wanting to test it, experiment with it should build from nightly or 
development branch for now.


Regards,
Amar

On Tue, Apr 26, 2022 at 12:09, Alan Orth
mailto:alan.o...@gmail.com>> wrote:






Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Outdated docs?

2022-04-04 Thread Diego Zuccato


Hello all.

I think there's something wrong in the docs: the page 
https://docs.gluster.org/en/main/Administrator-Guide/Handling-of-users-with-many-groups/ 
says "The FUSE client gets the groups of the process that does the I/O 
by reading the information from /proc/$pid/status. This file only 
contains up to 32 groups."


I checked on my system and status files report way more than 32 groups 
(when the user does have 'em, obv).


It could probably just be outdated info: I think it got 'fixed' 9y ago 
by this patch:

https://linux-kernel.vger.kernel.narkive.com/KDWSnAMn/patch-proc-pid-status-show-all-supplementary-groups

--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Indexing/Importing existing files on disk

2022-04-04 Thread Diego Zuccato


Il 03/04/2022 16:36, Strahil Nikolov ha scritto:


Using relative/noatime mount option reduces the I/O to the brick device.IMVHO 
this sentence could cause misunderstandings. :)
It could be read like "noatime slows down your brick" while, IIUC, it 
really means it *improves* the brick's performance by reducing the 
number of "housekeeping" IOs.


--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Interpreting gluster volume top read/write output

2022-02-22 Thread Diego Zuccato


Hi Hubert.

Didn't notice the 'top' command, so I cannot answer your doubts.

But I tried and noticed that read reports some entries like
2880
2761
Hope that's not symptom of a problem...

Il 22/02/2022 09:41, Hu Bert ha scritto:

Hello @ll,

we're just doing some "research" on our own replica 3 volume
(hostnames gserver1-3, gluster 9.4), and there are a few questions
regarding the output of 'gluster volume top $volname read/write'.

1)
gluster volume top workdata write
Brick: gserver2:/gluster/md7/workdata
Count   filename
===
203 /images/504/013/50401355/de.mp4
195 /images/396/910/39691058/de.mp4
167 /themes/oad-high-scardus-trail/media/220202Hstimageslider1.mp4

does this mean that these files have been written 203/195/... times?
lifetime writes?

2)
gluster volume top workdata read
Brick: gserver1:/gluster/md3/workdata
Count   filename
===
1794/images/441/297/44129755/de.mp4
275 /images/275/806/27580686/default.jpg
258 /images/269/844/26984442/default.jpg
256 /images/269/845/26984597/default.jpg

gserver1 was rebooted yesterday; does this mean these files have been
read 1794/275/... times? lifetime reads or since reboot?

We're just a bit ... curious if these are real read/write stats,
lifetime, since reboot.


thx,
Hubert




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Experimenting with thin-arbiter

2022-02-16 Thread Diego Zuccato


Not there. It's not one of the defined services :(
Maybe Debian does not support it?

Il 16/02/2022 13:26, Strahil Nikolov ha scritto:

My bad, it should be /gluster-ta-volume.service/

On Wed, Feb 16, 2022 at 7:45, Diego Zuccato
 wrote:
No such process is defined. Just the standard glusterd.service and
glustereventsd.service . Using Debian stable.

Il 15/02/2022 15:41, Strahil Nikolov ha scritto:
 > Any errors in gluster-ta.service on the arbiter node ?
 >
 > Best Regards,
 > Strahil Nikolov
 >
 >    On Tue, Feb 15, 2022 at 14:28, Diego Zuccato
 >    mailto:diego.zucc...@unibo.it>> wrote:
 >    Hello all.
 >
 >    I'm experimenting with thin-arbiter and getting disappointing
results.
 >
 >    I have 3 hosts in the trusted pool:
 > root@nas1 <mailto:root@nas1> <mailto:root@nas1
<mailto:root@nas1>>:~# gluster --version
 >    glusterfs 9.2
 >    [...]
 > root@nas1 <mailto:root@nas1> <mailto:root@nas1
<mailto:root@nas1>>:~# gluster pool list
 >    UUID                                    Hostname        State
 >    d4791fed-3e6d-4f8f-bdb6-4e0043610ead    nas3            Connected
 >    bff398f0-9d1d-4bd0-8a47-0bf481d1d593    nas2            Connected
 >    4607034c-919d-4675-b5fc-14e1cad90214    localhost      Connected
 >
 >    When I try to create a new volume, the first initialization
succeeds:
 > root@nas1 <mailto:root@nas1> <mailto:root@nas1
<mailto:root@nas1>>:~# gluster v create Bck replica 2
 >    thin-arbiter 1
 >    nas{1,3}:/bricks/00/Bck nas2:/bricks/arbiter/Bck
 >    volume create: Bck: success: please start the volume to access
data
 >
 >    But adding a second brick segfaults the daemon:
 > root@nas1 <mailto:root@nas1> <mailto:root@nas1
<mailto:root@nas1>>:~# gluster v add-brick Bck
 >    nas{1,3}:/bricks/01/Bck
 >    Connection failed. Please check if gluster daemon is operational.
 >
 >    After erroring out, systemctl status glusterd reports daemon in
 >    "restarting" state and it eventually restarts. But the new
brick is not
 >    added to the volume, even if trying to re-add it yelds a "brick is
 >    already part of a volume" error. Seems glusterd crashes
between marking
 >    brick dir as used and recording its data in the config.
 >
 >    If I try to add all the bricks during the creation, glusterd
does not
 >    die but the volume doesn't get created:
 > root@nas1 <mailto:root@nas1> <mailto:root@nas1
<mailto:root@nas1>>:~# rm -rf /bricks/{00..07}/Bck && mkdir
 >    /bricks/{00..07}/Bck
 > root@nas1 <mailto:root@nas1> <mailto:root@nas1
<mailto:root@nas1>>:~# gluster v create Bck replica 2
 >    thin-arbiter 1
 >    nas{1,3}:/bricks/00/Bck nas{1,3}:/bricks/01/Bck
nas{1,3}:/bricks/02/Bck
 >    nas{1,3}:/bricks/03/Bck nas{1,3}:/bricks/04/Bck
nas{1,3}:/bricks/05/Bck
 >    nas{1,3}:/bricks/06/Bck nas{1,3}:/bricks/07/Bck
nas2:/bricks/arbiter/Bck
 >    volume create: Bck: failed: Commit failed on localhost. Please
check
 >    the
 >    log file for more details.
 >
 >    Couldn't find anything useful in the logs :(
 >
 >    If I create a "replica 3 arbiter 1" over the same brick
directories
 >    (just adding some directories to keep arbiters separated), it
succeeds:
 > root@nas1 <mailto:root@nas1> <mailto:root@nas1
<mailto:root@nas1>>:~# gluster v create Bck replica 3
 >    arbiter 1
 >    nas{1,3}:/bricks/00/Bck nas2:/bricks/arbiter/Bck/00
 >    volume create: Bck: success: please start the volume to access
data
 > root@nas1 <mailto:root@nas1> <mailto:root@nas1
<mailto:root@nas1>>:~# for T in {01..07}; do gluster v
 >    add-brick Bck
 >    nas{1,3}:/bricks/$T/Bck nas2:/bricks/arbiter/Bck/$T ; done
 >    volume add-brick: success
 >    volume add-brick: success
 >    volume add-brick: success
 >    volume add-brick: success
 >    volume add-brick: success
 >    volume add-brick: success
 >    volume add-brick: success
 > root@nas1 <mailto:root@nas1> <mailto:root@nas1
<mailto:root@nas1>>:~# gluster v start Bck
 >    volume start: Bck: success
 > root@nas1 <mailto:root@nas1> <mailto:root@nas1
<mailto:root@nas1>>:~# gluster v info Bck
 >
 >    Volume Name: Bck
 >    Type: Distributed-Replicate
 >    Volume ID: 4786e747-8203-42bf-abe8-107a5

Re: [Gluster-users] Experimenting with thin-arbiter

2022-02-15 Thread Diego Zuccato

No such process is defined. Just the standard glusterd.service and 
glustereventsd.service . Using Debian stable.


Il 15/02/2022 15:41, Strahil Nikolov ha scritto:

Any errors in gluster-ta.service on the arbiter node ?

Best Regards,
Strahil Nikolov

On Tue, Feb 15, 2022 at 14:28, Diego Zuccato
 wrote:
Hello all.

I'm experimenting with thin-arbiter and getting disappointing results.

I have 3 hosts in the trusted pool:
root@nas1 <mailto:root@nas1>:~# gluster --version
glusterfs 9.2
[...]
root@nas1 <mailto:root@nas1>:~# gluster pool list
UUID                                    Hostname        State
d4791fed-3e6d-4f8f-bdb6-4e0043610ead    nas3            Connected
bff398f0-9d1d-4bd0-8a47-0bf481d1d593    nas2            Connected
4607034c-919d-4675-b5fc-14e1cad90214    localhost      Connected

When I try to create a new volume, the first initialization succeeds:
root@nas1 <mailto:root@nas1>:~# gluster v create Bck replica 2
thin-arbiter 1
nas{1,3}:/bricks/00/Bck nas2:/bricks/arbiter/Bck
volume create: Bck: success: please start the volume to access data

But adding a second brick segfaults the daemon:
root@nas1 <mailto:root@nas1>:~# gluster v add-brick Bck
nas{1,3}:/bricks/01/Bck
Connection failed. Please check if gluster daemon is operational.

After erroring out, systemctl status glusterd reports daemon in
"restarting" state and it eventually restarts. But the new brick is not
added to the volume, even if trying to re-add it yelds a "brick is
already part of a volume" error. Seems glusterd crashes between marking
brick dir as used and recording its data in the config.

If I try to add all the bricks during the creation, glusterd does not
die but the volume doesn't get created:
root@nas1 <mailto:root@nas1>:~# rm -rf /bricks/{00..07}/Bck && mkdir
/bricks/{00..07}/Bck
root@nas1 <mailto:root@nas1>:~# gluster v create Bck replica 2
thin-arbiter 1
nas{1,3}:/bricks/00/Bck nas{1,3}:/bricks/01/Bck nas{1,3}:/bricks/02/Bck
nas{1,3}:/bricks/03/Bck nas{1,3}:/bricks/04/Bck nas{1,3}:/bricks/05/Bck
nas{1,3}:/bricks/06/Bck nas{1,3}:/bricks/07/Bck nas2:/bricks/arbiter/Bck
volume create: Bck: failed: Commit failed on localhost. Please check
the
log file for more details.

Couldn't find anything useful in the logs :(

If I create a "replica 3 arbiter 1" over the same brick directories
(just adding some directories to keep arbiters separated), it succeeds:
root@nas1 <mailto:root@nas1>:~# gluster v create Bck replica 3
arbiter 1
nas{1,3}:/bricks/00/Bck nas2:/bricks/arbiter/Bck/00
volume create: Bck: success: please start the volume to access data
root@nas1 <mailto:root@nas1>:~# for T in {01..07}; do gluster v
add-brick Bck
nas{1,3}:/bricks/$T/Bck nas2:/bricks/arbiter/Bck/$T ; done
volume add-brick: success
volume add-brick: success
volume add-brick: success
volume add-brick: success
volume add-brick: success
volume add-brick: success
volume add-brick: success
root@nas1 <mailto:root@nas1>:~# gluster v start Bck
volume start: Bck: success
root@nas1 <mailto:root@nas1>:~# gluster v info Bck

Volume Name: Bck
Type: Distributed-Replicate
Volume ID: 4786e747-8203-42bf-abe8-107a50b238ee
Status: Started
Snapshot Count: 0
Number of Bricks: 8 x (2 + 1) = 24
Transport-type: tcp
Bricks:
Brick1: nas1:/bricks/00/Bck
Brick2: nas3:/bricks/00/Bck
Brick3: nas2:/bricks/arbiter/Bck/00 (arbiter)
Brick4: nas1:/bricks/01/Bck
Brick5: nas3:/bricks/01/Bck
Brick6: nas2:/bricks/arbiter/Bck/01 (arbiter)
Brick7: nas1:/bricks/02/Bck
Brick8: nas3:/bricks/02/Bck
Brick9: nas2:/bricks/arbiter/Bck/02 (arbiter)
Brick10: nas1:/bricks/03/Bck
Brick11: nas3:/bricks/03/Bck
Brick12: nas2:/bricks/arbiter/Bck/03 (arbiter)
Brick13: nas1:/bricks/04/Bck
Brick14: nas3:/bricks/04/Bck
Brick15: nas2:/bricks/arbiter/Bck/04 (arbiter)
Brick16: nas1:/bricks/05/Bck
Brick17: nas3:/bricks/05/Bck
Brick18: nas2:/bricks/arbiter/Bck/05 (arbiter)
Brick19: nas1:/bricks/06/Bck
Brick20: nas3:/bricks/06/Bck
Brick21: nas2:/bricks/arbiter/Bck/06 (arbiter)
Brick22: nas1:/bricks/07/Bck
Brick23: nas3:/bricks/07/Bck
Brick24: nas2:/bricks/arbiter/Bck/07 (arbiter)
Options Reconfigured:
cluster.granular-entry-heal: on
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off

Does thin arbiter support just one replica of bricks?

-- 
Diego Zuccato

DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786

[Gluster-users] Experimenting with thin-arbiter

2022-02-15 Thread Diego Zuccato


Hello all.

I'm experimenting with thin-arbiter and getting disappointing results.

I have 3 hosts in the trusted pool:
root@nas1:~# gluster --version
glusterfs 9.2
[...]
root@nas1:~# gluster pool list
UUIDHostnameState
d4791fed-3e6d-4f8f-bdb6-4e0043610eadnas3Connected
bff398f0-9d1d-4bd0-8a47-0bf481d1d593nas2Connected
4607034c-919d-4675-b5fc-14e1cad90214localhost   Connected

When I try to create a new volume, the first initialization succeeds:
root@nas1:~# gluster v create Bck replica 2 thin-arbiter 1 
nas{1,3}:/bricks/00/Bck nas2:/bricks/arbiter/Bck

volume create: Bck: success: please start the volume to access data

But adding a second brick segfaults the daemon:
root@nas1:~# gluster v add-brick Bck nas{1,3}:/bricks/01/Bck
Connection failed. Please check if gluster daemon is operational.

After erroring out, systemctl status glusterd reports daemon in 
"restarting" state and it eventually restarts. But the new brick is not 
added to the volume, even if trying to re-add it yelds a "brick is 
already part of a volume" error. Seems glusterd crashes between marking 
brick dir as used and recording its data in the config.


If I try to add all the bricks during the creation, glusterd does not 
die but the volume doesn't get created:

root@nas1:~# rm -rf /bricks/{00..07}/Bck && mkdir /bricks/{00..07}/Bck
root@nas1:~# gluster v create Bck replica 2 thin-arbiter 1 
nas{1,3}:/bricks/00/Bck nas{1,3}:/bricks/01/Bck nas{1,3}:/bricks/02/Bck 
nas{1,3}:/bricks/03/Bck nas{1,3}:/bricks/04/Bck nas{1,3}:/bricks/05/Bck 
nas{1,3}:/bricks/06/Bck nas{1,3}:/bricks/07/Bck nas2:/bricks/arbiter/Bck
volume create: Bck: failed: Commit failed on localhost. Please check the 
log file for more details.


Couldn't find anything useful in the logs :(

If I create a "replica 3 arbiter 1" over the same brick directories 
(just adding some directories to keep arbiters separated), it succeeds:
root@nas1:~# gluster v create Bck replica 3 arbiter 1 
nas{1,3}:/bricks/00/Bck nas2:/bricks/arbiter/Bck/00

volume create: Bck: success: please start the volume to access data
root@nas1:~# for T in {01..07}; do gluster v add-brick Bck 
nas{1,3}:/bricks/$T/Bck nas2:/bricks/arbiter/Bck/$T ; done

volume add-brick: success
volume add-brick: success
volume add-brick: success
volume add-brick: success
volume add-brick: success
volume add-brick: success
volume add-brick: success
root@nas1:~# gluster v start Bck
volume start: Bck: success
root@nas1:~# gluster v info Bck

Volume Name: Bck
Type: Distributed-Replicate
Volume ID: 4786e747-8203-42bf-abe8-107a50b238ee
Status: Started
Snapshot Count: 0
Number of Bricks: 8 x (2 + 1) = 24
Transport-type: tcp
Bricks:
Brick1: nas1:/bricks/00/Bck
Brick2: nas3:/bricks/00/Bck
Brick3: nas2:/bricks/arbiter/Bck/00 (arbiter)
Brick4: nas1:/bricks/01/Bck
Brick5: nas3:/bricks/01/Bck
Brick6: nas2:/bricks/arbiter/Bck/01 (arbiter)
Brick7: nas1:/bricks/02/Bck
Brick8: nas3:/bricks/02/Bck
Brick9: nas2:/bricks/arbiter/Bck/02 (arbiter)
Brick10: nas1:/bricks/03/Bck
Brick11: nas3:/bricks/03/Bck
Brick12: nas2:/bricks/arbiter/Bck/03 (arbiter)
Brick13: nas1:/bricks/04/Bck
Brick14: nas3:/bricks/04/Bck
Brick15: nas2:/bricks/arbiter/Bck/04 (arbiter)
Brick16: nas1:/bricks/05/Bck
Brick17: nas3:/bricks/05/Bck
Brick18: nas2:/bricks/arbiter/Bck/05 (arbiter)
Brick19: nas1:/bricks/06/Bck
Brick20: nas3:/bricks/06/Bck
Brick21: nas2:/bricks/arbiter/Bck/06 (arbiter)
Brick22: nas1:/bricks/07/Bck
Brick23: nas3:/bricks/07/Bck
Brick24: nas2:/bricks/arbiter/Bck/07 (arbiter)
Options Reconfigured:
cluster.granular-entry-heal: on
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off

Does thin arbiter support just one replica of bricks?

--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Arbiter

2022-02-08 Thread Diego Zuccato


Il 08/02/2022 12:17, Karthik Subrahmanya ha scritto:

Since there are 4 nodes available here, and based on the configuration 
of the available volumes (requested volume info for the same) I was 
thinking whether the arbiter brick can be hosted on one of those nodes 
itself, or a new node is required.
We're using replica 3 arbiter 1, with quorum balanced between the 3 
servers. No need for an extra server.


When we will add a 4th server, there'll be a lot of brick juggling 
(luckily they're connected by IB100 :) ) . The simplest thing you can do 
to balance load across 4 servers i laying down data as:

S1  S2  S3  S4
0a  b0  q0  1a
1b  1q  2a  2b
2q  3a  3b  3q
... and so on: it requires adding 8 disks at a time, 2 per server -- as 
long as you have enough blocks *and inodes* available on an ssd for 
metadata.


Hope the layout is clear: Xa and Xb are the replicated bricks, Xq is 
quorum brick for bricks Xa  and Xb.


For a 3 servers setup the layout we're using is
S1 S2 S3
0a 0b 0q
1a 1q 1b
2q 2a 2b

HIH.

--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Arbiter

2022-02-08 Thread Diego Zuccato


IIUC it always requires 3 servers.
Lightweight arbiter is just to avoid split brain (a client needs to 
reach two servers out of three to be able to write data).
"Full" arbiter is a third replica of metadata while there are only two 
copies of the data.


Il 08/02/2022 11:58, Gilberto Ferreira ha scritto:
Forgive me if I am wrong, but AFAIK, arbiter is for a two-node 
configuration, isn't it?

---
Gilberto Nunes Ferreira
(47) 99676-7530 - Whatsapp / Telegram






Em ter., 8 de fev. de 2022 às 07:17, Karthik Subrahmanya 
mailto:ksubr...@redhat.com>> escreveu:


Hi Andre,

Striped volumes are deprecated long back, see [1] & [2]. Seems like
you are using a very old version. May I know which version of
gluster you are running and the gluster volume info please?
Release schedule and the maintained branches can be found at [3].


[1] https://docs.gluster.org/en/latest/release-notes/6.0/
<https://docs.gluster.org/en/latest/release-notes/6.0/>
[2]
https://lists.gluster.org/pipermail/gluster-users/2018-July/034400.html
<https://lists.gluster.org/pipermail/gluster-users/2018-July/034400.html>
[3] https://www.gluster.org/release-schedule/
<https://www.gluster.org/release-schedule/>

Regards,
Karthik

On Mon, Feb 7, 2022 at 9:43 PM Andre Probst mailto:andrefpro...@gmail.com>> wrote:

I have a striped and replicated volume with 4 nodes. How do I
add an arbiter to this volume?


--
André Probst
Consultor de Tecnologia
43 99617 8765




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
<https://meet.google.com/cpu-eiue-hvk>
Gluster-users mailing list
Gluster-users@gluster.org <mailto:Gluster-users@gluster.org>
https://lists.gluster.org/mailman/listinfo/gluster-users
<https://lists.gluster.org/mailman/listinfo/gluster-users>





Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
<https://meet.google.com/cpu-eiue-hvk>
Gluster-users mailing list
Gluster-users@gluster.org <mailto:Gluster-users@gluster.org>
https://lists.gluster.org/mailman/listinfo/gluster-users
<https://lists.gluster.org/mailman/listinfo/gluster-users>






Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Swapping brick mounts/nodes

2022-02-01 Thread Diego Zuccato


Il 01/02/2022 20:08, Fox ha scritto:


Basically I'm asking if the bricks are mountpoint and node agnostic.Nope. They 
aren't :( (unless something changed in the latest releases).
Some days ago I asked basically the same question (how to move a volume 
to a new server).


--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Move a volume to a new server

2022-01-21 Thread Diego Zuccato


Il 18/01/2022 18:27, Strahil Nikolov ha scritto:

If you manage to get that server and can setup a replica -> then the 
migration can be transparent for the clients.

But I'll be beaten by the network team :)
OK as a last resort and for safety.

Another option is to move both OS+Gluster disks , rebuild the initramfs 
and thus you will change only the decommissioned hardeare.
That's more problematic... The same server serves another volume, that 
must stay there.


--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Move a volume to a new server

2022-01-18 Thread Diego Zuccato

Tks. I'll do it. Worst case: I'll have to create a new volume and move 
files from the old bricks to it. The safety is what I like most in Gluster.
The 'funny' thing is that a replica server is planned but not yet 
operative :(


Il 17/01/2022 18:24, Strahil Nikolov ha scritto:

Verify that you can install the same version of gluster.
If not, plan to update to a version that is available to both old and 
new servers' OS.


Once you migrate (check all release notes in advance) to a common 
version, you can do something like this:

- Install the gluster software on the new host
- Setup the firewall to match the old server
- Stop the gluster volumes and any geo-rep sessions
- Shutdown glusterd service
- Umount the bricks
- Disable LVM LVs/VGs that host your bricks (if you share the same VG 
with other software, you will have to use vgsplit)

- Remove the multipath devices (multipath -f)
- Remove the block devices that are part of those multipath devices
- Backup /etc/glusterfs
- Backup /var/lib/glusterd


- Unmap the LUNs
- Present the LUNs on the new host
- Verify that the multipath devices are there
- Rescan the LVM stack (pvscan --cache, vgscan lvscan)
- Activate the VGs/LVs
- Mount the bricks and ensure mounting on boot (autofs, systemd's 
'.mount/.automount' units, fstab)

- restore /etc/glusterfs & /var/lib/glusterd
- Start the glusterd service
- Start the volumes
- Mount via FUSE to verify the situation
- Start the geo-replications (if any)

Note, if you use VDO - disable the volume on the old system and backup 
the config (/etc/vdoconf.yml) -> restore on the new host.


Check your tuned profile and if needed transfer the configuration file 
on the new system and activate it.


I might have missed something (like custom entries in /etc/hosts) , so 
do a short test on test system in advance.


Edit: You didn't mention your FS type, so I assume XFS .

Best Regards,
Strahil Nikolov


On Mon, Jan 17, 2022 at 13:15, Diego Zuccato
 wrote:
Hello all.

I have a Gluster volume that I'd need to move to a different server.
The volume is 4x10TB bricks accessed via FC (different LUNs) on an old
CX3-80.
I have no extra space to create a copy of all the data, so I'd need to
hide the LUNs from the old server and make 'em visible to the new
("move
the disks"), w/o copying data.

Can I just do something like this?
- stop volume
- umount bricks
- copy volume state files to new server (which ones?)
- map LUNs to new server
- mount bricks on new server (maintaining the same path they had on old
server)
- start glusterd on new server
- start volume

Tks!

    -- 
Diego Zuccato

DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
<https://meet.google.com/cpu-eiue-hvk>
Gluster-users mailing list
Gluster-users@gluster.org <mailto:Gluster-users@gluster.org>
https://lists.gluster.org/mailman/listinfo/gluster-users
<https://lists.gluster.org/mailman/listinfo/gluster-users>



--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Move a volume to a new server

2022-01-17 Thread Diego Zuccato


Hello all.

I have a Gluster volume that I'd need to move to a different server.
The volume is 4x10TB bricks accessed via FC (different LUNs) on an old 
CX3-80.
I have no extra space to create a copy of all the data, so I'd need to 
hide the LUNs from the old server and make 'em visible to the new ("move 
the disks"), w/o copying data.


Can I just do something like this?
- stop volume
- umount bricks
- copy volume state files to new server (which ones?)
- map LUNs to new server
- mount bricks on new server (maintaining the same path they had on old 
server)

- start glusterd on new server
- start volume

Tks!

--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] force realignement after downing a node?

2021-11-30 Thread Diego Zuccato


Done just that :)
Today I upgraded the second node, with a cleaner shutdown.
Strangely, at reboot it worked for about half an hour with all the cores 
at 100% (but low mem use) and "gluster v heal cluster_data info" 
apparently hanging. Had lunch and now all cores are back to normal 
(20-60%), memory use is higher ang gluster is responding again. Still no 
files in heal pending.
I'll skip tomorrow, then upgrade the last server. Hope it all goes 
smoothly again.


Tks.

Il 30/11/2021 13:17, Strahil Nikolov ha scritto:

Than,

take a beer/tee/coffee/ and enjoy the rest of the day ;)

Best Regards,
Strahil Nikolov

On Mon, Nov 29, 2021 at 13:09, Diego Zuccato
 wrote:
Here it is. Seems gluster thinks there's nothing to be done...

-8<--
root@str957-clustor00 <mailto:root@str957-clustor00>:~# gluster v
heal cluster_data info summary
Brick clustor00:/srv/bricks/00/d
Status: Connected
Total Number of entries: 0
Number of entries in heal pending: 0
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick clustor01:/srv/bricks/00/d
Status: Connected
Total Number of entries: 0
Number of entries in heal pending: 0
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick clustor02:/srv/quorum/00/d
Status: Connected
Total Number of entries: 0
Number of entries in heal pending: 0
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick clustor02:/srv/bricks/00/d
Status: Connected
Total Number of entries: 0
Number of entries in heal pending: 0
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick clustor00:/srv/bricks/01/d
Status: Connected
Total Number of entries: 0
Number of entries in heal pending: 0
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick clustor01:/srv/quorum/00/d
Status: Connected
Total Number of entries: 0
Number of entries in heal pending: 0
Number of entries in split-brain: 0
Number of entries possibly healing: 0

[... snip: everything reports 0...]

Brick clustor01:/srv/bricks/29/d
Status: Connected
Total Number of entries: 0
Number of entries in heal pending: 0
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick clustor02:/srv/bricks/29/d
Status: Connected
Total Number of entries: 0
Number of entries in heal pending: 0
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick clustor00:/srv/quorum/14/d
Status: Connected
Total Number of entries: 0
Number of entries in heal pending: 0
Number of entries in split-brain: 0
Number of entries possibly healing: 0
-8<--

Il 29/11/2021 12:02, Strahil Nikolov ha scritto:
 > What is the output of 'gluster volume heal VOLUME info summary' ?
 >
 > Best Regards,
 > Strahil Nikolov
 >
 >    On Mon, Nov 29, 2021 at 10:33, Diego Zuccato
 >    mailto:diego.zucc...@unibo.it>> wrote:
 >    Hello all.
 >
 >    I just brought offline a node (in a replica 3 arbiter 1 volume) to
 >    install more RAM.
 >    The other two nodes kept being used, so I expected to see some
 >    resync at
 >    power on. But I saw nothing unusual: seems it's just serving files
 >    as usual.
 >    Is it normal or should I force a resync? If so, how?
 >
 >    Regards.
 >
 >    --
 >    Diego Zuccato
 >    DIFA - Dip. di Fisica e Astronomia
 >    Servizi Informatici
 >    Alma Mater Studiorum - Università di Bologna
 >    V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
 >    tel.: +39 051 20 95786
 >    
 >
 >
 >
 >    Community Meeting Calendar:
 >
 >    Schedule -
 >    Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
 >    Bridge: https://meet.google.com/cpu-eiue-hvk
<https://meet.google.com/cpu-eiue-hvk>
 >    <https://meet.google.com/cpu-eiue-hvk
<https://meet.google.com/cpu-eiue-hvk>>
 >    Gluster-users mailing list
 > Gluster-users@gluster.org <mailto:Gluster-users@gluster.org>
<mailto:Gluster-users@gluster.org <mailto:Gluster-users@gluster.org>>
 > https://lists.gluster.org/mailman/listinfo/gluster-users
<https://lists.gluster.org/mailman/listinfo/gluster-users>
 >    <https://lists.gluster.org/mailman/listinfo/gluster-users
<https://lists.gluster.org/mailman/listinfo/gluster-users>>

 >

-- 
Diego Zuccato

    DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna

Re: [Gluster-users] force realignement after downing a node?

2021-11-29 Thread Diego Zuccato


Here it is. Seems gluster thinks there's nothing to be done...

-8<--
root@str957-clustor00:~# gluster v heal cluster_data info summary
Brick clustor00:/srv/bricks/00/d
Status: Connected
Total Number of entries: 0
Number of entries in heal pending: 0
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick clustor01:/srv/bricks/00/d
Status: Connected
Total Number of entries: 0
Number of entries in heal pending: 0
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick clustor02:/srv/quorum/00/d
Status: Connected
Total Number of entries: 0
Number of entries in heal pending: 0
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick clustor02:/srv/bricks/00/d
Status: Connected
Total Number of entries: 0
Number of entries in heal pending: 0
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick clustor00:/srv/bricks/01/d
Status: Connected
Total Number of entries: 0
Number of entries in heal pending: 0
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick clustor01:/srv/quorum/00/d
Status: Connected
Total Number of entries: 0
Number of entries in heal pending: 0
Number of entries in split-brain: 0
Number of entries possibly healing: 0

[... snip: everything reports 0...]

Brick clustor01:/srv/bricks/29/d
Status: Connected
Total Number of entries: 0
Number of entries in heal pending: 0
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick clustor02:/srv/bricks/29/d
Status: Connected
Total Number of entries: 0
Number of entries in heal pending: 0
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick clustor00:/srv/quorum/14/d
Status: Connected
Total Number of entries: 0
Number of entries in heal pending: 0
Number of entries in split-brain: 0
Number of entries possibly healing: 0
-8<--

Il 29/11/2021 12:02, Strahil Nikolov ha scritto:

What is the output of 'gluster volume heal VOLUME info summary' ?

Best Regards,
Strahil Nikolov

On Mon, Nov 29, 2021 at 10:33, Diego Zuccato
 wrote:
Hello all.

I just brought offline a node (in a replica 3 arbiter 1 volume) to
install more RAM.
The other two nodes kept being used, so I expected to see some
resync at
power on. But I saw nothing unusual: seems it's just serving files
as usual.
Is it normal or should I force a resync? If so, how?

Regards.

-- 
    Diego Zuccato

DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
<https://meet.google.com/cpu-eiue-hvk>
Gluster-users mailing list
Gluster-users@gluster.org <mailto:Gluster-users@gluster.org>
https://lists.gluster.org/mailman/listinfo/gluster-users
<https://lists.gluster.org/mailman/listinfo/gluster-users>



--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] force realignement after downing a node?

2021-11-29 Thread Diego Zuccato


Hello all.

I just brought offline a node (in a replica 3 arbiter 1 volume) to 
install more RAM.
The other two nodes kept being used, so I expected to see some resync at 
power on. But I saw nothing unusual: seems it's just serving files as usual.

Is it normal or should I force a resync? If so, how?

Regards.

--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Re-Add Distributed Volume Volume

2021-11-17 Thread Diego Zuccato


Hi.

If you can still move files from the broken *volume*, you don't have to 
touch .glusterfs folder: it's managed by gluster itself and that's the 
preferred way to recover (more like a transfer between two volumes that 
only accidentally share the filesystem on the bricks).


But if the vol was really broken (wouldn't start at all), the only way 
to recover would be to read the files *from the bricks*.


Those are quite different scenarios that require different recovery 
methods. Recovering from the briks is slightly more complicated, since 
you have to manually handle duplicate files, checking if they're 
actually identical or if there are differences.


Il 16/11/2021 11:14, Taste-Of-IT ha scritto:

Hi Diego,

i noticed, when i move files from broken volume to new mounted gf volume, the 
folder and files are also deleted from .glusterfs directory. Thats ok right, 
because there are the hardlilnks of the files, right? You wrote, that you 
delete the .glusterfs Folder first and than move. I didnt try it because iam 
afraid of loosing all files, if these are the hardlings. I also noticed, if i 
moved files and the folder and files are also deleted in .glusterfs, the disk 
size didnt change.

=> i read about hardlink and it seems that there is a remaining part of them, thats why 
the free space rises, right? So i have to delete the .glusterfs directory right and no 
"real" files are deleted. Thats what i understand now by reading about hardlinks.

What do you think
thx



--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Re-Add Distributed Volume Volume

2021-11-15 Thread Diego Zuccato


Il 15/11/2021 06:45, Strahil Nikolov ha scritto:

Gluster uses hard links (2 entries pointing to the same inode) and 
untill the hard links are not deleted, the data will be still there.

[...]
The hard links are in the .glusterfs directory and after a successful 
move you can delete them.When I've had to move from a "broken" volume to a newly created one, I 
first deleted the .glusterfs folders from the roots of the old bricks 
(the volume was already broken, after all) and then moved the other 
folders to their new home (the new volume, mounted on every node). Just 
avoided overwriting existing files.

This way the bricks didn't overflow.

--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster extra large file on brick

2021-07-12 Thread Diego Zuccato


Il 06/07/2021 18:28, Dan Thomson ha scritto:

Hi.

Maybe you're hitting the "reserved space for root" (usually 5%): when 
you try to write from the server directly to the brick, you're mos 
probably doing it from root and you use the reserved space. When you try 
writing from a client you're likely using a normal user and get the "no 
space left".
Another possible issue to watch out for, is exhaustion of inodes (I've 
been bitten by it for arbiter bricks partition).


HIH,
Diego


Hi gluster users,

I'm having an issue that I'm hoping to get some help with on a
dispersed volume (EC: 2x(4+2)) that's causing me some headaches. This is
on a cluster running Gluster 6.9 on CentOS 7.

At some point in the last week, writes to one of my bricks have started
failing due to an "No Space Left on Device" error:

[2021-07-06 16:08:57.261307] E [MSGID: 115067] 
[server-rpc-fops_v2.c:1373:server4_writev_cbk] 0-gluster-01-server: 
1853436561: WRITEV -2 (f2d6f2f8-4fd7-4692-bd60-23124897be54), client: 
CTX_ID:648a7383-46c8-4ed7-a921-acafc90bec1a-GRAPH_ID:4-PID:19471-HOST:rhevh08.mgmt.triumf.ca-PC_NAME:gluster-01-client-5-RECON_NO:-5, 
error-xlator: gluster-01-posix [No space left on device]


The disk is quite full (listed as 100% on the server), but does have
some writable room left:

/dev/mapper/vg--brick1-brick1
11T   11T   97G 100% /data/glusterfs/gluster-01/brick1


however, I'm not sure if the amount of disk space used on the physical
drive is the true cause of the "No Space Left on Device" errors anyway.
I can still manually write to this brick outside of Gluster, so it seems
like the operating system isn't preventing the writes from happening.

During my investigation, I noticed that one .glusterfs paths on the problem
server is using up much more space than it is on the other servers. I can't
quite figure out why that might be, or how that happened. I'm wondering
if there's any advice on what the cause might've been.

I had done some package updates on this server with the issue and not on 
the
other servers. This included the kernel version, but didn't include the 
Gluster

packages. So possibly this, or the reboot to load the new kernel may
have caused a problem. I have scripts on my gluster machines to nicely kill
all of the brick processes before rebooting, so I'm not leaning towards
an abrupt shutdown being the cause, but it's a possibility.

I'm also looking for advice on how to safely remove the problem file and
rebuild it from the other Gluster peers. I've seen some documentation on
this, but I'm a little nervous about corrupting the volume if I
misunderstand the process. I'm not free to take the volume or cluster 
down and
do maintenance at this point, but that might be something I'll have to 
consider

if it's my only option.

For reference, here's the comparison of the same path that seems to be
taking up extra space on one of the hosts:

1: 26G /data/gluster-01/brick1/vol/.glusterfs/99/56
2: 26G /data/gluster-01/brick1/vol/.glusterfs/99/56
3: 26G /data/gluster-01/brick1/vol/.glusterfs/99/56
4: 26G /data/gluster-01/brick1/vol/.glusterfs/99/56
5: 26G /data/gluster-01/brick1/vol/.glusterfs/99/56
6: 3.0T    /data/gluster-01/brick1/vol/.glusterfs/99/56

Any and all advice is appreciated.

Thanks!
--

Daniel Thomson
DevOps Engineer
t +1 604 222 7428
dthom...@triumf.ca
TRIUMF Canada's particle accelerator centre
www.triumf.ca @TRIUMFLab
4004 Wesbrook Mall
Vancouver BC V6T 2A3 Canada





Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users




--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] File not found errors

2021-06-25 Thread Diego Zuccato


Il 24/06/21 03:45, cfel...@rocketmail.com ha scritto:

Probably related to the problem I reported some time ago (apr, 22nd), 
thread "quantum directories?". Got no answer :(



Hi,

I've got an interesting issue with files not being found when accessed directly.

When accessing a file like so:

   #ls /mnt/path/to/some/file/file.json
   ls: cannot access '/mnt/path/to/some/file/file.json': No such file or 
directory

I can wait, try again, and same result. I could put this in a loop with a delay 
between read attempts, same result.

However, if I do an "ls" of the directory first then everything works. That is:

   # ls /mnt/path/to/some/file/file.json
   ls: cannot access '/mnt/path/to/some/file/file.json': No such file or 
directory

   # ls /mnt/path/to/some/file/
   dir1  dir2  file1.txt  file2.txt  file3.txt  file.json

At this point I can successfully "ls" the file:

   # ls /mnt/path/to/some/file/file.json
   /mnt/path/to/some/file/file.json

It is curious, but seems to be isolated to certain directories/mount points.

Looking at the mount logs there is absolutely nothing printed to the mnt log on 
the client with regards to the failure or the eventual success.

The configuration:

   Distributed-Replicate
   4 x 2 = 8
   Gluster version 9.2

Clients are using native FUSE mounts (also version 9.2)

I did recently add the 4th brick, and the system is currently undergoing a 
rebalance (the fix-layout rebalance already completed). I am throttling the 
full rebalance as 'lazy'.

I looked at the bricks on the server and the files still exists on the older 
bricks, so I don't think the files in question got rebalanced (yet).

The system load isn't too high, and this doesn't seem to be a consistent error 
as I don't have this issue in other directories, but I consistently have the 
issue on this directory as well as a few others.

Adding some additional background/history:

   Cluster was setup a couple of years ago on gluster 6.x.
   Did an expansion (two replica pairs to three replica pairs) and rebalance 
while on 6.x, no issues.
   Upgraded to gluster 7.x -> 8.x -> 9.x
   Ops version has been upgraded to 9
   Doing another expansion (three replica pairs to four replica pairs) and 
rebalance now, and seeing some interesting issues, including this one.

There is 0 entries under "heal info" or "info split-brain".

I'm just wondering if anyone on this list has seen anything similar, or has any 
suggestions.




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users




--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Replica bricks fungible?

2021-06-09 Thread Diego Zuccato


Il 05/06/2021 14:36, Zenon Panoussis ha scritto:


What I'm really asking is: can I physically move a brick
from one server to another such as

I can now answer my own question: yes, replica bricks are
identical and can be physically moved or copied from one
server to another. I have now done it a few times without
any problems, though I made sure no healing was pending
before the moves.
Well, if it's officially supported, that could be a really interesting 
option to quickly scale big storage systems.
I'm thinking about our scenario: 3 servers, 36 12TB disks each. When 
adding a new server (or another pair of servers, to keep an odd number)
it will require quite a lot of time to rebalance, with heavy 
implications both on IB network and latency for the users. If we could 
simply swap around some disks it could be a lot faster.

Have you documented the procedure you followed?

--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Replica bricks fungible?

2021-04-23 Thread Diego Zuccato

Il 23/04/21 13:30, Zenon Panoussis ha scritto:

> Are all replica (non-arbiter) bricks identical to each
> other? If not, what do they differ in?
No. At least meta-metadata is different, IIUC.

> What I'm really asking is: can I physically move a brick
> from one server to another such as
[...]
> and then remove node2 from the volume, add node4 to
> it and be back up and running without the need of any
> synchronisation?I'm no expert, but I think you can't. It might be an 
> interesting
feature, tho. Could be very useful to quickly scale a cluster w/o moving
terabytes of data via network: move some (carefully-chosen) bricks from
old nodes to the new one, replace 'em with empty disks and expand.
Something like MD-RAID metadata.

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Quantum directories?

2021-04-22 Thread Diego Zuccato

Hello all.

I just noticed a really inexplicable (for me) behaviour:

root@str957-cluster:/scratch/.resv2# mkdir test
root@str957-cluster:/scratch/.resv2# ls -la
total 4
drwxr-xr-x 16 root root 4096 Apr 22 11:40 ..
root@str957-cluster:/scratch/.resv2# mkdir test
mkdir: cannot create directory 'test': File exists
root@str957-cluster:/scratch/.resv2# ls -la
total 4
drwxr-xr-x 16 root root 4096 Apr 22 11:40 ..
root@str957-cluster:/scratch/.resv2# cd test
root@str957-cluster:/scratch/.resv2/test# pwd
/scratch/.resv2/test

The directory both exists and doesn't exist at the same time???

The volume is:
Volume Name: cluster_data
Type: Distributed-Replicate
Volume ID: a8caaa90-d161-45bb-a68c-278263a8531a
Status: Started
Snapshot Count: 0
Number of Bricks: 21 x (2 + 1) = 63
Transport-type: tcp
Bricks:
Brick1: clustor00:/srv/bricks/00/d
Brick2: clustor01:/srv/bricks/00/d
Brick3: clustor02:/srv/quorum/00/d (arbiter)
[...snip...]
Brick61: clustor01:/srv/bricks/13/d
Brick62: clustor02:/srv/bricks/13/d
Brick63: clustor00:/srv/quorum/06/d (arbiter)
Options Reconfigured:
features.scrub: Active
features.bitrot: on
cluster.lookup-optimize: on
performance.stat-prefetch: on
performance.cache-refresh-timeout: 60
client.event-threads: 8
performance.client-io-threads: on
nfs.disable: on
transport.address-family: inet
features.quota: on
features.inode-quota: on
features.quota-deem-statfs: on
features.default-soft-limit: 90
cluster.self-heal-daemon: enable
performance.write-behind-window-size: 128MB
performance.parallel-readdir: on

Disabling performance.parallel-readdir seems to "fix" it, but IIUC it
shouldn't happen even with parallel-readdir turned on, right?

PS: sometimes, access to the volume becomes quite slow (3-4s for a ls of
a dozen files). Any hints about options I could enable or change? The 3
servers currently have only 96GB RAM (already asked to double it), and
should host up to 36 bricks + 18 quorums). There are about 50 clients.

Tks.

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] OOM kills gluster process

2021-04-22 Thread Diego Zuccato

Il 21/04/21 12:20, Strahil Nikolov ha scritto:

Tks for answering.

> You will need ro create a script to identify the pids and then protect them.
> Then , you can add that script in the gluster's service file as
> ExecStartPost=myscript
Well, using pidof or the runfile contents it should be doable...
Probably the runfile is the best option. I'll try it.

> Another approach is to use cgroups and limit everything in the userspace.
Tried that, but have had to revert the change: SLURM is propagating
ulimits to the nodes... Going to ask in the SLURM list ...

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [EXT] OOM kills gluster process

2021-04-22 Thread Diego Zuccato

Il 21/04/21 13:04, Stefan Solbrig ha scritto:

Tks for answering.

> You could also consider disabling overcommitting memory: 
> /etc/sysctl.d/:
> vm.overcommit_memory = 2
> vm.overcommit_ratio = 100
> (See https://www.kernel.org/doc/Documentation/vm/overcommit-accounting)
Interesting idea, but a bit of swapping is not too bad.

> This way, If users allocate too much memory, they get an error upon 
> allocation.
> This should limit the cases where the oom killer needs to take action.
> however, it has other side effects, like killing user programs that overcommit
> by default. (Or user programs that fork() a lot.)
Actually the fork()-intensive programs are the ones that most likely are
behaving badly... I'll have to dig deeper.

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] OOM kills gluster process

2021-04-21 Thread Diego Zuccato


Hello all.

I have a somewhat undersized cluster frontend node where users (way too 
often) use too much RAM. Too bad that the first process selected for 
killing is the one handling the gluster mount!


Is there a way to permanently make it "unkillable"? I already tried 
altering oom_adj, but the PID changes at every boot...


--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster usage scenarios in HPC cluster management

2021-03-23 Thread Diego Zuccato

Il 22/03/21 16:54, Erik Jacobson ha scritto:

> So if you had 24 leaders like HLRS, there would be 8 replica-3 at the
> bottom layer, and then distributed across. (replicated/distributed
> volumes)
I still have to grasp the "leader node" concept.
Weren't gluster nodes "peers"? Or by "leader" you mean that it's
mentioned in the fstab entry like
/l1,l2,l3:gv0 /mnt/gv0 glusterfs defaults 0 0
while the peer list includes l1,l2,l3 and a bunch of other nodes?

> So we would have 24 leader nodes, each leader would have a disk serving
> 4 bricks (one of which is simply a lock FS for CTDB, one is sharded,
> one is for logs, and one is heavily optimized for non-object expanded
> tree NFS). The term "disk" is loose.
That's a system way bigger than ours (3 nodes, replica3arbiter1, up to
36 bricks per node).

> Specs of a leader node at a customer site:
>  * 256G RAM
Glip! 256G for 4 bricks... No wonder I have had troubles running 26
bricks in 64GB RAM... :)

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster usage scenarios in HPC cluster management

2021-03-22 Thread Diego Zuccato

Il 22/03/21 14:45, Erik Jacobson ha scritto:

> The stuff I work on doesn't use containers much (unlike a different
> system also at HPE).
By "pods" I meant "glusterd instance", a server hosting a collection of
bricks.

> I don't have a recipe, they've just always been beefy enough for
> gluster. Sorry I don't have a more scientific answer.
Seems that 64GB RAM are not enough for a pod with 26 glusterfsd
instances and no other services (except sshd for management). What do
you mean by "beefy enough"? 128GB RAM or 1TB?

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster usage scenarios in HPC cluster management

2021-03-22 Thread Diego Zuccato


Il 19/03/2021 16:03, Erik Jacobson ha scritto:


A while back I was asked to make a blog or something similar to discuss
the use cases the team I work on (HPCM cluster management) at HPE.

Tks for the article.

I just miss a bit of information: how are you sizing CPU/RAM for pods?

--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Volume not healing

2021-03-22 Thread Diego Zuccato

Il 19/03/21 18:06, Strahil Nikolov ha scritto:

> Are you running it against the fuse mountpoint ?
Yup.

> You are not supposed to see 'no such file or directory' ... Maybe
> something more serious is going on.
Between that and the duplicated files,that's for sure. But I don't know
where to look to at least diangose (if not fix) this :( As I said,
probably part of the issue is due to the multiple failures for OOM and
the multiple tries to remove a brick.

I'm currently emptying the volume then I'll recreate it from scratch,
hoping for the best.

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Volume not healing

2021-03-22 Thread Diego Zuccato

Il 20/03/21 15:21, Zenon Panoussis ha scritto:

> When you have 0 files that need healing,
>   gluster volume heal BigVol granular-entry-heal enable
> I have tested with and without granular and, empirically,
> without any hard statistics, I find granular considerably
> faster.
Tks for the hint, but it's already set. I usually do it as soon as I
create the volume :) I don't understand why it's not the default :)

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Volume not healing

2021-03-19 Thread Diego Zuccato

Il 19/03/21 13:17, Strahil Nikolov ha scritto:

> find /FUSE/mountpoint -exec stat {} \;
Running it now (redirecting stdout to /dev/null).
It's finding quite a lot of "no such file or directory" errors.

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Volume not healing

2021-03-19 Thread Diego Zuccato

Il 19/03/21 11:06, Diego Zuccato ha scritto:

> I tried to run "gluster v heal BigVol info summary" and got quite a high
> count of entries to be healed on some bricks:
> # gluster v heal BigVol info summary|grep pending|grep -v ' 0$'
> Number of entries in heal pending: 41
> Number of entries in heal pending: 2971
> Number of entries in heal pending: 20
> Number of entries in heal pending: 2393
> 
> Too bad that those numbers aren't decreasing with time.
Slight correction. Seems the numbers are *slowly* decreasing. After one
hour I see:
# gluster v heal BigVol info summary|grep pending|grep -v ' 0$'
Number of entries in heal pending: 41
Number of entries in heal pending: 2955
Number of entries in heal pending: 20
Number of entries in heal pending: 2384

Is it possible to speed it up? Nodes are nearly idle...

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Volume not healing

2021-03-19 Thread Diego Zuccato

Hello all.

I have a "problematic" volume. It was Rep3a1 with a dedicated VM for the
arbiters.

Too bad I understimated RAM needs and the arbiters VM crashed frequently
for OOM (had just 8GB allocated). Even the other two nodes sometimes
crashed, too, during a remove-brick operation (other thread).

So I've had to stop & re-run the remove-brick multiple times, even
rebooting the nodes, but it never completed.

Now, I decided to move all the files to a temporary storage to rebuild
the volume from scratch, but I find directories with duplicated files
(two identical files, same name, size and contents), probably the two
replicas.

I tried to run "gluster v heal BigVol info summary" and got quite a high
count of entries to be healed on some bricks:
# gluster v heal BigVol info summary|grep pending|grep -v ' 0$'
Number of entries in heal pending: 41
Number of entries in heal pending: 2971
Number of entries in heal pending: 20
Number of entries in heal pending: 2393

Too bad that those numbers aren't decreasing with time.

Seems no entries are considered in split-brain condition (all counts for
"gluster v heal BigVol info split-brain" are 0).

Is there something I can do to convince Gluster to heal those entries
w/o going entry-by-entry manually?

Thanks.

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Suggested setup for VM images

2021-02-26 Thread Diego Zuccato

Hello all.

What server config would you suggest for hosting live VM images?
I'd have to replace a Dell MD3200 and a Dell MD3800i that are getting
too old and I'd like to have a distributed architecture to avoid SPOF.

What are the recommended RAM/CPU/#of disks per server?

We're currently using 3 servers configured with 2 * Intel(R) Xeon(R)
Silver 4210, 96GB RAM (6x16G -- probably it could be better to increase
it), up to 36 12TB spinning disks. Volume is Distributed-Replicate with
2 data copies + 1 arbiter. But they're serving normal (small-medium)
files, not big images (and sometimes an ls takes 3s... uhm...) so the
workload is quite different...

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Proper procedure to reduce an active volume

2021-02-04 Thread Diego Zuccato

Il 04/02/21 19:28, Nag Pavan Chilakam ha scritto:

> What is the proper procedure to reduce a "replica 3 arbiter 1" volume?
> Can you kindly elaborate the volume configuration.  Is this a plain
> arbiter volume or is it a distributed arbiter volume?
> Please share the volume info so that we can help you better
Sure. Here it is. Shortened a bit :)
-8<--
# gluster v info

Volume Name: BigVol
Type: Distributed-Replicate
Volume ID: c51926bd-6715-46b2-8bb3-8c915ec47e28
Status: Started
Snapshot Count: 0
Number of Bricks: 28 x (2 + 1) = 84
Transport-type: tcp
Bricks:
Brick1: str957-biostor2:/srv/bricks/00/BigVol
Brick2: str957-biostor:/srv/bricks/00/BigVol
Brick3: str957-biostq:/srv/arbiters/00/BigVol (arbiter)
Brick4: str957-biostor2:/srv/bricks/01/BigVol
Brick5: str957-biostor:/srv/bricks/01/BigVol
Brick6: str957-biostq:/srv/arbiters/01/BigVol (arbiter)
[...]
Brick79: str957-biostor:/srv/bricks/26/BigVol
Brick80: str957-biostor2:/srv/bricks/26/BigVol
Brick81: str957-biostq:/srv/arbiters/26/BigVol (arbiter)
Brick82: str957-biostor:/srv/bricks/27/BigVol
Brick83: str957-biostor2:/srv/bricks/27/BigVol
Brick84: str957-biostq:/srv/arbiters/27/BigVol (arbiter)
Options Reconfigured:
features.scrub-throttle: aggressive
server.manage-gids: on
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on
cluster.self-heal-daemon: enable
ssl.certificate-depth: 1
auth.ssl-allow: str957-bio*
features.scrub-freq: biweekly
features.scrub: Active
features.bitrot: on
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: on
client.ssl: on
server.ssl: on
server.event-threads: 8
client.event-threads: 8
cluster.granular-entry-heal: enable
-8<--

> The procedure I've found is:
> 1) # gluster volume remove-brick VOLNAME BRICK start
> (repeat for each brick to be removed, but being a r3a1 should I remove
> both bricks and the arbiter in a single command or multiple ones?)
> No , you can mention bricks of a distributed subvolume in one command.
> If you are having a 1x(2+1a) volume , then you should mention only one
> brick. Start by removing the arbiter brick
Ok.

> 2) # gluster volume remove-brick VOLNAME BRICK status
> (to monitor migration)
> 3) # gluster volume remove-brick VOLNAME BRICK commit
> (to finalize the removal)
> 4) umount and reformat the freed (now unused) bricks
> Is this safe?
> What is the actual need to remove bricks?
I need to move a couple of disks to a new server, to keep it all well
balanced and increase the available space.

> If you feel this volume is not needed anymore , then just delete the
> volume, instead of going through each brick deletion
Nono, the volume is needed and is currently hosting data I cannot
lose... But I haven't space to copy it elsewhere...

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Proper procedure to reduce an active volume

2021-02-03 Thread Diego Zuccato

Il 03/02/21 18:15, Strahil Nikolov ha scritto:

Tks for the fast answer.

> Replica volumes do not require the 'start + commit' - it's needed only
> for distributed replicated volumes and other types of volumes.
> Yet, I'm not sure if removing a data brick (and keeping the arbiter)
> makes any sense. Usually, I just remove 1 data copy + the arbiter to 
> reshape the volume.
Well, actually I need to remove both data bricks and the arbiters w/o
losing the data. Probably that wasn't clear, sorry.

The current pods have 28x10TB disks and all the arbiters are on a VM.
The new pod does have only 26 disks.
What I want to do is remove one disk from each of the current pods, move
one of the freed disks to the new pod (this way each pod will have 27
disks and I'll have a cold spare to quickly replace a failed disk) and
distribute the arbiters between the three pods to dismiss the VM.
If possible, I'd prefer to keep redundancy (hence not going to replica 1
in an intermediate step).

> Keep in mind that as you remove a brick you need to specify the new
> replica count.
> For example you have 'replica 3 arbiter 1' and you want to remove the
> second copy and the arbiter:
> gluster volume remove-brick  replica 1 server2:/path/to/brick
> arbiter:/path/to/brick force
That's what I want to avoid :)
I need to migrate data out of s1:/bricks/27, s2:/bricks/27 and
s3:/arbiters/27 redistributing it to the remaining bricks.
BTW, isn't replica count an attribute of the whole volume?

> If you wish to reuse block devices, don't forget to rebuild the FS (as
> it's fastest way to cleanup)!
Yup. Already been bitten by eas :)

> When you increase the count (add second data brick and maybe arbiter),
> you should run:
> gluster volume add-brick  replica 3 arbiter 1
> server4:/path/to/brick arbiter2:/path/to/brick
> gluster volume heal  full
That will be useful when more disks will be added.
After removing the last bricks (isn't there a term for "all the
components of a replica set"? slice?) I thought I could move the
remaining bricks with replace-brick and keep a "rotating" distribution:
slice | s1  | s2  | s3
 00   | b00 | b00 | a00 (vm.a00 -> s2.a00)
 01   | a00 | b01 | b00 (s1.b01 -> s3.b00, vm.a01 -> s1.a00)
 02   | b01 | a00 | b01 (s1.b02 -> s1.b01, s2.b02 -> s3.b01, vm.a02 ->
s2.a00)
[and so on]
That will take quite a long time (IIUC I cannot move to a brick being
moved to another... or at least it doesn't seem wise :) ).
It's probably faster to first move arbiters and then the data.

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Proper procedure to reduce an active volume

2021-02-03 Thread Diego Zuccato

Hello all.

What is the proper procedure to reduce a "replica 3 arbiter 1" volume?

The procedure I've found is:
1) # gluster volume remove-brick VOLNAME BRICK start
(repeat for each brick to be removed, but being a r3a1 should I remove
both bricks and the arbiter in a single command or multiple ones?)
2) # gluster volume remove-brick VOLNAME BRICK status
(to monitor migration)
3) # gluster volume remove-brick VOLNAME BRICK commit
(to finalize the removal)
4) umount and reformat the freed (now unused) bricks
Is this safe?

And once the bricks are removed I'll have to distribute arbiters across
the current two data servers and a new one (currently I'm using a
dedicated VM just for the arbiters). But that's another pie :)

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Very slow 'ls' ?

2021-01-15 Thread Diego Zuccato

Hello all.

I have a volume configured as:
-8<--
root@str957-clustor00:~# gluster v info cluster_data


Volume Name: cluster_data
Type: Distributed-Replicate
Volume ID: a8caaa90-d161-45bb-a68c-278263a8531a
Status: Started
Snapshot Count: 0
Number of Bricks: 21 x (2 + 1) = 63
Transport-type: tcp
Bricks:
Brick1: clustor00:/srv/bricks/00/d
Brick2: clustor01:/srv/bricks/00/d
Brick3: clustor02:/srv/quorum/00/d (arbiter)
[...]
Brick61: clustor01:/srv/bricks/13/d
Brick62: clustor02:/srv/bricks/13/d
Brick63: clustor00:/srv/quorum/06/d (arbiter)
Options Reconfigured:
client.event-threads: 2
performance.client-io-threads: off
nfs.disable: on
transport.address-family: inet
features.quota: on
features.inode-quota: on
features.quota-deem-statfs: on
features.default-soft-limit: 90
cluster.self-heal-daemon: enable
-8<--

Connection between client and server is via InfiniBand (40G from the
client, 100G between storage nodes), using ipoib (IIUC RDMA is
deprecated and unmaintained).

A simple "ls -ln" ('n' to avoid delays due to lookups) for a folder with
just 7 entries takes more than 4s on the first run, ~1s on the next one
and a reasonable 0.1s on the third (if I'm fast enough).

I tried enabling client-io-threads, but seems it didn't change anything.

Any hints?

TIA!

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Replication logic

2021-01-12 Thread Diego Zuccato

Il 28/12/20 22:14, Zenon Panoussis ha scritto:

> Is that so, or am imagining impossible acrobatics?
Given the slow link, probably snail mail is faster.

Configure a new node near the fast ones, add it to the pool, replace
thin arbiters with full replicas on the new node, let it rebuild (fast,
since it's "local"), then put it offline and send it to the final
location. Once you turn it on again it will have to sync only the latest
changes.

Sould take less than 3 weeks :)

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Replica 3 volume with forced quorum 1 fault tolerance and recovery

2020-12-01 Thread Diego Zuccato

Il 01/12/20 15:23, Dmitry Antipov ha scritto:

> At least I can imagine the volume option to specify "let's assume that
> the only live brick contains the
> most recent (and so hopefully valid) data, so newly (re)started ones are
> pleased to heal from it" behavior.
Too dangerous and prone to byzantine desync.

Say only node 1 survives, and a file gets written to it.
Then, while node 2 returns to activity, node 1 dies before being able to
tell node2 what changed.
Another client writes to the "same" file a different content.
Now node 1 returns active and you have split-brain: no version of the
file is "better" than the other. A returning node 3 can't know (in an
automated way) which copy of the file should be replicated.

That's why you should always have a quorum of N/2+1 when data integrity
is important.

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Geo-replication status Faulty

2020-10-27 Thread Diego Zuccato

Il 27/10/20 13:15, Gilberto Nunes ha scritto:
> I have applied this parameters to the 2-node gluster:
> gluster vol set VMS cluster.heal-timeout 10
> gluster volume heal VMS enable
> gluster vol set VMS cluster.quorum-reads false
> gluster vol set VMS cluster.quorum-count 1
Urgh!
IIUC you're begging for split-brain ...
I think you should leave quorum-count=2 for safe writes. If a node is
down, obviously the volume becomes readonly. But if you planned the
downtime you can reduce quorum-count just before shutting it down.
You'll have to bring it back to 2 before re-enabling the downed server,
then wait for heal to complete before being able to down the second server.

> Then I mount the gluster volume putting this line in the fstab file:
> In gluster01
> gluster01:VMS /vms glusterfs
> defaults,_netdev,x-systemd.automount,backupvolfile-server=gluster02 0 0
> In gluster02
> gluster02:VMS /vms glusterfs
> defaults,_netdev,x-systemd.automount,backupvolfile-server=gluster01 0 0
Isn't it preferrable to use the 'hostlist' syntax?
gluster01,gluster02:VMS /vms glusterfs defaults,_netdev 0 0
A / at the beginning is optional, but can be useful if you're trying to
use the diamond freespace collector (w/o the initial slash, it ignores
glusterfs mountpoints).

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Upgrade from 6.9 to 7.7 stuck (peer is rejected)

2020-10-27 Thread Diego Zuccato

Il 27/10/20 07:40, mabi ha scritto:

> First to answer your question how this first happened, I reached that issue 
> first by simply rebooting my arbiter node yesterday morning in order to due 
> some maintenance which I do on a regular basis and was never a problem before 
> GlusterFS 7.8.
In my case the problem originated from the daemon being reaped by OOM
killer, but the result was the same.

You're in the same rat hole I've been into... IIRC you have to probe *a
working node from the detached node* . I followed these instructions:
https://staged-gluster-docs.readthedocs.io/en/release3.7.0beta1/Administrator%20Guide/Resolving%20Peer%20Rejected/

Yes, they're for an ancient version, but it worked...

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Upgrade from 6.9 to 7.7 stuck (peer is rejected)

2020-10-26 Thread Diego Zuccato

Il 26/10/20 15:09, mabi ha scritto:

> Right, seen liked that this sounds reasonable. Do you actually remember the 
> exact command you ran in order to remove the brick? I was thinking this 
> should be it:
> gluster volume remove-brick   force
> but should I use "force" or "start"?
Memory does not serve me well (there are 28 disks, not 26!), but bash
history does :)
# gluster volume remove-brick BigVol replica 2
str957-biostq:/srv/arbiters/{00..27}/BigVol force
# gluster peer detach str957-biostq
# gluster peer probe str957-biostq
# gluster volume add-brick BigVol replica 3 arbiter 1
str957-biostq:/srv/arbiters/{00..27}/BigVol

You obviously have to wait for remove-brick to complete before detaching
arbiter.

>> IIRC it took about 3 days, but the arbiters are on a VM (8CPU, 8GB RAM)
>> that uses an iSCSI disk. More than 80% continuous load on both CPUs and RAM.
> That's quite long I must say and I am in the same case as you, my arbiter is 
> a VM.
Give all the CPU and RAM you can. Less than 8GB RAM is asking for
troubles (in my case).

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Upgrade from 6.9 to 7.7 stuck (peer is rejected)

2020-10-26 Thread Diego Zuccato

Il 26/10/20 14:46, mabi ha scritto:

>> I solved it by "degrading" the volume to replica 2, then cleared the
>> arbiter bricks and upgraded again to replica 3 arbiter 1.
> Thanks Diego for pointing out this workaround. How much data do you have on 
> that volume in terms of TB and files? Because I have around 3TB of data in 10 
> million files. So I am a bit worried of taking such drastic measures.
The volume is built by 26 10TB disks w/ genetic data. I currently don't
have exact numbers, but it's still at the beginning, so there are a bit
less than 10TB actually used.
But you're only removing the arbiters, you always have two copies of
your files. The worst that can happen is a split brain condition
(avoidable by requiring a 2-nodes quorum, in that case the worst is that
the volume goes readonly).

> How bad was the load after on your volume when re-adding the arbiter brick? 
> and how long did it take to sync/heal?
IIRC it took about 3 days, but the arbiters are on a VM (8CPU, 8GB RAM)
that uses an iSCSI disk. More than 80% continuous load on both CPUs and RAM.

> Would another workaround such as turning off quotas on that problematic 
> volume work? That sounds much less scary but I don't know if that would 
> work...
I don't know, sorry.

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Upgrade from 6.9 to 7.7 stuck (peer is rejected)

2020-10-26 Thread Diego Zuccato

Il 26/10/20 07:40, mabi ha scritto:

> Thanks to this fix I could successfully upgrade from GlusterFS 6.9 to
> 7.8 but now, 1 week later after the upgrade, I have rebooted my third
> node (arbiter node) and unfortunately the bricks do not want to come up
> on that node. I get the same following error message:
IIRC it's the same issue I had some time ago.
I solved it by "degrading" the volume to replica 2, then cleared the
arbiter bricks and upgraded again to replica 3 arbiter 1.

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] gluster replica 3 with third less powerful machine

2020-10-21 Thread Diego Zuccato

Il 20/10/20 15:53, Gilberto Nunes ha scritto:

> I have 3 servers but the third one is a very low machine compared to the
> 2 others servers.
How much RAM does it have?

> How could I create a replica 3 in order to prevent split-brain, but tell
> the gluster to not use the third node too much???
You could have it host just arbiters in a "replica 3 arbiter 1" volume.
I currently use a VM in this role, but it needs at least 8GB RAM to
avoid OOM (it handles 26 arbiters, so you probably can get away with
less if you have less bricks). My VM also have 8 CPUs to reduce the time
needed for resync.

Remember that backing filesystems for arbiters should be tweaked for
allowing a lot of inodes. I formatted my XFS volumes with
mkfs.xfs -i size=512,maxpct=90 /dev/sdXn
to allow up to 90% for inodes (instead of the usual 5%) => a single fs
can handle multiple arbiter bricks.

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Low cost, extendable, failure tolerant home cloud storage question

2020-10-05 Thread Diego Zuccato

Il 04/10/20 21:29, Strahil Nikolov ha scritto:

> In order to be safe you need 'replica 3' or a disperse volumes.
At work I'm using "replica 3 arbiter 1" to balance storage overhead and
data security.

> In both cases extending by 1 brick (brick is not equal to node) is not 
> possible in most cases. For example in 'replica 3' you need to add 3 more 
> bricks (brick is a combination of 'server + directory' and it is recommended 
> to be on separate systems or it's a potential single point of failure). 
> Dispersed volumes also need to be extended in numbers of , so 
> if you have 4+2 (4 bricks , 2 are the maximum you can loose without dataloss 
> ) - you need to add another 6 bricks to extend.
To extend a replica 3 arbiter 1 you only have to add two disks. And have
enough inodes available on the third server. Don't understimate inodes
use, especially if you're using a single partition for all the arbiters!

>> - cheap nodes (with 1-2GB of RAM) able to handle the task (like Rpi, >Odroid 
>> XU4 or even HP T610)
> You need a little bit more ram for daily usage and most probably more cores 
> as healing of data in replica is demanding (dispersed volumes are like raid's 
> parity and require some cpu).
From my experience 8GB is the minimum during healing. Less than that and
you'll get OOM kills and many problems. I'd recommend not less than 16G
for an "arbiter only" server, and 32G for a replica server. These
figures are for a volume with 26 10TB disks (two physical servers w/ 26
disks each plus the arbiter-only in a VM).

> The idea with the ITX boards is not so bad. You can get 2 small systems and 
> create your erasure coding.
Isn't EC a tad overkill with only 2 systems?
BTW I noticed that too small systems are not practical: you have a lot
of (nearly) fixed costs (motherboard, enclosure, power suppy) that only
manages 2-3 disks. Then, most depends on how much  you think you'll
expand your storage. Old theorem is that "the time required to fill a
disk is constant" :)

> Yet, I would prefer the 'replica 3 arbiter 1' approach as it doesn't take so 
> much space and extending will require only 2 data disks .
And you won't have split-brain issues that are a mess to fix!

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Fwd: New GlusterFS deployment, doubts on 1 brick per host vs 1 brick per drive.

2020-09-10 Thread Diego Zuccato

Il 09/09/20 15:30, Miguel Mascarenhas Filipe ha scritto:

I'm a noob, but IIUC this is the option giving the best performance:

> 2. 1 brick per drive, Gluster "distributed replicated" volumes, no
> internal redundancy

Clients can write to both servers in parallel and read scattered (read
performance using multiple files ~ 16x vs 2x with a single disk per
host). Moreover it's easier to extend.
But why ZFS instead of XFS ? In my experience it's heavier.

PS: add a third host ASAP, at least for arbiter volumes (replica 3
arbiter 1). Split brain can be a real pain to fix!

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] How to fix I/O error ? (resend)

2020-08-28 Thread Diego Zuccato

Il 28/08/20 10:31, Felix Kölzow ha scritto:

> I faced a directory were a simple ls leads to input/output error.
I saw something similar, but the directory was OK, except some files
that reported "??" (IIRC in the size field). That got healed automatically.

> I cd into the corresponding directory on the brick and I did a ls
> command and it works.
Well, you have to check all the bricks of a replica to be sure to get
all the files.

> # while read item
> # do
> # rm -rf $item
> # done < /tmp/mylist
Before this I'd have saved the files outside of the bricks :)

> Thats it. Afterwards, I copied the deleted files back from our backup.
Ah, you had a backup! :)

> Please give me a hint if this procedure also works for you.Different 
> situation. But could probably work. Except for the fact we
don't have a backup of those files :( Our volume is mostly used for
archiving, so writes are rare. I know really well redundancy is no
substitute for a backup (with redundancy only, if a file gets deleted,
it's lost -- for this, a WORM translator could be useful :) ).

BTW, in my case I noticed that having the two replicas online and
bringing down the arbiters brought back online the files, so I
completely removed the abriter bricks (degrading to replica 2) and I'm
now slowly re-adding 'em to have "replica 3 arbiter 1" again (see "node
sizing" thread).

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Node sizing

2020-08-28 Thread Diego Zuccato

Hello all.

I just noticed that rebuilding arbiter bricks is using lots of CPU and
RAM. I thought it was quite a lightweight op so I installed the arbiter
node in a VM, but 8CPUs and 16GB RAM are maxed out (and a bit of swap
gets used, too).

The volume is 28*(2+1) 10TB bricks. Gluster v 5.5 .

Is there some rule of thumb for sizing nodes? I couldn't find anything...

TIA.

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] How to fix I/O error ? (resend)

2020-08-26 Thread Diego Zuccato

Il 25/08/20 15:27, Amar Tumballi ha scritto:

> I am not aware of any data layout changes we did between current latest
> (7.7) and 3.8.8. But due to some issues, 'online' migration is not
> possible, even the clients needs to be updated, so you have to umount
> the volume once.
Tks for the info.
Actually the issue is less bad than I thought: I checked on a client
that (somehow) still used Debian oldstable. Current stable uses 5.5,
still old but not prehistoric :)

Too bad the original issue still persists, even after removing the file
and its hardlink from .gluster dir :(
Maybe the upgrade can fix it? Or I risk breaking it even more?

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] How to fix I/O error ? (resend)

2020-08-25 Thread Diego Zuccato

Il 24/08/20 15:23, Diego Zuccato ha scritto:

> I'm now completely out of ideas :(
Actually I have one last idea. My nodes are installed from standard
Debian "stable" repos. That means they're version 3.8.8 !
I understand it's an ancient version.
What's the recommended upgrade path to a current version? Possibly
keeping the data safe: I have nowhere to move all those TBs to...

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] How to fix I/O error ? (resend)

2020-08-24 Thread Diego Zuccato

Il 21/08/20 13:56, Diego Zuccato ha scritto:

Hello again.

I also tried disabling bitrot (and re-enabling it afterwards) and the
procedure for recovery from split-brain[*] removing the file and its
link from one of the nodes, but no luck.

I'm now completely out of ideas :(

How can I resync those gfids ?

Tks!
Diego

[*] even if "gluster volume heal BigVol info split-brain" reports 0 for
every brick.

> Hello all.
> 
> I have a volume setup as:
> -8<--
> root@str957-biostor:~# gluster v info BigVol
> 
> Volume Name: BigVol
> Type: Distributed-Replicate
> Volume ID: c51926bd-6715-46b2-8bb3-8c915ec47e28
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 28 x (2 + 1) = 84
> Transport-type: tcp
> Bricks:
> Brick1: str957-biostor2:/srv/bricks/00/BigVol
> Brick2: str957-biostor:/srv/bricks/00/BigVol
> Brick3: str957-biostq:/srv/arbiters/00/BigVol (arbiter)
> [...]
> Options Reconfigured:
> cluster.granular-entry-heal: enable
> client.event-threads: 8
> server.event-threads: 8
> server.ssl: on
> client.ssl: on
> nfs.disable: on
> performance.readdir-ahead: on
> transport.address-family: inet
> features.bitrot: on
> features.scrub: Active
> features.scrub-freq: biweekly
> auth.ssl-allow: str957-bio*
> ssl.certificate-depth: 1
> cluster.self-heal-daemon: enable
> features.quota: on
> features.inode-quota: on
> features.quota-deem-statfs: on
> server.manage-gids: on
> features.scrub-throttle: aggressive
> -8<--
> 
> After a couple failures (a disk on biostor2 went "missing", and glusterd
> on biostq got killed by OOM) I noticed that some files can't be accessed
> from the clients:
> -8<--
> $ ls -lh 1_germline_CGTACTAG_L005_R*
> -rwxr-xr-x 1 e.f domain^users 2,0G apr 24  2015
> 1_germline_CGTACTAG_L005_R1_001.fastq.gz
> -rwxr-xr-x 1 e.f domain^users 2,0G apr 24  2015
> 1_germline_CGTACTAG_L005_R2_001.fastq.gz
> $ ls -lh 1_germline_CGTACTAG_L005_R1_001.fastq.gz
> ls: cannot access '1_germline_CGTACTAG_L005_R1_001.fastq.gz':
> Input/output error
> -8<--
> (note that if I request ls for more files, it works...).
> 
> The files have exactly the same contents (verified via md5sum). The only
> difference is in getfattr: trusted.bit-rot.version is
> 0x17005f3f9e670002ad5b on a node and
> 0x12005f3ce7af000dccad on the other.
> 
> On the client, the log reports:
> -8<-
> [2020-08-21 11:32:52.208809] W [MSGID: 108008]
> [afr-self-heal-name.c:354:afr_selfheal_name_gfid_mismatch_check]
> 4-BigVol-replicate-13: GFID mismatch for
> /1_germline_CGTACTAG_L005_R1_001.fastq.gz
> d70a4a6d-05fc-4988-8041-5e7f62155fe5 on BigVol-client-55 and
> f249f88a-909f-489d-8d1d-d428e842ee96 on BigVol-client-34
> [2020-08-21 11:32:52.209768] W [fuse-bridge.c:471:fuse_entry_cbk]
> 0-glusterfs-fuse: 233606: LOOKUP()
> /[...]/1_germline_CGTACTAG_L005_R1_001.fastq.gz => -1 (Errore di
> input/output)
> -8<--
> 
> As suggested on IRC, I tested the RAM, but the only thing I got have
> been a "Peer rejected" status due to another OOM kill. No problem, I've
> been able to resolve it, but the original problem still remains.
> 
> What else can I do?
> 
> TIA!
> 
> --
> Diego Zuccato
> DIFA - Dip. di Fisica e Astronomia
> Servizi Informatici
> Alma Mater Studiorum - Università di Bologna
> V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
> tel.: +39 051 20 95786
> 
> 
> 
> 
> Community Meeting Calendar:
> 
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://bluejeans.com/441850968
> 
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
> 


-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] How to fix I/O error ? (resend)

2020-08-21 Thread Diego Zuccato

Hello all.

I have a volume setup as:
-8<--
root@str957-biostor:~# gluster v info BigVol

Volume Name: BigVol
Type: Distributed-Replicate
Volume ID: c51926bd-6715-46b2-8bb3-8c915ec47e28
Status: Started
Snapshot Count: 0
Number of Bricks: 28 x (2 + 1) = 84
Transport-type: tcp
Bricks:
Brick1: str957-biostor2:/srv/bricks/00/BigVol
Brick2: str957-biostor:/srv/bricks/00/BigVol
Brick3: str957-biostq:/srv/arbiters/00/BigVol (arbiter)
[...]
Options Reconfigured:
cluster.granular-entry-heal: enable
client.event-threads: 8
server.event-threads: 8
server.ssl: on
client.ssl: on
nfs.disable: on
performance.readdir-ahead: on
transport.address-family: inet
features.bitrot: on
features.scrub: Active
features.scrub-freq: biweekly
auth.ssl-allow: str957-bio*
ssl.certificate-depth: 1
cluster.self-heal-daemon: enable
features.quota: on
features.inode-quota: on
features.quota-deem-statfs: on
server.manage-gids: on
features.scrub-throttle: aggressive
-8<--

After a couple failures (a disk on biostor2 went "missing", and glusterd
on biostq got killed by OOM) I noticed that some files can't be accessed
from the clients:
-8<--
$ ls -lh 1_germline_CGTACTAG_L005_R*
-rwxr-xr-x 1 e.f domain^users 2,0G apr 24  2015
1_germline_CGTACTAG_L005_R1_001.fastq.gz
-rwxr-xr-x 1 e.f domain^users 2,0G apr 24  2015
1_germline_CGTACTAG_L005_R2_001.fastq.gz
$ ls -lh 1_germline_CGTACTAG_L005_R1_001.fastq.gz
ls: cannot access '1_germline_CGTACTAG_L005_R1_001.fastq.gz':
Input/output error
-8<--
(note that if I request ls for more files, it works...).

The files have exactly the same contents (verified via md5sum). The only
difference is in getfattr: trusted.bit-rot.version is
0x17005f3f9e670002ad5b on a node and
0x12005f3ce7af000dccad on the other.

On the client, the log reports:
-8<-
[2020-08-21 11:32:52.208809] W [MSGID: 108008]
[afr-self-heal-name.c:354:afr_selfheal_name_gfid_mismatch_check]
4-BigVol-replicate-13: GFID mismatch for
/1_germline_CGTACTAG_L005_R1_001.fastq.gz
d70a4a6d-05fc-4988-8041-5e7f62155fe5 on BigVol-client-55 and
f249f88a-909f-489d-8d1d-d428e842ee96 on BigVol-client-34
[2020-08-21 11:32:52.209768] W [fuse-bridge.c:471:fuse_entry_cbk]
0-glusterfs-fuse: 233606: LOOKUP()
/[...]/1_germline_CGTACTAG_L005_R1_001.fastq.gz => -1 (Errore di
input/output)
-8<--

As suggested on IRC, I tested the RAM, but the only thing I got have
been a "Peer rejected" status due to another OOM kill. No problem, I've
been able to resolve it, but the original problem still remains.

What else can I do?

TIA!

--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Replicate over WAN?

2010-05-07 Thread Diego Zuccato


On 05/05/2010 20:18, Vikas Gorur wrote:


2) readdir (ls) is always sent to the first subvolume. This is
necessary to ensure consistent inode numbers.
Uhm... Couldn't the same result be achieved storing a virtual inode 
number in an attribute? So that it gets replicated with the rest of the 
data and it makes possible to have the first subvolume always local...


I understand tht it could lead to possible problems (like how do I 
generate an inode number if the master node is missing), but it could 
open to the replicate writes, local reads that many people are 
requesting...


The scenario to think about is a firm w/ a remote office connected via a 
VPN -- if you can cut nearly all the read traffic from the VPN, then you 
see a great boost in performance.


Or maybe I missed something...

--
Diego Zuccato
Servizi Informatici
Dip. di Astronomia - Università di Bologna
Via Ranzani, 1 - 40126 Bologna - Italy
tel.: +39 051 20 95786
mail: diego.zucc...@unibo.it


--
LA RICERCA C’È E SI VEDE:
5 per mille all'Università di Bologna - C.F.: 80007010376
http://www.unibo.it/Vademecum5permille.htm

Questa informativa è inserita in automatico dal sistema al fine esclusivo della 
realizzazione dei fini istituzionali dell’ente.

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

93 matches

Mail list logo