[ovirt-users] Re: storage healing question

2018-11-12 Thread Ravishankar N

Hi,

Can you restart the self-heal daemon by doing a `gluster volume start 
bgl-vms-gfs force` and then launch the heal again? If you are seeing 
different entries and counts each time you run heal info, there is 
likely a network issue (disconnect) between the (gluster fuse?) mount 
and the bricks of the volume leading to pending heals.


Also, there was a bug in arbiter volumes[1] that got fixed in glusterfs 
3.12.15. It can cause VMs to pause when you reboot the arbiter node, so 
it is recommended to upgrade to this gluster version.


HTH,
Ravi

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1637989

From: Dev Ops
Date: Mon, Nov 12, 2018 at 1:09 PM
Subject: [ovirt-users] Re: storage healing question
To:


Any help would be appreciated. I have since rebooted the 3rd gluster
node which is the arbiter. This doesn't seem to want to heal.

gluster volume heal bgl-vms-gfs info |grep Number
Number of entries: 68
Number of entries: 0
Number of entries: 68
___
Users mailing list --users@ovirt.org
To unsubscribe send an email tousers-le...@ovirt.org
Privacy Statement:https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List 
Archives:https://lists.ovirt.org/archives/list/users@ovirt.org/message/XNHA6WS5MGCLXJX3HCDGLZRVJ7E5Q7NX/


___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/YFN4DQKKX6YQR6W4U2EC2OJQS5TOJJSW/


[ovirt-users] Re: storage healing question

2018-11-11 Thread Dev Ops
Any help would be appreciated. I have since rebooted the 3rd gluster node which 
is the arbiter. This doesn't seem to want to heal. 

gluster volume heal bgl-vms-gfs info |grep Number
Number of entries: 68
Number of entries: 0
Number of entries: 68
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/XNHA6WS5MGCLXJX3HCDGLZRVJ7E5Q7NX/


[ovirt-users] Re: storage healing question

2018-11-09 Thread Dev Ops
Just a quick note the volume in question is actually called bgl-vms-gfs. The 
original message is still valid. 

[root@bgl-vms-gfs03 bricks]# gluster volume heal bgl-vms-gfs info
Brick 10.8.255.1:/gluster/bgl-vms-gfs01/brick
/.shard/bd0bf192-e0e1-4b72-85cb-fa3497c555be.989
/.shard/bd0bf192-e0e1-4b72-85cb-fa3497c555be.988
/.shard/bd0bf192-e0e1-4b72-85cb-fa3497c555be.423
/.shard/cca2d4d0-7254-49c5-9db0-c9aaeb34c479.612
/.shard/cca2d4d0-7254-49c5-9db0-c9aaeb34c479.614
/.shard/cca2d4d0-7254-49c5-9db0-c9aaeb34c479.611
/.shard/bd0bf192-e0e1-4b72-85cb-fa3497c555be.236
/.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.48
/.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.52
/.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.423
/.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.424
/.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.611
/.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.612
/.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.799
/.shard/18954415-3210-4d93-8591-0b3e1e5b3a16.498
/.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.1175
/.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.1551
/c71bb8b0-c669-4bf6-8348-14aafd4a805f/images/9dc54d22-7cb3-4e07-adbb-70f0ec5b7e6b/5f8515f7-3fae-4af6-adc4-d38426a9aa72
/.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.611
/.shard/dfe31381-6b91-4eb1-9050-0332182e424a.424
/.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.50
/.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.51
/.shard/cca2d4d0-7254-49c5-9db0-c9aaeb34c479.424
/.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.236
/.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.425
/.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.428
/.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.427
/.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.614
/.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.1363
/.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.238
/.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.428
/.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.612
/.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.423
/.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.614
/.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.987
/.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.429
/.shard/dfe31381-6b91-4eb1-9050-0332182e424a.429
/.shard/dfe31381-6b91-4eb1-9050-0332182e424a.241
/c71bb8b0-c669-4bf6-8348-14aafd4a805f/dom_md/ids
/.shard/cca2d4d0-7254-49c5-9db0-c9aaeb34c479.987
/.shard/bd0bf192-e0e1-4b72-85cb-fa3497c555be.987
/.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.241
/.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.429
/.shard/bd0bf192-e0e1-4b72-85cb-fa3497c555be.424
/.shard/dfe31381-6b91-4eb1-9050-0332182e424a.987
/.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.987
/.shard/dfe31381-6b91-4eb1-9050-0332182e424a.238
/.shard/dfe31381-6b91-4eb1-9050-0332182e424a.428
/.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.238
/.shard/bd0bf192-e0e1-4b72-85cb-fa3497c555be.611
/.shard/18954415-3210-4d93-8591-0b3e1e5b3a16.504
/.shard/bd0bf192-e0e1-4b72-85cb-fa3497c555be.238
/.shard/bd0bf192-e0e1-4b72-85cb-fa3497c555be.428
/.shard/bd0bf192-e0e1-4b72-85cb-fa3497c555be.612
/.shard/bd0bf192-e0e1-4b72-85cb-fa3497c555be.614
/.shard/cca2d4d0-7254-49c5-9db0-c9aaeb34c479.241
/.shard/cca2d4d0-7254-49c5-9db0-c9aaeb34c479.429
/.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.236
/.shard/cca2d4d0-7254-49c5-9db0-c9aaeb34c479.989
/.shard/18954415-3210-4d93-8591-0b3e1e5b3a16.909
/__DIRECT_IO_TEST__
/.shard/bd0bf192-e0e1-4b72-85cb-fa3497c555be.240
/.shard/bd0bf192-e0e1-4b72-85cb-fa3497c555be.429
/.shard/18954415-3210-4d93-8591-0b3e1e5b3a16.497
/.shard/cca2d4d0-7254-49c5-9db0-c9aaeb34c479.238
/.shard/cca2d4d0-7254-49c5-9db0-c9aaeb34c479.428
/.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.241
/.shard/bd0bf192-e0e1-4b72-85cb-fa3497c555be.991
/.shard/cca2d4d0-7254-49c5-9db0-c9aaeb34c479.236
/.shard/cca2d4d0-7254-49c5-9db0-c9aaeb34c479.990
/.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.48
/.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.52
/.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.424
/.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.425
/.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.799
/.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.1175
/.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.1551
/.shard/6792d5d0-1bd2-41cf-a48e-dbe015d3e9fd.1363
/.shard/dfe31381-6b91-4eb1-9050-0332182e424a.612
/.shard/dfe31381-6b91-4eb1-9050-0332182e424a.614
/.shard/dfe31381-6b91-4eb1-9050-0332182e424a.236
/.shard/dfe31381-6b91-4eb1-9050-0332182e424a.611
/.shard/dfe31381-6b91-4eb1-9050-0332182e424a.989
/.shard/dfe31381-6b91-4eb1-9050-0332182e424a.988
/.shard/dfe31381-6b91-4eb1-9050-0332182e424a.990
/.shard/dfe31381-6b91-4eb1-9050-0332182e424a.991
Status: Connected
Number of entries: 86

Brick 10.8.255.2:/gluster/bgl-vms-gfs02/brick
Status: Connected
Number of entries: 0

Brick 10.8.255.3:/gluster/bgl-vms-gfs03/brick
/.shard/bd0bf192-e0e1-4b72-85cb-fa3497c555be.236
/.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.48
/.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.52
/.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.423
/.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.424
/.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.611
/.shard/5bb5bc8b-abfb-4ab8-9f12-cbc020b3d50f.612

[ovirt-users] Re: storage healing question

2018-11-08 Thread Sahina Bose
On Fri, Nov 9, 2018 at 3:42 AM Dev Ops  wrote:
>
> The switches above our environment had some VPC issues and the port channels 
> went offline. The ports that had issues belonged to 2 of the gfs nodes in our 
> environment. We have 3 storage nodes total with the 3rd being the arbiter. I 
> wound up rebooting the first 2 nodes and everything came back happy. After a 
> few hours I noticed that the storage was up but complaining about being out 
> of sync and needing healing. Within the hour I noticed a VM had paused itself 
> due to storage issues. This is a small environment, for now, with only 30 
> VM's. I am new to Ovirt so this is uncharted territory for me. I am tailing 
> some logs and things look sort of normal and google is sending me down a 
> wormhole.
>
> If I run "gluster volume heal cps-vms-gfs info" this number seems to be 
> changing pretty regularly. Logs are showing lots of entries like this:
>
> [2018-11-08 21:55:05.996675] I [MSGID: 114047] 
> [client-handshake.c:1242:client_setvolume_cbk] 0-cps-vms-gfs-client-1: Server 
> and Client lk-version numbers are not same, reopening the fds
> [2018-11-08 21:55:05.997693] I [MSGID: 108002] [afr-common.c:5312:afr_notify] 
> 0-cps-vms-gfs-replicate-0: Client-quorum is met
> [2018-11-08 21:55:05.997717] I [MSGID: 114035] 
> [client-handshake.c:202:client_set_lk_version_cbk] 0-cps-vms-gfs-client-1: 
> Server lk version = 1
>
> I guess I am curious what else should I be looking for? Is this just taking 
> forever to heal? Is there something else I can run or I should do to verify 
> things are actually getting better? I ran an actual heal command and it 
> cleared everything for a few seconds and then the entries started to populate 
> again when I did the info command.
>
> [root@cps-vms-gfs01 glusterfs]# gluster volume status
> Status of volume: cps-vms-gfs
> Gluster process   
>  TCP Port  RDMA Port  Online  Pid
> --
> Brick 10.8.255.1:/gluster/cps-vms-gfs01/brick 
>  49152 0  Y   4054
> Brick 10.8.255.2:/gluster/cps-vms-gfs02/brick 
>  49152 0  Y   4144
> Brick 10.8.255.3:/gluster/cps-vms-gfs03/brick 
>  49152 0  Y   4294
> Self-heal Daemon on localhost   N/A   N/AY   4279
> Self-heal Daemon on cps-vms-gfs02.cisco.com N/A   N/AY   5185
> Self-heal Daemon on 10.196.152.145  N/A   N/AY   50948
>
> Task Status of Volume cps-vms-gfs
> --
> There are no active volume tasks
>
> I am running ovirt 4.2.5 and gluster 3.12.11.

Can you provide output of gluster volume heal cps-vms-gfs info, and
the logs from  /var/log/glusterfs/glfsheal-cps-vms-gfs.log and the
brick logs from /var/log/glusterfs/bricks for this volume.



>
> Thanks!
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/MDZXUZQSWQUKZRM3OUIGDOAMGDZHPVIF/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/W4Q3L3SV73WEOMDPHAL7SDRRBJGYT2EK/