Re: [Gluster-users] [ovirt-users] ovirt 4.1 hosted engine hyper converged on glusterfs 3.8.10 : "engine" storage domain alway complain about "unsynced" elements

2017-07-24 Thread Kasturi Narra
These errors are because not having glusternw assigned to the correct
interface. Once you attach that these errors should go away.  This has
nothing to do with the problem you are seeing.

sahina any idea about engine not showing the correct volume info ?

On Mon, Jul 24, 2017 at 7:30 PM, yayo (j)  wrote:

> Hi,
>
> UI refreshed but problem still remain ...
>
> No specific error, I've only these errors but I've read that there is no
> problem if I have this kind of errors:
>
>
> 2017-07-24 15:53:59,823+02 INFO  [org.ovirt.engine.core.vdsbro
> ker.gluster.GlusterServersListVDSCommand] (DefaultQuartzScheduler2)
> [b7590c4] START, GlusterServersListVDSCommand(HostName =
> node01.localdomain.local, VdsIdVDSCommandParametersBase:{runAsync='true',
> hostId='4c89baa5-e8f7-4132-a4b3-af332247570c'}), log id: 29a62417
> 2017-07-24 15:54:01,066+02 INFO  [org.ovirt.engine.core.vdsbro
> ker.gluster.GlusterServersListVDSCommand] (DefaultQuartzScheduler2)
> [b7590c4] FINISH, GlusterServersListVDSCommand, return: 
> [10.10.20.80/24:CONNECTED,
> node02.localdomain.local:CONNECTED, gdnode04:CONNECTED], log id: 29a62417
> 2017-07-24 15:54:01,076+02 INFO  [org.ovirt.engine.core.vdsbro
> ker.gluster.GlusterVolumesListVDSCommand] (DefaultQuartzScheduler2)
> [b7590c4] START, GlusterVolumesListVDSCommand(HostName =
> node01.localdomain.local, GlusterVolumesListVDSParameters:{runAsync='true',
> hostId='4c89baa5-e8f7-4132-a4b3-af332247570c'}), log id: 7fce25d3
> 2017-07-24 15:54:02,209+02 WARN  [org.ovirt.engine.core.vdsbro
> ker.gluster.GlusterVolumesListReturn] (DefaultQuartzScheduler2) [b7590c4]
> Could not associate brick 'gdnode01:/gluster/engine/brick' of volume
> 'd19c19e3-910d-437b-8ba7-4f2a23d17515' with correct network as no gluster
> network found in cluster '0002-0002-0002-0002-017a'
> 2017-07-24 15:54:02,212+02 WARN  [org.ovirt.engine.core.vdsbro
> ker.gluster.GlusterVolumesListReturn] (DefaultQuartzScheduler2) [b7590c4]
> Could not associate brick 'gdnode02:/gluster/engine/brick' of volume
> 'd19c19e3-910d-437b-8ba7-4f2a23d17515' with correct network as no gluster
> network found in cluster '0002-0002-0002-0002-017a'
> 2017-07-24 15:54:02,215+02 WARN  [org.ovirt.engine.core.vdsbro
> ker.gluster.GlusterVolumesListReturn] (DefaultQuartzScheduler2) [b7590c4]
> Could not associate brick 'gdnode04:/gluster/engine/brick' of volume
> 'd19c19e3-910d-437b-8ba7-4f2a23d17515' with correct network as no gluster
> network found in cluster '0002-0002-0002-0002-017a'
> 2017-07-24 15:54:02,218+02 WARN  [org.ovirt.engine.core.vdsbro
> ker.gluster.GlusterVolumesListReturn] (DefaultQuartzScheduler2) [b7590c4]
> Could not associate brick 'gdnode01:/gluster/data/brick' of volume
> 'c7a5dfc9-3e72-4ea1-843e-c8275d4a7c2d' with correct network as no gluster
> network found in cluster '0002-0002-0002-0002-017a'
> 2017-07-24 15:54:02,221+02 WARN  [org.ovirt.engine.core.vdsbro
> ker.gluster.GlusterVolumesListReturn] (DefaultQuartzScheduler2) [b7590c4]
> Could not associate brick 'gdnode02:/gluster/data/brick' of volume
> 'c7a5dfc9-3e72-4ea1-843e-c8275d4a7c2d' with correct network as no gluster
> network found in cluster '0002-0002-0002-0002-017a'
> 2017-07-24 15:54:02,224+02 WARN  [org.ovirt.engine.core.vdsbro
> ker.gluster.GlusterVolumesListReturn] (DefaultQuartzScheduler2) [b7590c4]
> Could not associate brick 'gdnode04:/gluster/data/brick' of volume
> 'c7a5dfc9-3e72-4ea1-843e-c8275d4a7c2d' with correct network as no gluster
> network found in cluster '0002-0002-0002-0002-017a'
> 2017-07-24 15:54:02,224+02 INFO  [org.ovirt.engine.core.vdsbro
> ker.gluster.GlusterVolumesListVDSCommand] (DefaultQuartzScheduler2)
> [b7590c4] FINISH, GlusterVolumesListVDSCommand, return: {d19c19e3-910d-437
> b-8ba7-4f2a23d17515=org.ovirt.engine.core.common.businessentities.gluste
> r.GlusterVolumeEntity@fdc91062, c7a5dfc9-3e72-4ea1-843e-c8275d
> 4a7c2d=org.ovirt.engine.core.common.businessentities.gluste
> r.GlusterVolumeEntity@999a6f23}, log id: 7fce25d3
>
>
> Thank you
>
>
> 2017-07-24 8:12 GMT+02:00 Kasturi Narra :
>
>> Hi,
>>
>>Regarding the UI showing incorrect information about engine and data
>> volumes, can you please refresh the UI and see if the issue persists  plus
>> any errors in the engine.log files ?
>>
>> Thanks
>> kasturi
>>
>> On Sat, Jul 22, 2017 at 11:43 AM, Ravishankar N 
>> wrote:
>>
>>>
>>> On 07/21/2017 11:41 PM, yayo (j) wrote:
>>>
>>> Hi,
>>>
>>> Sorry for follow up again, but, checking the ovirt interface I've found
>>> that ovirt report the "engine" volume as an "arbiter" configuration and the
>>> "data" volume as full replicated volume. Check these screenshots:
>>>
>>>
>>> This is probably some refresh bug in the UI, Sahina might be able to
>>> tell you.
>>>
>>>
>>> https://drive.google.com/drive/folders/0ByUV7xQtP1gCTE8tUTFf
>>> VmR5aDQ?usp=sharing
>>>
>>> But the "gluster volume info" command 

Re: [Gluster-users] set owner:group on root of volume

2017-07-24 Thread mabi
I can now also answer your question 3) so I just did a stop and start of the 
volume and yes the owner and group of the root directory of my volume gets set 
again correctly to UID/GID 1000. The problem is that it is now just a mater of 
time that it somehow gets reseted back to root:root...

>  Original Message 
> Subject: Re: [Gluster-users] set owner:group on root of volume
> Local Time: July 23, 2017 8:15 PM
> UTC Time: July 23, 2017 6:15 PM
> From: vbel...@redhat.com
> To: mabi , Gluster Users 
> On 07/20/2017 03:13 PM, mabi wrote:
>> Anyone has an idea? or shall I open a bug for that?
> This is an interesting problem. A few questions:
> 1. Is there any chance that one of your applications does a chown on the
> root?
> 2. Do you notice any logs related to metadata self-heal on "/" in the
> gluster logs?
> 3. Does the ownership of all bricks reset to custom uid/gid after every
> restart of the volume?
> Thanks,
> Vijay
>>
>>
>>>  Original Message 
>>> Subject: Re: set owner:group on root of volume
>>> Local Time: July 18, 2017 3:46 PM
>>> UTC Time: July 18, 2017 1:46 PM
>>> From: m...@protonmail.ch
>>> To: Gluster Users 
>>>
>>> Unfortunately the root directory of my volume still get its owner and
>>> group resetted to root. Can someone explain why or help with this
>>> issue? I need it to be set to UID/GID 1000 and stay like that.
>>>
>>> Thanks
>>>
>>>
>>>
  Original Message 
 Subject: Re: set owner:group on root of volume
 Local Time: July 11, 2017 9:33 PM
 UTC Time: July 11, 2017 7:33 PM
 From: m...@protonmail.ch
 To: Gluster Users 

 Just found out I needed to set following two parameters:

 gluster volume set myvol storage.owner-uid 1000
 gluster volume set myvol storage.owner-gid 1000



 In case that helps any one else :)

>  Original Message 
> Subject: set owner:group on root of volume
> Local Time: July 11, 2017 8:15 PM
> UTC Time: July 11, 2017 6:15 PM
> From: m...@protonmail.ch
> To: Gluster Users 
>
> Hi,
>
> By default the owner and group of a GlusterFS seems to be root:root
> now I changed this by first mounting my volume using glusterfs/fuse
> on a client and did the following
>
> chmod 1000:1000 /mnt/myglustervolume
>
> This changed correctly the owner and group to UID/GID 1000 of my
> volume but like 1-2 hours later it was back to root:root. I tried
> again and this happens again.
>
> Am I doing something wrong here? I am using GlusterFS 3.8.11 on
> Debian 8.
>
> Regards,
> M.
>
>
>

>>>
>>
>>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] vol status detail - times out?

2017-07-24 Thread lejeczek
now, what I found out: (is probably reproducible with 
version I use, I'd imagine fairly critical)


There was a peer which had been, now I realize, not fully 
detached, namely: three peers constituted a working cluster, 
on these peers that fourth peer did not exist, was detached, 
but that detach peer (was it network? hmm) still saw itself 
(and other three peers) as a member of the cluster.

I did on "detached":
$ gluster peer status
and it would show three other peers. How it happened exactly 
I cannot explain.
Nonetheless, that fourth-detached peer caused all these 
problems it seems. As soon as I stopped gluster there I was 
able to request all the actions successfully from the 
gluster that before failed.


This in my mind is, if true, a serious problem, it will 
cause a user/admin a headache when troubleshooting(like it 
caused me), especially if that admin forgot there was 
another peer somewhere(like I did, it was a VM in openstack)


Would be great if devel investigate or confirm, hopefully 
new versions are/will be resilient to such cases of "ghost" 
peers.


many thanks.
L

On 24/07/17 14:26, Atin Mukherjee wrote:
Yes it could as depending on number of bricks there might 
be too many brick ops involved. This is the reason we 
introduced --timeout option in CLI which can be used to 
have a larger time out value. However this fix is 
available from release-3.9 onwards.


On Mon, Jul 24, 2017 at 3:54 PM, lejeczek 
> wrote:


hi fellas

would you know what could be the problem with: vol
status detail times out always?
After I did above I had to restart glusterd on the
peer which had the command issued.
I run 3.8.14. Everything seems to work a ok.

many thanks
L.
___
Gluster-users mailing list
Gluster-users@gluster.org

http://lists.gluster.org/mailman/listinfo/gluster-users





___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] [ovirt-users] ovirt 4.1 hosted engine hyper converged on glusterfs 3.8.10 : "engine" storage domain alway complain about "unsynced" elements

2017-07-24 Thread yayo (j)
>
> All these ip are pingable and hosts resolvible across all 3 nodes but,
>> only the 10.10.10.0 network is the decidated network for gluster  (rosolved
>> using gdnode* host names) ... You think that remove other entries can fix
>> the problem? So, sorry, but, how can I remove other entries?
>>
> I don't think having extra entries could be a problem. Did you check the
> fuse mount logs for disconnect messages that I referred to in the other
> email?
>



* tail -f
/var/log/glusterfs/rhev-data-center-mnt-glusterSD-dvirtgluster\:engine.log*

*NODE01:*


[2017-07-24 07:34:00.799347] E [glusterfsd-mgmt.c:1908:mgmt_rpc_notify] 0
-glusterfsd-mgmt: failed to connect with remote-host: gdnode03 (Transport
endpoint is not connected)
[2017-07-24 07:44:46.687334] I [glusterfsd-mgmt.c:1926:mgmt_rpc_notify] 0
-glusterfsd-mgmt: Exhausted all volfile servers
[2017-07-24 09:04:25.951350] E [glusterfsd-mgmt.c:1908:mgmt_rpc_notify] 0
-glusterfsd-mgmt: failed to connect with remote-host: gdnode03 (Transport
endpoint is not connected)
[2017-07-24 09:15:11.839357] I [glusterfsd-mgmt.c:1926:mgmt_rpc_notify] 0
-glusterfsd-mgmt: Exhausted all volfile servers
[2017-07-24 10:34:51.231353] E [glusterfsd-mgmt.c:1908:mgmt_rpc_notify] 0
-glusterfsd-mgmt: failed to connect with remote-host: gdnode03 (Transport
endpoint is not connected)
[2017-07-24 10:45:36.991321] I [glusterfsd-mgmt.c:1926:mgmt_rpc_notify] 0
-glusterfsd-mgmt: Exhausted all volfile servers
[2017-07-24 12:05:16.383323] E [glusterfsd-mgmt.c:1908:mgmt_rpc_notify] 0
-glusterfsd-mgmt: failed to connect with remote-host: gdnode03 (Transport
endpoint is not connected)
[2017-07-24 12:16:02.271320] I [glusterfsd-mgmt.c:1926:mgmt_rpc_notify] 0
-glusterfsd-mgmt: Exhausted all volfile servers
[2017-07-24 13:35:41.535308] E [glusterfsd-mgmt.c:1908:mgmt_rpc_notify] 0
-glusterfsd-mgmt: failed to connect with remote-host: gdnode03 (Transport
endpoint is not connected)
[2017-07-24 13:46:27.423304] I [glusterfsd-mgmt.c:1926:mgmt_rpc_notify] 0
-glusterfsd-mgmt: Exhausted all volfile servers



Why again gdnode03? Was removed from gluster! was the arbiter node...


*NODE02:*


[2017-07-24 14:08:18.709209] I [MSGID: 108026] [
afr-self-heal-common.c:1254:afr_log_selfheal] 0-engine-replicate-0:
Completed data selfheal on db56ac00-fd5b-4326-a879-326ff56181de. sources=0 [
1]  sinks=2
[2017-07-24 14:08:38.746688] I [MSGID: 108026] [
afr-self-heal-metadata.c:51:__afr_selfheal_metadata_do]
0-engine-replicate-0: performing metadata selfheal on
f05b9742-2771-484a-85fc-5b6974bcef81
[2017-07-24 14:08:38.749379] I [MSGID: 108026] [
afr-self-heal-common.c:1254:afr_log_selfheal] 0-engine-replicate-0:
Completed metadata selfheal on f05b9742-2771-484a-85fc-5b6974bcef81.
sources=0 [1]  sinks=2
[2017-07-24 14:08:46.068001] I [MSGID: 108026] [
afr-self-heal-common.c:1254:afr_log_selfheal] 0-engine-replicate-0:
Completed data selfheal on db56ac00-fd5b-4326-a879-326ff56181de. sources=0 [
1]  sinks=2
The message "I [MSGID: 108026] [
afr-self-heal-metadata.c:51:__afr_selfheal_metadata_do]
0-engine-replicate-0: performing metadata selfheal on
f05b9742-2771-484a-85fc-5b6974bcef81" repeated 3 times between [2017-07-24
14:08:38.746688] and [2017-07-24 14:10:09.088625]
The message "I [MSGID: 108026] [afr-self-heal-common.c:1254:afr_log_selfheal]
0-engine-replicate-0: Completed metadata selfheal on
f05b9742-2771-484a-85fc-5b6974bcef81. sources=0 [1]  sinks=2 " repeated 3
times between [2017-07-24 14:08:38.749379] and [2017-07-24 14:10:09.091377]
[2017-07-24 14:10:19.384379] I [MSGID: 108026]
[afr-self-heal-common.c:1254:afr_log_selfheal] 0-engine-replicate-0:
Completed data selfheal on db56ac00-fd5b-4326-a879-326ff56181de. sources=0
[1]  sinks=2
[2017-07-24 14:10:39.433155] I [MSGID: 108026] [afr-self-heal-metadata.c:51:
__afr_selfheal_metadata_do] 0-engine-replicate-0: performing metadata
selfheal on f05b9742-2771-484a-85fc-5b6974bcef81
[2017-07-24 14:10:39.435847] I [MSGID: 108026]
[afr-self-heal-common.c:1254:afr_log_selfheal] 0-engine-replicate-0:
Completed metadata selfheal on f05b9742-2771-484a-85fc-5b6974bcef81.
sources=0 [1]  sinks=2



*NODE04:*


[2017-07-24 14:08:56.789598] I [MSGID: 108026] [afr-self-heal-common.c:1254
:afr_log_selfheal] 0-engine-replicate-0: Completed data selfheal on
e6dfd556-340b-4b76-b47b-7b6f5bd74327. sources=[0] 1  sinks=2
[2017-07-24 14:09:17.231987] I [MSGID: 108026] [afr-self-heal-common.c:1254
:afr_log_selfheal] 0-engine-replicate-0: Completed data selfheal on db56ac00
-fd5b-4326-a879-326ff56181de. sources=[0] 1  sinks=2
[2017-07-24 14:09:38.039541] I [MSGID: 108026] [afr-self-heal-common.c:1254
:afr_log_selfheal] 0-engine-replicate-0: Completed data selfheal on
e6dfd556-340b-4b76-b47b-7b6f5bd74327. sources=[0] 1  sinks=2
[2017-07-24 14:09:48.875602] I [MSGID: 108026] [afr-self-heal-common.c:1254
:afr_log_selfheal] 0-engine-replicate-0: Completed data selfheal on db56ac00
-fd5b-4326-a879-326ff56181de. sources=[0] 1  sinks=2
[2017-07-24 14:10:39.832068] I [MSGID: 108026] 

Re: [Gluster-users] [ovirt-users] ovirt 4.1 hosted engine hyper converged on glusterfs 3.8.10 : "engine" storage domain alway complain about "unsynced" elements

2017-07-24 Thread yayo (j)
Hi,

UI refreshed but problem still remain ...

No specific error, I've only these errors but I've read that there is no
problem if I have this kind of errors:


2017-07-24 15:53:59,823+02 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand]
(DefaultQuartzScheduler2) [b7590c4] START,
GlusterServersListVDSCommand(HostName
= node01.localdomain.local, VdsIdVDSCommandParametersBase:{runAsync='true',
hostId='4c89baa5-e8f7-4132-a4b3-af332247570c'}), log id: 29a62417
2017-07-24 15:54:01,066+02 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand]
(DefaultQuartzScheduler2) [b7590c4] FINISH, GlusterServersListVDSCommand,
return: [10.10.20.80/24:CONNECTED, node02.localdomain.local:CONNECTED,
gdnode04:CONNECTED], log id: 29a62417
2017-07-24 15:54:01,076+02 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand]
(DefaultQuartzScheduler2) [b7590c4] START,
GlusterVolumesListVDSCommand(HostName
= node01.localdomain.local, GlusterVolumesListVDSParameters:{runAsync='true',
hostId='4c89baa5-e8f7-4132-a4b3-af332247570c'}), log id: 7fce25d3
2017-07-24 15:54:02,209+02 WARN
[org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturn]
(DefaultQuartzScheduler2) [b7590c4] Could not associate brick
'gdnode01:/gluster/engine/brick' of volume 'd19c19e3-910d-437b-8ba7-
4f2a23d17515' with correct network as no gluster network found in cluster
'0002-0002-0002-0002-017a'
2017-07-24 15:54:02,212+02 WARN
[org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturn]
(DefaultQuartzScheduler2) [b7590c4] Could not associate brick
'gdnode02:/gluster/engine/brick' of volume 'd19c19e3-910d-437b-8ba7-
4f2a23d17515' with correct network as no gluster network found in cluster
'0002-0002-0002-0002-017a'
2017-07-24 15:54:02,215+02 WARN
[org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturn]
(DefaultQuartzScheduler2) [b7590c4] Could not associate brick
'gdnode04:/gluster/engine/brick' of volume 'd19c19e3-910d-437b-8ba7-
4f2a23d17515' with correct network as no gluster network found in cluster
'0002-0002-0002-0002-017a'
2017-07-24 15:54:02,218+02 WARN
[org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturn]
(DefaultQuartzScheduler2) [b7590c4] Could not associate brick
'gdnode01:/gluster/data/brick' of volume 'c7a5dfc9-3e72-4ea1-843e-
c8275d4a7c2d' with correct network as no gluster network found in cluster
'0002-0002-0002-0002-017a'
2017-07-24 15:54:02,221+02 WARN
[org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturn]
(DefaultQuartzScheduler2) [b7590c4] Could not associate brick
'gdnode02:/gluster/data/brick' of volume 'c7a5dfc9-3e72-4ea1-843e-
c8275d4a7c2d' with correct network as no gluster network found in cluster
'0002-0002-0002-0002-017a'
2017-07-24 15:54:02,224+02 WARN
[org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturn]
(DefaultQuartzScheduler2) [b7590c4] Could not associate brick
'gdnode04:/gluster/data/brick' of volume 'c7a5dfc9-3e72-4ea1-843e-
c8275d4a7c2d' with correct network as no gluster network found in cluster
'0002-0002-0002-0002-017a'
2017-07-24 15:54:02,224+02 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand]
(DefaultQuartzScheduler2) [b7590c4] FINISH, GlusterVolumesListVDSCommand,
return: {d19c19e3-910d-437b-8ba7-4f2a23d17515=org.ovirt.engine.core.
common.businessentities.gluster.GlusterVolumeEntity@fdc91062, c7a5dfc9-3e72
-4ea1-843e-c8275d4a7c2d=org.ovirt.engine.core.common.businessentities.
gluster.GlusterVolumeEntity@999a6f23}, log id: 7fce25d3


Thank you


2017-07-24 8:12 GMT+02:00 Kasturi Narra :

> Hi,
>
>Regarding the UI showing incorrect information about engine and data
> volumes, can you please refresh the UI and see if the issue persists  plus
> any errors in the engine.log files ?
>
> Thanks
> kasturi
>
> On Sat, Jul 22, 2017 at 11:43 AM, Ravishankar N 
> wrote:
>
>>
>> On 07/21/2017 11:41 PM, yayo (j) wrote:
>>
>> Hi,
>>
>> Sorry for follow up again, but, checking the ovirt interface I've found
>> that ovirt report the "engine" volume as an "arbiter" configuration and the
>> "data" volume as full replicated volume. Check these screenshots:
>>
>>
>> This is probably some refresh bug in the UI, Sahina might be able to tell
>> you.
>>
>>
>> https://drive.google.com/drive/folders/0ByUV7xQtP1gCTE8tUTFf
>> VmR5aDQ?usp=sharing
>>
>> But the "gluster volume info" command report that all 2 volume are full
>> replicated:
>>
>>
>> *Volume Name: data*
>> *Type: Replicate*
>> *Volume ID: c7a5dfc9-3e72-4ea1-843e-c8275d4a7c2d*
>> *Status: Started*
>> *Snapshot Count: 0*
>> *Number of Bricks: 1 x 3 = 3*
>> *Transport-type: tcp*
>> *Bricks:*
>> *Brick1: gdnode01:/gluster/data/brick*
>> *Brick2: gdnode02:/gluster/data/brick*
>> *Brick3: gdnode04:/gluster/data/brick*
>> *Options Reconfigured:*
>> *nfs.disable: on*
>> *performance.readdir-ahead: on*
>> 

Re: [Gluster-users] vol status detail - times out?

2017-07-24 Thread Atin Mukherjee
Yes it could as depending on number of bricks there might be too many brick
ops involved. This is the reason we introduced --timeout option in CLI
which can be used to have a larger time out value. However this fix is
available from release-3.9 onwards.

On Mon, Jul 24, 2017 at 3:54 PM, lejeczek  wrote:

> hi fellas
>
> would you know what could be the problem with: vol status detail times out
> always?
> After I did above I had to restart glusterd on the peer which had the
> command issued.
> I run 3.8.14. Everything seems to work a ok.
>
> many thanks
> L.
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] gluster-heketi-kubernetes

2017-07-24 Thread Vijay Bellur
Hi Bishoy,

Adding Talur who can help address your queries on Heketi.

@wattsteve's github  repo on glusterfs-kubernetes is a bit dated. You can
either refer to gluster/gluster-kubernetes or heketi/heketi for current
documentation and operational procedures.

Regards,
Vijay



On Fri, Jul 21, 2017 at 2:19 AM, Bishoy Mikhael 
wrote:

> Hi,
>
> I'm trying to deploy Gluster and Heketi on a Kubernetes cluster
> I'm following the guide at https://github.com/gluster/gluster-kubernetes/
> but the video referenced in the page is showing json files used while the
> git repo has only yaml files, they are quiet similar though, but Gluster is
> a deployment not a DaemonSet.
>
> I deploy Gluster DaemonSet successfully, but heketi is giving me the
> following error:
>
> # kubectl logs deploy-heketi-930916695-np4hb
>
> Heketi v4.0.0-8-g9372c22-release-4
>
> [kubeexec] ERROR 2017/07/21 06:08:52 /src/github.com/heketi/heketi/
> executors/kubeexec/kubeexec.go:125: Namespace must be provided in
> configuration: File /var/run/secrets/kubernetes.
> io/serviceaccount/namespace not found
>
> [heketi] ERROR 2017/07/21 06:08:52 /src/github.com/heketi/heketi/
> apps/glusterfs/app.go:85: Namespace must be provided in configuration:
> File /var/run/secrets/kubernetes.io/serviceaccount/namespace not found
>
> ERROR: Unable to start application
>
> What am I doing wrong here?!
>
> I found more than one source for documentation about how to use Gluster as
> a persistent storage for kubernetes, some of them are:
> https://github.com/heketi/heketi/wiki/Kubernetes-Integration
> https://github.com/wattsteve/glusterfs-kubernetes
>
> Which one to follow?!
>
> Also I've created a topology file as noted by one of the documentation,
> but I don't know how to provide it to heketi.
>
> Regards,
> Bishoy Mikhael
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] vol status detail - times out?

2017-07-24 Thread lejeczek

hi fellas

would you know what could be the problem with: vol status 
detail times out always?
After I did above I had to restart glusterd on the peer 
which had the command issued.

I run 3.8.14. Everything seems to work a ok.

many thanks
L.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Bug 1473150 - features/shard:Lookup on shard 18 failed. Base file gfid = b00f5de2-d811-44fe-80e5-1f382908a55a [No data available], the [No data available]

2017-07-24 Thread Krutika Dhananjay
+gluster-users ML

Hi,

I've responded to your bug report here -
https://bugzilla.redhat.com/show_bug.cgi?id=1473150#c3
Kindly let us know if the patch fixes your bug.


-Krutika

On Thu, Jul 20, 2017 at 3:12 PM, zhangjianwei1...@163.com <
zhangjianwei1...@163.com> wrote:

> Hi  Krutika Dhananjay, Pranith Kumar Karampuri,
>  Thank for your reply!
>
>  I am waiting your good news!
>
>  Thank you for your hard work!
>
> --
> zhangjianwei1...@163.com
>
>
> *From:* Krutika Dhananjay 
> *Date:* 2017-07-20 17:34
> *To:* Pranith Kumar Karampuri 
> *CC:* 张建伟 
> *Subject:* Re: Bug 1473150 - features/shard:Lookup on shard 18 failed.
> Base file gfid = b00f5de2-d811-44fe-80e5-1f382908a55a [No data
> available], the [No data available]
> Hi 张建伟,
>
> Thanks for your email. I am currently looking into a customer issue. I
> will get back to you as soon as I'm done with it.
>
> -Krutika
>
> On Thu, Jul 20, 2017 at 2:17 PM, Pranith Kumar Karampuri <
> pkara...@redhat.com> wrote:
>
>> Krutika is working on a similar ENODATA bug with distribute xlator. This
>> bug looks similar to it. Krutika knows more details about this issue.
>>  This is lunch time in India. Expect some delay.
>>
>> On Thu, Jul 20, 2017 at 1:08 PM, 张建伟  wrote:
>>
>>> Hi,
>>> Nice to meet you!
>>> Recently, I am testing features/shard module and finding some
>>> problem in it.
>>> I have commited the problem to https://bugzilla.redhat.com
>>> /show_bug.cgi?id=1473150
>>> 
>>>
>>> The shard_glusterfs_log.tar.gz is my test results log.
>>> I hope you can help me!
>>> Thank you very much!
>>>
>>> Best regards!!!
>>>
>>>
>>>
>>>
>>
>>
>>
>> --
>> Pranith
>>
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Volume mounts read-only

2017-07-24 Thread Pavel Szalbot
Hi, are you sure there are not two hosts with the same IP address?
Happened to me once and logs looked pretty similar (frequent
disconnects, all other clients seemed fine, gluster cluster reported
no issues).
-ps


On Fri, Jul 21, 2017 at 9:13 PM, Jamie Lawrence
 wrote:
> Hello Glusterites,
>
> I have a volume that will not mount read/write. V3.10.3 on Centos 7, this is 
> a replica-3 volume, mounted with the fuse client. This is in support of an 
> Ovirt installation, but I've isolated the problem to Gluster. `gluster peer 
> status` looks normal, as does a `gluster v status`. Server.allow-insecure is 
> set to on.
>
> Notable from the client volume logs are the 'connection refused' messages[1], 
> that seem to match the gluster server logs[2], but I'm not exactly sure what 
> that's telling me, or what I can do about it.  Two other hosts currently have 
> the volume mounted and can write just fine; Because of operational reasons, I 
> have not tested to see what would happen if I un/remount it on the currently 
> functioning hosts. I trialed it disabling iptables, no change.
>
> I'm not sure what else to experiment with, and couldn't find anything 
> relevant in the googles.  Thus this email.
>
> Has anyone seen this before? TIA,
>
> -j
>
>
> [1] rhev-data-center-mnt-glusterSD-sc5-gluster-10g-1:ovirt__engine.log
> [2017-07-21 18:19:07.210232] E [socket.c:2316:socket_connect_finish] 
> 0-ovirt_engine-client-1: connection to 172.16.0.152:49155 failed (Connection 
> refused); disconnecting socket
> [2017-07-21 18:19:08.280799] I [rpc-clnt.c:2001:rpc_clnt_reconfig] 
> 0-ovirt_engine-client-0: changing port to 49220 (from 0)
> [2017-07-21 18:19:08.286512] E [socket.c:2309:socket_connect_finish] 
> 0-ovirt_engine-client-0: connection to 172.16.0.151:49220 failed (Connection 
> refused)
> [2017-07-21 18:19:09.260107] W [fuse-bridge.c:1067:fuse_setattr_cbk] 
> 0-glusterfs-fuse: 1721: SETATTR() /__DIRECT_IO_TEST__ => -1 (Read-only file 
> system)
> [2017-07-21 18:19:10.286556] I [rpc-clnt.c:2001:rpc_clnt_reconfig] 
> 0-ovirt_engine-client-1: changing port to 49155 (from 0)
> [2017-07-21 18:19:10.292217] E [socket.c:2309:socket_connect_finish] 
> 0-ovirt_engine-client-1: connection to 172.16.0.152:49155 failed (Connection 
> refused)
> [2017-07-21 18:19:11.206829] I [rpc-clnt.c:2000:rpc_clnt_reconfig] 
> 0-ovirt_engine-client-0: changing port to 49220 (from 0)
> [2017-07-21 18:19:11.212077] I [rpc-clnt.c:2000:rpc_clnt_reconfig] 
> 0-ovirt_engine-client-1: changing port to 49155 (from 0)
> [2017-07-21 18:19:11.216493] E [socket.c:2316:socket_connect_finish] 
> 0-ovirt_engine-client-0: connection to 172.16.0.151:49220 failed (Connection 
> refused); disconnecting socket
> [2017-07-21 18:19:11.221265] E [socket.c:2316:socket_connect_finish] 
> 0-ovirt_engine-client-1: connection to 172.16.0.152:49155 failed (Connection 
> refused); disconnecting socket
> [2017-07-21 18:19:12.292228] I [rpc-clnt.c:2001:rpc_clnt_reconfig] 
> 0-ovirt_engine-client-0: changing port to 49220 (from 0)
> [2017-07-21 18:19:12.297970] E [socket.c:2309:socket_connect_finish] 
> 0-ovirt_engine-client-0: connection to 172.16.0.151:49220 failed (Connection 
> refused)
> [2017-07-21 18:19:14.297828] I [rpc-clnt.c:2001:rpc_clnt_reconfig] 
> 0-ovirt_engine-client-1: changing port to 49155 (from 0)
> [2017-07-21 18:19:14.303567] E [socket.c:2309:socket_connect_finish] 
> 0-ovirt_engine-client-1: connection to 172.16.0.152:49155 failed (Connection 
> refused)
> [2017-07-21 18:19:15.217792] I [rpc-clnt.c:2000:rpc_clnt_reconfig] 
> 0-ovirt_engine-client-0: changing port to 49220 (from 0)
> [2017-07-21 18:19:15.223013] I [rpc-clnt.c:2000:rpc_clnt_reconfig] 
> 0-ovirt_engine-client-1: changing port to 49155 (from 0)
> [2017-07-21 18:19:15.227406] E [socket.c:2316:socket_connect_finish] 
> 0-ovirt_engine-client-0: connection to 172.16.0.151:49220 failed (Connection 
> refused); disconnecting socket
> [2017-07-21 18:19:15.232193] E [socket.c:2316:socket_connect_finish] 
> 0-ovirt_engine-client-1: connection to 172.16.0.152:49155 failed (Connection 
> refused); disconnecting socket
>
>
> [2] glustershd.log:
> [2017-07-21 18:31:36.890242] I [rpc-clnt.c:2000:rpc_clnt_reconfig] 
> 0-ovirt_engine-client-0: changing port to 49220 (from 0)
> [2017-07-21 18:31:36.895707] E [socket.c:2316:socket_connect_finish] 
> 0-ovirt_engine-client-0: connection to 172.16.0.151:49220 failed (Connection 
> refused); disconnecting socket
> [2017-07-21 18:31:39.896196] I [rpc-clnt.c:2000:rpc_clnt_reconfig] 
> 0-ovirt_engine-client-1: changing port to 49155 (from 0)
> [2017-07-21 18:31:39.901850] E [socket.c:2316:socket_connect_finish] 
> 0-ovirt_engine-client-1: connection to 172.16.0.152:49155 failed (Connection 
> refused); disconnecting socket
> [2017-07-21 18:31:40.901569] I [rpc-clnt.c:2000:rpc_clnt_reconfig] 
> 0-ovirt_engine-client-0: changing port to 49220 (from 0)
> [2017-07-21 18:31:40.907006] E [socket.c:2316:socket_connect_finish] 
> 

Re: [Gluster-users] Community Meeting 2017-07-19

2017-07-24 Thread Kaushal M
On Wed, Jul 19, 2017 at 8:08 PM, Kaushal M  wrote:
> This is a (late) reminder about today's meeting. The meeting begins in
> ~20 minutes from now.
>
> The meeting notepad is at https://bit.ly/gluster-community-meetings
> and currently has no topics for discussion. If you have anything to be
> discussed please add it to the pad.
>
> ~kaushal

Apologies for the late update.

The last community meeting happened with good participation. I hope to
see the trend continuing.

The meeting minutes and logs are available at the links below.

The next meeting is scheduled for 2nd August. The meeting notepad is
at [4] for your updates and topics for discussion.

See you at the next meeting.

~kaushal

[1]: Minutes: 
https://meetbot.fedoraproject.org/gluster-meeting/2017-07-19/community_meeting_2017-07-19.2017-07-19-15.02.html
[2]: Minutes (text):
https://meetbot.fedoraproject.org/gluster-meeting/2017-07-19/community_meeting_2017-07-19.2017-07-19-15.02.txt
[3]: Log: 
https://meetbot.fedoraproject.org/gluster-meeting/2017-07-19/community_meeting_2017-07-19.2017-07-19-15.02.log.html
[4]: https://bit.ly/gluster-community-meetings

Meeting summary
---
* Should we build 3.12 packages for old distros  (kshlm, 15:06:23)
  * AGREED: 4.0 will drop support for EL6 and other old distros. Will
see what can be done if and when someone wants to do it anyway.
(kshlm, 15:21:48)

* Is 4.0 LTM or STM?  (kshlm, 15:24:04)
  * AGREED: 4.0 is STM. Will take call on 4.1 and beyond later.  (kshlm,
15:38:54)
  * ACTION: shyam will edit release pages and milestones to reflect 4.0
is STM.  (kshlm, 15:39:59)

Meeting ended at 16:02:12 UTC.




Action Items

* shyam will edit release pages and milestones to reflect 4.0 is STM.




Action Items, by person
---
* shyam
  * shyam will edit release pages and milestones to reflect 4.0 is STM.
* **UNASSIGNED**
  * (none)




People Present (lines said)
---
* kshlm (91)
* bulde (30)
* ndevos (27)
* amye (22)
* shyam (20)
* nigelb (17)
* Snowman (16)
* kkeithley (13)
* vbellur (8)
* zodbot (3)
* jstrunk (3)
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] [ovirt-users] ovirt 4.1 hosted engine hyper converged on glusterfs 3.8.10 : "engine" storage domain alway complain about "unsynced" elements

2017-07-24 Thread Kasturi Narra
Hi,

   Regarding the UI showing incorrect information about engine and data
volumes, can you please refresh the UI and see if the issue persists  plus
any errors in the engine.log files ?

Thanks
kasturi

On Sat, Jul 22, 2017 at 11:43 AM, Ravishankar N 
wrote:

>
> On 07/21/2017 11:41 PM, yayo (j) wrote:
>
> Hi,
>
> Sorry for follow up again, but, checking the ovirt interface I've found
> that ovirt report the "engine" volume as an "arbiter" configuration and the
> "data" volume as full replicated volume. Check these screenshots:
>
>
> This is probably some refresh bug in the UI, Sahina might be able to tell
> you.
>
>
> https://drive.google.com/drive/folders/0ByUV7xQtP1gCTE8tUTFfVmR5aDQ?
> usp=sharing
>
> But the "gluster volume info" command report that all 2 volume are full
> replicated:
>
>
> *Volume Name: data*
> *Type: Replicate*
> *Volume ID: c7a5dfc9-3e72-4ea1-843e-c8275d4a7c2d*
> *Status: Started*
> *Snapshot Count: 0*
> *Number of Bricks: 1 x 3 = 3*
> *Transport-type: tcp*
> *Bricks:*
> *Brick1: gdnode01:/gluster/data/brick*
> *Brick2: gdnode02:/gluster/data/brick*
> *Brick3: gdnode04:/gluster/data/brick*
> *Options Reconfigured:*
> *nfs.disable: on*
> *performance.readdir-ahead: on*
> *transport.address-family: inet*
> *storage.owner-uid: 36*
> *performance.quick-read: off*
> *performance.read-ahead: off*
> *performance.io-cache: off*
> *performance.stat-prefetch: off*
> *performance.low-prio-threads: 32*
> *network.remote-dio: enable*
> *cluster.eager-lock: enable*
> *cluster.quorum-type: auto*
> *cluster.server-quorum-type: server*
> *cluster.data-self-heal-algorithm: full*
> *cluster.locking-scheme: granular*
> *cluster.shd-max-threads: 8*
> *cluster.shd-wait-qlength: 1*
> *features.shard: on*
> *user.cifs: off*
> *storage.owner-gid: 36*
> *features.shard-block-size: 512MB*
> *network.ping-timeout: 30*
> *performance.strict-o-direct: on*
> *cluster.granular-entry-heal: on*
> *auth.allow: **
> *server.allow-insecure: on*
>
>
>
>
>
> *Volume Name: engine*
> *Type: Replicate*
> *Volume ID: d19c19e3-910d-437b-8ba7-4f2a23d17515*
> *Status: Started*
> *Snapshot Count: 0*
> *Number of Bricks: 1 x 3 = 3*
> *Transport-type: tcp*
> *Bricks:*
> *Brick1: gdnode01:/gluster/engine/brick*
> *Brick2: gdnode02:/gluster/engine/brick*
> *Brick3: gdnode04:/gluster/engine/brick*
> *Options Reconfigured:*
> *nfs.disable: on*
> *performance.readdir-ahead: on*
> *transport.address-family: inet*
> *storage.owner-uid: 36*
> *performance.quick-read: off*
> *performance.read-ahead: off*
> *performance.io-cache: off*
> *performance.stat-prefetch: off*
> *performance.low-prio-threads: 32*
> *network.remote-dio: off*
> *cluster.eager-lock: enable*
> *cluster.quorum-type: auto*
> *cluster.server-quorum-type: server*
> *cluster.data-self-heal-algorithm: full*
> *cluster.locking-scheme: granular*
> *cluster.shd-max-threads: 8*
> *cluster.shd-wait-qlength: 1*
> *features.shard: on*
> *user.cifs: off*
> *storage.owner-gid: 36*
> *features.shard-block-size: 512MB*
> *network.ping-timeout: 30*
> *performance.strict-o-direct: on*
> *cluster.granular-entry-heal: on*
> *auth.allow: **
>
>   server.allow-insecure: on
>
>
> 2017-07-21 19:13 GMT+02:00 yayo (j) :
>
>> 2017-07-20 14:48 GMT+02:00 Ravishankar N :
>>
>>>
>>> But it does  say something. All these gfids of completed heals in the
>>> log below are the for the ones that you have given the getfattr output of.
>>> So what is likely happening is there is an intermittent connection problem
>>> between your mount and the brick process, leading to pending heals again
>>> after the heal gets completed, which is why the numbers are varying each
>>> time. You would need to check why that is the case.
>>> Hope this helps,
>>> Ravi
>>>
>>>
>>>
>>> *[2017-07-20 09:58:46.573079] I [MSGID: 108026]
>>> [afr-self-heal-common.c:1254:afr_log_selfheal] 0-engine-replicate-0:
>>> Completed data selfheal on e6dfd556-340b-4b76-b47b-7b6f5bd74327.
>>> sources=[0] 1  sinks=2*
>>> *[2017-07-20 09:59:22.995003] I [MSGID: 108026]
>>> [afr-self-heal-metadata.c:51:__afr_selfheal_metadata_do]
>>> 0-engine-replicate-0: performing metadata selfheal on
>>> f05b9742-2771-484a-85fc-5b6974bcef81*
>>> *[2017-07-20 09:59:22.999372] I [MSGID: 108026]
>>> [afr-self-heal-common.c:1254:afr_log_selfheal] 0-engine-replicate-0:
>>> Completed metadata selfheal on f05b9742-2771-484a-85fc-5b6974bcef81.
>>> sources=[0] 1  sinks=2*
>>>
>>>
>>
>> Hi,
>>
>> following your suggestion, I've checked the "peer" status and I found
>> that there is too many name for the hosts, I don't know if this can be the
>> problem or part of it:
>>
>> *gluster peer status on NODE01:*
>> *Number of Peers: 2*
>>
>> *Hostname: dnode02.localdomain.local*
>> *Uuid: 7c0ebfa3-5676-4d3f-9bfa-7fff6afea0dd*
>> *State: Peer in Cluster (Connected)*
>> *Other names:*
>> *192.168.10.52*
>> *dnode02.localdomain.local*
>> *10.10.20.90*
>> *10.10.10.20*
>>
>>
>>
>>
>> *gluster peer