Hi, Regarding the UI showing incorrect information about engine and data volumes, can you please refresh the UI and see if the issue persists plus any errors in the engine.log files ?
Thanks kasturi On Sat, Jul 22, 2017 at 11:43 AM, Ravishankar N <ravishan...@redhat.com> wrote: > > On 07/21/2017 11:41 PM, yayo (j) wrote: > > Hi, > > Sorry for follow up again, but, checking the ovirt interface I've found > that ovirt report the "engine" volume as an "arbiter" configuration and the > "data" volume as full replicated volume. Check these screenshots: > > > This is probably some refresh bug in the UI, Sahina might be able to tell > you. > > > https://drive.google.com/drive/folders/0ByUV7xQtP1gCTE8tUTFfVmR5aDQ? > usp=sharing > > But the "gluster volume info" command report that all 2 volume are full > replicated: > > > *Volume Name: data* > *Type: Replicate* > *Volume ID: c7a5dfc9-3e72-4ea1-843e-c8275d4a7c2d* > *Status: Started* > *Snapshot Count: 0* > *Number of Bricks: 1 x 3 = 3* > *Transport-type: tcp* > *Bricks:* > *Brick1: gdnode01:/gluster/data/brick* > *Brick2: gdnode02:/gluster/data/brick* > *Brick3: gdnode04:/gluster/data/brick* > *Options Reconfigured:* > *nfs.disable: on* > *performance.readdir-ahead: on* > *transport.address-family: inet* > *storage.owner-uid: 36* > *performance.quick-read: off* > *performance.read-ahead: off* > *performance.io-cache: off* > *performance.stat-prefetch: off* > *performance.low-prio-threads: 32* > *network.remote-dio: enable* > *cluster.eager-lock: enable* > *cluster.quorum-type: auto* > *cluster.server-quorum-type: server* > *cluster.data-self-heal-algorithm: full* > *cluster.locking-scheme: granular* > *cluster.shd-max-threads: 8* > *cluster.shd-wait-qlength: 10000* > *features.shard: on* > *user.cifs: off* > *storage.owner-gid: 36* > *features.shard-block-size: 512MB* > *network.ping-timeout: 30* > *performance.strict-o-direct: on* > *cluster.granular-entry-heal: on* > *auth.allow: ** > *server.allow-insecure: on* > > > > > > *Volume Name: engine* > *Type: Replicate* > *Volume ID: d19c19e3-910d-437b-8ba7-4f2a23d17515* > *Status: Started* > *Snapshot Count: 0* > *Number of Bricks: 1 x 3 = 3* > *Transport-type: tcp* > *Bricks:* > *Brick1: gdnode01:/gluster/engine/brick* > *Brick2: gdnode02:/gluster/engine/brick* > *Brick3: gdnode04:/gluster/engine/brick* > *Options Reconfigured:* > *nfs.disable: on* > *performance.readdir-ahead: on* > *transport.address-family: inet* > *storage.owner-uid: 36* > *performance.quick-read: off* > *performance.read-ahead: off* > *performance.io-cache: off* > *performance.stat-prefetch: off* > *performance.low-prio-threads: 32* > *network.remote-dio: off* > *cluster.eager-lock: enable* > *cluster.quorum-type: auto* > *cluster.server-quorum-type: server* > *cluster.data-self-heal-algorithm: full* > *cluster.locking-scheme: granular* > *cluster.shd-max-threads: 8* > *cluster.shd-wait-qlength: 10000* > *features.shard: on* > *user.cifs: off* > *storage.owner-gid: 36* > *features.shard-block-size: 512MB* > *network.ping-timeout: 30* > *performance.strict-o-direct: on* > *cluster.granular-entry-heal: on* > *auth.allow: ** > > server.allow-insecure: on > > > 2017-07-21 19:13 GMT+02:00 yayo (j) <jag...@gmail.com>: > >> 2017-07-20 14:48 GMT+02:00 Ravishankar N <ravishan...@redhat.com>: >> >>> >>> But it does say something. All these gfids of completed heals in the >>> log below are the for the ones that you have given the getfattr output of. >>> So what is likely happening is there is an intermittent connection problem >>> between your mount and the brick process, leading to pending heals again >>> after the heal gets completed, which is why the numbers are varying each >>> time. You would need to check why that is the case. >>> Hope this helps, >>> Ravi >>> >>> >>> >>> *[2017-07-20 09:58:46.573079] I [MSGID: 108026] >>> [afr-self-heal-common.c:1254:afr_log_selfheal] 0-engine-replicate-0: >>> Completed data selfheal on e6dfd556-340b-4b76-b47b-7b6f5bd74327. >>> sources=[0] 1 sinks=2* >>> *[2017-07-20 09:59:22.995003] I [MSGID: 108026] >>> [afr-self-heal-metadata.c:51:__afr_selfheal_metadata_do] >>> 0-engine-replicate-0: performing metadata selfheal on >>> f05b9742-2771-484a-85fc-5b6974bcef81* >>> *[2017-07-20 09:59:22.999372] I [MSGID: 108026] >>> [afr-self-heal-common.c:1254:afr_log_selfheal] 0-engine-replicate-0: >>> Completed metadata selfheal on f05b9742-2771-484a-85fc-5b6974bcef81. >>> sources=[0] 1 sinks=2* >>> >>> >> >> Hi, >> >> following your suggestion, I've checked the "peer" status and I found >> that there is too many name for the hosts, I don't know if this can be the >> problem or part of it: >> >> *gluster peer status on NODE01:* >> *Number of Peers: 2* >> >> *Hostname: dnode02.localdomain.local* >> *Uuid: 7c0ebfa3-5676-4d3f-9bfa-7fff6afea0dd* >> *State: Peer in Cluster (Connected)* >> *Other names:* >> *192.168.10.52* >> *dnode02.localdomain.local* >> *10.10.20.90* >> *10.10.10.20* >> >> >> >> >> *gluster peer status on NODE02:* >> *Number of Peers: 2* >> >> *Hostname: dnode01.localdomain.local* >> *Uuid: a568bd60-b3e4-4432-a9bc-996c52eaaa12* >> *State: Peer in Cluster (Connected)* >> *Other names:* >> *gdnode01* >> *10.10.10.10* >> >> *Hostname: gdnode04* >> *Uuid: ce6e0f6b-12cf-4e40-8f01-d1609dfc5828* >> *State: Peer in Cluster (Connected)* >> *Other names:* >> *192.168.10.54* >> *10.10.10.40* >> >> >> *gluster peer status on NODE04:* >> *Number of Peers: 2* >> >> *Hostname: dnode02.neridom.dom* >> *Uuid: 7c0ebfa3-5676-4d3f-9bfa-7fff6afea0dd* >> *State: Peer in Cluster (Connected)* >> *Other names:* >> *10.10.20.90* >> *gdnode02* >> *192.168.10.52* >> *10.10.10.20* >> >> *Hostname: dnode01.localdomain.local* >> *Uuid: a568bd60-b3e4-4432-a9bc-996c52eaaa12* >> *State: Peer in Cluster (Connected)* >> *Other names:* >> *gdnode01* >> *10.10.10.10* >> >> >> >> All these ip are pingable and hosts resolvible across all 3 nodes but, >> only the 10.10.10.0 network is the decidated network for gluster (rosolved >> using gdnode* host names) ... You think that remove other entries can fix >> the problem? So, sorry, but, how can I remove other entries? >> > I don't think having extra entries could be a problem. Did you check the > fuse mount logs for disconnect messages that I referred to in the other > email? > > >> And, what about the selinux? >> > Not sure about this. See if there are disconnect messages in the mount > logs first. > -Ravi > > >> Thank you >> >> >> > > > -- > Linux User: 369739 http://counter.li.org > > > > _______________________________________________ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users > >
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users