Re: [Gluster-users] [Possibile SPAM] Re: Problem with Gluster 3.12.4, VM and sharding

2018-01-18 Thread Ing. Luca Lazzeroni - Trend Servizi Srl

Nope. If I enable write-behind the corruption happens every time.


Il 19/01/2018 08:26, Ing. Luca Lazzeroni - Trend Servizi Srl ha scritto:


After other test (I'm trying to convice myself about gluster 
reliability :-) I've found that with


performance.write-behind off

the vm works without problem. Now I'll try with write-behind on and 
flush-behind on too.




Il 18/01/2018 13:30, Krutika Dhananjay ha scritto:
Thanks for that input. Adding Niels since the issue is reproducible 
only with libgfapi.


-Krutika

On Thu, Jan 18, 2018 at 1:39 PM, Ing. Luca Lazzeroni - Trend Servizi 
Srl > wrote:


Another update.

I've setup a replica 3 volume without sharding and tried to
install a VM on a qcow2 volume on that device; however the result
is the same and the vm image has been corrupted, exactly at the
same point.

Here's the volume info of the create volume:

Volume Name: gvtest
Type: Replicate
Volume ID: e2ddf694-ba46-4bc7-bc9c-e30803374e9d
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: gluster1:/bricks/brick1/gvtest
Brick2: gluster2:/bricks/brick1/gvtest
Brick3: gluster3:/bricks/brick1/gvtest
Options Reconfigured:
user.cifs: off
features.shard: off
cluster.shd-wait-qlength: 1
cluster.shd-max-threads: 8
cluster.locking-scheme: granular
cluster.data-self-heal-algorithm: full
cluster.server-quorum-type: server
cluster.quorum-type: auto
cluster.eager-lock: enable
network.remote-dio: enable
performance.low-prio-threads: 32
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off


Il 17/01/2018 14:51, Ing. Luca Lazzeroni - Trend Servizi Srl ha
scritto:


Hi,

after our IRC chat I've rebuilt a virtual machine with FUSE
based virtual disk. Everything worked flawlessly.

Now I'm sending you the output of the requested getfattr command
on the disk image:

# file: TestFUSE-vda.qcow2
trusted.afr.dirty=0x
trusted.gfid=0x40ffafbbe987445692bb31295fa40105

trusted.gfid2path.dc9dde61f0b77eab=0x31326533323631662d373839332d346262302d383738632d3966623765306232336263652f54657374465553452d7664612e71636f7732
trusted.glusterfs.shard.block-size=0x0400

trusted.glusterfs.shard.file-size=0xc1530060be90

Hope this helps.



Il 17/01/2018 11:37, Ing. Luca Lazzeroni - Trend Servizi Srl ha
scritto:


I actually use FUSE and it works. If i try to use "libgfapi"
direct interface to gluster in qemu-kvm, the problem appears.



Il 17/01/2018 11:35, Krutika Dhananjay ha scritto:

Really? Then which protocol exactly do you see this issue
with? libgfapi? NFS?

-Krutika

On Wed, Jan 17, 2018 at 3:59 PM, Ing. Luca Lazzeroni - Trend
Servizi Srl > wrote:

Of course. Here's the full log. Please, note that in FUSE
mode everything works apparently without problems. I've
installed 4 vm and updated them without problems.



Il 17/01/2018 11:00, Krutika Dhananjay ha scritto:



On Tue, Jan 16, 2018 at 10:47 PM, Ing. Luca Lazzeroni -
Trend Servizi Srl >
wrote:

I've made the test with raw image format
(preallocated too) and the corruption problem is
still there (but without errors in bricks' log file).

What does the "link" error in bricks log files means ?

I've seen the source code looking for the lines where
it happens and it seems a warning (it doesn't imply a
failure).


Indeed, it only represents a transient state when the
shards are created for the first time and does not
indicate a failure.
Could you also get the logs of the gluster fuse mount
process? It should be under /var/log/glusterfs of your
client machine with the filename as a hyphenated mount
point path.

For example, if your volume was mounted at
/mnt/glusterfs, then your log file would be named
mnt-glusterfs.log.

-Krutika



Il 16/01/2018 17:39, Ing. Luca Lazzeroni - Trend
Servizi Srl ha scritto:


An update:

I've tried, for my tests, to create the vm volume as

qemu-img create -f qcow2 -o preallocation=full
gluster://gluster1/Test/Test-vda.img 20G

et voila !

No errors at all, neither in bricks' log file (the
"link failed" message disappeared), neither in VM
(no corruption and installed succesfully).

I'll do another test with a fully 

[Gluster-users] Backup Solutions for GlusterFS

2018-01-18 Thread Kadir
Hi,
What is the backup solutions for glusterfs? Does glusterfs suppors any backup 
solutions.
Sincerely,Kadir___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Possibile SPAM] Re: Problem with Gluster 3.12.4, VM and sharding

2018-01-18 Thread Ing. Luca Lazzeroni - Trend Servizi Srl
After other test (I'm trying to convice myself about gluster reliability 
:-) I've found that with


performance.write-behind off

the vm works without problem. Now I'll try with write-behind on and 
flush-behind on too.




Il 18/01/2018 13:30, Krutika Dhananjay ha scritto:
Thanks for that input. Adding Niels since the issue is reproducible 
only with libgfapi.


-Krutika

On Thu, Jan 18, 2018 at 1:39 PM, Ing. Luca Lazzeroni - Trend Servizi 
Srl > wrote:


Another update.

I've setup a replica 3 volume without sharding and tried to
install a VM on a qcow2 volume on that device; however the result
is the same and the vm image has been corrupted, exactly at the
same point.

Here's the volume info of the create volume:

Volume Name: gvtest
Type: Replicate
Volume ID: e2ddf694-ba46-4bc7-bc9c-e30803374e9d
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: gluster1:/bricks/brick1/gvtest
Brick2: gluster2:/bricks/brick1/gvtest
Brick3: gluster3:/bricks/brick1/gvtest
Options Reconfigured:
user.cifs: off
features.shard: off
cluster.shd-wait-qlength: 1
cluster.shd-max-threads: 8
cluster.locking-scheme: granular
cluster.data-self-heal-algorithm: full
cluster.server-quorum-type: server
cluster.quorum-type: auto
cluster.eager-lock: enable
network.remote-dio: enable
performance.low-prio-threads: 32
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off


Il 17/01/2018 14:51, Ing. Luca Lazzeroni - Trend Servizi Srl ha
scritto:


Hi,

after our IRC chat I've rebuilt a virtual machine with FUSE based
virtual disk. Everything worked flawlessly.

Now I'm sending you the output of the requested getfattr command
on the disk image:

# file: TestFUSE-vda.qcow2
trusted.afr.dirty=0x
trusted.gfid=0x40ffafbbe987445692bb31295fa40105

trusted.gfid2path.dc9dde61f0b77eab=0x31326533323631662d373839332d346262302d383738632d3966623765306232336263652f54657374465553452d7664612e71636f7732
trusted.glusterfs.shard.block-size=0x0400

trusted.glusterfs.shard.file-size=0xc1530060be90

Hope this helps.



Il 17/01/2018 11:37, Ing. Luca Lazzeroni - Trend Servizi Srl ha
scritto:


I actually use FUSE and it works. If i try to use "libgfapi"
direct interface to gluster in qemu-kvm, the problem appears.



Il 17/01/2018 11:35, Krutika Dhananjay ha scritto:

Really? Then which protocol exactly do you see this issue with?
libgfapi? NFS?

-Krutika

On Wed, Jan 17, 2018 at 3:59 PM, Ing. Luca Lazzeroni - Trend
Servizi Srl > wrote:

Of course. Here's the full log. Please, note that in FUSE
mode everything works apparently without problems. I've
installed 4 vm and updated them without problems.



Il 17/01/2018 11:00, Krutika Dhananjay ha scritto:



On Tue, Jan 16, 2018 at 10:47 PM, Ing. Luca Lazzeroni -
Trend Servizi Srl >
wrote:

I've made the test with raw image format (preallocated
too) and the corruption problem is still there (but
without errors in bricks' log file).

What does the "link" error in bricks log files means ?

I've seen the source code looking for the lines where
it happens and it seems a warning (it doesn't imply a
failure).


Indeed, it only represents a transient state when the
shards are created for the first time and does not
indicate a failure.
Could you also get the logs of the gluster fuse mount
process? It should be under /var/log/glusterfs of your
client machine with the filename as a hyphenated mount
point path.

For example, if your volume was mounted at /mnt/glusterfs,
then your log file would be named mnt-glusterfs.log.

-Krutika



Il 16/01/2018 17:39, Ing. Luca Lazzeroni - Trend
Servizi Srl ha scritto:


An update:

I've tried, for my tests, to create the vm volume as

qemu-img create -f qcow2 -o preallocation=full
gluster://gluster1/Test/Test-vda.img 20G

et voila !

No errors at all, neither in bricks' log file (the
"link failed" message disappeared), neither in VM (no
corruption and installed succesfully).

I'll do another test with a fully preallocated raw image.



Il 16/01/2018 16:31, Ing. Luca Lazzeroni - Trend
Servizi Srl ha scritto:


I've just done all 

[Gluster-users] no luck making read local ONLY local work

2018-01-18 Thread Vlad Kopylov
Standard fuse mount - mount and brick are on the save server. 3.13 branch.
Volume created and fstab mounted via server-name from the hosts file
vm1 vm2 vm3 ... pointing to interface IP

trying:
cluster.nufa on
cluster.choose-local on
cluster.read-hash-mode: 0
cluster.choose-local on (by default)
trying to mount with xlator-option=cluster.read-subvolume-index=X

When read happens - network tops out whatever I try, making everything dead slow
Copy from the brick takes 2 sec, through mount minutes

Maybe I can use the brick directly for read? and maybe for write?
Can I just block fuse mount ports so there will be no fuse client
communication with other nodes from each server?

only non ancient doc on this I was able to find
http://lists.gluster.org/pipermail/gluster-users/2015-June/022321.html
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Segfaults after upgrade to GlusterFS 3.10.9

2018-01-18 Thread Jiffin Tony Thottan

Hi Frank,

It will be very easy to debug if u have core file with u. It looks like 
crash is coming from gfapi stack.


If there is core file can u please share bt of the core file.

Regards,

Jiffin


On Thursday 18 January 2018 11:18 PM, Frank Wall wrote:

Hi,

after upgrading to 3.10.9 I'm seing ganesha.nfsd segfaulting all the time:

[12407.918249] ganesha.nfsd[38104]: segfault at 0 ip 7f872425fb00 sp 
7f867cefe5d0 error 4 in libglusterfs.so.0.0.1[7f8724223000+f1000]
[12693.119259] ganesha.nfsd[3610]: segfault at 0 ip 7f716d8f5b00 sp 
7f71367e15d0 error 4 in libglusterfs.so.0.0.1[7f716d8b9000+f1000]
[14531.582667] ganesha.nfsd[17025]: segfault at 0 ip 7f7cb8fa8b00 sp 
7f7c5878d5d0 error 4 in libglusterfs.so.0.0.1[7f7cb8f6c000+f1000]

ganesha-fgapi.log shows the following errors:

[2018-01-18 17:24:00.146094] W [inode.c:1341:inode_parent] 
(-->/lib64/libgfapi.so.0(glfs_resolve_at+0x278) [0x7f7cb927f0b8] 
-->/lib64/libglusterfs.so.0(glusterfs_normalize_dentry+0x8e) [0x7f7cb8fa8aee] 
-->/lib64/libglusterfs.so.0(inode_parent+0xda) [0x7f7cb8fa670a] ) 0-gfapi: inode not 
found
[2018-01-18 17:24:00.146210] E [inode.c:2567:inode_parent_null_check] 
(-->/lib64/libgfapi.so.0(glfs_resolve_at+0x278) [0x7f7cb927f0b8] 
-->/lib64/libglusterfs.so.0(glusterfs_normalize_dentry+0xa0) [0x7f7cb8fa8b00] 
-->/lib64/libglusterfs.so.0(+0x398c4) [0x7f7cb8fa58c4] ) 0-inode: invalid argument: 
inode [Invalid argument]

This leads to serious availability issues.

Is this a known issue? Any workaround available?

FWIW, my GlusterFS volume looks like this:

Volume Name: gfsvol
Type: Distributed-Replicate
Volume ID: f7985bf3-67e1-49d6-90bf-16816536533b
Status: Started
Snapshot Count: 0
Number of Bricks: 4 x 3 = 12
Transport-type: tcp
Bricks:
Brick1: AAA:/bricks/gfsvol/vol1/volume
Brick2: BBB:/bricks/gfsvol/vol1/volume
Brick3: CCC:/bricks/gfsvol/vol1/volume
Brick4: AAA:/bricks/gfsvol/vol2/volume
Brick5: BBB:/bricks/gfsvol/vol2/volume
Brick6: CCC:/bricks/gfsvol/vol2/volume
Brick7: AAA:/bricks/gfsvol/vol3/volume
Brick8: BBB:/bricks/gfsvol/vol3/volume
Brick9: CCC:/bricks/gfsvol/vol3/volume
Brick10: AAA:/bricks/gfsvol/vol4/volume
Brick11: BBB:/bricks/gfsvol/vol4/volume
Brick12: CCC:/bricks/gfsvol/vol4/volume
Options Reconfigured:
nfs.disable: on
transport.address-family: inet
features.cache-invalidation: off
ganesha.enable: on
auth.allow: *
nfs.rpc-auth-allow: *
nfs-ganesha: enable
cluster.enable-shared-storage: enable


Thanks
- Frank
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] IMP: Release 4.0: CentOS 6 packages will not be made available

2018-01-18 Thread Amye Scavarda
On Thu, Jan 18, 2018 at 3:25 PM, Tomalak Geret'kal  wrote:

> On 11/01/2018 18:32, Shyam Ranganathan wrote:
> > Gluster Users,
> >
> > This is to inform you that from the 4.0 release onward, packages for
> > CentOS 6 will not be built by the gluster community. This also means
> > that the CentOS SIG will not receive updates for 4.0 gluster packages.
> >
> > Gluster release 3.12 and its predecessors will receive CentOS 6 updates
> > till Release 4.3 of gluster (which is slated around Dec, 2018).
> >
> > The decision is due to the following,
> > - Glusterd2 which is golang based meets its dependencies on CentOS 7
> > only, and is not built on CentOS 6 (yet)
> >
> > - Gluster community regression machines and runs are going to be CentOS
> > 7 based going forward, and so determinism of quality on CentOS 7 would
> > be better than on CentOS 6
> >
> > If you have questions, send to Amye Scavarda 
> >
> Hi Amye,
>
> How does this mesh with the fact that CentOS 6 is supposed
> to receive maintenance updates until November 2020?
> Particularly as this project is a Red Hat endeavour. Will we
> only get updates on RHEL6 or is that going to be
> discontinued around December 2018 as well?
>
> Cheers
>

Hi there!
Thanks for reaching out!
We did consider this as well, and we understand that different projects
have different lifecycles. That being said, in the interest of providing
the best possible experience for 4.0, we're aligning our community releases
with our development platform.
If someone is available to offer resources to continue supporting CentOS6,
I think we'd be willing to consider that.
-- amye

-- 
Amye Scavarda | a...@redhat.com | Gluster Community Lead
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] IMP: Release 4.0: CentOS 6 packages will not be made available

2018-01-18 Thread Tomalak Geret'kal
On 11/01/2018 18:32, Shyam Ranganathan wrote:
> Gluster Users,
>
> This is to inform you that from the 4.0 release onward, packages for
> CentOS 6 will not be built by the gluster community. This also means
> that the CentOS SIG will not receive updates for 4.0 gluster packages.
>
> Gluster release 3.12 and its predecessors will receive CentOS 6 updates
> till Release 4.3 of gluster (which is slated around Dec, 2018).
>
> The decision is due to the following,
> - Glusterd2 which is golang based meets its dependencies on CentOS 7
> only, and is not built on CentOS 6 (yet)
>
> - Gluster community regression machines and runs are going to be CentOS
> 7 based going forward, and so determinism of quality on CentOS 7 would
> be better than on CentOS 6
>
> If you have questions, send to Amye Scavarda 
>
Hi Amye,

How does this mesh with the fact that CentOS 6 is supposed
to receive maintenance updates until November 2020?
Particularly as this project is a Red Hat endeavour. Will we
only get updates on RHEL6 or is that going to be
discontinued around December 2018 as well?

Cheers

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Deploying geo-replication to local peer

2018-01-18 Thread Viktor Nosov
Hi Kotresh,

 

Thanks for response! 

 

After taking more tests with this specific geo-replication configuration I 
realized that  

file extended attributes trusted.gfid and trusted.gfid2path.*** are synced as 
well during geo replication.

I’m concern about attribute trusted.gfid because value of the attribute has to 
be unique for glusterfs cluster.

But this is not a case in my tests. File on master and slave volumes has the 
same trusted.gfid attribute.

To handle this issue the geo replication configuration option sync-xattrs = 
false was tested on glusterfs version 3.12.3. 

After changes of the option from true to false the geo-replication was stopped, 
volume was stopped, glusterd was stopped, glusterd was started, volume was 
started and the geo-replication was started again.

It had no effect on syncing of trusted.gfid.

 

How it is critical to have duplicated gfid’s? Can volume data be corrupted in 
this case somehow?

 

Best regards,

 

Viktor Nosov 

 

From: Kotresh Hiremath Ravishankar [mailto:khire...@redhat.com] 
Sent: Tuesday, January 16, 2018 7:59 PM
To: Viktor Nosov
Cc: Gluster Users; jby...@stonefly.com
Subject: Re: [Gluster-users] Deploying geo-replication to local peer

 

Hi Viktor,

Answers inline

 

On Wed, Jan 17, 2018 at 3:46 AM, Viktor Nosov  wrote:

Hi,

I'm looking for glusterfs feature that can be used to transform data between
volumes of different types provisioned on the same nodes.
It could be, for example, transformation from disperse to distributed
volume.
The possible option is to invoke geo-replication between volumes. It seems
is works properly.
But I'm concern about  requirement from Administration Guide for Red Hat
Gluster Storage 3.3 (10.3.3. Prerequisites):

"Slave node must not be a peer of the any of the nodes of the Master trusted
storage pool."

 This doesn't limit geo-rep feature in anyway. It's a  recommendation. You

can go ahead and use it.

 

Is this restriction is set to limit usage of geo-replication to disaster
recovery scenarios only or there is a problem with data synchronization
between
master and slave volumes?

Anybody has experience with this issue?

Thanks for any information!

Viktor Nosov


___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users




-- 

Thanks and Regards,

Kotresh H R

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] issues after botched update

2018-01-18 Thread Jorick Astrego
Hi,

A client has a glusterfs cluster that's behaving weirdly after some 
issues during upgrade.

They upgraded a glusterfs 2+1 cluster (replica with arbiter) from 3.10.9 
to 3.12.4 on Centos and now have weird issues and some files maybe being 
corrupted. They also switched from nfs ganesha that crashed every couple 
of days to glusterfs subdirectory mounting. Subdirectory mounting was 
the point of the upgrade.

On a client when I go switch to a user and go to the home folder, I do a 
ls -lart and get a list of files, when I do it the second time I get 
"ls: cannot open directory .: Permission denied" until I logout and 
login again. Then it works mostly one time only, somtimes it fails on 
the first time. Deploying something in this folder works sometimes, 
other times I get write permission is denied on /home/user.

I've enabled the bitrot daemon but it's still running:

Volume name : home

State of scrub: Active (In Progress)

Scrub impact: normal

Scrub frequency: biweekly

Bitrot error log location: /var/log/glusterfs/bitd.log

Scrubber error log location: /var/log/glusterfs/scrub.log


=

Node: localhost

Number of Scrubbed files: 152451

Number of Skipped files: 197

Last completed scrub time: Scrubber pending to complete.

Duration of last scrub (D:M:H:M:S): 0:0:0:0

Error count: 0


=

Node: gluster01

Number of Scrubbed files: 150198

Number of Skipped files: 190

Last completed scrub time: Scrubber pending to complete.

Duration of last scrub (D:M:H:M:S): 0:0:0:0

Error count: 0


=

Node: gluster02

Number of Scrubbed files: 0

Number of Skipped files: 153939

Last completed scrub time: Scrubber pending to complete.

Duration of last scrub (D:M:H:M:S): 0:0:0:0

Error count: 0

=

Gluster volume heal has one failed entries.

Ending time of crawl: Thu Jan 18 11:49:04 2018

Type of crawl: INDEX
No. of entries healed: 0
No. of entries in split-brain: 0
No. of heal failed entries: 0

Starting time of crawl: Thu Jan 18 11:59:04 2018

Ending time of crawl: Thu Jan 18 11:59:06 2018

Type of crawl: INDEX
No. of entries healed: 0
No. of entries in split-brain: 0
*No. of heal failed entries: 1*

Starting time of crawl: Thu Jan 18 12:09:06 2018

Ending time of crawl: Thu Jan 18 12:09:07 2018

Type of crawl: INDEX
No. of entries healed: 0
No. of entries in split-brain: 0
No. of heal failed entries: 0

but I'm unable to trace which file as "gluster volume heal  info 
heal-failed" doesn't exist anymore


gluster volume heal home info heal-failed

Usage:
volume heal  [enable | disable | full |statistics
[heal-count [replica ]] |info [split-brain]
|split-brain {bigger-file  | latest-mtime  |source-brick
 []} |granular-entry-heal {enable | disable}]

The things I'm seeing in the logfiles on the client:

[2018-01-18 08:59:53.360864] W [MSGID: 114031] 
[client-rpc-fops.c:2151:client3_3_seek_cbk] 0-home-client-0: remote 
operation failed [No such device or address]
[2018-01-18 09:00:17.512636] W [MSGID: 114031] 
[client-rpc-fops.c:2151:client3_3_seek_cbk] 0-home-client-0: remote 
operation failed [No such device or address]
[2018-01-18 09:00:27.473702] W [MSGID: 114031] 
[client-rpc-fops.c:2151:client3_3_seek_cbk] 0-home-client-0: remote 
operation failed [No such device or address]
[2018-01-18 09:00:40.372756] W [MSGID: 114031] 
[client-rpc-fops.c:2151:client3_3_seek_cbk] 0-home-client-0: remote 
operation failed [No such device or address]
[2018-01-18 09:00:50.344597] W [MSGID: 114031] 
[client-rpc-fops.c:2151:client3_3_seek_cbk] 0-home-client-0: remote 
operation failed [No such device or address]

Thes one worry me, there are multiple but without any gfid:

  [MSGID: 109063] [dht-layout.c:716:dht_layout_normalize]
5-home-dht: Found anomalies in (null) (gfid =
----). Holes=1 overlaps=0

Also it seems to disconnect and

[2018-01-18 08:38:41.210848] I [MSGID: 114057]
[client-handshake.c:1478:select_server_supported_programs]
0-home-client-1: Using Program GlusterFS 3.3, Num (1298437), Version
(330)
[2018-01-18 08:38:41.214548] I [MSGID: 114057]
[client-handshake.c:1478:select_server_supported_programs]
0-home-client-2: Using Program GlusterFS 3.3, Num (1298437), Version
(330)
[2018-01-18 08:38:41.255458] I [MSGID: 114046]
[client-handshake.c:1231:client_setvolume_cbk] 0-home-client-0:
Connected to home-client-0, attached to remote volume
'/data/home/brick1'.
[2018-01-18 08:38:41.255505] I [MSGID: 114047]
[client-handshake.c:1242:client_setvolume_cbk] 

Re: [Gluster-users] [Possibile SPAM] Re: Strange messages in mnt-xxx.log

2018-01-18 Thread Ing. Luca Lazzeroni - Trend Servizi Srl

Here's the volume info:


Volume Name: gv2a2
Type: Replicate
Volume ID: 83c84774-2068-4bfc-b0b9-3e6b93705b9f
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: gluster1:/bricks/brick2/gv2a2
Brick2: gluster3:/bricks/brick3/gv2a2
Brick3: gluster2:/bricks/arbiter_brick_gv2a2/gv2a2 (arbiter)
Options Reconfigured:
storage.owner-gid: 107
storage.owner-uid: 107
user.cifs: off
features.shard: on
cluster.shd-wait-qlength: 1
cluster.shd-max-threads: 8
cluster.locking-scheme: granular
cluster.data-self-heal-algorithm: full
cluster.server-quorum-type: server
cluster.quorum-type: auto
cluster.eager-lock: enable
network.remote-dio: enable
performance.low-prio-threads: 32
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
transport.address-family: inet
nfs.disable: off
performance.client-io-threads: off

The only client I'm using is FUSE client used to mount the gluster 
volume. On the gluster volume there is a maximum of 15 files ("big" 
files because they host VM images) accessed by qemu-kvm as normal files 
on the FUSE mounted volume.


By inspecting the code I've found that the message is logged in 2 
situations:


1) A real "Hole" in DHT

2) A "virgin" file being created

I think this is the second situation because that message appears only 
when I create a new qcow2 volume to host VM image.




Il 17/01/2018 04:54, Nithya Balachandran ha scritto:

Hi,


On 16 January 2018 at 18:56, Ing. Luca Lazzeroni - Trend Servizi Srl 
> wrote:


Hi,

I'm testing gluster 3.12.4 and, by inspecting log files
/var/log/glusterfs/mnt-gv0.log (gv0 is the volume name), I found
many lines saying:

[2018-01-15 09:45:41.066914] I [MSGID: 109063]
[dht-layout.c:716:dht_layout_normalize] 0-gv0-dht: Found anomalies
in (null) (gfid = ----). Holes=1
overlaps=0
[2018-01-15 09:45:45.755021] I [MSGID: 109063]
[dht-layout.c:716:dht_layout_normalize] 0-gv0-dht: Found anomalies
in (null) (gfid = ----). Holes=1
overlaps=0
[2018-01-15 14:02:29.171437] I [MSGID: 109063]
[dht-layout.c:716:dht_layout_normalize] 0-gv0-dht: Found anomalies
in (null) (gfid = ----). Holes=1
overlaps=0

What do they mean ? Is there any real problem ?


Please provide the following details:
gluster volume info
what clients you are using and what operations being performed
Any steps to reproduce this issue.

Thanks,
Nithya

Thank you,


-- 
Ing. Luca Lazzeroni

Responsabile Ricerca e Sviluppo
Trend Servizi Srl
Tel: 0376/631761
Web: https://www.trendservizi.it

___
Gluster-users mailing list
Gluster-users@gluster.org 
http://lists.gluster.org/mailman/listinfo/gluster-users





--
Ing. Luca Lazzeroni
Responsabile Ricerca e Sviluppo
Trend Servizi Srl
Tel: 0376/631761
Web: https://www.trendservizi.it

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Gluster endless heal

2018-01-18 Thread Mahdi Adnan
Hi,

I have an issue with Gluster 3.8.14.
The cluster is 4 nodes with replica count 2, on of the nodes went offline for 
around 15 minutes, when it came back online, self heal triggered and it just 
did not stop afterward, it's been running for 3 days now, maxing the bricks 
utilization without actually healing anything.
The bricks are all SSDs, and the logs of the source node is spamming with the 
following messages;

[2018-01-17 18:37:11.815247] I [MSGID: 108026] 
[afr-self-heal-common.c:1254:afr_log_selfheal] 0-ovirt_imgs-replicate-0: 
Completed data selfheal on 450fb07a-e95d-48ef-a229-48917557c278. sources=[0]  
sinks=1
[2018-01-17 18:37:12.830887] I [MSGID: 108026] 
[afr-self-heal-metadata.c:51:__afr_selfheal_metadata_do] 
0-ovirt_imgs-replicate-0: performing metadata selfheal on 
ce0f545d-635a-40c0-95eb-ccfc71971f78
[2018-01-17 18:37:12.845978] I [MSGID: 108026] 
[afr-self-heal-common.c:1254:afr_log_selfheal] 0-ovirt_imgs-replicate-0: 
Completed metadata selfheal on ce0f545d-635a-40c0-95eb-ccfc71971f78. 
sources=[0]  sinks=1

---

I tried restarting glusterd and rebooting the node after about 24 hours of 
healing, but it just did not help, i had like several bricks doing heal and 
after rebooting it's now only 4 bricks doing heal.

The volume is used for oVirt storage domain with sharding enabled.
No errors or warnings on both nodes, just info messages about afr healing.

any idea whats going on or where should i start looking ?

--

Respectfully
Mahdi A. Mahdi

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Segfaults after upgrade to GlusterFS 3.10.9

2018-01-18 Thread Frank Wall
Hi,

after upgrading to 3.10.9 I'm seing ganesha.nfsd segfaulting all the time:

[12407.918249] ganesha.nfsd[38104]: segfault at 0 ip 7f872425fb00 sp 
7f867cefe5d0 error 4 in libglusterfs.so.0.0.1[7f8724223000+f1000]
[12693.119259] ganesha.nfsd[3610]: segfault at 0 ip 7f716d8f5b00 sp 
7f71367e15d0 error 4 in libglusterfs.so.0.0.1[7f716d8b9000+f1000]
[14531.582667] ganesha.nfsd[17025]: segfault at 0 ip 7f7cb8fa8b00 sp 
7f7c5878d5d0 error 4 in libglusterfs.so.0.0.1[7f7cb8f6c000+f1000]

ganesha-fgapi.log shows the following errors:

[2018-01-18 17:24:00.146094] W [inode.c:1341:inode_parent] 
(-->/lib64/libgfapi.so.0(glfs_resolve_at+0x278) [0x7f7cb927f0b8] 
-->/lib64/libglusterfs.so.0(glusterfs_normalize_dentry+0x8e) [0x7f7cb8fa8aee] 
-->/lib64/libglusterfs.so.0(inode_parent+0xda) [0x7f7cb8fa670a] ) 0-gfapi: 
inode not found
[2018-01-18 17:24:00.146210] E [inode.c:2567:inode_parent_null_check] 
(-->/lib64/libgfapi.so.0(glfs_resolve_at+0x278) [0x7f7cb927f0b8] 
-->/lib64/libglusterfs.so.0(glusterfs_normalize_dentry+0xa0) [0x7f7cb8fa8b00] 
-->/lib64/libglusterfs.so.0(+0x398c4) [0x7f7cb8fa58c4] ) 0-inode: invalid 
argument: inode [Invalid argument]

This leads to serious availability issues.

Is this a known issue? Any workaround available?

FWIW, my GlusterFS volume looks like this:

Volume Name: gfsvol
Type: Distributed-Replicate
Volume ID: f7985bf3-67e1-49d6-90bf-16816536533b
Status: Started
Snapshot Count: 0
Number of Bricks: 4 x 3 = 12
Transport-type: tcp
Bricks:
Brick1: AAA:/bricks/gfsvol/vol1/volume
Brick2: BBB:/bricks/gfsvol/vol1/volume
Brick3: CCC:/bricks/gfsvol/vol1/volume
Brick4: AAA:/bricks/gfsvol/vol2/volume
Brick5: BBB:/bricks/gfsvol/vol2/volume
Brick6: CCC:/bricks/gfsvol/vol2/volume
Brick7: AAA:/bricks/gfsvol/vol3/volume
Brick8: BBB:/bricks/gfsvol/vol3/volume
Brick9: CCC:/bricks/gfsvol/vol3/volume
Brick10: AAA:/bricks/gfsvol/vol4/volume
Brick11: BBB:/bricks/gfsvol/vol4/volume
Brick12: CCC:/bricks/gfsvol/vol4/volume
Options Reconfigured:
nfs.disable: on
transport.address-family: inet
features.cache-invalidation: off
ganesha.enable: on
auth.allow: *
nfs.rpc-auth-allow: *
nfs-ganesha: enable
cluster.enable-shared-storage: enable


Thanks
- Frank
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] [Gluster-devel] cluster/dht: restrict migration of opened files

2018-01-18 Thread Susant Palai
This does not restrict tiered migrations.

Susant

On 18 Jan 2018 8:18 pm, "Milind Changire"  wrote:

On Tue, Jan 16, 2018 at 2:52 PM, Raghavendra Gowdappa 
wrote:

> All,
>
> Patch [1] prevents migration of opened files during rebalance operation.
> If patch [1] affects you, please voice out your concerns. [1] is a stop-gap
> fix for the problem discussed in issues [2][3]
>
> [1] https://review.gluster.org/#/c/19202/
> [2] https://github.com/gluster/glusterfs/issues/308
> [3] https://github.com/gluster/glusterfs/issues/347
>
> regards,
> Raghavendra
>
> ___
> Gluster-devel mailing list
> gluster-de...@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-devel
>


Would this patch affect tiering as well ?
Do we need to worry about tiering anymore ?

--
Milind


___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Blocking IO when hot tier promotion daemon runs

2018-01-18 Thread Tom Fite
Thanks for the info, Hari. Sorry about the bad gluster volume info, I
grabbed that from a file not realizing it was out of date. Here's a current
configuration showing the active hot tier:

[root@pod-sjc1-gluster1 ~]# gluster volume info

Volume Name: gv0
Type: Tier
Volume ID: d490a9ec-f9c8-4f10-a7f3-e1b6d3ced196
Status: Started
Snapshot Count: 13
Number of Bricks: 8
Transport-type: tcp
Hot Tier :
Hot Tier Type : Replicate
Number of Bricks: 1 x 2 = 2
Brick1: pod-sjc1-gluster2:/data/hot_tier/gv0
Brick2: pod-sjc1-gluster1:/data/hot_tier/gv0
Cold Tier:
Cold Tier Type : Distributed-Replicate
Number of Bricks: 3 x 2 = 6
Brick3: pod-sjc1-gluster1:/data/brick1/gv0
Brick4: pod-sjc1-gluster2:/data/brick1/gv0
Brick5: pod-sjc1-gluster1:/data/brick2/gv0
Brick6: pod-sjc1-gluster2:/data/brick2/gv0
Brick7: pod-sjc1-gluster1:/data/brick3/gv0
Brick8: pod-sjc1-gluster2:/data/brick3/gv0
Options Reconfigured:
performance.rda-low-wmark: 4KB
performance.rda-request-size: 128KB
storage.build-pgfid: on
cluster.watermark-low: 50
performance.readdir-ahead: off
cluster.tier-cold-compact-frequency: 86400
cluster.tier-hot-compact-frequency: 86400
features.ctr-sql-db-wal-autocheckpoint: 2500
cluster.tier-max-mb: 64000
cluster.tier-max-promote-file-size: 10485760
cluster.tier-max-files: 10
cluster.tier-demote-frequency: 3600
server.allow-insecure: on
performance.flush-behind: on
performance.rda-cache-limit: 128MB
network.tcp-window-size: 1048576
performance.nfs.io-threads: off
performance.write-behind-window-size: 512MB
performance.nfs.write-behind-window-size: 4MB
performance.io-cache: on
performance.quick-read: on
features.cache-invalidation: on
features.cache-invalidation-timeout: 600
performance.cache-invalidation: on
performance.md-cache-timeout: 600
network.inode-lru-limit: 9
performance.cache-size: 1GB
server.event-threads: 10
client.event-threads: 10
features.barrier: disable
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: on
cluster.lookup-optimize: on
server.outstanding-rpc-limit: 2056
performance.stat-prefetch: on
performance.cache-refresh-timeout: 60
features.ctr-enabled: on
cluster.tier-mode: cache
cluster.tier-compact: on
cluster.tier-pause: off
cluster.tier-promote-frequency: 1500
features.record-counters: on
cluster.write-freq-threshold: 2
cluster.read-freq-threshold: 5
features.ctr-sql-db-cachesize: 262144
cluster.watermark-hi: 95
auto-delete: enable

It will take some time to get the logs together, I need to strip out
potentially sensitive info, will update with them when I have them.

Any theories as to why the promotions / demotions only take place on one
box but not both?

-Tom

On Thu, Jan 18, 2018 at 5:12 AM, Hari Gowtham  wrote:

> Hi Tom,
>
> The volume info doesn't show the hot bricks. I think you have took the
> volume info output before attaching the hot tier.
> Can you send the volume info of the current setup where you see this issue.
>
> The logs you sent are from a later point in time. The issue is hit
> earlier than the logs what is available in the log. I need the logs
> from an earlier time.
> And along with the entire tier logs, can you send the glusterd and
> brick logs too?
>
> Rest of the comments are inline
>
> On Wed, Jan 10, 2018 at 9:03 PM, Tom Fite  wrote:
> > I should add that additional testing has shown that only accessing files
> is
> > held up, IO is not interrupted for existing transfers. I think this
> points
> > to the heat metadata in the sqlite DB for the tier, is it possible that a
> > table is temporarily locked while the promotion daemon runs so the calls
> to
> > update the access count on files are blocked?
> >
> >
> > On Wed, Jan 10, 2018 at 10:17 AM, Tom Fite  wrote:
> >>
> >> The sizes of the files are extremely varied, there are millions of small
> >> (<1 MB) files and thousands of files larger than 1 GB.
>
> The tier use case is for bigger size files. not the best for files of
> smaller size.
> That can end up hindering the IOs.
>
> >>
> >> Attached is the tier log for gluster1 and gluster2. These are full of
> >> "demotion failed" messages, which is also shown in the status:
> >>
> >> [root@pod-sjc1-gluster1 gv0]# gluster volume tier gv0 status
> >> Node Promoted files   Demoted filesStatus
> >> run time in h:m:s
> >> ----
> >> -
> >> localhost259400in
> progress
> >> 112:21:49
> >> pod-sjc1-gluster2 02917154  in progress
> >> 112:21:49
> >>
> >> Is it normal to have promotions and demotions only happen on each server
> >> but not both?
>
> No. its not normal.
>
> >>
> >> Volume info:
> >>
> >> [root@pod-sjc1-gluster1 ~]# gluster volume info
> >>
> >> Volume Name: gv0
> >> Type: Distributed-Replicate
> >> Volume ID: d490a9ec-f9c8-4f10-a7f3-e1b6d3ced196
> >> Status: Started
> >> Snapshot Count: 13

Re: [Gluster-users] [Gluster-devel] cluster/dht: restrict migration of opened files

2018-01-18 Thread Milind Changire
On Tue, Jan 16, 2018 at 2:52 PM, Raghavendra Gowdappa 
wrote:

> All,
>
> Patch [1] prevents migration of opened files during rebalance operation.
> If patch [1] affects you, please voice out your concerns. [1] is a stop-gap
> fix for the problem discussed in issues [2][3]
>
> [1] https://review.gluster.org/#/c/19202/
> [2] https://github.com/gluster/glusterfs/issues/308
> [3] https://github.com/gluster/glusterfs/issues/347
>
> regards,
> Raghavendra
>
> ___
> Gluster-devel mailing list
> gluster-de...@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-devel



Would this patch affect tiering as well ?
Do we need to worry about tiering anymore ?

--
Milind
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] cluster/dht: restrict migration of opened files

2018-01-18 Thread Pranith Kumar Karampuri
On Tue, Jan 16, 2018 at 2:52 PM, Raghavendra Gowdappa 
wrote:

> All,
>
> Patch [1] prevents migration of opened files during rebalance operation.
> If patch [1] affects you, please voice out your concerns. [1] is a stop-gap
> fix for the problem discussed in issues [2][3]
>

What is the impact on VM and gluster-block usecases after this patch? Will
it rebalance any data in these usecases?


>
> [1] https://review.gluster.org/#/c/19202/
> [2] https://github.com/gluster/glusterfs/issues/308
> [3] https://github.com/gluster/glusterfs/issues/347
>
> regards,
> Raghavendra
>
> ___
> Gluster-devel mailing list
> gluster-de...@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-devel
>



-- 
Pranith
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Possibile SPAM] Re: Problem with Gluster 3.12.4, VM and sharding

2018-01-18 Thread Krutika Dhananjay
Thanks for that input. Adding Niels since the issue is reproducible only
with libgfapi.

-Krutika

On Thu, Jan 18, 2018 at 1:39 PM, Ing. Luca Lazzeroni - Trend Servizi Srl <
l...@gvnet.it> wrote:

> Another update.
>
> I've setup a replica 3 volume without sharding and tried to install a VM
> on a qcow2 volume on that device; however the result is the same and the vm
> image has been corrupted, exactly at the same point.
>
> Here's the volume info of the create volume:
>
> Volume Name: gvtest
> Type: Replicate
> Volume ID: e2ddf694-ba46-4bc7-bc9c-e30803374e9d
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: gluster1:/bricks/brick1/gvtest
> Brick2: gluster2:/bricks/brick1/gvtest
> Brick3: gluster3:/bricks/brick1/gvtest
> Options Reconfigured:
> user.cifs: off
> features.shard: off
> cluster.shd-wait-qlength: 1
> cluster.shd-max-threads: 8
> cluster.locking-scheme: granular
> cluster.data-self-heal-algorithm: full
> cluster.server-quorum-type: server
> cluster.quorum-type: auto
> cluster.eager-lock: enable
> network.remote-dio: enable
> performance.low-prio-threads: 32
> performance.io-cache: off
> performance.read-ahead: off
> performance.quick-read: off
> transport.address-family: inet
> nfs.disable: on
> performance.client-io-threads: off
>
>
> Il 17/01/2018 14:51, Ing. Luca Lazzeroni - Trend Servizi Srl ha scritto:
>
> Hi,
>
> after our IRC chat I've rebuilt a virtual machine with FUSE based virtual
> disk. Everything worked flawlessly.
>
> Now I'm sending you the output of the requested getfattr command on the
> disk image:
>
> # file: TestFUSE-vda.qcow2
> trusted.afr.dirty=0x
> trusted.gfid=0x40ffafbbe987445692bb31295fa40105
> trusted.gfid2path.dc9dde61f0b77eab=0x31326533323631662d373839332d
> 346262302d383738632d3966623765306232336263652f54657374465553
> 452d7664612e71636f7732
> trusted.glusterfs.shard.block-size=0x0400
> trusted.glusterfs.shard.file-size=0xc153
> 0060be90
>
> Hope this helps.
>
>
>
> Il 17/01/2018 11:37, Ing. Luca Lazzeroni - Trend Servizi Srl ha scritto:
>
> I actually use FUSE and it works. If i try to use "libgfapi" direct
> interface to gluster in qemu-kvm, the problem appears.
>
>
>
> Il 17/01/2018 11:35, Krutika Dhananjay ha scritto:
>
> Really? Then which protocol exactly do you see this issue with? libgfapi?
> NFS?
>
> -Krutika
>
> On Wed, Jan 17, 2018 at 3:59 PM, Ing. Luca Lazzeroni - Trend Servizi Srl <
> l...@gvnet.it> wrote:
>
>> Of course. Here's the full log. Please, note that in FUSE mode everything
>> works apparently without problems. I've installed 4 vm and updated them
>> without problems.
>>
>>
>>
>> Il 17/01/2018 11:00, Krutika Dhananjay ha scritto:
>>
>>
>>
>> On Tue, Jan 16, 2018 at 10:47 PM, Ing. Luca Lazzeroni - Trend Servizi Srl
>>  wrote:
>>
>>> I've made the test with raw image format (preallocated too) and the
>>> corruption problem is still there (but without errors in bricks' log file).
>>>
>>> What does the "link" error in bricks log files means ?
>>>
>>> I've seen the source code looking for the lines where it happens and it
>>> seems a warning (it doesn't imply a failure).
>>>
>>
>> Indeed, it only represents a transient state when the shards are created
>> for the first time and does not indicate a failure.
>> Could you also get the logs of the gluster fuse mount process? It should
>> be under /var/log/glusterfs of your client machine with the filename as a
>> hyphenated mount point path.
>>
>> For example, if your volume was mounted at /mnt/glusterfs, then your log
>> file would be named mnt-glusterfs.log.
>>
>> -Krutika
>>
>>
>>>
>>> Il 16/01/2018 17:39, Ing. Luca Lazzeroni - Trend Servizi Srl ha scritto:
>>>
>>> An update:
>>>
>>> I've tried, for my tests, to create the vm volume as
>>>
>>> qemu-img create -f qcow2 -o preallocation=full
>>> gluster://gluster1/Test/Test-vda.img 20G
>>>
>>> et voila !
>>>
>>> No errors at all, neither in bricks' log file (the "link failed" message
>>> disappeared), neither in VM (no corruption and installed succesfully).
>>>
>>> I'll do another test with a fully preallocated raw image.
>>>
>>>
>>>
>>> Il 16/01/2018 16:31, Ing. Luca Lazzeroni - Trend Servizi Srl ha scritto:
>>>
>>> I've just done all the steps to reproduce the problem.
>>>
>>> Tha VM volume has been created via "qemu-img create -f qcow2
>>> Test-vda2.qcow2 20G" on the gluster volume mounted via FUSE. I've tried
>>> also to create the volume with preallocated metadata, which moves the
>>> problem a bit far away (in time). The volume is a replice 3 arbiter 1
>>> volume hosted on XFS bricks.
>>>
>>> Here are the informations:
>>>
>>> [root@ovh-ov1 bricks]# gluster volume info gv2a2
>>>
>>> Volume Name: gv2a2
>>> Type: Replicate
>>> Volume ID: 83c84774-2068-4bfc-b0b9-3e6b93705b9f
>>> Status: Started
>>> Snapshot Count: 0
>>> Number of Bricks: 1 x (2 + 1) = 3
>>> Transport-type: tcp
>>> 

Re: [Gluster-users] Blocking IO when hot tier promotion daemon runs

2018-01-18 Thread Hari Gowtham
Hi Tom,

The volume info doesn't show the hot bricks. I think you have took the
volume info output before attaching the hot tier.
Can you send the volume info of the current setup where you see this issue.

The logs you sent are from a later point in time. The issue is hit
earlier than the logs what is available in the log. I need the logs
from an earlier time.
And along with the entire tier logs, can you send the glusterd and
brick logs too?

Rest of the comments are inline

On Wed, Jan 10, 2018 at 9:03 PM, Tom Fite  wrote:
> I should add that additional testing has shown that only accessing files is
> held up, IO is not interrupted for existing transfers. I think this points
> to the heat metadata in the sqlite DB for the tier, is it possible that a
> table is temporarily locked while the promotion daemon runs so the calls to
> update the access count on files are blocked?
>
>
> On Wed, Jan 10, 2018 at 10:17 AM, Tom Fite  wrote:
>>
>> The sizes of the files are extremely varied, there are millions of small
>> (<1 MB) files and thousands of files larger than 1 GB.

The tier use case is for bigger size files. not the best for files of
smaller size.
That can end up hindering the IOs.

>>
>> Attached is the tier log for gluster1 and gluster2. These are full of
>> "demotion failed" messages, which is also shown in the status:
>>
>> [root@pod-sjc1-gluster1 gv0]# gluster volume tier gv0 status
>> Node Promoted files   Demoted filesStatus
>> run time in h:m:s
>> ----
>> -
>> localhost259400in progress
>> 112:21:49
>> pod-sjc1-gluster2 02917154  in progress
>> 112:21:49
>>
>> Is it normal to have promotions and demotions only happen on each server
>> but not both?

No. its not normal.

>>
>> Volume info:
>>
>> [root@pod-sjc1-gluster1 ~]# gluster volume info
>>
>> Volume Name: gv0
>> Type: Distributed-Replicate
>> Volume ID: d490a9ec-f9c8-4f10-a7f3-e1b6d3ced196
>> Status: Started
>> Snapshot Count: 13
>> Number of Bricks: 3 x 2 = 6
>> Transport-type: tcp
>> Bricks:
>> Brick1: pod-sjc1-gluster1:/data/brick1/gv0
>> Brick2: pod-sjc1-gluster2:/data/brick1/gv0
>> Brick3: pod-sjc1-gluster1:/data/brick2/gv0
>> Brick4: pod-sjc1-gluster2:/data/brick2/gv0
>> Brick5: pod-sjc1-gluster1:/data/brick3/gv0
>> Brick6: pod-sjc1-gluster2:/data/brick3/gv0
>> Options Reconfigured:
>> performance.cache-refresh-timeout: 60
>> performance.stat-prefetch: on
>> server.allow-insecure: on
>> performance.flush-behind: on
>> performance.rda-cache-limit: 32MB
>> network.tcp-window-size: 1048576
>> performance.nfs.io-threads: on
>> performance.write-behind-window-size: 4MB
>> performance.nfs.write-behind-window-size: 512MB
>> performance.io-cache: on
>> performance.quick-read: on
>> features.cache-invalidation: on
>> features.cache-invalidation-timeout: 600
>> performance.cache-invalidation: on
>> performance.md-cache-timeout: 600
>> network.inode-lru-limit: 9
>> performance.cache-size: 4GB
>> server.event-threads: 16
>> client.event-threads: 16
>> features.barrier: disable
>> transport.address-family: inet
>> nfs.disable: on
>> performance.client-io-threads: on
>> cluster.lookup-optimize: on
>> server.outstanding-rpc-limit: 1024
>> auto-delete: enable
>>
>>
>> # gluster volume status
>> Status of volume: gv0
>> Gluster process TCP Port  RDMA Port  Online
>> Pid
>>
>> --
>> Hot Bricks:
>> Brick pod-sjc1-gluster2:/data/
>> hot_tier/gv049219 0  Y
>> 26714
>> Brick pod-sjc1-gluster1:/data/
>> hot_tier/gv049199 0  Y
>> 21325
>> Cold Bricks:
>> Brick pod-sjc1-gluster1:/data/
>> brick1/gv0  49152 0  Y
>> 3178
>> Brick pod-sjc1-gluster2:/data/
>> brick1/gv0  49152 0  Y
>> 4818
>> Brick pod-sjc1-gluster1:/data/
>> brick2/gv0  49153 0  Y
>> 3186
>> Brick pod-sjc1-gluster2:/data/
>> brick2/gv0  49153 0  Y
>> 4829
>> Brick pod-sjc1-gluster1:/data/
>> brick3/gv0  49154 0  Y
>> 3194
>> Brick pod-sjc1-gluster2:/data/
>> brick3/gv0  49154 0  Y
>> 4840
>> Tier Daemon on localhostN/A   N/AY
>> 20313
>> Self-heal Daemon on localhost   N/A   N/AY
>> 32023
>> Tier Daemon on pod-sjc1-gluster1N/A   N/AY
>> 24758
>> Self-heal Daemon on pod-sjc1-gluster2   N/A   N/AY
>> 12349
>>
>> Task Status of Volume gv0
>>
>> --
>> There are 

Re: [Gluster-users] [Possibile SPAM] Re: Problem with Gluster 3.12.4, VM and sharding

2018-01-18 Thread Ing. Luca Lazzeroni - Trend Servizi Srl

Another update.

I've setup a replica 3 volume without sharding and tried to install a VM 
on a qcow2 volume on that device; however the result is the same and the 
vm image has been corrupted, exactly at the same point.


Here's the volume info of the create volume:

Volume Name: gvtest
Type: Replicate
Volume ID: e2ddf694-ba46-4bc7-bc9c-e30803374e9d
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: gluster1:/bricks/brick1/gvtest
Brick2: gluster2:/bricks/brick1/gvtest
Brick3: gluster3:/bricks/brick1/gvtest
Options Reconfigured:
user.cifs: off
features.shard: off
cluster.shd-wait-qlength: 1
cluster.shd-max-threads: 8
cluster.locking-scheme: granular
cluster.data-self-heal-algorithm: full
cluster.server-quorum-type: server
cluster.quorum-type: auto
cluster.eager-lock: enable
network.remote-dio: enable
performance.low-prio-threads: 32
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off


Il 17/01/2018 14:51, Ing. Luca Lazzeroni - Trend Servizi Srl ha scritto:


Hi,

after our IRC chat I've rebuilt a virtual machine with FUSE based 
virtual disk. Everything worked flawlessly.


Now I'm sending you the output of the requested getfattr command on 
the disk image:


# file: TestFUSE-vda.qcow2
trusted.afr.dirty=0x
trusted.gfid=0x40ffafbbe987445692bb31295fa40105
trusted.gfid2path.dc9dde61f0b77eab=0x31326533323631662d373839332d346262302d383738632d3966623765306232336263652f54657374465553452d7664612e71636f7732
trusted.glusterfs.shard.block-size=0x0400
trusted.glusterfs.shard.file-size=0xc1530060be90

Hope this helps.



Il 17/01/2018 11:37, Ing. Luca Lazzeroni - Trend Servizi Srl ha scritto:


I actually use FUSE and it works. If i try to use "libgfapi" direct 
interface to gluster in qemu-kvm, the problem appears.




Il 17/01/2018 11:35, Krutika Dhananjay ha scritto:
Really? Then which protocol exactly do you see this issue with? 
libgfapi? NFS?


-Krutika

On Wed, Jan 17, 2018 at 3:59 PM, Ing. Luca Lazzeroni - Trend Servizi 
Srl > wrote:


Of course. Here's the full log. Please, note that in FUSE mode
everything works apparently without problems. I've installed 4
vm and updated them without problems.



Il 17/01/2018 11:00, Krutika Dhananjay ha scritto:



On Tue, Jan 16, 2018 at 10:47 PM, Ing. Luca Lazzeroni - Trend
Servizi Srl > wrote:

I've made the test with raw image format (preallocated too)
and the corruption problem is still there (but without
errors in bricks' log file).

What does the "link" error in bricks log files means ?

I've seen the source code looking for the lines where it
happens and it seems a warning (it doesn't imply a failure).


Indeed, it only represents a transient state when the shards
are created for the first time and does not indicate a failure.
Could you also get the logs of the gluster fuse mount process?
It should be under /var/log/glusterfs of your client machine
with the filename as a hyphenated mount point path.

For example, if your volume was mounted at /mnt/glusterfs, then
your log file would be named mnt-glusterfs.log.

-Krutika



Il 16/01/2018 17:39, Ing. Luca Lazzeroni - Trend Servizi
Srl ha scritto:


An update:

I've tried, for my tests, to create the vm volume as

qemu-img create -f qcow2 -o preallocation=full
gluster://gluster1/Test/Test-vda.img 20G

et voila !

No errors at all, neither in bricks' log file (the "link
failed" message disappeared), neither in VM (no corruption
and installed succesfully).

I'll do another test with a fully preallocated raw image.



Il 16/01/2018 16:31, Ing. Luca Lazzeroni - Trend Servizi
Srl ha scritto:


I've just done all the steps to reproduce the problem.

Tha VM volume has been created via "qemu-img create -f
qcow2 Test-vda2.qcow2 20G" on the gluster volume mounted
via FUSE. I've tried also to create the volume with
preallocated metadata, which moves the problem a bit far
away (in time). The volume is a replice 3 arbiter 1
volume hosted on XFS bricks.

Here are the informations:

[root@ovh-ov1 bricks]# gluster volume info gv2a2

Volume Name: gv2a2
Type: Replicate
Volume ID: 83c84774-2068-4bfc-b0b9-3e6b93705b9f
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: gluster1:/bricks/brick2/gv2a2
Brick2: gluster3:/bricks/brick3/gv2a2
Brick3: gluster2:/bricks/arbiter_brick_gv2a2/gv2a2 (arbiter)