Re: [Gluster-users] [Gluster-devel] Lot of EIO errors in disperse volume

2017-01-11 Thread Xavier Hernandez

Hi Ram,

On 12/01/17 02:36, Ankireddypalle Reddy wrote:

Xavi,
  I added some more logging information. The trusted.ec.size field 
values are in fact different.
   trusted.ec.sizel1 = 62719407423488l2 = 0


That's very weird. Directories do not have this attribute. It's only 
present on regular files. But you said that the error happens while 
creating the file, so it doesn't make much sense because file creation 
always sets trusted.ec.size to 0.


Could you reproduce the problem with diagnostics.client-log-level set to 
TRACE and send the log to me ? it will create a big log, but I'll have 
much more information about what's going on.


Do you have a mixed setup with nodes of different types ? for example 
mixed 32/64 bits architectures or different operating systems ? I ask 
this because 62719407423488 in hex is 0x390B, which has the 
lower 32 bits set to 0, but has garbage above that.




   This is a fairly static setup with no brick/ node failure.  Please 
explain why  is that a heal is being triggered and what could have acutually 
caused these size xattrs to differ.  This is causing random I/O failures and is 
impacting the backup schedules.


The launch of self-heal is normal because it has detected an 
inconsistency. The real problem is what originates that inconsistency.


Xavi



[ 2017-01-12 01:19:18.256970] W [MSGID: 122056] 
[ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-8: Mismatching 
xdata in answers of 'LOOKUP'
[2017-01-12 01:19:18.257015] W [MSGID: 122053] 
[ec-common.c:116:ec_check_status] 0-glusterfsProd-disperse-8: Operation failed 
on some subvolumes (up=7, mask=7, remaining=0, good=3, bad=4)
[2017-01-12 01:19:18.257018] W [MSGID: 122002] [ec-common.c:71:ec_heal_report] 
0-glusterfsProd-disperse-8: Heal failed [Invalid argument]
[2017-01-12 01:19:21.002028] E [dict.c:197:key_value_cmp] 
0-glusterfsProd-disperse-4: 'trusted.ec.size' is different in two dicts (8, 8)
[2017-01-12 01:19:21.002056] E [dict.c:166:log_value] 
0-glusterfsProd-disperse-4: trusted.ec.size [ l1 = 62719407423488 l2 = 0 i1 = 0 
i2 = 0 ]
[2017-01-12 01:19:21.002064] W [MSGID: 122056] 
[ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-4: Mismatching 
xdata in answers of 'LOOKUP'
[2017-01-12 01:19:21.209640] E [dict.c:197:key_value_cmp] 
0-glusterfsProd-disperse-4: 'trusted.ec.size' is different in two dicts (8, 8)
[2017-01-12 01:19:21.209673] E [dict.c:166:log_value] 
0-glusterfsProd-disperse-4: trusted.ec.size [ l1 = 62719407423488 l2 = 0 i1 = 0 
i2 = 0 ]
[2017-01-12 01:19:21.209686] W [MSGID: 122056] 
[ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-4: Mismatching 
xdata in answers of 'LOOKUP'
[2017-01-12 01:19:21.209719] W [MSGID: 122053] 
[ec-common.c:116:ec_check_status] 0-glusterfsProd-disperse-4: Operation failed 
on some subvolumes (up=7, mask=7, remaining=0, good=6, bad=1)
[2017-01-12 01:19:21.209753] W [MSGID: 122002] [ec-common.c:71:ec_heal_report] 
0-glusterfsProd-disperse-4: Heal failed [Invalid argument]

Thanks and Regards,
Ram

-Original Message-
From: Ankireddypalle Reddy
Sent: Wednesday, January 11, 2017 9:29 AM
To: Ankireddypalle Reddy; Xavier Hernandez; Gluster Devel 
(gluster-de...@gluster.org); gluster-users@gluster.org
Subject: RE: [Gluster-users] [Gluster-devel] Lot of EIO errors in disperse 
volume

Xavi,
I built a debug binary to log more information. This is what is 
getting logged. Looks like it is the attribute trusted.ec.size which is 
different among the bricks in a sub volume.

In glustershd.log :

[2017-01-11 14:19:45.023845] N [MSGID: 122029] 
[ec-generic.c:683:ec_combine_lookup] 0-glusterfsProd-disperse-8: Mismatching 
iatt in answers of 'GF_FOP_LOOKUP'
[2017-01-11 14:19:45.027718] E [dict.c:166:key_value_cmp] 
0-glusterfsProd-disperse-6: 'trusted.ec.size' is different in two dicts (8, 8)
[2017-01-11 14:19:45.027736] W [MSGID: 122056] 
[ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-6: Mismatching 
xdata in answers of 'LOOKUP'
[2017-01-11 14:19:45.027763] E [dict.c:166:key_value_cmp] 
0-glusterfsProd-disperse-6: 'trusted.ec.size' is different in two dicts (8, 8)
[2017-01-11 14:19:45.027781] W [MSGID: 122056] 
[ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-6: Mismatching 
xdata in answers of 'LOOKUP'
[2017-01-11 14:19:45.027793] W [MSGID: 122053] 
[ec-common.c:116:ec_check_status] 0-glusterfsProd-disperse-6: Operation failed 
on some subvolumes (up=7, mask=7, remaining=0, good=6, bad=1)
[2017-01-11 14:19:45.027815] W [MSGID: 122002] [ec-common.c:71:ec_heal_report] 
0-glusterfsProd-disperse-6: Heal failed [Invalid argument]
[2017-01-11 14:19:45.029035] E [dict.c:166:key_value_cmp] 
0-glusterfsProd-disperse-8: 'trusted.ec.size' is different in two dicts (8, 8)
[2017-01-11 14:19:45.029057] W [MSGID: 122056] 
[ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-8: Mismatching 
xdata in answers of 'LOOKUP'
[2017-01-11 14:19:45.029089] E 

Re: [Gluster-users] [Gluster-devel] Lot of EIO errors in disperse volume

2017-01-11 Thread Ankireddypalle Reddy
Xavi,
  I added some more logging information. The trusted.ec.size field 
values are in fact different.  
   trusted.ec.sizel1 = 62719407423488l2 = 0 
   
   This is a fairly static setup with no brick/ node failure.  Please 
explain why  is that a heal is being triggered and what could have acutually 
caused these size xattrs to differ.  This is causing random I/O failures and is 
impacting the backup schedules.

[ 2017-01-12 01:19:18.256970] W [MSGID: 122056] 
[ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-8: Mismatching 
xdata in answers of 'LOOKUP'
[2017-01-12 01:19:18.257015] W [MSGID: 122053] 
[ec-common.c:116:ec_check_status] 0-glusterfsProd-disperse-8: Operation failed 
on some subvolumes (up=7, mask=7, remaining=0, good=3, bad=4)
[2017-01-12 01:19:18.257018] W [MSGID: 122002] [ec-common.c:71:ec_heal_report] 
0-glusterfsProd-disperse-8: Heal failed [Invalid argument]
[2017-01-12 01:19:21.002028] E [dict.c:197:key_value_cmp] 
0-glusterfsProd-disperse-4: 'trusted.ec.size' is different in two dicts (8, 8)
[2017-01-12 01:19:21.002056] E [dict.c:166:log_value] 
0-glusterfsProd-disperse-4: trusted.ec.size [ l1 = 62719407423488 l2 = 0 i1 = 0 
i2 = 0 ]
[2017-01-12 01:19:21.002064] W [MSGID: 122056] 
[ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-4: Mismatching 
xdata in answers of 'LOOKUP'
[2017-01-12 01:19:21.209640] E [dict.c:197:key_value_cmp] 
0-glusterfsProd-disperse-4: 'trusted.ec.size' is different in two dicts (8, 8)
[2017-01-12 01:19:21.209673] E [dict.c:166:log_value] 
0-glusterfsProd-disperse-4: trusted.ec.size [ l1 = 62719407423488 l2 = 0 i1 = 0 
i2 = 0 ]
[2017-01-12 01:19:21.209686] W [MSGID: 122056] 
[ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-4: Mismatching 
xdata in answers of 'LOOKUP'
[2017-01-12 01:19:21.209719] W [MSGID: 122053] 
[ec-common.c:116:ec_check_status] 0-glusterfsProd-disperse-4: Operation failed 
on some subvolumes (up=7, mask=7, remaining=0, good=6, bad=1)
[2017-01-12 01:19:21.209753] W [MSGID: 122002] [ec-common.c:71:ec_heal_report] 
0-glusterfsProd-disperse-4: Heal failed [Invalid argument]

Thanks and Regards,
Ram

-Original Message-
From: Ankireddypalle Reddy 
Sent: Wednesday, January 11, 2017 9:29 AM
To: Ankireddypalle Reddy; Xavier Hernandez; Gluster Devel 
(gluster-de...@gluster.org); gluster-users@gluster.org
Subject: RE: [Gluster-users] [Gluster-devel] Lot of EIO errors in disperse 
volume

Xavi,
I built a debug binary to log more information. This is what is 
getting logged. Looks like it is the attribute trusted.ec.size which is 
different among the bricks in a sub volume. 

In glustershd.log :

[2017-01-11 14:19:45.023845] N [MSGID: 122029] 
[ec-generic.c:683:ec_combine_lookup] 0-glusterfsProd-disperse-8: Mismatching 
iatt in answers of 'GF_FOP_LOOKUP'
[2017-01-11 14:19:45.027718] E [dict.c:166:key_value_cmp] 
0-glusterfsProd-disperse-6: 'trusted.ec.size' is different in two dicts (8, 8)
[2017-01-11 14:19:45.027736] W [MSGID: 122056] 
[ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-6: Mismatching 
xdata in answers of 'LOOKUP'
[2017-01-11 14:19:45.027763] E [dict.c:166:key_value_cmp] 
0-glusterfsProd-disperse-6: 'trusted.ec.size' is different in two dicts (8, 8)
[2017-01-11 14:19:45.027781] W [MSGID: 122056] 
[ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-6: Mismatching 
xdata in answers of 'LOOKUP'
[2017-01-11 14:19:45.027793] W [MSGID: 122053] 
[ec-common.c:116:ec_check_status] 0-glusterfsProd-disperse-6: Operation failed 
on some subvolumes (up=7, mask=7, remaining=0, good=6, bad=1)
[2017-01-11 14:19:45.027815] W [MSGID: 122002] [ec-common.c:71:ec_heal_report] 
0-glusterfsProd-disperse-6: Heal failed [Invalid argument]
[2017-01-11 14:19:45.029035] E [dict.c:166:key_value_cmp] 
0-glusterfsProd-disperse-8: 'trusted.ec.size' is different in two dicts (8, 8)
[2017-01-11 14:19:45.029057] W [MSGID: 122056] 
[ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-8: Mismatching 
xdata in answers of 'LOOKUP'
[2017-01-11 14:19:45.029089] E [dict.c:166:key_value_cmp] 
0-glusterfsProd-disperse-8: 'trusted.ec.size' is different in two dicts (8, 8)
[2017-01-11 14:19:45.029105] W [MSGID: 122056] 
[ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-8: Mismatching 
xdata in answers of 'LOOKUP'
[2017-01-11 14:19:45.029121] W [MSGID: 122053] 
[ec-common.c:116:ec_check_status] 0-glusterfsProd-disperse-8: Operation failed 
on some subvolumes (up=7, mask=7, remaining=0, good=6, bad=1)
[2017-01-11 14:19:45.032566] E [dict.c:166:key_value_cmp] 
0-glusterfsProd-disperse-6: 'trusted.ec.size' is different in two dicts (8, 8)
[2017-01-11 14:19:45.029138] W [MSGID: 122002] [ec-common.c:71:ec_heal_report] 
0-glusterfsProd-disperse-8: Heal failed [Invalid argument]
[2017-01-11 14:19:45.032585] W [MSGID: 122056] 
[ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-6: Mismatching 
xdata in answers of 'LOOKUP'
[2017-01-11 14:19:45.032614] E 

Re: [Gluster-users] NFS service dying

2017-01-11 Thread Giuseppe Ragusa
Da: gluster-users-boun...@gluster.org  per 
conto di Paul Allen 
Inviato: mercoledì 11 gennaio 2017 19.58
A: gluster-users@gluster.org
Oggetto: [Gluster-users] NFS service dying

I'm running into an issue where the gluster nfs service keeps dying on a
new cluster I have setup recently. We've been using Gluster on several
other clusters now for about a year or so and I have never seen this
issue before, nor have I been able to find anything remotely similar to
it while searching on-line. I initially was using the latest version in
the Gluster Debian repository for Jessie, 3.9.0-1, and then I tried
using the next one down, 3.8.7-1. Both behave the same for me.

What I was seeing was after a while the nfs service on the NAS server
would suddenly die after a number of processes had run on the app server
I had connected to the new NAS servers for testing (we're upgrading the
NAS servers for this cluster to newer hardware and expanded storage, the
current production NAS servers are using nfs-kernel-server with no type
of clustering of the data). I checked the logs but all it showed me was
something that looked like a stack trace in the nfs.log and the
glustershd.log showed the nfs service disconnecting. I turned on
debugging but it didn't give me a whole lot more, and certainly nothing
that helps me identify the source of my issue. It is pretty consistent
in dying shortly after I mount the file system on the servers and start
testing, usually within 15-30 minutes. But if I have nothing using the
file system, mounted or no, the service stays running for days. I tried
mounting it using the gluster client, and it works fine, but I can;t use
that due to the performance penalty, it slows the websites down by a few
seconds at a minimum.

Here is the output from the logs one of the times it died:

glustershd.log:

[2017-01-10 19:06:20.265918] W [socket.c:588:__socket_rwv] 0-nfs: readv
on /var/run/gluster/a921bec34928e8380280358a30865cee.socket failed (No
data available)
[2017-01-10 19:06:20.265964] I [MSGID: 106006]
[glusterd-svc-mgmt.c:327:glusterd_svc_common_rpc_notify] 0-management:
nfs has disconnected from glusterd.


nfs.log:

[2017-01-10 19:06:20.135430] D [name.c:168:client_fill_address_family]
0-NLM-client: address-family not specified, marking it as unspec for
getaddrinfo to resolve from (remote-host: 10.20.5.13)
[2017-01-10 19:06:20.135531] D [MSGID: 0]
[common-utils.c:335:gf_resolve_ip6] 0-resolver: returning ip-10.20.5.13
(port-48963) for hostname: 10.20.5.13 and port: 48963
[2017-01-10 19:06:20.136569] D [logging.c:1764:gf_log_flush_extra_msgs]
0-logging-infra: Log buffer size reduced. About to flush 5 extra log
messages
[2017-01-10 19:06:20.136630] D [logging.c:1767:gf_log_flush_extra_msgs]
0-logging-infra: Just flushed 5 extra log messages
pending frames:
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash:
2017-01-10 19:06:20
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.9.0
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xac)[0x7f891f0846ac]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(gf_print_trace+0x324)[0x7f891f08dcc4]
/lib/x86_64-linux-gnu/libc.so.6(+0x350e0)[0x7f891db870e0]
/lib/x86_64-linux-gnu/libc.so.6(+0x91d8a)[0x7f891dbe3d8a]
/usr/lib/x86_64-linux-gnu/glusterfs/3.9.0/xlator/nfs/server.so(+0x3a352)[0x7f8918682352]
/usr/lib/x86_64-linux-gnu/glusterfs/3.9.0/xlator/nfs/server.so(+0x3cc15)[0x7f8918684c15]
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x2aa)[0x7f891ee4e4da]
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7f891ee4a7e3]
/usr/lib/x86_64-linux-gnu/glusterfs/3.9.0/rpc-transport/socket.so(+0x4b33)[0x7f8919eadb33]
/usr/lib/x86_64-linux-gnu/glusterfs/3.9.0/rpc-transport/socket.so(+0x8f07)[0x7f8919eb1f07]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x7e836)[0x7f891f0d9836]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x80a4)[0x7f891e3010a4]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f891dc3a62d]


The IP showing in the nfs.log is actually for a web server I was also
testing with, not the app server, but it doesn't appear to me that would
be the cause for the nfs service dying. I'm at a loss as to what is
going on, and I need to try and get this fixed pretty quickly here, I
was hoping to have this in production last Friday. If anyone has any
ideas I'd be very grateful.

---

Hi Paul,

I've experienced almost the same symptoms but I'm running Gluster 3.7.17

I've found a really similar bug and added my logs and information to it:

https://bugzilla.redhat.com/show_bug.cgi?id=1381970

but it got no signs of activity (apart my comments/logs) in almost three months

I've also tried to catch the 

Re: [Gluster-users] Gluster 3.9 repository?

2017-01-11 Thread Amye Scavarda
On Wed, Jan 11, 2017 at 9:52 AM, Kaleb S. KEITHLEY 
wrote:

> On 01/11/2017 12:39 PM, Andrus, Brian Contractor wrote:
> > All,
> >
> > I notice on the main page, Gluster 3.9 is listed as
> >
> > */Gluster 3.9 is the latest major release as of November 2016./*
> >
> >
> >
> > Yet, if you click on the download link, there is no mention of Gluster
> > 3.9 on that page at all. It has:
> >
> > */GlusterFS version 3.8 is the latest version at the moment./*
> >
> >
> >
> > Is there going to be a 3.9 repository set up like the 3.8 has?
>
>
> https://download.gluster.org/pub/gluster/glusterfs/3.9/ has been there
> since November.
>
> > And could the webpage be updated for 3.9?
>
> I can't do anything about the web site. :-/
>
>
> --
>
> Kaleb
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>

Hey there,
I've submitted a PR to update the website and am just waiting for it to run
through the various TravisCI checks.

Thanks for the ping!
- amye

-- 
Amye Scavarda | a...@redhat.com | Gluster Community Lead
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] NFS service dying

2017-01-11 Thread Paul Allen
I'm running into an issue where the gluster nfs service keeps dying on a
new cluster I have setup recently. We've been using Gluster on several
other clusters now for about a year or so and I have never seen this
issue before, nor have I been able to find anything remotely similar to
it while searching on-line. I initially was using the latest version in
the Gluster Debian repository for Jessie, 3.9.0-1, and then I tried
using the next one down, 3.8.7-1. Both behave the same for me.

What I was seeing was after a while the nfs service on the NAS server
would suddenly die after a number of processes had run on the app server
I had connected to the new NAS servers for testing (we're upgrading the
NAS servers for this cluster to newer hardware and expanded storage, the
current production NAS servers are using nfs-kernel-server with no type
of clustering of the data). I checked the logs but all it showed me was
something that looked like a stack trace in the nfs.log and the
glustershd.log showed the nfs service disconnecting. I turned on
debugging but it didn't give me a whole lot more, and certainly nothing
that helps me identify the source of my issue. It is pretty consistent
in dying shortly after I mount the file system on the servers and start
testing, usually within 15-30 minutes. But if I have nothing using the
file system, mounted or no, the service stays running for days. I tried
mounting it using the gluster client, and it works fine, but I can;t use
that due to the performance penalty, it slows the websites down by a few
seconds at a minimum.

Here is the output from the logs one of the times it died:

glustershd.log:

[2017-01-10 19:06:20.265918] W [socket.c:588:__socket_rwv] 0-nfs: readv
on /var/run/gluster/a921bec34928e8380280358a30865cee.socket failed (No
data available)
[2017-01-10 19:06:20.265964] I [MSGID: 106006]
[glusterd-svc-mgmt.c:327:glusterd_svc_common_rpc_notify] 0-management:
nfs has disconnected from glusterd.


nfs.log:

[2017-01-10 19:06:20.135430] D [name.c:168:client_fill_address_family]
0-NLM-client: address-family not specified, marking it as unspec for
getaddrinfo to resolve from (remote-host: 10.20.5.13)
[2017-01-10 19:06:20.135531] D [MSGID: 0]
[common-utils.c:335:gf_resolve_ip6] 0-resolver: returning ip-10.20.5.13
(port-48963) for hostname: 10.20.5.13 and port: 48963
[2017-01-10 19:06:20.136569] D [logging.c:1764:gf_log_flush_extra_msgs]
0-logging-infra: Log buffer size reduced. About to flush 5 extra log
messages
[2017-01-10 19:06:20.136630] D [logging.c:1767:gf_log_flush_extra_msgs]
0-logging-infra: Just flushed 5 extra log messages
pending frames:
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash:
2017-01-10 19:06:20
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.9.0
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xac)[0x7f891f0846ac]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(gf_print_trace+0x324)[0x7f891f08dcc4]
/lib/x86_64-linux-gnu/libc.so.6(+0x350e0)[0x7f891db870e0]
/lib/x86_64-linux-gnu/libc.so.6(+0x91d8a)[0x7f891dbe3d8a]
/usr/lib/x86_64-linux-gnu/glusterfs/3.9.0/xlator/nfs/server.so(+0x3a352)[0x7f8918682352]
/usr/lib/x86_64-linux-gnu/glusterfs/3.9.0/xlator/nfs/server.so(+0x3cc15)[0x7f8918684c15]
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x2aa)[0x7f891ee4e4da]
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7f891ee4a7e3]
/usr/lib/x86_64-linux-gnu/glusterfs/3.9.0/rpc-transport/socket.so(+0x4b33)[0x7f8919eadb33]
/usr/lib/x86_64-linux-gnu/glusterfs/3.9.0/rpc-transport/socket.so(+0x8f07)[0x7f8919eb1f07]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x7e836)[0x7f891f0d9836]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x80a4)[0x7f891e3010a4]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f891dc3a62d]


The IP showing in the nfs.log is actually for a web server I was also
testing with, not the app server, but it doesn't appear to me that would
be the cause for the nfs service dying. I'm at a loss as to what is
going on, and I need to try and get this fixed pretty quickly here, I
was hoping to have this in production last Friday. If anyone has any
ideas I'd be very grateful.

-- 

Paul Allen

Inetz System Administrator


___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster 3.9 repository?

2017-01-11 Thread Kaleb S. KEITHLEY
On 01/11/2017 12:39 PM, Andrus, Brian Contractor wrote:
> All,
> 
> I notice on the main page, Gluster 3.9 is listed as
> 
> */Gluster 3.9 is the latest major release as of November 2016./*
> 
>  
> 
> Yet, if you click on the download link, there is no mention of Gluster
> 3.9 on that page at all. It has:
> 
> */GlusterFS version 3.8 is the latest version at the moment./*
> 
>  
> 
> Is there going to be a 3.9 repository set up like the 3.8 has?


https://download.gluster.org/pub/gluster/glusterfs/3.9/ has been there
since November.

> And could the webpage be updated for 3.9?

I can't do anything about the web site. :-/


-- 

Kaleb
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Gluster 3.9 repository?

2017-01-11 Thread Andrus, Brian Contractor
All,
I notice on the main page, Gluster 3.9 is listed as
Gluster 3.9 is the latest major release as of November 2016.

Yet, if you click on the download link, there is no mention of Gluster 3.9 on 
that page at all. It has:
GlusterFS version 3.8 is the latest version at the moment.

Is there going to be a 3.9 repository set up like the 3.8 has? And could the 
webpage be updated for 3.9?

Thanks!

Brian Andrus
ITACS/Research Computing
Naval Postgraduate School
Monterey, California
voice: 831-656-6238

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Gluster command as non root

2017-01-11 Thread Kevin Lemonnier
Hi,

I developped a telegraf plugin for gluster recently, to graph the read and write
byte per seconds on our volumes in grafana. Works fine by getting those values 
from
the profiler, the problem is that I have to use sudo to make that work, and I 
don't
like it.

Is there a way, on debian, to allow a non-root user to run
"/usr/sbin/gluster volume profile  info cumulative" ? Or any other way
to get the total byte read and written on a volume for a user maybe ?

Thanks

-- 
Kevin Lemonnier
PGP Fingerprint : 89A5 2283 04A0 E6E9 0111


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Lot of EIO errors in disperse volume

2017-01-11 Thread Ankireddypalle Reddy
Xavi,
I built a debug binary to log more information. This is what is 
getting logged. Looks like it is the attribute trusted.ec.size which is 
different among the bricks in a sub volume. 

In glustershd.log :

[2017-01-11 14:19:45.023845] N [MSGID: 122029] 
[ec-generic.c:683:ec_combine_lookup] 0-glusterfsProd-disperse-8: Mismatching 
iatt in answers of 'GF_FOP_LOOKUP'
[2017-01-11 14:19:45.027718] E [dict.c:166:key_value_cmp] 
0-glusterfsProd-disperse-6: 'trusted.ec.size' is different in two dicts (8, 8)
[2017-01-11 14:19:45.027736] W [MSGID: 122056] 
[ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-6: Mismatching 
xdata in answers of 'LOOKUP'
[2017-01-11 14:19:45.027763] E [dict.c:166:key_value_cmp] 
0-glusterfsProd-disperse-6: 'trusted.ec.size' is different in two dicts (8, 8)
[2017-01-11 14:19:45.027781] W [MSGID: 122056] 
[ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-6: Mismatching 
xdata in answers of 'LOOKUP'
[2017-01-11 14:19:45.027793] W [MSGID: 122053] 
[ec-common.c:116:ec_check_status] 0-glusterfsProd-disperse-6: Operation failed 
on some subvolumes (up=7, mask=7, remaining=0, good=6, bad=1)
[2017-01-11 14:19:45.027815] W [MSGID: 122002] [ec-common.c:71:ec_heal_report] 
0-glusterfsProd-disperse-6: Heal failed [Invalid argument]
[2017-01-11 14:19:45.029035] E [dict.c:166:key_value_cmp] 
0-glusterfsProd-disperse-8: 'trusted.ec.size' is different in two dicts (8, 8)
[2017-01-11 14:19:45.029057] W [MSGID: 122056] 
[ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-8: Mismatching 
xdata in answers of 'LOOKUP'
[2017-01-11 14:19:45.029089] E [dict.c:166:key_value_cmp] 
0-glusterfsProd-disperse-8: 'trusted.ec.size' is different in two dicts (8, 8)
[2017-01-11 14:19:45.029105] W [MSGID: 122056] 
[ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-8: Mismatching 
xdata in answers of 'LOOKUP'
[2017-01-11 14:19:45.029121] W [MSGID: 122053] 
[ec-common.c:116:ec_check_status] 0-glusterfsProd-disperse-8: Operation failed 
on some subvolumes (up=7, mask=7, remaining=0, good=6, bad=1)
[2017-01-11 14:19:45.032566] E [dict.c:166:key_value_cmp] 
0-glusterfsProd-disperse-6: 'trusted.ec.size' is different in two dicts (8, 8)
[2017-01-11 14:19:45.029138] W [MSGID: 122002] [ec-common.c:71:ec_heal_report] 
0-glusterfsProd-disperse-8: Heal failed [Invalid argument]
[2017-01-11 14:19:45.032585] W [MSGID: 122056] 
[ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-6: Mismatching 
xdata in answers of 'LOOKUP'
[2017-01-11 14:19:45.032614] E [dict.c:166:key_value_cmp] 
0-glusterfsProd-disperse-6: 'trusted.ec.size' is different in two dicts (8, 8)
[2017-01-11 14:19:45.032631] W [MSGID: 122056] 
[ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-6: Mismatching 
xdata in answers of 'LOOKUP'
[2017-01-11 14:19:45.032638] W [MSGID: 122053] 
[ec-common.c:116:ec_check_status] 0-glusterfsProd-disperse-6: Operation failed 
on some subvolumes (up=7, mask=7, remaining=0, good=6, bad=1)
[2017-01-11 14:19:45.032654] W [MSGID: 122002] [ec-common.c:71:ec_heal_report] 
0-glusterfsProd-disperse-6: Heal failed [Invalid argument]
[2017-01-11 14:19:45.037514] E [dict.c:166:key_value_cmp] 
0-glusterfsProd-disperse-6: 'trusted.ec.size' is different in two dicts (8, 8)
[2017-01-11 14:19:45.037536] W [MSGID: 122056] 
[ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-6: Mismatching 
xdata in answers of 'LOOKUP'
[2017-01-11 14:19:45.037553] E [dict.c:166:key_value_cmp] 
0-glusterfsProd-disperse-6: 'trusted.ec.size' is different in two dicts (8, 8)
[2017-01-11 14:19:45.037573] W [MSGID: 122056] 
[ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-6: Mismatching 
xdata in answers of 'LOOKUP'
[2017-01-11 14:19:45.037582] W [MSGID: 122053] 
[ec-common.c:116:ec_check_status] 0-glusterfsProd-disperse-6: Operation failed 
on some subvolumes (up=7, mask=7, remaining=0, good=6, bad=1)
[2017-01-11 14:19:45.037599] W [MSGID: 122002] [ec-common.c:71:ec_heal_report] 
0-glusterfsProd-disperse-6: Heal failed [Invalid argument]
[2017-01-11 14:20:40.001401] E [dict.c:166:key_value_cmp] 
0-glusterfsProd-disperse-3: 'trusted.ec.size' is different in two dicts (8, 8)
[2017-01-11 14:20:40.001387] E [dict.c:166:key_value_cmp] 
0-glusterfsProd-disperse-5: 'trusted.ec.size' is different in two dicts (8, 8)

In the mount daemon log:

[2017-01-11 14:20:17.806826] E [MSGID: 122001] 
[ec-common.c:872:ec_config_check] 2-glusterfsProd-disperse-0: Invalid or 
corrupted config [Invalid argument]
[2017-01-11 14:20:17.806847] E [MSGID: 122066] 
[ec-common.c:969:ec_prepare_update_cbk] 2-glusterfsProd-disperse-0: Invalid 
config xattr [Invalid argument]
[2017-01-11 14:20:17.807076] E [MSGID: 122001] 
[ec-common.c:872:ec_config_check] 2-glusterfsProd-disperse-1: Invalid or 
corrupted config [Invalid argument]
[2017-01-11 14:20:17.807099] E [MSGID: 122066] 
[ec-common.c:969:ec_prepare_update_cbk] 2-glusterfsProd-disperse-1: Invalid 
config xattr [Invalid argument]
[2017-01-11 14:20:17.807286] 

Re: [Gluster-users] Weekly Community Meeting - 20170111

2017-01-11 Thread Kaushal M
A busy meeting today. 4 topics were discussed, on 3.7.20, archiving
meeting notes in the wiki, testing cleanup and participation in the
meetings. More details can be found in the meeting page on the wiki[1]
and in the logs[2]. The notes have been pasted at the end for easy
reference as well.

I'll be hosting next weeks meeting. The meeting pad[3] is ready for
new topics and updates. See everyone at 1200UTC (or possibly later)
18th Jan in #gluster-meeting.

~kaushal

[1] https://github.com/gluster/glusterfs/wiki/Community-Meeting-2017-01-11
[2] Minutes: 
https://meetbot.fedoraproject.org/gluster-meeting/2017-01-11/weekly_community_meeting_2017-01-11.2017-01-11-12.02.html
   Minutes (text):
https://meetbot.fedoraproject.org/gluster-meeting/2017-01-11/weekly_community_meeting_2017-01-11.2017-01-11-12.02.txt
   Log: 
https://meetbot.fedoraproject.org/gluster-meeting/2017-01-11/weekly_community_meeting_2017-01-11.2017-01-11-12.02.log.html
[3] https://bit.ly/gluster-community-meetings

# Meeting Notes & updates

## Topics of Discussion

The meeting is an open floor, open for discussion of any topic entered below.

- Discuss participation in the meetings
- We had low attendence and participation in meetings in Nov/Dec 2016.
- Is this still true now?
- [ndevos] Doesn't seem to have changed a lot
- Not a lot of waiting for updates and moving on.
- [shyam] new format more valuable than previous
- Anything we can do better?
- [nigelb] A slot for presentation
- Present new features/things being worked on
- [sankarshan] Optional slot at the end
- [ndevos] meetings already long enough, might not be
enough time for presentations
- [ndevos] need to change medium for presentations
- [shyam] presentations can help bring in technical
content to meeting, currently meetings are more community operations
oriented
- [ndevos] Why no maintainers participating?
- [sankarshan] Maintainers probably don't know
- [sankarshan] Why another meeting for maintainers only?
- [shyam] separate meeting to get maintainers to
participate and bubble up things here, if maintainers are here and
participating then possibly this forum can be leveraged
-
- Is 3.7.20 required? [kshlm]
- 3.10 is releasing next month, which means 3.7 is EOL. Does it
require another bug fix release?
- Agreed to do a 3.7.20 release to keep with schedules.
- Testing discussion update:
https://github.com/gluster/glusterfs/wiki/Test-Clean-Up
- https://www.gluster.org/pipermail/gluster-devel/2017-January/051859.html
- Comments need to be added to the mail thread
- [jdarcy] bail on first failure in a test file
- Do we need a wiki page for meeting minutes, given that we already
have meeting minutes logged by mail and available for ready reference?
 ..means available as mail, available in irc logs in gluster-meeting,
why another wiki page update?
- archive to email (send updates to the lists after every meeting)
- Wiki is optional
- But not for the weekly meeting.

### Next weeks meeting host

- kshlm

## Updates

> NOTE : Updates will not be discussed during meetings. Any important or 
> noteworthy update will be announced at the end of the meeting

### Action Items from last week

- Need to find out when 3.9.1 is happening
- Done by kkeithley
- 
https://www.gluster.org/pipermail/gluster-devel/2017-January/051814.html
- shyam will file a bug to get arequal included in glusterfs packages
- Done
- https://bugzilla.redhat.com/show_bug.cgi?id=1410100

### Releases

 GlusterFS 4.0

- Tracker bug :
https://bugzilla.redhat.com/showdependencytree.cgi?id=glusterfs-4.0
- Roadmap : https://www.gluster.org/community/roadmap/4.0/
- Updates:
- GD2: No updates this week

 GlusterFS 3.10

- Maintainers : shyam, kkeithley, rtalur
- Next release : 3.10.0
- Target date: February 14, 2017
- Release tracker : https://github.com/gluster/glusterfs/milestone/1
- Updates:
  - Review backlog for features to make the branching deadline:
http://bit.ly/2iXor01
  - Request developer community focus on the same!
  - Branching date set as 17th Jan 2017

 GlusterFS 3.9

- Maintainers : pranithk, aravindavk, dblack, (kkeithley)
- Current release : 3.9.0
- Next release : 3.9.1
  - Release date : 20 Jan 2017
- Tracker bug : https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-3.9.1
- Open bugs : 
https://bugzilla.redhat.com/showdependencytree.cgi?maxdepth=2=glusterfs-3.9.0_resolved=1
- Roadmap : https://www.gluster.org/community/roadmap/3.9/
- Updates:
  - I will try to tag 3.9.1 today (20170111) or tomorrow. (delayed
from 20161220, or early for 20170120, depending on how you want to
look at it. --kkeithley)

 GlusterFS 3.8

- Maintainers : ndevos, jiffin
- Current release : 3.8.7
- Next release : 3.8.8
  - Release date : 10 Jan 2017
- Tracker b

Re: [Gluster-users] [Gluster-devel] GlusterFS 3.7.19 released

2017-01-11 Thread Lindsay Mathieson

On 11/01/2017 8:13 PM, Kaushal M wrote:

[1]:https://github.com/gluster/glusterfs/blob/release-3.7/doc/release-notes/3.7.18.md


Is there a reason that "performance.strict-o-direct=on" needs to be set 
for VM Hosting?


--
Lindsay Mathieson

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] GlusterFS 3.7.19 released

2017-01-11 Thread Kaushal M
On Wed, Jan 11, 2017 at 3:43 PM, Kaushal M  wrote:
> GlusterFS 3.7.19 is a regular bug fix release for GlusterFS-3.7. The
> release-notes for this release can be read here[1].
>
> The release tarball and community provided packages[2] can obtained
> from download.gluster.org[3]. The CentOS Storage SIG[4] packages have
> been built and should be available soon from the centos-gluster37
> repository.
>
> A reminder to everyone, GlusterFS-3.7 is scheduled[5] to be EOLed with
> the release of GlusterFS-3.10, which should happen sometime in
> February 2017.
>
> ~kaushal
>

The links have been corrected. Thanks Niels for noticing this.

 [1]: 
https://github.com/gluster/glusterfs/blob/release-3.7/doc/release-notes/3.7.19.md
 [2]: https://gluster.readthedocs.io/en/latest/Install-Guide/Community_Packages/
 [3]: https://download.gluster.org/pub/gluster/glusterfs/3.7/3.7.19/
 [4]: https://wiki.centos.org/SpecialInterestGroup/Storage
 [5]: https://www.gluster.org/community/release-schedule/
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] [Gluster-devel] decoupling network.ping-timeout and transport.tcp-user-timeout

2017-01-11 Thread Milind Changire

+gluster-users

Milind

On 01/11/2017 03:21 PM, Milind Changire wrote:

The management connection uses network.ping-timeout to time out and
retry connection to a different server if the existing connection
end-point is unreachable from the client.
Due to the nature of the parameters involved in the TCP/IP network
stack, it becomes imperative to control the other network connections
using the socket level tunables:
* SO_KEEPALIVE
* TCP_KEEPIDLE
* TCP_KEEPINTVL
* TCP_KEEPCNT

So, I'd like to decouple the network.ping-timeout and
transport.tcp-user-timeout since they are tunables for different
aspects of gluster application. network-ping-timeout monitors the
brick/node level responsiveness and transport.tcp-user-timeout is one
of the attributes that is used to manage the state of the socket.

Saying so, we could do away with network.ping-timeout altogether and
stick with transport.tcp-user-timeout for types of sockets. It becomes
increasingly difficult to work with different tunables across gluster.

I believe, there have not been many cases in which the community has
found the existing defaults for socket timeout unusable. So we could
stick with the system defaults and add the following socket level
tunables and make them open for configuration:
* client.tcp-user-timeout
 which sets transport.tcp-user-timeout
* client.keepalive-time
 which sets transport.socket.keepalive-time
* client.keepalive-interval
 which sets transport.socket.keepalive-interval
* client.keepalive-count
 which sets transport.socket.keepalive-count
* server.tcp-user-timeout
 which sets transport.tcp-user-timeout
* server.keepalive-time
 which sets transport.socket.keepalive-time
* server.keepalive-interval
 which sets transport.socket.keepalive-interval
* server.keepalive-count
 which sets transport.socket.keepalive-count

However, these settings would effect all sockets in gluster.
In cases where aggressive timeouts are needed, the community can find
gluster options which have 1:1 mapping with socket level options as
documented in tcp(7).

Please share your thoughts about the risks or effectiveness of the
decoupling.


___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] GlusterFS 3.7.19 released

2017-01-11 Thread Kaushal M
GlusterFS 3.7.19 is a regular bug fix release for GlusterFS-3.7. The
release-notes for this release can be read here[1].

The release tarball and community provided packages[2] can obtained
from download.gluster.org[3]. The CentOS Storage SIG[4] packages have
been built and should be available soon from the centos-gluster37
repository.

A reminder to everyone, GlusterFS-3.7 is scheduled[5] to be EOLed with
the release of GlusterFS-3.10, which should happen sometime in
February 2017.

~kaushal

[1]: 
https://github.com/gluster/glusterfs/blob/release-3.7/doc/release-notes/3.7.18.md
[2]: https://gluster.readthedocs.io/en/latest/Install-Guide/Community_Packages/
[3]: https://download.gluster.org/pub/gluster/glusterfs/3.7/3.7.18/
[4]: https://wiki.centos.org/SpecialInterestGroup/Storage
[5]: https://www.gluster.org/community/release-schedule/
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users