Re: [Gluster-devel] Metrics: and how to get them out from gluster

2017-09-01 Thread Xavier Hernandez
Hi Amar, I don't have time to review the changes in experimental branch yet, but here are some comments about these ideas... On 01/09/17 07:27, Amar Tumballi wrote: Disclaimer: This email is long, and did take significant time to write. Do take time and read, review and give feedback, so we

Re: [Gluster-devel] GlusterFS v3.12 - Nearing deadline for branch out

2017-07-19 Thread Xavier Hernandez
Hi, On 17/07/17 17:30, Pranith Kumar Karampuri wrote: hi, Status of the following features targeted for 3.12: 1) Need a way to resolve split-brain (#135) : Mostly will be merged in a day. 2) Halo Hybrid mode (#217): Unfortunately didn't get time to follow up on this, so will not make it

Re: [Gluster-devel] geo-rep regression because of node-uuid change

2017-07-07 Thread Xavier Hernandez
On 07/07/17 11:25, Pranith Kumar Karampuri wrote: On Fri, Jul 7, 2017 at 2:46 PM, Xavier Hernandez <xhernan...@datalab.es <mailto:xhernan...@datalab.es>> wrote: On 07/07/17 10:12, Pranith Kumar Karampuri wrote: On Fri, Jul 7, 2017 at 1:13 PM, Xavier Hernandez

Re: [Gluster-devel] geo-rep regression because of node-uuid change

2017-07-07 Thread Xavier Hernandez
Hi Pranith, On 05/07/17 12:28, Pranith Kumar Karampuri wrote: On Tue, Jul 4, 2017 at 2:26 PM, Xavier Hernandez <xhernan...@datalab.es <mailto:xhernan...@datalab.es>> wrote: Hi Pranith, On 03/07/17 08:33, Pranith Kumar Karampuri wrote: Xavi, Now tha

Re: [Gluster-devel] geo-rep regression because of node-uuid change

2017-07-04 Thread Xavier Hernandez
PM, Karthik Subrahmanya <ksubr...@redhat.com <mailto:ksubr...@redhat.com>> wrote: On Wed, Jun 21, 2017 at 1:56 PM, Xavier Hernandez <xhernan...@datalab.es <mailto:xhernan...@datalab.es>> wrote: That's ok. I'm currently unable to write a patch fo

Re: [Gluster-devel] Disperse volume : Sequential Writes

2017-07-04 Thread Xavier Hernandez
edhat.com <mailto:aspan...@redhat.com>> wrote: I think it should be done as we have agreement on basic design. *From: *"Pranith Kumar Karampuri" <pkara...@redhat.com <mailto:pkara...@re

Re: [Gluster-devel] geo-rep regression because of node-uuid change

2017-06-21 Thread Xavier Hernandez
That's ok. I'm currently unable to write a patch for this on ec. If no one can do it, I can try to do it in 6 - 7 hours... Xavi On Wednesday, June 21, 2017 09:48 CEST, Pranith Kumar Karampuri <pkara...@redhat.com> wrote:    On Wed, Jun 21, 2017 at 1:00 PM, Xavier Hernandez &l

Re: [Gluster-devel] geo-rep regression because of node-uuid change

2017-06-21 Thread Xavier Hernandez
oes these changes. If everyone is in agreement, we will leave it as is and add similar changes in ec as well. If we are not in agreement, then we will let the discussion progress :-)   Regards,Nithya-- Aravinda  Thanks to all of you guys for the discussions! On Tue, Jun 20, 2017 at 5:05 PM, Xavier

Re: [Gluster-devel] geo-rep regression because of node-uuid change

2017-06-20 Thread Xavier Hernandez
Aravinda VK On 06/20/2017 03:06 PM, Aravinda wrote: Hi Xavi, On 06/20/2017 02:51 PM, Xavier Hernandez wrote: Hi Aravinda, On 20/06/17 11:05, Pranith Kumar Karampuri wrote: Adding more people to get a consensus about this. On Tue, Jun 20, 2017 at 1:49 PM, Aravinda <avish...@redhat.com &l

Re: [Gluster-devel] geo-rep regression because of node-uuid change

2017-06-20 Thread Xavier Hernandez
Hi Aravinda, On 20/06/17 11:05, Pranith Kumar Karampuri wrote: Adding more people to get a consensus about this. On Tue, Jun 20, 2017 at 1:49 PM, Aravinda <avish...@redhat.com <mailto:avish...@redhat.com>> wrote: regards Aravinda VK On 06/20/2017 01:26 PM, Xavi

Re: [Gluster-devel] geo-rep regression because of node-uuid change

2017-06-20 Thread Xavier Hernandez
Hi Pranith, adding gluster-devel, Kotresh and Aravinda, On 20/06/17 09:45, Pranith Kumar Karampuri wrote: On Tue, Jun 20, 2017 at 1:12 PM, Xavier Hernandez <xhernan...@datalab.es <mailto:xhernan...@datalab.es>> wrote: On 20/06/17 09:31, Pranith Kumar Karampuri wrote:

Re: [Gluster-devel] Self-heal on read-only volumes

2017-06-20 Thread Xavier Hernandez
at.com>> wrote: I remember either Kotresh/Karthik recently sent patches to do something similar. Adding them to check if the know something about this On Fri, Jun 16, 2017 at 1:25 PM, Xavier Hernandez <xhernan...@datalab.es <mailto:xhernan...@datalab.es>

Re: [Gluster-devel] Performance experiments with io-stats translator

2017-06-07 Thread Xavier Hernandez
Hi Krutika, On 06/06/17 13:35, Krutika Dhananjay wrote: Hi, As part of identifying performance bottlenecks within gluster stack for VM image store use-case, I loaded io-stats at multiple points on the client and brick stack and ran randrd test using fio from within the hosted vms in parallel.

Re: [Gluster-devel] GFID2 - Proposal to add extra byte to existing GFID

2017-05-15 Thread Xavier Hernandez
Hi Amar, On May 15, 2017 2:15 PM, Amar Tumballi <atumb...@redhat.com> wrote: > > > > On Tue, Apr 11, 2017 at 2:59 PM, Amar Tumballi <ama...@gmail.com> wrote: >> >> Comments inline. >> >> On Mon, Dec 19, 2016 at 1:47 PM, Xavier Hernandez <xherna

Re: [Gluster-devel] [DHT] The myth of two hops for linkto file resolution

2017-05-04 Thread Xavier Hernandez
Hi, On 30/04/17 06:03, Raghavendra Gowdappa wrote: All, Its a common perception that the resolution of a file having linkto file on the hashed-subvol requires two hops: 1. client to hashed-subvol. 2. client to the subvol where file actually resides. While it is true that a fresh lookup

Re: [Gluster-devel] Pluggable interface for erasure coding?

2017-03-02 Thread Xavier Hernandez
Hi Niels, On 02/03/17 07:58, Niels de Vos wrote: Hi guys, I think this is a topic/question that has come up before, but I can not find any references or feature requests related to it. Because there are different libraries for Erasure Coding, it would be interesting to be able to select

Re: [Gluster-devel] release-3.10: Final call for release notes updates

2017-02-20 Thread Xavier Hernandez
Hi Shyam, I've added some comments [1] for the issue between disperse's dynamic code generator and SELinux. It assumes that [2] will be backported to 3.10. Xavi [1] https://review.gluster.org/16685 [2] https://review.gluster.org/16614 On 20/02/17 04:04, Shyam wrote: Hi, Please find the

Re: [Gluster-devel] https://review.gluster.org/#/c/16643/

2017-02-20 Thread Xavier Hernandez
Hi Nithya, I've merged it. However Vijay said in another email [1] that backports to 3.9 are not needed anymore. Xavi [1] http://lists.gluster.org/pipermail/gluster-devel/2017-February/052107.html On 20/02/17 09:19, Nithya Balachandran wrote: Hi, Can this be merged ? This is holding up

[Gluster-devel] Reviews needed

2017-02-16 Thread Xavier Hernandez
Hi everyone, I would need some reviews if you have some time: A memory leak fix in fuse: * Patch already merged in master and 3.10 * Backport to 3.9: https://review.gluster.org/16402 * Backport to 3.8: https://review.gluster.org/16403 A safe fallback for dynamic code generation in

Re: [Gluster-devel] Release 3.10: Request fix status for RC1 tagging

2017-02-16 Thread Xavier Hernandez
Hi Shyam, On 16/02/17 02:47, Shyam wrote: Hi, The 3.10 release tracker [1], shows 6 bugs needing a fix in 3.10. We need to get RC1 out so that we can start tracking the same for a potential release. Request folks on these bugs to provide a date by when we can expect a fix for these issues.

Re: [Gluster-devel] Creating new options for multiple gluster versions

2017-01-30 Thread Xavier Hernandez
Hi Atin, On 31/01/17 05:45, Atin Mukherjee wrote: On Mon, Jan 30, 2017 at 9:02 PM, Xavier Hernandez <xhernan...@datalab.es <mailto:xhernan...@datalab.es>> wrote: Hi Atin, On 30/01/17 15:25, Atin Mukherjee wrote: On Mon, Jan 30, 2017 at 7:30 PM, Xavi

Re: [Gluster-devel] Spurious regression failure? tests/basic/ec/ec-background-heals.t

2017-01-26 Thread Xavier Hernandez
Hi Atin, I don't clearly see what's the problem. Even if the truncate causes a dirty flag to be set, eventually it should be removed before the $HEAL_TIMEOUT value. For now I've marked the test as bad. Patch is: https://review.gluster.org/16470 Xavi On 25/01/17 17:24, Atin Mukherjee

Re: [Gluster-devel] [Gluster-users] Lot of EIO errors in disperse volume

2017-01-23 Thread Xavier Hernandez
from my iPhone On Jan 23, 2017, at 3:11 AM, Xavier Hernandez <xhernan...@datalab.es> wrote: Hi Ram, On 20/01/17 21:06, Ankireddypalle Reddy wrote: Attachments (2): 1 glustershd.log <https://imap.commvault.com/webconsole/embedded.do?url=https://imap.commvault.com/webconsole/

Re: [Gluster-devel] [Gluster-users] Lot of EIO errors in disperse volume

2017-01-23 Thread Xavier Hernandez
Please find attached the trace logs and heal info output. I'll examine the logs to see if there's something, but the previous patch will help a lot. Xavi Thanks and Regards, Ram -----Original Message- From: Xavier Hernandez [mailto:xhernan...@datalab.es] Sent: Friday, January 20, 2017 3:0

Re: [Gluster-devel] [Gluster-users] Lot of EIO errors in disperse volume

2017-01-20 Thread Xavier Hernandez
logged by ec_check_status() are not real problems. See patch http://review.gluster.org/16435/ for more info. Xavi Thanks and Regards, Ram -Original Message- From: Xavier Hernandez [mailto:xhernan...@datalab.es] Sent: Friday, January 20, 2017 2:41 AM To: Ankireddypalle Reddy; Ashish Pandey

Re: [Gluster-devel] [Gluster-users] Lot of EIO errors in disperse volume

2017-01-19 Thread Xavier Hernandez
,glusterfs5sds,glusterfs6sds diagnostics.client-log-level: INFO [root@glusterfs4 glusterfs]# Thanks and Regards, Ram *From:*Ashish Pandey [mailto:aspan...@redhat.com] *Sent:* Thursday, January 19, 2017 10:36 PM *To:* Ankireddypalle Reddy *Cc:* Xavier Hernandez; gluster-us...@gluster.org; Gluster

Re: [Gluster-devel] [Gluster-users] Lot of EIO errors in disperse volume

2017-01-16 Thread Xavier Hernandez
and writable. It's true that there's some problem here and it could derive in EIO if one of the healthy bricks degrades, but at least this file shouldn't be giving EIO errors for now. Xavi Sent on from my iPhone On Jan 16, 2017, at 6:23 AM, Xavier Hernandez <xhernan...@datalab.es> wrot

Re: [Gluster-devel] [Gluster-users] Lot of EIO errors in disperse volume

2017-01-16 Thread Xavier Hernandez
el-boun...@gluster.org [mailto:gluster-devel-boun...@gluster.org] On Behalf Of Ankireddypalle Reddy Sent: Friday, January 13, 2017 4:17 AM To: Xavier Hernandez Cc: gluster-us...@gluster.org; Gluster Devel (gluster-devel@gluster.org) Subject: Re: [Gluster-devel] [Gluster-users] Lot of EIO errors in

Re: [Gluster-devel] Question about EC locking

2017-01-13 Thread Xavier Hernandez
.@gmail.com <mailto:jayakrishnan...@gmail.com>> wrote: Thanks Xavier, for making it clear. Regards JK On Dec 13, 2016 3:52 PM, "Xavier Hernandez" <xhernan...@datalab.es <mailto:xhernan...@datalab.es>> wrote: Hi JK, On 12/13/20

Re: [Gluster-devel] [Gluster-users] Lot of EIO errors in disperse volume

2017-01-13 Thread Xavier Hernandez
----- From: Xavier Hernandez [mailto:xhernan...@datalab.es] Sent: Thursday, January 12, 2017 6:40 AM To: Ankireddypalle Reddy Cc: Gluster Devel (gluster-devel@gluster.org); gluster-us...@gluster.org Subject: Re: [Gluster-users] [Gluster-devel] Lot of EIO errors in disperse volume Hi Ram, On 12/0

Re: [Gluster-devel] [Gluster-users] Lot of EIO errors in disperse volume

2017-01-12 Thread Xavier Hernandez
a self-heal. Xavi Thanks and Regards, Ram Sent from my iPhone On Jan 12, 2017, at 2:25 AM, Xavier Hernandez <xhernan...@datalab.es> wrote: Hi Ram, On 12/01/17 02:36, Ankireddypalle Reddy wrote: Xavi, I added some more logging information. The trusted.ec.size field

Re: [Gluster-devel] [Gluster-users] Lot of EIO errors in disperse volume

2017-01-11 Thread Xavier Hernandez
-disperse-4: Heal failed [Invalid argument] Thanks and Regards, Ram -Original Message- From: Ankireddypalle Reddy Sent: Wednesday, January 11, 2017 9:29 AM To: Ankireddypalle Reddy; Xavier Hernandez; Gluster Devel (gluster-devel@gluster.org); gluster-us...@gluster.org Subject: RE: [Gluster

Re: [Gluster-devel] Lot of EIO errors in disperse volume

2017-01-10 Thread Xavier Hernandez
tories. Xavi Thanks and Regards, Ram -Original Message- From: Xavier Hernandez [mailto:xhernan...@datalab.es] Sent: Tuesday, January 10, 2017 7:53 AM To: Ankireddypalle Reddy; Gluster Devel (gluster-devel@gluster.org); gluster-us...@gluster.org Subject: Re: [Gluster-devel] Lot of EIO error

Re: [Gluster-devel] Lot of EIO errors in disperse volume

2017-01-10 Thread Xavier Hernandez
that some of these operations would succeed if retried. Do you know of any communicated related errors that are being reported/triaged. Thanks and Regards, Ram -Original Message- From: Xavier Hernandez [mailto:xhernan...@datalab.es] Sent: Tuesday, January 10, 2017 7:23 AM To: Ankireddypalle

Re: [Gluster-devel] Lot of EIO errors in disperse volume

2017-01-10 Thread Xavier Hernandez
exact file that triggers the EIO. The attached attributes seem consistent and that directory shouldn't cause any problem. Does an 'ls' on that directory fail or does it show the contents ? Xavi Thanks and Regards, Ram -Original Message- From: Xavier Hernandez [mailto:xhernan...@datalab.es] S

Re: [Gluster-devel] Lot of EIO errors in disperse volume

2017-01-10 Thread Xavier Hernandez
the servers at a time. The volume was brought down during upgrade. Thanks and Regards, Ram -Original Message- From: Xavier Hernandez [mailto:xhernan...@datalab.es] Sent: Tuesday, January 10, 2017 6:35 AM To: Ankireddypalle Reddy; Gluster Devel (gluster-devel@gluster.org); gluster-us

Re: [Gluster-devel] Lot of EIO errors in disperse volume

2017-01-10 Thread Xavier Hernandez
Hi Ram, how did you upgrade gluster ? from which version ? Did you upgrade one server at a time and waited until self-heal finished before upgrading the next server ? Xavi On 10/01/17 11:39, Ankireddypalle Reddy wrote: Hi, We upgraded to GlusterFS 3.7.18 yesterday. We see lot of

Re: [Gluster-devel] GFID2 - Proposal to add extra byte to existing GFID

2016-12-16 Thread Xavier Hernandez
On 12/16/2016 08:31 AM, Aravinda wrote: Proposal to add one more byte to GFID to store "Type" information. Extra byte will represent type(directory: 00, file: 01, Symlink: 02 etc) For example, if a directory GFID is f4f18c02-0360-4cdc-8c00-0164e49a7afd then, GFID2 will be

Re: [Gluster-devel] 1402538 : Assertion failure during rebalance of symbolic links

2016-12-15 Thread Xavier Hernandez
On 12/15/2016 01:41 PM, Nithya Balachandran wrote: On 15 December 2016 at 18:07, Xavier Hernandez <xhernan...@datalab.es <mailto:xhernan...@datalab.es>> wrote: On 12/15/2016 12:48 PM, Raghavendra Gowdappa wrote: I need to step back a little to understand the R

Re: [Gluster-devel] 1402538 : Assertion failure during rebalance of symbolic links

2016-12-15 Thread Xavier Hernandez
On 12/15/2016 12:48 PM, Raghavendra Gowdappa wrote: I need to step back a little to understand the RCA correctly. If I understand the code correctly, the callstack which resulted in failed setattr is (in rebalance process): dht_lookup -> dht_lookup_cbk -> dht_lookup_everwhere ->

Re: [Gluster-devel] 1402538 : Assertion failure during rebalance of symbolic links

2016-12-14 Thread Xavier Hernandez
On 12/14/2016 10:28 AM, Pranith Kumar Karampuri wrote: On Wed, Dec 14, 2016 at 2:54 PM, Xavier Hernandez <xhernan...@datalab.es <mailto:xhernan...@datalab.es>> wrote: On 12/14/2016 10:17 AM, Pranith Kumar Karampuri wrote: On Wed, Dec 14, 2016 at 1:48 PM, Xavi

Re: [Gluster-devel] 1402538 : Assertion failure during rebalance of symbolic links

2016-12-14 Thread Xavier Hernandez
On 12/14/2016 10:17 AM, Pranith Kumar Karampuri wrote: On Wed, Dec 14, 2016 at 1:48 PM, Xavier Hernandez <xhernan...@datalab.es <mailto:xhernan...@datalab.es>> wrote: There's another issue with the patch that Ashish sent. The original problem is that a setattr on a symbol

Re: [Gluster-devel] 1402538 : Assertion failure during rebalance of symbolic links

2016-12-14 Thread Xavier Hernandez
most of these problems should be solved. Xavi On 12/14/2016 09:02 AM, Xavier Hernandez wrote: On 12/14/2016 06:10 AM, Raghavendra Gowdappa wrote: - Original Message - From: "Pranith Kumar Karampuri" <pkara...@redhat.com> To: "Ashish Pandey" <aspan...@r

Re: [Gluster-devel] 1402538 : Assertion failure during rebalance of symbolic links

2016-12-14 Thread Xavier Hernandez
anathan" <srang...@redhat.com>, "Nithya Balachandran" <nbala...@redhat.com>, "Xavier Hernandez" <xhernan...@datalab.es>, "Raghavendra Gowdappa" <rgowd...@redhat.com> Sent: Tuesday, December 13, 2016 9:29:46 PM Subject: Re: 1402538 : Assertion

Re: [Gluster-devel] Question about EC locking

2016-12-12 Thread Xavier Hernandez
rishnan...@gmail.com <mailto:jayakrishnan...@gmail.com>> wrote: Hi Xavier, Thank you very much for your explanation. This helped me to understand more about locking in EC. Best Regards JK On Mon, Nov 28, 2016 at 4:17 PM, Xavier Hernandez <xhernan...@datalab

Re: [Gluster-devel] Question about EC locking

2016-11-28 Thread Xavier Hernandez
Hi, On 11/28/2016 02:59 AM, jayakrishnan mm wrote: Hi Xavier, Notice that EC xlator uses blocking locks. Any specific reason for this? In a distributed filesystem like gluster a synchronization mechanism is a must to avoid data corruption. Do you think this will affect the

Re: [Gluster-devel] Why vandermonde matrix is used in EC?

2016-11-27 Thread Xavier Hernandez
verify in some way that the other bricks do not contain updated data. Best regards, Xavi Best regards, Han 2016-11-24 17:26 GMT+09:00 Xavier Hernandez <xhernan...@datalab.es <mailto:xhernan...@datalab.es>>: Hi Han, On 11/24/2016 04:25 AM, 한우형 wrote: Hi,

Re: [Gluster-devel] Why vandermonde matrix is used in EC?

2016-11-24 Thread Xavier Hernandez
Hi Han, On 11/24/2016 04:25 AM, 한우형 wrote: Hi, I'm working on dispersed volume(ec) and I found ec encode/decode algorithm is using non-systematic vandermonde matrix. My question is this: why non-systematic algorithm is used? Non-systematic encoding/decoding doesn't alter performance when

Re: [Gluster-devel] Possible problem introduced by http://review.gluster.org/15573

2016-10-24 Thread Xavier Hernandez
Hi Soumya, On 21/10/16 16:15, Soumya Koduri wrote: On 10/21/2016 06:35 PM, Soumya Koduri wrote: Hi Xavi, On 10/21/2016 12:57 PM, Xavier Hernandez wrote: Looking at the code, I think that the added fd_unref() should only be called if the fop preparation fails. Otherwise the callback already

Re: [Gluster-devel] Possible problem introduced by http://review.gluster.org/15573

2016-10-24 Thread Xavier Hernandez
On 21/10/16 15:05, Soumya Koduri wrote: Hi Xavi, On 10/21/2016 12:57 PM, Xavier Hernandez wrote: Looking at the code, I think that the added fd_unref() should only be called if the fop preparation fails. Otherwise the callback already unreferences the fd. Code flow

Re: [Gluster-devel] Possible problem introduced by http://review.gluster.org/15573

2016-10-21 Thread Xavier Hernandez
Hi Niels, On 21/10/16 10:03, Niels de Vos wrote: On Fri, Oct 21, 2016 at 09:03:30AM +0200, Xavier Hernandez wrote: Hi, I've just tried Gluster 3.8.5 with Proxmox using gfapi and I consistently see a crash each time an attempt to connect to the volume is made. Thanks, that likely is the same

Re: [Gluster-devel] Possible problem introduced by http://review.gluster.org/15573

2016-10-21 Thread Xavier Hernandez
. * When glfs_io_async_cbk() is called another ref is released. Note that if fop preparation fails, a single fd_unref() is called, but on success two fd_unref() are called. Xavi On 21/10/16 09:03, Xavier Hernandez wrote: Hi, I've just tried Gluster 3.8.5 with Proxmox using gfapi and I

[Gluster-devel] Possible problem introduced by http://review.gluster.org/15573

2016-10-21 Thread Xavier Hernandez
Hi, I've just tried Gluster 3.8.5 with Proxmox using gfapi and I consistently see a crash each time an attempt to connect to the volume is made. The backtrace of the crash shows this: #0 pthread_spin_lock () at ../nptl/sysdeps/x86_64/pthread_spin_lock.S:24 #1 0x7fe5345776a5 in

Re: [Gluster-devel] Multiplexing - good news, bad news, and a plea for help

2016-09-20 Thread Xavier Hernandez
On 19/09/16 15:26, Jeff Darcy wrote: I have brick multiplexing[1] functional to the point that it passes all basic AFR, EC, and quota tests. There are still some issues with tiering, and I wouldn't consider snapshots functional at all, but it seemed like a good point to see how well it

Re: [Gluster-devel] Review request for 3.9 patches

2016-09-19 Thread Xavier Hernandez
Hi Poornima, On 19/09/16 07:01, Poornima Gurusiddaiah wrote: Hi All, There are 3 more patches that we need for enabling md-cache invalidation in 3.9. Request your help with the reviews: http://review.gluster.org/#/c/15378/ - afr: Implement IPC fop http://review.gluster.org/#/c/15387/ -

Re: [Gluster-devel] Query regards to heal xattr heal in dht

2016-09-15 Thread Xavier Hernandez
On 15/09/16 11:31, Raghavendra G wrote: On Thu, Sep 15, 2016 at 12:02 PM, Nithya Balachandran > wrote: On 8 September 2016 at 12:02, Mohit Agrawal > wrote: Hi All,

Re: [Gluster-devel] Need help with https://bugzilla.redhat.com/show_bug.cgi?id=1224180

2016-09-13 Thread Xavier Hernandez
to find a custom solution to this. Xavi Thanks and Regards, Sanoj - Original Message - From: "Xavier Hernandez" <xhernan...@datalab.es> To: "Raghavendra Gowdappa" <rgowd...@redhat.com>, "Sanoj Unnikrishnan" <sunni...@redhat.com> Cc

Re: [Gluster-devel] Need help with https://bugzilla.redhat.com/show_bug.cgi?id=1224180

2016-09-13 Thread Xavier Hernandez
Hi Sanoj, I'm unable to see bug 1224180. Access is restricted. Not sure what is the problem exactly, but I see that quota is involved. Currently disperse doesn't play well with quota when the limit is near. The reason is that not all bricks fail at the same time with EDQUOT due to small

Re: [Gluster-devel] Spurious termination of fuse invalidation notifier thread

2016-09-06 Thread Xavier Hernandez
Hi Raghavendra, On 06/09/16 06:11, Raghavendra Gowdappa wrote: - Original Message - From: "Xavier Hernandez" <xhernan...@datalab.es> To: "Raghavendra Gowdappa" <rgowd...@redhat.com>, "Kaleb Keithley" <kkeit...@redhat.com>, "Pran

Re: [Gluster-devel] Spurious termination of fuse invalidation notifier thread

2016-09-05 Thread Xavier Hernandez
Hi Raghavendra, On 03/09/16 05:42, Raghavendra Gowdappa wrote: Hi Xavi/Kaleb/Pranith, During few of our older conversations (like [1], but not only one), some of you had reported that the thread which writes invalidation notifications (of inodes, entries) to /dev/fuse terminates spuriously.

Re: [Gluster-devel] Notifications (was Re: GF_PARENT_DOWN on SIGKILL)

2016-07-25 Thread Xavier Hernandez
Hi Jeff, On 22/07/16 16:14, Jeff Darcy wrote: I don't think we need any list traversal because notify sends it down the graph. Good point. I think we need to change that, BTW. Relying on translators to propagate notifications has proven very fragile, as many of those events are overloaded

Re: [Gluster-devel] GF_PARENT_DOWN on SIGKILL

2016-07-25 Thread Xavier Hernandez
Hi Jeff, On 22/07/16 15:37, Jeff Darcy wrote: Gah! sorry sorry, I meant to send the mail as SIGTERM. Not SIGKILL. So xavi and I were wondering why cleanup_and_exit() is not sending GF_PARENT_DOWN event. OK, then that grinding sound you hear is my brain shifting gears. ;) It seems that

Re: [Gluster-devel] performance issues Manoj found in EC testing

2016-06-28 Thread Xavier Hernandez
Poornima *From: *"Pranith Kumar Karampuri" <pkara...@redhat.com <mailto:pkara...@redhat.com>> *To: *"Xavier Hernandez" <xhernan...@datalab.es <mailto:xhernan...@datalab.es>> *Cc: *"Gluster Devel" &l

Re: [Gluster-devel] performance issues Manoj found in EC testing

2016-06-27 Thread Xavier Hernandez
: "Pranith Kumar Karampuri" <pkara...@redhat.com> To: "Xavier Hernandez" <xhernan...@datalab.es> Cc: "Manoj Pillai" <mpil...@redhat.com>, "Gluster Devel" <gluster-devel@gluster.org> Sent: Thursday, June 23, 2016 8:50:44 PM Subject: perfo

Re: [Gluster-devel] Wrong assumptions about disperse

2016-06-20 Thread Xavier Hernandez
Hi Shyam, On 17/06/16 15:59, Shyam wrote: On 06/17/2016 04:59 AM, Xavier Hernandez wrote: Firstly, thanks for the overall post, was informative and helps clarify some aspects of EC. AFAIK the real problem of EC is the communications layer. It adds a lot of latency and having to communicate

[Gluster-devel] Wrong assumptions about disperse

2016-06-17 Thread Xavier Hernandez
Hi all, I've seen in many places the belief that disperse, or erasure coding in general, is slow because of the complex or costly math involved. It's true that there's an overhead compared to a simple copy like replica does, but this overhead is way more smaller than many people think. The

Re: [Gluster-devel] Failure to release unusable file open fd_count on glusterfs v3.7.11

2016-06-09 Thread Xavier Hernandez
Hi, thanks for testing it. I've identified an fd leak in the disperse xlator. I've filed a bug [1] for this. Xavi [1] https://bugzilla.redhat.com/show_bug.cgi?id=1344396 On 08.06.2016 05:00, 彭繼霆 wrote: > Hi, I have a volume created with 3 bricks.After delete file which was created by

Re: [Gluster-devel] dht mkdir preop check, afr and (non-)readable afr subvols

2016-06-06 Thread Xavier Hernandez
Hi Raghavendra, On 06/06/16 10:54, Raghavendra G wrote: On Wed, Jun 1, 2016 at 12:50 PM, Xavier Hernandez <xhernan...@datalab.es <mailto:xhernan...@datalab.es>> wrote: Hi, On 01/06/16 08:53, Raghavendra Gowdappa wrote: - Original Message -

Re: [Gluster-devel] dht mkdir preop check, afr and (non-)readable afr subvols

2016-06-01 Thread Xavier Hernandez
com <mailto:raghaven...@gluster.com>> wrote: On Tue, May 31, 2016 at 12:37 PM, Xavier Hernandez <xhernan...@datalab.es <mailto:xhernan...@datalab.es>> wrote: Hi, On 31/05/16 07:05, Raghavendra Gowdappa wrote: +gluster-devel, +

Re: [Gluster-devel] dht mkdir preop check, afr and (non-)readable afr subvols

2016-05-31 Thread Xavier Hernandez
Hi, On 31/05/16 07:05, Raghavendra Gowdappa wrote: +gluster-devel, +Xavi Hi all, The context is [1], where bricks do pre-operation checks before doing a fop and proceed with fop only if pre-op check is successful. @Xavi, We need your inputs on behavior of EC subvolumes as well. If I

Re: [Gluster-devel] Possible bug in the communications layer ?

2016-05-09 Thread Xavier Hernandez
I've uploaded a patch for this problem: http://review.gluster.org/14270 Any review will be very appreciated :) Thanks, Xavi On 09/05/16 12:35, Raghavendra Gowdappa wrote: - Original Message - From: "Xavier Hernandez" <xhernan...@datalab.es> To: "Raghave

Re: [Gluster-devel] Possible bug in the communications layer ?

2016-05-09 Thread Xavier Hernandez
The program interprets that xdata_len being 0 means that there's no xdata, so it continues reading the remaining of the RPC packet into the payload buffer. If you want, I can send a patch for this. Xavi On 05/05/16 10:21, Xavier Hernandez wrote: I've undone all changes and now I'm unable to

Re: [Gluster-devel] Bugs with incorrect status

2016-05-06 Thread Xavier Hernandez
I think there's a problem with the script that generates this report. The changes I2fac59 and Ie1934f are bound to bug 1332054, not 1236065. Xavi On 06/05/16 10:41, Niels de Vos wrote: 1236065 (mainline) MODIFIED: Disperse volume: FUSE I/O error after self healing the failed disk files

Re: [Gluster-devel] [Gluster-users] Fwd: dht_is_subvol_filled messages on client

2016-05-05 Thread Xavier Hernandez
On 05/05/16 13:59, Kaushal M wrote: On Thu, May 5, 2016 at 4:37 PM, Xavier Hernandez <xhernan...@datalab.es> wrote: On 05/05/16 11:31, Kaushal M wrote: On Thu, May 5, 2016 at 2:36 PM, David Gossage <dgoss...@carouselchecks.com> wrote: On Thu, May 5, 2016 at 3:28 AM,

Re: [Gluster-devel] [Gluster-users] Fwd: dht_is_subvol_filled messages on client

2016-05-05 Thread Xavier Hernandez
n Thu, May 5, 2016 at 9:33 AM, Xavier Hernandez <xhernan...@datalab.es> wrote: Can you post the result of 'gluster volume status v0 detail' ? On 05/05/16 06:49, Serkan Çoban wrote: Hi, Can anyone suggest something for this issue? df, du has no issue for the bricks yet one subvolume not bei

Re: [Gluster-devel] [Gluster-users] Fwd: dht_is_subvol_filled messages on client

2016-05-05 Thread Xavier Hernandez
Can you post the result of 'gluster volume status v0 detail' ? On 05/05/16 06:49, Serkan Çoban wrote: Hi, Can anyone suggest something for this issue? df, du has no issue for the bricks yet one subvolume not being used by gluster.. On Wed, May 4, 2016 at 4:40 PM, Serkan Çoban

Re: [Gluster-devel] Possible bug in the communications layer ?

2016-05-04 Thread Xavier Hernandez
On 04/05/16 14:47, Raghavendra Gowdappa wrote: - Original Message - From: "Xavier Hernandez" <xhernan...@datalab.es> To: "Raghavendra Gowdappa" <rgowd...@redhat.com> Cc: "Gluster Devel" <gluster-devel@gluster.org> Sent: Wednesday, Ma

Re: [Gluster-devel] Possible bug in the communications layer ?

2016-05-04 Thread Xavier Hernandez
ain what would be the best solution ? Xavi On 29/04/16 14:52, Xavier Hernandez wrote: With your patch applied, it seems that the bug is not hit. I guess it's a timing issue that the new logging hides. Maybe no more data available after reading the partial readv header ? (it will arrive late

[Gluster-devel] Improve EXPECT/EXPECT_WITHIN result check in tests

2016-05-02 Thread Xavier Hernandez
Hi, I've found an spurious failure caused by an incorrect check of the expected value in EXPECT_WITHIN. The problem is that the value passed to EXPECT_WITHIN (EXPECT also has the same problem) is considered a regular expression but most tests do not pass a full/valid regular expression.

Re: [Gluster-devel] Possible bug in the communications layer ?

2016-04-29 Thread Xavier Hernandez
: Attaching the patch. - Original Message - From: "Raghavendra Gowdappa" <rgowd...@redhat.com> To: "Xavier Hernandez" <xhernan...@datalab.es> Cc: "Gluster Devel" <gluster-devel@gluster.org> Sent: Friday, April 29, 2016 5:14:02 PM Subject: Re: [G

Re: [Gluster-devel] Regression-test-burn-in crash in EC test

2016-04-29 Thread Xavier Hernandez
Hi Jeff, On 27/04/16 20:01, Jeff Darcy wrote: One of the "rewards" of reviewing and merging people's patches is getting email if the next regression-test-burn-in should fail - even if it fails for a completely unrelated reason. Today I got one that's not among the usual suspects. The

Re: [Gluster-devel] Possible bug in the communications layer ?

2016-04-29 Thread Xavier Hernandez
ot; <rgowd...@redhat.com> To: "Xavier Hernandez" <xhernan...@datalab.es> Cc: "Gluster Devel" <gluster-devel@gluster.org> Sent: Friday, April 29, 2016 12:36:43 PM Subject: Re: [Gluster-devel] Possible bug in the communications layer ? - Original Message - F

Re: [Gluster-devel] Possible bug in the communications layer ?

2016-04-28 Thread Xavier Hernandez
I've filed bug https://bugzilla.redhat.com/show_bug.cgi?id=1331502 and added Raghavendra Gowdappa in the CC list (he appears as a maintainer of RPC). Xavi On 28.04.2016 18:42, Xavier Hernandez wrote: > Hi Niels, > > On 28.04.2016 15:44, Niels de Vos wrote: > >> On

Re: [Gluster-devel] Possible bug in the communications layer ?

2016-04-28 Thread Xavier Hernandez
Hi Niels, On 28.04.2016 15:44, Niels de Vos wrote: > On Thu, Apr 28, 2016 at 02:43:01PM +0200, Xavier Hernandez wrote: > >> Hi, I've seen what seems a bug in the communications layer. The first sign is an "XDR decoding failed" error in the logs. This happens with G

Re: [Gluster-devel] Possible bug in the communications layer ?

2016-04-28 Thread Xavier Hernandez
Hi Jeff, On 28.04.2016 15:20, Jeff Darcy wrote: >> This happens with Gluster 3.7.11 accessed through Ganesha and gfapi. The volume is a distributed-disperse 4*(4+2). I'm able to reproduce the problem easily doing the following test: iozone -t2 -s10g -r1024k -i0 -w -F/iozone{1..2}.dat echo 3

[Gluster-devel] Possible bug in the communications layer ?

2016-04-28 Thread Xavier Hernandez
Hi, I've seen what seems a bug in the communications layer. The first sign is an "XDR decoding failed" error in the logs. This happens with Gluster 3.7.11 accessed through Ganesha and gfapi. The volume is a distributed-disperse 4*(4+2). I'm able to reproduce the problem easily doing the

Re: [Gluster-devel] disperse volume file to subvolume mapping

2016-04-19 Thread Xavier Hernandez
Hi Serkan, On 19/04/16 09:18, Serkan Çoban wrote: Hi, I just reinstalled fresh 3.7.11 and I am seeing the same behavior. 50 clients copying part-0- named files using mapreduce to gluster using one thread per server and they are using only 20 servers out of 60. On the other hand fio tests

Re: [Gluster-devel] Fragment size in Systematic erasure code

2016-03-14 Thread Xavier Hernandez
Hi Ashish, On 14/03/16 12:31, Ashish Pandey wrote: Hi Xavi, I think for Systematic erasure coded volume you are going to take fragment size of 512 Bytes. Will there be any CLI option to configure this block size? We were having a discussion and Manoj was suggesting to have this option which

Re: [Gluster-devel] GlusterFS FUSE client leaks summary — part I

2016-02-02 Thread Xavier Hernandez
ol-name=fuse:dentry_t hot-count=32761 if '32761' is the current active dentry count, it still doesn't seem to match up to inode count. Thanks, Soumya And here is Valgrind output: https://gist.github.com/2490aeac448320d98596 On субота, 30 січня 2016 р. 22:56:37 EET Xavier Hernandez wrote: The

Re: [Gluster-devel] [Gluster-users] GlusterFS FUSE client leaks summary — part I

2016-02-01 Thread Xavier Hernandez
nt. Thanks, Soumya And here is Valgrind output: https://gist.github.com/2490aeac448320d98596 On субота, 30 січня 2016 р. 22:56:37 EET Xavier Hernandez wrote: There's another inode leak caused by an incorrect counting of lookups on directory reads. Here's a patch that solves the problem for 3.7: h

Re: [Gluster-devel] [Gluster-users] GlusterFS FUSE client leaks summary — part I

2016-02-01 Thread Xavier Hernandez
ll doesn't seem to match up to inode count. Thanks, Soumya And here is Valgrind output: https://gist.github.com/2490aeac448320d98596 On субота, 30 січня 2016 р. 22:56:37 EET Xavier Hernandez wrote: There's another inode leak caused by an incorrect counting of lookups on directory reads. Here

Re: [Gluster-devel] GlusterFS FUSE client leaks summary — part I

2016-01-30 Thread Xavier Hernandez
There's another inode leak caused by an incorrect counting of lookups on directory reads. Here's a patch that solves the problem for 3.7: http://review.gluster.org/13324 Hopefully with this patch the memory leaks should disapear. Xavi On 29.01.2016 19:09, Oleksandr Natalenko wrote: >

Re: [Gluster-devel] distributed files/directories and [cm]time updates

2016-01-26 Thread Xavier Hernandez
Hi Joseph, On 26/01/16 10:42, Joseph Fernandes wrote: Hi Xavi, Answer inline: - Original Message - From: "Xavier Hernandez" <xhernan...@datalab.es> To: "Joseph Fernandes" <josfe...@redhat.com> Cc: "Pranith Kumar Karampuri" <pkara...

Re: [Gluster-devel] distributed files/directories and [cm]time updates

2016-01-26 Thread Xavier Hernandez
Hi Joseph, On 26/01/16 09:07, Joseph Fernandes wrote: Answer inline: - Original Message - From: "Xavier Hernandez" <xhernan...@datalab.es> To: "Pranith Kumar Karampuri" <pkara...@redhat.com>, "Gluster Devel" <gluster-devel@gluster.or

Re: [Gluster-devel] distributed files/directories and [cm]time updates

2016-01-25 Thread Xavier Hernandez
Hi Pranith, On 26/01/16 03:47, Pranith Kumar Karampuri wrote: hi, Traditionally gluster has been using ctime/mtime of the files/dirs on the bricks as stat output. Problem we are seeing with this approach is that, software which depends on it gets confused when there are differences in

Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client

2016-01-21 Thread Xavier Hernandez
message per each remount): === [2016-01-18 23:42:08.742447] I [fuse-bridge.c:3875:notify_kernel_loop] 0- glusterfs-fuse: kernel notifier loop terminated === On середа, 20 січня 2016 р. 09:51:23 EET Xavier Hernandez wrote: I'm seeing a similar problem with 3.7.6. This latest statedump contains

Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client

2016-01-20 Thread Xavier Hernandez
I'm seeing a similar problem with 3.7.6. This latest statedump contains a lot of gf_fuse_mt_invalidate_node_t objects in fuse. Looking at the code I see they are used to send invalidations to kernel fuse, however this is done in a separate thread that writes a log message when it exits. On

Re: [Gluster-devel] tests/basic/ec/ec-3-1.t generated core

2016-01-14 Thread Xavier Hernandez
recvd $14 = 0 '\000' Xavi On 14/01/16 08:33, Xavier Hernandez wrote: The failure happens when a statedump is generated. For some reason priv->active_subvol is NULL, causing a segmentation fault: (gdb) t 1 [Switching to thread 1 (LWP 4179)] #0 0x7f75dae83137 in fuse_itable_dump (th

Re: [Gluster-devel] tests/basic/ec/ec-3-1.t generated core

2016-01-13 Thread Xavier Hernandez
I'm looking it. On 14/01/16 08:03, Atin Mukherjee wrote: [1] has caused a regression failure with a core from the mentioned test. Mind having a look? [1] https://build.gluster.org/job/rackspace-regression-2GB-triggered/17579/consoleFull Thanks, Atin

Re: [Gluster-devel] tests/basic/ec/ec-3-1.t generated core

2016-01-13 Thread Xavier Hernandez
tive_subvol->itable, 4989 "xlator.mount.fuse.itable"); 4990 4991return 0; 4992} (gdb) print priv->active_subvol $5 = (xlator_t *) 0x0 Does this sound familiar to anyone ? Xavi On 14/01/16 08:08, Xavier Hernandez wrote: I'm looking it. On 14/01/16 08:03, Atin Mukherjee wr

  1   2   >