Re: [Gluster-devel] Removing problematic language in geo-replication

2020-07-22 Thread Kotresh Hiremath Ravishankar
+1

On Wed, Jul 22, 2020 at 2:34 PM Ravishankar N 
wrote:

> Hi,
>
> The gluster code base has some words and terminology (blacklist,
> whitelist, master, slave etc.) that can be considered hurtful/offensive
> to people in a global open source setting. Some of words can be fixed
> trivially but the Geo-replication code seems to be something that needs
> extensive rework. More so because we have these words being used in the
> CLI itself. Two questions that I had were:
>
> 1. Can I replace master:slave with primary:secondary everywhere in the
> code and the CLI? Are there any suggestions for more appropriate
> terminology?
>
Primary:secondary looks good to me.

>
> 2. Is it okay to target the changes to a major release (release-9) and
> *not* provide backward compatibility for the CLI?
>
>  I think that should be good. This also needs a change tool/scripts which
uses geo-rep like geo-rep scheduler script.

Thanks,
>
> Ravi
>
>
> ___
>
> Community Meeting Calendar:
>
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://bluejeans.com/441850968
>
>
>
>
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
>

-- 
Thanks and Regards,
Kotresh H R
___

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968




Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] could you help to check about a glusterfs issue seems to be related to ctime

2020-03-17 Thread Kotresh Hiremath Ravishankar
The whole ctime features relies on time provided by clients which are time
synchronized. This patch brings in the time from server to compare against
the time sent from client.
As Amar mentioned, this doesn't fit well into the scheme of how ctime is
designed. Definitely keeping it optional and disabling it by default is one
way. But is that what your intention
here?

On Tue, Mar 17, 2020 at 10:56 AM Zhou, Cynthia (NSB - CN/Hangzhou) <
cynthia.z...@nokia-sbell.com> wrote:

> Ok, thanks for your feedback!
>
> I will do local test to verify this patch first.
>
>
>
> cynthia
>
>
>
> *From:* Amar Tumballi 
> *Sent:* 2020年3月17日 13:18
> *To:* Zhou, Cynthia (NSB - CN/Hangzhou) 
> *Cc:* Kotresh Hiremath Ravishankar ; Gluster Devel <
> gluster-devel@gluster.org>
> *Subject:* Re: [Gluster-devel] could you help to check about a glusterfs
> issue seems to be related to ctime
>
>
>
>
>
> On Tue, Mar 17, 2020 at 10:18 AM Zhou, Cynthia (NSB - CN/Hangzhou) <
> cynthia.z...@nokia-sbell.com> wrote:
>
> Hi glusterfs expert,
>
> Our product need to tolerate change date to future and then change back.
>
> How about change like this ?
>
>
> https://review.gluster.org/#/c/glusterfs/+/24229/1/xlators/storage/posix/src/posix-metadata.c
>
>
>
> when time change to future and change back , should still be able to
> update mdata, so the following changes to file can be populated to other
> clients.
>
>
>
>
>
> We do like to have people integrating with GlusterFS. But this change is
> not inline with the 'assumptions' we had about the feature.
>
>
>
> If you have verified this change works for you, please add it as an
> 'option' in posix, which can be changed through volume set, and keep this
> option disable/off by default. That should be an easier way to get the
> patch reviewed and take it further. Please make sure to provide the
> 'description' for the option with details.
>
>
>
> Regards,
>
> Amar
>
>
>
>
>
> cynthia
>
>
>
> *From:* Zhou, Cynthia (NSB - CN/Hangzhou)
> *Sent:* 2020年3月12日 17:31
> *To:* 'Kotresh Hiremath Ravishankar' 
> *Cc:* 'Gluster Devel' 
> *Subject:* RE: could you help to check about a glusterfs issue seems to
> be related to ctime
>
>
>
> Hi,
>
> One more question, I find each client has the same future time stamp where
> are those time stamps from, since Since it is different from any brick
> stored time stamp. And after I modify files  from clients, it remains the
> same.
>
> [root@mn-0:/home/robot]
>
> # stat /mnt/export/testfile
>
>   File: /mnt/export/testfile
>
>   Size: 193 Blocks: 1  IO Block: 131072 regular file
>
> Device: 28h/40d Inode: 10383279039841136109  Links: 1
>
> Access: (0644/-rw-r--r--)  Uid: (0/root)   Gid: (
> 615/_nokfsuifileshare)
>
> Access: 2020-04-11 12:20:22.114365172 +0300
>
> Modify: 2020-04-11 12:20:22.121552573 +0300
>
> Change: 2020-04-11 12:20:22.121552573 +0300
>
>
>
> [root@mn-0:/home/robot]
>
> # date
>
> Thu Mar 12 11:27:33 EET 2020
>
> [root@mn-0:/home/robot]
>
>
>
> [root@mn-0:/home/robot]
>
> # stat /mnt/bricks/export/brick/testfile
>
>   File: /mnt/bricks/export/brick/testfile
>
>   Size: 193 Blocks: 16 IO Block: 4096   regular file
>
> Device: fc02h/64514dInode: 512015  Links: 2
>
> Access: (0644/-rw-r--r--)  Uid: (0/root)   Gid: (
> 615/_nokfsuifileshare)
>
> Access: 2020-04-11 12:20:22.100395536 +0300
>
> Modify: 2020-03-12 11:25:04.095981276 +0200
>
> Change: 2020-03-12 11:25:04.095981276 +0200
>
> Birth: 2020-04-11 08:53:26.805163816 +0300
>
>
>
>
>
> [root@mn-1:/root]
>
> # stat /mnt/bricks/export/brick/testfile
>
>   File: /mnt/bricks/export/brick/testfile
>
>   Size: 193 Blocks: 16 IO Block: 4096   regular file
>
> Device: fc02h/64514dInode: 512015  Links: 2
>
> Access: (0644/-rw-r--r--)  Uid: (0/root)   Gid: (
> 615/_nokfsuifileshare)
>
> Access: 2020-04-11 12:20:22.100395536 +0300
>
> Modify: 2020-03-12 11:25:04.094913452 +0200
>
> Change: 2020-03-12 11:25:04.095913453 +0200
>
> Birth: 2020-03-12 07:53:26.803783053 +0200
>
>
>
>
>
>
>
> *From:* Zhou, Cynthia (NSB - CN/Hangzhou)
> *Sent:* 2020年3月12日 16:09
> *To:* 'Kotresh Hiremath Ravishankar' 
> *Cc:* Gluster Devel 
> *Subject:* RE: could you help to check about a glusterfs issue seems to
> be related to ctime
>
>
>
> Hi,
>
> This is abnormal test case, however, when this happened it will have big
> impact on the apps using those files. And this can not be

Re: [Gluster-devel] could you help to check about a glusterfs issue seems to be related to ctime

2020-03-12 Thread Kotresh Hiremath Ravishankar
All the perf xlators depend on time (mostly mtime I guess). In my setup,
only quick read was enabled and hence disabling it worked for me.
All perf xlators needs to be disabled to make it work correctly. But I
still failed to understand how normal this kind of workload ?

Thanks,
Kotresh

On Thu, Mar 12, 2020 at 11:20 AM Zhou, Cynthia (NSB - CN/Hangzhou) <
cynthia.z...@nokia-sbell.com> wrote:

> When disable both quick-read and performance.io-cache off everything is
> back to normal
>
> I attached the log when only enable quick-read and performance.io-cache is
> still on glusterfs trace log
>
> When execute command “cat /mnt/export/testfile”
>
> Can you help to find why this still to fail to show correct content?
>
> The file size showed is 141, but actually in brick it is longer than that.
>
>
>
>
>
> cynthia
>
>
>
>
>
> *From:* Zhou, Cynthia (NSB - CN/Hangzhou)
> *Sent:* 2020年3月12日 12:53
> *To:* 'Kotresh Hiremath Ravishankar' 
> *Cc:* 'Gluster Devel' 
> *Subject:* RE: could you help to check about a glusterfs issue seems to
> be related to ctime
>
>
>
> From my local test only when disable both features.ctime and ctime.noatime
> this issue is gone.
>
> Or
>
> Do echo 3 >/proc/sys/vm/drop_caches after each time when some client
> change the file , can cat command show correct data(same as brick )
>
>
>
> cynthia
>
>
>
> *From:* Zhou, Cynthia (NSB - CN/Hangzhou)
> *Sent:* 2020年3月12日 9:53
> *To:* 'Kotresh Hiremath Ravishankar' 
> *Cc:* Gluster Devel 
> *Subject:* RE: could you help to check about a glusterfs issue seems to
> be related to ctime
>
>
>
> Hi,
>
> Thanks for your responding!
>
> I’ve tried to disable quick-read:
>
> [root@mn-0:/home/robot]
>
> # gluster v get export all| grep quick
>
> performance.quick-read  off
>
> performance.nfs.quick-read  off
>
>
>
> however, this issue still exists.
>
> Two clients see different contents.
>
>
>
> it seems only after I disable utime this issue is completely gone.
>
> features.ctime  off
>
> ctime.noatime   off
>
>
>
>
>
> Do you know why is this?
>
>
>
>
>
> Cynthia
>
> Nokia storage team
>
> *From:* Kotresh Hiremath Ravishankar 
> *Sent:* 2020年3月11日 22:05
> *To:* Zhou, Cynthia (NSB - CN/Hangzhou) 
> *Cc:* Gluster Devel 
> *Subject:* Re: could you help to check about a glusterfs issue seems to
> be related to ctime
>
>
>
> Hi,
>
> I figured out what's happening. The issue is that the file has 'c|a|m'
> time set to future (The file is created after the date is set to +30 days).
> This
> is done from client-1. On client-2 with correct date, when data is
> appended, it doesn't update the mtime and ctime because of both mtime and
> ctime is less than
> already set time on the file. This protection is required to keep the
> latest time when two clients are writing to the same file. We update c|m|a
> time only if it's greater than
> existing time. As a result, the perf xlators on client1 which relies on
> mtime doesn't send read to server as it thinks nothing is changed as in
> this case the times haven't
> changed.
>
>
>
> Workarounds:
> 1. Disabling quick-read solved the issue for me.
>
> I don't know how real this kind of workload is? Is this a normal scenario ?
> The other thing to do is to remove that protection of updating time only
> if it's greater but that would open up the race when two clients are
> updating the same file.
>
> This would result in keeping the older time than the latest. This requires
> code change and I don't think that should be done.
>
>
>
> Thanks,
> Kotresh
>
>
>
> On Wed, Mar 11, 2020 at 3:02 PM Kotresh Hiremath Ravishankar <
> khire...@redhat.com> wrote:
>
> Exactly, I am also curious about this. I will debug and update about
> what's exactly happening.
>
>
>
> Thanks,
> Kotresh
>
>
>
> On Wed, Mar 11, 2020 at 1:56 PM Zhou, Cynthia (NSB - CN/Hangzhou) <
> cynthia.z...@nokia-sbell.com> wrote:
>
> I used to think the file is cached in some client side buffer, because
> I’ve checked from different sn brick, the file content are all correct. But
> when I open client side trace level log, and cat the file, I only find
> lookup/open/flush fop from fuse-bridge side, I am just wondering how is
> file content served to client side? Should not there be readv fop seen from
> trace log?
>
>
>
> cynthia
>
>
>
> *From:* Zhou, Cynthia (NSB - CN/Hangzhou)
> *Sent:* 2020年3月11日 15:54
> *To:* 'Kotresh Hiremath Ravishankar' 
> *Subjec

Re: [Gluster-devel] could you help to check about a glusterfs issue seems to be related to ctime

2020-03-11 Thread Kotresh Hiremath Ravishankar
Hi,

I figured out what's happening. The issue is that the file has 'c|a|m' time
set to future (The file is created after the date is set to +30 days). This
is done from client-1. On client-2 with correct date, when data is
appended, it doesn't update the mtime and ctime because of both mtime and
ctime is less than
already set time on the file. This protection is required to keep the
latest time when two clients are writing to the same file. We update c|m|a
time only if it's greater than
existing time. As a result, the perf xlators on client1 which relies on
mtime doesn't send read to server as it thinks nothing is changed as in
this case the times haven't
changed.

Workarounds:
1. Disabling quick-read solved the issue for me.

I don't know how real this kind of workload is? Is this a normal scenario ?
The other thing to do is to remove that protection of updating time only if
it's greater but that would open up the race when two clients are updating
the same file.
This would result in keeping the older time than the latest. This requires
code change and I don't think that should be done.

Thanks,
Kotresh

On Wed, Mar 11, 2020 at 3:02 PM Kotresh Hiremath Ravishankar <
khire...@redhat.com> wrote:

> Exactly, I am also curious about this. I will debug and update about
> what's exactly happening.
>
> Thanks,
> Kotresh
>
> On Wed, Mar 11, 2020 at 1:56 PM Zhou, Cynthia (NSB - CN/Hangzhou) <
> cynthia.z...@nokia-sbell.com> wrote:
>
>> I used to think the file is cached in some client side buffer, because
>> I’ve checked from different sn brick, the file content are all correct. But
>> when I open client side trace level log, and cat the file, I only find
>> lookup/open/flush fop from fuse-bridge side, I am just wondering how is
>> file content served to client side? Should not there be readv fop seen from
>> trace log?
>>
>>
>>
>> cynthia
>>
>>
>>
>> *From:* Zhou, Cynthia (NSB - CN/Hangzhou)
>> *Sent:* 2020年3月11日 15:54
>> *To:* 'Kotresh Hiremath Ravishankar' 
>> *Subject:* RE: could you help to check about a glusterfs issue seems to
>> be related to ctime
>>
>>
>>
>> Does that require, that for all the time client should be time synched?
>> What if the client time is not synched for a while? And then restored?
>>
>> I make a test when time has been restored and then client change the
>> file, the file’s modify time, access times remains to be wrong, is that
>> correct?
>>
>>
>>
>> root@mn-0:/home/robot]
>>
>> # echo "fromm mn-0">>/mnt/export/testfile
>>
>> [root@mn-0:/home/robot]
>>
>> # stat /mnt/export/testfile
>>
>>   File: /mnt/export/testfile
>>
>>   Size: 30  Blocks: 1  IO Block: 131072 regular file
>>
>> Device: 28h/40d Inode: 9855109080001305442  Links: 1
>>
>> Access: (0644/-rw-r--r--)  Uid: (0/root)   Gid: (
>> 615/_nokfsuifileshare)
>>
>> Access: 2020-05-10 09:33:59.713840197 +0300
>>
>> Modify: 2020-05-10 09:33:59.713840197 +0300
>>
>> Change: 2020-05-10 09:33:59.714413772 +0300  //remains to be future time
>>
>> Birth: -
>>
>> [root@mn-0:/home/robot]
>>
>> # cat /mnt/export/testfil
>>
>> cat: /mnt/export/testfil: No such file or directory
>>
>> [root@mn-0:/home/robot]
>>
>> # cat /mnt/export/testfile
>>
>> from mn0
>>
>> from mn-1
>>
>> fromm mn-0
>>
>> [root@mn-0:/home/robot]
>>
>> # date
>>
>> Wed 11 Mar 2020 09:05:58 AM EET
>>
>> [root@mn-0:/home/robot]
>>
>>
>>
>> cynthia
>>
>>
>>
>> *From:* Kotresh Hiremath Ravishankar 
>> *Sent:* 2020年3月11日 15:41
>> *To:* Zhou, Cynthia (NSB - CN/Hangzhou) 
>> *Subject:* Re: could you help to check about a glusterfs issue seems to
>> be related to ctime
>>
>>
>>
>>
>>
>>
>>
>> On Wed, Mar 11, 2020 at 12:46 PM Zhou, Cynthia (NSB - CN/Hangzhou) <
>> cynthia.z...@nokia-sbell.com> wrote:
>>
>> But there are times that ntp service went wrong, and time on two storage
>> nodes may be not synced.
>>
>> Or do you mean when can not guarantee that the time on two clients is
>> synched, we should not enable this ctime feature?
>>
>> Yes, that's correct. The ctime feature relies on the time generated at
>> the client (that's the utime xlator loaded in client) and hence
>> expects all clients to be ntp synced.
>>
>>
>>
>> Without ctime feature, is there some way to avoid this “file changed as
>> we read it” issue?

Re: [Gluster-devel] Solving Ctime Issue with legacy files [BUG 1593542]

2019-06-18 Thread Kotresh Hiremath Ravishankar
Hi Xavi,

On Tue, Jun 18, 2019 at 12:28 PM Xavi Hernandez  wrote:

> Hi Kotresh,
>
> On Tue, Jun 18, 2019 at 8:33 AM Kotresh Hiremath Ravishankar <
> khire...@redhat.com> wrote:
>
>> Hi Xavi,
>>
>> Reply inline.
>>
>> On Mon, Jun 17, 2019 at 5:38 PM Xavi Hernandez 
>> wrote:
>>
>>> Hi Kotresh,
>>>
>>> On Mon, Jun 17, 2019 at 1:50 PM Kotresh Hiremath Ravishankar <
>>> khire...@redhat.com> wrote:
>>>
>>>> Hi All,
>>>>
>>>> The ctime feature is enabled by default from release gluster-6. But as
>>>> explained in bug [1]  there is a known issue with legacy files i.e., the
>>>> files which are created before ctime feature is enabled. These files would
>>>> not have "trusted.glusterfs.mdata" xattr which maintain time attributes. So
>>>> on, accessing those files, it gets created with latest time attributes.
>>>> This is not correct because all the time attributes (atime, mtime, ctime)
>>>> get updated instead of required time attributes.
>>>>
>>>> There are couple of approaches to solve this.
>>>>
>>>> 1. On accessing the files, let the posix update the time attributes
>>>> from  the back end file on respective replicas. This obviously results in
>>>> inconsistent "trusted.glusterfs.mdata" xattr values with in replica set.
>>>> AFR/EC should heal this xattr as part of metadata heal upon accessing this
>>>> file. It can chose to replicate from any subvolume. Ideally we should
>>>> consider the highest time from the replica and treat it as source but I
>>>> think that should be fine as replica time attributes are mostly in sync
>>>> with max difference in order of few seconds if am not wrong.
>>>>
>>>>But client side self heal is disabled by default because of
>>>> performance reasons [2]. If we chose to go by this approach, we need to
>>>> consider enabling at least client side metadata self heal by default.
>>>> Please share your thoughts on enabling the same by default.
>>>>
>>>> 2. Don't let posix update the legacy files from the backend. On lookup
>>>> cbk, let the utime xlator update the time attributes from statbuf received
>>>> synchronously.
>>>>
>>>> Both approaches are similar as both results in updating the xattr
>>>> during lookup. Please share your inputs on which approach is better.
>>>>
>>>
>>> I prefer second approach. First approach is not feasible for EC volumes
>>> because self-heal requires that k bricks (on a k+r configuration) agree on
>>> the value of this xattr, otherwise it considers the metadata damaged and
>>> needs manual intervention to fix it. During upgrade, first r bricks with be
>>> upgraded without problems, but trusted.glusterfs.mdata won't be healed
>>> because r < k. In fact this xattr will be removed from new bricks because
>>> the majority of bricks agree on xattr not being present. Once the r+1 brick
>>> is upgraded, it's possible that posix sets different values for
>>> trusted.glusterfs.mdata, which will cause self-heal to fail.
>>>
>>> Second approach seems better to me if guarded by a new option that
>>> enables this behavior. utime xlator should only update the mdata xattr if
>>> that option is set, and that option should only be settable once all nodes
>>> have been upgraded (controlled by op-version). In this situation the first
>>> lookup on a file where utime detects that mdata is not set, will require a
>>> synchronous update. I think this is good enough because it will only happen
>>> once per file. We'll need to consider cases where different clients do
>>> lookups at the same time, but I think this can be easily solved by ignoring
>>> the request if mdata is already present.
>>>
>>
>> Initially there were two issues.
>> 1. Upgrade Issue with EC Volume as described by you.
>>  This is solved with the patch [1]. There was a bug in ctime
>> posix where it was creating xattr even when ctime is not set on client
>> (during utimes system call). With patch [1], the behavior
>> is that utimes system call will only update the
>> "trusted.glusterfs.mdata" xattr if present else it won't create. The new
>> xattr creation should only happen during entry operations (i.e create,
>> mknod and others).
>>So there won't be any problems with upgrade. I think we don't need new
&

Re: [Gluster-devel] Solving Ctime Issue with legacy files [BUG 1593542]

2019-06-18 Thread Kotresh Hiremath Ravishankar
Hi Xavi,

Reply inline.

On Mon, Jun 17, 2019 at 5:38 PM Xavi Hernandez  wrote:

> Hi Kotresh,
>
> On Mon, Jun 17, 2019 at 1:50 PM Kotresh Hiremath Ravishankar <
> khire...@redhat.com> wrote:
>
>> Hi All,
>>
>> The ctime feature is enabled by default from release gluster-6. But as
>> explained in bug [1]  there is a known issue with legacy files i.e., the
>> files which are created before ctime feature is enabled. These files would
>> not have "trusted.glusterfs.mdata" xattr which maintain time attributes. So
>> on, accessing those files, it gets created with latest time attributes.
>> This is not correct because all the time attributes (atime, mtime, ctime)
>> get updated instead of required time attributes.
>>
>> There are couple of approaches to solve this.
>>
>> 1. On accessing the files, let the posix update the time attributes from
>> the back end file on respective replicas. This obviously results in
>> inconsistent "trusted.glusterfs.mdata" xattr values with in replica set.
>> AFR/EC should heal this xattr as part of metadata heal upon accessing this
>> file. It can chose to replicate from any subvolume. Ideally we should
>> consider the highest time from the replica and treat it as source but I
>> think that should be fine as replica time attributes are mostly in sync
>> with max difference in order of few seconds if am not wrong.
>>
>>But client side self heal is disabled by default because of
>> performance reasons [2]. If we chose to go by this approach, we need to
>> consider enabling at least client side metadata self heal by default.
>> Please share your thoughts on enabling the same by default.
>>
>> 2. Don't let posix update the legacy files from the backend. On lookup
>> cbk, let the utime xlator update the time attributes from statbuf received
>> synchronously.
>>
>> Both approaches are similar as both results in updating the xattr during
>> lookup. Please share your inputs on which approach is better.
>>
>
> I prefer second approach. First approach is not feasible for EC volumes
> because self-heal requires that k bricks (on a k+r configuration) agree on
> the value of this xattr, otherwise it considers the metadata damaged and
> needs manual intervention to fix it. During upgrade, first r bricks with be
> upgraded without problems, but trusted.glusterfs.mdata won't be healed
> because r < k. In fact this xattr will be removed from new bricks because
> the majority of bricks agree on xattr not being present. Once the r+1 brick
> is upgraded, it's possible that posix sets different values for
> trusted.glusterfs.mdata, which will cause self-heal to fail.
>
> Second approach seems better to me if guarded by a new option that enables
> this behavior. utime xlator should only update the mdata xattr if that
> option is set, and that option should only be settable once all nodes have
> been upgraded (controlled by op-version). In this situation the first
> lookup on a file where utime detects that mdata is not set, will require a
> synchronous update. I think this is good enough because it will only happen
> once per file. We'll need to consider cases where different clients do
> lookups at the same time, but I think this can be easily solved by ignoring
> the request if mdata is already present.
>

Initially there were two issues.
1. Upgrade Issue with EC Volume as described by you.
 This is solved with the patch [1]. There was a bug in ctime posix
where it was creating xattr even when ctime is not set on client (during
utimes system call). With patch [1], the behavior
is that utimes system call will only update the
"trusted.glusterfs.mdata" xattr if present else it won't create. The new
xattr creation should only happen during entry operations (i.e create,
mknod and others).
   So there won't be any problems with upgrade. I think we don't need new
option dependent on op version if I am not wrong.

2. After upgrade, how do we update "trusted.glusterfs.mdata" xattr.
This mail thread was for this. Here which approach is better? I
understand from EC point of view the second approach is the best one. The
question I had was, Can't EC treat 'trusted.glusterfs.mdata'
as special xattr and add the logic to heal it from one subvolume  (i.e.
to remove the requirement of having to have consistent data on k subvolumes
in k+r configuration).

Second approach is independent of AFR and EC. So if we chose this,
do we need new option to guard? If the upgrade steps is to upgrade server
first and then client, we don't need to guard I think?

>
> Xavi
>
>
>>
>>
>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1593542
&g

Re: [Gluster-devel] Bitrot: Time of signing depending on the file size???

2019-03-05 Thread Kotresh Hiremath Ravishankar
Hi David,

Thanks for raising the bug. But from the above validation, it's clear that
bitrot is not directly involved. Bitrot waits for last fd to be closed. We
will have to investigate the reason for fd not being closed for large files.

Thanks,
Kotresh HR

On Mon, Mar 4, 2019 at 3:13 PM David Spisla  wrote:

> Hello Kotresh,
>
> Yes, the fd was still open for larger files. I could verify this with a
> 500MiB file and some smaller files. After a specific time only the fd for
> the 500MiB was up and the file still had no signature, for the smaller
> files there were no fds and they already had a signature. I don't know the
> reason for this. Maybe the client still keep th fd open? I opened a bug for
> this:
> https://bugzilla.redhat.com/show_bug.cgi?id=1685023
>
> Regards
> David
>
> Am Fr., 1. März 2019 um 18:29 Uhr schrieb Kotresh Hiremath Ravishankar <
> khire...@redhat.com>:
>
>> Interesting observation! But as discussed in the thread bitrot signing
>> processes depends 2 min timeout (by default) after last fd closes. It
>> doesn't have any co-relation with the size of the file.
>> Did you happen to verify that the fd was still open for large files for
>> some reason?
>>
>>
>>
>> On Fri, Mar 1, 2019 at 1:19 PM David Spisla  wrote:
>>
>>> Hello folks,
>>>
>>> I did some observations concerning the bitrot daemon. It seems to be
>>> that the bitrot signer is signing files depending on file size. I copied
>>> files with different sizes into a volume and I was wonderung because the
>>> files get their signature not the same time (I keep the expiry time default
>>> with 120). Here are some examples:
>>>
>>> 300 KB file ~2-3 m
>>> 70 MB file ~ 40 m
>>> 115 MB file ~ 1 Sh
>>> 800 MB file ~ 4,5 h
>>>
>>> What is the expected behaviour here?
>>> Why does it take so long to sign a 800MB file?
>>> What about 500GB or 1TB?
>>> Is there a way to speed up the sign process?
>>>
>>> My ambition is to understand this observation
>>>
>>> Regards
>>> David Spisla
>>> ___
>>> Gluster-devel mailing list
>>> Gluster-devel@gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>>
>>
>>
>> --
>> Thanks and Regards,
>> Kotresh H R
>>
>

-- 
Thanks and Regards,
Kotresh H R
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Bitrot: Time of signing depending on the file size???

2019-03-01 Thread Kotresh Hiremath Ravishankar
Interesting observation! But as discussed in the thread bitrot signing
processes depends 2 min timeout (by default) after last fd closes. It
doesn't have any co-relation with the size of the file.
Did you happen to verify that the fd was still open for large files for
some reason?



On Fri, Mar 1, 2019 at 1:19 PM David Spisla  wrote:

> Hello folks,
>
> I did some observations concerning the bitrot daemon. It seems to be that
> the bitrot signer is signing files depending on file size. I copied files
> with different sizes into a volume and I was wonderung because the files
> get their signature not the same time (I keep the expiry time default with
> 120). Here are some examples:
>
> 300 KB file ~2-3 m
> 70 MB file ~ 40 m
> 115 MB file ~ 1 Sh
> 800 MB file ~ 4,5 h
>
> What is the expected behaviour here?
> Why does it take so long to sign a 800MB file?
> What about 500GB or 1TB?
> Is there a way to speed up the sign process?
>
> My ambition is to understand this observation
>
> Regards
> David Spisla
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel



-- 
Thanks and Regards,
Kotresh H R
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Geo-rep tests failing on master Cent7-regressions

2018-12-04 Thread Kotresh Hiremath Ravishankar
On Tue, Dec 4, 2018 at 10:02 PM Amar Tumballi  wrote:

> Looks like that is correct, but that also is failing in another regression
> shard/zero-flag.t
>
It's not related to this as it doesn't involve any code changes. Changes
are restricted to tests..


> On Tue, Dec 4, 2018 at 7:40 PM Shyam Ranganathan 
> wrote:
>
>> Hi Kotresh,
>>
>> Multiple geo-rep tests are failing on master on various patch regressions.
>>
>> Looks like you have put in
>> https://review.gluster.org/c/glusterfs/+/21794 for review, to address
>> the issue at present.
>>
>> Would that be correct?
>>
> Yes!

>
>> Thanks,
>> Shyam
>> ___
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>>
>>
>>
>
> --
> Amar Tumballi (amarts)
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel



-- 
Thanks and Regards,
Kotresh H R
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-Maintainers] Release 5: Branched and further dates

2018-10-08 Thread Kotresh Hiremath Ravishankar
Had forgot to add milind, ccing.

On Mon, Oct 8, 2018 at 11:41 AM Kotresh Hiremath Ravishankar <
khire...@redhat.com> wrote:

>
>
> On Fri, Oct 5, 2018 at 10:31 PM Shyam Ranganathan 
> wrote:
>
>> On 10/05/2018 10:59 AM, Shyam Ranganathan wrote:
>> > On 10/04/2018 11:33 AM, Shyam Ranganathan wrote:
>> >> On 09/13/2018 11:10 AM, Shyam Ranganathan wrote:
>> >>> RC1 would be around 24th of Sep. with final release tagging around 1st
>> >>> of Oct.
>> >> RC1 now stands to be tagged tomorrow, and patches that are being
>> >> targeted for a back port include,
>> > We still are awaiting release notes (other than the bugs section) to be
>> > closed.
>> >
>> > There is one new bug that needs attention from the replicate team.
>> > https://bugzilla.redhat.com/show_bug.cgi?id=1636502
>> >
>> > The above looks important to me to be fixed before the release, @ravi or
>> > @pranith can you take a look?
>> >
>>
>> RC1 is tagged and release tarball generated.
>>
>> We still have 2 issues to work on,
>>
>> 1. The above messages from AFR in self heal logs
>>
>> 2. We need to test with Py3, else we risk putting out packages there on
>> Py3 default distros and causing some mayhem if basic things fail.
>>
>> I am open to suggestions on how to ensure we work with Py3, thoughts?
>>
>> I am thinking we run a regression on F28 (or a platform that defaults to
>> Py3) and ensure regressions are passing at the very least. For other
>> Python code that regressions do not cover,
>> - We have a list at [1]
>> - How can we split ownership of these?
>>
>
> +1 for the regression  run on py3 default platform. We don't need to run
> full regressions.
> We can chose to run only those test cases related to python. Categorically
> we have
> 1. geo-rep
> 2. events framework
> 3. glusterfind
> 4. tools/scripts
>
> I can take care of geo-rep. With following two patches, geo-rep works both
> on py2 and py3.
> I have tested these locally on centos-7.5 (py2 is default) and fedora28
> (making py3 default by
> symlink /usr/bin/python -> python3). Again the test was very basic, we can
> fix going forward,
> if there are any corner cases.
>
> 1. https://review.gluster.org/#/c/glusterfs/+/21356/ (Though this is
> events patch, geo-rep internally uses it, so is required for geo-rep.)
> 2. https://review.gluster.org/#/c/glusterfs/+/21357/
>
> It think we need to add regression tests for events and glusterfind.
> Adding milind to comment on glusterfind.
>
>>
>> @Aravinda, @Kotresh, and @ppai, looking to you folks to help out with
>> the process and needs here.
>>
>> Shyam
>>
>> [1] https://github.com/gluster/glusterfs/issues/411
>>
>
>
> --
> Thanks and Regards,
> Kotresh H R
>


-- 
Thanks and Regards,
Kotresh H R
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-Maintainers] Release 5: Branched and further dates

2018-10-08 Thread Kotresh Hiremath Ravishankar
On Fri, Oct 5, 2018 at 10:31 PM Shyam Ranganathan 
wrote:

> On 10/05/2018 10:59 AM, Shyam Ranganathan wrote:
> > On 10/04/2018 11:33 AM, Shyam Ranganathan wrote:
> >> On 09/13/2018 11:10 AM, Shyam Ranganathan wrote:
> >>> RC1 would be around 24th of Sep. with final release tagging around 1st
> >>> of Oct.
> >> RC1 now stands to be tagged tomorrow, and patches that are being
> >> targeted for a back port include,
> > We still are awaiting release notes (other than the bugs section) to be
> > closed.
> >
> > There is one new bug that needs attention from the replicate team.
> > https://bugzilla.redhat.com/show_bug.cgi?id=1636502
> >
> > The above looks important to me to be fixed before the release, @ravi or
> > @pranith can you take a look?
> >
>
> RC1 is tagged and release tarball generated.
>
> We still have 2 issues to work on,
>
> 1. The above messages from AFR in self heal logs
>
> 2. We need to test with Py3, else we risk putting out packages there on
> Py3 default distros and causing some mayhem if basic things fail.
>
> I am open to suggestions on how to ensure we work with Py3, thoughts?
>
> I am thinking we run a regression on F28 (or a platform that defaults to
> Py3) and ensure regressions are passing at the very least. For other
> Python code that regressions do not cover,
> - We have a list at [1]
> - How can we split ownership of these?
>

+1 for the regression  run on py3 default platform. We don't need to run
full regressions.
We can chose to run only those test cases related to python. Categorically
we have
1. geo-rep
2. events framework
3. glusterfind
4. tools/scripts

I can take care of geo-rep. With following two patches, geo-rep works both
on py2 and py3.
I have tested these locally on centos-7.5 (py2 is default) and fedora28
(making py3 default by
symlink /usr/bin/python -> python3). Again the test was very basic, we can
fix going forward,
if there are any corner cases.

1. https://review.gluster.org/#/c/glusterfs/+/21356/ (Though this is events
patch, geo-rep internally uses it, so is required for geo-rep.)
2. https://review.gluster.org/#/c/glusterfs/+/21357/

It think we need to add regression tests for events and glusterfind.
Adding milind to comment on glusterfind.

>
> @Aravinda, @Kotresh, and @ppai, looking to you folks to help out with
> the process and needs here.
>
> Shyam
>
> [1] https://github.com/gluster/glusterfs/issues/411
>


-- 
Thanks and Regards,
Kotresh H R
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Release 5: Branched and further dates

2018-10-04 Thread Kotresh Hiremath Ravishankar
On Thu, Oct 4, 2018 at 9:03 PM Shyam Ranganathan 
wrote:

> On 09/13/2018 11:10 AM, Shyam Ranganathan wrote:
> > RC1 would be around 24th of Sep. with final release tagging around 1st
> > of Oct.
>
> RC1 now stands to be tagged tomorrow, and patches that are being
> targeted for a back port include,
>
> 1) https://review.gluster.org/c/glusterfs/+/21314 (snapshot volfile in
> mux cases)
>
> @RaBhat working on this.
>
> 2) Py3 corrections in master
>
> @Kotresh are all changes made to master backported to release-5 (may not
> be merged, but looking at if they are backported and ready for merge)?
>

All changes made to master are backported to release-5. But py3 support is
still not complete.

>
> 3) Release notes review and updates with GD2 content pending
>
> @Kaushal/GD2 team can we get the updates as required?
> https://review.gluster.org/c/glusterfs/+/21303
>
> 4) This bug [2] was filed when we released 4.0.
>
> The issue has not bitten us in 4.0 or in 4.1 (yet!) (i.e the options
> missing and hence post-upgrade clients failing the mount). This is
> possibly the last chance to fix it.
>
> Glusterd and protocol maintainers, can you chime in, if this bug needs
> to be and can be fixed? (thanks to @anoopcs for pointing it out to me)
>
> The tracker bug [1] does not have any other blockers against it, hence
> assuming we are not tracking/waiting on anything other than the set above.
>
> Thanks,
> Shyam
>
> [1] Tracker: https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-5.0
> [2] Potential upgrade bug:
> https://bugzilla.redhat.com/show_bug.cgi?id=1540659
>


-- 
Thanks and Regards,
Kotresh H R
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Python3 build process

2018-09-27 Thread Kotresh Hiremath Ravishankar
On Thu, Sep 27, 2018 at 5:38 PM Kaleb S. KEITHLEY 
wrote:

> On 9/26/18 8:28 PM, Shyam Ranganathan wrote:
> > Hi,
> >
> > With the introduction of default python 3 shebangs and the change in
> > configure.ac to correct these to py2 if the build is being attempted on
> > a machine that does not have py3, there are a couple of issues
> > uncovered. Here is the plan to fix the same, suggestions welcome.
> >
> > Issues:
> > - A configure job is run when creating the dist tarball, and this runs
> > on non py3 platforms, hence changing the dist tarball to basically have
> > py2 shebangs, as a result the release-new build job always outputs py
> > files with the py2 shebang. See tarball in [1]
> >
> > - All regression hosts are currently py2 and so if we do not run the py
> > shebang correction during configure (as we do not build and test from
> > RPMS), we would be running with incorrect py3 shebangs (although this
> > seems to work, see [2]. @kotresh can we understand why?)
>
> Is it because we don't test any of the python in the regression tests?
>
> Geo-replication do have regression tests but not sure about glusterfind,
events.

Or because when we do, we invoke python scripts with `python foo.py` or
> `$PYTHON foo.py` everywhere? The shebangs are ignored when scripts are
> invoked this way.
>
The reason why geo-rep is passing is for the same reason mentioned. Geo-rep
python file is invoked from a c program always prefixing it with python as
follows.

python = getenv("PYTHON");
if (!python)
python = PYTHON;
nargv[j++] = python;
nargv[j++] = GSYNCD_PREFIX "/python/syncdaemon/" GSYNCD_PY;

>
> > Plan to address the above is detailed in this bug [3].
> >
> > The thought is,
> > - Add a configure option "--enable-py-version-correction" to configure,
> > that is disabled by default
>
> "correction" implies there's something that's incorrect. How about
> "conversion" or perhaps just --enable-python2
>
> >
> > - All regression jobs will run with the above option, and hence this
> > will correct the py shebangs in the regression machines. In the future
> > as we run on both py2 and py3 machines, this will run with the right
> > python shebangs on these machines.
> >
> > - The packaging jobs will now run the py version detection and shebang
> > correction during actual build and packaging, Kaleb already has put up a
> > patch for the same [2].
> >
> > Thoughts?
> >
>
> Also note that until --enable-whatever is added to configure(.ac), if
> you're building and testing any of the python bits on RHEL or CentOS
> you'll need to convert the shebangs. Perhaps the easiest way to do that
> now (master branch and release-5 branch) is to build+install rpms.
>
> If you're currently doing
>
>   `git clone; ./autogen.sh; ./configure; make; make install`
>
> then change that to
>
>   `git clone; ./autogen.sh; ./configure; make -C extras/LinuxRPMS
> glusterrpms`
>
> and then yum install those rpms. The added advantage is that it's easier
> to remove rpms than anything installed with `make install`.
>
> If you're developing on Fedora (hopefully 27 or later) or Debian or
> Ubuntu you don't need to do anything different as they all have python3.
>
> --
>
> Kaleb
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>


-- 
Thanks and Regards,
Kotresh H R
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Clang-Formatter for GlusterFS.

2018-09-18 Thread Kotresh Hiremath Ravishankar
On Tue, Sep 18, 2018 at 2:44 PM, Amar Tumballi  wrote:

>
>
> On Tue, Sep 18, 2018 at 2:33 PM, Kotresh Hiremath Ravishankar <
> khire...@redhat.com> wrote:
>
>> I have a different problem. clang is complaining on the 4.1 back port of
>> a patch which is merged in master before
>> clang-format is brought in. Is there a way I can get smoke +1 for 4.1 as
>> it won't be neat to have clang changes
>> in 4.1 and not in master for same patch. It might further affect the
>> clean back ports.
>>
>>
> This is a bug.. please file an 'project-infrastructure' bug to disable
> clang-format job in release branches (other than release-5 branch).
>
ok, done https://bugzilla.redhat.com/show_bug.cgi?id=1630259

>
> -Amar
>
>
>> - Kotresh HR
>>
>> On Tue, Sep 18, 2018 at 2:13 PM, Ravishankar N 
>> wrote:
>>
>>>
>>>
>>> On 09/18/2018 02:02 PM, Hari Gowtham wrote:
>>>
>>>> I see that the procedure mentioned in the coding standard document is
>>>> buggy.
>>>>
>>>> git show --pretty="format:" --name-only | grep -v "contrib/" | egrep
>>>> "*\.[ch]$" | xargs clang-format -i
>>>>
>>>> The above command edited the whole file. which is not supposed to
>>>> happen.
>>>>
>>> It works fine on fedora 28 (clang version 6.0.1). I had the same problem
>>> you faced on fedora 26 though, presumably because of the older clang
>>> version.
>>> -Ravi
>>>
>>>
>>>
>>>> +1 for the readability of the code having been affected.
>>>> On Mon, Sep 17, 2018 at 10:45 AM Amar Tumballi 
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Mon, Sep 17, 2018 at 10:00 AM, Ravishankar N <
>>>>> ravishan...@redhat.com> wrote:
>>>>>
>>>>>>
>>>>>> On 09/13/2018 03:34 PM, Niels de Vos wrote:
>>>>>>
>>>>>>> On Thu, Sep 13, 2018 at 02:25:22PM +0530, Ravishankar N wrote:
>>>>>>> ...
>>>>>>>
>>>>>>>> What rules does clang impose on function/argument wrapping and
>>>>>>>> alignment? I
>>>>>>>> somehow found the new code wrapping to be random and highly
>>>>>>>> unreadable. An
>>>>>>>> example of 'before and after' the clang format patches went in:
>>>>>>>> https://paste.fedoraproject.org/paste/dC~aRCzYgliqucGYIzxPrQ
>>>>>>>> Wondering if
>>>>>>>> this is just me or is it some problem of spurious clang fixes.
>>>>>>>>
>>>>>>> I agree that this example looks pretty ugly. Looking at random
>>>>>>> changes
>>>>>>> to the code where I am most active does not show this awkward
>>>>>>> formatting.
>>>>>>>
>>>>>>
>>>>>> So one of my recent patches is failing smoke and clang-format is
>>>>>> insisting [https://build.gluster.org/job/clang-format/22/console] on
>>>>>> wrapping function arguments in an unsightly manner. Should I resend my
>>>>>> patch with this new style of wrapping ?
>>>>>>
>>>>>> I would say yes! We will get better, by changing options of
>>>>> clang-format once we get better options there. But for now, just following
>>>>> the option suggested by clang-format job is good IMO.
>>>>>
>>>>> -Amar
>>>>>
>>>>> Regards,
>>>>>> Ravi
>>>>>>
>>>>>>
>>>>>>
>>>>>> However, I was expecting to see enforcing of the
>>>>>>> single-line-if-statements like this (and while/for/.. loops):
>>>>>>>
>>>>>>>   if (need_to_do_it) {
>>>>>>>do_it();
>>>>>>>   }
>>>>>>>
>>>>>>> instead of
>>>>>>>
>>>>>>>   if (need_to_do_it)
>>>>>>>do_it();
>>>>>>>
>>>>>>> At least the conversion did not take care of this. But, maybe I'm
>>>>>>> wrong
>>>>>>> as I can not find the discussion in https://bugzilla.redhat.com/15
>>>>>>> 64149
>>>>>>> about this. Does someone remember what was decided in the end?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Niels
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Amar Tumballi (amarts)
>>>>> ___
>>>>> Gluster-devel mailing list
>>>>> Gluster-devel@gluster.org
>>>>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>>>>>
>>>>
>>>>
>>>>
>>> ___
>>> Gluster-devel mailing list
>>> Gluster-devel@gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>>>
>>
>>
>>
>> --
>> Thanks and Regards,
>> Kotresh H R
>>
>
>
>
> --
> Amar Tumballi (amarts)
>



-- 
Thanks and Regards,
Kotresh H R
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Clang-Formatter for GlusterFS.

2018-09-18 Thread Kotresh Hiremath Ravishankar
I have a different problem. clang is complaining on the 4.1 back port of a
patch which is merged in master before
clang-format is brought in. Is there a way I can get smoke +1 for 4.1 as it
won't be neat to have clang changes
in 4.1 and not in master for same patch. It might further affect the clean
back ports.

- Kotresh HR

On Tue, Sep 18, 2018 at 2:13 PM, Ravishankar N 
wrote:

>
>
> On 09/18/2018 02:02 PM, Hari Gowtham wrote:
>
>> I see that the procedure mentioned in the coding standard document is
>> buggy.
>>
>> git show --pretty="format:" --name-only | grep -v "contrib/" | egrep
>> "*\.[ch]$" | xargs clang-format -i
>>
>> The above command edited the whole file. which is not supposed to happen.
>>
> It works fine on fedora 28 (clang version 6.0.1). I had the same problem
> you faced on fedora 26 though, presumably because of the older clang
> version.
> -Ravi
>
>
>
>> +1 for the readability of the code having been affected.
>> On Mon, Sep 17, 2018 at 10:45 AM Amar Tumballi 
>> wrote:
>>
>>>
>>>
>>> On Mon, Sep 17, 2018 at 10:00 AM, Ravishankar N 
>>> wrote:
>>>

 On 09/13/2018 03:34 PM, Niels de Vos wrote:

> On Thu, Sep 13, 2018 at 02:25:22PM +0530, Ravishankar N wrote:
> ...
>
>> What rules does clang impose on function/argument wrapping and
>> alignment? I
>> somehow found the new code wrapping to be random and highly
>> unreadable. An
>> example of 'before and after' the clang format patches went in:
>> https://paste.fedoraproject.org/paste/dC~aRCzYgliqucGYIzxPrQ
>> Wondering if
>> this is just me or is it some problem of spurious clang fixes.
>>
> I agree that this example looks pretty ugly. Looking at random changes
> to the code where I am most active does not show this awkward
> formatting.
>

 So one of my recent patches is failing smoke and clang-format is
 insisting [https://build.gluster.org/job/clang-format/22/console] on
 wrapping function arguments in an unsightly manner. Should I resend my
 patch with this new style of wrapping ?

 I would say yes! We will get better, by changing options of
>>> clang-format once we get better options there. But for now, just following
>>> the option suggested by clang-format job is good IMO.
>>>
>>> -Amar
>>>
>>> Regards,
 Ravi



 However, I was expecting to see enforcing of the
> single-line-if-statements like this (and while/for/.. loops):
>
>   if (need_to_do_it) {
>do_it();
>   }
>
> instead of
>
>   if (need_to_do_it)
>do_it();
>
> At least the conversion did not take care of this. But, maybe I'm wrong
> as I can not find the discussion in https://bugzilla.redhat.com/15
> 64149
> about this. Does someone remember what was decided in the end?
>
> Thanks,
> Niels
>


>>>
>>> --
>>> Amar Tumballi (amarts)
>>> ___
>>> Gluster-devel mailing list
>>> Gluster-devel@gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>>>
>>
>>
>>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>



-- 
Thanks and Regards,
Kotresh H R
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Cloudsync with AFR

2018-09-16 Thread Kotresh Hiremath Ravishankar
Hi Anuradha,

To enable the c-time (consistent time) feature. Please enable following two
options.

gluster vol set  utime on
gluster vol set  ctime on

Thanks,
Kotresh HR

On Fri, Sep 14, 2018 at 12:18 PM, Rafi Kavungal Chundattu Parambil <
rkavu...@redhat.com> wrote:

> Hi Anuradha,
>
> We have an xlator to provide consistent time across replica set. You can
> enable this xlator to get the consistent mtime,atime, ctime.
>
>
> Regards
> Rafi KC
>
> - Original Message -
> From: "Anuradha Talur" 
> To: gluster-devel@gluster.org
> Cc: ama...@redhat.com, "Ram Ankireddypalle" ,
> "Sachin Pandit" 
> Sent: Thursday, September 13, 2018 7:19:26 AM
> Subject: [Gluster-devel] Cloudsync with AFR
>
>
>
> Hi,
>
> We recently started testing cloudsync xlator on a replica volume.
> And we have noticed a few issues. We would like some advice on how to
> proceed with them.
>
>
>
> 1) As we know, when stubbing a file cloudsync uses mtime of files to
> decide whether a file should be truncated or not.
>
> If the mtime provided as part of the setfattr operation is lesser than the
> current mtime of the file on brick, stubbing isn't completed.
>
> This works fine in a plain distribute volume. B ut i n case of a replica
> volume, the mtime could be different for the files on each of the replica
> brick.
>
>
> During our testing we came across the following scenario for a replica 3
> volume with 3 bricks:
>
> We performed `setfattr -n "trusted.glusterfs.csou.complete" -v m1 file1`
> from our gluster mount to stub the files.
> It so happened that on brick1 this operation succeeded and truncated file1
> as it should have. But on brick2 and brick3, mtime found on file1
> was greater than m1, leading to failure there.
>
> From AFR's perspective this operation failed as a whole because quorum
> could not be met. But on the brick where this setxattr succeeded, truncate
> was already performed. So now we have one of the replica bricks out of sync
> and AFR has no awareness of this. This file needs to be rolled back to its
> state before the
>
>
> setfattr.
>
> Ideally, it appears that we should add intelligence in AFR to handle this.
> How do you suggest we do that?
>
>
> The case is also applicable to EC volumes of course.
>
> 2) Given that cloudsync depends on mtime to make the decision of
> truncating, how do we ensure that we don't end up in this situation again?
>
> Thanks,
> Anuradha
> ***Legal Disclaimer***
> "This communication may contain confidential and privileged material for
> the
> sole use of the intended recipient. Any unauthorized review, use or
> distribution
> by others is strictly prohibited. If you have received the message by
> mistake,
> please advise the sender by reply email and delete the message. Thank
> you."
> **
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>



-- 
Thanks and Regards,
Kotresh H R
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-infra] Setting up machines from softserve in under 5 mins

2018-08-14 Thread Kotresh Hiremath Ravishankar
In the /etc/hosts, I think it is adding different IP

On Mon, Aug 13, 2018 at 5:59 PM, Rafi Kavungal Chundattu Parambil <
rkavu...@redhat.com> wrote:

> This is so nice. I tried it and succesfully created a test machine. It
> would be great if there is a provision to extend the lifetime of vm's
> beyond the time provided during creation. First I ran the ansible-playbook
> from the vm machine, then I realized that has to be executed from outside
> machine. May be we can mention that info in the doc.
>
> Regards
> Rafi KC
>
> - Original Message -
> From: "Nigel Babu" 
> To: "gluster-devel" 
> Cc: "gluster-infra" 
> Sent: Monday, August 13, 2018 3:38:17 PM
> Subject: [Gluster-devel] Setting up machines from softserve in under 5 mins
>
> Hello folks,
>
> Deepshikha did the work to make loaning a machine to running your
> regressions on them faster a while ago. I've tested them a few times today
> to confirm it works as expected. In the past, Softserve[1] machines would
> be a clean Centos 7 image. Now, we have an image with all the dependencies
> installed and *almost* setup to run regressions. It just needs a few steps
> run on them and we have a simplified playbook that will setup *just* those
> steps. This brings down the time from around 30 mins to setup a machine to
> less than 5 mins. The instructions[2] are on the softserve wiki for now,
> but will move to the site itself in the future.
>
> Please let us know if you face troubles by filing a bug.[3]
> [1]: https://softserve.gluster.org/
> [2]: https://github.com/gluster/softserve/wiki/Running-
> Regressions-on-loaned-Softserve-instances
> [3]: https://bugzilla.redhat.com/enter_bug.cgi?product=
> GlusterFS=project-infrastructure
>
> --
> nigelb
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
> ___
> Gluster-infra mailing list
> gluster-in...@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-infra
>



-- 
Thanks and Regards,
Kotresh H R
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-Maintainers] Master branch lock down status (Wed, August 08th)

2018-08-10 Thread Kotresh Hiremath Ravishankar
Hi Shyam/Atin,

I have posted the patch[1] for geo-rep test cases failure:
tests/00-geo-rep/georep-basic-dr-rsync.t
tests/00-geo-rep/georep-basic-dr-tarssh.t
tests/00-geo-rep/00-georep-verify-setup.t

Please include patch [1] while triggering tests.
The instrumentation patch [2] which was included can be removed.

[1]  https://review.gluster.org/#/c/glusterfs/+/20704/
[2]  https://review.gluster.org/#/c/glusterfs/+/20477/

Thanks,
Kotresh HR




On Fri, Aug 10, 2018 at 3:21 PM, Pranith Kumar Karampuri <
pkara...@redhat.com> wrote:

>
>
> On Thu, Aug 9, 2018 at 4:02 PM Pranith Kumar Karampuri <
> pkara...@redhat.com> wrote:
>
>>
>>
>> On Thu, Aug 9, 2018 at 6:34 AM Shyam Ranganathan 
>> wrote:
>>
>>> Today's patch set 7 [1], included fixes provided till last evening IST,
>>> and its runs can be seen here [2] (yay! we can link to comments in
>>> gerrit now).
>>>
>>> New failures: (added to the spreadsheet)
>>> ./tests/bugs/protocol/bug-808400-repl.t (core dumped)
>>> ./tests/bugs/quick-read/bug-846240.t
>>>
>>> Older tests that had not recurred, but failed today: (moved up in the
>>> spreadsheet)
>>> ./tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t
>>> ./tests/bugs/index/bug-1559004-EMLINK-handling.t
>>>
>>
>> The above test is timing out. I had to increase the timeout while adding
>> the .t so that creation of maximum number of links that will max-out in
>> ext4. Will re-check if it is the same issue and get back.
>>
>
> This test is timing out with lcov. I bumped up timeout to 30 minutes @
> https://review.gluster.org/#/c/glusterfs/+/20699, I am not happy that
> this test takes so long, but without this it is difficult to find
> regression on ext4 which has limits on number of hardlinks in a
> directory(It took us almost one year after we introduced regression to find
> this problem when we did introduce regression last time). If there is a way
> of running this .t once per day and before each release. I will be happy to
> make it part of that. Let me know.
>
>
>>
>>
>>>
>>> Other issues;
>>> Test ./tests/basic/ec/ec-5-2.t core dumped again
>>> Few geo-rep failures, Kotresh should have more logs to look at with
>>> these runs
>>> Test ./tests/bugs/glusterd/quorum-validation.t dumped core again
>>>
>>> Atin/Amar, we may need to merge some of the patches that have proven to
>>> be holding up and fixing issues today, so that we do not leave
>>> everything to the last. Check and move them along or lmk.
>>>
>>> Shyam
>>>
>>> [1] Patch set 7: https://review.gluster.org/c/glusterfs/+/20637/7
>>> [2] Runs against patch set 7 and its status (incomplete as some runs
>>> have not completed):
>>> https://review.gluster.org/c/glusterfs/+/20637/7#message-
>>> 37bc68ce6f2157f2947da6fd03b361ab1b0d1a77
>>> (also updated in the spreadsheet)
>>>
>>> On 08/07/2018 07:37 PM, Shyam Ranganathan wrote:
>>> > Deserves a new beginning, threads on the other mail have gone deep
>>> enough.
>>> >
>>> > NOTE: (5) below needs your attention, rest is just process and data on
>>> > how to find failures.
>>> >
>>> > 1) We are running the tests using the patch [2].
>>> >
>>> > 2) Run details are extracted into a separate sheet in [3] named "Run
>>> > Failures" use a search to find a failing test and the corresponding run
>>> > that it failed in.
>>> >
>>> > 3) Patches that are fixing issues can be found here [1], if you think
>>> > you have a patch out there, that is not in this list, shout out.
>>> >
>>> > 4) If you own up a test case failure, update the spreadsheet [3] with
>>> > your name against the test, and also update other details as needed (as
>>> > comments, as edit rights to the sheet are restricted).
>>> >
>>> > 5) Current test failures
>>> > We still have the following tests failing and some without any RCA or
>>> > attention, (If something is incorrect, write back).
>>> >
>>> > ./tests/bugs/replicate/bug-1290965-detect-bitrotten-objects.t (needs
>>> > attention)
>>> > ./tests/00-geo-rep/georep-basic-dr-tarssh.t (Kotresh)
>>> > ./tests/bugs/glusterd/add-brick-and-validate-replicated-
>>> volume-options.t
>>> > (Atin)
>>> > ./tests/bugs/ec/bug-1236065.t (Ashish)
>>> > ./tests/00-geo-rep/georep-basic-dr-rsync.t (Kotresh)
>>> > ./tests/basic/ec/ec-1468261.t (needs attention)
>>> > ./tests/basic/afr/add-brick-self-heal.t (needs attention)
>>> > ./tests/basic/afr/granular-esh/replace-brick.t (needs attention)
>>> > ./tests/bugs/core/multiplex-limit-issue-151.t (needs attention)
>>> > ./tests/bugs/glusterd/validating-server-quorum.t (Atin)
>>> > ./tests/bugs/replicate/bug-1363721.t (Ravi)
>>> >
>>> > Here are some newer failures, but mostly one-off failures except cores
>>> > in ec-5-2.t. All of the following need attention as these are new.
>>> >
>>> > ./tests/00-geo-rep/00-georep-verify-setup.t
>>> > ./tests/basic/afr/gfid-mismatch-resolution-with-fav-child-policy.t
>>> > ./tests/basic/stats-dump.t
>>> > ./tests/bugs/bug-1110262.t
>>> > ./tests/bugs/glusterd/mgmt-handshake-and-volume-sync-
>>> 

Re: [Gluster-devel] [Gluster-Maintainers] Master branch lock down status

2018-08-08 Thread Kotresh Hiremath Ravishankar
Hi Atin/Shyam

For geo-rep test retrials. Could you take this instrumentation patch [1]
and give a run?
I am have tried thrice on the patch with brick mux enabled and without but
couldn't hit
geo-rep failure. May be some race and it's not happening with
instrumentation patch.

[1] https://review.gluster.org/20477

Thanks,
Kotresh HR


On Wed, Aug 8, 2018 at 4:00 PM, Pranith Kumar Karampuri  wrote:

>
>
> On Wed, Aug 8, 2018 at 5:08 AM Shyam Ranganathan 
> wrote:
>
>> Deserves a new beginning, threads on the other mail have gone deep enough.
>>
>> NOTE: (5) below needs your attention, rest is just process and data on
>> how to find failures.
>>
>> 1) We are running the tests using the patch [2].
>>
>> 2) Run details are extracted into a separate sheet in [3] named "Run
>> Failures" use a search to find a failing test and the corresponding run
>> that it failed in.
>>
>> 3) Patches that are fixing issues can be found here [1], if you think
>> you have a patch out there, that is not in this list, shout out.
>>
>> 4) If you own up a test case failure, update the spreadsheet [3] with
>> your name against the test, and also update other details as needed (as
>> comments, as edit rights to the sheet are restricted).
>>
>> 5) Current test failures
>> We still have the following tests failing and some without any RCA or
>> attention, (If something is incorrect, write back).
>>
>> ./tests/bugs/replicate/bug-1290965-detect-bitrotten-objects.t (needs
>> attention)
>> ./tests/00-geo-rep/georep-basic-dr-tarssh.t (Kotresh)
>> ./tests/bugs/glusterd/add-brick-and-validate-replicated-volume-options.t
>> (Atin)
>> ./tests/bugs/ec/bug-1236065.t (Ashish)
>> ./tests/00-geo-rep/georep-basic-dr-rsync.t (Kotresh)
>> ./tests/basic/ec/ec-1468261.t (needs attention)
>> ./tests/basic/afr/add-brick-self-heal.t (needs attention)
>> ./tests/basic/afr/granular-esh/replace-brick.t (needs attention)
>>
>
> Sent https://review.gluster.org/#/c/glusterfs/+/20681 for the failure
> above. Because it was retried there were no logs. Entry heal succeeded but
> data/metadata heal after that didn't succeed. Found only one case based on
> code reading and the point at which it failed in .t
>
>
>> ./tests/bugs/core/multiplex-limit-issue-151.t (needs attention)
>> ./tests/bugs/glusterd/validating-server-quorum.t (Atin)
>> ./tests/bugs/replicate/bug-1363721.t (Ravi)
>>
>> Here are some newer failures, but mostly one-off failures except cores
>> in ec-5-2.t. All of the following need attention as these are new.
>>
>> ./tests/00-geo-rep/00-georep-verify-setup.t
>> ./tests/basic/afr/gfid-mismatch-resolution-with-fav-child-policy.t
>> ./tests/basic/stats-dump.t
>> ./tests/bugs/bug-1110262.t
>> ./tests/bugs/glusterd/mgmt-handshake-and-volume-sync-
>> post-glusterd-restart.t
>> ./tests/basic/ec/ec-data-heal.t
>> ./tests/bugs/replicate/bug-1448804-check-quorum-type-values.t
>> ./tests/bugs/snapshot/bug-1482023-snpashot-issue-with-
>> other-processes-accessing-mounted-path.t
>> ./tests/basic/ec/ec-5-2.t
>>
>> 6) Tests that are addressed or are not occurring anymore are,
>>
>> ./tests/bugs/glusterd/rebalance-operations-in-single-node.t
>> ./tests/bugs/index/bug-1559004-EMLINK-handling.t
>> ./tests/bugs/replicate/bug-1386188-sbrain-fav-child.t
>> ./tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t
>> ./tests/bitrot/bug-1373520.t
>> ./tests/bugs/distribute/bug-1117851.t
>> ./tests/bugs/glusterd/quorum-validation.t
>> ./tests/bugs/distribute/bug-1042725.t
>> ./tests/bugs/replicate/bug-1586020-mark-dirty-for-entry-
>> txn-on-quorum-failure.t
>> ./tests/bugs/quota/bug-1293601.t
>> ./tests/bugs/bug-1368312.t
>> ./tests/bugs/distribute/bug-1122443.t
>> ./tests/bugs/core/bug-1432542-mpx-restart-crash.t
>>
>> Shyam (and Atin)
>>
>> On 08/05/2018 06:24 PM, Shyam Ranganathan wrote:
>> > Health on master as of the last nightly run [4] is still the same.
>> >
>> > Potential patches that rectify the situation (as in [1]) are bunched in
>> > a patch [2] that Atin and myself have put through several regressions
>> > (mux, normal and line coverage) and these have also not passed.
>> >
>> > Till we rectify the situation we are locking down master branch commit
>> > rights to the following people, Amar, Atin, Shyam, Vijay.
>> >
>> > The intention is to stabilize master and not add more patches that my
>> > destabilize it.
>> >
>> > Test cases that are tracked as failures and need action are present here
>> > [3].
>> >
>> > @Nigel, request you to apply the commit rights change as you see this
>> > mail and let the list know regarding the same as well.
>> >
>> > Thanks,
>> > Shyam
>> >
>> > [1] Patches that address regression failures:
>> > https://review.gluster.org/#/q/starredby:srangana%2540redhat.com
>> >
>> > [2] Bunched up patch against which regressions were run:
>> > https://review.gluster.org/#/c/20637
>> >
>> > [3] Failing tests list:
>> > https://docs.google.com/spreadsheets/d/1IF9GhpKah4bto19RQLr0y_Kkw26E_
>> -crKALHSaSjZMQ/edit?usp=sharing
>> >
>> > [4] 

Re: [Gluster-devel] [Gluster-Maintainers] Release 5: Master branch health report (Week of 30th July)

2018-08-03 Thread Kotresh Hiremath Ravishankar
Hi Du/Poornima,

I was analysing bitrot and geo-rep failures and I suspect there is a bug in
some perf xlator
that was one of the cause. I was seeing following behaviour in few runs.

1. Geo-rep synced data to slave. It creats empty file and then rsync syncs
data.
But test does "stat --format "%F" " to confirm. If it's empty, it
returns
"regular empty file" else "regular file". I believe it did get the
"regular empty file"
instead of "regular file" until timeout.

2. Other behaviour is with bitrot, with brick-mux. If a file is deleted on
the back end on one brick
and the look up is done. What all performance xlators needs to be
disabled to get the lookup/revalidate
on the brick where the file was deleted. Earlier, only md-cache was
disable and it used to work.
No it's failing intermittently.

Are there any pending patches around these areas that needs to be merged ?
If there are, then it could be affecting other tests as well.

Thanks,
Kotresh HR

On Fri, Aug 3, 2018 at 3:07 PM, Karthik Subrahmanya 
wrote:

>
>
> On Fri, Aug 3, 2018 at 2:12 PM Karthik Subrahmanya 
> wrote:
>
>>
>>
>> On Thu, Aug 2, 2018 at 11:00 PM Karthik Subrahmanya 
>> wrote:
>>
>>>
>>>
>>> On Tue 31 Jul, 2018, 10:17 PM Atin Mukherjee, 
>>> wrote:
>>>
 I just went through the nightly regression report of brick mux runs and
 here's what I can summarize.

 
 
 =
 Fails only with brick-mux
 
 
 =
 tests/bugs/core/bug-1432542-mpx-restart-crash.t - Times out even after
 400 secs. Refer https://fstat.gluster.org/failure/209?state=2_
 date=2018-06-30_date=2018-07-31=all, specifically the
 latest report https://build.gluster.org/job/
 regression-test-burn-in/4051/consoleText . Wasn't timing out as
 frequently as it was till 12 July. But since 27 July, it has timed out
 twice. Beginning to believe commit 9400b6f2c8aa219a493961e0ab9770b7f12e80d2
 has added the delay and now 400 secs isn't sufficient enough (Mohit?)

 tests/bugs/glusterd/add-brick-and-validate-replicated-volume-options.t
 (Ref - https://build.gluster.org/job/regression-test-with-
 multiplex/814/console) -  Test fails only in brick-mux mode, AI on
 Atin to look at and get back.

 tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t (
 https://build.gluster.org/job/regression-test-with-
 multiplex/813/console) - Seems like failed just twice in last 30 days
 as per https://fstat.gluster.org/failure/251?state=2_
 date=2018-06-30_date=2018-07-31=all. Need help from AFR
 team.

 tests/bugs/quota/bug-1293601.t (https://build.gluster.org/
 job/regression-test-with-multiplex/812/console) - Hasn't failed after
 26 July and earlier it was failing regularly. Did we fix this test through
 any patch (Mohit?)

 tests/bitrot/bug-1373520.t - (https://build.gluster.org/
 job/regression-test-with-multiplex/811/console)  - Hasn't failed after
 27 July and earlier it was failing regularly. Did we fix this test through
 any patch (Mohit?)

 tests/bugs/glusterd/remove-brick-testcases.t - Failed once with a
 core, not sure if related to brick mux or not, so not sure if brick mux is
 culprit here or not. Ref - https://build.gluster.org/job/
 regression-test-with-multiplex/806/console . Seems to be a glustershd
 crash. Need help from AFR folks.

 
 
 =
 Fails for non-brick mux case too
 
 
 =
 tests/bugs/distribute/bug-1122443.t 0 Seems to be failing at my setup
 very often, with out brick mux as well. Refer
 https://build.gluster.org/job/regression-test-burn-in/4050/consoleText
 . There's an email in gluster-devel and a BZ 1610240 for the same.

 tests/bugs/bug-1368312.t - Seems to be recent failures (
 https://build.gluster.org/job/regression-test-with-
 multiplex/815/console) - seems to be a new failure, however seen this
 for a non-brick-mux case too - https://build.gluster.org/job/
 regression-test-burn-in/4039/consoleText . Need some eyes from AFR
 folks.

 tests/00-geo-rep/georep-basic-dr-tarssh.t - this isn't specific to
 brick mux, have seen this failing at multiple default regression runs.
 Refer 

Re: [Gluster-devel] [Gluster-Maintainers] Release 5: Master branch health report (Week of 30th July)

2018-08-02 Thread Kotresh Hiremath Ravishankar
Have attached in the Bug https://bugzilla.redhat.com/show_bug.cgi?id=1611635


On Thu, 2 Aug 2018, 22:21 Raghavendra Gowdappa,  wrote:

>
>
> On Thu, Aug 2, 2018 at 5:48 PM, Kotresh Hiremath Ravishankar <
> khire...@redhat.com> wrote:
>
>> I am facing different issue in softserve machines. The fuse mount itself
>> is failing.
>> I tried day before yesterday to debug geo-rep failures. I discussed with
>> Raghu,
>> but could not root cause it.
>>
>
> Where can I find the complete client logs for this?
>
> So none of the tests were passing. It happened on
>> both machine instances I tried.
>>
>> 
>> [2018-07-31 10:41:49.288117] D [fuse-bridge.c:5407:notify] 0-fuse: got
>> event 6 on graph 0
>> [2018-07-31 10:41:49.289427] D [fuse-bridge.c:4990:fuse_get_mount_status]
>> 0-fuse: mount status is 0
>> [2018-07-31 10:41:49.289555] D [fuse-bridge.c:4256:fuse_init]
>> 0-glusterfs-fuse: Detected support for FUSE_AUTO_INVAL_DATA. Enabling
>> fopen_keep_cache automatically.
>> [2018-07-31 10:41:49.289591] T [fuse-bridge.c:278:send_fuse_iov]
>> 0-glusterfs-fuse: writev() result 40/40
>> [2018-07-31 10:41:49.289610] I [fuse-bridge.c:4314:fuse_init]
>> 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24 kernel
>> 7.22
>> [2018-07-31 10:41:49.289627] I [fuse-bridge.c:4948:fuse_graph_sync]
>> 0-fuse: switched to graph 0
>> [2018-07-31 10:41:49.289696] T [MSGID: 0] [syncop.c:1261:syncop_lookup]
>> 0-stack-trace: stack-address: 0x7f36e4001058, winding from fuse to
>> meta-autoload
>> [2018-07-31 10:41:49.289743] T [MSGID: 0]
>> [defaults.c:2716:default_lookup] 0-stack-trace: stack-address:
>> 0x7f36e4001058, winding from meta-autoload to master
>> [2018-07-31 10:41:49.289787] T [MSGID: 0]
>> [io-stats.c:2788:io_stats_lookup] 0-stack-trace: stack-address:
>> 0x7f36e4001058, winding from master to master-md-cache
>> [2018-07-31 10:41:49.289833] T [MSGID: 0]
>> [md-cache.c:513:mdc_inode_iatt_get] 0-md-cache: mdc_inode_ctx_get failed
>> (----0001)
>> [2018-07-31 10:41:49.289923] T [MSGID: 0] [md-cache.c:1200:mdc_lookup]
>> 0-stack-trace: stack-address: 0x7f36e4001058, winding from master-md-cache
>> to master-open-behind
>> [2018-07-31 10:41:49.289946] T [MSGID: 0]
>> [defaults.c:2716:default_lookup] 0-stack-trace: stack-address:
>> 0x7f36e4001058, winding from master-open-behind to master-quick-read
>> [2018-07-31 10:41:49.289973] T [MSGID: 0] [quick-read.c:556:qr_lookup]
>> 0-stack-trace: stack-address: 0x7f36e4001058, winding from
>> master-quick-read to master-io-cache
>> [2018-07-31 10:41:49.290002] T [MSGID: 0] [io-cache.c:298:ioc_lookup]
>> 0-stack-trace: stack-address: 0x7f36e4001058, winding from master-io-cache
>> to master-readdir-ahead
>> [2018-07-31 10:41:49.290034] T [MSGID: 0]
>> [defaults.c:2716:default_lookup] 0-stack-trace: stack-address:
>> 0x7f36e4001058, winding from master-readdir-ahead to master-read-ahead
>> [2018-07-31 10:41:49.290052] T [MSGID: 0]
>> [defaults.c:2716:default_lookup] 0-stack-trace: stack-address:
>> 0x7f36e4001058, winding from master-read-ahead to master-write-behind
>> [2018-07-31 10:41:49.290077] T [MSGID: 0] [write-behind.c:2439:wb_lookup]
>> 0-stack-trace: stack-address: 0x7f36e4001058, winding from
>> master-write-behind to master-dht
>> [2018-07-31 10:41:49.290156] D [MSGID: 0]
>> [dht-common.c:3674:dht_do_fresh_lookup] 0-master-dht: /: no subvolume in
>> layout for path, checking on all the subvols to see if it is a directory
>> [2018-07-31 10:41:49.290180] D [MSGID: 0]
>> [dht-common.c:3688:dht_do_fresh_lookup] 0-master-dht: /: Found null hashed
>> subvol. Calling lookup on all nodes.
>> [2018-07-31 10:41:49.290199] T [MSGID: 0]
>> [dht-common.c:3695:dht_do_fresh_lookup] 0-stack-trace: stack-address:
>> 0x7f36e4001058, winding from master-dht to master-replicate-0
>> [2018-07-31 10:41:49.290245] I [MSGID: 108006]
>> [afr-common.c:5582:afr_local_init] 0-master-replicate-0: no subvolumes up
>> [2018-07-31 10:41:49.290291] D [MSGID: 0]
>> [afr-common.c:3212:afr_discover] 0-stack-trace: stack-address:
>> 0x7f36e4001058, master-replicate-0 returned -1 error: Transport endpoint is
>> not conne
>> cted [Transport endpoint is not connected]
>> [2018-07-31 10:41:49.290323] D [MSGID: 0]
>> [dht-common.c:1391:dht_lookup_dir_cbk] 0-master-dht: lookup of / on
>> master-replicate-0 returned error [Transport endpoint is not connected]
>> [2018-07-31 10:41:49.290350] T [MSGID: 0]
>> [dht-common.c:3695:dht_do_fresh_lookup] 0-stack-trace: 

Re: [Gluster-devel] [Gluster-Maintainers] Release 5: Master branch health report (Week of 30th July)

2018-08-02 Thread Kotresh Hiremath Ravishankar
is not connected [Transport endpoint is not connected]
[2018-07-31 10:41:49.290504] D [MSGID: 0] [io-cache.c:268:ioc_lookup_cbk]
0-stack-trace: stack-address: 0x7f36e4001058, master-io-cache returned -1
error: Transport endpoint is not connected [Transport endpoint is not
connected]
[2018-07-31 10:41:49.290530] D [MSGID: 0] [quick-read.c:515:qr_lookup_cbk]
0-stack-trace: stack-address: 0x7f36e4001058, master-quick-read returned -1
error: Transport endpoint is not connected [Transport endpoint is not
connected]
[2018-07-31 10:41:49.290554] D [MSGID: 0] [md-cache.c:1130:mdc_lookup_cbk]
0-stack-trace: stack-address: 0x7f36e4001058, master-md-cache returned -1
error: Transport endpoint is not connected [Transport endpoint is not
connected]
[2018-07-31 10:41:49.290581] D [MSGID: 0]
[io-stats.c:2276:io_stats_lookup_cbk] 0-stack-trace: stack-address:
0x7f36e4001058, master returned -1 error: Transport endpoint is not
connected [Transport endpoint is not connected]
[2018-07-31 10:41:49.290626] E [fuse-bridge.c:4382:fuse_first_lookup]
0-fuse: first lookup on root failed (Transport endpoint is not connected)
-

On Thu, Aug 2, 2018 at 5:35 PM, Nigel Babu  wrote:

> On Thu, Aug 2, 2018 at 5:12 PM Kotresh Hiremath Ravishankar <
> khire...@redhat.com> wrote:
>
>> Don't know, something to do with perf xlators I suppose. It's not
>> repdroduced on my local system with brick-mux enabled as well. But it's
>> happening on Xavis' system.
>>
>> Xavi,
>> Could you try with the patch [1] and let me know whether it fixes the
>> issue.
>>
>> [1] https://review.gluster.org/#/c/20619/1
>>
>
> If you cannot reproduce it on your laptop, why don't you request a machine
> from softserve[1] and try it out?
>
> [1]: https://github.com/gluster/softserve/wiki/Running-
> Regressions-on-clean-Centos-7-machine
>
> --
> nigelb
>



-- 
Thanks and Regards,
Kotresh H R
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-Maintainers] Release 5: Master branch health report (Week of 30th July)

2018-08-02 Thread Kotresh Hiremath Ravishankar
On Thu, Aug 2, 2018 at 5:05 PM, Atin Mukherjee 
wrote:

>
>
> On Thu, Aug 2, 2018 at 4:37 PM Kotresh Hiremath Ravishankar <
> khire...@redhat.com> wrote:
>
>>
>>
>> On Thu, Aug 2, 2018 at 3:49 PM, Xavi Hernandez 
>> wrote:
>>
>>> On Thu, Aug 2, 2018 at 6:14 AM Atin Mukherjee 
>>> wrote:
>>>
>>>>
>>>>
>>>> On Tue, Jul 31, 2018 at 10:11 PM Atin Mukherjee 
>>>> wrote:
>>>>
>>>>> I just went through the nightly regression report of brick mux runs
>>>>> and here's what I can summarize.
>>>>>
>>>>> 
>>>>> 
>>>>> =
>>>>> Fails only with brick-mux
>>>>> 
>>>>> 
>>>>> =
>>>>> tests/bugs/core/bug-1432542-mpx-restart-crash.t - Times out even
>>>>> after 400 secs. Refer https://fstat.gluster.org/
>>>>> failure/209?state=2_date=2018-06-30_date=2018-
>>>>> 07-31=all, specifically the latest report
>>>>> https://build.gluster.org/job/regression-test-burn-in/4051/consoleText
>>>>> . Wasn't timing out as frequently as it was till 12 July. But since 27
>>>>> July, it has timed out twice. Beginning to believe commit
>>>>> 9400b6f2c8aa219a493961e0ab9770b7f12e80d2 has added the delay and now
>>>>> 400 secs isn't sufficient enough (Mohit?)
>>>>>
>>>>> tests/bugs/glusterd/add-brick-and-validate-replicated-volume-options.t
>>>>> (Ref - https://build.gluster.org/job/regression-test-with-
>>>>> multiplex/814/console) -  Test fails only in brick-mux mode, AI on
>>>>> Atin to look at and get back.
>>>>>
>>>>> tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t (
>>>>> https://build.gluster.org/job/regression-test-with-
>>>>> multiplex/813/console) - Seems like failed just twice in last 30 days
>>>>> as per https://fstat.gluster.org/failure/251?state=2_
>>>>> date=2018-06-30_date=2018-07-31=all. Need help from AFR
>>>>> team.
>>>>>
>>>>> tests/bugs/quota/bug-1293601.t (https://build.gluster.org/
>>>>> job/regression-test-with-multiplex/812/console) - Hasn't failed after
>>>>> 26 July and earlier it was failing regularly. Did we fix this test through
>>>>> any patch (Mohit?)
>>>>>
>>>>> tests/bitrot/bug-1373520.t - (https://build.gluster.org/
>>>>> job/regression-test-with-multiplex/811/console)  - Hasn't failed
>>>>> after 27 July and earlier it was failing regularly. Did we fix this test
>>>>> through any patch (Mohit?)
>>>>>
>>>>
>>>> I see this has failed in day before yesterday's regression run as well
>>>> (and I could reproduce it locally with brick mux enabled). The test fails
>>>> in healing a file within a particular time period.
>>>>
>>>> *15:55:19* not ok 25 Got "0" instead of "512", LINENUM:55*15:55:19* FAILED 
>>>> COMMAND: 512 path_size /d/backends/patchy5/FILE1
>>>>
>>>> Need EC dev's help here.
>>>>
>>>
>>> I'm not sure where the problem is exactly. I've seen that when the test
>>> fails, self-heal is attempting to heal the file, but when the file is
>>> accessed, an Input/Output error is returned, aborting heal. I've checked
>>> that a heal is attempted every time the file is accessed, but it fails
>>> always. This error seems to come from bit-rot stub xlator.
>>>
>>> When in this situation, if I stop and start the volume, self-heal
>>> immediately heals the files. It seems like an stale state that is kept by
>>> the stub xlator, preventing the file from being healed.
>>>
>>> Adding bit-rot maintainers for help on this one.
>>>
>>
>> Bitrot-stub marks the file as corrupted in inode_ctx. But when the file
>> and it's hardlink are deleted from that brick and a lookup is done
>> on the fi

Re: [Gluster-devel] [Gluster-Maintainers] Release 5: Master branch health report (Week of 30th July)

2018-08-02 Thread Kotresh Hiremath Ravishankar
On Thu, Aug 2, 2018 at 4:50 PM, Amar Tumballi  wrote:

>
>
> On Thu, Aug 2, 2018 at 4:37 PM, Kotresh Hiremath Ravishankar <
> khire...@redhat.com> wrote:
>
>>
>>
>> On Thu, Aug 2, 2018 at 3:49 PM, Xavi Hernandez 
>> wrote:
>>
>>> On Thu, Aug 2, 2018 at 6:14 AM Atin Mukherjee 
>>> wrote:
>>>
>>>>
>>>>
>>>> On Tue, Jul 31, 2018 at 10:11 PM Atin Mukherjee 
>>>> wrote:
>>>>
>>>>> I just went through the nightly regression report of brick mux runs
>>>>> and here's what I can summarize.
>>>>>
>>>>> 
>>>>> 
>>>>> =
>>>>> Fails only with brick-mux
>>>>> 
>>>>> 
>>>>> =
>>>>> tests/bugs/core/bug-1432542-mpx-restart-crash.t - Times out even
>>>>> after 400 secs. Refer https://fstat.gluster.org/fail
>>>>> ure/209?state=2_date=2018-06-30_date=2018-07-31=all,
>>>>> specifically the latest report https://build.gluster.org/job/
>>>>> regression-test-burn-in/4051/consoleText . Wasn't timing out as
>>>>> frequently as it was till 12 July. But since 27 July, it has timed out
>>>>> twice. Beginning to believe commit 
>>>>> 9400b6f2c8aa219a493961e0ab9770b7f12e80d2
>>>>> has added the delay and now 400 secs isn't sufficient enough (Mohit?)
>>>>>
>>>>> tests/bugs/glusterd/add-brick-and-validate-replicated-volume-options.t
>>>>> (Ref - https://build.gluster.org/job/regression-test-with-multiplex
>>>>> /814/console) -  Test fails only in brick-mux mode, AI on Atin to
>>>>> look at and get back.
>>>>>
>>>>> tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t (
>>>>> https://build.gluster.org/job/regression-test-with-multiple
>>>>> x/813/console) - Seems like failed just twice in last 30 days as per
>>>>> https://fstat.gluster.org/failure/251?state=2_date=201
>>>>> 8-06-30_date=2018-07-31=all. Need help from AFR team.
>>>>>
>>>>> tests/bugs/quota/bug-1293601.t (https://build.gluster.org/job
>>>>> /regression-test-with-multiplex/812/console) - Hasn't failed after 26
>>>>> July and earlier it was failing regularly. Did we fix this test through 
>>>>> any
>>>>> patch (Mohit?)
>>>>>
>>>>> tests/bitrot/bug-1373520.t - (https://build.gluster.org/job
>>>>> /regression-test-with-multiplex/811/console)  - Hasn't failed after
>>>>> 27 July and earlier it was failing regularly. Did we fix this test through
>>>>> any patch (Mohit?)
>>>>>
>>>>
>>>> I see this has failed in day before yesterday's regression run as well
>>>> (and I could reproduce it locally with brick mux enabled). The test fails
>>>> in healing a file within a particular time period.
>>>>
>>>> *15:55:19* not ok 25 Got "0" instead of "512", LINENUM:55*15:55:19* FAILED 
>>>> COMMAND: 512 path_size /d/backends/patchy5/FILE1
>>>>
>>>> Need EC dev's help here.
>>>>
>>>
>>> I'm not sure where the problem is exactly. I've seen that when the test
>>> fails, self-heal is attempting to heal the file, but when the file is
>>> accessed, an Input/Output error is returned, aborting heal. I've checked
>>> that a heal is attempted every time the file is accessed, but it fails
>>> always. This error seems to come from bit-rot stub xlator.
>>>
>>> When in this situation, if I stop and start the volume, self-heal
>>> immediately heals the files. It seems like an stale state that is kept by
>>> the stub xlator, preventing the file from being healed.
>>>
>>> Adding bit-rot maintainers for help on this one.
>>>
>>
>> Bitrot-stub marks the file as corrupted in inode_ctx. But when the file
>> and it's hardlink are deleted from that brick and a lookup is done
>> on the file, it cleans up the marker on gettin

Re: [Gluster-devel] [Gluster-Maintainers] Release 5: Master branch health report (Week of 30th July)

2018-08-02 Thread Kotresh Hiremath Ravishankar
On Thu, Aug 2, 2018 at 11:43 AM, Xavi Hernandez 
wrote:

> On Thu, Aug 2, 2018 at 6:14 AM Atin Mukherjee  wrote:
>
>>
>>
>> On Tue, Jul 31, 2018 at 10:11 PM Atin Mukherjee 
>> wrote:
>>
>>> I just went through the nightly regression report of brick mux runs and
>>> here's what I can summarize.
>>>
>>> 
>>> 
>>> =
>>> Fails only with brick-mux
>>> 
>>> 
>>> =
>>> tests/bugs/core/bug-1432542-mpx-restart-crash.t - Times out even after
>>> 400 secs. Refer https://fstat.gluster.org/failure/209?state=2_
>>> date=2018-06-30_date=2018-07-31=all, specifically the latest
>>> report https://build.gluster.org/job/regression-test-burn-in/4051/
>>> consoleText . Wasn't timing out as frequently as it was till 12 July.
>>> But since 27 July, it has timed out twice. Beginning to believe commit
>>> 9400b6f2c8aa219a493961e0ab9770b7f12e80d2 has added the delay and now
>>> 400 secs isn't sufficient enough (Mohit?)
>>>
>>> tests/bugs/glusterd/add-brick-and-validate-replicated-volume-options.t
>>> (Ref - https://build.gluster.org/job/regression-test-with-
>>> multiplex/814/console) -  Test fails only in brick-mux mode, AI on Atin
>>> to look at and get back.
>>>
>>> tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t (
>>> https://build.gluster.org/job/regression-test-with-multiplex/813/console)
>>> - Seems like failed just twice in last 30 days as per
>>> https://fstat.gluster.org/failure/251?state=2_
>>> date=2018-06-30_date=2018-07-31=all. Need help from AFR team.
>>>
>>> tests/bugs/quota/bug-1293601.t (https://build.gluster.org/
>>> job/regression-test-with-multiplex/812/console) - Hasn't failed after
>>> 26 July and earlier it was failing regularly. Did we fix this test through
>>> any patch (Mohit?)
>>>
>>> tests/bitrot/bug-1373520.t - (https://build.gluster.org/
>>> job/regression-test-with-multiplex/811/console)  - Hasn't failed after
>>> 27 July and earlier it was failing regularly. Did we fix this test through
>>> any patch (Mohit?)
>>>
>>
>> I see this has failed in day before yesterday's regression run as well
>> (and I could reproduce it locally with brick mux enabled). The test fails
>> in healing a file within a particular time period.
>>
>> *15:55:19* not ok 25 Got "0" instead of "512", LINENUM:55*15:55:19* FAILED 
>> COMMAND: 512 path_size /d/backends/patchy5/FILE1
>>
>> Need EC dev's help here.
>>
>
> I'll investigate this.
>
>
>>
>>
>>> tests/bugs/glusterd/remove-brick-testcases.t - Failed once with a core,
>>> not sure if related to brick mux or not, so not sure if brick mux is
>>> culprit here or not. Ref - https://build.gluster.org/job/
>>> regression-test-with-multiplex/806/console . Seems to be a glustershd
>>> crash. Need help from AFR folks.
>>>
>>> 
>>> 
>>> =
>>> Fails for non-brick mux case too
>>> 
>>> 
>>> =
>>> tests/bugs/distribute/bug-1122443.t 0 Seems to be failing at my setup
>>> very often, with out brick mux as well. Refer
>>> https://build.gluster.org/job/regression-test-burn-in/4050/consoleText
>>> . There's an email in gluster-devel and a BZ 1610240 for the same.
>>>
>>> tests/bugs/bug-1368312.t - Seems to be recent failures (
>>> https://build.gluster.org/job/regression-test-with-multiplex/815/console)
>>> - seems to be a new failure, however seen this for a non-brick-mux case too
>>> - https://build.gluster.org/job/regression-test-burn-in/4039/consoleText
>>> . Need some eyes from AFR folks.
>>>
>>> tests/00-geo-rep/georep-basic-dr-tarssh.t - this isn't specific to
>>> brick mux, have seen this failing at multiple default regression runs.
>>> Refer https://fstat.gluster.org/failure/392?state=2_
>>> date=2018-06-30_date=2018-07-31=all . We need help from
>>> geo-rep dev to root cause this earlier than later
>>>
>>> tests/00-geo-rep/georep-basic-dr-rsync.t - this isn't specific to brick
>>> mux, have seen this failing at multiple default regression runs. Refer
>>> https://fstat.gluster.org/failure/393?state=2_
>>> date=2018-06-30_date=2018-07-31=all . We need help from
>>> geo-rep dev to root cause this earlier than later
>>>
>>
I have posted the patch [1] for above two. This should handle connection
time outs without any logs. But I still see a strange behaviour now and then
where the one of the worker doesn't get started at all. I am debugging that
with instrumentation patch [2]. I am not hitting that on this 

Re: [Gluster-devel] [Gluster-Maintainers] Update: Gerrit review system has one more command now

2018-05-21 Thread Kotresh Hiremath Ravishankar
This will be very useful. Thank you.

On Mon, May 21, 2018 at 11:45 PM, Vijay Bellur  wrote:

>
>
> On Mon, May 21, 2018 at 2:29 AM, Amar Tumballi 
> wrote:
>
>> Hi all,
>>
>> As a push towards more flexibility to our developers, and options to run
>> more tests without too much effort, we are moving towards more and more
>> options to trigger tests from Gerrit during reviews.
>>
>> One such example was 'regression-on-demand-multiplex' tests, where any
>> one can ask for a brick-mux regression for a particular patch.
>>
>> In the same way, in certain cases where developers are making changes,
>> and more than 1 tests would be impacted, there was no easy way to run all
>> the regression, other than sending one patchset with changes to
>> 'run-tests.sh' to not fail on failures. This was tedious, and also is not
>> known to many new developers. Hence a new command is added to gerrit, where
>> one can trigger all the runs (if something is failing), by entering *'run
>> full regression'* in a single line at the top of your review comments.
>>
>> With this, a separate job will be triggered which will run the full
>> regression suite with the patch. So, no more requirement to make
>> 'run-tests.sh' changes.
>>
>> More on this at http://bugzilla.redhat.com/1564119
>>
>>
>>
>
> Thank you, Amar! I think it will be quite useful for us.
>
> I am not sure if there's a document that details all possible options &
> tricks with gerrit. If there's none, we could add one to our
> repository/developer-guide so that new developers find it easy to use these
> options.
>
> Regards,
> Vijay
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-devel
>



-- 
Thanks and Regards,
Kotresh H R
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Release 4.1: LTM release targeted for end of May

2018-03-21 Thread Kotresh Hiremath Ravishankar
Hi Shyam,

Rafi and Me are proposing consistent time across replica feature for 4.1

https://github.com/gluster/glusterfs/issues/208

Thanks,
Kotresh H R

On Wed, Mar 21, 2018 at 2:05 PM, Ravishankar N 
wrote:

>
>
> On 03/20/2018 07:07 PM, Shyam Ranganathan wrote:
>
>> On 03/12/2018 09:37 PM, Shyam Ranganathan wrote:
>>
>>> Hi,
>>>
>>> As we wind down on 4.0 activities (waiting on docs to hit the site, and
>>> packages to be available in CentOS repositories before announcing the
>>> release), it is time to start preparing for the 4.1 release.
>>>
>>> 4.1 is where we have GD2 fully functional and shipping with migration
>>> tools to aid Glusterd to GlusterD2 migrations.
>>>
>>> Other than the above, this is a call out for features that are in the
>>> works for 4.1. Please *post* the github issues to the *devel lists* that
>>> you would like as a part of 4.1, and also mention the current state of
>>> development.
>>>
>> Thanks for those who responded. The github lane and milestones for the
>> said features are updated, request those who mentioned issues being
>> tracked for 4.1 check that these are reflected in the project lane [1].
>>
>> I have few requests as follows that if picked up would be a good thing
>> to achieve by 4.1, volunteers welcome!
>>
>> - Issue #224: Improve SOS report plugin maintenance
>>- https://github.com/gluster/glusterfs/issues/224
>>
>> - Issue #259: Compilation warnings with gcc 7.x
>>- https://github.com/gluster/glusterfs/issues/259
>>
>> - Issue #411: Ensure python3 compatibility across code base
>>- https://github.com/gluster/glusterfs/issues/411
>>
>> - NFS Ganesha HA (storhaug)
>>- Does this need an issue for Gluster releases to track? (maybe
>> packaging)
>>
>> I will close the call for features by Monday 26th Mar, 2018. Post this,
>> I would request that features that need to make it into 4.1 be raised as
>> exceptions to the devel and maintainers list for evaluation.
>>
>
> Hi Shyam,
>
> I want to add https://github.com/gluster/glusterfs/issues/363 also for
> 4.1. It is not a new feature but rather an enhancement to a volume option
> in AFR. I don't think it can qualify as a bug fix, so mentioning it here
> just in case it needs to be tracked too. The (only) patch is undergoing
> review cycles.
>
> Regards,
> Ravi
>
>
>> Further, as we hit end of March, we would make it mandatory for features
>>> to have required spec and doc labels, before the code is merged, so
>>> factor in efforts for the same if not already done.
>>>
>>> Current 4.1 project release lane is empty! I cleaned it up, because I
>>> want to hear from all as to what content to add, than add things marked
>>> with the 4.1 milestone by default.
>>>
>> [1] 4.1 Release lane:
>> https://github.com/gluster/glusterfs/projects/1#column-1075416
>>
>> Thanks,
>>> Shyam
>>> P.S: Also any volunteers to shadow/participate/run 4.1 as a release
>>> owner?
>>>
>> Calling this out again!
>>
>> ___
>>> Gluster-devel mailing list
>>> Gluster-devel@gluster.org
>>> http://lists.gluster.org/mailman/listinfo/gluster-devel
>>>
>>> ___
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-devel
>>
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-devel
>



-- 
Thanks and Regards,
Kotresh H R
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-infra] Infra-related Regression Failures and What We're Doing

2018-01-21 Thread Kotresh Hiremath Ravishankar
On Mon, Jan 22, 2018 at 12:21 PM, Nigel Babu  wrote:

> Hello folks,
>
> As you may have noticed, we've had a lot of centos6-regression failures
> lately. The geo-replication failures are the new ones which particularly
> concern me. These failures have nothing to do with the test. The tests are
> exposing a problem in our infrastructure that we've carried around for a
> long time. Our machines are not clean machines that we automated. We setup
> automation on machines that were already created. At some point, we loaned
> machines for debugging. During this time, developers have inadvertently
> done 'make install' on the system to install onto system paths rather than
> into /build/install. This is what is causing the geo-replication tests to
> fail. I've tried cleaning the machines up several times with little to no
> success.
>
> Last week, we decided to take an aggressive path to fix this problem. We
> planned to replace all our problematic nodes with new Centos 7 nodes. This
> exposed more problems. We expected a specific type of machine from
> Rackspace. These are no longer offered. Thus, our automation fails on some
> steps. I've spent this weekend tweaking our automation so that it works
> on the new Rackspace machines and I'm down to just one test failure[1]. I
> have a patch up to fix this failure[2]. As soon as that patch is merged,
> we can push forward with Centos7 nodes. In 4.0, we're dropping support for
> Centos 6, so this decision makes more sense to do sooner than later.
>
> We'll not be lending machines anymore from production. We'll be creating
> new nodes which are a snapshots of an existing production node. This
> machine will be destroyed after use. This helps prevent this particular
> problem in the future. This also means that our machine capacity at all
> times is at 100 with very minimal wastage.
>

+2 for this

>
> [1]: https://build.gluster.org/job/cage-test/184/consoleText
> [2]: https://review.gluster.org/#/c/19262/
>
> --
> nigelb
>
> ___
> Gluster-infra mailing list
> gluster-in...@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-infra
>



-- 
Thanks and Regards,
Kotresh H R
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-infra] Recent regression failures

2018-01-12 Thread Kotresh Hiremath Ravishankar
Nigel,

Could you give a machine where geo-rep is failing even with bashrc fix to
debug ?

Thanks,
Kotresh HR

On Fri, Jan 12, 2018 at 3:54 PM, Amar Tumballi  wrote:

> Can we have a separate test case to validate for all the basic necessity
> for whole test suite to pass? That way the very first run will report an
> error in the setup and we can fix it faster, instead of waiting for the
> failure which is a setup issue after 3-4hrs ?
>
> -Amar
>
> On Fri, Jan 12, 2018 at 3:47 PM, Atin Mukherjee 
> wrote:
>
>>
>>
>> On Thu, Jan 11, 2018 at 10:15 AM, Nigel Babu  wrote:
>>
>>> Hello folks,
>>>
>>> We may have been a little too quick to blame Meltdown on the Jenkins
>>> failures yesterday. In any case, we've open a ticket with our provider and
>>> they're looking into the failures. I've looked at the last 90 failures to
>>> get a comprehensive number on the failures.
>>>
>>> Total Jobs: 90
>>> Failures: 62
>>> Failure Percentage: 68.89%
>>>
>>> I've analyzed the individual failures categorized them as well.
>>>
>>> slave28.cloud.gluster.org failure: 9
>>> Geo-replication failures: 12
>>> Fops-during-migration.t: 4
>>> Compilation failures: 3
>>> durability-off.t failures: 7
>>>
>>> These alone total to 35 failures. The slave28 failures were due to the
>>> machine running out of disk space. We had a very large binary archived from
>>> an experimental branch build failure. I've cleared that core out and this
>>> is now fixed. The geo-replication failures were due to geo-rep tests
>>> depending on root's .bashrc having the PATH variable modified. This was not
>>> a standard setup and therefore didn't work on many machines. This has now
>>> been fixed. The other 3 were transient failures either limited to a
>>> particular review or a temporary bustage on master. The majority of the
>>> recent failures had more to do with infra than to do with tests.
>>>
>>
>> While Nigel tells me that the infra related problems for geo-rep tests
>> are already fixed, I see a similar failure pops up through
>> https://build.gluster.org/job/centos6-regression/8356/console .
>>
>> @Kotresh - Could you check if this is something different?
>>
>>
>>> I'm therefore cautiously moving with the assumption that the impact of
>>> KPTI patch is minimal so far.
>>>
>>> --
>>> nigelb
>>>
>>> ___
>>> Gluster-devel mailing list
>>> Gluster-devel@gluster.org
>>> http://lists.gluster.org/mailman/listinfo/gluster-devel
>>>
>>
>>
>> ___
>> Gluster-infra mailing list
>> gluster-in...@gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-infra
>>
>
>
>
> --
> Amar Tumballi (amarts)
>



-- 
Thanks and Regards,
Kotresh H R
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-users] Request for Comments: Upgrades from 3.x to 4.0+

2017-11-02 Thread Kotresh Hiremath Ravishankar
Hi Amudhan,

Please go through the following that would clarify up-gradation concerns
from DHT to RIO in 4.0


   1. RIO would not deprecate DHT. Both DHT and RIO would co-exist.
   2. DHT volumes would not be migrated to RIO. DHT volumes would still be
   using DHT code.
   3. The new volume creation should specifically opt for RIO volume once
   RIO is in place.
   4. RIO should be perceived as another volume type which is chosed during
   volume creation
   just like replicate, EC which would avoid most of the confusions.

Shaym,

Please add if I am missing anything.

Thanks,
Kotresh HR

On Thu, Nov 2, 2017 at 4:36 PM, Amudhan P  wrote:

> does RIO improves folder listing and rebalance, when compared to 3.x?
>
> if yes, do you have any performance data comparing RIO and DHT?
>
> On Thu, Nov 2, 2017 at 4:12 PM, Kaushal M  wrote:
>
>> On Thu, Nov 2, 2017 at 4:00 PM, Amudhan P  wrote:
>> > if doing an upgrade from 3.10.1 to 4.0 or 4.1, will I be able to access
>> > volume without any challenge?
>> >
>> > I am asking this because 4.0 comes with DHT2?
>>
>> Very short answer, yes. Your volumes will remain the same. And you
>> will continue to access them the same way.
>>
>> RIO (as DHT2 is now known as) developers in CC can provide more
>> information on this. But in short, RIO will not be replacing DHT. It
>> was renamed to make this clear.
>> Gluster 4.0 will continue to ship both DHT and RIO. All 3.x volumes
>> that exist will continue to use DHT, and continue to work as they
>> always have.
>> You will only be able to create new RIO volumes, and will not be able
>> to migrate DHT to RIO.
>>
>> >
>> >
>> >
>> >
>> > On Thu, Nov 2, 2017 at 2:26 PM, Kaushal M  wrote:
>> >>
>> >> We're fast approaching the time for Gluster-4.0. And we would like to
>> >> set out the expected upgrade strategy and try to polish it to be as
>> >> user friendly as possible.
>> >>
>> >> We're getting this out here now, because there was quite a bit of
>> >> concern and confusion regarding the upgrades between 3.x and 4.0+.
>> >>
>> >> ---
>> >> ## Background
>> >>
>> >> Gluster-4.0 will bring a newer management daemon, GlusterD-2.0 (GD2),
>> >> which is backwards incompatible with the GlusterD (GD1) in
>> >> GlusterFS-3.1+.  As a hybrid cluster of GD1 and GD2 cannot be
>> >> established, rolling upgrades are not possible. This meant that
>> >> upgrades from 3.x to 4.0 would require a volume downtime and possible
>> >> client downtime.
>> >>
>> >> This was a cause of concern among many during the recently concluded
>> >> Gluster Summit 2017.
>> >>
>> >> We would like to keep pains experienced by our users to a minimum, so
>> >> we are trying to develop an upgrade strategy that avoids downtime as
>> >> much as possible.
>> >>
>> >> ## (Expected) Upgrade strategy from 3.x to 4.0
>> >>
>> >> Gluster-4.0 will ship with both GD1 and GD2.
>> >> For fresh installations, only GD2 will be installed and available by
>> >> default.
>> >> For existing installations (upgrades) GD1 will be installed and run by
>> >> default. GD2 will also be installed simultaneously, but will not run
>> >> automatically.
>> >>
>> >> GD1 will allow rolling upgrades, and allow properly setup Gluster
>> >> volumes to be upgraded to 4.0 binaries, without downtime.
>> >>
>> >> Once the full pool is upgraded, and all bricks and other daemons are
>> >> running 4.0 binaries, migration to GD2 can happen.
>> >>
>> >> To migrate to GD2, all GD1 processes in the cluster need to be killed,
>> >> and GD2 started instead.
>> >> GD2 will not automatically form a cluster. A migration script will be
>> >> provided, which will form a new GD2 cluster from the existing GD1
>> >> cluster information, and migrate volume information from GD1 into GD2.
>> >>
>> >> Once migration is complete, GD2 will pick up the running brick and
>> >> other daemon processes and continue. This will only be possible if the
>> >> rolling upgrade with GD1 happened successfully and all the processes
>> >> are running with 4.0 binaries.
>> >>
>> >> During the whole migration process, the volume would still be online
>> >> for existing clients, who can still continue to work. New clients will
>> >> not be possible during this time.
>> >>
>> >> After migration, existing clients will connect back to GD2 for
>> >> updates. GD2 listens on the same port as GD1 and provides the required
>> >> SunRPC programs.
>> >>
>> >> Once migrated to GD2, rolling upgrades to newer GD2 and Gluster
>> >> versions. without volume downtime, will be possible.
>> >>
>> >> ### FAQ and additional info
>> >>
>> >>  Both GD1 and GD2? What?
>> >>
>> >> While both GD1 and GD2 will be shipped, the GD1 shipped will
>> >> essentially be the GD1 from the last 3.x series. It will not support
>> >> any of the newer storage or management features being planned for 4.0.
>> >> All new features will only be available from GD2.
>> >>
>> >>  How long will GD1 

Re: [Gluster-devel] Release 3.12: Status of features (Require responses!)

2017-07-24 Thread Kotresh Hiremath Ravishankar
Answers inline.

On Sat, Jul 22, 2017 at 1:36 AM, Shyam  wrote:

> Hi,
>
> Prepare for a lengthy mail, but needed for the 3.12 release branching, so
> here is a key to aid the impatient,
>
> Key:
> 1) If you asked for an exception to a feature (meaning delayed backport to
> 3.12 branch post branching for the release) see "Section 1"
>   - Handy list of nick's that maybe interested in this:
> - @pranithk, @sunilheggodu, @aspandey, @amarts, @kalebskeithley,
> @kshlm (IPv6), @jdarcy (Halo Hybrid)
>
> 2) If you have/had a feature targeted for 3.12 and have some code posted
> against the same, look at "Section 2" AND we want to hear back from you!
>   - Handy list of nick's that should be interested in this:
> - @csabahenk, @nixpanic, @aravindavk, @amarts, @kotreshhr,
> @soumyakoduri
>
> 3) If you have/had a feature targeted for 3.12 and have posted no code
> against the same yet, see "Section 3", your feature is being dropped from
> the release.
>   - Handy list of nick's that maybe interested in this:
> - @sanoj-unnikrishnan, @aravindavk, @kotreshhr, @amarts, @jdarcy,
> @avra (people who filed the issue)
>
> 4) Finally, if you do not have any features for the release pending,
> please help others out reviewing what is still pending, here [1] is a quick
> link to those reviews.
>
> Sections:
>
> **Section 1:**
> Exceptions granted to the following features: (Total: 8)
> Reasons:
>   - Called out in the mail sent for noting exceptions and feature status
> for 3.12
>   - Awaiting final changes/decision from a few Facebook patches
>
> Issue list:
> - Implement an xlator to delay fops
>   - https://github.com/gluster/glusterfs/issues/257
>
> - Implement parallel writes feature on EC volume
>   - https://github.com/gluster/glusterfs/issues/251
>
> - DISCARD support with EC
>   - https://github.com/gluster/glusterfs/issues/254
>
> - Cache last stripe of an EC volume while write is going on
>   - https://github.com/gluster/glusterfs/issues/256
>
> - gfid-path by default
>   - https://github.com/gluster/glusterfs/issues/139



Following patches needs to be merged. It would be back ported to 3.12 branch
Once it is merged.

https://review.gluster.org/#/c/17744/
https://review.gluster.org/#/c/17785/
https://review.gluster.org/#/c/17839/

And on the part of enabling it by default. I have requested for a
performance testing.
Once that is done, one more patch to enable it by default.



>
> - allow users to enable used of localtime instead of UTC for log entries
>   - https://github.com/gluster/glusterfs/issues/272
>
> - Halo translator: Hybrid mode
>   - https://github.com/gluster/glusterfs/issues/217
>
> - [RFE] Improve IPv6 support in GlusterFS
>   - https://github.com/gluster/glusterfs/issues/192
>
> **Section 2:**
> Issues needing some further clarity: (Total: 6)
> Reason:
>   - There are issues here, for which code is already merged (or submitted)
> and issue is still open. This is the right state for an issue to be in this
> stage of the release, as documentation or release-notes would possibly be
> still pending, which will finally close the issue (or rather mark it fixed)
>   - BUT, without a call out from the contributors that required code is
> already merged in, it is difficult to assess if the issue should qualify
> for the release
>
> Issue list:
> - [RFE] libfuse rebase to latest?
>   - https://github.com/gluster/glusterfs/issues/153
>   - @csabahenk is this all done?
>
> - Decide what to do with glfs_ipc() in libgfapi
>   - https://github.com/gluster/glusterfs/issues/269
>   - @nixpanic I assume there is more than just test case disabling for
> this, is this expected to happen by 3.12?
>
> - Structured log format support for gf_log and gf_msg
>   - https://github.com/gluster/glusterfs/issues/240
>   - @aravindavk this looks done, anything code wise pending here?
>
> - xxhash: Add xxhash library
>   - https://github.com/gluster/glusterfs/issues/253
>   - @kotreshhr this looks done, anything code wise pending here?
>

This issue is complete. This can be closed.

>
> - posix: provide option to handle 'statfs()' properly when more than 1
> brick is exported from 1 node
>   - https://github.com/gluster/glusterfs/issues/241
>   - @amarts patch is still awaiting reviews, should this be tracked as an
> exception?
>
> - gfapi to support leases and lock-owner
>   - https://github.com/gluster/glusterfs/issues/213
>   - @soumyakoduri I do not see work progressing on the patches provided,
> should this be dropped from 3.12?
>
> **Section 3:**
> Issues moved out of the 3.12 Milestone: (Total: 8)
> Reasons:
>   - No commits visible against the github issue
>   - No commits against 'master' branch visible on the github issue
>
> Further changes:
>   - No new milestone assigned, IOW not moved to 4.0 by default, hence
> contributors working on these features would need to rekindle conversations
> on including the same in 4.0 on the ML or on the issue itself.
>
> Issue 

[Gluster-devel] 3.12 Review Request

2017-07-24 Thread Kotresh Hiremath Ravishankar
Hi,

Following patches are targeted for 3.12. It has undergone few reviews and
yet it
to merged. Please take some time to review and merge if it looks good.

https://review.gluster.org/#/c/17744/
https://review.gluster.org/#/c/17785/

-- 
Thanks and Regards,
Kotresh H R
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] ./tests/bugs/distribute/bug-1389697.t generates a core file

2017-06-30 Thread Kotresh Hiremath Ravishankar
Hi,

The above mentioned distribute test case generated a core
which is not related to the patch.

https://build.gluster.org/job/centos6-regression/5218/consoleFull

Here is the backtrace.

#0  0x7f3222fbaebf in dht_build_root_loc (inode=0xa800,
loc=0x7f3220b10e50) at
/home/jenkins/root/workspace/centos6-regression/xlators/cluster/dht/src/dht-rebalance.c:2294
2294
/home/jenkins/root/workspace/centos6-regression/xlators/cluster/dht/src/dht-rebalance.c:
No such file or directory.
[Current thread is 1 (LWP 24730)]
(gdb) bt
#0  0x7f3222fbaebf in dht_build_root_loc (inode=0xa800,
loc=0x7f3220b10e50) at
/home/jenkins/root/workspace/centos6-regression/xlators/cluster/dht/src/dht-rebalance.c:2294
#1  0x7f3222fbffde in dht_file_counter_thread (args=0x7f321c01e600) at
/home/jenkins/root/workspace/centos6-regression/xlators/cluster/dht/src/dht-rebalance.c:4090
#2  0x7f322fe27aa1 in start_thread () from ./lib64/libpthread.so.0
#3  0x7f322f78fbcd in clone () from ./lib64/libc.so.6


Could anybody from dht team take a look at it?


-- 
Thanks and Regards,
Kotresh H R
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Adding xxhash to gluster code base

2017-06-28 Thread Kotresh Hiremath Ravishankar
That sounds good to me. I will send it as a separate patch then. And I can
maintain it. No issues.

Thanks and Regards,
Kotresh H R


On Wed, Jun 28, 2017 at 1:02 PM, Niels de Vos <nde...@redhat.com> wrote:

> On Wed, Jun 28, 2017 at 12:51:07PM +0530, Amar Tumballi wrote:
> > On Tue, Jun 27, 2017 at 8:46 PM, Niels de Vos <nde...@redhat.com> wrote:
> >
> > > On Tue, Jun 27, 2017 at 08:09:25AM -0400, Kaleb S. KEITHLEY wrote:
> > > >
> > > > xxhash doesn't seem to change much. Last update to the non-test code
> was
> > > six
> > > > months ago.
> > > >
> > > > bundling giant (for some definition of giant) packages/projects
> would be
> > > > bad. bundling two (three if you count the test) C files doesn't seem
> too
> > > bad
> > > > when you consider that there are already three or four packages in
> fedora
> > > > (perl, python, R-digest, ghc (gnu haskell) that have implementations
> of
> > > > xxhash or murmur but didn't bother to package a C implementation and
> use
> > > it.
> > >
> > > I prefer to have as little maintenance components in the Gluster
> sources
> > > as we can. The maintenance burdon is already very high. The number of
> > > changes to xxhash seem limited, but we still need someone to track and
> > > pay attention to them.
> > >
> >
> > I agree that someone should maintain it, and we should add it to
> > MAINTAINERS file
> > (or some other place, where we are tracking the dependencies).
> >
> > For now, Kotresh will be looking into keeping these changes up-to-date
> with
> > upstream xxhash project, along with me.
>
> Kotresh as maintainer/owner, and Aravinda as peer?
>
> > > > I'd be for packaging it in Fedora rather than bundling it in
> gluster. But
> > > > then we get to "carry" it in rhgs as we do with userspace-rcu.
> > >
> > > We should descide what the most maintainable solution is. Having
> package
> > > maintainers with the explicit task to keep xxhash updated and current
> is
> > > apealing to me. Merging (even small) projects into the Gluster codebase
> > > will add more maintenance need to the project members. Therefor I have
> a
> > > strong preference to use xxhash (or an other library) that is provided
> > > by distributions. The more common the library is, the better it will be
> > > maintained without our (Gluster Community's) help.
> > >
> > >
> > While this is desirable, we didn't see any library available for xxhash (
> > http://cyan4973.github.io/xxHash/) in our distro.
> >
> > I would recommend taking these patches with TODO to use library in future
> > when its available, and continue to have xxhash in 'contrib/'. It is not
> > new for us to take code from different libraries and use it for our need
> > and maintain only that part (eg. libfuse). Lets treat this as similar
> setup.
>
> Yes, if there is no suitable alternative available in the majority of
> distributions, this is the only sensible approach. Much of the code in
> contrib/ is not maintained at all. We should prevent this from happening
> with new code and assigning an owner/maintainer and peer(s) just like
> for other components is a must.
>
> Thanks,
> Niels
>
>
> >
> > Regards,
> > Amar
> >
> >
> >
> >
> >
> > > Niels
> > >
> > >
> > > > On 06/27/2017 04:08 AM, Niels de Vos wrote:
> > > > > On Tue, Jun 27, 2017 at 12:25:11PM +0530, Kotresh Hiremath
> Ravishankar
> > > wrote:
> > > > > > Hi,
> > > > > >
> > > > > > We were looking for faster non-cryptographic hash to be used for
> the
> > > > > > gfid2path infra [1]
> > > > > > The initial testing was done with md5 128bit checksum which was a
> > > slow,
> > > > > > cryptographic hash
> > > > > > and using it makes software not complaint to FIPS [2]
> > > > > >
> > > > > > On searching online a bit we found out xxhash [3] seems to be
> faster
> > > from
> > > > > > the results of
> > > > > > benchmark tests shared and lot of projects use it. So we have
> > > decided to us
> > > > > > xxHash
> > > > > > and added following files to gluster code base with the patch [4]
> > > > > >
> > > > > >  BSD 2-Clause License:
> > > >

Re: [Gluster-devel] Adding xxhash to gluster code base

2017-06-27 Thread Kotresh Hiremath Ravishankar
Sure, I can do that.

On Tue, Jun 27, 2017 at 12:28 PM, Amar Tumballi <atumb...@redhat.com> wrote:

>
>
> On Tue, Jun 27, 2017 at 12:25 PM, Kotresh Hiremath Ravishankar <
> khire...@redhat.com> wrote:
>
>> Hi,
>>
>> We were looking for faster non-cryptographic hash to be used for the
>> gfid2path infra [1]
>> The initial testing was done with md5 128bit checksum which was a slow,
>> cryptographic hash
>> and using it makes software not complaint to FIPS [2]
>>
>> On searching online a bit we found out xxhash [3] seems to be faster from
>> the results of
>> benchmark tests shared and lot of projects use it. So we have decided to
>> us xxHash
>> and added following files to gluster code base with the patch [4]
>>
>> BSD 2-Clause License:
>>contrib/xxhash/xxhash.c
>>contrib/xxhash/xxhash.h
>>
>> GPL v2 License:
>>tests/utils/xxhsum.c
>>
>> NOTE: We have ignored the code guideline check for these files as
>> maintaining it
>> further becomes difficult.
>>
>> Please comment on the same if there are any issues around it.
>>
>> [1] Issue: https://github.com/gluster/glusterfs/issues/139
>> [2] https://en.wikipedia.org/wiki/Federal_Information_Processing
>> _Standards
>> [3] http://cyan4973.github.io/xxHash/
>> [4] https://review.gluster.org/#/c/17488/10
>>
>>
>>
> Just one comment at the moment. Please separate out the patches as
>
> 1. changes to get xxHash into the project
> 2. gfid2path feature (which can use xxHash code).
>
> That way it will be very easy to review, and also to maintain in future.
>
> -Amar
>
>
>
>>
>> --
>> Thanks and Regards,
>> Kotresh H R and Aravinda VK
>>
>
>
>
> --
> Amar Tumballi (amarts)
>



-- 
Thanks and Regards,
Kotresh H R
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Adding xxhash to gluster code base

2017-06-27 Thread Kotresh Hiremath Ravishankar
Hi,

We were looking for faster non-cryptographic hash to be used for the
gfid2path infra [1]
The initial testing was done with md5 128bit checksum which was a slow,
cryptographic hash
and using it makes software not complaint to FIPS [2]

On searching online a bit we found out xxhash [3] seems to be faster from
the results of
benchmark tests shared and lot of projects use it. So we have decided to us
xxHash
and added following files to gluster code base with the patch [4]

BSD 2-Clause License:
   contrib/xxhash/xxhash.c
   contrib/xxhash/xxhash.h

GPL v2 License:
   tests/utils/xxhsum.c

NOTE: We have ignored the code guideline check for these files as
maintaining it
further becomes difficult.

Please comment on the same if there are any issues around it.

[1] Issue: https://github.com/gluster/glusterfs/issues/139
[2] https://en.wikipedia.org/wiki/Federal_Information_Processing_Standards
[3] http://cyan4973.github.io/xxHash/
[4] https://review.gluster.org/#/c/17488/10



-- 
Thanks and Regards,
Kotresh H R and Aravinda VK
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] ./tests/encryption/crypt.t fails regression with core

2017-06-22 Thread Kotresh Hiremath Ravishankar
+1

On Fri, Jun 23, 2017 at 10:44 AM, Amar Tumballi <atumb...@redhat.com> wrote:

>
>
> On Thu, Jun 22, 2017 at 9:22 PM, Atin Mukherjee <amukh...@redhat.com>
> wrote:
>
>> I have highlighted about this failure earlier at [1]
>>
>> [1] http://lists.gluster.org/pipermail/gluster-devel/2017-June/
>> 053042.html
>>
>>
> If its stopping us from adding important features / bring stability, lets
> document 'Crypt' (ie, encryption xlator) has issues, and remove this test
> case from running?
>
> I see that even in recent maintainers meeting also, no one volunteered to
> fix encryption translator. So, I am fine with taking it out for now. Anyone
> has objections?
>
> -Amar
>
>
>> On Wed, Jun 21, 2017 at 10:41 PM, Kotresh Hiremath Ravishankar <
>> khire...@redhat.com> wrote:
>>
>>> Hi
>>>
>>> ./tests/encryption/crypt.t fails regression on
>>> https://build.gluster.org/job/centos6-regression/5112/consoleFull
>>> with a core. It doesn't seem to be related to the patch. Can somebody
>>> take a look at it? Following is the backtrace.
>>>
>>> Program terminated with signal SIGSEGV, Segmentation fault.
>>> #0  0x7effbe9ef92b in offset_at_tail (conf=0xc0,
>>> object=0x7effb000ac28) at /home/jenkins/root/workspace/c
>>> entos6-regression/xlators/encryption/crypt/src/atom.c:96
>>> 96
>>> /home/jenkins/root/workspace/centos6-regression/xlators/encryption/crypt/src/atom.c:
>>> No such file or directory.
>>> [Current thread is 1 (LWP 1082)]
>>> (gdb) bt
>>> #0  0x7effbe9ef92b in offset_at_tail (conf=0xc0,
>>> object=0x7effb000ac28) at /home/jenkins/root/workspace/c
>>> entos6-regression/xlators/encryption/crypt/src/atom.c:96
>>> #1  0x7effbe9ef9d5 in offset_at_data_tail (frame=0x7effa4001960,
>>> object=0x7effb000ac28) at /home/jenkins/root/workspace/c
>>> entos6-regression/xlators/encryption/crypt/src/atom.c:110
>>> #2  0x7effbe9f0729 in rmw_partial_block (frame=0x7effa4001960,
>>> cookie=0x7effb4010050, this=0x7effb800b870, op_ret=0, op_errno=2, vec=0x0,
>>> count=0, stbuf=0x7effb402da18,
>>> iobref=0x7effb804a0d0, atom=0x7effbec106a0 <atoms+64>) at
>>> /home/jenkins/root/workspace/centos6-regression/xlators/encr
>>> yption/crypt/src/atom.c:523
>>> #3  0x7effbe9f1339 in rmw_data_tail (frame=0x7effa4001960,
>>> cookie=0x7effb4010050, this=0x7effb800b870, op_ret=0, op_errno=2, vec=0x0,
>>> count=0, stbuf=0x7effb402da18,
>>> iobref=0x7effb804a0d0, xdata=0x0) at /home/jenkins/root/workspace/c
>>> entos6-regression/xlators/encryption/crypt/src/atom.c:716
>>> #4  0x7effbea03684 in __crypt_readv_done (frame=0x7effb4010050,
>>> cookie=0x0, this=0x7effb800b870, op_ret=0, op_errno=0, xdata=0x0)
>>> at /home/jenkins/root/workspace/centos6-regression/xlators/encr
>>> yption/crypt/src/crypt.c:3460
>>> #5  0x7effbea0375f in crypt_readv_done (frame=0x7effb4010050,
>>> this=0x7effb800b870) at /home/jenkins/root/workspace/c
>>> entos6-regression/xlators/encryption/crypt/src/crypt.c:3487
>>> #6  0x7effbea03b25 in put_one_call_readv (frame=0x7effb4010050,
>>> this=0x7effb800b870) at /home/jenkins/root/workspace/c
>>> entos6-regression/xlators/encryption/crypt/src/crypt.c:3514
>>> #7  0x7effbe9f286e in crypt_readv_cbk (frame=0x7effb4010050,
>>> cookie=0x7effb4010160, this=0x7effb800b870, op_ret=0, op_errno=2,
>>> vec=0x7effbfb33880, count=1, stbuf=0x7effbfb33810,
>>> iobref=0x7effb804a0d0, xdata=0x0) at /home/jenkins/root/workspace/c
>>> entos6-regression/xlators/encryption/crypt/src/crypt.c:371
>>> #8  0x7effbec9cb4b in dht_readv_cbk (frame=0x7effb4010160,
>>> cookie=0x7effb400ff40, this=0x7effb800a200, op_ret=0, op_errno=2,
>>> vector=0x7effbfb33880, count=1, stbuf=0x7effbfb33810,
>>> iobref=0x7effb804a0d0, xdata=0x0) at /home/jenkins/root/workspace/c
>>> entos6-regression/xlators/cluster/dht/src/dht-inode-read.c:479
>>> #9  0x7effbeefff83 in client3_3_readv_cbk (req=0x7effb40048b0,
>>> iov=0x7effb40048f0, count=2, myframe=0x7effb400ff40)
>>> at /home/jenkins/root/workspace/centos6-regression/xlators/prot
>>> ocol/client/src/client-rpc-fops.c:2997
>>> #10 0x7effcc7b681e in rpc_clnt_handle_reply (clnt=0x7effb803eb70,
>>> pollin=0x7effb807b350) at /home/jenkins/root/workspace/c
>>> entos6-regression/rpc/rpc-lib/src/rpc-clnt.c:793
>>> #11 0x7effcc7b6de8 in rpc_clnt_notify (trans=0x7effb803ed10,
>>&g

[Gluster-devel] ./tests/encryption/crypt.t fails regression with core

2017-06-21 Thread Kotresh Hiremath Ravishankar
Hi

./tests/encryption/crypt.t fails regression on
https://build.gluster.org/job/centos6-regression/5112/consoleFull
with a core. It doesn't seem to be related to the patch. Can somebody take
a look at it? Following is the backtrace.

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x7effbe9ef92b in offset_at_tail (conf=0xc0, object=0x7effb000ac28)
at
/home/jenkins/root/workspace/centos6-regression/xlators/encryption/crypt/src/atom.c:96
96
/home/jenkins/root/workspace/centos6-regression/xlators/encryption/crypt/src/atom.c:
No such file or directory.
[Current thread is 1 (LWP 1082)]
(gdb) bt
#0  0x7effbe9ef92b in offset_at_tail (conf=0xc0, object=0x7effb000ac28)
at
/home/jenkins/root/workspace/centos6-regression/xlators/encryption/crypt/src/atom.c:96
#1  0x7effbe9ef9d5 in offset_at_data_tail (frame=0x7effa4001960,
object=0x7effb000ac28) at
/home/jenkins/root/workspace/centos6-regression/xlators/encryption/crypt/src/atom.c:110
#2  0x7effbe9f0729 in rmw_partial_block (frame=0x7effa4001960,
cookie=0x7effb4010050, this=0x7effb800b870, op_ret=0, op_errno=2, vec=0x0,
count=0, stbuf=0x7effb402da18,
iobref=0x7effb804a0d0, atom=0x7effbec106a0 ) at
/home/jenkins/root/workspace/centos6-regression/xlators/encryption/crypt/src/atom.c:523
#3  0x7effbe9f1339 in rmw_data_tail (frame=0x7effa4001960,
cookie=0x7effb4010050, this=0x7effb800b870, op_ret=0, op_errno=2, vec=0x0,
count=0, stbuf=0x7effb402da18,
iobref=0x7effb804a0d0, xdata=0x0) at
/home/jenkins/root/workspace/centos6-regression/xlators/encryption/crypt/src/atom.c:716
#4  0x7effbea03684 in __crypt_readv_done (frame=0x7effb4010050,
cookie=0x0, this=0x7effb800b870, op_ret=0, op_errno=0, xdata=0x0)
at
/home/jenkins/root/workspace/centos6-regression/xlators/encryption/crypt/src/crypt.c:3460
#5  0x7effbea0375f in crypt_readv_done (frame=0x7effb4010050,
this=0x7effb800b870) at
/home/jenkins/root/workspace/centos6-regression/xlators/encryption/crypt/src/crypt.c:3487
#6  0x7effbea03b25 in put_one_call_readv (frame=0x7effb4010050,
this=0x7effb800b870) at
/home/jenkins/root/workspace/centos6-regression/xlators/encryption/crypt/src/crypt.c:3514
#7  0x7effbe9f286e in crypt_readv_cbk (frame=0x7effb4010050,
cookie=0x7effb4010160, this=0x7effb800b870, op_ret=0, op_errno=2,
vec=0x7effbfb33880, count=1, stbuf=0x7effbfb33810,
iobref=0x7effb804a0d0, xdata=0x0) at
/home/jenkins/root/workspace/centos6-regression/xlators/encryption/crypt/src/crypt.c:371
#8  0x7effbec9cb4b in dht_readv_cbk (frame=0x7effb4010160,
cookie=0x7effb400ff40, this=0x7effb800a200, op_ret=0, op_errno=2,
vector=0x7effbfb33880, count=1, stbuf=0x7effbfb33810,
iobref=0x7effb804a0d0, xdata=0x0) at
/home/jenkins/root/workspace/centos6-regression/xlators/cluster/dht/src/dht-inode-read.c:479
#9  0x7effbeefff83 in client3_3_readv_cbk (req=0x7effb40048b0,
iov=0x7effb40048f0, count=2, myframe=0x7effb400ff40)
at
/home/jenkins/root/workspace/centos6-regression/xlators/protocol/client/src/client-rpc-fops.c:2997
#10 0x7effcc7b681e in rpc_clnt_handle_reply (clnt=0x7effb803eb70,
pollin=0x7effb807b350) at
/home/jenkins/root/workspace/centos6-regression/rpc/rpc-lib/src/rpc-clnt.c:793
#11 0x7effcc7b6de8 in rpc_clnt_notify (trans=0x7effb803ed10,
mydata=0x7effb803eba0, event=RPC_TRANSPORT_MSG_RECEIVED,
data=0x7effb807b350)
at
/home/jenkins/root/workspace/centos6-regression/rpc/rpc-lib/src/rpc-clnt.c:986
#12 0x7effcc7b2e0c in rpc_transport_notify (this=0x7effb803ed10,
event=RPC_TRANSPORT_MSG_RECEIVED, data=0x7effb807b350)
at
/home/jenkins/root/workspace/centos6-regression/rpc/rpc-lib/src/rpc-transport.c:538
#13 0x7effc136458a in socket_event_poll_in (this=0x7effb803ed10,
notify_handled=_gf_true) at
/home/jenkins/root/workspace/centos6-regression/rpc/rpc-transport/socket/src/socket.c:2315
#14 0x7effc1364bd5 in socket_event_handler (fd=10, idx=2, gen=1,
data=0x7effb803ed10, poll_in=1, poll_out=0, poll_err=0)
at
/home/jenkins/root/workspace/centos6-regression/rpc/rpc-transport/socket/src/socket.c:2467
#15 0x7effcca6216e in event_dispatch_epoll_handler
(event_pool=0x2105fc0, event=0x7effbfb33e70) at
/home/jenkins/root/workspace/centos6-regression/libglusterfs/src/event-epoll.c:572
#16 0x7effcca62470 in event_dispatch_epoll_worker (data=0x215d950) at
/home/jenkins/root/workspace/centos6-regression/libglusterfs/src/event-epoll.c:648
#17 0x7effcbcc9aa1 in start_thread () from ./lib64/libpthread.so.0
#18 0x7effcb631bcd in clone () from ./lib64/libc.so.6


Thanks,
Kotresh H R
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Announcing release 3.11 : Scope, schedule and feature tracking

2017-04-26 Thread Kotresh Hiremath Ravishankar
Hi Shyam,

Following RFE is merged in master with github issue and would go in 3.11

GitHub issue: https://github.com/gluster/glusterfs/issues/191
Patch:https://review.gluster.org/#/c/15472/
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1443373

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Shyam" 
> To: gluster-devel@gluster.org
> Sent: Wednesday, April 26, 2017 5:00:02 AM
> Subject: Re: [Gluster-devel] Announcing release 3.11 : Scope, schedule and 
> feature tracking
> 
> On 04/25/2017 10:16 AM, Shyam wrote:
> > On 04/20/2017 02:46 AM, Kaushal M wrote:
> >>> 2) IPv6 support (@kaushal)
> >> This is under review at https://review.gluster.org/16228 . The patch
> >> mostly looks fine.
> >>
> >> The only issue is that it currently depends and links with an internal
> >> FB fork of tirpc (mainly for some helper functions and utilities).
> >> This makes it hard for the community to make actual use of  and test,
> >> the IPv6 features/fixes introduced by the change.
> >>
> >> If the change were refactored the use publicly available versions of
> >> tirpc or ntirpc, I'm OK for it to be merged. I did try it out myself.
> >> While I was able to build it against available versions of tirpc, I
> >> wasn't able to get it working correctly.
> >>
> >
> > I checked the patch and here are my comments on merging this,
> >
> > 1) We are encouraging FB to actually not use FB specific configure time
> > options, and instead use a site.h like approach (wherein we can build
> > with different site.h files and not proliferate options). This
> > discussion I realize is not public, nor is there a github issue for the
> > same.
> >
> > Considering this, we would need this patch to change appropriately.
> >
> > 2) I also agree on the tirpc dependency, if we could make it work with
> > the publicly available tirpc, it is better as otherwise it is difficult
> > to use by the community.
> >
> > Considering this, I would suggest we (as in all concerned) work on these
> > aspects and get it right in master before we take it in for a release.
> 
> Forgot to mention, please add a github issue to track this feature
> against the release scope (3.11 or otherwise).
> 
> >
> > Thanks,
> > Shyam
> > ___
> > Gluster-devel mailing list
> > Gluster-devel@gluster.org
> > http://lists.gluster.org/mailman/listinfo/gluster-devel
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-devel
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-users] Announcing release 3.11 : Scope, schedule and feature tracking

2017-04-25 Thread Kotresh Hiremath Ravishankar
Hi Serkan,

Even though bitrot is not enabled, versioning was being done.
As part of it, on every fresh lookup, getxattr calls were
made to find weather object is bad, to get it's current version
and signature. So a find on gluster mount sometimes would cause
high cpu utilization. 

Since this is an RFE, it would be available from 3.11 and would not
be back ported to 3.10.x


Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Serkan Çoban" <cobanser...@gmail.com>
> To: "Kotresh Hiremath Ravishankar" <khire...@redhat.com>
> Cc: "Shyam" <srang...@redhat.com>, "Gluster Users" 
> <gluster-us...@gluster.org>, "Gluster Devel"
> <gluster-devel@gluster.org>
> Sent: Tuesday, April 25, 2017 1:25:39 PM
> Subject: Re: [Gluster-users] [Gluster-devel] Announcing release 3.11 : Scope, 
> schedule and feature tracking
> 
> How this affect CPU usage? Does it read whole file and calculates a
> hash after it is being written?
> Will this patch land in 3.10.x?
> 
> On Tue, Apr 25, 2017 at 10:32 AM, Kotresh Hiremath Ravishankar
> <khire...@redhat.com> wrote:
> > Hi
> >
> > https://github.com/gluster/glusterfs/issues/188 is merged in master
> > and needs to go in 3.11
> >
> > Thanks and Regards,
> > Kotresh H R
> >
> > - Original Message -
> >> From: "Kaushal M" <kshlms...@gmail.com>
> >> To: "Shyam" <srang...@redhat.com>
> >> Cc: gluster-us...@gluster.org, "Gluster Devel" <gluster-devel@gluster.org>
> >> Sent: Thursday, April 20, 2017 12:16:39 PM
> >> Subject: Re: [Gluster-devel] Announcing release 3.11 : Scope, schedule and
> >> feature tracking
> >>
> >> On Thu, Apr 13, 2017 at 8:17 PM, Shyam <srang...@redhat.com> wrote:
> >> > On 02/28/2017 10:17 AM, Shyam wrote:
> >> >>
> >> >> Hi,
> >> >>
> >> >> With release 3.10 shipped [1], it is time to set the dates for release
> >> >> 3.11 (and subsequently 4.0).
> >> >>
> >> >> This mail has the following sections, so please read or revisit as
> >> >> needed,
> >> >>   - Release 3.11 dates (the schedule)
> >> >>   - 3.11 focus areas
> >> >
> >> >
> >> > Pinging the list on the above 2 items.
> >> >
> >> >> *Release 3.11 dates:*
> >> >> Based on our release schedule [2], 3.11 would be 3 months from the 3.10
> >> >> release and would be a Short Term Maintenance (STM) release.
> >> >>
> >> >> This puts 3.11 schedule as (working from the release date backwards):
> >> >> - Release: May 30th, 2017
> >> >> - Branching: April 27th, 2017
> >> >
> >> >
> >> > Branching is about 2 weeks away, other than the initial set of overflow
> >> > features from 3.10 nothing else has been raised on the lists and in
> >> > github
> >> > as requests for 3.11.
> >> >
> >> > So, a reminder to folks who are working on features, to raise the
> >> > relevant
> >> > github issue for the same, and post it to devel list for consideration
> >> > in
> >> > 3.11 (also this helps tracking and ensuring we are waiting for the right
> >> > things at the time of branching).
> >> >
> >> >>
> >> >> *3.11 focus areas:*
> >> >> As maintainers of gluster, we want to harden testing around the various
> >> >> gluster features in this release. Towards this the focus area for this
> >> >> release are,
> >> >>
> >> >> 1) Testing improvements in Gluster
> >> >>   - Primary focus would be to get automated test cases to determine
> >> >> release health, rather than repeating a manual exercise every 3 months
> >> >>   - Further, we would also attempt to focus on maturing Glusto[7] for
> >> >> this, and other needs (as much as possible)
> >> >>
> >> >> 2) Merge all (or as much as possible) Facebook patches into master, and
> >> >> hence into release 3.11
> >> >>   - Facebook has (as announced earlier [3]) started posting their
> >> >> patches mainline, and this needs some attention to make it into master
> >> >>
> >> >
> >> > Further to the above, we are also considering the following features for
> >> > this release, re

Re: [Gluster-devel] Announcing release 3.11 : Scope, schedule and feature tracking

2017-04-25 Thread Kotresh Hiremath Ravishankar
Hi

https://github.com/gluster/glusterfs/issues/188 is merged in master
and needs to go in 3.11

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Kaushal M" 
> To: "Shyam" 
> Cc: gluster-us...@gluster.org, "Gluster Devel" 
> Sent: Thursday, April 20, 2017 12:16:39 PM
> Subject: Re: [Gluster-devel] Announcing release 3.11 : Scope, schedule and 
> feature tracking
> 
> On Thu, Apr 13, 2017 at 8:17 PM, Shyam  wrote:
> > On 02/28/2017 10:17 AM, Shyam wrote:
> >>
> >> Hi,
> >>
> >> With release 3.10 shipped [1], it is time to set the dates for release
> >> 3.11 (and subsequently 4.0).
> >>
> >> This mail has the following sections, so please read or revisit as needed,
> >>   - Release 3.11 dates (the schedule)
> >>   - 3.11 focus areas
> >
> >
> > Pinging the list on the above 2 items.
> >
> >> *Release 3.11 dates:*
> >> Based on our release schedule [2], 3.11 would be 3 months from the 3.10
> >> release and would be a Short Term Maintenance (STM) release.
> >>
> >> This puts 3.11 schedule as (working from the release date backwards):
> >> - Release: May 30th, 2017
> >> - Branching: April 27th, 2017
> >
> >
> > Branching is about 2 weeks away, other than the initial set of overflow
> > features from 3.10 nothing else has been raised on the lists and in github
> > as requests for 3.11.
> >
> > So, a reminder to folks who are working on features, to raise the relevant
> > github issue for the same, and post it to devel list for consideration in
> > 3.11 (also this helps tracking and ensuring we are waiting for the right
> > things at the time of branching).
> >
> >>
> >> *3.11 focus areas:*
> >> As maintainers of gluster, we want to harden testing around the various
> >> gluster features in this release. Towards this the focus area for this
> >> release are,
> >>
> >> 1) Testing improvements in Gluster
> >>   - Primary focus would be to get automated test cases to determine
> >> release health, rather than repeating a manual exercise every 3 months
> >>   - Further, we would also attempt to focus on maturing Glusto[7] for
> >> this, and other needs (as much as possible)
> >>
> >> 2) Merge all (or as much as possible) Facebook patches into master, and
> >> hence into release 3.11
> >>   - Facebook has (as announced earlier [3]) started posting their
> >> patches mainline, and this needs some attention to make it into master
> >>
> >
> > Further to the above, we are also considering the following features for
> > this release, request feature owners to let us know if these are actively
> > being worked on and if these will make the branching dates. (calling out
> > folks that I think are the current feature owners for the same)
> >
> > 1) Halo - Initial Cut (@pranith)
> > 2) IPv6 support (@kaushal)
> 
> This is under review at https://review.gluster.org/16228 . The patch
> mostly looks fine.
> 
> The only issue is that it currently depends and links with an internal
> FB fork of tirpc (mainly for some helper functions and utilities).
> This makes it hard for the community to make actual use of  and test,
> the IPv6 features/fixes introduced by the change.
> 
> If the change were refactored the use publicly available versions of
> tirpc or ntirpc, I'm OK for it to be merged. I did try it out myself.
> While I was able to build it against available versions of tirpc, I
> wasn't able to get it working correctly.
> 
> > 3) Negative lookup (@poornima)
> > 4) Parallel Readdirp - More changes to default settings. (@poornima, @du)
> >
> >
> >> [1] 3.10 release announcement:
> >> http://lists.gluster.org/pipermail/gluster-devel/2017-February/052188.html
> >>
> >> [2] Gluster release schedule:
> >> https://www.gluster.org/community/release-schedule/
> >>
> >> [3] Mail regarding facebook patches:
> >> http://lists.gluster.org/pipermail/gluster-devel/2016-December/051784.html
> >>
> >> [4] Release scope: https://github.com/gluster/glusterfs/projects/1
> >>
> >> [5] glusterfs github issues: https://github.com/gluster/glusterfs/issues
> >>
> >> [6] github issues for features and major fixes:
> >> https://hackmd.io/s/BkgH8sdtg#
> >>
> >> [7] Glusto tests: https://github.com/gluster/glusto-tests
> >> ___
> >> Gluster-devel mailing list
> >> Gluster-devel@gluster.org
> >> http://lists.gluster.org/mailman/listinfo/gluster-devel
> >
> > ___
> > Gluster-devel mailing list
> > Gluster-devel@gluster.org
> > http://lists.gluster.org/mailman/listinfo/gluster-devel
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-devel
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-users] High load on glusterfsd process

2017-04-25 Thread Kotresh Hiremath Ravishankar
Hi Abhishek,

As this is an enhancement it won't be back ported to 3.7/3.8/3.10
It would be only available from upcoming 3.11 release.

But I did try applying it to 3.7.6. It has lot of conflicts.
If it's important for you, you can upgrade to latest version. 
available and back port it. If it's impossible to upgrade to
latest version, atleast 3.7.20 would do. It has minimal
conflicts. I can help you out with that. 

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "ABHISHEK PALIWAL" <abhishpali...@gmail.com>
> To: "Kotresh Hiremath Ravishankar" <khire...@redhat.com>
> Cc: "Pranith Kumar Karampuri" <pkara...@redhat.com>, "Gluster Devel" 
> <gluster-devel@gluster.org>, "gluster-users"
> <gluster-us...@gluster.org>
> Sent: Tuesday, April 25, 2017 10:58:41 AM
> Subject: Re: [Gluster-users] High load on glusterfsd process
> 
> Hi Kotresh,
> 
> Could you please update whether it is possible to get the patch or bakport
> this patch on Gluster 3.7.6 version.
> 
> Regards,
> Abhishek
> 
> On Mon, Apr 24, 2017 at 6:14 PM, ABHISHEK PALIWAL <abhishpali...@gmail.com>
> wrote:
> 
> > What is the way to take this patch on Gluster 3.7.6 or only way to upgrade
> > the version?
> >
> > On Mon, Apr 24, 2017 at 3:22 PM, ABHISHEK PALIWAL <abhishpali...@gmail.com
> > > wrote:
> >
> >> Hi Kotresh,
> >>
> >> I have seen the patch available on the link which you shared. It seems we
> >> don't have some files in gluser 3.7.6 which you modified in the patch.
> >>
> >> Is there any possibility to provide the patch for Gluster 3.7.6?
> >>
> >> Regards,
> >> Abhishek
> >>
> >> On Mon, Apr 24, 2017 at 3:07 PM, Kotresh Hiremath Ravishankar <
> >> khire...@redhat.com> wrote:
> >>
> >>> Hi Abhishek,
> >>>
> >>> Bitrot requires versioning of files to be down on writes.
> >>> This was being done irrespective of whether bitrot is
> >>> enabled or not. This takes considerable CPU. With the
> >>> fix https://review.gluster.org/#/c/14442/, it is made
> >>> optional and is enabled only with bitrot. If bitrot
> >>> is not enabled, then you won't see any setxattr/getxattrs
> >>> related to bitrot.
> >>>
> >>> The fix would be available in 3.11.
> >>>
> >>>
> >>> Thanks and Regards,
> >>> Kotresh H R
> >>>
> >>> - Original Message -
> >>> > From: "ABHISHEK PALIWAL" <abhishpali...@gmail.com>
> >>> > To: "Pranith Kumar Karampuri" <pkara...@redhat.com>
> >>> > Cc: "Gluster Devel" <gluster-devel@gluster.org>, "gluster-users" <
> >>> gluster-us...@gluster.org>, "Kotresh Hiremath
> >>> > Ravishankar" <khire...@redhat.com>
> >>> > Sent: Monday, April 24, 2017 11:30:57 AM
> >>> > Subject: Re: [Gluster-users] High load on glusterfsd process
> >>> >
> >>> > Hi Kotresh,
> >>> >
> >>> > Could you please update me on this?
> >>> >
> >>> > Regards,
> >>> > Abhishek
> >>> >
> >>> > On Sat, Apr 22, 2017 at 12:31 PM, Pranith Kumar Karampuri <
> >>> > pkara...@redhat.com> wrote:
> >>> >
> >>> > > +Kotresh who seems to have worked on the bug you mentioned.
> >>> > >
> >>> > > On Fri, Apr 21, 2017 at 12:21 PM, ABHISHEK PALIWAL <
> >>> > > abhishpali...@gmail.com> wrote:
> >>> > >
> >>> > >>
> >>> > >> If the patch provided in that case will resolve my bug as well then
> >>> > >> please provide the patch so that I will backport it on 3.7.6
> >>> > >>
> >>> > >> On Fri, Apr 21, 2017 at 11:30 AM, ABHISHEK PALIWAL <
> >>> > >> abhishpali...@gmail.com> wrote:
> >>> > >>
> >>> > >>> Hi Team,
> >>> > >>>
> >>> > >>> I have noticed that there are so many glusterfsd threads are
> >>> running in
> >>> > >>> my system and we observed some of those thread consuming more cpu.
> >>> I
> >>> > >>> did “strace” on two such threads (before the problem disappeared by
> >>> > >>> itself)

Re: [Gluster-devel] [Gluster-users] High load on glusterfsd process

2017-04-24 Thread Kotresh Hiremath Ravishankar
Hi Abhishek,

Bitrot requires versioning of files to be down on writes.
This was being done irrespective of whether bitrot is
enabled or not. This takes considerable CPU. With the
fix https://review.gluster.org/#/c/14442/, it is made
optional and is enabled only with bitrot. If bitrot
is not enabled, then you won't see any setxattr/getxattrs
related to bitrot.

The fix would be available in 3.11. 


Thanks and Regards,
Kotresh H R

- Original Message -
> From: "ABHISHEK PALIWAL" <abhishpali...@gmail.com>
> To: "Pranith Kumar Karampuri" <pkara...@redhat.com>
> Cc: "Gluster Devel" <gluster-devel@gluster.org>, "gluster-users" 
> <gluster-us...@gluster.org>, "Kotresh Hiremath
> Ravishankar" <khire...@redhat.com>
> Sent: Monday, April 24, 2017 11:30:57 AM
> Subject: Re: [Gluster-users] High load on glusterfsd process
> 
> Hi Kotresh,
> 
> Could you please update me on this?
> 
> Regards,
> Abhishek
> 
> On Sat, Apr 22, 2017 at 12:31 PM, Pranith Kumar Karampuri <
> pkara...@redhat.com> wrote:
> 
> > +Kotresh who seems to have worked on the bug you mentioned.
> >
> > On Fri, Apr 21, 2017 at 12:21 PM, ABHISHEK PALIWAL <
> > abhishpali...@gmail.com> wrote:
> >
> >>
> >> If the patch provided in that case will resolve my bug as well then
> >> please provide the patch so that I will backport it on 3.7.6
> >>
> >> On Fri, Apr 21, 2017 at 11:30 AM, ABHISHEK PALIWAL <
> >> abhishpali...@gmail.com> wrote:
> >>
> >>> Hi Team,
> >>>
> >>> I have noticed that there are so many glusterfsd threads are running in
> >>> my system and we observed some of those thread consuming more cpu. I
> >>> did “strace” on two such threads (before the problem disappeared by
> >>> itself)
> >>> and found that there is a continuous activity like below:
> >>>
> >>> lstat("/opt/lvmdir/c2/brick/.glusterfs/e7/7d/e77d12b3-92f8-4
> >>> dfe-9a7f-246e901cbdf1/002700/firewall_-J208482-425_20170126T113552+.log.gz",
> >>> {st_mode=S_IFREG|0670, st_size=1995, ...}) = 0
> >>> lgetxattr("/opt/lvmdir/c2/brick/.glusterfs/e7/7d/e77d12b3-92
> >>> f8-4dfe-9a7f-246e901cbdf1/002700/firewall_-J208482-425_20170126T113552+.log.gz",
> >>> "trusted.bit-rot.bad-file", 0x3fff81f58550, 255) = -1 ENODATA (No data
> >>> available)
> >>> lgetxattr("/opt/lvmdir/c2/brick/.glusterfs/e7/7d/e77d12b3-92
> >>> f8-4dfe-9a7f-246e901cbdf1/002700/firewall_-J208482-425_20170126T113552+.log.gz",
> >>> "trusted.bit-rot.signature", 0x3fff81f58550, 255) = -1 ENODATA (No data
> >>> available)
> >>> lstat("/opt/lvmdir/c2/brick/.glusterfs/e7/7d/e77d12b3-92f8-4
> >>> dfe-9a7f-246e901cbdf1/002700/tcli_-J208482-425_20170123T180550+.log.gz",
> >>> {st_mode=S_IFREG|0670, st_size=169, ...}) = 0
> >>> lgetxattr("/opt/lvmdir/c2/brick/.glusterfs/e7/7d/e77d12b3-92
> >>> f8-4dfe-9a7f-246e901cbdf1/002700/tcli_-J208482-425_20170123T180550+.log.gz",
> >>> "trusted.bit-rot.bad-file", 0x3fff81f58550, 255) = -1 ENODATA (No data
> >>> available)
> >>> lgetxattr("/opt/lvmdir/c2/brick/.glusterfs/e7/7d/e77d12b3-92
> >>> f8-4dfe-9a7f-246e901cbdf1/002700/tcli_-J208482-425_20170123T180550+.log.gz",
> >>> "trusted.bit-rot.signature", 0x3fff81f58550, 255) = -1 ENODATA (No data
> >>> available)
> >>>
> >>> I have found the below existing issue which is very similar to my
> >>> scenario.
> >>>
> >>> https://bugzilla.redhat.com/show_bug.cgi?id=1298258
> >>>
> >>> We are using the gluster-3.7.6 and it seems that the issue is fixed in
> >>> 3.8.4 version.
> >>>
> >>> Could you please let me know why it showing the number of above logs and
> >>> reason behind it as it is not explained in the above bug.
> >>>
> >>> Regards,
> >>> Abhishek
> >>>
> >>> --
> >>>
> >>>
> >>>
> >>>
> >>> Regards
> >>> Abhishek Paliwal
> >>>
> >>
> >>
> >>
> >> --
> >>
> >>
> >>
> >>
> >> Regards
> >> Abhishek Paliwal
> >>
> >> ___
> >> Gluster-users mailing list
> >> gluster-us...@gluster.org
> >> http://lists.gluster.org/mailman/listinfo/gluster-users
> >>
> >
> >
> >
> > --
> > Pranith
> >
> 
> 
> 
> --
> 
> 
> 
> 
> Regards
> Abhishek Paliwal
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] 3.9. feature freeze status check

2016-08-28 Thread Kotresh Hiremath Ravishankar
Hi Pranith,

Please add the following feature of bitrot to 3.9 road map page. It is merged.

feature/bitrot: Ondemand scrub option for bitrot 
(http://review.gluster.org/#/c/15111/)

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Poornima Gurusiddaiah" 
> To: "Pranith Kumar Karampuri" 
> Cc: "Anoop Chirayath Manjiyil Sajan" , "Kaushal Madappa" 
> , "Gluster Devel"
> , "Jose Rivera" 
> Sent: Monday, August 29, 2016 10:18:46 AM
> Subject: Re: [Gluster-devel] 3.9. feature freeze status check
> 
> Hi,
> 
> Updated inline.
> 
> 
> 
> 
> From: "Pranith Kumar Karampuri" 
> To: "Rajesh Joseph" , "Manikandan Selvaganesh"
> , "Csaba Henk" , "Niels de Vos"
> , "Jiffin Thottan" , "Aravinda
> Vishwanathapura Krishna Murthy" , "Anoop Chirayath
> Manjiyil Sajan" , "Ravishankar Narayanankutty"
> , "Kaushal Madappa" ,
> "Raghavendra Talur" , "Poornima Gurusiddaiah"
> , "Soumya Koduri" , "Kaleb
> Keithley" , "Jose Rivera" ,
> "Prashanth Pai" , "Samikshan Bairagya"
> , "Vijay Bellur" , "Prasanna
> Kalever" 
> Cc: "Gluster Devel" 
> Sent: Friday, August 26, 2016 12:21:07 PM
> Subject: Re: 3.9. feature freeze status check
> 
> Prasanna, Prashant,
> Could you add a short description of the features you are working on for 3.9
> as well to the list?
> 
> On Fri, Aug 26, 2016 at 9:39 PM, Pranith Kumar Karampuri <
> pkara...@redhat.com > wrote:
> 
> 
> 
> 
> 
> On Fri, Aug 26, 2016 at 9:38 PM, Pranith Kumar Karampuri <
> pkara...@redhat.com > wrote:
> 
> 
> 
> hi,
> Now that we are almost near the feature freeze date (31st of Aug), want to
> get a sense if any of the status of the features.
> 
> I meant "want to get a sense of the status of the features"
> 
> 
> 
> 
> Please respond with:
> 1) Feature already merged
> 2) Undergoing review will make it by 31st Aug
> 3) Undergoing review, but may not make it by 31st Aug
> 4) Feature won't make it for 3.9.
> 
> I added the features that were not planned(i.e. not in the 3.9 roadmap page)
> but made it to the release and not planned but may make it to release at the
> end of this mail.
> If you added a feature on master that will be released as part of 3.9.0 but
> forgot to add it to roadmap page, please let me know I will add it.
> 
> Here are the features planned as per the roadmap:
> 1) Throttling
> Feature owner: Ravishankar
> 
> 2) Trash improvements
> Feature owners: Anoop, Jiffin
> 
> 3) Kerberos for Gluster protocols:
> Feature owners: Niels, Csaba
> 
> 4) SELinux on gluster volumes:
> Feature owners: Niels, Manikandan
> 
> 5) Native sub-directory mounts:
> Feature owners: Kaushal, Pranith
> 
> 6) RichACL support for GlusterFS:
> Feature owners: Rajesh Joseph
> 
> 7) Sharemodes/Share reservations:
> Feature owners: Raghavendra Talur, Poornima G, Soumya Koduri, Rajesh Joseph,
> Anoop C S
> 
> 8) Integrate with external resource management software
> Feature owners: Kaleb Keithley, Jose Rivera
> 
> 9) Python Wrappers for Gluster CLI Commands
> Feature owners: Aravinda VK
> 
> 10) Package and ship libgfapi-python
> Feature owners: Prashant Pai
> 
> 11) Management REST APIs
> Feature owners: Aravinda VK
> 
> 12) Events APIs
> Feature owners: Aravinda VK
> 
> 13) CLI to get state representation of a cluster from the local glusterd pov
> Feature owners: Samikshan Bairagya
> 
> 14) Posix-locks Reclaim support
> Feature owners: Soumya Koduri
> 
> 15) Deprecate striped volumes
> Feature owners: Vijay Bellur, Niels de Vos
> 
> 16) Improvements in Gluster NFS-Ganesha integration
> Feature owners: Jiffin Tony Thottan, Soumya Koduri
> 
> The following need to be added to the roadmap:
> 
> Features that made it to master already but were not palnned:
> 1) Multi threaded self-heal in EC
> Feature owner: Pranith (Did this because serkan asked for it. He has 9PB
> volume, self-healing takes a long time :-/)
> 
> 2) Lock revocation (Facebook patch)
> Feature owner: Richard Wareing
> 
> Features that look like will make it to 3.9.0:
> 1) Hardware extension support for EC
> Feature owner: Xavi
> 
> 2) Reset brick support for replica volumes:
> Feature owner: Anuradha
> 
> 3) Md-cache perf improvements in smb:
> Feature owner: Poornima
> Feature undergoing review. It will be in tech-preview for this release. Main
> feature will be merged by 31st August 2016.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> --
> Pranith
> 
> 
> 
> --
> Pranith
> 
> 
> 
> --
> Pranith
> 
> Regards,
> Poornima
> 
> 
> 
> 
> 
> 
> ___
> 

Re: [Gluster-devel] CFP for Gluster Developer Summit

2016-08-23 Thread Kotresh Hiremath Ravishankar
Hi,

We would like to propose the following talk.

Title: Gluster Geo-replication
Theme: Stability and Performance

We plan to cover the following things.
- Introduction
- New Features
- Stability and Usability Improvements
- Performance Improvements.
- Road-map

Thanks,
Kotresh HR and Aravinda VK

- Original Message -
> From: "Vijay Bellur" 
> To: "Gluster Devel" , "gluster-users Discussion 
> List" 
> Cc: "Amye Scavarda" , "Ric Wheeler" 
> Sent: Saturday, August 13, 2016 1:18:49 AM
> Subject: [Gluster-devel] CFP for Gluster Developer Summit
> 
> Hey All,
> 
> Gluster Developer Summit 2016 is fast approaching [1] on us. We are
> looking to have talks and discussions related to the following themes in
> the summit:
> 
> 1. Gluster.Next - focusing on features shaping the future of Gluster
> 
> 2. Experience - Description of real world experience and feedback from:
> a> Devops and Users deploying Gluster in production
> b> Developers integrating Gluster with other ecosystems
> 
> 3. Use cases  - focusing on key use cases that drive Gluster.today and
> Gluster.Next
> 
> 4. Stability & Performance - focusing on current improvements to reduce
> our technical debt backlog
> 
> 5. Process & infrastructure  - focusing on improving current workflow,
> infrastructure to make life easier for all of us!
> 
> If you have a talk/discussion proposal that can be part of these themes,
> please send out your proposal(s) by replying to this thread. Please
> clearly mention the theme for which your proposal is relevant when you
> do so. We will be ending the CFP by 12 midnight PDT on August 31st, 2016.
> 
> If you have other topics that do not fit in the themes listed, please
> feel free to propose and we might be able to accommodate some of them as
> lightening talks or something similar.
> 
> Please do reach out to me or Amye if you have any questions.
> 
> Thanks!
> Vijay
> 
> [1] https://www.gluster.org/events/summit2016/
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] ./tests/basic/afr/granular-esh/add-brick.t suprious failure

2016-07-26 Thread Kotresh Hiremath Ravishankar
Hi,

Above mentioned AFT test has failed and is not related to the below patch.

https://build.gluster.org/job/rackspace-regression-2GB-triggered/22485/consoleFull

Can someone from AFR team look into it?

Thanks and Regards,
Kotresh H R

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] ./tests/basic/afr/entry-self-heal.t regressin failure

2016-07-21 Thread Kotresh Hiremath Ravishankar
Hi,

One more AFR test has failed for the patch http://review.gluster.org/14903/
and is not related to the patch. Can someone from AFR team look into it?

https://build.gluster.org/job/rackspace-regression-2GB-triggered/22357/console

Thanks and Regards,
Kotresh H R

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Regression failures in last 3 days

2016-07-20 Thread Kotresh Hiremath Ravishankar
Hi,

Here is the patch for br-stub.t failures.
http://review.gluster.org/14960
Thanks Soumya for root causing this.

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Poornima Gurusiddaiah" <pguru...@redhat.com>
> To: "Gluster Devel" <gluster-devel@gluster.org>, "Kotresh Hiremath 
> Ravishankar" <khire...@redhat.com>, "Rajesh
> Joseph" <rjos...@redhat.com>, "Ravishankar N" <ravishan...@redhat.com>, 
> "Ashish Pandey" <aspan...@redhat.com>
> Sent: Wednesday, July 20, 2016 10:43:33 AM
> Subject: Regression failures in last 3 days
> 
> Hi,
> 
> Below are the list of test cases that have failed regression in the last 3
> days. Please take a look at them:
> 
> ./tests/bitrot/br-stub.t ; Failed 8 times
> Regression Links:
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/22356/consoleFull
> Regression Links:
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/22355/consoleFull
> Regression Links:
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/22340/consoleFull
> Regression Links:
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/22325/consoleFull
> Regression Links:
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/22322/consoleFull
> Regression Links:
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/22316/consoleFull
> Regression Links:
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/22313/consoleFull
> Regression Links:
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/22293/consoleFull
> 
> ./tests/bugs/snapshot/bug-1316437.t ; Failed 6 times
> Regression Links:
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/22361/consoleFull
> Regression Links:
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/22343/consoleFull
> Regression Links:
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/22340/consoleFull
> Regression Links:
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/22329/consoleFull
> Regression Links:
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/22327/consoleFull
> Regression Links:
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/22324/consoleFull
> 
> ./tests/basic/afr/arbiter-mount.t ; Failed 4 times
> Regression Links:
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/22354/consoleFull
> Regression Links:
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/22353/consoleFull
> Regression Links:
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/22311/consoleFull
> Regression Links:
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/22306/consoleFull
> 
> ./tests/basic/ec/ec.t ; Failed 3 times
> Regression Links:
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/22335/consoleFull
> Regression Links:
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/22290/consoleFull
> Regression Links:
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/22287/consoleFull
> 
> ./tests/bugs/disperse/bug-1236065.t ; Failed 1 times
> Regression Links:
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/22339/consoleFull
> 
> ./tests/basic/afr/add-brick-self-heal.t ; Failed 1 times
> Regression Links:
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/22315/consoleFull
> 
> ./tests/basic/tier/tierd_check.t ; Failed 2 times
> Regression Links:
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/22299/consoleFull
> Regression Links:
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/22296/consoleFull
> 
> ./tests/bugs/glusterd/bug-041.t ; Failed 1 times
> Regression Links:
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/22331/consoleFull
> 
> ./tests/bugs/glusterd/bug-1089668.t ; Failed 1 times
> Regression Links:
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/22304/consoleFull
> 
> ./tests/basic/ec/ec-new-entry.t ; Failed 1 times
> Regression Links:
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/22359/consoleFull
> 
> ./tests/basic/uss.t ; Failed 1 times
> Regression Links:
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/22352/consoleFull
> 
> ./tests/basic/geo-replication/marker-xattrs.t ; Failed 1 times
> Regression Links:
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/22337/consoleFull
> 
> ./tests/basic/bd.t ; Failed 1 times
> Regression Links:
> https://build.gluster.org/job/rackspace-regressio

[Gluster-devel] ./tests/basic/afr/split-brain-favorite-child-policy.t regressin failure on NetBSD

2016-07-18 Thread Kotresh Hiremath Ravishankar
Hi,

The above mentioned test has failed for the patch 
http://review.gluster.org/#/c/14927/1 
and is not related to my patch. Can someone from AFR team look into it?

https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/18132/console

Thanks and Regards,
Kotresh H R

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-infra] Please test Gerrit 2.12.2

2016-05-31 Thread Kotresh Hiremath Ravishankar
Hi Prasanna,

'Fix' button is visible. May be you are missing something, please check.

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Prasanna Kalever" 
> To: "Nigel Babu" 
> Cc: "gluster-infra" , "gluster-devel" 
> 
> Sent: Tuesday, May 31, 2016 12:13:47 PM
> Subject: Re: [Gluster-devel] [Gluster-infra] Please test Gerrit 2.12.2
> 
> Hi Nigel,
> 
> I don't see 'Fix' button in the comment section which is "fix for a
> remote code execution exploit" introduced in 2.12.2, it helps us in
> editing the code in the gerrit web editor instantaneously, hence we
> don't have to cherry pick the patch every time to address minor code
> changes.
> 
> I feel that is really helpful for the developers to address comments
> faster and easier.
> 
> Please see [1], it also has attachments showing how this looks
> 
> [1] http://www.gluster.org/pipermail/gluster-devel/2016-May/049429.html
> 
> 
> Thanks,
> --
> Prasanna
> 
> On Tue, May 31, 2016 at 10:39 AM, Nigel Babu  wrote:
> > Hello,
> >
> > A reminder: I'm hoping to get this done tomorrow morning at 0230 GMT[1].
> > I'll have a backup ready in case something goes wrong. I've tested this
> > process on review.nigelb.me and it's gone reasonably smoothly.
> >
> > [1]:
> > http://www.timeanddate.com/worldclock/fixedtime.html?msg=Maintenance=20160601T08=176=1
> >
> > On Mon, May 30, 2016 at 7:26 PM, Nigel Babu  wrote:
> >>
> >> Hello,
> >>
> >> I've now upgraded Gerrit on http://review.nigelb.me to 2.12.2. Please
> >> spend a few minutes testing that everything works as you expect it to. If
> >> I
> >> don't hear anything negative by tomorrow, I'd like to schedule an upgrade
> >> this week.
> >>
> >> --
> >> nigelb
> >
> >
> >
> >
> > --
> > nigelb
> >
> > ___
> > Gluster-infra mailing list
> > gluster-in...@gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-infra
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] 3.8: Centos Regression Failure

2016-05-05 Thread Kotresh Hiremath Ravishankar
Hi

./tests/bugs/replicate/bug-977797.t fails in the following run.

https://build.gluster.org/job/rackspace-regression-2GB-triggered/20473/console

It succeeds in my local machine. It could be spurious.

Could someone from replication team look into it?


Thanks and Regards,
Kotresh H R

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Bitrot Review Request

2016-04-29 Thread Kotresh Hiremath Ravishankar
Hi Pranith,

You had a concern of consuming I/O threads when bit-rot uses rchecksum 
interface to
signing, normal scrubbing and on-demand scrubbing with tiering. 
 
  http://review.gluster.org/#/c/13833/5/xlators/storage/posix/src/posix.c

As discussed over comments, the concern is valid and the above patch is not 
being
taken in and would be abandoned.

I have the following patch where the signing and normal scrubbing would not
consume io-threads. Only the on-demand scrubbing consumes io-threads. I think
this should be fine as tiering is single threaded and only consumes
one I/O thread (as told by Joseph on PatchSet 6).

  http://review.gluster.org/#/c/13969/

Since, on-demand scrubbing is disabled by default and there is a size cap and
we document to increase the default number of I/O threads, consuming one I/O
thread for scrubbing would be fine I guess.

Let me know your thoughts.

Thanks and Regards,
Kotresh H R

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-Maintainers] Update on 3.7.10 - on schedule to be tagged at 2200PDT 30th March.

2016-03-31 Thread Kotresh Hiremath Ravishankar
Point noted, will keep informed from next time!

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Kaushal M" <kshlms...@gmail.com>
> To: "Kotresh Hiremath Ravishankar" <khire...@redhat.com>
> Cc: "Aravinda" <avish...@redhat.com>, "Gluster Devel" 
> <gluster-devel@gluster.org>, maintain...@gluster.org
> Sent: Thursday, March 31, 2016 7:32:58 PM
> Subject: Re: [Gluster-Maintainers] Update on 3.7.10 - on schedule to be 
> tagged at 2200PDT 30th March.
> 
> This is a really hard to hit issue, that requires a lot of things to
> be in place for it to happen.
> But it is an unexpected data loss issue.
> 
> I'll wait tonight for the change to be merged, though I really don't like it.
> 
> You could have informed me on this thread earlier.
> Please, in the future, keep release-managers/maintainers updated about
> any critical changes.
> 
> The only reason this is getting merged now, is because of the Jenkins
> migration which got completed surprisingly quickly.
> 
> On Thu, Mar 31, 2016 at 7:08 PM, Kotresh Hiremath Ravishankar
> <khire...@redhat.com> wrote:
> > Kaushal,
> >
> > I just replied to Aravinda's mail. Anyway pasting the snippet if someone
> > misses that.
> >
> > "In the scenario mentioned by aravinda below, when an unlink comes on a
> > entry, in changelog xlator, it's 'loc->pargfid'
> > was getting modified to "/". So consequence is that , when it hits
> > posix, the 'loc->pargfid' would be pointing
> > to "/" instead of actual parent. This is not so terrible yet, as we are
> > saved by posix. Posix checks
> > for "loc->path" first, only if it's not filled, it will use
> > "pargfid/bname" combination. So only for
> > clients like self-heal who does not populate 'loc->path' and the same
> > basename exists on root, the
> > unlink happens on root instead of actual path."
> >
> > Thanks and Regards,
> > Kotresh H R
> >
> > - Original Message -
> >> From: "Kaushal M" <kshlms...@gmail.com>
> >> To: "Aravinda" <avish...@redhat.com>
> >> Cc: "Gluster Devel" <gluster-devel@gluster.org>, maintain...@gluster.org,
> >> "Kotresh Hiremath Ravishankar"
> >> <khire...@redhat.com>
> >> Sent: Thursday, March 31, 2016 6:56:18 PM
> >> Subject: Re: [Gluster-Maintainers] Update on 3.7.10 - on schedule to be
> >> tagged at 2200PDT 30th March.
> >>
> >> Kotresh, Could you please provide the details?
> >>
> >> On Thu, Mar 31, 2016 at 6:43 PM, Aravinda <avish...@redhat.com> wrote:
> >> > Hi Kaushal,
> >> >
> >> > We have a Changelog bug which can lead to data loss if Glusterfind is
> >> > enabled(To be specific,  when changelog.capture-del-path and
> >> > changelog.changelog options enabled on a replica volume).
> >> >
> >> > http://review.gluster.org/#/c/13861/
> >> >
> >> > This is very corner case. but good to go with the release. We tried to
> >> > merge
> >> > this before the merge window for 3.7.10, but regressions not yet
> >> > complete
> >> > :(
> >> >
> >> > Do you think we should wait for this patch?
> >> >
> >> > @Kotresh can provide more details about this issue.
> >> >
> >> > regards
> >> > Aravinda
> >> >
> >> >
> >> > On 03/31/2016 01:29 PM, Kaushal M wrote:
> >> >>
> >> >> The last change for 3.7.10 has been merged now. Commit 2cd5b75 will be
> >> >> used for the release. I'll be preparing release-notes, and tagging the
> >> >> release soon.
> >> >>
> >> >> After running verification tests and checking for any perf
> >> >> improvements, I'll make be making the release tarball.
> >> >>
> >> >> Regards,
> >> >> Kaushal
> >> >>
> >> >> On Wed, Mar 30, 2016 at 7:00 PM, Kaushal M <kshlms...@gmail.com> wrote:
> >> >>>
> >> >>> Hi all,
> >> >>>
> >> >>> I'll be taking over the release duties for 3.7.10. Vijay is busy and
> >> >>> could not get the time to do a scheduled release.
> >> >>>
> >> >>> The .10 release has been scheduled for tagging on 30th (ie. today).
> >> >>> In the interests of providing some heads up to developers wishing to
> >> >>> get changes merged,
> >> >>> I'll be waiting till 10PM PDT, 30th March. (0500UTC/1030IST 31st
> >> >>> March), to tag the release.
> >> >>>
> >> >>> So you have ~15 hours to get any changes required merged.
> >> >>>
> >> >>> Thanks,
> >> >>> Kaushal
> >> >>
> >> >> ___
> >> >> maintainers mailing list
> >> >> maintain...@gluster.org
> >> >> http://www.gluster.org/mailman/listinfo/maintainers
> >> >
> >> >
> >>
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-Maintainers] Update on 3.7.10 - on schedule to be tagged at 2200PDT 30th March.

2016-03-31 Thread Kotresh Hiremath Ravishankar
Kaushal,

I just replied to Aravinda's mail. Anyway pasting the snippet if someone misses 
that.

"In the scenario mentioned by aravinda below, when an unlink comes on a 
entry, in changelog xlator, it's 'loc->pargfid'
was getting modified to "/". So consequence is that , when it hits posix, 
the 'loc->pargfid' would be pointing
to "/" instead of actual parent. This is not so terrible yet, as we are 
saved by posix. Posix checks
for "loc->path" first, only if it's not filled, it will use "pargfid/bname" 
combination. So only for
clients like self-heal who does not populate 'loc->path' and the same 
basename exists on root, the
unlink happens on root instead of actual path."

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Kaushal M" <kshlms...@gmail.com>
> To: "Aravinda" <avish...@redhat.com>
> Cc: "Gluster Devel" <gluster-devel@gluster.org>, maintain...@gluster.org, 
> "Kotresh Hiremath Ravishankar"
> <khire...@redhat.com>
> Sent: Thursday, March 31, 2016 6:56:18 PM
> Subject: Re: [Gluster-Maintainers] Update on 3.7.10 - on schedule to be 
> tagged at 2200PDT 30th March.
> 
> Kotresh, Could you please provide the details?
> 
> On Thu, Mar 31, 2016 at 6:43 PM, Aravinda <avish...@redhat.com> wrote:
> > Hi Kaushal,
> >
> > We have a Changelog bug which can lead to data loss if Glusterfind is
> > enabled(To be specific,  when changelog.capture-del-path and
> > changelog.changelog options enabled on a replica volume).
> >
> > http://review.gluster.org/#/c/13861/
> >
> > This is very corner case. but good to go with the release. We tried to
> > merge
> > this before the merge window for 3.7.10, but regressions not yet complete
> > :(
> >
> > Do you think we should wait for this patch?
> >
> > @Kotresh can provide more details about this issue.
> >
> > regards
> > Aravinda
> >
> >
> > On 03/31/2016 01:29 PM, Kaushal M wrote:
> >>
> >> The last change for 3.7.10 has been merged now. Commit 2cd5b75 will be
> >> used for the release. I'll be preparing release-notes, and tagging the
> >> release soon.
> >>
> >> After running verification tests and checking for any perf
> >> improvements, I'll make be making the release tarball.
> >>
> >> Regards,
> >> Kaushal
> >>
> >> On Wed, Mar 30, 2016 at 7:00 PM, Kaushal M <kshlms...@gmail.com> wrote:
> >>>
> >>> Hi all,
> >>>
> >>> I'll be taking over the release duties for 3.7.10. Vijay is busy and
> >>> could not get the time to do a scheduled release.
> >>>
> >>> The .10 release has been scheduled for tagging on 30th (ie. today).
> >>> In the interests of providing some heads up to developers wishing to
> >>> get changes merged,
> >>> I'll be waiting till 10PM PDT, 30th March. (0500UTC/1030IST 31st
> >>> March), to tag the release.
> >>>
> >>> So you have ~15 hours to get any changes required merged.
> >>>
> >>> Thanks,
> >>> Kaushal
> >>
> >> ___
> >> maintainers mailing list
> >> maintain...@gluster.org
> >> http://www.gluster.org/mailman/listinfo/maintainers
> >
> >
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-Maintainers] Update on 3.7.10 - on schedule to be tagged at 2200PDT 30th March.

2016-03-31 Thread Kotresh Hiremath Ravishankar
Inline...

- Original Message -
> From: "Aravinda" <avish...@redhat.com>
> To: "Kaushal M" <kshlms...@gmail.com>, "Gluster Devel" 
> <gluster-devel@gluster.org>, maintain...@gluster.org, "Kotresh
> Hiremath Ravishankar" <khire...@redhat.com>
> Sent: Thursday, March 31, 2016 6:43:20 PM
> Subject: Re: [Gluster-Maintainers] Update on 3.7.10 - on schedule to be 
> tagged at 2200PDT 30th March.
> 
> Hi Kaushal,
> 
> We have a Changelog bug which can lead to data loss if Glusterfind is
> enabled(To be specific,  when changelog.capture-del-path and
> changelog.changelog options enabled on a replica volume).
> 
> http://review.gluster.org/#/c/13861/
> 
> This is very corner case. but good to go with the release. We tried to
> merge this before the merge window for 3.7.10, but regressions not yet
> complete :(
> 
> Do you think we should wait for this patch?
> 
> @Kotresh can provide more details about this issue.

In the above scenario, when an unlink comes on a entry, in changelog 
xlator, it's 'loc->pargfid'
was getting modified to "/". So consequence is that , when it hits posix, 
the 'loc->pargfid' would be pointing 
to "/" instead of actual parent. This is not so terrible yet, as we are 
saved by posix. Posix checks
for "loc->path" first, only if it's not filled, it will use "pargfid/bname" 
combination. So only for
clients like self-heal who does not populate 'loc->path' and the same 
basename exists on root, the
unlink happens on root instead of actual path.
> 
> regards
> Aravinda
> 
> On 03/31/2016 01:29 PM, Kaushal M wrote:
> > The last change for 3.7.10 has been merged now. Commit 2cd5b75 will be
> > used for the release. I'll be preparing release-notes, and tagging the
> > release soon.
> >
> > After running verification tests and checking for any perf
> > improvements, I'll make be making the release tarball.
> >
> > Regards,
> > Kaushal
> >
> > On Wed, Mar 30, 2016 at 7:00 PM, Kaushal M <kshlms...@gmail.com> wrote:
> >> Hi all,
> >>
> >> I'll be taking over the release duties for 3.7.10. Vijay is busy and
> >> could not get the time to do a scheduled release.
> >>
> >> The .10 release has been scheduled for tagging on 30th (ie. today).
> >> In the interests of providing some heads up to developers wishing to
> >> get changes merged,
> >> I'll be waiting till 10PM PDT, 30th March. (0500UTC/1030IST 31st
> >> March), to tag the release.
> >>
> >> So you have ~15 hours to get any changes required merged.
> >>
> >> Thanks,
> >> Kaushal
> > ___
> > maintainers mailing list
> > maintain...@gluster.org
> > http://www.gluster.org/mailman/listinfo/maintainers
> 
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] NetBSD Regression failure on 3.7: ./tests/features/trash.t

2016-03-14 Thread Kotresh Hiremath Ravishankar
Hi,

trash.t is failing on 3.7 branch for below patch.

https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/15153/console

Could someone look into it?

Thanks and Regards,
Kotresh H R

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [ANNOUNCE] Maintainer Update

2016-03-07 Thread Kotresh Hiremath Ravishankar
Congrats and all the best Aravinda!

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Venky Shankar" 
> To: "Gluster Devel" 
> Cc: maintain...@gluster.org
> Sent: Tuesday, March 8, 2016 10:49:46 AM
> Subject: [Gluster-devel] [ANNOUNCE] Maintainer Update
> 
> Hey folks,
> 
> As of yesterday, Aravinda has taken over the maintainership of
> Geo-replication. Over
> the past year or so, he has been actively involved in it's development -
> introducing
> new features, fixing bugs, reviewing patches and helping out the community.
> Needless
> to say, he's the go-to guy for anything related to Geo-replication.
> 
> Although this shift should have been in effect far earlier, it's better late
> than
> never (and before Aravinda could change his mind, Jeff took the liberty of
> merging
> the maintainer update patch ;)).
> 
> Congrats and all the best with the new role, Aravinda.
> 
> Thanks!
> 
> Venky
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] CentOS Regression generated core by .tests/basic/tier/tier-file-create.t

2016-03-07 Thread Kotresh Hiremath Ravishankar
Hi All,

The regression run has caused the core to generate for below patch.

https://build.gluster.org/job/rackspace-regression-2GB-triggered/18859/console

>From the initial analysis, it's a tiered setup where ec sub-volume is the cold 
>tier and afr is the hot tier.
The crash has happened during lookup, the lookup is wound to cold-tier, since 
it is not present there, dht issued discover
onto hot-tier and while serializing dictionary, it found the 'data' is freed 
for the key 'trusted.ec.size'.

(gdb) bt
#0  0x7fe059df9772 in memcpy () from ./lib64/libc.so.6
#1  0x7fe05b209902 in dict_serialize_lk (this=0x7fe04809f7dc, 
buf=0x7fe0480a2b7c "") at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/dict.c:2533
#2  0x7fe05b20a182 in dict_allocate_and_serialize (this=0x7fe04809f7dc, 
buf=0x7fe04ef6bb08, length=0x7fe04ef6bb00) at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/dict.c:2780
#3  0x7fe04e3492de in client3_3_lookup (frame=0x7fe0480a22dc, 
this=0x7fe048008c00, data=0x7fe04ef6bbe0) at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/protocol/client/src/client-rpc-fops.c:3368
#4  0x7fe04e32c8c8 in client_lookup (frame=0x7fe0480a22dc, 
this=0x7fe048008c00, loc=0x7fe0480a4354, xdata=0x7fe04809f7dc) at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/protocol/client/src/client.c:417
#5  0x7fe04dbdaf5f in afr_lookup_do (frame=0x7fe04809f6dc, 
this=0x7fe048029e00, err=0) at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/cluster/afr/src/afr-common.c:2422
#6  0x7fe04dbdb4bb in afr_lookup (frame=0x7fe04809f6dc, 
this=0x7fe048029e00, loc=0x7fe03c0082f4, xattr_req=0x7fe03c00810c) at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/cluster/afr/src/afr-common.c:2532
#7  0x7fe04de3c2b8 in dht_lookup (frame=0x7fe0480a0a3c, 
this=0x7fe04802c580, loc=0x7fe03c0082f4, xattr_req=0x7fe03c00810c) at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/cluster/dht/src/dht-common.c:2429
#8  0x7fe04d91f07e in dht_lookup_everywhere (frame=0x7fe03c0081ec, 
this=0x7fe04802d450, loc=0x7fe03c0082f4) at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/cluster/dht/src/dht-common.c:1803
#9  0x7fe04d920953 in dht_lookup_cbk (frame=0x7fe03c0081ec, 
cookie=0x7fe03c00902c, this=0x7fe04802d450, op_ret=-1, op_errno=2, inode=0x0, 
stbuf=0x0, xattr=0x0, postparent=0x0)
at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/cluster/dht/src/dht-common.c:2056
#10 0x7fe04de35b94 in dht_lookup_everywhere_done (frame=0x7fe03c00902c, 
this=0x7fe0480288a0) at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/cluster/dht/src/dht-common.c:1338
#11 0x7fe04de38281 in dht_lookup_everywhere_cbk (frame=0x7fe03c00902c, 
cookie=0x7fe04809ed2c, this=0x7fe0480288a0, op_ret=-1, op_errno=2, inode=0x0, 
buf=0x0, xattr=0x0, postparent=0x0)
at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/cluster/dht/src/dht-common.c:1768
#12 0x7fe05b27 in default_lookup_cbk (frame=0x7fe04809ed2c, 
cookie=0x7fe048099ddc, this=0x7fe048027590, op_ret=-1, op_errno=2, inode=0x0, 
buf=0x0, xdata=0x0, postparent=0x0) at defaults.c:1188
#13 0x7fe04e0a4861 in ec_manager_lookup (fop=0x7fe048099ddc, state=-5) at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/cluster/ec/src/ec-generic.c:864
#14 0x7fe04e0a0b3a in __ec_manager (fop=0x7fe048099ddc, error=2) at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/cluster/ec/src/ec-common.c:2098
#15 0x7fe04e09c912 in ec_resume (fop=0x7fe048099ddc, error=0) at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/cluster/ec/src/ec-common.c:289
#16 0x7fe04e09caf8 in ec_complete (fop=0x7fe048099ddc) at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/cluster/ec/src/ec-common.c:362
#17 0x7fe04e0a41a8 in ec_lookup_cbk (frame=0x7fe04800107c, cookie=0x5, 
this=0x7fe048027590, op_ret=-1, op_errno=2, inode=0x7fe03c00152c, 
buf=0x7fe04ef6c860, xdata=0x0, postparent=0x7fe04ef6c7f0)
at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/cluster/ec/src/ec-generic.c:758
#18 0x7fe04e348239 in client3_3_lookup_cbk (req=0x7fe04809dd4c, 
iov=0x7fe04809dd8c, count=1, myframe=0x7fe04809964c)
at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/protocol/client/src/client-rpc-fops.c:3028
#19 0x7fe05afd83e6 in rpc_clnt_handle_reply (clnt=0x7fe048066350, 
pollin=0x7fe0480018f0) at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-clnt.c:759
#20 0x7fe05afd8884 in rpc_clnt_notify (trans=0x7fe0480667f0, 
mydata=0x7fe048066380, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x7fe0480018f0)
at 

[Gluster-devel] Using geo-replication as backup solution using gluster volume snapshot!

2016-03-07 Thread Kotresh Hiremath Ravishankar
Hi All,

Here is the idea, we can use geo-replication as backup solution using gluster 
volume
snapshots on slave side. One of the drawbacks of geo-replication is that it's a
continuous asynchronous replication and would not help in getting the last 
week's or
yesterday's data. So if we use gluster snapshots at the slave end, we can use 
the
snapshots to get the last week's or yesterday's data making it a candidate for a
backup solution. The limitation is that the snapshots at the slave end can't be
restored as it will break the running geo-replication. It could be mounted and
we have access to data when the snapshots are taken. It's just a naive idea.
Any suggestions and use cases are worth discussing:)


Thanks and Regards,
Kotresh H R

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Cores generated with ./tests/geo-rep/georep-basic-dr-tarssh.t

2016-03-03 Thread Kotresh Hiremath Ravishankar
Hi,

Yes, with this patch we need not set conn->trans to NULL in rpc_clnt_disable

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Soumya Koduri" <skod...@redhat.com>
> To: "Kotresh Hiremath Ravishankar" <khire...@redhat.com>, "Raghavendra G" 
> <raghaven...@gluster.com>
> Cc: "Gluster Devel" <gluster-devel@gluster.org>
> Sent: Thursday, March 3, 2016 5:06:00 PM
> Subject: Re: [Gluster-devel] Cores generated with 
> ./tests/geo-rep/georep-basic-dr-tarssh.t
> 
> 
> 
> On 03/03/2016 04:58 PM, Kotresh Hiremath Ravishankar wrote:
> > [Replying on top of my own reply]
> >
> > Hi,
> >
> > I have submitted the below patch [1] to avoid the issue of
> > 'rpc_clnt_submit'
> > getting reconnected. But it won't take care of memory leak problem you were
> > trying to fix. That we have to carefully go through all cases and fix it.
> > Please have a look at it.
> >
> Looks good. IIUC, with this patch, we need not set conn->trans to NULL
> in 'rpc_clnt_disable()'. Right? If yes, then it takes care of memleak as
> the transport object shall then get freed as part of
> 'rpc_clnt_trigger_destroy'.
> 
> 
> > http://review.gluster.org/#/c/13592/
> >
> > Thanks and Regards,
> > Kotresh H R
> >
> > - Original Message -
> >> From: "Kotresh Hiremath Ravishankar" <khire...@redhat.com>
> >> To: "Soumya Koduri" <skod...@redhat.com>
> >> Cc: "Raghavendra G" <raghaven...@gluster.com>, "Gluster Devel"
> >> <gluster-devel@gluster.org>
> >> Sent: Thursday, March 3, 2016 3:39:11 PM
> >> Subject: Re: [Gluster-devel] Cores generated with
> >> ./tests/geo-rep/georep-basic-dr-tarssh.t
> >>
> >> Hi Soumya,
> >>
> >> I tested the lastes patch [2] on master where your previous patch [1] in
> >> merged.
> >> I see crashes at different places.
> >>
> >> 1. If there are code paths that are holding rpc object without taking ref
> >> on
> >> it, all those
> >> code path will crash on invoking rpc submit on that object as rpc
> >> object
> >> would have freed
> >> by last unref on DISCONNECT event. I see this kind of use-case in
> >> chagnelog rpc code.
> >> Need to check on other users of rpc.
> Agree. We should fix all such code-paths. Since this seem to be an
> intricate fix, shall we take these patches only in master branch and not
> in 3.7 release for now till we fix all such paths as we encounter?
> 
> >>
> >> 2. And also we need to take care of reconnect timers that are being set
> >> and
> >> are re-tried to
> >> connect back on expiration. In those cases also, we might crash as rpc
> >> object would have freed.
> Your patch addresses this..right?
> 
> Thanks,
> Soumya
> 
> >>
> >>
> >> [1] http://review.gluster.org/#/c/13507/
> >> [2] http://review.gluster.org/#/c/13587/
> >>
> >> Thanks and Regards,
> >> Kotresh H R
> >>
> >> - Original Message -
> >>> From: "Soumya Koduri" <skod...@redhat.com>
> >>> To: "Raghavendra G" <raghaven...@gluster.com>, "Kotresh Hiremath
> >>> Ravishankar" <khire...@redhat.com>
> >>> Cc: "Gluster Devel" <gluster-devel@gluster.org>
> >>> Sent: Thursday, March 3, 2016 12:24:00 PM
> >>> Subject: Re: [Gluster-devel] Cores generated with
> >>> ./tests/geo-rep/georep-basic-dr-tarssh.t
> >>>
> >>> Thanks a lot Kotresh.
> >>>
> >>> On 03/03/2016 08:47 AM, Raghavendra G wrote:
> >>>> Hi Soumya,
> >>>>
> >>>> Can you send a fix to this regression on upstream master too? This patch
> >>>> is merged there.
> >>>>
> >>> I have submitted below patch.
> >>>   http://review.gluster.org/#/c/13587/
> >>>
> >>> Kindly review the same.
> >>>
> >>> Thanks,
> >>> Soumya
> >>>
> >>>> regards,
> >>>> Raghavendra
> >>>>
> >>>> On Tue, Mar 1, 2016 at 10:34 PM, Kotresh Hiremath Ravishankar
> >>>> <khire...@redhat.com <mailto:khire...@redhat.com>> wrote:
> >>>>
> >>>>  Hi Soumya,
> >>>>
> >>>>  I analysed the issue an

Re: [Gluster-devel] Cores generated with ./tests/geo-rep/georep-basic-dr-tarssh.t

2016-03-03 Thread Kotresh Hiremath Ravishankar
[Replying on top of my own reply]

Hi,

I have submitted the below patch [1] to avoid the issue of 'rpc_clnt_submit'
getting reconnected. But it won't take care of memory leak problem you were
trying to fix. That we have to carefully go through all cases and fix it.
Please have a look at it.

http://review.gluster.org/#/c/13592/

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Kotresh Hiremath Ravishankar" <khire...@redhat.com>
> To: "Soumya Koduri" <skod...@redhat.com>
> Cc: "Raghavendra G" <raghaven...@gluster.com>, "Gluster Devel" 
> <gluster-devel@gluster.org>
> Sent: Thursday, March 3, 2016 3:39:11 PM
> Subject: Re: [Gluster-devel] Cores generated with 
> ./tests/geo-rep/georep-basic-dr-tarssh.t
> 
> Hi Soumya,
> 
> I tested the lastes patch [2] on master where your previous patch [1] in
> merged.
> I see crashes at different places.
> 
> 1. If there are code paths that are holding rpc object without taking ref on
> it, all those
>code path will crash on invoking rpc submit on that object as rpc object
>would have freed
>by last unref on DISCONNECT event. I see this kind of use-case in
>chagnelog rpc code.
>Need to check on other users of rpc.
>
> 2. And also we need to take care of reconnect timers that are being set and
> are re-tried to
>connect back on expiration. In those cases also, we might crash as rpc
>object would have freed.
>
> 
> [1] http://review.gluster.org/#/c/13507/
> [2] http://review.gluster.org/#/c/13587/
> 
> Thanks and Regards,
> Kotresh H R
> 
> - Original Message -
> > From: "Soumya Koduri" <skod...@redhat.com>
> > To: "Raghavendra G" <raghaven...@gluster.com>, "Kotresh Hiremath
> > Ravishankar" <khire...@redhat.com>
> > Cc: "Gluster Devel" <gluster-devel@gluster.org>
> > Sent: Thursday, March 3, 2016 12:24:00 PM
> > Subject: Re: [Gluster-devel] Cores generated with
> > ./tests/geo-rep/georep-basic-dr-tarssh.t
> > 
> > Thanks a lot Kotresh.
> > 
> > On 03/03/2016 08:47 AM, Raghavendra G wrote:
> > > Hi Soumya,
> > >
> > > Can you send a fix to this regression on upstream master too? This patch
> > > is merged there.
> > >
> > I have submitted below patch.
> > http://review.gluster.org/#/c/13587/
> > 
> > Kindly review the same.
> > 
> > Thanks,
> > Soumya
> > 
> > > regards,
> > > Raghavendra
> > >
> > > On Tue, Mar 1, 2016 at 10:34 PM, Kotresh Hiremath Ravishankar
> > > <khire...@redhat.com <mailto:khire...@redhat.com>> wrote:
> > >
> > > Hi Soumya,
> > >
> > > I analysed the issue and found out that crash has happened because
> > > of the patch [1].
> > >
> > > The patch doesn't set transport object to NULL in 'rpc_clnt_disable'
> > > but instead does it on
> > > 'rpc_clnt_trigger_destroy'. So if there are pending rpc invocations
> > > on the rpc object that
> > > is disabled (those instances are possible as happening now in
> > > changelog), it will trigger a
> > > CONNECT notify again with 'mydata' that is freed causing a crash.
> > > This happens because
> > > 'rpc_clnt_submit' reconnects if rpc is not connected.
> > >
> > >   rpc_clnt_submit (...) {
> > > ...
> > >  if (conn->connected == 0) {
> > >  ret = rpc_transport_connect (conn->trans,
> > >
> > >   conn->config.remote_port);
> > >  }
> > > ...
> > >   }
> > >
> > > Without your patch, conn->trans was set NULL and hence CONNECT fails
> > > not resulting with
> > > CONNECT notify call. And also the cleanup happens in failure path.
> > >
> > > So the memory leak can happen, if there is no try for rpc invocation
> > > after DISCONNECT.
> > > It will be cleaned up otherwise.
> > >
> > >
> > > [1] http://review.gluster.org/#/c/13507/
> > >
> > > Thanks and Regards,
> > > Kotresh H R
> > >
> > > - Original Message -
> > >  > From: "Kotresh Hiremath Ravishankar" <khire...@redhat.com
> > > <mailto:khire...@redhat.com>>
> > >  > To: "Soumya Koduri" <skod...@redhat.com
&g

Re: [Gluster-devel] Cores generated with ./tests/geo-rep/georep-basic-dr-tarssh.t

2016-03-01 Thread Kotresh Hiremath Ravishankar
Hi Soumya,

I analysed the issue and found out that crash has happened because of the patch 
[1].

The patch doesn't set transport object to NULL in 'rpc_clnt_disable' but 
instead does it on
'rpc_clnt_trigger_destroy'. So if there are pending rpc invocations on the rpc 
object that
is disabled (those instances are possible as happening now in changelog), it 
will trigger a
CONNECT notify again with 'mydata' that is freed causing a crash. This happens 
because
'rpc_clnt_submit' reconnects if rpc is not connected.

 rpc_clnt_submit (...) {
   ...
if (conn->connected == 0) {
ret = rpc_transport_connect (conn->trans,
 conn->config.remote_port);
}
   ...
 }

Without your patch, conn->trans was set NULL and hence CONNECT fails not 
resulting with
CONNECT notify call. And also the cleanup happens in failure path.

So the memory leak can happen, if there is no try for rpc invocation after 
DISCONNECT.
It will be cleaned up otherwise.


[1] http://review.gluster.org/#/c/13507/

Thanks and Regards,
Kotresh H R

- Original Message -----
> From: "Kotresh Hiremath Ravishankar" <khire...@redhat.com>
> To: "Soumya Koduri" <skod...@redhat.com>
> Cc: avish...@redhat.com, "Gluster Devel" <gluster-devel@gluster.org>
> Sent: Monday, February 29, 2016 4:15:22 PM
> Subject: Re: Cores generated with ./tests/geo-rep/georep-basic-dr-tarssh.t
> 
> Hi Soumya,
> 
> I just tested that it is reproducible only with your patch both in master and
> 3.76 branch.
> The geo-rep test cases are marked bad in master. So it's not hit in master.
> rpc is introduced
> in changelog xlator to communicate to applications via libgfchangelog.
> Venky/Me will check
> why is the crash happening and will update.
> 
> 
> Thanks and Regards,
> Kotresh H R
> 
> - Original Message -
> > From: "Soumya Koduri" <skod...@redhat.com>
> > To: avish...@redhat.com, "kotresh" <khire...@redhat.com>
> > Cc: "Gluster Devel" <gluster-devel@gluster.org>
> > Sent: Monday, February 29, 2016 2:10:51 PM
> > Subject: Cores generated with ./tests/geo-rep/georep-basic-dr-tarssh.t
> > 
> > Hi Aravinda/Kotresh,
> > 
> > With [1], I consistently see cores generated with the test
> > './tests/geo-rep/georep-basic-dr-tarssh.t' in release-3.7 branch. From
> > the cores, looks like we are trying to dereference a freed
> > changelog_rpc_clnt_t(crpc) object in changelog_rpc_notify(). Strangely
> > this was not reported in master branch.
> > 
> > I tried debugging but couldn't find any possible suspects. I request you
> > to take a look and let me know if [1] caused any regression.
> > 
> > Thanks,
> > Soumya
> > 
> > [1] http://review.gluster.org/#/c/13507/
> > 
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Cores generated with ./tests/geo-rep/georep-basic-dr-tarssh.t

2016-02-29 Thread Kotresh Hiremath Ravishankar
Hi Soumya,

I just tested that it is reproducible only with your patch both in master and 
3.76 branch.
The geo-rep test cases are marked bad in master. So it's not hit in master. rpc 
is introduced
in changelog xlator to communicate to applications via libgfchangelog. Venky/Me 
will check
why is the crash happening and will update.


Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Soumya Koduri" 
> To: avish...@redhat.com, "kotresh" 
> Cc: "Gluster Devel" 
> Sent: Monday, February 29, 2016 2:10:51 PM
> Subject: Cores generated with ./tests/geo-rep/georep-basic-dr-tarssh.t
> 
> Hi Aravinda/Kotresh,
> 
> With [1], I consistently see cores generated with the test
> './tests/geo-rep/georep-basic-dr-tarssh.t' in release-3.7 branch. From
> the cores, looks like we are trying to dereference a freed
> changelog_rpc_clnt_t(crpc) object in changelog_rpc_notify(). Strangely
> this was not reported in master branch.
> 
> I tried debugging but couldn't find any possible suspects. I request you
> to take a look and let me know if [1] caused any regression.
> 
> Thanks,
> Soumya
> 
> [1] http://review.gluster.org/#/c/13507/
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Bitrot stub forget()

2016-02-17 Thread Kotresh Hiremath Ravishankar
I will take care of putting up the patch upstream.

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Venky Shankar" <vshan...@redhat.com>
> To: "FNU Raghavendra Manjunath" <rab...@redhat.com>
> Cc: "Gluster Devel" <gluster-devel@gluster.org>, "Kotresh Hiremath 
> Ravishankar" <khire...@redhat.com>
> Sent: Thursday, February 18, 2016 9:44:27 AM
> Subject: Re: Bitrot stub forget()
> 
> On Tue, Feb 16, 2016 at 11:38:24PM -0500, FNU Raghavendra Manjunath wrote:
> > Venky,
> > 
> > Yes. You are right. We should not remove the quarantine entry in forget.
> > 
> > We have to remove it upon getting -ve lookups in bit-rot-stub and upon
> > getting an unlink.
> > 
> > I have attached a patch for it.
> > 
> > Unfortunately rfc.sh is failing for me with the below error.
> 
> Thanks for the patch. One of us will take care of putting it up for review.
> 
> > 
> > 
> > ssh: connect to host git.gluster.com port 22: Connection timed out
> > fatal: Could not read from remote repository.
> > 
> > Please make sure you have the correct access rights
> > and the repository exists."
> > 
> > 
> > Regards,
> > Raghavendra
> > 
> > 
> > On Tue, Feb 16, 2016 at 10:53 AM, Venky Shankar <vshan...@redhat.com>
> > wrote:
> > 
> > > Hey Raghu,
> > >
> > > Bitrot stub inode forget implementation (br_stub_forget()) deletes the
> > > bad
> > > object
> > > marker (under quarantine directory) if present. This looks incorrect as
> > > ->forget()
> > > can be trigerred when inode table LRU size exceeeds configured limit -
> > > check bug
> > > #1308961 which tracks this issue. I recall that protocol/server calls
> > > inode_forget()
> > > on negative lookup (that might not invoke ->forget() though) and that's
> > > the reason
> > > why br_stub_forget() has this code.
> > >
> > > So, would it make sense to purge bad object marker just in lookup()?
> > > There
> > > might be
> > > a need to do the same in unlink() in case the object was removed by the
> > > client.
> > >
> > > Thoughts?
> > >
> > > Thanks,
> > >
> > > Venky
> > >
> 
> > From a0cc49172df24e263e0db25c53b57f58c19d2cab Mon Sep 17 00:00:00 2001
> > From: Raghavendra Bhat <raghaven...@redhat.com>
> > Date: Tue, 16 Feb 2016 20:22:36 -0500
> > Subject: [PATCH] features/bitrot: do not remove the quarantine handle in
> >  forget
> > 
> > If an object is marked as bad, then an entry is corresponding to the
> > bad object is created in the .glusterfs/quarantine directory to help
> > scrub status. The entry name is the gfid of the corrupted object.
> > The quarantine handle is removed in below 2 cases.
> > 
> > 1) When protocol/server revceives the -ve lookup on an entry whose inode
> >is there in the inode table (it can happen when the corrupted object
> >is deleted directly from the backend for recovery purpose) it sends a
> >forget on the inode and bit-rot-stub removes the quarantine handle in
> >upon getting the forget.
> >refer to the below commit
> >f853ed9c61bf65cb39f859470a8ffe8973818868:
> >http://review.gluster.org/12743)
> > 
> > 2) When bit-rot-stub itself realizes that lookup on a corrupted object
> >has failed with ENOENT.
> > 
> > But with step1, there is a problem when the bit-rot-stub receives forget
> > due to lru limit exceeding in the inode table. In such cases, though the
> > corrupted object is not deleted (either from the mount point or from the
> > backend), the handle in the quarantine directory is removed and that object
> > is not shown in the bad objects list in the scrub status command.
> > 
> > So it is better to follow only 2nd step (i.e. bit-rot-stub removing the
> > handle
> > from the quarantine directory in -ve lookups). Also the handle has to be
> > removed
> > when a corrupted object is unlinked from the mount point itself.
> > 
> > Change-Id: Ibc3bbaf4bc8a5f8986085e87b729ab912cbf8cf9
> > Signed-off-by: Raghavendra Bhat <raghaven...@redhat.com>
> > ---
> >  xlators/features/bit-rot/src/stub/bit-rot-stub.c | 103
> >  +--
> >  1 file changed, 95 insertions(+), 8 deletions(-)
> > 
> > diff --git a/xlators/features/bit-rot/src/stub/bit-rot-stub.c
> > b/xlators/features/bit-rot/src/stub/bit-ro

Re: [Gluster-devel] glusterfsd core on NetBSD (https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/14139/consoleFull)

2016-02-10 Thread Kotresh Hiremath Ravishankar
The crash reported on the above link is same as bug 1221629.
But stack trace mentioned below looks to be from different regression run?
Can I get the link for the same?

It is strange the bt says 'rpcsvc_record_build_header' calling 
'gf_history_changelog'
which does not! Am I missing something?

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Soumya Koduri" 
> To: "Emmanuel Dreyfus" , "kotresh" 
> Cc: "Gluster Devel" 
> Sent: Wednesday, February 10, 2016 2:26:35 PM
> Subject: Re: glusterfsd core on NetBSD
> (https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/14139/consoleFull)
> 
> Thanks Manu.
> 
> Kotresh,
> 
> Is this issue related to bug1221629 as well?
> 
> Thanks,
> Soumya
> 
> On 02/10/2016 02:10 PM, Emmanuel Dreyfus wrote:
> > On Wed, Feb 10, 2016 at 12:17:23PM +0530, Soumya Koduri wrote:
> >> I see a core generated in this regression run though all the tests seem to
> >> have passed. I do not have a netbsd machine to analyze the core.
> >> Could you please take a look and let me know what the issue could have
> >> been?
> >
> > changelog bug. I am not sure how this could become NULL after it has been
> > checked at the beginning of gf_history_changelog().
> >
> > I note this uses readdir() which is not thread-safe. readdir_r() should
> > probably be used instead.
> >
> > Program terminated with signal SIGSEGV, Segmentation fault.
> > #0  0xb99912b4 in gf_history_changelog (changelog_dir=0xb7b160f0 "\003",
> >  start=3081873456, end=0, n_parallel=-1217773520,
> >  actual_end=0xb7b05310)
> >  at
> >  
> > /home/jenkins/root/workspace/rackspace-netbsd7-regression-triggered/xlators/features/changelog/lib/src/gf-history-changelog.c:834
> > 834 gf_log (this->name, GF_LOG_ERROR,
> > (gdb) print this
> > $1 = (xlator_t *) 0x0
> > #0  0xb99912b4 in gf_history_changelog (changelog_dir=0xb7b160f0 "\003",
> >  start=3081873456, end=0, n_parallel=-1217773520,
> >  actual_end=0xb7b05310)
> >  at
> >  
> > /home/jenkins/root/workspace/rackspace-netbsd7-regression-triggered/xlators/features/changelog/lib/src/gf-history-changelog.c:834
> > #1  0xbb6fec17 in rpcsvc_record_build_header (recordstart=0x0,
> >  rlen=3077193776, reply=..., payload=3081855216)
> >  at
> >  
> > /home/jenkins/root/workspace/rackspace-netbsd7-regression-triggered/rpc/rpc-lib/src/rpcsvc.c:857
> > #2  0xbb6fec95 in rpcsvc_record_build_header (recordstart=0xb7b10030 "",
> >  rlen=3077193776, reply=..., payload=3081855216)
> >  at
> >  
> > /home/jenkins/root/workspace/rackspace-netbsd7-regression-triggered/rpc/rpc-lib/src/rpcsvc.c:874
> > #3  0xbb6ffa81 in rpcsvc_submit_generic (req=0xb7b10030,
> > proghdr=0xb7b160f0,
> >  hdrcount=0, payload=0xb76a4030, payloadcount=1, iobref=0x0)
> >  at
> >  
> > /home/jenkins/root/workspace/rackspace-netbsd7-regression-triggered/rpc/rpc-lib/src/rpcsvc.c:1316
> > #4  0xbb70506c in xdr_to_rpc_reply (msgbuf=0xb7b10030 "", len=0,
> >  reply=0xb76a4030, payload=0xb76a4030,
> >  verfbytes=0x1 )
> >  at
> >  
> > /home/jenkins/root/workspace/rackspace-netbsd7-regression-triggered/rpc/rpc-lib/src/xdr-rpcclnt.c:40
> > #5  0xbb26cbb5 in socket_server_event_handler (fd=16, idx=3,
> > data=0xb7b10030,
> >  poll_in=1, poll_out=0, poll_err=0)
> >  at
> >  
> > /home/jenkins/root/workspace/rackspace-netbsd7-regression-triggered/rpc/rpc-transport/socket/src/socket.c:2765
> > #6  0xbb7908da in syncop_rename (subvol=0xbb143030, oldloc=0xba45b4b0,
> >  newloc=0x3, xdata_in=0x75, xdata_out=0xbb7e8000)
> >  at
> >  
> > /home/jenkins/root/workspace/rackspace-netbsd7-regression-triggered/libglusterfs/src/syncop.c:2225
> > #7  0xbb790c21 in syncop_ftruncate (subvol=0xbb143030, fd=0x8062cc0
> > ,
> >  offset=-4647738537632864458, xdata_in=0xbb7efe75
> >  <_rtld_bind_start+17>,
> >  xdata_out=0xbb7e8000)
> >  at
> >  
> > /home/jenkins/root/workspace/rackspace-netbsd7-regression-triggered/libglusterfs/src/syncop.c:2265
> > #8  0xbb75f6d1 in inode_table_dump (itable=0xbb143030,
> >  prefix=0x2 )
> >  at
> >  
> > /home/jenkins/root/workspace/rackspace-netbsd7-regression-triggered/libglusterfs/src/inode.c:2352
> > #9  0x08050e20 in main (argc=12, argv=0xbf7feaac)
> >  at
> >  
> > /home/jenkins/root/workspace/rackspace-netbsd7-regression-triggered/glusterfsd/src/glusterfsd.c:2345
> >
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] changelog bug

2016-02-09 Thread Kotresh Hiremath Ravishankar
Hi,

This crash can't be same as BZ 1221629. The crash in the BZ 1221629
is with the rpc introduced in changelog in 3.7 along with bitrot.
Could you share the crash dump to analyse ?

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Vijay Bellur" <vbel...@redhat.com>
> To: "Kotresh Hiremath Ravishankar" <khire...@redhat.com>, "Manikandan 
> Selvaganesh" <mselv...@redhat.com>
> Cc: gluster-devel@gluster.org, "cyril peponnet" 
> <cyril.pepon...@alcatel-lucent.com>
> Sent: Tuesday, February 9, 2016 10:26:57 PM
> Subject: Re: [Gluster-devel] changelog bug
> 
> On 02/08/2016 01:14 AM, Kotresh Hiremath Ravishankar wrote:
> > Hi,
> >
> > This bug is already tracked BZ 1221629
> > I will start working on this and will update once it is fixed.
> >
> 
> Cyril (in CC) also reported a similar crash with changelog in 3.6.5:
> 
> https://gist.github.com/CyrilPeponnet/b67b360f186f31d34d8f
> 
> The crash seems to be consistently reproducible in Cyril's setup. Can we
> address this soon?
> 
> Thanks,
> Vijay
> 
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] changelog bug

2016-02-09 Thread Kotresh Hiremath Ravishankar
Hi

I think the two crashes he is seeing are in changelog_rollver and 
changelog_notifier are related
but not the one in BZ 1221629

addr2line is mapping to below lines.
changelog_rollover + 0xa9FD_ZERO();
changelog_notifier + 0x3ee   if (FD_ISSET (cn->rfd, )) {

I am guessing it could be related to FD_SETSIZE and number of open file 
descriptors?


Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Joe Julian" <j...@julianfamily.org>
> To: gluster-devel@gluster.org
> Sent: Wednesday, February 10, 2016 10:18:07 AM
> Subject: Re: [Gluster-devel] changelog bug
> 
> btw... he was also having another crash in changelog_rollover:
> https://gist.githubusercontent.com/CyrilPeponnet/11954cbca725d4b8da7a/raw/2168169f7b208d8ee6193c4a444639505efb634b/gistfile1.txt
> 
> It would be a pretty huge coincidence if these were all unique causes,
> wouldn't it?
> 
> On 02/09/2016 08:27 PM, Kotresh Hiremath Ravishankar wrote:
> > Hi,
> >
> > This crash can't be same as BZ 1221629. The crash in the BZ 1221629
> > is with the rpc introduced in changelog in 3.7 along with bitrot.
> > Could you share the crash dump to analyse ?
> >
> > Thanks and Regards,
> > Kotresh H R
> >
> > - Original Message -
> >> From: "Vijay Bellur" <vbel...@redhat.com>
> >> To: "Kotresh Hiremath Ravishankar" <khire...@redhat.com>, "Manikandan
> >> Selvaganesh" <mselv...@redhat.com>
> >> Cc: gluster-devel@gluster.org, "cyril peponnet"
> >> <cyril.pepon...@alcatel-lucent.com>
> >> Sent: Tuesday, February 9, 2016 10:26:57 PM
> >> Subject: Re: [Gluster-devel] changelog bug
> >>
> >> On 02/08/2016 01:14 AM, Kotresh Hiremath Ravishankar wrote:
> >>> Hi,
> >>>
> >>> This bug is already tracked BZ 1221629
> >>> I will start working on this and will update once it is fixed.
> >>>
> >> Cyril (in CC) also reported a similar crash with changelog in 3.6.5:
> >>
> >> https://gist.github.com/CyrilPeponnet/b67b360f186f31d34d8f
> >>
> >> The crash seems to be consistently reproducible in Cyril's setup. Can we
> >> address this soon?
> >>
> >> Thanks,
> >> Vijay
> >>
> >>
> > ___
> > Gluster-devel mailing list
> > Gluster-devel@gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-devel
> 
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] changelog bug

2016-02-07 Thread Kotresh Hiremath Ravishankar
Hi,

This bug is already tracked BZ 1221629
I will start working on this and will update once it is fixed.

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Manikandan Selvaganesh" 
> To: "Emmanuel Dreyfus" 
> Cc: gluster-devel@gluster.org
> Sent: Monday, February 8, 2016 11:23:33 AM
> Subject: Re: [Gluster-devel] changelog bug
> 
> Hi Emmanuel,
> 
> Thanks and as you have mentioned, I have no clue how my changes produced a
> core
> due to a NULL pointer in changelog. I will have a look on this and update you
> soon :)
> 
> --
> Thanks & Regards,
> Manikandan Selvaganesh.
> 
> - Original Message -
> From: "Emmanuel Dreyfus" 
> To: gluster-devel@gluster.org, mselv...@redhat.com
> Sent: Sunday, February 7, 2016 9:25:39 AM
> Subject: changelog bug
> 
> NetBSD regression uncovered an apparently unrelated bug for this change:
> http://review.gluster.org/#/c/13363/
> 
> Regression failed because it produced a core. We have a NULL pointer in
> changelog xlator. Perhaps a race condition?
> 
> "C"o"r"e" "w"a"s" "g"e"n"e"r"a"t"e"d" "b"y" "`"g"l"u"s"t"e"r"f"s"d"'"."
> "P"r"o"g"r"a"m" "t"e"r"m"i"n"a"t"e"d" "w"i"t"h" "s"i"g"n"a"l"
> "S"I"G"S"E"G"V"," "S"e"g"m"e"n"t"a"t"i"o"n" "f"a"u"l"t"."
> "#"0" " "0"x"b"9"a"9"1"2"e"c" "i"n"
> "g"f"_"c"h"a"n"g"e"l"o"g"_"r"e"b"o"r"p"_"r"p"c"s"v"c"_"n"o"t"i"f"y"
> "("r"p"c"="0"x"b"7"b"1"6"0"f"0"," "
> " " " " "m"y"d"a"t"a"="0"x"b"7"b"1"a"8"3"0","
> "e"v"e"n"t"="R"P"C"S"V"C"_"E"V"E"N"T"_"A"C"C"E"P"T","
> "d"a"t"a"="0"x"b"7"8"b"6"0"3"0")"
> " " " " "a"t"
> "/"h"o"m"e"/"j"e"n"k"i"n"s"/"r"o"o"t"/"w"o"r"k"s"p"a"c"e"/"r"a"c"k"s"p"a"c"e"-"n"e"t"b"s"d"7"-"r"e"g"r"e"s"s"i"o"n"-"t"r"i"g"g"e"r"e"d"/"x"l"a"t"o"r"s"/"f"e"a"t"u"r"e"s"/"c"h"a"n"g
> "e"l"o"g"/"l"i"b"/"s"r"c"/"g"f"-"c"h"a"n"g"e"l"o"g"-"r"e"b"o"r"p"."c":"1"1"4"
> "1"1"4" " " " " " " " " " " " " "p"r"i"v" "=" "t"h"i"s"-">"p"r"i"v"a"t"e";"
> "("g"d"b")" "p"r"i"n"t" "t"h"i"s"
> "$"1" "=" "("x"l"a"t"o"r"_"t" "*")" "0"x"0"
> "("g"d"b")" "l"i"s"t"
> "1"0"9" " " " " " " " " " " " " " " " " " " "e"v"e"n"t" "="="
> "R"P"C"S"V"C"_"E"V"E"N"T"_"D"I"S"C"O"N"N"E"C"T")")"
> "1"1"0" " " " " " " " " " " " " " " " " " " " " "r"e"t"u"r"n" "0";"
> "1"1"1" " " " " "
> "1"1"2" " " " " " " " " " " " " "e"n"t"r"y" "=" "m"y"d"a"t"a";"
> "1"1"3" " " " " " " " " " " " " "t"h"i"s" "=" "e"n"t"r"y"-">"t"h"i"s";"
> "1"1"4" " " " " " " " " " " " " "p"r"i"v" "=" "t"h"i"s"-">"p"r"i"v"a"t"e";"
> "1"1"5" " " " " "
> "1"1"6" " " " " " " " " " " " " "s"w"i"t"c"h" "("e"v"e"n"t")" "{"
> "1"1"7" " " " " " " " " " " " " "c"a"s"e"
> "R"P"C"S"V"C"_"E"V"E"N"T"_"A"C"C"E"P"T":"
> "1"1"8" " " " " " " " " " " " " " " " " " " " " "r"e"t" "="
> "s"y"s"_"u"n"l"i"n"k" "("R"P"C"_"S"O"C"K"("e"n"t"r"y")")";"
> "("g"d"b")" "p"r"i"n"t" "e"n"t"r"y"
> "$"2" "=" "("g"f"_"c"h"a"n"g"e"l"o"g"_"t" "*")" "0"x"b"7"b"1"a"8"3"0"
> "("g"d"b")" "p"r"i"n"t" "*"e"n"t"r"y"
> "$"3" "=" "{"s"t"a"t"e"l"o"c"k" "=" "{"p"t"s"_"m"a"g"i"c" "=" "0","
> "p"t"s"_"s"p"i"n" "=" "0" "'"\"0"0"0"'"," "p"t"s"_"f"l"a"g"s" "=" "0"}"," "
> " " "c"o"n"n"s"t"a"t"e" "="
> "G"F"_"C"H"A"N"G"E"L"O"G"_"C"O"N"N"_"S"T"A"T"E"_"P"E"N"D"I"N"G"," "t"h"i"s"
> "=" "0"x"0"," "l"i"s"t" "=" "{"n"e"x"t" "=" "0"x"0"," "
> " " " " "p"r"e"v" "=" "0"x"0"}"," "b"r"i"c"k" "=" "'"\"0"0"0"'"
> "<"r"e"p"e"a"t"s" "5"8"0" "t"i"m"e"s">"."."."," "g"r"p"c" "=" "{"s"v"c" "="
> "0"x"4"f"c"0"0"," "
> " " " " "r"p"c" "=" "0"x"b"1"a"8"3"0"0"0"," "
> " " " " "s"o"c"k" "="
> """∑"fi"¿"≠"fi"\"0"0"0"\"0"6"0"\"0"3"7"\"2"7"3"î"\"0"0"0"\"0"0"0"\"0"0"0"\"3"7"4"\"0"0"4"\"0"0"0"\"0"0"0"\"0"6"0"®"±"∑"fi"¿"≠"fi"\"0"0"0"\"0"6"0"\"0"3"7"\"2"7"3"î"\"0"0"0"\"0"0"0"\"0"0
> "0"\"3"7"4"\"0"0"4"\"0"0"0"\"0"0"0"\"0"6"0"®"±"∑"fi"¿"≠"fi"\"0"0"0"\"0"6"0"\"0"3"7"\"2"7"3"î"\"0"0"0"\"0"0"0"\"0"0"0"\"3"7"4"\"0"0"4"\"0"0"0"\"0"0"0"\"0"6"0"®"±"∑"fi"¿"≠"fi"\"0"0"0"\"0
> "6"0"\"0"3"7"\"2"7"3"î"\"0"0"0"\"0"0"0"\"0"0"0"\"3"7"4"\"0"0"4"\"0"0"0"\"0"0"0"\"0"6"0"®"±"∑"fi"¿"≠"fi"\"0"0"0"\"0"6"0"\"0"3"7"\"2"7"3"î"\"0"0"0"\"0"0"0"\"0"0"0"\"3"7"4"\"0"0"4"\"0"0
> "0"\"0"0"0"\"0"6"0"®"±"∑"fi"¿"≠"""}"," "
> " " "n"o"t"i"f"y" "=" "5"2"3"2"3"9"6"4"6"," "f"i"n"i" "=" "0"x"9"4"b"b","
> "c"a"l"l"b"a"c"k" "=" "0"x"4"f"c"0"0"," "
> " " "c"o"n"n"e"c"t"e"d" "=" "0"x"b"1"a"8"3"0"0"0"," "d"i"s"c"o"n"n"e"c"t"e"d"
> "=" "0"x"a"d"c"0"d"e"b"7"," "p"t"r" "=" "0"x"1"f"3"0"0"0"d"e"," "
> " " "i"n"v"o"k"e"r"x"l" "=" "0"x"9"4"b"b"," "o"r"d"e"r"e"d" "="
> "("u"n"k"n"o"w"n":" "3"2"6"6"5"6")"," "q"u"e"u"e"e"v"e"n"t" "="
> "0"x"b"1"a"8"3"0"0"0"," "
> " " "p"i"c"k"e"v"e"n"t" "=" "0"x"a"d"c"0"d"e"b"7"," "e"v"e"n"t" "="
> "{"l"o"c"k" "=" "{"p"t"m"_"m"a"g"i"c" "=" "5"2"3"2"3"9"6"4"6"," "
> " " " " " " "p"t"m"_"e"r"r"o"r"c"h"e"c"k" "=" "1"8"7" "'"ª"'","
> "p"t"m"_"p"a"d"1" "=" """î"\"0"0"0"""," "p"t"m"_"i"n"t"e"r"l"o"c"k" "=" "0"
> "'"\"0"0"0"'"," "
> " " " " " " "p"t"m"_"p"a"d"2" "=" """\"3"7"4"\"0"0"4"""," "p"t"m"_"o"w"n"e"r"
> "=" "0"x"b"1"a"8"3"0"0"0"," "p"t"m"_"w"a"i"t"e"r"s" "="
> "0"x"a"d"c"0"d"e"b"7"," "
> " " " " " " "p"t"m"_"r"e"c"u"r"s"e"d" "=" 

Re: [Gluster-devel] Possible spurious test tests/bitrot/br-stub.t

2016-02-02 Thread Kotresh Hiremath Ravishankar
I will have a look at it!

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Venky Shankar" <vshan...@redhat.com>
> To: "Sakshi Bansal" <saban...@redhat.com>
> Cc: "Gluster Devel" <gluster-devel@gluster.org>, "Kotresh Hiremath 
> Ravishankar" <khire...@redhat.com>
> Sent: Wednesday, February 3, 2016 11:11:54 AM
> Subject: Re: Possible spurious test tests/bitrot/br-stub.t
> 
> On Tue, Feb 02, 2016 at 01:51:56AM -0500, Sakshi Bansal wrote:
> > Hi Venky,
> > 
> > Patch #13262 is failing for the above tests. The patch is just calling
> > STACK_DESTROY at appropriate place to avoid rebalance crashing. The test
> > is not rebalance related so the failure looks spurious to me.
> 
> Kotresh, mind having a look at this?
> 
> (If not, I'll take a look sometime today)
> 
> > 
> > 
> > --
> > Thanks and Regards
> > Sakshi Bansal
> 
> Thanks,
> 
> Venky
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Possible spurious test tests/bitrot/br-stub.t

2016-02-02 Thread Kotresh Hiremath Ravishankar
Hi

I looked into it, it is a nfs mount failure and not related to bitrot.
The logs say 
[2016-02-02 06:42:30.6N]:++ G_LOG:./tests/bitrot/br-stub.t: TEST: 31 31 
mount_nfs nbslave74.cloud.gluster.org:/patchy /mnt/nfs/0 nolock ++
[2016-02-02 06:42:30.264734] W [MSGID: 114031] 
[client-rpc-fops.c:2664:client3_3_readdirp_cbk] 0-patchy-client-0: remote 
operation failed [Invalid argument]

I talked to Soumya (nfs team) and she will be looking into it.


Thanks and Regards,
Kotresh H R


- Original Message -
> From: "Kotresh Hiremath Ravishankar" <khire...@redhat.com>
> To: "Venky Shankar" <vshan...@redhat.com>
> Cc: "Gluster Devel" <gluster-devel@gluster.org>
> Sent: Wednesday, February 3, 2016 11:30:31 AM
> Subject: Re: [Gluster-devel] Possible spurious test tests/bitrot/br-stub.t
> 
> I will have a look at it!
> 
> Thanks and Regards,
> Kotresh H R
> 
> - Original Message -
> > From: "Venky Shankar" <vshan...@redhat.com>
> > To: "Sakshi Bansal" <saban...@redhat.com>
> > Cc: "Gluster Devel" <gluster-devel@gluster.org>, "Kotresh Hiremath
> > Ravishankar" <khire...@redhat.com>
> > Sent: Wednesday, February 3, 2016 11:11:54 AM
> > Subject: Re: Possible spurious test tests/bitrot/br-stub.t
> > 
> > On Tue, Feb 02, 2016 at 01:51:56AM -0500, Sakshi Bansal wrote:
> > > Hi Venky,
> > > 
> > > Patch #13262 is failing for the above tests. The patch is just calling
> > > STACK_DESTROY at appropriate place to avoid rebalance crashing. The test
> > > is not rebalance related so the failure looks spurious to me.
> > 
> > Kotresh, mind having a look at this?
> > 
> > (If not, I'll take a look sometime today)
> > 
> > > 
> > > 
> > > --
> > > Thanks and Regards
> > > Sakshi Bansal
> > 
> > Thanks,
> > 
> > Venky
> > 
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Glusterd crash in regression

2016-01-05 Thread Kotresh Hiremath Ravishankar
Hi Atin,

The same test caused glusterd crash for my patch as well.

https://build.gluster.org/job/rackspace-regression-2GB-triggered/17289/consoleFull

Core was generated by `glusterd'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x7f8a77b7e223 in dict_lookup_common (this=0x7f8a64000c3c, 
key=0x7f8a6d0eb0f8 "cmd-str")
at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/dict.c:287
#1  0x7f8a77b808a9 in dict_get_with_ref (this=0x7f8a64000c3c, 
key=0x7f8a6d0eb0f8 "cmd-str", data=0x7f8a5b5fd140)
at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/dict.c:1397
#2  0x7f8a77b81bd6 in dict_get_str (this=0x7f8a64000c3c, 
key=0x7f8a6d0eb0f8 "cmd-str", str=0x7f8a5b5fd1b8)
at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/dict.c:2139
#3  0x7f8a6cfe7016 in glusterd_xfer_cli_probe_resp (req=0x7f8a6000401c, 
op_ret=-1, op_errno=107, op_errstr=0x0, hostname=0x7f8a64000970 "a.b.c.d", 
port=24007, dict=0x7f8a64000c3c)
at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-handler.c:3902
#4  0x7f8a6cfea2e6 in glusterd_friend_remove_notify 
(peerctx=0x7f8a64001240, 
op_errno=107)
at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-handler.c:4980
#5  0x7f8a6cfea98b in __glusterd_peer_rpc_notify (rpc=0x7f8a64003490, 
mydata=0x7f8a64001240, event=RPC_CLNT_DISCONNECT, data=0x0)
at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-handler.c:5115
#6  0x7f8a6cfdbdd5 in glusterd_big_locked_notify (rpc=0x7f8a64003490, 
mydata=0x7f8a64001240, event=RPC_CLNT_DISCONNECT, data=0x0, 
notify_fn=0x7f8a6cfea390 <__glusterd_peer_rpc_notify>)
at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-handler.c:67
#7  0x7f8a6cfeaa76 in glusterd_peer_rpc_notify (rpc=0x7f8a64003490, 
mydata=0x7f8a64001240, event=RPC_CLNT_DISCONNECT, data=0x0)
at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-handler.c:5144
#8  0x7f8a77951793 in rpc_clnt_notify (trans=0x7f8a64003910, 
mydata=0x7f8a640034c0, event=RPC_TRANSPORT_DISCONNECT, data=0x7f8a64003910)
at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-clnt.c:867
#9  0x7f8a7794db5a in rpc_transport_notify (this=0x7f8a64003910, 
event=RPC_TRANSPORT_DISCONNECT, data=0x7f8a64003910)
---Type  to continue, or q  to quit---
at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-transport.c:541
#10 0x7f8a6b695754 in socket_connect_error_cbk (opaque=0x7f8a48000fe0)
at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-transport/socket/src/socket.c:2812
#11 0x7f8a76e68a51 in start_thread () from ./lib64/libpthread.so.0
#12 0x7f8a767d293d in clone () from ./lib64/libc.so.6


Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Atin Mukherjee" 
> To: "Poornima Gurusiddaiah" , "Gluster Devel" 
> 
> Cc: "Kaushal Madappa" 
> Sent: Tuesday, January 5, 2016 2:35:25 PM
> Subject: Re: [Gluster-devel] Glusterd crash in regression
> 
> Could you paste the back trace?
> 
> On 01/05/2016 02:33 PM, Poornima Gurusiddaiah wrote:
> > Hi,
> > 
> > In upstream regression, looks like the following test has caused
> > glusterd to crash.
> > 
> > ./tests/bugs/glusterfs/bug-879490.t
> > 
> > Not sure if its a known crash, here is the link to the regression links:
> > https://build.gluster.org/job/rackspace-regression-2GB-triggered/17213/consoleFull
> > 
> > Regards,
> > Poornima
> > 
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Glusterd crash in regression

2016-01-05 Thread Kotresh Hiremath Ravishankar
Hi Atin,

Here is the bug.
https://bugzilla.redhat.com/show_bug.cgi?id=1296004

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Atin Mukherjee" <amukh...@redhat.com>
> To: "Kotresh Hiremath Ravishankar" <khire...@redhat.com>
> Cc: "Poornima Gurusiddaiah" <pguru...@redhat.com>, "Gluster Devel" 
> <gluster-devel@gluster.org>, "Kaushal Madappa"
> <kmada...@redhat.com>
> Sent: Tuesday, January 5, 2016 7:20:49 PM
> Subject: Re: [Gluster-devel] Glusterd crash in regression
> 
> We will analyze it and get back. Mind filing a bug for this?
> 
> ~Atin
> 
> On 01/05/2016 03:36 PM, Kotresh Hiremath Ravishankar wrote:
> > Hi Atin,
> > 
> > The same test caused glusterd crash for my patch as well.
> > 
> > https://build.gluster.org/job/rackspace-regression-2GB-triggered/17289/consoleFull
> > 
> > Core was generated by `glusterd'.
> > Program terminated with signal SIGSEGV, Segmentation fault.
> > #0  0x7f8a77b7e223 in dict_lookup_common (this=0x7f8a64000c3c,
> > key=0x7f8a6d0eb0f8 "cmd-str")
> > at
> > 
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/dict.c:287
> > #1  0x7f8a77b808a9 in dict_get_with_ref (this=0x7f8a64000c3c,
> > key=0x7f8a6d0eb0f8 "cmd-str", data=0x7f8a5b5fd140)
> > at
> > 
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/dict.c:1397
> > #2  0x7f8a77b81bd6 in dict_get_str (this=0x7f8a64000c3c,
> > key=0x7f8a6d0eb0f8 "cmd-str", str=0x7f8a5b5fd1b8)
> > at
> > 
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/dict.c:2139
> > #3  0x7f8a6cfe7016 in glusterd_xfer_cli_probe_resp (req=0x7f8a6000401c,
> > op_ret=-1, op_errno=107, op_errstr=0x0, hostname=0x7f8a64000970
> > "a.b.c.d",
> > port=24007, dict=0x7f8a64000c3c)
> > at
> > 
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-handler.c:3902
> > #4  0x7f8a6cfea2e6 in glusterd_friend_remove_notify
> > (peerctx=0x7f8a64001240,
> > op_errno=107)
> > at
> > 
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-handler.c:4980
> > #5  0x7f8a6cfea98b in __glusterd_peer_rpc_notify (rpc=0x7f8a64003490,
> > mydata=0x7f8a64001240, event=RPC_CLNT_DISCONNECT, data=0x0)
> > at
> > 
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-handler.c:5115
> > #6  0x7f8a6cfdbdd5 in glusterd_big_locked_notify (rpc=0x7f8a64003490,
> > mydata=0x7f8a64001240, event=RPC_CLNT_DISCONNECT, data=0x0,
> > notify_fn=0x7f8a6cfea390 <__glusterd_peer_rpc_notify>)
> > at
> > 
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-handler.c:67
> > #7  0x7f8a6cfeaa76 in glusterd_peer_rpc_notify (rpc=0x7f8a64003490,
> > mydata=0x7f8a64001240, event=RPC_CLNT_DISCONNECT, data=0x0)
> > at
> > 
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-handler.c:5144
> > #8  0x7f8a77951793 in rpc_clnt_notify (trans=0x7f8a64003910,
> > mydata=0x7f8a640034c0, event=RPC_TRANSPORT_DISCONNECT,
> > data=0x7f8a64003910)
> > at
> > 
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-clnt.c:867
> > #9  0x7f8a7794db5a in rpc_transport_notify (this=0x7f8a64003910,
> > event=RPC_TRANSPORT_DISCONNECT, data=0x7f8a64003910)
> > ---Type  to continue, or q  to quit---
> > at
> > 
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-transport.c:541
> > #10 0x7f8a6b695754 in socket_connect_error_cbk (opaque=0x7f8a48000fe0)
> > at
> > 
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-transport/socket/src/socket.c:2812
> > #11 0x7f8a76e68a51 in start_thread () from ./lib64/libpthread.so.0
> > #12 0x7f8a767d293d in clone () from ./lib64/libc.so.6
> > 
> > 
> > Thanks and Regards,
> > Kotresh H R
> > 
> > - Original Message -
> >> From: "Atin Mukherjee" <amukh...@redhat.com>
> >> To: "Poornima Gurusiddaiah" <pguru...@redhat.com>, "Glust

Re: [Gluster-devel] compound fop design first cut

2015-12-09 Thread Kotresh Hiremath Ravishankar
Geo-rep requirements inline.

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Pranith Kumar Karampuri" 
> To: "Vijay Bellur" , "Jeff Darcy" , 
> "Raghavendra Gowdappa"
> , "Ira Cooper" 
> Cc: "Gluster Devel" 
> Sent: Wednesday, December 9, 2015 11:44:52 AM
> Subject: Re: [Gluster-devel] compound fop design first cut
> 
> 
> 
> On 12/09/2015 06:37 AM, Vijay Bellur wrote:
> > On 12/08/2015 03:45 PM, Jeff Darcy wrote:
> >>
> >>
> >>
> >> On December 8, 2015 at 12:53:04 PM, Ira Cooper (i...@redhat.com) wrote:
> >>> Raghavendra Gowdappa writes:
> >>> I propose that we define a "compound op" that contains ops.
> >>>
> >>> Within each op, there are fields that can be "inherited" from the
> >>> previous op, via use of a sentinel value.
> >>>
> >>> Sentinel is -1, for all of these examples.
> >>>
> >>> So:
> >>>
> >>> LOOKUP (1, "foo") (Sets the gfid value to be picked up by
> >>> compounding, 1
> >>> is the root directory, as a gfid, by convention.)
> >>> OPEN(-1, O_RDWR) (Uses the gfid value, sets the glfd compound value.)
> >>> WRITE(-1, "foo", 3) (Uses the glfd compound value.)
> >>> CLOSE(-1) (Uses the glfd compound value)
> >>
> >> So, basically, what the programming-language types would call futures
> >> and promises.  It’s a good and well studied concept, which is necessary
> >> to solve the second-order problem of how to specify an argument in
> >> sub-operation N+1 that’s not known until sub-operation N completes.
> >>
> >> To be honest, some of the highly general approaches suggested here scare
> >> me too.  Wrapping up the arguments for one sub-operation in xdata for
> >> another would get pretty hairy if we ever try to go beyond two
> >> sub-operations and have to nest sub-operation #3’s args within
> >> sub-operation #2’s xdata which is itself encoded within sub-operation
> >> #1’s xdata.  There’s also not much clarity about how to handle errors in
> >> that model.  Encoding N sub-operations’ arguments in a linear structure
> >> as Shyam proposes seems a bit cleaner that way.  If I were to continue
> >> down that route I’d suggest just having start_compound and end-compound
> >> fops, plus an extra field (or by-convention xdata key) that either the
> >> client-side or server-side translator could use to build whatever
> >> structure it wants and schedule sub-operations however it wants.
> >>
> >> However, I’d be even more comfortable with an even simpler approach that
> >> avoids the need to solve what the database folks (who have dealt with
> >> complex transactions for years) would tell us is a really hard problem.
> >> Instead of designing for every case we can imagine, let’s design for the
> >> cases that we know would be useful for improving performance. Open plus
> >> read/write plus close is an obvious one.  Raghavendra mentions
> >> create+inodelk as well.  For each of those, we can easily define a
> >> structure that contains the necessary fields, we don’t need a
> >> client-side translator, and the server-side translator can take care of
> >> “forwarding” results from one sub-operation to the next.  We could even
> >> use GF_FOP_IPC to prototype this.  If we later find that the number of
> >> “one-off” compound requests is growing too large, then at least we’ll
> >> have some experience to guide our design of a more general alternative.
> >> Right now, I think we’re trying to look further ahead than we can see
> >> clearly.
> Yes Agree. This makes implementation on the client side simpler as well.
> So it is welcome.
> 
> Just updating the solution.
> 1) New RPCs are going to be implemented.
> 2) client stack will use these new fops.
> 3) On the server side we have server xlator implementing these new fops
> to decode the RPC request then resolve_resume and
> compound-op-receiver(Better name for this is welcome) which sends one op
> after other and send compound fop response.
> 
> List of compound fops identified so far:
> Swift/S3:
> PUT: creat(), write()s, setxattr(), fsync(), close(), rename()
> 
> Dht:
> mkdir + inodelk
> 
> Afr:
> xattrop+writev, xattrop+unlock to begin with.

  Geo-rep:
  mknod,entrylk,stat(on backend gfid)
  mkdir,entrylk,stat (on backend gfid)
  symlink,entrylk,stat(on backend gfid)
  
> 
> Could everyone who needs compound fops add to this list?
> 
> I see that Niels is back on 14th. Does anyone else know the list of
> compound fops he has in mind?
> 
> Pranith.
> >
> > Starting with a well defined set of operations for compounding has its
> > advantages. It would be easier to understand and maintain correctness
> > across the stack. Some of our translators perform transactions &
> > create/update internal metadata for certain fops. It would be easier
> > for such translators if the compound operations are well defined and
> > does not entail deep introspection of a generic representation to
> > ensure that the right 

[Gluster-devel] NetBSD regression not kicking off!

2015-11-29 Thread Kotresh Hiremath Ravishankar
Hi,

I am consistently getting the following errors for my patch.
https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/12193/console

Building remotely on nbslave7i.cloud.gluster.org (netbsd7_regression) in 
workspace /home/jenkins/root/workspace/rackspace-netbsd7-regression-triggered
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url git://review.gluster.org/glusterfs.git # 
 > timeout=10
ERROR: Error fetching remote repo 'origin'
hudson.plugins.git.GitException: Failed to fetch from 
git://review.gluster.org/glusterfs.git
at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:763)
at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1012)
at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1043)
at hudson.scm.SCM.checkout(SCM.java:485)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1276)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:607)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:529)
at hudson.model.Run.execute(Run.java:1738)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:98)
at hudson.model.Executor.run(Executor.java:410)
Caused by: hudson.plugins.git.GitException: Command "git config 
remote.origin.url git://review.gluster.org/glusterfs.git" returned status code 
4:
stdout: 
stderr: error: failed to write new configuration file .git/config.lock

at 
org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:1640)
at 
org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:1616)
at 
org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:1612)
at 
org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommand(CliGitAPIImpl.java:1254)
at 
org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommand(CliGitAPIImpl.java:1266)
at 
org.jenkinsci.plugins.gitclient.CliGitAPIImpl.setRemoteUrl(CliGitAPIImpl.java:972)
at hudson.plugins.git.GitAPI.setRemoteUrl(GitAPI.java:160)
at sun.reflect.GeneratedMethodAccessor49.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:608)
at 
hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:583)
at 
hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:542)
at hudson.remoting.UserRequest.perform(UserRequest.java:120)
at hudson.remoting.UserRequest.perform(UserRequest.java:48)
at hudson.remoting.Request$2.run(Request.java:326)
at 
hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
at ..remote call to nbslave7i.cloud.gluster.org(Native Method)
at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1416)
at hudson.remoting.UserResponse.retrieve(UserRequest.java:220)
at hudson.remoting.Channel.call(Channel.java:781)
at 
hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:250)
at com.sun.proxy.$Proxy51.setRemoteUrl(Unknown Source)
at 
org.jenkinsci.plugins.gitclient.RemoteGitImpl.setRemoteUrl(RemoteGitImpl.java:298)
at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:751)
... 11 more
ERROR: null
Finished: FAILURE

Help us localize this page

Could someone look into this?


Thanks and Regards,
Kotresh H R

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Need advice re some major issues with glusterfind

2015-10-23 Thread Kotresh Hiremath Ravishankar
Hi John,

You are welcome and happy to help you!

You can delete the consumed changelogs safely if there is only one glusterfind 
session
for your gluster volume. But if you have multiple glusterfind sessions which 
are started
at time gap of let's say two days.

1st-day  : session1   - For Purpose 1
2nd-day  : session1
3rd-day  : session1
   session2 (started)  - For Purpose 2

Above, if you had deleted changelogs of Day-1 and Day-2 when session 2 is 
started, it needs
to crawl the entire filesystem which defeats the purpose of glusterfind and is 
slower.
That's the reason I said deleting changelogs is not recommended. If you don't 
have use cases
of above kind, you can delete changelogs.


Thanks and Regards,
Kotresh H R

- Original Message -
> From: "John Sincock [FLCPTY]" <j.sinc...@fugro.com>
> To: "Kotresh Hiremath Ravishankar" <khire...@redhat.com>
> Cc: "Vijaikumar Mallikarjuna" <vmall...@redhat.com>, gluster-devel@gluster.org
> Sent: Friday, October 23, 2015 2:54:14 PM
> Subject: RE: [Gluster-devel] Need advice re some major issues with glusterfind
> 
> Aaah I s, thanks Kotresh :-)
> This explains why there are so many files and why I sometimes didn't see some
> changed files during my testing where I was changing files and then
> immediately running a glusterfind.
> 
> When you say deleting the changelogs is not recommended because it will
> affect new glusterfind sessions - I assume it will be OK to delete
> changelogs that are further back into the past than the time period we're
> interested in? Please let me know if this is the case, or if you meant that
> removing old changelogs is likely to trigger bugs and cause all our
> glusterfinds to start failing outright...
> 
> We can leave the old changelogs there if we have to, but if we don’t increase
> the rollover time, the number will become astronomical as time goes on, so I
> hope we can delete or archive old changelogs for time periods we're no
> longer interested in.
> 
> For our purposes I think it should also be OK to try increasing the rollover
> time significantly, eg if we have it set to rollover every 10 minutes, then
> all we have to do is subtract 10 mins from the start time of each
> glusterfind/backup so it overlaps the end of the previous glusterfind
> period. In this way, any files changed just before a glusterfind/backup
> runs, might be missed by the first backup, but they will be caught by the
> next backup that runs later on. And it wont matter if some changed files get
> backed up twice -as long as we get at least one backup of every file that
> does change..
> 
> I note that by default there is no easy way to make glusterfind report on
> changes further back in time than the time you run glusterfind create to
> start a session - but I've already had some success at getting glusterfind
> to give results back to earlier times before the session was created (as
> long as the changelogs exist). I did this by using a script to manually set
> the time we're interested in in the status file(s) - ie in the main status
> file on the node running the "pre" command", and for every one of the extra
> status files stored on every node for each of their bricks :-)
> 
> I think my only remaining concern is how cpu-intensive the process is. I've
> had glusterfinds return very quickly if only reporting on changes for the
> last hour, or the last 10 hours or so. But if I go back a bit further, the
> time taken to do the glusterfind seems to really blow out and it sits there
> pegging all our CPUs at 100% for hours.
> 
> But you and Vijay have definitely given me a few tweaks I can look into - I
> think I will bump-up the changelog rollover a bit, and will follow Vijay's
> tip to get all our files labelled with pgfid's, and then perhaps the
> glusterfinds will be less cpu-intensive.
> 
> Thanks for the tips (Kotresh & Vijay), and I'll let you know how it goes.
> 
> If the glusterfinds are still very cpu-intensive after all the pgfid
> labelling is done, I'll be happy to do some further testing if it can be of
> any help to you. Or if you're already trying to find time to work on
> increasing the efficiency of processing the changelogs, and you know where
> the improvements need to be made I'll just leave you to it and hope it all
> goes smoothly for you
> 
> Thanks again, and cheerios :-)
> John
> 
> 
> 
> 
> 
>  
> 
> 
> 
> 
> -Original Message-
> From: Kotresh Hiremath Ravishankar [mailto:khire...@redhat.com]
> Sent: Friday, 23 October 2015 5:24 PM
> To: Sincock, John [FLCPTY]
> Cc: Vijaikumar Mallikarjuna; gluster-devel@gluster.org
> Subject: Re: [Gluster-devel] Need advice re some major iss

Re: [Gluster-devel] Need advice re some major issues with glusterfind

2015-10-23 Thread Kotresh Hiremath Ravishankar
Hi John,

The changelog files are generated every 15 secs recording the changes happened 
to filesystem
within that span.  So every 15 sec, once the new changelog file is generated, 
it is ready 
to be consumed by glusterfind or any other consumers. The 15 sec time period is 
a tune-able.
e.g.,
 gluster vol set  changelog.rollover-time 300

The above will generate new changelog file every 300 sec instead of 15 sec. 
Hence reducing
the number of changelogs. But glusterfind, will come to know about the changes 
in filesystem
only after 300 secs!

Deleting these changelogs at .glusterfs/changelog/... is not recommeneded. It 
will affect any
new glusterfind session going to be established. 


Thanks and Regards,
Kotresh H R1

- Original Message -
> From: "John Sincock [FLCPTY]" 
> To: "Vijaikumar Mallikarjuna" 
> Cc: gluster-devel@gluster.org
> Sent: Friday, October 23, 2015 9:54:25 AM
> Subject: Re: [Gluster-devel] Need advice re some major issues with glusterfind
> 
> 
> Hi Vijay, pls see below again (I'm wondering if top-posting would be easier,
> that's usually what I do, though I know some ppl don’t like it)
> 
>  
> On Wed, Oct 21, 2015 at 5:53 AM, Sincock, John [FLCPTY] 
> wrote:
> Hi Everybody,
> 
> We have recently upgraded our 220 TB gluster to 3.7.4, and we've been trying
> to use the new glusterfind feature but have been having some serious
> problems with it. Overall the glusterfind looks very promising, so I don't
> want to offend anyone by raising these issues.
> 
> If these issues can be resolved or worked around, glusterfind will be a great
> feature.  So I would really appreciate any information or advice:
> 
> 1) What can be done about the vast number of tiny changelogs? We are seeing
> often 5+ small 89 byte changelog files per minute on EACH brick. Larger
> files if busier. We've been generating these changelogs for a few weeks and
> have in excess of 10,000 or 12,000 on most bricks. This makes glusterfinds
> very, very slow, especially on a node which has a lot of bricks, and looks
> unsustainable in the long run. Why are these files so small, and why are
> there so many of them, and how are they supposed to be managed in the long
> run? The sheer number of these files looks sure to impact performance in the
> long run.
> 
> 2) Pgfid xattribute is wreaking havoc with our backup scheme - when gluster
> adds this extended attribute to files it changes the ctime, which we were
> using to determine which files need to be archived. There should be a
> warning added to release notes & upgrade notes, so people can make a plan to
> manage this if required.
> 
> Also, we ran a rebalance immediately after the 3.7.4 upgrade, and the
> rebalance took 5 days or so to complete, which looks like a major speed
> improvement over the more serial rebalance algorithm, so that's good. But I
> was hoping that the rebalance would also have had the side-effect of
> triggering all files to be labelled with the pgfid attribute by the time the
> rebalance completed, or failing that, after creation of an mlocate database
> across our entire gluster (which would have accessed every file, unless it
> is getting the info it needs only from directory inodes). Now it looks like
> ctimes are still being modified, and I think this can only be caused by
> files still being labelled with pgfids.
> 
> How can we force gluster to get this pgfid labelling over and done with, for
> all files that are already on the volume? We can't have gluster continuing
> to add pgfids in bursts here and there, eg when files are read for the first
> time since the upgrade. We need to get it over and done with. We have just
> had to turn off pgfid creation on the volume until we can force gluster to
> get it over and done with in one go.
>  
>  
> Hi John,
>  
> Was quota turned on/off before/after performing re-balance? If the pgfid is
>  missing, this can be healed by performing 'find  | xargs
> stat', all the files will get looked-up once and the pgfid healing will
> happen.
> Also could you please provide all the volume files under
> '/var/lib/glusterd/vols//*.vol'?
>  
> Thanks,
> Vijay
>  
>  
> Hi Vijay
>  
> Quota has never been turned on in our gluster, so it can’t be any
> quota-related xattrs which are resetting our ctimes, so I’m pretty sure it
> must be due to pgfids still being added.
>  
> Thanks for the tip re using stat, if that should trigger the pgfid build on
> each file, then I will run that when I have a chance. We’ll have to get our
> archiving of data back up to date, re-enable pgfid build option, and then
> run the stat over a weekend or something, as it will take a while.
>  
> I’m still quite concerned about the number of changelogs being generated. Do
> you know if there any plans to change the way changelogs are generated so
> there aren’t so many of them, and to process them more efficiently? I think
> this will be vital to improving 

Re: [Gluster-devel] Spurious failures

2015-09-27 Thread Kotresh Hiremath Ravishankar
Thanks Michael!

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Michael Scherer" <msche...@redhat.com>
> To: "Kotresh Hiremath Ravishankar" <khire...@redhat.com>
> Cc: "Krutika Dhananjay" <kdhan...@redhat.com>, "Atin Mukherjee" 
> <amukh...@redhat.com>, "Gaurav Garg"
> <gg...@redhat.com>, "Aravinda" <avish...@redhat.com>, "Gluster Devel" 
> <gluster-devel@gluster.org>
> Sent: Thursday, 24 September, 2015 11:09:52 PM
> Subject: Re: Spurious failures
> 
> Le jeudi 24 septembre 2015 à 07:59 -0400, Kotresh Hiremath Ravishankar a
> écrit :
> > Thank you:) and also please check the script I had given passes in all
> > machines
> 
> So it worked everywhere, but on slave0 and slave1. Not sure what is
> wrong, or if they are used, I will check later.
> 
> 
> --
> Michael Scherer
> Sysadmin, Community Infrastructure and Platform, OSAS
> 
> 
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Spurious failures

2015-09-24 Thread Kotresh Hiremath Ravishankar
>>> Ok, this definitely requires some tests and toughts. It only use ipv4
>>> too ?
>>> (I guess yes, since ipv6 is removed from the rackspace build slaves)
   
Yes!

Could we know when can these settings be done on all linux slave machines?
If it takes sometime, we should consider moving all geo-rep testcases under 
bad tests
till then.

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Michael Scherer" <msche...@redhat.com>
> To: "Kotresh Hiremath Ravishankar" <khire...@redhat.com>
> Cc: "Krutika Dhananjay" <kdhan...@redhat.com>, "Atin Mukherjee" 
> <amukh...@redhat.com>, "Gaurav Garg"
> <gg...@redhat.com>, "Aravinda" <avish...@redhat.com>, "Gluster Devel" 
> <gluster-devel@gluster.org>
> Sent: Thursday, 24 September, 2015 1:18:16 PM
> Subject: Re: Spurious failures
> 
> Le jeudi 24 septembre 2015 à 02:24 -0400, Kotresh Hiremath Ravishankar a
> écrit :
> > Hi,
> > 
> > >>>So, it is ok if I restrict that to be used only on 127.0.0.1 ?
> > I think no, testcases use 'H0' to create volumes
> >  H0=${H0:=`hostname`};
> > Geo-rep expects passwordLess SSH to 'H0'
> >  
> 
> Ok, this definitely requires some tests and toughts. It only use ipv4
> too ?
> (I guess yes, since ipv6 is removed from the rackspace build slaves)
> --
> Michael Scherer
> Sysadmin, Community Infrastructure and Platform, OSAS
> 
> 
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Spurious failures

2015-09-24 Thread Kotresh Hiremath Ravishankar
Thank you:) and also please check the script I had given passes in all machines

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Michael Scherer" <msche...@redhat.com>
> To: "Kotresh Hiremath Ravishankar" <khire...@redhat.com>
> Cc: "Krutika Dhananjay" <kdhan...@redhat.com>, "Atin Mukherjee" 
> <amukh...@redhat.com>, "Gaurav Garg"
> <gg...@redhat.com>, "Aravinda" <avish...@redhat.com>, "Gluster Devel" 
> <gluster-devel@gluster.org>
> Sent: Thursday, 24 September, 2015 5:00:43 PM
> Subject: Re: Spurious failures
> 
> Le jeudi 24 septembre 2015 à 06:50 -0400, Kotresh Hiremath Ravishankar a
> écrit :
> > >>> Ok, this definitely requires some tests and toughts. It only use ipv4
> > >>> too ?
> > >>> (I guess yes, since ipv6 is removed from the rackspace build slaves)
> >
> > Yes!
> > 
> > Could we know when can these settings be done on all linux slave
> > machines?
> > If it takes sometime, we should consider moving all geo-rep testcases
> > under bad tests
> > till then.
> 
> I will do that this afternoon, now I have a clear idea of what need to
> be done.
> ( I already pushed the path change )
> 
> --
> Michael Scherer
> Sysadmin, Community Infrastructure and Platform, OSAS
> 
> 
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Spurious failures

2015-09-24 Thread Kotresh Hiremath Ravishankar
Hi,

>>>So, it is ok if I restrict that to be used only on 127.0.0.1 ?
I think no, testcases use 'H0' to create volumes
 H0=${H0:=`hostname`};
Geo-rep expects passwordLess SSH to 'H0'  
 

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Michael Scherer" <msche...@redhat.com>
> To: "Kotresh Hiremath Ravishankar" <khire...@redhat.com>
> Cc: "Krutika Dhananjay" <kdhan...@redhat.com>, "Atin Mukherjee" 
> <amukh...@redhat.com>, "Gaurav Garg"
> <gg...@redhat.com>, "Aravinda" <avish...@redhat.com>, "Gluster Devel" 
> <gluster-devel@gluster.org>
> Sent: Wednesday, 23 September, 2015 5:05:58 PM
> Subject: Re: Spurious failures
> 
> Le mercredi 23 septembre 2015 à 06:24 -0400, Kotresh Hiremath
> Ravishankar a écrit :
> > Hi Michael,
> > 
> > Please find my replies below.
> > 
> > >>> Root login using password should be disabled, so no. If that's still
> > >>> working and people use it, that's gonna change soon, too much problems
> > >>> with it.
> > 
> >   Ok
> > 
> > >>>Can you be more explicit on where should the user come from so I can
> > >>>properly integrate that ?
> > 
> >   It's just PasswordLess SSH from root to root on to same host.
> >   1. Generate ssh key:
> > #ssh-keygen
> >   2. Add it to /root/.ssh/authorized_keys
> > #ssh-copy-id -i  root@host
> > 
> >   Requirement by geo-replication:
> > 'ssh root@host' should not ask for password
> 
> So, it is ok if I restrict that to be used only on 127.0.0.1 ?
> 
> --
> Michael Scherer
> Sysadmin, Community Infrastructure and Platform, OSAS
> 
> 
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Spurious failures

2015-09-23 Thread Kotresh Hiremath Ravishankar
Hi Michael,

Please find my replies below.

>>> Root login using password should be disabled, so no. If that's still
>>> working and people use it, that's gonna change soon, too much problems
>>> with it.

  Ok

>>>Can you be more explicit on where should the user come from so I can
>>>properly integrate that ?

  It's just PasswordLess SSH from root to root on to same host.
  1. Generate ssh key:
#ssh-keygen
  2. Add it to /root/.ssh/authorized_keys
#ssh-copy-id -i  root@host

  Requirement by geo-replication:
'ssh root@host' should not ask for password


>>>There is something adding lots of line to /root/.ssh/authorized_keys on
>>>the slave, and this make me quite unconfortable, so if that's it, I
>>>rather have it done cleanly, and for that, I need to understand the
>>>test, and the requirement.

  Yes, geo-rep is doing it. It adds only once per session. Since the
   test is running continuously for different patches, it's building up.
   I will submit a patch to clean it up in geo-rep testsuite itself.

>>>I will do this one.
  
Thank you!

>>>Is georep supposed to work on other platform like freebsd ? ( because
>>>freebsd do not have bash, so I have to adapt to local way, but if that's
>>>not gonna be tested, I rather not spend too much time on reading the
>>>handbook for now )

As of now it is supported only on Linux, it has known issues with other 
platforms 
such as NetBSD...

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Michael Scherer" <msche...@redhat.com>
> To: "Kotresh Hiremath Ravishankar" <khire...@redhat.com>
> Cc: "Krutika Dhananjay" <kdhan...@redhat.com>, "Atin Mukherjee" 
> <amukh...@redhat.com>, "Gaurav Garg"
> <gg...@redhat.com>, "Aravinda" <avish...@redhat.com>, "Gluster Devel" 
> <gluster-devel@gluster.org>
> Sent: Wednesday, September 23, 2015 3:30:39 PM
> Subject: Re: Spurious failures
> 
> Le mercredi 23 septembre 2015 à 03:25 -0400, Kotresh Hiremath
> Ravishankar a écrit :
> > Hi Krutika,
> > 
> > Looks like the prerequisites for geo-replication to work is changed
> > in slave21
> > 
> > Hi Michael,
> 
> Hi,
> 
> > Could you please check following settings are made in all linux regression
> > machines?
> 
> Yeah, I will add to salt.
> 
> > Or provide me with root password so that I can verify.
> 
> Root login using password should be disabled, so no. If that's still
> working and people use it, that's gonna change soon, too much problems
> with it.
> 
> > 1. Setup Passwordless SSH for the root user:
> 
> Can you be more explicit on where should the user come from so I can
> properly integrate that ?
> 
> There is something adding lots of line to /root/.ssh/authorized_keys on
> the slave, and this make me quite unconfortable, so if that's it, I
> rather have it done cleanly, and for that, I need to understand the
> test, and the requirement.
>  
> > 2. Add below line in /root/.bashrc. This is required as geo-rep does
> > "gluster --version" via ssh
> >and it can't find the gluster PATH via ssh.
> >  export PATH=$PATH:/build/install/sbin:/build/install/bin
> 
> I will do this one.
> 
> Is georep supposed to work on other platform like freebsd ? ( because
> freebsd do not have bash, so I have to adapt to local way, but if that's
> not gonna be tested, I rather not spend too much time on reading the
> handbook for now )
> 
> > Once above settings are done, the following script should output proper
> > version.
> > 
> > ---
> > #!/bin/bash
> > 
> > function SSHM()
> > {
> > ssh -q \
> > -oPasswordAuthentication=no \
> > -oStrictHostKeyChecking=no \
> > -oControlMaster=yes \
> > "$@";
> > }
> > 
> > function cmd_slave()
> > {
> > local cmd_line;
> > cmd_line=$(cat < > function do_verify() {
> > ver=\$(gluster --version | head -1 | cut -f2 -d " ");
> > echo \$ver;
> > };
> > source /etc/profile && do_verify;
> > EOF
> > );
> > echo $cmd_line;
> > }[root@slave32 ~]
> > 
> > HOST=$1
> > cmd_line=$(cmd_slave);
> > ver=`SSHM root@$HOST bash -c "'$cmd_line'"`;
> > echo $ver
> > -
> > 
> > I could verify for slave32.
> > [root@slave32 ~]# vi /tmp/gver.sh
> > [root@slave32 ~]# /tmp/gver.sh slave32
> > 3.8dev
> > 
> > Please help me in verifying the same for all the linux regression machines.
> > 
> 
> --
> Michael Scherer
> Sysadmin, Community Infrastructure and Platform, OSAS
> 
> 
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Spurious failures

2015-09-23 Thread Kotresh Hiremath Ravishankar
Hi Krutika,

It's failing with

++ gluster --mode=script --wignore volume geo-rep master 
slave21.cloud.gluster.org::slave create push-pem
Gluster version mismatch between master and slave.

I will look into it.

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Krutika Dhananjay" <kdhan...@redhat.com>
> To: "Atin Mukherjee" <amukh...@redhat.com>
> Cc: "Gluster Devel" <gluster-devel@gluster.org>, "Gaurav Garg" 
> <gg...@redhat.com>, "Aravinda" <avish...@redhat.com>,
> "Kotresh Hiremath Ravishankar" <khire...@redhat.com>
> Sent: Tuesday, September 22, 2015 9:03:44 PM
> Subject: Re: Spurious failures
> 
> Ah! Sorry. I didn't read that line. :)
> 
> Just figured even ./tests/geo-rep/georep-basic-dr-rsync.t is added to bad
> tests list.
> 
> So it's just /tests/geo-rep/georep-basic-dr-tarssh.t for now.
> 
> Thanks Atin!
> 
> -Krutika
> 
> - Original Message -
> 
> > From: "Atin Mukherjee" <amukh...@redhat.com>
> > To: "Krutika Dhananjay" <kdhan...@redhat.com>
> > Cc: "Gluster Devel" <gluster-devel@gluster.org>, "Gaurav Garg"
> > <gg...@redhat.com>, "Aravinda" <avish...@redhat.com>, "Kotresh Hiremath
> > Ravishankar" <khire...@redhat.com>
> > Sent: Tuesday, September 22, 2015 8:51:22 PM
> > Subject: Re: Spurious failures
> 
> > ./tests/bugs/glusterd/bug-1238706-daemons-stop-on-peer-cleanup.t (Wstat:
> > 0 Tests: 8 Failed: 2)
> > Failed tests: 6, 8
> > Files=1, Tests=8, 48 wallclock secs ( 0.01 usr 0.01 sys + 0.88 cusr
> > 0.56 csys = 1.46 CPU)
> > Result: FAIL
> > ./tests/bugs/glusterd/bug-1238706-daemons-stop-on-peer-cleanup.t: bad
> > status 1
> > *Ignoring failure from known-bad test
> > ./tests/bugs/glusterd/bug-1238706-daemons-stop-on-peer-cleanup.t*
> > [11:24:16] ./tests/bugs/glusterd/bug-1242543-replace-brick.t .. ok
> > 17587 ms
> > [11:24:16]
> > All tests successful
> 
> > On 09/22/2015 08:46 PM, Krutika Dhananjay wrote:
> > > https://build.gluster.org/job/rackspace-regression-2GB-triggered/14421/consoleFull
> > >
> > > Ctrl + f 'not ok'.
> > >
> > > -Krutika
> > >
> > > 
> > >
> > > *From: *"Atin Mukherjee" <amukh...@redhat.com>
> > > *To: *"Krutika Dhananjay" <kdhan...@redhat.com>, "Gluster Devel"
> > > <gluster-devel@gluster.org>
> > > *Cc: *"Gaurav Garg" <gg...@redhat.com>, "Aravinda"
> > > <avish...@redhat.com>, "Kotresh Hiremath Ravishankar"
> > > <khire...@redhat.com>
> > > *Sent: *Tuesday, September 22, 2015 8:39:56 PM
> > > *Subject: *Re: Spurious failures
> > >
> > > Krutika,
> > >
> > > ./tests/bugs/glusterd/bug-1238706-daemons-stop-on-peer-cleanup.t is
> > > already a part of bad_tests () in both mainline and 3.7. Could you
> > > provide me the link where this test has failed explicitly and that has
> > > caused the regression to fail?
> > >
> > > ~Atin
> > >
> > >
> > > On 09/22/2015 07:27 PM, Krutika Dhananjay wrote:
> > > > Hi,
> > > >
> > > > The following tests seem to be failing consistently on the build
> > > > machines in Linux:
> > > >
> > > > ./tests/bugs/glusterd/bug-1238706-daemons-stop-on-peer-cleanup.t ..
> > > >
> > > > ./tests/geo-rep/georep-basic-dr-rsync.t ..
> > > >
> > > > ./tests/geo-rep/georep-basic-dr-tarssh.t ..
> > > >
> > > > I have added these tests into the tracker etherpad.
> > > >
> > > > Meanwhile could someone from geo-rep and glusterd team take a look or
> > > > perhaps move them to bad tests list?
> > > >
> > > >
> > > > Here is one place where the three tests failed:
> > > >
> > > https://build.gluster.org/job/rackspace-regression-2GB-triggered/14421/consoleFull
> > > >
> > > > -Krutika
> > > >
> > >
> > >
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Spurious failures

2015-09-23 Thread Kotresh Hiremath Ravishankar
Hi Krutika,

Looks like the prerequisites for geo-replication to work is changed
in slave21

Hi Michael,

Could you please check following settings are made in all linux regression 
machines?
Or provide me with root password so that I can verify.

1. Setup Passwordless SSH for the root user:
 
2. Add below line in /root/.bashrc. This is required as geo-rep does "gluster 
--version" via ssh
   and it can't find the gluster PATH via ssh.
 export PATH=$PATH:/build/install/sbin:/build/install/bin

Once above settings are done, the following script should output proper version.

---
#!/bin/bash

function SSHM()
{
ssh -q \
-oPasswordAuthentication=no \
-oStrictHostKeyChecking=no \
-oControlMaster=yes \
"$@";
}

function cmd_slave()
{
local cmd_line;
cmd_line=$(cat < From: "Kotresh Hiremath Ravishankar" <khire...@redhat.com>
> To: "Krutika Dhananjay" <kdhan...@redhat.com>
> Cc: "Atin Mukherjee" <amukh...@redhat.com>, "Gluster Devel" 
> <gluster-devel@gluster.org>, "Gaurav Garg"
> <gg...@redhat.com>, "Aravinda" <avish...@redhat.com>
> Sent: Wednesday, September 23, 2015 12:31:12 PM
> Subject: Re: Spurious failures
> 
> Hi Krutika,
> 
> It's failing with
> 
> ++ gluster --mode=script --wignore volume geo-rep master
> slave21.cloud.gluster.org::slave create push-pem
> Gluster version mismatch between master and slave.
> 
> I will look into it.
> 
> Thanks and Regards,
> Kotresh H R
> 
> - Original Message -
> > From: "Krutika Dhananjay" <kdhan...@redhat.com>
> > To: "Atin Mukherjee" <amukh...@redhat.com>
> > Cc: "Gluster Devel" <gluster-devel@gluster.org>, "Gaurav Garg"
> > <gg...@redhat.com>, "Aravinda" <avish...@redhat.com>,
> > "Kotresh Hiremath Ravishankar" <khire...@redhat.com>
> > Sent: Tuesday, September 22, 2015 9:03:44 PM
> > Subject: Re: Spurious failures
> > 
> > Ah! Sorry. I didn't read that line. :)
> > 
> > Just figured even ./tests/geo-rep/georep-basic-dr-rsync.t is added to bad
> > tests list.
> > 
> > So it's just /tests/geo-rep/georep-basic-dr-tarssh.t for now.
> > 
> > Thanks Atin!
> > 
> > -Krutika
> > 
> > - Original Message -
> > 
> > > From: "Atin Mukherjee" <amukh...@redhat.com>
> > > To: "Krutika Dhananjay" <kdhan...@redhat.com>
> > > Cc: "Gluster Devel" <gluster-devel@gluster.org>, "Gaurav Garg"
> > > <gg...@redhat.com>, "Aravinda" <avish...@redhat.com>, "Kotresh Hiremath
> > > Ravishankar" <khire...@redhat.com>
> > > Sent: Tuesday, September 22, 2015 8:51:22 PM
> > > Subject: Re: Spurious failures
> > 
> > > ./tests/bugs/glusterd/bug-1238706-daemons-stop-on-peer-cleanup.t (Wstat:
> > > 0 Tests: 8 Failed: 2)
> > > Failed tests: 6, 8
> > > Files=1, Tests=8, 48 wallclock secs ( 0.01 usr 0.01 sys + 0.88 cusr
> > > 0.56 csys = 1.46 CPU)
> > > Result: FAIL
> > > ./tests/bugs/glusterd/bug-1238706-daemons-stop-on-peer-cleanup.t: bad
> > > status 1
> > > *Ignoring failure from known-bad test
> > > ./tests/bugs/glusterd/bug-1238706-daemons-stop-on-peer-cleanup.t*
> > > [11:24:16] ./tests/bugs/glusterd/bug-1242543-replace-brick.t .. ok
> > > 17587 ms
> > > [11:24:16]
> > > All tests successful
> > 
> > > On 09/22/2015 08:46 PM, Krutika Dhananjay wrote:
> > > > https://build.gluster.org/job/rackspace-regression-2GB-triggered/14421/consoleFull
> > > >
> > > > Ctrl + f 'not ok'.
> > > >
> > > > -Krutika
> > > >
> > > > 
> > > >
> > > > *From: *"Atin Mukherjee" <amukh...@redhat.com>
> > > > *To: *"Krutika Dhananjay" <kdhan...@redhat.com>, "Gluster Devel"
> > > > <gluster-devel@gluster.org>
> > > > *Cc: *"Gaurav Garg" <gg...@redhat.com>, "Aravinda"
> > > > <avish...@redhat.com>, "Kotresh Hiremath Ravishankar"
> > > > <khire...@redhat.com>
> > > > *Sent: *Tuesday, September 22, 2015 8:39:56 PM
> > > > *Subject: *Re: Spurious failures
> > > >
> > > > Krutika,
> > > >
> > > > ./tests/bugs/glusterd/bug-1238706-daemon

[Gluster-devel] Geo-rep: Solving changelog ordering problem!

2015-09-03 Thread Kotresh Hiremath Ravishankar
Hi DHT Team and Others,

Changelog is a server side translator sits above POSIX and records FOPs.
Hence, the order of operation is true only for that brick and the order
of operation is lost across bricks.

e.g.,(f1 hashes to brick1 and f2 to brick2)
  brick1 brick2
  CREATE f1
  RENAME f1, f2
> Re-balance happens, which is very common with Tiering in place
 RENAME f2, f3
 DATA f3

The moment re-balance happens, the changelogs related to same entry is 
distributed
across bricks and since geo-rep sync these changes independently, it is well 
possible
that it processes in wrong order and end up in inconsistent state in slave.

SOLUTION APPROACHES:

1. Capture re-balance traffic as well and workout all combinations of FOPs to 
end
   up in correct state. Though we started thinking in these lines, one or the 
other
   corner case does exist and still end up in out of order syncing.

2. The changes related to the 'entry'(file), should always be captured on the 
first
   brick where it recorded initially no matter where the file moves because of 
re-balance.
   This retains the ordering for an entry implicitly and yet geo-rep can sync 
in distributed
   manner from each brick keeping the performance up.

   DHT needs to maintain the state for each entry where it was first cached (to 
be precise, 
   which brick it gets recorded in changelog) and always notifies changelog the 
FOP.

   I think if can achieve second solution, it would solve geo-rep's out of 
order syncing
   problem for ever. 

   Let me know your comments and suggestions on this!

 
  

Thanks and Regards,
Kotresh H R

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Introducing georepsetup - Gluster Geo-replication Setup Tool

2015-09-02 Thread Kotresh Hiremath Ravishankar
Hi Aravinda,

I used it yesterday. It greatly simplifies the geo-rep setup.
It would be great if it is enhanced to troubleshoot what's
wrong in already corrupted setup.

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Aravinda" 
> To: "Gluster Devel" , "gluster-users" 
> 
> Sent: Wednesday, September 2, 2015 11:25:15 PM
> Subject: [Gluster-devel] Introducing georepsetup - Gluster Geo-replication
> Setup Tool
> 
> Hi,
> 
> Created a CLI tool using Python to simplify the Geo-replication Setup
> process. This tool takes care of running gsec_create command,
> distributing the SSH keys from Master to all Slave nodes etc. All in
> one single command :)
> 
> Initial password less SSH login is not required, this tool prompts the
> Root's password during run. Will not store password!
> 
> 
> Wrote a blog post about thesame.
> http://aravindavk.in/blog/introducing-georepsetup
> 
> Comments and Suggestions Welcome.
> 
> --
> regards
> Aravinda
> http://aravindavk.in
> 
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Geo-rep portability issues with NetBSD

2015-08-31 Thread Kotresh Hiremath Ravishankar
Hi Emmanuel,

Right now, I am just tweaking the obstacles to bypass them and 
uncover other issues. I have not fixed them in code. It takes
a while.

Well now I am stuck with the following.

geo-rep uses libgfchangelog.so shared library.
When gluster is installed using source, all other shared libraries are loaded 
fine,
but libgfchangelog doesn't.
LINUX:
   ldconfig /usr/local/lib

NETBSD:
   Since that din't work, I did /usr/lib/libgfchangelog.so -> 
/usr/local/lib/libgfchangelog.so
Well, it able to find the library, the function fails. I need to debug what's 
the reason.
You can login into nbslave75 and check.

#gluster vol geo-rep master nbslave75.cloud.gluster.org::slave status
#gluster vol geo-rep master nbslave75.cloud.gluster.org::slave start

Logs:
/var/log/gluster/geo-replication/master/ssh%3A%2F%2Froot%4023.253.43.210%3Agluster%3A%2F%2F127.0.0.1%3Aslave.log
which says library not found. After playing around with symlinks. It did work, 
but 
function call failed with "name too long" but I pretty sure it is something 
else (Invalid Argument Errors)
You can see the logs of libgfchangelog here:
ssh%3A%2F%2Froot%4023.253.43.210%3Agluster%3A%2F%2F127.0.0.1%3Aslave.%2Fd%2Fbackends%2Fmaster-changes.log


Rsync version update:
Thanks for the update! But it still doesn't support --xattrs and --acls 
option. 
That can be disabled in geo-rep config for now.

Let me know if you can find out something!
Thanks in Advance!

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Emmanuel Dreyfus" <m...@netbsd.org>
> To: "Kotresh Hiremath Ravishankar" <khire...@redhat.com>, "Gluster Devel" 
> <gluster-devel@gluster.org>
> Cc: "Aravinda" <avish...@redhat.com>
> Sent: Monday, August 31, 2015 7:39:35 AM
> Subject: Re: Geo-rep  portability issues with NetBSD
> 
> Kotresh Hiremath Ravishankar <khire...@redhat.com> wrote:
> 
> > Please let me know your thoughts
> 
> We can disable them, OTOH you seem close to have resolved the 5 points.
> Do you need any help there?
> 
> --
> Emmanuel Dreyfus
> http://hcpnet.free.fr/pubz
> m...@netbsd.org
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Geo-rep portability issues with NetBSD

2015-08-28 Thread Kotresh Hiremath Ravishankar
Hi Emmanuel and others,

Geo-rep has few issues that's needs to be addressed to work with NetBSD.
The following bug is raised to track the same.
https://bugzilla.redhat.com/show_bug.cgi?id=1257847

So till these issues are fixed, I think we should disable only in
NetBSD and let it run in linux. Currently the geo-rep tests are
marked as bad test.

Please let me know your thoughts

Thanks and Regards,
Kotresh H R

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression failures

2015-08-19 Thread Kotresh Hiremath Ravishankar
Hi Emmanuel,

Thanks you, but any way geo-rep testsuit is moved under bad-tests.
I am working on getting geo-rep to work in NetBSD. I am fixing one
by one to get there. I have few questions.


1. geo-rep does lazy umount of gluster volume which needs to be modified
   to use 'gf_umount_lazy' provided by libglusterfs, correct?

2. geo-rep uses lgetxattr, it is throwing 'undefined error', I tried searching
   for man page for lgetxattr in netBSD but couldn't find. Is there a known
   portability issue with it?


Thanks and Regards,
Kotresh H R

- Original Message -
 From: Emmanuel Dreyfus m...@netbsd.org
 To: Kotresh Hiremath Ravishankar khire...@redhat.com, Avra Sengupta 
 aseng...@redhat.com
 Cc: gluster-infra gluster-in...@gluster.org, Gluster Devel 
 gluster-devel@gluster.org
 Sent: Wednesday, August 19, 2015 12:28:18 PM
 Subject: Re: [Gluster-devel] NetBSD regression failures
 
 Kotresh Hiremath Ravishankar khire...@redhat.com wrote:
 
  Since the geo-rep regression tests are failing only in NetBSD, Is there
  a way we can mask it's run only in NetBSD and let it run in linux?
  I am working on geo-rep issues with NetBSD. Once these are fixed we can
  enable on NetBSD as well.
 
 Yes, I can wipe them from regression.sh before running the tests, like
 we do for tests/bugs (never ported), tests/basic/tier/tier.t  and
 tests/basic/ec (the two later used to pass but started exhibiting too
 much spurious failures).
 
 --
 Emmanuel Dreyfus
 http://hcpnet.free.fr/pubz
 m...@netbsd.org
 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression failures

2015-08-19 Thread Kotresh Hiremath Ravishankar
Hi,

Since the geo-rep regression tests are failing only in NetBSD, Is there
a way we can mask it's run only in NetBSD and let it run in linux?
I am working on geo-rep issues with NetBSD. Once these are fixed we can
enable on NetBSD as well.

Thanks and Regards,
Kotresh H R

- Original Message -
 From: Kotresh Hiremath Ravishankar khire...@redhat.com
 To: Avra Sengupta aseng...@redhat.com
 Cc: gluster-infra gluster-in...@gluster.org, Gluster Devel 
 gluster-devel@gluster.org
 Sent: Tuesday, August 18, 2015 11:50:00 AM
 Subject: Re: [Gluster-devel] NetBSD regression failures
 
 Yes, it makes sense to move both geo-rep test to bad tests for now till.
 the issue gets fixed in netBSD. I am looking into it netbsd failures.
 
 Thanks and Regards,
 Kotresh H R
 
 - Original Message -
  From: Avra Sengupta aseng...@redhat.com
  To: Atin Mukherjee amukh...@redhat.com, Gluster Devel
  gluster-devel@gluster.org, gluster-infra
  gluster-in...@gluster.org, Raghavendra Talur rta...@redhat.com
  Sent: Tuesday, August 18, 2015 11:02:08 AM
  Subject: Re: [Gluster-devel] NetBSD regression failures
  
  On 08/18/2015 09:25 AM, Atin Mukherjee wrote:
  
   On 08/17/2015 02:20 PM, Avra Sengupta wrote:
   That patch itself might not pass all regressions as it might fail at the
   geo-rep test. I have sent a patch (http://review.gluster.org/#/c/11934/)
   with both the tests being moved to bad test. Talur could you please
   abandon 11933.
   It seems like we need to move tests/geo-rep/georep-basic-dr-tarssh.t as
   well to the bad test?
  Yes looks like it. I will resend the patch with this change.
   Regards,
   Avra
  
   On 08/17/2015 02:12 PM, Atin Mukherjee wrote:
   tests/basic/mount-nfs-auth.t has been already been added to bad test by
   http://review.gluster.org/11933
  
   ~Atin
  
   On 08/17/2015 02:09 PM, Avra Sengupta wrote:
   Will send a patch moving ./tests/basic/mount-nfs-auth.t and
   ./tests/geo-rep/georep-basic-dr-rsync.t to bad test.
  
   Regards,
   Avra
  
   On 08/17/2015 12:45 PM, Avra Sengupta wrote:
   On 08/17/2015 12:29 PM, Vijaikumar M wrote:
   On Monday 17 August 2015 12:22 PM, Avra Sengupta wrote:
   Hi,
  
   The NetBSD regression tests are continuously failing with errors in
   the following tests:
  
   ./tests/basic/mount-nfs-auth.t
   ./tests/basic/quota-anon-fd-nfs.t
   quota-anon-fd-nfs.t is known issues with NFS client caching so it is
   marked as bad test, final test will be marked as success even if
   this
   test fails.
   Yes it seems ./tests/geo-rep/georep-basic-dr-rsync.t also fails in
   the runs where quota-anon-fd-nfs.t fails, and that marks the final
   tests as failure.
  
  
   Is there any recent change that is trigerring this behaviour. Also
   currently one machine is running NetBSD tests. Can someone with
   access to Jenkins, bring up a few more slaves to run NetBSD
   regressions in parallel.
  
   Regards,
   Avra
   ___
   Gluster-devel mailing list
   Gluster-devel@gluster.org
   http://www.gluster.org/mailman/listinfo/gluster-devel
   ___
   Gluster-devel mailing list
   Gluster-devel@gluster.org
   http://www.gluster.org/mailman/listinfo/gluster-devel
   ___
   Gluster-devel mailing list
   Gluster-devel@gluster.org
   http://www.gluster.org/mailman/listinfo/gluster-devel
  
  ___
  Gluster-devel mailing list
  Gluster-devel@gluster.org
  http://www.gluster.org/mailman/listinfo/gluster-devel
  
 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-devel
 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression failures

2015-08-18 Thread Kotresh Hiremath Ravishankar
Yes, it makes sense to move both geo-rep test to bad tests for now till.
the issue gets fixed in netBSD. I am looking into it netbsd failures.

Thanks and Regards,
Kotresh H R

- Original Message -
 From: Avra Sengupta aseng...@redhat.com
 To: Atin Mukherjee amukh...@redhat.com, Gluster Devel 
 gluster-devel@gluster.org, gluster-infra
 gluster-in...@gluster.org, Raghavendra Talur rta...@redhat.com
 Sent: Tuesday, August 18, 2015 11:02:08 AM
 Subject: Re: [Gluster-devel] NetBSD regression failures
 
 On 08/18/2015 09:25 AM, Atin Mukherjee wrote:
 
  On 08/17/2015 02:20 PM, Avra Sengupta wrote:
  That patch itself might not pass all regressions as it might fail at the
  geo-rep test. I have sent a patch (http://review.gluster.org/#/c/11934/)
  with both the tests being moved to bad test. Talur could you please
  abandon 11933.
  It seems like we need to move tests/geo-rep/georep-basic-dr-tarssh.t as
  well to the bad test?
 Yes looks like it. I will resend the patch with this change.
  Regards,
  Avra
 
  On 08/17/2015 02:12 PM, Atin Mukherjee wrote:
  tests/basic/mount-nfs-auth.t has been already been added to bad test by
  http://review.gluster.org/11933
 
  ~Atin
 
  On 08/17/2015 02:09 PM, Avra Sengupta wrote:
  Will send a patch moving ./tests/basic/mount-nfs-auth.t and
  ./tests/geo-rep/georep-basic-dr-rsync.t to bad test.
 
  Regards,
  Avra
 
  On 08/17/2015 12:45 PM, Avra Sengupta wrote:
  On 08/17/2015 12:29 PM, Vijaikumar M wrote:
  On Monday 17 August 2015 12:22 PM, Avra Sengupta wrote:
  Hi,
 
  The NetBSD regression tests are continuously failing with errors in
  the following tests:
 
  ./tests/basic/mount-nfs-auth.t
  ./tests/basic/quota-anon-fd-nfs.t
  quota-anon-fd-nfs.t is known issues with NFS client caching so it is
  marked as bad test, final test will be marked as success even if this
  test fails.
  Yes it seems ./tests/geo-rep/georep-basic-dr-rsync.t also fails in
  the runs where quota-anon-fd-nfs.t fails, and that marks the final
  tests as failure.
 
 
  Is there any recent change that is trigerring this behaviour. Also
  currently one machine is running NetBSD tests. Can someone with
  access to Jenkins, bring up a few more slaves to run NetBSD
  regressions in parallel.
 
  Regards,
  Avra
  ___
  Gluster-devel mailing list
  Gluster-devel@gluster.org
  http://www.gluster.org/mailman/listinfo/gluster-devel
  ___
  Gluster-devel mailing list
  Gluster-devel@gluster.org
  http://www.gluster.org/mailman/listinfo/gluster-devel
  ___
  Gluster-devel mailing list
  Gluster-devel@gluster.org
  http://www.gluster.org/mailman/listinfo/gluster-devel
 
 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-devel
 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] testcase ./tests/geo-rep/georep-basic-dr-rsync.t failure

2015-08-17 Thread Kotresh Hiremath Ravishankar
Thanks Emmanuel, I could not look into it as I was out of station.
I will debug it today.

Thanks and Regards,
Kotresh H R

- Original Message -
 From: Emmanuel Dreyfus m...@netbsd.org
 To: Kotresh Hiremath Ravishankar khire...@redhat.com
 Cc: Gluster Devel gluster-devel@gluster.org
 Sent: Friday, August 14, 2015 12:45:49 AM
 Subject: Re: [Gluster-devel] testcase ./tests/geo-rep/georep-basic-dr-rsync.t 
 failure
 
 Kotresh Hiremath Ravishankar khire...@redhat.com wrote:
 
  We need a netbsd machine off the ring to debug. Could you please provide
  one?
 
 nbslave75 is offline for you.
 
 
 --
 Emmanuel Dreyfus
 http://hcpnet.free.fr/pubz
 m...@netbsd.org
 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] testcase ./tests/geo-rep/georep-basic-dr-rsync.t failure

2015-08-12 Thread Kotresh Hiremath Ravishankar
Hi Emmanuel,

I checked the netbsd regression machines and found that they were already 
configure with
PasswordLess SSH for root.

The issue was Geo-rep runs gluster vol info via ssh and it can't find the 
gluster PATH via ssh.

I have fixed the above in following four machines which are up by adding 
export PATH=$PATH:/build/install/sbin:/build/install/bin in ~/.kshrc and
similarly in other shells as I didn't know the default shell used by regression 
run

nbslave75
nbslave79
nbslave7g
nbslvae7j

This hopefully should fix geo-rep regression run in netbsd machines.
The same should be done for other netbsd slave machines when those are brought 
up.

Thanks and Regards,
Kotresh H R

- Original Message -
 From: Kotresh Hiremath Ravishankar khire...@redhat.com
 To: Susant Palai spa...@redhat.com, Emmanuel Dreyfus m...@netbsd.org
 Cc: Gluster Devel gluster-devel@gluster.org
 Sent: Wednesday, August 12, 2015 5:31:35 PM
 Subject: Re: [Gluster-devel] testcase ./tests/geo-rep/georep-basic-dr-rsync.t 
 failure
 
 Hi Susanth,
 
 It could be issue with PasswordLess SSH not being setup in NetBSD machines.
 
 Emmanuel,
 
 Could you please setup PasswordLess SSH in all NetBSD regression machines ?
 Otherwise, till then geo-rep testsuite should be skipped for now.
 
 Thanks and Regards,
 Kotresh H R
 
 - Original Message -
  From: Susant Palai spa...@redhat.com
  To: Gluster Devel gluster-devel@gluster.org
  Sent: Wednesday, August 12, 2015 3:56:31 PM
  Subject: [Gluster-devel] testcase ./tests/geo-rep/georep-basic-dr-rsync.t
  failure
  
  Hi,
 ./tests/geo-rep/georep-basic-dr-rsync.t fails in regression machine as
 well as in my local machine also. Requesting geo-rep team to look in to
 it.
  
  link:
  
  https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/9158/consoleFull
  
  Regards,
  Susant
  ___
  Gluster-devel mailing list
  Gluster-devel@gluster.org
  http://www.gluster.org/mailman/listinfo/gluster-devel
  
 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-devel
 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression tests not Initializing...

2015-07-05 Thread Kotresh Hiremath Ravishankar
Thanks Emmanuel.

Thanks and Regards,
Kotresh H R

- Original Message -
 From: Emmanuel Dreyfus m...@netbsd.org
 To: Kotresh Hiremath Ravishankar khire...@redhat.com, Gluster Devel 
 gluster-devel@gluster.org
 Sent: Sunday, July 5, 2015 12:52:23 AM
 Subject: Re: [Gluster-devel] NetBSD regression tests not Initializing...
 
 Kotresh Hiremath Ravishankar khire...@redhat.com wrote:
 
  Any help is appreciated.
 
 nbslave72 was sick indeed: it refused SSH connexions. I rebooted it and
 retiggered your change, but it went on another machine.
 
 --
 Emmanuel Dreyfus
 http://hcpnet.free.fr/pubz
 m...@netbsd.org
 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] NetBSD regression tests not Initializing...

2015-07-03 Thread Kotresh Hiremath Ravishankar
Hi

NetBSD regressions are not initializing because of following error consistently 
with multiple re-triggers.
I see the same error for quite a few patches.

http://review.gluster.org/#/c/11443/
Building remotely on nbslave72.cloud.gluster.org (netbsd7_regression) in 
workspace /home/jenkins/root/workspace/rackspace-netbsd7-regression-triggered
  git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
  git config remote.origin.url http://review.gluster.org/glusterfs.git # 
  timeout=10
Fetching upstream changes from http://review.gluster.org/glusterfs.git
  git --version # timeout=10
  git -c core.askpass=true fetch --tags --progress 
  http://review.gluster.org/glusterfs.git refs/changes/43/11443/9
ERROR: Error fetching remote repo 'origin'
ERROR: Error fetching remote repo 'origin'
Finished: FAILURE

Any help is appreciated.

Thanks and Regards,
Kotresh H R

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Regression Failure: ./tests/basic/quota.t

2015-07-02 Thread Kotresh Hiremath Ravishankar
Comments inline.

Thanks and Regards,
Kotresh H R

- Original Message -
 From: Susant Palai spa...@redhat.com
 To: Sachin Pandit span...@redhat.com
 Cc: Kotresh Hiremath Ravishankar khire...@redhat.com, Gluster Devel 
 gluster-devel@gluster.org
 Sent: Thursday, July 2, 2015 12:35:08 PM
 Subject: Re: [Gluster-devel] Regression Failure: ./tests/basic/quota.t
 
 Comments inline.
 
 - Original Message -
  From: Sachin Pandit span...@redhat.com
  To: Kotresh Hiremath Ravishankar khire...@redhat.com
  Cc: Gluster Devel gluster-devel@gluster.org
  Sent: Thursday, July 2, 2015 12:21:44 PM
  Subject: Re: [Gluster-devel] Regression Failure: ./tests/basic/quota.t
  
  - Original Message -
   From: Vijaikumar M vmall...@redhat.com
   To: Kotresh Hiremath Ravishankar khire...@redhat.com, Gluster Devel
   gluster-devel@gluster.org
   Cc: Sachin Pandit span...@redhat.com
   Sent: Thursd, TOTAL CHANGELOGS: 106
[2015-07-02 07:01:06.883504] E 
[gf-history-changelog.c:877:gf_history_changelog] 0-gfchangelog: wrong result 
for start: 1435818ay, July 2, 2015 12:01:03 PM
   Subject: Re: Regression Failure: ./tests/basic/quota.t
   
   We look into this issue
   
   Thanks,
   Vijay
   
   On Thursday 02 July 2015 11:46 AM, Kotresh Hiremath Ravishankar wrote:
Hi,
   
I see quota.t regression failure for the following. The changes are
related
to
example programs in libgfchangelog.
   
http://build.gluster.org/job/rackspace-regression-2GB-triggered/11785/consoleFull
   
Could someone from quota team, take a look at it.
  
  Hi,
  
  I had a quick look at this. It looks like the following test case failed
  
  TEST $CLI volume add-brick $V0 $H0:$B0/brick{3,4}
  EXPECT_WITHIN $REBALANCE_TIMEOUT 0 rebalance_completed
  
  
  I looked at the logs too, and found out the following errors
  
  patchy-rebalance.log:[2015-07-01 09:27:23.040756] E [MSGID: 109026]
  [dht-rebalance.c:2689:gf_defrag_start_crawl] 0-patchy-dht: fix layout on /
  failed
  build-install-etc-glusterfs-glusterd.vol.log:[2015-07-01 09:27:23.040998] E
  [MSGID: 106224]
  [glusterd-rebalance.c:960:glusterd_defrag_event_notify_handle]
  0-management:
  Failed to update status
  StartMigrationDuringRebalanceTest-rebalance.log:[2015-06-19
  14:34:47.557887]
  E [rpc-clnt.c:362:saved_frames_unwind] (--
  /build/install/lib/libglusterfs.so.0(_gf_log_callingfn+0x240)[0x7fc882d04d5a]
  (--
  /build/install/lib/libgfrpc.so.0(saved_frames_unwind+0x212)[0x7fc882ace086]
  (--
  /build/install/lib/libgfrpc.so.0(saved_frames_destroy+0x1f)[0x7fc882ace183]
  (--
  /build/install/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x11e)[0x7fc882ace615]
  (--
  /build/install/lib/libgfrpc.so.0(rpc_clnt_notify+0x147)[0x7fc882acf00f]
  ) 0-StartMigrationDuringRebalanceTest-client-0: forced unwinding frame
  type(GlusterFS 3.3) op(LOOKUP(27)) called at 2015-06-19 14:34:47.554862
  (xid=0xc)
  StartMigrationDuringRebalanceTest-rebalance.log:[2015-06-19
  14:34:47.561191]
  E [MSGID: 114031] [client-rpc-fops.c:1623:client3_3_inodelk_cbk]
  0-StartMigrationDuringRebalanceTest-client-0: remote operation failed:
  Transport endpoint is not connected [Transport endpoint is not connected]
  StartMigrationDuringRebalanceTest-rebalance.log:[2015-06-19
  14:34:47.561417]
  E [socket.c:2332:socket_connect_finish]
  0-StartMigrationDuringRebalanceTest-client-0: connection to
  23.253.62.104:24007 failed (Connection refused)
  StartMigrationDuringRebalanceTest-rebalance.log:[2015-06-19
  14:34:47.561707]
  E [dht-common.c:2643:dht_find_local_subvol_cbk]
  0-StartMigrationDuringRebalanceTest-dht: getxattr err (Transport endpoint
  is
  not connected) for dir
  
 Seems like a network partition. Rebalance fails if there it receives ENOTCONN
 on it's child.

Is this intended to happen on regression machines?
 
  
  Any help regarding this or more information on this would be much
  appreciated.
  
  Thanks,
  Sachin Pandit.
  
  
   
Thanks and Regards,
Kotresh H R
   
   
   
  ___
  Gluster-devel mailing list
  Gluster-devel@gluster.org
  http://www.gluster.org/mailman/listinfo/gluster-devel
  
 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Build and Regression failure in master branch!

2015-06-28 Thread Kotresh Hiremath Ravishankar
Hi,

Sorry, rmpbuild is failing in local setup as well.
Reverting the following commit helped me in the local setup.

commit 3741804bec65a33d400af38dcc80700c8a668b81
Author: arao a...@redhat.com
Date:   Mon Jun 22 11:10:05 2015 +0530

Logging: Porting the performance translator
 logs to new logging framework.


Thanks and Regards,
Kotresh H R

- Original Message -
 From: Kotresh Hiremath Ravishankar khire...@redhat.com
 To: Gluster Devel gluster-devel@gluster.org
 Sent: Sunday, June 28, 2015 12:01:22 PM
 Subject: [Gluster-devel] Build and Regression failure in master branch!
 
 Hi,
 
 rpm build is consistently failing for the patch
 (http://review.gluster.org/#/c/11443/)
 with following error where as it is passing in local setup.
 
 ...
 Making all in performance
 Making all in write-behind
 Making all in src
   CC   write-behind.lo
 write-behind.c:24:35: fatal error: write-behind-messages.h: No such file or
 directory
  #include write-behind-messages.h
^
 compilation terminated.
 make[5]: *** [write-behind.lo] Error 1
 make[4]: *** [all-recursive] Error 1
 make[3]: *** [all-recursive] Error 1
 make[2]: *** [all-recursive] Error 1
 make[1]: *** [all-recursive] Error 1
 make: *** [all] Error 2
 RPM build errors:
 error: Bad exit status from /var/tmp/rpm-tmp.8QmLg0 (%build)
 Bad exit status from /var/tmp/rpm-tmp.8QmLg0 (%build)
 
 
 
 Regression Failures: ./tests/basic/afr/client-side-heal.t
 
 Above test case is consistently failing for the patch.
 http://build.gluster.org/job/rackspace-netbsd7-regression-triggered/7596/consoleFull
 http://build.gluster.org/job/rackspace-regression-2GB-triggered/11641/consoleFull
 
 Are there known issues?
 
 
 Thanks and Regards,
 Kotresh H R
 
 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-devel
 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Build and Regression failure in master branch!

2015-06-28 Thread Kotresh Hiremath Ravishankar
Yes, Atin. You are right, header files are missing in Makefiles.
Build because of the commit 3741804bec65a33d400af38dcc80700c8a668b81

I have sent the patch for the same.
http://review.gluster.org/#/c/11451/

Please someone review and merge it.


Thanks and Regards,
Kotresh H R

- Original Message -
 From: Atin Mukherjee atin.mukherje...@gmail.com
 To: Kotresh Hiremath Ravishankar khire...@redhat.com
 Cc: Gluster Devel gluster-devel@gluster.org
 Sent: Sunday, June 28, 2015 12:56:21 PM
 Subject: Re: [Gluster-devel] Build and Regression failure in master branch!
 
 -Atin
 Sent from one plus one
 On Jun 28, 2015 12:01 PM, Kotresh Hiremath Ravishankar 
 khire...@redhat.com wrote:
 
  Hi,
 
  rpm build is consistently failing for the patch (
 http://review.gluster.org/#/c/11443/)
  with following error where as it is passing in local setup.
 
  ...
  Making all in performance
  Making all in write-behind
  Making all in src
CC   write-behind.lo
  write-behind.c:24:35: fatal error: write-behind-messages.h: No such file
 or directory
   #include write-behind-messages.h
 ^
  compilation terminated.
  make[5]: *** [write-behind.lo] Error 1
  make[4]: *** [all-recursive] Error 1
  make[3]: *** [all-recursive] Error 1
  make[2]: *** [all-recursive] Error 1
  make[1]: *** [all-recursive] Error 1
  make: *** [all] Error 2
  RPM build errors:
  error: Bad exit status from /var/tmp/rpm-tmp.8QmLg0 (%build)
  Bad exit status from /var/tmp/rpm-tmp.8QmLg0 (%build)
 This means the entry of this file is missing in the respective makefile.
  
 
 
  Regression Failures: ./tests/basic/afr/client-side-heal.t
 
  Above test case is consistently failing for the patch.
 
 http://build.gluster.org/job/rackspace-netbsd7-regression-triggered/7596/consoleFull
 
 http://build.gluster.org/job/rackspace-regression-2GB-triggered/11641/consoleFull
 
  Are there known issues?
 
 
  Thanks and Regards,
  Kotresh H R
 
  ___
  Gluster-devel mailing list
  Gluster-devel@gluster.org
  http://www.gluster.org/mailman/listinfo/gluster-devel
 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Regresssion Failure (3.7 branch): afr-quota-xattr-mdata-heal.t

2015-06-25 Thread Kotresh Hiremath Ravishankar
Ok, Thanks. I have re-triggered it.

Thanks and Regards,
Kotresh H R

- Original Message -
 From: Pranith Kumar Karampuri pkara...@redhat.com
 To: Kotresh Hiremath Ravishankar khire...@redhat.com, Gluster Devel 
 gluster-devel@gluster.org
 Sent: Thursday, June 25, 2015 11:55:22 AM
 Subject: Re: Regresssion Failure (3.7 branch): afr-quota-xattr-mdata-heal.t
 
 This is a known spurious failure.
 
 Pranith
 On 06/25/2015 11:14 AM, Kotresh Hiremath Ravishankar wrote:
  Hi,
 
  I see the above test case failing for my patch which is not related.
  Could some one from AFR team look into it?
  http://build.gluster.org/job/rackspace-regression-2GB-triggered/11332/consoleFull
 
 
  Thanks and Regards,
  Kotresh H R
 
 
 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Regression Failure: bug-1134822-read-only-default-in-graph.t

2015-06-24 Thread Kotresh Hiremath Ravishankar
Hi All,

The above mentioned testcase failed for me which is not related to the patch.
Could someone look into it?

http://build.gluster.org/job/rackspace-regression-2GB-triggered/11267/consoleFull

Thanks and Regards,
Kotresh H R

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Regresssion Failure (3.7 branch): afr-quota-xattr-mdata-heal.t

2015-06-24 Thread Kotresh Hiremath Ravishankar
Hi,

I see the above test case failing for my patch which is not related.
Could some one from AFR team look into it?
http://build.gluster.org/job/rackspace-regression-2GB-triggered/11332/consoleFull


Thanks and Regards,
Kotresh H R

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


  1   2   >