Re: [Gluster-users] Problem with glusterd locks on gluster 3.6.1

2016-06-17 Thread B.K.Raghuram
No, we had not.. As the developer is no longer around and I have not looked
at the code, I was not sure if I should. We have been using the code
changes against 3.6.1 and it works fine.

On Fri, Jun 17, 2016 at 5:19 PM, Joe Julian  wrote:

> Have you offered those patches upstream?
>
> On June 16, 2016 1:02:24 AM PDT, "B.K.Raghuram"  wrote:
>
>> Thanks a lot Atin,
>>
>> The problem is that we are using a forked version of 3.6.1 which has been
>> modified to work with ZFS (for snapshots) but we do not have the resources
>> to port that over to the later versions of gluster.
>>
>> Would you know of anyone who would be willing to take this on?!
>>
>> Regards,
>> -Ram
>>
>> On Thu, Jun 16, 2016 at 11:02 AM, Atin Mukherjee 
>> wrote:
>>
>>>
>>>
>>> On 06/16/2016 10:49 AM, B.K.Raghuram wrote:
>>> >
>>> >
>>> > On Wed, Jun 15, 2016 at 5:01 PM, Atin Mukherjee >> > > wrote:
>>> >
>>> >
>>> >
>>> > On 06/15/2016 04:24 PM, B.K.Raghuram wrote:
>>> > > Hi,
>>> > >
>>> > > We're using gluster 3.6.1 and we periodically find that gluster
>>> commands
>>> > > fail saying the it could not get the lock on one of the brick
>>> machines.
>>> > > The logs on that machine then say something like :
>>> > >
>>> > > [2016-06-15 08:17:03.076119] E
>>> > > [glusterd-op-sm.c:3058:glusterd_op_ac_lock] 0-management: Unable
>>> to
>>> > > acquire lock for vol2
>>> >
>>> > This is a possible case if concurrent volume operations are run.
>>> Do you
>>> > have any script which checks for volume status on an interval from
>>> all
>>> > the nodes, if so then this is an expected behavior.
>>> >
>>> >
>>> > Yes, I do have a couple of scripts that check on volume and quota
>>> > status.. Given this, I do get a "Another transaction is in progress.."
>>> > message which is ok. The problem is that sometimes I get the volume
>>> lock
>>> > held message which never goes away. This sometimes results in glusterd
>>> > consuming a lot of memory and CPU and the problem can only be fixed
>>> with
>>> > a reboot. The log files are huge so I'm not sure if its ok to attach
>>> > them to an email.
>>>
>>> Ok, so this is known. We have fixed lots of stale lock issues in 3.7
>>> branch and some of them if not all were also backported to 3.6 branch.
>>> The issue is you are using 3.6.1 which is quite old. If you can upgrade
>>> to latest versions of 3.7 or at worst of 3.6 I am confident that this
>>> will go away.
>>>
>>> ~Atin
>>> >
>>> > >
>>> > > After sometime, glusterd then seems to give up and die..
>>> >
>>> > Do you mean glusterd shuts down or segfaults, if so I am more
>>> interested
>>> > in analyzing this part. Could you provide us the glusterd log,
>>> > cmd_history log file along with core (in case of SEGV) from all the
>>> > nodes for the further analysis?
>>> >
>>> >
>>> > There is no segfault. glusterd just shuts down. As I said above,
>>> > sometimes this happens and sometimes it just continues to hog a lot of
>>> > memory and CPU..
>>> >
>>> >
>>> > >
>>> > > Interestingly, I also find the following line in the beginning of
>>> > > etc-glusterfs-glusterd.vol.log and I dont know if this has any
>>> > > significance to the issue :
>>> > >
>>> > > [2016-06-14 06:48:57.282290] I
>>> > > [glusterd-store.c:2063:glusterd_restore_op_version] 0-management:
>>> > > Detected new install. Setting op-version to maximum : 30600
>>> > >
>>> >
>>> >
>>> > What does this line signify?
>>>
>>
>> --
>>
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>>
>>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Problem with glusterd locks on gluster 3.6.1

2016-06-17 Thread Joe Julian
Have you offered those patches upstream?

On June 16, 2016 1:02:24 AM PDT, "B.K.Raghuram"  wrote:
>Thanks a lot Atin,
>
>The problem is that we are using a forked version of 3.6.1 which has
>been
>modified to work with ZFS (for snapshots) but we do not have the
>resources
>to port that over to the later versions of gluster.
>
>Would you know of anyone who would be willing to take this on?!
>
>Regards,
>-Ram
>
>On Thu, Jun 16, 2016 at 11:02 AM, Atin Mukherjee 
>wrote:
>
>>
>>
>> On 06/16/2016 10:49 AM, B.K.Raghuram wrote:
>> >
>> >
>> > On Wed, Jun 15, 2016 at 5:01 PM, Atin Mukherjee
>> > > wrote:
>> >
>> >
>> >
>> > On 06/15/2016 04:24 PM, B.K.Raghuram wrote:
>> > > Hi,
>> > >
>> > > We're using gluster 3.6.1 and we periodically find that
>gluster
>> commands
>> > > fail saying the it could not get the lock on one of the brick
>> machines.
>> > > The logs on that machine then say something like :
>> > >
>> > > [2016-06-15 08:17:03.076119] E
>> > > [glusterd-op-sm.c:3058:glusterd_op_ac_lock] 0-management:
>Unable to
>> > > acquire lock for vol2
>> >
>> > This is a possible case if concurrent volume operations are
>run. Do
>> you
>> > have any script which checks for volume status on an interval
>from
>> all
>> > the nodes, if so then this is an expected behavior.
>> >
>> >
>> > Yes, I do have a couple of scripts that check on volume and quota
>> > status.. Given this, I do get a "Another transaction is in
>progress.."
>> > message which is ok. The problem is that sometimes I get the volume
>lock
>> > held message which never goes away. This sometimes results in
>glusterd
>> > consuming a lot of memory and CPU and the problem can only be fixed
>with
>> > a reboot. The log files are huge so I'm not sure if its ok to
>attach
>> > them to an email.
>>
>> Ok, so this is known. We have fixed lots of stale lock issues in 3.7
>> branch and some of them if not all were also backported to 3.6
>branch.
>> The issue is you are using 3.6.1 which is quite old. If you can
>upgrade
>> to latest versions of 3.7 or at worst of 3.6 I am confident that this
>> will go away.
>>
>> ~Atin
>> >
>> > >
>> > > After sometime, glusterd then seems to give up and die..
>> >
>> > Do you mean glusterd shuts down or segfaults, if so I am more
>> interested
>> > in analyzing this part. Could you provide us the glusterd log,
>> > cmd_history log file along with core (in case of SEGV) from all
>the
>> > nodes for the further analysis?
>> >
>> >
>> > There is no segfault. glusterd just shuts down. As I said above,
>> > sometimes this happens and sometimes it just continues to hog a lot
>of
>> > memory and CPU..
>> >
>> >
>> > >
>> > > Interestingly, I also find the following line in the
>beginning of
>> > > etc-glusterfs-glusterd.vol.log and I dont know if this has
>any
>> > > significance to the issue :
>> > >
>> > > [2016-06-14 06:48:57.282290] I
>> > > [glusterd-store.c:2063:glusterd_restore_op_version]
>0-management:
>> > > Detected new install. Setting op-version to maximum : 30600
>> > >
>> >
>> >
>> > What does this line signify?
>>
>
>
>
>
>___
>Gluster-users mailing list
>Gluster-users@gluster.org
>http://www.gluster.org/mailman/listinfo/gluster-users

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Problem with glusterd locks on gluster 3.6.1

2016-06-17 Thread B.K.Raghuram
I'd tried that sometime back but ran into some merge conflicts and was not
sure who to turn to :) May I come to you for help with that?!

On Fri, Jun 17, 2016 at 3:29 PM, Atin Mukherjee  wrote:

>
>
> On 06/17/2016 03:21 PM, B.K.Raghuram wrote:
> > Thanks a ton Atin. That fixed cherry-pick. Will build it and let you
> > know how it goes. Does it make sense to try and merge the whole upstream
> > glusterfs repo for the 3.6 branch in order to get all the other bug
> > fixes? That may bring in many more merge conflicts though..
>
> Yup, I'd not recommend that. Applying your local changes on the latest
> version is a much easier option :)
>
> >
> > On Fri, Jun 17, 2016 at 3:07 PM, Atin Mukherjee  > > wrote:
> >
> > I've resolved the merge conflicts and files are attached. Copy these
> > files and follow the instructions from the cherry pick command which
> > failed.
> >
> > ~Atin
> >
> > On 06/17/2016 02:55 PM, B.K.Raghuram wrote:
> > >
> > > Thanks Atin, I had three merge conflicts in the third patch.. I've
> > > attached the files with the conflicts. Would any of the intervening
> > > commits be needed as well?
> > >
> > > The conflicts were in :
> > >
> > > both modified:  libglusterfs/src/mem-types.h
> > > both modified:  xlators/mgmt/glusterd/src/glusterd-utils.c
> > > both modified:  xlators/mgmt/glusterd/src/glusterd-utils.h
> > >
> > >
> > > On Fri, Jun 17, 2016 at 2:17 PM, Atin Mukherjee <
> amukh...@redhat.com 
> > > >> wrote:
> > >
> > >
> > >
> > > On 06/17/2016 12:44 PM, B.K.Raghuram wrote:
> > > > Thanks Atin.. I'm not familiar with pulling patches the
> review system
> > > > but will try:)
> > >
> > > It's not that difficult. Open the gerrit review link, go to
> the download
> > > drop box at the top right corner, click on it and then you
> will see a
> > > cherry pick option, copy that content and paste it the source
> code repo
> > > you host. If there are no merge conflicts, it should auto
> apply,
> > > otherwise you'd need to fix them manually.
> > >
> > > HTH.
> > > Atin
> > >
> > > >
> > > > On Fri, Jun 17, 2016 at 12:35 PM, Atin Mukherjee <
> amukh...@redhat.com 
> > >
> > > > 
> >  > > >
> > > >
> > > >
> > > > On 06/16/2016 06:17 PM, Atin Mukherjee wrote:
> > > > >
> > > > >
> > > > > On 06/16/2016 01:32 PM, B.K.Raghuram wrote:
> > > > >> Thanks a lot Atin,
> > > > >>
> > > > >> The problem is that we are using a forked version of
> 3.6.1 which has
> > > > >> been modified to work with ZFS (for snapshots) but we
> do not have the
> > > > >> resources to port that over to the later versions of
> gluster.
> > > > >>
> > > > >> Would you know of anyone who would be willing to take
> this on?!
> > > > >
> > > > > If you can cherry pick the patches and apply them on
> your source and
> > > > > rebuild it, I can point the patches to you, but you'd
> need to give a
> > > > > day's time to me as I have some other items to finish
> from my plate.
> > > >
> > > >
> > > > Here is the list of the patches need to be applied on
> the following
> > > > order:
> > > >
> > > > http://review.gluster.org/9328
> > > > http://review.gluster.org/9393
> > > > http://review.gluster.org/10023
> > > >
> > > > >
> > > > > ~Atin
> > > > >>
> > > > >> Regards,
> > > > >> -Ram
> > > > >>
> > > > >> On Thu, Jun 16, 2016 at 11:02 AM, Atin Mukherjee
> > > > mailto:amukh...@redhat.com>
> > >
> > > 
> > >>
> > > > >>  >   > >
> > > 
> >  wrote:
> > > > >>
> > > > >>
> > > > >>
> > > > >> On 06/16/2016 10:49 AM, B.K.Raghuram wrote:
> > > > >> >
> > > > >> >
> > > 

Re: [Gluster-users] Problem with glusterd locks on gluster 3.6.1

2016-06-17 Thread Atin Mukherjee


On 06/17/2016 03:21 PM, B.K.Raghuram wrote:
> Thanks a ton Atin. That fixed cherry-pick. Will build it and let you
> know how it goes. Does it make sense to try and merge the whole upstream
> glusterfs repo for the 3.6 branch in order to get all the other bug
> fixes? That may bring in many more merge conflicts though..

Yup, I'd not recommend that. Applying your local changes on the latest
version is a much easier option :)

> 
> On Fri, Jun 17, 2016 at 3:07 PM, Atin Mukherjee  > wrote:
> 
> I've resolved the merge conflicts and files are attached. Copy these
> files and follow the instructions from the cherry pick command which
> failed.
> 
> ~Atin
> 
> On 06/17/2016 02:55 PM, B.K.Raghuram wrote:
> >
> > Thanks Atin, I had three merge conflicts in the third patch.. I've
> > attached the files with the conflicts. Would any of the intervening
> > commits be needed as well?
> >
> > The conflicts were in :
> >
> > both modified:  libglusterfs/src/mem-types.h
> > both modified:  xlators/mgmt/glusterd/src/glusterd-utils.c
> > both modified:  xlators/mgmt/glusterd/src/glusterd-utils.h
> >
> >
> > On Fri, Jun 17, 2016 at 2:17 PM, Atin Mukherjee  
> > >> wrote:
> >
> >
> >
> > On 06/17/2016 12:44 PM, B.K.Raghuram wrote:
> > > Thanks Atin.. I'm not familiar with pulling patches the review 
> system
> > > but will try:)
> >
> > It's not that difficult. Open the gerrit review link, go to the 
> download
> > drop box at the top right corner, click on it and then you will see 
> a
> > cherry pick option, copy that content and paste it the source code 
> repo
> > you host. If there are no merge conflicts, it should auto apply,
> > otherwise you'd need to fix them manually.
> >
> > HTH.
> > Atin
> >
> > >
> > > On Fri, Jun 17, 2016 at 12:35 PM, Atin Mukherjee 
> mailto:amukh...@redhat.com>
> >
> > > 
>  > >
> > >
> > >
> > > On 06/16/2016 06:17 PM, Atin Mukherjee wrote:
> > > >
> > > >
> > > > On 06/16/2016 01:32 PM, B.K.Raghuram wrote:
> > > >> Thanks a lot Atin,
> > > >>
> > > >> The problem is that we are using a forked version of 3.6.1 
> which has
> > > >> been modified to work with ZFS (for snapshots) but we do 
> not have the
> > > >> resources to port that over to the later versions of 
> gluster.
> > > >>
> > > >> Would you know of anyone who would be willing to take this 
> on?!
> > > >
> > > > If you can cherry pick the patches and apply them on your 
> source and
> > > > rebuild it, I can point the patches to you, but you'd need 
> to give a
> > > > day's time to me as I have some other items to finish from 
> my plate.
> > >
> > >
> > > Here is the list of the patches need to be applied on the 
> following
> > > order:
> > >
> > > http://review.gluster.org/9328
> > > http://review.gluster.org/9393
> > > http://review.gluster.org/10023
> > >
> > > >
> > > > ~Atin
> > > >>
> > > >> Regards,
> > > >> -Ram
> > > >>
> > > >> On Thu, Jun 16, 2016 at 11:02 AM, Atin Mukherjee
> > > mailto:amukh...@redhat.com>
> >
> > 
> >>
> > > >>    >
> > 
>  wrote:
> > > >>
> > > >>
> > > >>
> > > >> On 06/16/2016 10:49 AM, B.K.Raghuram wrote:
> > > >> >
> > > >> >
> > > >> > On Wed, Jun 15, 2016 at 5:01 PM, Atin Mukherjee
> > > mailto:amukh...@redhat.com>
> >
> > 
> >>
> > > 
> 

Re: [Gluster-users] Problem with glusterd locks on gluster 3.6.1

2016-06-17 Thread B.K.Raghuram
Thanks a ton Atin. That fixed cherry-pick. Will build it and let you know
how it goes. Does it make sense to try and merge the whole upstream
glusterfs repo for the 3.6 branch in order to get all the other bug fixes?
That may bring in many more merge conflicts though..

On Fri, Jun 17, 2016 at 3:07 PM, Atin Mukherjee  wrote:

> I've resolved the merge conflicts and files are attached. Copy these
> files and follow the instructions from the cherry pick command which
> failed.
>
> ~Atin
>
> On 06/17/2016 02:55 PM, B.K.Raghuram wrote:
> >
> > Thanks Atin, I had three merge conflicts in the third patch.. I've
> > attached the files with the conflicts. Would any of the intervening
> > commits be needed as well?
> >
> > The conflicts were in :
> >
> > both modified:  libglusterfs/src/mem-types.h
> > both modified:  xlators/mgmt/glusterd/src/glusterd-utils.c
> > both modified:  xlators/mgmt/glusterd/src/glusterd-utils.h
> >
> >
> > On Fri, Jun 17, 2016 at 2:17 PM, Atin Mukherjee  > > wrote:
> >
> >
> >
> > On 06/17/2016 12:44 PM, B.K.Raghuram wrote:
> > > Thanks Atin.. I'm not familiar with pulling patches the review
> system
> > > but will try:)
> >
> > It's not that difficult. Open the gerrit review link, go to the
> download
> > drop box at the top right corner, click on it and then you will see a
> > cherry pick option, copy that content and paste it the source code
> repo
> > you host. If there are no merge conflicts, it should auto apply,
> > otherwise you'd need to fix them manually.
> >
> > HTH.
> > Atin
> >
> > >
> > > On Fri, Jun 17, 2016 at 12:35 PM, Atin Mukherjee <
> amukh...@redhat.com 
> > > >> wrote:
> > >
> > >
> > >
> > > On 06/16/2016 06:17 PM, Atin Mukherjee wrote:
> > > >
> > > >
> > > > On 06/16/2016 01:32 PM, B.K.Raghuram wrote:
> > > >> Thanks a lot Atin,
> > > >>
> > > >> The problem is that we are using a forked version of 3.6.1
> which has
> > > >> been modified to work with ZFS (for snapshots) but we do
> not have the
> > > >> resources to port that over to the later versions of
> gluster.
> > > >>
> > > >> Would you know of anyone who would be willing to take this
> on?!
> > > >
> > > > If you can cherry pick the patches and apply them on your
> source and
> > > > rebuild it, I can point the patches to you, but you'd need
> to give a
> > > > day's time to me as I have some other items to finish from
> my plate.
> > >
> > >
> > > Here is the list of the patches need to be applied on the
> following
> > > order:
> > >
> > > http://review.gluster.org/9328
> > > http://review.gluster.org/9393
> > > http://review.gluster.org/10023
> > >
> > > >
> > > > ~Atin
> > > >>
> > > >> Regards,
> > > >> -Ram
> > > >>
> > > >> On Thu, Jun 16, 2016 at 11:02 AM, Atin Mukherjee
> > > mailto:amukh...@redhat.com>
> > >
> > > >> 
> >  > > >>
> > > >>
> > > >>
> > > >> On 06/16/2016 10:49 AM, B.K.Raghuram wrote:
> > > >> >
> > > >> >
> > > >> > On Wed, Jun 15, 2016 at 5:01 PM, Atin Mukherjee
> > > mailto:amukh...@redhat.com>
> > >
> > > 
> > >>
> > > >> >  >   > >
> > > 
> >  wrote:
> > > >> >
> > > >> >
> > > >> >
> > > >> > On 06/15/2016 04:24 PM, B.K.Raghuram wrote:
> > > >> > > Hi,
> > > >> > >
> > > >> > > We're using gluster 3.6.1 and we periodically
> find
> > > that gluster commands
> > > >> > > fail saying the it could not get the lock on
> one of
> > > the brick machines.
> > > >> > > The logs on that machine then say something
> like :
> > > >> > >
> > > >> > > [2016-06-15 08:17:03.076119] E
> > > >> > > [glusterd-op-sm.c:3058:glusterd_op_ac_lock]
> > > 0-management: Unable to
> > > >> > > acquire lock for vol2
> > > >> >
> >

Re: [Gluster-users] Problem with glusterd locks on gluster 3.6.1

2016-06-17 Thread Atin Mukherjee


On 06/17/2016 12:44 PM, B.K.Raghuram wrote:
> Thanks Atin.. I'm not familiar with pulling patches the review system
> but will try:)

It's not that difficult. Open the gerrit review link, go to the download
drop box at the top right corner, click on it and then you will see a
cherry pick option, copy that content and paste it the source code repo
you host. If there are no merge conflicts, it should auto apply,
otherwise you'd need to fix them manually.

HTH.
Atin

> 
> On Fri, Jun 17, 2016 at 12:35 PM, Atin Mukherjee  > wrote:
> 
> 
> 
> On 06/16/2016 06:17 PM, Atin Mukherjee wrote:
> >
> >
> > On 06/16/2016 01:32 PM, B.K.Raghuram wrote:
> >> Thanks a lot Atin,
> >>
> >> The problem is that we are using a forked version of 3.6.1 which has
> >> been modified to work with ZFS (for snapshots) but we do not have the
> >> resources to port that over to the later versions of gluster.
> >>
> >> Would you know of anyone who would be willing to take this on?!
> >
> > If you can cherry pick the patches and apply them on your source and
> > rebuild it, I can point the patches to you, but you'd need to give a
> > day's time to me as I have some other items to finish from my plate.
> 
> 
> Here is the list of the patches need to be applied on the following
> order:
> 
> http://review.gluster.org/9328
> http://review.gluster.org/9393
> http://review.gluster.org/10023
> 
> >
> > ~Atin
> >>
> >> Regards,
> >> -Ram
> >>
> >> On Thu, Jun 16, 2016 at 11:02 AM, Atin Mukherjee
> mailto:amukh...@redhat.com>
> >> >> wrote:
> >>
> >>
> >>
> >> On 06/16/2016 10:49 AM, B.K.Raghuram wrote:
> >> >
> >> >
> >> > On Wed, Jun 15, 2016 at 5:01 PM, Atin Mukherjee
> mailto:amukh...@redhat.com>
> >
> >> > 
>  >> >
> >> >
> >> >
> >> > On 06/15/2016 04:24 PM, B.K.Raghuram wrote:
> >> > > Hi,
> >> > >
> >> > > We're using gluster 3.6.1 and we periodically find
> that gluster commands
> >> > > fail saying the it could not get the lock on one of
> the brick machines.
> >> > > The logs on that machine then say something like :
> >> > >
> >> > > [2016-06-15 08:17:03.076119] E
> >> > > [glusterd-op-sm.c:3058:glusterd_op_ac_lock]
> 0-management: Unable to
> >> > > acquire lock for vol2
> >> >
> >> > This is a possible case if concurrent volume operations
> are run. Do you
> >> > have any script which checks for volume status on an
> interval from all
> >> > the nodes, if so then this is an expected behavior.
> >> >
> >> >
> >> > Yes, I do have a couple of scripts that check on volume and
> quota
> >> > status.. Given this, I do get a "Another transaction is in
> progress.."
> >> > message which is ok. The problem is that sometimes I get
> the volume lock
> >> > held message which never goes away. This sometimes results
> in glusterd
> >> > consuming a lot of memory and CPU and the problem can only
> be fixed with
> >> > a reboot. The log files are huge so I'm not sure if its ok
> to attach
> >> > them to an email.
> >>
> >> Ok, so this is known. We have fixed lots of stale lock issues
> in 3.7
> >> branch and some of them if not all were also backported to
> 3.6 branch.
> >> The issue is you are using 3.6.1 which is quite old. If you
> can upgrade
> >> to latest versions of 3.7 or at worst of 3.6 I am confident
> that this
> >> will go away.
> >>
> >> ~Atin
> >> >
> >> > >
> >> > > After sometime, glusterd then seems to give up and die..
> >> >
> >> > Do you mean glusterd shuts down or segfaults, if so I
> am more
> >> interested
> >> > in analyzing this part. Could you provide us the
> glusterd log,
> >> > cmd_history log file along with core (in case of SEGV) from
> >> all the
> >> > nodes for the further analysis?
> >> >
> >> >
> >> > There is no segfault. glusterd just shuts down. As I said
> above,
> >> > sometimes this happens and sometimes it just continues to
> hog a lot of
> >> > memory and CPU..
> >> >
> >> >
> >> > >
> >> > > Interestingly, I also find the following line in the
> >> beginning of
> >> > > etc-gluster

Re: [Gluster-users] Problem with glusterd locks on gluster 3.6.1

2016-06-17 Thread B.K.Raghuram
Thanks Atin.. I'm not familiar with pulling patches the review system but
will try:)

On Fri, Jun 17, 2016 at 12:35 PM, Atin Mukherjee 
wrote:

>
>
> On 06/16/2016 06:17 PM, Atin Mukherjee wrote:
> >
> >
> > On 06/16/2016 01:32 PM, B.K.Raghuram wrote:
> >> Thanks a lot Atin,
> >>
> >> The problem is that we are using a forked version of 3.6.1 which has
> >> been modified to work with ZFS (for snapshots) but we do not have the
> >> resources to port that over to the later versions of gluster.
> >>
> >> Would you know of anyone who would be willing to take this on?!
> >
> > If you can cherry pick the patches and apply them on your source and
> > rebuild it, I can point the patches to you, but you'd need to give a
> > day's time to me as I have some other items to finish from my plate.
>
>
> Here is the list of the patches need to be applied on the following order:
>
> http://review.gluster.org/9328
> http://review.gluster.org/9393
> http://review.gluster.org/10023
>
> >
> > ~Atin
> >>
> >> Regards,
> >> -Ram
> >>
> >> On Thu, Jun 16, 2016 at 11:02 AM, Atin Mukherjee  >> > wrote:
> >>
> >>
> >>
> >> On 06/16/2016 10:49 AM, B.K.Raghuram wrote:
> >> >
> >> >
> >> > On Wed, Jun 15, 2016 at 5:01 PM, Atin Mukherjee <
> amukh...@redhat.com 
> >> > >> wrote:
> >> >
> >> >
> >> >
> >> > On 06/15/2016 04:24 PM, B.K.Raghuram wrote:
> >> > > Hi,
> >> > >
> >> > > We're using gluster 3.6.1 and we periodically find that
> gluster commands
> >> > > fail saying the it could not get the lock on one of the
> brick machines.
> >> > > The logs on that machine then say something like :
> >> > >
> >> > > [2016-06-15 08:17:03.076119] E
> >> > > [glusterd-op-sm.c:3058:glusterd_op_ac_lock] 0-management:
> Unable to
> >> > > acquire lock for vol2
> >> >
> >> > This is a possible case if concurrent volume operations are
> run. Do you
> >> > have any script which checks for volume status on an interval
> from all
> >> > the nodes, if so then this is an expected behavior.
> >> >
> >> >
> >> > Yes, I do have a couple of scripts that check on volume and quota
> >> > status.. Given this, I do get a "Another transaction is in
> progress.."
> >> > message which is ok. The problem is that sometimes I get the
> volume lock
> >> > held message which never goes away. This sometimes results in
> glusterd
> >> > consuming a lot of memory and CPU and the problem can only be
> fixed with
> >> > a reboot. The log files are huge so I'm not sure if its ok to
> attach
> >> > them to an email.
> >>
> >> Ok, so this is known. We have fixed lots of stale lock issues in 3.7
> >> branch and some of them if not all were also backported to 3.6
> branch.
> >> The issue is you are using 3.6.1 which is quite old. If you can
> upgrade
> >> to latest versions of 3.7 or at worst of 3.6 I am confident that
> this
> >> will go away.
> >>
> >> ~Atin
> >> >
> >> > >
> >> > > After sometime, glusterd then seems to give up and die..
> >> >
> >> > Do you mean glusterd shuts down or segfaults, if so I am more
> >> interested
> >> > in analyzing this part. Could you provide us the glusterd log,
> >> > cmd_history log file along with core (in case of SEGV) from
> >> all the
> >> > nodes for the further analysis?
> >> >
> >> >
> >> > There is no segfault. glusterd just shuts down. As I said above,
> >> > sometimes this happens and sometimes it just continues to hog a
> lot of
> >> > memory and CPU..
> >> >
> >> >
> >> > >
> >> > > Interestingly, I also find the following line in the
> >> beginning of
> >> > > etc-glusterfs-glusterd.vol.log and I dont know if this has
> any
> >> > > significance to the issue :
> >> > >
> >> > > [2016-06-14 06:48:57.282290] I
> >> > > [glusterd-store.c:2063:glusterd_restore_op_version]
> >> 0-management:
> >> > > Detected new install. Setting op-version to maximum : 30600
> >> > >
> >> >
> >> >
> >> > What does this line signify?
> >>
> >>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Problem with glusterd locks on gluster 3.6.1

2016-06-17 Thread Atin Mukherjee


On 06/16/2016 06:17 PM, Atin Mukherjee wrote:
> 
> 
> On 06/16/2016 01:32 PM, B.K.Raghuram wrote:
>> Thanks a lot Atin,
>>
>> The problem is that we are using a forked version of 3.6.1 which has
>> been modified to work with ZFS (for snapshots) but we do not have the
>> resources to port that over to the later versions of gluster.
>>
>> Would you know of anyone who would be willing to take this on?!
> 
> If you can cherry pick the patches and apply them on your source and
> rebuild it, I can point the patches to you, but you'd need to give a
> day's time to me as I have some other items to finish from my plate.


Here is the list of the patches need to be applied on the following order:

http://review.gluster.org/9328
http://review.gluster.org/9393
http://review.gluster.org/10023

> 
> ~Atin
>>
>> Regards,
>> -Ram
>>
>> On Thu, Jun 16, 2016 at 11:02 AM, Atin Mukherjee > > wrote:
>>
>>
>>
>> On 06/16/2016 10:49 AM, B.K.Raghuram wrote:
>> >
>> >
>> > On Wed, Jun 15, 2016 at 5:01 PM, Atin Mukherjee > 
>> > >> wrote:
>> >
>> >
>> >
>> > On 06/15/2016 04:24 PM, B.K.Raghuram wrote:
>> > > Hi,
>> > >
>> > > We're using gluster 3.6.1 and we periodically find that gluster 
>> commands
>> > > fail saying the it could not get the lock on one of the brick 
>> machines.
>> > > The logs on that machine then say something like :
>> > >
>> > > [2016-06-15 08:17:03.076119] E
>> > > [glusterd-op-sm.c:3058:glusterd_op_ac_lock] 0-management: Unable 
>> to
>> > > acquire lock for vol2
>> >
>> > This is a possible case if concurrent volume operations are run. 
>> Do you
>> > have any script which checks for volume status on an interval from 
>> all
>> > the nodes, if so then this is an expected behavior.
>> >
>> >
>> > Yes, I do have a couple of scripts that check on volume and quota
>> > status.. Given this, I do get a "Another transaction is in progress.."
>> > message which is ok. The problem is that sometimes I get the volume 
>> lock
>> > held message which never goes away. This sometimes results in glusterd
>> > consuming a lot of memory and CPU and the problem can only be fixed 
>> with
>> > a reboot. The log files are huge so I'm not sure if its ok to attach
>> > them to an email.
>>
>> Ok, so this is known. We have fixed lots of stale lock issues in 3.7
>> branch and some of them if not all were also backported to 3.6 branch.
>> The issue is you are using 3.6.1 which is quite old. If you can upgrade
>> to latest versions of 3.7 or at worst of 3.6 I am confident that this
>> will go away.
>>
>> ~Atin
>> >
>> > >
>> > > After sometime, glusterd then seems to give up and die..
>> >
>> > Do you mean glusterd shuts down or segfaults, if so I am more
>> interested
>> > in analyzing this part. Could you provide us the glusterd log,
>> > cmd_history log file along with core (in case of SEGV) from
>> all the
>> > nodes for the further analysis?
>> >
>> >
>> > There is no segfault. glusterd just shuts down. As I said above,
>> > sometimes this happens and sometimes it just continues to hog a lot of
>> > memory and CPU..
>> >
>> >
>> > >
>> > > Interestingly, I also find the following line in the
>> beginning of
>> > > etc-glusterfs-glusterd.vol.log and I dont know if this has any
>> > > significance to the issue :
>> > >
>> > > [2016-06-14 06:48:57.282290] I
>> > > [glusterd-store.c:2063:glusterd_restore_op_version]
>> 0-management:
>> > > Detected new install. Setting op-version to maximum : 30600
>> > >
>> >
>> >
>> > What does this line signify?
>>
>>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Problem with glusterd locks on gluster 3.6.1

2016-06-16 Thread Atin Mukherjee


On 06/16/2016 01:32 PM, B.K.Raghuram wrote:
> Thanks a lot Atin,
> 
> The problem is that we are using a forked version of 3.6.1 which has
> been modified to work with ZFS (for snapshots) but we do not have the
> resources to port that over to the later versions of gluster.
> 
> Would you know of anyone who would be willing to take this on?!

If you can cherry pick the patches and apply them on your source and
rebuild it, I can point the patches to you, but you'd need to give a
day's time to me as I have some other items to finish from my plate.

~Atin
> 
> Regards,
> -Ram
> 
> On Thu, Jun 16, 2016 at 11:02 AM, Atin Mukherjee  > wrote:
> 
> 
> 
> On 06/16/2016 10:49 AM, B.K.Raghuram wrote:
> >
> >
> > On Wed, Jun 15, 2016 at 5:01 PM, Atin Mukherjee  
> > >> wrote:
> >
> >
> >
> > On 06/15/2016 04:24 PM, B.K.Raghuram wrote:
> > > Hi,
> > >
> > > We're using gluster 3.6.1 and we periodically find that gluster 
> commands
> > > fail saying the it could not get the lock on one of the brick 
> machines.
> > > The logs on that machine then say something like :
> > >
> > > [2016-06-15 08:17:03.076119] E
> > > [glusterd-op-sm.c:3058:glusterd_op_ac_lock] 0-management: Unable 
> to
> > > acquire lock for vol2
> >
> > This is a possible case if concurrent volume operations are run. Do 
> you
> > have any script which checks for volume status on an interval from 
> all
> > the nodes, if so then this is an expected behavior.
> >
> >
> > Yes, I do have a couple of scripts that check on volume and quota
> > status.. Given this, I do get a "Another transaction is in progress.."
> > message which is ok. The problem is that sometimes I get the volume lock
> > held message which never goes away. This sometimes results in glusterd
> > consuming a lot of memory and CPU and the problem can only be fixed with
> > a reboot. The log files are huge so I'm not sure if its ok to attach
> > them to an email.
> 
> Ok, so this is known. We have fixed lots of stale lock issues in 3.7
> branch and some of them if not all were also backported to 3.6 branch.
> The issue is you are using 3.6.1 which is quite old. If you can upgrade
> to latest versions of 3.7 or at worst of 3.6 I am confident that this
> will go away.
> 
> ~Atin
> >
> > >
> > > After sometime, glusterd then seems to give up and die..
> >
> > Do you mean glusterd shuts down or segfaults, if so I am more
> interested
> > in analyzing this part. Could you provide us the glusterd log,
> > cmd_history log file along with core (in case of SEGV) from
> all the
> > nodes for the further analysis?
> >
> >
> > There is no segfault. glusterd just shuts down. As I said above,
> > sometimes this happens and sometimes it just continues to hog a lot of
> > memory and CPU..
> >
> >
> > >
> > > Interestingly, I also find the following line in the
> beginning of
> > > etc-glusterfs-glusterd.vol.log and I dont know if this has any
> > > significance to the issue :
> > >
> > > [2016-06-14 06:48:57.282290] I
> > > [glusterd-store.c:2063:glusterd_restore_op_version]
> 0-management:
> > > Detected new install. Setting op-version to maximum : 30600
> > >
> >
> >
> > What does this line signify?
> 
> 
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Problem with glusterd locks on gluster 3.6.1

2016-06-16 Thread B.K.Raghuram
Thanks a lot Atin,

The problem is that we are using a forked version of 3.6.1 which has been
modified to work with ZFS (for snapshots) but we do not have the resources
to port that over to the later versions of gluster.

Would you know of anyone who would be willing to take this on?!

Regards,
-Ram

On Thu, Jun 16, 2016 at 11:02 AM, Atin Mukherjee 
wrote:

>
>
> On 06/16/2016 10:49 AM, B.K.Raghuram wrote:
> >
> >
> > On Wed, Jun 15, 2016 at 5:01 PM, Atin Mukherjee  > > wrote:
> >
> >
> >
> > On 06/15/2016 04:24 PM, B.K.Raghuram wrote:
> > > Hi,
> > >
> > > We're using gluster 3.6.1 and we periodically find that gluster
> commands
> > > fail saying the it could not get the lock on one of the brick
> machines.
> > > The logs on that machine then say something like :
> > >
> > > [2016-06-15 08:17:03.076119] E
> > > [glusterd-op-sm.c:3058:glusterd_op_ac_lock] 0-management: Unable to
> > > acquire lock for vol2
> >
> > This is a possible case if concurrent volume operations are run. Do
> you
> > have any script which checks for volume status on an interval from
> all
> > the nodes, if so then this is an expected behavior.
> >
> >
> > Yes, I do have a couple of scripts that check on volume and quota
> > status.. Given this, I do get a "Another transaction is in progress.."
> > message which is ok. The problem is that sometimes I get the volume lock
> > held message which never goes away. This sometimes results in glusterd
> > consuming a lot of memory and CPU and the problem can only be fixed with
> > a reboot. The log files are huge so I'm not sure if its ok to attach
> > them to an email.
>
> Ok, so this is known. We have fixed lots of stale lock issues in 3.7
> branch and some of them if not all were also backported to 3.6 branch.
> The issue is you are using 3.6.1 which is quite old. If you can upgrade
> to latest versions of 3.7 or at worst of 3.6 I am confident that this
> will go away.
>
> ~Atin
> >
> > >
> > > After sometime, glusterd then seems to give up and die..
> >
> > Do you mean glusterd shuts down or segfaults, if so I am more
> interested
> > in analyzing this part. Could you provide us the glusterd log,
> > cmd_history log file along with core (in case of SEGV) from all the
> > nodes for the further analysis?
> >
> >
> > There is no segfault. glusterd just shuts down. As I said above,
> > sometimes this happens and sometimes it just continues to hog a lot of
> > memory and CPU..
> >
> >
> > >
> > > Interestingly, I also find the following line in the beginning of
> > > etc-glusterfs-glusterd.vol.log and I dont know if this has any
> > > significance to the issue :
> > >
> > > [2016-06-14 06:48:57.282290] I
> > > [glusterd-store.c:2063:glusterd_restore_op_version] 0-management:
> > > Detected new install. Setting op-version to maximum : 30600
> > >
> >
> >
> > What does this line signify?
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Problem with glusterd locks on gluster 3.6.1

2016-06-15 Thread Atin Mukherjee


On 06/16/2016 10:49 AM, B.K.Raghuram wrote:
> 
> 
> On Wed, Jun 15, 2016 at 5:01 PM, Atin Mukherjee  > wrote:
> 
> 
> 
> On 06/15/2016 04:24 PM, B.K.Raghuram wrote:
> > Hi,
> >
> > We're using gluster 3.6.1 and we periodically find that gluster commands
> > fail saying the it could not get the lock on one of the brick machines.
> > The logs on that machine then say something like :
> >
> > [2016-06-15 08:17:03.076119] E
> > [glusterd-op-sm.c:3058:glusterd_op_ac_lock] 0-management: Unable to
> > acquire lock for vol2
> 
> This is a possible case if concurrent volume operations are run. Do you
> have any script which checks for volume status on an interval from all
> the nodes, if so then this is an expected behavior.
> 
> 
> Yes, I do have a couple of scripts that check on volume and quota
> status.. Given this, I do get a "Another transaction is in progress.."
> message which is ok. The problem is that sometimes I get the volume lock
> held message which never goes away. This sometimes results in glusterd
> consuming a lot of memory and CPU and the problem can only be fixed with
> a reboot. The log files are huge so I'm not sure if its ok to attach
> them to an email.

Ok, so this is known. We have fixed lots of stale lock issues in 3.7
branch and some of them if not all were also backported to 3.6 branch.
The issue is you are using 3.6.1 which is quite old. If you can upgrade
to latest versions of 3.7 or at worst of 3.6 I am confident that this
will go away.

~Atin
> 
> >
> > After sometime, glusterd then seems to give up and die..
> 
> Do you mean glusterd shuts down or segfaults, if so I am more interested
> in analyzing this part. Could you provide us the glusterd log,
> cmd_history log file along with core (in case of SEGV) from all the
> nodes for the further analysis?
> 
> 
> There is no segfault. glusterd just shuts down. As I said above,
> sometimes this happens and sometimes it just continues to hog a lot of
> memory and CPU..
> 
> 
> >
> > Interestingly, I also find the following line in the beginning of
> > etc-glusterfs-glusterd.vol.log and I dont know if this has any
> > significance to the issue :
> >
> > [2016-06-14 06:48:57.282290] I
> > [glusterd-store.c:2063:glusterd_restore_op_version] 0-management:
> > Detected new install. Setting op-version to maximum : 30600
> >
> 
> 
> What does this line signify?
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Problem with glusterd locks on gluster 3.6.1

2016-06-15 Thread B.K.Raghuram
On Wed, Jun 15, 2016 at 5:01 PM, Atin Mukherjee  wrote:

>
>
> On 06/15/2016 04:24 PM, B.K.Raghuram wrote:
> > Hi,
> >
> > We're using gluster 3.6.1 and we periodically find that gluster commands
> > fail saying the it could not get the lock on one of the brick machines.
> > The logs on that machine then say something like :
> >
> > [2016-06-15 08:17:03.076119] E
> > [glusterd-op-sm.c:3058:glusterd_op_ac_lock] 0-management: Unable to
> > acquire lock for vol2
>
> This is a possible case if concurrent volume operations are run. Do you
> have any script which checks for volume status on an interval from all
> the nodes, if so then this is an expected behavior.
>

Yes, I do have a couple of scripts that check on volume and quota status..
Given this, I do get a "Another transaction is in progress.." message which
is ok. The problem is that sometimes I get the volume lock held message
which never goes away. This sometimes results in glusterd consuming a lot
of memory and CPU and the problem can only be fixed with a reboot. The log
files are huge so I'm not sure if its ok to attach them to an email.

>
> > After sometime, glusterd then seems to give up and die..
>
> Do you mean glusterd shuts down or segfaults, if so I am more interested
> in analyzing this part. Could you provide us the glusterd log,
> cmd_history log file along with core (in case of SEGV) from all the
> nodes for the further analysis?
>

There is no segfault. glusterd just shuts down. As I said above, sometimes
this happens and sometimes it just continues to hog a lot of memory and
CPU..

>
> >
> > Interestingly, I also find the following line in the beginning of
> > etc-glusterfs-glusterd.vol.log and I dont know if this has any
> > significance to the issue :
> >
> > [2016-06-14 06:48:57.282290] I
> > [glusterd-store.c:2063:glusterd_restore_op_version] 0-management:
> > Detected new install. Setting op-version to maximum : 30600
> >
>

What does this line signify?
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Problem with glusterd locks on gluster 3.6.1

2016-06-15 Thread Atin Mukherjee


On 06/15/2016 04:24 PM, B.K.Raghuram wrote:
> Hi,
> 
> We're using gluster 3.6.1 and we periodically find that gluster commands
> fail saying the it could not get the lock on one of the brick machines.
> The logs on that machine then say something like :
> 
> [2016-06-15 08:17:03.076119] E
> [glusterd-op-sm.c:3058:glusterd_op_ac_lock] 0-management: Unable to
> acquire lock for vol2

This is a possible case if concurrent volume operations are run. Do you
have any script which checks for volume status on an interval from all
the nodes, if so then this is an expected behavior.
> 
> After sometime, glusterd then seems to give up and die..

Do you mean glusterd shuts down or segfaults, if so I am more interested
in analyzing this part. Could you provide us the glusterd log,
cmd_history log file along with core (in case of SEGV) from all the
nodes for the further analysis?

> 
> Interestingly, I also find the following line in the beginning of
> etc-glusterfs-glusterd.vol.log and I dont know if this has any
> significance to the issue :
> 
> [2016-06-14 06:48:57.282290] I
> [glusterd-store.c:2063:glusterd_restore_op_version] 0-management:
> Detected new install. Setting op-version to maximum : 30600
> 
> Any idea what the problem may be?
> 
> Thanks,
> -Ram
> 
> 
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
> 
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users