Re: [Gluster-users] Pending fcntl locks found!

2009-04-09 Thread Greg

Keith Freedman a écrit :

all of a sudden, I'm getting messages such as this:

2009-03-13 23:14:06 C [posix.c:709:pl_forget] posix-locks-home1: 
Pending fcntl locks found!


and some processes are hanging waiting presumably for the locks?
any way to find out what files are being locked and unlock them.
restarting gluster doesn't seem to solve the problem.



Hi,

I'm facing same problem with rc7, causing the server to use 100% of the 
CPU, and clients are unables to access files, waiting for something... I 
have to remove the glusterfs stack from our production environment, back 
to local hard drives... See attached CPU graphs of both servers.


Config:
# file: /etc/glusterfs/glusterfsd.vol

#
# Volumes
#
volume media-small
   type storage/posix
   option directory /var/local/glusterfs/media_small
end-volume

volume media-medium
   type storage/posix
   option directory /var/local/glusterfs/media_medium
end-volume

# Lock posix
volume media-small-locks
   type features/posix-locks
   option mandatory-locks on
   subvolumes media-small
#   subvolumes trash # enable this if you need trash can support 
(NOTE: not present in 1.3.0-pre5+ releases)

end-volume

volume media-medium-locks
   type features/posix-locks
   option mandatory-locks on
   subvolumes media-medium
#   subvolumes trash # enable this if you need trash can support 
(NOTE: not present in 1.3.0-pre5+ releases)

end-volume


#
# Performance
#
volume media-small-iot
   type performance/io-threads
   subvolumes media-small-locks
   option thread-count 4 # default value is 1
end-volume

volume media-small-ioc
   type performance/io-cache
   option cache-size 128MB # default is 32MB
   option page-size 128KB  # default is 128KB
   subvolumes media-small-iot
end-volume

volume media-small-wb
   type performance/write-behind
   #option flush-behind on # default is off
   subvolumes media-small-ioc
end-volume

volume media-small-ra
   type performance/read-ahead
   subvolumes media-small-wb
   option page-size 256KB  # default is 256KB
   option page-count 4 # default is 2 - cache per file 
= (page-count x page-size)

   option force-atime-update no# defalut is 'no'
end-volume


volume media-medium-iot
   type performance/io-threads
   subvolumes media-medium-locks
   option thread-count 4 # default value is 1
end-volume

volume media-medium-ioc
   type performance/io-cache
   option cache-size 128MB # default is 32MB
   option page-size 128KB  # default is 128KB
   subvolumes media-medium-iot
end-volume

volume media-medium-wb
   type performance/write-behind
   #option flush-behind on # default is off
   subvolumes media-medium-ioc
end-volume

volume media-medium-ra
   type performance/read-ahead
   subvolumes media-medium-wb
   option page-size 256KB  # default is 256KB
   option page-count 4 # default is 2 - cache per file 
= (page-count x page-size)

   option force-atime-update no# defalut is 'no'
end-volume




#
# Serveur
#
volume server
   type protocol/server
   option transport-type tcp/server
   option auth.addr.media-small-ra.allow 10.0.*.*
   option auth.addr.media-medium-ra.allow 10.0.*.*
   # Autoconfiguration, e.g. :
   # glusterfs -l /tmp/glusterfs.log --server=filer-04 ./Cache
   option client-volume-filename /etc/glusterfs/glusterfs.vol
   subvolumes media-small-ra media-medium-ra # volumes exportés
end-volume

# file: /etc/glusterfs/glusterfs.vol


#
# Clients
#
volume media-small-filer-04
   type protocol/client
   option transport-type tcp/client
   option remote-host filer-04.local
   option remote-subvolume media-small-ra
end-volume

volume media-small-filer-05
   type protocol/client
   option transport-type tcp/client
   option remote-host filer-05.local
   option remote-subvolume media-small-ra
end-volume

volume media-medium-filer-04
   type protocol/client
   option transport-type tcp/client
   option remote-host filer-04.local
   option remote-subvolume media-medium-ra
end-volume

volume media-medium-filer-05
   type protocol/client
   option transport-type tcp/client
   option remote-host filer-05.local
   option remote-subvolume media-medium-ra
end-volume


#
# Volume principal
#
volume afr-small
   # AFR has been renamed to "Replicate" for simplicity.
   type cluster/replicate
   # Il faut mettre le serveur avec le moins d'espace disque en 1er :
   # "When doing a "df -h" on a client, the AVAILABLE disk space 
will display the maximum disk space of the first AFR sub volume defined 
in the spec file. So if you have two servers with 50 gigs and 100 gigs 
of free disk space, and the server with 100 gigs is listed first, then 
you will see 100 gigs avai

Re: [Gluster-users] Pending fcntl locks found!

2009-03-18 Thread Keith Freedman

alright.. I'll try to do that tonight and see if it solves the problem.

if I were able to identify the exact things that were causing this 
I'd pass them along, but right now I just have general ideas.
For example, squirrilmail webmail just breaks hard with RC4 (single 
server mode), whereas in RC2 it works.  At the same time, roundcube 
web mail works fine.  But still I have no way of knowing specifically 
where within squirrilmail it gets stuck so I can't identify a 
situation to debug.


Keith

At 06:07 AM 3/18/2009, you wrote:

Hi.

Try the split approach - I didn't notice this here on RC4, during my testing.

Regards.

2009/3/18 Keith Freedman 
<freed...@freeformit.com>

I rolled back to RC2, and no longer have this problem.
i don't see these error messages, and processes that were hanging 
before are not hanging now?


I'm using single process AFR.  should I split it out and try RC4, or 
does it not seem logical that this should be a problem?


Keith


At 09:57 AM 3/16/2009, Keith Freedman wrote:
Also, I'm wondering if this is related to the fact that I have 
single process client/server.

which used to be the recommended method and now is not.

if I split those out, will that solve my problem?

At 09:50 AM 3/16/2009, Keith Freedman wrote:
At 04:06 AM 3/16/2009, Vikas Gorur wrote:
2009/3/14 Keith Freedman 
<freed...@freeformit.com>:

> all of a sudden, I'm getting messages such as this:
>
> 2009-03-13 23:14:06 C [posix.c:709:pl_forget] posix-locks-home1: Pending
> fcntl locks found!
>
> and some processes are hanging waiting presumably for the locks?
> any way to find out what files are being locked and unlock them.
> restarting gluster doesn't seem to solve the problem.

Are you using any applications that hold POSIX fcntl locks? Try
running the server in debug mode --- then you can find out which files
are being locked/not unlocked, etc.


well, I'm sure I am, I've no idea really, there are some php scripts 
which seem to hang and some python programs.


however, this problem only manifested itself when I upgraded to rc4 
and the new fuse-2.7.4glfs11


so something must be significantly different about how those (or 
that combination) handles locks.



Also, debug mode wont really solve the problem, cause knowing what 
exact file is the problem, isn't going to help because that wont 
really tell me how to prevent this from happening.  Clearly one side 
should get the lock and one should wait, rather than both servers in 
the replicate pair just hanging on the same file?


in addition, ERROR mode logging should log enough related 
information to know this stuff (this is an enhancement request :) )



Vikas
--
Engineer - Z Research
http://gluster.com/



___
Gluster-users mailing list
Gluster-users@gluster.org
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users



___
Gluster-users mailing list
Gluster-users@gluster.org
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users



___
Gluster-users mailing list
Gluster-users@gluster.org
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users



___
Gluster-users mailing list
Gluster-users@gluster.org
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Pending fcntl locks found!

2009-03-18 Thread Stas Oskin
Hi.

Try the split approach - I didn't notice this here on RC4, during my
testing.

Regards.

2009/3/18 Keith Freedman 

> I rolled back to RC2, and no longer have this problem.
> i don't see these error messages, and processes that were hanging before
> are not hanging now?
>
> I'm using single process AFR.  should I split it out and try RC4, or does
> it not seem logical that this should be a problem?
>
> Keith
>
>
> At 09:57 AM 3/16/2009, Keith Freedman wrote:
>
>> Also, I'm wondering if this is related to the fact that I have single
>> process client/server.
>> which used to be the recommended method and now is not.
>>
>> if I split those out, will that solve my problem?
>>
>> At 09:50 AM 3/16/2009, Keith Freedman wrote:
>>
>>> At 04:06 AM 3/16/2009, Vikas Gorur wrote:
>>>
 2009/3/14 Keith Freedman :
 > all of a sudden, I'm getting messages such as this:
 >
 > 2009-03-13 23:14:06 C [posix.c:709:pl_forget] posix-locks-home1:
 Pending
 > fcntl locks found!
 >
 > and some processes are hanging waiting presumably for the locks?
 > any way to find out what files are being locked and unlock them.
 > restarting gluster doesn't seem to solve the problem.

 Are you using any applications that hold POSIX fcntl locks? Try
 running the server in debug mode --- then you can find out which files
 are being locked/not unlocked, etc.

>>>
>>> well, I'm sure I am, I've no idea really, there are some php scripts
>>> which seem to hang and some python programs.
>>>
>>> however, this problem only manifested itself when I upgraded to rc4 and
>>> the new fuse-2.7.4glfs11
>>>
>>> so something must be significantly different about how those (or that
>>> combination) handles locks.
>>>
>>>
>>> Also, debug mode wont really solve the problem, cause knowing what exact
>>> file is the problem, isn't going to help because that wont really tell me
>>> how to prevent this from happening.  Clearly one side should get the lock
>>> and one should wait, rather than both servers in the replicate pair just
>>> hanging on the same file?
>>>
>>> in addition, ERROR mode logging should log enough related information to
>>> know this stuff (this is an enhancement request :) )
>>>
>>>
>>>  Vikas
 --
 Engineer - Z Research
 http://gluster.com/

>>>
>>>
>>> ___
>>> Gluster-users mailing list
>>> Gluster-users@gluster.org
>>> http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
>>>
>>
>>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
>>
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Pending fcntl locks found!

2009-03-18 Thread Keith Freedman

I rolled back to RC2, and no longer have this problem.
i don't see these error messages, and processes that were hanging 
before are not hanging now?


I'm using single process AFR.  should I split it out and try RC4, or 
does it not seem logical that this should be a problem?


Keith

At 09:57 AM 3/16/2009, Keith Freedman wrote:
Also, I'm wondering if this is related to the fact that I have 
single process client/server.

which used to be the recommended method and now is not.

if I split those out, will that solve my problem?

At 09:50 AM 3/16/2009, Keith Freedman wrote:

At 04:06 AM 3/16/2009, Vikas Gorur wrote:

2009/3/14 Keith Freedman :
> all of a sudden, I'm getting messages such as this:
>
> 2009-03-13 23:14:06 C [posix.c:709:pl_forget] posix-locks-home1: Pending
> fcntl locks found!
>
> and some processes are hanging waiting presumably for the locks?
> any way to find out what files are being locked and unlock them.
> restarting gluster doesn't seem to solve the problem.

Are you using any applications that hold POSIX fcntl locks? Try
running the server in debug mode --- then you can find out which files
are being locked/not unlocked, etc.


well, I'm sure I am, I've no idea really, there are some php 
scripts which seem to hang and some python programs.


however, this problem only manifested itself when I upgraded to rc4 
and the new fuse-2.7.4glfs11


so something must be significantly different about how those (or 
that combination) handles locks.



Also, debug mode wont really solve the problem, cause knowing what 
exact file is the problem, isn't going to help because that wont 
really tell me how to prevent this from happening.  Clearly one 
side should get the lock and one should wait, rather than both 
servers in the replicate pair just hanging on the same file?


in addition, ERROR mode logging should log enough related 
information to know this stuff (this is an enhancement request :) )




Vikas
--
Engineer - Z Research
http://gluster.com/



___
Gluster-users mailing list
Gluster-users@gluster.org
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users



___
Gluster-users mailing list
Gluster-users@gluster.org
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users



___
Gluster-users mailing list
Gluster-users@gluster.org
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Pending fcntl locks found!

2009-03-16 Thread Keith Freedman
Also, I'm wondering if this is related to the fact that I have single 
process client/server.

which used to be the recommended method and now is not.

if I split those out, will that solve my problem?

At 09:50 AM 3/16/2009, Keith Freedman wrote:

At 04:06 AM 3/16/2009, Vikas Gorur wrote:

2009/3/14 Keith Freedman :
> all of a sudden, I'm getting messages such as this:
>
> 2009-03-13 23:14:06 C [posix.c:709:pl_forget] posix-locks-home1: Pending
> fcntl locks found!
>
> and some processes are hanging waiting presumably for the locks?
> any way to find out what files are being locked and unlock them.
> restarting gluster doesn't seem to solve the problem.

Are you using any applications that hold POSIX fcntl locks? Try
running the server in debug mode --- then you can find out which files
are being locked/not unlocked, etc.


well, I'm sure I am, I've no idea really, there are some php scripts 
which seem to hang and some python programs.


however, this problem only manifested itself when I upgraded to rc4 
and the new fuse-2.7.4glfs11


so something must be significantly different about how those (or 
that combination) handles locks.



Also, debug mode wont really solve the problem, cause knowing what 
exact file is the problem, isn't going to help because that wont 
really tell me how to prevent this from happening.  Clearly one side 
should get the lock and one should wait, rather than both servers in 
the replicate pair just hanging on the same file?


in addition, ERROR mode logging should log enough related 
information to know this stuff (this is an enhancement request :) )




Vikas
--
Engineer - Z Research
http://gluster.com/



___
Gluster-users mailing list
Gluster-users@gluster.org
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users



___
Gluster-users mailing list
Gluster-users@gluster.org
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Pending fcntl locks found!

2009-03-16 Thread Keith Freedman

At 04:06 AM 3/16/2009, Vikas Gorur wrote:

2009/3/14 Keith Freedman :
> all of a sudden, I'm getting messages such as this:
>
> 2009-03-13 23:14:06 C [posix.c:709:pl_forget] posix-locks-home1: Pending
> fcntl locks found!
>
> and some processes are hanging waiting presumably for the locks?
> any way to find out what files are being locked and unlock them.
> restarting gluster doesn't seem to solve the problem.

Are you using any applications that hold POSIX fcntl locks? Try
running the server in debug mode --- then you can find out which files
are being locked/not unlocked, etc.


well, I'm sure I am, I've no idea really, there are some php scripts 
which seem to hang and some python programs.


however, this problem only manifested itself when I upgraded to rc4 
and the new fuse-2.7.4glfs11


so something must be significantly different about how those (or that 
combination) handles locks.



Also, debug mode wont really solve the problem, cause knowing what 
exact file is the problem, isn't going to help because that wont 
really tell me how to prevent this from happening.  Clearly one side 
should get the lock and one should wait, rather than both servers in 
the replicate pair just hanging on the same file?


in addition, ERROR mode logging should log enough related information 
to know this stuff (this is an enhancement request :) )




Vikas
--
Engineer - Z Research
http://gluster.com/



___
Gluster-users mailing list
Gluster-users@gluster.org
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Pending fcntl locks found!

2009-03-16 Thread Vikas Gorur
2009/3/14 Keith Freedman :
> all of a sudden, I'm getting messages such as this:
>
> 2009-03-13 23:14:06 C [posix.c:709:pl_forget] posix-locks-home1: Pending
> fcntl locks found!
>
> and some processes are hanging waiting presumably for the locks?
> any way to find out what files are being locked and unlock them.
> restarting gluster doesn't seem to solve the problem.

Are you using any applications that hold POSIX fcntl locks? Try
running the server in debug mode --- then you can find out which files
are being locked/not unlocked, etc.

Vikas
-- 
Engineer - Z Research
http://gluster.com/

___
Gluster-users mailing list
Gluster-users@gluster.org
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Pending fcntl locks found!

2009-03-15 Thread Keith Freedman

any thoughts on this one?

it seems to be causing some severe problems.
there are occasions where things block on all nodes, seemingly 
waiting for one to get a lock that it never gets?

and I've no real way of finding out which file is the problem or why?


At 11:21 PM 3/13/2009, Keith Freedman wrote:

all of a sudden, I'm getting messages such as this:

2009-03-13 23:14:06 C [posix.c:709:pl_forget] posix-locks-home1: 
Pending fcntl locks found!


and some processes are hanging waiting presumably for the locks?
any way to find out what files are being locked and unlock them.
restarting gluster doesn't seem to solve the problem.


___
Gluster-users mailing list
Gluster-users@gluster.org
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users



___
Gluster-users mailing list
Gluster-users@gluster.org
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users