Re: [Gluster-users] uninterruptible processes writing toglusterfsshare

2011-06-08 Thread Justice London
Hopefully this will help some people... try disabling the io-cache routine
in the fuse configurations for your share. Let me know if you need
instruction on doing this. It solved all of the lockup issues I was
experiencing. I believe there is some sort of as-yet-undetermined memory
leak here.

Justice London

-Original Message-
From: gluster-users-boun...@gluster.org
[mailto:gluster-users-boun...@gluster.org] On Behalf Of bxma...@gmail.com
Sent: Wednesday, June 08, 2011 12:22 PM
To: gluster-users@gluster.org
Subject: Re: [Gluster-users] uninterruptible processes writing
toglusterfsshare

Im using kernel 2.6.34 + fuse 2.5.5 + gluster 3.2 from beginning and
it happen again today ...
php-fpm freeze and reboot was only solution.

Matus


2011/6/7 Markus Fröhlich markus.froehl...@xidras.com:
 hi!

 there ist no relavant output from dmesg.
 no entries in the server log - only the one line in the client-server log,
I
 already posted.

 the glusterfs version on the server had been updated to gfs 3.2.0 more
than
 a month ago.
 because of the troubles on the backup server, I deleted the whole backup
 share and started from scratch.


 I looked for a update of fuse and upgraded from 2.7.2-61.18.1 to
 2.8.5-41.1
 maybe this helps.

 here is the changelog info:

 Authors:
 
    Miklos Szeredi mik...@szeredi.hu
 Distribution: systemsmanagement:baracus / SLE_11_SP1
 * Tue Mar 29 2011 db...@novell.com
 - remove the --no-canonicalize usage for suse_version = 11.3

 * Mon Mar 21 2011 co...@novell.com
 - licenses package is about to die

 * Thu Feb 17 2011 mszer...@suse.cz
 - In case of failure to add to /etc/mtab don't umount. [bnc#668820]
  [CVE-2011-0541]

 * Tue Nov 16 2010 mszer...@suse.cz
 - Fix symlink attack for mount and umount [bnc#651598]

 * Wed Oct 27 2010 mszer...@suse.cz
 - Remove /etc/init.d/boot.fuse [bnc#648843]

 * Tue Sep 28 2010 mszer...@suse.cz
 - update to 2.8.5
  * fix option escaping for fusermount [bnc#641480]

 * Wed Apr 28 2010 mszer...@suse.cz
 - keep examples and internal docs in devel package (from jnweiger)

 * Mon Apr 26 2010 mszer...@suse.cz
 - update to 2.8.4
  * fix checking for symlinks in umount from /tmp
  * fix umounting if /tmp is a symlink


 kind regards
 markus froehlich

 Am 06.06.2011 21:19, schrieb Anthony J. Biacco:

 Could be fuse, check 'dmesg' for kernel module timeouts.

 In a similar vein, has anyone seen signifigant performance/reliability
 with diff fuse versions? say, latest source vs. Rhel distro rpms vers.

 -Tony



 -Original Message-
 From: Mohit Anchliamohitanch...@gmail.com
 Sent: June 06, 2011 1:14 PM
 To: Markus Fröhlichmarkus.froehl...@xidras.com
 Cc: gluster-users@gluster.orggluster-users@gluster.org
 Subject: Re: [Gluster-users] uninterruptible processes writing to
 glusterfsshare

 Is there anything in the server logs? Does it follow any particular
 pattern before going in this mode?

 Did you upgrade Gluster or is this new install?

 2011/6/6 Markus Fröhlichmarkus.froehl...@xidras.com:

 hi!

 sometimes we've on some client-servers hanging uninterruptible processes
 (ps aux stat is on D ) and on one the CPU wait I/O grows within some
 minutes to 100%.
 you are not able to kill such processes - also kill -9 doesnt work -
 when
 you connect via strace to such an process, you wont see anything and
 you
 cannot detach it again.

 there are only two possibilities:
 killing the glusterfs process (umount GFS share) or rebooting the
server.

 the only log entry I found, was on one client - just a single line:
 [2011-06-06 10:44:18.593211] I
 [afr-common.c:581:afr_lookup_collect_xattr]
 0-office-data-replicate-0: data self-heal is pending for


/pc-partnerbet-public/Promotionaktionen/Mailakquise_2009/Webmaster_2010/HTML
/bilder/Thumbs.db.

 one of the client-servers is a samba-server, the other one a
 backup-server
 based on rsync with millions of small files.

 gfs-servers + gfs-clients: SLES11 x86_64, glusterfs V 3.2.0

 and here are the configs from server and client:
 server config


/etc/glusterd/vols/office-data/office-data.gfs-01-01.GFS-office-data02.vol
:
 volume office-data-posix
    type storage/posix
    option directory /GFS/office-data02
 end-volume

 volume office-data-access-control
    type features/access-control
    subvolumes office-data-posix
 end-volume

 volume office-data-locks
    type features/locks
    subvolumes office-data-access-control
 end-volume

 volume office-data-io-threads
    type performance/io-threads
    subvolumes office-data-locks
 end-volume

 volume office-data-marker
    type features/marker
    option volume-uuid 3c6e633d-a0bb-4c52-8f05-a2db9bc9c659
    option timestamp-file /etc/glusterd/vols/office-data/marker.tstamp
    option xtime off
    option quota off
    subvolumes office-data-io-threads
 end-volume

 volume /GFS/office-data02
    type debug/io-stats
    option latency-measurement off
    option count-fop-hits off
    subvolumes office-data-marker
 end-volume

 volume office-data

Re: [Gluster-users] uninterruptible processes writing toglusterfsshare

2011-06-08 Thread Mohit Anchlia
On Wed, Jun 8, 2011 at 12:29 PM, Justice London jlon...@lawinfo.com wrote:
 Hopefully this will help some people... try disabling the io-cache routine
 in the fuse configurations for your share. Let me know if you need
 instruction on doing this. It solved all of the lockup issues I was
 experiencing. I believe there is some sort of as-yet-undetermined memory
 leak here.

Was there a bug filed? If you think this is a bug it will help others as well.

 Justice London

 -Original Message-
 From: gluster-users-boun...@gluster.org
 [mailto:gluster-users-boun...@gluster.org] On Behalf Of bxma...@gmail.com
 Sent: Wednesday, June 08, 2011 12:22 PM
 To: gluster-users@gluster.org
 Subject: Re: [Gluster-users] uninterruptible processes writing
 toglusterfsshare

 Im using kernel 2.6.34 + fuse 2.5.5 + gluster 3.2 from beginning and
 it happen again today ...
 php-fpm freeze and reboot was only solution.

 Matus


 2011/6/7 Markus Fröhlich markus.froehl...@xidras.com:
 hi!

 there ist no relavant output from dmesg.
 no entries in the server log - only the one line in the client-server log,
 I
 already posted.

 the glusterfs version on the server had been updated to gfs 3.2.0 more
 than
 a month ago.
 because of the troubles on the backup server, I deleted the whole backup
 share and started from scratch.


 I looked for a update of fuse and upgraded from 2.7.2-61.18.1 to
 2.8.5-41.1
 maybe this helps.

 here is the changelog info:

 Authors:
 
    Miklos Szeredi mik...@szeredi.hu
 Distribution: systemsmanagement:baracus / SLE_11_SP1
 * Tue Mar 29 2011 db...@novell.com
 - remove the --no-canonicalize usage for suse_version = 11.3

 * Mon Mar 21 2011 co...@novell.com
 - licenses package is about to die

 * Thu Feb 17 2011 mszer...@suse.cz
 - In case of failure to add to /etc/mtab don't umount. [bnc#668820]
  [CVE-2011-0541]

 * Tue Nov 16 2010 mszer...@suse.cz
 - Fix symlink attack for mount and umount [bnc#651598]

 * Wed Oct 27 2010 mszer...@suse.cz
 - Remove /etc/init.d/boot.fuse [bnc#648843]

 * Tue Sep 28 2010 mszer...@suse.cz
 - update to 2.8.5
  * fix option escaping for fusermount [bnc#641480]

 * Wed Apr 28 2010 mszer...@suse.cz
 - keep examples and internal docs in devel package (from jnweiger)

 * Mon Apr 26 2010 mszer...@suse.cz
 - update to 2.8.4
  * fix checking for symlinks in umount from /tmp
  * fix umounting if /tmp is a symlink


 kind regards
 markus froehlich

 Am 06.06.2011 21:19, schrieb Anthony J. Biacco:

 Could be fuse, check 'dmesg' for kernel module timeouts.

 In a similar vein, has anyone seen signifigant performance/reliability
 with diff fuse versions? say, latest source vs. Rhel distro rpms vers.

 -Tony



 -Original Message-
 From: Mohit Anchliamohitanch...@gmail.com
 Sent: June 06, 2011 1:14 PM
 To: Markus Fröhlichmarkus.froehl...@xidras.com
 Cc: gluster-users@gluster.orggluster-users@gluster.org
 Subject: Re: [Gluster-users] uninterruptible processes writing to
 glusterfsshare

 Is there anything in the server logs? Does it follow any particular
 pattern before going in this mode?

 Did you upgrade Gluster or is this new install?

 2011/6/6 Markus Fröhlichmarkus.froehl...@xidras.com:

 hi!

 sometimes we've on some client-servers hanging uninterruptible processes
 (ps aux stat is on D ) and on one the CPU wait I/O grows within some
 minutes to 100%.
 you are not able to kill such processes - also kill -9 doesnt work -
 when
 you connect via strace to such an process, you wont see anything and
 you
 cannot detach it again.

 there are only two possibilities:
 killing the glusterfs process (umount GFS share) or rebooting the
 server.

 the only log entry I found, was on one client - just a single line:
 [2011-06-06 10:44:18.593211] I
 [afr-common.c:581:afr_lookup_collect_xattr]
 0-office-data-replicate-0: data self-heal is pending for


 /pc-partnerbet-public/Promotionaktionen/Mailakquise_2009/Webmaster_2010/HTML
 /bilder/Thumbs.db.

 one of the client-servers is a samba-server, the other one a
 backup-server
 based on rsync with millions of small files.

 gfs-servers + gfs-clients: SLES11 x86_64, glusterfs V 3.2.0

 and here are the configs from server and client:
 server config


 /etc/glusterd/vols/office-data/office-data.gfs-01-01.GFS-office-data02.vol
 :
 volume office-data-posix
    type storage/posix
    option directory /GFS/office-data02
 end-volume

 volume office-data-access-control
    type features/access-control
    subvolumes office-data-posix
 end-volume

 volume office-data-locks
    type features/locks
    subvolumes office-data-access-control
 end-volume

 volume office-data-io-threads
    type performance/io-threads
    subvolumes office-data-locks
 end-volume

 volume office-data-marker
    type features/marker
    option volume-uuid 3c6e633d-a0bb-4c52-8f05-a2db9bc9c659
    option timestamp-file /etc/glusterd/vols/office-data/marker.tstamp
    option xtime off
    option quota off
    subvolumes office-data-io-threads
 end-volume

 volume

Re: [Gluster-users] uninterruptible processes writing toglusterfsshare

2011-06-08 Thread bxma...@gmail.com
How to disable  io-cache routine ? I will try it and report back :)

thanks

Matus

2011/6/8 Mohit Anchlia mohitanch...@gmail.com:
 On Wed, Jun 8, 2011 at 12:29 PM, Justice London jlon...@lawinfo.com wrote:
 Hopefully this will help some people... try disabling the io-cache routine
 in the fuse configurations for your share. Let me know if you need
 instruction on doing this. It solved all of the lockup issues I was
 experiencing. I believe there is some sort of as-yet-undetermined memory
 leak here.

 Was there a bug filed? If you think this is a bug it will help others as well.

 Justice London

 -Original Message-
 From: gluster-users-boun...@gluster.org
 [mailto:gluster-users-boun...@gluster.org] On Behalf Of bxma...@gmail.com
 Sent: Wednesday, June 08, 2011 12:22 PM
 To: gluster-users@gluster.org
 Subject: Re: [Gluster-users] uninterruptible processes writing
 toglusterfsshare

 Im using kernel 2.6.34 + fuse 2.5.5 + gluster 3.2 from beginning and
 it happen again today ...
 php-fpm freeze and reboot was only solution.

 Matus


 2011/6/7 Markus Fröhlich markus.froehl...@xidras.com:
 hi!

 there ist no relavant output from dmesg.
 no entries in the server log - only the one line in the client-server log,
 I
 already posted.

 the glusterfs version on the server had been updated to gfs 3.2.0 more
 than
 a month ago.
 because of the troubles on the backup server, I deleted the whole backup
 share and started from scratch.


 I looked for a update of fuse and upgraded from 2.7.2-61.18.1 to
 2.8.5-41.1
 maybe this helps.

 here is the changelog info:

 Authors:
 
    Miklos Szeredi mik...@szeredi.hu
 Distribution: systemsmanagement:baracus / SLE_11_SP1
 * Tue Mar 29 2011 db...@novell.com
 - remove the --no-canonicalize usage for suse_version = 11.3

 * Mon Mar 21 2011 co...@novell.com
 - licenses package is about to die

 * Thu Feb 17 2011 mszer...@suse.cz
 - In case of failure to add to /etc/mtab don't umount. [bnc#668820]
  [CVE-2011-0541]

 * Tue Nov 16 2010 mszer...@suse.cz
 - Fix symlink attack for mount and umount [bnc#651598]

 * Wed Oct 27 2010 mszer...@suse.cz
 - Remove /etc/init.d/boot.fuse [bnc#648843]

 * Tue Sep 28 2010 mszer...@suse.cz
 - update to 2.8.5
  * fix option escaping for fusermount [bnc#641480]

 * Wed Apr 28 2010 mszer...@suse.cz
 - keep examples and internal docs in devel package (from jnweiger)

 * Mon Apr 26 2010 mszer...@suse.cz
 - update to 2.8.4
  * fix checking for symlinks in umount from /tmp
  * fix umounting if /tmp is a symlink


 kind regards
 markus froehlich

 Am 06.06.2011 21:19, schrieb Anthony J. Biacco:

 Could be fuse, check 'dmesg' for kernel module timeouts.

 In a similar vein, has anyone seen signifigant performance/reliability
 with diff fuse versions? say, latest source vs. Rhel distro rpms vers.

 -Tony



 -Original Message-
 From: Mohit Anchliamohitanch...@gmail.com
 Sent: June 06, 2011 1:14 PM
 To: Markus Fröhlichmarkus.froehl...@xidras.com
 Cc: gluster-users@gluster.orggluster-users@gluster.org
 Subject: Re: [Gluster-users] uninterruptible processes writing to
 glusterfsshare

 Is there anything in the server logs? Does it follow any particular
 pattern before going in this mode?

 Did you upgrade Gluster or is this new install?

 2011/6/6 Markus Fröhlichmarkus.froehl...@xidras.com:

 hi!

 sometimes we've on some client-servers hanging uninterruptible processes
 (ps aux stat is on D ) and on one the CPU wait I/O grows within some
 minutes to 100%.
 you are not able to kill such processes - also kill -9 doesnt work -
 when
 you connect via strace to such an process, you wont see anything and
 you
 cannot detach it again.

 there are only two possibilities:
 killing the glusterfs process (umount GFS share) or rebooting the
 server.

 the only log entry I found, was on one client - just a single line:
 [2011-06-06 10:44:18.593211] I
 [afr-common.c:581:afr_lookup_collect_xattr]
 0-office-data-replicate-0: data self-heal is pending for


 /pc-partnerbet-public/Promotionaktionen/Mailakquise_2009/Webmaster_2010/HTML
 /bilder/Thumbs.db.

 one of the client-servers is a samba-server, the other one a
 backup-server
 based on rsync with millions of small files.

 gfs-servers + gfs-clients: SLES11 x86_64, glusterfs V 3.2.0

 and here are the configs from server and client:
 server config


 /etc/glusterd/vols/office-data/office-data.gfs-01-01.GFS-office-data02.vol
 :
 volume office-data-posix
    type storage/posix
    option directory /GFS/office-data02
 end-volume

 volume office-data-access-control
    type features/access-control
    subvolumes office-data-posix
 end-volume

 volume office-data-locks
    type features/locks
    subvolumes office-data-access-control
 end-volume

 volume office-data-io-threads
    type performance/io-threads
    subvolumes office-data-locks
 end-volume

 volume office-data-marker
    type features/marker
    option volume-uuid 3c6e633d-a0bb-4c52-8f05-a2db9bc9c659
    option timestamp-file /etc/glusterd

Re: [Gluster-users] uninterruptible processes writing toglusterfsshare

2011-06-08 Thread Justice London
I have been working with the gluster support department (under paid support
I should mention)... with no resolution or similar. A bug report
specifically was not filed by myself though.

Justice London

-Original Message-
From: Mohit Anchlia [mailto:mohitanch...@gmail.com] 
Sent: Wednesday, June 08, 2011 12:29 PM
To: Justice London
Cc: bxma...@gmail.com; gluster-users@gluster.org
Subject: Re: [Gluster-users] uninterruptible processes writing
toglusterfsshare

On Wed, Jun 8, 2011 at 12:29 PM, Justice London jlon...@lawinfo.com wrote:
 Hopefully this will help some people... try disabling the io-cache routine
 in the fuse configurations for your share. Let me know if you need
 instruction on doing this. It solved all of the lockup issues I was
 experiencing. I believe there is some sort of as-yet-undetermined memory
 leak here.

Was there a bug filed? If you think this is a bug it will help others as
well.

 Justice London

 -Original Message-
 From: gluster-users-boun...@gluster.org
 [mailto:gluster-users-boun...@gluster.org] On Behalf Of bxma...@gmail.com
 Sent: Wednesday, June 08, 2011 12:22 PM
 To: gluster-users@gluster.org
 Subject: Re: [Gluster-users] uninterruptible processes writing
 toglusterfsshare

 Im using kernel 2.6.34 + fuse 2.5.5 + gluster 3.2 from beginning and
 it happen again today ...
 php-fpm freeze and reboot was only solution.

 Matus


 2011/6/7 Markus Fröhlich markus.froehl...@xidras.com:
 hi!

 there ist no relavant output from dmesg.
 no entries in the server log - only the one line in the client-server
log,
 I
 already posted.

 the glusterfs version on the server had been updated to gfs 3.2.0 more
 than
 a month ago.
 because of the troubles on the backup server, I deleted the whole backup
 share and started from scratch.


 I looked for a update of fuse and upgraded from 2.7.2-61.18.1 to
 2.8.5-41.1
 maybe this helps.

 here is the changelog info:

 Authors:
 
    Miklos Szeredi mik...@szeredi.hu
 Distribution: systemsmanagement:baracus / SLE_11_SP1
 * Tue Mar 29 2011 db...@novell.com
 - remove the --no-canonicalize usage for suse_version = 11.3

 * Mon Mar 21 2011 co...@novell.com
 - licenses package is about to die

 * Thu Feb 17 2011 mszer...@suse.cz
 - In case of failure to add to /etc/mtab don't umount. [bnc#668820]
  [CVE-2011-0541]

 * Tue Nov 16 2010 mszer...@suse.cz
 - Fix symlink attack for mount and umount [bnc#651598]

 * Wed Oct 27 2010 mszer...@suse.cz
 - Remove /etc/init.d/boot.fuse [bnc#648843]

 * Tue Sep 28 2010 mszer...@suse.cz
 - update to 2.8.5
  * fix option escaping for fusermount [bnc#641480]

 * Wed Apr 28 2010 mszer...@suse.cz
 - keep examples and internal docs in devel package (from jnweiger)

 * Mon Apr 26 2010 mszer...@suse.cz
 - update to 2.8.4
  * fix checking for symlinks in umount from /tmp
  * fix umounting if /tmp is a symlink


 kind regards
 markus froehlich

 Am 06.06.2011 21:19, schrieb Anthony J. Biacco:

 Could be fuse, check 'dmesg' for kernel module timeouts.

 In a similar vein, has anyone seen signifigant performance/reliability
 with diff fuse versions? say, latest source vs. Rhel distro rpms vers.

 -Tony



 -Original Message-
 From: Mohit Anchliamohitanch...@gmail.com
 Sent: June 06, 2011 1:14 PM
 To: Markus Fröhlichmarkus.froehl...@xidras.com
 Cc: gluster-users@gluster.orggluster-users@gluster.org
 Subject: Re: [Gluster-users] uninterruptible processes writing to
 glusterfsshare

 Is there anything in the server logs? Does it follow any particular
 pattern before going in this mode?

 Did you upgrade Gluster or is this new install?

 2011/6/6 Markus Fröhlichmarkus.froehl...@xidras.com:

 hi!

 sometimes we've on some client-servers hanging uninterruptible
processes
 (ps aux stat is on D ) and on one the CPU wait I/O grows within
some
 minutes to 100%.
 you are not able to kill such processes - also kill -9 doesnt work -
 when
 you connect via strace to such an process, you wont see anything and
 you
 cannot detach it again.

 there are only two possibilities:
 killing the glusterfs process (umount GFS share) or rebooting the
 server.

 the only log entry I found, was on one client - just a single line:
 [2011-06-06 10:44:18.593211] I
 [afr-common.c:581:afr_lookup_collect_xattr]
 0-office-data-replicate-0: data self-heal is pending for



/pc-partnerbet-public/Promotionaktionen/Mailakquise_2009/Webmaster_2010/HTML
 /bilder/Thumbs.db.

 one of the client-servers is a samba-server, the other one a
 backup-server
 based on rsync with millions of small files.

 gfs-servers + gfs-clients: SLES11 x86_64, glusterfs V 3.2.0

 and here are the configs from server and client:
 server config



/etc/glusterd/vols/office-data/office-data.gfs-01-01.GFS-office-data02.vol
 :
 volume office-data-posix
    type storage/posix
    option directory /GFS/office-data02
 end-volume

 volume office-data-access-control
    type features/access-control
    subvolumes office-data-posix
 end-volume

 volume office-data-locks