Re: [Gluster-users] glusterfs and pacemaker

2011-07-18 Thread Uwe Weiss
Hello Marcel,

thank you very much for the patch. Great Job.

It works on the first shot. Mounting and migration of the fs works. Till now
I could not test a hart reset of a cluster node, cause a colleague is
currently using the cluster.

I applied the following parameters:

Fstype: glusterfs
Mountdir: /virtfs
Glustervolume: 192.168.50.1:/gl_vol1

Maybe you can answer me a question for better understanding?

My second node is 192.168.50.2. But in the Filesystem RA I have referenced
to 192.168.50.1 (see above). During my first test node1 was up and running,
but what happens if node1 is completely away and the address is
inaccessible?

Thx
Uwe


Uwe Weiss
weiss edv-consulting
Lattenkamp 14
22299 Hamburg
Phone:  +49 40 51323431
Fax:+49 40 51323437
eMail:  u.we...@netz-objekte.de

-Ursprüngliche Nachricht-
Von: gluster-users-boun...@gluster.org
[mailto:gluster-users-boun...@gluster.org] Im Auftrag von Marcel Pennewiß
Gesendet: Sonntag, 17. Juli 2011 17:37
An: gluster-users@gluster.org
Betreff: Re: [Gluster-users] glusterfs and pacemaker

On Friday 15 July 2011 13:12:02 Marcel Pennewiß wrote:
  My idea is, that pacemaker starts and monitors the glusterfs 
  mountpoints and migrates some resources to the remaining  node if 
  one or more
  mountpoint(s) fails.
 
 For using mountpoints, please have a look at OCF Filesystem agent.

Uwe informed me (via PM) that this didn't work - we did not use this until
now. After some investigation you'll see that ocf::Filesystem did not
detect/work with glusterfs-shares :(

A few changes are necessary to create a basic support for glusterfs.
@Uwe: Please have a look at [1] and try to patch your
Filesystem-OCF-script (which maybe located in
/usr/lib/ocf/resource.d/heartbeat).

[1] http://subversion.fem.tu-ilmenau.de/repository/fem-overlay/trunk/sys-
cluster/resource-agents/files/filesystem-glusterfs-support.patch

best regards
Marcel
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] glusterfs and pacemaker

2011-07-18 Thread Marcel Pennewiß
On Monday 18 July 2011 12:10:36 Uwe Weiss wrote:
 My second node is 192.168.50.2. But in the Filesystem RA I have referenced
 to 192.168.50.1 (see above). During my first test node1 was up and running,
 but what happens if node1 is completely away and the address is
 inaccessible?

We're using replicated setup and both nodes share an IPv4/IPv6-address (via 
pacemaker) which is used for accessing/mounting glusterfs-share and nfs-share 
(from backup-server).

Marcel
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] glusterfs and pacemaker

2011-07-18 Thread samuel
I don't know from which version on but, if you use the native client for
mounting the volumes, it's only required to have the IP active in the mount
moment. After that, the native client will transparently manage node's
failure.

Best regards,
Samuel.

On 18 July 2011 13:14, Marcel Pennewiß mailingli...@pennewiss.de wrote:

 On Monday 18 July 2011 12:10:36 Uwe Weiss wrote:
  My second node is 192.168.50.2. But in the Filesystem RA I have
 referenced
  to 192.168.50.1 (see above). During my first test node1 was up and
 running,
  but what happens if node1 is completely away and the address is
  inaccessible?

 We're using replicated setup and both nodes share an IPv4/IPv6-address (via
 pacemaker) which is used for accessing/mounting glusterfs-share and
 nfs-share
 (from backup-server).

 Marcel
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] glusterfs and pacemaker

2011-07-18 Thread Marcel Pennewiß
On Monday 18 July 2011 13:26:00 samuel wrote:
 I don't know from which version on but, if you use the native client for
 mounting the volumes, it's only required to have the IP active in the mount
 moment. After that, the native client will transparently manage node's
 failure.

ACK, that's why we use this shared IP (e.g. for backup issues via nfs). AFAIR 
glusterFS retrieves Volfile (via shared IP) and connects to the nodes.

Marcel
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Issue with Gluster Quota

2011-07-18 Thread Brian Smith
Updated to the 3.2.2 release and found that Quotas do not work when
using the POSIX-ACL translator.  In fact, after I disabled the ACLs, I
had to remove the quota'd directory (presumably, to remove some
attributes) and start over in order to get them to work.  Once I
disabled ACLs and re-created my directory, quotas worked as expected.
Is this a known limitation of using POSIX ACLs?  I happen to need both
features, so that could pose an issue :)

-Brian

Brian Smith
Senior Systems Administrator
IT Research Computing, University of South Florida
4202 E. Fowler Ave. ENB308
Office Phone: +1 813 974-1467
Organization URL: http://rc.usf.edu

On 07/11/2011 12:48 PM, Brian Smith wrote:
 According to the logs, the last commit was:
 
 commit 5c20eb3bbf870edadd22d06babb5d38dad222533
 Author: shishir gowda shishi...@gluster.com
 Date:   Tue Jul 5 03:41:51 2011 +
 
 [root@gluster1 glusterfs-3.2git]# gluster volume quota home list
   path  limit_set  size
 --
 /brs   10485760 81965056
 
 [root@gluster1 glusterfs-3.2git]# gluster volume info
 
 Volume Name: home
 Type: Distribute
 Status: Started
 Number of Bricks: 2
 Transport-type: tcp,rdma
 Bricks:
 Brick1: gluster1:/glusterfs/home
 Brick2: gluster2:/glusterfs/home
 Options Reconfigured:
 features.limit-usage: /brs:10MB
 features.quota: on
 
 -Brian
 
 Brian Smith
 Senior Systems Administrator
 IT Research Computing, University of South Florida
 4202 E. Fowler Ave. ENB308
 Office Phone: +1 813 974-1467
 Organization URL: http://rc.usf.edu
 
 On 07/11/2011 08:29 AM, Saurabh Jain wrote:
 Hello Brian,


   I synced my gluster repository back to July 5th and tried quota on a 
 certain dir of a distribute and the quota was implemeted properly on that, 
 here are the logs,

[root@centos-qa-client-3 glusterfs]# /root/july6git/inst/sbin/gluster 
 volume quota dist list
  path  limit_set  size
 --
 /dir   10485760 10485760


 [root@centos-qa-client-3 glusterfs]# /root/july6git/inst/sbin/gluster volume 
 info

 Volume Name: dist
 Type: Distribute
 Status: Started
 Number of Bricks: 2
 Transport-type: tcp,rdma
 Bricks:
 Brick1: 10.1.12.134:/mnt/dist
 Brick2: 10.1.12.135:/mnt/dist
 Options Reconfigured:
 features.limit-usage: /dir:10MB
 features.quota: on
 [root@centos-qa-client-3 glusterfs]# 

 requesting you to please inform us about the commit id to which your 
 workspace is synced.

 Thanks,
 Saurabh
 
 From: gluster-users-boun...@gluster.org [gluster-users-boun...@gluster.org] 
 on behalf of gluster-users-requ...@gluster.org 
 [gluster-users-requ...@gluster.org]
 Sent: Friday, July 08, 2011 12:30 AM
 To: gluster-users@gluster.org
 Subject: Gluster-users Digest, Vol 39, Issue 13

 Send Gluster-users mailing list submissions to
 gluster-users@gluster.org

 To subscribe or unsubscribe via the World Wide Web, visit
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
 or, via email, send a message with subject or body 'help' to
 gluster-users-requ...@gluster.org

 You can reach the person managing the list at
 gluster-users-ow...@gluster.org

 When replying, please edit your Subject line so it is more specific
 than Re: Contents of Gluster-users digest...


 Today's Topics:

1. Re: Issue with Gluster Quota (Brian Smith)
2. Re: Issues with geo-rep (Carl Chenet)


 --

 Message: 1
 Date: Thu, 07 Jul 2011 13:10:06 -0400
 From: Brian Smith b...@usf.edu
 Subject: Re: [Gluster-users] Issue with Gluster Quota
 To: gluster-users@gluster.org
 Message-ID: 4e15e86e.6030...@usf.edu
 Content-Type: text/plain; charset=ISO-8859-1

 Sorry about that.  I re-populated with an 82MB dump from dd:

 [root@gluster1 ~]# gluster volume quota home list
 path  limit_set  size
 --
 /brs   10485760 81965056

 [root@gluster1 ~]# getfattr -m . -d -e hex /glusterfs/home/brs
 getfattr: Removing leading '/' from absolute path names
 # file: glusterfs/home/brs
 security.selinux=0x726f6f743a6f626a6563745f723a66696c655f743a733000
 trusted.gfid=0x1bbcb9a08bf64406b440f3bb3ad334ed
 trusted.glusterfs.dht=0x00017fff
 trusted.glusterfs.quota.----0001.contri=0x6000
 trusted.glusterfs.quota.dirty=0x3000
 trusted.glusterfs.quota.size=0x6000

 [root@gluster2 ~]# getfattr -m . -d -e hex /glusterfs/home/brs
 getfattr: Removing leading '/' from absolute path names
 # file: glusterfs/home/brs
 security.selinux=0x726f6f743a6f626a6563745f723a66696c655f743a733000
 

Re: [Gluster-users] not sure how to troubleshoot SMB / CIFS overload when using GlusterFS

2011-07-18 Thread Ken Randall
Whit,

Genius!

This morning I set out to remove as many variables as possible to whittle
down the repro case as much as possible.  I've become pretty good at
debugging memory dumps on the Windows side over the years, and even
inspected the web processes.  Nothing looked out of the ordinary there, just
a bunch of threads waiting to get file attribute data from the Gluster
share.

So then, to follow your lead, I reduced the Page of Death down from
thousands of images to just five.  I tried accessing the page, and boom,
everything's frozen for minutes.  Interesting.  So I reduced it to one
image, accessed the page, and boom, everything's dead instantly.  That one
image is a file that doesn't exist.

So now, knowing that GlusterFS is kicking into overdrive fretting about a
file it can't find, I decided to eliminate the web server altogether.  I
opened up Windows Explorer, and typed in a directory that didn't exist, and
sure enough, I'm unable to navigate through the share in another Explorer
window until it finally responds again a minute later.  I think the Page of
Death was exhibiting such a massive death (e.g. only able to respond again
upwards of five minutes later) because it was systematically trying to
access several files that weren't found, and each one it can't find causes
the SMB connection to hang for close to a minute.

I feel like this is a bit of major progress toward pinpointing the problem
for a possible resolution.  Here are some additional details that may help:

The GlusterFS directory in question, /storage, has about 80,000 subdirs in
it.  As such, I'm using ext4 to overcome the subdir limitations of ext3.
The non-existent image file that is able to cause everything to freeze
exists in a directory, /storage/thisdirdoesntexist/images/blah.gif, where
thisdirdoesntexist is in that storage directory along with those 80,000
real subdirs.  I know it's a pretty laborious thing for Gluster to piece
together a directory listing, and combined with Joseph's recognition of the
flood of getdents, does it seem reasonable that Gluster or Samba is
freezing because it's for some reason generating a subdir listing of
/storage whenever it can't find one of its subdirs?

As another test, if I access a file inside a non-existent subdir of a dir
that only has five subdirs, and nothing freezes.

So the freezing seems to be a function of the number of subdirectories that
are siblings of the first part of the path that doesn't exist, if that makes
sense.  So in /this/is/a/long/path, if is doesn't exist, then Samba will
generate a list of subdirs under /this.  And if /this has 100,000
immediate subdirs under it, then you're about to experience a world of hurt.

I read some where that FUSE's implementation of readdir() is a blocking
operation.  If true, the above explanation, plus FUSE's readdir(), are to
blame.

And I am therefore up a creek.  It is not feasible to enforce the system to
only have a few subdirs at any given level to prevent the lockup.  Unless
somebody, after reading this novel, has some ideas for me to try.  =)  Any
magical ways to not get FUSE to block, or any trickery on Samba's side?

Ken



On Sun, Jul 17, 2011 at 10:29 PM, Whit Blauvelt
whit.glus...@transpect.comwrote:

 On Sun, Jul 17, 2011 at 10:19:00PM -0500, Ken Randall wrote:

  (The no such file or directory part is expected since some of the image
  references don't exist.)

 Wild guess on that: Gluster may work harder at files it doesn't find than
 files it finds. It's going to look on one side or the other of the
 replicated file at first, and if it finds the file deliver it. But if it
 doesn't find the file, wouldn't it then check the other side of the
 replicated storage to make sure this wasn't a replication error?

 Might be interesting to run a version of the test where all the images
 referenced do exist, to see if it's the missing files that are driving up
 the CPU cycles.

 Whit


___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] not sure how to troubleshoot SMB / CIFS overload when using GlusterFS

2011-07-18 Thread Ken Randall
Joseph,

Thank you for your response.  Yours, combined with Whit's, led me to come up
with a pretty solid repro case, and a pinpointing of what I think is going
on.

I tried your additional SMB configuration settings, and was hopeful, but it
didn't alleviate the issue.  But it was helpful, your interpretation of the
logs.  It makes sense now that Samba was pounding on GlusterFS, doing it's
string of getdents operations.

I also took your advice last night on stat-cache (I assume that was on the
Gluster side, which I enabled), and wasn't sure where fast lookups was.
That didn't seem to make a noticeable difference either.

I think the lockups are happening as a result of being crippled by
GlusterFS's relatively slow directory listing (5x-10x slower generating a
dir listing than a raw SMB share), combined with FUSE's blocking readdir().
I'm not positive on that last point since there was only one mention of that
on the internet.   Am praying that somebody will see this and say, oh yeah,
well sure, just change this one thing in FUSE and you're good to go!
Somehow I don't think that's going to happen.  :)

Ken

On Sun, Jul 17, 2011 at 10:35 PM, Joe Landman 
land...@scalableinformatics.com wrote:

 On 07/17/2011 11:19 PM, Ken Randall wrote:

 Joe,

 Thank you for your response.  After seeing what you wrote, I bumped up
 the performance.cache-size up to 4096MB, the max allowed, and ran into
 the same wall.


 Hmmm ...



 I wouldn't think that any SMB caching would help in this case, since the
 same Samba server on top of the raw Gluster data wasn't exhibiting any
 trouble, or am I deceived?


 Samba could cache better so it didn't have to hit Gluster so hard.


  I haven't used strace before, but I ran it on the glusterfs process, and
 saw a lot of:
 epoll_wait(3, {{EPOLLIN, {u32=9, u64=9}}}, 257, 4294967295) = 1
 readv(9, [{\200\0\16,, 4}], 1)= 4
 readv(9, [{\0\n;\227\0\0\0\1, 8}], 1) = 8
 readv(9,
 [{\0\0\0\0\0\0\0\0\0\0\0\0\0\**0\0\0\0\0\0\31\0\0\0\0\0\0\0\**1\0\0\0\0...,
 3620}],
 1) = 1436
 readv(9, 0xa90b1b8, 1)  = -1 EAGAIN (Resource
 temporarily unavailable)


 Interesting ... I am not sure why its reporting an EAGAIN for readv, other
 than it can't fill the vector from the read.


  And when I ran it on smbd, I saw a constant stream of this kind of
 activity:
 getdents(29, /* 25 entries */, 32768)   = 840
 getdents(29, /* 25 entries */, 32768)   = 856
 getdents(29, /* 25 entries */, 32768)   = 848
 getdents(29, /* 24 entries */, 32768)   = 856
 getdents(29, /* 25 entries */, 32768)   = 864
 getdents(29, /* 24 entries */, 32768)   = 832
 getdents(29, /* 25 entries */, 32768)   = 832
 getdents(29, /* 24 entries */, 32768)   = 856
 getdents(29, /* 25 entries */, 32768)   = 840
 getdents(29, /* 24 entries */, 32768)   = 832
 getdents(29, /* 25 entries */, 32768)   = 784
 getdents(29, /* 25 entries */, 32768)   = 824
 getdents(29, /* 25 entries */, 32768)   = 808
 getdents(29, /* 25 entries */, 32768)   = 840
 getdents(29, /* 25 entries */, 32768)   = 864
 getdents(29, /* 25 entries */, 32768)   = 872
 getdents(29, /* 25 entries */, 32768)   = 832
 getdents(29, /* 24 entries */, 32768)   = 832
 getdents(29, /* 25 entries */, 32768)   = 840
 getdents(29, /* 25 entries */, 32768)   = 824
 getdents(29, /* 25 entries */, 32768)   = 824
 getdents(29, /* 24 entries */, 32768)   = 864
 getdents(29, /* 25 entries */, 32768)   = 848
 getdents(29, /* 24 entries */, 32768)   = 840


 Get directory entries.  This is the stuff that NTFS is caching for its web
 server, and it appears Samba is not.

 Try

aio read size = 32768
csc policy = documents
dfree cache time = 60
directory name cache size = 10
fake oplocks = yes
getwd cache = yes
level2 oplocks = yes
max stat cache size = 16384


  That chunk would get repeated over and over and over again as fast as
 the screen could go, with the occasional (every 5-10 seconds or so),
 would you see anything that you'd normally expect to see, such as:
 close(29)   = 0
 stat(Storage/01, 0x7fff07dae870) = -1 ENOENT (No such file or directory)
 write(23,
 \0\0\0#\377SMB24\0\0\300\**210A\310\0\0\0\0\0\0\0\0\0\0\**
 0\0\1\0d\233...,
 39) = 39
 select(38, [5 20 23 27 30 31 35 36 37], [], NULL, {60, 0}) = 1 (in [23],
 left {60, 0})
 read(23, \0\0\0x, 4)  = 4
 read(23,
 \377SMB2\0\0\0\0\30\7\310\0\**0\0\0\0\0\0\0\0\0\0\0\1\0\**
 250P\273\0[8...,
 120) = 120
 stat(Storage, {st_mode=S_IFDIR|0755, st_size=1581056, ...}) = 0
 stat(Storage/011235, 0x7fff07dad470) = -1 ENOENT (No such file or
 directory)
 stat(Storage/011235, 0x7fff07dad470) = -1 ENOENT (No such file or
 directory)
 open(Storage, O_RDONLY|O_NONBLOCK|O_**DIRECTORY) = 29
 fcntl(29, F_SETFD, FD_CLOEXEC)  = 0

 (The no such file or directory part is expected since some of the image
 references don't exist.)


 Ok.  It looks like Samba is pounding on GlusterFS metadata (getdents).
 

Re: [Gluster-users] not sure how to troubleshoot SMB / CIFS overload when using GlusterFS

2011-07-18 Thread Anand Avati
Please fine responses inline.


 So now, knowing that GlusterFS is kicking into overdrive fretting about a
 file it can't find, I decided to eliminate the web server altogether.  I
 opened up Windows Explorer, and typed in a directory that didn't exist, and
 sure enough, I'm unable to navigate through the share in another Explorer
 window until it finally responds again a minute later.  I think the Page of
 Death was exhibiting such a massive death (e.g. only able to respond again
 upwards of five minutes later) because it was systematically trying to
 access several files that weren't found, and each one it can't find causes
 the SMB connection to hang for close to a minute.



Gluster does not really care if a file is not found. It just looks up the
filename on all servers and returns -ENOENT. End of story for Gluster.
What's happening here is that Samba is 'searching' through all filenames in
the directory to match some other filename with strcasecmp() to provide a
case-insensitive match to the user.




 I feel like this is a bit of major progress toward pinpointing the problem
 for a possible resolution.  Here are some additional details that may help:

 The GlusterFS directory in question, /storage, has about 80,000 subdirs in
 it.  As such, I'm using ext4 to overcome the subdir limitations of ext3.
 The non-existent image file that is able to cause everything to freeze
 exists in a directory, /storage/thisdirdoesntexist/images/blah.gif, where
 thisdirdoesntexist is in that storage directory along with those 80,000
 real subdirs.  I know it's a pretty laborious thing for Gluster to piece
 together a directory listing, and combined with Joseph's recognition of the
 flood of getdents, does it seem reasonable that Gluster or Samba is
 freezing because it's for some reason generating a subdir listing of
 /storage whenever it can't find one of its subdirs?



Yes, it is samba searching around for the case insensitive match.


As another test, if I access a file inside a non-existent subdir of a dir
 that only has five subdirs, and nothing freezes.



That is because iterating over 5 names to determine non existence of a case
insensitive match is trivially fast.



 So the freezing seems to be a function of the number of subdirectories that
 are siblings of the first part of the path that doesn't exist, if that makes
 sense.  So in /this/is/a/long/path, if is doesn't exist, then Samba will
 generate a list of subdirs under /this.  And if /this has 100,000
 immediate subdirs under it, then you're about to experience a world of hurt.

 I read some where that FUSE's implementation of readdir() is a blocking
 operation.  If true, the above explanation, plus FUSE's readdir(), are to
 blame.


What do you mean by that. FUSE's readdir() is as blocking or unblocking as
the rest of its open/create/getattr/setattr etc. What you probably meant was
that the fuse kernel module does not cache dirents?


 And I am therefore up a creek.  It is not feasible to enforce the system to
 only have a few subdirs at any given level to prevent the lockup.  Unless
 somebody, after reading this novel, has some ideas for me to try.  =)  Any
 magical ways to not get FUSE to block, or any trickery on Samba's side?


It is not FUSE blocking that is your problem. You need a quicker trick to
achieve case insensitivity.

Avati
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] not sure how to troubleshoot SMB / CIFS overload when using GlusterFS

2011-07-18 Thread Anand Avati
 I also took your advice last night on stat-cache (I assume that was on the
 Gluster side, which I enabled), and wasn't sure where fast lookups was.
 That didn't seem to make a noticeable difference either.


stat-prefetch xlator does not help here. It helps quicken lookups. But Samba
is not interested in lookup/attributes. It is only looking for existance of
directory entry names, without caring whether it is a file or directory.



 I think the lockups are happening as a result of being crippled by
 GlusterFS's relatively slow directory listing (5x-10x slower generating a
 dir listing than a raw SMB share), combined with FUSE's blocking readdir().
 I'm not positive on that last point since there was only one mention of that
 on the internet.   Am praying that somebody will see this and say, oh yeah,
 well sure, just change this one thing in FUSE and you're good to go!
 Somehow I don't think that's going to happen.  :)


GlusterFS uses readdirp() internally even if FUSE asks readdir() because it
makes generation of unique list of entry names in distribute much more
efficient and simple. This might be causing readdir ops themselves to be
slow. Doing an 'strace -Tc' on smbd can show you what %age of time is spent
in getdents().

One test you could try is checking if a pure replicated setup without
stat-prefetch has the same performance hit as a distributed(+replicated?)
setup? Both stat-prefetch and distributed upgrade readdirs into readdirps
(one for performance, another for unique list generation).

Avati
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Tools for the admin

2011-07-18 Thread Mohit Anchlia
Or you can also use tr :  to remove :

On Sun, Jul 17, 2011 at 8:17 PM, Whit Blauvelt
whit.glus...@transpect.com wrote:
 On Mon, Jul 18, 2011 at 03:48:11AM +0100, Dan Bretherton wrote:

 I had a closer look at this.  It is the output of gfid-mismatch
 causing the problem; paths are shown with a trailing colon as in
 GlusterFS log files.  The cut -f1 -d: to extract the paths
 obviously removes all the colons.  I'm sure there is an easy way to
 remove the trailing ':' from filenames but I can't think of one off
 hand (and it is 3:30AM).

 Something along the lines of sed 's/.$//, as in:

        dog=doggy:; echo $dog | sed 's/.$//'

 That would remove any last character. To just get ::

        dog=doggy:; echo $dog | sed 's/:$//'

 (No, I didn't know that. I googled.)

 Whit

 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] not sure how to troubleshoot SMB / CIFS overload when using GlusterFS

2011-07-18 Thread Whit Blauvelt
On Mon, Jul 18, 2011 at 09:49:42PM +0530, Anand Avati wrote:

 It is not FUSE blocking that is your problem. You need a quicker trick to
 achieve case insensitivity.

Could this help?

  5.4.2.1 case sensitive

  This share-level option, which has the obtuse synonym casesignames,
  specifies whether Samba should preserve case when resolving filenames in a
  specific share. The default value for this option is no, which is how
  Windows handles file resolution. If clients are using an operating system
  that takes advantage of case-sensitive filenames, you can set this
  configuration option to yes as shown here:
  
  [accounting]
case sensitive = yes
  
  Otherwise, we recommend that you leave this option set to its default.

From http://oreilly.com/catalog/samba/chapter/book/ch05_04.html

As I read that, case sensitive = yes is telling Samba not to bother with
any case substitutions. That is, you may need a quicker trick to achieve
case insensitivity if you just get rid of case insensitivity.

Whit

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Issue with Gluster Quota

2011-07-18 Thread Brian Smith
After further tests, it appears both now work after the update.  Seems
the attributes set while using the git source build on the 'brs'
directory were gummed up.  When I recreated the directory and remounted
with -o acl, ACLs worked and so did quota enforcement.  I'll keep
testing and post if anything else comes up.  So far, so good.

-Brian

Brian Smith
Senior Systems Administrator
IT Research Computing, University of South Florida
4202 E. Fowler Ave. ENB308
Office Phone: +1 813 974-1467
Organization URL: http://rc.usf.edu

On 07/18/2011 11:57 AM, Brian Smith wrote:
 Updated to the 3.2.2 release and found that Quotas do not work when
 using the POSIX-ACL translator.  In fact, after I disabled the ACLs, I
 had to remove the quota'd directory (presumably, to remove some
 attributes) and start over in order to get them to work.  Once I
 disabled ACLs and re-created my directory, quotas worked as expected.
 Is this a known limitation of using POSIX ACLs?  I happen to need both
 features, so that could pose an issue :)
 
 -Brian
 
 Brian Smith
 Senior Systems Administrator
 IT Research Computing, University of South Florida
 4202 E. Fowler Ave. ENB308
 Office Phone: +1 813 974-1467
 Organization URL: http://rc.usf.edu
 
 On 07/11/2011 12:48 PM, Brian Smith wrote:
 According to the logs, the last commit was:

 commit 5c20eb3bbf870edadd22d06babb5d38dad222533
 Author: shishir gowda shishi...@gluster.com
 Date:   Tue Jul 5 03:41:51 2011 +

 [root@gluster1 glusterfs-3.2git]# gluster volume quota home list
  path  limit_set  size
 --
 /brs   10485760 81965056

 [root@gluster1 glusterfs-3.2git]# gluster volume info

 Volume Name: home
 Type: Distribute
 Status: Started
 Number of Bricks: 2
 Transport-type: tcp,rdma
 Bricks:
 Brick1: gluster1:/glusterfs/home
 Brick2: gluster2:/glusterfs/home
 Options Reconfigured:
 features.limit-usage: /brs:10MB
 features.quota: on

 -Brian

 Brian Smith
 Senior Systems Administrator
 IT Research Computing, University of South Florida
 4202 E. Fowler Ave. ENB308
 Office Phone: +1 813 974-1467
 Organization URL: http://rc.usf.edu

 On 07/11/2011 08:29 AM, Saurabh Jain wrote:
 Hello Brian,


   I synced my gluster repository back to July 5th and tried quota on a 
 certain dir of a distribute and the quota was implemeted properly on that, 
 here are the logs,

[root@centos-qa-client-3 glusterfs]# /root/july6git/inst/sbin/gluster 
 volume quota dist list
 path  limit_set  size
 --
 /dir   10485760 10485760


 [root@centos-qa-client-3 glusterfs]# /root/july6git/inst/sbin/gluster 
 volume info

 Volume Name: dist
 Type: Distribute
 Status: Started
 Number of Bricks: 2
 Transport-type: tcp,rdma
 Bricks:
 Brick1: 10.1.12.134:/mnt/dist
 Brick2: 10.1.12.135:/mnt/dist
 Options Reconfigured:
 features.limit-usage: /dir:10MB
 features.quota: on
 [root@centos-qa-client-3 glusterfs]# 

 requesting you to please inform us about the commit id to which your 
 workspace is synced.

 Thanks,
 Saurabh
 
 From: gluster-users-boun...@gluster.org [gluster-users-boun...@gluster.org] 
 on behalf of gluster-users-requ...@gluster.org 
 [gluster-users-requ...@gluster.org]
 Sent: Friday, July 08, 2011 12:30 AM
 To: gluster-users@gluster.org
 Subject: Gluster-users Digest, Vol 39, Issue 13

 Send Gluster-users mailing list submissions to
 gluster-users@gluster.org

 To subscribe or unsubscribe via the World Wide Web, visit
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
 or, via email, send a message with subject or body 'help' to
 gluster-users-requ...@gluster.org

 You can reach the person managing the list at
 gluster-users-ow...@gluster.org

 When replying, please edit your Subject line so it is more specific
 than Re: Contents of Gluster-users digest...


 Today's Topics:

1. Re: Issue with Gluster Quota (Brian Smith)
2. Re: Issues with geo-rep (Carl Chenet)


 --

 Message: 1
 Date: Thu, 07 Jul 2011 13:10:06 -0400
 From: Brian Smith b...@usf.edu
 Subject: Re: [Gluster-users] Issue with Gluster Quota
 To: gluster-users@gluster.org
 Message-ID: 4e15e86e.6030...@usf.edu
 Content-Type: text/plain; charset=ISO-8859-1

 Sorry about that.  I re-populated with an 82MB dump from dd:

 [root@gluster1 ~]# gluster volume quota home list
 path  limit_set  size
 --
 /brs   10485760 81965056

 [root@gluster1 ~]# getfattr -m . -d -e hex /glusterfs/home/brs
 getfattr: Removing leading '/' from absolute path names
 # file: glusterfs/home/brs
 

[Gluster-users] GlusterFS v3.1.5 Stable Configuration

2011-07-18 Thread Remi Broemeling
Hi,

We've been using GlusterFS to manage shared files across a number of hosts
in the past few months and have ran into a few problems -- basically one
every month, roughly.  The problems are occasionally extremely difficult to
track down to GlusterFS, as they often masquerade as something else in the
application log files that we have.  The problems have been one instance of
split-brain and then a number of instances of stuck files (i.e. any stat
calls would block for an hour and then timeout with an error) as well as a
couple instances of ghost files (remove the file, but GlusterFS continues
to show it for a little while until the cache times out).

We do *not* place a large amount of load on GlusterFS, and don't have any
significant performance issues to deal with.  With that in mind, the core
question of this e-mail is: How can I modify our configuration to be the
absolute *most* stable (problem free) that it can be, even if it means
sacrificing performance?  In sum, I don't have any particular performance
concerns at this moment, but the GlusterFS bugs that we encounter are quite
problematic -- so I'm willing to entertain any suggested stability
improvement, even if it has a negative impact on performance (I suspect that
the answer here is just turn off all performance-enhancing gluster
caching, but I wanted to validate that is actually true before going so
far).  Thus please suggest anything that could be done to improve the
stability of our setup -- as an aside, I think that this would be an
advantageous thing to add to the FAQ.  Right now the FAQ contains
information for *performance* tuning, but not for *stability* tuning.

Thanks for any help that you can give/suggestions that you can make.

Here are the details of our environment:

OS: RHEL5
GlusterFS Version: 3.1.5
Mount method: glusterfsd/FUSE
GlusterFS Servers: web01, web02
GlusterFS Clients: web01, web02, dj01, dj02

$ sudo gluster volume info

Volume Name: shared-application-data
Type: Replicate
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: web01:/var/glusterfs/bricks/shared
Brick2: web02:/var/glusterfs/bricks/shared
Options Reconfigured:
network.ping-timeout: 5
nfs.disable: on

Configuration File Contents:
*/etc/glusterd/vols/shared-application-data/shared-application-data-fuse.vol
*
volume shared-application-data-client-0
type protocol/client
option remote-host web01
option remote-subvolume /var/glusterfs/bricks/shared
option transport-type tcp
option ping-timeout 5
end-volume

volume shared-application-data-client-1
type protocol/client
option remote-host web02
option remote-subvolume /var/glusterfs/bricks/shared
option transport-type tcp
option ping-timeout 5
end-volume

volume shared-application-data-replicate-0
type cluster/replicate
subvolumes shared-application-data-client-0
shared-application-data-client-1
end-volume

volume shared-application-data-write-behind
type performance/write-behind
subvolumes shared-application-data-replicate-0
end-volume

volume shared-application-data-read-ahead
type performance/read-ahead
subvolumes shared-application-data-write-behind
end-volume

volume shared-application-data-io-cache
type performance/io-cache
subvolumes shared-application-data-read-ahead
end-volume

volume shared-application-data-quick-read
type performance/quick-read
subvolumes shared-application-data-io-cache
end-volume

volume shared-application-data-stat-prefetch
type performance/stat-prefetch
subvolumes shared-application-data-quick-read
end-volume

volume shared-application-data
type debug/io-stats
subvolumes shared-application-data-stat-prefetch
end-volume

*/etc/glusterfs/glusterd.vol*
volume management
type mgmt/glusterd
option working-directory /etc/glusterd
option transport-type socket,rdma
option transport.socket.keepalive-time 10
option transport.socket.keepalive-interval 2
end-volume

-- 
Remi Broemeling
System Administrator
Clio - Practice Management Simplified
1-888-858-2546 x(2^5) | r...@goclio.com
www.goclio.com | blog http://www.goclio.com/blog |
twitterhttp://www.twitter.com/goclio
 | facebook http://www.facebook.com/goclio

   
 _⌠ oo ⌡_
(_  _)
  ||
  ⌡_⌡⌡_⌡
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] Reminder: Monitoring GlusterFS Webinar is Tomorrow

2011-07-18 Thread John Mark Walker
Greetings - if you're curious about monitoring GlusterFS performance, be sure 
and sign up for tomorrow's webinar. We will also post the recording online 
should you not be able to make it.


Introducing Gluster for Geeks Technical Webinar Series

In this Gluster for Geeks technical webinar, Craig Carl, Senior Systems 
Engineer, will explain and demonstrate how to monitor your Gluster 
cluster for availability and performance.  

Register: https://www3.gotomeeting.com/register/542541630 


Topics covered will include:

- What services and logs to watch
- How to run baseline performance testing
- How to use Ganglia for ongoing performance monitoring

Craig will demonstrate how to use Ganglia to collate performance data from 
across a Gluster cluster and present it in a usable format. With time 
permitting we will demonstrate performance monitoring using Amazon CloudWatch 
and 
RightScale's monitoring tools.

 
Webinar: Monitoring GlusterFS 3.2

Tuesday, July 19 at 10am PT / 1pm ET / 6pm UK (London)

Speaker: Craig Carl, Senior Systems Engineer

This will be a 90 minute webinar with the first hour dedicated to content 
and the last 30 minutes scheduled for QA. We look forward to seeing 
you there!  If you can't make it, register anyway and we'll send you the 
recording.

Register: https://www3.gotomeeting.com/register/542541630 
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] not sure how to troubleshoot SMB / CIFS overload when using GlusterFS

2011-07-18 Thread Ken Randall
Anand, Whit, and Joseph,

I appreciate your help very, very much.  Anand's assertion about Samba doing
string comparisons was spot on.  And Whit's suggestion to change the
smb.conf to make it be case sensitive did the trick.  I am also forcing
default case = lower and preserve case = no in smb.conf to make sure
everything stays lower-case going in.

With those changes in place, I believe I can hear a song in the back of my
head, It's a whole new world...

On the web app side we will be writing a request handler that will
automatically lower-case any requests coming in, so any referenced images
and files will work no matter the casing specified.  (I've noticed that not
many Linux hosted sites and SaaS platforms handle casing well, not sure
why.)

We will have some users complain about case sensitivity not being maintained
on their files, but I think that the huge win for us being able to use
GlusterFS is worth it.  There are no great Windows solutions for
ever-expandable storage, and we're well past the published limitations of
Windows DFS-R.  DFS-R is an amazing, refined piece of technology, but it is
a solution for a different kind of problem.

Thanks again, guys, I never would have navigated to this solution without
you.

Ken
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] GlusterFS v3.1.5 Stable Configuration

2011-07-18 Thread Mohit Anchlia
On Mon, Jul 18, 2011 at 10:53 AM, Remi Broemeling r...@goclio.com wrote:
 Hi,

 We've been using GlusterFS to manage shared files across a number of hosts
 in the past few months and have ran into a few problems -- basically one
 every month, roughly.  The problems are occasionally extremely difficult to
 track down to GlusterFS, as they often masquerade as something else in the
 application log files that we have.  The problems have been one instance of
 split-brain and then a number of instances of stuck files (i.e. any stat
 calls would block for an hour and then timeout with an error) as well as a
 couple instances of ghost files (remove the file, but GlusterFS continues
 to show it for a little while until the cache times out).

 We do not place a large amount of load on GlusterFS, and don't have any
 significant performance issues to deal with.  With that in mind, the core
 question of this e-mail is: How can I modify our configuration to be the
 absolute most stable (problem free) that it can be, even if it means
 sacrificing performance?  In sum, I don't have any particular performance

It depends on kind of bugs or issues you are encountering. There might
be solution for some bugs and may not be for others.

 concerns at this moment, but the GlusterFS bugs that we encounter are quite
 problematic -- so I'm willing to entertain any suggested stability
 improvement, even if it has a negative impact on performance (I suspect that
 the answer here is just turn off all performance-enhancing gluster
 caching, but I wanted to validate that is actually true before going so
 far).  Thus please suggest anything that could be done to improve the
 stability of our setup -- as an aside, I think that this would be an
 advantageous thing to add to the FAQ.  Right now the FAQ contains
 information for performance tuning, but not for stability tuning.

 Thanks for any help that you can give/suggestions that you can make.

 Here are the details of our environment:

 OS: RHEL5
 GlusterFS Version: 3.1.5
 Mount method: glusterfsd/FUSE
 GlusterFS Servers: web01, web02
 GlusterFS Clients: web01, web02, dj01, dj02

 $ sudo gluster volume info

 Volume Name: shared-application-data
 Type: Replicate
 Status: Started
 Number of Bricks: 2
 Transport-type: tcp
 Bricks:
 Brick1: web01:/var/glusterfs/bricks/shared
 Brick2: web02:/var/glusterfs/bricks/shared
 Options Reconfigured:
 network.ping-timeout: 5
 nfs.disable: on

 Configuration File Contents:
 /etc/glusterd/vols/shared-application-data/shared-application-data-fuse.vol
 volume shared-application-data-client-0
     type protocol/client
     option remote-host web01
     option remote-subvolume /var/glusterfs/bricks/shared
     option transport-type tcp
     option ping-timeout 5
 end-volume

 volume shared-application-data-client-1
     type protocol/client
     option remote-host web02
     option remote-subvolume /var/glusterfs/bricks/shared
     option transport-type tcp
     option ping-timeout 5
 end-volume

 volume shared-application-data-replicate-0
     type cluster/replicate
     subvolumes shared-application-data-client-0
 shared-application-data-client-1
 end-volume

 volume shared-application-data-write-behind
     type performance/write-behind
     subvolumes shared-application-data-replicate-0
 end-volume

 volume shared-application-data-read-ahead
     type performance/read-ahead
     subvolumes shared-application-data-write-behind
 end-volume

 volume shared-application-data-io-cache
     type performance/io-cache
     subvolumes shared-application-data-read-ahead
 end-volume

 volume shared-application-data-quick-read
     type performance/quick-read
     subvolumes shared-application-data-io-cache
 end-volume

 volume shared-application-data-stat-prefetch
     type performance/stat-prefetch
     subvolumes shared-application-data-quick-read
 end-volume

 volume shared-application-data
     type debug/io-stats
     subvolumes shared-application-data-stat-prefetch
 end-volume

 /etc/glusterfs/glusterd.vol
 volume management
     type mgmt/glusterd
     option working-directory /etc/glusterd
     option transport-type socket,rdma
     option transport.socket.keepalive-time 10
     option transport.socket.keepalive-interval 2
 end-volume

 --
 Remi Broemeling
 System Administrator
 Clio - Practice Management Simplified
 1-888-858-2546 x(2^5) | r...@goclio.com
 www.goclio.com | blog | twitter | facebook


  _⌠ oo ⌡_
 (_  _)
   ||
   ⌡_⌡⌡_⌡


 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] Questions on Replication and Design

2011-07-18 Thread Brian Smith
Hi, all,

We're looking to replace our DRBD/Ext4/NFS file storage configuration,
using RHEL cluster w/ GlusterFS 3.2.2 (or greater, depending on the
timeline).  Currently, our configuration includes 2 four node clusters
at different sites with

1. 8 @ 1.5TB LVM LVs on top of an HP-P2000 storage array
2. Each LV is mirrored to an identical LV on a cluster at a remote site
using DRBD
3. Each LV/DRBD is an Ext4 volume and an NFS mount-point
4. DRBD is started as primary in one site and secondary in the remote
site and can be switched easily in the event of a failure.
5. RHEL cluster w/ some patches runs DRBD, floating IP, ext4, NFS as a
service for each of the 8 mounts.

We went this route because it was the only way to get POSIX-ACL support
and a working Quota implementation w/ replication of a large-ish volume
without spending incredible amounts of money.  With GlusterFS 3.2.2,
these are both supported features.  My proposed layout for the new
configuration would look like so:

1. 8 @ 1.5TB LVM LVs on top of an HP-P2000 storage array
2. Each LV is an Ext4 FS
3. RHEL cluster runs a glusterd instance, floating IP and, ext4 mount
for each of the 8 LVs.
4. Each of the 8 LVs is configured with a replicated pair in our remote
site while they distribute across the local site.  For instance:

site1-node1: site2-node1:
  gluster1:/glusterfs   --   gluster9:/glusterfs
  gluster2:/glusterfs   --   gluster10:/glusterfs
site1-node2:
  gluster3:/glusterfs   --   gluster11:/glusterfs
  gluster4:/glusterfs   --   gluster12:/glusterfs
site1-node3:
  gluster5:/glusterfs   --   gluster13:/glusterfs
  gluster6:/glusterfs   --   gluster14:/glusterfs
site1-node4:
  gluster7:/glusterfs   --   gluster15:/glusterfs
  gluster8:/glusterfs   --   gluster16:/glusterfs


   Distributed


  --
 |   |   |  |
  client  client   client  client

We will still use RHEL cluster to facilitate HA and failover of the
glusterd/ip/fs instances on each cluster site.

What say the experts about this approach and what caveats/issues should
I be looking out for?  I'll be building a test environment, but was
wondering, before I start, whether this is a supportable configuration
in the event we decide to get support, etc.

Many thanks in advance!

-Brian

-- 
Brian Smith
Senior Systems Administrator
IT Research Computing, University of South Florida
4202 E. Fowler Ave. ENB308
Office Phone: +1 813 974-1467
Organization URL: http://rc.usf.edu
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Tools for the admin

2011-07-18 Thread Vikas Gorur
On 17 July 2011 18:05, Dan Bretherton d.a.brether...@reading.ac.uk wrote:

Dear Vikas-



 Thanks for provding these tools.  Unfortunately I think I have found a
 problem with the procedure outlined in the README - I don't think it
 works for files with names containing the colon character.  I still have
 a lot of gfid errors in my logs after running the gfid tools on one
 volume, and all the filenames have one or more ':' characters.  There
 are 1677 files still affected with gfid different so I don't think it
 can be a coincidence.


Thanks for pointing this out, Dan. There was some urgency in writing the
tool and I forgot to document that it wouldn't handle files with a : in
them. It'll be fixed soon.

-- 
Vikas Gorur
Engineer - Gluster
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users