Re: [Gluster-users] Is it possible to install Gluster management console using manual install process?

2011-11-09 Thread phil cryer
On Wed, Nov 9, 2011 at 10:05 AM, Jérémie Tarot silopo...@gmail.com wrote:
 Hi,

 2011/11/8 Bala.FA b...@gluster.com:
 Hi Xybrek,

 Gluster Management Console is not available for download now.  It will
 be released soon.


 Is it the same management console that was available on the appliance ?
 Is there somewhere to get an idea of the features of the MC ? Screenshots ?
 Last, will the MC be specific to RH/Fedora/CentOS or will it be
 available for other distros ?

 Thanks
 Jé
 __

I have the same question, but want to know if it will be available for
Debian (or if I can build it from source). I've had the cluster
running fine for a long time, but having a web based console to check
on all of the disks status would be killer.

Thanks

P
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] Invitation to connect on LinkedIn

2011-10-09 Thread Phil Cryer
I'd like to add you to my professional network on LinkedIn.

- Phil

Phil Cryer
Technical infrastructure engineer at Center for Library  Informatics at Woods 
Hole Marine Biological Laboratory
Greater St. Louis Area

Confirm that you know Phil Cryer:
https://www.linkedin.com/e/ug45gt-gtkr0eey-6k/isd/4504901697/pgJfSmLX/?hs=falsetok=27ARZa8mDlf4Y1

--
You are receiving Invitation to Connect emails. Click to unsubscribe:
http://www.linkedin.com/e/ug45gt-gtkr0eey-6k/paFBcysu6IvSuDm7LAFfuLfm66wSonuzptbI1iG/goo/gluster-users%40gluster%2Eorg/20061/I1560657279_1/?hs=falsetok=3SW4URV7vlf4Y1

(c) 2011 LinkedIn Corporation. 2029 Stierlin Ct, Mountain View, CA 94043, USA.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] cannot access /mnt/glusterfs: Stale NFS file handle

2011-09-12 Thread phil cryer
I've mounted my glusterfs share as I always do:
mount -t glusterfs `hostname`:/bhl-volume /mnt/glusterfs

and I can see it in df:
# df -h | tail -n1
clustr-01:/bhl-volume90T   51T   39T  57% /mnt/glusterfs

but I can't change into it, or access any of the files in it:
# ls -al /mnt/glusterfs
ls: cannot access /mnt/glusterfs: Stale NFS file handle

Any idea what could be causing this? It was working fine last week (in
fact I haven't remounted it in months and have had clients accessing
it constantly), but we did do a reboot across all 6 of the nodes over
the weekend.

details and version numbers:
# uname -a; glusterfs -V
Linux clustr-01 2.6.32-5-amd64 #1 SMP Tue Jun 14 09:42:28 UTC 2011
x86_64 GNU/Linux
glusterfs 3.1.2 built on Jan 16 2011 18:14:56
Repository revision: v3.1.1-64-gf2a067c
Copyright (c) 2006-2010 Gluster Inc. http://www.gluster.com
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU
Affero General Public License.

P
-- 
http://philcryer.com
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] anyone else playing with gcollect yet?

2011-07-27 Thread phil cryer
This looks interesting, but is there any tie in with collectd
(http://collectd.org/)? I'm currently using ganglia, but only on the
cluster, want to run collectd and have it consolidate data to the
central monitoring server. Wondering if gcollect could do this
too...could call it gcollectd then :)

Thanks

P

On Wed, Jul 27, 2011 at 9:40 AM,  greg_sw...@aotx.uscourts.gov wrote:
 gluster-users-boun...@gluster.org wrote on 07/27/2011 07:40:13 AM:


 I'm messing with it and had to do a few patches to get rid of
 warnings/errors on my system (it threw lots of warnings cause of my
 configured options on volumes and there was a traceback do to a typo),
 but
 now it just returns empty with a return code of 0.


 so for the sake of discussion... reasons it is returning blank:

 1: iterator only goes through number of bricks-1 when evaluating the
 bricks, manytimes missing the local brick cause I happen to be testing on
 the last node in the list.  patched here:
 https://github.com/gregswift/Gluster/commit/a16567b5149aea2ddbec1e61d6b9a8e8e3b10e76

 2: hostname check doesn't work on my system because I don't use
 hostnames ;)  yes yes.. i usually love dns... please don't fight me on this
 one. working on patch

 -greg

 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users




-- 
http://philcryer.com
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] rsync for WAN replication (active/active)

2011-03-24 Thread phil cryer
On Thu, Mar 24, 2011 at 12:31 PM, Mohit Anchlia mohitanch...@gmail.com wrote:
 Thanks for pointing that out. I think rsync also has option to sync
 based on time,md5hash and other attributes if I am not wrong. If we
 can preserve time and only sync the most latest file then I think we
 should be ok? What do you think? I can't think of any other option
 other than looking at some other DFS systems. We definitely don't want
 to add remote site in the brick because of the latency that we have.

 On Thu, Mar 24, 2011 at 5:31 AM, Jonathan Barber
 jonathan.bar...@gmail.com wrote:
 On 17 March 2011 17:08, Mohit Anchlia mohitanch...@gmail.com wrote:
 Thanks! I was going to trigger it through cron say every 10 mts. if
 rsync is not currently running.

 Regarding point 3) I thought of it also! I think this problem cannot
 be solved even when using bricks. If someone is editing 2 files at the
 same time only one will win (always). Only way we can avoid this is
 through application making sure that customer accessing the file can't
 go to 2 sites simulatneously. But I agree this scenario is the most
 complicated of all.

 This is a different issue; with gluster locking solves it (obviously
 the application has to know how to handle locks). Also, and I don't
 know if gluster supports this, some systems support byte range file
 locks, so both sites can write to the same file at the same time.

 The scenario I was trying to describe was a race condition between the
 rsync processes clobbering your files. I don't think this race
 condition is removed by using the --temp-dir option (although it
 probably decreases the window by a large amount). But if you don't run
 the sync process whilst the remote site is sync'ing to you, then it's
 not a problem.

 I was planning to use --temp-dir option (not tested it). Also I think
 rsync first copies the file as temporary files and then moves it.

 I just thought of another problem; which is that in the worst case you
 might require twice the amount of storage to sync your data (1x for
 the old data, 1x for the new data).

 In our case rsync will not handle deletes. If we want to delete any
 files it will be done manually.

Nice thread, I've heard this come up a few times in regards to
Gluster, and it relates to a project I'm working on. Basically I use a
server/client setup using rsync, with inotify handling the kicking off
once changes are seen. One box acts as the server and all the others
are clients. This way when clients have new or changed files, those
changes are sync'd to the server, but when files are removed on a
client those updates will only be sync'd to the server. A separate
cron job run on the clients does the syncs with the server to learn
about missing files it needs to delete from its own store.

It's definitely a work in progress, but the more people I talk to, the
more I think this is needed. I will have it running on my gluster
cluster soon to sync it with another (non-gluster) cluster in another
country. If interested, or you have better idea :) the project is
hosted here: https://github.com/philcryer/lipsync

Thanks

P
-- 
http://philcryer.com
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Debian, 3.1.1, duplicate files

2011-02-11 Thread phil cryer
On Thu, Jan 13, 2011 at 3:53 PM, Jacob Shucart ja...@gluster.com wrote:
 Phil,

 This sounds to me like an issue identified that affects Gluster directories
 that were part of older versions related to extended attributes that were
 set on the directories.  I believe this issue is supposed to be fixed in
 3.1.2.  I don't know how large your dataset is, but a way to fix it would be
 to:

 1. Delete the Gluster volume.
 2. On the back end directories on your nodes, scrub the offending extended
 attribute with the command:
        find /back/end/dir -exec setfattr -x trusted.gfid {} \;
 3. Create the Gluster volume again.
 4. Mount the volume somewhere as a GlusterFS(mount -t glusterfs) and
 run:
        find /mnt/gluster -print0 | xargs --null stat
 5. Enjoy.

Jacob
Thanks for your reply, to solve this I installed 3.1.2, then
re-formatted all of my drives (bricks). It might have been overkill,
but I wanted to start completely fresh with 3.1.2. So far, we've had
no issues with the setup, and I'll be careful from now on when I
update versions, hopefully they will be a path to avoid gotchas like
this!

Thanks

P


 Please let me know if that helps.  Thank you.

 -Jacob

 -Original Message-
 From: gluster-users-boun...@gluster.org
 [mailto:gluster-users-boun...@gluster.org] On Behalf Of phil cryer
 Sent: Thursday, January 13, 2011 9:07 AM
 To: gluster-users@gluster.org
 Subject: Re: [Gluster-users] Debian, 3.1.1, duplicate files

 So, I haven't heard anything back, so I just wanted to update this
 just in case anyone else comes across it. This was an old store that
 we created in 3.0.4, that kept getting duplicate files, basically we
 ran an update script that would use wget, try to download any files
 that were not present on the local box but were on the remote. Of
 course if it just downloaded the same file it would either 1) ignore
 it and not download it because it would see that we already have it 2)
 overwrite that file (clobber) with a new version of that file or 2)
 rewrite the file as file.1 so as not to mess with the original one
 (no-clobber) - but in fact it did none of these - so instead we ended
 up with the bizzare feature of having multiple/identical files in the
 same directory. Meanwhile we're also using far more space than we
 should have (~70TB instead of ~40TB or so) thanks to having
 directories like this:

 # ls -al /mnt/glusterfs//www/t/tijdschriftvoore1951nede/
 total 536436
 drwxr-xr-x    2 www-data www-data    294912 Jan 13 10:05 .
 drwx-- 1016 www-data www-data   3846144 Dec 12 11:10 ..
 -rwxr-xr-x    1 www-data www-data   1151282 Jul 12  2010
 tijdschriftvoore1951nede_djvu.txt
 -rwxr-xr-x    1 www-data www-data   1151282 Jul 12  2010
 tijdschriftvoore1951nede_djvu.txt
 -rwxr-xr-x    1 www-data www-data  12078834 Jul 12  2010
 tijdschriftvoore1951nede_djvu.xml
 -rwxr-xr-x    1 www-data www-data  12078834 Jul 12  2010
 tijdschriftvoore1951nede_djvu.xml
 -rwxr-xr-x    1 www-data www-data    271733 Jul 12  2010
 tijdschriftvoore1951nede.gif
 -rwxr-xr-x    1 www-data www-data    271733 Jul 12  2010
 tijdschriftvoore1951nede.gif
 -rwxr-xr-x    1 www-data www-data 257779301 Jul 12  2010
 tijdschriftvoore1951nede_jp2.zip
 -rwxr-xr-x    1 www-data www-data 257779301 Jul 12  2010
 tijdschriftvoore1951nede_jp2.zip
 -rwxr-xr-x    1 www-data www-data      2278 Jul 12  2010
 tijdschriftvoore1951nede_marc.xml
 -rwxr-xr-x    1 www-data www-data      2278 Jul 12  2010
 tijdschriftvoore1951nede_marc.xml
 -rwxr-xr-x    1 www-data www-data       720 Jul 12  2010
 tijdschriftvoore1951nede_meta.mrc
 -rwxr-xr-x    1 www-data www-data       720 Jul 12  2010
 tijdschriftvoore1951nede_meta.mrc
 -rwxr-xr-x    1 www-data www-data    546411 Jul 12  2010
 tijdschriftvoore1951nede_names.xml
 -rwxr-xr-x    1 www-data www-data    546411 Jul 12  2010
 tijdschriftvoore1951nede_names.xml
 -rwxr-xr-x    1 www-data www-data       256 Jul 12  2010
 tijdschriftvoore1951nede_names.xml_meta.txt
 -rwxr-xr-x    1 www-data www-data       256 Jul 12  2010
 tijdschriftvoore1951nede_names.xml_meta.txt
 -rwxr-xr-x    1 www-data www-data    257556 Jul 13  2010
 tijdschriftvoore1951nede_scandata.xml
 -rwxr-xr-x    1 www-data www-data    257556 Jul 13  2010
 tijdschriftvoore1951nede_scandata.xml

 Ouch, right? So, I installed 3.1.1, that went well, I got it on all
 the drives and servers we had before, have a total capacity of 96TB
 again, good, all seems to be working, mounted the old directories and
 saw the same issue with the duplicate files and let it sit over night
 to see if it would notice this and try to fix things. Then we're
 seeing gluster logs saying things like:

 == glusterfs/mnt-glusterfs.log ==
 [2011-01-13 11:46:23.2762] I [afr-common.c:662:afr_lookup_done]
 bhl-volume-replicate-55: entries are missing in lookup of
 /www/t/tijdschriftvoore1951nede.
 [2011-01-13 11:46:23.2817] I [afr-common.c:716:afr_lookup_done]
 bhl-volume-replicate-55: background  meta-data data entry self-heal
 triggered. path

[Gluster-users] Running Storage Platform to admin normal GlusterFS instances?

2011-02-08 Thread phil cryer
I'm running GlusterFS 3.1.2 on some nodes, and they're all working.
Can I now run the Gluster Storage Platform on another server and
administrate those existing nodes with it, or do you have to have SP
on all the servers; admin and nodes?

Thanks

P
-- 
http://philcryer.com
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] 3.1.2 Debian - client_rpc_notify failed to get the port number for remote subvolume

2011-02-04 Thread phil cryer
On Fri, Feb 4, 2011 at 12:33 PM, Anand Avati anand.av...@gmail.com wrote:
 It is very likely the brick process is failing to start. Please look at the
 brick log on that server. (in /var/log/glusterfs/bricks/* )
 Avati

Thanks, so if I'm looking at it right, the 'bhl-volume-client-98' is
really Brick98: clustr-02:/mnt/data17 - I'm figuring that from this:

 [2011-02-04 13:09:28.407300] I [client.c:1590:client_rpc_notify]
 bhl-volume-client-98: disconnected

 However, if I do a gluster volume info I see that it's listed:
 # gluster volume info | grep 98
 Brick98: clustr-02:/mnt/data17

But on that server I don't see any issues with that brick starting:

# head mnt-data17.log -n50
[2011-02-03 23:29:24.235648] W [graph.c:274:gf_add_cmdline_options]
bhl-volume-server: adding option 'listen-port' for volume
'bhl-volume-server' with value '24025'
[2011-02-03 23:29:24.236017] W
[rpc-transport.c:566:validate_volume_options] tcp.bhl-volume-server:
option 'listen-port' is deprecated, preferred is
'transport.socket.listen-port', continuing with correction
Given volfile:
+--+
  1: volume bhl-volume-posix
  2: type storage/posix
  3: option directory /mnt/data17
  4: end-volume
  5:
  6: volume bhl-volume-access-control
  7: type features/access-control
  8: subvolumes bhl-volume-posix
  9: end-volume
 10:
 11: volume bhl-volume-locks
 12: type features/locks
 13: subvolumes bhl-volume-access-control
 14: end-volume
 15:
 16: volume bhl-volume-io-threads
 17: type performance/io-threads
 18: subvolumes bhl-volume-locks
 19: end-volume
 20:
 21: volume /mnt/data17
 22: type debug/io-stats
 23: subvolumes bhl-volume-io-threads
 24: end-volume
 25:
 26: volume bhl-volume-server
 27: type protocol/server
 28: option transport-type tcp
 29: option auth.addr./mnt/data17.allow *
 30: subvolumes /mnt/data17
 31: end-volume

+--+
[2011-02-03 23:29:28.575630] I
[server-handshake.c:535:server_setvolume] bhl-volume-server: accepted
client from 128.128.164.219:724
[2011-02-03 23:29:28.583169] I
[server-handshake.c:535:server_setvolume] bhl-volume-server: accepted
client from 127.0.1.1:985
[2011-02-03 23:29:28.603357] I
[server-handshake.c:535:server_setvolume] bhl-volume-server: accepted
client from 128.128.164.218:726
[2011-02-03 23:29:28.605650] I
[server-handshake.c:535:server_setvolume] bhl-volume-server: accepted
client from 128.128.164.217:725
[2011-02-03 23:29:28.608033] I
[server-handshake.c:535:server_setvolume] bhl-volume-server: accepted
client from 128.128.164.215:725
[2011-02-03 23:29:31.161985] I
[server-handshake.c:535:server_setvolume] bhl-volume-server: accepted
client from 128.128.164.74:697
[2011-02-04 00:40:11.600314] I
[server-handshake.c:535:server_setvolume] bhl-volume-server: accepted
client from 128.128.164.74:805

Plus, looking at the tail of this log, it's still working, latest
messages (from 4 seconds before) as I'm moving some things on the
cluster

[2011-02-04 23:13:35.53685] W [server-resolve.c:565:server_resolve]
bhl-volume-server: pure path resolution for
/www/d/dasobstdertropen00schrrich (INODELK)
[2011-02-04 23:13:35.57107] W [server-resolve.c:565:server_resolve]
bhl-volume-server: pure path resolution for
/www/d/dasobstdertropen00schrrich (SETXATTR)
[2011-02-04 23:13:35.59699] W [server-resolve.c:565:server_resolve]
bhl-volume-server: pure path resolution for
/www/d/dasobstdertropen00schrrich (INODELK)

Thanks!

P




 On Fri, Feb 4, 2011 at 10:19 AM, phil cryer p...@cryer.us wrote:

 I have glusterfs 3.1.2 running on Debian, I'm able to start the volume
 and now mount it via mount -t gluster and I can see everything. I am
 still seeing the following error in /var/log/glusterfs/nfs.log

 [2011-02-04 13:09:16.404851] E
 [client-handshake.c:1079:client_query_portmap_cbk]
 bhl-volume-client-98: failed to get the port number for remote
 subvolume
 [2011-02-04 13:09:16.404909] I [client.c:1590:client_rpc_notify]
 bhl-volume-client-98: disconnected
 [2011-02-04 13:09:20.405843] E
 [client-handshake.c:1079:client_query_portmap_cbk]
 bhl-volume-client-98: failed to get the port number for remote
 subvolume
 [2011-02-04 13:09:20.405938] I [client.c:1590:client_rpc_notify]
 bhl-volume-client-98: disconnected
 [2011-02-04 13:09:24.406634] E
 [client-handshake.c:1079:client_query_portmap_cbk]
 bhl-volume-client-98: failed to get the port number for remote
 subvolume
 [2011-02-04 13:09:24.406711] I [client.c:1590:client_rpc_notify]
 bhl-volume-client-98: disconnected
 [2011-02-04 13:09:28.407249] E
 [client-handshake.c:1079:client_query_portmap_cbk]
 bhl-volume-client-98: failed to get the port number for remote
 subvolume
 [2011-02-04 13:09:28.407300] I [client.c:1590:client_rpc_notify]
 bhl-volume-client-98: disconnected

 However, if I do a gluster volume info I see that it's listed:
 # gluster volume

Re: [Gluster-users] df causes hang

2011-02-03 Thread phil cryer
This wasn't my issue, but I'm still having the issue. Today I purged
glusterfs 3.1.1 and installed 3.1.2 fresh from deb. I recreated my
volume, started it, everything was going fine, mounted the share, then
ran df -h to see it, now every few seconds my logs posts this:

== /var/log/glusterfs/nfs.log ==
[2011-02-03 15:55:57.145626] E
[client-handshake.c:1079:client_query_portmap_cbk]
bhl-volume-client-98: failed to get the port number for remote
subvolume
[2011-02-03 15:55:57.145694] I [client.c:1590:client_rpc_notify]
bhl-volume-client-98: disconnected

== /var/log/glusterfs/mnt-glusterfs.log ==
[2011-02-03 15:55:57.605802] E [common-utils.c:124:gf_resolve_ip6]
resolver: getaddrinfo failed (Name or service not known)
[2011-02-03 15:55:57.605834] E
[name.c:251:af_inet_client_get_remote_sockaddr] glusterfs: DNS
resolution failed on host /etc/glusterfs/glusterfs.vol

over and over. Any clues as to how I can fix this? This one issue has
made our entire 100TB store unusable.

and again, gluster volume info shows all the bricks are OK, including 98:

gluster volume info

Volume Name: bhl-volume
Type: Distributed-Replicate
Status: Started
Number of Bricks: 72 x 2 = 144
Transport-type: tcp
Bricks:
[...]
Brick92: clustr-02:/mnt/data16
Brick93: clustr-03:/mnt/data16
Brick94: clustr-04:/mnt/data16
Brick95: clustr-05:/mnt/data16
Brick96: clustr-06:/mnt/data16
Brick97: clustr-01:/mnt/data17
Brick98: clustr-02:/mnt/data17
Brick99: clustr-03:/mnt/data17
Brick100: clustr-04:/mnt/data17
Brick101: clustr-05:/mnt/data17
Brick102: clustr-06:/mnt/data17
Brick103: clustr-01:/mnt/data18
Brick104: clustr-02:/mnt/data18
Brick105: clustr-03:/mnt/data18
[...]


P


On Mon, Jan 31, 2011 at 4:26 PM, Anand Avati anand.av...@gmail.com wrote:
 Can you post your server logs? What happens if you run 'df -k' on your
 backend export filesystems?

 Thanks
 Avati

 On Mon, Jan 17, 2011 at 5:27 AM, Joe Warren-Meeks
 j...@encoretickets.co.ukwrote:


 (sorry about topposting.)

 Just changing the timeout would only mask the problem. The real issue is
 that running 'df' on either node causes a hang.

 All other operations seem fine, files can be created and deleted as
 normal with the results showing up on both.

 I'd like to work out why it's hanging on df so I can fix it and get my
 monitoring and cron scripts running again :)

  -- joe.

 -Original Message-
 From: gluster-users-boun...@gluster.org
 [mailto:gluster-users-boun...@gluster.org] On Behalf Of Daniel Maher
 Sent: 17 January 2011 12:48
 To: gluster-users@gluster.org
 Subject: Re: [Gluster-users] df causes hang

 On 01/17/2011 10:47 AM, Joe Warren-Meeks wrote:
  Hey chaps,
 
  Anyone got any pointers as to what this might be? This is still
 causing
  a lot of problems for us whenever we attempt to do df.
 
    -- joe.
 
  -Original Message-

  However, for some reason, they've got into a bit of a state such that
  typing 'df -k' causes both to hang, resulting in a loss of service
 for42
  seconds. I see the following messages in the log files:
 
 

 42 seconds is the default tcp timeout time for any given node - you
 could try tuning that down and seeing how it works for you.

 http://www.gluster.com/community/documentation/index.php/Gluster_3.1:_Se
 tting_Volume_Options


 --
 Daniel Maher dma+gluster AT witbe DOT net
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users





-- 
http://philcryer.com
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] df causes hang

2011-02-03 Thread phil cryer
Avati - thanks for your reply, my comments below

 [name.c:251:af_inet_client_get_remote_sockaddr] glusterfs: DNS
 resolution failed on host /etc/glusterfs/glusterfs.vol

 Please make sure you are able to resolve hostnames as given in volume info
 in all of your servers via 'dig'. The logs clearly show that host resolution
 seems to be failing.

Agreed, however that does seem to be the issue because I can dig the
host (they're all defined in my hosts file too so it doesn't have to
look them up) named clustr-02 and in fact there are 23 other 'bricks'
on that host that are working fine:

# gluster volume info | grep clustr-02
Brick2: clustr-02:/mnt/data01
Brick8: clustr-02:/mnt/data02
Brick14: clustr-02:/mnt/data03
Brick20: clustr-02:/mnt/data04
Brick26: clustr-02:/mnt/data05
Brick32: clustr-02:/mnt/data06
Brick38: clustr-02:/mnt/data07
Brick44: clustr-02:/mnt/data08
Brick50: clustr-02:/mnt/data09
Brick56: clustr-02:/mnt/data10
Brick62: clustr-02:/mnt/data11
Brick68: clustr-02:/mnt/data12
Brick74: clustr-02:/mnt/data13
Brick80: clustr-02:/mnt/data14
Brick86: clustr-02:/mnt/data15
Brick92: clustr-02:/mnt/data16
Brick98: clustr-02:/mnt/data17
Brick104: clustr-02:/mnt/data18
Brick110: clustr-02:/mnt/data19
Brick116: clustr-02:/mnt/data20
Brick122: clustr-02:/mnt/data21
Brick128: clustr-02:/mnt/data22
Brick134: clustr-02:/mnt/data23
Brick140: clustr-02:/mnt/data24

I logged into that host, unmounted that mount, ran fsck.ext4 on it,
but it came back clean.

Also thing, the log says: glusterfs: DNS  resolution failed on host
/etc/glusterfs/glusterfs.vol - however, there is obviously no host
named  /etc/glusterfs/glusterfs.vol - does this point to an issue?

And lastly, I even have a file named /etc/glusterfs/glusterfs.vol

ls -ls /etc/glusterfs
-rw-r--r-- 1 root root  229 Jan 16 21:15 glusterd.vol
-rw-r--r-- 1 root root 1908 Jan 16 21:15 glusterfsd.vol.sample
-rw-r--r-- 1 root root 2005 Jan 16 21:15 glusterfs.vol.sample

I created all of the configs via the gluster commandline tool.

Thanks

P




On Thu, Feb 3, 2011 at 6:39 PM, Anand Avati anand.av...@gmail.com wrote:
 Please make sure you are able to resolve hostnames as given in volume info
 in all of your servers via 'dig'. The logs clearly show that host resolution
 seems to be failing.
 Avati

 On Thu, Feb 3, 2011 at 1:08 PM, phil cryer p...@cryer.us wrote:

 This wasn't my issue, but I'm still having the issue. Today I purged
 glusterfs 3.1.1 and installed 3.1.2 fresh from deb. I recreated my
 volume, started it, everything was going fine, mounted the share, then
 ran df -h to see it, now every few seconds my logs posts this:

 == /var/log/glusterfs/nfs.log ==
 [2011-02-03 15:55:57.145626] E
 [client-handshake.c:1079:client_query_portmap_cbk]
 bhl-volume-client-98: failed to get the port number for remote
 subvolume
 [2011-02-03 15:55:57.145694] I [client.c:1590:client_rpc_notify]
 bhl-volume-client-98: disconnected

 == /var/log/glusterfs/mnt-glusterfs.log ==
 [2011-02-03 15:55:57.605802] E [common-utils.c:124:gf_resolve_ip6]
 resolver: getaddrinfo failed (Name or service not known)
 [2011-02-03 15:55:57.605834] E
 [name.c:251:af_inet_client_get_remote_sockaddr] glusterfs: DNS
 resolution failed on host /etc/glusterfs/glusterfs.vol

 over and over. Any clues as to how I can fix this? This one issue has
 made our entire 100TB store unusable.

 and again, gluster volume info shows all the bricks are OK, including 98:

 gluster volume info

 Volume Name: bhl-volume
 Type: Distributed-Replicate
 Status: Started
 Number of Bricks: 72 x 2 = 144
 Transport-type: tcp
 Bricks:
 [...]
 Brick92: clustr-02:/mnt/data16
 Brick93: clustr-03:/mnt/data16
 Brick94: clustr-04:/mnt/data16
 Brick95: clustr-05:/mnt/data16
 Brick96: clustr-06:/mnt/data16
 Brick97: clustr-01:/mnt/data17
 Brick98: clustr-02:/mnt/data17
 Brick99: clustr-03:/mnt/data17
 Brick100: clustr-04:/mnt/data17
 Brick101: clustr-05:/mnt/data17
 Brick102: clustr-06:/mnt/data17
 Brick103: clustr-01:/mnt/data18
 Brick104: clustr-02:/mnt/data18
 Brick105: clustr-03:/mnt/data18
 [...]


 P


 On Mon, Jan 31, 2011 at 4:26 PM, Anand Avati anand.av...@gmail.com
 wrote:
  Can you post your server logs? What happens if you run 'df -k' on your
  backend export filesystems?
 
  Thanks
  Avati
 
  On Mon, Jan 17, 2011 at 5:27 AM, Joe Warren-Meeks
  j...@encoretickets.co.ukwrote:
 
 
  (sorry about topposting.)
 
  Just changing the timeout would only mask the problem. The real issue
  is
  that running 'df' on either node causes a hang.
 
  All other operations seem fine, files can be created and deleted as
  normal with the results showing up on both.
 
  I'd like to work out why it's hanging on df so I can fix it and get my
  monitoring and cron scripts running again :)
 
   -- joe.
 
  -Original Message-
  From: gluster-users-boun...@gluster.org
  [mailto:gluster-users-boun...@gluster.org] On Behalf Of Daniel Maher
  Sent: 17 January 2011 12:48
  To: gluster-users@gluster.org
  Subject: Re: [Gluster-users] df

[Gluster-users] 3.1 Can't mount glusterfs mountpoint - Address already in use

2011-01-31 Thread phil cryer
I have a problem with Gluster 3.1.1 (debian) that I didn't have a few
weeks ago. I can run glusterd on my 6 servers, I can see all 6 of them
from the main one, but the logs keep complaining about:

== /var/log/glusterfs/nfs.log ==
[2011-01-31 14:46:49.157527] E
[client-handshake.c:1067:client_query_portmap_cbk]
bhl-volume-client-98: failed to get the port number for remote
subvolume
[2011-01-31 14:46:53.158708] E
[client-handshake.c:1067:client_query_portmap_cbk]
bhl-volume-client-98: failed to get the port number for remote
subvolume
^[  [2011-01-31 14:46:57.159754] E
[client-handshake.c:1067:client_query_portmap_cbk]
bhl-volume-client-98: failed to get the port number for remote
subvolume
[2011-01-31 14:47:01.160804] E
[client-handshake.c:1067:client_query_portmap_cbk]
bhl-volume-client-98: failed to get the port number for remote
subvolume

Then, if I try to mount the glusterfs I get the following errors in
glusterfs.log

== /var/log/glusterfs/mnt-glusterfs.log ==
[2011-01-31 14:46:48.720968] I [glusterd.c:275:init] management: Using
/etc/glusterd as working directory
[2011-01-31 14:46:48.721514] E [socket.c:322:__socket_server_bind]
socket.management: binding to  failed: Address already in use
[2011-01-31 14:46:48.721535] E [socket.c:325:__socket_server_bind]
socket.management: Port is already in use
[2011-01-31 14:46:48.721561] E [glusterd.c:348:init] management:
creation of listener failed
[2011-01-31 14:46:48.721577] E [xlator.c:909:xlator_init] management:
Initialization of volume 'management' failed, review your volfile
again
[2011-01-31 14:46:48.721595] E [graph.c:331:glusterfs_graph_init]
management: initializing translator failed
[2011-01-31 14:46:48.721635] I [fuse-bridge.c:3616:fini] fuse:
Unmounting '/mnt/glusterfs'.
[2011-01-31 14:46:48.823199] I [glusterfsd.c:672:cleanup_and_exit]
glusterfsd: shutting down

But I don't see how it's already in use; I've made sure everything was
stopped/killed on the main server, but when I restart everything fresh
the above happens. So, are these two related, or how can I debug the
mounting issue? incidentally, if I do a volume info everything seems
to be mounted and ok:

# gluster
gluster volume info bhl-volume

Volume Name: bhl-volume
Type: Distributed-Replicate
Status: Started
Number of Bricks: 72 x 2 = 144
Transport-type: tcp
Bricks:
Brick1: clustr-01:/mnt/data01
Brick2: clustr-02:/mnt/data01
Brick3: clustr-03:/mnt/data01
Brick4: clustr-04:/mnt/data01
Brick5: clustr-05:/mnt/data01
Brick6: clustr-06:/mnt/data01
Brick7: clustr-01:/mnt/data02
Brick8: clustr-02:/mnt/data02
Brick9: clustr-03:/mnt/data02
Brick10: clustr-04:/mnt/data02
Brick11: clustr-05:/mnt/data02
Brick12: clustr-06:/mnt/data02
Brick13: clustr-01:/mnt/data03
Brick14: clustr-02:/mnt/data03
Brick15: clustr-03:/mnt/data03
Brick16: clustr-04:/mnt/data03
Brick17: clustr-05:/mnt/data03
Brick18: clustr-06:/mnt/data03
Brick19: clustr-01:/mnt/data04
Brick20: clustr-02:/mnt/data04
Brick21: clustr-03:/mnt/data04
Brick22: clustr-04:/mnt/data04
Brick23: clustr-05:/mnt/data04
Brick24: clustr-06:/mnt/data04
Brick25: clustr-01:/mnt/data05
Brick26: clustr-02:/mnt/data05
Brick27: clustr-03:/mnt/data05
Brick28: clustr-04:/mnt/data05
Brick29: clustr-05:/mnt/data05
Brick30: clustr-06:/mnt/data05
Brick31: clustr-01:/mnt/data06
Brick32: clustr-02:/mnt/data06
Brick33: clustr-03:/mnt/data06
Brick34: clustr-04:/mnt/data06
Brick35: clustr-05:/mnt/data06
Brick36: clustr-06:/mnt/data06
Brick37: clustr-01:/mnt/data07
Brick38: clustr-02:/mnt/data07
Brick39: clustr-03:/mnt/data07
Brick40: clustr-04:/mnt/data07
Brick41: clustr-05:/mnt/data07
Brick42: clustr-06:/mnt/data07
Brick43: clustr-01:/mnt/data08
Brick44: clustr-02:/mnt/data08
Brick45: clustr-03:/mnt/data08
Brick46: clustr-04:/mnt/data08
Brick47: clustr-05:/mnt/data08
Brick48: clustr-06:/mnt/data08
Brick49: clustr-01:/mnt/data09
Brick50: clustr-02:/mnt/data09
Brick51: clustr-03:/mnt/data09
Brick52: clustr-04:/mnt/data09
Brick53: clustr-05:/mnt/data09
Brick54: clustr-06:/mnt/data09
Brick55: clustr-01:/mnt/data10
Brick56: clustr-02:/mnt/data10
Brick57: clustr-03:/mnt/data10
Brick58: clustr-04:/mnt/data10
Brick59: clustr-05:/mnt/data10
Brick60: clustr-06:/mnt/data10
Brick61: clustr-01:/mnt/data11
Brick62: clustr-02:/mnt/data11
Brick63: clustr-03:/mnt/data11
Brick64: clustr-04:/mnt/data11
Brick65: clustr-05:/mnt/data11
Brick66: clustr-06:/mnt/data11
Brick67: clustr-01:/mnt/data12
Brick68: clustr-02:/mnt/data12
Brick69: clustr-03:/mnt/data12
Brick70: clustr-04:/mnt/data12
Brick71: clustr-05:/mnt/data12
Brick72: clustr-06:/mnt/data12
Brick73: clustr-01:/mnt/data13
Brick74: clustr-02:/mnt/data13
Brick75: clustr-03:/mnt/data13
Brick76: clustr-04:/mnt/data13
Brick77: clustr-05:/mnt/data13
Brick78: clustr-06:/mnt/data13
Brick79: clustr-01:/mnt/data14
Brick80: clustr-02:/mnt/data14
Brick81: clustr-03:/mnt/data14
Brick82: clustr-04:/mnt/data14
Brick83: clustr-05:/mnt/data14
Brick84: clustr-06:/mnt/data14
Brick85: clustr-01:/mnt/data15
Brick86: clustr-02:/mnt/data15
Brick87: 

Re: [Gluster-users] Debian, 3.1.1, duplicate files

2011-01-13 Thread phil cryer
tijdschriftvoore1951nede_meta.mrc
-rwxr-xr-x1 www-data www-data   720 Jul 12  2010
tijdschriftvoore1951nede_meta.mrc
-rwxr-xr-x1 www-data www-data546411 Jul 12  2010
tijdschriftvoore1951nede_names.xml
-rwxr-xr-x1 www-data www-data546411 Jul 12  2010
tijdschriftvoore1951nede_names.xml
-rwxr-xr-x1 www-data www-data   256 Jul 12  2010
tijdschriftvoore1951nede_names.xml_meta.txt
-rwxr-xr-x1 www-data www-data   256 Jul 12  2010
tijdschriftvoore1951nede_names.xml_meta.txt
-rwxr-xr-x1 www-data www-data257556 Jul 13  2010
tijdschriftvoore1951nede_scandata.xml
-rwxr-xr-x1 www-data www-data257556 Jul 13  2010
tijdschriftvoore1951nede_scandata.xml

but, this allows us to do (in my opinion) scary things like this:

# ls -al /mnt/glusterfs//www/t/tijdschriftvoore1951nede/*_names.xml
-rwxr-xr-x 1 www-data www-data 546411 Jul 12  2010
/mnt/glusterfs//www/t/tijdschriftvoore1951nede/tijdschriftvoore1951nede_names.xml
-rwxr-xr-x 1 www-data www-data 546411 Jul 12  2010
/mnt/glusterfs//www/t/tijdschriftvoore1951nede/tijdschriftvoore1951nede_names.xml

# rm 
/mnt/glusterfs//www/t/tijdschriftvoore1951nede/tijdschriftvoore1951nede_names.xml

# ls -al /mnt/glusterfs//www/t/tijdschriftvoore1951nede/*_names.xml
-rwxr-xr-x 1 www-data www-data 546411 Jul 12  2010
/mnt/glusterfs//www/t/tijdschriftvoore1951nede/tijdschriftvoore1951nede_names.xml

eek! so it only removed one of the files, even though they both had
the same name. At this point we're going to wipe all 70TB and
re-transfer, hoping it stops when it gets all the files and doesn't
start writing the files with the same names as before. Anyone with
advice or insight into this issue? Would love to learn why it did
this, and REALLY hope it doesn't do it again.

Thanks

P



On Wed, Jan 12, 2011 at 2:37 PM, phil cryer p...@cryer.us wrote:
 I'm now running gluster 3.1.1 on Debian. A directory that was running
 under 3.0.4 had duplicate files, but I've remounted things now that
 we're running 3.1.1 in hopes it would fix things, but so far it has
 not:

 # ls -l /mnt/glusterfs/www/0/0descriptionofta581unittotal 37992
 -rwxr-xr-x 1 www-data www-data   796343 Jun 23  2010
 0descriptionofta581unit_bw.pdf
 -rwxr-xr-x 1 www-data www-data   796343 Jun 23  2010
 0descriptionofta581unit_bw.pdf
 -T 1 root     root         1497 Jun 24  2010
 0descriptionofta581unit_dc.xml
 -T 1 root     root         1497 Jun 24  2010
 0descriptionofta581unit_dc.xml
 -T 1 www-data www-data   577050 Jun 24  2010
 0descriptionofta581unit.djvu
 -T 1 www-data www-data   577050 Jun 24  2010
 0descriptionofta581unit.djvu
 -rwxr-xr-x 1 www-data www-data    33272 Jun 22  2010
 0descriptionofta581unit_djvu.txt
 -rwxr-xr-x 1 www-data www-data    33272 Jun 22  2010
 0descriptionofta581unit_djvu.txt
 -rwxr-xr-x 1 www-data www-data     4445 Jun 23  2010
 0descriptionofta581unit_files.xml
 -rwxr-xr-x 1 www-data www-data     4445 Jun 23  2010
 0descriptionofta581unit_files.xml
 -rwxr-xr-x 1 www-data www-data     5011 Jun 22  2010
 0descriptionofta581unit_marc.xml
 -rwxr-xr-x 1 www-data www-data     5011 Jun 22  2010
 0descriptionofta581unit_marc.xml
 -rwxr-xr-x 1 www-data www-data      360 Jun 23  2010
 0descriptionofta581unit_metasource.xml
 -rwxr-xr-x 1 www-data www-data      360 Jun 23  2010
 0descriptionofta581unit_metasource.xml
 -rwxr-xr-x 1 www-data www-data     2848 Jun 22  2010
 0descriptionofta581unit_meta.xml
 -rwxr-xr-x 1 www-data www-data     2848 Jun 22  2010
 0descriptionofta581unit_meta.xml
 -rwxr-xr-x 1 www-data www-data 16916480 Jun 22  2010
 0descriptionofta581unit_orig_jp2.tar
 -rwxr-xr-x 1 www-data www-data 16916480 Jun 22  2010
 0descriptionofta581unit_orig_jp2.tar
 -rwxr-xr-x 1 www-data www-data  1051810 Jun 22  2010 
 0descriptionofta581unit.pdf
 -rwxr-xr-x 1 www-data www-data  1051810 Jun 22  2010 
 0descriptionofta581unit.pdf

 While running the latest, 3.1.1, I noticed some log files that said:

 [..]
 [2011-01-12 15:24:33.325546] I
 [afr-common.c:613:afr_lookup_self_heal_check] bhl-volume-replicate-69:
 size differs for
 /www/0/0descriptionofta581unit/0descriptionofta581unit.djvu
 [2011-01-12 15:24:33.325558] I [afr-common.c:716:afr_lookup_done]
 bhl-volume-replicate-69: background  meta-data data self-heal
 triggered. path:
 /www/0/0descriptionofta581unit/0descriptionofta581unit.djvu
 [2011-01-12 15:24:33.364501] I
 [afr-self-heal-common.c:1526:afr_self_heal_completion_cbk]
 bhl-volume-replicate-66: background  meta-data data self-heal
 completed on /www/0/0descriptionofta581unit/0descriptionofta581unit.djvu
 [2011-01-12 15:24:33.364881] I
 [afr-self-heal-common.c:1526:afr_self_heal_completion_cbk]
 bhl-volume-replicate-69: background  meta-data data self-heal
 completed on /www/0/0descriptionofta581unit/0descriptionofta581unit.djvu

 I assumed it was fixing that, but it didn't. Here's the full logs that
 include all the gluster.log work it did in this directory:
 http://pastebin.com/8X52Em7Y

 Question: how can I 'fix' this, or is the best

[Gluster-users] what does the permission T mean? -rwx-----T

2011-01-13 Thread phil cryer
I have a file that looks like this, what does T tell me in terms of
permissions and glusterfs?

-rwx-T 1 root root   3414 Oct 22 15:27 reportr2.sh

P
-- 
http://philcryer.com
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] Debian, 3.1.1, duplicate files

2011-01-12 Thread phil cryer
I'm now running gluster 3.1.1 on Debian. A directory that was running
under 3.0.4 had duplicate files, but I've remounted things now that
we're running 3.1.1 in hopes it would fix things, but so far it has
not:

# ls -l /mnt/glusterfs/www/0/0descriptionofta581unittotal 37992
-rwxr-xr-x 1 www-data www-data   796343 Jun 23  2010
0descriptionofta581unit_bw.pdf
-rwxr-xr-x 1 www-data www-data   796343 Jun 23  2010
0descriptionofta581unit_bw.pdf
-T 1 root root 1497 Jun 24  2010
0descriptionofta581unit_dc.xml
-T 1 root root 1497 Jun 24  2010
0descriptionofta581unit_dc.xml
-T 1 www-data www-data   577050 Jun 24  2010
0descriptionofta581unit.djvu
-T 1 www-data www-data   577050 Jun 24  2010
0descriptionofta581unit.djvu
-rwxr-xr-x 1 www-data www-data33272 Jun 22  2010
0descriptionofta581unit_djvu.txt
-rwxr-xr-x 1 www-data www-data33272 Jun 22  2010
0descriptionofta581unit_djvu.txt
-rwxr-xr-x 1 www-data www-data 4445 Jun 23  2010
0descriptionofta581unit_files.xml
-rwxr-xr-x 1 www-data www-data 4445 Jun 23  2010
0descriptionofta581unit_files.xml
-rwxr-xr-x 1 www-data www-data 5011 Jun 22  2010
0descriptionofta581unit_marc.xml
-rwxr-xr-x 1 www-data www-data 5011 Jun 22  2010
0descriptionofta581unit_marc.xml
-rwxr-xr-x 1 www-data www-data  360 Jun 23  2010
0descriptionofta581unit_metasource.xml
-rwxr-xr-x 1 www-data www-data  360 Jun 23  2010
0descriptionofta581unit_metasource.xml
-rwxr-xr-x 1 www-data www-data 2848 Jun 22  2010
0descriptionofta581unit_meta.xml
-rwxr-xr-x 1 www-data www-data 2848 Jun 22  2010
0descriptionofta581unit_meta.xml
-rwxr-xr-x 1 www-data www-data 16916480 Jun 22  2010
0descriptionofta581unit_orig_jp2.tar
-rwxr-xr-x 1 www-data www-data 16916480 Jun 22  2010
0descriptionofta581unit_orig_jp2.tar
-rwxr-xr-x 1 www-data www-data  1051810 Jun 22  2010 0descriptionofta581unit.pdf
-rwxr-xr-x 1 www-data www-data  1051810 Jun 22  2010 0descriptionofta581unit.pdf

While running the latest, 3.1.1, I noticed some log files that said:

[..]
[2011-01-12 15:24:33.325546] I
[afr-common.c:613:afr_lookup_self_heal_check] bhl-volume-replicate-69:
size differs for
/www/0/0descriptionofta581unit/0descriptionofta581unit.djvu
[2011-01-12 15:24:33.325558] I [afr-common.c:716:afr_lookup_done]
bhl-volume-replicate-69: background  meta-data data self-heal
triggered. path:
/www/0/0descriptionofta581unit/0descriptionofta581unit.djvu
[2011-01-12 15:24:33.364501] I
[afr-self-heal-common.c:1526:afr_self_heal_completion_cbk]
bhl-volume-replicate-66: background  meta-data data self-heal
completed on /www/0/0descriptionofta581unit/0descriptionofta581unit.djvu
[2011-01-12 15:24:33.364881] I
[afr-self-heal-common.c:1526:afr_self_heal_completion_cbk]
bhl-volume-replicate-69: background  meta-data data self-heal
completed on /www/0/0descriptionofta581unit/0descriptionofta581unit.djvu

I assumed it was fixing that, but it didn't. Here's the full logs that
include all the gluster.log work it did in this directory:
http://pastebin.com/8X52Em7Y

Question: how can I 'fix' this, or is the best bet to remove
everything and start over? It's going to set us back, but I'd rather
do it now that keep banging on this without any resolution.

Thanks for the help, really like the new gluster command, very nice!

P
-- 
http://philcryer.com
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Hardware advice?

2010-09-24 Thread phil cryer
On Fri, Sep 24, 2010 at 8:30 AM, Janne Aho ja...@citynetwork.se wrote:

 Hi,

 We are looking into setting up a glusterfs cluster to store VM images (for
 KVM and VMware). Usually we have machines from Dell, but we haven't found
 any good machine to use which allows a good amount of disk space and
 possibility to have at least 4 NICs (we are thinking about using 10 gigabit
 network, otherwise we need to bond and use more NICs).

 Sure we could buy just off the shelf stuff to keep the cost down, but we are
 looking for having a good hardware support (to be sure if something breaks
 down, that we will get spare parts).

 Does anyone here have suggestion on hardware that can do the following:

 1. having iDrac or similar (remote access to console)
 2. at least 4 NICs which can be 10 gigabit (this for redundancy).
 3. have an architecture which is supported by gluster (with other words no
 mc68k).
 4. having space enough for a good amount of disks or jbod that can be
 connected to the machine (please no suggestion on Promise jbods).
 5. It has to be rack mounted

We currently have six servers setup like this in our Gluster cluster:
http://philcryer.com/wiki/doku.php?id=building_steam_from_a_grain_of_salt_-_redux

We have more details if you need them, so far, so good is our experience!

P


 If suggesting something else than Dell, please give some price indication on
 the hardware, don't care if it's accurate or not, just that I get some
 understanding if it's something that can fit our budget.


 Thanks in advance for your replies.


 --
 Janne Aho (Developer) | City Network Hosting AB - www.citynetwork.se
 Phone: +46 455 690022 | Cell: +46 733 312775
 EMail/MSN: ja...@citynetwork.se
 ICQ: 567311547 | Skype: janne_mz | AIM: janne4cn | Gadu: 16275665
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users




-- 
http://philcryer.com
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Cannot remove directory - Directory not empty

2010-09-20 Thread phil cryer
[r...@3d13 ~]# rm -rfv /flock/proj/tele2_holland/rnd/comp/010/v003
rm: cannot remove directory `/flock/proj/tele2_holland/rnd/comp/010/v003': 
Directory not empty

When I had this issue it was because I modified the files outside of
glusterfs - so for example, when gluster was not running, I
moved/modified files. I believe you have to run the scale-n-defrag.sh
script that you'll find in the contrib directory of the gluster
source.

P


On Mon, Sep 20, 2010 at 4:49 AM, Thomas Ericsson
thomas.erics...@fido.se wrote:
 I can not delete quite a few directories on our glusterfs mounts. The error 
 is Directory not empty. A listing shows no files in the directory, however 
 if i do a listing on the brick volumes some of them show files.

 Any idea as of how this can happen and how to remove the directories. Would 
 it be safe to remove the invisible files straight from the brick volume?

 Best regards
 Thomas


 From a glusterfs client
 [r...@3d13 ~]# ls -lai /flock/proj/tele2_holland/rnd/comp/010/v003/
 total 0
 38939716797 drwxrwxr-x 2 poal FidoUsers 162 Sep 16 09:15 .
 60331700537 drwxrwxr-x 5 poal FidoUsers 536 Sep  4 01:24 ..
 [r...@3d13 ~]# rm -rfv /flock/proj/tele2_holland/rnd/comp/010/v003
 rm: cannot remove directory `/flock/proj/tele2_holland/rnd/comp/010/v003': 
 Directory not empty

 From a glusterfsd brick
 flock01 ~ # ls -lai /node04/storage/proj/tele2_holland/rnd/comp/010/v003/
 total 0
  305414438 drwxrwxr-x 2 1038 fido_user 57 Sep 16 09:15 .
 7541462567 drwxrwxr-x 5 1038 fido_user 61 Jul  7 09:44 ..
  305414403 -T 1 root root       0 Sep  4 01:24 
 tele2_holland_010_comp_v003.0031.exr

 From another glusterfsd brick
 flock04 ~ # ls -lai /node03/storage/proj/tele2_holland/rnd/comp/010/v003/
 total 0
 4861583534 drwxrwxr-x 2 1038  500 57 Sep 16 09:15 .
  280040615 drwxrwxr-x 5 1038  500 61 Jul  7 09:44 ..
 4861671820 -T 1 root root  0 Sep  4 01:24 
 tele2_holland_010_comp_v003.0007.exr


 --

 Server and clients are vesion 2.0.8 with FUSE 2.7.4

 Server config
 flock04 ~ # cat /usr/local/etc/glusterfs/glusterfs.server
 volume posix01
  type storage/posix
  option directory /node01/storage
 end-volume

 volume locks01
  type features/locks
  subvolumes posix01
 end-volume

 volume brick01
  type performance/io-threads
  option thread-count 2
  subvolumes locks01
 end-volume

 volume posix02
  type storage/posix
  option directory /node02/storage
 end-volume

 volume locks02
  type features/locks
  subvolumes posix02
 end-volume

 volume brick02
  type performance/io-threads
  option thread-count 2
  subvolumes locks02
 end-volume

 volume posix03
  type storage/posix
  option directory /node03/storage
 end-volume

 volume locks03
  type features/locks
  subvolumes posix03
 end-volume

 volume brick03
  type performance/io-threads
  option thread-count 32
  subvolumes locks03
 end-volume

 volume posix04
  type storage/posix
  option directory /node04/storage
 end-volume

 volume locks04
  type features/locks
  subvolumes posix04
 end-volume

 volume brick04
  type performance/io-threads
  option thread-count 32
  subvolumes locks04
 end-volume

 volume server
  type protocol/server
  option transport-type ib-verbs/server
  option auth.addr.brick01.allow *
  option auth.addr.brick02.allow *
  option auth.addr.brick03.allow *
  option auth.addr.brick04.allow *
  subvolumes brick01 brick02 brick03 brick04
 end-volume

 volume tcp_server
  type protocol/server
  option transport-type tcp/server
  option transport.socket.nodelay on
  option auth.addr.brick01.allow *
  option auth.addr.brick02.allow *
  option auth.addr.brick03.allow *
  option auth.addr.brick04.allow *
  subvolumes brick01 brick02 brick03 brick04
 end-volume

 Client config
 volume remote01
  type protocol/client
  option transport-type ib-verbs/client
  option remote-host flock01
  option remote-subvolume brick03
 end-volume

 volume remote02
  type protocol/client
  option transport-type ib-verbs/client
  option remote-host flock01
  option remote-subvolume brick04
 end-volume

 volume remote03
  type protocol/client
  option transport-type ib-verbs/client
  option remote-host flock03
  option remote-subvolume brick03
 end-volume

 volume remote04
  type protocol/client
  option transport-type ib-verbs/client
  option remote-host flock03
  option remote-subvolume brick04
 end-volume

 volume remote05
  type protocol/client
  option transport-type ib-verbs/client
  option remote-host flock04
  option remote-subvolume brick03
 end-volume

 volume remote06
  type protocol/client
  option transport-type ib-verbs/client
  option remote-host flock04
  option remote-subvolume brick04
 end-volume

 volume remote07
  type protocol/client
  option transport-type ib-verbs/client
  option remote-host flock08
  option remote-subvolume brick03
 end-volume

 volume remote08
  type protocol/client
  option transport-type ib-verbs/client
  option remote-host flock08
  option 

Re: [Gluster-users] Shared web hosting with GlusterFS and inotify

2010-09-15 Thread phil cryer
We're interested in this as well, as we will be serving our docroot
from a GlusterFS share. Have you tried nginx? I have not tested this,
but after your benchmarks it sounds like I need to. Your inotifiy
script looks like it would work, but it wouldn't for me; we use
GlusterFS so we can store ~70TB of data, which we can't copy to the
regular filesystem. This is a big risk indeed - can you share your
benchmarking method? Did you simply use ab?

Thanks for the heads up

P

On Wed, Sep 15, 2010 at 9:58 AM, Emile Heitor
emile.hei...@nbs-system.com wrote:
 Hi list,

 For a couple of weeks, we're experimenting a web hosting system based on
 GlusterFS in order to share customers documentroots between
 more-than-one machine.

 Involved hardware and software are :

 Two servers composed of 2x Intel 5650 (i.e. 2x12 cores @2,6Ghz), 24GB
 DDR3 RAM, 146GB SAS disks / RAID 1
 Both servers running 64bits Debian Lenny GNU/Linux with GlusterFS 3.0.5
 The web server is Apache 2.2, the application is a huge PHP/MySQL monster.

 For our first naive tests were using the glusterfs mountpoint as
 apache's documentroot. In short, performances were catastrophic.
 A single of these servers, without GlusterFS, is capable of handling
 about 170 pages per second with 100 concurrent users.
 The same server, with apache documentroot being a gluster mountpoint,
 drops to 5 PPS for 20 CU and just stops responding for 40+.

 We tried a lot of tips (quick-read, io-threads, io-cache, thread-count,
 timeouts...) we read on this very mailing list, various websites, or
 experiences on our own, we never got better than 10 PPS / 20 users.

 So we took another approach: instead of declaring gluster mountpoint as
 the documentroot, we declared the local storage, but of course, without
 any modification, this would lead to inconsistencies if by any chance
 apache writes something (.htaccess, tmp file, log...). And so enters
 inotify. Using inotify-tools's inotifywait, we have this little script
 watching for local documentroot modifications, duplicating them to the
 glusterfs share. The infinite loop is avoided by a md5 comparison. Here
 a very early proof of concept :

 #!/bin/sh

 [ $# -lt 2  ]  echo usage: $0source  destination  exit 1

 PATH=${PATH}:/bin:/sbin:/usr/bin:/usr/sbin; export PATH

 SRC=$1
 DST=$2

 cd ${SRC}

 # no recursion
 RSYNC='rsync -dlptgoD --delete ${srcdir} ${dstdir}/'

 inotifywait -mr \
    --exclude \..*\.sw.* \
    -e close_write -e create -e delete_self -e delete . | \
    while read dir action file
    do
        srcdir=${SRC}/${dir}
        dstdir=${DST}/${dir}

        [ -d ${srcdir} ]  \
        [ ! -z `df -T \${srcdir}\|grep tmpfs` ] \
   continue

        # debug
        echo ${dir} ${action} ${file}

        case ${action} in
        CLOSE_WRITE,CLOSE)
            [ ! -f ${dstdir}/${file} ]  eval ${RSYNC}  continue

            md5src=`md5sum \${srcdir}/${file}\|cut -d' ' -f1`
            md5dst=`md5sum \${dstdir}/${file}\|cut -d' ' -f1`
            [ ! $md5src == $md5dst ]  eval ${RSYNC}
            ;;
        CREATE,ISDIR)
            [ ! -d ${dstdir}/${file} ]  eval ${RSYNC}
            ;;
        DELETE|DELETE,ISDIR)
            eval ${RSYNC}
            ;;
        esac
    done

 As for now a gluster mountpoint is barely unusable as an Apache
 DocumentRoot for us (and yes, with htaccess disabled), i'd like to have
 the list's point of view on this approach. Do you see any terrible glitch ?

 Thanks in advance,

 --
 Emile Heitor, Responsable d'Exploitation
 ---
 www.nbs-system.com, 140 Bd Haussmann, 75008 Paris
 Tel: 01.58.56.60.80 / Fax: 01.58.56.60.81


 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users




-- 
http://philcryer.com
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Adding volumes - How redistribute existing data

2010-08-06 Thread phil cryer
If a drive dies and you want to repopulate its replacement with `ls -R
/mnt/glusterfs` is it necessary to have the options set too, or is
this specific to the scale-n-defrag.sh script?

P

On Thu, Jul 29, 2010 at 12:22 AM, Amar Tumballi a...@gluster.com wrote:
 Hi Michael,

 Sorry for the confusion on 'scale-n-defrag.sh' script.

 To make sure the script does defrag, you need to have two options set in
 distribute volume.

 'option unhashed-sticky-bit on'
 'option lookup-unhashed on'

 Without these options it will not move the data files in backend. If you
 don't want to bring down the current mount point to run the defrag, you can
 have another mount point with changed volume file, and run defrag over it.

 Let us know if you have any more questions regarding defrag process.

 Regards,
 Amar

 On Wed, Jul 28, 2010 at 9:37 PM, Moore, Michael
 michael.mo...@lifetech.comwrote:

 Hi,

   I am trying to add several new backend volumes to an existing GlusterFS
 setup.  I am running GlusterFS 3.0.4 using the distribute translator.  I've
 tried running the scale-n-defrag.sh script to redistribute the data across
 the additional volumes, but after running for a significant time, nothing
 was significantly redistributed.  What are the proper steps to do to
 redistribute the data?  Do I need to clean up the links GlusterFS makes on
 the backends before I run scale-n-defrag?

   I am running GlusterFS 3.0.4 on top of CentOS 5.4.  This is not running
 GlusterSP.


 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users





-- 
http://philcryer.com
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] Kernel panic when populating cluster

2010-07-21 Thread phil cryer
I'm populating my 6 node cluster, running glusterfs 3.0.4, by copying
files into /mnt/glusterfs, the gluster mounted filesystem. I had one
machine go down with a kernel panic last week, but I wasn't able to
see the error (it's a remote server) so we just restarted and went
along. I was running 4 instances, all writing to the /mnt/gluster
directory, again today and saw the following error in the logs. I've
stopped and restarted my processes, this time just running two of
them, and I'm not seeing the error. Obviously this is taking much
longer to populate the cluster, could I have overloaded it by having
four shell scripts copying files into the mount? What does this error
mean, and is my method the proper way to populate a 6 node cluster
with 50TB capacity?

Thanks

P

== /var/log/syslog ==
Jul 20 23:49:54 clustr-01 kernel: [794473.515204] INFO: task cp:6706
blocked for more than 120 seconds.
Jul 20 23:49:54 clustr-01 kernel: [794473.515235] echo 0 
/proc/sys/kernel/hung_task_timeout_secs disables this message.
Jul 20 23:49:54 clustr-01 kernel: [794473.515292] cpD
880143d955c0 0  6706  1 0x0004
Jul 20 23:49:54 clustr-01 kernel: [794473.515296]  88013ce55bd0
0046 88005d4afbc8 88005d4afbc4
Jul 20 23:49:54 clustr-01 kernel: [794473.515300]  000e
0096 f8a0 88005d4affd8
Jul 20 23:49:54 clustr-01 kernel: [794473.515303]  000155c0
000155c0 88023e35e2e0 88023e35e5d8
Jul 20 23:49:54 clustr-01 kernel: [794473.515306] Call Trace:
Jul 20 23:49:54 clustr-01 kernel: [794473.515315]
[a0213a99] ? fuse_request_send+0x196/0x249 [fuse]
Jul 20 23:49:54 clustr-01 kernel: [794473.515319]
[81064a56] ? autoremove_wake_function+0x0/0x2e
Jul 20 23:49:54 clustr-01 kernel: [794473.515324]
[a0218086] ? fuse_flush+0xca/0xfe [fuse]
Jul 20 23:49:54 clustr-01 kernel: [794473.515328]
[810eb90e] ? filp_close+0x37/0x62
Jul 20 23:49:54 clustr-01 kernel: [794473.515332]
[8104f710] ? put_files_struct+0x64/0xc1
Jul 20 23:49:54 clustr-01 kernel: [794473.515335]
[81050fb2] ? do_exit+0x225/0x6b5
Jul 20 23:49:54 clustr-01 kernel: [794473.515337]
[810514b8] ? do_group_exit+0x76/0x9d
Jul 20 23:49:54 clustr-01 kernel: [794473.515341]
[8105dc50] ? get_signal_to_deliver+0x310/0x33c
Jul 20 23:49:54 clustr-01 kernel: [794473.515353]
[8101002f] ? do_notify_resume+0x87/0x73f
Jul 20 23:49:54 clustr-01 kernel: [794473.515357]
[810cb774] ? handle_mm_fault+0x2f7/0x7a5
Jul 20 23:49:54 clustr-01 kernel: [794473.515361]
[810eddd6] ? vfs_read+0xa6/0xff
Jul 20 23:49:54 clustr-01 kernel: [794473.515363]
[81010e0e] ? int_signal+0x12/0x17

-- 
http://philcryer.com
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] scale-n-defrag of 50TB across a 6 node cluster

2010-07-06 Thread phil cryer
Our cluster was out of balance as we had two servers running glusterfs
under a RAID1 setup, and only after those two servers were full did we
add the additional four to our group.  Now we're running the
scale-n-defrag.sh script across all 50TB of data across the six node
cluster.  So we continue to get closer to having all of the data
balanced across the 6 nodes on the cluster, though it seems to be
going very slowly overall - this process has been running for more
than a week now.  Looking at the networking graph on this page shows
that it's still working, and passing data across the network.
http://whbhl01.ubio.org/ganglia/?m=load_oner=hours=descendingc=Woods+Holeh=sh=1hc=3z=medium

Looking at the servers' disk usage on the command line we see that the
data in indeed being equally distributed across all 24 mounts on each
node.  While not being able to get an update from the gluster process,
we can see by physically looking at the disk usage that:

1 - done balancing
2 - done balancing
3 - done balancing
4 - beginning balancing
5 - beginning balancing
6 - about 1/2 complete balancing

This makes sense, since 4/5 were the first two servers, and were the
full ones, and the most out of sync with the others.  It seems like
1/2/3 and most of 6 have gotten the majority of the balancing
complete.  Does this sound normal?  Also, would it cause the process
to run longer if we started moving files around in their directories
on the nodes? (we need to move the files to a shared docroot so they
can be served via HTTP).  I realize now that the best way to build
this cluster would have been to have the entire cluster up and
running, and then load the data, but since over 50TB needed to be
transfered to the cluster over the Internet, we thought starting
sooner and adding nodes as we grew was the best way to proceed.

Also, does anyone have configuration suggtions for serving static
files for websites from glusterfs?  Either as far as configuration of
the .vol files, or the architecture of how the servers are laid out:
I'm thinking of two ways:

Internet - SERVER 1 (www server with glusterfs client running) using
/mnt/glusterfs/www as the docroot

- or -

Internet - SERVER1 (www server) - CLUSTER1 (www server with
glusterfs server and client running) using /mnt/glusterfs/www as the
docroot

P
-- 
http://philcryer.com
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Transport endpoint is not connected - getfattr

2010-06-18 Thread phil cryer
So I'm working on this today, maybe I can simplify my issue: in
Glusterfs 3.0.4, I can create files and directories fine, I can delete
files, but not directories.  I'm running the server in DEBUG and it's
not saying anything.  For example, I want to delete
/mnt/glusterfs/www/new :

[23:12:03] [r...@clustr-01 /mnt]# mount -t glusterfs
/etc/glusterfs/glusterfs.vol /mnt/glusterfs -o log-level=DEBUG
[23:12:13] [r...@clustr-01 /mnt]# ls -al /mnt/glusterfs/www/ | grep new
drwxrwxrwx   3 www-data www-data 196608 2010-06-18 23:10 new
[23:12:26] [r...@clustr-01 /mnt]# rm -rf /mnt/glusterfs/www/new/
rm: cannot remove directory `/mnt/glusterfs/www/new/__MACOSX':
Transport endpoint is not connected

I'm running glusterfsd in another window in DEBUG, and it doesn't log
anything when this happens.  So I've already deleted the files in that
directory, I just can't remove the two remaining directories, new and
__MACOSX.  Again, I created these yesterday so I haven't made any
config changes between then and now, how can I figure out why this is
failing?

Thanks

P



On Thu, Jun 17, 2010 at 4:23 PM, phil cryer p...@cryer.us wrote:
 I'm having problems removing directories, if I do a mv or if I do a rm
 I'll get an error like this:

 [00:57:57] [r...@clustr-01 /]# rm -rf /mnt/glusterfs/bhl/
 rm: cannot remove directory `/mnt/glusterfs/bhl': Transport endpoint
 is not connected

 EdWyse on IRC suggested I run getfattr -m  on a few bricks, when I
 did I got various results (see below).  Is this a case where I can run
 something like backend-cleanup.sh or backend-xattr-sanitize.sh to fix,
 or is there a manual command?  We're around 45TB, so I don't have
 anywhere to copy the files off.  Thanks!

 [16:30:25] [r...@clustr-04 /root/bin]# getfattr -m  /mnt/data04
 getfattr: Removing leading '/' from absolute path names
 # file: mnt/data04
 trusted.afr.clustr-04-1
 trusted.afr.clustr-04-10
 trusted.afr.clustr-04-11
 trusted.afr.clustr-04-12
 trusted.afr.clustr-04-13
 trusted.afr.clustr-04-14
 trusted.afr.clustr-04-15
 trusted.afr.clustr-04-16
 trusted.afr.clustr-04-17
 trusted.afr.clustr-04-18
 trusted.afr.clustr-04-19
 trusted.afr.clustr-04-2
 trusted.afr.clustr-04-20
 trusted.afr.clustr-04-21
 trusted.afr.clustr-04-22
 trusted.afr.clustr-04-23
 trusted.afr.clustr-04-24
 trusted.afr.clustr-04-3
 trusted.afr.clustr-04-4
 trusted.afr.clustr-04-5
 trusted.afr.clustr-04-6
 trusted.afr.clustr-04-7
 trusted.afr.clustr-04-8
 trusted.afr.clustr-04-9
 trusted.afr.clustr-05-1
 trusted.afr.clustr-05-10
 trusted.afr.clustr-05-11
 trusted.afr.clustr-05-12
 trusted.afr.clustr-05-13
 trusted.afr.clustr-05-14
 trusted.afr.clustr-05-15
 trusted.afr.clustr-05-16
 trusted.afr.clustr-05-17
 trusted.afr.clustr-05-18
 trusted.afr.clustr-05-19
 trusted.afr.clustr-05-2
 trusted.afr.clustr-05-20
 trusted.afr.clustr-05-21
 trusted.afr.clustr-05-22
 trusted.afr.clustr-05-23
 trusted.afr.clustr-05-24
 trusted.afr.clustr-05-3
 trusted.afr.clustr-05-4
 trusted.afr.clustr-05-5
 trusted.afr.clustr-05-6
 trusted.afr.clustr-05-7
 trusted.afr.clustr-05-8
 trusted.afr.clustr-05-9
 trusted.glusterfs.dht
 trusted.posix4.gen

 ---
 Another server

 [01:02:05] [r...@clustr-01 /]#  getfattr -m  /mnt/data09
 getfattr: Removing leading '/' from absolute path names
 # file: mnt/data09
 trusted.afr.clustr-01-10
 trusted.afr.clustr-01-9
 trusted.glusterfs.dht
 trusted.glusterfs.test
 trusted.posix9.gen


 [00:43:14] [r...@clustr-01 /]#  getfattr -m  /mnt/data04
 getfattr: Removing leading '/' from absolute path names
 # file: mnt/data04
 trusted.afr.clustr-01-3
 trusted.afr.clustr-01-4
 trusted.glusterfs.dht
 trusted.glusterfs.test
 trusted.posix4.gen




-- 
http://philcryer.com
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] Transport endpoint is not connected - getfattr

2010-06-17 Thread phil cryer
I'm having problems removing directories, if I do a mv or if I do a rm
I'll get an error like this:

[00:57:57] [r...@clustr-01 /]# rm -rf /mnt/glusterfs/bhl/
rm: cannot remove directory `/mnt/glusterfs/bhl': Transport endpoint
is not connected

EdWyse on IRC suggested I run getfattr -m  on a few bricks, when I
did I got various results (see below).  Is this a case where I can run
something like backend-cleanup.sh or backend-xattr-sanitize.sh to fix,
or is there a manual command?  We're around 45TB, so I don't have
anywhere to copy the files off.  Thanks!

[16:30:25] [r...@clustr-04 /root/bin]# getfattr -m  /mnt/data04
getfattr: Removing leading '/' from absolute path names
# file: mnt/data04
trusted.afr.clustr-04-1
trusted.afr.clustr-04-10
trusted.afr.clustr-04-11
trusted.afr.clustr-04-12
trusted.afr.clustr-04-13
trusted.afr.clustr-04-14
trusted.afr.clustr-04-15
trusted.afr.clustr-04-16
trusted.afr.clustr-04-17
trusted.afr.clustr-04-18
trusted.afr.clustr-04-19
trusted.afr.clustr-04-2
trusted.afr.clustr-04-20
trusted.afr.clustr-04-21
trusted.afr.clustr-04-22
trusted.afr.clustr-04-23
trusted.afr.clustr-04-24
trusted.afr.clustr-04-3
trusted.afr.clustr-04-4
trusted.afr.clustr-04-5
trusted.afr.clustr-04-6
trusted.afr.clustr-04-7
trusted.afr.clustr-04-8
trusted.afr.clustr-04-9
trusted.afr.clustr-05-1
trusted.afr.clustr-05-10
trusted.afr.clustr-05-11
trusted.afr.clustr-05-12
trusted.afr.clustr-05-13
trusted.afr.clustr-05-14
trusted.afr.clustr-05-15
trusted.afr.clustr-05-16
trusted.afr.clustr-05-17
trusted.afr.clustr-05-18
trusted.afr.clustr-05-19
trusted.afr.clustr-05-2
trusted.afr.clustr-05-20
trusted.afr.clustr-05-21
trusted.afr.clustr-05-22
trusted.afr.clustr-05-23
trusted.afr.clustr-05-24
trusted.afr.clustr-05-3
trusted.afr.clustr-05-4
trusted.afr.clustr-05-5
trusted.afr.clustr-05-6
trusted.afr.clustr-05-7
trusted.afr.clustr-05-8
trusted.afr.clustr-05-9
trusted.glusterfs.dht
trusted.posix4.gen

---
Another server

[01:02:05] [r...@clustr-01 /]#  getfattr -m  /mnt/data09
getfattr: Removing leading '/' from absolute path names
# file: mnt/data09
trusted.afr.clustr-01-10
trusted.afr.clustr-01-9
trusted.glusterfs.dht
trusted.glusterfs.test
trusted.posix9.gen


[00:43:14] [r...@clustr-01 /]#  getfattr -m  /mnt/data04
getfattr: Removing leading '/' from absolute path names
# file: mnt/data04
trusted.afr.clustr-01-3
trusted.afr.clustr-01-4
trusted.glusterfs.dht
trusted.glusterfs.test
trusted.posix4.gen
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Replication between two separate clusters

2010-05-25 Thread phil cryer
 Is it possible with Gluster to have two separate clusters
 replicating volumes that are already mirrored in their independent cluster?
 So far it looks like that's what AFR is supposed to do.

This is what I'm setting up as well, we'll have individual, stand
alone, clusters that are synced using a combination of
rsync/ssh/lsyncd/csync2.  For me this is a true DR environment, since
each instance will be storing and serving content at the same time,
and won't be reliant on the other, but as long as they're both up they
will keep each other in sync as to be a mirror.

P

On Mon, May 24, 2010 at 10:39 AM, Jeffrey Negro jne...@billtrust.com wrote:
 Hello -

 My company is in need of a clustered NAS solution, mostly for CIFS
 fileshares.  We have been considering commercial solutions from Isilon and
 NetApp, but I have a feeling I'm not going to get the budget approval for
 those products.  I also tend to stay away from closed hardware solutions...
 but I digress.  We want to have a production and a DR cluster that replicate
 across a WAN.  Is it possible with Gluster to have two separate clusters
 replicating volumes that are already mirrored in their independent cluster?
 So far it looks like that's what AFR is supposed to do.

 Any information or assistance that anyone can provide in clarifying my
 understanding of this scenario would be very helpful and much appreciated.

 Thank you,

 Jeffrey

 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users





-- 
http://philcryer.com
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Input/output error when running `ls` and `cd` on directories

2010-05-17 Thread phil cryer
Lakshmipathi
Attached are the gluster vol files, and the glusterfsd.log file while
running under TRACE.  I did the same queries as in the original email,
and got the same results.  Let me know if you want each action broken
out in the logfile if you can't tell by the details.

Thanks for your help on this.

P

On Sat, May 15, 2010 at 7:20 AM, Lakshmipathi lakshmipa...@gluster.com wrote:
 Hi,
 Can you please sent us server/client log files and server/client volume 
 files?.If you think there is not enough detail on logs,
 please set the log level as TRACE instead of DEBUG and sent us the logs.

 Cheers,
 Lakshmipathi.G


 - Original Message -
 From: phil cryer p...@cryer.us
 To: gluster-users@gluster.org
 Sent: Saturday, May 15, 2010 9:44:18 AM
 Subject: [Gluster-users] Input/output error when running `ls` and `cd` on     
   directories

 I'm getting Input/output errors on gluster mounted directories.
 First, I have a few directories I created a few weeks ago, but when I
 run an ls on them, their status is listed as ???:

 [23:52:54] [r...@clustr06 /mnt/glusterfs]# ls -al
 ls: cannot access lost+found: Input/output error
 ls: cannot access bhl: Input/output error
 total 1920
 drwxr-xr-x  7 root root 294912 2010-05-13 19:11 .
 drwxr-xr-x 27 root root   4096 2010-04-30 15:28 ..
 d?  ? ?    ?         ?                ? bhl
 drwx--  2 root root 294912 2010-05-05 22:37 bin
 drwx--  4 root root 294912 2010-05-10 14:37 clustr-02
 drwx-- 46 root root 294912 2010-05-13 19:13 clustr-04
 d?  ? ?    ?         ?                ? lost+found

 Then, I go into a directory I've been trying to populate with files to
 see how far along is is, and I can't see it:

 [23:55:48] [r...@clustr06 /mnt/glusterfs/clustr-04]# ls grab4*
 ls: cannot access grab43: Input/output error
 ls: cannot access grab44: Input/output error
 ls: cannot access grab45: Input/output error
 grab4:
 grabby.sh  status

 grab40:
 grabby.sh  status

 grab41:
 complete  grabby.sh  status

 grab42:
 grabby.sh  status

 I have glusterfsd running in debug mode, but it's not giving me any
 details.  I've stopped, restarted glusterfsd, unmounting and
 remounting the glusterfs shares after that.  What is happening and how
 can I fix it?  Thanks.

 P
 --
 http://philcryer.com
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users




-- 
http://philcryer.com
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] Input/output error when running `ls` and `cd` on directories

2010-05-14 Thread phil cryer
I'm getting Input/output errors on gluster mounted directories.
First, I have a few directories I created a few weeks ago, but when I
run an ls on them, their status is listed as ???:

[23:52:54] [r...@clustr06 /mnt/glusterfs]# ls -al
ls: cannot access lost+found: Input/output error
ls: cannot access bhl: Input/output error
total 1920
drwxr-xr-x  7 root root 294912 2010-05-13 19:11 .
drwxr-xr-x 27 root root   4096 2010-04-30 15:28 ..
d?  ? ?? ?? bhl
drwx--  2 root root 294912 2010-05-05 22:37 bin
drwx--  4 root root 294912 2010-05-10 14:37 clustr-02
drwx-- 46 root root 294912 2010-05-13 19:13 clustr-04
d?  ? ?? ?? lost+found

Then, I go into a directory I've been trying to populate with files to
see how far along is is, and I can't see it:

[23:55:48] [r...@clustr06 /mnt/glusterfs/clustr-04]# ls grab4*
ls: cannot access grab43: Input/output error
ls: cannot access grab44: Input/output error
ls: cannot access grab45: Input/output error
grab4:
grabby.sh  status

grab40:
grabby.sh  status

grab41:
complete  grabby.sh  status

grab42:
grabby.sh  status

I have glusterfsd running in debug mode, but it's not giving me any
details.  I've stopped, restarted glusterfsd, unmounting and
remounting the glusterfs shares after that.  What is happening and how
can I fix it?  Thanks.

P
-- 
http://philcryer.com
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Monitoring Gluster availability

2010-05-10 Thread phil cryer
On Fri, May 7, 2010 at 3:13 AM, Kelvin Westlake kel...@netbasic.co.uk wrote:
 Can anybody recommend away of monitoring gluster availability, I need to
 be made aware of a server or client crashes out. Is there some port or
 system component that can be monitored?

I use monit [ttp://mmonit.com/monit/] extensively, and have written a
simple config snippet to watch glusterfds and restart it if it has
failed.

from /etc/monit/monitrc

check process glusterfsd with pidfile /var/run/glusterfsd.pid
start program = /etc/init.d/glusterfsd start
stop program = /etc/init.d/glusterfsd stop
if failed host 127.0.0.1 port 6996 then restart
if loadavg(5min) greater than 10 for 8 cycles then restart
if 5 restarts within 5 cycles then timeout

Today I was looking for a more 'gluster native' way of checking all
the nodes to see if each of them in the cluster are up, but haven't
gotten very far, save for pulling the hostnames out of the volfile:

grep option remote-host /etc/glusterfs/glusterfs.vol | uniq | cut -d  -f7

but from there you'd need to do a shared ssh key setup for a script to
loop through those entries and check things in the logs on all the
servers...

Does anyone have a way they do it?

P


On Fri, May 7, 2010 at 3:13 AM, Kelvin Westlake kel...@netbasic.co.uk wrote:
 Hi Guys



 Can anybody recommend away of monitoring gluster availability, I need to
 be made aware of a server or client crashes out. Is there some port or
 system component that can be monitored?



 Cheers

 Kelvin




 This email with any attachments is for the exclusive and confidential use of 
 the addressee(s) and may contain legally privileged information. Any other 
 distribution, use or reproduction without the senders prior consent is 
 unauthorised and strictly prohibited. If you receive this message in error 
 please notify the sender by email and delete the message from your computer.

 Netbasic Limited registered office and business address is 9 Funtley Court, 
 Funtley Hill, Fareham, Hampshire PO16 7UY. Company No. 04906681. Netbasic 
 Limited is authorised and regulated by the Financial Services Authority in 
 respect of regulated activities. Please note that many of our activities do 
 not require FSA regulation.

 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users





-- 
http://philcryer.com
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] gluster-volgen - syntax for mirroring/distributing across 6 nodes

2010-05-03 Thread phil cryer
On Mon, May 3, 2010 at 1:25 AM, Lakshmipathi lakshmipa...@gluster.com wrote:
 Hi Phil,
 1)Yes,your glusterfs-volgen command should  provide the required volume files 
 for your setup ,but you can also use this,

 glusterfs-volgen --name repstore1 --raid 1 clustr-01:/mnt/data01 
 clustr-02:/mnt/data01 clustr-03:/mnt/data01 clustr-04:/mnt/data01
 clustr-05:/mnt/data01 clustr-06:/mnt/data01 -c ~/files/

 will create volume files under your home , make sure you have ~/files 
 directory already exists.

Thanks for this, I'm currently trying to debug my config, when I try
to mount ONE of the servers 24 bricks, it's only showing capacity like
it's mounting ONE of the bricks, logs error is:

[2010-05-03 11:18:35] W [posix.c:246:posix_lstat_with_gen] posix1:
Access to /mnt/data01//.. (on dev 16771) is crossing device (2257)

What is this telling me?  I'm looking at just dropping back to
mirroring 2 servers, and replicating from there, then try against all
6.  I'll post my config in a bit.

Thanks

P




 2)No currently glusterfs-volgen won't support shorthand methods for export 
 directories  - they need to be exact strings.

 For more details on volgen - please check
 http://www.gluster.com/community/documentation/index.php/Glusterfs-volgen_Reference_Page


 Cheers,
 Lakshmipathi.G







 - Original Message -
 From: phil cryer p...@cryer.us
 To: gluster-users@gluster.org
 Sent: Friday, April 30, 2010 11:56:44 PM
 Subject: [Gluster-users] gluster-volgen - syntax for mirroring/distributing   
   across 6 nodes

 NOTE: posted this to gluster-devel when I meant to post it to gluster-users

 01 | 02 mirrored --|
 03 | 04 mirrored --| distributed
 05 | 06 mirrored --|

 1) Would this command work for that?
 glusterfs-volgen --name repstore1 --raid 1 clustr-01:/mnt/data01
 clustr-02:/mnt/data01 --raid 1 clustr-03:/mnt/data01
 clustr-04:/mnt/data01 --raid 1 clustr-05:/mnt/data01
 clustr-06:/mnt/data01

 So the 'repstore1' is the distributed part, and within that are 3 sets
 of mirrored nodes.

 2) Then, since we're running 24 drives in JBOD mode, so we've got
 mounts from /mnt/data01 - /mnt/data24.  Is there a way to write this
 in shorthand, because the last time I generated a config across 3 of
 these hosts, the command looked like this:

 glusterfs-volgen --name store123 clustr-01:/mnt/data01
 clustr-02:/mnt/data01 clustr-03:/mnt/data01 clustr-01:/mnt/data02
 clustr-02:/mnt/data02 clustr-03:/mnt/data02 clustr-01:/mnt/data03
 clustr-02:/mnt/data03 clustr-03:/mnt/data03 clustr-01:/mnt/data04
 clustr-02:/mnt/data04 clustr-03:/mnt/data04 clustr-01:/mnt/data05
 clustr-02:/mnt/data05 clustr-03:/mnt/data05 clustr-01:/mnt/data06
 clustr-02:/mnt/data06 clustr-03:/mnt/data06 clustr-01:/mnt/data07
 clustr-02:/mnt/data07 clustr-03:/mnt/data07 clustr-01:/mnt/data08
 clustr-02:/mnt/data08 clustr-03:/mnt/data08 clustr-01:/mnt/data09
 clustr-02:/mnt/data09 clustr-03:/mnt/data09 clustr-01:/mnt/data10
 clustr-02:/mnt/data10 clustr-03:/mnt/data10 clustr-01:/mnt/data11
 clustr-02:/mnt/data11 clustr-03:/mnt/data11 clustr-01:/mnt/data12
 clustr-02:/mnt/data12 clustr-03:/mnt/data12 clustr-01:/mnt/data13
 clustr-02:/mnt/data13 clustr-03:/mnt/data13 clustr-01:/mnt/data14
 clustr-02:/mnt/data14 clustr-03:/mnt/data14 clustr-01:/mnt/data15
 clustr-02:/mnt/data15 clustr-03:/mnt/data15 clustr-01:/mnt/data16
 clustr-02:/mnt/data16 clustr-03:/mnt/data16 clustr-01:/mnt/data17
 clustr-02:/mnt/data17 clustr-03:/mnt/data17 clustr-01:/mnt/data18
 clustr-02:/mnt/data18 clustr-03:/mnt/data18 clustr-01:/mnt/data19
 clustr-02:/mnt/data19 clustr-03:/mnt/data19 clustr-01:/mnt/data20
 clustr-02:/mnt/data20 clustr-03:/mnt/data20 clustr-01:/mnt/data21
 clustr-02:/mnt/data21 clustr-03:/mnt/data21 clustr-01:/mnt/data22
 clustr-02:/mnt/data22 clustr-03:/mnt/data22 clustr-01:/mnt/data23
 clustr-02:/mnt/data23 clustr-03:/mnt/data23 clustr-01:/mnt/data24
 clustr-02:/mnt/data24 clustr-03:/mnt/data24

 Thanks

 P
 --
 http://philcryer.com
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users




-- 
http://philcryer.com
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] gluster-volgen - syntax for mirroring/distributing across 6 nodes

2010-04-30 Thread phil cryer
NOTE: posted this to gluster-devel when I meant to post it to gluster-users

01 | 02 mirrored --|
03 | 04 mirrored --| distributed
05 | 06 mirrored --|

1) Would this command work for that?
glusterfs-volgen --name repstore1 --raid 1 clustr-01:/mnt/data01
clustr-02:/mnt/data01 --raid 1 clustr-03:/mnt/data01
clustr-04:/mnt/data01 --raid 1 clustr-05:/mnt/data01
clustr-06:/mnt/data01

So the 'repstore1' is the distributed part, and within that are 3 sets
of mirrored nodes.

2) Then, since we're running 24 drives in JBOD mode, so we've got
mounts from /mnt/data01 - /mnt/data24.  Is there a way to write this
in shorthand, because the last time I generated a config across 3 of
these hosts, the command looked like this:

glusterfs-volgen --name store123 clustr-01:/mnt/data01
clustr-02:/mnt/data01 clustr-03:/mnt/data01 clustr-01:/mnt/data02
clustr-02:/mnt/data02 clustr-03:/mnt/data02 clustr-01:/mnt/data03
clustr-02:/mnt/data03 clustr-03:/mnt/data03 clustr-01:/mnt/data04
clustr-02:/mnt/data04 clustr-03:/mnt/data04 clustr-01:/mnt/data05
clustr-02:/mnt/data05 clustr-03:/mnt/data05 clustr-01:/mnt/data06
clustr-02:/mnt/data06 clustr-03:/mnt/data06 clustr-01:/mnt/data07
clustr-02:/mnt/data07 clustr-03:/mnt/data07 clustr-01:/mnt/data08
clustr-02:/mnt/data08 clustr-03:/mnt/data08 clustr-01:/mnt/data09
clustr-02:/mnt/data09 clustr-03:/mnt/data09 clustr-01:/mnt/data10
clustr-02:/mnt/data10 clustr-03:/mnt/data10 clustr-01:/mnt/data11
clustr-02:/mnt/data11 clustr-03:/mnt/data11 clustr-01:/mnt/data12
clustr-02:/mnt/data12 clustr-03:/mnt/data12 clustr-01:/mnt/data13
clustr-02:/mnt/data13 clustr-03:/mnt/data13 clustr-01:/mnt/data14
clustr-02:/mnt/data14 clustr-03:/mnt/data14 clustr-01:/mnt/data15
clustr-02:/mnt/data15 clustr-03:/mnt/data15 clustr-01:/mnt/data16
clustr-02:/mnt/data16 clustr-03:/mnt/data16 clustr-01:/mnt/data17
clustr-02:/mnt/data17 clustr-03:/mnt/data17 clustr-01:/mnt/data18
clustr-02:/mnt/data18 clustr-03:/mnt/data18 clustr-01:/mnt/data19
clustr-02:/mnt/data19 clustr-03:/mnt/data19 clustr-01:/mnt/data20
clustr-02:/mnt/data20 clustr-03:/mnt/data20 clustr-01:/mnt/data21
clustr-02:/mnt/data21 clustr-03:/mnt/data21 clustr-01:/mnt/data22
clustr-02:/mnt/data22 clustr-03:/mnt/data22 clustr-01:/mnt/data23
clustr-02:/mnt/data23 clustr-03:/mnt/data23 clustr-01:/mnt/data24
clustr-02:/mnt/data24 clustr-03:/mnt/data24

Thanks

P
-- 
http://philcryer.com
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] client mount fails on boot under debian lenny...

2010-04-24 Thread phil cryer
On Sat, Apr 24, 2010 at 9:55 PM,  mki-gluste...@mozone.net wrote:
 On Sat, Apr 24, 2010 at 07:57:32PM +0200, Smart Weblications GmbH - Florian 
 Wiessner wrote:
 Am 24.04.2010 19:08, schrieb mki-gluste...@mozone.net:
  I quote below: The fstab entry contains options noatime,_netdev
  already. :)  Any other thoughts?

 the parameter is iirc no_netdev. you can also add glusterfs to the function
 mount_all_local() in /etc/init.d/mountall.sh and update 
 ../init.d/mountnfs.sh to
 mount glusterfs.

 ../init.d/mountnfs.sh is executed after networking is established.

 Yeah I did try modifying mountall.sh, mountnfs.sh and a couple of others
 that already had references to gfs/ocfs in them and add glusterfs to the
 list.  I even added it to /etc/network/if-up.d/mountnfs, but even with
 that it did the same thing and barfed until I added a sleep 3 to the
 script right before it's mount attempt line.  My fear with modifying all
 those system files is the next apt-get update/upgrade will end up blowing
 the changes away?

My fear with modifying all those system files is the next apt-get
 update/upgrade will end up blowing the changes away?

This would be my concern as well, and bolsters the original thought of
using /etc/rc.local to handle it.  Perhaps background a `sleep x;
mount -a` command, or the like.

P
-- 
http://philcryer.com
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Monitoring gluster with nagios - was Re: [Gluster-devel] Gluster health/status

2010-04-14 Thread phil cryer
Ian
Very nice, IMO this should be added to the Gluster wiki.

P

On Tue, Apr 13, 2010 at 5:07 PM, Ian Rogers ian.rog...@contactclean.com wrote:

 Answering my own question, hope these instructions are useful -
 http://www.sirgroane.net/2010/04/monitoring-gluster-with-nagios/

 Cheers,

 Ian

 On 09/04/2010 06:01, Ian Rogers wrote:

 Gluster devs,

 I found the message below in the archives. glfs-health.sh is not included
 in the v3.0.3 sources - is there any plan to add this to the extras
 directory? What's its status?

 Ian

 == snip ==

 Raghavendra G
 Mon, 22 Feb 2010 20:20:33 -0800

 Hi all,

 Here is some work related to Health monitoring. glfs-health.sh is a shell
 script to check the health of glusterfs.

 http://git.gluster.com/?p=users/avati/glfs-health.git;a=blob_plain;f=glfs-health.sh;hb=5bf3cb50452525f545018fa5f8eed06cb2fbbe7d

 Documentation can be found from

 http://git.gluster.com/?p=users/avati/glfs-health.git;a=blob_plain;f=README;hb=5bf3cb50452525f545018fa5f8eed06cb2fbbe7d

 We welcome improvements and discussions on this.


 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users




-- 
http://philcryer.com
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Random No such file or directory error in gluster client logs - FIXED

2010-03-18 Thread phil cryer
 The solution was quite simple. It turned out that it was because the server's 
 data drive was formatted in ext4. Switched it to ext3 and the problems went 
 away!

Is this a known issue with Gluster 3.0.3?  I've setup our cluster with
ext4 on Debian, but have not had any issues like this yet (but we're
not running live yet).  Is this something to be concerned about?
Should we change everything back to ext3?

P

On Thu, Mar 18, 2010 at 8:48 AM, Lee Simpson l...@leenix.co.uk wrote:
 Hello,

 Just thought id share the experience i had with a gluster client error and 
 the solution i found after much searching and chatting with some IRC guys.

 Im running a simple 2 server with multiple clients using cluster/replicate. 
 Randomly newly created files produced the following error in the gluster 
 client logs when accessed;

 W [fuse-bridge.c:858:fuse_fd_cbk] glusterfs-fuse: 59480: OPEN() 
 /data/randomfile-here = -1 (No such file or directory)

 These files are created by apache or other scripts (such as awstats on a 
 cron).  Apache is then unable to read the file, and the above message appears 
 in the gluster logs everytime you try. If i SSH into the apache server and 
 cat the file it displays fine and then apache starts reading it fine.

 I upgraded the client and server to 3.03 and tried reducing my configs to the 
 bare min without any performance volumes..  But the problem persisted...


 SOLUTION

 The solution was quite simple. It turned out that it was because the server's 
 data drive was formatted in ext4. Switched it to ext3 and the problems went 
 away!


 Hope that helps someone else who finds this.


 - Lee





 [ Disclaimer ]
 This e-mail and any files transmitted with it are confidential and intended 
 solely for the use of the individual or entity to whom they are addressed. If 
 you have received this email in error please notify the sender by replying to 
 this e-mail.

 This email has been scanned for viruses


 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users




-- 
http://philcryer.com
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster-users Digest, Vol 20, Issue 22

2010-01-05 Thread phil cryer
 that it is much less of a problem.

 By using more smaller machines you also eliminate the need for redundant
 power supplies (which would be a requirement in your large boxes because
 it would be a single point of failure on a large percentage of your
 storage system).

 Hope the information helps.

 Regards,
 Larry Bates


 --
 Message: 6
 Date: Thu, 17 Dec 2009 00:18:54 -0600
 From: phil cryer p...@cryer.us
 Subject: [Gluster-users] Recommended GlusterFS configuration for 6
       node    cluster
 To: gluster-users@gluster.org gluster-users@gluster.org
 Message-ID:
       3a3bc55a0912162218i4e3f326cr9956dd37132bf...@mail.gmail.com
 Content-Type: text/plain; charset=UTF-8

 We're setting up 6 servers, each with 24 x 1.5TB drives, the systems
 will run Debian testing and Gluster 3.x.  The SATA RAID card offers
 RAID5 and RAID6, we're wondering what the optimum setup would be for
 this configuration.  Do we RAID5 the disks, and have GlusterFS use
 them that way, or do we keep them all 'raw' and have GlusterFS handle
 the replication (though not 2x as we would have with the RAID
 options)?  Obviously a lot of ways to do this, just wondering what
 GlusterFS devs and other experienced users would recommend.

 Thanks

 P

 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users




-- 
http://philcryer.com
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] volume sizes

2009-12-30 Thread phil cryer
Thanks for all of the responses, as Anthony said, we're just now
building this out, so we look forward to doing benchmarking to find
what's right for our environment, and of course share it to help
adoption of GlusterFS.  Also, we gave a talk in November at a
conference in France covering our plans and reasons for moving to
Gluster, you can see the slides for the talk here:
http://www.slideshare.net/phil.cryer/building-a-scalable-open-source-storage-solution-2482448

All questions, comments are welcome!

I'm working now on some detailed documentation for Gluster install
using the latest Debian (Squeeze) testing branch to take advantage of
of ext4.  The storage platform looks very promising, I hope that you
can break the web UI out of if so we can install it on our 'hand
rolled' boxes.  Thanks again for the support!

P

On Wed, Dec 30, 2009 at 12:07 PM, Tejas N. Bhise te...@gluster.com wrote:
 Thanks, Raghvendra.

 Anthony,

 Its a lazy self-heal mechanism, if you will. If one wants it all done right 
 away, an ls -alR will access each file and hence cause the rebuild of the 
 whole glusterfs volume which _may_, like you mentioned, be spread across disk 
 partitions, LVM/RAID luns or even server nodes.

 Even after all that, only the files impacted in the volume would need to be 
 rebuilt - although there might be some difference in overheads for different 
 sized and configured Glusterfs volumes. It might be interesting to check - we 
 have not done numbers on this.

 Let me check with the person who is more familiar with this area of code than 
 me and he may be able to suggest  some ballpark numbers till we run some real 
 numbers. Meanwhile, if you do some tests, please share the numbers with the 
 community.

 Regards,
 Tejas.


 - Original Message -
 From: Raghavendra G raghaven...@gluster.com
 To: Anthony Goddard agodd...@mbl.edu
 Cc: Tejas N. Bhise te...@gluster.com, gluster-users 
 gluster-users@gluster.org
 Sent: Wednesday, December 30, 2009 9:10:23 PM GMT +05:30 Chennai, Kolkata, 
 Mumbai, New Delhi
 Subject: Re: [Gluster-users] volume sizes

 Hi Anthony,


 On Wed, Dec 30, 2009 at 6:30 PM, Anthony Goddard  agodd...@mbl.edu  wrote:


 Hi Tejas,
 Thanks for the advice. I will be using RAID as well as gluster replication I 
 think.. as we'll only need to sacrifice 1 drive per raid set to add a bit of 
 extra redundancy.

 The rebuild happens at the first access of a file, does this mean that the 
 entire brick/node is rebuilt upon an initial file access?

 No, only the file which is accessed is rebuilt. That is the reason we 
 recursively access all the files using 'ls -laR' on mount point.



 I think this is what I've seen from using gluster previously. If this is the 
 case, it would rebuild the entire volume which could span many raid volumes 
 or even machines, is this correct? If this is the case, then the underlying 
 disk wouldn't have any effect at all, but if it's spanned over multiple 
 machines and it only needs to rebuild one machine (or multiple volumes on one 
 machine) it only needs to rebuild one volume.
 I don't know if that made any sense.. haha.. but if it did, any insights into 
 whether the size of the volumes (aside from RAID rebuilds) will have a 
 positive effect on glusters rebuild operations?


 Cheers,
 Ant.





 On Dec 30, 2009, at 2:56 AM, Tejas N. Bhise wrote:

 Anthony,

 Gluster can take the smaller ( 6TB ) volumes and aggregate them into a large 
 Gluster volume ( as seen from the clients ). So that takes care of 
 managebility on the client side of things. On the server side, once you make 
 those smaller 6 TB volumes, you will depend on RAID to rebuild the disk 
 behind it, so its good to have a smaller partition. Since you are using RAID 
 and not Gluster replication, it might just make sense to have smaller RAID 
 partitions.

 If instead you were using Gluster replication and resulting recovery, it 
 would happen at first access of the file and the size of the Gluster volume 
 or the backend native FS volume or the RAID ( or raw ) partition behind it 
 would not be much of a consideration.

 Regards,
 Tejas.

 - Original Message -
 From: Anthony Goddard  agodd...@mbl.edu 
 To: gluster-users@gluster.org
 Sent: Wednesday, December 30, 2009 3:24:35 AM GMT +05:30 Chennai, Kolkata, 
 Mumbai, New Delhi
 Subject: [Gluster-users] volume sizes

 First post!
 We're looking at setting up 6x 24 bay storage servers (36TB of JBOD storage 
 per node) and running glusterFS over this cluster.
 We have RAID cards on these boxes and are trying to decide what the best 
 size of each volume should be, for example if we present the OS's (and 
 gluster) with six 36TB volumes, I imagine rebuilding one node would take a 
 long time, and there may be other performance implications of this. On the 
 other hand, if we present gluster / the OS's with 6x 6TB volumes on each 
 node, we might have more trouble in managing a larger number of volumes.

 My gut tells me a lot 

Re: [Gluster-users] Recommended GlusterFS configuration for 6 node cluster

2009-12-17 Thread phil cryer
Thanks Tejas
We have scanned biodiversity texts, so our aims are to have storage
capable of holding our full store of data (approx 24TB), and then as
we'll have everything in one place, be able to serve the data up via
standard HTTP calls.  Later we will look at doing syncs of the data
once we have other clusters up regionally, and globally for further
redudancy as well as providing better presetation for other parts of
the world.  So overall we just need to have a cluster that is
redundant and is able to serve files relatively quickly (some of the
scans are large, whereas the accompanying metadata files are small.
GlusterFS gives us this ability, something we've wanted for some time,
so this is amazing functioality for us to put into place.

Does this give you enough to go on?  If not, let me know, I appreciate
any/all suggestions.

P

On Thu, Dec 17, 2009 at 4:29 AM, Tejas N. Bhise te...@gluster.com wrote:
 Hi Phil,

 It's great to know that you are using Gluster. It would be easy to make 
 suggestions on the points you bring up if there is more information on what 
 use your want to put the system to.

 Regards,
 Tejas.

 - Original Message -
 From: phil cryer p...@cryer.us
 To: gluster-users@gluster.org
 Sent: Thursday, December 17, 2009 11:48:54 AM GMT +05:30 Chennai, Kolkata, 
 Mumbai, New Delhi
 Subject: [Gluster-users] Recommended GlusterFS configuration for 6 node 
 cluster

 We're setting up 6 servers, each with 24 x 1.5TB drives, the systems
 will run Debian testing and Gluster 3.x.  The SATA RAID card offers
 RAID5 and RAID6, we're wondering what the optimum setup would be for
 this configuration.  Do we RAID5 the disks, and have GlusterFS use
 them that way, or do we keep them all 'raw' and have GlusterFS handle
 the replication (though not 2x as we would have with the RAID
 options)?  Obviously a lot of ways to do this, just wondering what
 GlusterFS devs and other experienced users would recommend.

 Thanks

 P
 --
 http://philcryer.com
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users




-- 
http://philcryer.com
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users