Re: [Gluster-users] gluster no longer serving all of its files

2012-02-01 Thread Pranith Kumar K

hi Pat,
It seems that brick is disconnecting frequently:
pranithk @ ~/Downloads/pat-mit/old
14:48:39 :) $ grep -w disconnected gdata.log | awk '{print $5, $6}' | 
sort | uniq -c

  5 0-gdata-client-0: disconnected
  6 0-gdata-client-1: disconnected
308 0-gdata-client-2: disconnected - It disconnected 308 
times where as the others  10

  2 gdata-client-0: disconnected
  1 gdata-client-1: disconnected
  8 gdata-client-2: disconnected

You were mentioning that you tried doing self-heal, but I dont see any 
replicate configuration in your setup. It is a plain distribute setup. 
Am I missing something?.


This is the file I looked into 
http://mseas.mit.edu/download/phaley/GlusterUsers/Old/client_logs/gdata.log


Pranith.

On 01/30/2012 11:47 PM, Pat Haley wrote:


Hi,

We recently upgraded our version of gluster from 3.1.4 to
3.2.5. Shortly after I noticed that gluster was not serving
up all of its files, that is if I log onto one of the individual
bricks and do an ls of the underlying nfs directories, I can
see files that I do not see if do an ls from a client node
in the equivalent gluster directory.

We may have caused this by also running a script that went
through and removed pointer files (this had been how we had
been dealing with bad pointers under gluster 3.1.4).

So far, we have tried
   - running a gluster self-heal under version 3.2.5
   - rolling back to version 3.1.4
   - running a gluster self-heal under version 3.1.4
   - running a rebalance under version 3.1.4
None of these have solved the problem.  The one other
piece of data we have is that all the files which are
not being served appear to reside on a single brick
(at least every file we missed so far has been on
 that server and we have not found one missing from
 the other servers).

Any advice you can give us would be greatly appreciated.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley  Email:  pha...@mit.edu
Center for Ocean Engineering   Phone:  (617) 253-6824
Dept. of Mechanical EngineeringFax:(617) 253-8125
MIT, Room 5-222B   http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  02139-4301
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] best practices? 500+ win/mac computers, gluster, samba, new SAN, new hardware

2012-02-01 Thread D. Dante Lorenso

On 1/28/12 6:02 PM, Brian Candler wrote:

On Sat, Jan 28, 2012 at 05:31:28PM -0600, D. Dante Lorenso wrote:

Thinking about buying 8 servers with 4 x 2TB 7200 rpm SATA drives
(expandable to 8 drives).  Each server will have 8 network ports and
will be connected to a SAN switch using 4 ports link aggregated and
connected to a LAN switch using the other 4 ports aggregated.  The
servers will run CentOS 6.2 Linux.  The LAN side will run Samba and
export the network shares, and the SAN side will run Gluster daemon.


Just a terminology issue, but Gluster isn't really a SAN, it's a distributed
NAS.

A SAN uses a block-level protocol (e.g. iSCSI), into which the client runs a
regular filesystem like ext4 or xfs or whatever.  A NAS is a file-sharing
protocol (e.g.  NFS).  Gluster is the latter.


I need a word to describe the switch that I'll plug all my storage 
machines into.  Distributed NAS sounds good.  Might have a few iSCSI on 
there too, however.



With 8 machines and 4 ports for SAN each, I need 32 ports total.
I'm thinking a 48 port switch would work well as a SAN back-end
switch giving me left over space to add iSCSI devices and backup
servers which need to hook into the SAN.


Out of interest, why are you considering two different network fabrics? Are
there one set of clients which are talking CIFS and a different set of
clients using the Gluster native client?


Most of my clients (95%) are all Windows 7 workstations.  The only way I 
think I can get GlusterFS to work with Win7 is through Samba.  I was 
planning to use SMB/CIFS on the Win7 side of the network (using 2 bonded 
ports) and use Gluster native client on the storage side (using another 
2 bonded ports).



4) Performance tuning.  So far, I've discovered using dd and iperf
to debug my transfer rates.  I use dd to test raw speed of the
underlying disks (should I use RAID 0, RAID 1, RAID 5 ?)


Try some dd measurements onto a RAID 5 volume, especially for writing, and
you'll find it sucks.

I also suggest something like bonnie++ to get a more realistic performance
measurement than just the dd throughput, as it will include seeks and
filesystem operations (e.g.  file creations/deletions)


Good advice, I'll check into bonnie++.


Perhaps if my drives on each of the 8
servers are RAID 0, then I can use replicate 2 through gluster and
get the RAID 1 equivalent.  I think using replicate 2 in gluster
will 1/2 my network write/read speed, though.


In theory Gluster replication ought to improve your read speed, since some
clients can access one copy spindle while other clients access the other.
But I'm not sure how much it will impact the write speed.

I would however suggest that building a local RAID 0 array is probably a bad
idea, because if one disk of the set fails, that whole filesystem is toast.

Gluster does give you the option of a distributed replicated volume, so
you can get both the RAID 0 and RAID 1 functionality.


If you have 8 drives connected to a single machine, how do you introduce 
those drives to Gluster?  I was thinking I'd combine them into a single 
volume using RAID 0 and mount that volume on a box and turn it into a 
brick.  Otherwise you have to add 8 separate bricks, right?  That's not 
better is it?


-- Dante

D. Dante Lorenso
da...@lorenso.com
972-333-4139
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] best practices? 500+ win/mac computers, gluster, samba, new SAN, new hardware

2012-02-01 Thread D. Dante Lorenso

On 1/28/12 5:46 PM, Matthew Mackes wrote:

Hello,

You are on the right track.Your mount points are fine.I like to mount my
Gluster storage under /mnt/gluster and place my bricks inside /STORAGE.

I think that you are planing many more network interfaces per node then
your 4 (even 8) SATA drives per node at 7200 RPM will require. 2
aggregate ports should be plenty for heavy load, and one for normal use.

In my experience the 7200 RPM SATA drives will be your bottleneck.
  15,000 RPM SAS is a better choice for a storage node that requires
heavy storage load.



OMG, drive prices are insane!  Just 4 x 2TB SATA drives cost more than 
the rest of an entire 2U system.  We are considering getting Desktop 
drives temporarily and waiting for Thailand to rebuild before filling 
out the remainder of our disk arrays!



The only case I can think of for 4+ network interfaces per machine is if
you intend to subnet your Gluster SAN network from your normal network
used for storage access and administration. In that case you could bond
2 interfaces for the Gluster SAN network (for replication, stripping, et
between nodes) and the other pair bonded for  your SAMBA and management
access.


We are thinking now to just get 4 nics.  2 bonded for the storage 
network and 2 bonded for the Samba interface to the LAN.  That outta do 
it.  I think your math on the drive speed vs network speed is right.


-- Dante

D. Dante Lorenso
da...@lorenso.com
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] best practices? 500+ win/mac computers, gluster, samba, new SAN, new hardware

2012-02-01 Thread Brian Candler
On Wed, Feb 01, 2012 at 12:21:17PM -0600, D. Dante Lorenso wrote:
 Gluster does give you the option of a distributed replicated volume, so
 you can get both the RAID 0 and RAID 1 functionality.
 
 If you have 8 drives connected to a single machine, how do you
 introduce those drives to Gluster?  I was thinking I'd combine them
 into a single volume using RAID 0 and mount that volume on a box and
 turn it into a brick.  Otherwise you have to add 8 separate bricks,
 right?  That's not better is it?

I'm in the process of building a pair of test systems (in my case 12 disks
per server), and haven't quite got to building the Gluster layer, but yes 8
separate filesystems and 8 separate bricks per server is what I'm suggesting
you consider.

Then you create a distributed replicated volume using 16 bricks across 2
servers, added in the correct order so that they pair up and down
(serverA:brick1 serverB:brick1 serverA:brick2 serverB:brick2 etc) - or
across 4 servers or however many you're building.

The advantage is that if you lose one disk, 7/8 of the data is still usable
on both disks, and 1/8 is still available on one disk.  If you lose a second
disk, there is a 1 in 15 chance that it's the mirror of the other failed
one, but a 14 in 15 chance that you won't lose any data.  Furthermore,
replacing the failed disk will only have to synchronise (heal) one disk
worth of data.

Now, if you decide to make RAID0 sets instead, then losing one disk will
destroy the whole filesystem.  If you lose any disk in the second server you
will have lost everything.  And when you replace the one failed disk, you
will need to make a new filesystem across the whole RAID0 array and resync
all 8 disks worth of data.

I think it only makes sense to build an array brick if you are using RAID1
or higher.  RAID1 or RAID10 is fast but presumably you don't want to store 4
copies of your data, 2 on each server.  The write performance of RAID5 and
RAID6 is terrible.  An expensive RAID card with battery-backed write-through
cache will make it slightly less terrible, but still terrible.

Regards,

Brian.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] Sorry for x-post: RDMA Mounts drop with transport endpoints not connected

2012-02-01 Thread Brian Smith
Having serious issues w/ glusterfs 3.2.5 over rdma.  Clients are 
periodically dropping off with transport endpoint not connected.  Any 
help would be appreciated. Environment is HPC.  GlusterFS is being used 
as a shared /work|/scratch directory.  Standard distributed volume 
configuration.  Nothing fancy.


Pastie log snippet is here: http://pastie.org/3291330

Any help would be appreciated!

--
Brian Smith
Senior Systems Administrator
IT Research Computing, University of South Florida
4202 E. Fowler Ave. ENB308
Office Phone: +1 813 974-1467
Organization URL: http://rc.usf.edu
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Sorry for x-post: RDMA Mounts drop with transport endpoints not connected

2012-02-01 Thread Joe Landman

On 02/01/2012 04:49 PM, Brian Smith wrote:

Having serious issues w/ glusterfs 3.2.5 over rdma. Clients are
periodically dropping off with transport endpoint not connected. Any
help would be appreciated. Environment is HPC. GlusterFS is being used
as a shared /work|/scratch directory. Standard distributed volume
configuration. Nothing fancy.

Pastie log snippet is here: http://pastie.org/3291330

Any help would be appreciated!




What OS, kernel rev, OFED, etc.  What HCAs, switch, etc.

What does ibv_devinfo report for nodes experiencing the transport 
endpoint issue?


--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: land...@scalableinformatics.com
web  : http://scalableinformatics.com
   http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users