Re: [Gluster-users] gluster no longer serving all of its files
hi Pat, It seems that brick is disconnecting frequently: pranithk @ ~/Downloads/pat-mit/old 14:48:39 :) $ grep -w disconnected gdata.log | awk '{print $5, $6}' | sort | uniq -c 5 0-gdata-client-0: disconnected 6 0-gdata-client-1: disconnected 308 0-gdata-client-2: disconnected - It disconnected 308 times where as the others 10 2 gdata-client-0: disconnected 1 gdata-client-1: disconnected 8 gdata-client-2: disconnected You were mentioning that you tried doing self-heal, but I dont see any replicate configuration in your setup. It is a plain distribute setup. Am I missing something?. This is the file I looked into http://mseas.mit.edu/download/phaley/GlusterUsers/Old/client_logs/gdata.log Pranith. On 01/30/2012 11:47 PM, Pat Haley wrote: Hi, We recently upgraded our version of gluster from 3.1.4 to 3.2.5. Shortly after I noticed that gluster was not serving up all of its files, that is if I log onto one of the individual bricks and do an ls of the underlying nfs directories, I can see files that I do not see if do an ls from a client node in the equivalent gluster directory. We may have caused this by also running a script that went through and removed pointer files (this had been how we had been dealing with bad pointers under gluster 3.1.4). So far, we have tried - running a gluster self-heal under version 3.2.5 - rolling back to version 3.1.4 - running a gluster self-heal under version 3.1.4 - running a rebalance under version 3.1.4 None of these have solved the problem. The one other piece of data we have is that all the files which are not being served appear to reside on a single brick (at least every file we missed so far has been on that server and we have not found one missing from the other servers). Any advice you can give us would be greatly appreciated. -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Pat Haley Email: pha...@mit.edu Center for Ocean Engineering Phone: (617) 253-6824 Dept. of Mechanical EngineeringFax:(617) 253-8125 MIT, Room 5-222B http://web.mit.edu/phaley/www/ 77 Massachusetts Avenue Cambridge, MA 02139-4301 ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] best practices? 500+ win/mac computers, gluster, samba, new SAN, new hardware
On 1/28/12 6:02 PM, Brian Candler wrote: On Sat, Jan 28, 2012 at 05:31:28PM -0600, D. Dante Lorenso wrote: Thinking about buying 8 servers with 4 x 2TB 7200 rpm SATA drives (expandable to 8 drives). Each server will have 8 network ports and will be connected to a SAN switch using 4 ports link aggregated and connected to a LAN switch using the other 4 ports aggregated. The servers will run CentOS 6.2 Linux. The LAN side will run Samba and export the network shares, and the SAN side will run Gluster daemon. Just a terminology issue, but Gluster isn't really a SAN, it's a distributed NAS. A SAN uses a block-level protocol (e.g. iSCSI), into which the client runs a regular filesystem like ext4 or xfs or whatever. A NAS is a file-sharing protocol (e.g. NFS). Gluster is the latter. I need a word to describe the switch that I'll plug all my storage machines into. Distributed NAS sounds good. Might have a few iSCSI on there too, however. With 8 machines and 4 ports for SAN each, I need 32 ports total. I'm thinking a 48 port switch would work well as a SAN back-end switch giving me left over space to add iSCSI devices and backup servers which need to hook into the SAN. Out of interest, why are you considering two different network fabrics? Are there one set of clients which are talking CIFS and a different set of clients using the Gluster native client? Most of my clients (95%) are all Windows 7 workstations. The only way I think I can get GlusterFS to work with Win7 is through Samba. I was planning to use SMB/CIFS on the Win7 side of the network (using 2 bonded ports) and use Gluster native client on the storage side (using another 2 bonded ports). 4) Performance tuning. So far, I've discovered using dd and iperf to debug my transfer rates. I use dd to test raw speed of the underlying disks (should I use RAID 0, RAID 1, RAID 5 ?) Try some dd measurements onto a RAID 5 volume, especially for writing, and you'll find it sucks. I also suggest something like bonnie++ to get a more realistic performance measurement than just the dd throughput, as it will include seeks and filesystem operations (e.g. file creations/deletions) Good advice, I'll check into bonnie++. Perhaps if my drives on each of the 8 servers are RAID 0, then I can use replicate 2 through gluster and get the RAID 1 equivalent. I think using replicate 2 in gluster will 1/2 my network write/read speed, though. In theory Gluster replication ought to improve your read speed, since some clients can access one copy spindle while other clients access the other. But I'm not sure how much it will impact the write speed. I would however suggest that building a local RAID 0 array is probably a bad idea, because if one disk of the set fails, that whole filesystem is toast. Gluster does give you the option of a distributed replicated volume, so you can get both the RAID 0 and RAID 1 functionality. If you have 8 drives connected to a single machine, how do you introduce those drives to Gluster? I was thinking I'd combine them into a single volume using RAID 0 and mount that volume on a box and turn it into a brick. Otherwise you have to add 8 separate bricks, right? That's not better is it? -- Dante D. Dante Lorenso da...@lorenso.com 972-333-4139 ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] best practices? 500+ win/mac computers, gluster, samba, new SAN, new hardware
On 1/28/12 5:46 PM, Matthew Mackes wrote: Hello, You are on the right track.Your mount points are fine.I like to mount my Gluster storage under /mnt/gluster and place my bricks inside /STORAGE. I think that you are planing many more network interfaces per node then your 4 (even 8) SATA drives per node at 7200 RPM will require. 2 aggregate ports should be plenty for heavy load, and one for normal use. In my experience the 7200 RPM SATA drives will be your bottleneck. 15,000 RPM SAS is a better choice for a storage node that requires heavy storage load. OMG, drive prices are insane! Just 4 x 2TB SATA drives cost more than the rest of an entire 2U system. We are considering getting Desktop drives temporarily and waiting for Thailand to rebuild before filling out the remainder of our disk arrays! The only case I can think of for 4+ network interfaces per machine is if you intend to subnet your Gluster SAN network from your normal network used for storage access and administration. In that case you could bond 2 interfaces for the Gluster SAN network (for replication, stripping, et between nodes) and the other pair bonded for your SAMBA and management access. We are thinking now to just get 4 nics. 2 bonded for the storage network and 2 bonded for the Samba interface to the LAN. That outta do it. I think your math on the drive speed vs network speed is right. -- Dante D. Dante Lorenso da...@lorenso.com ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] best practices? 500+ win/mac computers, gluster, samba, new SAN, new hardware
On Wed, Feb 01, 2012 at 12:21:17PM -0600, D. Dante Lorenso wrote: Gluster does give you the option of a distributed replicated volume, so you can get both the RAID 0 and RAID 1 functionality. If you have 8 drives connected to a single machine, how do you introduce those drives to Gluster? I was thinking I'd combine them into a single volume using RAID 0 and mount that volume on a box and turn it into a brick. Otherwise you have to add 8 separate bricks, right? That's not better is it? I'm in the process of building a pair of test systems (in my case 12 disks per server), and haven't quite got to building the Gluster layer, but yes 8 separate filesystems and 8 separate bricks per server is what I'm suggesting you consider. Then you create a distributed replicated volume using 16 bricks across 2 servers, added in the correct order so that they pair up and down (serverA:brick1 serverB:brick1 serverA:brick2 serverB:brick2 etc) - or across 4 servers or however many you're building. The advantage is that if you lose one disk, 7/8 of the data is still usable on both disks, and 1/8 is still available on one disk. If you lose a second disk, there is a 1 in 15 chance that it's the mirror of the other failed one, but a 14 in 15 chance that you won't lose any data. Furthermore, replacing the failed disk will only have to synchronise (heal) one disk worth of data. Now, if you decide to make RAID0 sets instead, then losing one disk will destroy the whole filesystem. If you lose any disk in the second server you will have lost everything. And when you replace the one failed disk, you will need to make a new filesystem across the whole RAID0 array and resync all 8 disks worth of data. I think it only makes sense to build an array brick if you are using RAID1 or higher. RAID1 or RAID10 is fast but presumably you don't want to store 4 copies of your data, 2 on each server. The write performance of RAID5 and RAID6 is terrible. An expensive RAID card with battery-backed write-through cache will make it slightly less terrible, but still terrible. Regards, Brian. ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
[Gluster-users] Sorry for x-post: RDMA Mounts drop with transport endpoints not connected
Having serious issues w/ glusterfs 3.2.5 over rdma. Clients are periodically dropping off with transport endpoint not connected. Any help would be appreciated. Environment is HPC. GlusterFS is being used as a shared /work|/scratch directory. Standard distributed volume configuration. Nothing fancy. Pastie log snippet is here: http://pastie.org/3291330 Any help would be appreciated! -- Brian Smith Senior Systems Administrator IT Research Computing, University of South Florida 4202 E. Fowler Ave. ENB308 Office Phone: +1 813 974-1467 Organization URL: http://rc.usf.edu ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] Sorry for x-post: RDMA Mounts drop with transport endpoints not connected
On 02/01/2012 04:49 PM, Brian Smith wrote: Having serious issues w/ glusterfs 3.2.5 over rdma. Clients are periodically dropping off with transport endpoint not connected. Any help would be appreciated. Environment is HPC. GlusterFS is being used as a shared /work|/scratch directory. Standard distributed volume configuration. Nothing fancy. Pastie log snippet is here: http://pastie.org/3291330 Any help would be appreciated! What OS, kernel rev, OFED, etc. What HCAs, switch, etc. What does ibv_devinfo report for nodes experiencing the transport endpoint issue? -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics Inc. email: land...@scalableinformatics.com web : http://scalableinformatics.com http://scalableinformatics.com/sicluster phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users