Re: [Gluster-users] Regular I/O Freezing since downgrading from 3.7.12 to 3.7.11
On 7 July 2016 at 15:42, Krutika Dhananjaywrote: > could you please share the glusterfs client logs? Alas, being qemu/libgfapi there aren't an client logs :( I'll shut down the VM and restart it from the cmd line tonight with a std output redirect, then wait for it to freeze again. Might take a couple of days. Cheers, -- Lindsay ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Problem rebalancing a distributed volume
Hi, Please pass on the rebalance log from the 1st server for more analysis which can be found under /var/log/glusterfs/"$VOL-rebalance.log". And also we need the current layout xattrs from both the bricks, which can be extracted by the following command. "getfattr -m . -de hex <$BRICK_PATH>". Thanks, Susant - Original Message - > From: "Kyle Johnson"> To: gluster-users@gluster.org > Sent: Tuesday, 5 July, 2016 10:58:09 PM > Subject: [Gluster-users] Problem rebalancing a distributed volume > > > > > Hello everyone, > > I am having trouble with a distributed volume. In short, the rebalance > command does not seem to work for me: Existing files are not migrated, and > new files are not created on the new brick. > > I am running glusterfs 3.7.6 on two servers: > > 1) FreeBSD 10.3-RELEASE (colossus2 - 192.168.110.1) > 2) CentOS 6.7 (colossus - 192.168.110.2) > > The bricks are zfs-backed on both servers, and the network consists of two > direct-connected cat6 cables on 10gig NICs. The NICs are bonded (lagg'd) > together with mode 4 (LACP). > > Here is what I am seeing: > > root@colossus ~]# gluster volume create fubar 192.168.110.2:/ftp/bricks/fubar > volume create: fubar: success: please start the volume to access data > [root@colossus ~]# gluster volume start fubar > volume start: fubar: success > [root@colossus ~]# mount -t glusterfs 192.168.110.2:/fubar /mnt/test > [root@colossus ~]# touch /mnt/test/file{1..100} > [root@colossus ~]# ls / mnt/test / | wc -l > 100 > [root@colossus ~]# ls /ftp/bricks/fubar | wc -l > 100 > > # So far, so good. > > [root@colossus ~]# gluster volume add-brick fubar > 192.168.110.1:/tank/bricks/fubar > volume add-brick: success > > # For good measure, I'll do an explicit fix-layout first. > > [root@colossus ~]# gluster volume rebalance fubar fix-layout start > volume rebalance: fubar: success: Rebalance on fubar has been started > successfully. Use rebalance status command to check status of the rebalance > process. > ID: 2da23238-dbe4-4759-97b2-08879db271e7 > > [root@colossus ~]# gluster volume rebalance fubar status > Node Rebalanced-files size scanned failures skipped status run time in secs > - --- --- --- --- --- > -- > localhost 0 0Bytes 0 0 0 fix-layout completed 0.00 > 192.168.110.1 0 0Bytes 0 0 0 fix-layout completed 0.00 > volume rebalance: fubar: success > > # Now to do the actual rebalance. > > [root@colossus ~]# gluster volume rebalance fubar start > volume rebalance: fubar: success: Rebalance on fubar has been started > successfully. Use rebalance status command to check status of the rebalance > process. > ID: 67160a67-01b2-4a51-9a11-114aa6269ee9 > > [root@colossus ~]# gluster volume rebalance fubar status > Node Rebalanced-files size scanned failures skipped status run time in secs > - --- --- --- --- --- > -- > localhost 0 0Bytes 100 0 0 completed 0.00 > 192.168.110.1 0 0Bytes 0 0 0 completed 0.00 > volume rebalance: fubar: success > [root@colossus ~]# ls / mnt/test / | wc -l > 101 > [root@colossus ~]# ls / ftp/bricks/fubar / | wc -l > 100 > > # As the output shows, 100 files were scanned, but none were moved. > > # And for another test, I'll create 100 new post-fix-layout files > > [root@colossus ~]# touch /mnt/test/file{101..200} > [root@colossus ~]# ls / ftp/bricks/fubar / | wc -l > 199 > > > # And as you can see here, they were all created on the first server. The > second server isn't touched at all. > > > Not sure if this is relevant, but if I create the volume with both bricks to > begin with, files are properly distributed. > > Thanks! > Kyle > > > ___ > Gluster-users mailing list > Gluster-users@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Regular I/O Freezing since downgrading from 3.7.12 to 3.7.11
Yes, could you please share the glusterfs client logs? -Krutika On Thu, Jul 7, 2016 at 5:12 AM, Lindsay Mathieson < lindsay.mathie...@gmail.com> wrote: > Becoming a serious problem. Since my misadventure with 3.7.12 and > downgrading back to 3.7.11 i have daily freezes of VM where they *appears* > to be unable to write to disk. Its seems to be localised to just a few > VM's, one of which unfortunately is our AD Server. The only fix is a hard > reset of the VM. > > > nb. The entire cluster has been rebooted since the downgrade. > > > Logging available as needed. > > > -- > Lindsay Mathieson | Senior Developer > Softlog Systems Australia > 43 Kedron Park Road, Wooloowin, QLD, 4030 > [T] +61 7 3632 8804 | [F] +61 1800-818-914| [W]softlog.com.au > > DISCLAIMER: This Email and any attachments are a confidential communication > intended exclusively for the recipient. If you are not the intended > recipient > you must not disclose or use any of the contents of this Email. Should you > receive this Email in error, contact us immediately by return Email and > delete > this Email and any attachments. If you are the intended recipient of this > Email and propose to rely on its contents you should contact the writer to > confirm the same. Copyright and privilege relating to the contents of this > Email and any attachments are reserved. It is the recipient’s > responsibility > to scan all attachments for viruses prior to use. > > > -- > Lindsay Mathieson > ___ > Gluster-users mailing list > Gluster-users@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] Regular I/O Freezing since downgrading from 3.7.12 to 3.7.11
Becoming a serious problem. Since my misadventure with 3.7.12 and downgrading back to 3.7.11 i have daily freezes of VM where they *appears* to be unable to write to disk. Its seems to be localised to just a few VM's, one of which unfortunately is our AD Server. The only fix is a hard reset of the VM. nb. The entire cluster has been rebooted since the downgrade. Logging available as needed. -- Lindsay Mathieson | Senior Developer Softlog Systems Australia 43 Kedron Park Road, Wooloowin, QLD, 4030 [T] +61 7 3632 8804 | [F] +61 1800-818-914| [W]softlog.com.au DISCLAIMER: This Email and any attachments are a confidential communication intended exclusively for the recipient. If you are not the intended recipient you must not disclose or use any of the contents of this Email. Should you receive this Email in error, contact us immediately by return Email and delete this Email and any attachments. If you are the intended recipient of this Email and propose to rely on its contents you should contact the writer to confirm the same. Copyright and privilege relating to the contents of this Email and any attachments are reserved. It is the recipient’s responsibility to scan all attachments for viruses prior to use. -- Lindsay Mathieson ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] Securing GlusterD management
As some of you might already have noticed, GlusterD has been notably insecure ever since it was written. Unlike our I/O path, which does check access control on each request, anyone who can craft a CLI RPC request and send it to GlusterD's well known TCP port can do anything that the CLI itself can do. TLS support was added for GlusterD a while ago, but it has always been a bit problematic and as far as I know hasn't been used much. It's a bit of a chicken-and-egg problem. Nobody wants to use a buggy or incomplete feature, but as long as nobody's using it there's little incentive to improve it. Recently, there have been some efforts to add features which would turn the existing security problem into a full-fledged "arbitrary code execution" vulnerability (as the security folks would call it). These efforts have been blocked, but they have also highlighted the fact that we're *long* past the point where we should have tried to make GlusterD more secure. To that end, I've submitted the following patch to make TLS mandatory for all GlusterD communication, with some very basic authorization for CLI commands. http://review.gluster.org/#/c/14866/ The technical details are in the commit message, but the salient point is that it requires *zero configuration* to get basic authentication and encryption. This is equivalent to putting a lock on the door. Sure, maybe everybody knows the default combination, but *at least there's a lock* and people who want to secure their systems can change the combination to whatever they want. That's better than the door hanging open, without even a solid attachment point for a lock, and it's essential infrastructure for anything else we might do. The patch also fixes some bugs that affect even today's optional TLS implementation. One significant downside of this change has to do with rolling upgrades. While it might be possible for those who are already using TLS to do a rolling upgrade, it would still require some manual steps. The vast majority of users who haven't enabled TLS will be unable to upgrade without "stopping the world" (as is already the case for enabling TLS). I'd appreciate feedback from users on both the positive and negative aspects of this change. Should it go into 3.9? Should it be backported to 3.8? Or should it wait until 4.0? Feedback from developers is also appreciated, though at this point I think any problems with the patch itself have already been resolved to the point where GlusterFS with the patch is more stable than GlusterFS without it. I'm just fighting through some NetBSD testing issues at this point, hoping to make that situation better as well. ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] rebalance immediately fails 3.7.11, 3.7.12, 3.8.0
Hi All, i am trying to do some gluster testing for my customer. I am experiencing the same issue as described here: http://serverfault.com/questions/782602/glusterfs-rebalancing-volume-failed Except: I have dispersed-distributed volume. And I only let the fix-layout run for a few hours before stopping it. Cold someone please help? I have the same immediate rebalance fault @ 3.7.11, 3.7.12, and 3.8.0 on Ubuntu Trusty if it matters. Best Regards, Wade ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] Expand distributed replicated volume
Let's assume a distributed replicated (replica 3) volume, with sharding enabled. Can I add 1 node per time or I have to add 3 nodes every time ? ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] gluster dispersed volume configuration
Hi everyone, I am trying to configure a dispersed volume following this documentation page: http://gluster.readthedocs.io/en/latest/Administrator%20Guide/Setting%20Up%20Volumes/#creating-dispersed-volumes I have a set of 8 storage nodes, and I want the erasure coding settings to be (k=4)+(r=2). I understood that 6 bricks must be provided in order to create the volume. But my question is: is it possible to exploit the set of 8 servers given this coding configuration? My goal is the following: for each file, a set of 6 servers among the 8 is selected (in a round-robin way for example). Regards, Dimitri Pertin ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] Weekly community meeting - 06/Jun/2016
Today's meeting didn't go according to the agenda, as we initially had low attendance. Attendance overall was low as well owing to a holiday in Bangalore. The meeting minutes and logs for the meeting are available at the links below, Minutes: https://meetbot.fedoraproject.org/gluster-meeting/2016-07-06/weekly_community_meeting_06jul2016.2016-07-06-12.09.html Minutes (text): https://meetbot.fedoraproject.org/gluster-meeting/2016-07-06/weekly_community_meeting_06jul2016.2016-07-06-12.09.txt Log: https://meetbot.fedoraproject.org/gluster-meeting/2016-07-06/weekly_community_meeting_06jul2016.2016-07-06-12.09.log.html Next weeks meeting will be held at the same time again. See you all next week. ~kaushal ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] rebalance immediately fails 3.7.11, 3.7.12, 3.8.0
Hi All, i am trying to do some gluster testing for my customer. I am experiencing the same issue as described here: http://serverfault.com/questions/782602/glusterfs-rebalancing-volume-failed Except: I have dispersed-distributed volume. And I only let the fix-layout run for a few hours before stopping it. Could someone please help? I have the same immediate rebalance fault @ 3.7.11, 3.7.12, and 3.8.0 on Ubuntu Trusty if it matters. Best Regards, Wade ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] replace brick in distributed-dispersed setup
Hi all, I'm doing some testings with glusterfs in a virtualized environment running a 3 x (8 + 4) distributed-dispersed volume simulating a 3 node cluster with 12 drives per node configuration. The system versions are: OS: Debian jessie kernel 3.16 Gluster: 3.8.0-2 installed from the gluster.org debian repository I have tested the node failure scenario while some clients are running some read/write operations and the setup works as expected. Now I'm trying to test how to replace a faulty drive on this setup, however I'm not able to replace a brick. To test it I have: 1: Find the pid of the brick I'd like to 'fail' and kill the process. (tried removing the drive from the host but that would make the whole guest unresponsive) 2: Attach a new virtual drive, format and mount it 3: Try the gluster volume replace-brick command And I'm getting the following error: gluster volume replace-brick vol_1 glusterserver1:/ext/bricks/brick-1 glusterserver1:/ext/bricks/brick-13 commit force volume replace-brick: failed: Fuse unavailable Replace-brick failed I assume I'm doing something wrong but don't know what exactly. Looking in the documentation I have not found information about brick replacement in distributed-dispersed setups. Thanks! Iñaki ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] [Gluster-devel] Non Shared Persistent Gluster Storage with Kubernetes
On Wed, Jul 6, 2016 at 12:24 AM, Shyamwrote: > On 07/01/2016 01:45 AM, B.K.Raghuram wrote: > >> I have not gone through this implementation nor the new iscsi >> implementation being worked on for 3.9 but I thought I'd share the >> design behind a distributed iscsi implementation that we'd worked on >> some time back based on the istgt code with a libgfapi hook. >> >> The implementation used the idea of using one file to represent one >> block (of a chosen size) thus allowing us to use gluster as the backend >> to store these files while presenting a single block device of possibly >> infinite size. We used a fixed file naming convention based on the block >> number which allows the system to determine which file(s) needs to be >> operated on for the requested byte offset. This gave us the advantage of >> automatically accessing all of gluster's file based functionality >> underneath to provide a fully distributed iscsi implementation. >> >> Would this be similar to the new iscsi implementation thats being worked >> on for 3.9? >> > > > > Ultimately the idea would be to use sharding, as a part of the gluster > volume graph, to distribute the blocks (or rather shard the blocks), rather > than having the disk image on one distribute subvolume and hence scale disk > sizes to the size of the cluster. Further, sharding should work well here, > as this is a single client access case (or are we past that hurdle > already?). > Not yet, we need common transaction frame in place to reduce the latency for synchronization. > > What this achieves is similar to the iSCSI implementation that you talk > about, but gluster doing the block splitting and hence distribution, rather > than the iSCSI implementation (istgt) doing the same. > > < I did a cursory check on the blog post, but did not find a shard > reference, so maybe others could pitch in here, if they know about the > direction> > There are two directions which will eventually converge. 1) Granular data self-heal implementation so that taking snapshot becomes as simple as reflink. 2) Bring in snapshots of file with shards - this is a bit involved compared to the solution above. Once 2) is also complete we will have both 1) + 2) combined so that data-self-heal will heal the exact blocks inside each shard. If the users are not worried about snapshots 2) is the best option. > Further, in your original proposal, how do you maintain device properties, > such as size of the device and used/free blocks? I ask about used and free, > as that is an overhead to compute, if each block is maintained as a > separate file by itself, or difficult to achieve consistency of the size > and block update (as they are separate operations). Just curious. > -- Pranith ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users