from:"James"

Re: [Gluster-users] missing files on FUSE mounts

2020-11-26 Thread James Hammett

Yes, I compared the client count like this:

gluster volume status  clients |grep -B1 connected

I ran the find command on each client before and after shutting down the 
problematic daemon to determine any file count differences:

find /mount/point |wc -l

After my last post I discovered that one of the clients had somehow been 
blocked by iptables from connecting to one of the bricks. So for an extended 
period any file creation from that one client was perpetuating an imbalance 
between bricks, causing different files to be visible for different clients. 
What baffles me is that gluster wouldn't automatically fix an imbalance between 
replicas like that.

On Fri, Nov 20, 2020, at 1:24 AM, Benedikt Kaleß wrote:
> Dear James,

> we have exactly the same problems.

> Could you describe what you did to discover which of your bricks had the 
> worst file count discrepancy and how you find out that all clients matched 
> after shutting down this daemon?

> 

> Best regards

> Benedikt

> Am 02.11.20 um 17:30 schrieb James H:
>> I found a solution after making a discovery. I logged into the brick with 
>> the worst file count discrepancy - odroid4 - and killed the gluster daemon 
>> there. All file counts across all clients then matched. So I started the 
>> daemon and ran this command to try to fix it up:

>> 

>> gluster volume replace-brick gvol0 odroid4:/srv/gfs-brick/gvol0 
>> odroid4:/srv/gfs-brick/gvol0_2 commit force

>> 

>> ...and that fixed it. It's disconcerting that it's possible for Gluster to 
>> merrily hum along without any problems showing up in the various status 
>> summaries yet show vastly different directory listings to different clients. 
>> Is this a known problem or shall I open a bug report? Are there any 
>> particular error logs I should monitor to be alerted to this bad state?

>> 
>> On Thu, Oct 29, 2020 at 8:39 PM James H  wrote:
>>> Hi folks, I'm struggling to find a solution to missing files on FUSE 
>>> mounts. Which files are missing is different on different clients. I can 
>>> stat or ls the missing files directly when called by filename but listing 
>>> directories won't show them.
>>> 
>>> So far I've:
>>>  * verified heal info shows no files in need of healing and no split brain 
>>> condition
>>>  * verified the same number of clients are connected to each brick 
>>>  * verified the file counts on the bricks match
>>>  * upgraded Gluster server and clients from 3.x to 6.x and 7.x
>>>  * run a stat on all files
>>>  * run a heal full
>>>  * rebooted / remounted FUSE clients
>>> File count from running a 'find' command on FUSE mounts on the bricks 
>>> themselves. These counts should all be the same:
>>> *38823 *fuse-odroid1-share2 
>>> *38823 *fuse-odroid2-share2
>>> *60962 *fuse-odroid3-share2
>>> *7202 *fuse-odroid4-share2
>>> 
>>> ...and a FUSE mount on a seperate server:
>>> *38823 *fuse-phn2dsm-share2  
>>> 
>>> File count from running a 'find' command on brick directories themselves::  
>>> *43382 *brick-odroid1-share2
>>> *43382 *brick-odroid2-share2
>>> *43382 *brick-arbiter-odroid3-share2
>>> *23075 *brick-odroid3-share2
>>> *23075 *brick-odroid4-share2
>>> *23075 *brick-arbiter-odroid2-share2
>>> 
>>> Here's some info about the setup:
>>> 
>>> *# gluster --version | head -1; cat /etc/lsb-release; uname -r*
>>> glusterfs 7.8
>>> DISTRIB_ID=Ubuntu
>>> DISTRIB_RELEASE=18.04
>>> DISTRIB_CODENAME=bionic
>>> DISTRIB_DESCRIPTION="Ubuntu 18.04.3 LTS"
>>> 4.14.157-171
>>> 
>>> 
>>> *# gluster volume info*
>>> Volume Name: gvol0
>>> Type: Distributed-Replicate
>>> Volume ID: 57e3a085-5fb7-417d-a71a-fed5cd0ae2d9
>>> Status: Started
>>> Snapshot Count: 0
>>> Number of Bricks: 2 x (2 + 1) = 6
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: odroid1:/srv/gfs-brick/gvol0
>>> Brick2: odroid2:/srv/gfs-brick/gvol0
>>> Brick3: odroid3:/srv/gfs-brick/gvol0-arbiter2 (arbiter)
>>> Brick4: odroid3:/srv/gfs-brick/gvol0_2
>>> Brick5: odroid4:/srv/gfs-brick/gvol0
>>> Brick6: odroid2:/srv/gfs-brick/gvol0-arbiter2 (arbiter)
>>> Options Reconfigured:
>>> cluster.self-heal-daemon: enable
>>> performance.readdir-ahead: yes
>>> performance.cache-invalidation: on
>>> performance.stat-prefetch: on
>>> performance.quick-read: on
>>> cluster

Re: [Gluster-users] missing files on FUSE mounts

2020-11-02 Thread James H

I found a solution after making a discovery. I logged into the brick with
the worst file count discrepancy - odroid4 - and killed the gluster daemon
there. All file counts across all clients then matched. So I started the
daemon and ran this command to try to fix it up:


gluster volume replace-brick gvol0 odroid4:/srv/gfs-brick/gvol0
odroid4:/srv/gfs-brick/gvol0_2 commit force


...and that fixed it. It's disconcerting that it's possible for Gluster to
merrily hum along without any problems showing up in the various status
summaries yet show vastly different directory listings to different
clients. Is this a known problem or shall I open a bug report? Are there
any particular error logs I should monitor to be alerted to this bad state?

On Thu, Oct 29, 2020 at 8:39 PM James H  wrote:

> Hi folks, I'm struggling to find a solution to missing files on FUSE
> mounts. Which files are missing is different on different clients. I can
> stat or ls the missing files directly when called by filename but listing
> directories won't show them.
>
> So far I've:
>
>- verified heal info shows no files in need of healing and no split
>brain condition
>- verified the same number of clients are connected to each brick
>- verified the file counts on the bricks match
>- upgraded Gluster server and clients from 3.x to 6.x and 7.x
>- run a stat on all files
>- run a heal full
>- rebooted / remounted FUSE clients
>
> File count from running a 'find' command on FUSE mounts on the bricks
> themselves. These counts should all be the same:
> *38823 *fuse-odroid1-share2
> *38823 *fuse-odroid2-share2
> *60962 *fuse-odroid3-share2
> *7202 *fuse-odroid4-share2
>
> ...and a FUSE mount on a seperate server:
> *38823 *fuse-phn2dsm-share2
>
> File count from running a 'find' command on brick directories
> themselves::
> *43382 *brick-odroid1-share2
> *43382 *brick-odroid2-share2
> *43382 *brick-arbiter-odroid3-share2
> *23075 *brick-odroid3-share2
> *23075 *brick-odroid4-share2
> *23075 *brick-arbiter-odroid2-share2
>
> Here's some info about the setup:
>
> *# gluster --version | head -1; cat /etc/lsb-release; uname -r*
> glusterfs 7.8
> DISTRIB_ID=Ubuntu
> DISTRIB_RELEASE=18.04
> DISTRIB_CODENAME=bionic
> DISTRIB_DESCRIPTION="Ubuntu 18.04.3 LTS"
> 4.14.157-171
>
> *# gluster volume info*
> Volume Name: gvol0
> Type: Distributed-Replicate
> Volume ID: 57e3a085-5fb7-417d-a71a-fed5cd0ae2d9
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 2 x (2 + 1) = 6
> Transport-type: tcp
> Bricks:
> Brick1: odroid1:/srv/gfs-brick/gvol0
> Brick2: odroid2:/srv/gfs-brick/gvol0
> Brick3: odroid3:/srv/gfs-brick/gvol0-arbiter2 (arbiter)
> Brick4: odroid3:/srv/gfs-brick/gvol0_2
> Brick5: odroid4:/srv/gfs-brick/gvol0
> Brick6: odroid2:/srv/gfs-brick/gvol0-arbiter2 (arbiter)
> Options Reconfigured:
> cluster.self-heal-daemon: enable
> performance.readdir-ahead: yes
> performance.cache-invalidation: on
> performance.stat-prefetch: on
> performance.quick-read: on
> cluster.shd-max-threads: 4
> performance.parallel-readdir: on
> cluster.server-quorum-type: server
> server.event-threads: 4
> client.event-threads: 4
> performance.nl-cache-timeout: 600
> performance.nl-cache: on
> network.inode-lru-limit: 20
> performance.md-cache-timeout: 600
> performance.cache-samba-metadata: on
> features.cache-invalidation-timeout: 600
> features.cache-invalidation: on
> storage.fips-mode-rchecksum: on
> performance.client-io-threads: off
> nfs.disable: on
> transport.address-family: inet
> features.bitrot: on
> features.scrub: Active
> features.scrub-throttle: lazy
> features.scrub-freq: daily
> cluster.min-free-disk: 10%
>
> *# gluster volume status gvol0 detail*
> Status of volume: gvol0
>
> --
> Brick: Brick odroid1:/srv/gfs-brick/gvol0
> TCP Port : 49152
> RDMA Port: 0
> Online   : Y
> Pid  : 702
> File System  : xfs
> Device   : /dev/sda
> Mount Options:
> rw,noatime,nouuid,attr2,inode64,sunit=256,swidth=2560,noquota
> Inode Size   : 512
> Disk Space Free  : 983.4GB
> Total Disk Space : 5.5TB
> Inode Count  : 586052224
> Free Inodes  : 585835873
>
> --
> Brick: Brick odroid2:/srv/gfs-brick/gvol0
> TCP Port : 49152
> RDMA Port: 0
> Online   : Y
> Pid  : 30206
> File System  : xfs
>

[Gluster-users] missing files on FUSE mounts

2020-10-29 Thread James H

Hi folks, I'm struggling to find a solution to missing files on FUSE
mounts. Which files are missing is different on different clients. I can
stat or ls the missing files directly when called by filename but listing
directories won't show them.

So far I've:

   - verified heal info shows no files in need of healing and no split
   brain condition
   - verified the same number of clients are connected to each brick
   - verified the file counts on the bricks match
   - upgraded Gluster server and clients from 3.x to 6.x and 7.x
   - run a stat on all files
   - run a heal full
   - rebooted / remounted FUSE clients

File count from running a 'find' command on FUSE mounts on the bricks
themselves. These counts should all be the same:
*38823 *fuse-odroid1-share2
*38823 *fuse-odroid2-share2
*60962 *fuse-odroid3-share2
*7202 *fuse-odroid4-share2

...and a FUSE mount on a seperate server:
*38823 *fuse-phn2dsm-share2

File count from running a 'find' command on brick directories themselves::
*43382 *brick-odroid1-share2
*43382 *brick-odroid2-share2
*43382 *brick-arbiter-odroid3-share2
*23075 *brick-odroid3-share2
*23075 *brick-odroid4-share2
*23075 *brick-arbiter-odroid2-share2

Here's some info about the setup:

*# gluster --version | head -1; cat /etc/lsb-release; uname -r*
glusterfs 7.8
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.3 LTS"
4.14.157-171

*# gluster volume info*
Volume Name: gvol0
Type: Distributed-Replicate
Volume ID: 57e3a085-5fb7-417d-a71a-fed5cd0ae2d9
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x (2 + 1) = 6
Transport-type: tcp
Bricks:
Brick1: odroid1:/srv/gfs-brick/gvol0
Brick2: odroid2:/srv/gfs-brick/gvol0
Brick3: odroid3:/srv/gfs-brick/gvol0-arbiter2 (arbiter)
Brick4: odroid3:/srv/gfs-brick/gvol0_2
Brick5: odroid4:/srv/gfs-brick/gvol0
Brick6: odroid2:/srv/gfs-brick/gvol0-arbiter2 (arbiter)
Options Reconfigured:
cluster.self-heal-daemon: enable
performance.readdir-ahead: yes
performance.cache-invalidation: on
performance.stat-prefetch: on
performance.quick-read: on
cluster.shd-max-threads: 4
performance.parallel-readdir: on
cluster.server-quorum-type: server
server.event-threads: 4
client.event-threads: 4
performance.nl-cache-timeout: 600
performance.nl-cache: on
network.inode-lru-limit: 20
performance.md-cache-timeout: 600
performance.cache-samba-metadata: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
storage.fips-mode-rchecksum: on
performance.client-io-threads: off
nfs.disable: on
transport.address-family: inet
features.bitrot: on
features.scrub: Active
features.scrub-throttle: lazy
features.scrub-freq: daily
cluster.min-free-disk: 10%

*# gluster volume status gvol0 detail*
Status of volume: gvol0
--
Brick: Brick odroid1:/srv/gfs-brick/gvol0
TCP Port : 49152
RDMA Port: 0
Online   : Y
Pid  : 702
File System  : xfs
Device   : /dev/sda
Mount Options:
rw,noatime,nouuid,attr2,inode64,sunit=256,swidth=2560,noquota
Inode Size   : 512
Disk Space Free  : 983.4GB
Total Disk Space : 5.5TB
Inode Count  : 586052224
Free Inodes  : 585835873
--
Brick: Brick odroid2:/srv/gfs-brick/gvol0
TCP Port : 49152
RDMA Port: 0
Online   : Y
Pid  : 30206
File System  : xfs
Device   : /dev/sda
Mount Options:
rw,noatime,nouuid,attr2,inode64,sunit=256,swidth=2560,noquota
Inode Size   : 512
Disk Space Free  : 983.3GB
Total Disk Space : 5.5TB
Inode Count  : 586052224
Free Inodes  : 585711242
--
Brick: Brick odroid3:/srv/gfs-brick/gvol0-arbiter2
TCP Port : 49152
RDMA Port: 0
Online   : Y
Pid  : 32449
File System  : xfs
Device   : /dev/sda
Mount Options:
rw,noatime,nouuid,attr2,inode64,sunit=256,swidth=2560,noquota
Inode Size   : 512
Disk Space Free  : 1.4TB
Total Disk Space : 2.7TB
Inode Count  : 293026624
Free Inodes  : 292378835
--
Brick: Brick odroid3:/srv/gfs-brick/gvol0_2
TCP Port : 49153
RDMA Port: 0
Online   : Y
Pid  : 32474
File System  : xfs
Device   : /dev/sda
Mount Options:
rw,noatime,nouuid,attr2,inode64,sunit=256,swidth=2560,noquota
Inode Size   : 512
Disk Space Free  : 1.4TB
Total Disk Space : 2.7TB
Inode Count  : 293026624
Free Inodes  : 292378835

[Gluster-users] Using block storage for bricks? (ie. EBS, Digitalocean BS, etc)

2017-02-13 Thread James Addison

I've got a use scenario where I would like to have an expandable volume to
handle a growing set of files. Mostly original images + thumbnails. They'll
get written once, but read many times. So say the file size varies between
40kb and 20mb.

There will be a 3rd party CDN in front of the nginx server tier that will
be reading the files from the volume.

I plan to host this in a cloud environment, AWS is a possibility, but I
would like to investigate Digitalocean's block storage as well. Each node
can have 7 BS disks attached, each disk up to 1.6tb in size.

I gather from my reading that glusterfs and other technologies (ceph, etc)
work best on bare metal, not in a virtualized environment. Some people have
anecdotally stated "don't do POSIX shared filesystems in the cloud", ie I'm
asking for trouble.

Is this feasible with glusterfs? If not, what else should I look at? If it
is feasible, is libgfapi better connection strategy to architect around vs
the POSIX/fuse interface?

Am I asking the right questions here? Thanks.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Cluster not healing

2017-01-23 Thread James Wilkins

On 23 January 2017 at 20:04, Gambit15  wrote:

> Have you verified that Gluster has marked the files as split-brain?
>

Gluster does not recognise all the files as split-brain - in fact only a
handful are recognised as such - e.g for the example I pasted, its not
listed - however the gfid is different (I believe this should be the same?)




>
> gluster volume heal  info split-brain
>
> If you're fairly confident about which files are correct, you can automate
> the split-brain healing procedure.
>
> From the manual...
>
>> volume heal  split-brain bigger-file 
>>   Performs healing of  which is in split-brain by
>> choosing the bigger file in the replica as source.
>>
>> volume heal  split-brain source-brick
>> 
>>   Selects  as the source for all the
>> files that are in split-brain in that replica and heals them.
>>
>> volume heal  split-brain source-brick
>>  
>>   Selects the split-brained  present in
>>  as source and completes heal.
>>
>
> D
>
> On 23 January 2017 at 16:28, James Wilkins  wrote:
>
>> Hello,
>>
>> I have a couple of gluster clusters - setup with distributed/replicated
>> volumes that have starting incrementing the heal-count from statistics -
>> and for some files returning input/output error when attempting to access
>> said files from a fuse mount.
>>
>> If i take one volume, from one cluster as an example:
>>
>> gluster volume heal storage01 statistics info
>> 
>> Brick storage02.:/storage/sdc/brick_storage01
>> Number of entries: 595
>> 
>>
>> And then proceed to look at one of these files (have found 2 copies - one
>> on each server / brick)
>>
>> First brick:
>>
>> # getfattr -m . -d -e hex  /storage/sdc/brick_storage01/
>> projects/183-57c559ea4d60e-canary-test--node02/wordpress285-
>> data/html/wp-content/themes/twentyfourteen/single.php
>> getfattr: Removing leading '/' from absolute path names
>> # file: storage/sdc/brick_storage01/projects/183-57c559ea4d60e-canar
>> y-test--node02/wordpress285-data/html/wp-content/themes/
>> twentyfourteen/single.php
>> security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7
>> 573746572645f627269636b5f743a733000
>> trusted.afr.dirty=0x
>> trusted.afr.storage01-client-0=0x00020001
>> trusted.bit-rot.version=0x02005874e2cd459d
>> trusted.gfid=0xda4253be1c2647b7b6ec5c045d61d216
>> trusted.glusterfs.quota.c9764826-596a-4886-9bc0-60ee9b3fce44
>> .contri.1=0x0601
>> trusted.pgfid.c9764826-596a-4886-9bc0-60ee9b3fce44=0x0001
>>
>> Second Brick:
>>
>> # getfattr -m . -d -e hex /storage/sdc/brick_storage01/p
>> rojects/183-57c559ea4d60e-canary-test--node02/wordpress285-
>> data/html/wp-content/themes/twentyfourteen/single.php
>> getfattr: Removing leading '/' from absolute path names
>> # file: storage/sdc/brick_storage01/projects/183-57c559ea4d60e-canar
>> y-test--node02/wordpress285-data/html/wp-content/themes/
>> twentyfourteen/single.php
>> security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7
>> 573746572645f627269636b5f743a733000
>> trusted.afr.dirty=0x
>> trusted.bit-rot.version=0x020057868423000d6332
>> trusted.gfid=0x14f74b04679345289dbd3290a3665cbc
>> trusted.glusterfs.quota.47e007ee-6f91-4187-81f8-90a393deba2b
>> .contri.1=0x0601
>> trusted.pgfid.47e007ee-6f91-4187-81f8-90a393deba2b=0x0001
>>
>>
>>
>> I can see the only the first brick has the appropiate
>> trusted.afr. tag - e.g in this case
>>
>> trusted.afr.storage01-client-0=0x00020001
>>
>> Files are same size under stat - just the access/modify/change dates are
>> different.
>>
>> My first question is - reading https://gluster.readth
>> edocs.io/en/latest/Troubleshooting/split-brain/ this suggests that i
>> should have this field on both copies of the files - or am I mis-reading?
>>
>> Secondly - am I correct that each one of these entries will require
>> manual fixing?  (I have approx 6K files/directories in this state over two
>> clusters - which appears like an awful lot of manual fixing)
>>
>> I've checked gluster volume info  and all appropiate
>> services/self-heal daemon are running.  We've even tried a full heal/heal
>> and iterating over parts of the filesystem in question with find / stat /
>> md5sum.
>>
>> Any input appreciated.
>>
>> Cheers,
>>
>>
>>
>>
>>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Cluster not healing

2017-01-23 Thread James Wilkins

Hello,

I have a couple of gluster clusters - setup with distributed/replicated
volumes that have starting incrementing the heal-count from statistics -
and for some files returning input/output error when attempting to access
said files from a fuse mount.

If i take one volume, from one cluster as an example:

gluster volume heal storage01 statistics info

Brick storage02.:/storage/sdc/brick_storage01
Number of entries: 595


And then proceed to look at one of these files (have found 2 copies - one
on each server / brick)

First brick:

# getfattr -m . -d -e hex
 
/storage/sdc/brick_storage01/projects/183-57c559ea4d60e-canary-test--node02/wordpress285-data/html/wp-content/themes/twentyfourteen/single.php
getfattr: Removing leading '/' from absolute path names
# file:
storage/sdc/brick_storage01/projects/183-57c559ea4d60e-canary-test--node02/wordpress285-data/html/wp-content/themes/twentyfourteen/single.php
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.dirty=0x
trusted.afr.storage01-client-0=0x00020001
trusted.bit-rot.version=0x02005874e2cd459d
trusted.gfid=0xda4253be1c2647b7b6ec5c045d61d216
trusted.glusterfs.quota.c9764826-596a-4886-9bc0-60ee9b3fce44.contri.1=0x0601
trusted.pgfid.c9764826-596a-4886-9bc0-60ee9b3fce44=0x0001

Second Brick:

# getfattr -m . -d -e hex
/storage/sdc/brick_storage01/projects/183-57c559ea4d60e-canary-test--node02/wordpress285-data/html/wp-content/themes/twentyfourteen/single.php
getfattr: Removing leading '/' from absolute path names
# file:
storage/sdc/brick_storage01/projects/183-57c559ea4d60e-canary-test--node02/wordpress285-data/html/wp-content/themes/twentyfourteen/single.php
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.dirty=0x
trusted.bit-rot.version=0x020057868423000d6332
trusted.gfid=0x14f74b04679345289dbd3290a3665cbc
trusted.glusterfs.quota.47e007ee-6f91-4187-81f8-90a393deba2b.contri.1=0x0601
trusted.pgfid.47e007ee-6f91-4187-81f8-90a393deba2b=0x0001



I can see the only the first brick has the appropiate trusted.afr.
tag - e.g in this case

trusted.afr.storage01-client-0=0x00020001

Files are same size under stat - just the access/modify/change dates are
different.

My first question is - reading
https://gluster.readthedocs.io/en/latest/Troubleshooting/split-brain/ this
suggests that i should have this field on both copies of the files - or am
I mis-reading?

Secondly - am I correct that each one of these entries will require manual
fixing?  (I have approx 6K files/directories in this state over two
clusters - which appears like an awful lot of manual fixing)

I've checked gluster volume info  and all appropiate
services/self-heal daemon are running.  We've even tried a full heal/heal
and iterating over parts of the filesystem in question with find / stat /
md5sum.

Any input appreciated.

Cheers,
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] remove old gluster storage domain and resize remaining gluster storage domain

2016-12-01 Thread Bill James


glusterfs-3.7.11-1.el7.x86_64


I have a  3 node ovirt cluster with replica 3 gluster volume.
But for some reason the volume is not using the full size available.
I thought maybe it was because I had created a second gluster volume on 
same partition, so I tried to remove it.


I was able to put it in maintenance mode and detach it, but in no window 
was I able to get the "remove" option to be enabled.
Now if I select "attach data" I see ovirt thinks the volume is still 
there, although it is not.


2 questions.

1. how do I clear out the old removed volume from ovirt?

2. how do I get gluster to use the full disk space available?

Its a 1T partition but it only created a 225G gluster volume. Why? How 
do I get the space back?


All three nodes look the same:
/dev/mapper/rootvg01-lv02  1.1T  135G  929G  13% /ovirt-store
ovirt1-gl.j2noc.com:/gv1   225G  135G   91G  60% 
/rhev/data-center/mnt/glusterSD/ovirt1-gl.j2noc.com:_gv1



[root@ovirt1 prod ovirt1-gl.j2noc.com:_gv1]# gluster volume status
Status of volume: gv1
Gluster process TCP Port  RDMA Port Online  Pid
-- 


Brick ovirt1-gl.j2noc.com:/ovirt-store/bric
k1/gv1  49152 0 Y 5218
Brick ovirt3-gl.j2noc.com:/ovirt-store/bric
k1/gv1  49152 0 Y 5678
Brick ovirt2-gl.j2noc.com:/ovirt-store/bric
k1/gv1  49152 0 Y 61386
NFS Server on localhost 2049  0 Y 31312
Self-heal Daemon on localhost   N/A   N/A Y 31320
NFS Server on ovirt3-gl.j2noc.com   2049  0 Y 38109
Self-heal Daemon on ovirt3-gl.j2noc.com N/A   N/A Y 38119
NFS Server on ovirt2-gl.j2noc.com   2049  0 Y 5387
Self-heal Daemon on ovirt2-gl.j2noc.com N/A   N/A Y 5402

Task Status of Volume gv1
-- 


There are no active volume tasks


Thanks.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] gluster brick hang/High CPU load after 10 hours file transfer test

2016-11-21 Thread James Zhu

Hi,


I encountered gluster hung after 10 hours file transfer test.


gluster3.7.14 nfs-ganesha 2.3.2


we are running on 56-cores superMicro PC.


>sudo system-docker stats gluster nfs

CONTAINER   CPU %   MEM USAGE / LIMIT MEM % 
  NET I/O BLOCK I/O
gluster  2694.74%2.434 GB / 270.4 GB   0.90%
   0 B / 0 B   0 B / 1.073 MB
nfs  30.07%  146.6 MB / 270.4 GB   0.05%
   0 B / 0 B   4.096 kB / 0 B


>top capture:

root S2556m   0%   24% /usr/local/sbin/glusterfsd -s denali-bm-qa-45 
--volfile-id gluster-volume



gdb attach to some glusterfsd thread. it reported:

#0  pthread_spin_lock () at ../sysdeps/x86_64/nptl/pthread_spin_lock.S:32
#1  0x7f945f379ae5 in pl_inode_get (this=this@entry=0x7f9460010720, 
inode=inode@entry=0x7f943ffe1edc) at common.c:416
#2  0x7f945f3883be in pl_common_inodelk (frame=0x7f9467dc2ed8, 
this=0x7f9460010720, volume=0x7f945b5a9ac0 "gluster-volume-disperse-0", 
inode=0x7f943ffe1edc, cmd=6, flock=0x7f94678653d8, loc=0x7f94678652d8, fd=0x0,
xdata=0x7f946a2e9180) at inodelk.c:743
#3  0x7f945f388e27 in pl_inodelk (frame=, this=, volume=, loc=, cmd=, 
flock=, xdata=0x7f946a2e9180) at inodelk.c:816
#4  0x7f946a00b5c6 in default_inodelk (frame=0x7f9467dc2ed8, 
this=0x7f9460011bf0, volume=0x7f945b5a9ac0 "gluster-volume-disperse-0", 
loc=0x7f94678652d8, cmd=6, lock=0x7f94678653d8, xdata=0x7f946a2e9180) at 
defaults.c:2032
#5  0x7f946a01e324 in default_inodelk_resume (frame=0x7f9467dbabd4, 
this=0x7f9460013070, volume=0x7f945b5a9ac0 "gluster-volume-disperse-0", 
loc=0x7f94678652d8, cmd=6, lock=0x7f94678653d8, xdata=0x7f946a2e9180) at 
defaults.c:1589
#6  0x7f946a03c1ce in call_resume_wind (stub=) at 
call-stub.c:2210
#7  0x7f946a03c5bd in call_resume (stub=0x7f9467865298) at call-stub.c:2576
#8  0x7f945ef5b2b2 in iot_worker (data=0x7f9460052ec0) at io-threads.c:215
#9  0x7f946979270a in start_thread (arg=0x7f943cd5e700) at 
pthread_create.c:333
#10 0x7f94694c882d in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:109

It shows that  many glusterfsd sub-threads' pthread_spin_lock wait for unlock. 
it caused CPU load so high.
   |-glusterfsd(772)-+-{glusterfsd}(773)
   | |-{glusterfsd}(774)
   | |-{glusterfsd}(775)
   | |-{glusterfsd}(776)
   | |-{glusterfsd}(777)
   | |-{glusterfsd}(778)
   | |-{glusterfsd}(779)
   | |-{glusterfsd}(780)
   | |-{glusterfsd}(781)
   | |-{glusterfsd}(782)
   | |-{glusterfsd}(783)
   | |-{glusterfsd}(784)
   | |-{glusterfsd}(785)
   | |-{glusterfsd}(786)
   | |-{glusterfsd}(787)
   | |-{glusterfsd}(788)
   | `-{glusterfsd}(789)
   |-glusterfsd(791)-+-{glusterfsd}(792)
   | |-{glusterfsd}(793)
   | |-{glusterfsd}(794)
   | |-{glusterfsd}(795)
   | |-{glusterfsd}(796)
   | |-{glusterfsd}(797)
   | |-{glusterfsd}(798)
   | |-{glusterfsd}(799)
   | |-{glusterfsd}(800)
   | |-{glusterfsd}(801)
   | |-{glusterfsd}(802)
   | |-{glusterfsd}(803)
   | |-{glusterfsd}(804)
   | |-{glusterfsd}(805)
   | |-{glusterfsd}(806)
   | |-{glusterfsd}(807)
   | `-{glusterfsd}(808)



If just wait for few hours, the system will recover to normal.


I am wondering how to go deeply to discover what caused one of the thread hold 
the lock so long. Please give me your professional advice.


Best Regards!


James Zhu
Email Disclaimer & Confidentiality Notice
This message is confidential and intended solely for the use of the recipient 
to whom they are addressed. If you are not the intended recipient you should 
not deliver, distribute or copy this e-mail. Please notify the sender 
immediately by e-mail and delete this e-mail from your system. Copyright © 2016 
by Istuary Innovation Labs, Inc. All rights reserved.

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] write performance with NIC bonding

2016-09-21 Thread James Ching


Hi,

I'm using gluster 3.7.5 and I'm trying to get port bonding working 
properly with the gluster protocol.  I've bonded the NICs using round 
robin because I also bond it at the switch level with link aggregation.  
I've used this type of bonding without a problem with my other 
applications but for some reason gluster does not want to utilize all 3 
NICs for writes but it does for reads... any of you come across this or 
know why?  Here's the output of the traffic on the NICs you can see that 
RX is unbalanced but TX is completely balanced across the 3 NICs.  I've 
tried both mounting via glusterfs or nfs, both result in the same 
imbalance. Am I missing some configuration?



root@e-gluster-01:~# ifconfig
bond0 Link encap:Ethernet
  inet addr:  Bcast:128.33.23.255 Mask:255.255.248.0
  inet6 addr: fe80::46a8:42ff:fe43:8817/64 Scope:Link
  UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500 Metric:1
  RX packets:160972852 errors:0 dropped:0 overruns:0 frame:0
  TX packets:122295229 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:0
  RX bytes:152800624950 (142.3 GiB)  TX bytes:138720356365 
(129.1 GiB)


em1   Link encap:Ethernet
  UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
  RX packets:160793725 errors:0 dropped:0 overruns:0 frame:0
  TX packets:40763142 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000
  RX bytes:152688146880 (142.2 GiB)  TX bytes:46239971255 (43.0 
GiB)

  Interrupt:41

em2   Link encap:Ethernet
  UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
  RX packets:92451 errors:0 dropped:0 overruns:0 frame:0
  TX packets:40750031 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000
  RX bytes:9001370 (8.5 MiB)  TX bytes:46216513162 (43.0 GiB)
  Interrupt:45

em3   Link encap:Ethernet
  UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
  RX packets:86676 errors:0 dropped:0 overruns:0 frame:0
  TX packets:40782056 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000
  RX bytes:103476700 (98.6 MiB)  TX bytes:46263871948 (43.0 GiB)
  Interrupt:40


___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] gluster VM disk permissions

2016-05-19 Thread Bill James

I tried posting this to ovirt-users list but got no response so I'll try 
here too.



I just setup a new ovirt cluster with gluster & nfs data domains.

VMs on the NFS domain startup with no issues.
VMs on the gluster domains complain of "Permission denied" on startup.

2016-05-17 14:14:51,959 ERROR [org.ovirt.engine.core.dal.dbbroker.audi
tloghandling.AuditLogDirector] (ForkJoinPool-1-worker-11) [] Correlation 
ID: null, Call Stack: null, Custom Event ID: -1, Message: VM 
billj7-2.j2noc.com is down with error. Exit message: internal error: 
process exited while connecting to monitor: 2016-05-17T21:14:51.162932Z 
qemu-kvm: -drive 
file=/rhev/data-center/0001-0001-0001-0001-02c5/22df0943-c131-4ed8-ba9c-05923afcf8e3/images/2ddf0d0e-6a7e-4eb9-b1d5-6d7792da0d25/a2b0a04d-041f-4342-9687-142cc641b35e,if=none,id=drive-virtio-disk0,format=raw,serial=2ddf0d0e-6a7e-4eb9-b1d5-6d7792da0d25,cache=none,werror=stop,rerror=stop,aio=threads: 
Could not open 
'/rhev/data-center/0001-0001-0001-0001-02c5/22df0943-c131-4ed8-ba9c-05923afcf8e3/images/2ddf0d0e-6a7e-4eb9-b1d5-6d7792da0d25/a2b0a04d-041f-4342-9687-142cc641b35e': 
Permission denied



I did setup gluster permissions:
gluster volume set gv1 storage.owner-uid 36
gluster volume set gv1 storage.owner-gid 36

files look fine:
[root@ovirt1 prod 2ddf0d0e-6a7e-4eb9-b1d5-6d7792da0d25]# ls -lah
total 2.0G
drwxr-xr-x  2 vdsm kvm 4.0K May 17 09:39 .
drwxr-xr-x 11 vdsm kvm 4.0K May 17 10:40 ..
-rw-rw  1 vdsm kvm  20G May 17 10:33 
a2b0a04d-041f-4342-9687-142cc641b35e
-rw-rw  1 vdsm kvm 1.0M May 17 09:38 
a2b0a04d-041f-4342-9687-142cc641b35e.lease
-rw-r--r--  1 vdsm kvm  259 May 17 09:39 
a2b0a04d-041f-4342-9687-142cc641b35e.meta


I did check and vdsm user can read the file just fine.
*If I change mod disk to 666 VM starts up fine.*
ALso if I chgrp to qemu VM starts up fine.

[root@ovirt2 prod a7af2477-4a19-4f01-9de1-c939c99e53ad]# ls -l 
253f9615-f111-45ca-bdce-cbc9e70406df
-rw-rw 1 vdsm qemu 21474836480 May 18 11:38 
253f9615-f111-45ca-bdce-cbc9e70406df



Seems similar to issue here but that suggests it was fixed:
https://bugzilla.redhat.com/show_bug.cgi?id=1052114



[root@ovirt1 prod 2ddf0d0e-6a7e-4eb9-b1d5-6d7792da0d25]# grep 36 
/etc/passwd /etc/group

/etc/passwd:vdsm:x:36:36:Node Virtualization Manager:/:/bin/bash
/etc/group:kvm:x:36:qemu,sanlock


ovirt-engine-3.6.4.1-1.el7.centos.noarch
glusterfs-3.7.11-1.el7.x86_64
qemu-img-ev-2.3.0-31.el7_2.4.1.x86_64
qemu-kvm-ev-2.3.0-31.el7_2.4.1.x86_64
libvirt-daemon-1.2.17-13.el7_2.4.x86_64


I also set libvirt qemu user to root, for import-to-ovirt.pl script.

[root@ovirt1 prod 2ddf0d0e-6a7e-4eb9-b1d5-6d7792da0d25]# grep ^user 
/etc/libvirt/qemu.conf

user = "root"


[root@ovirt1 prod 2ddf0d0e-6a7e-4eb9-b1d5-6d7792da0d25]# gluster volume 
info gv1


Volume Name: gv1
Type: Replicate
Volume ID: 062aa1a5-91e8-420d-800e-b8bc4aff20d8
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: ovirt1-gl.j2noc.com:/ovirt-store/brick1/gv1
Brick2: ovirt2-gl.j2noc.com:/ovirt-store/brick1/gv1
Brick3: ovirt3-gl.j2noc.com:/ovirt-store/brick1/gv1
Options Reconfigured:
performance.readdir-ahead: on
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: enable
cluster.quorum-type: auto
cluster.server-quorum-type: server
features.shard: on
features.shard-block-size: 64MB
storage.owner-uid: 36
storage.owner-gid: 36

[root@ovirt1 prod 2ddf0d0e-6a7e-4eb9-b1d5-6d7792da0d25]# gluster volume 
status gv1

Status of volume: gv1
Gluster process TCP Port  RDMA Port Online  Pid
--
Brick ovirt1-gl.j2noc.com:/ovirt-store/bric
k1/gv1  49152 0 Y   2046
Brick ovirt2-gl.j2noc.com:/ovirt-store/bric
k1/gv1  49152 0 Y   22532
Brick ovirt3-gl.j2noc.com:/ovirt-store/bric
k1/gv1  49152 0 Y   59683
NFS Server on localhost 2049  0 Y   2200
Self-heal Daemon on localhost   N/A   N/A Y   2232
NFS Server on ovirt3-gl.j2noc.com   2049  0 Y   65363
Self-heal Daemon on ovirt3-gl.j2noc.com N/A   N/A Y   65371
NFS Server on ovirt2-gl.j2noc.com   2049  0 Y   17621
Self-heal Daemon on ovirt2-gl.j2noc.com N/A   N/A Y   17629

Task Status of Volume gv1
--
There are no active volume tasks



any ideas on why ovirt thinks it needs group of qemu??



___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [ovirt-users] ovirt glusterfs performance

2016-02-12 Thread Bill James


thank you for the reply.

We setup gluster using the names associated with  NIC 2 IP.
 Brick1: ovirt1-ks.test.j2noc.com:/gluster-store/brick1/gv1
 Brick2: ovirt2-ks.test.j2noc.com:/gluster-store/brick1/gv1
 Brick3: ovirt3-ks.test.j2noc.com:/gluster-store/brick1/gv1

That's NIC 2's IP.
Using 'iftop -i eno2 -L 5 -t' :

dd if=/dev/zero of=/root/testfile bs=1M count=1000 oflag=direct
1048576000 bytes (1.0 GB) copied, 68.0714 s, 15.4 MB/s

Peak rate (sent/received/total):  281Mb 5.36Mb  
282Mb
Cumulative (sent/received/total):1.96GB 14.6MB 
1.97GB


gluster volume info gv1:
 Options Reconfigured:
 performance.write-behind-window-size: 4MB
 performance.readdir-ahead: on
 performance.cache-size: 1GB
 performance.write-behind: off

performance.write-behind: off didn't help.
Neither did any other changes I've tried.


There is no VM traffic on this VM right now except my test.



On 02/10/2016 11:55 PM, Nir Soffer wrote:

On Thu, Feb 11, 2016 at 2:42 AM, Ravishankar N  wrote:

+gluster-users

Does disabling 'performance.write-behind' give a better throughput?



On 02/10/2016 11:06 PM, Bill James wrote:

I'm setting up a ovirt cluster using glusterfs and noticing not stellar
performance.
Maybe my setup could use some adjustments?

3 hardware nodes running centos7.2, glusterfs 3.7.6.1, ovirt 3.6.2.6-1.
Each node has 8 spindles configured in 1 array which is split using LVM
with one logical volume for system and one for gluster.
They each have 4 NICs,
  NIC1 = ovirtmgmt
  NIC2 = gluster  (1GbE)

How do you ensure that gluster trafic is using this nic?


  NIC3 = VM traffic

How do you ensure that vm trafic is using this nic?


I tried with default glusterfs settings

And did you find any difference?


and also with:
performance.cache-size: 1GB
performance.readdir-ahead: on
performance.write-behind-window-size: 4MB

[root@ovirt3 test scripts]# gluster volume info gv1

Volume Name: gv1
Type: Replicate
Volume ID: 71afc35b-09d7-4384-ab22-57d032a0f1a2
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: ovirt1-ks.test.j2noc.com:/gluster-store/brick1/gv1
Brick2: ovirt2-ks.test.j2noc.com:/gluster-store/brick1/gv1
Brick3: ovirt3-ks.test.j2noc.com:/gluster-store/brick1/gv1
Options Reconfigured:
performance.cache-size: 1GB
performance.readdir-ahead: on
performance.write-behind-window-size: 4MB


Using simple dd test on VM in ovirt:
   dd if=/dev/zero of=/root/testfile bs=1G count=1 oflag=direct

block size of 1G?!

Try 1M (our default for storage operations)


   1073741824 bytes (1.1 GB) copied, 65.9337 s, 16.3 MB/s

Another VM not in ovirt using nfs:
dd if=/dev/zero of=/root/testfile bs=1G count=1 oflag=direct
   1073741824 bytes (1.1 GB) copied, 27.0079 s, 39.8 MB/s


Is that expected or is there a better way to set it up to get better
performance?

Adding Niels for advice.


This email, its contents and 

Please avoid this, this is a public mailing list, everything you write
here is public.

Nir

I'll have to look into how to remove this sig for this mailing list

Cloud Services for Business www.j2.com
j2 | eFax | eVoice | FuseMail | Campaigner | KeepItSafe | Onebox


This email, its contents and attachments contain information from j2 Global, 
Inc. and/or its affiliates which may be privileged, confidential or otherwise 
protected from disclosure. The information is intended to be for the 
addressee(s) only. If you are not an addressee, any disclosure, copy, 
distribution, or use of the contents of this message is prohibited. If you have 
received this email in error please notify the sender by reply e-mail and 
delete the original message and any copies. (c) 2015 j2 Global, Inc. All rights 
reserved. eFax, eVoice, Campaigner, FuseMail, KeepItSafe, and Onebox are 
registered trademarks of j2 Global, Inc. and its affiliates.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [ovirt-users] ovirt glusterfs performance

2016-02-12 Thread Bill James


wow, that made a whole lot of difference!
Thank you!

[root@billjov1 ~]# time dd if=/dev/zero of=/root/testfile1 bs=1M 
count=1000 oflag=direct

1048576000 bytes (1.0 GB) copied, 20.2778 s, 51.7 MB/s

these are the options now for the record.

Options Reconfigured:
cluster.server-quorum-type: server
cluster.quorum-type: auto
network.remote-dio: enable
cluster.eager-lock: enable
performance.stat-prefetch: off
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
performance.write-behind: off
performance.write-behind-window-size: 4MB
performance.cache-size: 1GB
performance.readdir-ahead: on


Thanks again!


On 2/11/16 8:18 PM, Ravishankar N wrote:

Hi Bill,
Can you enable virt-profile setting for your volume and see if that 
helps? You need to enable this optimization when you create the volume 
using ovrit, or use the following command for an existing volume:


#gluster volume set  group virt

-Ravi


On 02/12/2016 05:22 AM, Bill James wrote:

My apologies, I'm showing how much of a noob I am.
Ignore last direct to gluster numbers, as that wasn't really glusterfs.


[root@ovirt2 test ~]# mount -t glusterfs 
ovirt2-ks.test.j2noc.com:/gv1 /mnt/tmp/
[root@ovirt2 test ~]# time dd if=/dev/zero of=/mnt/tmp/testfile2 
bs=1M count=1000 oflag=direct

1048576000 bytes (1.0 GB) copied, 65.8596 s, 15.9 MB/s

That's more how I expected, it is pointing to glusterfs performance.



On 02/11/2016 03:27 PM, Bill James wrote:
don't know if it helps, but I ran a few more tests, all from the 
same hardware node.


The VM:
[root@billjov1 ~]# time dd if=/dev/zero of=/root/testfile bs=1M 
count=1000 oflag=direct

1048576000 bytes (1.0 GB) copied, 62.5535 s, 16.8 MB/s

Writing directly to gluster volume:
[root@ovirt2 test ~]# time dd if=/dev/zero 
of=/gluster-store/brick1/gv1/testfile bs=1M count=1000 oflag=direct

1048576000 bytes (1.0 GB) copied, 9.92048 s, 106 MB/s


Writing to NFS volume:
[root@ovirt2 test ~]# time dd if=/dev/zero 
of=/mnt/storage/qa/testfile bs=1M count=1000 oflag=direct

1048576000 bytes (1.0 GB) copied, 10.5776 s, 99.1 MB/s

NFS & Gluster are using the same interface. Tests were not run at 
same time.


This would suggest my problem isn't glusterfs, but the VM performance.



On 02/11/2016 03:13 PM, Bill James wrote:

xml attached.


On 02/11/2016 12:28 PM, Nir Soffer wrote:
On Thu, Feb 11, 2016 at 8:27 PM, Bill James  
wrote:

thank you for the reply.

We setup gluster using the names associated with  NIC 2 IP.
  Brick1: ovirt1-ks.test.j2noc.com:/gluster-store/brick1/gv1
  Brick2: ovirt2-ks.test.j2noc.com:/gluster-store/brick1/gv1
  Brick3: ovirt3-ks.test.j2noc.com:/gluster-store/brick1/gv1

That's NIC 2's IP.
Using 'iftop -i eno2 -L 5 -t' :

dd if=/dev/zero of=/root/testfile bs=1M count=1000 oflag=direct
1048576000 bytes (1.0 GB) copied, 68.0714 s, 15.4 MB/s

Can you share the xml of this vm? You can find it in vdsm log,
at the time you start the vm.

Or you can do (on the host):

# virsh
virsh # list
(username: vdsm@ovirt password: shibboleth)
virsh # dumpxml vm-id


Peak rate (sent/received/total):  281Mb 5.36Mb
282Mb
Cumulative (sent/received/total): 1.96GB 14.6MB
1.97GB

gluster volume info gv1:
  Options Reconfigured:
  performance.write-behind-window-size: 4MB
  performance.readdir-ahead: on
  performance.cache-size: 1GB
  performance.write-behind: off

performance.write-behind: off didn't help.
Neither did any other changes I've tried.


There is no VM traffic on this VM right now except my test.



On 02/10/2016 11:55 PM, Nir Soffer wrote:
On Thu, Feb 11, 2016 at 2:42 AM, Ravishankar N 


wrote:

+gluster-users

Does disabling 'performance.write-behind' give a better 
throughput?




On 02/10/2016 11:06 PM, Bill James wrote:
I'm setting up a ovirt cluster using glusterfs and noticing 
not stellar

performance.
Maybe my setup could use some adjustments?

3 hardware nodes running centos7.2, glusterfs 3.7.6.1, ovirt 
3.6.2.6-1.
Each node has 8 spindles configured in 1 array which is split 
using LVM

with one logical volume for system and one for gluster.
They each have 4 NICs,
   NIC1 = ovirtmgmt
   NIC2 = gluster  (1GbE)

How do you ensure that gluster trafic is using this nic?


   NIC3 = VM traffic

How do you ensure that vm trafic is using this nic?


I tried with default glusterfs settings

And did you find any difference?


and also with:
performance.cache-size: 1GB
performance.readdir-ahead: on
performance.write-behind-window-size: 4MB

[root@ovirt3 test scripts]# gluster volume info gv1

Volume Name: gv1
Type: Replicate
Volume ID: 71afc35b-09d7-4384-ab22-57d032a0f1a2
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: ovirt1-ks.test.j2noc.com:/gluster-store/brick1/gv1
Brick2: ovirt2-ks.test.j2noc.com:/gluster-store/brick1/gv1
Brick3: ovirt3-ks.test.j2noc.com:/gluster-store/brick1/gv1
Options Reconfigured:
performance.cache-size: 1GB
perfo

Re: [Gluster-users] [ovirt-users] ovirt glusterfs performance

2016-02-11 Thread Bill James


My apologies, I'm showing how much of a noob I am.
Ignore last direct to gluster numbers, as that wasn't really glusterfs.


[root@ovirt2 test ~]# mount -t glusterfs ovirt2-ks.test.j2noc.com:/gv1 
/mnt/tmp/
[root@ovirt2 test ~]# time dd if=/dev/zero of=/mnt/tmp/testfile2 bs=1M 
count=1000 oflag=direct

1048576000 bytes (1.0 GB) copied, 65.8596 s, 15.9 MB/s

That's more how I expected, it is pointing to glusterfs performance.



On 02/11/2016 03:27 PM, Bill James wrote:
don't know if it helps, but I ran a few more tests, all from the same 
hardware node.


The VM:
[root@billjov1 ~]# time dd if=/dev/zero of=/root/testfile bs=1M 
count=1000 oflag=direct

1048576000 bytes (1.0 GB) copied, 62.5535 s, 16.8 MB/s

Writing directly to gluster volume:
[root@ovirt2 test ~]# time dd if=/dev/zero 
of=/gluster-store/brick1/gv1/testfile bs=1M count=1000 oflag=direct

1048576000 bytes (1.0 GB) copied, 9.92048 s, 106 MB/s


Writing to NFS volume:
[root@ovirt2 test ~]# time dd if=/dev/zero of=/mnt/storage/qa/testfile 
bs=1M count=1000 oflag=direct

1048576000 bytes (1.0 GB) copied, 10.5776 s, 99.1 MB/s

NFS & Gluster are using the same interface. Tests were not run at same 
time.


This would suggest my problem isn't glusterfs, but the VM performance.



On 02/11/2016 03:13 PM, Bill James wrote:

xml attached.


On 02/11/2016 12:28 PM, Nir Soffer wrote:

On Thu, Feb 11, 2016 at 8:27 PM, Bill James  wrote:

thank you for the reply.

We setup gluster using the names associated with  NIC 2 IP.
  Brick1: ovirt1-ks.test.j2noc.com:/gluster-store/brick1/gv1
  Brick2: ovirt2-ks.test.j2noc.com:/gluster-store/brick1/gv1
  Brick3: ovirt3-ks.test.j2noc.com:/gluster-store/brick1/gv1

That's NIC 2's IP.
Using 'iftop -i eno2 -L 5 -t' :

dd if=/dev/zero of=/root/testfile bs=1M count=1000 oflag=direct
1048576000 bytes (1.0 GB) copied, 68.0714 s, 15.4 MB/s

Can you share the xml of this vm? You can find it in vdsm log,
at the time you start the vm.

Or you can do (on the host):

# virsh
virsh # list
(username: vdsm@ovirt password: shibboleth)
virsh # dumpxml vm-id


Peak rate (sent/received/total):  281Mb 5.36Mb
282Mb
Cumulative (sent/received/total):1.96GB 14.6MB
1.97GB

gluster volume info gv1:
  Options Reconfigured:
  performance.write-behind-window-size: 4MB
  performance.readdir-ahead: on
  performance.cache-size: 1GB
  performance.write-behind: off

performance.write-behind: off didn't help.
Neither did any other changes I've tried.


There is no VM traffic on this VM right now except my test.



On 02/10/2016 11:55 PM, Nir Soffer wrote:
On Thu, Feb 11, 2016 at 2:42 AM, Ravishankar N 


wrote:

+gluster-users

Does disabling 'performance.write-behind' give a better throughput?



On 02/10/2016 11:06 PM, Bill James wrote:
I'm setting up a ovirt cluster using glusterfs and noticing not 
stellar

performance.
Maybe my setup could use some adjustments?

3 hardware nodes running centos7.2, glusterfs 3.7.6.1, ovirt 
3.6.2.6-1.
Each node has 8 spindles configured in 1 array which is split 
using LVM

with one logical volume for system and one for gluster.
They each have 4 NICs,
   NIC1 = ovirtmgmt
   NIC2 = gluster  (1GbE)

How do you ensure that gluster trafic is using this nic?


   NIC3 = VM traffic

How do you ensure that vm trafic is using this nic?


I tried with default glusterfs settings

And did you find any difference?


and also with:
performance.cache-size: 1GB
performance.readdir-ahead: on
performance.write-behind-window-size: 4MB

[root@ovirt3 test scripts]# gluster volume info gv1

Volume Name: gv1
Type: Replicate
Volume ID: 71afc35b-09d7-4384-ab22-57d032a0f1a2
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: ovirt1-ks.test.j2noc.com:/gluster-store/brick1/gv1
Brick2: ovirt2-ks.test.j2noc.com:/gluster-store/brick1/gv1
Brick3: ovirt3-ks.test.j2noc.com:/gluster-store/brick1/gv1
Options Reconfigured:
performance.cache-size: 1GB
performance.readdir-ahead: on
performance.write-behind-window-size: 4MB


Using simple dd test on VM in ovirt:
dd if=/dev/zero of=/root/testfile bs=1G count=1 oflag=direct

block size of 1G?!

Try 1M (our default for storage operations)


1073741824 bytes (1.1 GB) copied, 65.9337 s, 16.3 MB/s

Another VM not in ovirt using nfs:
 dd if=/dev/zero of=/root/testfile bs=1G count=1 oflag=direct
1073741824 bytes (1.1 GB) copied, 27.0079 s, 39.8 MB/s


Is that expected or is there a better way to set it up to get 
better

performance?

Adding Niels for advice.


This email, its contents and 
Please avoid this, this is a public mailing list, everything you 
write

here is public.

Nir
I'll have to look into how to remove this sig for this mailing 
list


Cloud Services for Business www.j2.com
j2 | eFax | eVoice | FuseMail | Campaigner | KeepItSafe | Onebox


This email, its contents and attachments contain information from 
j2 Global,

Re: [Gluster-users] [ovirt-users] ovirt glusterfs performance

2016-02-11 Thread Bill James

don't know if it helps, but I ran a few more tests, all from the same 
hardware node.


The VM:
[root@billjov1 ~]# time dd if=/dev/zero of=/root/testfile bs=1M 
count=1000 oflag=direct

1048576000 bytes (1.0 GB) copied, 62.5535 s, 16.8 MB/s

Writing directly to gluster volume:
[root@ovirt2 test ~]# time dd if=/dev/zero 
of=/gluster-store/brick1/gv1/testfile bs=1M count=1000 oflag=direct

1048576000 bytes (1.0 GB) copied, 9.92048 s, 106 MB/s


Writing to NFS volume:
[root@ovirt2 test ~]# time dd if=/dev/zero of=/mnt/storage/qa/testfile 
bs=1M count=1000 oflag=direct

1048576000 bytes (1.0 GB) copied, 10.5776 s, 99.1 MB/s

NFS & Gluster are using the same interface. Tests were not run at same time.

This would suggest my problem isn't glusterfs, but the VM performance.



On 02/11/2016 03:13 PM, Bill James wrote:

xml attached.


On 02/11/2016 12:28 PM, Nir Soffer wrote:

On Thu, Feb 11, 2016 at 8:27 PM, Bill James  wrote:

thank you for the reply.

We setup gluster using the names associated with  NIC 2 IP.
  Brick1: ovirt1-ks.test.j2noc.com:/gluster-store/brick1/gv1
  Brick2: ovirt2-ks.test.j2noc.com:/gluster-store/brick1/gv1
  Brick3: ovirt3-ks.test.j2noc.com:/gluster-store/brick1/gv1

That's NIC 2's IP.
Using 'iftop -i eno2 -L 5 -t' :

dd if=/dev/zero of=/root/testfile bs=1M count=1000 oflag=direct
1048576000 bytes (1.0 GB) copied, 68.0714 s, 15.4 MB/s

Can you share the xml of this vm? You can find it in vdsm log,
at the time you start the vm.

Or you can do (on the host):

# virsh
virsh # list
(username: vdsm@ovirt password: shibboleth)
virsh # dumpxml vm-id


Peak rate (sent/received/total):  281Mb 5.36Mb
282Mb
Cumulative (sent/received/total):1.96GB 14.6MB
1.97GB

gluster volume info gv1:
  Options Reconfigured:
  performance.write-behind-window-size: 4MB
  performance.readdir-ahead: on
  performance.cache-size: 1GB
  performance.write-behind: off

performance.write-behind: off didn't help.
Neither did any other changes I've tried.


There is no VM traffic on this VM right now except my test.



On 02/10/2016 11:55 PM, Nir Soffer wrote:
On Thu, Feb 11, 2016 at 2:42 AM, Ravishankar N 


wrote:

+gluster-users

Does disabling 'performance.write-behind' give a better throughput?



On 02/10/2016 11:06 PM, Bill James wrote:
I'm setting up a ovirt cluster using glusterfs and noticing not 
stellar

performance.
Maybe my setup could use some adjustments?

3 hardware nodes running centos7.2, glusterfs 3.7.6.1, ovirt 
3.6.2.6-1.
Each node has 8 spindles configured in 1 array which is split 
using LVM

with one logical volume for system and one for gluster.
They each have 4 NICs,
   NIC1 = ovirtmgmt
   NIC2 = gluster  (1GbE)

How do you ensure that gluster trafic is using this nic?


   NIC3 = VM traffic

How do you ensure that vm trafic is using this nic?


I tried with default glusterfs settings

And did you find any difference?


and also with:
performance.cache-size: 1GB
performance.readdir-ahead: on
performance.write-behind-window-size: 4MB

[root@ovirt3 test scripts]# gluster volume info gv1

Volume Name: gv1
Type: Replicate
Volume ID: 71afc35b-09d7-4384-ab22-57d032a0f1a2
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: ovirt1-ks.test.j2noc.com:/gluster-store/brick1/gv1
Brick2: ovirt2-ks.test.j2noc.com:/gluster-store/brick1/gv1
Brick3: ovirt3-ks.test.j2noc.com:/gluster-store/brick1/gv1
Options Reconfigured:
performance.cache-size: 1GB
performance.readdir-ahead: on
performance.write-behind-window-size: 4MB


Using simple dd test on VM in ovirt:
dd if=/dev/zero of=/root/testfile bs=1G count=1 oflag=direct

block size of 1G?!

Try 1M (our default for storage operations)


1073741824 bytes (1.1 GB) copied, 65.9337 s, 16.3 MB/s

Another VM not in ovirt using nfs:
 dd if=/dev/zero of=/root/testfile bs=1G count=1 oflag=direct
1073741824 bytes (1.1 GB) copied, 27.0079 s, 39.8 MB/s


Is that expected or is there a better way to set it up to get better
performance?

Adding Niels for advice.


This email, its contents and 

Please avoid this, this is a public mailing list, everything you write
here is public.

Nir

I'll have to look into how to remove this sig for this mailing list

Cloud Services for Business www.j2.com
j2 | eFax | eVoice | FuseMail | Campaigner | KeepItSafe | Onebox


This email, its contents and attachments contain information from j2 
Global,

Inc. and/or its affiliates which may be privileged, confidential or
otherwise protected from disclosure. The information is intended to 
be for
the addressee(s) only. If you are not an addressee, any disclosure, 
copy,
distribution, or use of the contents of this message is prohibited. 
If you
have received this email in error please notify the sender by reply 
e-mail
and delete the original message and any copies. (c) 2015 j2 Global, 
Inc. All
rights reserved. eFax, eVoice, Campaigner, Fu

Re: [Gluster-users] [ovirt-users] ovirt glusterfs performance

2016-02-11 Thread Bill James


xml attached.


On 02/11/2016 12:28 PM, Nir Soffer wrote:

On Thu, Feb 11, 2016 at 8:27 PM, Bill James  wrote:

thank you for the reply.

We setup gluster using the names associated with  NIC 2 IP.
  Brick1: ovirt1-ks.test.j2noc.com:/gluster-store/brick1/gv1
  Brick2: ovirt2-ks.test.j2noc.com:/gluster-store/brick1/gv1
  Brick3: ovirt3-ks.test.j2noc.com:/gluster-store/brick1/gv1

That's NIC 2's IP.
Using 'iftop -i eno2 -L 5 -t' :

dd if=/dev/zero of=/root/testfile bs=1M count=1000 oflag=direct
1048576000 bytes (1.0 GB) copied, 68.0714 s, 15.4 MB/s

Can you share the xml of this vm? You can find it in vdsm log,
at the time you start the vm.

Or you can do (on the host):

# virsh
virsh # list
(username: vdsm@ovirt password: shibboleth)
virsh # dumpxml vm-id


Peak rate (sent/received/total):  281Mb 5.36Mb
282Mb
Cumulative (sent/received/total):1.96GB 14.6MB
1.97GB

gluster volume info gv1:
  Options Reconfigured:
  performance.write-behind-window-size: 4MB
  performance.readdir-ahead: on
  performance.cache-size: 1GB
  performance.write-behind: off

performance.write-behind: off didn't help.
Neither did any other changes I've tried.


There is no VM traffic on this VM right now except my test.



On 02/10/2016 11:55 PM, Nir Soffer wrote:

On Thu, Feb 11, 2016 at 2:42 AM, Ravishankar N 
wrote:

+gluster-users

Does disabling 'performance.write-behind' give a better throughput?



On 02/10/2016 11:06 PM, Bill James wrote:

I'm setting up a ovirt cluster using glusterfs and noticing not stellar
performance.
Maybe my setup could use some adjustments?

3 hardware nodes running centos7.2, glusterfs 3.7.6.1, ovirt 3.6.2.6-1.
Each node has 8 spindles configured in 1 array which is split using LVM
with one logical volume for system and one for gluster.
They each have 4 NICs,
   NIC1 = ovirtmgmt
   NIC2 = gluster  (1GbE)

How do you ensure that gluster trafic is using this nic?


   NIC3 = VM traffic

How do you ensure that vm trafic is using this nic?


I tried with default glusterfs settings

And did you find any difference?


and also with:
performance.cache-size: 1GB
performance.readdir-ahead: on
performance.write-behind-window-size: 4MB

[root@ovirt3 test scripts]# gluster volume info gv1

Volume Name: gv1
Type: Replicate
Volume ID: 71afc35b-09d7-4384-ab22-57d032a0f1a2
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: ovirt1-ks.test.j2noc.com:/gluster-store/brick1/gv1
Brick2: ovirt2-ks.test.j2noc.com:/gluster-store/brick1/gv1
Brick3: ovirt3-ks.test.j2noc.com:/gluster-store/brick1/gv1
Options Reconfigured:
performance.cache-size: 1GB
performance.readdir-ahead: on
performance.write-behind-window-size: 4MB


Using simple dd test on VM in ovirt:
dd if=/dev/zero of=/root/testfile bs=1G count=1 oflag=direct

block size of 1G?!

Try 1M (our default for storage operations)


1073741824 bytes (1.1 GB) copied, 65.9337 s, 16.3 MB/s

Another VM not in ovirt using nfs:
 dd if=/dev/zero of=/root/testfile bs=1G count=1 oflag=direct
1073741824 bytes (1.1 GB) copied, 27.0079 s, 39.8 MB/s


Is that expected or is there a better way to set it up to get better
performance?

Adding Niels for advice.


This email, its contents and 

Please avoid this, this is a public mailing list, everything you write
here is public.

Nir

I'll have to look into how to remove this sig for this mailing list

Cloud Services for Business www.j2.com
j2 | eFax | eVoice | FuseMail | Campaigner | KeepItSafe | Onebox


This email, its contents and attachments contain information from j2 Global,
Inc. and/or its affiliates which may be privileged, confidential or
otherwise protected from disclosure. The information is intended to be for
the addressee(s) only. If you are not an addressee, any disclosure, copy,
distribution, or use of the contents of this message is prohibited. If you
have received this email in error please notify the sender by reply e-mail
and delete the original message and any copies. (c) 2015 j2 Global, Inc. All
rights reserved. eFax, eVoice, Campaigner, FuseMail, KeepItSafe, and Onebox
are registered trademarks of j2 Global, Inc. and its affiliates.


Please enter your authentication name: Please enter your password: 
  billjov1.test.j2noc.com
  c6aa56b4-f387-4a5b-84b6-a7db6ef89686
  http://ovirt.org/vm/tune/1.0";>

  
  4294967296
  2097152
  2097152
  16
  
1020
  
  

  
  
/machine
  
  

  oVirt
  oVirt Node
  7-2.1511.el7.centos.2.10
  30343536-3138-5355-4533-323134593738
  c6aa56b4-f387-4a5b-84b6-a7db6ef89686

  
  
hvm

  
  

  
  
SandyBridge


  

  
  



  
  destroy
  restart
  destroy
  
/usr/libexec/qemu-kvm

Re: [Gluster-users] volume command issues

2015-07-06 Thread Alun James

Ok thanks. I see this: 


Unable to get lock for uuid: 06a089b9-01d8-48c1-b02c-fa051897ab45, lock held 
by: 749ac41e-6d00-4d3c-9fb8-55b73135b4ee 


I have tried gluster vol status on node 749ac41e-6d00-4d3c-9fb8-55b73135b4ee 
too, however it giving: 



gluster vol status 


Locking failed on my-web01 . Please check log file for details. 
Locking failed on my-web02. Please check log file for details. 


- Original Message -

From: "Atin Mukherjee"  
To: "Alun James"  
Cc: gluster-users@gluster.org 
Sent: Monday, 6 July, 2015 3:07:57 PM 
Subject: Re: [Gluster-users] volume command issues 


-Atin 
Sent from one plus one 
On Jul 6, 2015 7:30 PM, "Alun James" < aja...@tibus.com > wrote: 
> 
> Hi folks, 
> 
> I have a 3 node gluster volume that developed a problem with gluster vol 
> commands e.g. gluster vol status 
> 
> GlusterFS version 3.6.2 
> 
> The volume is successfully mounted and appears to be working correctly (for 
> now). 
> 
> gluster volume info 
> 
> Volume Name: my_filestore_vol 
> Type: Replicate 
> Volume ID:  
> Status: Started 
> Number of Bricks: 1 x 3 = 3 
> Transport-type: tcp 
> Bricks: 
> Brick1: my-web01:/export/brick0 
> Brick2: my-web02:/export/brick0 
> Brick3: my-web03:/export/brick0 
> Options Reconfigured: 
> nfs.drc: off 
> diagnostics.brick-log-level: WARNING 
> 
> gluster volume status 
> 
> Another transaction is in progress. Please try again after sometime. 
This indicates one more command has acquired the lock on the same volume. 
Please check etc-glusterfs-*.log (glusterd log) to see which node has acquired 
the volume lock. cmd_history.log file will also hint you the other concurrent 
command ran. 
> 
> I am also seeing errors in the self-heal logs... 
> 
> /var/log/glusterfs/glfsheal-my_filestore_vol.log 
> 
> [2015-07-06 13:49:12.609285] E [glfs-mgmt.c:520:mgmt_getspec_cbk] 0-gfapi: 
> failed to get the 'volume file' from server 
> [2015-07-06 13:49:12.609331] E [glfs-mgmt.c:599:mgmt_getspec_cbk] 
> 0-glfs-mgmt: failed to fetch volume file (key:my_filestore_vol) 
> [2015-07-06 13:49:27.377875] E [glfs-mgmt.c:520:mgmt_getspec_cbk] 0-gfapi: 
> failed to get the 'volume file' from server 
> [2015-07-06 13:49:27.377928] E [glfs-mgmt.c:599:mgmt_getspec_cbk] 
> 0-glfs-mgmt: failed to fetch volume file (key:my_filestore_vol) 
> [2015-07-06 13:50:21.963651] E [glfs-mgmt.c:520:mgmt_getspec_cbk] 0-gfapi: 
> failed to get the 'volume file' from server 
> [2015-07-06 13:50:21.963710] E [glfs-mgmt.c:599:mgmt_getspec_cbk] 
> 0-glfs-mgmt: failed to fetch volume file (key:my_filestore_vol) 
> 
> Has anyone experienced similar? 
> 
> ALUN JAMES 
> Senior Systems Engineer 
> Tibus 
> 
> T: +44 (0)28 9033 1122 
> E: aja...@tibus.com 
> W: www.tibus.com 
> 
> Follow us on Twitter @tibus 
> 
> Tibus is a trading name of The Internet Business Ltd, a company limited by 
> share capital and registered in Northern Ireland, NI31235. It is part of UTV 
> Media Plc. 
> 
> This email and any attachment may contain confidential information for the 
> sole use of the intended recipient. Any review, use, distribution or 
> disclosure by others is strictly prohibited. If you are not the intended 
> recipient (or authorised to receive for the recipient), please contact the 
> sender by reply email and delete all copies of this message. 
> 
> ___ 
> Gluster-users mailing list 
> Gluster-users@gluster.org 
> http://www.gluster.org/mailman/listinfo/gluster-users 

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] volume command issues

2015-07-06 Thread Alun James

Hi folks, 


I have a 3 node gluster volume that developed a problem with gluster vol 
commands e.g. gluster vol status 


GlusterFS version 3.6.2 


The volume is successfully mounted and appears to be working correctly (for 
now). 



gluster volume info 


Volume Name: my_filestore_vol 
Type: Replicate 
Volume ID:  
Status: Started 
Number of Bricks: 1 x 3 = 3 
Transport-type: tcp 
Bricks: 
Brick1: my-web01:/export/brick0 
Brick2: my-web02:/export/brick0 
Brick3: my-web03:/export/brick0 
Options Reconfigured: 
nfs.drc: off 
diagnostics.brick-log-level: WARNING 



gluster volume status 


Another transaction is in progress. Please try again after sometime. 


I am also seeing errors in the self-heal logs... 


/var/log/glusterfs/glfsheal-my_filestore_vol.log 



[2015-07-06 13:49:12.609285] E [glfs-mgmt.c:520:mgmt_getspec_cbk] 0-gfapi: 
failed to get the 'volume file' from server 
[2015-07-06 13:49:12.609331] E [glfs-mgmt.c:599:mgmt_getspec_cbk] 0-glfs-mgmt: 
failed to fetch volume file (key:my_filestore_vol) 
[2015-07-06 13:49:27.377875] E [glfs-mgmt.c:520:mgmt_getspec_cbk] 0-gfapi: 
failed to get the 'volume file' from server 
[2015-07-06 13:49:27.377928] E [glfs-mgmt.c:599:mgmt_getspec_cbk] 0-glfs-mgmt: 
failed to fetch volume file (key:my_filestore_vol) 
[2015-07-06 13:50:21.963651] E [glfs-mgmt.c:520:mgmt_getspec_cbk] 0-gfapi: 
failed to get the 'volume file' from server 
[2015-07-06 13:50:21.963710] E [glfs-mgmt.c:599:mgmt_getspec_cbk] 0-glfs-mgmt: 
failed to fetch volume file (key:my_filestore_vol) 


Has anyone experienced similar? 

ALUN JAMES 
Senior Systems Engineer 
Tibus 

T: +44 (0)28 9033 1122 
E: aja...@tibus.com 
W: www.tibus.com 

Follow us on Twitter @tibus 

Tibus is a trading name of The Internet Business Ltd, a company limited by 
share capital and registered in Northern Ireland, NI31235. It is part of UTV 
Media Plc. 

This email and any attachment may contain confidential information for the sole 
use of the intended recipient. Any review, use, distribution or disclosure by 
others is strictly prohibited. If you are not the intended recipient (or 
authorised to receive for the recipient), please contact the sender by reply 
email and delete all copies of this message. 
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Un-synced entries / self-heal errors

2015-06-12 Thread Alun James

If it helps, the gfids mentioned below all appear in the following 
directories... 



find /export/brick0/.glusterfs -name d230ce2e-6baf-46a8-87a0-c0882ad51721 -type 
f 


/export/brick0/.glusterfs/indices/xattrop/d230ce2e-6baf-46a8-87a0-c0882ad51721 




find /export/brick0/.glusterfs -name d614004f-d1f3-4c4b-b15e-e3b8d419a959 -type 
f 


/export/brick0/.glusterfs/indices/xattrop/d614004f-d1f3-4c4b-b15e-e3b8d419a959 
- Original Message -

From: "Alun James"  
To: gluster-users@gluster.org 
Sent: Friday, 12 June, 2015 4:59:30 PM 
Subject: [Gluster-users] Un-synced entries / self-heal errors 


Hi folks, 


We have a 3 node replicated volume that is reporting un-synced entries for a 
number of days. "gluster vol my_volume info" lists ~600 entries. e.g. 



 
 
 
 
 
 


Each time the self-heal daemon executes, the glustershd.log gains ~600 error 
entries referencing the named entries above. e.g. 




[2015-06-12 15:56:06.182541] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_vol-client-1: remote operation failed: No such file or directory. Path: 
 
(1b6c7720-997b-4acf-bfef-7d51c4ae9f6c) 
[2015-06-12 15:56:06.182586] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_vol-client-2: remote operation failed: No such file or directory. Path: 
 
(1b6c7720-997b-4acf-bfef-7d51c4ae9f6c) 
[2015-06-12 15:56:06.182886] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_vol-client-0: remote operation failed: No such file or directory. Path: 
 
(74586164-6bf0-4ccd-95ff-8d09fa7f6681) 
[2015-06-12 15:56:06.182931] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_vol-client-1: remote operation failed: No such file or directory. Path: 
 
(74586164-6bf0-4ccd-95ff-8d09fa7f6681) 
[2015-06-12 15:56:06.182976] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_vol-client-2: remote operation failed: No such file or directory. Path: 
 
(74586164-6bf0-4ccd-95ff-8d09fa7f6681) 


Apart from this, the cluster is working fine and files are synced between nodes 
without any issues. The number of entries is NOT increasing. Any thoughts on 
how to clear these 600 entries? 


Regards, 


ALUN JAMES 
Senior Systems Engineer 
Tibus 

T: +44 (0)28 9033 1122 
E: aja...@tibus.com 
W: www.tibus.com 

Follow us on Twitter @tibus 

Tibus is a trading name of The Internet Business Ltd, a company limited by 
share capital and registered in Northern Ireland, NI31235. It is part of UTV 
Media Plc. 

This email and any attachment may contain confidential information for the sole 
use of the intended recipient. Any review, use, distribution or disclosure by 
others is strictly prohibited. If you are not the intended recipient (or 
authorised to receive for the recipient), please contact the sender by reply 
email and delete all copies of this message. 

___ 
Gluster-users mailing list 
Gluster-users@gluster.org 
http://www.gluster.org/mailman/listinfo/gluster-users 
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Un-synced entries / self-heal errors

2015-06-12 Thread Alun James

Hi folks, 


We have a 3 node replicated volume that is reporting un-synced entries for a 
number of days. "gluster vol my_volume info" lists ~600 entries. e.g. 



 
 
 
 
 
 


Each time the self-heal daemon executes, the glustershd.log gains ~600 error 
entries referencing the named entries above. e.g. 




[2015-06-12 15:56:06.182541] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_vol-client-1: remote operation failed: No such file or directory. Path: 
 
(1b6c7720-997b-4acf-bfef-7d51c4ae9f6c) 
[2015-06-12 15:56:06.182586] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_vol-client-2: remote operation failed: No such file or directory. Path: 
 
(1b6c7720-997b-4acf-bfef-7d51c4ae9f6c) 
[2015-06-12 15:56:06.182886] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_vol-client-0: remote operation failed: No such file or directory. Path: 
 
(74586164-6bf0-4ccd-95ff-8d09fa7f6681) 
[2015-06-12 15:56:06.182931] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_vol-client-1: remote operation failed: No such file or directory. Path: 
 
(74586164-6bf0-4ccd-95ff-8d09fa7f6681) 
[2015-06-12 15:56:06.182976] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_vol-client-2: remote operation failed: No such file or directory. Path: 
 
(74586164-6bf0-4ccd-95ff-8d09fa7f6681) 


Apart from this, the cluster is working fine and files are synced between nodes 
without any issues. The number of entries is NOT increasing. Any thoughts on 
how to clear these 600 entries? 


Regards, 


ALUN JAMES 
Senior Systems Engineer 
Tibus 

T: +44 (0)28 9033 1122 
E: aja...@tibus.com 
W: www.tibus.com 

Follow us on Twitter @tibus 

Tibus is a trading name of The Internet Business Ltd, a company limited by 
share capital and registered in Northern Ireland, NI31235. It is part of UTV 
Media Plc. 

This email and any attachment may contain confidential information for the sole 
use of the intended recipient. Any review, use, distribution or disclosure by 
others is strictly prohibited. If you are not the intended recipient (or 
authorised to receive for the recipient), please contact the sender by reply 
email and delete all copies of this message. 
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] glustershd.log errors

2015-05-15 Thread Alun James

I should add that the other 2 nodes are not creating such huge logs, only 
host01, yet on host01 the volume is replicating data just fine. 


Regards, 


Alun. 

- Original Message -

From: "Alun James"  
To: gluster-users@gluster.org 
Cc: "Gary Armstrong"  
Sent: Friday, 15 May, 2015 12:48:07 PM 
Subject: glustershd.log errors 



Folks, 


We have a server who's glustershd.log log is gigs in size and rising. The 
cluster looks healthy and file replication is working fine between all nodes, 
but one node is continually writing errors. 


Any ideas? 


[2015-05-15 11:43:46.171692] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_filestore_vol-client-0: remote operation failed: No such file or 
directory. Path:  
(028dc1a8-138e-4ed2-a114-12d707c70c79) 
[2015-05-15 11:43:46.171756] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_filestore_vol-client-1: remote operation failed: No such file or 
directory. Path:  
(028dc1a8-138e-4ed2-a114-12d707c70c79) 
[2015-05-15 11:43:46.171809] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_filestore_vol-client-2: remote operation failed: No such file or 
directory. Path:  
(028dc1a8-138e-4ed2-a114-12d707c70c79) 
[2015-05-15 11:43:46.172178] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_filestore_vol-client-1: remote operation failed: No such file or 
directory. Path:  
(3fd6837f-55a4-4d02-b568-1a4c2c23e8a9) 
[2015-05-15 11:43:46.172233] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_filestore_vol-client-0: remote operation failed: No such file or 
directory. Path:  
(3fd6837f-55a4-4d02-b568-1a4c2c23e8a9) 
[2015-05-15 11:43:46.172284] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_filestore_vol-client-2: remote operation failed: No such file or 
directory. Path:  
(3fd6837f-55a4-4d02-b568-1a4c2c23e8a9) 
[2015-05-15 11:43:46.172650] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_filestore_vol-client-0: remote operation failed: No such file or 
directory. Path:  
(4bec2fff-5c5f-4486-909d-bedb5a609ee8) 
[2015-05-15 11:43:46.172706] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_filestore_vol-client-1: remote operation failed: No such file or 
directory. Path:  
(4bec2fff-5c5f-4486-909d-bedb5a609ee8) 
[2015-05-15 11:43:46.172753] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_filestore_vol-client-2: remote operation failed: No such file or 
directory. Path:  
(4bec2fff-5c5f-4486-909d-bedb5a609ee8) 
[2015-05-15 11:43:46.173114] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_filestore_vol-client-0: remote operation failed: No such file or 
directory. Path:  
(7afb4663-1448-4d25-8511-fd3dc5a1acc3) 
[2015-05-15 11:43:46.173180] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_filestore_vol-client-1: remote operation failed: No such file or 
directory. Path:  
(7afb4663-1448-4d25-8511-fd3dc5a1acc3) 
[2015-05-15 11:43:46.173229] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_filestore_vol-client-2: remote operation failed: No such file or 
directory. Path:  
(7afb4663-1448-4d25-8511-fd3dc5a1acc3) 
[2015-05-15 11:43:46.174018] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_filestore_vol-client-0: remote operation failed: No such file or 
directory. Path:  
(0039a6df-890a-410b-82c8-60d0882d9d6b) 
[2015-05-15 11:43:46.174071] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_filestore_vol-client-1: remote operation failed: No such file or 
directory. Path:  
(0039a6df-890a-410b-82c8-60d0882d9d6b) 
[2015-05-15 11:43:46.174117] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_filestore_vol-client-2: remote operation failed: No such file or 
directory. Path:  
(0039a6df-890a-410b-82c8-60d0882d9d6b) 


gluster vol info 


Volume Name: my_filestore_vol 
Type: Replicate 
Volume ID: 25951af0-005d-47a2-92a4-7babfbe9437e 
Status: Started 
Number of Bricks: 1 x 3 = 3 
Transport-type: tcp 
Bricks: 
Brick1: host01:/export/brick0 
Brick2: host02:/export/brick0 
Brick3: host03:/export/brick0 





gluster vol status 


Status of volume: my_filestore_vol 
Gluster process Port Online Pid 
-- 
Brick host01:/export/brick0 49152 Y 2628 
Brick host02:/export/brick0 49152 Y 2049 
Brick host03:/export/brick0 49152 Y 2163 
NFS Server on localhost 2049 Y 2637 
Self-heal Daemon on localhost N/A Y 2642 
NFS Server on host03 2049 Y 2172 
Self-heal Daemon on host03 N/A Y 2177 
NFS Server on host02 2049 Y 2058 
Self-heal Daemon on host02 N/A Y 2063 


Task Status of Volume my_filestore_vol 
------ 



ALUN JAMES 
Senior Systems Engineer 
Tibus 

T: +44 (0)28 9033 1122 
E: aja...@tibus.com 
W: www.tibus.com 

Follow us on Twitter @tibus 

Tibus is a trading name of The Internet Business Ltd, a company limited by 
share capital and registered in Northern Ireland, NI31235. It is part of UTV 
Media Plc. 

This email and any attachment

[Gluster-users] glustershd.log errors

2015-05-15 Thread Alun James


Folks, 


We have a server who's glustershd.log log is gigs in size and rising. The 
cluster looks healthy and file replication is working fine between all nodes, 
but one node is continually writing errors. 


Any ideas? 


[2015-05-15 11:43:46.171692] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_filestore_vol-client-0: remote operation failed: No such file or 
directory. Path:  
(028dc1a8-138e-4ed2-a114-12d707c70c79) 
[2015-05-15 11:43:46.171756] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_filestore_vol-client-1: remote operation failed: No such file or 
directory. Path:  
(028dc1a8-138e-4ed2-a114-12d707c70c79) 
[2015-05-15 11:43:46.171809] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_filestore_vol-client-2: remote operation failed: No such file or 
directory. Path:  
(028dc1a8-138e-4ed2-a114-12d707c70c79) 
[2015-05-15 11:43:46.172178] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_filestore_vol-client-1: remote operation failed: No such file or 
directory. Path:  
(3fd6837f-55a4-4d02-b568-1a4c2c23e8a9) 
[2015-05-15 11:43:46.172233] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_filestore_vol-client-0: remote operation failed: No such file or 
directory. Path:  
(3fd6837f-55a4-4d02-b568-1a4c2c23e8a9) 
[2015-05-15 11:43:46.172284] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_filestore_vol-client-2: remote operation failed: No such file or 
directory. Path:  
(3fd6837f-55a4-4d02-b568-1a4c2c23e8a9) 
[2015-05-15 11:43:46.172650] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_filestore_vol-client-0: remote operation failed: No such file or 
directory. Path:  
(4bec2fff-5c5f-4486-909d-bedb5a609ee8) 
[2015-05-15 11:43:46.172706] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_filestore_vol-client-1: remote operation failed: No such file or 
directory. Path:  
(4bec2fff-5c5f-4486-909d-bedb5a609ee8) 
[2015-05-15 11:43:46.172753] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_filestore_vol-client-2: remote operation failed: No such file or 
directory. Path:  
(4bec2fff-5c5f-4486-909d-bedb5a609ee8) 
[2015-05-15 11:43:46.173114] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_filestore_vol-client-0: remote operation failed: No such file or 
directory. Path:  
(7afb4663-1448-4d25-8511-fd3dc5a1acc3) 
[2015-05-15 11:43:46.173180] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_filestore_vol-client-1: remote operation failed: No such file or 
directory. Path:  
(7afb4663-1448-4d25-8511-fd3dc5a1acc3) 
[2015-05-15 11:43:46.173229] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_filestore_vol-client-2: remote operation failed: No such file or 
directory. Path:  
(7afb4663-1448-4d25-8511-fd3dc5a1acc3) 
[2015-05-15 11:43:46.174018] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_filestore_vol-client-0: remote operation failed: No such file or 
directory. Path:  
(0039a6df-890a-410b-82c8-60d0882d9d6b) 
[2015-05-15 11:43:46.174071] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_filestore_vol-client-1: remote operation failed: No such file or 
directory. Path:  
(0039a6df-890a-410b-82c8-60d0882d9d6b) 
[2015-05-15 11:43:46.174117] W [client-rpc-fops.c:2766:client3_3_lookup_cbk] 
0-my_filestore_vol-client-2: remote operation failed: No such file or 
directory. Path:  
(0039a6df-890a-410b-82c8-60d0882d9d6b) 


gluster vol info 


Volume Name: my_filestore_vol 
Type: Replicate 
Volume ID: 25951af0-005d-47a2-92a4-7babfbe9437e 
Status: Started 
Number of Bricks: 1 x 3 = 3 
Transport-type: tcp 
Bricks: 
Brick1: host01:/export/brick0 
Brick2: host02:/export/brick0 
Brick3: host03:/export/brick0 





gluster vol status 


Status of volume: my_filestore_vol 
Gluster process Port Online Pid 
-- 
Brick host01:/export/brick0 49152 Y 2628 
Brick host02:/export/brick0 49152 Y 2049 
Brick host03:/export/brick0 49152 Y 2163 
NFS Server on localhost 2049 Y 2637 
Self-heal Daemon on localhost N/A Y 2642 
NFS Server on host03 2049 Y 2172 
Self-heal Daemon on host03 N/A Y 2177 
NFS Server on host02 2049 Y 2058 
Self-heal Daemon on host02 N/A Y 2063 


Task Status of Volume my_filestore_vol 
-- 



ALUN JAMES 
Senior Systems Engineer 
Tibus 

T: +44 (0)28 9033 1122 
E: aja...@tibus.com 
W: www.tibus.com 

Follow us on Twitter @tibus 

Tibus is a trading name of The Internet Business Ltd, a company limited by 
share capital and registered in Northern Ireland, NI31235. It is part of UTV 
Media Plc. 

This email and any attachment may contain confidential information for the sole 
use of the intended recipient. Any review, use, distribution or disclosure by 
others is strictly prohibited. If you are not the intended recipient (or 
authorised to receive for the recipient), please contact the sender by reply 
email and delete all copies of this me

Re: [Gluster-users] High load / hang

2015-05-12 Thread Alun James

Evening running a Wordpress update (a 6 meg file) via wp-admin seems to cause 
the gluster process CPU to spike and server Load to go higher than I am 
comfortable with. I get that due to replication there will be some latency 
overhead, but I was not expecting a meltdown :)


Regards,


A

- Original Message -

From: "Alun James" 
To: "Hoggins!" 
Cc: gluster-users@gluster.org
Sent: Tuesday, 12 May, 2015 2:16:02 PM
Subject: Re: [Gluster-users] High load / hang


A hard reboot solved it for me too, but I am a little worried on the stability 
of it.


The entire wordpress web root is stored on the gluster volume, a day later I 
attempted to tar/gzip the web directory on the gluster volume and again the 
load on one particular server sky rocketed, server became unresponsive and 
kernel hang messages appeared claiming the gzip process was hanging. It seems 
that if I cause any serious IO on the volume, things seem to snowball.


There are 3 web nodes with a replicated gluster volume between them all. Each 
web node also mounts the gluster volume locally using NFS (an attempt to 
improve performance for lots of small php files). I tried mounting native 
glusterfs instead of NFS for client mount, but I had the same issues.



ii glusterfs-client 3.6.2-ubuntu1~trusty3 amd64 clustered file-system (client 
package)
ii glusterfs-common 3.6.2-ubuntu1~trusty3 amd64 GlusterFS common libraries and 
translator modules
ii glusterfs-server 3.6.2-ubuntu1~trusty3 amd64 clustered file-system (server 
package)



Volume Name: my_filestore_vol
Type: Replicate
Volume ID: xyz
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: my-web01:/export/brick0
Brick2: my-web02:/export/brick0
Brick3: my-web03:/export/brick0
Options Reconfigured:
nfs.drc: off
diagnostics.brick-log-level: WARNING



Status of volume: my_filestore_vol
Gluster process Port Online Pid
--
Brick my-web01:/export/brick0 49152 Y 2138
Brick my-web02:/export/brick0 49152 Y 21104
Brick my-web03:/export/brick0 49152 Y 1827
NFS Server on localhost 2049 Y 2145
Self-heal Daemon on localhost N/A Y 2152
NFS Server on my-web03 2049 Y 1834
Self-heal Daemon on my-web03 N/A Y 1841
NFS Server on my-web02 2049 Y 21118
Self-heal Daemon on my-web02 N/A Y 21123


Task Status of Volume my_filestore_vol
--
There are no active volume tasks




Regards,


A

- Original Message -

From: "Hoggins!" 
To: gluster-users@gluster.org
Sent: Friday, 8 May, 2015 1:25:26 PM
Subject: Re: [Gluster-users] High load / hang

Well, that's "funny", because the exact same thing happened to me this morning, 
except that I could hard reboot the machine, and it got up and running normally 
again.
But the symptoms you describe are oddly similar, and strangely simultaneous.


Le 08/05/2015 10:36, Alun James a écrit :


Hi folks,


I have a 3 node gluster/web/db cluster running a Wordpress site . This morning 
one of the nodes is under very high load and the mounted gluster partition is 
inaccessible. Attempts to reboot that node have failed with the server 
seemingly hung/blocked. Has anyone else experienced this or can give any 
pointers in how to diagnose the cause of gluster going awry?



___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] High load / hang

2015-05-12 Thread Alun James

A hard reboot solved it for me too, but I am a little worried on the stability 
of it.


The entire wordpress web root is stored on the gluster volume, a day later I 
attempted to tar/gzip the web directory on the gluster volume and again the 
load on one particular server sky rocketed, server became unresponsive and 
kernel hang messages appeared claiming the gzip process was hanging. It seems 
that if I cause any serious IO on the volume, things seem to snowball.


There are 3 web nodes with a replicated gluster volume between them all. Each 
web node also mounts the gluster volume locally using NFS (an attempt to 
improve performance for lots of small php files). I tried mounting native 
glusterfs instead of NFS for client mount, but I had the same issues.



ii glusterfs-client 3.6.2-ubuntu1~trusty3 amd64 clustered file-system (client 
package)
ii glusterfs-common 3.6.2-ubuntu1~trusty3 amd64 GlusterFS common libraries and 
translator modules
ii glusterfs-server 3.6.2-ubuntu1~trusty3 amd64 clustered file-system (server 
package)



Volume Name: my_filestore_vol
Type: Replicate
Volume ID: xyz
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: my-web01:/export/brick0
Brick2: my-web02:/export/brick0
Brick3: my-web03:/export/brick0
Options Reconfigured:
nfs.drc: off
diagnostics.brick-log-level: WARNING



Status of volume: my_filestore_vol
Gluster process Port Online Pid
--
Brick my-web01:/export/brick0 49152 Y 2138
Brick my-web02:/export/brick0 49152 Y 21104
Brick my-web03:/export/brick0 49152 Y 1827
NFS Server on localhost 2049 Y 2145
Self-heal Daemon on localhost N/A Y 2152
NFS Server on my-web03 2049 Y 1834
Self-heal Daemon on my-web03 N/A Y 1841
NFS Server on my-web02 2049 Y 21118
Self-heal Daemon on my-web02 N/A Y 21123


Task Status of Volume my_filestore_vol
--
There are no active volume tasks




Regards,


A

- Original Message -

From: "Hoggins!" 
To: gluster-users@gluster.org
Sent: Friday, 8 May, 2015 1:25:26 PM
Subject: Re: [Gluster-users] High load / hang

Well, that's "funny", because the exact same thing happened to me this morning, 
except that I could hard reboot the machine, and it got up and running normally 
again.
But the symptoms you describe are oddly similar, and strangely simultaneous.


Le 08/05/2015 10:36, Alun James a écrit :


Hi folks,


I have a 3 node gluster/web/db cluster running a Wordpress site . This morning 
one of the nodes is under very high load and the mounted gluster partition is 
inaccessible. Attempts to reboot that node have failed with the server 
seemingly hung/blocked. Has anyone else experienced this or can give any 
pointers in how to diagnose the cause of gluster going awry?



___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] High load / hang

2015-05-08 Thread Alun James

Similar to this one... 


When running Wordpress update, the load on the server goes through the rough 
due to Gluster, site is very unresponsive during this time. How can I improve 
the performance? 


Regards, 


Alun. 

- Original Message -

From: "Alun James"  
To: gluster-users@gluster.org 
Sent: Friday, 8 May, 2015 9:36:08 AM 
Subject: High load / hang 


Hi folks, 


I have a 3 node gluster/web/db cluster running a Wordpress site . This morning 
one of the nodes is under very high load and the mounted gluster partition is 
inaccessible. Attempts to reboot that node have failed with the server 
seemingly hung/blocked. Has anyone else experienced this or can give any 
pointers in how to diagnose the cause of gluster going awry? 


Cheers, 


ALUN JAMES 
Senior Systems Engineer 
Tibus 

T: +44 (0)28 9033 1122 
E: aja...@tibus.com 
W: www.tibus.com 

Follow us on Twitter @tibus 

Tibus is a trading name of The Internet Business Ltd, a company limited by 
share capital and registered in Northern Ireland, NI31235. It is part of UTV 
Media Plc. 

This email and any attachment may contain confidential information for the sole 
use of the intended recipient. Any review, use, distribution or disclosure by 
others is strictly prohibited. If you are not the intended recipient (or 
authorised to receive for the recipient), please contact the sender by reply 
email and delete all copies of this message. 

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] High load / hang

2015-05-08 Thread Alun James

Hi folks, 


I have a 3 node gluster/web/db cluster running a Wordpress site . This morning 
one of the nodes is under very high load and the mounted gluster partition is 
inaccessible. Attempts to reboot that node have failed with the server 
seemingly hung/blocked. Has anyone else experienced this or can give any 
pointers in how to diagnose the cause of gluster going awry? 


Cheers, 


ALUN JAMES 
Senior Systems Engineer 
Tibus 

T: +44 (0)28 9033 1122 
E: aja...@tibus.com 
W: www.tibus.com 

Follow us on Twitter @tibus 

Tibus is a trading name of The Internet Business Ltd, a company limited by 
share capital and registered in Northern Ireland, NI31235. It is part of UTV 
Media Plc. 

This email and any attachment may contain confidential information for the sole 
use of the intended recipient. Any review, use, distribution or disclosure by 
others is strictly prohibited. If you are not the intended recipient (or 
authorised to receive for the recipient), please contact the sender by reply 
email and delete all copies of this message. 
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] ls gives Input/Output error

2015-04-29 Thread James Whitwell

 05:14:25.77] I [socket.c:3495:socket_init] 0-management:
using system polling thread
[2015-04-30 05:14:25.778326] I
[glusterd-utils.c:1079:glusterd_volume_brickinfo_get] 0-management: Found
brick
[2015-04-30 05:14:25.778356] I [socket.c:2236:socket_event_handler]
0-transport: disconnecting now

Thanks,
 James.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Got a slogan idea?

2015-04-01 Thread James Cuff

GlusterFS: It's not DevNull(tm)

https://rc.fas.harvard.edu/news-home/feature-stories/fas-research-computing-implements-novel-big-data-storage-system/

Best,

j.

--
dr. james cuff, assistant dean for research computing, harvard
university | division of science | thirty eight oxford street,
cambridge. ma. 02138 | +1 617 384 7647 | http://rc.fas.harvard.edu


On Wed, Apr 1, 2015 at 9:01 AM, Jeff Darcy  wrote:
>> What I am saying is that if you have a slogan idea for Gluster, I want
>> to hear it. You can reply on list or send it to me directly. I will
>> collect all the proposals (yours and the ones that Red Hat comes up
>> with) and circle back around for community discussion in about a month
>> or so.
>
> Personally I don't like any of these all that much, but maybe they'll
> get someone else thinking.
>
> GlusterFS: your data, your way
>
> GlusterFS: any data, any servers, any protocol
>
> GlusterFS: scale-out storage for everyone
>
> GlusterFS: software defined storage for everyone
>
> GlusterFS: the Swiss Army Knife of storage
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster/NFS mount issues

2015-03-26 Thread Alun James

Thanks for that Ben. Adding "/bin/mount /data" to rc.local is working 
perfectly. 


Regards, 


Alun. 

- Original Message -

From: "Ben Turner"  
To: "Alun James"  
Cc: gluster-users@gluster.org 
Sent: Wednesday, 25 March, 2015 9:21:14 PM 
Subject: Re: [Gluster-users] Gluster/NFS mount issues 

Normally when I see this the NICs are not fully initialized. I have done a 
couple different things to work around this: 

-Try adding the linkdelay parameter to the ifcfg script: 

LINKDELAY=time 
where time is the number of seconds to wait for link negotiation before 
configuring the device. 

-Try turning on portfast on your switch to speed up negotiation. 

-Try putting a sleep in your init scripts just before it goes to mount your 
fstab items 

-Try putting the mount command in rc.local or whatever is the last thing your 
system does before it boots. 

Last time I looked at the _netdev code it only looked for an active link, it 
didn't ensure that the NIC was up and able to send traffic. I would start with 
the linkdelay and go from there. LMK how this works out for ya, I am not very 
well versed on the Ubuntu boot process :/ 

-b 

- Original Message - 
> From: "Alun James"  
> To: gluster-users@gluster.org 
> Sent: Wednesday, March 25, 2015 6:33:05 AM 
> Subject: [Gluster-users] Gluster/NFS mount issues 
> 
> Hi folks, 
> 
> I am having some issues getting NFS to mount the glusterfs volume on boot-up, 
> I have tried all the usual mount options in fstab, but thus far none have 
> helped I am using NFS as it seems to give better performance for my workload 
> compared with glusterfs client. 
> 
> [Node Setup] 
> 
> 3 x Nodes mounting vol locally. 
> Ubuntu 14.04 3.13.0-45-generic 
> GlusterFS: 3.6.2-ubuntu1~trusty3 
> nfs-common 1:1.2.8-6ubuntu1.1 
> 
> Type: Replicate 
> Status: Started 
> Number of Bricks: 1 x 3 = 3 
> Transport-type: tcp 
> Bricks: 
> Brick1: node01:/export/brick0 
> Brick2: node 02:/export/brick0 
> Brick3: node 03:/export/brick0 
> 
> /etc/fstab: 
> 
> /dev/mapper/gluster--vg-brick0 /export/brick0 xfs defaults 0 0 
> 
> localhost:/my_filestore_vol /data nfs 
> defaults,nobootwait,noatime,_netdev,nolock,mountproto=tcp,vers=3 0 0 
> 
> 
> [Issue] 
> 
> On boot, the /data partition is not mounted, however, I can jump on each node 
> and simply run "mount /data" without any problems, so I assume my fstab 
> options are OK. I have noticed the following log: 
> 
> /var/log/upstart/mountall.log: 
> 
> mount.nfs: requested NFS version or transport protocol is not supported 
> mountall: mount /data [1178] terminated with status 32 
> 
> I have attempted the following fstab options without success and similar log 
> message: 
> 
> localhost:/my_filestore_vol /data nfs 
> defaults,nobootwait,noatime,_netdev,nolock 0 0 
> localhost:/my_filestore_vol /data nfs 
> defaults,nobootwait,noatime,_netdev,nolock,mountproto=tcp,vers=3 0 0 
> localhost:/my_filestore_vol /data nfs 
> defaults,nobootwait,noatime,_netdev,nolock,vers=3 0 0 
> localhost:/my_filestore_vol /data nfs 
> defaults,nobootwait,noatime,_netdev,nolock,nfsvers=3 0 0 
> localhost:/my_filestore_vol /data nfs 
> defaults,nobootwait,noatime,_netdev,nolock,mountproto=tcp,nfsvers=3 0 0 
> 
> Anything else I can try? 
> 
> Regards, 
> 
> ALUN JAMES 
> Senior Systems Engineer 
> Tibus 
> 
> T: +44 (0)28 9033 1122 
> E: aja...@tibus.com 
> W: www.tibus.com 
> 
> Follow us on Twitter @tibus 
> 
> Tibus is a trading name of The Internet Business Ltd, a company limited by 
> share capital and registered in Northern Ireland, NI31235. It is part of UTV 
> Media Plc. 
> 
> This email and any attachment may contain confidential information for the 
> sole use of the intended recipient. Any review, use, distribution or 
> disclosure by others is strictly prohibited. If you are not the intended 
> recipient (or authorised to receive for the recipient), please contact the 
> sender by reply email and delete all copies of this message. 
> 
> ___ 
> Gluster-users mailing list 
> Gluster-users@gluster.org 
> http://www.gluster.org/mailman/listinfo/gluster-users 

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Gluster/NFS mount issues

2015-03-25 Thread Alun James


Hi folks, 


I am having some issues getting NFS to mount the glusterfs volume on boot-up, I 
have tried all the usual mount options in fstab, but thus far none have helped 
I am using NFS as it seems to give better performance for my workload compared 
with glusterfs client. 


[Node Setup] 



3 x Nodes mounting vol locally. 

Ubuntu 14.04 3.13.0-45-generic 
GlusterFS: 3.6.2-ubuntu1~trusty3 
nfs-common 1:1.2.8-6ubuntu1.1 



Type: Replicate 

Status: Started 
Number of Bricks: 1 x 3 = 3 
Transport-type: tcp 
Bricks: 
Brick1: node01:/export/brick0 
Brick2: node 02:/export/brick0 
Brick3: node 03:/export/brick0 


/etc/fstab: 


/dev/mapper/gluster--vg-brick0 /export/brick0 xfs defaults 0 0 



localhost:/my_filestore_vol /data nfs 
defaults,nobootwait,noatime,_netdev,nolock,mountproto=tcp,vers=3 0 0 




[Issue] 


On boot, the /data partition is not mounted, however, I can jump on each node 
and simply run "mount /data" without any problems, so I assume my fstab options 
are OK. I have noticed the following log: 



/var/log/upstart/mountall.log: 




mount.nfs: requested NFS version or transport protocol is not supported 
mountall: mount /data [1178] terminated with status 32 


I have attempted the following fstab options without success and similar log 
message: 


localhost:/my_filestore_vol /data nfs 
defaults,nobootwait,noatime,_netdev,nolock 0 0 
localhost:/my_filestore_vol /data nfs 
defaults,nobootwait,noatime,_netdev,nolock,mountproto=tcp,vers=3 0 0 
localhost:/my_filestore_vol /data nfs 
defaults,nobootwait,noatime,_netdev,nolock,vers=3 0 0 
localhost:/my_filestore_vol /data nfs 
defaults,nobootwait,noatime,_netdev,nolock,nfsvers=3 0 0 
localhost:/my_filestore_vol /data nfs 
defaults,nobootwait,noatime,_netdev,nolock,mountproto=tcp,nfsvers=3 0 0 



Anything else I can try? 


Regards, 

ALUN JAMES 
Senior Systems Engineer 
Tibus 

T: +44 (0)28 9033 1122 
E: aja...@tibus.com 
W: www.tibus.com 

Follow us on Twitter @tibus 

Tibus is a trading name of The Internet Business Ltd, a company limited by 
share capital and registered in Northern Ireland, NI31235. It is part of UTV 
Media Plc. 

This email and any attachment may contain confidential information for the sole 
use of the intended recipient. Any review, use, distribution or disclosure by 
others is strictly prohibited. If you are not the intended recipient (or 
authorised to receive for the recipient), please contact the sender by reply 
email and delete all copies of this message. 
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Recommended xfs mount options

2014-11-02 Thread James

On Sun, Nov 2, 2014 at 9:08 PM, Lindsay Mathieson
 wrote:
> Further to my earlier question about the recommend underlying file
> system for gluster (xfs), are there recommended mount options for xfs?

https://github.com/purpleidea/puppet-gluster/blob/master/manifests/brick.pp#L261

Cheers,
James

>
> Currently I'm using:  noatime,inode64,nodiratime
>
> And BTW - great piece of software! its working pretty well so far in
> my three node proxmox gluster. Easy to setup and understand.
>
> I'm getting 80 MB/s sustained writes over a 2*1GB bonded (balance-rr)
> network and hope to improve that once I get my smart switch configured
> with LACP and jumbo frames.
>
> --
> Lindsay
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Is RAID necessary/recommended?

2014-10-28 Thread James

On Tue, Oct 28, 2014 at 9:53 PM, Lindsay Mathieson
 wrote:
> Ok, thanks James, Juan.
>
> Given my budget, I think I'll switch to using a single 3TB drive in
> each node, but add an extra 1GB Intel network card to each node and
> bond them for better network performance.
Did you test your workload to find your bottlenecks, or is this all
just conjecture? Test!


>
> Also I will be adding a third proxmox node for quorum purposes - it
> will just be a Intel NUC, won't be used for running VM's (though it
> could manage a couple).
Sweet... I almost bought a NUC to replace my Pentium 4 home server,
but they were kind of pricey. How is it?

>
>
> On 29 October 2014 11:10, Juan José Pavlik Salles  wrote:
>> RAID6 is the best choice when working with arrays with many disks. RAID10
>> doesn't make sense to me since you already have replication with gluster.

Keep in mind that if you've got an array of 24 disks, you'd probably
want to split that up into multiple RAID 6's. IOW, you'll have a few
bricks per host, each composed of a RAID 6. I think the magic number
of disks for a set is probably at least six, but not much more than
eight. I got this number from my imagination. Test your workloads
(also under failure scenarios) to decide for yourself.

Cheers,
James
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Is RAID necessary/recommended?

2014-10-28 Thread James

On Tue, Oct 28, 2014 at 7:53 PM, Lindsay Mathieson
 wrote:
> I thought RAID5 was no longer considered a good option these days,
> with RAID10 being preferred?


RAID6 preferred
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Recomended underlying filesystem

2014-10-27 Thread James

On Mon, Oct 27, 2014 at 8:52 AM, Lindsay Mathieson
 wrote:
> Looking at setting up a two node replicated gluster filesystem. Base hard
> disks on each node are 2*2TB in RAID1. It will be used for serving VM Images.
>
> Does the underlying filesystem particularly matter? EXT4? XFS?
>
>
> thanks,

Something with xattrs. XFS is most tested/supported.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] geo-replication breaks on CentOS 6.5 + gluster 3.6.0 beta3

2014-10-18 Thread James Payne

Not in my particular use case which is where in Windows a new folder or file is 
created through explorer. The new folder is created by Windows with the name 
'New Folder' which almost certainly the user will the rename. The same goes 
with newly created files in explorer.

does this mean the issue shouldn't be there in a replicate only scenario?

Regards
James

--- Original Message ---

From: "M S Vishwanath Bhat" 
Sent: 17 October 2014 20:53
To: "Kingsley" , "James Payne" 

Cc: gluster-users@gluster.org
Subject: Re: [Gluster-users] geo-replication breaks on CentOS 6.5 + gluster 
3.6.0 beta3

Hi,

Right now, distributed-geo-rep has bunch of known issues with deletes
and renames. Part of the issue was solved with a patch sent to upstream
recently. But still it doesn't solve complete issue.

So long story short, dist-geo-rep has still issues with short lived
renames where the renamed files are hashed to different subvolume
(bricks). If the renamed file is hashed to same brick then issue should
not be seen (hopefully).

Using volume set, we can force the renamed file to be hashed to same
brick. "gluster volume set  cluster.extra-hash-regex
"

For example if you open a file in vi, it will rename the file to
filename.txt~, so the regex should be
gluster volume set VOLNAME cluster.extra-hash-regex '^(.+)~$'

But for this to work, the format of the files created by your
application has to be identified. Does your application create files in
a identifiable format which can be specified in a regex? Is this a
possibility?


Best Regards,
Vishwanath

On 15/10/14 15:41, Kingsley wrote:
> I have added a comment to that bug report (a paste of my original
> email).
>
> Cheers,
> Kingsley.
>
> On Tue, 2014-10-14 at 22:10 +0100, James Payne wrote:
>> Just adding that I have verified this as well with the 3.6 beta, I added a
>> log to the ticket regarding this.
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=1141379
>>
>> Please feel free to add to the bug report, I think we are seeing the same
>> issue. It isn't present in the 3.4 series which in the one I'm testing
>> currently. (no distributed geo rep though)
>>
>> Regards
>> James
>>
>> -Original Message-
>> From: Kingsley [mailto:glus...@gluster.dogwind.com]
>> Sent: 13 October 2014 16:51
>> To: gluster-users@gluster.org
>> Subject: [Gluster-users] geo-replication breaks on CentOS 6.5 + gluster
>> 3.6.0 beta3
>>
>> Hi,
>>
>> I have a small script to simulate file activity for an application we have.
>> It breaks geo-replication within about 15 - 20 seconds when I try it.
>>
>> This is on a small Gluster test environment running in some VMs running
>> CentOS 6.5 and using gluster 3.6.0 beta3. I have 6 VMs - test1, test2,
>> test3, test4, test5 and test6. test1, test2 , test3 and test4 are gluster
>> servers while test5 and test6 are the clients. test3 is actually not used in
>> this test.
>>
>>
>> Before the test, I had a single gluster volume as follows:
>>
>> test1# gluster volume status
>> Status of volume: gv0
>> Gluster process PortOnline  Pid
>> 
>> --
>> Brick test1:/data/brick/gv0 49168   Y
>> 12017
>> Brick test2:/data/brick/gv0 49168   Y
>> 11835
>> NFS Server on localhost 2049Y
>> 12032
>> Self-heal Daemon on localhost   N/A Y
>> 12039
>> NFS Server on test4 2049Y   7934
>> Self-heal Daemon on test4   N/A Y   7939
>> NFS Server on test3 2049Y
>> 11768
>> Self-heal Daemon on test3   N/A Y
>> 11775
>> NFS Server on test2 2049Y
>> 11849
>> Self-heal Daemon on test2   N/A Y
>> 11855
>>
>> Task Status of Volume gv0
>> 
>> --
>> There are no active volume tasks
>>
>>
>> I created a new volume and set up geo-replication as follows (as these are
>> test machines I only have one file system on each, hence using "force" to
>> create the bricks in the root FS):
>>
>> test4# date ; gluster volume create gv0-slave test4:/data/brick/gv0-slave
>> force; date Mon Oct 13 15:03:14 BST 2014 volume create: gv0-slave:

Re: [Gluster-users] Re-Sync Geo Replication

2014-10-14 Thread James Payne

I can confirm that using touch works and the file is resync'd

Thanks for your help :)

James

--- Original Message ---

From: "Venky Shankar" 
Sent: 8 October 2014 12:41
To: "James Payne" 
Cc: "gluster-users" 
Subject: Re: [Gluster-users] Re-Sync Geo Replication

"touch " should trigger a re-sync.

On Sun, Oct 5, 2014 at 3:03 AM, James Payne  wrote:

> Hi,
>
>
>
> Just wondering if there was a method to manually force a re-sync of a geo
> replication slave so it is an identical mirror of the master?
>
>
>
> History of this request is that I have a test setup and the Gluster
> Geo-Replication seems to have missed 7 files out completely (not sure if
> this was a bug or an issue with my setup specifically as this is a test
> setup it has been setup and torn down a few times). Now though the Geo
> replica will not converge to be the same, ie. It’s stable, new files add
> fine and files will delete, but the missing files just don’t seem to be
> interested in synchronising! I’m guessing that as the rsync is triggered by
> the change log and as these files aren’t changing it won’t ever notice them
> again? I can manually copy the files (there are only 7 after all…) but I
> have only found them through a consistency checking script I wrote. I can
> run this through a cron to pick up any missing files, however I wondered if
> Gluster had something built in which did a check and sync? Also, If I did
> manually copy these files across how would that affect the consistency of
> the geo replica session?
>
>
>
> Running: GlusterFS 3.4.5 on CentOS 6.5
>
>
>
> Regards
>
> James
>
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] geo-replication breaks on CentOS 6.5 + gluster 3.6.0 beta3

2014-10-14 Thread James Payne

Just adding that I have verified this as well with the 3.6 beta, I added a
log to the ticket regarding this. 

https://bugzilla.redhat.com/show_bug.cgi?id=1141379

Please feel free to add to the bug report, I think we are seeing the same
issue. It isn't present in the 3.4 series which in the one I'm testing
currently. (no distributed geo rep though)

Regards
James

-Original Message-
From: Kingsley [mailto:glus...@gluster.dogwind.com] 
Sent: 13 October 2014 16:51
To: gluster-users@gluster.org
Subject: [Gluster-users] geo-replication breaks on CentOS 6.5 + gluster
3.6.0 beta3

Hi,

I have a small script to simulate file activity for an application we have.
It breaks geo-replication within about 15 - 20 seconds when I try it.

This is on a small Gluster test environment running in some VMs running
CentOS 6.5 and using gluster 3.6.0 beta3. I have 6 VMs - test1, test2,
test3, test4, test5 and test6. test1, test2 , test3 and test4 are gluster
servers while test5 and test6 are the clients. test3 is actually not used in
this test.


Before the test, I had a single gluster volume as follows:

test1# gluster volume status
Status of volume: gv0
Gluster process PortOnline  Pid

--
Brick test1:/data/brick/gv0 49168   Y
12017
Brick test2:/data/brick/gv0 49168   Y
11835
NFS Server on localhost 2049Y
12032
Self-heal Daemon on localhost   N/A Y
12039
NFS Server on test4 2049Y   7934
Self-heal Daemon on test4   N/A Y   7939
NFS Server on test3 2049Y
11768
Self-heal Daemon on test3   N/A Y
11775
NFS Server on test2 2049Y
11849
Self-heal Daemon on test2   N/A Y
11855

Task Status of Volume gv0

--
There are no active volume tasks


I created a new volume and set up geo-replication as follows (as these are
test machines I only have one file system on each, hence using "force" to
create the bricks in the root FS):

test4# date ; gluster volume create gv0-slave test4:/data/brick/gv0-slave
force; date Mon Oct 13 15:03:14 BST 2014 volume create: gv0-slave: success:
please start the volume to access data Mon Oct 13 15:03:15 BST 2014

test4# date ; gluster volume start gv0-slave; date Mon Oct 13 15:03:36 BST
2014 volume start: gv0-slave: success Mon Oct 13 15:03:39 BST 2014

test4# date ; gluster volume geo-replication gv0 test4::gv0-slave create
push-pem force ; date Mon Oct 13 15:05:59 BST 2014 Creating geo-replication
session between gv0 & test4::gv0-slave has been successful Mon Oct 13
15:06:11 BST 2014


I then mount volume gv0 on one of the client machines. I can create files
within the gv0 volume and can see the changes being replicated to the
gv0-slave volume, so I know that geo-replication is working at the start.

When I run my script (which quickly creates, deletes and renames files),
geo-replication breaks within a very short time. The test script output is
in http://gluster.dogwind.com/files/georep20141013/test6_script-output.log
(I interrupted the script once I saw that geo-replication was broken).
Note that when it deletes a file, it renames any later-numbered file so that
the file numbering remains sequential with no gaps; this simulates a real
world application that we use.

If you want a copy of the test script, it's here:
http://gluster.dogwind.com/files/georep20141013/test_script.tar.gz


The various gluster log files can be downloaded from here:
http://gluster.dogwind.com/files/georep20141013/ - each log file has the
actual log file path at the top of the file.

If you want to run the test script on your own system, edit test.pl so that
@mailstores contains a directory path to a gluster volume.

My systems' timezone is BST (GMT+1 / UTC+1) so any timestamps outside of
gluster logs are in this timezone.

Let me know if you need any more info.

--
Cheers,
Kingsley.



___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Documentation for Snapshot feature introduced in Gluster 3.6

2014-10-13 Thread James

On Mon, 2014-10-13 at 01:22 -0400, Rajesh Joseph wrote:
> Hi all,
> 
> I have written a small write-up for the new snapshot feature introduced in 
> Gluster 3.6 release.
> 
> http://rajesh-joseph.blogspot.in/p/gluster-volume-snapshot-howto.html
> 
> Please go through it and let me know if you have any comments.
> 
> Thanks & Regards,
> Rajesh

Shamefully piggy backing on this thread...

Puppet-Gluster also supports setting up lvm_thinp for snapshots since
april [1]:

https://github.com/purpleidea/puppet-gluster/blob/master/DOCUMENTATION.md#lvm_thinp

although it has received very little testing, so if someone is
interested in trying this, please step up and let me know :)

If someone does do this, and shows interest, then I'm happy to accept
patches or show interest in writing some to support native
puppet::volume::snapshot types.

Cheers,
James

[1]
https://github.com/purpleidea/puppet-gluster/commit/b0f645e24d6622008a56504c099fbd2ce64497fe



signature.asc
Description: This is a digitally signed message part
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] If using ZFS with GlusterFS, why disable the ZIL?

2014-10-10 Thread James

On 10 October 2014 12:51, Nathan Fiedler  wrote:
> That makes sense, thanks for the explanation. I'm encouraged now to use ZFS
> with Gluster.

Remember! If you break it, you get to keep both pieces!
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Re-Sync Geo Replication

2014-10-04 Thread James Payne

Hi,

 

Just wondering if there was a method to manually force a re-sync of a geo
replication slave so it is an identical mirror of the master? 

 

History of this request is that I have a test setup and the Gluster
Geo-Replication seems to have missed 7 files out completely (not sure if
this was a bug or an issue with my setup specifically as this is a test
setup it has been setup and torn down a few times). Now though the Geo
replica will not converge to be the same, ie. It's stable, new files add
fine and files will delete, but the missing files just don't seem to be
interested in synchronising! I'm guessing that as the rsync is triggered by
the change log and as these files aren't changing it won't ever notice them
again? I can manually copy the files (there are only 7 after all.) but I
have only found them through a consistency checking script I wrote. I can
run this through a cron to pick up any missing files, however I wondered if
Gluster had something built in which did a check and sync? Also, If I did
manually copy these files across how would that affect the consistency of
the geo replica session?

 

Running: GlusterFS 3.4.5 on CentOS 6.5

 

Regards

James

 

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Export a gluster volume read-only to some clients but not all

2014-10-02 Thread James

On 3 October 2014 00:49, Tom van Leeuwen  wrote:
> I am not in control of the client

Then you should not use GlusterFS without adding some sort of access
control. The feature you want is currently not available in GlusterFS.
As you said, you can put a kerberized nfs server (krb5p) in front of
GlusterFS if you want, but there is a certain amount of overhead in
doing so.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Export a gluster volume read-only to some clients but not all

2014-10-02 Thread James

On 3 October 2014 00:30, Tom van Leeuwen  wrote:
> I'm running gluster 3.4.2 and have the requirement to export a glusterfs
> volume read-only to some hosts, but definitely not all.
> Is there any way to achieve this without having to introduce a frontend nfs
> server?

You should only allow hosts access to your GlusterFS servers that you
trust. Firewall off the rest, and use auth.allow and auth.reject for
the others.

This is needed because GlusterFS doesn't have built in authentication.
So the answer is that you must mount the volume readonly with standard
mount 'ro' vs. 'rw' options.

If there is a better solution than this, I don't know it, and maybe
someone will let me know.

Cheers,
James
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] geo-replication fails on CentOS 6.5, gluster v 3.5.2

2014-09-27 Thread James Payne

Not sure but is this the same as bug
https://bugzilla.redhat.com/show_bug.cgi?id=1141379

I have seen similar behaviour but in my case it was shown up due to using
Samba and every time a user created a folder (Windows calls it New Folder)
and renamed it quickly the Geo Rep version became instantly incorrect.

James


-Original Message-
From: Kingsley [mailto:glus...@gluster.dogwind.com] 
Sent: 27 September 2014 00:16
To: gluster-users@gluster.org
Subject: [Gluster-users] geo-replication fails on CentOS 6.5, gluster v
3.5.2

Hi,

I'm new to gluster so forgive me if I'm being an idiot. I've searched the
list archives back to May but haven't found the exact issue I've come
across, so I thought I'd ask on here.

Firstly, I'd like to thank the people working on this project. I've found
gluster to be pretty simple to get going and it seems to work pretty well so
far. It looks like it will be a good fit for the application I have in mind,
if we can get geo-replication to work reliably.

Now on to my problem ...

I've set up an additional gluster volume and configured geo-replication to
replicate the master volume to it using the instructions here:

https://github.com/gluster/glusterfs/blob/master/doc/admin-guide/en-US/markd
own/admin_distributed_geo_rep.md

To keep things simple while it was all new to me and I was just testing, I
didn't want to add confusion by thinking about using non-privileged accounts
and mountbroker and stuff so I just set everything up to use root.

Anyway, I mounted the master volume and slave on a client machine (I didn't
modify the content of the slave volume, I just mounted it so that I could
check things were working).

When I manually create or delete a few files and wait 60 seconds for
replication to do its thing, it seems to work fine.

However, when I hit it with a script to simulate intense user activity,
geo-replication breaks. I deleted the geo-replication and removed the slave
volume, then re-created and re-enabled geo-replication several times so that
I could start again from scratch. Each time, my script (which just creates,
renames and deletes files in the master volume via a glusterfs mount) runs
for barely a minute before geo-replication breaks.

I tried this with the slave volume containing just one brick, and also with
it containing 2 bricks replicating each other. Each time, it broke.

On the slave, I noticed that the geo-replication logs contained entries like
these:

[2014-09-26 16:32:23.995539] W [fuse-bridge.c:1214:fuse_err_cbk]
0-glusterfs-fuse: 6384: SETXATTR()
/.gfid/5f9b6d20-a062-4168-9333-8d28f2ba2d57 => -1 (File exists)
[2014-09-26 16:32:23.995798] W [client-rpc-fops.c:256:client3_3_mknod_cbk]
0-gv2-slave-client-0: remote operation failed: File exists. Path:
/msg02
[2014-09-26 16:32:23.996042] W [fuse-bridge.c:1214:fuse_err_cbk]
0-glusterfs-fuse: 6385: SETXATTR()
/.gfid/855b5eda-f694-487c-adae-a4723fe6c316 => -1 (File exists)
[2014-09-26 16:32:24.550009] W [fuse-bridge.c:1911:fuse_create_cbk]
0-glusterfs-fuse: 6469: /.gfid/05a27020-5931-4890-9b74-a77cb1aca918 => -1
(Operation not permitted)
[2014-09-26 16:32:24.550533] W [defaults.c:1381:default_release]
(-->/usr/lib64/glusterfs/3.5.2/xlator/mount/fuse.so(+0x1e7d0)
[0x7fb2ebd1e7d0]
(-->/usr/lib64/glusterfs/3.5.2/xlator/mount/fuse.so(free_fuse_state+0x93)
[0x7fb2ebd07063] (-->/usr/lib64/libglusterfs.so.0(fd_unref+0x10e)
[0x7fb2eef36fbe]))) 0-fuse: xlator does not implement release_cbk

I also noticed that at some point, rsync was returning error code 23.

Now ... I noted from the page I linked above that it requires rsync version
3.0.7 and the version that ships with CentOS 6.5 is, wait for it ... 3.0.6.
Is this going to be the issue, or is the problem something else?

If you need more logs, let me know. If you need a copy of my client script
that breaks it, let me know that and I'll send it along.

--
Cheers,
Kingsley.



___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Bricks as BTRFS

2014-09-26 Thread James

On Fri, Sep 26, 2014 at 5:57 PM, Ric Wheeler  wrote:
> reflink for backup is really a bad idea since you will not have really made
> a second copy - if the disk fails (even partially!) you might lose data
> since we will not have a second copy of the blocks. Where it is not
> supported, you will still need to do a full file copy which means normal
> file operation speed for the backup and restore.

Perhaps I didn't express the feature well enough, but here is what I
think would be particularly useful, and how the reflink operation can
make this extremely powerful. The below description works for the
single server use case, but does not (AFAICT) yet exist for the multi
machine Gluster scenario.

When running an incremental backup with something like rsnapshot, only
changed files are copied. Therefore in this common scenario you
actually only have one copy of each current file, and each file that
is the same across snapshots is in fact a hardlink to the same data.

To do a restore, the operation could take hours, days, or even longer
depending on how much data is present. If you cp --reflink, then you
effectively complete the copy to restore location instantly (since the
blocks don't need to be duplicated) and as they are COW files, changes
happen as needed, without affecting the backup.

Similarly, with a backup operation, many TB of data can be copied very
quickly, because the data blocks don't need to move.

The only problem with this scheme is that it only works for the single
machine use case. Distributing it and replicating it across a scalable
system like GlusterFS would extend this scheme to large data sets. I
don't know how feasible that is, but I have been led to believe that
it is.

Hopefully the above use case is a compelling one. Any sysadmin who has
sat and watched their data restore for 3 days knows the value of it.

Cheers,
James
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Bricks as BTRFS

2014-09-26 Thread James

On Fri, Sep 26, 2014 at 3:15 PM, Ric Wheeler  wrote:
> On 09/26/2014 01:58 PM, James wrote:
>>
>> On Thu, Sep 25, 2014 at 2:53 AM, Venky Shankar 
>> wrote:
>>>
>>> Hey folks,
>>>
>>> Wanted to check if anyone out here uses BTRFS (and willing to share their
>>> experiences[1]) as the backend filesystem for GlusterFS. We're planning
>>> to
>>> explore some of it's features and put it to use for GlusterFS. This was
>>> discussed briefly during the weekly meeting on #gluster-meeting[2].
>>>
>>> To start with, we plan to explore data/metadata checksumming (+
>>> scrubbing)
>>> and subvolumes to "offload" the work to BTRFS. The mentioned features
>>> would
>>> help us with BitRot detection[3] and Openstack Manila use cases
>>> respectively
>>> (though there are various other nifty things one would want to do with
>>> them).
>>>
>>> Thanks in advance!
>>
>>
>> Hey,
>>
>> I couldn't make the meeting, but I am interested in BTRFS. I added
>> this in puppet-gluster a bunch of months ago as a feature branch.
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=1094860
>>
>> I just pushed it to git master.
>>
>>
>> https://github.com/purpleidea/puppet-gluster/commit/6c962083d8b100dcaeb6f11dbe61e6071f3d13f0
>>
>> The reason I want btrfs support, is I want glusterfs to eventually be
>> able to support reflinks across gluster volumes. There is a strong use
>> case for this feature.
>>
>> Let me know if this helps!
>> Cheers,
>> James
>>
>
> Reflinks in btrfs (or ocfs2) need to be between files in the same linux
> kernel instance of btrfs.  Effectively, we have two inodes backed by the
> same physical blocks.
>
> It won't, in general, be useful for reflinks across volumes
>
> Regards,
>
> Ric

Agreed... Which is why this isn't a trivial thing for GlusterFS to do,
but we've discussed certain mechanisms to emulate this behaviour
across a Gluster volume. For example:

* If the reflink causes the file to be on the same brick, just reflink.
* If the reflink causes the file to be on a different brick, then
reflink to self, and put a pointer to that original brick
* If we want to reflink across volumes, then it's tricky, because fuse
would have to pass this information through and down to the
filesystem.

The winning use case for this feature is that someone could
backup/restore petabytes of data "virtually instantly". This is
possible with single volume things, but I'd like to scale this to a
distributed-replicated data store.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Bricks as BTRFS

2014-09-26 Thread James

On Thu, Sep 25, 2014 at 2:53 AM, Venky Shankar  wrote:
> Hey folks,
>
> Wanted to check if anyone out here uses BTRFS (and willing to share their
> experiences[1]) as the backend filesystem for GlusterFS. We're planning to
> explore some of it's features and put it to use for GlusterFS. This was
> discussed briefly during the weekly meeting on #gluster-meeting[2].
>
> To start with, we plan to explore data/metadata checksumming (+ scrubbing)
> and subvolumes to "offload" the work to BTRFS. The mentioned features would
> help us with BitRot detection[3] and Openstack Manila use cases respectively
> (though there are various other nifty things one would want to do with
> them).
>
> Thanks in advance!

Hey,

I couldn't make the meeting, but I am interested in BTRFS. I added
this in puppet-gluster a bunch of months ago as a feature branch.

https://bugzilla.redhat.com/show_bug.cgi?id=1094860

I just pushed it to git master.

https://github.com/purpleidea/puppet-gluster/commit/6c962083d8b100dcaeb6f11dbe61e6071f3d13f0

The reason I want btrfs support, is I want glusterfs to eventually be
able to support reflinks across gluster volumes. There is a strong use
case for this feature.

Let me know if this helps!
Cheers,
James
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Migrating data from a failing filesystem

2014-09-24 Thread james . bellinger

> On 09/24/2014 07:35 PM, james.bellin...@icecube.wisc.edu wrote:
>> Thanks for the info!
>> I started the remove-brick start and, of course, the brick went
>> read-only
>> in less than an hour.
>> This morning I checked the status a couple of minutes apart and found:
>>
>>   Node Rebalanced-files   size scanned  failures
>> status
>> -  ---      -   ---
>> 
>> gfs-node04 6634   590.7GB   81799 14868in
>> progress
>> ...
>> gfs-node04 6669   596.5GB   86584 15271in
>> progress
>>
>> I'm not sure exactly what it is doing here:  4785 files scanned, 403
>> failures, and 35 rebalanced.
> What it is supposed to be doing is to scan all the files in the volume,
> and for the files present in itself, i.e.gfs-node04:/sdb, migrate
> (rebalance) it into other bricks in the volume. Let it go to completion.
> The rebalance log should give you an idea of the 403 failures.

I'll have a look at that.

>>   The used amount on the partition hasn't
>> changed.
> Probably because after copying the files to the other bricks, the
> unlinks/rmdirs on itself are failing because of the FS being mounted
> read-only.
>> If anything, the _other_ brick on the server is shrinking!
> Because the data is being copied into this brick as a part of migration?

No, the space used on the read/write brick is decreasing.  The readonly
one isn't changing, of course.

FWIW, this operation seems to have triggered a failure elsewhere, so I was
a little occupied in getting a filesystem working again.  (I can hardly
wait to remainder this system...)

>> (Which is related to the question I had before that you mention below.)
>>
>> gluster volume remove-brick scratch gfs-node04:/sdb start
> What is your original volume configuration? (gluster vol info scratch)?

$ sudo gluster volume info scratch

Volume Name: scratch
Type: Distribute
Volume ID: de1fbb47-3e5a-45dc-8df8-04f7f73a3ecc
Status: Started
Number of Bricks: 12
Transport-type: tcp,rdma
Bricks:
Brick1: gfs-node01:/sdb
Brick2: gfs-node01:/sdc
Brick3: gfs-node01:/sdd
Brick4: gfs-node03:/sda
Brick5: gfs-node03:/sdb
Brick6: gfs-node03:/sdc
Brick7: gfs-node04:/sda
Brick8: gfs-node04:/sdb
Brick9: gfs-node05:/sdb
Brick10: gfs-node06:/sdb
Brick11: gfs-node06:/sdc
Brick12: gfs-node05:/sdc
Options Reconfigured:
cluster.min-free-disk: 30GB


>> but...
>> df /sda
>> Filesystem   1K-blocks  Used Available Use% Mounted on
>> /dev/sda 12644872688 10672989432 1844930140  86% /sda
>> ...
>> /dev/sda 12644872688 10671453672 1846465900  86% /sda
>>
>> Have I shot myself in the other foot?
>> jim
>>
>>
>>
>>
>>
>>> On 09/23/2014 08:56 PM, james.bellin...@icecube.wisc.edu wrote:
>>>> I inherited a non-replicated gluster system based on antique hardware.
>>>>
>>>> One of the brick filesystems is flaking out, and remounts read-only.
>>>> I
>>>> repair it and remount it, but this is only postponing the inevitable.
>>>>
>>>> How can I migrate files off a failing brick that intermittently turns
>>>> read-only?  I have enough space, thanks to a catastrophic failure on
>>>> another brick; I don't want to present people with another one.  But
>>>> if
>>>> I
>>>> understand migration correctly references have to be deleted, which
>>>> isn't
>>>> possible if the filesystem turns read-only.
>>> What you could do is initiate the  migration  with `remove-brick start'
>>> and monitor the progress with 'remove-brick status`. Irrespective of
>>> whether the rebalance  completes or fails (due to the brick turning
>>> read-only), you could anyway update the volume configuration with
>>> 'remove-brick commit`. Now if the brick still has files left, mount the
>>> gluster volume on that node and copy the files from the brick to the
>>> volume via the mount.  You can then safely rebuild the array/ add a
>>> different brick or whatever.
>>>
>>>> What I want to do is migrate the files off, remove it from gluster,
>>>> rebuild the array, rebuild the filesystem, and then add it back as a
>>>> brick.  (Actually what I'd really like is to hear that the students
>>>> are
>>>> all done with the system and I can turn the whole thing off, but
>>>> theses
>>>> aren't complete yet.)
>>>>
>>>> Any advice or words of warni

Re: [Gluster-users] Migrating data from a failing filesystem

2014-09-24 Thread james . bellinger

Thanks for the info!
I started the remove-brick start and, of course, the brick went read-only
in less than an hour.
This morning I checked the status a couple of minutes apart and found:

 Node Rebalanced-files   size scanned  failures
status
-  ---      -   ---  

gfs-node04 6634   590.7GB   81799 14868in
progress
...
gfs-node04 6669   596.5GB   86584 15271in
progress

I'm not sure exactly what it is doing here:  4785 files scanned, 403
failures, and 35 rebalanced.  The used amount on the partition hasn't
changed.  If anything, the _other_ brick on the server is shrinking! 
(Which is related to the question I had before that you mention below.)

gluster volume remove-brick scratch gfs-node04:/sdb start
but...
df /sda
Filesystem   1K-blocks  Used Available Use% Mounted on
/dev/sda 12644872688 10672989432 1844930140  86% /sda
...
/dev/sda 12644872688 10671453672 1846465900  86% /sda

Have I shot myself in the other foot?
jim





> On 09/23/2014 08:56 PM, james.bellin...@icecube.wisc.edu wrote:
>> I inherited a non-replicated gluster system based on antique hardware.
>>
>> One of the brick filesystems is flaking out, and remounts read-only.  I
>> repair it and remount it, but this is only postponing the inevitable.
>>
>> How can I migrate files off a failing brick that intermittently turns
>> read-only?  I have enough space, thanks to a catastrophic failure on
>> another brick; I don't want to present people with another one.  But if
>> I
>> understand migration correctly references have to be deleted, which
>> isn't
>> possible if the filesystem turns read-only.
>
> What you could do is initiate the  migration  with `remove-brick start'
> and monitor the progress with 'remove-brick status`. Irrespective of
> whether the rebalance  completes or fails (due to the brick turning
> read-only), you could anyway update the volume configuration with
> 'remove-brick commit`. Now if the brick still has files left, mount the
> gluster volume on that node and copy the files from the brick to the
> volume via the mount.  You can then safely rebuild the array/ add a
> different brick or whatever.
>
>> What I want to do is migrate the files off, remove it from gluster,
>> rebuild the array, rebuild the filesystem, and then add it back as a
>> brick.  (Actually what I'd really like is to hear that the students are
>> all done with the system and I can turn the whole thing off, but theses
>> aren't complete yet.)
>>
>> Any advice or words of warning will be appreciated.
>
> Looks like your bricks are in trouble for over a year now
> (http://gluster.org/pipermail/gluster-users.old/2013-September/014319.html).
> Better get them fixed sooner than later! :-)

Oddly enough the old XRAID systems are holding up better than the VTRAK
arrays.  That doesn't help me much, though, since they're so small.

> HTH,
> Ravi
>
>> James Bellinger
>>
>>
>>
>>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>
>


___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Migrating data from a failing filesystem

2014-09-23 Thread james . bellinger

I inherited a non-replicated gluster system based on antique hardware.

One of the brick filesystems is flaking out, and remounts read-only.  I
repair it and remount it, but this is only postponing the inevitable.

How can I migrate files off a failing brick that intermittently turns
read-only?  I have enough space, thanks to a catastrophic failure on
another brick; I don't want to present people with another one.  But if I
understand migration correctly references have to be deleted, which isn't
possible if the filesystem turns read-only.

What I want to do is migrate the files off, remove it from gluster,
rebuild the array, rebuild the filesystem, and then add it back as a
brick.  (Actually what I'd really like is to hear that the students are
all done with the system and I can turn the whole thing off, but theses
aren't complete yet.)

Any advice or words of warning will be appreciated.

James Bellinger




___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] glusterfs replica volume self heal lots of small file very very slow?how to improve? why slow?

2014-09-22 Thread James

On Mon, Sep 22, 2014 at 10:04 AM, Jocelyn Hotte
 wrote:
> When a self-heal hits in our use-case, it is a direct impact in performance 
> for the users. The CPU of the Gluster nodes hits 100%, and maintains this for 
> usually 1 hour, but sometimes goes up to 4-5 hours.
> This usually renders the Gluster cluster unusable, with a high impact for us.

Have you tried using cgroups to limit the effect of self-heal?
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Proposal for GlusterD-2.0

2014-09-12 Thread James

On Fri, Sep 12, 2014 at 10:38 AM, Eliezer Croitoru  wrote:
> Are you the maintainer of puppet-gluster??
Yes.

> Is it compatible with ubuntu?
Yes, although I don't test on ubuntu, that testing is community driven.
This was built to facilitate that effort:
https://ttboj.wordpress.com/2014/06/04/hiera-data-in-modules-and-os-independent-puppet/
If you find any bugs, please report them, and/or send patches.

Cheers,
James
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Proposal for GlusterD-2.0

2014-09-12 Thread James

On Fri, Sep 12, 2014 at 12:02 AM, Krishnan Parthasarathi
 wrote:
> - Original Message -
>> On Thu, Sep 11, 2014 at 4:55 AM, Krishnan Parthasarathi
>>  wrote:
>> >
>> > I think using Salt as the orchestration framework is a good idea.
>> > We would still need to have a consistent distributed store. I hope
>> > Salt has the provision to use one of our choice. It could be consul
>> > or something that satisfies the criteria for choosing alternate technology.
>> > I would wait for a couple of days for the community to chew on this and
>> > share their thoughts. If we have a consensus on this, we could 'port'
>> > the 'basic'[1] volume management commands to a system built using Salt and
>> > see for real how it fits our use case. Thoughts?
>>
>>
>> I disagree. I think puppet + puppet-gluster would be a good idea :)
>> One advantage is that the technology is already proven, and there's a
>> working POC.
>> Feel free to prove me wrong, or to request any features that it's missing. ;)
>>
>
> I am glad you joined this discussion. I was expecting you to join earlier :)
Assuming non-sarcasm, then thank you :)
I didn't join earlier, because 1) I'm not a hardcore algorithmist like
most of you are and, 2) I'm busy a lot :P

>
> IIUC, puppet-gluster uses glusterd to perform glusterfs deployments. I think 
> it's
> important to consider puppet given its acceptance.What are your thoughts on 
> building
> 'glusterd' using puppet?
I think I can describe my proposal simply, and then give the reason why...

Proposal:
glusterd shouldn't go away or aim to greatly automate / do much more
than it does today already (with a few exceptions).
puppet-gluster should be used as a higher layer abstraction to do the
complex management. More features would still need to be added to
address every use case and corner case, but I think we're a good deal
of the way there. My work on automatically growing gluster volumes was
demo-ed as a POC but never finished and pushed to git master.

I have no comment on language choices or rewrites of glusterd itself,
since functionality wise it mostly "works for me".

Why?
The reasons this makes a lot of sense:
1) Higher level declarative languages can guarantee a lot of "safety"
in terms of avoiding incorrect operations. It's easy to get the config
management graph to error out, which typically means there is a bug in
the code to be fixed. In this scenario, no code actually runs! This
means your data won't get accidentally hurt, or put into a partial
state.
2) Lines of code to accomplish certain things in puppet might be an
order of magnitude less than in a typical imperative language.
Statistically speaking, by keeping LOC down, the logic can be more
concise, and have fewer bugs. This also lets us reason about things
from a higher POV.
3) Understanding the logic in puppet can be easier than reading a pile
of c or go code. This is why you can look at a page of python and
understand, but staring at three pages of assembly is useless.

In any case, I don't think it's likely that Gluster will end up using
puppet, although I do hope people will think about this a bit more and
at least consider it seriously. Since many people are not very
familiar with configuration management, please don't be shy if you'd
like to have a quick chat about it, and maybe a little demo to show
you what's truly possible.

HTH,
James

>
> The proposal mail describes the functions glusterd performs today. With that
> as a reference could you elaborate on how we could use puppet to perform some
> (or all) the functions of glusterd?
>
> ~KP
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Proposal for GlusterD-2.0

2014-09-11 Thread James

On Thu, Sep 11, 2014 at 12:01 PM, Prasad, Nirmal  wrote:
> I really hope whatever the outcome and final choice is ... as an end user I 
> hope that Gluster stays as simple to deploy as it is today.

I think it's pretty simple already with puppet-gluster. It takes me
around 15 minutes while I'm off drinking a $BEVERAGE.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Proposal for GlusterD-2.0

2014-09-11 Thread James

On Thu, Sep 11, 2014 at 4:55 AM, Krishnan Parthasarathi
 wrote:
>
> I think using Salt as the orchestration framework is a good idea.
> We would still need to have a consistent distributed store. I hope
> Salt has the provision to use one of our choice. It could be consul
> or something that satisfies the criteria for choosing alternate technology.
> I would wait for a couple of days for the community to chew on this and
> share their thoughts. If we have a consensus on this, we could 'port'
> the 'basic'[1] volume management commands to a system built using Salt and
> see for real how it fits our use case. Thoughts?

I disagree. I think puppet + puppet-gluster would be a good idea :)
One advantage is that the technology is already proven, and there's a
working POC.
Feel free to prove me wrong, or to request any features that it's missing. ;)
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Run gluster server from flash drive

2014-08-11 Thread James

On Mon, Aug 11, 2014 at 4:27 PM, Juan José Pavlik Salles
 wrote:
> Hi James, that post was my inspiration actually hahaha.
lol, oh cool. Check out some of the new articles, they are more fun :)

> I just re-checked
> the link and you are right, there's an optional fixed internal bay for
> drives so I'll ask the seller if they can provide it.
Exactly, this usually exists too. It's fine, the only difference is
you might have to take down the node for internal disk failures. Good
thing Gluster is around!

>When I asked them for
> this specification they said that only 826B chasis have the rear drive kit.

Ah, that's too bad. See if there's a similar chassis that has it. It's
very useful!!
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Run gluster server from flash drive

2014-08-11 Thread James

On Mon, Aug 11, 2014 at 3:03 PM, Juan José Pavlik Salles
 wrote:
> Hi Guys, we are about to get two of these
> http://www.supermicro.com/products/system/4U/6047/SSG-6047R-E1R36N.cfm and
> the seller said that we have to use one of the 36 drives as OS drive. I'd
> like to avoid using one of the swappable drives for the OS, has anyone tried
> running gluster nodes from flash pendrives? Is it a better idea than loosing
> one of the drives? Any other idea?
>
> Regards,

Some of these supermicro servers allow you to add two 2.5" drives (eg:
for a software mdadm raid 1) in the rear with a special bracket. If
the server doesn't support this bracket, then you can enclose them
internally. I wrote about this type of hardware setup here:
https://ttboj.wordpress.com/2012/07/19/my-gluster-setup-described/
(old post, not necessarily still relevant, but idk)

HTH
James
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster vs NFS export sync

2014-07-10 Thread James

On Thu, Jul 10, 2014 at 8:40 PM, Franco Broi  wrote:
> Is there any way to make Gluster emulate the behaviour of a NFS
> filesystem exported with the sync option? By that I mean is it possible
> to write a file from one client and guarantee that the data will be
> instantly available on close to all other clients?

Gluster native writes are already synchronous.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Call for Topics: Gluster Usage and Development

2014-07-08 Thread James

On Tue, Jul 8, 2014 at 11:33 AM, John Mark Walker  wrote:

> Here's what I need from you:
>
> 1. vote on the topics above. If you don't like the topics, feel free to add 
> one (or a few)

* NSR
* SSL
* 4.0
* ARM
* Btrfs
* Erasure coding

> 2. nominate experts to cover the above. This could be anyone, from members of 
> the developer team to yourself
> 3. got tips and tricks you've discovered for your deployment?
Yeah, I just use puppet-gluster :P

Cheers,
James
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] postgresql HA

2014-07-07 Thread James

On Mon, Jul 7, 2014 at 9:58 AM, Thomaz Luiz Santos
 wrote:
> hello!
>
> is possible to use glusterfs 3.4 to make a high availability cluster for
> replication of directory postgresql, I'm an environment for testing
> available, with 2 computers for testing.

Typically you'll probably want to use the built-in sql replication for
that, eg: 
http://www.postgresql.org/docs/current/interactive/high-availability.html

But JoeJulian did a cool hack a while back demonstrating that this
sort of thing was possible. Maybe he can post the slides.

Cheers,
James
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] GlusterFS Consultants & Support companies page

2014-06-11 Thread James

On Wed, Jun 11, 2014 at 11:19 AM, Lalatendu Mohanty  wrote:
>
> We can sort that list alphabetically first country then organisation name
> i.e. Countries name should be sorted alphabetically and companies name in
> each country should be again sorted alphabetically. Also beofre the list we
> should say the rule of this sorting so that it would be transparent to
> everybody.

That's it! I'm starting a company in Albania called "AAA Gluster
consulting services, we are the best Inc."... Anyone want in?
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Need testers for GlusterFS 3.4.4

2014-06-04 Thread James

On Wed, Jun 4, 2014 at 2:43 PM, BGM  wrote:
> we might get a cfengine/puppet framework to easily

https://github.com/purpleidea/puppet-gluster
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Possible to pre-sync data before geo-rep?

2014-05-30 Thread James Le Cuirot

Hi Venky,

On Thu, 29 May 2014 14:09:38 +0530
Venky Shankar  wrote:

> There are couple of things here:
> 
> 1. With 3.5, geo-replication would also take care to maintain the
> GFIDs of files to be in sync. Syncing data using rsync this way would
> haves mangled the GFIDs. This is very much similar to an upgrade
> scenario to 3.5 where data synced by geo-rep pre 3.5 would not have
> the GFIDs in sync. So, you would need to follow the upgrade steps
> here[1].

I synced the data before creating volumes from it. Even still, I knew
to exclude the .glusterfs directory and did so when I did a dry-run
rsync to check whether they had somehow diverged after starting the
replication. They hadn't.

I did see a very long string of errors like this in the log though.

[2014-05-27 10:26:49.764568] W [master(/mnt/brick/gv0):250:regjob] : 
Rsync: .gfid/d2fd40a5-06e6-45b6-bf23-152c39c89f79 [errcode: 23]

I don't know what it means. I can't find a .gfid directory in the brick
on either side.

> 2. Regarding geo-rep replicating already replicated data, as of now
> there is no *easy *way to _tell_ gsyncd to skip hybrid crawl and
> start processing live changes (a.k.a. *changelog* mode). Maybe we
> could generate the metadata but not replicate anything, but then, if
> we're fully sure data is in sync after a session restart.

What's the hard way then? ;) I just thought it was odd given that it
takes rsync less than a minute to determine that they are in sync. It
would be a very useful feature to have. We have three data centres
spread across the country and sometimes we copy the data locally before
taking the disks out into the field to be placed in other machines.

Regards,
James
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Need testers for GlusterFS 3.4.4

2014-05-28 Thread James

On Wed, May 28, 2014 at 5:02 PM, Justin Clift  wrote:
> Hi all,
>
> Are there any Community members around who can test the GlusterFS 3.4.4
> beta (rpms are available)?

I've provided all the tools and how-to to do this yourself. Should
probably take about ~20 min.

Old example:

https://ttboj.wordpress.com/2014/01/16/testing-glusterfs-during-glusterfest/

Same process should work, except base your testing on the latest
vagrant article:

https://ttboj.wordpress.com/2014/05/13/vagrant-on-fedora-with-libvirt-reprise/

If you haven't set it up already.

Cheers,
James


>
> We're looking for success/failure reports before releasing 3.4.4 final. :)
>
> Regards and best wishes,
>
> Justin Clift
>
> --
> Open Source and Standards @ Red Hat
>
> twitter.com/realjustinclift
>
> ___
> Gluster-devel mailing list
> gluster-de...@gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Possible to pre-sync data before geo-rep?

2014-05-27 Thread James Le Cuirot

Hello,

I've set up geo-rep under 3.5 over a slow link. There are 75GB of data
and I already have it present on the slave. The data has been fully
synced with rsync beforehand but gluster still seems to insist on fully
resyncing everything. I left it overnight and it's still chewing up all
our bandwidth in hybrid crawl mode. Is there any way to tell it that
the slave is already up to date? I know it probably has to generate
some metadata but does that really mean all the data has to be resent?

Regards,
James
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] geo-replication delay?

2014-05-09 Thread James Le Cuirot

Hi Venky,

> On Wed, May 7, 2014 at 11:07 PM, Venky Shankar
> wrote:
> >
> > On Wed, May 7, 2014 at 6:10 PM, James Le Cuirot
> > wrote:
> >
> >> I have set up geo-replication between two machines on my LAN for
> >> testing. Both are using NTP and the clocks are definitely in sync.
> >> CRAWL STATUS reports Changelog Crawl. When I make a change on the
> >> master, it takes up to a minute (sometimes less) for the slave to
> >> notice.
> >
> > That's because the change detection interval is 60 seconds by
> > default[1] in 3.5. This has been changed to 15 seconds lately.

I have a very bad habit of needing things that haven't quite been
released yet. ;)

> >> Now I understand that geo-replication will always have some delay,
> >> not least because it is asynchronous, but given these are tiny
> >> changes with practically no other activity going on, I was
> >> expecting it to be a little more responsive. Even in production,
> >> there will very little traffic so some additional resource usage
> >> to speed things up would not be an issue. Is this configurable at
> >> all?
> >
> > It's configurable. Try this:
> >
> > # gluster volume set  rollover-time 1
> >
> > to indentify changes (followed by syncing it to the slave) every
> > second. This will definitely speed up replication.

Fantastic! I just tried it and it works a charm. Many thanks. :)

Regards,
James
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] geo-replication delay?

2014-05-07 Thread James Le Cuirot

Hello,

I know this has been asked before but I felt that it wasn't fully
answered and I think the situation may have changed in 3.5.

I have set up geo-replication between two machines on my LAN for
testing. Both are using NTP and the clocks are definitely in sync.
CRAWL STATUS reports Changelog Crawl. When I make a change on the
master, it takes up to a minute (sometimes less) for the slave to
notice.

Now I understand that geo-replication will always have some delay, not
least because it is asynchronous, but given these are tiny changes with
practically no other activity going on, I was expecting it to be a
little more responsive. Even in production, there will very little
traffic so some additional resource usage to speed things up would not
be an issue. Is this configurable at all?

I've seen it explained previously that inotify does not scale well and
gluster has taken a more efficient approach. I hadn't expected this to
come at the cost of such long delays though. We're only planning to
have two nodes, a master and a slave, and I've also seen it said that
gluster provides little benefit over a simple periodic call to rsync
under such setups. Combine that with inotify and the rsync solution
even starts to look favourable.

Lowering this delay is not a deal-breaker for us, it's just that it
seems unnecessarily long. I'd appreciate hearing your thoughts.

Regards,
James
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] tar_ssh.pem?

2014-05-07 Thread James Le Cuirot

On Tue, 6 May 2014 19:20:25 +0530
Venky Shankar  wrote:

> push-pem expects password less SSH b/w the node where the CLI is
> executed and a slave node (the slave endpoint used session creation).
> It then adds master's SSH keys to *authorized_keys* on all slave
> nodes (prepended with command=... for restricting access to gsyncd).
> As you said, prompting for password is definitely better and should
> be thought of.

I thought that maybe just removing the check from gverify.sh would do
the trick but after trying it, I see that it's not quite that
straightforward. It doesn't execute that script in the foreground?

> Non-root geo-replication does not work as of now (upstream/3.5). I'm
> in the process of getting in to work (patch
> http://review.gluster.org/#/c/7658/ in gerrit). Even with this you'd
> need password less SSH to one of the nodes on the slave (to an
> unprivileged user in this case). Your argument of prompting for
> password still holds true here.

Good to hear, I'll keep an eye on that. Given that push-pem writes
files to /var on the remote end, would that step still require root? We
generally disable root SSH login as per security policy although
temporarily allowing it for this one step would not be the end of the
world. It looks like this problem has been considered but not yet
solved in gerrit.

> I see the document link you mentioned in BZ #1091079 (comment #2)
> still points to old style geo-replication (we'd need to correct
> that). Are you following that in any case? Comment #1 points to the
> correct URL.

3.5 is the first version I've tried but I came across the older
documentation first. Even after discovering the newer documentation, I
got the impression that "push-pem" is more of a convenience thing to
save you from copying the keys around manually. I only have two nodes,
a master and a slave, so the new "distributed" model doesn't add much
for me.

Regards,
James
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] tar_ssh.pem?

2014-05-06 Thread James Le Cuirot

On Wed, 30 Apr 2014 20:25:03 +0100
James Le Cuirot  wrote:

> > > On April 28, 2014 6:03:16 AM PDT, Venky Shankar
> > >  wrote:
> 
> > >> On 04/27/2014 11:55 PM, James Le Cuirot wrote:
> > >>> I'm new to Gluster but have successfully tried geo-rep with
> > >>> 3.5.0. I've read about the new tar+ssh feature and it sounds
> > >>> good but nothing has been said about the tar_ssh.pem file that
> > >>> gsyncd.conf references. Why is a separate key needed? Does it
> > >>> not use gsyncd on the other end? If not, what command should I
> > >>> lock it down to in authorized_keys, bug #1091079
> > >>> notwithstanding?
> 
> > >> geo-replication "create push-pem" command should add the keys on
> > >> the slave for tar+ssh to work. That is done as part of geo-rep
> > >> setup.
> 
> I had seen the new "create push-pem" option and gave it a try today. I
> see that it does indeed create a different key with a different
> command in the authorized_keys file.
> 
> One question remains though and this stems back to bug #1091079.
> push-pem expects you to have setup passwordless SSH access already so
> what is the point of adding further lines to authorized_keys when
> general access is already allowed? Surely this is bad for security?
> Wouldn't it be better for push-pem to prompt for a password so that
> only the required access is added?

Sorry for this but could I please get an answer on the above? Security
is a very big deal for us as it should be for everyone here. I gather
the mountbroker can be used to do this replication as non-root which
helps but general SSH access for this user is something I would still
like to avoid if it is really not necessary.

Regards,
James
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Subject: "Accept Peer Request" state

2014-05-05 Thread James

On Mon, May 5, 2014 at 5:19 PM, Cary Tsai  wrote:
> Yea, restart glusterfs works.
> Thanks

Please comment on the bug so that it's confirmed by someone else. Thanks.
Leaving info about your setup is useful too, thanks.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Subject: "Accept Peer Request" state

2014-05-05 Thread James

On Mon, May 5, 2014 at 5:11 PM, Cary Tsai  wrote:
> I have 4 systems, us-east-1, us-east-2, us-west-1, and us-west-2.
> From us-east-1, it sees us-east-2 state as :  Accepted peer request
> (Connected)
> But other systems sees it as "Peer in Cluster (Connected)"
>
> Due to us-east-2 is "  Accepted peer request " I cannot create a volume
> using brick in us-east-2 on us-east-1.
>
> How do I make us-east-2 seen as "Peer in Cluster" in us-east-1?
>
> Thanks

BTW, normal gluster mode isn't usually meant for "geo-distribution"...
You might want to look at the geo-replication feature instead.

HTH

>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Subject: "Accept Peer Request" state

2014-05-05 Thread James

On Mon, May 5, 2014 at 5:11 PM, Cary Tsai  wrote:
> I have 4 systems, us-east-1, us-east-2, us-west-1, and us-west-2.
> From us-east-1, it sees us-east-2 state as :  Accepted peer request
> (Connected)
> But other systems sees it as "Peer in Cluster (Connected)"
>
> Due to us-east-2 is "  Accepted peer request " I cannot create a volume
> using brick in us-east-2 on us-east-1.
>
> How do I make us-east-2 seen as "Peer in Cluster" in us-east-1?
>
> Thanks

This looks like bug: https://bugzilla.redhat.com/show_bug.cgi?id=1051992
Puppet-Gluster [1] automatically detects this issue and works around it.
You can restart glusterd on the affected host to workaround it too.
Please comment on the bug with your information.

HTH,
James

[1] https://github.com/purpleidea/puppet-gluster
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] tar_ssh.pem?

2014-04-30 Thread James Le Cuirot

> > On April 28, 2014 6:03:16 AM PDT, Venky Shankar
> >  wrote:

> >> On 04/27/2014 11:55 PM, James Le Cuirot wrote:
> >>> I'm new to Gluster but have successfully tried geo-rep with 3.5.0.
> >>> I've read about the new tar+ssh feature and it sounds good but
> >>> nothing has been said about the tar_ssh.pem file that gsyncd.conf
> >>> references. Why is a separate key needed? Does it not use gsyncd
> >>> on the other end? If not, what command should I lock it down to
> >>> in authorized_keys, bug #1091079 notwithstanding?

> >> geo-replication "create push-pem" command should add the keys on
> >> the slave for tar+ssh to work. That is done as part of geo-rep
> >> setup.

I had seen the new "create push-pem" option and gave it a try today. I
see that it does indeed create a different key with a different command
in the authorized_keys file.

One question remains though and this stems back to bug #1091079.
push-pem expects you to have setup passwordless SSH access already so
what is the point of adding further lines to authorized_keys when
general access is already allowed? Surely this is bad for security?
Wouldn't it be better for push-pem to prompt for a password so that
only the required access is added?

Regards,
James
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Update on release maintainers

2014-04-29 Thread James

Congratulations Niels!

On Wed, Apr 30, 2014 at 12:02 AM, Vijay Bellur  wrote:
> Hi All,
>
> I am happy to announce that Niels de Vos will be the release maintainer for
> release-3.5. Kaleb Keithley will continue to function as the release
> maintainer for release-3.4. Please join me in congratulating Niels on his
> new role and extend your co-operation to him for further 3.5.x releases :).
>
> Cheers,
> Vijay
> ___
> Gluster-devel mailing list
> gluster-de...@gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] tar_ssh.pem?

2014-04-27 Thread James Le Cuirot

Hello all,

I'm new to Gluster but have successfully tried geo-rep with 3.5.0. I've
read about the new tar+ssh feature and it sounds good but nothing has
been said about the tar_ssh.pem file that gsyncd.conf references. Why
is a separate key needed? Does it not use gsyncd on the other end? If
not, what command should I lock it down to in authorized_keys, bug
#1091079 notwithstanding?

Regards,
James
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Scaling for repository purposes

2014-04-21 Thread James

On Mon, Apr 21, 2014 at 9:54 AM, Peter Milanese  wrote:
> Greeting-
Hey,

>
>  I'm relatively new to the Gluster community, and would like to investigate
> Gluster as a solution to augment our current storage systems. My use of
> Gluster has been limited to nitch use cases. Is there anybody in the
> Library/Digital Repository space that has implemented this for mass storage
> (multi-petabyte). I'd be interested in having a discussion via email if
> that's ok.

TBH, the best way to learn about Gluster is to start playing with it.
It's fairly easy to do by hand, but there is also a Puppet-Gluster
module to automate it, and it integrates with Vagrant. Disclaimer: I'm
the author of this code, so of course, I think it's a great idea to
use it.
When I first started testing gluster, this enabled me to try different
configurations and get familiar with how it works.

# the code
https://github.com/purpleidea/puppet-gluster

# some related articles
https://ttboj.wordpress.com/

Try out GlusterFS, and once you've used it for a week, come back with
the harder questions :)
When you want professional support, you can also go to RedHat and get
support too! (Red Hat Storage)

HTH,
James

>
> Thanks.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Horrendously slow directory access

2014-04-09 Thread James Cuff

Hey Joe!

Yeah we are all XFS all the time round here - none of that nasty ext4
combo that we know causes raised levels of mercury :-)

The brick errors, we have not seen any we have been busy grepping and
alerting on anything suspect in our logs.  Mind you there are hundreds
of brick logs to search through I'm not going to say we may have
missed one, but after asking the boys in chat just now they are pretty
convinced that was not the smoking gun.  I'm sure they will chip in on
this thread if there is anything.


j.

--
dr. james cuff, assistant dean for research computing, harvard
university | division of science | thirty eight oxford street,
cambridge. ma. 02138 | +1 617 384 7647 | http://rc.fas.harvard.edu


On Wed, Apr 9, 2014 at 10:36 AM, Joe Julian  wrote:
> What's the backend filesystem?
> Were there any brick errors, probably around 2014-03-31 22:44:04 (half an
> hour before the frame timeout)?
>
>
> On April 9, 2014 7:10:58 AM PDT, James Cuff  wrote:
>>
>> Hi team,
>>
>> I hate "me too" emails sometimes not at all constructive, but I feel I
>> really ought chip in from real world systems we use in anger and at
>> massive scale here.
>>
>> So we also use NFS to "mask" this and other performance issues.  The
>> cluster.readdir-optimize gave us similar results unfortunately.
>>
>> We reported our other challenge back last summer but we stalled on this:
>>
>> http://www.gluster.org/pipermail/gluster-users/2013-June/036252.html
>>
>> We also unfortunately now see a new NFS phenotype that I've pasted
>> below which is again is causing real heartburn.
>>
>> Small files, always difficult for any FS, might be worth doing some
>> regression testing with small file directory scenarios in test - it's
>> an easy reproducer on even moderately sized gluster clusters.  Hope
>> some good progress can be
>> made, and I understand it's a tough one to
>> track down performance hangs and issues.  I just wanted to say that we
>> really do see them, and have tried many things to avoid them.
>>
>> Here's the note from my team:
>>
>> We were hitting 30 minute timeouts on getxattr/system.posix_acl_access
>> calls on directories in a NFS v3 mount (w/ acl option) of a 10-node
>> 40-brick gluster 3.4.0 volume.  Strace shows where the client hangs:
>>
>> $ strace -tt -T getfacl d6h_take1
>> ...
>> 18:43:57.929225 lstat("d6h_take1", {st_mode=S_IFDIR|0755,
>> st_size=7024, ...}) = 0 <0.257107>
>> 18:43:58.186461 getxattr("d6h_take1", "system.posix_acl_access",
>> 0x7fffdf2b9f50, 132) = -1 ENODATA (No data available) <1806.296893>
>> 19:14:04.483556 stat("d6h_take1", {st_mode=S_IFDIR|0755, st_size=7024,
>> ...}) = 0 <0.642362>
>> 19:14:05.126025 getxattr("d6h_take1", "system.posix_acl_default",
>> 0x7fffdf2b9f50, 132) = -1 ENODATA (No data
>> available) <0.24>
>> 19:14:05.126114 stat("d6h_take1", {st_mode=S_IFDIR|0755, st_size=7024,
>> ...}) = 0 <0.10>
>> ...
>>
>> Load on the servers was moderate.  While the above was hanging,
>> getfacl worked nearly instantaneously on that directory on all bricks.
>>  When it finally hit the 30 minute timeout, gluster logged it in
>> nfs.log:
>>
>> [2014-03-31 23:14:04.481154] E [rpc-clnt.c:207:call_bail]
>> 0-holyscratch-client-36: bailing out frame type(GlusterFS 3.3)
>> op(GETXATTR(18)) xid = 0x8168809x sent = 2014-03-31 22:43:58.442411.
>> timeout = 1800
>> [2014-03-31 23:14:04.481233] W
>> [client-rpc-fops.c:1112:client3_3_getxattr_cbk]
>> 0-holyscratch-client-36: remote operation failed: Transport endpoint
>> is not connected. Path: 
>> (b116fb01-b13d-448a-90d0-a8693a98698b). Key: (null)
>>
>> Other than that, we didn't see anything directly related in the nfs or
>> brick logs or anything out of sorts with the gluster services.  A
>> couple other errors raise eyebrows, but these are different
>> directories (neighbors of the example above) and at different times:
>>
>> holyscratch07: /var/log/glusterfs/nfs.log:[2014-03-31 19:30:47.794454]
>> I [dht-layout.c:630:dht_layout_normalize] 0-holyscratch-dht: found
>> anomalies in /ramanathan_lab/dhuh/d9_take2_BGI/Diffreg. holes=1
>> overlaps=0
>> holyscratch07: /var/log/glusterfs/nfs.log:[2014-03-31 19:31:47.794447]
>> I [dht-layout.c:630:dht_layout_normalize] 0-holyscratch-dht: found
>> anomalies in /ramanathan_lab/dhuh/d9_take2_BGI/Diffreg. holes=1
>> overlaps=0
>> holyscratch07: /var/log/glusterfs/nfs.log:[201

Re: [Gluster-users] Horrendously slow directory access

2014-04-09 Thread James Cuff

fRN.txt:
key:system.posix_acl_access error:Invalid argument
holyscratch08: /var/log/glusterfs/bricks/holyscratch08_03-brick.log:[2014-03-31
01:18:12.345818] E [posix-helpers.c:696:posix_handle_pair]
0-holyscratch-posix:
/holyscratch08_03/brick/ramanathan_lab/dhuh/d9_take2_BGI/cuffdiffRN.txt:
key:system.posix_acl_access error:Invalid argument
holyscratch05: /var/log/glusterfs/bricks/holyscratch05_04-brick.log:[2014-03-31
21:16:37.057674] E [posix-helpers.c:696:posix_handle_pair]
0-holyscratch-posix:
/holyscratch05_04/brick/ramanathan_lab/dhuh/d9_take2_BGI/Diffreg/cuffdiffRN.txt:
key:system.posix_acl_access error:Invalid argument

--
dr. james cuff, assistant dean for research computing, harvard
university | division of science | thirty eight oxford street,
cambridge. ma. 02138 | +1 617 384 7647 | http://rc.fas.harvard.edu


On Wed, Apr 9, 2014 at 9:52 AM,   wrote:
> I am seeing something perhaps similar.  3.4.2-1, 2 servers, each with 1
> brick, replicated.  A du of a local (ZFS) directory tree of 297834 files
> and 525GB takes about 17 minutes.  A du of the gluster copy is still not
> finished after 22 hours.  Network activity has been about 5-6KB/sec until
> (I gather) du hit a directory with 22450 files, when activity jumped to
> 300KB/sec (200 packets/sec) for about 15-20 minutes.  If I assume that the
> spike came from scanning the two largest directories, that looks like
> about 8K of traffic per file, and about 5 packets.
>
> A 3.3.2 gluster installation that we are trying to retire is not afflicted
> this way.
>
> James Bellinger
>
>>
>> Am I the only person using Gluster suffering from very slow directory
>> access? It's so seriously bad that it almost makes Gluster unusable.
>>
>> Using NFS instead of the Fuse client masks the problem as long as the
>> directories are cached but it's still hellishly slow when you first
>> access them.
>>
>> Has there been any progress at all fixing this bug?
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=1067256
>>
>> Cheers,
>>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>>
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Horrendously slow directory access

2014-04-09 Thread james . bellinger

I am seeing something perhaps similar.  3.4.2-1, 2 servers, each with 1
brick, replicated.  A du of a local (ZFS) directory tree of 297834 files
and 525GB takes about 17 minutes.  A du of the gluster copy is still not
finished after 22 hours.  Network activity has been about 5-6KB/sec until
(I gather) du hit a directory with 22450 files, when activity jumped to
300KB/sec (200 packets/sec) for about 15-20 minutes.  If I assume that the
spike came from scanning the two largest directories, that looks like
about 8K of traffic per file, and about 5 packets.

A 3.3.2 gluster installation that we are trying to retire is not afflicted
this way.

James Bellinger

>
> Am I the only person using Gluster suffering from very slow directory
> access? It's so seriously bad that it almost makes Gluster unusable.
>
> Using NFS instead of the Fuse client masks the problem as long as the
> directories are cached but it's still hellishly slow when you first
> access them.
>
> Has there been any progress at all fixing this bug?
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1067256
>
> Cheers,
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>


___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster-Deploy

2014-04-08 Thread James

You might also want to check out Puppet-Gluster.

https://ttboj.wordpress.com/code/puppet-gluster/

https://ttboj.wordpress.com/2014/01/08/automatically-deploying-glusterfs-with-puppet-gluster-vagrant/

Disclaimer: I'm the author :P

James


On Tue, Apr 8, 2014 at 1:05 PM, Jimmy Lu  wrote:
> Hello Gluster Guru,
>
> I like to deploy about 5 nodes of gluster using gluster-deploy. Would
> someone please point me to the link where I can  download? I do not see it
> in the repo. I am using rhel on my 5 nodes.
>
> Thanks in advance!
>
> -Jimmy
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Weekly Community Meeting

2014-03-25 Thread James

On Wed, Mar 26, 2014 at 1:18 AM, Lalatendu Mohanty  wrote:
> Hey JM,
>
> Two meetings every week will be a overkill for me.  Can we schedule this
> meeting monthly once or on alternative weeks with the current development
> meeting? We can start with monthly once  (may be the first Wednesday of a
> month) and see if that works. I think development meetings wont be affected
> much with one meeting less per month.
>
> Thanks,
> Lala

+1
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Use single large bring or several smaller ones?

2014-03-22 Thread James

On Sat, Mar 22, 2014 at 3:18 PM, Carlos Capriotti
 wrote:
> James, can we agree to disagree on that ?

Of course! There are lots of different possible configurations, each
with different advantages and disadvantages.

>
> First, one of the main ideas of having a replicated/distributed filesystem
> is to have your data safe, and avoiding a single point of failure or, at the
> very least, minimizing the points of failure.
>
> By adding multiple bricks per server, you are increasing your risk.

Not sure I agree unilaterally.

>
> Regarding performance, glusterd will run on a single core. The more work you
> push to it (multiple volumes) the quicker it will saturate and cap your
> performance. (will, in all fairness, it will happen also with several
> connections, so, let's call this even).

Actually you get a different glusterfsd for each brick, so this can
increase the parallelism.

>
> Again, on performance, splitting volumes on a RAID6 will only make yo
> literally lose diskspace IF you divide you volumes on the controller.
> Remember: 6 disks on raid6 means having the available space of 4 disks,
> since the other two are "spares" (this is a simplification). On the other
> hand, if you use all of the 12 disks, you will have the available space of
> 10 disks, instead of 4x2=8.

There is some balance between performance and max available space.
It's up to the individual to pick that spot :)

>
> If the alternative create a single volume on the RAID controller, and the
> create two logical volumes on the OS, STILL, you have one single RAID
> controller, thus, you might still lose, since the OS would have some
> overhead to control two volumes.
>
> Now, REALLY important factors is, indeed, having "symmetrical" settings,
> like same processors, same disk configuration for bricks, same mount of ram,
> same NIC configurations, just to rule out all of those potential problems
> that would make one node wait for the other.
>
> Remember: gluster  write, as it would be expected, is only as fast as the
> slowest node, thus affecting performance greatly.
>
> Hope this helps.
>
>
>
>
> On Fri, Mar 21, 2014 at 7:20 PM, James  wrote:
>>
>> On Fri, Mar 21, 2014 at 2:02 PM, Justin Dossey  wrote:
>> > The more bricks you allocate, the higher your operational complexity.
>> > One
>> > brick per server is perfectly fine.
>>
>> I don't agree that you necessarily have a "higher operational
>> complexity" by adding more bricks per host. Especially if you're using
>> Puppet-Gluster [1] to manage it all ;) I do think you'll have higher
>> complexity if your cluster isn't homogenous or your bricks per host
>> isn't symmetrical across the cluster, or if you're using chaining.
>> Otherwise, I think it's recommended to use more than one brick.
>>
>> There are a number of reasons why you might want more than one brick per
>> host.
>>
>> * Splitting of large RAID sets (might be better to have 2x12 drives
>> per RAID6, instead of one giant RAID6 set)
>>
>> * More parallel IO workload (you might want to see how much
>> performance gains you get from more bricks and your workload. keep
>> adding until you plateau. Puppet-Gluster is a useful tool for
>> deploying a cluster (vm's or iron) to test a certain config, and then
>> using it again to re-deploy and test a new brick count).
>>
>> * More than one brick per server is required if you want to do volume
>> chaining. (Advanced, unsupported feature, but has cool implictions.)
>>
>> And so on...
>>
>> The famous "semiosis" has given at least one talk, explaining how he
>> chose his brick count, and detailing his method. I believe he uses 6
>> or 8 bricks per host. If there's a reference, maybe he can chime in
>> and add some context.
>>
>> HTH,
>> James
>>
>> [1] https://github.com/purpleidea/puppet-gluster
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Use single large bring or several smaller ones?

2014-03-21 Thread James

On Fri, Mar 21, 2014 at 2:02 PM, Justin Dossey  wrote:
> The more bricks you allocate, the higher your operational complexity.  One
> brick per server is perfectly fine.

I don't agree that you necessarily have a "higher operational
complexity" by adding more bricks per host. Especially if you're using
Puppet-Gluster [1] to manage it all ;) I do think you'll have higher
complexity if your cluster isn't homogenous or your bricks per host
isn't symmetrical across the cluster, or if you're using chaining.
Otherwise, I think it's recommended to use more than one brick.

There are a number of reasons why you might want more than one brick per host.

* Splitting of large RAID sets (might be better to have 2x12 drives
per RAID6, instead of one giant RAID6 set)

* More parallel IO workload (you might want to see how much
performance gains you get from more bricks and your workload. keep
adding until you plateau. Puppet-Gluster is a useful tool for
deploying a cluster (vm's or iron) to test a certain config, and then
using it again to re-deploy and test a new brick count).

* More than one brick per server is required if you want to do volume
chaining. (Advanced, unsupported feature, but has cool implictions.)

And so on...

The famous "semiosis" has given at least one talk, explaining how he
chose his brick count, and detailing his method. I believe he uses 6
or 8 bricks per host. If there's a reference, maybe he can chime in
and add some context.

HTH,
James

[1] https://github.com/purpleidea/puppet-gluster
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Bring up a brick after disk failure

2014-03-19 Thread James

On Wed, Mar 19, 2014 at 2:28 PM, Joe Julian  wrote:
> Probably should have checked the logs to see what the problem was. The
> "force" command overrides things like preventing adding a brick directory
> that exists on your root partition (like when you forget to mount your raid
> volume).

Actually, there's an open bug about this issue!
https://bugzilla.redhat.com/show_bug.cgi?id=1051993

Hope this gets fixed! It would be a huge win for users and automation
especially.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] PLEASE READ ! We need your opinion. GSOC-2014 and the Gluster community

2014-03-18 Thread James

On Tue, 2014-03-18 at 12:35 +0530, Kaushal M wrote:
> I had a discussion with some developers here in the office regarding
> this. We created a list of ideas which we thought could be suitable
> for student projects. I've added these to [1]. But I'm also putting
> them on here for more visibility.
> 
> (I've tried to arrange the list in descending order of difficulty as I find 
> it)
> 
> . Glusterd services high availablity
> Glusterd should restart the processes it manages, bricks, nfs
> server, self-heal daemon & quota daemon, whenever it detects they have
> died.

It might make sense to think about the interplay between this and the
systemd feature set... 

> . glusterfsiostat - Top like utility for glusterfs
> These are client side tools which will display stats from the
> io-stats translator. I'm not currently sure of the difference between
> the two.
> . ovirt gui for stats
> Have pretty graphs and tables in ovirt for the GlusterFS top and
> profile commands.
> . monitoring integrations - munin others.
> The more monitoring support we have for GlusterFS the better.
> . More compression algorithms for compression xlator
> The onwire compression translator should be extended to support
> more compression algorithms. Ideally it should be pluggable.
> . cinder glusterfs backup driver
> Write a driver for cinder, a part of openstack, to allow backup
> onto GlusterFS volumes
> . rsockets - sockets for rdma transport
> Coding for RDMA using the familiar socket api should lead to a
> more robust rdma transport
> . data import tool
> Create a tool which will allow already importing already existing
> data in the brick directories into the gluster volume. This is most
> likely going to be a special rebalance process.
> . rebalance improvements
> Improve rebalance preformance.
> . Improve the meta translator
> The meta xlator provides a /proc like interface to GlusterFS
> xlators. We could further improve this and make it a part of the
> standard volume graph.
> . geo-rep using rest-api
> This might be suitable for geo replication over WAN. Using
> rsync/ssh over WAN isn't too nice.
> . quota using underlying fs quota
> GlusterFS quota is currently maintained completely in GlusterFSs
> namespace using xattrs. We could make use of the quota capabilities of
> the underlying fs (XFS) for better performance.
> . snapshot pluggability
> Snapshot should be able to make use of snapshot support provided
> by btrfs for example.

This would be very useful :)

> . compression at rest
> Lessons learnt while implementing encryption at rest can be used
> with the compression at rest.
> . file-level deduplication
> GlusterFS works on files. So why not have dedup at the level files as 
> well.
> . composition xlator for small files
> Merge smallfiles into a designated large file using our own custom
> semantics. This can improve our small file performance.
> . multi master geo-rep
> Nothing much to say here. This has been discussed many times.
> 
> Any comments on this list?
> ~kaushal
> 
> [1] http://www.gluster.org/community/documentation/index.php/Projects
> 
> On Tue, Mar 18, 2014 at 9:07 AM, Lalatendu Mohanty  
> wrote:
> > On 03/13/2014 11:49 PM, John Mark Walker wrote:
> >>
> >> - Original Message -
> >>
> >>> Welcome, Carlos.  I think it's great that you're taking initiative here.
> >>
> >> +1 - I love enthusiastic fresh me^H^H^H^H^H^H^H^Hcommunity members! :)
> >>
> >>
> >>> However, it's also important to set proper expectations for what a GSoC
> >>> intern
> >>> could reasonably be expected to achieve.  I've seen some amazing stuff
> >>> out of
> >>> GSoC, but if we set the bar too high then we end up with incomplete code
> >>> and
> >>> the student doesn't learn much except frustration.
> >>
> >> This. The reason we haven't really participated in GSoC is not because we
> >> don't want to - it's because it's exceptionally difficult for a project of
> >> our scope, but that doesn't mean there aren't any possibilities. As an
> >> example, last year the Open Source Lab at OSU worked with a student to
> >> create an integration with Ganeti, which was mostly successful, and I think
> >> work has continued on that project. That's an example of a project with the
> >> right scope.
> >
> >
> > IMO integration projects are ideal fits for GSoc. I can see some information
> > in Trello back log i.e. under "Ecosystem Integration". But not sure of their
> > current status. I think we should again take look on these and see if
> > something can be done through GSoc.
> >
> >
>  3) Accelerator node project. Some storage solutions out there offer an
>  "accelerator node", which is, in short, a, extra node with a lot of RAM,
>  eventually fast disks (SSD), and that works like a proxy to the regular
>  volumes. active chunks of files are moved there, logs (ZIL style) are
>  recorded on fast media, among other things. There is NO active pro

Re: [Gluster-users] Different brick sizes in a volume

2014-03-18 Thread James

On Tue, Mar 18, 2014 at 1:42 PM, Greg Waite
 wrote:
> Hi,
>
> I've been playing around with a 2x2 distributed replicated setup with
> replicating group 1 having a different brick size than replicating group 2.
> I've been running into "out of disk" errors when the smaller replicating
> pair disks fill up. I know of the minimum free disk feature which should
> prevent this issue. My question is, are there features that allow gluster to
> smartly use different brick sizes so extra space on larger bricks do not go
> unused?
>
> Thanks

It looks like different sized bricks will be a core feature in 3.6
(coming soon).

Stay tuned! For now, don't run out of space :)
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster 3.4.2 geo-replication bandwith limiting

2014-03-18 Thread James

Re-read the bug. I just added:

It's --bwlimit not --bwlimits.

HTH
James


On Tue, Mar 18, 2014 at 10:28 AM, Steve Dainard  wrote:
> According to this BZ https://bugzilla.redhat.com/show_bug.cgi?id=764826 its
> possible to set rysnc bandwidth options for geo-replication on vers 3.2.1.
>
> Is this supported in 3.4.2? I just added the option referenced in the above
> link and the replication agreement status changed to faulty.
>
> Thanks,
>
> Steve
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Has anyone dockerized gluster yet?

2014-03-10 Thread James

On Mon, Mar 10, 2014 at 4:20 PM, Jay Vyas  wrote:
> Hi folks.
>
> Well, after James Shubin's tour-de-awesome of Gluster on vagrant, i think we
> all learned two things:

Who :P

>
> - james is an awesome hacker

Thank you!

> - vagrant isnt ready for primetime on KVM yet.

Sadly, I agree, but not necessarily for the same reason. It was
supposed to be included in Fedora 20 as a feature/package, but
something happened, and it got pushed back. Once this happens, I think
all the important barriers are gone.

The other issue was the lack of proper vagrant boxes, but I consider
this now solved, especially with the hosted box that I provide [1]
(thanks to JMW for the hosting), and because you can build you own
with other tools, including my own:
https://ttboj.wordpress.com/2014/01/20/building-base-images-for-vagrant-with-a-makefile

>
> So, an alternative for quick and easy spin up of gluster systems for
> dev/test would be a docker recipe for gluster on fedora/centos/...

I don't see it as an alternative, actually, I see using Docker with
GlusterFS as being another goal that needs more investigating :)

>
> Has anyone set up a docker container that installs and mounts a couple of
> gluster peers as linux containers yet?  It love to see how that works, and
> then maybe layer in hadoop on top of it.

I recently saw this:

https://lists.gnu.org/archive/html/gluster-devel/2014-03/msg00010.html

Which I wanted to reply to, but didn't have time to yet.
Sorry Kaushal!

>
> Jay Vyas
> http://jayunit100.blogspot.com

Cheers,
James

[1] https://download.gluster.org/pub/gluster/purpleidea/vagrant/

>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] RFC: Gluster bundles for 3.5

2014-03-10 Thread James

On Mon, Mar 10, 2014 at 4:16 PM, John Mark Walker  wrote:
> Greetings,
>
> As you may or may not be aware, there are quite a few maturing projects that 
> may warrant inclusion in a GlusterFS 3.5 bundle:
>
> - gluster-swift
> - gluster-hdfs/gluster-hadoop
> - oVirt 3.4 (for RESTful API and management GUI)
> - pmux - for distributed MapReduce jobs
> - gflocator - implements some measure of data locality, used in pmux
> - glubix - for monitoring with zabbix
> - puppet-gluster - puppet module for deploying GlusterFS

Obviously I agree ;)

I would also nominate libgfapi-python [1] and glupy [2].

Although I wouldn't expect this (although I'd certainly welcome it) is
if major breakages in GlusterFS break Puppet-Gluster, that this be a
blocker for that change, although I don't really see this being an
issue since the user --xml interface is stable.

>
> I mention these in particular because there are actual, real-life users for 
> each of them. If there are other projects out there that you feel have been 
> overlooked, please nominate them.

Thanks for saving me from doing this :)

>
>
> How do we distribute this software? I've long thought that the easiest way to 
> release them is as packages that we make available for download with each 
> major release. We could call the major releases "Gluster Software 
> Distribution" or "Gluster++" or some other witty name that makes it clear 
> that it's more than just GlusterFS.

The amazing kkeithley [3]* might be helping me rpm-ify Puppet-Gluster
this week. This will serve a few use cases including the above, the
CentOS storage SIG, and maybe even RHS if they ever wanted to use
Puppet-Gluster.

>
>
> Please comment on both the projects I've listed above, as well as how we 
> should go about making this part of a release.
>
> -JM

Cheers,
James
@purpleidea (twitter / irc)
https://ttboj.wordpress.com/

[1] https://github.com/gluster/libgfapi-python/
[2] https://github.com/jdarcy/glupy/
[3]* kkeithley is even cooler in person than on IRC


>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Clarification on cluster quorum

2014-03-10 Thread James

A quick search reveals:

# Sets the quorum percentage for the trusted storage pool.
cluster.server-quorum-ratio

# If set to server, enables the specified volume to
participate in quorum.
cluster.server-quorum-type

# If quorum-type is "fixed" only allow writes if this many
bricks or present. Other quorum types will OVERWRITE this value.
cluster.quorum-count

# If value is "fixed" only allow writes if quorum-count bricks
are present. If value is "auto" only allow writes if more than half of
bricks, or exactly half including the first, are present.
cluster.quorum-type

I took these from my previous "notes" (code) in:
https://github.com/purpleidea/puppet-gluster/blob/master/manifests/volume/property/data.pp#L18

You can get newer values or appropriate values for your version by
running something like:

gluster volume set help ( i think )

Cheers,
James


On Mon, Mar 10, 2014 at 2:43 AM, Andrew Lau  wrote:
> Thanks again James!
>
> So far so good, I plan to test this a little more in a few days but so far
> it seems the only volume setting I need is:
> cluster.server-quorum-type: server
>
> Default cluster.server-quorum-ratio >50%
> So 2 is greater than 1.5.. which should allow writes.
>
> On Thu, Mar 6, 2014 at 5:00 PM, James  wrote:
>>
>> Top posting sorry:
>>
>> Yes, you can add a third "arbiter" node, that exists to help with the
>> quorum issues. AFAICT, all you do is peer is with the cluster (as you
>> did with the other hosts) but don't add any storage for example.
>>
>> Then you set the cluster.quorum* style volume settings that you're
>> interested. I don't have a list of exactly which ones off the top of
>> my head, but if you make a list, let me know!
>>
>> Cheers,
>> James
>>
>>
>> On Wed, Mar 5, 2014 at 10:51 PM, Andrew Lau  wrote:
>> > Hi,
>> >
>> > I'm looking for an option to add an arbiter node to the gluster
>> > cluster, but the leads I've been following seem to lead to
>> > inconclusive results.
>> >
>> > The scenario is, a 2 node replicated cluster. What I want to do is
>> > introduce a fake host/arbiter node which would set the cluster to a 3
>> > node meaning, we can meet the conditions of allow over 50% to write
>> > (ie. 2 can write, 1 can not).
>> >
>> > elyograg from IRC gave me a few links [1], [2]
>> > But these appear to be over a year old, and still under review.
>> >
>> > Gluster 3.2 volume options (I'm running 3.4, but there doesn't seem to
>> > be an updated page) [3]
>> > seem to state the that the cluster quorum is identified by active
>> > peers. This also backs up the statement in [2] in regards to a patch
>> > for active volumes rather than cluster peers.
>> >
>> > Has anyone gone down this path, or could they confirm any of these
>> > leads? (ie. does a host w/o any volumes get considered as a peer
>> > within the cluster)
>> >
>> > Thanks,
>> > Andrew
>> >
>> > [1] https://bugzilla.redhat.com/show_bug.cgi?id=914804
>> > [2] http://review.gluster.org/#/c/4363/
>> > [3]
>> > http://gluster.org/community/documentation/index.php/Gluster_3.2:_Setting_Volume_Options#cluster.quorum-type
>> > ___
>> > Gluster-users mailing list
>> > Gluster-users@gluster.org
>> > http://supercolony.gluster.org/mailman/listinfo/gluster-users
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Clarification on cluster quorum

2014-03-05 Thread James

Top posting sorry:

Yes, you can add a third "arbiter" node, that exists to help with the
quorum issues. AFAICT, all you do is peer is with the cluster (as you
did with the other hosts) but don't add any storage for example.

Then you set the cluster.quorum* style volume settings that you're
interested. I don't have a list of exactly which ones off the top of
my head, but if you make a list, let me know!

Cheers,
James


On Wed, Mar 5, 2014 at 10:51 PM, Andrew Lau  wrote:
> Hi,
>
> I'm looking for an option to add an arbiter node to the gluster
> cluster, but the leads I've been following seem to lead to
> inconclusive results.
>
> The scenario is, a 2 node replicated cluster. What I want to do is
> introduce a fake host/arbiter node which would set the cluster to a 3
> node meaning, we can meet the conditions of allow over 50% to write
> (ie. 2 can write, 1 can not).
>
> elyograg from IRC gave me a few links [1], [2]
> But these appear to be over a year old, and still under review.
>
> Gluster 3.2 volume options (I'm running 3.4, but there doesn't seem to
> be an updated page) [3]
> seem to state the that the cluster quorum is identified by active
> peers. This also backs up the statement in [2] in regards to a patch
> for active volumes rather than cluster peers.
>
> Has anyone gone down this path, or could they confirm any of these
> leads? (ie. does a host w/o any volumes get considered as a peer
> within the cluster)
>
> Thanks,
> Andrew
>
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=914804
> [2] http://review.gluster.org/#/c/4363/
> [3] 
> http://gluster.org/community/documentation/index.php/Gluster_3.2:_Setting_Volume_Options#cluster.quorum-type
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Fwd: [CentOS-devel] Update on stats of SIG plans

2014-02-27 Thread James

Cool... Thanks for posting this... I've been a bit busy as of late, but if
someone else from Gluster.org is getting involved, I'd like to be looped in
too... In particular, (obviously) I'd love to get a stable release of
Puppet-Gluster packaged and installed alongside GlusterFS so that users
could (optionally) use it to try out GlusterFS.

Karanbir came by #gluster a while back and we talked about this a bit.

If anyone can help out with the RPM packaging, please let me know :)
If you're doing the SIG, let me know! I certainly wouldn't propose to
manage the GlusterFS side alone!

Cheers,
James


On Mon, Feb 24, 2014 at 7:37 AM, Lalatendu Mohanty wrote:

>
> FYI,
>
> For folks who are interested to see a CentOS variant with all required
> GlusterFS packages i.e. CentOS storage SIG
>
> Thanks,
> Lala
>
>  Original Message   Subject: [CentOS-devel] Update on
> stats of SIG plans  Date: Fri, 21 Feb 2014 19:45:47 +  From: Karanbir
> SinghReply-To: The CentOS
> developers mailing list.   
> To:
> centos-de...@centos.org
>
> hi,
>
> I am sure people are wondering what the state of the SIG's is at this
> point. And this is a quick recap of stuff from my perspective.
>
> We have 8 SIG's that are under active consideration and planning.
>
> Every SIG proposal needs to go via the CentOS Board for inclusion and setup.
>
> The CentOS Board is starting to get a regular meeting schedule doing,
> and will meet every Wednesday ( minutes for every meeting will be posted
> on the website, Jim is working out the mechanics for that ).
>
> Starting with the Board meeting of the 5th March, we will consider SIG
> plans, no more than 2 a meeting, at every other board meeting. These
> meetings will be held in public, and the SIG's being considered will be
> notified in advance so they can come and be a part of the conversations
> ( and any followup can happen immediately after the meeting ).
>
> In the coming days, I will be reaching out to the people who nominated
> themselves to be SIG coordinators, for the SIG's I've offered to help
> sponsor and start working on the proposals. The proposals will be on 
> thewiki.centos.org site and I'll try to post updates to this list (
> centos-devel ) so others can chime in and we incorporate wider, public
> viewpoints on the proposal.
>
> - KB
>
> --
> Karanbir Singh+44-207-0999389 | http://www.karan.org/ | twitter.com/kbsingh
> GnuPG Key : http://www.karan.org/publickey.asc
> ___
> CentOS-devel mailing 
> listCentOS-devel@centos.orghttp://lists.centos.org/mailman/listinfo/centos-devel
>
>
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Best Practices for different failure scenarios?

2014-02-19 Thread James

On Wed, Feb 19, 2014 at 4:50 PM, Michael Peek  wrote:
> Thanks for the quick reply.
>
> On 02/19/2014 03:15 PM, James wrote:
>> Short answer, it sounds like you'd benefit from playing with a test
>> cluster... Would I be correct in guessing that you haven't setup a
>> gluster pool yet? You might want to look at:
>> https://ttboj.wordpress.com/2014/01/08/automatically-deploying-glusterfs-with-puppet-gluster-vagrant/
>> This way you can try them out easily...
>
> You're close.  I've got a test cluster up and running now, and I'm about
> to go postal on it to see just in how many different ways I can break
> it, and what I need to know to bring it back to life.
"Go postal on it" -- I like this.
Remember: if you break it, you get to keep both pieces!

>
>> For some of those points... solve them with...
>>>  Sort of a crib notes for things like:
>>>
>>> 1) What do you do if you see that a drive is about to fail?
>>>
>>> 2) What do you do if a drive has already failed?
>> RAID6
>
> Derp.  Shoulda seen that one.
Typically on iron, people with have been 2 and N different bricks,
each composed of a RAID6 set. Other setups are possible depending on
what kind of engineering you're doing.

>
>>> 3) What do you do if a peer is about to fail?
>> Get a new peer ready...
>
> Here's what I think needs to happen, correct me if I've got this wrong:
> 1) Set up a new host with gluster installed
> 2) From the new host, probe one of the other peers (or from one of the
> other peers, probe the new host)
The pool has to probe the peer. Not the other way around...

> 3) gluster volume replace-brick volname failing-host:/failing/brick
> new-host:/new/brick start
In latest gluster replace-brick is going away... Turning into
add/remove brick...
Try it out with a vagrant setup to get comfortable with it!

>
> Find out how it's going with:
> gluster volume replace-brick volname failing-host:/failing/brick
> new-host:/new/brick status
>
>>> 4) What do you do if a peer has failed?
>> Replace with new peer...
>>
>
> Same steps as (3) above, then:
> 4) gluster volume heal volname
> to begin copying data over from a replicant.
>
>>> 5) What do you do to reinstall a peer from scratch (i.e. what
>>> configuration files/directories do you need to restore to get the host
>>> back up and talking to the rest of the cluster)?
>> Bring up a new peer. Add to cluster... Same as failed peer...
>>
>
>
>>> 6) What do you do with failed-heals?
>>> 7) What do you do with split-brains?
>> These are more complex issues and a number of people have written about 
>> them...
>> Eg: http://joejulian.name/blog/fixing-split-brain-with-glusterfs-33/
>
> This covers split-brain, but what about failed-heal?  Do you do the same
> thing?
Depends on what has happened... Look at the logs, see what's going on.
Oh, make sure you aren't running out of disk space, because bad things
could happen... :P

>
> Michael

HTH
James
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Best Practices for different failure scenarios?

2014-02-19 Thread James

On Wed, Feb 19, 2014 at 3:07 PM, Michael Peek  wrote:
> Is there a best practices document somewhere for how to handle standard
> problems that crop up?

Short answer, it sounds like you'd benefit from playing with a test
cluster... Would I be correct in guessing that you haven't setup a
gluster pool yet?
You might want to look at:
https://ttboj.wordpress.com/2014/01/08/automatically-deploying-glusterfs-with-puppet-gluster-vagrant/
This way you can try them out easily...
For some of those points... solve them with...

>  Sort of a crib notes for things like:
>
> 1) What do you do if you see that a drive is about to fail?
RAID6

> 2) What do you do if a drive has already failed?
RAID6

> 3) What do you do if a peer is about to fail?
Get a new peer ready...

> 4) What do you do if a peer has failed?
Replace with new peer...

> 5) What do you do to reinstall a peer from scratch (i.e. what
> configuration files/directories do you need to restore to get the host
> back up and talking to the rest of the cluster)?
Bring up a new peer. Add to cluster... Same as failed peer...

> 6) What do you do with failed-heals?
> 7) What do you do with split-brains?
These are more complex issues and a number of people have written about them...
Eg: http://joejulian.name/blog/fixing-split-brain-with-glusterfs-33/

Cheers,
James


>
> Michael
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Complete machine lockup, v3.4.2

2014-02-18 Thread James

On Tue, Feb 18, 2014 at 8:12 AM, Laurent Chouinard
 wrote:
> Before the system freeze, the last thing the kernel seemed to be doing is
> killing HTTPD threads (INFO: task httpd:7910 blocked for more than 120
> seconds.)  End-users talk to Apache in order to read/write from the Gluster
> volume, so it seems a simple case of “something wrong” with gluster which
> locks read/writes, and eventually the kernel kills them.

If the kernel was killing things, check that it wasn't the OOM killer.
If so, you might want to ensure you've got swap, enough memory, check
if anything is leaking, and finally if you have memory management
issues between services, cgroups might be the thing to use to control
this.

HTH,
James
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Testing replication and HA

2014-02-11 Thread James

On Tue, Feb 11, 2014 at 9:43 AM, David Bierce  wrote:
> When the timeout is reached for the brick failed brick, it does have to 
> recreate handles for all the files in the volume, which is apparently quite 
> an expensive operation.  In our environment, with only 100s of files, this 
> has been livable, but if you have 100k files, I’d imagine it is quite a wait 
> to get the clients state of the volume back to usable.

I'm interested in hearing more about this situation. How expensive,
and where do you see the cost? As CPU usage on the client side? On the
brick side? Or what?

Cheers,
James
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Testing replication and HA

2014-02-11 Thread James

Thanks to everyone for their replies...

On Tue, Feb 11, 2014 at 2:37 AM, Kaushal M  wrote:
> The 42 second hang is most likely the ping timeout of the client translator.
Indeed I think it is...

>
> What most likely happened was that, the brick on annex3 was being used
> for the read when you pulled its plug. When you pulled the plug, the
> connection between the client and annex3 isn't gracefully terminated
> and the client translator still sees the connection as alive. Because
> of this the next fop is also sent to annex3, but it will timeout as
> annex3 is dead. After the timeout happens, the connection is marked as
> dead, and the associated client xlator is marked as down. Since afr
> now know annex3 is dead, it sends the next fop to annex4 which is
> still alive.
I think this sounds right... My thought was that maybe Gluster could
do better somehow. For example, if the timeout counter passes (say 1
sec) it immediately starts looking for a different brick to continue
from. This way a routine failover wouldn't interrupt activity for 42
seconds. Maybe this is a feature that could be part of the new style
replication?

>
> These kinds of unclean connection terminations are only handled by
> request/ping timeouts currently. You could set the ping timeout values
> to be lower, to reduce the detection time.
The reason I don't want to set this value significantly lower, is that
in the case of a _real_ disaster, or high load condition, I want to
have the 42 seconds to give things a chance to recover without having
to kill the "in process" client mount. So it makes sense to keep it
like this.

>
> ~kaushal

Cheers,
James
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Testing replication and HA

2014-02-10 Thread James

It's been a while since I did some gluster replication testing, so I
spun up a quick cluster *cough, plug* using puppet-gluster+vagrant (of
course) and here are my results.

* Setup is a 2x2 distributed-replicated cluster
* Hosts are named: annex{1..4}
* Volume name is 'puppet'
* Client vm's mount (fuse) the volume.

* On the client:

# cd /mnt/gluster/puppet/
# dd if=/dev/urandom of=random.51200 count=51200
# sha1sum random.51200
# rsync -v --bwlimit=10 --progress random.51200 root@localhost:/tmp

* This gives me about an hour to mess with the bricks...
* By looking on the hosts directly, I see that the random.51200 file is
on annex3 and annex4...

* On annex3:
# poweroff
[host shuts down...]

* On client1:
# time ls
random.51200

real0m42.705s
user0m0.001s
sys 0m0.002s

[hangs for about 42 seconds, and then returns successfully...]

* I then powerup annex3, and then pull the plug on annex4. The same sort
of thing happens... It hangs for 42 seconds, but then everything works
as normal. This is of course the cluster timeout value and the answer to
life the universe and everything.

Question: Why doesn't glusterfs automatically flip over to using the
other available host right away? If you agree, I'll report this as a
bug. If there's a way to do this, let me know.

Apart from the delay, glad that this is of course still HA ;)

Cheers,
James
@purpleidea (twitter/irc)
https://ttboj.wordpress.com/



signature.asc
Description: This is a digitally signed message part
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Same server different networks and different bricks

2014-02-10 Thread James

On Mon, Feb 10, 2014 at 12:43 PM, Elías David
 wrote:
> Ahhh...I see your point, thanks!
>
> What I was thinking was about isolating both gluster volumes despite being
> on the same servers, not by controlling clients access but really isolating,
> like vlans without actually using vlans

I don't think GlusterFS offers an out-of-the-box way to do exactly
what you think you want. Of course you can always isolate by building
something yourself using perhaps virt-sandbox-service, cgroups and
selinux...

James
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Same server different networks and different bricks

2014-02-10 Thread James

On Mon, Feb 10, 2014 at 12:15 PM, Elías David
 wrote:
> So two vols, different bricks, different networks but the same 5 servers
> handling both

So as I said in my first message:

> for the servers, build it all on the same network.

and then create your volumes however you like, using whichever bricks you want.

and use the:

> auth.allow volume property to control access

If that's not what you want, you'll have to explain more clearly, or
maybe someone else knows what you're trying to do.

HTH,
James
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Same server different networks and different bricks

2014-02-10 Thread James

On Mon, Feb 10, 2014 at 11:56 AM, Elías David
 wrote:
> Hello all, I would like to do something but I'm not sure if it's possible, I
> have 5 servers each with 6 ethernet ports, 4 ports working on say
> 192.168.100.0/24 and 2 ports working on subnet 10.10.10.0/24.
>
> Each server has 6 disks and what I would like to do is to have a gluster
> vol, say GVol0 using to disks from each server and available to
> 192.168.100.0/24 and another vol, say GVol1 with another 2 disks from each
> servers but available to subnet 10.10.10.0/24 only.
>
> I tried but I was unable to do this, is like gluster said "you already have
> a vol on the 192.168... network and the 10.10... network points to the same
> servers already peered in 192.168... network"
>
> Is my setup possible? Please if I'm not being clear let me know.

If I understand correctly, all you need is the:

auth.allow

volume property.

You should use that to control access, and as for the servers, build
it all on the same network.

Alternatively, set up two separate Gluster pools that don't peer with
each other.

Here's an auth.reject example (similar syntax to auth.allow) with
Puppet-Gluster, for example:
https://github.com/purpleidea/puppet-gluster/blob/master/examples/distributed-replicate-example.pp#L140

HTH,
James


>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Striped Volume and Client Configuration

2014-02-10 Thread James

On Mon, Feb 10, 2014 at 11:05 AM, Justin Clift  wrote:
> Possibly also interesting is the "gluster-deploy" wizard written by
> Paul Cuzner:
>
>   https://forge.gluster.org/gluster-deploy
>
> He did a screencast/talk thing for an early version of it, and it's
> progressed a bunch since then (has v good rep):
>
>   http://www.youtube.com/watch?v=UxyPLnlCdhA
>
> Does that help? :)
>
> Regards and best wishes,
>
> Justin Clift

Alternatively, there is:
https://ttboj.wordpress.com/2014/01/08/automatically-deploying-glusterfs-with-puppet-gluster-vagrant/

and associated screencasts:
http://ttboj.wordpress.com/2014/01/27/screencasts-of-puppet-gluster-vagrant/

With the above, you get a fast to deploy/re-deploy cluster, which you
can break and re-build.

Cheers,
James
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

1 2 3 4 5 >

1 - 100 of 480 matches

Mail list logo