Re: [Gluster-users] Swap space requirements

2017-02-08 Thread Dev Sidious
ERATA:

less than 4 GB RAM.

RAM < 4 GB , SWAP = RAM * 2
RAM > 4 GB, SWAP = RAM + 2 GB

On 2/7/2017 5:37 PM, Cedric Lemarchand wrote:
> Hum ... I would say that ram is really expensive right now ! 
> In response to the OP, an old rule was to provision double ram as swap,
> actually you could put the half, or nothing.  
> From my point of view, swap only helps in the way it retards OOM killer
> process and as a sort of warning role because it slows a lot the whole OS. 
> 
> Cheers
> 
> --
> Cédric Lemarchand
> 
> Le 7 févr. 2017 à 17:11, Gambit15  > a écrit :
> 
>> Gluster doesn't "require" swap any more than any other service, and
>> with the price of RAM today, most admins should even consider removing
>> swap altogether.
>>
>> D
>>
>> On 7 February 2017 at 10:56, Mark Connor > > wrote:
>>
>> I am planning in deploying about 18 bricks of about 50 TB bricks
>> each spanning 8-10 servers. My servers are high end servers with
>> 128gb each. I have searched and cannot find any detail on swap
>> partition requirements for the latest gluster server.  Can anyone
>> offer me some advice?
>>
>>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org 
>> http://lists.gluster.org/mailman/listinfo/gluster-users
>> 
>>
>>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org 
>> http://lists.gluster.org/mailman/listinfo/gluster-users
> 
> 
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
> 



signature.asc
Description: OpenPGP digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Issue in installing Gluster 3.9.0

2017-02-08 Thread Amudhan P
Hi All,

Using  'configure --disable-events` fixes above problem.

Thank you, Niklas for forwarding this info.

regards
Amudhan

On Sat, Dec 17, 2016 at 5:49 PM, Amudhan P  wrote:

> Hi All,
>
> Did anyone  faced below issue in installing 3.9.0 using source tar file in
> Ubuntu.
>
>
>
> On Thu, Dec 15, 2016 at 4:51 PM, Amudhan P  wrote:
>
>> Hi,
>>
>> I am trying to install Gluster 3.9.0 from tarball, downloaded from
>> gluster site.
>>
>> configure and make are completing successfully.
>>
>> getting below error message when running "make install"
>>
>> /usr/bin/install: cannot stat 'glustereventsd-Debian': No such file or
>> directory
>> make[3]: *** [Debain] Error 1
>> make[2]: *** [install-am] Error 2
>> make[1]: *** [install-recursive] Error 1
>> make: *** [install-recursive] Error 1
>>
>> os version : Ubuntu 14.04
>>
>> note: Able to install 3.8 and 3.7 without any issue in same os.
>>
>> regards
>> Amudhan
>>
>>
>>
>>
>>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Input/output error - would not heal

2017-02-08 Thread Nag Pavan Chilakam

- Original Message -
From: "lejeczek" 
To: "Nag Pavan Chilakam" 
Cc: gluster-users@gluster.org
Sent: Wednesday, 8 February, 2017 7:15:29 PM
Subject: Re: [Gluster-users] Input/output error - would not heal



On 08/02/17 06:11, Nag Pavan Chilakam wrote:
> "gluster volume info" and "gluster vol status" would help in us debug faster.
>
> However, coming to gfid mismatch, yes the file "abbreviations.log" (I assume 
> the other brick copy also to be " abbreviations.log" and not 
> "breviations.log" typo mistake?) is in gfid mismatch leading to IO 
> error(gfid splitbrain)
> Resolving data and metadata splitbrains are not recommended to be done from 
> backend brick.
> But in case of a GFID splitbrain(like in file abbreviations.log), the only 
> method available is resolving from backend brick
> You can read more about this in 
> http://gluster.readthedocs.io/en/latest/Troubleshooting/split-brain/?highlight=gfid
>(Fixing Directory entry split-brain   section)
> (There is a bug already existing to resolve gfid splitbrain using CLI )
>
>   
I've read that doc, however I'm not sure what to do with 
bits that are not mentioned in that doc. Which is:
when some xattr does not exist on one copy but does on the 
other, like:

3]$ getfattr -d -m . -e hex .vim.backup/.bash_profile.swp
# file: .vim.backup/.bash_profile.swp
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.afr.USER-HOME-client-0=0x00010001
trusted.afr.USER-HOME-client-5=0x00010001

2]$ getfattr -d -m . -e hex .vim.backup/.bash_profile.swp
# file: .vim.backup/.bash_profile.swp
security.selinux=0x73797374656d5f753a6f626a6563745f723a64656661756c745f743a733000
trusted.afr.USER-HOME-client-5=0x00010001
trusted.afr.USER-HOME-client-6=0x00010001

that means the file .bash_profile.swp is possibly in a data and metadata 
splitbrain
I need to understand the volume configuration, that is the reason I am asking 
for volume info
By seeing the above, I am guessing that it is a x3 volume(3 replica copies)


unless the doc talks about it and I've gone (temporarily) 
blind, but if it's does not it would be great to include 
more scenarios/cases there.
many thx.
L.

>
>
> thanks,
> nagpavan
>
>
> - Original Message -
> From: "lejeczek" 
> To: "Nag Pavan Chilakam" 
> Cc: gluster-users@gluster.org
> Sent: Tuesday, 7 February, 2017 10:53:07 PM
> Subject: Re: [Gluster-users] Input/output error - would not heal
>
>
>
> On 07/02/17 12:50, Nag Pavan Chilakam wrote:
>> Hi,
>> Can you help us with more information on the volume, like volume status and 
>> volume info
>> One reason of "transport endpoint error" is the brick could be down
>>
>> Also, i see that the syntax used for healing is wrong.
>> You need to use as below:
>> gluster v heal  split-brain source-brick  > considering brick path as />
>>
>> In yourcase if brick path is "/G-store/1" and the file to be healed is 
>> "that_file" , then use below syntax (in this case i am considering 
>> "that_file" lying under the brick path directly"
>>
>> gluster volume heal USER-HOME split-brain source-brick 10.5.6.100:/G-store/1 
>> /that_file
> that was that, my copy-paste typo, it does not heal.
> Interestingly, that file is not reported by heal.
>
> I've replied to -  GFID Mismatch - Automatic Correction ? -
> I think my problem is similar, here is a file the heal
> actually sees:
>
>
> $ gluster vol heal USER-HOME info
> Brick
> 10.5.6.100:/__.aLocalStorages/3/0-GLUSTERs/0-USER.HOME/aUser/.vim.backup/.bash_profile.swp
>
> Status: Connected
> Number of entries: 1
>
> Brick
> 10.5.6.49:/__.aLocalStorages/3/0-GLUSTERs/0-USER.HOME/aUser/.vim.backup/.bash_profile.swp
>
> Status: Connected
> Number of entries: 1
>
> I'm copying+pasting what I said in that reply to that thread:
> ...
>
> yep, I'm seeing the same:
> as follows:
> 3]$ getfattr -d -m . -e hex .
> # file: .
> security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
> trusted.afr.USER-HOME-client-2=0x
> trusted.afr.USER-HOME-client-3=0x
> trusted.afr.USER-HOME-client-5=0x
> trusted.afr.dirty=0x
> trusted.gfid=0x06341b521ba94ab7938eca57f7a1824f
> trusted.glusterfs.9e4ed9b7-373a-413b-bc82-b6f978e82ec4.xtime=0x5898e0cf000dd2fe
> trusted.glusterfs.dht=0x0001
> trusted.glusterfs.quota.----0001.contri.1=0x00701c90fcb11200fef6f08c798e006a99819205
> trusted.glusterfs.quota.dirty=0x3000
> trusted.glusterfs.quota.size.1=0x00701c90fcb11200fef6f08c798e006a99819205
> 3]$ getfattr -d -m . -e hex .vim.backup
> # file: .vim.backup
> security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
> trusted.afr.USER-HOME-client-3=0x
> 

[Gluster-users] Can't start cloned volume: "Volume id mismatch"

2017-02-08 Thread Gambit15
Hey guys,
 Any ideas?

[root@v0 ~]# gluster volume start data2
volume start: data2: failed: Volume id mismatch for brick
s0:/run/gluster/snaps/data2/brick1/data/brick. Expected volume id
d8b0a411-70d9-454d-b5fb-7d7ca424adf2, volume id
a7eae608-f1c4-44fd-a6aa-5b9c19e13565
found

[root@v0 ~]# gluster volume info data2 | grep Volume\ ID
Volume ID: d8b0a411-70d9-454d-b5fb-7d7ca424adf2

[root@v0 ~]# gluster snapshot info data-bck_GMT-2017.02.07-14.30.28
Snapshot  : data-bck_GMT-2017.02.07-14.30.28
Snap UUID : 911ef04b-b922-4611-91a1-9abc29fd2360
Created   : 2017-02-07 14:30:28
Snap Volumes:

Snap Volume Name  : a7eae608f1c444fda6aa5b9c19e13565
Origin Volume name: data
Snaps taken for data  : 1
Snaps available for data  : 255
Status: Started


Cheers,
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Moving a physical disk from one machine to another

2017-02-08 Thread Scott Hazelhurst

Dear Ted and Ashish 

Thanks for your helpful response. Unfortunately I’m running the epel version 
for my SL6.8 distro so that’s 3.7.

So, I’ve tried to remove the bad brick. (I am running a 3x2 system), but I get 
the following error


gluster volume remove-brick A01 replica 1 m55:/export/brickA01_1 force
Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y
volume remove-brick commit force: failed: need 3(xN) bricks for reducing 
replica count of the volume from 2 to 1


Any help gratefully received.

Thanks

Scott



 

This communication is 
intended for the addressee only. It is confidential. If you have received this 
communication in error, please notify us immediately and destroy the original 
message. You may not copy or disseminate this communication without the 
permission of the University. Only authorised signatories are competent to 
enter into agreements on behalf of the University and recipients are thus 
advised that the content of this message may not be legally binding on the 
University and may contain the personal views and opinions of the author, which 
are not necessarily the views and opinions of The University of the 
Witwatersrand, Johannesburg. All agreements between the University and 
outsiders are subject to South African Law unless the University agrees in 
writing to the contrary. 

http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Distributed volumes

2017-02-08 Thread Dev Sidious
Hi,

New to Gluster too. I welcome the more experienced users to correct me
if I'm wrong.

Based on some quick tests in my environment, it works like this:

a) The following creates a replicated (replica count = 2),
non-distributed volume

gluster volume create replicated_but_not_distributed replica 2
host1:/GlusterFS/replicated_but_not_distributed
host2:/GlusterFS/replicated_but_not_distributed

[user@whatever]# gluster volume info replicated_but_not_distributed

Volume Name: replicated_but_not_distributed
Type: Replicate
Volume ID: blah-bal-blah-blah-blah
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: host1:/GlusterFS/replicated_but_not_distributed
Brick2: host2:/GlusterFS/replicated_but_not_distributed
Options Reconfigured:
nfs.disable: on
performance.readdir-ahead: on
transport.address-family: inet


b) The following creates a distributed, non-replicated volume. Note that
I omitted the "replica" directive:

gluster volume create distributed_but_not_replicated
host1:/GlusterFS/distributed_but_not_replicated
host2:/GlusterFS/distributed_but_not_replicated

[user@whatever]# gluster volume info distributed_but_not_replicated

Volume Name: distributed_but_not_replicated
Type: Distribute
Volume ID: blah-bal-blah-blah-blah
Status: Started
Snapshot Count: 0
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: host1:/GlusterFS/distributed_but_not_replicated
Brick2: host2:/GlusterFS/distributed_but_not_replicated
Options Reconfigured:
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: on

c) The following creates a replicated AND distributed volume as shown
here
(https://access.redhat.com/documentation/en-US/Red_Hat_Storage/2.1/html/Administration_Guide/sect-User_Guide-Setting_Volumes-Distributed_Replicated.html):

gluster volume create replicated_and_distributed replica 2
host1:/GlusterFS/replicated_and_distributed
host2:/GlusterFS/replicated_and_distributed
host3:/GlusterFS/replicated_and_distributed
host4:/GlusterFS/replicated_and_distributed

I don't have 2 other nodes online at this moment to paste the output of
"gluster volume info" for this one.

I hope this helps.

On 2/8/2017 4:12 AM, Dave Fan wrote:
> Hi,
> 
> I'm new to Gluster so a very basic question. Are all volumes distributed
> by default? Is there a switch to turn this feature on/off?
> 
> I ask this because in an intro to Gluster I saw "Replicated Volume" and
> "Distributed Replicated Volume". Is the first type, "Replicated Volume",
> not distributed?
> 
> Many thanks,
> Dave
> 
> 
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
> 



signature.asc
Description: OpenPGP digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Input/output error - would not heal

2017-02-08 Thread lejeczek



On 08/02/17 06:11, Nag Pavan Chilakam wrote:

"gluster volume info" and "gluster vol status" would help in us debug faster.

However, coming to gfid mismatch, yes the file "abbreviations.log" (I assume the other brick copy 
also to be " abbreviations.log" and not "breviations.log" typo mistake?) is in gfid 
mismatch leading to IO error(gfid splitbrain)
Resolving data and metadata splitbrains are not recommended to be done from 
backend brick.
But in case of a GFID splitbrain(like in file abbreviations.log), the only 
method available is resolving from backend brick
You can read more about this in 
http://gluster.readthedocs.io/en/latest/Troubleshooting/split-brain/?highlight=gfid
   (Fixing Directory entry split-brain   section)
(There is a bug already existing to resolve gfid splitbrain using CLI )

  
I've read that doc, however I'm not sure what to do with 
bits that are not mentioned in that doc. Which is:
when some xattr does not exist on one copy but does on the 
other, like:


3]$ getfattr -d -m . -e hex .vim.backup/.bash_profile.swp
# file: .vim.backup/.bash_profile.swp
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.afr.USER-HOME-client-0=0x00010001
trusted.afr.USER-HOME-client-5=0x00010001

2]$ getfattr -d -m . -e hex .vim.backup/.bash_profile.swp
# file: .vim.backup/.bash_profile.swp
security.selinux=0x73797374656d5f753a6f626a6563745f723a64656661756c745f743a733000
trusted.afr.USER-HOME-client-5=0x00010001
trusted.afr.USER-HOME-client-6=0x00010001

unless the doc talks about it and I've gone (temporarily) 
blind, but if it's does not it would be great to include 
more scenarios/cases there.

many thx.
L.




thanks,
nagpavan


- Original Message -
From: "lejeczek" 
To: "Nag Pavan Chilakam" 
Cc: gluster-users@gluster.org
Sent: Tuesday, 7 February, 2017 10:53:07 PM
Subject: Re: [Gluster-users] Input/output error - would not heal



On 07/02/17 12:50, Nag Pavan Chilakam wrote:

Hi,
Can you help us with more information on the volume, like volume status and 
volume info
One reason of "transport endpoint error" is the brick could be down

Also, i see that the syntax used for healing is wrong.
You need to use as below:
gluster v heal  split-brain source-brick  

In yourcase if brick path is "/G-store/1" and the file to be healed is "that_file" , then use 
below syntax (in this case i am considering "that_file" lying under the brick path directly"

gluster volume heal USER-HOME split-brain source-brick 10.5.6.100:/G-store/1 
/that_file

that was that, my copy-paste typo, it does not heal.
Interestingly, that file is not reported by heal.

I've replied to -  GFID Mismatch - Automatic Correction ? -
I think my problem is similar, here is a file the heal
actually sees:


$ gluster vol heal USER-HOME info
Brick
10.5.6.100:/__.aLocalStorages/3/0-GLUSTERs/0-USER.HOME/aUser/.vim.backup/.bash_profile.swp

Status: Connected
Number of entries: 1

Brick
10.5.6.49:/__.aLocalStorages/3/0-GLUSTERs/0-USER.HOME/aUser/.vim.backup/.bash_profile.swp

Status: Connected
Number of entries: 1

I'm copying+pasting what I said in that reply to that thread:
...

yep, I'm seeing the same:
as follows:
3]$ getfattr -d -m . -e hex .
# file: .
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.afr.USER-HOME-client-2=0x
trusted.afr.USER-HOME-client-3=0x
trusted.afr.USER-HOME-client-5=0x
trusted.afr.dirty=0x
trusted.gfid=0x06341b521ba94ab7938eca57f7a1824f
trusted.glusterfs.9e4ed9b7-373a-413b-bc82-b6f978e82ec4.xtime=0x5898e0cf000dd2fe
trusted.glusterfs.dht=0x0001
trusted.glusterfs.quota.----0001.contri.1=0x00701c90fcb11200fef6f08c798e006a99819205
trusted.glusterfs.quota.dirty=0x3000
trusted.glusterfs.quota.size.1=0x00701c90fcb11200fef6f08c798e006a99819205
3]$ getfattr -d -m . -e hex .vim.backup
# file: .vim.backup
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.afr.USER-HOME-client-3=0x
trusted.gfid=0x0b3a223955534de89086679a4dce8156
trusted.glusterfs.9e4ed9b7-373a-413b-bc82-b6f978e82ec4.xtime=0x5898621c0005d720
trusted.glusterfs.dht=0x0001
trusted.glusterfs.quota.06341b52-1ba9-4ab7-938e-ca57f7a1824f.contri.1=0x04020001
trusted.glusterfs.quota.dirty=0x3000
trusted.glusterfs.quota.size.1=0x04020001
3]$ getfattr -d -m . -e hex .vim.backup/.bash_profile.swp
# file: .vim.backup/.bash_profile.swp
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.afr.USER-HOME-client-0=0x00010001
trusted.afr.USER-HOME-client-5=0x00010001

Re: [Gluster-users] Slow performance on samba with small files

2017-02-08 Thread Дмитрий Глушенок
For _every_ file copied samba performs readdir() to get all entries of the 
destination folder. Then the list is searched for filename (to prevent name 
collisions as SMB shares are not case sensitive). More files in folder, more 
time it takes to perform readdir(). It is a lot worse for Gluster because 
single folder contents distributed among many servers and Gluster has to join 
many directory listings (requested via network) to form one and return it to 
caller.

Rsync does not perform readdir(), it just checks file existence with stat() 
IIRC. And as modern Gluster versions has default setting to check for file only 
at its destination (when volume is balanced) - the check performs relatively 
fast.

You can hack samba to prevent such checks if your goal is to get files copied 
not so slow (as you sure the files you are copying are not exists at 
destination). But try to perform 'ls -l' on _not_ cached folder with thousands 
of files - it will take tens of seconds. This is time your users will waste 
browsing shares.

> 8 февр. 2017 г., в 13:17, Gary Lloyd  написал(а):
> 
> Thanks for the reply
> 
> I've just done a bit more testing. If I use rsync from a gluster client to 
> copy the same files to the mount point it only takes a couple of minutes.
> For some reason it's very slow on samba though (version 4.4.4).
> 
> I have tried various samba tweaks / settings and have yet to get acceptable 
> write speed on small files.
> 
> 
> Gary Lloyd
> 
> I.T. Systems:Keele University
> Finance & IT Directorate
> Keele:Staffs:IC1 Building:ST5 5NB:UK
> +44 1782 733063 
> 
> 
> On 8 February 2017 at 10:05, Дмитрий Глушенок  > wrote:
> Hi,
> 
> There is a number of tweaks/hacks to make it better, but IMHO overall 
> performance with small files is still unacceptable for such folders with 
> thousands of entries.
> 
> If your shares are not too large to be placed on single filesystem and you 
> still want to use Gluster - it is possible to run VM on top of Gluster. 
> Inside that VM you can create ZFS/NTFS to be shared.
> 
>> 8 февр. 2017 г., в 12:10, Gary Lloyd > > написал(а):
>> 
>> Hi
>> 
>> I am currently testing gluster 3.9 replicated/distrbuted on centos 7.3 with 
>> samba/ctdb.
>> I have been able to get it all up and running, but writing small files is 
>> really slow. 
>> 
>> If I copy large files from gluster backed samba I get almost wire speed (We 
>> only have 1Gb at the moment). I get around half that speed if I copy large 
>> files to the gluster backed samba system, which I am guessing is due to it 
>> being replicated (This is acceptable).
>> 
>> Small file write performance seems really poor for us though:
>> As an example I have an eclipse IDE workspace folder that is 6MB in size 
>> that has around 6000 files in it. A lot of these files are <1k in size.
>> 
>> If I copy this up to gluster backed samba it takes almost one hour to get 
>> there.
>> With our basic samba deployment it only takes about 5 minutes.
>> 
>> Both systems reside on the same disks/SAN.
>> 
>> 
>> I was hoping that we would be able to move away from using a proprietary SAN 
>> to house our network shares and use gluster instead.
>> 
>> Does anyone have any suggestions of anything I could tweak to make it better 
>> ?
>> 
>> Many Thanks
>> 
>> 
>> Gary Lloyd
>> 
>> I.T. Systems:Keele University
>> Finance & IT Directorate
>> Keele:Staffs:IC1 Building:ST5 5NB:UK
>> 
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org 
>> http://lists.gluster.org/mailman/listinfo/gluster-users 
>> 
> --
> Dmitry Glushenok
> Jet Infosystems
> 
> 

--
Dmitry Glushenok
Jet Infosystems

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Slow performance on samba with small files

2017-02-08 Thread Gary Lloyd
Thanks for the reply

I've just done a bit more testing. If I use rsync from a gluster client to
copy the same files to the mount point it only takes a couple of minutes.
For some reason it's very slow on samba though (version 4.4.4).

I have tried various samba tweaks / settings and have yet to get acceptable
write speed on small files.


*Gary Lloyd*

I.T. Systems:Keele University
Finance & IT Directorate
Keele:Staffs:IC1 Building:ST5 5NB:UK
+44 1782 733063 <%2B44%201782%20733073>


On 8 February 2017 at 10:05, Дмитрий Глушенок  wrote:

> Hi,
>
> There is a number of tweaks/hacks to make it better, but IMHO overall
> performance with small files is still unacceptable for such folders with
> thousands of entries.
>
> If your shares are not too large to be placed on single filesystem and you
> still want to use Gluster - it is possible to run VM on top of Gluster.
> Inside that VM you can create ZFS/NTFS to be shared.
>
> 8 февр. 2017 г., в 12:10, Gary Lloyd  написал(а):
>
> Hi
>
> I am currently testing gluster 3.9 replicated/distrbuted on centos 7.3
> with samba/ctdb.
> I have been able to get it all up and running, but writing small files is
> really slow.
>
> If I copy large files from gluster backed samba I get almost wire speed
> (We only have 1Gb at the moment). I get around half that speed if I copy
> large files to the gluster backed samba system, which I am guessing is due
> to it being replicated (This is acceptable).
>
> Small file write performance seems really poor for us though:
> As an example I have an eclipse IDE workspace folder that is 6MB in size
> that has around 6000 files in it. A lot of these files are <1k in size.
>
> If I copy this up to gluster backed samba it takes almost one hour to get
> there.
> With our basic samba deployment it only takes about 5 minutes.
>
> Both systems reside on the same disks/SAN.
>
>
> I was hoping that we would be able to move away from using a proprietary
> SAN to house our network shares and use gluster instead.
>
> Does anyone have any suggestions of anything I could tweak to make it
> better ?
>
> Many Thanks
>
>
> *Gary Lloyd*
> 
> I.T. Systems:Keele University
> Finance & IT Directorate
> Keele:Staffs:IC1 Building:ST5 5NB:UK
> 
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
>
> --
> Dmitry Glushenok
> Jet Infosystems
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] should geo repl pick up changes to a vol?

2017-02-08 Thread Kotresh Hiremath Ravishankar
Hi lejeczek,

Try stop force.

gluster vol geo-rep  :: stop force

Thanks and Regards,
Kotresh H R

Thanks and Regards,
Kotresh H R


- Original Message -
> From: "lejeczek" 
> To: "Kotresh Hiremath Ravishankar" 
> Cc: gluster-users@gluster.org
> Sent: Tuesday, February 7, 2017 3:42:18 PM
> Subject: [Gluster-users] should geo repl pick up changes to a vol?
> 
> 
> 
> On 03/02/17 07:25, Kotresh Hiremath Ravishankar wrote:
> > Hi,
> >
> > The following steps needs to be followed when a brick is added from new
> > node on master.
> >
> > 1. Stop geo-rep
> geo repl which master volume had a brick removed:
> 
> ~]$ gluster volume geo-replication GROUP-WORK
> 10.5.6.32::GROUP-WORK-Replica status
> 
> MASTER NODEMASTER VOLMASTER
> BRICK  SLAVE USER
> SLAVESLAVE NODESTATUS
> CRAWL STATUS LAST_SYNCED
> -
> 10.5.6.100 GROUP-WORK
> /__.aLocalStorages/3/0-GLUSTERs/GROUP-WORKroot
> 10.5.6.32::GROUP-WORK-Replica10.5.6.32 Active
> History Crawl2017-02-01 15:24:05
> 
> ~]$ gluster volume geo-replication GROUP-WORK
> 10.5.6.32::GROUP-WORK-Replica stop
> Staging failed on 10.5.6.49. Error: Geo-replication session
> between GROUP-WORK and 10.5.6.32::GROUP-WORK-Replica does
> not exist.
> geo-replication command failed
> 
> 10.5.6.49 is the brick which was added, now part of the
> master vol.
> 
> 
> >
> > 2. Run the following command on the master node where passwordless SSH
> >connection is configured, in order to create a common pem pub file.
> >
> >  # gluster system:: execute gsec_create
> >
> > 3. Create the geo-replication session using the following command.
> >The push-pem and force options are required to perform the necessary
> >pem-file setup on the slave nodes.
> >
> >  # gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL
> >  create push-pem force
> >
> > 4. Start geo-rep
> >
> > Thanks and Regards,
> > Kotresh H R
> >
> > - Original Message -
> >> From: "lejeczek" 
> >> To: gluster-users@gluster.org
> >> Sent: Thursday, February 2, 2017 1:14:07 AM
> >> Subject: [Gluster-users] should geo repl pick up changes to a vol?
> >>
> >> dear all
> >>
> >> should gluster update geo repl when a volume changes?
> >> eg. bricks are added, taken away.
> >>
> >> reason I'm asking is because it doe not seem like gluster is
> >> doing it on my systems?
> >> Well, I see gluster removed a node form geo-repl, brick that
> >> I removed.
> >> But I added a brick to a vol and it's not there in geo-repl.
> >>
> >> bw.
> >> L.
> >> ___
> >> Gluster-users mailing list
> >> Gluster-users@gluster.org
> >> http://lists.gluster.org/mailman/listinfo/gluster-users
> >>
> 
> 
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Slow performance on samba with small files

2017-02-08 Thread Дмитрий Глушенок
Hi,

There is a number of tweaks/hacks to make it better, but IMHO overall 
performance with small files is still unacceptable for such folders with 
thousands of entries.

If your shares are not too large to be placed on single filesystem and you 
still want to use Gluster - it is possible to run VM on top of Gluster. Inside 
that VM you can create ZFS/NTFS to be shared.

> 8 февр. 2017 г., в 12:10, Gary Lloyd  написал(а):
> 
> Hi
> 
> I am currently testing gluster 3.9 replicated/distrbuted on centos 7.3 with 
> samba/ctdb.
> I have been able to get it all up and running, but writing small files is 
> really slow. 
> 
> If I copy large files from gluster backed samba I get almost wire speed (We 
> only have 1Gb at the moment). I get around half that speed if I copy large 
> files to the gluster backed samba system, which I am guessing is due to it 
> being replicated (This is acceptable).
> 
> Small file write performance seems really poor for us though:
> As an example I have an eclipse IDE workspace folder that is 6MB in size that 
> has around 6000 files in it. A lot of these files are <1k in size.
> 
> If I copy this up to gluster backed samba it takes almost one hour to get 
> there.
> With our basic samba deployment it only takes about 5 minutes.
> 
> Both systems reside on the same disks/SAN.
> 
> 
> I was hoping that we would be able to move away from using a proprietary SAN 
> to house our network shares and use gluster instead.
> 
> Does anyone have any suggestions of anything I could tweak to make it better ?
> 
> Many Thanks
> 
> 
> Gary Lloyd
> 
> I.T. Systems:Keele University
> Finance & IT Directorate
> Keele:Staffs:IC1 Building:ST5 5NB:UK
> 
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users

--
Dmitry Glushenok
Jet Infosystems

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Remove an artificial limitation of disperse volume

2017-02-08 Thread Xavier Hernandez

On 07/02/17 17:29, Olivier Lambert wrote:

Yep, but if I hit a 30% penalty, I don't want that :) Any idea of the
perf impact? I'll probably contact Xavier directly if he's not here!


It depends on the workload. The basic problem is that the I/O write size 
must be a multiple of the number of data bricks to avoid 
read-modify-write cycles (i.e. they write always full stripes, so no 
need to read anything from disk).


Normally applications issue writes that are a power of 2 (512, 4096, 
131072, ...), so the best configurations for ec should also be powers of 
two. But of course you can have bad performance with ec even if you use 
a power of two if the application ends writing blocks of 1000 bytes for 
example.


Anyway, I cannot give you real numbers. Long time ago I did some tests 
but I was unable to get conclusive results. Probably I was having too 
many interferences from other translators that were hiding the real 
performance impact (maybe write-behind). This is good because this means 
that peak performance could be the roughly the same with "optimal" and 
"non-optimal" configurations, but sustained throughput should be worse 
for "non-optimal" configurations because at some point the caching made 
by write-behind and other xlators will become full and they won't be 
able to continue hiding the performance hit.


Some day I'll try to find time to get more accurate results regarding 
this issue, but for now only thing I can say is to test your specific 
workload with your specific configuration and see if it suffers a 
performance impact or not.


Xavi



On Tue, Feb 7, 2017 at 5:27 PM, Jeff Darcy  wrote:



- Original Message -

Okay so the 4 nodes thing is a kind of exception? What about 8 nodes
with redundancy 4?

I made a table to recap possible configurations, can you take a quick
look and tell me if it's OK?

Here: https://gist.github.com/olivierlambert/8d530ac11b10dd8aac95749681f19d2c


As I understand it, the "power of two" thing is only about maximum
efficiency, and other values can work without wasting space (they'll
just be a bit slower).  So, for example, with 12 disks you would be
able to do 10+2 and get 83% space efficiency.  Xavier's the expert,
though, so it's probably best to let him clarify.

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users



___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Remove an artificial limitation of disperse volume

2017-02-08 Thread Xavier Hernandez

Hi Pavan,

On 07/02/17 14:51, Nag Pavan Chilakam wrote:

You can always go for x3(3 replica copies), to address your need which you have 
asked
EC volumes can be seen as raid for understanding purpose, but don't see it as 
an apple-to-apple comparison.
Raid4/6(mostly) relies on XOR'ing bits(so basic addition and subtraction), but 
EC involves a more complex algorithm(reed-solomon)


In fact RAID-5 and RAID-6 can be seen as an implementation of 
Reed-Solomon, though for really easy cases, so they are directly 
computed using xors and no one talks about Reed-Solomon.


For example, one possible Reed-Solomon matrix that implements a RAID-5 
is equivalent to compute the redundancy as the XOR of all data blocks. 
This is precisely what RAID-5 uses. RAID-6 is also very similar.


The current implementation of the Reed-Solomon in EC only uses XORs to 
compute the parity and recover the data. The average number of XORs per 
input byte needed to compute the redundancy depends on the CPU 
extensions used (none, SSE, AVX) and the configuration. This is a table 
showing this:


  x86_64   SSEAVX
 2+1   0.790.39   0.20
 4+2   1.760.88   0.44
 4+3   2.061.03   0.51
 8+3   3.401.70   0.85
 8+4   3.711.86   0.93
16+4   6.343.17   1.59

Note that for AVX and a 16+4 configuration it only uses 1.59 xors on 
average to compute the 4 redundancies. It only needs more than one xor 
per byte of redundancy for x86_64 and 16+4 (6.34 / 4 = 1.585).


There's a technical document explaining how EC works internally here, 
though it's oriented to developers and people who already know the 
basics about erasure codes:


https://review.gluster.org/#/c/15637/4/doc/developer-guide/ec-implementation.md

Xavi




- Original Message -
From: "Olivier Lambert" 
To: "gluster-users" 
Sent: Tuesday, 7 February, 2017 6:46:37 PM
Subject: [Gluster-users] Remove an artificial limitation of disperse volume

Hi everyone!

I'm currently working on implementing Gluster on XenServer/Xen Orchestra.

I want to expose some Gluster features (in the easiest possible way to
the user).

Therefore, I want to expose only "distributed/replicated" and
"disperse" mode. From what I understand, they are working differently.
Let's take a simple example.

Setup: 6x nodes with 1x 200GB disk each.

* Disperse with redundancy 2 (4+2): I can lose **any 2 of all my
disks**. Total usable space is 800GB. It's a kind of RAID6 (or RAIDZ2)
* Distributed/replicated with replica 2: I can lose 2 disks **BUT**
not on the same "mirror". Total usable space is 600GB. It's a kind of
RAID10

So far, is it correct?

My main point is that behavior is very different (pairing disks in
distributed/replicated and "shared" parity in disperse).

Now, let's imagine something else. 4x nodes with 1x 200GB disk each.

Why not having disperse with redundancy 2? It will be the same in
terms of storage space than distributed/replicated, **BUT** in
disperse I can lose any of 2 disks. In dist/rep, only if they are not
on the same "mirror".

So far, I can't create a disperse volume if the redundancy level is
50% or more the number of bricks. I know that perfs would be better in
dist/rep, but what if I prefer anyway to have disperse?

Conclusion: would it be possible to have a "force" flag during
disperse volume creation even if redundancy is higher that 50%?



Thanks!



Olivier.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users



___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Distributed volumes

2017-02-08 Thread Dave Fan
Hi,


I'm new to Gluster so a very basic question. Are all volumes distributed by 
default? Is there a switch to turn this feature on/off?


I ask this because in an intro to Gluster I saw "Replicated Volume" and 
"Distributed Replicated Volume". Is the first type, "Replicated Volume", not 
distributed?


Many thanks,
Dave___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Remove an artificial limitation of disperse volume

2017-02-08 Thread Xavier Hernandez

Hi Olivier,

sorry, didn't see the email earlier...

We've already talked about this in private, but to make things clearer 
to everyone I answer here.


On 07/02/17 14:16, Olivier Lambert wrote:

Hi everyone!

I'm currently working on implementing Gluster on XenServer/Xen Orchestra.

I want to expose some Gluster features (in the easiest possible way to
the user).

Therefore, I want to expose only "distributed/replicated" and
"disperse" mode. From what I understand, they are working differently.
Let's take a simple example.

Setup: 6x nodes with 1x 200GB disk each.

* Disperse with redundancy 2 (4+2): I can lose **any 2 of all my
disks**. Total usable space is 800GB. It's a kind of RAID6 (or RAIDZ2)
* Distributed/replicated with replica 2: I can lose 2 disks **BUT**
not on the same "mirror". Total usable space is 600GB. It's a kind of
RAID10

So far, is it correct?


Yes, but sometimes you can gain some performance by splitting each disk 
into two bricks if the disks are not the bottleneck.




My main point is that behavior is very different (pairing disks in
distributed/replicated and "shared" parity in disperse).

Now, let's imagine something else. 4x nodes with 1x 200GB disk each.

Why not having disperse with redundancy 2? It will be the same in
terms of storage space than distributed/replicated, **BUT** in
disperse I can lose any of 2 disks. In dist/rep, only if they are not
on the same "mirror".

So far, I can't create a disperse volume if the redundancy level is
50% or more the number of bricks. I know that perfs would be better in
dist/rep, but what if I prefer anyway to have disperse?

Conclusion: would it be possible to have a "force" flag during
disperse volume creation even if redundancy is higher that 50%?


That's a design decision made to avoid most of the split-brains and 
thinking that 50% redundancy is already achieved by replicate (even if 
the conditions are not really the same).


The Reed-Solomon algorithm is able to create as many or even more 
redundancy fragments as there are data bricks (the only real limitation 
is the Galois Field used). However allowing this in disperse had a lot 
of complex scenarios that are both difficult to solve and prone to 
possible failures/data corruptions. So it was decided to not support 
those configurations.


Xavi





Thanks!



Olivier.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users



___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Slow performance on samba with small files

2017-02-08 Thread Gary Lloyd
Hi

I am currently testing gluster 3.9 replicated/distrbuted on centos 7.3 with
samba/ctdb.
I have been able to get it all up and running, but writing small files is
really slow.

If I copy large files from gluster backed samba I get almost wire speed (We
only have 1Gb at the moment). I get around half that speed if I copy large
files to the gluster backed samba system, which I am guessing is due to it
being replicated (This is acceptable).

Small file write performance seems really poor for us though:
As an example I have an eclipse IDE workspace folder that is 6MB in size
that has around 6000 files in it. A lot of these files are <1k in size.

If I copy this up to gluster backed samba it takes almost one hour to get
there.
With our basic samba deployment it only takes about 5 minutes.

Both systems reside on the same disks/SAN.


I was hoping that we would be able to move away from using a proprietary
SAN to house our network shares and use gluster instead.

Does anyone have any suggestions of anything I could tweak to make it
better ?

Many Thanks


*Gary Lloyd*

I.T. Systems:Keele University
Finance & IT Directorate
Keele:Staffs:IC1 Building:ST5 5NB:UK

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users