Re: [Gluster-users] Slow performance on samba with small files

2017-02-08 Thread Дмитрий Глушенок
For _every_ file copied samba performs readdir() to get all entries of the 
destination folder. Then the list is searched for filename (to prevent name 
collisions as SMB shares are not case sensitive). More files in folder, more 
time it takes to perform readdir(). It is a lot worse for Gluster because 
single folder contents distributed among many servers and Gluster has to join 
many directory listings (requested via network) to form one and return it to 
caller.

Rsync does not perform readdir(), it just checks file existence with stat() 
IIRC. And as modern Gluster versions has default setting to check for file only 
at its destination (when volume is balanced) - the check performs relatively 
fast.

You can hack samba to prevent such checks if your goal is to get files copied 
not so slow (as you sure the files you are copying are not exists at 
destination). But try to perform 'ls -l' on _not_ cached folder with thousands 
of files - it will take tens of seconds. This is time your users will waste 
browsing shares.

> 8 февр. 2017 г., в 13:17, Gary Lloyd  написал(а):
> 
> Thanks for the reply
> 
> I've just done a bit more testing. If I use rsync from a gluster client to 
> copy the same files to the mount point it only takes a couple of minutes.
> For some reason it's very slow on samba though (version 4.4.4).
> 
> I have tried various samba tweaks / settings and have yet to get acceptable 
> write speed on small files.
> 
> 
> Gary Lloyd
> 
> I.T. Systems:Keele University
> Finance & IT Directorate
> Keele:Staffs:IC1 Building:ST5 5NB:UK
> +44 1782 733063 
> ________
> 
> On 8 February 2017 at 10:05, Дмитрий Глушенок  <mailto:gl...@jet.msk.su>> wrote:
> Hi,
> 
> There is a number of tweaks/hacks to make it better, but IMHO overall 
> performance with small files is still unacceptable for such folders with 
> thousands of entries.
> 
> If your shares are not too large to be placed on single filesystem and you 
> still want to use Gluster - it is possible to run VM on top of Gluster. 
> Inside that VM you can create ZFS/NTFS to be shared.
> 
>> 8 февр. 2017 г., в 12:10, Gary Lloyd > <mailto:g.ll...@keele.ac.uk>> написал(а):
>> 
>> Hi
>> 
>> I am currently testing gluster 3.9 replicated/distrbuted on centos 7.3 with 
>> samba/ctdb.
>> I have been able to get it all up and running, but writing small files is 
>> really slow. 
>> 
>> If I copy large files from gluster backed samba I get almost wire speed (We 
>> only have 1Gb at the moment). I get around half that speed if I copy large 
>> files to the gluster backed samba system, which I am guessing is due to it 
>> being replicated (This is acceptable).
>> 
>> Small file write performance seems really poor for us though:
>> As an example I have an eclipse IDE workspace folder that is 6MB in size 
>> that has around 6000 files in it. A lot of these files are <1k in size.
>> 
>> If I copy this up to gluster backed samba it takes almost one hour to get 
>> there.
>> With our basic samba deployment it only takes about 5 minutes.
>> 
>> Both systems reside on the same disks/SAN.
>> 
>> 
>> I was hoping that we would be able to move away from using a proprietary SAN 
>> to house our network shares and use gluster instead.
>> 
>> Does anyone have any suggestions of anything I could tweak to make it better 
>> ?
>> 
>> Many Thanks
>> 
>> 
>> Gary Lloyd
>> 
>> I.T. Systems:Keele University
>> Finance & IT Directorate
>> Keele:Staffs:IC1 Building:ST5 5NB:UK
>> 
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org <mailto:Gluster-users@gluster.org>
>> http://lists.gluster.org/mailman/listinfo/gluster-users 
>> <http://lists.gluster.org/mailman/listinfo/gluster-users>
> --
> Dmitry Glushenok
> Jet Infosystems
> 
> 

--
Dmitry Glushenok
Jet Infosystems

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Slow performance on samba with small files

2017-02-08 Thread Дмитрий Глушенок
Hi,

There is a number of tweaks/hacks to make it better, but IMHO overall 
performance with small files is still unacceptable for such folders with 
thousands of entries.

If your shares are not too large to be placed on single filesystem and you 
still want to use Gluster - it is possible to run VM on top of Gluster. Inside 
that VM you can create ZFS/NTFS to be shared.

> 8 февр. 2017 г., в 12:10, Gary Lloyd  написал(а):
> 
> Hi
> 
> I am currently testing gluster 3.9 replicated/distrbuted on centos 7.3 with 
> samba/ctdb.
> I have been able to get it all up and running, but writing small files is 
> really slow. 
> 
> If I copy large files from gluster backed samba I get almost wire speed (We 
> only have 1Gb at the moment). I get around half that speed if I copy large 
> files to the gluster backed samba system, which I am guessing is due to it 
> being replicated (This is acceptable).
> 
> Small file write performance seems really poor for us though:
> As an example I have an eclipse IDE workspace folder that is 6MB in size that 
> has around 6000 files in it. A lot of these files are <1k in size.
> 
> If I copy this up to gluster backed samba it takes almost one hour to get 
> there.
> With our basic samba deployment it only takes about 5 minutes.
> 
> Both systems reside on the same disks/SAN.
> 
> 
> I was hoping that we would be able to move away from using a proprietary SAN 
> to house our network shares and use gluster instead.
> 
> Does anyone have any suggestions of anything I could tweak to make it better ?
> 
> Many Thanks
> 
> 
> Gary Lloyd
> 
> I.T. Systems:Keele University
> Finance & IT Directorate
> Keele:Staffs:IC1 Building:ST5 5NB:UK
> 
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users

--
Dmitry Glushenok
Jet Infosystems

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Vm migration between diff clusters

2017-01-19 Thread Дмитрий Глушенок
Hello,

For offline migration you can use storage domain of type Export, shared between 
clusters. For online storage migration the source and destination storage have 
to be present in current cluster.

Regarding the different glusterfs versions - it must not be a problem because 
oVirt uses vm images as files on filesystem, without gfapi involvement.

> 19 янв. 2017 г., в 2:11, p...@email.cz написал(а):
> 
> Hello, 
> how can I migrate VM  between  two different clusters with gluster FS ? ( 3.5 
> x 4.0 )
> They have different ovirt mgmt.
> 
> regards
> Paf1
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users

--
Dmitry Glushenok
Jet Infosystems

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] How to properly set ACLs in GlusterFS?

2016-12-08 Thread Дмитрий Глушенок
Hi,

According to man page for setfacl: For uid and gid you can specify either a 
name or a number.
But actually the information will be stored in xattrs in the form of numbers, 
afaik.

One way to solve your problem is the consistent name/id mapping, which can be 
achieved by using directory servers like Free IPA, for example.

> 7 дек. 2016 г., в 16:59, Alexandr Porunov  
> написал(а):
> 
> Hello,
> 
> I am trying to use ACLs but it seems that it doesn't recognize user names but 
> user IDs. 
> I.e. I have 2 machines with next users: user1, user2.
> On the first machine I have created users like this:
> useradd user1
> useradd user2
> 
> On the second machine I have created users like this:
> useradd user2
> useradd user1
> 
> Now I see id's of the users. Here is what I see:
> 
> Machine 1:
> # id test1
> uid=1002(test1) gid=1003(test1) groups=1003(test1)
> # id test2
> uid=1003(test2) gid=1004(test2) groups=1004(test2)
> 
> Machine 2:
> # id test1
> uid=1003(test1) gid=1004(test1) groups=1004(test1)
> # id test2
> uid=1002(test2) gid=1003(test2) groups=1003(test2)
> 
> So, on the machine1 test1 user has 1002 ID and on the machine2 test1 user has 
> 1003
> 
> Now If on the machine1 I set a permission a on file like this:
> setfacl -R -m u:test1:rwx /repositories/test
> 
> On the machine2 test1 user won't have any access to the file but the user 
> test2 will! How to set permissions based on the user/group ID?
> 
> Here is how I mount a gluster volume:
> mount -t glusterfs -o acl 192.168.0.120:/gv0 /repositories/
> 
> Sincerely,
> Alexandr
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users

--
Dmitry Glushenok
Jet Infosystems

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] dealing with gluster outages due to disk timeouts

2016-12-02 Thread Дмитрий Глушенок
Hi,

I always though that hardware RAID is a requirement for SDS as it hides all 
dirty work with raw disks from software which just cannot deal with all kinds 
of hardware faults. If disk starts to experiencing long delays, then after 
about 7 seconds RAID controller marks this disk as failed (this is what 
TLER/ERC for). If your RAID card behaves differently you can try to decrease OS 
timeouts (disk driver will send i/o error to filesystem then). But in this case 
complete brick will go offline and you will definitely need replicated setup.

> 24 нояб. 2016 г., в 9:12, Christian Rice  написал(а):
> 
> This is a long-standing problem for me, and I’m wondering how to insulate 
> myself from it…pardon the long-windedness in advance.
>  
> I use gluster internationally as regional repositories of files, and it’s 
> pretty constantly being rsync’d to (ie, written to solely by rsync, optimized 
> with –inplace or similar).
>  
> These regional repositories are also being read from, each to the tune of 
> 10-50MB/s.  Each gluster pool is anywhere between 4 to 16 servers, each with 
> one brick of RAID6, all pools in a distributed-only config.  I’m not 
> currently using distributed-replicated, but even that configuration is not 
> immune to my problem.
>  
> So, here’s the problem:
>  
> If one disk on one gluster brick experiences timeouts, all the gluster 
> clients block.  This is likely because the rate at which the disks are being 
> exercised by rsyncs (writes and stats) plus reads (client file access) causes 
> an overwhelming backlog of gluster ops, something perhaps is bottlenecked and 
> locking up, but in general it’s fairly useless to me.  Running a ‘df’ hangs 
> completely.
>  
> This has been an issue for me for years.  My usual procedure is to manually 
> fail the disk that’s experiencing timeouts, if it hasn’t been ejected already 
> by the raid controller, and remove the load from the gluster file system—it 
> only takes a fraction of a minute before the gluster volume recovers and I 
> can add the load back.  Rebuilding parity to the brick’s raid is not the 
> problem—it’s the moments before the disk ultimately fails that causes the 
> backlog of requests that really causes problems.
>  
> I’m looking for advice as to how to insulate myself from this problem better. 
>  My RAID cards don’t support modifying disk timeouts to be incredibly short.  
> I can see disk timeout messages from the raid card, and write an omprog 
> function to fail the disk, but that’s kinda brutal.  Maybe I could get a 
> different raid card that supports shorter timeouts or fast disk failures, but 
> if anyone has experience with, say md raid1 not having this problem, or 
> something similar, it might be worth the expense to go that route.
>  
> If my memory is correct, gluster still has this problem with a 
> distributed-replicated configuration, because writes need to succeed on both 
> leafs before an operation is considered complete, so a timeout on one node is 
> still detrimental.
>  
> Insight, experience designing around this, tunables I haven’t considered—I’ll 
> take anything.  I really like gluster, I’ll keep using it, but this is its 
> Achille’s heel for me.  Is there a magic bullet?  Or do I just need to fail 
> faster?
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org 
> http://www.gluster.org/mailman/listinfo/gluster-users 
> 
--
Dmitry Glushenok
Jet Infosystems

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] corruption using gluster and iSCSI with LIO

2016-12-02 Thread Дмитрий Глушенок
ting node2, VM continues to work after a small "lag"/freeze.
>>>>>> I restarted node2 and it was back online: OK
>>>>>> 
>>>>>> Then, after waiting few minutes, halting node1. And **just** at this
>>>>>> moment, the VM is corrupted (segmentation fault, /var/log folder empty
>>>>>> etc.)
>>>>>> 
>>>>> Other than waiting a few minutes did you make sure heals had completed?
>>>>> 
>>>>>> 
>>>>>> dmesg of the VM:
>>>>>> 
>>>>>> [ 1645.852905] EXT4-fs error (device xvda1):
>>>>>> htree_dirblock_to_tree:988: inode #19: block 8286: comm bash: bad
>>>>>> entry in directory: rec_len is smaller than minimal - offset=0(0),
>>>>>> inode=0, rec_len=0, name_len=0
>>>>>> [ 1645.854509] Aborting journal on device xvda1-8.
>>>>>> [ 1645.855524] EXT4-fs (xvda1): Remounting filesystem read-only
>>>>>> 
>>>>>> And got a lot of " comm bash: bad entry in directory" messages then...
>>>>>> 
>>>>>> Here is the current config with all Node back online:
>>>>>> 
>>>>>> # gluster volume info
>>>>>> 
>>>>>> Volume Name: gv0
>>>>>> Type: Replicate
>>>>>> Volume ID: 5f15c919-57e3-4648-b20a-395d9fe3d7d6
>>>>>> Status: Started
>>>>>> Snapshot Count: 0
>>>>>> Number of Bricks: 1 x (2 + 1) = 3
>>>>>> Transport-type: tcp
>>>>>> Bricks:
>>>>>> Brick1: 10.0.0.1:/bricks/brick1/gv0
>>>>>> Brick2: 10.0.0.2:/bricks/brick1/gv0
>>>>>> Brick3: 10.0.0.3:/bricks/brick1/gv0 (arbiter)
>>>>>> Options Reconfigured:
>>>>>> nfs.disable: on
>>>>>> performance.readdir-ahead: on
>>>>>> transport.address-family: inet
>>>>>> features.shard: on
>>>>>> features.shard-block-size: 16MB
>>>>>> network.remote-dio: enable
>>>>>> cluster.eager-lock: enable
>>>>>> performance.io-cache: off
>>>>>> performance.read-ahead: off
>>>>>> performance.quick-read: off
>>>>>> performance.stat-prefetch: on
>>>>>> performance.strict-write-ordering: off
>>>>>> cluster.server-quorum-type: server
>>>>>> cluster.quorum-type: auto
>>>>>> cluster.data-self-heal: on
>>>>>> 
>>>>>> 
>>>>>> # gluster volume status
>>>>>> Status of volume: gv0
>>>>>> Gluster process TCP Port  RDMA Port  Online
>>>>>> Pid
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Brick 10.0.0.1:/bricks/brick1/gv0   49152 0  Y
>>>>>> 1331
>>>>>> Brick 10.0.0.2:/bricks/brick1/gv0   49152 0  Y
>>>>>> 2274
>>>>>> Brick 10.0.0.3:/bricks/brick1/gv0   49152 0  Y
>>>>>> 2355
>>>>>> Self-heal Daemon on localhost   N/A   N/AY
>>>>>> 2300
>>>>>> Self-heal Daemon on 10.0.0.3N/A   N/AY
>>>>>> 10530
>>>>>> Self-heal Daemon on 10.0.0.2N/A   N/AY
>>>>>> 2425
>>>>>> 
>>>>>> Task Status of Volume gv0
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> There are no active volume tasks
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Thu, Nov 17, 2016 at 11:35 PM, Olivier Lambert
>>>>>>  wrote:
>>>>>>> It's planned to have an arbiter soon :) It was just preliminary
>>>>>>> tests.
>>>>>>> 
>>>>>>> Thanks for the settings, I'll test this soon and I'll come back to
>>>>>>> you!
>>>>>>> 
>>>>>>> On Thu, Nov 17, 2016 at 11:29 PM, Lindsay Mathieson
>>>>>>>  wrote:
>>>>>>>> On 18/11/2016 8:17 AM, Olivier Lambert wrote:
>>>>>>>>> 
>>>>>>>>> gluster volume info gv0
>>>>>>>>> 
>>>>>>>>> Volume Name: gv0
>>>>>>>>> Type: Replicate
>>>>>>>>> Volume ID: 2f8658ed-0d9d-4a6f-a00b-96e9d3470b53
>>>>>>>>> Status: Started
>>>>>>>>> Snapshot Count: 0
>>>>>>>>> Number of Bricks: 1 x 2 = 2
>>>>>>>>> Transport-type: tcp
>>>>>>>>> Bricks:
>>>>>>>>> Brick1: 10.0.0.1:/bricks/brick1/gv0
>>>>>>>>> Brick2: 10.0.0.2:/bricks/brick1/gv0
>>>>>>>>> Options Reconfigured:
>>>>>>>>> nfs.disable: on
>>>>>>>>> performance.readdir-ahead: on
>>>>>>>>> transport.address-family: inet
>>>>>>>>> features.shard: on
>>>>>>>>> features.shard-block-size: 16MB
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> When hosting VM's its essential to set these options:
>>>>>>>> 
>>>>>>>> network.remote-dio: enable
>>>>>>>> cluster.eager-lock: enable
>>>>>>>> performance.io-cache: off
>>>>>>>> performance.read-ahead: off
>>>>>>>> performance.quick-read: off
>>>>>>>> performance.stat-prefetch: on
>>>>>>>> performance.strict-write-ordering: off
>>>>>>>> cluster.server-quorum-type: server
>>>>>>>> cluster.quorum-type: auto
>>>>>>>> cluster.data-self-heal: on
>>>>>>>> 
>>>>>>>> Also with replica two and quorum on (required) your volume will
>>>>>>>> become
>>>>>>>> read-only when one node goes down to prevent the possibility of
>>>>>>>> split-brain
>>>>>>>> - you *really* want to avoid that :)
>>>>>>>> 
>>>>>>>> I'd recommend a replica 3 volume, that way 1 node can go down, but
>>>>>>>> the
>>>>>>>> other
>>>>>>>> two still form a quorum and will remain r/w.
>>>>>>>> 
>>>>>>>> If the extra disks are not possible, then a Arbiter volume can be
>>>>>>>> setup
>>>>>>>> -
>>>>>>>> basically dummy files on the third node.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Lindsay Mathieson
>>>>>>>> 
>>>>>>>> ___
>>>>>>>> Gluster-users mailing list
>>>>>>>> Gluster-users@gluster.org
>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>> ___
>>>>>> Gluster-users mailing list
>>>>>> Gluster-users@gluster.org
>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>> 
>>>>> 
>>> 
>>> 
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users

--
Дмитрий Глушенок
Инфосистемы Джет
+7-910-453-2568

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Minio as object storage

2016-09-28 Thread Дмитрий Глушенок
Hi,

I've tried Minio and Scality S3 (both as Docker containers). None of them give 
me more than 60 MB/sec for one stream.

--
Dmitry Glushenok
Jet Infosystems

> 28 сент. 2016 г., в 1:04, Gandalf Corvotempesta 
>  написал(а):
> 
> Anyone tried Minio as object storage over gluster?
> It mostly a one-liner:
> https://docs.minio.io/docs/minio-quickstart-guide
> 
> something like:
> ./minio server /mnt/my_gluster_volume
> 
> Having an Amazon S3 compatible object store could be great in some 
> environments
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Production cluster planning

2016-09-26 Thread Дмитрий Глушенок
Hi,

Red Hat only supports XFS for some reason: 
https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3.1/html/Installation_Guide/sect-Prerequisites1.html

--
Dmitry Glushenok
Jet Infosystems

> 26 сент. 2016 г., в 14:26, Lindsay Mathieson  
> написал(а):
> 
> On 26/09/2016 8:18 PM, Gandalf Corvotempesta wrote:
>> No one ?
>> And what about gluster on ZFS? Is that fully supported ?
> 
> I certainly hope so because I'm running a Replica 3 production cluster on ZFS 
> :)
> 
> -- 
> Lindsay Mathieson
> 
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] auth.allow doesn't seem to work

2016-09-23 Thread Дмитрий Глушенок
Hi,

It looks like for NFS you have to change nfs.rpc-auth-allow, not auth.allow 
(which is for access by API). Docs for nfs.rpc-auth-allow states that "By 
default, all clients are disallowed", but in fact the option has "all" as 
default value.
Regarding auth.allow and information disclosure using FUSE - glusterd is not 
secure by design 
(https://www.gluster.org/pipermail/gluster-users/2016-July/027299.html). So, 
iptables may be the option.

--
Dmitry Glushenok
Jet Infosystems

> 23 сент. 2016 г., в 13:06, Kevin Lemonnier  написал(а):
> 
> Hi,
> 
> Using GlusterFS 3.7.15 on Debian 8 I'm trying to limit access using 
> auth.allow on my volume.
> I have 3 nodes in replication with both a public interface and a private 
> interface on each.
> Gluster uses the private IPs to communicate, but I noticed it was possible to 
> mount the volume
> from the internet (that's bad ..) so I googled a bit. auth.allow, if I 
> understand it correctly,
> should allow me to limit access of the volume to a list of IPs, is that 
> correct ?
> 
> I ran gluster volume set VMs auth.allow 10.10.0.* and it said success (it 
> does appear in the info),
> but I can still mount the volume from the internet. It works only using NFS 
> because using fuse it's
> trying to use the private adresses, which won't work on the internet, but it 
> still gets the volume
> config and the nodes names anyway.
> 
> Should I do something specific after setting auth.allow ?
> 
> Here is the volume info :
> 
> Volume Name: VMs
> Type: Replicate
> Volume ID: d0ee13f2-055c-4f37-9c75-527d5e86b78d
> Status: Started
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: ips1adm.clientname:/mnt/storage/VMs
> Brick2: ips2adm.clientname:/mnt/storage/VMs
> Brick3: ips3adm.clientname:/mnt/storage/VMs
> Options Reconfigured:
> auth.allow: 10.10.0.*
> network.ping-timeout: 15
> cluster.data-self-heal-algorithm: full
> features.shard-block-size: 64MB
> features.shard: on
> performance.stat-prefetch: off
> performance.io-cache: off
> performance.read-ahead: off
> performance.quick-read: off
> cluster.eager-lock: enable
> network.remote-dio: enable
> cluster.server-quorum-type: server
> cluster.quorum-type: auto
> performance.readdir-ahead: on
> 
> 
> -- 
> Kevin Lemonnier
> PGP Fingerprint : 89A5 2283 04A0 E6E9 0111
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] write performance with NIC bonding

2016-09-22 Thread Дмитрий Глушенок
Hi,

It is because your switch is not performing round-robin distribution while 
sending data to server (probably it can't). Usually it is enough to configure 
ip-port LACP hashing to evenly distribute traffic by all ports in aggregation. 
But any single tcp connections will still be using only one interface.

--
Dmitry Glushenok
Jet Infosystems

> 21 сент. 2016 г., в 23:42, James Ching  написал(а):
> 
> Hi,
> 
> I'm using gluster 3.7.5 and I'm trying to get port bonding working properly 
> with the gluster protocol.  I've bonded the NICs using round robin because I 
> also bond it at the switch level with link aggregation.  I've used this type 
> of bonding without a problem with my other applications but for some reason 
> gluster does not want to utilize all 3 NICs for writes but it does for 
> reads... any of you come across this or know why?  Here's the output of the 
> traffic on the NICs you can see that RX is unbalanced but TX is completely 
> balanced across the 3 NICs.  I've tried both mounting via glusterfs or nfs, 
> both result in the same imbalance. Am I missing some configuration?
> 
> 
> root@e-gluster-01:~# ifconfig
> bond0 Link encap:Ethernet
>  inet addr:  Bcast:128.33.23.255 Mask:255.255.248.0
>  inet6 addr: fe80::46a8:42ff:fe43:8817/64 Scope:Link
>  UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500 Metric:1
>  RX packets:160972852 errors:0 dropped:0 overruns:0 frame:0
>  TX packets:122295229 errors:0 dropped:0 overruns:0 carrier:0
>  collisions:0 txqueuelen:0
>  RX bytes:152800624950 (142.3 GiB)  TX bytes:138720356365 (129.1 GiB)
> 
> em1   Link encap:Ethernet
>  UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
>  RX packets:160793725 errors:0 dropped:0 overruns:0 frame:0
>  TX packets:40763142 errors:0 dropped:0 overruns:0 carrier:0
>  collisions:0 txqueuelen:1000
>  RX bytes:152688146880 (142.2 GiB)  TX bytes:46239971255 (43.0 GiB)
>  Interrupt:41
> 
> em2   Link encap:Ethernet
>  UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
>  RX packets:92451 errors:0 dropped:0 overruns:0 frame:0
>  TX packets:40750031 errors:0 dropped:0 overruns:0 carrier:0
>  collisions:0 txqueuelen:1000
>  RX bytes:9001370 (8.5 MiB)  TX bytes:46216513162 (43.0 GiB)
>  Interrupt:45
> 
> em3   Link encap:Ethernet
>  UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
>  RX packets:86676 errors:0 dropped:0 overruns:0 frame:0
>  TX packets:40782056 errors:0 dropped:0 overruns:0 carrier:0
>  collisions:0 txqueuelen:1000
>  RX bytes:103476700 (98.6 MiB)  TX bytes:46263871948 (43.0 GiB)
>  Interrupt:40
> 
> 
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Self healing does not see files to heal

2016-08-17 Thread Дмитрий Глушенок
You are right, stat triggers self-heal. Thank you!

--
Dmitry Glushenok
Jet Infosystems

> 17 авг. 2016 г., в 13:38, Ravishankar N  написал(а):
> 
> On 08/17/2016 03:48 PM, Дмитрий Глушенок wrote:
>> Unfortunately not:
>> 
>> Remount FS, then access test file from second client:
>> 
>> [root@srv02 ~]# umount /mnt
>> [root@srv02 ~]# mount -t glusterfs srv01:/test01 /mnt
>> [root@srv02 ~]# ls -l /mnt/passwd 
>> -rw-r--r--. 1 root root 1505 авг 16 19:59 /mnt/passwd
>> [root@srv02 ~]# ls -l /R1/test01/
>> итого 4
>> -rw-r--r--. 2 root root 1505 авг 16 19:59 passwd
>> [root@srv02 ~]# 
>> 
>> Then remount FS and check if accessing the file from second node triggered 
>> self-heal on first node:
>> 
>> [root@srv01 ~]# umount /mnt
>> [root@srv01 ~]# mount -t glusterfs srv01:/test01 /mnt
>> [root@srv01 ~]# ls -l /mnt
> 
> Can you try `stat /mnt/passwd` from this node after remounting? You need to 
> explicitly lookup the file.  `ls -l /mnt`  is only triggering readdir on the 
> parent directory.
> If that doesn't work, is this mount connected to both clients? i.e. if you 
> create a new file from here, is it getting replicated to both bricks?
> 
> -Ravi
> 
>> итого 0
>> [root@srv01 ~]# ls -l /R1/test01/
>> итого 0
>> [root@srv01 ~]#
>> 
>> Nothing appeared.
>> 
>> [root@srv01 ~]# gluster volume info test01
>>  
>> Volume Name: test01
>> Type: Replicate
>> Volume ID: 2c227085-0b06-4804-805c-ea9c1bb11d8b
>> Status: Started
>> Number of Bricks: 1 x 2 = 2
>> Transport-type: tcp
>> Bricks:
>> Brick1: srv01:/R1/test01
>> Brick2: srv02:/R1/test01
>> Options Reconfigured:
>> features.scrub-freq: hourly
>> features.scrub: Active
>> features.bitrot: on
>> transport.address-family: inet
>> performance.readdir-ahead: on
>> nfs.disable: on
>> [root@srv01 ~]# 
>> 
>> [root@srv01 ~]# gluster volume get test01 all | grep heal
>> cluster.background-self-heal-count  8
>>
>> cluster.metadata-self-heal  on   
>>
>> cluster.data-self-heal  on   
>>
>> cluster.entry-self-heal on   
>>
>> cluster.self-heal-daemonon   
>>
>> cluster.heal-timeout600  
>>
>> cluster.self-heal-window-size   1
>>
>> cluster.data-self-heal-algorithm(null)   
>>
>> cluster.self-heal-readdir-size  1KB  
>>
>> cluster.heal-wait-queue-length  128  
>>
>> features.lock-heal  off  
>>
>> features.lock-heal  off  
>>
>> storage.health-check-interval   30   
>>
>> features.ctr_lookupheal_link_timeout300  
>>
>> features.ctr_lookupheal_inode_timeout   300  
>>
>> cluster.disperse-self-heal-daemon   enable   
>>
>> disperse.background-heals       8
>>
>> disperse.heal-wait-qlength  128  
>>
>> cluster.heal-timeout600  
>>
>> cluster.granular-entry-heal no   
>>
>> [root@srv01 ~]#
>> 
>> --
>> Dmitry Glushenok
>> Jet Infosystems
>> 
>>> 17 авг. 2016 г., в 11:30, Ravishankar N >> <mailto:ravishan...@redhat.com>> написал(а):
>>> 
>>> On 08/17/2016 01:48 PM, Дмитрий Глушенок wrote:
>>>> Hello Ravi,
>>>> 
>>>> Thank you for reply. Found bug number (for those who will google the 
>>>> email) https://bugzilla.redhat.com/show_bug.cgi?id=1112158 
>>>> <https://bugzilla.redhat.com/show_bug.cgi?id=1112158>
>>>> 
>>>> Accessing the removed file from mount-point is not always working because 
>>>> we have to find a special client which DHT will point to the brick with 
>>>> removed file. Otherwise the file 

Re: [Gluster-users] Self healing does not see files to heal

2016-08-17 Thread Дмитрий Глушенок
Unfortunately not:

Remount FS, then access test file from second client:

[root@srv02 ~]# umount /mnt
[root@srv02 ~]# mount -t glusterfs srv01:/test01 /mnt
[root@srv02 ~]# ls -l /mnt/passwd 
-rw-r--r--. 1 root root 1505 авг 16 19:59 /mnt/passwd
[root@srv02 ~]# ls -l /R1/test01/
итого 4
-rw-r--r--. 2 root root 1505 авг 16 19:59 passwd
[root@srv02 ~]# 

Then remount FS and check if accessing the file from second node triggered 
self-heal on first node:

[root@srv01 ~]# umount /mnt
[root@srv01 ~]# mount -t glusterfs srv01:/test01 /mnt
[root@srv01 ~]# ls -l /mnt
итого 0
[root@srv01 ~]# ls -l /R1/test01/
итого 0
[root@srv01 ~]#

Nothing appeared.

[root@srv01 ~]# gluster volume info test01
 
Volume Name: test01
Type: Replicate
Volume ID: 2c227085-0b06-4804-805c-ea9c1bb11d8b
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: srv01:/R1/test01
Brick2: srv02:/R1/test01
Options Reconfigured:
features.scrub-freq: hourly
features.scrub: Active
features.bitrot: on
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: on
[root@srv01 ~]# 

[root@srv01 ~]# gluster volume get test01 all | grep heal
cluster.background-self-heal-count  8   
cluster.metadata-self-heal  on  
cluster.data-self-heal  on  
cluster.entry-self-heal on  
cluster.self-heal-daemonon  
cluster.heal-timeout600 
cluster.self-heal-window-size   1   
cluster.data-self-heal-algorithm(null)  
cluster.self-heal-readdir-size  1KB 
cluster.heal-wait-queue-length  128 
features.lock-heal  off 
features.lock-heal  off 
storage.health-check-interval   30  
features.ctr_lookupheal_link_timeout300 
features.ctr_lookupheal_inode_timeout   300 
cluster.disperse-self-heal-daemon   enable  
disperse.background-heals   8   
disperse.heal-wait-qlength  128 
cluster.heal-timeout600 
cluster.granular-entry-heal no  
[root@srv01 ~]#

--
Dmitry Glushenok
Jet Infosystems

> 17 авг. 2016 г., в 11:30, Ravishankar N  написал(а):
> 
> On 08/17/2016 01:48 PM, Дмитрий Глушенок wrote:
>> Hello Ravi,
>> 
>> Thank you for reply. Found bug number (for those who will google the email) 
>> https://bugzilla.redhat.com/show_bug.cgi?id=1112158 
>> <https://bugzilla.redhat.com/show_bug.cgi?id=1112158>
>> 
>> Accessing the removed file from mount-point is not always working because we 
>> have to find a special client which DHT will point to the brick with removed 
>> file. Otherwise the file will be accessed from good brick and self-healing 
>> will not happen (just verified). Or by accessing you meant something like 
>> touch?
> 
> Sorry should have been more explicit. I meant triggering a lookup on that 
> file with `stat filename`. I don't think you need a special client. DHT sends 
> the lookup to AFR which in turn sends to all its children. When one of them 
> returns ENOENT (because you removed it from the brick), AFR will 
> automatically trigger heal. I'm guessing it is not always working in your 
> case due to caching at various levels and the lookup not coming till AFR. 
> If you do it from a fresh mount ,it should always work.
> -Ravi
> 
>> Dmitry Glushenok
>> Jet Infosystems
>> 
>>> 17 авг. 2016 г., в 4:24, Ravishankar N >> <mailto:ravishan...@redhat.com>> написал(а):
>>> 
>>> On 08/16/2016 10:44 PM, Дмитрий Глушенок wrote:
>>>> Hello,
>>>> 
>>>> While testing healing after bitrot error it was found that self healing 
>>>> cannot heal files which were manually deleted from brick. Gluster 3.8.1:
>>>> 
>>>> - Create volume, mount it locally and copy test file to it
>>>> [root@srv01 ~]# gluster volume create test01 replica 2  srv01:/R1/test01 
>>>> srv02:/R1/test01
>>>> volume create: test01: success: please start the volume to access data
>>>> [root@srv01 ~]# gluster volume start test01
>>>&g

Re: [Gluster-users] Self healing does not see files to heal

2016-08-17 Thread Дмитрий Глушенок
Hello Ravi,

Thank you for reply. Found bug number (for those who will google the email) 
https://bugzilla.redhat.com/show_bug.cgi?id=1112158

Accessing the removed file from mount-point is not always working because we 
have to find a special client which DHT will point to the brick with removed 
file. Otherwise the file will be accessed from good brick and self-healing will 
not happen (just verified). Or by accessing you meant something like touch?

--
Dmitry Glushenok
Jet Infosystems

> 17 авг. 2016 г., в 4:24, Ravishankar N  написал(а):
> 
> On 08/16/2016 10:44 PM, Дмитрий Глушенок wrote:
>> Hello,
>> 
>> While testing healing after bitrot error it was found that self healing 
>> cannot heal files which were manually deleted from brick. Gluster 3.8.1:
>> 
>> - Create volume, mount it locally and copy test file to it
>> [root@srv01 ~]# gluster volume create test01 replica 2  srv01:/R1/test01 
>> srv02:/R1/test01
>> volume create: test01: success: please start the volume to access data
>> [root@srv01 ~]# gluster volume start test01
>> volume start: test01: success
>> [root@srv01 ~]# mount -t glusterfs srv01:/test01 /mnt
>> [root@srv01 ~]# cp /etc/passwd /mnt
>> [root@srv01 ~]# ls -l /mnt
>> итого 2
>> -rw-r--r--. 1 root root 1505 авг 16 19:59 passwd
>> 
>> - Then remove test file from first brick like we have to do in case of 
>> bitrot error in the file
> 
> You also need to remove all hard-links to the corrupted file from the brick, 
> including the one in the .glusterfs folder.
> There is a bug in heal-full that prevents it from crawling all bricks of the 
> replica. The right way to heal the corrupted files as of now is to access 
> them from the mount-point like you did after removing the hard-links. The 
> list of files that are corrupted can be obtained with the scrub status 
> command.
> 
> Hope this helps,
> Ravi
> 
>> [root@srv01 ~]# rm /R1/test01/passwd
>> [root@srv01 ~]# ls -l /mnt
>> итого 0
>> [root@srv01 ~]#
>> 
>> - Issue full self heal
>> [root@srv01 ~]# gluster volume heal test01 full
>> Launching heal operation to perform full self heal on volume test01 has been 
>> successful
>> Use heal info commands to check status
>> [root@srv01 ~]# tail -2 /var/log/glusterfs/glustershd.log
>> [2016-08-16 16:59:56.483767] I [MSGID: 108026] 
>> [afr-self-heald.c:611:afr_shd_full_healer] 0-test01-replicate-0: starting 
>> full sweep on subvol test01-client-0
>> [2016-08-16 16:59:56.486560] I [MSGID: 108026] 
>> [afr-self-heald.c:621:afr_shd_full_healer] 0-test01-replicate-0: finished 
>> full sweep on subvol test01-client-0
>> 
>> - Now we still see no files in mount point (it becomes empty right after 
>> removing file from the brick)
>> [root@srv01 ~]# ls -l /mnt
>> итого 0
>> [root@srv01 ~]#
>> 
>> - Then try to access file by using full name (lookup-optimize and 
>> readdir-optimize are turned off by default). Now glusterfs shows the file!
>> [root@srv01 ~]# ls -l /mnt/passwd
>> -rw-r--r--. 1 root root 1505 авг 16 19:59 /mnt/passwd
>> 
>> - And it reappeared in the brick
>> [root@srv01 ~]# ls -l /R1/test01/
>> итого 4
>> -rw-r--r--. 2 root root 1505 авг 16 19:59 passwd
>> [root@srv01 ~]#
>> 
>> Is it a bug or we can tell self heal to scan all files on all bricks in the 
>> volume?
>> 
>> --
>> Dmitry Glushenok
>> Jet Infosystems
>> 
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org <mailto:Gluster-users@gluster.org>
>> http://www.gluster.org/mailman/listinfo/gluster-users 
>> <http://www.gluster.org/mailman/listinfo/gluster-users>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Self healing does not see files to heal

2016-08-16 Thread Дмитрий Глушенок
Hello,

While testing healing after bitrot error it was found that self healing cannot 
heal files which were manually deleted from brick. Gluster 3.8.1:

- Create volume, mount it locally and copy test file to it
[root@srv01 ~]# gluster volume create test01 replica 2  srv01:/R1/test01 
srv02:/R1/test01 
volume create: test01: success: please start the volume to access data
[root@srv01 ~]# gluster volume start test01
volume start: test01: success
[root@srv01 ~]# mount -t glusterfs srv01:/test01 /mnt
[root@srv01 ~]# cp /etc/passwd /mnt
[root@srv01 ~]# ls -l /mnt
итого 2
-rw-r--r--. 1 root root 1505 авг 16 19:59 passwd

- Then remove test file from first brick like we have to do in case of bitrot 
error in the file
[root@srv01 ~]# rm /R1/test01/passwd 
[root@srv01 ~]# ls -l /mnt
итого 0
[root@srv01 ~]# 

- Issue full self heal
[root@srv01 ~]# gluster volume heal test01 full
Launching heal operation to perform full self heal on volume test01 has been 
successful 
Use heal info commands to check status
[root@srv01 ~]# tail -2 /var/log/glusterfs/glustershd.log
[2016-08-16 16:59:56.483767] I [MSGID: 108026] 
[afr-self-heald.c:611:afr_shd_full_healer] 0-test01-replicate-0: starting full 
sweep on subvol test01-client-0
[2016-08-16 16:59:56.486560] I [MSGID: 108026] 
[afr-self-heald.c:621:afr_shd_full_healer] 0-test01-replicate-0: finished full 
sweep on subvol test01-client-0

- Now we still see no files in mount point (it becomes empty right after 
removing file from the brick)
[root@srv01 ~]# ls -l /mnt
итого 0
[root@srv01 ~]# 

- Then try to access file by using full name (lookup-optimize and 
readdir-optimize are turned off by default). Now glusterfs shows the file!
[root@srv01 ~]# ls -l /mnt/passwd
-rw-r--r--. 1 root root 1505 авг 16 19:59 /mnt/passwd

- And it reappeared in the brick
[root@srv01 ~]# ls -l /R1/test01/
итого 4
-rw-r--r--. 2 root root 1505 авг 16 19:59 passwd
[root@srv01 ~]#

Is it a bug or we can tell self heal to scan all files on all bricks in the 
volume?

--
Dmitry Glushenok
Jet Infosystems

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster not saturating 10gb network

2016-08-09 Thread Дмитрий Глушенок
Hi,

Same problem on 3.8.1. Even on loopback interface (traffic not leaves gluster 
node):

Writing locally to replica 2 volume (each brick is separate local RAID6): 613 
MB/sec
Writing locally to 1-brick volume: 877 MB/sec
Writing locally to the brick itself (directly to XFS): 1400 MB/sec

Tests were performed using fio with following settings:

bs=4096k
ioengine=libaio
iodepth=32
direct=0
runtime=600
directory=/R1
numjobs=1
rw=write
size=40g

Even with direct=1 the brick itself gives 1400 MB/sec.

1-brick volume profiling below:

# gluster volume profile test-data-03 info
Brick: gluster-01:/R1/test-data-03
---
Cumulative Stats:
   Block Size: 131072b+  262144b+ 
 No. of Reads:0 0 
No. of Writes:   88907220 
 %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls Fop
 -   ---   ---   ---   
  0.00   0.00 us   0.00 us   0.00 us  3 RELEASE
100.00 122.96 us  67.00 us   42493.00 us 208598   WRITE
 
Duration: 1605 seconds
   Data Read: 0 bytes
Data Written: 116537688064 bytes
 
Interval 0 Stats:
   Block Size: 131072b+  262144b+ 
 No. of Reads:0 0 
No. of Writes:   88907220 
 %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls Fop
 -   ---   ---   ---   
  0.00   0.00 us   0.00 us   0.00 us  3 RELEASE
100.00 122.96 us  67.00 us   42493.00 us 208598   WRITE
 
Duration: 1605 seconds
   Data Read: 0 bytes
Data Written: 116537688064 bytes
 
#

As you can see all writes are performed using 128 KB block size. And it looks 
like a bottleneck. Which was discussed previously btw: 
http://www.gluster.org/pipermail/gluster-devel/2013-March/038821.html

Using GFAPI to access the volume shows better speed, but still far from raw 
brick. fio tests with ioengine=gfapi gives following:

Writing locally to replica 2 volume (each brick is separate local RAID6): 680 
MB/sec
Writing locally to 1-brick volume: 960 MB/sec


Accorging to 1-brick volume profile 128 KB blocks no more used:

# gluster volume profile tzk-data-03 info
Brick: j-gluster-01.vcod.jet.su:/R1/tzk-data-03
---
Cumulative Stats:
   Block Size:4194304b+ 
 No. of Reads:0 
No. of Writes: 9211 
 %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls Fop
 -   ---   ---   ---   
100.002237.67 us1880.00 us5785.00 us   8701   WRITE
 
Duration: 49 seconds
   Data Read: 0 bytes
Data Written: 38633734144 bytes
 
Interval 0 Stats:
   Block Size:4194304b+ 
 No. of Reads:0 
No. of Writes: 9211 
 %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls Fop
 -   ---   ---   ---   
100.002237.67 us1880.00 us5785.00 us   8701   WRITE
 
Duration: 49 seconds
   Data Read: 0 bytes
Data Written: 38633734144 bytes
 
[root@j-gluster-01 ~]# 

So, it may be worth to try using NFS Ganesha with GFAPI plugin.


> 3 авг. 2016 г., в 9:40, Kaamesh Kamalaaharan  
> написал(а):
> 
> Hi , 
> I have gluster 3.6.2 installed on my server network. Due to internal issues 
> we are not allowed to upgrade the gluster version. All the clients are on the 
> same version of gluster. When transferring files  to/from the clients or 
> between my nodes over the 10gb network, the transfer rate is capped at 
> 450Mb/s .Is there any way to increase the transfer speeds for gluster mounts? 
> 
> Our server setup is as following:
> 
> 2 gluster servers -gfs1 and gfs2
>  volume name : gfsvolume
> 3 clients - hpc1, hpc2,hpc3
> gluster volume mounted on /export/gfsmount/
> 
> 
> 
> The following is the average results what i did so far:
> 
> 1) test bandwith with iperf between all machines - 9.4 GiB/s
> 2) test write speed with dd 
> dd if=/dev/zero of=/export/gfsmount/testfile bs=1G count=1
> 
> result=399Mb/s
> 
> 3) test read speed with dd
> dd if=/export/gfsmount/testfile of=/dev/zero bs=1G count=1
> 
> result=284MB/s
> 
> My gluster volume configuration:
>  
> Volume Name: gfsvolume
> Type: Replicate
> Volume ID: a29bd2fb-b1ef-4481-be10-c2f4faf4059b
> Status: Started
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: gfs1:/export/sda/brick
> Brick2: gfs2:/export/sda/brick
> Options Reconfigured:
> performance.quick-read: off
> network.ping-timeout: 30
> network.frame-timeout: 90
> performance.cache-max-file-size: 2MB
> cluster.server-quorum-type

Re: [Gluster-users] Bit rot disabled as default

2016-06-15 Thread Дмитрий Глушенок
Sharding almost solves the problem (for inactive blocks), but it was considered 
as stable just today :)

http://blog.gluster.org/2016/06/glusterfs-3-8-released/ 
<http://blog.gluster.org/2016/06/glusterfs-3-8-released/>
- Sharding is now stable for VM image storage.

--
Dmitry Glushenok
Jet Infosystems

> 15 июня 2016 г., в 19:42, Gandalf Corvotempesta 
>  написал(а):
> 
> 2016-06-15 18:12 GMT+02:00 Дмитрий Глушенок :
>> Hello.
>> 
>> May be because of current implementation of rotten bits detection - one hash
>> for whole file. Imagine 40 GB VM image - few parts of the image are modified
>> continuously (VM log files and application data are constantly changing).
>> Those writes making checksum invalid and BitD has to recalculate it
>> endlessly. As the result - checksum of VM image can never be verified.
> 
> I think you are right
> But what about sharding? In this case, the hash should be created for
> each shard and not the whole file.

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Bit rot disabled as default

2016-06-15 Thread Дмитрий Глушенок
Hello.

May be because of current implementation of rotten bits detection - one hash 
for whole file. Imagine 40 GB VM image - few parts of the image are modified 
continuously (VM log files and application data are constantly changing). Those 
writes making checksum invalid and BitD has to recalculate it endlessly. As the 
result - checksum of VM image can never be verified.

> 15 июня 2016 г., в 9:37, Gandalf Corvotempesta 
>  написал(а):
> 
> I was looking at docs.
> why bit rot protection is disabled by defaults?
> with huge files like a qcow image a bit rot could lead to the whole image 
> corrupted and replicated to the whole cluster
> 
> Any drawbacks with bit rot detection to explain the default to off?
> 
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users

--
Dmitry Glushenok
Jet Infosystems

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users