On Thu, Mar 10, 2016 at 4:52 PM, Lindsay Mathieson < lindsay.mathie...@gmail.com> wrote:
> On 11/03/2016 2:24 AM, David Gossage wrote: > >> It is file based not block based healing so it saw multi-GB files that >> it had to recopy over. It had to halt all write to those files while that >> occurred or it would be a never ending cycle of re-copying the large >> images. So the fact most VM's went haywire isnt that odd. It does look >> based on timing in alerts the 2 bricks that were up kept serving images >> until 3rd brick came back. It did heal all images just fine. >> >> > What version are you running? 3.7.x has sharding (breaks large files into > chunks) to allow much finer grained healing, it speeds up heals a *lot*. > However it can't be applied retroactively, you have to enable sharding then > copy the VM over :( > > http://blog.gluster.org/2015/12/introducing-shard-translator/ Yes, I was planning on testing that out soon at office I am on 3.7. Attaching an nfs mount and moving disks off and on until all were sharded. > > > In regards to rolling reboots, it can be done with replicated storage and > gluster will transparently hand over client read/writes, but for each VM > image, only one copy at a time can be healing over wise access will be > blocked as you saw. > > So recommended procedure: > - Enable sharding > - copy VM's over > - when rebooting wait for heals to complete before rebooting the next node > Odd thing is I did only reboot the one node so I was expecting one version to be healed, the one I had rebooted, and the other 2 to handle writes still during the heal process. However that was not what happened. > > nb: Thoroughly recommend 3 way replication as you have done, it saves a > lot of headaches with quorums and split brain. > > -- > Lindsay Mathieson > >
_______________________________________________ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users