Re: [Gluster-users] GlusterFS as virtual machine storage

2017-08-25 Thread Gionatan Danti

Il 26-08-2017 07:38 Gionatan Danti ha scritto:

I'll surely give a look at the documentation. I have the "bad" habit
of not putting into production anything I know how to repair/cope
with.

Thanks.


Mmmm, this should read as:

"I have the "bad" habit of not putting into production anything I do NOT 
know how to repair/cope with"


Really :D

Thanks.

--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.da...@assyoma.it - i...@assyoma.it
GPG public key ID: FF5F32A8
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] GlusterFS as virtual machine storage

2017-08-25 Thread Gionatan Danti

Il 26-08-2017 01:13 WK ha scritto:

Big +1 on what was Kevin just said.  Just avoiding the problem is the
best strategy.


Ok, never run Gluster with anything less than a replica2 + arbiter ;)


However, for the record,  and if you really, really want to get deep
into the weeds on the subject, then the  Gluster people have docs on
Split-Brain recovery.

https://gluster.readthedocs.io/en/latest/Troubleshooting/split-brain/

and if you Google the topic, there are a lot of other blog posts,
emails, etc that discuss it.

I'd recommend reviewing those as well to wrap your head around what is 
going on.


I'll surely give a look at the documentation. I have the "bad" habit of 
not putting into production anything I know how to repair/cope with.


Thanks.

--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.da...@assyoma.it - i...@assyoma.it
GPG public key ID: FF5F32A8
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-08-25 Thread WK

On 8/25/2017 2:21 PM, lemonni...@ulrar.net wrote:

This concern me, and it is the reason I would like to avoid sharding.
How can I recover from such a situation? How can I "decide" which
(reconstructed) file is the one to keep rather than to delete?


No need, on a replica 3 that just doesn't happen. That's the main
advantage of it, that and the fact that you can perform operations on
your servers without having the volume go down.

For a replica 2 though, it will happen. With or without sharding the
operation is the same, it involves fiddling with gfids and is a bit
annoying, but not that hard for one file. But with sharding enabled
you'll need to pick each split brained shard out, which is I imagine a
huge pain .. Again, just don't do 2 nodes, it's a _bad_ idea. Add at the
very least an arbiter.




Big +1 on what was Kevin just said.  Just avoiding the problem is the 
best strategy.


However, for the record,  and if you really, really want to get deep 
into the weeds on the subject, then the  Gluster people have docs on 
Split-Brain recovery.


https://gluster.readthedocs.io/en/latest/Troubleshooting/split-brain/

and if you Google the topic, there are a lot of other blog posts, 
emails, etc that discuss it.


I'd recommend reviewing those as well to wrap your head around what is 
going on.




___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] GlusterFS as virtual machine storage

2017-08-25 Thread Gionatan Danti

Il 25-08-2017 21:48 WK ha scritto:

On 8/25/2017 12:56 AM, Gionatan Danti wrote:


We ran Rep2 for years on 3.4.  It does work if you are really,really 
careful,  But in a crash on one side, you might have lost some bits
that were on the fly. The VM would then try to heal.
Without sharding, big VMs take a while because the WHOLE VM file has
to be copied over. Then you might get Split-brain and have to stop the
VM, pick the good one, make sure that is healed on both sides and then
restart the VM.


Ok, so sharding needs to be enabled for VM disk storage, otherwise heal 
time skyrockets.



Arbiter/Replica 3 prevents that. Sharding helps a lot as well by
making the heals really quick, though in a Replica 2 with sharding you
no longer have a nice big  .img file sitting on each brick in plain
view and picking a split-brain winner is now WAY more complicated. You
would have to re-assemble things.


This concern me, and it is the reason I would like to avoid sharding. 
How can I recover from such a situation? How can I "decide" which 
(reconstructed) file is the one to keep rather than to delete?




We were quite good and fixing broken Gluster 3.4 nodes, but we are
*much* happier with the Arbiter node and sharding. It is a huge
difference.
We could go to Rep3 but we like the extra speed and we are comfortable
with the Arb limitations (we also have excellent off cluster backups
).


Also, on a two-node setup it is *guaranteed* for updates to one node 
to put offline the whole volume?


If you still have quorum turned on, then yes. One side goes and you are 
down.


On the other hand, a 3-way setup (or 2+arbiter) if free from all these 
problems?




Yes, you can lose one of the three nodes and after the pause,
everything just continues. If you have a second failure before you can
recover, then you have lost quorum.

If that second failure is the other actual replica, then you could get
into a situation where the arbiter isn't happy with either copy when
you come back up and of course the arbiter doesn't have a good copy
itself. Pavel alluded to something like that when describing his
problem.

That is where replica 3 helps. In theory, with replica 3, you could
lose 2 nodes and still have a reasonable copy of your VM, though
you've lost quorum and are still down. At that point, *I* would kill
the two bad nodes (STOMITH) to prevent them from coming back AND turn
off quorum. You could then run on the single node until you can
save/copy those VM images, preferably by migrating off that volume
completely. Create a remote pool using SSHFS if you have nothing else
available. THEN I would go back and fix the gluster cluster and
migrate back into it.

Replica2/Replica3 does not matter if you lose your Gluster network
switch, but again the Arb or Rep3 setup makes it easier to recover. I
suppose the only advantage of Replica2 is that you can use a cross
over cable and not worry about losing the switch, but bonding/teaming
works well and there are bonding modes that don't require the same
switch for the bond slaves. So you can build in some redundancy there
as well.


Thank you for the very valuable informations.

--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.da...@assyoma.it - i...@assyoma.it
GPG public key ID: FF5F32A8
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-08-25 Thread Gionatan Danti

Il 25-08-2017 21:43 lemonni...@ulrar.net ha scritto:

I think you are talking about DRBD 8, which is indeed very easy. DRBD 9
on the other hand, which is the one that compares to gluster (more or
less), is a whole other story. Never managed to make it work correctly
either


Oh yes, absolutely DRBD version 8.4.x
In my opinion, DRBD 9.x needs to mature.

Thanks.

--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.da...@assyoma.it - i...@assyoma.it
GPG public key ID: FF5F32A8
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] GlusterFS as virtual machine storage

2017-08-25 Thread WK



On 8/25/2017 12:43 PM, lemonni...@ulrar.net wrote:


I think you are talking about DRBD 8, which is indeed very easy. DRBD 9
on the other hand, which is the one that compares to gluster (more or
less), is a whole other story. Never managed to make it work correctly
either




Yes, and I noticed that Digimer's Anvil project still uses DRBD8 as well.

https://www.alteeve.com/w/Build_an_m2_Anvil

If she is still using DRBD8 on her stuff, then you know that DRBD9 isn't 
fully baked yet.


That being said we use DRBD  on projects such as NFS servers and found 
it to be reliable, easy to use and useful as long as we stayed with 
active/passive.


Our experiments with active/active were very unsatisfactory. OCFS2 in 
particular was very unhappy with lots of disk i/o, even just stats. We 
tried GFS2 as well but bailed due to stability (lockup) issues (probably 
our fault, but still not worth the effort).


___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] GlusterFS as virtual machine storage

2017-08-25 Thread WK



On 8/25/2017 12:56 AM, Gionatan Danti wrote:




WK wrote:
2 node plus Arbiter. You NEED the arbiter or a third node. Do NOT try 2
node with a VM


This is true even if I manage locking at application level (via 
virlock or sanlock)?



We ran Rep2 for years on 3.4.  It does work if you are really,really  
careful,  But in a crash on one side, you might have lost some bits that 
were on the fly. The VM would then try to heal.
Without sharding, big VMs take a while because the WHOLE VM file has to 
be copied over. Then you might get Split-brain and have to stop the VM, 
pick the good one, make sure that is healed on both sides and then 
restart the VM.


Arbiter/Replica 3 prevents that. Sharding helps a lot as well by making 
the heals really quick, though in a Replica 2 with sharding you no 
longer have a nice big  .img file sitting on each brick in plain view 
and picking a split-brain winner is now WAY more complicated. You would 
have to re-assemble things.


We were quite good and fixing broken Gluster 3.4 nodes, but we are 
*much* happier with the Arbiter node and sharding. It is a huge difference.
We could go to Rep3 but we like the extra speed and we are comfortable 
with the Arb limitations (we also have excellent off cluster backups 
).



Also, on a two-node setup it is *guaranteed* for updates to one node 
to put offline the whole volume?


If you still have quorum turned on, then yes. One side goes and you are 
down.


On the other hand, a 3-way setup (or 2+arbiter) if free from all these 
problems?




Yes, you can lose one of the three nodes and after the pause, everything 
just continues. If you have a second failure before you can recover, 
then you have lost quorum.


If that second failure is the other actual replica, then you could get 
into a situation where the arbiter isn't happy with either copy when you 
come back up and of course the arbiter doesn't have a good copy itself. 
Pavel alluded to something like that when describing his problem.


That is where replica 3 helps. In theory, with replica 3, you could lose 
2 nodes and still have a reasonable copy of your VM, though you've lost 
quorum and are still down. At that point, *I* would kill the two bad 
nodes (STOMITH) to prevent them from coming back AND turn off quorum. 
You could then run on the single node until you can save/copy those VM 
images, preferably by migrating off that volume completely. Create a 
remote pool using SSHFS if you have nothing else available. THEN I would 
go back and fix the gluster cluster and migrate back into it.


Replica2/Replica3 does not matter if you lose your Gluster network 
switch, but again the Arb or Rep3 setup makes it easier to recover. I 
suppose the only advantage of Replica2 is that you can use a cross over 
cable and not worry about losing the switch, but bonding/teaming works 
well and there are bonding modes that don't require the same switch for 
the bond slaves. So you can build in some redundancy there as well.



___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-08-25 Thread lemonnierk
> 
> This surprise me: I found DRBD quite simple to use, albeit I mostly use 
> active/passive setup in production (with manual failover)
> 

I think you are talking about DRBD 8, which is indeed very easy. DRBD 9
on the other hand, which is the one that compares to gluster (more or
less), is a whole other story. Never managed to make it work correctly
either


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-08-25 Thread Gionatan Danti

Il 25-08-2017 14:22 Lindsay Mathieson ha scritto:

On 25/08/2017 6:50 PM, lemonni...@ulrar.net wrote:

I run Replica 3 VM hosting (gfapi) via a 3 node proxmox cluster. Have
done a lot of rolling node updates, power failures etc, never had a
problem. Performance is better than any other DFS I've tried (Ceph,
lizard/moose).


Hi, very interesting! Are you using client or server quorum?


Never did get DRDB working.


This surprise me: I found DRBD quite simple to use, albeit I mostly use 
active/passive setup in production (with manual failover)



nb: ZFS Bricks, with each brick RAID10 - so a little paranoid on the
redundancy :)


Yeah, I remember you on the zfs-discuss mailing list ;)


For me, glusters biggest problem is its lack of flexibility in adding
bricks and nodes.And replacing them is an exercise in nail biting. 
Hoping V4 improves on this though maybe that will lead to performance
trade offs.


Can you elaborate? What are the biggest problems/inconvenience?

Thanks.

--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.da...@assyoma.it - i...@assyoma.it
GPG public key ID: FF5F32A8
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Gluster 4.0: Update

2017-08-25 Thread Amye Scavarda
On Fri, Aug 25, 2017 at 2:53 AM, Niels de Vos  wrote:

> On Fri, Aug 25, 2017 at 12:21:08PM +1000, Lindsay Mathieson wrote:
> > >
> > > This feature (and the patent) is from facebook folks.
> > >
> > >
> > Does that mean its not a problem?
>
> Facebook contributed the patches for this feature, so I don't think
> there is a problem.
>
> Niels (not a lawyer)
> ___
> Gluster-devel mailing list
> gluster-de...@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-devel
>

A quick note on this: we're delighted to have Facebook contributing their
work to the project overall.
The patent is unrelated to the decision to be able to contribute their
work, and Halo replication is currently experimental in 3.11. However, more
testing and users looking at this is always welcome.

Hope that helps, happy to clarify off list.
- amye

-- 
Amye Scavarda | a...@redhat.com | Gluster Community Lead
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-08-25 Thread Gionatan Danti

Il 23-08-2017 18:51 Gionatan Danti ha scritto:

Il 23-08-2017 18:14 Pavel Szalbot ha scritto:

Hi, after many VM crashes during upgrades of Gluster, losing network
connectivity on one node etc. I would advise running replica 2 with
arbiter.


Hi Pavel, this is bad news :(
So, in your case at least, Gluster was not stable? Something as simple
as an update would let it crash?


I once even managed to break this setup (with arbiter) due to network
partitioning - one data node never healed and I had to restore from
backups (it was easier and kind of non-production). Be extremely
careful and plan for failure.


I would use VM locking via sanlock or virtlock, so a split brain
should not cause simultaneous changes on both replicas. I am more
concerned about volume heal time: what will happen if the standby node
crashes/reboots? Will *all* data be re-synced from the master, or only
changed bit will be re-synced? As stated above, I would like to avoid
using sharding...

Thanks.


Hi all,
any other advice from who use (or do not use) Gluster as a replicated VM 
backend?


Thanks.

--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.da...@assyoma.it - i...@assyoma.it
GPG public key ID: FF5F32A8
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Backup and Restore strategy in Gluster FS 3.8.4

2017-08-25 Thread Sunil Aggarwal
Hi,

What is the preferred way of taking glusterfs backup?

I am using Glusterfs 3.8.4.

We've configured gluster on thick provisioned LV in which 50% of the VG is
kept free for the LVM snapshot.

is it any different then taking snapshot on a thin provisioned LV?

-- 
Thanks,
Sunil Aggarwal
844-734-5346
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] 3.8 Upgrade to 3.10

2017-08-25 Thread Lindsay Mathieson

On 25/08/2017 11:41 PM, Shyam Ranganathan wrote:


3.8 will receive no further updates post 3.12 is released, if the 
cluster is stable and you are not waiting on any fixes, then staying 
on a bit longer may not hurt.



Yes, issues raised on 3.8 will *possibly* receive an answer which 
would be, "please upgrade and let us know if the problem persists".



It should not given your usage, the problem should persist in 3.8 
version as well, and in latest 3.10 releases, the last known problems 
regarding this corruption are fixed and we are just awaiting 
confirmation from users to remove the note.



The upgrade notes [1] covers this, in short an offline upgrade is not 
needed in your setup, as disperse is not a part of the stack.


Shyam

[1] 3.10 upgrade guide: 
https://gluster.readthedocs.io/en/latest/Upgrade-Guide/upgrade_to_3.10/



Thanks Shyam, that covers everything.


--
Lindsay Mathieson

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Rolling upgrade from 3.6.3 to 3.10.5

2017-08-25 Thread Yong Tseng
Hi Diego,

That's valuable information to know. Thanks for the input!

On Fri, Aug 25, 2017 at 9:08 PM, Diego Remolina  wrote:

> Yes, I did an offline upgrade.
>
> 1. Stop all clients using gluster servers.
> 2. Stop glusterfsd and glusterd on both servers.
> 3. Backed up /var/lib/gluster* in all servers just to be safe.
> 4. Upgraded all servers from 3.6.x to 3.10.x (I did not have quotas or
> anything that required special steps)
> 5. Started gluster daemons again and confirmed everything was fine
> prior to letting clients connect.
> 5. Ran 3.10.x with the older op version for a few days to make sure
> all was OK (Not all was OK for me, but that may be a samba issue as I
> use it as a file server).
> 6. Upgraded the op version to maximum available.
>
> In my case, I have two servers with bricks and one server that acts as
> a witness.
>
> HTH,
>
> Diego
>
> On Fri, Aug 25, 2017 at 8:56 AM, Yong Tseng  wrote:
> > Hi Diego,
> >
> > Just to clarify, so did you do an offline upgrade with an existing
> cluster
> > (3.6.x => 3.10.x)?
> > Thanks.
> >
> > On Fri, Aug 25, 2017 at 8:42 PM, Diego Remolina 
> wrote:
> >>
> >> I was never able to go from 3.6.x to 3.7.x without downtime. Then
> >> 3.7.x did not work well for me, so I stuck with 3.6.x until recently.
> >> I went from 3.6.x to 3.10.x but downtime was scheduled.
> >>
> >> Diego
> >>
> >> On Fri, Aug 25, 2017 at 8:25 AM, Yong Tseng 
> wrote:
> >> > Hi Diego,
> >> >
> >> > Thanks for the information. I tried only setting 'allow-insecure on'
> but
> >> > nada.
> >> > The sentence "If you are using GlusterFS version 3.4.x or below, you
> can
> >> > upgrade it to following" in documentation is surely misleading.
> >> > So would you suggest creating a new 3.10 cluster from scratch then
> >> > rsync(?)
> >> > the data from old cluster to the new?
> >> >
> >> >
> >> > On Fri, Aug 25, 2017 at 7:53 PM, Diego Remolina 
> >> > wrote:
> >> >>
> >> >> You cannot do a rolling upgrade from 3.6.x to 3.10.x You will need
> >> >> downtime.
> >> >>
> >> >> Even 3.6 to 3.7 was not possible... see some references to it below:
> >> >>
> >> >> https://marc.info/?l=gluster-users=145136214452772=2
> >> >> https://gluster.readthedocs.io/en/latest/release-notes/3.7.1/
> >> >>
> >> >> # gluster volume set  server.allow-insecure on Edit
> >> >> /etc/glusterfs/glusterd.vol to contain this line: option
> >> >> rpc-auth-allow-insecure on
> >> >>
> >> >> Post 1, restarting the volume would be necessary:
> >> >>
> >> >> # gluster volume stop 
> >> >> # gluster volume start 
> >> >>
> >> >>
> >> >> HTH,
> >> >>
> >> >> Diego
> >> >>
> >> >> On Fri, Aug 25, 2017 at 7:46 AM, Yong Tseng 
> >> >> wrote:
> >> >> > Hi all,
> >> >> >
> >> >> > I'm currently in process of upgrading a replicated cluster (1 x 4)
> >> >> > from
> >> >> > 3.6.3 to 3.10.5. The nodes run CentOS 6. However after upgrading
> the
> >> >> > first
> >> >> > node, the said node fails to connect to other peers (as seen via
> >> >> > 'gluster
> >> >> > peer status'), but somehow other non-upgraded peers can still see
> the
> >> >> > upgraded peer as connected.
> >> >> >
> >> >> > Writes to the Gluster volume via local mounts of non-upgraded peers
> >> >> > are
> >> >> > replicated to the upgraded peer, but I can't write via the upgraded
> >> >> > peer
> >> >> > as
> >> >> > the local mount seems forced to read-only.
> >> >> >
> >> >> > Launching heal operations from non-upgraded peers will output
> 'Commit
> >> >> > failed
> >> >> > on . Please check log for details'.
> >> >> >
> >> >> > In addition, during upgrade process there were warning messages
> about
> >> >> > my
> >> >> > old
> >> >> > vol files renamed with .rpmsave extension. I tried starting Gluster
> >> >> > with
> >> >> > my
> >> >> > old vol files but the problem persisted. I tried generating new vol
> >> >> > files
> >> >> > with 'glusterd --xlator-option "*.upgrade=on" -N', still no avail.
> >> >> >
> >> >> > Also I checked the brick log it had several messages about "failed
> to
> >> >> > get
> >> >> > client opversion". I don't know if this is pertinent. Could it be
> >> >> > that
> >> >> > the
> >> >> > upgraded node cannot connect to older nodes but still can receive
> >> >> > instructions from them?
> >> >> >
> >> >> > Below are command outputs; some data are masked.
> >> >> > I'd provide more information if required.
> >> >> > Thanks in advance.
> >> >> >
> >> >> > ===> 'gluster volume status' ran on non-upgraded peers
> >> >> >
> >> >> > Status of volume: gsnfs
> >> >> > Gluster process Port
> >> >> > Online
> >> >> > Pid
> >> >> >
> >> >> >
> >> >> > 
> --
> >> >> > Brick gs-nfs01:/ftpdata 49154   Y
> >> >> > 2931
> >> >> > Brick gs-nfs02:/ftpdata 49152 

Re: [Gluster-users] 3.8 Upgrade to 3.10

2017-08-25 Thread Shyam Ranganathan

On 08/25/2017 09:17 AM, Lindsay Mathieson wrote:
Currently running 3.8.12, planning to rolling upgrade it to 3.8.15 this 
weekend.


  * debian 8
  * 3 nodes
  * Replica 3
  * Sharded
  * VM Hosting only

The release notes strongly recommend upgrading to 3.10

  * Is there any downside to staying on 3.8.15 for a while longer?


3.8 will receive no further updates post 3.12 is released, if the 
cluster is stable and you are not waiting on any fixes, then staying on 
a bit longer may not hurt.



  * I didn't see anything I had to have in 3.10, but ongoing updates are
always good :(


Yes, issues raised on 3.8 will *possibly* receive an answer which would 
be, "please upgrade and let us know if the problem persists".




This mildly concerned me:

  * Expanding a gluster volume that is sharded may cause file corruption

But I have no plans to expand or change the volume, so shouldn't be a issue?


It should not given your usage, the problem should persist in 3.8 
version as well, and in latest 3.10 releases, the last known problems 
regarding this corruption are fixed and we are just awaiting 
confirmation from users to remove the note.





Upgrading

  * Can I go straight from 3.8 to 3.10?
  * Do I need to offline the volume first?


The upgrade notes [1] covers this, in short an offline upgrade is not 
needed in your setup, as disperse is not a part of the stack.


Shyam

[1] 3.10 upgrade guide: 
https://gluster.readthedocs.io/en/latest/Upgrade-Guide/upgrade_to_3.10/

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] 3.8 Upgrade to 3.10

2017-08-25 Thread Lindsay Mathieson
Currently running 3.8.12, planning to rolling upgrade it to 3.8.15 this 
weekend.


 * debian 8
 * 3 nodes
 * Replica 3
 * Sharded
 * VM Hosting only

The release notes strongly recommend upgrading to 3.10

 * Is there any downside to staying on 3.8.15 for a while longer?
 * I didn't see anything I had to have in 3.10, but ongoing updates are
   always good :(

This mildly concerned me:

 * Expanding a gluster volume that is sharded may cause file corruption

But I have no plans to expand or change the volume, so shouldn't be a issue?


Upgrading

 * Can I go straight from 3.8 to 3.10?
 * Do I need to offline the volume first?


Thanks


--
Lindsay Mathieson

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Rolling upgrade from 3.6.3 to 3.10.5

2017-08-25 Thread Diego Remolina
Yes, I did an offline upgrade.

1. Stop all clients using gluster servers.
2. Stop glusterfsd and glusterd on both servers.
3. Backed up /var/lib/gluster* in all servers just to be safe.
4. Upgraded all servers from 3.6.x to 3.10.x (I did not have quotas or
anything that required special steps)
5. Started gluster daemons again and confirmed everything was fine
prior to letting clients connect.
5. Ran 3.10.x with the older op version for a few days to make sure
all was OK (Not all was OK for me, but that may be a samba issue as I
use it as a file server).
6. Upgraded the op version to maximum available.

In my case, I have two servers with bricks and one server that acts as
a witness.

HTH,

Diego

On Fri, Aug 25, 2017 at 8:56 AM, Yong Tseng  wrote:
> Hi Diego,
>
> Just to clarify, so did you do an offline upgrade with an existing cluster
> (3.6.x => 3.10.x)?
> Thanks.
>
> On Fri, Aug 25, 2017 at 8:42 PM, Diego Remolina  wrote:
>>
>> I was never able to go from 3.6.x to 3.7.x without downtime. Then
>> 3.7.x did not work well for me, so I stuck with 3.6.x until recently.
>> I went from 3.6.x to 3.10.x but downtime was scheduled.
>>
>> Diego
>>
>> On Fri, Aug 25, 2017 at 8:25 AM, Yong Tseng  wrote:
>> > Hi Diego,
>> >
>> > Thanks for the information. I tried only setting 'allow-insecure on' but
>> > nada.
>> > The sentence "If you are using GlusterFS version 3.4.x or below, you can
>> > upgrade it to following" in documentation is surely misleading.
>> > So would you suggest creating a new 3.10 cluster from scratch then
>> > rsync(?)
>> > the data from old cluster to the new?
>> >
>> >
>> > On Fri, Aug 25, 2017 at 7:53 PM, Diego Remolina 
>> > wrote:
>> >>
>> >> You cannot do a rolling upgrade from 3.6.x to 3.10.x You will need
>> >> downtime.
>> >>
>> >> Even 3.6 to 3.7 was not possible... see some references to it below:
>> >>
>> >> https://marc.info/?l=gluster-users=145136214452772=2
>> >> https://gluster.readthedocs.io/en/latest/release-notes/3.7.1/
>> >>
>> >> # gluster volume set  server.allow-insecure on Edit
>> >> /etc/glusterfs/glusterd.vol to contain this line: option
>> >> rpc-auth-allow-insecure on
>> >>
>> >> Post 1, restarting the volume would be necessary:
>> >>
>> >> # gluster volume stop 
>> >> # gluster volume start 
>> >>
>> >>
>> >> HTH,
>> >>
>> >> Diego
>> >>
>> >> On Fri, Aug 25, 2017 at 7:46 AM, Yong Tseng 
>> >> wrote:
>> >> > Hi all,
>> >> >
>> >> > I'm currently in process of upgrading a replicated cluster (1 x 4)
>> >> > from
>> >> > 3.6.3 to 3.10.5. The nodes run CentOS 6. However after upgrading the
>> >> > first
>> >> > node, the said node fails to connect to other peers (as seen via
>> >> > 'gluster
>> >> > peer status'), but somehow other non-upgraded peers can still see the
>> >> > upgraded peer as connected.
>> >> >
>> >> > Writes to the Gluster volume via local mounts of non-upgraded peers
>> >> > are
>> >> > replicated to the upgraded peer, but I can't write via the upgraded
>> >> > peer
>> >> > as
>> >> > the local mount seems forced to read-only.
>> >> >
>> >> > Launching heal operations from non-upgraded peers will output 'Commit
>> >> > failed
>> >> > on . Please check log for details'.
>> >> >
>> >> > In addition, during upgrade process there were warning messages about
>> >> > my
>> >> > old
>> >> > vol files renamed with .rpmsave extension. I tried starting Gluster
>> >> > with
>> >> > my
>> >> > old vol files but the problem persisted. I tried generating new vol
>> >> > files
>> >> > with 'glusterd --xlator-option "*.upgrade=on" -N', still no avail.
>> >> >
>> >> > Also I checked the brick log it had several messages about "failed to
>> >> > get
>> >> > client opversion". I don't know if this is pertinent. Could it be
>> >> > that
>> >> > the
>> >> > upgraded node cannot connect to older nodes but still can receive
>> >> > instructions from them?
>> >> >
>> >> > Below are command outputs; some data are masked.
>> >> > I'd provide more information if required.
>> >> > Thanks in advance.
>> >> >
>> >> > ===> 'gluster volume status' ran on non-upgraded peers
>> >> >
>> >> > Status of volume: gsnfs
>> >> > Gluster process Port
>> >> > Online
>> >> > Pid
>> >> >
>> >> >
>> >> > --
>> >> > Brick gs-nfs01:/ftpdata 49154   Y
>> >> > 2931
>> >> > Brick gs-nfs02:/ftpdata 49152   Y
>> >> > 29875
>> >> > Brick gs-nfs03:/ftpdata 49153   Y
>> >> > 6987
>> >> > Brick gs-nfs04:/ftpdata 49153   Y
>> >> > 24768
>> >> > Self-heal Daemon on localhost   N/A Y
>> >> > 2938
>> >> > Self-heal Daemon on gs-nfs04N/A Y
>> >> > 24788
>> >> > Self-heal Daemon on gs-nfs03

Re: [Gluster-users] self-heal not working

2017-08-25 Thread mabi
Hi Ravi,

Did you get a chance to have a look at the log files I have attached in my last 
mail?

Best,
Mabi

>  Original Message 
> Subject: Re: [Gluster-users] self-heal not working
> Local Time: August 24, 2017 12:08 PM
> UTC Time: August 24, 2017 10:08 AM
> From: m...@protonmail.ch
> To: Ravishankar N 
> Ben Turner , Gluster Users 
>
> Thanks for confirming the command. I have now enabled DEBUG client-log-level, 
> run a heal and then attached the glustershd log files of all 3 nodes in this 
> mail.
>
> The volume concerned is called myvol-pro, the other 3 volumes have no problem 
> so far.
>
> Also note that in the mean time it looks like the file has been deleted by 
> the user and as such the heal info command does not show the file name 
> anymore but just is GFID which is:
>
> gfid:1985e233-d5ee-4e3e-a51a-cf0b5f9f2aea
>
> Hope that helps for debugging this issue.
>
>>  Original Message 
>> Subject: Re: [Gluster-users] self-heal not working
>> Local Time: August 24, 2017 5:58 AM
>> UTC Time: August 24, 2017 3:58 AM
>> From: ravishan...@redhat.com
>> To: mabi 
>> Ben Turner , Gluster Users 
>>
>> Unlikely. In your case only the afr.dirty is set, not the 
>> afr.volname-client-xx xattr.
>>
>> `gluster volume set myvolume diagnostics.client-log-level DEBUG` is right.
>>
>> On 08/23/2017 10:31 PM, mabi wrote:
>>
>>> I just saw the following bug which was fixed in 3.8.15:
>>>
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1471613
>>>
>>> Is it possible that the problem I described in this post is related to that 
>>> bug?
>>>
  Original Message 
 Subject: Re: [Gluster-users] self-heal not working
 Local Time: August 22, 2017 11:51 AM
 UTC Time: August 22, 2017 9:51 AM
 From: ravishan...@redhat.com
 To: mabi [](mailto:m...@protonmail.ch)
 Ben Turner [](mailto:btur...@redhat.com), Gluster 
 Users [](mailto:gluster-users@gluster.org)

 On 08/22/2017 02:30 PM, mabi wrote:

> Thanks for the additional hints, I have the following 2 questions first:
>
> - In order to launch the index heal is the following command correct:
> gluster volume heal myvolume

 Yes

> - If I run a "volume start force" will it have any short disruptions on 
> my clients which mount the volume through FUSE? If yes, how long? This is 
> a production system that's why I am asking.

 No. You can actually create a test volume on  your personal linux box to 
 try these kinds of things without needing multiple machines. This is how 
 we develop and test our patches :)
 'gluster volume create testvol replica 3 /home/mabi/bricks/brick{1..3} 
 force` and so on.

 HTH,
 Ravi

>>  Original Message 
>> Subject: Re: [Gluster-users] self-heal not working
>> Local Time: August 22, 2017 6:26 AM
>> UTC Time: August 22, 2017 4:26 AM
>> From: ravishan...@redhat.com
>> To: mabi [](mailto:m...@protonmail.ch), Ben Turner 
>> [](mailto:btur...@redhat.com)
>> Gluster Users 
>> [](mailto:gluster-users@gluster.org)
>>
>> Explore the following:
>>
>> - Launch index heal and look at the glustershd logs of all bricks for 
>> possible errors
>>
>> - See if the glustershd in each node is connected to all bricks.
>>
>> - If not try to restart shd by `volume start force`
>>
>> - Launch index heal again and try.
>>
>> - Try debugging the shd log by setting client-log-level to DEBUG 
>> temporarily.
>>
>> On 08/22/2017 03:19 AM, mabi wrote:
>>
>>> Sure, it doesn't look like a split brain based on the output:
>>>
>>> Brick node1.domain.tld:/data/myvolume/brick
>>> Status: Connected
>>> Number of entries in split-brain: 0
>>>
>>> Brick node2.domain.tld:/data/myvolume/brick
>>> Status: Connected
>>> Number of entries in split-brain: 0
>>>
>>> Brick node3.domain.tld:/srv/glusterfs/myvolume/brick
>>> Status: Connected
>>> Number of entries in split-brain: 0
>>>
  Original Message 
 Subject: Re: [Gluster-users] self-heal not working
 Local Time: August 21, 2017 11:35 PM
 UTC Time: August 21, 2017 9:35 PM
 From: btur...@redhat.com
 To: mabi [](mailto:m...@protonmail.ch)
 Gluster Users 
 [](mailto:gluster-users@gluster.org)

 Can you also provide:

 gluster v heal  info split-brain

 If it is split brain just delete the incorrect file from the brick and 
 run heal again. I haven"t tried 

Re: [Gluster-users] Rolling upgrade from 3.6.3 to 3.10.5

2017-08-25 Thread Yong Tseng
Hi Diego,

Just to clarify, so did you do an offline upgrade with an existing cluster
(3.6.x => 3.10.x)?
Thanks.

On Fri, Aug 25, 2017 at 8:42 PM, Diego Remolina  wrote:

> I was never able to go from 3.6.x to 3.7.x without downtime. Then
> 3.7.x did not work well for me, so I stuck with 3.6.x until recently.
> I went from 3.6.x to 3.10.x but downtime was scheduled.
>
> Diego
>
> On Fri, Aug 25, 2017 at 8:25 AM, Yong Tseng  wrote:
> > Hi Diego,
> >
> > Thanks for the information. I tried only setting 'allow-insecure on' but
> > nada.
> > The sentence "If you are using GlusterFS version 3.4.x or below, you can
> > upgrade it to following" in documentation is surely misleading.
> > So would you suggest creating a new 3.10 cluster from scratch then
> rsync(?)
> > the data from old cluster to the new?
> >
> >
> > On Fri, Aug 25, 2017 at 7:53 PM, Diego Remolina 
> wrote:
> >>
> >> You cannot do a rolling upgrade from 3.6.x to 3.10.x You will need
> >> downtime.
> >>
> >> Even 3.6 to 3.7 was not possible... see some references to it below:
> >>
> >> https://marc.info/?l=gluster-users=145136214452772=2
> >> https://gluster.readthedocs.io/en/latest/release-notes/3.7.1/
> >>
> >> # gluster volume set  server.allow-insecure on Edit
> >> /etc/glusterfs/glusterd.vol to contain this line: option
> >> rpc-auth-allow-insecure on
> >>
> >> Post 1, restarting the volume would be necessary:
> >>
> >> # gluster volume stop 
> >> # gluster volume start 
> >>
> >>
> >> HTH,
> >>
> >> Diego
> >>
> >> On Fri, Aug 25, 2017 at 7:46 AM, Yong Tseng 
> wrote:
> >> > Hi all,
> >> >
> >> > I'm currently in process of upgrading a replicated cluster (1 x 4)
> from
> >> > 3.6.3 to 3.10.5. The nodes run CentOS 6. However after upgrading the
> >> > first
> >> > node, the said node fails to connect to other peers (as seen via
> >> > 'gluster
> >> > peer status'), but somehow other non-upgraded peers can still see the
> >> > upgraded peer as connected.
> >> >
> >> > Writes to the Gluster volume via local mounts of non-upgraded peers
> are
> >> > replicated to the upgraded peer, but I can't write via the upgraded
> peer
> >> > as
> >> > the local mount seems forced to read-only.
> >> >
> >> > Launching heal operations from non-upgraded peers will output 'Commit
> >> > failed
> >> > on . Please check log for details'.
> >> >
> >> > In addition, during upgrade process there were warning messages about
> my
> >> > old
> >> > vol files renamed with .rpmsave extension. I tried starting Gluster
> with
> >> > my
> >> > old vol files but the problem persisted. I tried generating new vol
> >> > files
> >> > with 'glusterd --xlator-option "*.upgrade=on" -N', still no avail.
> >> >
> >> > Also I checked the brick log it had several messages about "failed to
> >> > get
> >> > client opversion". I don't know if this is pertinent. Could it be that
> >> > the
> >> > upgraded node cannot connect to older nodes but still can receive
> >> > instructions from them?
> >> >
> >> > Below are command outputs; some data are masked.
> >> > I'd provide more information if required.
> >> > Thanks in advance.
> >> >
> >> > ===> 'gluster volume status' ran on non-upgraded peers
> >> >
> >> > Status of volume: gsnfs
> >> > Gluster process PortOnline
> >> > Pid
> >> >
> >> > 
> --
> >> > Brick gs-nfs01:/ftpdata 49154   Y
> >> > 2931
> >> > Brick gs-nfs02:/ftpdata 49152   Y
> >> > 29875
> >> > Brick gs-nfs03:/ftpdata 49153   Y
> >> > 6987
> >> > Brick gs-nfs04:/ftpdata 49153   Y
> >> > 24768
> >> > Self-heal Daemon on localhost   N/A Y
> >> > 2938
> >> > Self-heal Daemon on gs-nfs04N/A Y
> >> > 24788
> >> > Self-heal Daemon on gs-nfs03N/A Y
> >> > 7007
> >> > Self-heal Daemon on   N/A Y   29866
> >> >
> >> > Task Status of Volume gsnfs
> >> >
> >> > 
> --
> >> > There are no active volume tasks
> >> >
> >> >
> >> >
> >> > ===> 'gluster volume status' on upgraded peer
> >> >
> >> > Gluster process TCP Port  RDMA Port
> Online
> >> > Pid
> >> >
> >> > 
> --
> >> > Brick gs-nfs02:/ftpdata 49152 0  Y
> >> > 29875
> >> > Self-heal Daemon on localhost   N/A   N/AY
> >> > 29866
> >> >
> >> > Task Status of Volume gsnfs
> >> >
> >> > 
> --
> >> > There are no active volume tasks
> >> >
> >> >
> >> >
> >> > ===> 'gluster peer status' 

Re: [Gluster-users] Rolling upgrade from 3.6.3 to 3.10.5

2017-08-25 Thread Yong Tseng
Hi Diego,

Thanks for the information. I tried only setting 'allow-insecure on' but
nada.
The sentence "If you are using GlusterFS version 3.4.x or below, you can
upgrade it to following" in documentation is surely misleading.
So would you suggest creating a new 3.10 cluster from scratch then rsync(?)
the data from old cluster to the new?

On Fri, Aug 25, 2017 at 7:53 PM, Diego Remolina  wrote:

> You cannot do a rolling upgrade from 3.6.x to 3.10.x You will need
> downtime.
>
> Even 3.6 to 3.7 was not possible... see some references to it below:
>
> https://marc.info/?l=gluster-users=145136214452772=2
> https://gluster.readthedocs.io/en/latest/release-notes/3.7.1/
>
> # gluster volume set  server.allow-insecure on Edit
> /etc/glusterfs/glusterd.vol to contain this line: option
> rpc-auth-allow-insecure on
>
> Post 1, restarting the volume would be necessary:
>
> # gluster volume stop 
> # gluster volume start 
>
>
> HTH,
>
> Diego
>
> On Fri, Aug 25, 2017 at 7:46 AM, Yong Tseng  wrote:
> > Hi all,
> >
> > I'm currently in process of upgrading a replicated cluster (1 x 4) from
> > 3.6.3 to 3.10.5. The nodes run CentOS 6. However after upgrading the
> first
> > node, the said node fails to connect to other peers (as seen via 'gluster
> > peer status'), but somehow other non-upgraded peers can still see the
> > upgraded peer as connected.
> >
> > Writes to the Gluster volume via local mounts of non-upgraded peers are
> > replicated to the upgraded peer, but I can't write via the upgraded peer
> as
> > the local mount seems forced to read-only.
> >
> > Launching heal operations from non-upgraded peers will output 'Commit
> failed
> > on . Please check log for details'.
> >
> > In addition, during upgrade process there were warning messages about my
> old
> > vol files renamed with .rpmsave extension. I tried starting Gluster with
> my
> > old vol files but the problem persisted. I tried generating new vol files
> > with 'glusterd --xlator-option "*.upgrade=on" -N', still no avail.
> >
> > Also I checked the brick log it had several messages about "failed to get
> > client opversion". I don't know if this is pertinent. Could it be that
> the
> > upgraded node cannot connect to older nodes but still can receive
> > instructions from them?
> >
> > Below are command outputs; some data are masked.
> > I'd provide more information if required.
> > Thanks in advance.
> >
> > ===> 'gluster volume status' ran on non-upgraded peers
> >
> > Status of volume: gsnfs
> > Gluster process PortOnline
> Pid
> > 
> --
> > Brick gs-nfs01:/ftpdata 49154   Y
>  2931
> > Brick gs-nfs02:/ftpdata 49152   Y
> > 29875
> > Brick gs-nfs03:/ftpdata 49153   Y
>  6987
> > Brick gs-nfs04:/ftpdata 49153   Y
> > 24768
> > Self-heal Daemon on localhost   N/A Y
>  2938
> > Self-heal Daemon on gs-nfs04N/A Y
> > 24788
> > Self-heal Daemon on gs-nfs03N/A Y
>  7007
> > Self-heal Daemon on   N/A Y   29866
> >
> > Task Status of Volume gsnfs
> > 
> --
> > There are no active volume tasks
> >
> >
> >
> > ===> 'gluster volume status' on upgraded peer
> >
> > Gluster process TCP Port  RDMA Port  Online
> Pid
> > 
> --
> > Brick gs-nfs02:/ftpdata 49152 0  Y
> > 29875
> > Self-heal Daemon on localhost   N/A   N/AY
> > 29866
> >
> > Task Status of Volume gsnfs
> > 
> --
> > There are no active volume tasks
> >
> >
> >
> > ===> 'gluster peer status' on non-upgraded peer
> >
> > Number of Peers: 3
> >
> > Hostname: gs-nfs03
> > Uuid: 4c1544e6-550d-481a-95af-2a1da32d10ad
> > State: Peer in Cluster (Connected)
> >
> > Hostname: 
> > Uuid: 17d554fd-9181-4b53-9521-55acf69ac35f
> > State: Peer in Cluster (Connected)
> > Other names:
> > gs-nfs02
> >
> > Hostname: gs-nfs04
> > Uuid: c6d165e6-d222-414c-b57a-97c64f06c5e9
> > State: Peer in Cluster (Connected)
> >
> >
> >
> > ===> 'gluster peer status' on upgraded peer
> >
> > Number of Peers: 3
> >
> > Hostname: gs-nfs03
> > Uuid: 4c1544e6-550d-481a-95af-2a1da32d10ad
> > State: Peer in Cluster (Disconnected)
> >
> > Hostname: gs-nfs01
> > Uuid: 90d3ed27-61ac-4ad3-93a9-3c2b68f41ecf
> > State: Peer in Cluster (Disconnected)
> > Other names:
> > 
> >
> > Hostname: gs-nfs04
> > Uuid: c6d165e6-d222-414c-b57a-97c64f06c5e9
> > State: Peer in Cluster (Disconnected)
> >
> >
> > --
> > - Yong
> >
> > 

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-08-25 Thread Lindsay Mathieson

On 25/08/2017 6:50 PM, lemonni...@ulrar.net wrote:

Free from a lot of problems, but apparently not as good as a replica 3
volume. I can't comment on arbiter, I only have replica 3 clusters. I
can tell you that my colleagues setting up 2 nodes clusters have_a lot_
of problems.


I run Replica 3 VM hosting (gfapi) via a 3 node proxmox cluster. Have 
done a lot of rolling node updates, power failures etc, never had a 
problem. Performance is better than any other DFS I've tried (Ceph, 
lizard/moose). Never did get DRDB working.



nb: ZFS Bricks, with each brick RAID10 - so a little paranoid on the 
redundancy :)



For me, glusters biggest problem is its lack of flexibility in adding 
bricks and nodes.And replacing them is an exercise in nail biting.  
Hoping V4 improves on this though maybe that will lead to performance 
trade offs.


--
Lindsay Mathieson

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] GlusterFS as virtual machine storage

2017-08-25 Thread Lindsay Mathieson

On 25/08/2017 6:50 PM, lemonni...@ulrar.net wrote:

Free from a lot of problems, but apparently not as good as a replica 3
volume. I can't comment on arbiter, I only have replica 3 clusters. I
can tell you that my colleagues setting up 2 nodes clusters have_a lot_
of problems.


I run Replica 3 VM hosting (gfapi) via a 3 node proxmox cluster. Have 
done a lot of rolling node updates, power failures etc, never had a 
problem. Performance is better than any other DFS I've tried (Ceph, 
lizard/moose). Never did get DRDB working.



nb: ZFS Bricks, with each brick RAID10 - so a little paranoid on the 
redundancy :)



For me, glusters biggest problem is its lack of flexibility in adding 
bricks and nodes.


--
Lindsay Mathieson

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Rolling upgrade from 3.6.3 to 3.10.5

2017-08-25 Thread Diego Remolina
You cannot do a rolling upgrade from 3.6.x to 3.10.x You will need downtime.

Even 3.6 to 3.7 was not possible... see some references to it below:

https://marc.info/?l=gluster-users=145136214452772=2
https://gluster.readthedocs.io/en/latest/release-notes/3.7.1/

# gluster volume set  server.allow-insecure on Edit
/etc/glusterfs/glusterd.vol to contain this line: option
rpc-auth-allow-insecure on

Post 1, restarting the volume would be necessary:

# gluster volume stop 
# gluster volume start 


HTH,

Diego

On Fri, Aug 25, 2017 at 7:46 AM, Yong Tseng  wrote:
> Hi all,
>
> I'm currently in process of upgrading a replicated cluster (1 x 4) from
> 3.6.3 to 3.10.5. The nodes run CentOS 6. However after upgrading the first
> node, the said node fails to connect to other peers (as seen via 'gluster
> peer status'), but somehow other non-upgraded peers can still see the
> upgraded peer as connected.
>
> Writes to the Gluster volume via local mounts of non-upgraded peers are
> replicated to the upgraded peer, but I can't write via the upgraded peer as
> the local mount seems forced to read-only.
>
> Launching heal operations from non-upgraded peers will output 'Commit failed
> on . Please check log for details'.
>
> In addition, during upgrade process there were warning messages about my old
> vol files renamed with .rpmsave extension. I tried starting Gluster with my
> old vol files but the problem persisted. I tried generating new vol files
> with 'glusterd --xlator-option "*.upgrade=on" -N', still no avail.
>
> Also I checked the brick log it had several messages about "failed to get
> client opversion". I don't know if this is pertinent. Could it be that the
> upgraded node cannot connect to older nodes but still can receive
> instructions from them?
>
> Below are command outputs; some data are masked.
> I'd provide more information if required.
> Thanks in advance.
>
> ===> 'gluster volume status' ran on non-upgraded peers
>
> Status of volume: gsnfs
> Gluster process PortOnline  Pid
> --
> Brick gs-nfs01:/ftpdata 49154   Y   2931
> Brick gs-nfs02:/ftpdata 49152   Y
> 29875
> Brick gs-nfs03:/ftpdata 49153   Y   6987
> Brick gs-nfs04:/ftpdata 49153   Y
> 24768
> Self-heal Daemon on localhost   N/A Y   2938
> Self-heal Daemon on gs-nfs04N/A Y
> 24788
> Self-heal Daemon on gs-nfs03N/A Y   7007
> Self-heal Daemon on   N/A Y   29866
>
> Task Status of Volume gsnfs
> --
> There are no active volume tasks
>
>
>
> ===> 'gluster volume status' on upgraded peer
>
> Gluster process TCP Port  RDMA Port  Online  Pid
> --
> Brick gs-nfs02:/ftpdata 49152 0  Y
> 29875
> Self-heal Daemon on localhost   N/A   N/AY
> 29866
>
> Task Status of Volume gsnfs
> --
> There are no active volume tasks
>
>
>
> ===> 'gluster peer status' on non-upgraded peer
>
> Number of Peers: 3
>
> Hostname: gs-nfs03
> Uuid: 4c1544e6-550d-481a-95af-2a1da32d10ad
> State: Peer in Cluster (Connected)
>
> Hostname: 
> Uuid: 17d554fd-9181-4b53-9521-55acf69ac35f
> State: Peer in Cluster (Connected)
> Other names:
> gs-nfs02
>
> Hostname: gs-nfs04
> Uuid: c6d165e6-d222-414c-b57a-97c64f06c5e9
> State: Peer in Cluster (Connected)
>
>
>
> ===> 'gluster peer status' on upgraded peer
>
> Number of Peers: 3
>
> Hostname: gs-nfs03
> Uuid: 4c1544e6-550d-481a-95af-2a1da32d10ad
> State: Peer in Cluster (Disconnected)
>
> Hostname: gs-nfs01
> Uuid: 90d3ed27-61ac-4ad3-93a9-3c2b68f41ecf
> State: Peer in Cluster (Disconnected)
> Other names:
> 
>
> Hostname: gs-nfs04
> Uuid: c6d165e6-d222-414c-b57a-97c64f06c5e9
> State: Peer in Cluster (Disconnected)
>
>
> --
> - Yong
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Rolling upgrade from 3.6.3 to 3.10.5

2017-08-25 Thread Yong Tseng
Hi all,

I'm currently in process of upgrading a replicated cluster (1 x 4) from
3.6.3 to 3.10.5. The nodes run CentOS 6. However after upgrading the first
node, the said node fails to connect to other peers (as seen via 'gluster
peer status'), but somehow other non-upgraded peers can still see the
upgraded peer as connected.

Writes to the Gluster volume via local mounts of non-upgraded peers are
replicated to the upgraded peer, but I can't write via the upgraded peer as
the local mount seems forced to read-only.

Launching heal operations from non-upgraded peers will output 'Commit
failed on . Please check log for details'.

In addition, during upgrade process there were warning messages about my
old vol files renamed with .rpmsave extension. I tried starting Gluster
with my old vol files but the problem persisted. I tried generating new vol
files with 'glusterd --xlator-option "*.upgrade=on" -N', still no avail.

Also I checked the brick log it had several messages about "failed to get
client opversion". I don't know if this is pertinent. Could it be that the
upgraded node cannot connect to older nodes but still can receive
instructions from them?

Below are command outputs; some data are masked.
I'd provide more information if required.
Thanks in advance.

===> 'gluster volume status' ran on non-upgraded peers

Status of volume: gsnfs
Gluster process PortOnline  Pid
--
Brick gs-nfs01:/ftpdata 49154   Y   2931
Brick gs-nfs02:/ftpdata 49152   Y
29875
Brick gs-nfs03:/ftpdata 49153   Y   6987
Brick gs-nfs04:/ftpdata 49153   Y
24768
Self-heal Daemon on localhost   N/A Y   2938
Self-heal Daemon on gs-nfs04N/A Y
24788
Self-heal Daemon on gs-nfs03N/A Y   7007
Self-heal Daemon on   N/A Y   29866

Task Status of Volume gsnfs
--
There are no active volume tasks



===> 'gluster volume status' on upgraded peer

Gluster process TCP Port  RDMA Port  Online  Pid
--
Brick gs-nfs02:/ftpdata 49152 0  Y
29875
Self-heal Daemon on localhost   N/A   N/AY
29866

Task Status of Volume gsnfs
--
There are no active volume tasks



===> 'gluster peer status' on non-upgraded peer

Number of Peers: 3

Hostname: gs-nfs03
Uuid: 4c1544e6-550d-481a-95af-2a1da32d10ad
State: Peer in Cluster (Connected)

Hostname: 
Uuid: 17d554fd-9181-4b53-9521-55acf69ac35f
State: Peer in Cluster (Connected)
Other names:
gs-nfs02

Hostname: gs-nfs04
Uuid: c6d165e6-d222-414c-b57a-97c64f06c5e9
State: Peer in Cluster (Connected)



===> 'gluster peer status' on upgraded peer

Number of Peers: 3

Hostname: gs-nfs03
Uuid: 4c1544e6-550d-481a-95af-2a1da32d10ad
State: Peer in Cluster (Disconnected)

Hostname: gs-nfs01
Uuid: 90d3ed27-61ac-4ad3-93a9-3c2b68f41ecf
State: Peer in Cluster (Disconnected)
Other names:


Hostname: gs-nfs04
Uuid: c6d165e6-d222-414c-b57a-97c64f06c5e9
State: Peer in Cluster (Disconnected)


-- 
- Yong
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Gluster 4.0: Update

2017-08-25 Thread Niels de Vos
On Fri, Aug 25, 2017 at 12:21:08PM +1000, Lindsay Mathieson wrote:
> >
> > This feature (and the patent) is from facebook folks.
> >
> >
> Does that mean its not a problem?

Facebook contributed the patches for this feature, so I don't think
there is a problem.

Niels (not a lawyer)
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] GlusterFS as virtual machine storage

2017-08-25 Thread Gionatan Danti

Il 25-08-2017 10:50 lemonni...@ulrar.net ha scritto:

Yes. Gluster has it's own quorum, you can disable it but that's just a
recipe for a disaster.

Free from a lot of problems, but apparently not as good as a replica 3
volume. I can't comment on arbiter, I only have replica 3 clusters. I
can tell you that my colleagues setting up 2 nodes clusters have _a 
lot_

of problems.


Thanks, these are very valuable informations.

--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.da...@assyoma.it - i...@assyoma.it
GPG public key ID: FF5F32A8
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] GlusterFS as virtual machine storage

2017-08-25 Thread lemonnierk
> This is true even if I manage locking at application level (via virlock 
> or sanlock)?

Yes. Gluster has it's own quorum, you can disable it but that's just a
recipe for a disaster.

> Also, on a two-node setup it is *guaranteed* for updates to one node to 
> put offline the whole volume?

I think so, but I never took the chance so who knows.

> On the other hand, a 3-way setup (or 2+arbiter) if free from all these 
> problems?
> 

Free from a lot of problems, but apparently not as good as a replica 3
volume. I can't comment on arbiter, I only have replica 3 clusters. I
can tell you that my colleagues setting up 2 nodes clusters have _a lot_
of problems.


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-08-25 Thread Gionatan Danti

Il 25-08-2017 08:32 Gionatan Danti ha scritto:

Hi all,
any other advice from who use (or do not use) Gluster as a replicated
VM backend?

Thanks.


Sorry, I was not seeing messages because I was not subscribed on the 
list; I read it from the web.


So it seems that Pavel and WK have vastly different experience with 
Gluster. Any plausible cause for that difference?



WK wrote:
2 node plus Arbiter. You NEED the arbiter or a third node. Do NOT try 2
node with a VM


This is true even if I manage locking at application level (via virlock 
or sanlock)?
Also, on a two-node setup it is *guaranteed* for updates to one node to 
put offline the whole volume?
On the other hand, a 3-way setup (or 2+arbiter) if free from all these 
problems?


Thanks.

--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.da...@assyoma.it - i...@assyoma.it
GPG public key ID: FF5F32A8
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] NFS versus Fuse file locking problem (NFS works, fuse doesn't...)

2017-08-25 Thread Krist van Besien
On 25 August 2017 at 04:47, Vijay Bellur  wrote:

>
>
> On Thu, Aug 24, 2017 at 9:01 AM, Krist van Besien 
> wrote:
>
> Would it be possible to obtain a statedump of the native client when the
> application becomes completely unresponsive? A statedump can help in
> understanding operations within the gluster stack. Log file of the native
> client might also offer some clues.
>

I've increased logging to debug on both client and bricks, but didn't see
anything that hinted at problems.
Maybe we have to go for Ganesha after all.

But currently we are stuck at the customer having trouble actually
generating enough load to test the server with...

When I try to simulate the workload with a script that writes and renames
files at the same rate the the video recorders do I can run it without any
issue, and can ramp up to the point where I am hitting the network ceiling.
So the gluster cluster is up to the task.
But the recorder software itself is running in to issues. Which makes me
suspect that it may have to do with the way some aspects of it are coded.
And it is there I am looking for answers. Any hints, like "if you call
fopen() you should give these flags an not these flags or you get in to
trouble"...

Krist

-- 
Vriendelijke Groet |  Best Regards | Freundliche Grüße | Cordialement
--

Krist van Besien

senior architect, RHCE, RHCSA Open Stack

Red Hat Red Hat Switzerland S.A. 

kr...@redhat.comM: +41-79-5936260

TRIED. TESTED. TRUSTED. 
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users