Re: [Gluster-users] Gluster Peer behavior

2016-07-04 Thread Atin Mukherjee
On Tue, Jul 5, 2016 at 11:01 AM, Atul Yadav  wrote:

> Hi All,
>
> The glusterfs environment details are given below:-
>
> [root@master1 ~]# cat /etc/redhat-release
> CentOS release 6.7 (Final)
> [root@master1 ~]# uname -r
> 2.6.32-642.1.1.el6.x86_64
> [root@master1 ~]# rpm -qa | grep -i gluster
> glusterfs-rdma-3.8rc2-1.el6.x86_64
> glusterfs-api-3.8rc2-1.el6.x86_64
> glusterfs-3.8rc2-1.el6.x86_64
> glusterfs-cli-3.8rc2-1.el6.x86_64
> glusterfs-client-xlators-3.8rc2-1.el6.x86_64
> glusterfs-server-3.8rc2-1.el6.x86_64
> glusterfs-fuse-3.8rc2-1.el6.x86_64
> glusterfs-libs-3.8rc2-1.el6.x86_64
> [root@master1 ~]#
>
> Volume Name: home
> Type: Replicate
> Volume ID: 2403ddf9-c2e0-4930-bc94-734772ef099f
> Status: Stopped
> Number of Bricks: 1 x 2 = 2
> Transport-type: rdma
> Bricks:
> Brick1: master1-ib.dbt.au:/glusterfs/home/brick1
> Brick2: master2-ib.dbt.au:/glusterfs/home/brick2
> Options Reconfigured:
> network.ping-timeout: 20
> nfs.disable: on
> performance.readdir-ahead: on
> transport.address-family: inet
> config.transport: rdma
> cluster.server-quorum-type: server
> cluster.quorum-type: fixed
> cluster.quorum-count: 1
> locks.mandatory-locking: off
> cluster.enable-shared-storage: disable
> cluster.server-quorum-ratio: 51%
>
> When my single master node is up only, but other nodes are still showing
> connected mode 
> gluster pool list
> UUIDHostnameState
> 89ccd72e-cb99-4b52-a2c0-388c99e5c7b3master2-ib.dbt.au   Connected
> d2c47fc2-f673-4790-b368-d214a58c59f4compute01-ib.dbt.au Connected
> a5608d66-a3c6-450e-a239-108668083ff2localhost   Connected
> [root@master1 ~]#
>
>
> Please advise us
> Is this normal behavior Or This is issue.
>

First of, we don't have any master slave configuration mode for gluster
trusted storage pool i.e. peer list. Secondly, if master2 and compute01 are
still reflecting as 'connected' even though they are down it means that
localhost here didn't receive disconnect events for some reason. Could you
restart glusterd service on this node and check the output of gluster pool
list again?



>
> Thank You
> Atul Yadav
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Gluster Peer behavior

2016-07-04 Thread Atul Yadav
Hi All,

The glusterfs environment details are given below:-

[root@master1 ~]# cat /etc/redhat-release
CentOS release 6.7 (Final)
[root@master1 ~]# uname -r
2.6.32-642.1.1.el6.x86_64
[root@master1 ~]# rpm -qa | grep -i gluster
glusterfs-rdma-3.8rc2-1.el6.x86_64
glusterfs-api-3.8rc2-1.el6.x86_64
glusterfs-3.8rc2-1.el6.x86_64
glusterfs-cli-3.8rc2-1.el6.x86_64
glusterfs-client-xlators-3.8rc2-1.el6.x86_64
glusterfs-server-3.8rc2-1.el6.x86_64
glusterfs-fuse-3.8rc2-1.el6.x86_64
glusterfs-libs-3.8rc2-1.el6.x86_64
[root@master1 ~]#

Volume Name: home
Type: Replicate
Volume ID: 2403ddf9-c2e0-4930-bc94-734772ef099f
Status: Stopped
Number of Bricks: 1 x 2 = 2
Transport-type: rdma
Bricks:
Brick1: master1-ib.dbt.au:/glusterfs/home/brick1
Brick2: master2-ib.dbt.au:/glusterfs/home/brick2
Options Reconfigured:
network.ping-timeout: 20
nfs.disable: on
performance.readdir-ahead: on
transport.address-family: inet
config.transport: rdma
cluster.server-quorum-type: server
cluster.quorum-type: fixed
cluster.quorum-count: 1
locks.mandatory-locking: off
cluster.enable-shared-storage: disable
cluster.server-quorum-ratio: 51%

When my single master node is up only, but other nodes are still showing
connected mode 
gluster pool list
UUIDHostnameState
89ccd72e-cb99-4b52-a2c0-388c99e5c7b3master2-ib.dbt.au   Connected
d2c47fc2-f673-4790-b368-d214a58c59f4compute01-ib.dbt.au Connected
a5608d66-a3c6-450e-a239-108668083ff2localhost   Connected
[root@master1 ~]#


Please advise us
Is this normal behavior Or This is issue.

Thank You
Atul Yadav
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] to RAID or not?

2016-07-04 Thread Dmitry Melekhov

04.07.2016 19:01, Matt Robinson пишет:

With mdadm any raid6 (especially with 12 disks) will be rubbish.


Well, this can be offtopic, but could you, please, explain why? (never 
used md raid other than raid1... )


___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] to RAID or not?

2016-07-04 Thread tom
If you go the ZFS route - be absolutely sure you set xattr=sa on all 
filesystems that will hold bricks BEFORE you create bricks on same.  Not doing 
so will cause major problems with data that should be deleted not being 
reclaimed until after a forced dismount or reboot (which can take hours -> days 
if there are several terabytes of data to reclaim.)

Setting it also vastly improves directory and stat() performance.

Setting it after the bricks had been created led to data inconsistencies and 
eventual data loss on a cluster we used to operate.

-t

> On Jul 4, 2016, at 4:35 PM, Lindsay Mathieson  
> wrote:
> 
> On 5/07/2016 12:54 AM, Gandalf Corvotempesta wrote:
>> No suggestions ?
>> 
>> Il 14 giu 2016 10:01 AM, "Gandalf Corvotempesta" 
>> > 
>> ha scritto:
>> Let's assume a small cluster made by 3 servers, 12 disks/bricks each.
>> This cluster would be expanded to a maximum of 15 servers in near future.
>> 
>> What do you suggest, a JBOD or a RAID? Which RAID level?
> 
> 
> I setup my much smaller cluster with ZFS RAID10 on each node. 
> - Greatly increased the iops per node
> 
> - auto bitrot detection and repair
> 
> - SSD caches
> 
> - compression clawed back 30% of the disk space I lost to RAID10.
> -- 
> Lindsay Mathieson
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] to RAID or not?

2016-07-04 Thread Lindsay Mathieson

On 5/07/2016 12:54 AM, Gandalf Corvotempesta wrote:


No suggestions ?

Il 14 giu 2016 10:01 AM, "Gandalf Corvotempesta" 
> ha scritto:


Let's assume a small cluster made by 3 servers, 12 disks/bricks each.
This cluster would be expanded to a maximum of 15 servers in near
future.

What do you suggest, a JBOD or a RAID? Which RAID level?




I setup my much smaller cluster with ZFS RAID10 on each node.

- Greatly increased the iops per node

- auto bitrot detection and repair

- SSD caches

- compression clawed back 30% of the disk space I lost to RAID10.

--
Lindsay Mathieson

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] to RAID or not?

2016-07-04 Thread Russell Purinton
Agreed… It took me almost 2 years of tweaking and testing to get the 
performance I wanted.   

Different workloads require different configurations.Test different 
configurations and find what works best for you!

> On Jul 4, 2016, at 2:15 PM, t...@encoding.com wrote:
> 
> I would highly stress, regardless of whatever solution you choose - make sure 
> you test actual workload performance before going all-in.
> 
> In my testing, performance (esp. iops and latency) decreased as I added 
> bricks and additional nodes.  Since you have many spindles now, I would 
> encourage you to test your workload up to and including the total brick count 
> you ultimately expect.  RAID level and whether it’s md, zfs, or hardware 
> isn’t likely to make as significant of a performance impact as Gluster and 
> its various clients will.  Test failure scenarios and performance 
> characteristics during impairment events thoroughly.  Make sure heals happen 
> as you expect, including final contents of files modified during an 
> impairment.  If you have many small files or directories that will be 
> accessed concurrently, make sure to stress that behavior in your testing.
> 
> Gluster can be great for targeting availability and distribution at low 
> software cost, and I would say as of today at the expense of performance, but 
> as with any scale-out NAS there are limitations and some surprises along the 
> path.
> 
> Good hunting,
> -t
> 
>> On Jul 4, 2016, at 10:44 AM, Gandalf Corvotempesta 
>>  wrote:
>> 
>> 2016-07-04 19:35 GMT+02:00 Russell Purinton :
>>> For 3 servers with 12 disks each, I would do Hardware RAID0 (or madam if 
>>> you don’t have a RAID card) of 3 disks.  So four 3-disk RAID0’s per server.
>> 
>> 3 servers is just to start. We plan to use 5 server in shorter time
>> and up to 15 on production.
>> 
>>> I would set them up as Replica 3 Arbiter 1
>>> 
>>> server1:/brickA server2:/brickC server3:/brickA
>>> server1:/brickB server2:/brickD server3:/brickB
>>> server2:/brickA server3:/brickC server1:/brickA
>>> server2:/brickB server3:/brickD server1:/brickB
>>> server3:/brickA server1:/brickC server2:/brickA
>>> server3:/brickB server1:/brickD server2:/brickB
>>> 
>>> The benefit of this is that you can lose an entire server node (12 disks) 
>>> and all of your data is still accessible.   And you get the same space as 
>>> if they were all in a RAID10.
>>> 
>>> If you lose any disk, the entire 3 disk brick will need to be healed from 
>>> the replica.   I have 20GbE on each server so it doesn’t take long.   It 
>>> copied 20TB in about 18 hours once.
>> 
>> So, any disk failure would me at least 6TB to be recovered via
>> network. This mean an high network utilization and as long gluster
>> doesn't have a dedicated network for replica,
>> this can slow down client access.
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
> 
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] to RAID or not?

2016-07-04 Thread Russell Purinton
Sorry, example of 5 servers should read

> server1 A & B   replica to server 2 C & D
> server2 A & B   replica to server 3 C & D
> server3 A & B   replica to server 4 C & D
> server4 A & B   replica to server 5 C & D
> server5 A & B   replica to server 1 C & D


Adding each server should be as simple as using the brick-replace command to 
move bricks C and D from server1 onto bricks C and D of the new server.

Then you can add-brick to create 2 new brick replicas from new server A and B 
to server1 C and D.


> On Jul 4, 2016, at 1:54 PM, Russell Purinton  
> wrote:
> 
> The fault tolerance is provided by Gluster replica translator.
> 
> RAID0 to me is preferable to JBOD because you get 3x read performance and 3x 
> write performance.   If performance is not a concern, or if you only have 
> 1GbE, then it may not matter, and you could just do JBOD with a ton of bricks.
> 
> The same method scales to how ever many servers you need… imagine them in a 
> ring…
> 
> server1 A & B   replica to server 2 C & D
> server2 A & B   replica to server 3 C & D
> server3 A & B   replica to server 1 C & D
> 
> Adding a 4th server?  No problem… you can move the reconfigure the bricks to 
> do
> server1 A & B   replica to server 2 C & D
> server2 A & B   replica to server 3 C & D
> server3 A & B   replica to server 4 C & D
> server4 A & B   replica to server 1 C & D
> 
> or 5 servers
> server1 A & B   replica to server 2 C & D
> server2 A & B   replica to server 3 C & D
> server3 A & B   replica to server 4 C & D
> server4 A & B   replica to server 5 C & D
> server5 A & B   replica to server 6 C & D
> 
> I guess my recommendation is not the best for redundancy and data protection… 
> because I’m concerned with performance, and space, as long as I have 2 copies 
> of the data on different servers then I’m happy.  
> 
> If you care more about performance than space, and want extra data redundancy 
> (more than 2 copies), then use RAID 10 on the nodes, and use gluster replica. 
>  This means you have every byte of data on 4 disks.
> 
> If you care more about space than performance and want extra redundancy use 
> RAID 6, and gluster replica.
> 
> I always recommend gluster replica, because several times I have lost entire 
> servers… and its nice to have the data on more than server.
> 
>> On Jul 4, 2016, at 1:46 PM, Gandalf Corvotempesta 
>>  wrote:
>> 
>> 2016-07-04 19:44 GMT+02:00 Gandalf Corvotempesta
>> :
>>> So, any disk failure would me at least 6TB to be recovered via
>>> network. This mean an high network utilization and as long gluster
>>> doesn't have a dedicated network for replica,
>>> this can slow down client access.
>> 
>> Additionally, using a RAID-0 doesn't give any fault tollerance.
>> My question was for archieving the bast redundancy and data proction
>> available. If I have to use RAID-0 that doesn't protect data, why not
>> removing raid at all ?
> 

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] to RAID or not?

2016-07-04 Thread tom
I would highly stress, regardless of whatever solution you choose - make sure 
you test actual workload performance before going all-in.

In my testing, performance (esp. iops and latency) decreased as I added bricks 
and additional nodes.  Since you have many spindles now, I would encourage you 
to test your workload up to and including the total brick count you ultimately 
expect.  RAID level and whether it’s md, zfs, or hardware isn’t likely to make 
as significant of a performance impact as Gluster and its various clients will. 
 Test failure scenarios and performance characteristics during impairment 
events thoroughly.  Make sure heals happen as you expect, including final 
contents of files modified during an impairment.  If you have many small files 
or directories that will be accessed concurrently, make sure to stress that 
behavior in your testing.

Gluster can be great for targeting availability and distribution at low 
software cost, and I would say as of today at the expense of performance, but 
as with any scale-out NAS there are limitations and some surprises along the 
path.

Good hunting,
-t

> On Jul 4, 2016, at 10:44 AM, Gandalf Corvotempesta 
>  wrote:
> 
> 2016-07-04 19:35 GMT+02:00 Russell Purinton :
>> For 3 servers with 12 disks each, I would do Hardware RAID0 (or madam if you 
>> don’t have a RAID card) of 3 disks.  So four 3-disk RAID0’s per server.
> 
> 3 servers is just to start. We plan to use 5 server in shorter time
> and up to 15 on production.
> 
>> I would set them up as Replica 3 Arbiter 1
>> 
>> server1:/brickA server2:/brickC server3:/brickA
>> server1:/brickB server2:/brickD server3:/brickB
>> server2:/brickA server3:/brickC server1:/brickA
>> server2:/brickB server3:/brickD server1:/brickB
>> server3:/brickA server1:/brickC server2:/brickA
>> server3:/brickB server1:/brickD server2:/brickB
>> 
>> The benefit of this is that you can lose an entire server node (12 disks) 
>> and all of your data is still accessible.   And you get the same space as if 
>> they were all in a RAID10.
>> 
>> If you lose any disk, the entire 3 disk brick will need to be healed from 
>> the replica.   I have 20GbE on each server so it doesn’t take long.   It 
>> copied 20TB in about 18 hours once.
> 
> So, any disk failure would me at least 6TB to be recovered via
> network. This mean an high network utilization and as long gluster
> doesn't have a dedicated network for replica,
> this can slow down client access.
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] to RAID or not?

2016-07-04 Thread Russell Purinton
The fault tolerance is provided by Gluster replica translator.

RAID0 to me is preferable to JBOD because you get 3x read performance and 3x 
write performance.   If performance is not a concern, or if you only have 1GbE, 
then it may not matter, and you could just do JBOD with a ton of bricks.

The same method scales to how ever many servers you need… imagine them in a 
ring…

server1 A & B   replica to server 2 C & D
server2 A & B   replica to server 3 C & D
server3 A & B   replica to server 1 C & D

Adding a 4th server?  No problem… you can move the reconfigure the bricks to do
server1 A & B   replica to server 2 C & D
server2 A & B   replica to server 3 C & D
server3 A & B   replica to server 4 C & D
server4 A & B   replica to server 1 C & D

or 5 servers
server1 A & B   replica to server 2 C & D
server2 A & B   replica to server 3 C & D
server3 A & B   replica to server 4 C & D
server4 A & B   replica to server 5 C & D
server5 A & B   replica to server 6 C & D

I guess my recommendation is not the best for redundancy and data protection… 
because I’m concerned with performance, and space, as long as I have 2 copies 
of the data on different servers then I’m happy.  

If you care more about performance than space, and want extra data redundancy 
(more than 2 copies), then use RAID 10 on the nodes, and use gluster replica.  
This means you have every byte of data on 4 disks.

If you care more about space than performance and want extra redundancy use 
RAID 6, and gluster replica.

I always recommend gluster replica, because several times I have lost entire 
servers… and its nice to have the data on more than server.

> On Jul 4, 2016, at 1:46 PM, Gandalf Corvotempesta 
>  wrote:
> 
> 2016-07-04 19:44 GMT+02:00 Gandalf Corvotempesta
> :
>> So, any disk failure would me at least 6TB to be recovered via
>> network. This mean an high network utilization and as long gluster
>> doesn't have a dedicated network for replica,
>> this can slow down client access.
> 
> Additionally, using a RAID-0 doesn't give any fault tollerance.
> My question was for archieving the bast redundancy and data proction
> available. If I have to use RAID-0 that doesn't protect data, why not
> removing raid at all ?

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] to RAID or not?

2016-07-04 Thread Gandalf Corvotempesta
2016-07-04 19:44 GMT+02:00 Gandalf Corvotempesta
:
> So, any disk failure would me at least 6TB to be recovered via
> network. This mean an high network utilization and as long gluster
> doesn't have a dedicated network for replica,
> this can slow down client access.

Additionally, using a RAID-0 doesn't give any fault tollerance.
My question was for archieving the bast redundancy and data proction
available. If I have to use RAID-0 that doesn't protect data, why not
removing raid at all ?
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] to RAID or not?

2016-07-04 Thread Gandalf Corvotempesta
2016-07-04 19:35 GMT+02:00 Russell Purinton :
> For 3 servers with 12 disks each, I would do Hardware RAID0 (or madam if you 
> don’t have a RAID card) of 3 disks.  So four 3-disk RAID0’s per server.

3 servers is just to start. We plan to use 5 server in shorter time
and up to 15 on production.

> I would set them up as Replica 3 Arbiter 1
>
> server1:/brickA server2:/brickC server3:/brickA
> server1:/brickB server2:/brickD server3:/brickB
> server2:/brickA server3:/brickC server1:/brickA
> server2:/brickB server3:/brickD server1:/brickB
> server3:/brickA server1:/brickC server2:/brickA
> server3:/brickB server1:/brickD server2:/brickB
>
> The benefit of this is that you can lose an entire server node (12 disks) and 
> all of your data is still accessible.   And you get the same space as if they 
> were all in a RAID10.
>
> If you lose any disk, the entire 3 disk brick will need to be healed from the 
> replica.   I have 20GbE on each server so it doesn’t take long.   It copied 
> 20TB in about 18 hours once.

So, any disk failure would me at least 6TB to be recovered via
network. This mean an high network utilization and as long gluster
doesn't have a dedicated network for replica,
this can slow down client access.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] to RAID or not?

2016-07-04 Thread Russell Purinton
For 3 servers with 12 disks each, I would do Hardware RAID0 (or madam if you 
don’t have a RAID card) of 3 disks.  So four 3-disk RAID0’s per server. 

I would set them up as Replica 3 Arbiter 1

server1:/brickA server2:/brickC server3:/brickA
server1:/brickB server2:/brickD server3:/brickB
server2:/brickA server3:/brickC server1:/brickA
server2:/brickB server3:/brickD server1:/brickB
server3:/brickA server1:/brickC server2:/brickA
server3:/brickB server1:/brickD server2:/brickB

The benefit of this is that you can lose an entire server node (12 disks) and 
all of your data is still accessible.   And you get the same space as if they 
were all in a RAID10.

If you lose any disk, the entire 3 disk brick will need to be healed from the 
replica.   I have 20GbE on each server so it doesn’t take long.   It copied 
20TB in about 18 hours once.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] to RAID or not?

2016-07-04 Thread Gandalf Corvotempesta
2016-07-04 19:25 GMT+02:00 Matt Robinson :
> If you don't trust the hardware raid, then steer clear of raid-6 as mdadm 
> raid 6 is stupidly slow.
> I don't completely trust hardware raid either, but rebuild times should be 
> under a day and in order to lose a raid-6 array you have to lose 3 disks.
> My own systems are hardware raid-6.
> If you're not terribly worried about maximising usable storage, then mdadm 
> raid-10 is your friend.

All of my servers are hardware RAID-6 with 8x300GB SAS 15K (some
servers with 600GB)

A rebuild of a single disk in a 6x600GB SAS RAID-6 takes exactly 22 hours.

This with 15K SAS disks. Now try with 2TB (more than twice the size)
SATA 7200 (less than half speed)
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] to RAID or not?

2016-07-04 Thread Matt Robinson
If you don't trust the hardware raid, then steer clear of raid-6 as mdadm raid 
6 is stupidly slow.
I don't completely trust hardware raid either, but rebuild times should be 
under a day and in order to lose a raid-6 array you have to lose 3 disks.
My own systems are hardware raid-6.
If you're not terribly worried about maximising usable storage, then mdadm 
raid-10 is your friend.


> On 4 Jul 2016, at 18:15:26, Gandalf Corvotempesta 
>  wrote:
> 
> 2016-07-04 17:01 GMT+02:00 Matt Robinson :
>> Hi Gandalf,
>> 
>> Are you using hardware raid or mdadm?
>> On high quality hardware raid, a 12 disk raid-6 is pretty solid.  With mdadm 
>> any raid6 (especially with 12 disks) will be rubbish.
> 
> I can use both.
> I don't like very much hardware raid, even high quality. Recently i'm
> having too many issue with hardware raid (like multiple disks kicked
> out with no apparent reasons and virtual-disk failed with data loss)
> 
> A RAID-6 with 12x2TB SATA disks would take days to rebuild, in the
> meanwhile, multiple disks could fail resulting in data loss.
> Yes, gluster is able to recover from this, but I prefere to avoid have
> to resync 24TB of data via networks.
> 
> What about a software RAID-1 ? 6 raid for each gluster nodes and 6
> disks wasted but SATA disks are cheaper.

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] to RAID or not?

2016-07-04 Thread Joe Julian
IMHO you use raid for performance reasons and gluster for fault tolerance and 
scale.

On July 4, 2016 7:54:44 AM PDT, Gandalf Corvotempesta 
 wrote:
>No suggestions ?
>Il 14 giu 2016 10:01 AM, "Gandalf Corvotempesta" <
>gandalf.corvotempe...@gmail.com> ha scritto:
>
>> Let's assume a small cluster made by 3 servers, 12 disks/bricks each.
>> This cluster would be expanded to a maximum of 15 servers in near
>future.
>>
>> What do you suggest, a JBOD or a RAID? Which RAID level?
>>
>> 15 servers with 12 disks/bricks in JBOD are 180 bricks. Is this an
>> acceptable value?
>> Multiple raid6 for each servers? In example, RAID-6 with 6 disks and
>> another RAID-6 with the other 6 disks. I'll loose 4 disks on each
>> servers, performance would be affected and rebuild times would be
>huge
>> (by using 2TB/4TB disks)
>>
>> Any suggestions?
>>
>
>
>
>
>___
>Gluster-users mailing list
>Gluster-users@gluster.org
>http://www.gluster.org/mailman/listinfo/gluster-users

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] to RAID or not?

2016-07-04 Thread ML mail
Hi Gandalf
Not suggesting really here but just mentioning what I am using: I am using an 
HBA adapter with 12 disks so basically JBOD but I am using ZFS and have an 
array of 12 disks in RAIDZ2 (sort of RAID6 but ZFS-style). I am pretty happy 
with that setup so far.
CheersML
 

On Monday, July 4, 2016 4:54 PM, Gandalf Corvotempesta 
 wrote:
 

 No suggestions ?Il 14 giu 2016 10:01 AM, "Gandalf Corvotempesta" 
 ha scritto:

Let's assume a small cluster made by 3 servers, 12 disks/bricks each.
This cluster would be expanded to a maximum of 15 servers in near future.

What do you suggest, a JBOD or a RAID? Which RAID level?

15 servers with 12 disks/bricks in JBOD are 180 bricks. Is this an
acceptable value?
Multiple raid6 for each servers? In example, RAID-6 with 6 disks and
another RAID-6 with the other 6 disks. I'll loose 4 disks on each
servers, performance would be affected and rebuild times would be huge
(by using 2TB/4TB disks)

Any suggestions?


___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

   ___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] to RAID or not?

2016-07-04 Thread Matt Robinson
Hi Gandalf,

Are you using hardware raid or mdadm?
On high quality hardware raid, a 12 disk raid-6 is pretty solid.  With mdadm 
any raid6 (especially with 12 disks) will be rubbish.

Matt.

> On 4 Jul 2016, at 15:54:44, Gandalf Corvotempesta 
>  wrote:
> 
> No suggestions ?
> 
> Il 14 giu 2016 10:01 AM, "Gandalf Corvotempesta" 
>  ha scritto:
> Let's assume a small cluster made by 3 servers, 12 disks/bricks each.
> This cluster would be expanded to a maximum of 15 servers in near future.
> 
> What do you suggest, a JBOD or a RAID? Which RAID level?
> 
> 15 servers with 12 disks/bricks in JBOD are 180 bricks. Is this an
> acceptable value?
> Multiple raid6 for each servers? In example, RAID-6 with 6 disks and
> another RAID-6 with the other 6 disks. I'll loose 4 disks on each
> servers, performance would be affected and rebuild times would be huge
> (by using 2TB/4TB disks)
> 
> Any suggestions?
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] to RAID or not?

2016-07-04 Thread Gandalf Corvotempesta
No suggestions ?
Il 14 giu 2016 10:01 AM, "Gandalf Corvotempesta" <
gandalf.corvotempe...@gmail.com> ha scritto:

> Let's assume a small cluster made by 3 servers, 12 disks/bricks each.
> This cluster would be expanded to a maximum of 15 servers in near future.
>
> What do you suggest, a JBOD or a RAID? Which RAID level?
>
> 15 servers with 12 disks/bricks in JBOD are 180 bricks. Is this an
> acceptable value?
> Multiple raid6 for each servers? In example, RAID-6 with 6 disks and
> another RAID-6 with the other 6 disks. I'll loose 4 disks on each
> servers, performance would be affected and rebuild times would be huge
> (by using 2TB/4TB disks)
>
> Any suggestions?
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] 3.7.12/3.8.qemu/proxmox testing

2016-07-04 Thread Lindsay Mathieson

On 4/07/2016 11:06 PM, Poornima Gurusiddaiah wrote:
Found the RCA for the issue, an explanation of the same can be found @https://bugzilla.redhat.com/show_bug.cgi?id=1352482#c8  
The patch for this, will follow shortly and hope to include it in 3.1.13


Brilliant, thanks all.

--
Lindsay Mathieson

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] [Gluster-devel] 3.7.12/3.8.qemu/proxmox testing

2016-07-04 Thread Atin Mukherjee
On Mon, Jul 4, 2016 at 6:36 PM, Poornima Gurusiddaiah 
wrote:

>
> - Original Message -
> > From: "Lindsay Mathieson" 
> > To: "Kaushal M" , gluster-users@gluster.org
> > Cc: "Gluster Devel" 
> > Sent: Monday, July 4, 2016 4:23:37 PM
> > Subject: Re: [Gluster-users] 3.7.12/3.8.qemu/proxmox testing
> >
> > On 4/07/2016 7:16 PM, Kaushal M wrote:
> > > An update on this, we are tracking this issue on bugzilla [1].
> > > I've added some of the observations made till now in the bug. Copying
> > > the same here.
> >
> > Thanks Kaushal, appreciate the updates.
> >
>
> Found the RCA for the issue, an explanation of the same can be found @
> https://bugzilla.redhat.com/show_bug.cgi?id=1352482#c8
> The patch for this, will follow shortly and hope to include it in 3.1.13
>

You meant 3.7.13, isn't it :)

Kaushal, RTalur, Poornima - Good work guys!



>
> Regards,
> Poornima
>
> >
> > --
> > Lindsay Mathieson
> >
> > ___
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-users
> >
> ___
> Gluster-devel mailing list
> gluster-de...@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] 3.7.12/3.8.qemu/proxmox testing

2016-07-04 Thread Poornima Gurusiddaiah

- Original Message -
> From: "Lindsay Mathieson" 
> To: "Kaushal M" , gluster-users@gluster.org
> Cc: "Gluster Devel" 
> Sent: Monday, July 4, 2016 4:23:37 PM
> Subject: Re: [Gluster-users] 3.7.12/3.8.qemu/proxmox testing
> 
> On 4/07/2016 7:16 PM, Kaushal M wrote:
> > An update on this, we are tracking this issue on bugzilla [1].
> > I've added some of the observations made till now in the bug. Copying
> > the same here.
> 
> Thanks Kaushal, appreciate the updates.
> 

Found the RCA for the issue, an explanation of the same can be found @ 
https://bugzilla.redhat.com/show_bug.cgi?id=1352482#c8 
The patch for this, will follow shortly and hope to include it in 3.1.13

Regards,
Poornima

> 
> --
> Lindsay Mathieson
> 
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
> 
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] root-squash permission denied on rename

2016-07-04 Thread Matt Robinson
Hi,

I'm fairly new to gluster so please forgive me if I'm in any way out of 
protocol for this list.

I have a distributed volume with 2 servers hosting bricks and a third just 
managing the system.
I really want root squashing, as there are a large number of clients and I do 
not want a bad keystroke on one to wipe out the contents of the gluster 
file-system.
I'm using gluster 3.7.10 on Scientific Linux 6.7.

I just cannot get gluster to work properly for normal users with root-squashing 
enabled.  The problem is easiest to reproduce if one creates a directory with 
mkdir, creates a file with say 'echo hi > filename' and then tries to rename 
the latter to place it in the former using mv.  This fails about 50% of the 
time.  My reading suggests that it occurs when gluster decides to move the file 
from one brick to another as it renames it.  rebalance and fix-layout have been 
run, but have long finished and the problem persists.

I've spent a fair amount of time googling this issue and it's clearly not 
unprecedented, but it's supposedly fixed long before  v3.7.10.
I really would appreciate it if somebody could rescue me.  For the moment I'm 
running with server.root-squash turned off.

Thanks,

Matt.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] 3.7.12/3.8.qemu/proxmox testing

2016-07-04 Thread Lindsay Mathieson

On 4/07/2016 7:16 PM, Kaushal M wrote:

An update on this, we are tracking this issue on bugzilla [1].
I've added some of the observations made till now in the bug. Copying
the same here.


Thanks Kaushal, appreciate the updates.


--
Lindsay Mathieson

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] 3.7.12/3.8.qemu/proxmox testing

2016-07-04 Thread Kaushal M
On Mon, Jul 4, 2016 at 10:46 AM, Kaushal M  wrote:
> On Mon, Jul 4, 2016 at 9:47 AM, Dmitry Melekhov  wrote:
>> 01.07.2016 07:31, Lindsay Mathieson пишет:
>>>
>>> Started a new thread for this to get away from the somewhat panicky
>>> subject line ...
>>>
>>> Some more test results. I built pve-qemu-kvm against gluster 3.8 and
>>> installed, which would I hoped would remove any libglusterfs version
>>> issues.
>>>
>>> Unfortunately it made no difference - same problems emerged.
>>>
>> Hello!
>>
>> I guess there is problem on server side, because in Centos 7 libgfapi is
>> dynamically linked,
>> and, thus, is automatically upgraded. But we have the same problem.
>>
>
> Thanks for the updates guys. This does indicate something has changed
> in libgfapi with the latest update.
> We are still trying to identify the cause, and will keep you updated on this.
>

An update on this, we are tracking this issue on bugzilla [1].
I've added some of the observations made till now in the bug. Copying
the same here.

```
With qemu-img at least the hangs happen when creating qcow2 images.
The command doesn't hang when creating raw images.

When creating a qcow2 image, the qemu-img appears to be reloading the
glusterfs graph several times. This can be observed in the attached
log where qemu-img is run against glusterfs-3.7.11.

With glusterfs-3.7.12, this doesn't happen as an early writev failure
happens on the brick transport with a EFAULT (Bad address) errno (see
attached log). No further actions happen after this, and the qemu-img
command hangs till the RPC ping-timeout happens and then fails.
```
~kaushal
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1352482

>>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] (WAS : Re: Fedora upgrade to f24 installed 3.8.0 client and broke mounting)

2016-07-04 Thread Hoggins!
Hello Vijay,

Le 24/06/2016 21:49, Vijay Bellur a écrit :
> Note that if servers are upgraded ahead of the clients, this problem
> should not be seen. 

I'm facing the same problem, and for that I'm planning an upgrade of the
servers. I have 3 GlusterFS bricks, and I wish to upgrade them (3.7.11
to 3.8.0) one by one.
Am I going to break anything ?

Thanks !

Hoggins!



signature.asc
Description: OpenPGP digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users