Re: [Gluster-users] [ovirt-users] Tracking down high writes in GlusterFS volume

2019-02-25 Thread Krutika Dhananjay
On Fri, Feb 15, 2019 at 12:30 AM Jayme  wrote:

> Running an oVirt 4.3 HCI 3-way replica cluster with SSD backed storage.
> I've noticed that my SSD writes (smart Total_LBAs_Written) are quite high
> on one particular drive.  Specifically I've noticed one volume is much much
> higher total bytes written than others (despite using less overall space).
>

Writes are higher on one particular volume? Or did one brick witness more
writes than its two replicas within the same volume? Could you share the
volume info output of the affected volume plus the name of the affected
brick if at all the issue is with one single brick?

Also, did you check if the volume was undergoing any heals (`gluster volume
heal  info`)?

-Krutika

My volume is writing over 1TB of data per day (by my manual calculation,
> and with glusterfs profiling) and wearing my SSDs quickly, how can I best
> determine which VM or process is at fault here?
>
> There are 5 low use VMs using the volume in question.  I'm attempting to
> track iostats on each of the vm's individually but so far I'm not seeing
> anything obvious that would account for 1TB of writes per day that the
> gluster volume is reporting.
> ___
> Users mailing list -- us...@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/us...@ovirt.org/message/OZHZXQS4GUPPJXOZSBTO6X5ZL6CATFXK/
>
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Code of Conduct Update

2019-02-25 Thread Amye Scavarda
We've updated the code of conduct for Gluster to be more clear, it's
now based off the Contributor's Covenant 1.4.
(https://www.contributor-covenant.org/version/1/4/code-of-conduct.html)
This is the same Code of Conduct that many other communities have
adopted (https://www.contributor-covenant.org/adopters).

You can find the code of conduct at
https://www.gluster.org/legal-page/code-of-conduct/
Feel free to email the Technical Leadership Council (tlc@) with
questions, our code of conduct is designed to make participation in
the community easy.

-- 
Amye Scavarda | a...@redhat.com | Gluster Community Lead
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Geo-Replication in "FAULTY" state after files are added to master volume: gsyncd worker crashed in syncdutils with "OSError: [Errno 22] Invalid argument

2019-02-25 Thread Boubacar Cisse
Hello all,

I having trouble making gluster geo-replication on Ubuntu 18.04 (Bionic).
Gluster version is 5.3. I'm able to successfully create the geo-replication
session but status goes from "Initializing" to "Faulty" in a loop after
session is started. I've created a bug report with all the necessary
information at https://bugzilla.redhat.com/show_bug.cgi?id=1680324
Any assistance/tips fixing this issue will be greatly appreciated.

5/ Log entries
[MASTER SERVER GEO REP LOG]
root@media01:/var/log/glusterfs/geo-replication/gfs1_media03_gfs1# cat
gsyncd.log
[2019-02-23 21:36:43.851184] I
[gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status
Change status=Initializing...
[2019-02-23 21:36:43.851489] I [monitor(monitor):157:monitor] Monitor:
starting gsyncd worker brick=/gfs1-data/brick slave_node=media03
[2019-02-23 21:36:43.856857] D [monitor(monitor):228:monitor] Monitor:
Worker would mount volume privately
[2019-02-23 21:36:43.895652] I [gsyncd(agent /gfs1-data/brick):308:main]
: Using session config file
path=/var/lib/glusterd/geo-replication/gfs1_media03_gfs1/gsyncd.conf
[2019-02-23 21:36:43.896118] D [subcmds(agent
/gfs1-data/brick):103:subcmd_agent] : RPC FD rpc_fd='8,11,10,9'
[2019-02-23 21:36:43.896435] I [changelogagent(agent
/gfs1-data/brick):72:__init__] ChangelogAgent: Agent listining...
[2019-02-23 21:36:43.897432] I [gsyncd(worker /gfs1-data/brick):308:main]
: Using session config file
path=/var/lib/glusterd/geo-replication/gfs1_media03_gfs1/gsyncd.conf
[2019-02-23 21:36:43.904604] I [resource(worker
/gfs1-data/brick):1366:connect_remote] SSH: Initializing SSH connection
between master and slave...
[2019-02-23 21:36:43.905631] D [repce(worker /gfs1-data/brick):196:push]
RepceClient: call 22733:140323447641920:1550957803.9055686
__repce_version__() ...
[2019-02-23 21:36:45.751853] D [repce(worker
/gfs1-data/brick):216:__call__] RepceClient: call
22733:140323447641920:1550957803.9055686 __repce_version__ -> 1.0
[2019-02-23 21:36:45.752202] D [repce(worker /gfs1-data/brick):196:push]
RepceClient: call 22733:140323447641920:1550957805.7521348 version() ...
[2019-02-23 21:36:45.785690] D [repce(worker
/gfs1-data/brick):216:__call__] RepceClient: call
22733:140323447641920:1550957805.7521348 version -> 1.0
[2019-02-23 21:36:45.786081] D [repce(worker /gfs1-data/brick):196:push]
RepceClient: call 22733:140323447641920:1550957805.7860181 pid() ...
[2019-02-23 21:36:45.820014] D [repce(worker
/gfs1-data/brick):216:__call__] RepceClient: call
22733:140323447641920:1550957805.7860181 pid -> 24141
[2019-02-23 21:36:45.820337] I [resource(worker
/gfs1-data/brick):1413:connect_remote] SSH: SSH connection between master
and slave established. duration=1.9156
[2019-02-23 21:36:45.820520] I [resource(worker
/gfs1-data/brick):1085:connect] GLUSTER: Mounting gluster volume locally...
[2019-02-23 21:36:45.837300] D [resource(worker
/gfs1-data/brick):859:inhibit] DirectMounter: auxiliary glusterfs mount in
place
[2019-02-23 21:36:46.843754] D [resource(worker
/gfs1-data/brick):933:inhibit] DirectMounter: auxiliary glusterfs mount
prepared
[2019-02-23 21:36:46.844113] I [resource(worker
/gfs1-data/brick):1108:connect] GLUSTER: Mounted gluster volume
duration=1.0234
[2019-02-23 21:36:46.844283] I [subcmds(worker
/gfs1-data/brick):80:subcmd_worker] : Worker spawn successful.
Acknowledging back to monitor
[2019-02-23 21:36:46.844623] D [master(worker
/gfs1-data/brick):101:gmaster_builder] : setting up change detection
mode mode=xsync
[2019-02-23 21:36:46.844768] D [monitor(monitor):271:monitor] Monitor:
worker(/gfs1-data/brick) connected
[2019-02-23 21:36:46.846079] D [master(worker
/gfs1-data/brick):101:gmaster_builder] : setting up change detection
mode mode=changelog
[2019-02-23 21:36:46.847300] D [master(worker
/gfs1-data/brick):101:gmaster_builder] : setting up change detection
mode mode=changeloghistory
[2019-02-23 21:36:46.884938] D [repce(worker /gfs1-data/brick):196:push]
RepceClient: call 22733:140323447641920:1550957806.8848307 version() ...
[2019-02-23 21:36:46.885751] D [repce(worker
/gfs1-data/brick):216:__call__] RepceClient: call
22733:140323447641920:1550957806.8848307 version -> 1.0
[2019-02-23 21:36:46.886019] D [master(worker
/gfs1-data/brick):774:setup_working_dir] _GMaster: changelog working dir
/var/lib/misc/gluster/gsyncd/gfs1_media03_gfs1/gfs1-data-brick
[2019-02-23 21:36:46.886212] D [repce(worker /gfs1-data/brick):196:push]
RepceClient: call 22733:140323447641920:1550957806.8861625 init() ...
[2019-02-23 21:36:46.892709] D [repce(worker
/gfs1-data/brick):216:__call__] RepceClient: call
22733:140323447641920:1550957806.8861625 init -> None
[2019-02-23 21:36:46.892794] D [repce(worker /gfs1-data/brick):196:push]
RepceClient: call 22733:140323447641920:1550957806.892774
register('/gfs1-data/brick',
'/var/lib/misc/gluster/gsyncd/gfs1_media03_gfs1/gfs1-data-brick',
'/var/log/glusterfs/geo-replication/gfs1_media03_gfs1/changes-gfs1-data-brick.log',
8, 5) ...
[2019-02-23 

[Gluster-users] GlusterFS - 6.0RC - Test days (27th, 28th Feb)

2019-02-25 Thread Amar Tumballi Suryanarayan
Hi all,

We are calling out our users, and developers to contribute in validating
‘glusterfs-6.0rc’ build in their usecase. Specially for the cases of
upgrade, stability, and performance.

Some of the key highlights of the release are listed in release-notes draft
.
Please note that there are some of the features which are being dropped out
of this release, and hence making sure your setup is not going to have an
issue is critical. Also the default lru-limit option in fuse mount for
Inodes should help to control the memory usage of client processes. All the
good reason to give it a shot in your test setup.

If you are developer using gfapi interface to integrate with other
projects, you also have some signature changes, so please make sure your
project would work with latest release. Or even if you are using a project
which depends on gfapi, report the error with new RPMs (if any). We will
help fix it.

As part of test days, we want to focus on testing the latest upcoming
release i.e. GlusterFS-6, and one or the other gluster volunteers would be
there on #gluster channel on freenode to assist the people. Some of the key
things we are looking as bug reports are:

   -

   See if upgrade from your current version to 6.0rc is smooth, and works
   as documented.
   - Report bugs in process, or in documentation if you find mismatch.
   -

   Functionality is all as expected for your usecase.
   - No issues with actual application you would run on production etc.
   -

   Performance has not degraded in your usecase.
   - While we have added some performance options to the code, not all of
  them are turned on, as they have to be done based on usecases.
  - Make sure the default setup is at least same as your current version
  - Try out few options mentioned in release notes (especially,
  --auto-invalidation=no) and see if it helps performance.
   -

   While doing all the above, check below:
   - see if the log files are making sense, and not flooding with some “for
  developer only” type of messages.
  - get ‘profile info’ output from old and now, and see if there is
  anything which is out of normal expectation. Check with us on the numbers.
  - get a ‘statedump’ when there are some issues. Try to make sense of
  it, and raise a bug if you don’t understand it completely.

Process
expected on test days.

   -

   We have a tracker bug
   [0]
   - We will attach all the ‘blocker’ bugs to this bug.
   -

   Use this link to report bugs, so that we have more metadata around given
   bugzilla.
   - Click Here
  

  [1]
   -

   The test cases which are to be tested are listed here in this sheet
   
[2],
   please add, update, and keep it up-to-date to reduce duplicate efforts.

Lets together make this release a success.

Also check if we covered some of the open issues from Weekly untriaged bugs

[3]

For details on build and RPMs check this email

[4]

Finally, the dates :-)

   - Wednesday - Feb 27th, and
   - Thursday - Feb 28th

Note that our goal is to identify as many issues as possible in upgrade and
stability scenarios, and if any blockers are found, want to make sure we
release with the fix for same. So each of you, Gluster users, feel
comfortable to upgrade to 6.0 version.

Regards,
Gluster Ants.

-- 
Amar Tumballi (amarts)
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster and bonding

2019-02-25 Thread Alvin Starr

On 2/25/19 11:48 AM, Boris Zhmurov wrote:

On 25/02/2019 14:24, Jorick Astrego wrote:


Hi,

Have not measured it as we have been running this way for years now 
and haven't experienced any problems with "transport endpoint is not 
connected” with this setup.




Hello,

Jorick, how often (during those years) did your NICs break?


Over the years(30) I have had problems with bad ports on switches.

With some manufactures  being worse than others.


--
Alvin Starr   ||   land:  (905)513-7688
Netvel Inc.   ||   Cell:  (416)806-0133
al...@netvel.net  ||

___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster and bonding

2019-02-25 Thread Boris Zhmurov

On 25/02/2019 14:24, Jorick Astrego wrote:


Hi,

Have not measured it as we have been running this way for years now 
and haven't experienced any problems with "transport endpoint is not 
connected” with this setup.




Hello,

Jorick, how often (during those years) did your NICs break?


--
Kind regards,
Boris Zhmurov
mailto: b...@kernelpanic.ru

___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-Maintainers] glusterfs-6.0rc0 released

2019-02-25 Thread Shyam Ranganathan
Hi,

Release-6 RC0 packages are built (see mail below). This is a good time
to start testing the release bits, and reporting any issues on bugzilla.
Do post on the lists any testing done and feedback from the same.

We have about 2 weeks to GA of release-6 barring any major blockers
uncovered during the test phase. Please take this time to help make the
release effective, by testing the same.

Thanks,
Shyam

NOTE: CentOS StorageSIG packages for the same are still pending and
should be available in due course.
On 2/23/19 9:41 AM, Kaleb Keithley wrote:
> 
> GlusterFS 6.0rc0 is built in Fedora 30 and Fedora 31/rawhide.
> 
> Packages for Fedora 29, RHEL 8, RHEL 7, and RHEL 6* and Debian 9/stretch
> and Debian 10/buster are at
> https://download.gluster.org/pub/gluster/glusterfs/qa-releases/6.0rc0/
> 
> Packages are signed. The public key is at
> https://download.gluster.org/pub/gluster/glusterfs/6/rsa.pub
> 
> * RHEL 6 is client-side only. Fedora 29, RHEL 7, and RHEL 6 RPMs are
> Fedora Koji scratch builds. RHEL 7 and RHEL 6 RPMs are provided here for
> convenience only, and are independent of the RPMs in the CentOS Storage SIG.
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster and bonding

2019-02-25 Thread Jorick Astrego
Hi,

Have not measured it as we have been running this way for years now and
haven't experienced any problems with "transport endpoint is not
connected” with this setup.

We used the default options "BONDING_OPTS='mode=6 miimon=100'"

|miimon=/time_in_milliseconds/ |
Specifies (in milliseconds) how often MII link monitoring
occurs. This is useful if high availability is required because
MII is used to verify that the NIC is active.


On 2/25/19 2:22 PM, Martin Toth wrote:
> How long does it take to your devices (using mode 5 or 6, ALB is
> prefered for GlusterFS) to take-over the MAC? This can result in your
> error -  "transport endpoint is not connected” - there are some
> timeouts within gluster set by default.
> I am using LACP and it works without any problem. Can you share your
> mode 5 / 6 configuration ?
>
> Thanks.
> Martin
>
>> On 25 Feb 2019, at 13:44, Jorick Astrego > > wrote:
>>
>> Hi,
>>
>> Well no, mode 5 and mode 6 also have fault tollerance and don't need
>> any special switch config.
>>
>> Quick google search:
>>
>> https://serverfault.com/questions/734246/does-balance-alb-and-balance-tlb-support-fault-tolerance
>>
>> Bonding Mode 5 (balance-tlb) works by looking at all the devices
>> in the bond, and sending out the slave with the least current
>> traffic load. Traffic is only received by one slave (the "primary
>> slave"). If a slave is lost, that slave is not considered for
>> transmission, so this mode is fault-tolerant.
>>
>> Bonding Mode 6 (balance-alb) works as above, except incoming ARP
>> requests are intercepted by the bonding driver, and the bonding
>> driver generates ARP replies so that external hosts are tricked
>> into sending their traffic into one of the other bonding slaves
>> instead of the primary slave. If many hosts in the same broadcast
>> domain contact the bond, then traffic should balance roughly
>> evenly into all slaves.
>>
>> If a slave is lost in Mode 6, then it may take some time for a
>> remote host to time out its ARP table entry and send a new ARP
>> request. A TCP or SCTP retransmission tents to lead into ARP
>> request fairly quickly, but a UDP datagram does not, and will
>> rely on the usual ARP table refresh. So Mode 6 /is/ fault
>> tolerant, but convergence on slave loss may take some time
>> depending on the Layer 4 protocol used.
>>
>> If you are worried about fast fault tolerance, then consider
>> using Mode 4 (802.3ad aka LACP) which negotiates link aggregation
>> between the bond and the switch, and constantly updates the link
>> status between the aggregation partners. Mode 4 also has
>> configurable load balance hashing so is better for in-order
>> delivery of TCP streams compared to Mode 5 or Mode 6.
>>
>> https://wiki.linuxfoundation.org/networking/bonding
>>
>>  *
>> *balance-tlb or 5*
>> Adaptive transmit load balancing: channel bonding that does not
>> require any special switch support. The outgoing traffic is
>> distributed according to the current load (computed relative to
>> the speed) on each slave. Incoming traffic is received by the
>> current slave. *If the receiving slave fails, another slave takes
>> over the MAC address of the failed receiving slave.*
>>  o
>> Prerequisite:
>> 1.
>> Ethtool support in the base drivers for retrieving the
>> speed of each slave.
>>  *
>> *balance-alb or 6 *
>> Adaptive load balancing: *includes balance-tlb plus receive load
>> balancing* (rlb) for IPV4 traffic, and does not require any
>> special switch support. The receive load balancing is achieved by
>> ARP negotiation.
>>  o
>> The bonding driver intercepts the ARP Replies sent by the
>> local system on their way out and overwrites the source
>> hardware address with the unique hardware address of one of
>> the slaves in the bond such that different peers use
>> different hardware addresses for the server.
>>  o
>> Receive traffic from connections created by the server is
>> also balanced. When the local system sends an ARP Request the
>> bonding driver copies and saves the peer's IP information
>> from the ARP packet.
>>  o
>> When the ARP Reply arrives from the peer, its hardware
>> address is retrieved and the bonding driver initiates an ARP
>> reply to this peer assigning it to one of the slaves in the bond.
>>  o
>> A problematic outcome of using ARP negotiation for balancing
>> is that each time that an ARP request is broadcast it uses
>> the hardware address of the bond. Hence, peers learn the
>> hardware address of the bond and the balancing of receive
>> traffic collapses to the current slave. This is handled by
>> sending updates 

Re: [Gluster-users] Gluster and bonding

2019-02-25 Thread Martin Toth
How long does it take to your devices (using mode 5 or 6, ALB is prefered for 
GlusterFS) to take-over the MAC? This can result in your error -  "transport 
endpoint is not connected” - there are some timeouts within gluster set by 
default.
I am using LACP and it works without any problem. Can you share your mode 5 / 6 
configuration ?

Thanks.
Martin

> On 25 Feb 2019, at 13:44, Jorick Astrego  wrote:
> 
> Hi,
> 
> Well no, mode 5 and mode 6 also have fault tollerance and don't need any 
> special switch config.
> 
> Quick google search:
> 
> https://serverfault.com/questions/734246/does-balance-alb-and-balance-tlb-support-fault-tolerance
>  
> 
> Bonding Mode 5 (balance-tlb) works by looking at all the devices in the bond, 
> and sending out the slave with the least current traffic load. Traffic is 
> only received by one slave (the "primary slave"). If a slave is lost, that 
> slave is not considered for transmission, so this mode is fault-tolerant.
> 
> Bonding Mode 6 (balance-alb) works as above, except incoming ARP requests are 
> intercepted by the bonding driver, and the bonding driver generates ARP 
> replies so that external hosts are tricked into sending their traffic into 
> one of the other bonding slaves instead of the primary slave. If many hosts 
> in the same broadcast domain contact the bond, then traffic should balance 
> roughly evenly into all slaves.
> 
> If a slave is lost in Mode 6, then it may take some time for a remote host to 
> time out its ARP table entry and send a new ARP request. A TCP or SCTP 
> retransmission tents to lead into ARP request fairly quickly, but a UDP 
> datagram does not, and will rely on the usual ARP table refresh. So Mode 6 is 
> fault tolerant, but convergence on slave loss may take some time depending on 
> the Layer 4 protocol used.
> 
> If you are worried about fast fault tolerance, then consider using Mode 4 
> (802.3ad aka LACP) which negotiates link aggregation between the bond and the 
> switch, and constantly updates the link status between the aggregation 
> partners. Mode 4 also has configurable load balance hashing so is better for 
> in-order delivery of TCP streams compared to Mode 5 or Mode 6.
> 
> https://wiki.linuxfoundation.org/networking/bonding 
> 
> balance-tlb or 5
> Adaptive transmit load balancing: channel bonding that does not require any 
> special switch support. The outgoing traffic is distributed according to the 
> current load (computed relative to the speed) on each slave. Incoming traffic 
> is received by the current slave. If the receiving slave fails, another slave 
> takes over the MAC address of the failed receiving slave.
> Prerequisite:
> Ethtool support in the base drivers for retrieving the speed of each slave.
> balance-alb or 6 
> Adaptive load balancing: includes balance-tlb plus receive load balancing 
> (rlb) for IPV4 traffic, and does not require any special switch support. The 
> receive load balancing is achieved by ARP negotiation.
> The bonding driver intercepts the ARP Replies sent by the local system on 
> their way out and overwrites the source hardware address with the unique 
> hardware address of one of the slaves in the bond such that different peers 
> use different hardware addresses for the server.
> Receive traffic from connections created by the server is also balanced. When 
> the local system sends an ARP Request the bonding driver copies and saves the 
> peer's IP information from the ARP packet.
> When the ARP Reply arrives from the peer, its hardware address is retrieved 
> and the bonding driver initiates an ARP reply to this peer assigning it to 
> one of the slaves in the bond.
> A problematic outcome of using ARP negotiation for balancing is that each 
> time that an ARP request is broadcast it uses the hardware address of the 
> bond. Hence, peers learn the hardware address of the bond and the balancing 
> of receive traffic collapses to the current slave. This is handled by sending 
> updates (ARP Replies) to all the peers with their individually assigned 
> hardware address such that the traffic is redistributed. Receive traffic is 
> also redistributed when a new slave is added to the bond and when an inactive 
> slave is re-activated. The receive load is distributed sequentially (round 
> robin) among the group of highest speed slaves in the bond.
> When a link is reconnected or a new slave joins the bond the receive traffic 
> is redistributed among all active slaves in the bond by initiating ARP 
> Replies with the selected mac address to each of the clients. The updelay 
> parameter (detailed below) must be set to a value equal or greater than the 
> switch's forwarding delay so that the ARP Replies sent to the peers will not 
> be blocked by the switch.
> On 2/25/19 1:16 PM, Martin Toth wrote:
>> Hi Alex,
>> 
>> you have 

Re: [Gluster-users] Gluster and bonding

2019-02-25 Thread Jorick Astrego
Hi,

Well no, mode 5 and mode 6 also have fault tollerance and don't need any
special switch config.

Quick google search:

https://serverfault.com/questions/734246/does-balance-alb-and-balance-tlb-support-fault-tolerance

Bonding Mode 5 (balance-tlb) works by looking at all the devices in
the bond, and sending out the slave with the least current traffic
load. Traffic is only received by one slave (the "primary slave").
If a slave is lost, that slave is not considered for transmission,
so this mode is fault-tolerant.

Bonding Mode 6 (balance-alb) works as above, except incoming ARP
requests are intercepted by the bonding driver, and the bonding
driver generates ARP replies so that external hosts are tricked into
sending their traffic into one of the other bonding slaves instead
of the primary slave. If many hosts in the same broadcast domain
contact the bond, then traffic should balance roughly evenly into
all slaves.

If a slave is lost in Mode 6, then it may take some time for a
remote host to time out its ARP table entry and send a new ARP
request. A TCP or SCTP retransmission tents to lead into ARP request
fairly quickly, but a UDP datagram does not, and will rely on the
usual ARP table refresh. So Mode 6 /is/ fault tolerant, but
convergence on slave loss may take some time depending on the Layer
4 protocol used.

If you are worried about fast fault tolerance, then consider using
Mode 4 (802.3ad aka LACP) which negotiates link aggregation between
the bond and the switch, and constantly updates the link status
between the aggregation partners. Mode 4 also has configurable load
balance hashing so is better for in-order delivery of TCP streams
compared to Mode 5 or Mode 6.

https://wiki.linuxfoundation.org/networking/bonding

  *
*balance-tlb or 5*
Adaptive transmit load balancing: channel bonding that does not
require any special switch support. The outgoing traffic is
distributed according to the current load (computed relative to the
speed) on each slave. Incoming traffic is received by the current
slave. *If the receiving slave fails, another slave takes over the
MAC address of the failed receiving slave.*
  o
Prerequisite:
 1.
Ethtool support in the base drivers for retrieving the speed
of each slave.
  *
*balance-alb or 6 *
Adaptive load balancing: *includes balance-tlb plus receive load
balancing* (rlb) for IPV4 traffic, and does not require any special
switch support. The receive load balancing is achieved by ARP
negotiation.
  o
The bonding driver intercepts the ARP Replies sent by the local
system on their way out and overwrites the source hardware
address with the unique hardware address of one of the slaves in
the bond such that different peers use different hardware
addresses for the server.
  o
Receive traffic from connections created by the server is also
balanced. When the local system sends an ARP Request the bonding
driver copies and saves the peer's IP information from the ARP
packet.
  o
When the ARP Reply arrives from the peer, its hardware address
is retrieved and the bonding driver initiates an ARP reply to
this peer assigning it to one of the slaves in the bond.
  o
A problematic outcome of using ARP negotiation for balancing is
that each time that an ARP request is broadcast it uses the
hardware address of the bond. Hence, peers learn the hardware
address of the bond and the balancing of receive traffic
collapses to the current slave. This is handled by sending
updates (ARP Replies) to all the peers with their individually
assigned hardware address such that the traffic is
redistributed. Receive traffic is also redistributed when a new
slave is added to the bond and when an inactive slave is
re-activated. The receive load is distributed sequentially
(round robin) among the group of highest speed slaves in the bond.
  o
When a link is reconnected or a new slave joins the bond the
receive traffic is redistributed among all active slaves in the
bond by initiating ARP Replies with the selected mac address to
each of the clients. The updelay parameter (detailed below) must
be set to a value equal or greater than the switch's forwarding
delay so that the ARP Replies sent to the peers will not be
blocked by the switch.

On 2/25/19 1:16 PM, Martin Toth wrote:
> Hi Alex,
>
> you have to use bond mode 4 (LACP - 802.3ad) in order to achieve
> redundancy of cables/ports/switches. I suppose this is what you want.
>
> BR,
> Martin
>
>> On 25 Feb 2019, at 11:43, Alex K > > wrote:
>>
>> Hi All,
>>
>> I was asking if it 

Re: [Gluster-users] Gluster and bonding

2019-02-25 Thread Dmitry Melekhov

25.02.2019 14:43, Alex K пишет:

Hi All,

I was asking if it is possible to have the two separate cables 
connected to two different physical switched.



Yes, if these switches are in cluster, we use comware switches, so we 
use IRF, I guess cisco has lacp support on several switches in nexus..



When trying mode6 or mode1 in this setup gluster was refusing to start 
the volumes, giving me "transport endpoint is not connected".


server1: cable1  switch1 - 
server2: cable1

                |
server1: cable2  switch2 - 
server2: cable2


Both switches are connected with each other also. This is done to 
achieve redundancy for the switches.

When disconnecting cable2 from both servers, then gluster was happy.
What could be the problem?


If you need just redundancy, may be you can use STP? combine port in bridge.

Never tried this though, don't know how good is STP support in linux 
bridge...



btw, I don't think this is gluster problem, I think you have to ask in 
sort of linux networking list.





___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster and bonding

2019-02-25 Thread Jim Kinney
Unless the link between the two switches is set as a dedicated management link, 
won't that link create a problem? On the dual switch setup I have, there's a 
dedicated connection that handles inter-switch data. I'm not using bonding or 
teaming at the servers as I have 40Gb ethernet nics. Gluster is fine across 
this.

On February 25, 2019 5:43:24 AM EST, Alex K  wrote:
>Hi All,
>
>I was asking if it is possible to have the two separate cables
>connected to
>two different physical switched. When trying mode6 or mode1 in this
>setup
>gluster was refusing to start the volumes, giving me "transport
>endpoint is
>not connected".
>
>server1: cable1  switch1 - server2:
>cable1
>|
>server1: cable2  switch2 - server2:
>cable2
>
>Both switches are connected with each other also. This is done to
>achieve
>redundancy for the switches.
>When disconnecting cable2 from both servers, then gluster was happy.
>What could be the problem?
>
>Thanx,
>Alex
>
>
>On Mon, Feb 25, 2019 at 11:32 AM Jorick Astrego 
>wrote:
>
>> Hi,
>>
>> We use bonding mode 6 (balance-alb) for GlusterFS traffic
>>
>>
>>
>https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.4/html/administration_guide/network4
>>
>> Preferred bonding mode for Red Hat Gluster Storage client is mode 6
>> (balance-alb), this allows client to transmit writes in parallel on
>> separate NICs much of the time.
>>
>> Regards,
>>
>> Jorick Astrego
>> On 2/25/19 5:41 AM, Dmitry Melekhov wrote:
>>
>> 23.02.2019 19:54, Alex K пишет:
>>
>> Hi all,
>>
>> I have a replica 3 setup where each server was configured with a dual
>> interfaces in mode 6 bonding. All cables were connected to one common
>> network switch.
>>
>> To add redundancy to the switch, and avoid being a single point of
>> failure, I connected each second cable of each server to a second
>switch.
>> This turned out to not function as gluster was refusing to start the
>volume
>> logging "transport endpoint is disconnected" although all nodes were
>able
>> to reach each other (ping) in the storage network. I switched the
>mode to
>> mode 1 (active/passive) and initially it worked but following a
>reboot of
>> all cluster same issue appeared. Gluster is not starting the volumes.
>>
>> Isn't active/passive supposed to work like that? Can one have such
>> redundant network setup or are there any other recommended
>approaches?
>>
>>
>> Yes, we use lacp, I guess this is mode 4 ( we use teamd ), it is, no
>> doubt, best way.
>>
>>
>> Thanx,
>> Alex
>>
>> ___
>> Gluster-users mailing
>listGluster-users@gluster.orghttps://lists.gluster.org/mailman/listinfo/gluster-users
>>
>>
>>
>> ___
>> Gluster-users mailing
>listGluster-users@gluster.orghttps://lists.gluster.org/mailman/listinfo/gluster-users
>>
>>
>>
>>
>>
>> Met vriendelijke groet, With kind regards,
>>
>> Jorick Astrego
>>
>> *Netbulae Virtualization Experts *
>> --
>> Tel: 053 20 30 270 i...@netbulae.eu Staalsteden 4-3A KvK 08198180
>> Fax: 053 20 30 271 www.netbulae.eu 7547 TA Enschede BTW
>NL821234584B01
>> --
>>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users

-- 
Sent from my Android device with K-9 Mail. All tyopes are thumb related and 
reflect authenticity.___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster and bonding

2019-02-25 Thread Martin Toth
Hi Alex,

you have to use bond mode 4 (LACP - 802.3ad) in order to achieve redundancy of 
cables/ports/switches. I suppose this is what you want.

BR,
Martin

> On 25 Feb 2019, at 11:43, Alex K  wrote:
> 
> Hi All, 
> 
> I was asking if it is possible to have the two separate cables connected to 
> two different physical switched. When trying mode6 or mode1 in this setup 
> gluster was refusing to start the volumes, giving me "transport endpoint is 
> not connected". 
> 
> server1: cable1  switch1 - server2: cable1
> |
> server1: cable2  switch2 - server2: cable2
> 
> Both switches are connected with each other also. This is done to achieve 
> redundancy for the switches. 
> When disconnecting cable2 from both servers, then gluster was happy. 
> What could be the problem?
> 
> Thanx,
> Alex
> 
> 
> On Mon, Feb 25, 2019 at 11:32 AM Jorick Astrego  > wrote:
> Hi,
> 
> We use bonding mode 6 (balance-alb) for GlusterFS traffic
> 
> https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.4/html/administration_guide/network4
>  
> 
> Preferred bonding mode for Red Hat Gluster Storage client is mode 6 
> (balance-alb), this allows client to transmit writes in parallel on separate 
> NICs much of the time. 
> Regards,
> 
> Jorick Astrego
> On 2/25/19 5:41 AM, Dmitry Melekhov wrote:
>> 23.02.2019 19:54, Alex K пишет:
>>> Hi all, 
>>> 
>>> I have a replica 3 setup where each server was configured with a dual 
>>> interfaces in mode 6 bonding. All cables were connected to one common 
>>> network switch. 
>>> 
>>> To add redundancy to the switch, and avoid being a single point of failure, 
>>> I connected each second cable of each server to a second switch. This 
>>> turned out to not function as gluster was refusing to start the volume 
>>> logging "transport endpoint is disconnected" although all nodes were able 
>>> to reach each other (ping) in the storage network. I switched the mode to 
>>> mode 1 (active/passive) and initially it worked but following a reboot of 
>>> all cluster same issue appeared. Gluster is not starting the volumes. 
>>> 
>>> Isn't active/passive supposed to work like that? Can one have such 
>>> redundant network setup or are there any other recommended approaches?
>>> 
>> 
>> Yes, we use lacp, I guess this is mode 4 ( we use teamd ), it is, no doubt, 
>> best way.
>> 
>> 
>>> Thanx, 
>>> Alex
>>> 
>>> 
>>> ___
>>> Gluster-users mailing list
>>> Gluster-users@gluster.org 
>>> https://lists.gluster.org/mailman/listinfo/gluster-users 
>>> 
>> 
>> 
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org 
>> https://lists.gluster.org/mailman/listinfo/gluster-users 
>> 
> 
> 
> 
> Met vriendelijke groet, With kind regards,
> 
> Jorick Astrego
> 
> Netbulae Virtualization Experts 
> Tel: 053 20 30 270i...@netbulae.eu   
> Staalsteden 4-3AKvK 08198180
> Fax: 053 20 30 271www.netbulae.eu    7547 TA 
> EnschedeBTW NL821234584B01
> 
> 
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org 
> https://lists.gluster.org/mailman/listinfo/gluster-users 
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster and bonding

2019-02-25 Thread Alex K
Hi All,

I was asking if it is possible to have the two separate cables connected to
two different physical switched. When trying mode6 or mode1 in this setup
gluster was refusing to start the volumes, giving me "transport endpoint is
not connected".

server1: cable1  switch1 - server2:
cable1
|
server1: cable2  switch2 - server2:
cable2

Both switches are connected with each other also. This is done to achieve
redundancy for the switches.
When disconnecting cable2 from both servers, then gluster was happy.
What could be the problem?

Thanx,
Alex


On Mon, Feb 25, 2019 at 11:32 AM Jorick Astrego  wrote:

> Hi,
>
> We use bonding mode 6 (balance-alb) for GlusterFS traffic
>
>
> https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.4/html/administration_guide/network4
>
> Preferred bonding mode for Red Hat Gluster Storage client is mode 6
> (balance-alb), this allows client to transmit writes in parallel on
> separate NICs much of the time.
>
> Regards,
>
> Jorick Astrego
> On 2/25/19 5:41 AM, Dmitry Melekhov wrote:
>
> 23.02.2019 19:54, Alex K пишет:
>
> Hi all,
>
> I have a replica 3 setup where each server was configured with a dual
> interfaces in mode 6 bonding. All cables were connected to one common
> network switch.
>
> To add redundancy to the switch, and avoid being a single point of
> failure, I connected each second cable of each server to a second switch.
> This turned out to not function as gluster was refusing to start the volume
> logging "transport endpoint is disconnected" although all nodes were able
> to reach each other (ping) in the storage network. I switched the mode to
> mode 1 (active/passive) and initially it worked but following a reboot of
> all cluster same issue appeared. Gluster is not starting the volumes.
>
> Isn't active/passive supposed to work like that? Can one have such
> redundant network setup or are there any other recommended approaches?
>
>
> Yes, we use lacp, I guess this is mode 4 ( we use teamd ), it is, no
> doubt, best way.
>
>
> Thanx,
> Alex
>
> ___
> Gluster-users mailing 
> listGluster-users@gluster.orghttps://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
> ___
> Gluster-users mailing 
> listGluster-users@gluster.orghttps://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
>
>
> Met vriendelijke groet, With kind regards,
>
> Jorick Astrego
>
> *Netbulae Virtualization Experts *
> --
> Tel: 053 20 30 270 i...@netbulae.eu Staalsteden 4-3A KvK 08198180
> Fax: 053 20 30 271 www.netbulae.eu 7547 TA Enschede BTW NL821234584B01
> --
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster and bonding

2019-02-25 Thread Jorick Astrego
Hi,

We use bonding mode 6 (balance-alb) for GlusterFS traffic

https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.4/html/administration_guide/network4

Preferred bonding mode for Red Hat Gluster Storage client is mode 6
(balance-alb), this allows client to transmit writes in parallel on
separate NICs much of the time.

Regards,

Jorick Astrego

On 2/25/19 5:41 AM, Dmitry Melekhov wrote:
> 23.02.2019 19:54, Alex K пишет:
>> Hi all,
>>
>> I have a replica 3 setup where each server was configured with a dual
>> interfaces in mode 6 bonding. All cables were connected to one common
>> network switch.
>>
>> To add redundancy to the switch, and avoid being a single point of
>> failure, I connected each second cable of each server to a second
>> switch. This turned out to not function as gluster was refusing to
>> start the volume logging "transport endpoint is disconnected"
>> although all nodes were able to reach each other (ping) in the
>> storage network. I switched the mode to mode 1 (active/passive) and
>> initially it worked but following a reboot of all cluster same issue
>> appeared. Gluster is not starting the volumes.
>>
>> Isn't active/passive supposed to work like that? Can one have such
>> redundant network setup or are there any other recommended approaches?
>>
>
> Yes, we use lacp, I guess this is mode 4 ( we use teamd ), it is, no
> doubt, best way.
>
>
>> Thanx,
>> Alex
>>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users




Met vriendelijke groet, With kind regards,

Jorick Astrego

Netbulae Virtualization Experts 



Tel: 053 20 30 270  i...@netbulae.euStaalsteden 4-3A
KvK 08198180
Fax: 053 20 30 271  www.netbulae.eu 7547 TA Enschede
BTW NL821234584B01



___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users