Re: [ceph-users] In a High Avaiability setup, MON, OSD daemon take up the floating IP

Rahul S Thu, 28 Jun 2018 06:13:15 -0700

Hi Vlad,

Have not thoroughly tested my setup but so far things look good. Only
problem is that I have to manually activate the osd's using the ceph-deploy
command. Manually mounting the osd partition doesnt work.


Thanks for replying.

Regards,
Rahul S

On 27 June 2018 at 14:15, Дробышевский, Владимир <v...@itgorod.ru> wrote:

> Hello, Rahul!
>
>   Do you have your problem during initial cluster creation or on any
> reboot\leadership transfer? If the first then try to remove floating IP
> while creating mons and temporarily transfer the leadership from the server
> your going to create OSD on.
>
>   We are using the same configuration without any issues (though have a
> little bit more servers) but ceph cluster had been created before
> OpenNebula setup.
>
>   We have a number of physical\virtual interfaces on top of IPoIB _and_
> ethernet network (with bonding).
>
>   So there are 3 interfaces for the internal communications:
>
>   ib0.8003 - 10.103.0.0/16 - ceph public network and opennebula raft
> virtual ip
>   ib0.8004 - 10.104.0.0/16 - ceph cluster network
>   br0 (on top of ethernet bonding interface) - 10.101.0.0/16 - physical
> "management" network
>
>   also we have a number of other virtual interfaces for per-tenant
> intra-VM networks (vxlan on top of IP) and so on.
>
>
>
> in /etc/hosts we have only "fixed" IPs from 10.103.0.0/16 networks like:
>
> 10.103.0.1      e001n01.dc1.xxxxxxxx.xx        e001n01
>
>
>
>   /etc/one/oned.conf:
>
> # Executed when a server transits from follower->leader
>  RAFT_LEADER_HOOK = [
>      COMMAND = "raft/vip.sh",
>      ARGUMENTS = "leader ib0.8003 10.103.255.254/16"
>  ]
>
> # Executed when a server transits from leader->follower
>  RAFT_FOLLOWER_HOOK = [
>      COMMAND = "raft/vip.sh",
>      ARGUMENTS = "follower ib0.8003 10.103.255.254/16"
>  ]
>
>
>
>   /etc/ceph/ceph.conf:
>
> [global]
> public_network = 10.103.0.0/16
> cluster_network = 10.104.0.0/16
>
> mon_initial_members = e001n01, e001n02, e001n03
> mon_host = 10.103.0.1,10.103.0.2,10.103.0.3
>
>
>
>   Cluster and mons created with ceph-deploy, each OSD has been added via
> modified ceph-disk.py (as we have only 3 drive slots per server we had to
> co-locate system partition with OSD partition on our SSDs) on
> per-host\drive manner:
>
> admin@<host>:~$ sudo ./ceph-disk-mod.py -v prepare --dmcrypt
> --dmcrypt-key-dir /etc/ceph/dmcrypt-keys --bluestore --cluster ceph
> --fs-type xfs -- /dev/sda
>
>
>   And the current state on the leader:
>
> oneadmin@e001n02:~/remotes/tm$ onezone show 0
> ZONE 0 INFORMATION
> ID                : 0
> NAME              : OpenNebula
>
>
> ZONE SERVERS
> ID NAME            ENDPOINT
>  0 e001n01         http://10.103.0.1:2633/RPC2
>  1 e001n02         http://10.103.0.2:2633/RPC2
>  2 e001n03         http://10.103.0.3:2633/RPC2
>
> HA & FEDERATION SYNC STATUS
> ID NAME            STATE      TERM       INDEX      COMMIT     VOTE
> FED_INDEX
>  0 e001n01         follower   1571       68250418   68250417   1     -1
>  1 e001n02         leader     1571       68250418   68250418   1     -1
>  2 e001n03         follower   1571       68250418   68250417   -1    -1
> ...
>
>
> admin@e001n02:~$ ip addr show ib0.8003
> 9: ib0.8003@ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65520 qdisc mq
> state UP group default qlen 256
>     link/infiniband 
> a0:00:03:00:fe:80:00:00:00:00:00:00:00:1e:67:03:00:47:c1:1b
> brd 00:ff:ff:ff:ff:12:40:1b:80:03:00:00:00:00:00:00:ff:ff:ff:ff
>     inet 10.103.0.2/16 brd 10.103.255.255 scope global ib0.8003
>        valid_lft forever preferred_lft forever
>     inet 10.103.255.254/16 scope global secondary ib0.8003
>        valid_lft forever preferred_lft forever
>     inet6 fe80::21e:6703:47:c11b/64 scope link
>        valid_lft forever preferred_lft forever
>
> admin@e001n02:~$ sudo netstat -anp | grep mon
> tcp        0      0 10.103.0.2:6789         0.0.0.0:*
>  LISTEN      168752/ceph-mon
> tcp        0      0 10.103.0.2:6789         10.103.0.2:44270
> ESTABLISHED 168752/ceph-mon
> ...
>
> admin@e001n02:~$ sudo netstat -anp | grep osd
> tcp        0      0 10.104.0.2:6800         0.0.0.0:*
>  LISTEN      6736/ceph-osd
> tcp        0      0 10.104.0.2:6801         0.0.0.0:*
>  LISTEN      6736/ceph-osd
> tcp        0      0 10.103.0.2:6801         0.0.0.0:*
>  LISTEN      6736/ceph-osd
> tcp        0      0 10.103.0.2:6802         0.0.0.0:*
>  LISTEN      6736/ceph-osd
> tcp        0      0 10.104.0.2:6801         10.104.0.6:42868
> ESTABLISHED 6736/ceph-osd
> tcp        0      0 10.104.0.2:51788        10.104.0.1:6800
>  ESTABLISHED 6736/ceph-osd
> ...
>
> admin@e001n02:~$ sudo ceph -s
>   cluster:
>     id:     <uuid>
>     health: HEALTH_OK
>
> oneadmin@e001n02:~/remotes/tm$ onedatastore show 0
> DATASTORE 0 INFORMATION
> ID             : 0
> NAME           : system
> USER           : oneadmin
> GROUP          : oneadmin
> CLUSTERS       : 0
> TYPE           : SYSTEM
> DS_MAD         : -
> TM_MAD         : ceph_shared
> BASE PATH      : /var/lib/one//datastores/0
> DISK_TYPE      : RBD
> STATE          : READY
>
> ...
>
> DATASTORE TEMPLATE
> ALLOW_ORPHANS="YES"
> BRIDGE_LIST="e001n01 e001n02 e001n03"
> CEPH_HOST="e001n01 e001n02 e001n03"
> CEPH_SECRET="secret_uuid"
> CEPH_USER="libvirt"
> DEFAULT_DEVICE_PREFIX="sd"
> DISK_TYPE="RBD"
> DS_MIGRATE="NO"
> POOL_NAME="rbd-ssd"
> RESTRICTED_DIRS="/"
> SAFE_DIRS="/mnt"
> SHARED="YES"
> TM_MAD="ceph_shared"
> TYPE="SYSTEM_DS"
>
> ...
>
> oneadmin@e001n02:~/remotes/tm$ onedatastore show 1
> DATASTORE 1 INFORMATION
> ID             : 1
> NAME           : default
> USER           : oneadmin
> GROUP          : oneadmin
> CLUSTERS       : 0
> TYPE           : IMAGE
> DS_MAD         : ceph
> TM_MAD         : ceph_shared
> BASE PATH      : /var/lib/one//datastores/1
> DISK_TYPE      : RBD
> STATE          : READY
>
> ...
>
> DATASTORE TEMPLATE
> ALLOW_ORPHANS="YES"
> BRIDGE_LIST="e001n01 e001n02 e001n03"
> CEPH_HOST="e001n01 e001n02 e001n03"
> CEPH_SECRET="secret_uuid"
> CEPH_USER="libvirt"
> CLONE_TARGET="SELF"
> DISK_TYPE="RBD"
> DRIVER="raw"
> DS_MAD="ceph"
> LN_TARGET="NONE"
> POOL_NAME="rbd-ssd"
> SAFE_DIRS="/mnt /var/lib/one/datastores/tmp"
> STAGING_DIR="/var/lib/one/datastores/tmp"
> TM_MAD="ceph_shared"
> TYPE="IMAGE_DS"
>
> IMAGES
> ...
>
> Leadership transfers without any issues as well.
>
> BR
>
> 2018-06-26 13:17 GMT+05:00 Rahul S <saple.rahul.eightyth...@gmail.com>:
>
>> Hi! In my organisation we are using OpenNebula as our Cloud Platform.
>> Currently we are testing High Availability(HA) feature with Ceph Cluster as
>> our storage backend. In our test setup we have 3 systems with front-end HA
>> already successfully setup and configured with a floating IP in between
>> them. We are having our ceph cluster(3 osds and 3 mons) on these very 3
>> machines. However, when we try to deploy a ceph cluster, we have a
>> successful quorum with the following issues on the OpenNebula 'LEADER' node
>>
>>     1) The mon daemon successfully starts, but takes up the floating IP
>> rather than the actual IP.
>>
>>     2) The osd daemon on the other hand goes down after a while giving an
>> error
>>     log_channel(cluster) log [ERR] : map e29 had wrong cluster addr
>> (192.x.x.20:6801/10821 != my 192.x.x.245:6801/10821)
>>     192.x.x.20 being the floating ip
>>     192.x.x.245 being the actual ip
>>
>> Apart from that, we are getting HEALTH_WARN status on running ceph -s,
>> with many pgs in a degraded, unclean, undersized state
>>
>> Also, if that matters, we have our osds on a seperate partition rather
>> than a disk.
>>
>> We only need to get the cluster in a healthy state in our minimalistic
>> setup. Any idea on how to get past this?
>>
>> Thanks and Regards,
>> Rahul S
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>
>
> --
>
> С уважением,
> Дробышевский Владимир
> Компания "АйТи Город"
> +7 343 2222192
>
> ИТ-консалтинг
> Поставка проектов "под ключ"
> Аутсорсинг ИТ-услуг
> Аутсорсинг ИТ-инфраструктуры
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] In a High Avaiability setup, MON, OSD daemon take up the floating IP

Reply via email to