Rahul, if you are using the whole drives for OSDs then ceph-deploy is a good option in most cases.
2018-06-28 18:12 GMT+05:00 Rahul S <saple.rahul.eightyth...@gmail.com>: > Hi Vlad, > > Have not thoroughly tested my setup but so far things look good. Only > problem is that I have to manually activate the osd's using the ceph-deploy > command. Manually mounting the osd partition doesnt work. > > Thanks for replying. > > Regards, > Rahul S > > On 27 June 2018 at 14:15, Дробышевский, Владимир <v...@itgorod.ru> wrote: > >> Hello, Rahul! >> >> Do you have your problem during initial cluster creation or on any >> reboot\leadership transfer? If the first then try to remove floating IP >> while creating mons and temporarily transfer the leadership from the server >> your going to create OSD on. >> >> We are using the same configuration without any issues (though have a >> little bit more servers) but ceph cluster had been created before >> OpenNebula setup. >> >> We have a number of physical\virtual interfaces on top of IPoIB _and_ >> ethernet network (with bonding). >> >> So there are 3 interfaces for the internal communications: >> >> ib0.8003 - 10.103.0.0/16 - ceph public network and opennebula raft >> virtual ip >> ib0.8004 - 10.104.0.0/16 - ceph cluster network >> br0 (on top of ethernet bonding interface) - 10.101.0.0/16 - physical >> "management" network >> >> also we have a number of other virtual interfaces for per-tenant >> intra-VM networks (vxlan on top of IP) and so on. >> >> >> >> in /etc/hosts we have only "fixed" IPs from 10.103.0.0/16 networks like: >> >> 10.103.0.1 e001n01.dc1.xxxxxxxx.xx e001n01 >> >> >> >> /etc/one/oned.conf: >> >> # Executed when a server transits from follower->leader >> RAFT_LEADER_HOOK = [ >> COMMAND = "raft/vip.sh", >> ARGUMENTS = "leader ib0.8003 10.103.255.254/16" >> ] >> >> # Executed when a server transits from leader->follower >> RAFT_FOLLOWER_HOOK = [ >> COMMAND = "raft/vip.sh", >> ARGUMENTS = "follower ib0.8003 10.103.255.254/16" >> ] >> >> >> >> /etc/ceph/ceph.conf: >> >> [global] >> public_network = 10.103.0.0/16 >> cluster_network = 10.104.0.0/16 >> >> mon_initial_members = e001n01, e001n02, e001n03 >> mon_host = 10.103.0.1,10.103.0.2,10.103.0.3 >> >> >> >> Cluster and mons created with ceph-deploy, each OSD has been added via >> modified ceph-disk.py (as we have only 3 drive slots per server we had to >> co-locate system partition with OSD partition on our SSDs) on >> per-host\drive manner: >> >> admin@<host>:~$ sudo ./ceph-disk-mod.py -v prepare --dmcrypt >> --dmcrypt-key-dir /etc/ceph/dmcrypt-keys --bluestore --cluster ceph >> --fs-type xfs -- /dev/sda >> >> >> And the current state on the leader: >> >> oneadmin@e001n02:~/remotes/tm$ onezone show 0 >> ZONE 0 INFORMATION >> ID : 0 >> NAME : OpenNebula >> >> >> ZONE SERVERS >> ID NAME ENDPOINT >> 0 e001n01 http://10.103.0.1:2633/RPC2 >> 1 e001n02 http://10.103.0.2:2633/RPC2 >> 2 e001n03 http://10.103.0.3:2633/RPC2 >> >> HA & FEDERATION SYNC STATUS >> ID NAME STATE TERM INDEX COMMIT VOTE >> FED_INDEX >> 0 e001n01 follower 1571 68250418 68250417 1 -1 >> 1 e001n02 leader 1571 68250418 68250418 1 -1 >> 2 e001n03 follower 1571 68250418 68250417 -1 -1 >> ... >> >> >> admin@e001n02:~$ ip addr show ib0.8003 >> 9: ib0.8003@ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65520 qdisc mq >> state UP group default qlen 256 >> link/infiniband >> a0:00:03:00:fe:80:00:00:00:00:00:00:00:1e:67:03:00:47:c1:1b >> brd 00:ff:ff:ff:ff:12:40:1b:80:03:00:00:00:00:00:00:ff:ff:ff:ff >> inet 10.103.0.2/16 brd 10.103.255.255 scope global ib0.8003 >> valid_lft forever preferred_lft forever >> inet 10.103.255.254/16 scope global secondary ib0.8003 >> valid_lft forever preferred_lft forever >> inet6 fe80::21e:6703:47:c11b/64 scope link >> valid_lft forever preferred_lft forever >> >> admin@e001n02:~$ sudo netstat -anp | grep mon >> tcp 0 0 10.103.0.2:6789 0.0.0.0:* >> LISTEN 168752/ceph-mon >> tcp 0 0 10.103.0.2:6789 10.103.0.2:44270 >> ESTABLISHED 168752/ceph-mon >> ... >> >> admin@e001n02:~$ sudo netstat -anp | grep osd >> tcp 0 0 10.104.0.2:6800 0.0.0.0:* >> LISTEN 6736/ceph-osd >> tcp 0 0 10.104.0.2:6801 0.0.0.0:* >> LISTEN 6736/ceph-osd >> tcp 0 0 10.103.0.2:6801 0.0.0.0:* >> LISTEN 6736/ceph-osd >> tcp 0 0 10.103.0.2:6802 0.0.0.0:* >> LISTEN 6736/ceph-osd >> tcp 0 0 10.104.0.2:6801 10.104.0.6:42868 >> ESTABLISHED 6736/ceph-osd >> tcp 0 0 10.104.0.2:51788 10.104.0.1:6800 >> ESTABLISHED 6736/ceph-osd >> ... >> >> admin@e001n02:~$ sudo ceph -s >> cluster: >> id: <uuid> >> health: HEALTH_OK >> >> oneadmin@e001n02:~/remotes/tm$ onedatastore show 0 >> DATASTORE 0 INFORMATION >> ID : 0 >> NAME : system >> USER : oneadmin >> GROUP : oneadmin >> CLUSTERS : 0 >> TYPE : SYSTEM >> DS_MAD : - >> TM_MAD : ceph_shared >> BASE PATH : /var/lib/one//datastores/0 >> DISK_TYPE : RBD >> STATE : READY >> >> ... >> >> DATASTORE TEMPLATE >> ALLOW_ORPHANS="YES" >> BRIDGE_LIST="e001n01 e001n02 e001n03" >> CEPH_HOST="e001n01 e001n02 e001n03" >> CEPH_SECRET="secret_uuid" >> CEPH_USER="libvirt" >> DEFAULT_DEVICE_PREFIX="sd" >> DISK_TYPE="RBD" >> DS_MIGRATE="NO" >> POOL_NAME="rbd-ssd" >> RESTRICTED_DIRS="/" >> SAFE_DIRS="/mnt" >> SHARED="YES" >> TM_MAD="ceph_shared" >> TYPE="SYSTEM_DS" >> >> ... >> >> oneadmin@e001n02:~/remotes/tm$ onedatastore show 1 >> DATASTORE 1 INFORMATION >> ID : 1 >> NAME : default >> USER : oneadmin >> GROUP : oneadmin >> CLUSTERS : 0 >> TYPE : IMAGE >> DS_MAD : ceph >> TM_MAD : ceph_shared >> BASE PATH : /var/lib/one//datastores/1 >> DISK_TYPE : RBD >> STATE : READY >> >> ... >> >> DATASTORE TEMPLATE >> ALLOW_ORPHANS="YES" >> BRIDGE_LIST="e001n01 e001n02 e001n03" >> CEPH_HOST="e001n01 e001n02 e001n03" >> CEPH_SECRET="secret_uuid" >> CEPH_USER="libvirt" >> CLONE_TARGET="SELF" >> DISK_TYPE="RBD" >> DRIVER="raw" >> DS_MAD="ceph" >> LN_TARGET="NONE" >> POOL_NAME="rbd-ssd" >> SAFE_DIRS="/mnt /var/lib/one/datastores/tmp" >> STAGING_DIR="/var/lib/one/datastores/tmp" >> TM_MAD="ceph_shared" >> TYPE="IMAGE_DS" >> >> IMAGES >> ... >> >> Leadership transfers without any issues as well. >> >> BR >> >> 2018-06-26 13:17 GMT+05:00 Rahul S <saple.rahul.eightyth...@gmail.com>: >> >>> Hi! In my organisation we are using OpenNebula as our Cloud Platform. >>> Currently we are testing High Availability(HA) feature with Ceph Cluster as >>> our storage backend. In our test setup we have 3 systems with front-end HA >>> already successfully setup and configured with a floating IP in between >>> them. We are having our ceph cluster(3 osds and 3 mons) on these very 3 >>> machines. However, when we try to deploy a ceph cluster, we have a >>> successful quorum with the following issues on the OpenNebula 'LEADER' node >>> >>> 1) The mon daemon successfully starts, but takes up the floating IP >>> rather than the actual IP. >>> >>> 2) The osd daemon on the other hand goes down after a while giving >>> an error >>> log_channel(cluster) log [ERR] : map e29 had wrong cluster addr >>> (192.x.x.20:6801/10821 != my 192.x.x.245:6801/10821) >>> 192.x.x.20 being the floating ip >>> 192.x.x.245 being the actual ip >>> >>> Apart from that, we are getting HEALTH_WARN status on running ceph -s, >>> with many pgs in a degraded, unclean, undersized state >>> >>> Also, if that matters, we have our osds on a seperate partition rather >>> than a disk. >>> >>> We only need to get the cluster in a healthy state in our minimalistic >>> setup. Any idea on how to get past this? >>> >>> Thanks and Regards, >>> Rahul S >>> >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>> >> >> >> -- >> >> С уважением, >> Дробышевский Владимир >> Компания "АйТи Город" >> +7 343 2222192 >> >> ИТ-консалтинг >> Поставка проектов "под ключ" >> Аутсорсинг ИТ-услуг >> Аутсорсинг ИТ-инфраструктуры >> > > -- С уважением, Дробышевский Владимир Компания "АйТи Город" +7 343 2222192 ИТ-консалтинг Поставка проектов "под ключ" Аутсорсинг ИТ-услуг Аутсорсинг ИТ-инфраструктуры
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com