Re: [ceph-users] Long OSD restart after upgrade to 10.2.9
Hi, Anton. You need to run the OSD with debug_ms = 1/1 and debug_osd = 20/20 for detailed information. 2017-07-17 8:26 GMT+03:00 Anton Dmitriev: > Hi, all! > > After upgrading from 10.2.7 to 10.2.9 I see that restarting osds by > 'restart ceph-osd id=N' or 'restart ceph-osd-all' takes about 10 minutes > for getting OSD from DOWN to UP. The same situation on all 208 OSDs on 7 > servers. > > Also very long OSD start after rebooting servers. > > Before upgrade it took no more than 2 minutes. > > Does anyone has the same situation like mine? > > > 2017-07-17 08:07:26.895600 7fac2d656840 0 set uid:gid to 4402:4402 > (ceph:ceph) > 2017-07-17 08:07:26.895615 7fac2d656840 0 ceph version 10.2.9 > (2ee413f77150c0f375ff6f10edd6c8f9c7d060d0), process ceph-osd, pid 197542 > 2017-07-17 08:07:26.897018 7fac2d656840 0 pidfile_write: ignore empty > --pid-file > 2017-07-17 08:07:26.906489 7fac2d656840 0 filestore(/var/lib/ceph/osd/ceph-0) > backend xfs (magic 0x58465342) > 2017-07-17 08:07:26.917074 7fac2d656840 0 > genericfilestorebackend(/var/lib/ceph/osd/ceph-0) > detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config > option > 2017-07-17 08:07:26.917092 7fac2d656840 0 > genericfilestorebackend(/var/lib/ceph/osd/ceph-0) > detect_features: SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data > hole' config option > 2017-07-17 08:07:26.917112 7fac2d656840 0 > genericfilestorebackend(/var/lib/ceph/osd/ceph-0) > detect_features: splice is supported > 2017-07-17 08:07:27.037031 7fac2d656840 0 > genericfilestorebackend(/var/lib/ceph/osd/ceph-0) > detect_features: syncfs(2) syscall fully supported (by glibc and kernel) > 2017-07-17 08:07:27.037154 7fac2d656840 0 > xfsfilestorebackend(/var/lib/ceph/osd/ceph-0) > detect_feature: extsize is disabled by conf > 2017-07-17 08:15:17.839072 7fac2d656840 0 filestore(/var/lib/ceph/osd/ceph-0) > mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled > 2017-07-17 08:15:20.150446 7fac2d656840 0 > cls/hello/cls_hello.cc:305: loading cls_hello > 2017-07-17 08:15:20.152483 7fac2d656840 0 > cls/cephfs/cls_cephfs.cc:202: loading cephfs_size_scan > 2017-07-17 08:15:20.210428 7fac2d656840 0 osd.0 224167 crush map has > features 2200130813952, adjusting msgr requires for clients > 2017-07-17 08:15:20.210443 7fac2d656840 0 osd.0 224167 crush map has > features 2200130813952 was 8705, adjusting msgr requires for mons > 2017-07-17 08:15:20.210448 7fac2d656840 0 osd.0 224167 crush map has > features 2200130813952, adjusting msgr requires for osds > 2017-07-17 08:15:58.902173 7fac2d656840 0 osd.0 224167 load_pgs > 2017-07-17 08:16:19.083406 7fac2d656840 0 osd.0 224167 load_pgs opened > 242 pgs > 2017-07-17 08:16:19.083969 7fac2d656840 0 osd.0 224167 using 0 op queue > with priority op cut off at 64. > 2017-07-17 08:16:19.109547 7fac2d656840 -1 osd.0 224167 log_to_monitors > {default=true} > 2017-07-17 08:16:19.522448 7fac2d656840 0 osd.0 224167 done with init, > starting boot process > > -- > Dmitriev Anton > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Long OSD restart after upgrade to 10.2.9
Hi, all! After upgrading from 10.2.7 to 10.2.9 I see that restarting osds by 'restart ceph-osd id=N' or 'restart ceph-osd-all' takes about 10 minutes for getting OSD from DOWN to UP. The same situation on all 208 OSDs on 7 servers. Also very long OSD start after rebooting servers. Before upgrade it took no more than 2 minutes. Does anyone has the same situation like mine? 2017-07-17 08:07:26.895600 7fac2d656840 0 set uid:gid to 4402:4402 (ceph:ceph) 2017-07-17 08:07:26.895615 7fac2d656840 0 ceph version 10.2.9 (2ee413f77150c0f375ff6f10edd6c8f9c7d060d0), process ceph-osd, pid 197542 2017-07-17 08:07:26.897018 7fac2d656840 0 pidfile_write: ignore empty --pid-file 2017-07-17 08:07:26.906489 7fac2d656840 0 filestore(/var/lib/ceph/osd/ceph-0) backend xfs (magic 0x58465342) 2017-07-17 08:07:26.917074 7fac2d656840 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option 2017-07-17 08:07:26.917092 7fac2d656840 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data hole' config option 2017-07-17 08:07:26.917112 7fac2d656840 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: splice is supported 2017-07-17 08:07:27.037031 7fac2d656840 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: syncfs(2) syscall fully supported (by glibc and kernel) 2017-07-17 08:07:27.037154 7fac2d656840 0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_feature: extsize is disabled by conf 2017-07-17 08:15:17.839072 7fac2d656840 0 filestore(/var/lib/ceph/osd/ceph-0) mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled 2017-07-17 08:15:20.150446 7fac2d656840 0 cls/hello/cls_hello.cc:305: loading cls_hello 2017-07-17 08:15:20.152483 7fac2d656840 0 cls/cephfs/cls_cephfs.cc:202: loading cephfs_size_scan 2017-07-17 08:15:20.210428 7fac2d656840 0 osd.0 224167 crush map has features 2200130813952, adjusting msgr requires for clients 2017-07-17 08:15:20.210443 7fac2d656840 0 osd.0 224167 crush map has features 2200130813952 was 8705, adjusting msgr requires for mons 2017-07-17 08:15:20.210448 7fac2d656840 0 osd.0 224167 crush map has features 2200130813952, adjusting msgr requires for osds 2017-07-17 08:15:58.902173 7fac2d656840 0 osd.0 224167 load_pgs 2017-07-17 08:16:19.083406 7fac2d656840 0 osd.0 224167 load_pgs opened 242 pgs 2017-07-17 08:16:19.083969 7fac2d656840 0 osd.0 224167 using 0 op queue with priority op cut off at 64. 2017-07-17 08:16:19.109547 7fac2d656840 -1 osd.0 224167 log_to_monitors {default=true} 2017-07-17 08:16:19.522448 7fac2d656840 0 osd.0 224167 done with init, starting boot process -- Dmitriev Anton ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] 答复: How's cephfs going?
I work at Monash University. We are using active-standby MDS. We don't yet have it in full production as we need some of the newer Luminous features before we can roll it out more broadly, however we are moving towards letting a subset of users on (just slowly ticking off related work like putting external backup system in-place, writing some janitor scripts to check quota enforcement, and so on). Our HPC folks are quite keen for more as it has proved very useful for shunting a bit of data around between disparate systems. We're also testing NFS and CIFS gateways, after some initial issues with the CTDB setup that part seems to be working, but now hitting some interesting xattr acl behaviours (e.g. ACL'd dir writable through one gateway node but not another), other colleagues looking at that and poking Red Hat for assistance... On 17 July 2017 at 13:27, 许雪寒wrote: > Hi, thanks for the quick reply:-) > > May I ask which company are you in? I'm asking this because we are collecting > cephfs's usage information as the basis of our judgement about whether to use > cephfs. And also, how are you using it? Are you using single-mds, the > so-called active-standby mode? And could you give some information of your > cephfs's usage pattern, for example, does your client nodes directly mount > cephfs or mount it through an NFS, or something like it, running a directory > that is mounted with cephfs and are you using ceph-fuse? > > -邮件原件- > 发件人: Blair Bethwaite [mailto:blair.bethwa...@gmail.com] > 发送时间: 2017年7月17日 11:14 > 收件人: 许雪寒 > 抄送: ceph-users@lists.ceph.com > 主题: Re: [ceph-users] How's cephfs going? > > It works and can reasonably be called "production ready". However in Jewel > there are still some features (e.g. directory sharding, multi active MDS, and > some security constraints) that may limit widespread usage. Also note that > userspace client support in e.g. nfs-ganesha and samba is a mixed bag across > distros and you may find yourself having to resort to re-exporting ceph-fuse > or kernel mounts in order to provide those gateway services. We haven't tried > Luminous CephFS yet as still waiting for the first full (non-RC) release to > drop, but things seem very positive there... > > On 17 July 2017 at 12:59, 许雪寒 wrote: >> Hi, everyone. >> >> >> >> We intend to use cephfs of Jewel version, however, we don’t know its status. >> Is it production ready in Jewel? Does it still have lots of bugs? Is >> it a major effort of the current ceph development? And who are using cephfs >> now? >> >> >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > > > > -- > Cheers, > ~Blairo -- Cheers, ~Blairo ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] 答复: How's cephfs going?
Hi, thanks for the quick reply:-) May I ask which company are you in? I'm asking this because we are collecting cephfs's usage information as the basis of our judgement about whether to use cephfs. And also, how are you using it? Are you using single-mds, the so-called active-standby mode? And could you give some information of your cephfs's usage pattern, for example, does your client nodes directly mount cephfs or mount it through an NFS, or something like it, running a directory that is mounted with cephfs and are you using ceph-fuse? -邮件原件- 发件人: Blair Bethwaite [mailto:blair.bethwa...@gmail.com] 发送时间: 2017年7月17日 11:14 收件人: 许雪寒 抄送: ceph-users@lists.ceph.com 主题: Re: [ceph-users] How's cephfs going? It works and can reasonably be called "production ready". However in Jewel there are still some features (e.g. directory sharding, multi active MDS, and some security constraints) that may limit widespread usage. Also note that userspace client support in e.g. nfs-ganesha and samba is a mixed bag across distros and you may find yourself having to resort to re-exporting ceph-fuse or kernel mounts in order to provide those gateway services. We haven't tried Luminous CephFS yet as still waiting for the first full (non-RC) release to drop, but things seem very positive there... On 17 July 2017 at 12:59, 许雪寒wrote: > Hi, everyone. > > > > We intend to use cephfs of Jewel version, however, we don’t know its status. > Is it production ready in Jewel? Does it still have lots of bugs? Is > it a major effort of the current ceph development? And who are using cephfs > now? > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Cheers, ~Blairo ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] How's cephfs going?
It works and can reasonably be called "production ready". However in Jewel there are still some features (e.g. directory sharding, multi active MDS, and some security constraints) that may limit widespread usage. Also note that userspace client support in e.g. nfs-ganesha and samba is a mixed bag across distros and you may find yourself having to resort to re-exporting ceph-fuse or kernel mounts in order to provide those gateway services. We haven't tried Luminous CephFS yet as still waiting for the first full (non-RC) release to drop, but things seem very positive there... On 17 July 2017 at 12:59, 许雪寒wrote: > Hi, everyone. > > > > We intend to use cephfs of Jewel version, however, we don’t know its status. > Is it production ready in Jewel? Does it still have lots of bugs? Is it a > major effort of the current ceph development? And who are using cephfs now? > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Cheers, ~Blairo ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] How's cephfs going?
Hi, everyone. We intend to use cephfs of Jewel version, however, we don’t know its status. Is it production ready in Jewel? Does it still have lots of bugs? Is it a major effort of the current ceph development? And who are using cephfs now? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Systemd dependency cycle in Luminous
Hi all I recently upgraded two separate ceph clusters from Jewel to Luminous. (OS is Ubuntu xenial) Everything went smoothly except on one of the monitors in each cluster I had a problem shutting down/starting up. It seems the systemd dependencies are messed up. I get: systemd[1]: ceph-osd.target: Found ordering cycle on ceph-osd.target/start systemd[1]: ceph-osd.target: Found dependency on ceph-osd@16.service/start systemd[1]: ceph-osd.target: Found dependency on ceph-mon.target/start systemd[1]: ceph-osd.target: Found dependency on ceph.target/start systemd[1]: ceph-osd.target: Found dependency on ceph-osd.target/start Has anyone seen this? I ignored the first time this happened (and fixed it by uninstalling, purging and reinstalling ceph on that one node) but now it has happened while upgrading a completely different cluster and this one would be quite a pain to uninstall/reinstall ceph on. Any ideas? Thanks Michael ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Broken Ceph Cluster when adding new one - Proxmox 5.0 & Ceph Luminous
Le 16/07/2017 à 17:02, Udo Lembke a écrit : Hi, On 16.07.2017 15:04, Phil Schwarz wrote: ... Same result, the OSD is known by the node, but not by the cluster. ... Firewall? Or missmatch in /etc/hosts or DNS?? Udo OK, - No FW, - No DNS issue at this point. - Same procedure followed with the last node, except full cluster update before adding new node,new osd. Only the strange behavior of the 'pveceph createosd' command which was shown in prevous mail. ... systemd[1]: ceph-disk@dev-sdc1.service: Main process exited, code=exited, status=1/FAILURE systemd[1]: Failed to start Ceph disk activation: /dev/sdc1. systemd[1]: ceph-disk@dev-sdc1.service: Unit entered failed state. systemd[1]: ceph-disk@dev-sdc1.service: Failed with result 'exit-code' What consequences should i encounter when switching /etc/hosts from public_IPs to private_IPs ? ( appart from time travel paradox or blackhole bursting ..) Thanks. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Broken Ceph Cluster when adding new one - Proxmox 5.0 & Ceph Luminous
Hi, On 16.07.2017 15:04, Phil Schwarz wrote: > ... > Same result, the OSD is known by the node, but not by the cluster. > ... Firewall? Or missmatch in /etc/hosts or DNS?? Udo ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Broken Ceph Cluster when adding new one - Proxmox 5.0 & Ceph Luminous
Le 15/07/2017 à 23:09, Udo Lembke a écrit : Hi, On 15.07.2017 16:01, Phil Schwarz wrote: Hi, ... While investigating, i wondered about my config : Question relative to /etc/hosts file : Should i use private_replication_LAN Ip or public ones ? private_replication_LAN!! And the pve-cluster should use another network (nics) if possible. Udo OK, thanks Udo. After investigation, i did : - set Noout OSDs - Stopped CPU-pegging LXC - Check the cabling - Restart the whole cluster Everything went fine ! But, when i tried to add a new OSD : fdisk /dev/sdc --> Deleted the partition table parted /dev/sdc --> mklabel msdos (Disk came from a ZFS FreeBSD system) dd if=/dev/null of=/dev/sdc ceph-disk zap /dev/sdc dd if=/dev/zero of=/dev/sdc bs=10M count=1000 And recreated the OSD via Web GUI. Same result, the OSD is known by the node, but not by the cluster. Logs seem to show an issue with this bluestore OSD, have a look at the file. I'm gonna give a try to OSD recreating using Filestore. Thanks pvedaemon[3077]:starting task UPID:varys:7E7D:0004F489:596B5FCE:cephcreateosd:sdc:root@pam: kernel: [ 3267.263313] sdc: systemd[1]: Created slice system-ceph\x2ddisk.slice. systemd[1]: Starting Ceph disk activation: /dev/sdc2... sh[1074]: main_trigger: main_trigger: Namespace(cluster='ceph', dev='/dev/sdc2', dmcrypt=None, dmcrypt_key_dir='/etc/ceph/dmcrypt-keys', func=, log_stdout=True, prepend_to_path='/usr/bin', prog='ceph-disk', setgroup=None, setuser=None, statedir='/var/lib/ceph', sync=True, sh[1074]: command: Running command: /sbin/init --version sh[1074]: command_check_call: Running command: /bin/chown ceph:ceph /dev/sdc2 sh[1074]: command: Running command: /sbin/blkid -o udev -p /dev/sdc2 sh[1074]: command: Running command: /sbin/blkid -o udev -p /dev/sdc2 sh[1074]: main_trigger: trigger /dev/sdc2 parttype cafecafe-9b03-4f30-b4c6-b4b80ceff106 uuid 7a6d7546-b93a-452b-9bbc-f660f9a8416c sh[1074]: command: Running command: /usr/sbin/ceph-disk --verbose activate-block /dev/sdc2 systemd[1]: Stopped Ceph disk activation: /dev/sdc2. systemd[1]: Starting Ceph disk activation: /dev/sdc2... sh[1074]: main_trigger: sh[1074]: main_trigger: get_dm_uuid: get_dm_uuid /dev/sdc2 uuid path is /sys/dev/block/8:34/dm/uuid sh[1074]: command: Running command: /sbin/blkid -o udev -p /dev/sdc2 sh[1074]: command: Running command: /usr/bin/ceph-osd --get-device-fsid /dev/sdc2 sh[1074]: get_space_osd_uuid: Block /dev/sdc2 has OSD UUID ---- sh[1074]: main_activate_space: activate: OSD device not present, not starting, yet systemd[1]: Stopped Ceph disk activation: /dev/sdc2. systemd[1]: Starting Ceph disk activation: /dev/sdc2... sh[1475]: main_trigger: main_trigger: Namespace(cluster='ceph', dev='/dev/sdc2', dmcrypt=None, dmcrypt_key_dir='/etc/ceph/dmcrypt-keys', func=, log_stdout=True, prepend_to_path='/usr/bin', prog='ceph-disk', setgroup=None, setuser=None, statedir='/var/lib/ceph', sync=True, sh[1475]: command: Running command: /sbin/init --version sh[1475]: command_check_call: Running command: /bin/chown ceph:ceph /dev/sdc2 sh[1475]: command: Running command: /sbin/blkid -o udev -p /dev/sdc2 sh[1475]: command: Running command: /sbin/blkid -o udev -p /dev/sdc2 sh[1475]: main_trigger: trigger /dev/sdc2 parttype cafecafe-9b03-4f30-b4c6-b4b80ceff664 uuid 7a6d7546-b93a-452b-9bbc-f660f9a84664 sh[1475]: command: Running command: /usr/sbin/ceph-disk --verbose activate-block /dev/sdc2 kernel: [ 3291.171474] sdc: sdc1 sdc2 sh[1475]: main_trigger: sh[1475]: main_trigger: get_dm_uuid: get_dm_uuid /dev/sdc2 uuid path is /sys/dev/block/8:34/dm/uuid sh[1475]: command: Running command: /sbin/blkid -o udev -p /dev/sdc2 sh[1475]: command: Running command: /usr/bin/ceph-osd --get-device-fsid /dev/sdc2 sh[1475]: get_space_osd_uuid: Block /dev/sdc2 has OSD UUID ---- sh[1475]: main_activate_space: activate: OSD device not present, not starting, yet systemd[1]: Stopped Ceph disk activation: /dev/sdc2. systemd[1]: Starting Ceph disk activation: /dev/sdc2... sh[1492]: main_trigger: main_trigger: Namespace(cluster='ceph', dev='/dev/sdc2', dmcrypt=None, dmcrypt_key_dir='/etc/ceph/dmcrypt-keys', func=, log_stdout=True, prepend_to_path='/usr/bin', prog='ceph-disk', setgroup=None, setuser=None, statedir='/var/lib/ceph', sync=True, sh[1492]: command: Running command: /sbin/init --version sh[1492]: command_check_call: Running command: /bin/chown ceph:ceph /dev/sdc2 sh[1492]: command: Running command: /sbin/blkid -o udev -p /dev/sdc2 sh[1492]: command: Running command: /sbin/blkid -o udev -p /dev/sdc2 sh[1492]: main_trigger: trigger /dev/sdc2 parttype cafecafe-9b03-4f30-b4c6-b4b80ceff664 uuid 7a6d7546-b93a-452b-9bbc-f660f9a84664 sh[1492]: command: Running command: /usr/sbin/ceph-disk --verbose activate-block /dev/sdc2 sh[1492]: main_trigger: sh[1492]: main_trigger: get_dm_uuid: get_dm_uuid /dev/sdc2 uuid path is /sys/dev/block/8:34/dm/uuid sh[1492]:
Re: [ceph-users] Delete unused RBD volume takes to long.
In Hammer, ceph does not keep track of which objects exist and which ones don't. The delete is trying to delete every possible object in the rbd. In Jewel, the object_map keeps track of which objects exist and the delete of that rbd would be drastically faster. If you were 1PB of data to the rbd, it would still take a couple days to delete the rbd. The faster way to delete a large empty rbd in Hammer would be to find the rbd prefix, run rados ls grepping for the prefix, and then use the rados command to delete the objects that exist for that rbd. On Sun, Jul 16, 2017, 1:45 AM Alvaro Sotowrote: > Hi, > does anyone have experienced or know why the delete process takes longer > that a creation of a RBD volume. > > My test was this: > >- Create a 1PB volume -> less than a minute >- Delete the volume created -> like 2 days > > The result was unexpected by me and till now, don't know the reason, the > process of deletion was initiated exactly at the end of the creation, so > the volume was never used. > > About the environment: > >- Ubuntu trusty 64bits >- 5 Servers cluster >- 24 Intel SSD per server * 800GB each disk >- Intel NVMe PCIe for journal >- CEPH Hammer. >- Replica 3 > > Hope that someone can tell me why this deletion takes so long. > Best. > > -- > > ATTE. Alvaro Soto Escobar > > -- > Great people talk about ideas, > average people talk about things, > small people talk ... about other people. > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com