Ronny, talking about reboots, has anyone had experience of live kernel patching with CEPH? I am asking out of simple curiosity.
On 25 April 2018 at 19:40, Ronny Aasen <ronny+ceph-us...@aasen.cx> wrote: > the difference in cost between 2 and 3 servers are not HUGE. but the > reliability difference between a size 2/1 pool and a 3/2 pool is massive. > a 2/1 pool is just a single fault during maintenance away from dataloss. > but you need multiple simultaneous faults, and have very bad luck to break > a 3/2 pool > > I would recommend rather using 2/2 pools if you are willing to accept a > little downtime when a disk dies. the cluster io would stop until the > disks backfill to cover for the lost disk. > but it is better then having inconsistent pg's or dataloss because a disk > crashed during a routine reboot, or 2 disks > > also worth to read this link https://www.spinics.net/lists/ > ceph-users/msg32895.html a good explanation. > > you have good backups and are willing to restore the whole pool. And it is > of course your privilege to run 2/1 pools but be mind full of the risks of > doing so. > > > kind regards > Ronny Aasen > > BTW: i did not know ubuntu automagically rebooted after a upgrade. you can > probably avoid that reboot somehow in ubuntu. and do the restarts of > services manually. if you wish to maintain service during upgrade > > > > > > On 25.04.2018 11:52, Ranjan Ghosh wrote: > >> Thanks a lot for your detailed answer. The problem for us, however, was >> that we use the Ceph packages that come with the Ubuntu distribution. If >> you do a Ubuntu upgrade, all packages are upgraded in one go and the server >> is rebooted. You cannot influence anything or start/stop services >> one-by-one etc. This was concering me, because the upgrade instructions >> didn't mention anything about an alternative or what to do in this case. >> But someone here enlightened me that - in general - it all doesnt matter >> that much *if you are just accepting a downtime*. And, indeed, it all >> worked nicely. We stopped all services on all servers, upgraded the Ubuntu >> version, rebooted all servers and were ready to go again. Didn't encounter >> any problems there. The only problem turned out to be our own fault and >> simply a firewall misconfiguration. >> >> And, yes, we're running a "size:2 min_size:1" because we're on a very >> tight budget. If I understand correctly, this means: Make changes of files >> to one server. *Eventually* copy them to the other server. I hope this >> *eventually* means after a few minutes. Up until now I've never experienced >> *any* problems with file integrity with this configuration. In fact, Ceph >> is incredibly stable. Amazing. I have never ever had any issues whatsoever >> with broken files/partially written files, files that contain garbage etc. >> Even after starting/stopping services, rebooting etc. With GlusterFS and >> other Cluster file system I've experienced many such problems over the >> years, so this is what makes Ceph so great. I have now a lot of trust in >> Ceph, that it will eventually repair everything :-) And: If a file that has >> been written a few seconds ago is really lost it wouldnt be that bad for >> our use-case. It's a web-server. Most important stuff is in the DB. We have >> hourly backups of everything. In a huge emergency, we could even restore >> the backup from an hour ago if we really had to. Not nice, but if it >> happens every 6 years or sth due to some freak hardware failure, I think it >> is manageable. I accept it's not the recommended/perfect solution if you >> have infinite amounts of money at your hands, but in our case, I think it's >> not extremely audacious either to do it like this, right? >> >> >> Am 11.04.2018 um 19:25 schrieb Ronny Aasen: >> >>> ceph upgrades are usualy not a problem: >>> ceph have to be upgraded in the right order. normally when each service >>> is on its own machine this is not difficult. >>> but when you have mon, mgr, osd, mds, and klients on the same host you >>> have to do it a bit carefully.. >>> >>> i tend to have a terminal open with "watch ceph -s" running, and i never >>> do another service until the health is ok again. >>> >>> first apt upgrade the packages on all the hosts. This only update the >>> software on disk and not the running services. >>> then do the restart of services in the right order. and only on one >>> host at the time >>> >>> mons: first you restart the mon service on all mon running hosts. >>> all the 3 mons are active at the same time, so there is no "shifting >>> around" but make sure the quorum is ok again before you do the next mon. >>> >>> mgr: then restart mgr on all hosts that run mgr. there is only one >>> active mgr at the time now, so here there will be a bit of shifting around. >>> but it is only for statistics/management so it may affect your ceph -s >>> command, but not the cluster operation. >>> >>> osd: restart osd processes one osd at the time, make sure health_ok >>> before doing the next osd process. do this for all hosts that have osd's >>> >>> mds: restart mds's one at the time. you will notice the standby mds >>> taking over for the mds that was restarted. do both. >>> >>> klients: restart clients, that means remount filesystems, migrate or >>> restart vm's. or restart whatever process uses the old ceph libraries. >>> >>> >>> about pools: >>> since you only have 2 osd's you can obviously not be running the >>> recommended 3 replication pools. ? this makes me worry that you may be >>> running size:2 min_size:1 pools. and are daily running risk of dataloss due >>> to corruption and inconsistencies. especially when you restart osd's >>> >>> if your pools are size:2 min_size:2 then your cluster will fail when any >>> osd is restarted, until the osd is up and healthy again. but you have less >>> chance for dataloss then 2/1 pools. >>> >>> if you added a osd on a third host you can run size:3 min_size:2 . the >>> recommended config when you can have both redundancy and high availabillity. >>> >>> >>> kind regards >>> Ronny Aasen >>> >>> >>> >>> >>> >>> >>> >>> On 11.04.2018 17:42, Ranjan Ghosh wrote: >>> >>>> Ah, nevermind, we've solved it. It was a firewall issue. The only thing >>>> that's weird is that it became an issue immediately after an update. >>>> Perhaps it has sth. to do with monitor nodes shifting around or anything. >>>> Well, thanks again for your quick support, though. It's much appreciated. >>>> >>>> BR >>>> >>>> Ranjan >>>> >>>> >>>> Am 11.04.2018 um 17:07 schrieb Ranjan Ghosh: >>>> >>>>> Thank you for your answer. Do you have any specifics on which thread >>>>> you're talking about? Would be very interested to read about a success >>>>> story, because I fear that if I update the other node that the whole >>>>> cluster comes down. >>>>> >>>>> >>>>> Am 11.04.2018 um 10:47 schrieb Marc Roos: >>>>> >>>>>> I think you have to update all osd's, mon's etc. I can remember >>>>>> running >>>>>> into similar issue. You should be able to find more about this in >>>>>> mailing list archive. >>>>>> >>>>>> >>>>>> >>>>>> -----Original Message----- >>>>>> From: Ranjan Ghosh [mailto:gh...@pw6.de] >>>>>> Sent: woensdag 11 april 2018 16:02 >>>>>> To: ceph-users >>>>>> Subject: [ceph-users] Cluster degraded after Ceph Upgrade 12.2.1 => >>>>>> 12.2.2 >>>>>> >>>>>> Hi all, >>>>>> >>>>>> We have a two-cluster-node (with a third "monitoring-only" node). Over >>>>>> the last months, everything ran *perfectly* smooth. Today, I did an >>>>>> Ubuntu "apt-get upgrade" on one of the two servers. Among others, the >>>>>> ceph packages were upgraded from 12.2.1 to 12.2.2. A minor release >>>>>> update, one might think. But, to my surprise, after restarting the >>>>>> services, Ceph is now in degraded state :-( (see below). Only the >>>>>> first >>>>>> node - which ist still on 12.2.1 - seems to be running. I did a bit of >>>>>> research and found this: >>>>>> >>>>>> https://ceph.com/community/new-luminous-pg-overdose-protection/ >>>>>> >>>>>> I did set "mon_max_pg_per_osd = 300" to no avail. Don't know if this >>>>>> is >>>>>> the problem at all. >>>>>> >>>>>> Looking at the status it seems we have 264 pgs, right? When I enter >>>>>> "ceph osd df" (which I found on another website claiming it should >>>>>> print >>>>>> the number of PGs per OSD), it just hangs (need to abort with Ctrl+C). >>>>>> >>>>>> Hope anybody can help me. The cluster know works with the single node, >>>>>> but it is definively quite worrying because we don't have redundancy. >>>>>> >>>>>> Thanks in advance, >>>>>> >>>>>> Ranjan >>>>>> >>>>>> >>>>>> root@tukan2 /var/www/projects # ceph -s >>>>>> cluster: >>>>>> id: 19895e72-4a0c-4d5d-ae23-7f631ec8c8e4 >>>>>> health: HEALTH_WARN >>>>>> insufficient standby MDS daemons available >>>>>> Reduced data availability: 264 pgs inactive >>>>>> Degraded data redundancy: 264 pgs unclean >>>>>> >>>>>> services: >>>>>> mon: 3 daemons, quorum tukan1,tukan2,tukan0 >>>>>> mgr: tukan0(active), standbys: tukan2 >>>>>> mds: cephfs-1/1/1 up {0=tukan2=up:active} >>>>>> osd: 2 osds: 2 up, 2 in >>>>>> >>>>>> data: >>>>>> pools: 3 pools, 264 pgs >>>>>> objects: 0 objects, 0 bytes >>>>>> usage: 0 kB used, 0 kB / 0 kB avail >>>>>> pgs: 100.000% pgs unknown >>>>>> >>>>> > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com