Re: [ceph-users] Proxmox/ceph upgrade and addition of a new node/OSDs
On Fri, Sep 21, 2018 at 09:03:15AM +0200, Hervé Ballans wrote: > Hi MJ (and all), > > So we upgraded our Proxmox/Ceph cluster, and if we have to summarize the > operation in a few words : overall, everything went well :) > The most critical operation of all is the 'osd crush tunables optimal', I > talk about it in more detail after... > > The Proxmox documentation is really well written and accurate and, normally, > following the documentation step by step is almost sufficient ! Glad to hear that everything worked well. > > * first step : upgrade Ceph Jewel to Luminous : > https://pve.proxmox.com/wiki/Ceph_Jewel_to_Luminous > (Note here : OSDs remain in FileStore backend, no BlueStore migration) > > * second step : upgrade Proxmox version 4 to 5 : > https://pve.proxmox.com/wiki/Upgrade_from_4.x_to_5.0 > > Just some numbers, observations and tips (based on our feedback, I'm not an > expert !) : > > * Before migration, make sure you are in the lastest version of Proxmox 4 > (4.4-24) and Ceph Jewel (10.2.11) > > * We don't use the pve repository for ceph packages but the official one > (download.ceph.com). Thus, during the upgrade of Promox PVE, we don't > replace ceph.com repository with promox.com Ceph repository... This is not recommended (and for a reason) - our packages are almost identical to the upstream/official ones. But we do include the occasional bug fix much faster than the official packages do, including reverting breakage. Furthermore, when using our repository, you know that the packages went through our own testing to ensure compatibility with our stack (e.g., issues like JSON output changing from one minor release to the next breaking our integration/GUI). Also, this natural delay between upstream releases and availability in our repository has saved our users from lots of "serious bug noticed one day after release" issues since we switched to providing Ceph via our own repositories. > * When you upgrade Ceph to Luminous (without tunables optimal), there is no > impact on Proxmox 4. VMs are still running normally. > The side effect (non blocking for the functionning of VMs) is located in the > GUI, on the Ceph menu : it can't report the status of the ceph cluster as it > has a JSON formatting error (indeed the output of the command 'ceph -s' is > completely different, really more readable on Luminous) Yes, this is to be expected. Backporting all of that just for the short time window of "upgrade in progress" is too much work for too little gain. > > * It misses a little step in section 8 "Create Manager instances" of the > upgrade ceph documentation. As the Ceph manager daemon is new since > Luminous, the package doesn't exist on Jewel. So you have to install the > ceph-mgr package on each node first before doing 'pveceph createmgr'||| > | It actually does not ;) ceph-mgr is pulled in by ceph on upgrades from Jewel to Luminous - unless you manually removed that package at some point. > Otherwise : > - verify that all your VMs are recently backuped on an external storage (in > case of Disaster recovery Plan !) Good idea in general :D > - if you can, stop all your non-critical VMs (in order to limit client io > operations) > - if any, wait for the end of current backups then disable datacenter backup > (in order to limit client io operations). !! do not forget to re-enable it > when all is over !! > - if any and if no longer needed, delete your snapshots, it removes many > useless objects ! > - start the tunables operation outside of major activity periods (night, > week-end, ??) and take into account that it can be very slow... Scheduling and carefully planning rebalancing operations is always needed on a production cluster. Note that the upgrade docs state that switching to "tunables optimal" is recommended, but "will cause a massive rebalance". > There are probably some options to configure in ceph to avoid 'pgs stuck' > states, but on our side, as we previously moved our critical VM's disks, we > didn't care about that ! > > * Anyway, the upgrade step of Proxmox PVE is done easily and quickly (just > follow the documentation). Note that you can upgrade Proxmox PVE before > doing the 'tunables optimal' operation. > > Hoping that you will find this information useful, good luck with your very > next migration ! Thank you for the detailled report and feedback! ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] [Ceph-maintainers] download.ceph.com repository changes
On Mon, Jul 30, 2018 at 11:36:55AM -0600, Ken Dreyer wrote: > On Fri, Jul 27, 2018 at 1:28 AM, Fabian Grünbichler > wrote: > > On Tue, Jul 24, 2018 at 10:38:43AM -0400, Alfredo Deza wrote: > >> Hi all, > >> > >> After the 12.2.6 release went out, we've been thinking on better ways > >> to remove a version from our repositories to prevent users from > >> upgrading/installing a known bad release. > >> > >> The way our repos are structured today means every single version of > >> the release is included in the repository. That is, for Luminous, > >> every 12.x.x version of the binaries is in the same repo. This is true > >> for both RPM and DEB repositories. > >> > >> However, the DEB repos don't allow pinning to a given version because > >> our tooling (namely reprepro) doesn't construct the repositories in a > >> way that this is allowed. For RPM repos this is fine, and version > >> pinning works. > > > > If you mean that reprepo does not support referencing multiple versions > > of packages in the Packages file, there is a patched fork that does > > (that seems well-supported): > > > > https://github.com/profitbricks/reprepro > > Thanks for this link. That's great to know someone's working on this. > > What's the status of merging that back into the main reprepro code, or > else shipping that fork as the new reprepro package in Debian / > Ubuntu? The Ceph project could end up responsible for maintaining that > reprepro fork if the main Ubuntu community does not pick it up :) The > fork is several years old, and the latest update on > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=570623 was over a > year ago. I don't know anything more than what is publicly available about either merging back to the original reprepo or shipping in Debian/Ubuntu. We are using our own custom repo software built around lower level tools, I was just aware of the fork for unrelated reasons :) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] [Ceph-maintainers] download.ceph.com repository changes
On Tue, Jul 24, 2018 at 10:38:43AM -0400, Alfredo Deza wrote: > Hi all, > > After the 12.2.6 release went out, we've been thinking on better ways > to remove a version from our repositories to prevent users from > upgrading/installing a known bad release. > > The way our repos are structured today means every single version of > the release is included in the repository. That is, for Luminous, > every 12.x.x version of the binaries is in the same repo. This is true > for both RPM and DEB repositories. > > However, the DEB repos don't allow pinning to a given version because > our tooling (namely reprepro) doesn't construct the repositories in a > way that this is allowed. For RPM repos this is fine, and version > pinning works. If you mean that reprepo does not support referencing multiple versions of packages in the Packages file, there is a patched fork that does (that seems well-supported): https://github.com/profitbricks/reprepro > > To remove a bad version we have to proposals (and would like to hear > ideas on other possibilities), one that would involve symlinks and the > other one which purges the known bad version from our repos. > > *Symlinking* > When releasing we would have a "previous" and "latest" symlink that > would get updated as versions move forward. It would require > separation of versions at the URL level (all versions would no longer > be available in one repo). > > The URL structure would then look like: > > debian/luminous/12.2.3/ > debian/luminous/previous/ (points to 12.2.5) > debian/luminous/latest/ (points to 12.2.7) > > Caveats: the url structure would change from debian-luminous/ to > prevent breakage, and the versions would be split. For RPMs it would > mean a regression if someone is used to pinning, for example pinning > to 12.2.2 wouldn't be possible using the same url. > > Pros: Faster release times, less need to move packages around, and > easier to remove a bad version > > > *Single version removal* > Our tooling would need to go and remove the known bad version from the > repository, which would require to rebuild the repository again, so > that the metadata is updated with the difference in the binaries. > > Caveats: time intensive process, almost like cutting a new release > which takes about a day (and sometimes longer). Error prone since the > process wouldn't be the same (one off, just when a version needs to be > removed) I am not involved in this process, but that seems like something is wrong somewhere. You keep all the binary debs on the public mirror, so "retracting" a broken latest one should just consist of: - deleting the .deb files of the broken release - regenerating the Packages*, Content* and *Release* metadata files The former should be quasi-instant, the latter takes a bit (ceph packages are quite big, especially the ones containing debug symbols, and they need to be hashed multiple times), but nowhere near a day. If you keep the "old" metadata files around, both steps should be almost instant: - delete broken .deb files - revert (expensive) metadata files to previous snapshot > Pros: all urls for download.ceph.com and its structure are kept the same. that is quite a big pro, IMHO. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph Mimic on Debian 9 Stretch
On Mon, Jun 18, 2018 at 07:15:49PM +, Sage Weil wrote: > On Mon, 18 Jun 2018, Fabian Grünbichler wrote: > > it's of course within your purview as upstream project (lead) to define > > certain platforms/architectures/distros as fully supported, and others > > as best-effort/community-driven/... . there was no clear public > > communication (AFAICT, only the one thread on ceph-maintainers, which is > > rather low visibility) that Debian moves from somewhere in the middle[2] > > to the latter category with Mimic, and has now (at least for the time > > being) effectively joined FreeBSD (which has at least one community > > member pouring in enormous amounts of work) and the various community > > Linux distros like Arch, Gentoo, ... (where I frankly have no idea about > > the status quo). there is also no mention in the docs or the release > > notes about the lack of Debian packages (and the state of Xenial > > packages) for Mimic. all of which gives off more of an "unintended > > consequence" vibe, rather than "conscious decision to drop Debian". > > This is a fair assessment, and it's good to hear that there is some path > forward. > > It looks like the Buster release date is Feb '19 (give or take), which > corresponds to Nautilus, so it should be possible for Debian users to skip > mimic and upgrade directly from luminous to nautilus as long as we build > some buster packages for luminous as well right around the end of its > lifetime (and/or mimic point releases once buster gets closer to stable). > > Is this reasonable? yes. like I already indicated, this is our "Plan B" in case Mimic on Stretch is not feasible. we'll likely skip Mimic entirely (except for some internal testing to catch and fix issue before Nautlius) in that case, and jump straight to Nautilus with Buster and keep Luminous on "life support" for Stretch and upgrading. > https://github.com/ceph/ceph/pull/22602 LGTM. still would like to see some note about the Xenial toolchain stability issues, but that is more for the sake of others (I am not an Ubuntu user). ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph Mimic on Debian 9 Stretch
On Wed, Jun 13, 2018 at 12:36:50PM +, Sage Weil wrote: > Hi Fabian, thanks for your quick, and sorry for my delayed response (only having 1.5 usable arms atm). > > On Wed, 13 Jun 2018, Fabian Grünbichler wrote: > > On Mon, Jun 04, 2018 at 06:39:08PM +, Sage Weil wrote: > > > [adding ceph-maintainers] > > > > [and ceph-devel] > > > On Mon, 4 Jun 2018, Charles Alva wrote: > > > > Hi Guys, > > > > > > > > When will the Ceph Mimic packages for Debian Stretch released? I could > > > > not > > > > find the packages even after changing the sources.list. > > > > > > The problem is that we're now using c++17, which requires a newer gcc > > > than stretch or jessie provide, and Debian does not provide backports of > > > the newer gcc packages. We currently can't build the latest Ceph for > > > those releases. > > > > IMHO this is backwards. if you want to support distro X you should take > > care to not need toolchain features that are not included in distro X. > > Well, I thought we did. When we were making the C++17 decision we > verified that we could do builds on Ubuntu and CentOS using a newer > compiler toolchain. My assumption was that since both of these distros > had backports that pretty much everyone did. Clearly I was wrong. just to make this explicit: I did not mean to imply any malicious intent on your part. I am aware this whole issue is more a question of priorities and capacities than anything else (on all sides, not just yours). I do appreciate all the work you and the rest of the main Ceph developers and the community as a whole has done and continues to do, and we see ourselves very much as part of this community! > > > [...] > > effectively this means the current Xenial builds are about as safe and > > production-ready as doing your own gcc backports for Stretch - i.e., not > > very. > > I missed this nuance as well. just as another point, install-deps.sh script from ceph.git will upgrade(!) the following packages on Xenial: - libatomic1 - libcc1-0 - libcilkrts5 - libgcc1 - libgomp1 - libitm1 - liblsan0 - libquadmath0 - libstdc++6 - libtsan0 - libubsan0 some of them will even be upgraded to versions built from gcc-8. so not only is this backport not very production-grade from a testing and support POV, it is also not self-contained (in contrast to the old Wheezy Mozilla GCC backport, or RedHat's DTS, at least from my understanding?). IMHO both issues should be mentioned in the appropriate places (regular docs for the former, dev docs for the latter?). > > > We'd love to build for stretch, but until there is a newer gcc for that > > > distro it's not possible. We could build packages for 'testing', but I'm > > > not sure if those will be usable on stretch. > > > > saying you'd love to build for a distro, while effectively making sure a > > build according to that distro's release policies is impossible without > > major effort by someone else does strike me as a bit of a hollow > > statement. in the end this is a further nail in the coffin of upstream > > support for the Debian(-based) distros, with only the latest (1.5 months > > old!) Ubuntu LTS being properly supported. > > I think we need to be clear about the use of the term "support" here. I > was careful to say we'd like to *build* for Debian, but I'm not sure what > organizations out there are offering formal *support* for any of the > ceph.com packages (in the sense of providing technical support for bug > escalations or any guarantees around stability etc). This incident is > perhaps an indication that those organizations should become more involved > in the upstream development and decision-making process. I am aware that there is no formal support (in the sense of commercial agreements, etc.pp.) for the packages on download.ceph.com, and the fact that most of the testing the packages for Debian get are just a side-effect of you testing Ubuntu Xenial and now Bionic. we are already rolling our own .deb packages for Proxmox VE (based on the upstream debian/ directory) because we have been bitten in the past by issues not caught in the upstream CI infrastructure. we try to stay involved in the community, e.g. by opening or forwarding bugs after initial triaging, and contributing fixes when possible. we do spread the "gospel of Ceph" and promote it quite heaviliy, we do develop integration and management layers that probably allow end users to setup and use Ceph that would otherwise not dare to because of the complexity involved. but in the end, we (as in Ceph the upstream project, and Proxmox as downstream distro) both face a similar dilemma - given limited developer
Re: [ceph-users] Ceph Mimic on Debian 9 Stretch
On Mon, Jun 04, 2018 at 06:39:08PM +, Sage Weil wrote: > [adding ceph-maintainers] [and ceph-devel] > > On Mon, 4 Jun 2018, Charles Alva wrote: > > Hi Guys, > > > > When will the Ceph Mimic packages for Debian Stretch released? I could not > > find the packages even after changing the sources.list. > > The problem is that we're now using c++17, which requires a newer gcc > than stretch or jessie provide, and Debian does not provide backports of > the newer gcc packages. We currently can't build the latest Ceph for > those releases. IMHO this is backwards. if you want to support distro X you should take care to not need toolchain features that are not included in distro X. Debian only provides one toolchain backport, and that is for Firefox, which has a stable update exception because it is such an important component for desktop systems and cannot be supported otherwise[1]. This package is also not intended as general purpose toolchain, but built solely for enabling a Firefox backport. > We raised this with the Debian package maintainers about a month ago[1][2] > when the first release candidate was built and didn't get any response > (beyond a "yes, we there are not gcc package backports"). this is not how Debian works, as you most likely know ;) > Both ubuntu and > fedora/rhel/centos (and I presume sles/opensuse) provide compiler > backports we did not anticipate this being a problem. this is also not very accurate. it is true that Canonical provides a toolchain PPA[2] which the Ceph build for Xenial seems to use, but there is (AFAICT) no official guarantee for the level of support, security or otherwise[3]. in fact, the PPA description states that it contains "Toolchain test builds", which seem to mean pretty automatic backports of whatever is in the current Ubuntu dev release, with a very short delay between upload to Cosmic and the PPA for Xenial. e.g., for the currently contained gcc-7 packages, there was less than a week between hitting Cosmic and Xenial. Cosmic at the current point in the release cycle is already not really exposed to public testing scrutiny in general, and for sure not to the level something like the core toolchain would require. effectively this means the current Xenial builds are about as safe and production-ready as doing your own gcc backports for Stretch - i.e., not very. > We'd love to build for stretch, but until there is a newer gcc for that > distro it's not possible. We could build packages for 'testing', but I'm > not sure if those will be usable on stretch. saying you'd love to build for a distro, while effectively making sure a build according to that distro's release policies is impossible without major effort by someone else does strike me as a bit of a hollow statement. in the end this is a further nail in the coffin of upstream support for the Debian(-based) distros, with only the latest (1.5 months old!) Ubuntu LTS being properly supported. I hope we find some way to support Mimic+ for Stretch without requiring a backport of gcc-7+, although it unfortunately seems unlikely at this point. 1: https://tracker.debian.org/pkg/gcc-mozilla 2: https://launchpad.net/~ubuntu-toolchain-r/+archive/ubuntu/test 3: https://wiki.ubuntu.com/ToolChain#Toolchain_Updates ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] No more Luminous packages for Debian Jessie ??
On Wed, Mar 07, 2018 at 02:04:52PM +0100, Fabian Grünbichler wrote: > On Wed, Feb 28, 2018 at 10:24:50AM +0100, Florent B wrote: > > Hi, > > > > Since yesterday, the "ceph-luminous" repository does not contain any > > package for Debian Jessie. > > > > Is it expected ? > > AFAICT the packages are all there[2], but the Packages file only > references the ceph-deploy package so apt does not find the rest. > > IMHO this looks like something went wrong when generating the repository > metadata files - so maybe it's just a question of getting the people who > maintain the repository to notice this thread ;) > and as alfredo just pointed out on IRC, it has already been fixed! ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] No more Luminous packages for Debian Jessie ??
On Wed, Feb 28, 2018 at 10:24:50AM +0100, Florent B wrote: > Hi, > > Since yesterday, the "ceph-luminous" repository does not contain any > package for Debian Jessie. > > Is it expected ? AFAICT the packages are all there[2], but the Packages file only references the ceph-deploy package so apt does not find the rest. IMHO this looks like something went wrong when generating the repository metadata files - so maybe it's just a question of getting the people who maintain the repository to notice this thread ;) 2: http://download.ceph.com/debian-luminous/pool/main/c/ceph/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph-volume lvm deactivate/destroy/zap
On Tue, Jan 09, 2018 at 02:14:51PM -0500, Alfredo Deza wrote: > On Tue, Jan 9, 2018 at 1:35 PM, Reed Dierwrote: > > I would just like to mirror what Dan van der Ster’s sentiments are. > > > > As someone attempting to move an OSD to bluestore, with limited/no LVM > > experience, it is a completely different beast and complexity level compared > > to the ceph-disk/filestore days. > > > > ceph-deploy was a very simple tool that did exactly what I was looking to > > do, but now we have deprecated ceph-disk halfway into a release, ceph-deploy > > doesn’t appear to fully support ceph-volume, which is now the official way > > to manage OSDs moving forward. > > ceph-deploy now fully supports ceph-volume, we should get a release soon > > > > > My ceph-volume create statement ‘succeeded’ but the OSD doesn’t start, so > > now I am trying to zap the disk to try to recreate the OSD, and the zap is > > failing as Dan’s did. > > I would encourage you to open a ticket in the tracker so that we can > improve on what failed for you > > http://tracker.ceph.com/projects/ceph-volume/issues/new > > ceph-volume keeps thorough logs in /var/log/ceph/ceph-volume.log and > /var/log/ceph/ceph-volume-systemd.log > > If you create a ticket, please make sure to add all the output and > steps that you can > > > > And yes, I was able to get it zapped using the lvremove, vgremove, pvremove > > commands, but that is not obvious to someone who hasn’t used LVM extensively > > for storage management before. > > > > I also want to mirror Dan’s sentiments about the unnecessary complexity > > imposed on what I expect is the default use case of an entire disk being > > used. I can’t see anything more than the ‘entire disk’ method being the > > largest use case for users of ceph, especially the smaller clusters trying > > to maximize hardware/spend. > > We don't take lightly the introduction of LVM here. The new tool is > addressing several insurmountable issues with how ceph-disk operated. > > Although using an entire disk might be easier in the use case you are > in, it is certainly not the only thing we have to support, so then > again, we can't > reliably decide what strategy would be best to destroy that volume, or > group, or if the PV should be destroyed as well. wouldn't it be possible to detect on creation that it is a full physical disk that gets initialized completely by ceph-volume, store that in the metadata somewhere and clean up accordingly when destroying the OSD? > > The 'zap' sub-command will allow that lv to be reused for an OSD and > that should work. Again, if it isn't sufficient, we really do need > more information and a > ticket in the tracker is the best way. > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Hangs with qemu/libvirt/rbd when one host disappears
On Thu, Dec 07, 2017 at 09:59:43AM +0100, Marcus Priesch wrote: > Hello Brad, > > thanks for your answer ! > > >> at least the point of all is that a single host should be allowed to > >> fail and the vm's continue running ... ;) > > > > You don't really have six MONs do you (although I know the answer to > > this question)? I think you need to take another look at some of the > > docs about monitors. > > however i dont get the point here ... > > because its an even number ? > > i read docs ... but dont get any hints on the number of mons ... i would > assume, the more the better ... is this wrong ? an even number is always bad for quorum based systems (6 is no better than 5, as you can only tolerate a loss of 2 before losing quorum). in Ceph, additional monitors require additional resources AND generate additional overhead (more mons -> more communication). the rule of thumb is 3 for small to mid-sized cluster. the next step up performance wise would be to move the 3 mons to their own stand-alone nodes, and only once that starts to bottleneck, you increase the number to 5 and/or upgrade the HW to become faster. for really big clusters, you can then start splitting out the mgr instances to reduce the load further. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Increasing mon_pg_warn_max_per_osd in v12.2.2
On Mon, Dec 04, 2017 at 11:21:42AM +0100, SOLTECSIS - Victor Rodriguez Cortes wrote: > > > Why are you OK with this? A high amount of PGs can cause serious peering > > issues. OSDs might eat up a lot of memory and CPU after a reboot or such. > > > > Wido > > Mainly because there was no warning at all in v12.2.1 and it just > appeared after upgrading to v12.2.2. Besides,its not a "too high" number > of PGs for this environment and no CPU/peering issues have been detected > yet. > > I'll plan a way to create new OSD's/new CephFS and move files to it, but > in the mean time I would like to just increase that variable, which is > supposed to be supported and easy. > > Thanks the option is now called 'mon_max_pg_per_osd'. this was originally slated for v12.2.1 where it was erroneously mentioned in the release notes[1] despite note being part of the release (I remember asking for updated/fixed release notes after 12.2.1, seems like that never happened?). now it was applied as part of v12.2.2, but is not mentioned at all in the release notes[2]... 1: http://docs.ceph.com/docs/master/release-notes/#v12-2-1-luminous 2: http://docs.ceph.com/docs/master/release-notes/#v12-2-2-luminous ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph-disk removal roadmap (was ceph-disk is now deprecated)
On Thu, Nov 30, 2017 at 11:25:03AM -0500, Alfredo Deza wrote: > Thanks all for your feedback on deprecating ceph-disk, we are very > excited to be able to move forwards on a much more robust tool and > process for deploying and handling activation of OSDs, removing the > dependency on UDEV which has been a tremendous source of constant > issues. > > Initially (see "killing ceph-disk" thread [0]) we planned for removal > of Mimic, but we didn't want to introduce the deprecation warnings up > until we had an out for those who had OSDs deployed in previous > releases with ceph-disk (we are now able to handle those as well). > That is the reason ceph-volume, although present since the first > Luminous release, hasn't been pushed forward much. > > Now that we feel like we can cover almost all cases, we would really > like to see a wider usage so that we can improve on issues/experience. > > Given that 12.2.2 is already in the process of getting released, we > can't undo the deprecation warnings for that version, but we will > remove them for 12.2.3, add them back again in Mimic, which will mean > ceph-disk will be kept around a bit longer, and finally fully removed > by N. > > To recap: > > * ceph-disk deprecation warnings will stay for 12.2.2 > * deprecation warnings will be removed in 12.2.3 (and from all later > Luminous releases) > * deprecation warnings will be added again in ceph-disk for all Mimic releases > * ceph-disk will no longer be available for the 'N' release, along > with the UDEV rules > > I believe these four points address most of the concerns voiced in > this thread, and should give enough time to port clusters over to > ceph-volume. > > [0] > http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-October/021358.html Thank you for listening to the feedback - I think most of us know the balance that needs to be struck between moving a project forward and decrufting a code base versus providing a stable enough interface for users is not always easy to find. I think the above roadmap is a good compromise for all involved parties, and I hope we can use the remainder of Luminous to prepare for a seam- and painless transition to ceph-volume in time for the Mimic release, and then finally retire ceph-disk for good! ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph-disk is now deprecated
On Thu, Nov 30, 2017 at 07:04:33AM -0500, Alfredo Deza wrote: > On Thu, Nov 30, 2017 at 6:31 AM, Fabian Grünbichler > <f.gruenbich...@proxmox.com> wrote: > > On Tue, Nov 28, 2017 at 10:39:31AM -0800, Vasu Kulkarni wrote: > >> On Tue, Nov 28, 2017 at 9:22 AM, David Turner <drakonst...@gmail.com> > >> wrote: > >> > Isn't marking something as deprecated meaning that there is a better > >> > option > >> > that we want you to use and you should switch to it sooner than later? I > >> > don't understand how this is ready to be marked as such if ceph-volume > >> > can't > >> > be switched to for all supported use cases. If ZFS, encryption, FreeBSD, > >> > etc > >> > are all going to be supported under ceph-volume, then how can ceph-disk > >> > be > >> > deprecated before ceph-volume can support them? I can imagine many Ceph > >> > admins wasting time chasing an erroneous deprecated warning because it > >> > came > >> > out before the new solution was mature enough to replace the existing > >> > solution. > >> > >> There is no need to worry about this deprecation, Its mostly for > >> admins to be prepared > >> for the changes coming ahead and its mostly for *new* installations > >> that can plan on using ceph-volume which provides > >> great flexibility compared to ceph-disk. > > > > changing existing installations to output deprecation warnings from one > > minor release to the next means it is not just for new installations > > though, no matter how you spin it. a mention in the release notes and > > docs would be enough to get admins to test and use ceph-volume on new > > installations. > > > > I am pretty sure many admins will be bothered by all nodes running OSDs > > spamming the logs and their terminals with huge deprecation warnings on > > each OSD activation[1] or other actions involving ceph-disk, and having > > this state for the remainder of Luminous unless they switch to a new > > (and as of yet not battle-tested) way of activating their OSDs seems > > crazy to me. > > > > I know our users will be, and given the short notice and huge impact > > this would have we will likely have to remove the deprecation warnings > > altogether in our (downstream) packages until we have completed testing > > of and implementing support for ceph-volume.. > > > >> > >> a) many dont use ceph-disk or ceph-volume directly, so the tool you > >> have right now eg: ceph-deploy or ceph-ansible > >> will still support the ceph-disk, the previous ceph-deploy release is > >> still available from pypi > >> https://pypi.python.org/pypi/ceph-deploy > > > > we have >> 10k (user / customer managed!) installations on Ceph Luminous > > alone, all using our wrapper around ceph-disk - changing something like > > this in the middle of a release causes huge headaches for downstreams > > like us, and is not how a stable project is supposed to be run. > > If you are using a wrapper around ceph-disk, then silencing the > deprecation warnings should be easy to do. > > These are plain Python warnings, and can be silenced within Python or > environment variables. There are some details > on how to do that here https://github.com/ceph/ceph/pull/18989 the problem is not how to get rid of the warnings, but having to when upgrading from one bug fix release to the next. > > > >> > >> b) also the current push will help anyone who is using ceph-deploy or > >> ceph-disk in scripts/chef/etc > >>to have time to think about using newer cli based on ceph-volume > > > > a regular deprecate at the beginning of the release cycle were the > > replacement is deemed stable, remove in the next release cycle would be > > adequate for this purpose. > > > > I don't understand the rush to shoe-horn ceph-volume into existing > > supposedly stable Ceph installations at all - especially given the > > current state of ceph-volume (we'll file bugs once we are done writing > > them up, but a quick rudimentary test already showed stuff like choking > > on valid ceph.conf files because they contain leading whitespace and > > incomplete error handling leading to crush map entries for failed OSD > > creation attempts). > > Any ceph-volume bugs are welcomed as soon as you can get them to us. > Waiting to get them reported is a problem, since ceph-volume > is tied to Ceph releases, it means that these will now have to wait > for another point re
Re: [ceph-users] ceph-disk is now deprecated
On Tue, Nov 28, 2017 at 10:39:31AM -0800, Vasu Kulkarni wrote: > On Tue, Nov 28, 2017 at 9:22 AM, David Turnerwrote: > > Isn't marking something as deprecated meaning that there is a better option > > that we want you to use and you should switch to it sooner than later? I > > don't understand how this is ready to be marked as such if ceph-volume can't > > be switched to for all supported use cases. If ZFS, encryption, FreeBSD, etc > > are all going to be supported under ceph-volume, then how can ceph-disk be > > deprecated before ceph-volume can support them? I can imagine many Ceph > > admins wasting time chasing an erroneous deprecated warning because it came > > out before the new solution was mature enough to replace the existing > > solution. > > There is no need to worry about this deprecation, Its mostly for > admins to be prepared > for the changes coming ahead and its mostly for *new* installations > that can plan on using ceph-volume which provides > great flexibility compared to ceph-disk. changing existing installations to output deprecation warnings from one minor release to the next means it is not just for new installations though, no matter how you spin it. a mention in the release notes and docs would be enough to get admins to test and use ceph-volume on new installations. I am pretty sure many admins will be bothered by all nodes running OSDs spamming the logs and their terminals with huge deprecation warnings on each OSD activation[1] or other actions involving ceph-disk, and having this state for the remainder of Luminous unless they switch to a new (and as of yet not battle-tested) way of activating their OSDs seems crazy to me. I know our users will be, and given the short notice and huge impact this would have we will likely have to remove the deprecation warnings altogether in our (downstream) packages until we have completed testing of and implementing support for ceph-volume.. > > a) many dont use ceph-disk or ceph-volume directly, so the tool you > have right now eg: ceph-deploy or ceph-ansible > will still support the ceph-disk, the previous ceph-deploy release is > still available from pypi > https://pypi.python.org/pypi/ceph-deploy we have >> 10k (user / customer managed!) installations on Ceph Luminous alone, all using our wrapper around ceph-disk - changing something like this in the middle of a release causes huge headaches for downstreams like us, and is not how a stable project is supposed to be run. > > b) also the current push will help anyone who is using ceph-deploy or > ceph-disk in scripts/chef/etc >to have time to think about using newer cli based on ceph-volume a regular deprecate at the beginning of the release cycle were the replacement is deemed stable, remove in the next release cycle would be adequate for this purpose. I don't understand the rush to shoe-horn ceph-volume into existing supposedly stable Ceph installations at all - especially given the current state of ceph-volume (we'll file bugs once we are done writing them up, but a quick rudimentary test already showed stuff like choking on valid ceph.conf files because they contain leading whitespace and incomplete error handling leading to crush map entries for failed OSD creation attempts). I DO understand the motivation behind ceph-volume and the desire to get rid of the udev-based trigger mess, but the solution is not to scare users into switching in the middle of a release by introducing deprecation warnings for a core piece of the deployment stack. IMHO the only reason to push or force such a switch in this manner would be a (grave) security or data corruption bug, which is not the case at all here.. 1: have you looked at the journal / boot logs of a mid-sized OSD node using ceph-disk for activation with the deprecation warning active? if my boot log is suddenly filled with 20% warnings, my first reaction will be that something is very wrong.. my likely second reaction when realizing what is going on is probably not fit for posting to a public mailing list ;) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] [Ceph-announce] Luminous v12.2.1 released
On Thu, Sep 28, 2017 at 05:46:30PM +0200, Abhishek wrote: > This is the first bugfix release of Luminous v12.2.x long term stable > release series. It contains a range of bug fixes and a few features > across CephFS, RBD & RGW. We recommend all the users of 12.2.x series > update. > > For more details, refer to the release notes entry at the official > blog[1] and the complete changelog[2] > > Notable Changes > --- > > [ snip ] > > * The maximum number of PGs per OSD before the monitor issues a >warning has been reduced from 300 to 200 PGs. 200 is still twice >the generally recommended target of 100 PGs per OSD. This limit can >be adjusted via the ``mon_max_pg_per_osd`` option on the >monitors. The older ``mon_pg_warn_max_per_osd`` option has been > removed. > > * Creating pools or adjusting pg_num will now fail if the change would >make the number of PGs per OSD exceed the configured >``mon_max_pg_per_osd`` limit. The option can be adjusted if it >is really necessary to create a pool with more PGs. > > [ snip ] > > Getting Ceph > > > [ snip ] > > [1]: http://ceph.com/releases/v12-2-1-luminous-released/ > [2]: https://github.com/ceph/ceph/blob/master/doc/changelog/v12.2.1.txt > those release notes should be corrected, [1] did apparently not make the cut for 12.2.1 but makes up 1/3 of the notable changes.. 1: https://github.com/ceph/ceph/pull/17814 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph packages for Debian Stretch?
On Wed, Jun 21, 2017 at 05:30:02PM +0900, Christian Balzer wrote: > > Hello, > > On Wed, 21 Jun 2017 09:47:08 +0200 (CEST) Alexandre DERUMIER wrote: > > > Hi, > > > > Proxmox is maintening a ceph-luminous repo for stretch > > > > http://download.proxmox.com/debian/ceph-luminous/ > > > > > > git is here, with patches and modifications to get it work > > https://git.proxmox.com/?p=ceph.git;a=summary > > > While this is probably helpful for the changes needed, my quest is for > Jewel (really all supported builds) for Stretch. > And not whenever Luminous gets released, but within the next 10 days. I think you should be able to just backport the needed commits from http://tracker.ceph.com/issues/19884 on top of v10.2.7, bump the version in debian/changelog and use dpkg-buildpackage (or wrapper of your choice) to rebuild the packages. Building takes a while though ;) Alternatively use the slightly outdated stock Debian packages (based on 10.2.5 with slightly deviating packaging and the patches in [1]) and switch over to the official ones when they are available. 1: https://anonscm.debian.org/cgit/pkg-ceph/ceph.git/tree/debian/patches?id=7e85745cc7aece92e8f2e505285d451ec2210afa > > Though clearly that's not going to happen, oh well. Mismatched schedules between yourself and upstream can be cumbersome - but at least in case of FLOSS you can always take matters into your own hands and roll your own if the need is big enough ;) > > Christian ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] sortbitwise warning broken on Ceph Jewel?
The Kraken release notes[1] contain the following note about the sortbitwise flag and upgrading from <= Jewel to > Jewel: The sortbitwise flag must be set on the Jewel cluster before upgrading to Kraken. The latest Jewel (10.2.4+) releases issue a health warning if the flag is not set, so this is probably already set. If it is not, Kraken OSDs will refuse to start and will print and error message in their log. I think this refers to the warning introduced by d3dbd8581 [2], which is triggered if - a mon config key is set to true (default, not there in master anymore) - the sortbitwise flag is not set (default for clusters upgrading from hammer, not default for new jewel clusters) - the OSDs support sortbitwise (I assume this is the default for Jewel OSDs? I am not sure how to get this information from a running OSD?) I have not been able to trigger this warning for either an upgraded Hammer cluster (all nodes upgraded from latest Hammer to latest Jewel and rebooted) which does not have sortbitwise set, nor for a freshly installed Jewel cluster where I manually unset sortbitwise and rebooted afterwards. Am I doing something wrong, or is the check somehow broken? If the latter is the case, the release notes are very misleading (as users will probably rely on "no health warning -> safe to upgrade"). I also see one follow-up fix[3] which was only included in Kraken so far, but AFAICT this should only possible affect the second test with a manually unset sortbitwise on Jewel, and not the Hammer -> Jewel -> Kraken/Luminous upgrade path. 1: http://docs.ceph.com/docs/master/release-notes/#upgrading-from-jewel 2: https://github.com/ceph/ceph/commit/d3dbd8581bd39572dc55d4953b5d8c49255426d7 3: https://github.com/ceph/ceph/pull/12682 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Automatic OSD start on Jewel
On Wed, Jan 04, 2017 at 12:55:56PM +0100, Florent B wrote: > On 01/04/2017 12:18 PM, Fabian Grünbichler wrote: > > On Wed, Jan 04, 2017 at 12:03:39PM +0100, Florent B wrote: > >> Hi everyone, > >> > >> I have a problem with automatic start of OSDs on Debian Jessie with Ceph > >> Jewel. > >> > >> My osd.0 is using /dev/sda5 for data and /dev/sda2 for journal, it is > >> listed in ceph-disk list : > >> > >> /dev/sda : > >> /dev/sda1 other, 21686148-6449-6e6f-744e-656564454649 > >> /dev/sda3 other, linux_raid_member > >> /dev/sda4 other, linux_raid_member > >> /dev/sda2 ceph journal, for /dev/sda5 > >> /dev/sda5 ceph data, active, cluster ceph, osd.0, journal /dev/sda2 > >> > >> It was created with ceph-disk prepare. > >> > >> When I run "ceph-disk activate /dev/sda5", it is mounted and started. > >> > >> If I run "systemctl start ceph-disk@/dev/sda5", the same, it's OK. But > >> this is a service that can't be "enabled" !! > >> > >> But on reboot, nothing happen. The only thing which tries to start is > >> ceph-osd@0 service (enabled by ceph-disk, not me), and of course it > >> fails because its data is not mounted. > >> > >> I think udev rules should do this, but it does not seem to. > >> > >> > >> root@host102:~# sgdisk -i 2 /dev/sda > >> Partition GUID code: 45B0969E-9B03-4F30-B4C6-B4B80CEFF106 (Unknown) > >> Partition unique GUID: D0F4F00F-723D-4DAD-BA2E-93D52EB564C1 > >> First sector: 2048 (at 1024.0 KiB) > >> Last sector: 9765887 (at 4.7 GiB) > >> Partition size: 9763840 sectors (4.7 GiB) > >> Attribute flags: > >> Partition name: 'ceph journal' > >> > >> root@host102:~# sgdisk -i 5 /dev/sda > >> Partition GUID code: 4FBD7E29-9D25-41B8-AFD0-062C0CEFF05D (Unknown) > >> Partition unique GUID: 5AB4F732-AFBE-4DEA-A4C6-AD290C1302D9 > >> First sector: 123047424 (at 58.7 GiB) > >> Last sector: 1953459199 (at 931.5 GiB) > >> Partition size: 1830411776 sectors (872.8 GiB) > >> Attribute flags: > >> Partition name: 'ceph data' > >> > >> > >> Does someone have an idea of what's going on ? > >> > >> Thank you. > >> > >> Florent > > are you using the packages from ceph.com? if so, you might be affected > > by http://tracker.ceph.com/issues/18305 (and > > http://tracker.ceph.com/issues/17889) > > > > did you mask the ceph.service unit generated from the ceph init script? > > > > what does "systemctl status '*ceph*'" show? what does "journalctl -b > > '*ceph*'" show? > > > > what happens if you run "ceph-disk activate-all"? (this is what is > > called last in the init script and will probably trigger mounting of the > > OSD disk/partition and starting of the ceph-osd@.. service) > > > > Thank you, that was the problem : I disabled ceph.service unit because I > thought it was an "old" thing, I didn't knew it is always used. > Re-enabling it did the trick. > > Isn't it an "old way" of doing things ? > I am not sure if the init script was left on purpose or if nobody realized that the existing systemd units don't cover all the activation paths because the init script was forgotten and hides this fact quite well. I assume the latter ;) IMHO the current situation is wrong, which is why I filed the bug (including a proposed fix). especially since the init script actually starts monitors using systemd-run as transient units instead of via ceph-mon@XYZ, so on monitor nodes the startup situation can get quite confusing and racy. so far there hasn't been any feedback - maybe this thread will help and get some more eyes to look at it.. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Automatic OSD start on Jewel
On Wed, Jan 04, 2017 at 12:03:39PM +0100, Florent B wrote: > Hi everyone, > > I have a problem with automatic start of OSDs on Debian Jessie with Ceph > Jewel. > > My osd.0 is using /dev/sda5 for data and /dev/sda2 for journal, it is > listed in ceph-disk list : > > /dev/sda : > /dev/sda1 other, 21686148-6449-6e6f-744e-656564454649 > /dev/sda3 other, linux_raid_member > /dev/sda4 other, linux_raid_member > /dev/sda2 ceph journal, for /dev/sda5 > /dev/sda5 ceph data, active, cluster ceph, osd.0, journal /dev/sda2 > > It was created with ceph-disk prepare. > > When I run "ceph-disk activate /dev/sda5", it is mounted and started. > > If I run "systemctl start ceph-disk@/dev/sda5", the same, it's OK. But > this is a service that can't be "enabled" !! > > But on reboot, nothing happen. The only thing which tries to start is > ceph-osd@0 service (enabled by ceph-disk, not me), and of course it > fails because its data is not mounted. > > I think udev rules should do this, but it does not seem to. > > > root@host102:~# sgdisk -i 2 /dev/sda > Partition GUID code: 45B0969E-9B03-4F30-B4C6-B4B80CEFF106 (Unknown) > Partition unique GUID: D0F4F00F-723D-4DAD-BA2E-93D52EB564C1 > First sector: 2048 (at 1024.0 KiB) > Last sector: 9765887 (at 4.7 GiB) > Partition size: 9763840 sectors (4.7 GiB) > Attribute flags: > Partition name: 'ceph journal' > > root@host102:~# sgdisk -i 5 /dev/sda > Partition GUID code: 4FBD7E29-9D25-41B8-AFD0-062C0CEFF05D (Unknown) > Partition unique GUID: 5AB4F732-AFBE-4DEA-A4C6-AD290C1302D9 > First sector: 123047424 (at 58.7 GiB) > Last sector: 1953459199 (at 931.5 GiB) > Partition size: 1830411776 sectors (872.8 GiB) > Attribute flags: > Partition name: 'ceph data' > > > Does someone have an idea of what's going on ? > > Thank you. > > Florent are you using the packages from ceph.com? if so, you might be affected by http://tracker.ceph.com/issues/18305 (and http://tracker.ceph.com/issues/17889) did you mask the ceph.service unit generated from the ceph init script? what does "systemctl status '*ceph*'" show? what does "journalctl -b '*ceph*'" show? what happens if you run "ceph-disk activate-all"? (this is what is called last in the init script and will probably trigger mounting of the OSD disk/partition and starting of the ceph-osd@.. service) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com