Re: [ceph-users] Please help: change IP address of a cluster
Hi, I had to change the IPs of my cluster some time ago. The process was quite easy. I don't understand what you mean with configuring and deleting static routes. The easies way is if the router allows (at least for the change) all traffic between the old and the new network. I did the following steps. 1. Add the new IP Network space separated to the "public network" line in your ceph.conf 2. OSDS: stop you OSDs on the first node. Reconfigure the host network and start your OSDs again. Repeat this for all hosts one by one 3. MON: stop and remove one mon from cluster, delete all data in /var/ceph/mon/mon. reconfigure the host network. Create the new mon instance (don't forget the "mon host" entrys in your ceph.conf and your clients as well) Of course this requires at least 3 Mons in your cluster! After 2 of 5 Mons in my cluster I added the new mon adresses to my clients and restarted them. 4. MGR: stop the mgr daemon. reconfigure the host network. Start the mgr daemon one by one I wouldn't recomend the "messy way" to reconfigure your mons. removing and adding mons to the cluster is quite easy and in my opinion the most secure. The complet IP change in our cluster worked without outage while the cluster was in production. I hope I could help you. Regards Manuel On Fri, 19 Jul 2019 10:22:37 + "ST Wong (ITSC)" wrote: > Hi all, > > Our cluster has to change to new IP range in same VLAN: 10.0.7.0/24 > -> 10.0.18.0/23, while IP address on private network for OSDs > remains unchanged. I wonder if we can do that in either one following > ways: > > = > > 1. > > a. Define static route for 10.0.18.0/23 on each node > > b. Do it one by one: > > For each monitor/mgr: > > - remove from cluster > > - change IP address > > - add static route to original IP range 10.0.7.0/24 > > - delete static route for 10.0.18.0/23 > > - add back to cluster > > For each OSD: > > - stop OSD daemons > > - change IP address > > - add static route to original IP range 10.0.7.0/24 > > - delete static route for 10.0.18.0/23 > > - start OSD daemons > > c. Clean up all static routes defined. > > > > 2. > > a. Export and update monmap using the messy way as described in > http://docs.ceph.com/docs/mimic/rados/operations/add-or-rm-mons/ > > > > ceph mon getmap -o {tmp}/{filename} > > monmaptool -rm node1 -rm node2 ... --rm node n {tmp}/{filename} > > monmaptool -add node1 v2:10.0.18.1:3330,v1:10.0.18.1:6789 -add node2 > v2:10.0.18.2:3330,v1:10.0.18.2:6789 ... --add nodeN > v2:10.0.18.N:3330,v1:10.0.18.N:6789 {tmp}/{filename} > > > > b. stop entire cluster daemons and change IP addresses > > > c. For each mon node: ceph-mon -I {mon-id} -inject-monmap > {tmp}/{filename} > > > > d. Restart cluster daemons. > > > > 3. Or any better method... > = > > Would anyone please help? Thanks a lot. > Rgds > /st wong > -- Manuel Lausch Systemadministrator Storage Services 1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 | 76135 Karlsruhe | Germany Phone: +49 721 91374-1847 E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de Hauptsitz Montabaur, Amtsgericht Montabaur, HRB 5452 Geschäftsführer: Alexander Charles, Thomas Ludwig, Jan Oetjen, Sascha Vollmer Member of United Internet Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern, weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu verwenden. This e-mail may contain confidential and/or privileged information. If you are not the intended recipient of this e-mail, you are hereby notified that saving, distribution or use of the content of this e-mail in any way is prohibited. If you have received this e-mail in error, please notify the sender and delete the e-mail. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] chown -R on every osd activating
On Tue, 5 Mar 2019 11:04:16 +0100 Paul Emmerich wrote: > On Tue, Mar 5, 2019 at 10:51 AM Manuel Lausch > wrote: > > Now after rebooting a host I see there is a chown -R ceph:ceph > > running on each OSD before the OSD daemon starts. > > > > This takes a lot of time (-> millions of objects per OSD) and I > > think this is unneccessary on each startup. In my opinion chowning > > was a case with the update from hammer to jewel. > > > > I found this commit: > > https://github.com/ceph/ceph/commit/100f2613a4659b3bd4e550250a41593860118010 > > > > Is this intentional or is there a check missing if chown is > > realy neccessary? > > This is clearly a bug; it should either have the recursive=False > parameter set or explicitly chown the necessary files. > I think it should be the former. > > Please open an issue at http://tracker.ceph.com/ > Thanks. There is the ticket: http://tracker.ceph.com/issues/38581 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] chown -R on every osd activating
Hi, we recently updated to ceph luminous 12.2.11 after running in this bug: http://tracker.ceph.com/issues/37784. But this is a other story. Now after rebooting a host I see there is a chown -R ceph:ceph running on each OSD before the OSD daemon starts. This takes a lot of time (-> millions of objects per OSD) and I think this is unneccessary on each startup. In my opinion chowning was a case with the update from hammer to jewel. I found this commit: https://github.com/ceph/ceph/commit/100f2613a4659b3bd4e550250a41593860118010 Is this intentional or is there a check missing if chown is realy neccessary? Regards Manuel ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] migrate ceph-disk to ceph-volume fails with dmcrypt
On Wed, 23 Jan 2019 16:32:08 +0100 Manuel Lausch wrote: > > > > The key api for encryption is *very* odd and a lot of its quirks are > > undocumented. For example, ceph-volume is stuck supporting naming > > files and keys 'lockbox' > > (for backwards compatibility) but there is no real lockbox anymore. > > Another quirk is that when storing the secret in the monitor, it is > > done using the following convention: > > > > dm-crypt/osd/{OSD FSID}/luks > > > > The 'luks' part there doesn't indicate anything about the type of > > encryption (!!) so regardless of the type of encryption (luks or > > plain) the key would still go there. > > > > If you manage to get the keys into the monitors you still wouldn't > > be able to scan OSDs to produce the JSON files, but you would be > > able to create the JSON file with the > > metadata that ceph-volume needs to run the OSD. > > I think it is not that problem to create the json files by myself. > Moving the Keys to the monitors and creating appropriate auth-keys > should be more or less easy as well. > > The problem I see is, that there are individual keys for the journal > and data partition while the new process useses only one key for both > partitions. > > maybe I can recreate the journal partition with the other key. But is > this possible? Are there important data ramaining on the journal after > clean stopping the OSD which I cannot throw away without trashing the > whole OSD? > Ok with a new empty journal the OSD will not start. I have now rescued the data with dd and the recrypt it with a other key and copied the data back. This worked so far Now I encoded the key with base64 and put it to the key-value store. Also created the neccessary authkeys. Creating the json File by hand was quiet easy. But now there is one problem. ceph-disk opens the crypt like cryptsetup --key-file /etc/ceph/dmcrypt-keys/foobar ... ceph-volume pipes the key via stdin like this cat foobar | cryptsetup --key-file - ... The big problem. if the key is given via stdin cryptsetup hashes this key per default with some hash. Only if I set --hash plain it works. I think this is a bug in ceph-volume. Can someone confirm this? there is the related code I mean in ceph-volume https://github.com/ceph/ceph/blob/v12.2.10/src/ceph-volume/ceph_volume/util/encryption.py#L59 Regards Manuel ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] migrate ceph-disk to ceph-volume fails with dmcrypt
On Wed, 23 Jan 2019 08:11:31 -0500 Alfredo Deza wrote: > I don't know how that would look like, but I think it is worth a try > if re-deploying OSDs is not feasible for you. yes, is there a working way to migrate this I will have a try it. > > The key api for encryption is *very* odd and a lot of its quirks are > undocumented. For example, ceph-volume is stuck supporting naming > files and keys 'lockbox' > (for backwards compatibility) but there is no real lockbox anymore. > Another quirk is that when storing the secret in the monitor, it is > done using the following convention: > > dm-crypt/osd/{OSD FSID}/luks > > The 'luks' part there doesn't indicate anything about the type of > encryption (!!) so regardless of the type of encryption (luks or > plain) the key would still go there. > > If you manage to get the keys into the monitors you still wouldn't be > able to scan OSDs to produce the JSON files, but you would be able to > create the JSON file with the > metadata that ceph-volume needs to run the OSD. I think it is not that problem to create the json files by myself. Moving the Keys to the monitors and creating appropriate auth-keys should be more or less easy as well. The problem I see is, that there are individual keys for the journal and data partition while the new process useses only one key for both partitions. maybe I can recreate the journal partition with the other key. But is this possible? Are there important data ramaining on the journal after clean stopping the OSD which I cannot throw away without trashing the whole OSD? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] migrate ceph-disk to ceph-volume fails with dmcrypt
On Wed, 23 Jan 2019 14:25:00 +0100 Jan Fajerski wrote: > I might be wrong on this, since its been a while since I played with > that. But iirc you can't migrate a subset of ceph-disk OSDs to > ceph-volume on one host. Once you run ceph-volume simple activate, > the ceph-disk systemd units and udev profiles will be disabled. While > the remaining ceph-disk OSDs will continue to run, they won't come up > after a reboot. I'm sure there's a way to get them running again, but > I imagine you'd rather not manually deal with that. yes you are right. The activate disables system wide the ceph-disk. This is done by symlinking /etc/systemd/system/ceph-disk@.service to /dev/null. After deleting this symlink my OSDs started again after reboot. The startup processes from ceph-volume and ceph-disk might conflicts each other but on a QA system this did work. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] migrate ceph-disk to ceph-volume fails with dmcrypt
Hi, thats a bad news. round about 5000 OSDs are affected from this issue. It's not realy a solution to redeploy this OSDs. Is it possible to migrate the local keys to the monitors? I see that the OSDs with the "lockbox feature" has only one key for data and journal partition and the older OSDs have individual keys for journal and data. Might this be a problem? And a other question. Is it a good idea to mix ceph-disk and ceph-volume managed OSDSs on one host? So I could only migrate newer OSDs to ceph-volume and deploy new ones (after disk replacements) with ceph-volume until hopefuly there is a solution. Regards Manuel On Tue, 22 Jan 2019 07:44:02 -0500 Alfredo Deza wrote: > This is one case we didn't anticipate :/ We supported the wonky > lockbox setup and thought we wouldn't need to go further back, > although we did add support for both > plain and luks keys. > > Looking through the code, it is very tightly couple to > storing/retrieving keys from the monitors, and I don't know what > workarounds might be possible here other than throwing away the OSD > and deploying a new one (I take it this is not an option for you at > all) > > Manuel Lausch Systemadministrator Storage Services 1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 | 76135 Karlsruhe | Germany Phone: +49 721 91374-1847 E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de Hauptsitz Montabaur, Amtsgericht Montabaur, HRB 5452 Geschäftsführer: Thomas Ludwig, Jan Oetjen, Sascha Vollmer Member of United Internet Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern, weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu verwenden. This e-mail may contain confidential and/or privileged information. If you are not the intended recipient of this e-mail, you are hereby notified that saving, distribution or use of the content of this e-mail in any way is prohibited. If you have received this e-mail in error, please notify the sender and delete the e-mail. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] migrate ceph-disk to ceph-volume fails with dmcrypt
turned non-zero exit status: 32 ceph version 12.2.10 (177915764b752804194937482a39e95e0ca3de94) luminous (stable) Regards Manuel -- Manuel Lausch Systemadministrator Storage Services 1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 | 76135 Karlsruhe | Germany Phone: +49 721 91374-1847 E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de Hauptsitz Montabaur, Amtsgericht Montabaur, HRB 5452 Geschäftsführer: Thomas Ludwig, Jan Oetjen, Sascha Vollmer Member of United Internet Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern, weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu verwenden. This e-mail may contain confidential and/or privileged information. If you are not the intended recipient of this e-mail, you are hereby notified that saving, distribution or use of the content of this e-mail in any way is prohibited. If you have received this e-mail in error, please notify the sender and delete the e-mail. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] tunable question
Hi, We have similar issues. After upgradeing from hammer to jewel the tunable "choose leave stabel" was introduces. If we activate it nearly all data will be moved. The cluster has 2400 OSD on 40 nodes over two datacenters and is filled with 2,5 PB Data. We tried to enable it but the backfillingtraffic is to high to be handled without impacting other services on the Network. Do someone know if it is neccessary to enable this tunable? And could it be a problem in the future if we want to upgrade to newer versions wihout it enabled? Regards, Manuel Lausch Am Thu, 28 Sep 2017 10:29:58 +0200 schrieb Dan van der Ster : > Hi, > > How big is your cluster and what is your use case? > > For us, we'll likely never enable the recent tunables that need to > remap *all* PGs -- it would simply be too disruptive for marginal > benefit. > > Cheers, Dan > > > On Thu, Sep 28, 2017 at 9:21 AM, mj wrote: > > Hi, > > > > We have completed the upgrade to jewel, and we set tunables to > > hammer. Cluster again HEALTH_OK. :-) > > > > But now, we would like to proceed in the direction of luminous and > > bluestore OSDs, and we would like to ask for some feedback first. > > > > From the jewel ceph docs on tubables: "Changing tunable to > > "optimal" on an existing cluster will result in a very large amount > > of data movement as almost every PG mapping is likely to change." > > > > Given the above, and the fact that we would like to proceed to > > luminous/bluestore in the not too far away future: What is cleverer: > > > > 1 - keep the cluster at tunable hammer now, upgrade to luminous in > > a little while, change OSDs to bluestore, and then set tunables to > > optimal > > > > or > > > > 2 - set tunable to optimal now, take the impact of "almost all PG > > remapping", and when that is finished, upgrade to luminous, > > bluestore etc. > > > > Which route is the preferred one? > > > > Or is there a third (or fourth?) option..? :-) > > > > MJ > > ___ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Manuel Lausch Systemadministrator Cloud Services 1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 | 76135 Karlsruhe | Germany Phone: +49 721 91374-1847 E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de Amtsgericht Montabaur, HRB 5452 Geschäftsführer: Thomas Ludwig, Jan Oetjen Member of United Internet Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern, weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu verwenden. This e-mail may contain confidential and/or privileged information. If you are not the intended recipient of this e-mail, you are hereby notified that saving, distribution or use of the content of this e-mail in any way is prohibited. If you have received this e-mail in error, please notify the sender and delete the e-mail. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Very slow start of osds after reboot
Hi, I have the same issue with Ceph Jewel (10.2.9), RedHat7 and dmcrypt Is there any fix or at least a workaround available ? Regards, Manuel Am Thu, 31 Aug 2017 16:24:10 +0200 schrieb Piotr Dzionek : > Hi, > > For a last 3 weeks I have been running latest LTS Luminous Ceph > release on CentOS7. It started with 4th RC and now I have Stable > Release. Cluster runs fine, however I noticed that if I do a reboot > of one the nodes, it takes a really long time for cluster to be in ok > status. Osds are starting up, but not as soon as the server is up. > They are up one by one during a period of 5 minutes. I checked the > logs and all osds have following errors. > > > > As you can see the xfs volume(the part with meta-data) is not mounted > yet. My question here, what mounts it and why it takes so long ? > Maybe there is a setting that randomizes the start up process of osds > running on the same node? > > Kind regards, > Piotr Dzionek > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Manuel Lausch Systemadministrator Cloud Services 1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 | 76135 Karlsruhe | Germany Phone: +49 721 91374-1847 E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de Amtsgericht Montabaur, HRB 5452 Geschäftsführer: Thomas Ludwig, Jan Oetjen Member of United Internet Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern, weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu verwenden. This e-mail may contain confidential and/or privileged information. If you are not the intended recipient of this e-mail, you are hereby notified that saving, distribution or use of the content of this e-mail in any way is prohibited. If you have received this e-mail in error, please notify the sender and delete the e-mail. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph-osd restartd via systemd in case of disk error
Am Tue, 19 Sep 2017 08:24:48 + schrieb Adrian Saul : > > I understand what you mean and it's indeed dangerous, but see: > > https://github.com/ceph/ceph/blob/master/systemd/ceph-osd%40.service > > > > Looking at the systemd docs it's difficult though: > > https://www.freedesktop.org/software/systemd/man/systemd.service.ht > > ml > > > > If the OSD crashes due to another bug you do want it to restart. > > > > But for systemd it's not possible to see if the crash was due to a > > disk I/O- error or a bug in the OSD itself or maybe the OOM-killer > > or something. > > Perhaps using something like RestartPreventExitStatus and defining a > specific exit code for the OSD to exit on when it is exiting due to > an IO error. A other idea: The OSD daemon keeps running in a defined error state and only stops the listeners with other OSDs and the clients. -- Manuel Lausch Systemadministrator Cloud Services 1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 | 76135 Karlsruhe | Germany Phone: +49 721 91374-1847 E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de Amtsgericht Montabaur, HRB 5452 Geschäftsführer: Thomas Ludwig, Jan Oetjen Member of United Internet Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern, weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu verwenden. This e-mail may contain confidential and/or privileged information. If you are not the intended recipient of this e-mail, you are hereby notified that saving, distribution or use of the content of this e-mail in any way is prohibited. If you have received this e-mail in error, please notify the sender and delete the e-mail. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] ceph-osd restartd via systemd in case of disk error
Hi, I see a issue with systemd's restart behaviour and disk IO-errors If a disk fails with IO-errors ceph-osd stops running. Systemd detects this and starts the daemon again. In our cluster I did see some loops with osd crashes caused by disk failure and restarts triggerd by systemd. Every time with peering impact and timeouts to our application until systemd gave up. Obviously ceph needs the restart feature (at least with dmcrypt) to avoid raceconditions In the startup process. But in the case of disk related failures this is contraproductive. What do you think about this? Is this a bug which should be fixed? We use ceph jewel (10.2.9) Regards Manuel -- Manuel Lausch Systemadministrator Cloud Services 1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 | 76135 Karlsruhe | Germany Phone: +49 721 91374-1847 E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de Amtsgericht Montabaur, HRB 5452 Geschäftsführer: Thomas Ludwig, Jan Oetjen Member of United Internet Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern, weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu verwenden. This e-mail may contain confidential and/or privileged information. If you are not the intended recipient of this e-mail, you are hereby notified that saving, distribution or use of the content of this e-mail in any way is prohibited. If you have received this e-mail in error, please notify the sender and delete the e-mail. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Blocked requests problem
Hi, Sometimes we have the same issue on our 10.2.9 Cluster. (24 Nodes á 60 OSDs) I think there is some racecondition or something like that which results in this state. The blocking requests starts exactly at the time the PG begins to scrub. you can try the following. The OSD will automaticaly recover and the blocked requests will disapear. ceph osd down 31 In my opinion this is a bug but I have note investigated so far. Mayby some developer can say something about this issue Regards, Manuel Am Tue, 22 Aug 2017 16:20:14 +0300 schrieb Ramazan Terzi : > Hello, > > I have a Ceph Cluster with specifications below: > 3 x Monitor node > 6 x Storage Node (6 disk per Storage Node, 6TB SATA Disks, all disks > have SSD journals) Distributed public and private networks. All NICs > are 10Gbit/s osd pool default size = 3 > osd pool default min size = 2 > > Ceph version is Jewel 10.2.6. > > My cluster is active and a lot of virtual machines running on it > (Linux and Windows VM's, database clusters, web servers etc). > > During normal use, cluster slowly went into a state of blocked > requests. Blocked requests periodically incrementing. All OSD's seems > healthy. Benchmark, iowait, network tests, all of them succeed. > > Yerterday, 08:00: > $ ceph health detail > HEALTH_WARN 3 requests are blocked > 32 sec; 3 osds have slow requests > 1 ops are blocked > 134218 sec on osd.31 > 1 ops are blocked > 134218 sec on osd.3 > 1 ops are blocked > 8388.61 sec on osd.29 > 3 osds have slow requests > > Todat, 16:05: > $ ceph health detail > HEALTH_WARN 32 requests are blocked > 32 sec; 3 osds have slow > requests 1 ops are blocked > 134218 sec on osd.31 > 1 ops are blocked > 134218 sec on osd.3 > 16 ops are blocked > 134218 sec on osd.29 > 11 ops are blocked > 67108.9 sec on osd.29 > 2 ops are blocked > 16777.2 sec on osd.29 > 1 ops are blocked > 8388.61 sec on osd.29 > 3 osds have slow requests > > $ ceph pg dump | grep scrub > dumped all in format plain > pg_stat objects mip degrmisp > unf bytes log disklog state > state_stamp v reportedup > up_primaryacting acting_primary > last_scrubscrub_stamp last_deep_scrub > deep_scrub_stamp 20.1e25183 0 0 0 > 0 98332537930 30663066 > active+clean+scrubbing2017-08-21 04:55:13.354379 > 6930'23908781 6930:20905696 [29,31,3] 29 > [29,31,3] 29 6712'22950171 2017-08-20 > 04:46:59.208792 6712'22950171 2017-08-20 04:46:59.208792 > > Active scrub does not finish (about 24 hours). I did not restart any > OSD meanwhile. I'm thinking set noscrub, noscrub-deep, norebalance, > nobackfill, and norecover flags and restart 3,29,31th OSDs. Is this > solve my problem? Or anyone has suggestion about this problem? > > Thanks, > Ramazan > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Manuel Lausch Systemadministrator Cloud Services 1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 | 76135 Karlsruhe | Germany Phone: +49 721 91374-1847 E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de Amtsgericht Montabaur, HRB 5452 Geschäftsführer: Thomas Ludwig, Jan Oetjen Member of United Internet Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern, weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu verwenden. This e-mail may contain confidential and/or privileged information. If you are not the intended recipient of this e-mail, you are hereby notified that saving, distribution or use of the content of this e-mail in any way is prohibited. If you have received this e-mail in error, please notify the sender and delete the e-mail. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] purpose of ceph-mgr daemon
Hi, we decided to test a bit the upcoming ceph release (luminous). It seems that I need to install this ceph-mgr daemon as well. But I don't understand exactly why I need this service and what I can do with it. The ceph Cluster is working well without installing any manager daemon. However in the ceph status output there is a health error ("no active mgr") My questions: Can we run the cluster without this additional daemon? If yes, is it possible to supress this health error? What exactly is the purpose of this service. The documentation oncontains very rare information about it. Regards Manuel -- Manuel Lausch Systemadministrator Cloud Services 1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 | 76135 Karlsruhe | Germany Phone: +49 721 91374-1847 E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de Amtsgericht Montabaur, HRB 5452 Geschäftsführer: Frank Einhellinger, Thomas Ludwig, Jan Oetjen Member of United Internet Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern, weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu verwenden. This e-mail may contain confidential and/or privileged information. If you are not the intended recipient of this e-mail, you are hereby notified that saving, distribution or use of the content of this e-mail in any way is prohibited. If you have received this e-mail in error, please notify the sender and delete the e-mail. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] releasedate for 10.2.8?
Hi, is there a release date for the next Jewel release (10.2.8)? I'm waiting for it since a few weeks because there are some fixes included related to snapshot deleting and snap trim sleep. Thanks Manuel -- Manuel Lausch Systemadministrator Cloud Services 1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 | 76135 Karlsruhe | Germany Phone: +49 721 91374-1847 E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de Amtsgericht Montabaur, HRB 5452 Geschäftsführer: Frank Einhellinger, Thomas Ludwig, Jan Oetjen Member of United Internet Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern, weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu verwenden. This e-mail may contain confidential and/or privileged information. If you are not the intended recipient of this e-mail, you are hereby notified that saving, distribution or use of the content of this e-mail in any way is prohibited. If you have received this e-mail in error, please notify the sender and delete the e-mail. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] memory usage ceph jewel OSDs
Hello, in the last days I try to figure out why my OSDs needs a huge amount of RAM. (1,2 - 4 GB). With this my System memory is on limit. At beginning I thougt it is because of huge amount of backfilling (some disks died). But now since a few days all is good but the memory keeps at its level. Restarting of the OSDs did nothing on this behaviour. I running Ceph Jewel (10.2.6) on RedHat7. The cluster has 8 Hosts with 36 4TB OSDs each and 4 Hosts with 15 4 TB OSDs I tried to profile the used memory like documented here: http://docs.ceph.com/docs/jewel/rados/troubleshooting/memory-profiling/ But the output of this commands didn't help me. But I am confused about the used memory. from ceph tell osd.98 heap dump I get the following output: # ceph tell osd.98 heap dump osd.98 dumping heap profile now. MALLOC: 1290458456 ( 1230.7 MiB) Bytes in use by application MALLOC: +0 (0.0 MiB) Bytes in page heap freelist MALLOC: + 63583000 ( 60.6 MiB) Bytes in central cache freelist MALLOC: + 5896704 (5.6 MiB) Bytes in transfer cache freelist MALLOC: +102784400 ( 98.0 MiB) Bytes in thread cache freelists MALLOC: + 11350176 ( 10.8 MiB) Bytes in malloc metadata MALLOC: MALLOC: = 1474072736 ( 1405.8 MiB) Actual memory used (physical + swap) MALLOC: +129064960 ( 123.1 MiB) Bytes released to OS (aka unmapped) MALLOC: MALLOC: = 1603137696 ( 1528.9 MiB) Virtual address space used MALLOC: MALLOC: 88305 Spans in use MALLOC: 1627 Thread heaps in use MALLOC: 8192 Tcmalloc page size Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()). Bytes released to the OS take up virtual address space but no physical memory. I would say the application needs 1230.7 MB of RAM. But if I analyse the corresponding dump whit pprof The are only a few Megabytes mentioned. Follwing the first few lines of pprof: # pprof --text /usr/bin/ceph-osd osd.98.profile.0002.heap Using local file /usr/bin/ceph-osd. Using local file osd.98.profile.0002.heap. Total: 8.9 MB 3.3 36.7% 36.7% 3.3 36.7% ceph::log::Log::create_entry 2.3 25.5% 62.2% 2.3 25.5% ceph::buffer::list::append@a1f280 1.1 12.1% 74.3% 2.0 23.1% SimpleMessenger::add_accept_pipe 0.9 10.4% 84.7% 0.9 10.5% Pipe::Pipe 0.2 2.8% 87.5% 0.2 2.8% std::map::operator[] 0.2 2.2% 89.7% 0.2 2.2% std::vector::_M_default_append 0.2 1.8% 91.5% 0.2 1.8% std::_Rb_tree::_M_copy 0.1 0.8% 92.4% 0.1 0.8% ceph::buffer::create_aligned 0.1 0.8% 93.2% 0.1 0.8% std::string::_Rep::_S_create Is this normal? Do I do something wrong? Is there a Bug? Why need my OSDs so much RAM? Thanks for your help Regards, Manuel -- Manuel Lausch Systemadministrator Cloud Services 1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 | 76135 Karlsruhe | Germany Phone: +49 721 91374-1847 E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de Amtsgericht Montabaur, HRB 5452 Geschäftsführer: Frank Einhellinger, Thomas Ludwig, Jan Oetjen Member of United Internet Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern, weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu verwenden. This e-mail may contain confidential and/or privileged information. If you are not the intended recipient of this e-mail, you are hereby notified that saving, distribution or use of the content of this e-mail in any way is prohibited. If you have received this e-mail in error, please notify the sender and delete the e-mail. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] osd down detection broken in jewel?
Yes. This parameter is used in the condition described there: http://docs.ceph.com/docs/jewel/rados/configuration/mon-osd-interaction/#osds-report-their-status and works. I think the default timeout of 900s is quiet a bit large. Also in the documentation is a other function wich checks the health of OSDs and report them down: http://docs.ceph.com/docs/jewel/rados/configuration/mon-osd-interaction/#osds-report-down-osds As far as I see in the sourcode this documentation is not valid anymore! I found this commit -> https://github.com/ceph/ceph/commit/bcb8f362ec6ac47c4908118e7860dec7971d001f#diff-0a5db46a44ae9900e226289a810f10e8 "mon_osd_min_down_reporters" now is the threshold how many "mon_osd_reporter_subtree_level" has to report a down OSD. in Hammer this was how many other OSDs had to report. And in Hammer there was also the parameter "mon_osd_min_down_reports" which sets how often a other OSD has to report a other OSD. In Jewel the parameter doesn't exists anymore. With this "knowlege" I adjusted my configuration. And will now test it. BTW: While reading the source code I may found a other bug. Can you confirm this? In the function "OSDMonitor::check_failure" in src/mon/OSDMonitor.cc the code which counts the "reporters_by_subtree" is in the if block "if (g_conf->mon_osd_adjust_heartbeat_grace) {". So if I disable adjust_heartbeat_grace the reporters_by_subtree functionality will not work at all. Regards, Manuel Am 30.11.2016 um 15:24 schrieb John Petrini: It's right there in your config. mon osd report timeout = 900 See: http://docs.ceph.com/docs/jewel/rados/configuration/mon-osd-interaction/ ___ John Petrini NOC Systems Administrator // *CoreDial, LLC* // coredial.com <http://coredial.com/> // Twitter <https://twitter.com/coredial> LinkedIn <http://www.linkedin.com/company/99631> Google Plus <https://plus.google.com/104062177220750809525/posts> Blog <http://success.coredial.com/blog> Hillcrest I, 751 Arbor Way, Suite 150, Blue Bell PA, 19422 *P: *215.297.4400 x232 // *F: *215.297.4401 // *E: *jpetr...@coredial.com <mailto:jpetr...@coredial.com> Exceptional people. Proven Processes. Innovative Technology. Discover CoreDial - watch our video <http://cta-redirect.hubspot.com/cta/redirect/210539/4c492538-6e4b-445e-9480-bef676787085> The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer. On Wed, Nov 30, 2016 at 6:39 AM, Manuel Lausch <mailto:manuel.lau...@1und1.de>> wrote: Hi, In a test with ceph jewel we tested how long the cluster needs to detect and mark down OSDs after they are killed (with kill -9). The result -> 900 seconds. In Hammer this took about 20 - 30 seconds. In the Logfile from the leader monitor are a lot of messeages like 2016-11-30 11:32:20.966567 7f158f5ab700 0 log_channel(cluster) log [DBG] : osd.7 10.78.43.141:8120/106673 <http://10.78.43.141:8120/106673> reported failed by osd.272 10.78.43.145:8106/117053 <http://10.78.43.145:8106/117053> A deeper look at this. A lot of OSDs reported this exactly one time. In Hammer The OSDs reported a down OSD a few more times. Finaly there is the following and the osd is marked down. 2016-11-30 11:36:22.633253 7f158fdac700 0 log_channel(cluster) log [INF] : osd.7 marked down after no pg stats for 900.982893seconds In my ceph.conf I have the following lines in the global section mon osd min down reporters = 10 mon osd min down reports = 3 mon osd report timeout = 900 It seems the parameter "mon osd min down reports" is removed in jewel but the documentation is not updated -> http://docs.ceph.com/docs/jewel/rados/configuration/mon-osd-interaction/ <http://docs.ceph.com/docs/jewel/rados/configuration/mon-osd-interaction/> Can someone tell me how ceph jewel detects down OSDs and mark them down in a appropriated time? The Cluster: ceph version 10.2.3 (ecc23778eb545d8dd55e2e4735b53cc93f92e65b) 24 hosts á 60 OSDs -> 1440 OSDs 2 pool with replication factor 4 65536 PGs 5 Mons -- Manuel Lausch Systemadministrator Cloud Services 1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 | 76135 Karlsruhe | Germany Phone: +49 721 91374-1847 E-Mail: manuel.lau...@1und1.de <mailto:manuel.lau...@1und1.de> | Web: www.1und1.de <http://
[ceph-users] osd down detection broken in jewel?
Hi, In a test with ceph jewel we tested how long the cluster needs to detect and mark down OSDs after they are killed (with kill -9). The result -> 900 seconds. In Hammer this took about 20 - 30 seconds. In the Logfile from the leader monitor are a lot of messeages like 2016-11-30 11:32:20.966567 7f158f5ab700 0 log_channel(cluster) log [DBG] : osd.7 10.78.43.141:8120/106673 reported failed by osd.272 10.78.43.145:8106/117053 A deeper look at this. A lot of OSDs reported this exactly one time. In Hammer The OSDs reported a down OSD a few more times. Finaly there is the following and the osd is marked down. 2016-11-30 11:36:22.633253 7f158fdac700 0 log_channel(cluster) log [INF] : osd.7 marked down after no pg stats for 900.982893seconds In my ceph.conf I have the following lines in the global section mon osd min down reporters = 10 mon osd min down reports = 3 mon osd report timeout = 900 It seems the parameter "mon osd min down reports" is removed in jewel but the documentation is not updated -> http://docs.ceph.com/docs/jewel/rados/configuration/mon-osd-interaction/ Can someone tell me how ceph jewel detects down OSDs and mark them down in a appropriated time? The Cluster: ceph version 10.2.3 (ecc23778eb545d8dd55e2e4735b53cc93f92e65b) 24 hosts á 60 OSDs -> 1440 OSDs 2 pool with replication factor 4 65536 PGs 5 Mons -- Manuel Lausch Systemadministrator Cloud Services 1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 | 76135 Karlsruhe | Germany Phone: +49 721 91374-1847 E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de Amtsgericht Montabaur, HRB 5452 Geschäftsführer: Frank Einhellinger, Thomas Ludwig, Jan Oetjen Member of United Internet Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern, weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu verwenden. This e-mail may contain confidential and/or privileged information. If you are not the intended recipient of this e-mail, you are hereby notified that saving, distribution or use of the content of this e-mail in any way is prohibited. If you have received this e-mail in error, please notify the sender and delete the e-mail. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] resolve split brain situation in ceph cluster
Hi Gregory, each datacenter has its own IP subnet which is routed. We created simultaneously iptables rules on each host wich drops all packages in and outgoing to the other datacenter. After this our application wrote to DC A, there are 3 of 5 Monitor Nodes. Now we modified in B the monmap (removed all mon nodes from DC A, so there are now 2 of 2 mon active). The monmap in A is untouched. The cluster in B was now active as well and the applications in B could now write to it. So we wrote definitely data in both clusterparts. After this we shut down the mon nodes in A. The part in A was now unavailable. Some hours later we removed the iptables rules and tried to rejoin the tow parts. we rejoined he three mon nodes from A as new nodes. the old mon data from this nodes was destroyed. Do you need further information? Regards, Manuel Am 14.10.2016 um 17:58 schrieb Gregory Farnum: On Fri, Oct 14, 2016 at 7:27 AM, Manuel Lausch wrote: Hi, I need some help to fix a broken cluster. I think we broke the cluster, but I want to know your opinion and if you see a possibility to recover it. Let me explain what happend. We have a cluster (Version 0.94.9) in two datacenters (A and B). In each 12 nodes á 60 ODSs. In A we have 3 monitor nodes and in B 2. The crushrule and replication factor forces two replicas in each datacenter. We write objects via librados in the cluster. The objects are immutable, so they are either present or absent. In this cluster we tested what happens if datacenter A will fail and we need to bring up the cluster in B by creating a monitor quorum in B. We did this by cut off the network connection betwenn the two datacenters. The OSDs from DC B went down like expected. Now we removed the mon Nodes from the monmap in B (by extracting it offline and edit it). Our clients wrote now data in both independent clusterparts before we stopped the mons in A. (YES I know. This is a really bad thing). This story line seems to be missing some points. How did you cut off the network connection? What leads you to believe the OSDs accepted writes on both sides of the split? Did you edit the monmap in both data centers, or just DC A (that you wanted to remain alive)? What monitor counts do you have in each DC? -Greg Now we try to join the two sides again. But so far without success. Only the OSDs in B are running. The OSDs in A started but the OSDs stay down. In the mon log we see a lot of „...(leader).pg v3513957 ignoring stats from non-active osd“ alerts. We see, that the current osdmap epoch in the running cluster is „28873“. In the OSDs in A the epoch is „29003“. We assume that this is the reason why the OSDs won't to jump in. BTW: This is only a testcluster, so no important data are harmed. Regards Manuel -- Manuel Lausch Systemadministrator Cloud Services 1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 | 76135 Karlsruhe | Germany Phone: +49 721 91374-1847 E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de Amtsgericht Montabaur, HRB 5452 Geschäftsführer: Frank Einhellinger, Thomas Ludwig, Jan Oetjen Member of United Internet Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern, weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu verwenden. This e-mail may contain confidential and/or privileged information. If you are not the intended recipient of this e-mail, you are hereby notified that saving, distribution or use of the content of this e-mail in any way is prohibited. If you have received this e-mail in error, please notify the sender and delete the e-mail. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Manuel Lausch Systemadministrator Cloud Services 1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 | 76135 Karlsruhe | Germany Phone: +49 721 91374-1847 E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de Amtsgericht Montabaur, HRB 5452 Geschäftsführer: Frank Einhellinger, Thomas Ludwig, Jan Oetjen Member of United Internet Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern, weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu verwenden. This e-mail may contain confidential and/or privileged information. If you are not the intended recipient of this e-mail, you are hereby notified that sa
[ceph-users] resolve split brain situation in ceph cluster
Hi, I need some help to fix a broken cluster. I think we broke the cluster, but I want to know your opinion and if you see a possibility to recover it. Let me explain what happend. We have a cluster (Version 0.94.9) in two datacenters (A and B). In each 12 nodes á 60 ODSs. In A we have 3 monitor nodes and in B 2. The crushrule and replication factor forces two replicas in each datacenter. We write objects via librados in the cluster. The objects are immutable, so they are either present or absent. In this cluster we tested what happens if datacenter A will fail and we need to bring up the cluster in B by creating a monitor quorum in B. We did this by cut off the network connection betwenn the two datacenters. The OSDs from DC B went down like expected. Now we removed the mon Nodes from the monmap in B (by extracting it offline and edit it). Our clients wrote now data in both independent clusterparts before we stopped the mons in A. (YES I know. This is a really bad thing). Now we try to join the two sides again. But so far without success. Only the OSDs in B are running. The OSDs in A started but the OSDs stay down. In the mon log we see a lot of „...(leader).pg v3513957 ignoring stats from non-active osd“ alerts. We see, that the current osdmap epoch in the running cluster is „28873“. In the OSDs in A the epoch is „29003“. We assume that this is the reason why the OSDs won't to jump in. BTW: This is only a testcluster, so no important data are harmed. Regards Manuel -- Manuel Lausch Systemadministrator Cloud Services 1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 | 76135 Karlsruhe | Germany Phone: +49 721 91374-1847 E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de Amtsgericht Montabaur, HRB 5452 Geschäftsführer: Frank Einhellinger, Thomas Ludwig, Jan Oetjen Member of United Internet Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern, weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu verwenden. This e-mail may contain confidential and/or privileged information. If you are not the intended recipient of this e-mail, you are hereby notified that saving, distribution or use of the content of this e-mail in any way is prohibited. If you have received this e-mail in error, please notify the sender and delete the e-mail. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Try to install ceph hammer on CentOS7
Hi, Thanks for your help. I found the failure. Via puppet I configured a versionlock. There I had a wrong version-epoch configured. Regards, Manuel Am 23.07.2016 um 05:11 schrieb Brad Hubbard: On Sat, Jul 23, 2016 at 1:41 AM, Ruben Kerkhof wrote: Please keep the mailing list on the CC. On Fri, Jul 22, 2016 at 3:40 PM, Manuel Lausch wrote: oh. This was a copy&pase failure. Of course I checked my config again. Some other variations of configurating didn't help as well. Finaly I took the ceph-0.94.7-0.el7.x86_64.rpm in a directory and created with createrepo the neccessary repository index files. Also with this as a repository the ceph package is not visible. Other packages in the repository works fine. If I try to install the package with yum install ~/ceph-0.94.7-0.el7.x86_64.rpm the Installation including the dependencys is successfull. My knowledge with rpm and yum is not as big as it should be. So I don't know how to debug further. What does yum repolist show? This is good advice. I'd also advise running "yum clean all" before proceeding once you have confirmed everything is configured correctly. HTH, Brad It looks like the ceph-noarch repo is ok, the ceph repo isn't. Regards, Manuel Regards, Ruben ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Manuel Lausch Systemadministrator Cloud Services 1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 | 76135 Karlsruhe | Germany Phone: +49 721 91374-1847 E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de Amtsgericht Montabaur, HRB 5452 Geschäftsführer: Frank Einhellinger, Thomas Ludwig, Jan Oetjen Member of United Internet Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern, weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu verwenden. This e-mail may contain confidential and/or privileged information. If you are not the intended recipient of this e-mail, you are hereby notified that saving, distribution or use of the content of this e-mail in any way is prohibited. If you have received this e-mail in error, please notify the sender and delete the e-mail. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Try to install ceph hammer on CentOS7
Hi, I try to install ceph hammer on centos7 but something with the RPM Repository seems to be wrong. In my yum.repos.d/ceph.repo file I have the following configuration: [ceph] name=Ceph packages for $basearch baseurl=baseurl=http://download.ceph.com/rpm-hammer/el7/$basearch enabled=1 priority=2 gpgcheck=1 type=rpm-md gpgkey=https://download.ceph.com/keys/release.asc [ceph-noarch] name=Ceph noarch packages baseurl=http://download.ceph.com/rpm-hammer/el7/noarch enabled=1 priority=2 gpgcheck=1 type=rpm-md gpgkey=https://download.ceph.com/keys/release.asc Now I can find with "yum search ceph" some packages out of this repository but the "ceph" package itself is missing. Is there something wrong with my configuration or is there some issue with the rpository itself? This is the output from the yum search ceph: # yum search ceph Loaded plugins: fastestmirror, priorities, versionlock Loading mirror speeds from cached hostfile 112 packages excluded due to repository priority protections == N/S matched: ceph === centos-release-ceph-hammer.noarch : Ceph Hammer packages from the CentOS Storage SIG repository ceph-common.x86_64 : Ceph Common ceph-dash.noarch : ceph dashboard ceph-debuginfo.x86_64 : Debug information for package ceph ceph-deploy.noarch : Admin and deploy tool for Ceph ceph-devel-compat.x86_64 : Compatibility package for Ceph headers ceph-fuse.x86_64 : Ceph fuse-based client ceph-libs-compat.x86_64 : Meta package to include ceph libraries ceph-test.x86_64 : Ceph benchmarks and test tools cephfs-java.x86_64 : Java libraries for the Ceph File System collectd-ceph.x86_64 : Ceph plugin for collectd libcephfs1.x86_64 : Ceph distributed file system client library libcephfs1-devel.x86_64 : Ceph distributed file system headers libcephfs_jni1.x86_64 : Java Native Interface library for CephFS Java bindings libcephfs_jni1-devel.x86_64 : Development files for CephFS Java Native Interface library python-ceph-compat.x86_64 : Compatibility package for Cephs python libraries python-cephfs.x86_64 : Python libraries for Ceph distributed file system ceph-radosgw.x86_64 : Rados REST gateway rbd-fuse.x86_64 : Ceph fuse-based client Name and summary matches only, use "search all" for everything. Regards, Manuel ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Question about how to start ceph OSDs with systemd
hi, In the last days I do play around with ceph jewel on debian Jessie and CentOS 7. Now I have a question about systemd on this Systems. I installed ceph jewel (ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)) on debian Jessie and prepared some OSDs. While playing around I decided to reinstall my operating system (of course without deleting the OSD devices ). After the reinstallation of ceph and put in the old ceph.conf I thought the previously prepared OSDs do easily start and all will be fine after that. With debian Wheezy and ceph firefly this worked well, but with the new versions and systemd this doesn't work at all. Now what have I to do to get the OSDs running again? The following command didn't work and I didn't get any output from it. systemctl start ceph-osd.target And this is the output from systemctl status ceph-osd.target ● ceph-osd.target - ceph target allowing to start/stop all ceph-osd@.service instances at once Loaded: loaded (/usr/lib/systemd/system/ceph-osd.target; enabled; vendor preset: enabled) Active: active since Fri 2016-07-08 17:19:29 CEST; 36min ago Jul 08 17:19:29 cs-dellbrick01.server.lan systemd[1]: Reached target ceph target allowing to start/stop all ceph-osd@.service instances at once. Jul 08 17:19:29 cs-dellbrick01.server.lan systemd[1]: Starting ceph target allowing to start/stop all ceph-osd@.service instances at once. Jul 08 17:31:15 cs-dellbrick01.server.lan systemd[1]: Reached target ceph target allowing to start/stop all ceph-osd@.service instances at once. thanks, Manuel ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Need help with synchronizing ceph mons
Hi, I have some problems with my ceph monitor nodes in my Cluster. I had 5 mons in the cluster. On all 5 nodes the leveldb store grew up to about 80 – 90 GB in size. To get rid of it I triggerd a compaction with the following command on one node. ceph tell mon.d compact The Monitor compacted his data to about 5 GB. After this the mon tried to synchronize his data with the other mons. And start my problems. After a short time (20 – 30 sec) of streaming data from another mon node The stream breaks and further the sender reads oviously all the other data in his store with maximum speed. At this point the cluster lost his leader and trying to elect a new one. The leader election only works if the reading of the data is done. I tried to remove the mon from the cluster completly and rejoin it as a new one but while syncing I experience the same issue. So currently the cluster has only 4 Mons. While further investigation and testing I lost another mon which wants to sync data after starting with the same behavior. It seems the Node which streams data while syncing is in stress with reading and sending data. I tried to limit the network bandwith of the joining node to reduce the load. I also tried to set ionice -c3 on the process which does all the disk IO while reading. But nothing helped. Because the cluster is productive I don't want to expermient further more without knowing what's going on. Does anyone have any ideas what's going on and how I can try to fix this? I am using ceph version 0.67.11 (bc8b67bef6309a32361be76cd11fb56b057ea9d2) 5 Monitor Nodes with SSD as leveldb store 24 OSD Hosts with 1416 OSDs Thank you Manuel -- Manuel Lausch Systemadministrator Cloud Backend Services 1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 | 76135 Karlsruhe | Germany Phone: +49 721 91374-1847 E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de Amtsgericht Montabaur, HRB 5452 Geschäftsführer: Frank Einhellinger, Hans-Henning Kettler, Jan Oetjen Member of United Internet Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern, weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu verwenden. This e-mail may contain confidential and/or privileged information. If you are not the intended recipient of this e-mail, you are hereby notified that saving, distribution or use of the content of this e-mail in any way is prohibited. If you have received this e-mail in error, please notify the sender and delete the e-mail. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com