Re: [ceph-users] Need advice with setup planning

2019-09-21 Thread mj
be 10G, since 1G is surely going to be a bottleneck. We are running the above setup. No problems. Only issue is: adding a fourth node will be relatively intrusive. MJ On 9/20/19 8:23 PM, Salsa wrote: Replying inline. -- Salsa Sent with ProtonMail <https://protonmail.com> Secure

Re: [ceph-users] clock skew

2019-04-28 Thread mj
nc-status" is 0.00 on all hosts. Seems that replying on "chronyc sources" is not always enough to make sure that everything is synced indeed. Thanks for the help! MJ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] clock skew

2019-04-26 Thread mj
if the peer config will actually help in this situation. But time will tell. @John: Thanks for the maxsources suggestion @Bill: thanks for the interesting article, will check it out! MJ On 4/25/19 5:47 PM, Bill Sharer wrote: If you are just synching to the outside pool, the three hosts may end

[ceph-users] clock skew

2019-04-25 Thread mj
ent clock skew from cephs perspective? Because "ceph health detail" in case of HEALTH_OK does not show it. (I want to start monitoring it continuously, to see if I can find some sort of pattern) Thanks! MJ ___ ceph-users mailing li

Re: [ceph-users] PG inconsistent, "pg repair" not working

2018-09-25 Thread mj
Hi, I was able to solve a similar issue on our cluster using this blog: https://ceph.com/geen-categorie/ceph-manually-repair-object/ It does help if you are running a 3/2 config. Perhaps it helps you as well. MJ On 09/25/2018 02:37 AM, Sergey Malinin wrote: Hello, During normal operation

Re: [ceph-users] [slightly OT] XFS vs. BTRFS vs. others as root/usr/var/tmp filesystems ?

2018-09-24 Thread mj
On 09/24/2018 08:53 AM, Nicolas Huillard wrote: Thanks for your anecdote ;-) Could it be that I stack too many things (XFS in LVM in md-RAID in SSD 's FTL)? No, we regularly use the same compound of layers, just without the SSD. mj ___ ceph-users

Re: [ceph-users] [slightly OT] XFS vs. BTRFS vs. others as root/usr/var/tmp filesystems ?

2018-09-24 Thread mj
zfs, like in adding disks to raids to expand space for example. mj ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] [slightly OT] XFS vs. BTRFS vs. others as root/usr/var/tmp filesystems ?

2018-09-23 Thread mj
) always 'something' happened. (same for the few times we tried reiserfs, btw) So, while my story may be very anecdotical (and you will probably find many others here claiming the opposite) our own conclusion is very clear: we love xfs, and do not like btrfs very much. MJ On 09/22/2018 10:58 AM

Re: [ceph-users] Proxmox/ceph upgrade and addition of a new node/OSDs

2018-09-21 Thread mj
Hi Hervé! Thanks for the detailed summary, much appreciated! Best, MJ On 09/21/2018 09:03 AM, Hervé Ballans wrote: Hi MJ (and all), So we upgraded our Proxmox/Ceph cluster, and if we have to summarize the operation in a few words : overall, everything went well :) The most critical

Re: [ceph-users] Proxmox/ceph upgrade and addition of a new node/OSDs

2018-09-13 Thread mj
Hi Hervé, No answer from me, but just to say that I have exactly the same upgrade path ahead of me. :-) Please report here any tips, trics, or things you encountered doing the upgrades. It could potentially save us a lot of time. :-) Thanks! MJ On 09/13/2018 05:23 PM, Hervé Ballans wrote

Re: [ceph-users] HEALTH_ERR vs HEALTH_WARN

2018-08-23 Thread mj
I assumed that a simple "ceph pg repair 2.1a9" was enough to solve this without introducing corruption into our 3/2 cluster. MJ On 08/23/2018 12:28 PM, Mark Schouten wrote: Gregory's answer worries us. We thought that with a 3/2 pool, and one PG corrupted, the assumption would be:

Re: [ceph-users] HEALTH_ERR vs HEALTH_WARN

2018-08-23 Thread mj
sed to see this is on our cluster, as it should be and has been running stable and reliably for over two years. Perhaps just a one-time glitch. Thanks for your replies! MJ On 08/23/2018 01:06 AM, Gregory Farnum wrote: On Wed, Aug 22, 2018 at 2:46 AM John Spray <mailto:jsp...@redhat.com>> wrote

[ceph-users] HEALTH_ERR vs HEALTH_WARN

2018-08-22 Thread mj
e this is a size 3, min 2 pool... shouldn't this have been taken care of automatically..? ('self-healing' and all that..?) So, I'm having my morning coffee finally, wondering what happened... :-) Best regards to all, have a nice day! MJ ___ ceph

Re: [ceph-users] lacp bonding | working as expected..?

2018-06-21 Thread mj
no way to specify what outgoing port iperf should use, otherwise I could try again using the same ports, to check the pattern. Thanks again! MJ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Adding additional disks to the production cluster without performance impacts on the existing

2018-06-08 Thread mj
and then started ramping up each OSD. I created a script to do it dynamically, which will check CPU of the new host with OSDs that Would you mind sharing this script..? MJ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com

Re: [ceph-users] Adding cluster network to running cluster

2018-06-07 Thread mj
On 06/07/2018 01:45 PM, Wido den Hollander wrote: Removing cluster network is enough. After the restart the OSDs will not publish a cluster network in the OSDMap anymore. You can keep the public network in ceph.conf and can even remove that after you removed the 10.10.x.x addresses from the

Re: [ceph-users] tunable question

2017-10-05 Thread mj
nt of data: 32730 GB used, 56650 GB / 89380 GB avail We set noscrub and no-deepscrub during the rebalance, and our VMs experienced basically no impact. MJ On 10/03/2017 05:37 PM, lists wrote: Thanks Jake, for your extensive reply. :-) MJ On 3-10-2017 15:21, Jake Young wrote: On Tue, Oc

Re: [ceph-users] tunable question

2017-09-28 Thread mj
uot;hammer". This resulted in approx 24 hours rebuild, but actually without significant inpact on the hosted VMs. Is it safe to assume that setting it to "optimal" would have a similar impact, or are the implications bigger? MJ On 09/28/2017 10:29 AM, Dan van der Ster wr

[ceph-users] tunable question

2017-09-28 Thread mj
Which route is the preferred one? Or is there a third (or fourth?) option..? :-) MJ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] librmb: Mail storage on RADOS with Dovecot

2017-09-24 Thread mj
Hi, I forwarded your announcement to the dovecot mailinglist. The following reply to it was posted by there by Timo Sirainen. I'm forwarding it here, as you might not be reading the dovecot mailinglist. Wido: First, the Github link: https://github.com/ceph-dovecot/dovecot-ceph-plugin I am

Re: [ceph-users] Restart ceph cluster

2017-05-12 Thread mj
after a reboot. But perhaps I completely misunderstand your question... ;-) MJ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] slow requests and short OSD failures in small cluster

2017-04-20 Thread mj
their impact) Any other tips, do's or don'ts, or things to keep in mind related to snapshots, VM/OSD filesystems, or using fstrim..? (our cluster is also small, hammer, three servers with 8 OSDs each, and journals on ssd, plenty of cpu/ram) Again, thanks for your interesting post. MJ

Re: [ceph-users] slow requests and short OSD failures in small cluster

2017-04-18 Thread mj
...) We are still on hammer, but if the result of upgrading to jewel is actually a massive performance decrease, I might postpone as long as possible... Most of our VMs have a snapshot or two... MJ ___ ceph-users mailing list ceph-users

Re: [ceph-users] slow requests and short OSD failures in small cluster

2017-04-14 Thread mj
ah right: _during_ the actual removal, you mean. :-) clear now. mj On 04/13/2017 05:50 PM, Lionel Bouton wrote: Le 13/04/2017 à 17:47, mj a écrit : Hi, On 04/13/2017 04:53 PM, Lionel Bouton wrote: We use rbd snapshots on Firefly (and Hammer now) and I didn't see any measurable impact

Re: [ceph-users] slow requests and short OSD failures in small cluster

2017-04-13 Thread mj
Hi, On 04/13/2017 04:53 PM, Lionel Bouton wrote: We use rbd snapshots on Firefly (and Hammer now) and I didn't see any measurable impact on performance... until we tried to remove them. What exactly do you mean with that? MJ ___ ceph-users mailing

Re: [ceph-users] clock skew

2017-04-01 Thread mj
On 04/01/2017 04:02 PM, John Petrini wrote: Hello, I'm also curious about the impact of clock drift. We see the same on both of our clusters despite trying various NTP servers including our own local servers. Ultimately we just ended up adjusting our monitoring to be less sensitive to it

Re: [ceph-users] clock skew

2017-04-01 Thread mj
Hi, On 04/01/2017 02:10 PM, Wido den Hollander wrote: You could try the chrony NTP daemon instead of ntpd and make sure all MONs are peers from each other. I understand now what that means. I have set it up according to your suggestion. Curious to see how this works out, thanks! MJ

Re: [ceph-users] clock skew

2017-04-01 Thread mj
good experiences with those ntp servers. So, you're telling me that the MONs should be peers from each other... But if all MONs listen/sync to/with each other, where do I configure the external stratum1 source.? MJ ___ ceph-users mailing list ceph-users

Re: [ceph-users] clock skew

2017-04-01 Thread mj
Hi! On 04/01/2017 12:49 PM, Wei Jin wrote: mon_clock_drift_allowed should be used in monitor process, what's the output of `ceph daemon mon.foo config show | grep clock`? how did you change the value? command line or config file? I guess I changed it wrong then... Did it in ceph.conf, like:

[ceph-users] clock skew

2017-04-01 Thread mj
ame HEALTH_WARN clock skew detected on mon.1; Monitor clock skew detected mon.1 addr 10.10.89.2:6789/0 clock skew 0.113709s > max 0.1s (latency 0.000523111s) Can anyone explain why the running config shows "mon_clock_drift_allowed": "0.2" and the HEALTH_WARN says "

Re: [ceph-users] default pools gone. problem?

2017-03-24 Thread mj
On 03/24/2017 10:13 PM, Bob R wrote: You can operate without the default pools without issue. Thanks! ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] default pools gone. problem?

2017-03-24 Thread mj
ault pools a problem? Do I need to recreate them, or can they safely be deleted? I'm on hammer, but intending to upgrade to jewel, and trying to identify potential issues, therefore this question. MJ ___ ceph-users mailing list ceph-users@lists.cep

Re: [ceph-users] ceph 'tech' question

2017-03-24 Thread mj
benefits. Anyway, thanks for your reply. :-) MJ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] ceph 'tech' question

2017-03-24 Thread mj
over the network. And if this is not the case, then why not? :-) Thanks for any insights or pointers! MJ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] add multiple OSDs to cluster

2017-03-22 Thread mj
Hi Jonathan, Anthony and Steve, Thanks very much for your valuable advise and suggestions! MJ On 03/21/2017 08:53 PM, Jonathan Proulx wrote: If it took 7hr for one drive you probably already done this (or defaults are for low impact recovery) but before doing anything you want to besure

[ceph-users] add multiple OSDs to cluster

2017-03-21 Thread mj
will rebuild anyway...and I have the feeling that rebuilding from 4 -> 8 OSDs is not going to be much heavier than rebuilding from 4 -> 5 OSDs. Right? So better add all new OSDs together on a specific server? Or not? :-) MJ ___ ceph-users mailin

Re: [ceph-users] suddenly high memory usage for ceph-mon process

2016-11-05 Thread mj
happen, so next time we know where to look first. Thanks both, for you replies, MJ On 11/04/2016 03:26 PM, igor.podo...@ts.fujitsu.com wrote: Maybe you hit this https://github.com/ceph/ceph/pull/10238 still waits for merge. This will occur only if you have ceph-mds process in your cluster

[ceph-users] suddenly high memory usage for ceph-mon process

2016-11-04 Thread mj
process? MJ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] 10Gbit switch advice for small ceph cluster upgrade

2016-10-27 Thread mj
, with _direct_ 10G cable connections (quasi crosslink) between the three hosts. This is very low-budget, as it gives you 10G speed, without a (relatively) expensive 10G switch. Working fine here, with each host having a double 10G intel nic, plus a regular 1G interface. MJ

Re: [ceph-users] running xfs_fsr on ceph OSDs

2016-10-26 Thread mj
Hi Christian, Thanks for the reply / suggestion! MJ On 10/24/2016 10:02 AM, Christian Balzer wrote: Hello, On Mon, 24 Oct 2016 09:41:37 +0200 mj wrote: Hi, We have been running xfs on our servers for many years, and we are used to run a scheduled xfs_fsr during the weekend. Lately we

[ceph-users] running xfs_fsr on ceph OSDs

2016-10-24 Thread mj
are also mostly running xfs. Both of which (in theory anyway) could be defragmented. Google doesn't tell me a lot, therefore I'm posing the question here: What is consensus here? Is it worth running xfs_fsr on VMs and OSDs? (or perhaps just one of both?) MJ

Re: [ceph-users] Surviving a ceph cluster outage: the hard way

2016-10-20 Thread mj
Hi, Interesting reading! Any chance you could state some of your lessons (if any) you learned..? I can, for example, imagine your situation would have been much better with a replication factor of three instead of two..? MJ On 10/20/2016 12:09 AM, Kostis Fardelas wrote: Hello cephers

Re: [ceph-users] rbd pool:replica size choose: 2 vs 3

2016-09-23 Thread mj
? (our cluster is HEALTH_OK, enough disk space, etc, etc) MJ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] rados bench output question

2016-09-08 Thread mj
Hi Christian, Thanks a lot for all your information! (specially the bit that ceph never reads from the journal, but writes to osd from memory was new for me) MJ On 09/07/2016 03:20 AM, Christian Balzer wrote: hello, On Tue, 6 Sep 2016 13:38:45 +0200 lists wrote: Hi Christian, Thanks