[ceph-users] Re: Questions RE: Ceph/CentOS/IBM

2021-03-03 Thread Szabo, Istvan (Agoda)
But are you using kernel 4 then with centos 7?

Istvan Szabo
Senior Infrastructure Engineer
---
Agoda Services Co., Ltd.
e: istvan.sz...@agoda.com
---

-Original Message-
From: Marc 
Sent: Wednesday, March 3, 2021 11:40 PM
To: Alexander E. Patrakov ; Drew Weaver 

Cc: ceph-users@ceph.io
Subject: [ceph-users] Re: Questions RE: Ceph/CentOS/IBM

Email received from outside the company. If in doubt don't click links nor open 
attachments!


> This is wrong. Ceph 15 runs on CentOS 7 just fine, but without the
> dashboard.
>

I also hope that ceph is keeping support for el7 till it is eol in 2024. So I 
have enough time to figure out what OS to choose.
___
ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to 
ceph-users-le...@ceph.io


This message is confidential and is for the sole use of the intended 
recipient(s). It may also be privileged or otherwise protected by copyright or 
other legal rules. If you have received it by mistake please let us know by 
reply email and delete it from your system. It is prohibited to copy this 
message or disclose its content to anyone. Any confidentiality or privilege is 
not waived or lost by any mistaken delivery or unauthorized disclosure of the 
message. All messages sent to and from Agoda may be monitored to ensure 
compliance with company policies, to protect the company's interests and to 
remove potential malware. Electronic messages may be intercepted, amended, lost 
or deleted, or contain viruses.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Questions RE: Ceph/CentOS/IBM

2021-03-03 Thread Matt Wilder
On Wed, Mar 3, 2021 at 9:20 AM Teoman Onay  wrote:

> Just go for CentOS stream it will be at least as stable as CentOS and
> probably even more.
>
> CentOS Stream is just the next minor version of the current RHEL minor
> which means it already contains fixes not yet released for RHEL but
> available for CentOS stream. It is not as if CentOS stream would be a beta.
>
>
This is fundamentally not true.  Centos stream uses the "rolling release"
model, which means that it is not versioned (there will be no CentOS stream
8,9,10,etc).
This is an absolute nightmare to manage if you are trying to run a large
scale homogenous fleet of systems, and makes it completely unsuitable for
this task.
Given that this is the ceph-users mailing list, my guess is that most
people here run large numbers of homogenous machines that run a Ceph
cluster.  Centos Stream is a horrible choice of platform to do this.

When people talk about a distribution's stability, they are usually not
talking about "does it crash a lot"; they are talking about the frequency
of change.  CentOS Stream is essentially the definition of "unstable".

To answer OP's question, a bunch of choices are outlined here
.

I have no personal experience with any of these, but I would probably check
out Rocky Linux  first since it was created by one
of the original CentOS developers


> On Wed, Mar 3, 2021 at 5:39 PM Radoslav Milanov <
> radoslav.mila...@gmail.com>
> wrote:
>
> > +1
> >
> > On 3.3.2021 г. 11:37 ч., Marc wrote:
> > >>> Secondly, are we expecting IBM to "kill off" Ceph as well?
> > >>>
> > >> Stop spreading rumors! really! one can take it further and say kill
> > >> product
> > >> x, y, z until none exist!
> > >>
> > > This natural / logical thinking, the only one to blame here is
> > IBM/redhat. If you have no regards for maintaining the release period as
> it
> > was scheduled, and just cut it short by 7-8 years. More professional
> would
> > have been to announce this for el9, and not change 8 like this.
> > >
> > > How can you trust anything else they are now saying How can you
> know
> > the opensource version of ceph is going to be having restricted features.
> > With such management they will not even inform you. You will be the last
> to
> > know, like all clients. I think it is a valid concern.
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > ___
> > > ceph-users mailing list -- ceph-users@ceph.io
> > > To unsubscribe send an email to ceph-users-le...@ceph.io
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>

-- 


This e-mail and all information in, attached to, or linked via this 
e-mail (together the ‘e-mail’) is confidential and may be legally 
privileged. It is intended solely for the intended addressee(s). Access to, 
or any onward transmission, of this e-mail by any other person is not 
authorised. If you are not the intended recipient, you are requested to 
immediately alert the sender of this e-mail and to immediately delete this 
e-mail. Any disclosure in any form of all or part of this e-mail, or of any 
the parties to it, including any copying, distribution or any action taken 
or omitted to be taken in reliance on it, is prohibited and may be 
unlawful. 




This e-mail is not, and is not intended to be, and should 
not be construed as being, (a) any offer, solicitation, or promotion of any 
kind; (b) the basis of any investment or other decision(s);  (c) any 
recommendation to buy, sell or transact in any manner any good(s), 
product(s) or service(s), nor engage in any investment(s) or other 
transaction(s) or activities;  or (d) the provision of, or related to, any 
advisory service(s) or activities, including regarding any investment, tax, 
legal, financial, accounting, consulting or any other related service(s).
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Questions RE: Ceph/CentOS/IBM

2021-03-03 Thread Dave Hall
I have been told that Rocky Linux  is a fork of CentOS that will be what
CentOS used to be before this all happened.  I'm not sure how that figures
in here, but it's worth knowing.

-Dave

--
Dave Hall
Binghamton University
kdh...@binghamton.edu



On Wed, Mar 3, 2021 at 12:41 PM Drew Weaver  wrote:

> > As I understand it right now Ceph 14 is the last version that will run
> on CentOS/EL7 but CentOS8 was "killed off".
>
> >This is wrong. Ceph 15 runs on CentOS 7 just fine, but without the
> dashboard.
>
> Oh, what I should have said is that I want it to be fully functional.
>
>
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Questions RE: Ceph/CentOS/IBM [EXT]

2021-03-03 Thread Clyso GmbH - Ceph Foundation Member

Hello Mattew,

 I agree with you.

We have been running Ceph clusters on debian, centos, enterprise suse 
linux, redhat, opensuse, gardenlinux and whatever is LSB compliant for 
the last 9 years.


I think the trend towards containers further decouples it from Linux 
distributions.


Regards, Joachim

___
Clyso GmbH - Ceph Foundation Member
supp...@clyso.com
https://www.clyso.com

Am 03.03.2021 um 17:19 schrieb Matthew Vernon:

Hi,

You can get support for running Ceph on a number of distributions - RH 
support both RHEL and Ubuntu, Canonical support Ubuntu, the smaller 
consultancies seem happy to support anything plausible (e.g. Debian), 
this mailing list will opine regardless of what distro you're running ;-)


Regards,

Matthew



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Metadata for LibRADOS

2021-03-03 Thread Gregory Farnum
Unfortunately, nothing like this exists in RADOS. It can't really --
scaling is inimical to the sort of data collation you seem to be
looking for. If you use librados, you need to maintain all your own
metadata. RGW has done a lot of work to support these features;
depending on what you need you may be able to make do by implementing
your own centralized metadata system in other objects (omap might be
useful); if not you'll need to set up a more extensive system.
-Greg

On Tue, Mar 2, 2021 at 1:12 PM Cary FitzHugh  wrote:
>
> Phooey. :)
>
> Do you know of any notification subsystems in libRADOS that might be
> useful?
>
> Will have to think on this...
>
> Thanks
>
> On Tue, Mar 2, 2021 at 4:05 PM Matt Benjamin  wrote:
>
> > Right.  The elastic search integration--or something custom you could
> > base on s3 bucket notifications--would both be working with events
> > generated in RGW.
> >
> > Matt
> >
> > On Tue, Mar 2, 2021 at 3:55 PM Cary FitzHugh 
> > wrote:
> > >
> > > Understood.
> > >
> > > With the RGW architecture comes more load balancing concerns, more
> > moving parts, more tedious (to me) ACLs, less features (append and some
> > other things not supported in S3).  Was hoping for a solution which didn't
> > require us to be hamstrung and only read / write to a pool via the gateway.
> > >
> > > If the RGW Metadata search was able to "source" it's data from the OSDs
> > and sync that way, then I'd be up for setting up a skeleton
> > implementation,  but it sounds like RGW Metadata is only going to record
> > things which are flowing through the gateway.  (Is that correct?)
> > >
> > >
> > >
> > >
> > > On Tue, Mar 2, 2021 at 3:46 PM Matt Benjamin 
> > wrote:
> > >>
> > >> Hi Cary,
> > >>
> > >> As you've said, these are well-developed features of RGW, I think that
> > >> would be the way to go, in the Ceph ecosystem.
> > >>
> > >> Matt
> > >>
> > >> On Tue, Mar 2, 2021 at 3:41 PM Cary FitzHugh 
> > wrote:
> > >> >
> > >> > Hello -
> > >> >
> > >> > We're trying to use native libRADOS and the only challenge we're
> > running
> > >> > into is searching metadata.
> > >> >
> > >> > Using the rgw metadata sync seems to require all data to be pushed
> > through
> > >> > the rgw, which is not something we're interested in setting up at the
> > >> > moment.
> > >> >
> > >> > Are there hooks or features of libRADOS which could be leveraged to
> > enable
> > >> > syncing of metadata to an external system (elastic-search / postgres
> > / etc)?
> > >> >
> > >> > Is there a way to listen to a stream of updates to a pool in
> > real-time,
> > >> > with some guarantees I wouldn't miss things?
> > >> >
> > >> > Are there any features like this in libRADOS?
> > >> >
> > >> > Thank you
> > >> > ___
> > >> > ceph-users mailing list -- ceph-users@ceph.io
> > >> > To unsubscribe send an email to ceph-users-le...@ceph.io
> > >> >
> > >>
> > >>
> > >> --
> > >>
> > >> Matt Benjamin
> > >> Red Hat, Inc.
> > >> 315 West Huron Street, Suite 140A
> > >> Ann Arbor, Michigan 48103
> > >>
> > >> http://www.redhat.com/en/technologies/storage
> > >>
> > >> tel.  734-821-5101
> > >> fax.  734-769-8938
> > >> cel.  734-216-5309
> > >>
> >
> >
> > --
> >
> > Matt Benjamin
> > Red Hat, Inc.
> > 315 West Huron Street, Suite 140A
> > Ann Arbor, Michigan 48103
> >
> > http://www.redhat.com/en/technologies/storage
> >
> > tel.  734-821-5101
> > fax.  734-769-8938
> > cel.  734-216-5309
> >
> >
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephfs: unable to mount share with 5.11 mainline, ceph 15.2.9, MDS 14.1.16

2021-03-03 Thread Stefan Kooman

On 3/3/21 1:16 PM, Ilya Dryomov wrote:



And from this documentation:
https://docs.ceph.com/en/latest/rados/configuration/network-config-ref/#ipv4-ipv6-dual-stack-mode
we learned that dual stack is not possible for any current stable
release, but might be possible with latest code. So the takeaway is that
the linux kernel client needs fixing to be able to support dual stack
clusters in the future (multiple v1 / v2 address families), and, that
until then you should run with ms_bind_ipv4=false for IPv6 only clusters.


I don't think we do any dual stack testing, whether in userspace or
(certainly!) with the kernel client.



I'll make a PR to clear up the documenation. Do you want me to create a
tracker for the kernel client? I will happily test your changes.


Sure.  You are correct that the kernel client needs a bit a work as we
haven't considered dual stack configurations there at all.


I added another tracker as it is related to this thread:

https://tracker.ceph.com/issues/49584

^^^ do _not_ try this on a IPv6 only production cluster with 
ms_bind_ipv4=true or you will regret it ;-).


Gr. Stefan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephfs: unable to mount share with 5.11 mainline, ceph 15.2.9, MDS 14.1.16

2021-03-03 Thread Stefan Kooman

On 3/3/21 1:16 PM, Ilya Dryomov wrote:



Sure.  You are correct that the kernel client needs a bit a work as we
haven't considered dual stack configurations there at all.


https://tracker.ceph.com/issues/49581

Gr. Stefan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Questions RE: Ceph/CentOS/IBM

2021-03-03 Thread Mark Nelson


On 3/3/21 10:37 AM, Marc wrote:

Secondly, are we expecting IBM to "kill off" Ceph as well?


Stop spreading rumors! really! one can take it further and say kill
product
x, y, z until none exist!


This natural / logical thinking, the only one to blame here is IBM/redhat. If 
you have no regards for maintaining the release period as it was scheduled, and 
just cut it short by 7-8 years. More professional would have been to announce 
this for el9, and not change 8 like this.

How can you trust anything else they are now saying How can you know the 
opensource version of ceph is going to be having restricted features. With such 
management they will not even inform you. You will be the last to know, like 
all clients. I think it is a valid concern.


Speaking only for myself (but as someone who has been working on Ceph 
for nearly a decade all the way back to DreamHost), I do not believe 
IBM/Red Hat want to change the "upstream first" development model we 
follow for Ceph.  There's always been a little tension regarding how 
much time engineers spend on upstream development vs supporting the 
downstream products (and that existed even before Red Hat), but honestly 
I'm not really worried about it.  Ultimately releases flow from upstream 
to downstream except in rare circumstances (ie immediate hotfixes 
needed) and that model has worked well imho.


FWIW a lot of the people working on Ceph are passionate about open 
source.  It's baked into our culture and integral to how we run the 
project.  A large part of the Crimson development for instance is being 
done by outside contributors from Intel, Samsung, Qihoo 360, and 
others.  If significant changes were forced on Ceph there would be a lot 
of upset people including me. That doesn't mean it can't happen, but 
part of our job is to continually showcase and advocate for why open 
source is a better model not only for the world at large, but for our 
customers and IBM as well.  I believe companies (mostly!) do what's in 
their self interest, and I fully believe that it's in IBM's self 
interest right now to keep investing in Ceph (and fwiw they have been 
via additional upstream hardware purchases, testing, code contributions, 
product integration, etc).


Anyway, I don't know if that makes you feel any better, but imho Red Hat 
and IBM have been good custodians of Ceph so far, and at least for the 
immediate future I expect that to continue.  Also fwiw, I still use 
CentOS 8 stream for our upstream performance testing clusters and have 
no plans to change any time soon.



Mark









___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: "optimal" tunables on release upgrade

2021-03-03 Thread Clyso GmbH - Ceph Foundation Member

Hi Matthew,

my colleagues and I can still remember that the values do not change 
automatically when you upgrade.
I remember performance problems after an upgrade with old tunables a few 
years ago.


But such behaviour may change with the next version.

Meanwhile you get a warning in ceph status if they are not set correctly.

https://docs.ceph.com/en/latest/rados/operations/health-checks/#old-crush-tunables


Regards, Joachim


___
Clyso GmbH - Ceph Foundation Member
supp...@clyso.com
https://www.clyso.com

Am 26.02.2021 um 15:52 schrieb Matthew Vernon:

Hi,

Having been slightly caught out by tunables on my Octopus upgrade[0], 
can I just check that if I do

ceph osd crush tunables optimal

That will update the tunables on the cluster to the current "optimal" 
values (and move a lot of data around), but that this doesn't mean 
they'll change next time I upgrade the cluster or anything like that?


It's not quite clear from the documentation whether the next time 
"optimal" tunables change that'll be applied to a cluster where I've 
set tunables thus, or if tunables are only ever changed by a fresh 
invocation of ceph osd crush tunables...


[I assume the same answer applies to "default"?]

Regards,

Matthew

[0] I foolishly thought a cluster initially installed as Jewel would 
have jewel tunables




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Questions RE: Ceph/CentOS/IBM

2021-03-03 Thread Milan Kupcevic
On 3/3/21 10:45 AM, Drew Weaver wrote:
> Howdy,
> 
> After the IBM acquisition of RedHat the landscape for CentOS quickly changed.
> 
> As I understand it right now Ceph 14 is the last version that will run on 
> CentOS/EL7 but CentOS8 was "killed off".
> 
> So given that, if you were going to build a Ceph cluster today would you even 
> bother doing it using a non-commercial distribution or would you just use 
> RHEL 8 (or even their commercial Ceph product).
> 



We run our Ceph 14.2 Nautilus cluster on Ubuntu 18.4 Bionic Beaver:
  https://download.ceph.com/debian-nautilus/dists/bionic/


Planing to run Ceph 15.2 Octopus on Ubuntu 20.4 Focal Fossa:
  https://download.ceph.com/debian-octopus/dists/focal/


Regards,

Milan


-- 
Milan Kupcevic
Senior Cyberinfrastructure Engineer at Project NESE
Harvard University
FAS Research Computing
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: CephFS: side effects of not using ceph-mgr volumes / subvolumes

2021-03-03 Thread Patrick Donnelly
On Wed, Mar 3, 2021 at 5:49 AM Sebastian Knust
 wrote:
>
> Hi,
>
> Assuming a cluster (currently octopus, might upgrade to pacific once
> released) serving only CephFS and that only to a handful of kernel and
> fuse-clients (no OpenStack, CSI or similar): Are there any side effects
> of not using the ceph-mgr volumes module abstractions [1], namely
> subvolumes and subvolume groups, that I have to consider?

The "volume" abstraction helps with creating the file system/MDS and
may help with management in the future. No side-effects for not using
either one.

> I would still only mount subtrees of the whole (single) CephFS file
> system and have some clients which mount multiple disjunct subtrees.
> Quotas would only be set on the subtree level which I am mounting,
> likewise file layouts. Snapshots (via mkdir in .snap) would be used on
> the mounting level or one level above.
>
>
> Background: I don't require the abstraction features per se. Some
> restrictions (e.g. subvolume group snapshots not being supported) seem
> to me to be caused only by the abstraction layer and not the underlying
> CephFS. For my specific use case I require snapshots on the subvolume
> group layer. It therefore seems better to just forego the abstraction as
> a whole and work on bare CephFS.

subvolumegroup snapshots will come back, probably in a minor release of Pacific.


-- 
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Questions RE: Ceph/CentOS/IBM

2021-03-03 Thread Drew Weaver
> As I understand it right now Ceph 14 is the last version that will run on 
> CentOS/EL7 but CentOS8 was "killed off".

>This is wrong. Ceph 15 runs on CentOS 7 just fine, but without the dashboard.

Oh, what I should have said is that I want it to be fully functional.




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Questions RE: Ceph/CentOS/IBM [EXT]

2021-03-03 Thread Teoman Onay
Hi Matthew,

Starting of Ceph 4, RH does only support RHEL 7.x & 8.1. Ubuntu support has
been deprecated

Regards

On Wed, Mar 3, 2021 at 5:19 PM Matthew Vernon  wrote:

> Hi,
>
> You can get support for running Ceph on a number of distributions - RH
> support both RHEL and Ubuntu, Canonical support Ubuntu, the smaller
> consultancies seem happy to support anything plausible (e.g. Debian),
> this mailing list will opine regardless of what distro you're running ;-)
>
> Regards,
>
> Matthew
>
>
> --
>  The Wellcome Sanger Institute is operated by Genome Research
>  Limited, a charity registered in England with number 1021457 and a
>  company registered in England with number 2742969, whose registered
>  office is 215 Euston Road, London, NW1 2BE.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Questions RE: Ceph/CentOS/IBM

2021-03-03 Thread Teoman Onay
Just go for CentOS stream it will be at least as stable as CentOS and
probably even more.

CentOS Stream is just the next minor version of the current RHEL minor
which means it already contains fixes not yet released for RHEL but
available for CentOS stream. It is not as if CentOS stream would be a beta.

On Wed, Mar 3, 2021 at 5:39 PM Radoslav Milanov 
wrote:

> +1
>
> On 3.3.2021 г. 11:37 ч., Marc wrote:
> >>> Secondly, are we expecting IBM to "kill off" Ceph as well?
> >>>
> >> Stop spreading rumors! really! one can take it further and say kill
> >> product
> >> x, y, z until none exist!
> >>
> > This natural / logical thinking, the only one to blame here is
> IBM/redhat. If you have no regards for maintaining the release period as it
> was scheduled, and just cut it short by 7-8 years. More professional would
> have been to announce this for el9, and not change 8 like this.
> >
> > How can you trust anything else they are now saying How can you know
> the opensource version of ceph is going to be having restricted features.
> With such management they will not even inform you. You will be the last to
> know, like all clients. I think it is a valid concern.
> >
> >
> >
> >
> >
> >
> >
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Questions RE: Ceph/CentOS/IBM

2021-03-03 Thread Vasu Kulkarni
On Wed, Mar 3, 2021 at 8:46 AM Marc  wrote:

> > > Secondly, are we expecting IBM to "kill off" Ceph as well?
> > >
> > Stop spreading rumors! really! one can take it further and say kill
> > product
> > x, y, z until none exist!
> >
>
> This natural / logical thinking, the only one to blame here is IBM/redhat.
> If you have no regards for maintaining the release period as it was
> scheduled, and just cut it short by 7-8 years. More professional would have
> been to announce this for el9, and not change 8 like this.
>
> How can you trust anything else they are now saying How can you know
> the opensource version of ceph is going to be having restricted features.
> With such management they will not even inform you. You will be the last to
> know, like all clients. I think it is a valid concern.
>
I think unlike centos decision(which was sad for many employees as well),
ceph's decision is owned by foundation - https://ceph.io/foundation/ , so I
dont think one company has full say on its future.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Questions RE: Ceph/CentOS/IBM

2021-03-03 Thread Anthony D'Atri
I’m at something of a loss to understand all the panic here.

Unless I’ve misinterpreted, CentOS isn’t killed, it’s being updated more 
frequently.  Want something stable?  Freeze a repository into a local copy, and 
deploy off of that.  Like we all should be doing anyway, vs. relying on 
slurping packages over the net all the time from upstream repositories we don’t 
control.  This has been best practice for years:

* No unexpected regressions
* Lessened exposure to trojans
* Lower latency, better availability

Half of us run our own kernels — and other updated packages — on CentOS anyway, 
so how different is Stretch *really*?  

>>> Secondly, are we expecting IBM to "kill off" Ceph as well?
> 
> This natural / logical thinking, the only one to blame here is IBM/redhat. If 
> you have no regards for maintaining the release period as it was scheduled, 
> and just cut it short by 7-8 years. More professional would have been to 
> announce this for el9, and not change 8 like this.
> 
> How can you trust anything else they are now saying How can you know the 
> opensource version of ceph is going to be having restricted features. With 
> such management they will not even inform you. You will be the last to know, 
> like all clients. I think it is a valid concern.


I don’t think IBM “owns” Ceph in a way that would let them do that.  The 
scenarios:

* Status quo: people with certain corporate postures keep paying for RHCS, 
others use and contribute to the community release
* IBM cuts it loose.  OSS forges on.
* The Solaris / ZFS phenomenon:  fork fork fork.  Not all that different from 
the first scenario.


ymmv
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Questions RE: Ceph/CentOS/IBM

2021-03-03 Thread Marc
> This is wrong. Ceph 15 runs on CentOS 7 just fine, but without the
> dashboard.
> 

I also hope that ceph is keeping support for el7 till it is eol in 2024. So I 
have enough time to figure out what OS to choose.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Questions RE: Ceph/CentOS/IBM

2021-03-03 Thread Radoslav Milanov

+1

On 3.3.2021 г. 11:37 ч., Marc wrote:

Secondly, are we expecting IBM to "kill off" Ceph as well?


Stop spreading rumors! really! one can take it further and say kill
product
x, y, z until none exist!


This natural / logical thinking, the only one to blame here is IBM/redhat. If 
you have no regards for maintaining the release period as it was scheduled, and 
just cut it short by 7-8 years. More professional would have been to announce 
this for el9, and not change 8 like this.

How can you trust anything else they are now saying How can you know the 
opensource version of ceph is going to be having restricted features. With such 
management they will not even inform you. You will be the last to know, like 
all clients. I think it is a valid concern.







___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Questions RE: Ceph/CentOS/IBM

2021-03-03 Thread Marc
> > Secondly, are we expecting IBM to "kill off" Ceph as well?
> >
> Stop spreading rumors! really! one can take it further and say kill
> product
> x, y, z until none exist!
> 

This natural / logical thinking, the only one to blame here is IBM/redhat. If 
you have no regards for maintaining the release period as it was scheduled, and 
just cut it short by 7-8 years. More professional would have been to announce 
this for el9, and not change 8 like this.

How can you trust anything else they are now saying How can you know the 
opensource version of ceph is going to be having restricted features. With such 
management they will not even inform you. You will be the last to know, like 
all clients. I think it is a valid concern.







___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Questions RE: Ceph/CentOS/IBM

2021-03-03 Thread Alexander E. Patrakov
ср, 3 мар. 2021 г. в 20:45, Drew Weaver :
>
> Howdy,
>
> After the IBM acquisition of RedHat the landscape for CentOS quickly changed.
>
> As I understand it right now Ceph 14 is the last version that will run on 
> CentOS/EL7 but CentOS8 was "killed off".

This is wrong. Ceph 15 runs on CentOS 7 just fine, but without the dashboard.

-- 
Alexander E. Patrakov
CV: http://u.pc.cd/wT8otalK
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephfs: unable to mount share with 5.11 mainline, ceph 15.2.9, MDS 14.1.16

2021-03-03 Thread Stefan Kooman

On 3/3/21 1:16 PM, Ilya Dryomov wrote:



I have tested with 5.11 kernel (5.11.2-arch1-1 #1 SMP PREEMPT Fri, 26
Feb 2021 18:26:41 + x86_64 GNU/Linux) port 3300 and ms_mode=crc as
well as ms_mode=prefer-crc and that works when cluster is running with
ms_bind_ipv4=false. So the "fix" is to have this config option set: ceph
config set global ms_bind_ipv4 false


Right.  According to your original post that was already the case:
"ms_bind_ipv6=trie, ms_bind_ipv4=false".


Indeed, I wrote that. That was not correct. We *did* have set that on a 
test cluster, but those changes have never propagated to production.



And from this documentation:
https://docs.ceph.com/en/latest/rados/configuration/network-config-ref/#ipv4-ipv6-dual-stack-mode
we learned that dual stack is not possible for any current stable
release, but might be possible with latest code. So the takeaway is that
the linux kernel client needs fixing to be able to support dual stack
clusters in the future (multiple v1 / v2 address families), and, that
until then you should run with ms_bind_ipv4=false for IPv6 only clusters.


I don't think we do any dual stack testing, whether in userspace or
(certainly!) with the kernel client.


Yeah, there are also quite some combinations possible, especially with 
dual stack and separate public / cluster networks, even with different 
IP stack maybe? Would be good to know (as in the broader community) what 
is being tested / supported with regards to networking. Nowadays 
separate public/cluster is not advised anymore (which is a good thing, 
thanks Wido) but maybe this should also be made clear for IP families. 
IMHO it would be good to test both IPv4 and IPv6.






I'll make a PR to clear up the documenation. Do you want me to create a
tracker for the kernel client? I will happily test your changes.


Sure.  You are correct that the kernel client needs a bit a work as we
haven't considered dual stack configurations there at all.


Check, I'll do that and come back with a tracker ID.

Thanks,

Stefan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Questions RE: Ceph/CentOS/IBM

2021-03-03 Thread Vasu Kulkarni
On Wed, Mar 3, 2021 at 7:45 AM Drew Weaver  wrote:

> Howdy,
>
> After the IBM acquisition of RedHat the landscape for CentOS quickly
> changed.
>
> As I understand it right now Ceph 14 is the last version that will run on
> CentOS/EL7 but CentOS8 was "killed off".
>
> So given that, if you were going to build a Ceph cluster today would you
> even bother doing it using a non-commercial distribution or would you just
> use RHEL 8 (or even their commercial Ceph product).
>
> Secondly, are we expecting IBM to "kill off" Ceph as well?
>
Stop spreading rumors! really! one can take it further and say kill product
x, y, z until none exist!

>
> Thanks,
> -Drew
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: grafana-api-url not only for one host

2021-03-03 Thread Vladimir Sigunov
Hi,
I guess you can use a load balancer like HAProxy + keepalived to make the
api high available and point the dashboard to the VIP. Of course, you need
to deploy more than one grafana instance.

Thanks,
Vladimir

On Wed, Mar 3, 2021 at 5:07 AM E Taka <0eta...@gmail.com> wrote:

> Hi,
>
> if the host  fails, to which the grafana-api-url points (in the example
> below ceph01.hostxyz.tld:3000), Ceph Dashboard can't Display Grafana Data:
>
> # ceph dashboard get-grafana-api-url
> https://ceph01.hostxyz.tld:3000
>
> Is it possible to automagically switch to an other host?
>
> Thanks, Erich
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Questions RE: Ceph/CentOS/IBM [EXT]

2021-03-03 Thread Matthew Vernon

Hi,

You can get support for running Ceph on a number of distributions - RH 
support both RHEL and Ubuntu, Canonical support Ubuntu, the smaller 
consultancies seem happy to support anything plausible (e.g. Debian), 
this mailing list will opine regardless of what distro you're running ;-)


Regards,

Matthew


--
The Wellcome Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE. 
___

ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Questions RE: Ceph/CentOS/IBM

2021-03-03 Thread Freddy Andersen
I would use croit

From: Drew Weaver 
Date: Wednesday, March 3, 2021 at 7:45 AM
To: 'ceph-users@ceph.io' 
Subject: [ceph-users] Questions RE: Ceph/CentOS/IBM
Howdy,

After the IBM acquisition of RedHat the landscape for CentOS quickly changed.

As I understand it right now Ceph 14 is the last version that will run on 
CentOS/EL7 but CentOS8 was "killed off".

So given that, if you were going to build a Ceph cluster today would you even 
bother doing it using a non-commercial distribution or would you just use RHEL 
8 (or even their commercial Ceph product).

Secondly, are we expecting IBM to "kill off" Ceph as well?

Thanks,
-Drew

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Questions RE: Ceph/CentOS/IBM

2021-03-03 Thread Drew Weaver
Howdy,

After the IBM acquisition of RedHat the landscape for CentOS quickly changed.

As I understand it right now Ceph 14 is the last version that will run on 
CentOS/EL7 but CentOS8 was "killed off".

So given that, if you were going to build a Ceph cluster today would you even 
bother doing it using a non-commercial distribution or would you just use RHEL 
8 (or even their commercial Ceph product).

Secondly, are we expecting IBM to "kill off" Ceph as well?

Thanks,
-Drew

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Monitor leveldb growing without bound v14.2.16

2021-03-03 Thread Lincoln Bryant
Hi all,

Thanks for the responses.

I stopped the monitor that wasn't syncing and dumped keys with the 
monstoretool. The keys seemed to mostly be of type 'logm' which I guess matches 
up with the huge amount of log messages I was getting about slow ops. I tried 
injecting clog_to_monitor=false along the way but it did not help in my case.

I ended up doing a rolling restart of the whole cluster, which must have 
cleared whatever was blocking things because the monitors automatically 
compacted and 'b' rejoined the quorum about 75% of the way through.

Thanks,
Lincoln

From: Wido den Hollander 
Sent: Wednesday, March 3, 2021 2:03 AM
To: Lincoln Bryant ; ceph-users 
Subject: Re: [ceph-users] Monitor leveldb growing without bound v14.2.16



On 03/03/2021 00:55, Lincoln Bryant wrote:
> Hi list,
>
> We recently had a cluster outage over the weekend where several OSDs were 
> inaccessible over night for several hours. When I found the cluster in the 
> morning, the monitors' root disks (which contained both the monitor's leveldb 
> and the Ceph logs) had completely filled.
>
> After restarting OSDs, cleaning out the monitors' logs, moving /var/lib/ceph 
> to dedicated disks on the mons, and starting recovery (in which there was 1 
> unfound object that I marked lost, if that has any relevancy), the leveldb 
> continued/continues to grow without bound. The cluster has all PGs in 
> active+clean at this point, yet I'm accumulating what seems like 
> approximately ~1GB/hr of new leveldb data.
>
> Two of the monitors (a, c) are in quorum, while the third (b) has been 
> synchronizing for the last several hours, but doesn't seem to be able to 
> catch up. Mon 'b' has been running for 4 hours now in the 'synchronizing' 
> state. The mon's log has many messages about compacting and deleting files, 
> yet we never exit the synchronization state.
>
> The ceph.log is also rapidly accumulating complaints that the mons are slow 
> (not surprising, I suppose, since the levelDBs are ~100GB at this point).
>
> I've found that using monstore tool to do compaction on mons 'a' and 'c' 
> thelps but is only a temporary fix. Soon the database inflates again and I'm 
> back to where I started.

Are all the PGs in the active+clean state? I don't assume so? This will
cause the MONs to keep a large history of OSDMaps in their DB and thus
it will keep growing.

>
> Thoughts on how to proceed here? Some ideas I had:
> - Would it help to add some new monitors that use RocksDB?

They would need to sync which can take a lot of time. Moving to RocksDB
is a good idea when this is all fixed.

> - Stop a monitor and dump the keys via monstoretool, just to get an idea 
> of what's going on?
> - Increase mon_sync_max_payload_size to try to move data in larger chunks?

I would just try it.

> - Drop down to a single monitor, and see if normal compaction triggers 
> and stops growing unbounded?

It will keep growing, the compact only works for a limited time. Make
sure the PGs become clean again.

In the meantime make sure you have enough disk space.

Wido

> - Stop both 'a' and 'c', compact them, start them, and immediately start 
> 'b' ?
>
> Appreciate any advice.
>
> Regards,
> Lincoln
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephfs: unable to mount share with 5.11 mainline, ceph 15.2.9, MDS 14.1.16

2021-03-03 Thread Stefan Kooman

On 3/2/21 6:00 PM, Jeff Layton wrote:





v2 support in the kernel is keyed on the ms_mode= mount option, so that
has to be passed in if you're connecting to a v2 port. Until the mount
helpers get support for that option you'll need to specify the address
and port manually if you want to use v2.


I've tried feeding it ms_mode=v2 but I get a "mount error 22 = Invalid
argument", the ms_mode=legacy does work, but fails with the same errors.



That needs different values. See:

 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=00498b994113a871a556f7ff24a4cf8a00611700

You can try passing in a specific mon address and port, like:

 192.168.21.22:3300:/cephfs/dir/

...and then pass in ms_mode=crc or something similar.

That said, what you're doing should be working, so this sounds like a
regression. I presume you're able to mount with earlier kernels? What's
the latest kernel version that you have that works?


5.11 kernel (5.11.2-arch1-1 #1 SMP PREEMPT Fri, 26 Feb 2021 18:26:41 
+ x86_64 GNU/Linux) with a cluster that has ms_bind_ipv4=false 
works. Port 3300 ms_mode=prefer-crc and ms_mode=crc work.


I have tested with 5.11 kernel (5.11.2-arch1-1 #1 SMP PREEMPT Fri, 26 
Feb 2021 18:26:41 + x86_64 GNU/Linux) port 3300 and ms_mode=crc as 
well as ms_mode=prefer-crc and that works when cluster is running with 
ms_bind_ipv4=false. So the "fix" is to have this config option set: ceph 
config set global ms_bind_ipv4 false


5.10 kernel (5.10.19-1-lts Arch Linux) works with a cluster that is IPv6 
only but has ms_bind_ipv4=true. So it's "broken" since 5.11.


So, we have done more reading / researching on the ms_bind_ip{4,6} options:

- 
https://pve.proxmox.com/wiki/Ceph_Luminous_to_Nautilus#Restart_the_OSD_daemon_on_all_nodes


- https://github.com/rook/rook/issues/6266

^^ Describe that you have to disable bind to IPv4.

- https://github.com/ceph/ceph/pull/13317

^^ this PR is not completely correct:

   **Note:** You may use IPv6 addresses instead of IPv4 addresses, but
   you must set ``ms bind ipv6`` to ``true``.

^^ That is not enough as we have learned, and starts to give trouble 
with 5.11 linux cephfs client.


And from this documentation: 
https://docs.ceph.com/en/latest/rados/configuration/network-config-ref/#ipv4-ipv6-dual-stack-mode 
we learned that dual stack is not possible for any current stable 
release, but might be possible with latest code. So the takeaway is that 
the linux kernel client needs fixing to be able to support dual stack 
clusters in the future (multiple v1 / v2 address families), and, that 
until then you should run with ms_bind_ipv4=false for IPv6 only clusters.


I'll make a PR to clear up the documenation. Do you want me to create a 
tracker for the kernel client? I will happily test your changes.


Thanks,

Stefan


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] CephFS: side effects of not using ceph-mgr volumes / subvolumes

2021-03-03 Thread Sebastian Knust

Hi,

Assuming a cluster (currently octopus, might upgrade to pacific once 
released) serving only CephFS and that only to a handful of kernel and 
fuse-clients (no OpenStack, CSI or similar): Are there any side effects 
of not using the ceph-mgr volumes module abstractions [1], namely 
subvolumes and subvolume groups, that I have to consider?


I would still only mount subtrees of the whole (single) CephFS file 
system and have some clients which mount multiple disjunct subtrees. 
Quotas would only be set on the subtree level which I am mounting, 
likewise file layouts. Snapshots (via mkdir in .snap) would be used on 
the mounting level or one level above.



Background: I don't require the abstraction features per se. Some 
restrictions (e.g. subvolume group snapshots not being supported) seem 
to me to be caused only by the abstraction layer and not the underlying 
CephFS. For my specific use case I require snapshots on the subvolume 
group layer. It therefore seems better to just forego the abstraction as 
a whole and work on bare CephFS.



Cheers
Sebastian

[1] https://docs.ceph.com/en/octopus/cephfs/fs-volumes/
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: bug in latest cephadm bootstrap: got an unexpected keyword argument 'verbose_on_failure'

2021-03-03 Thread Sebastian Wagner
Indeed. That is going to be fixed by

https://github.com/ceph/ceph/pull/39633



Am 03.03.21 um 07:31 schrieb Philip Brown:
> Seems like someone is not testing cephadm on centos 7.9
> 
> Just tried installing cephadm from the repo, and ran
> cephadm bootstrap --mon-ip=xxx
> 
> it blew up, with
> 
> ceph TypeError: __init__() got an unexpected keyword argument 
> 'verbose_on_failure'
> 
> just after the firewall section.
> 
> I happen to have a test cluser from a few months ago, and compared the code.
> 
> Some added, in line 2348,
> 
> "out, err, ret = call([self.cmd, '--permanent', '--query-port', 
> tcp_port], verbose_on_failure=False)"
> 
> this made the init fail, on my centos 7.9 system, freshly installed and 
> updated today.
> 
> # cephadm version
> ceph version 15.2.9 (357616cbf726abb779ca75a551e8d02568e15b17) octopus 
> (stable)
> 
> 
> Simply commenting out that line makes it complete the cluster init like I 
> remember.
> 
> 
> --
> Philip Brown| Sr. Linux System Administrator | Medata, Inc. 
> 5 Peters Canyon Rd Suite 250 
> Irvine CA 92606 
> Office 714.918.1310| Fax 714.918.1325 
> pbr...@medata.com| www.medata.com
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> 

-- 
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephfs: unable to mount share with 5.11 mainline, ceph 15.2.9, MDS 14.1.16

2021-03-03 Thread Ilya Dryomov
On Wed, Mar 3, 2021 at 11:15 AM Stefan Kooman  wrote:
>
> On 3/2/21 6:00 PM, Jeff Layton wrote:
>
> >>
> >>>
> >>> v2 support in the kernel is keyed on the ms_mode= mount option, so that
> >>> has to be passed in if you're connecting to a v2 port. Until the mount
> >>> helpers get support for that option you'll need to specify the address
> >>> and port manually if you want to use v2.
> >>
> >> I've tried feeding it ms_mode=v2 but I get a "mount error 22 = Invalid
> >> argument", the ms_mode=legacy does work, but fails with the same errors.
> >>
> >
> > That needs different values. See:
> >
> >  
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=00498b994113a871a556f7ff24a4cf8a00611700
> >
> > You can try passing in a specific mon address and port, like:
> >
> >  192.168.21.22:3300:/cephfs/dir/
> >
> > ...and then pass in ms_mode=crc or something similar.
> >
> > That said, what you're doing should be working, so this sounds like a
> > regression. I presume you're able to mount with earlier kernels? What's
> > the latest kernel version that you have that works?
>
> 5.11 kernel (5.11.2-arch1-1 #1 SMP PREEMPT Fri, 26 Feb 2021 18:26:41
> + x86_64 GNU/Linux) with a cluster that has ms_bind_ipv4=false
> works. Port 3300 ms_mode=prefer-crc and ms_mode=crc work.
>
> I have tested with 5.11 kernel (5.11.2-arch1-1 #1 SMP PREEMPT Fri, 26
> Feb 2021 18:26:41 + x86_64 GNU/Linux) port 3300 and ms_mode=crc as
> well as ms_mode=prefer-crc and that works when cluster is running with
> ms_bind_ipv4=false. So the "fix" is to have this config option set: ceph
> config set global ms_bind_ipv4 false

Right.  According to your original post that was already the case:
"ms_bind_ipv6=trie, ms_bind_ipv4=false".

>
> 5.10 kernel (5.10.19-1-lts Arch Linux) works with a cluster that is IPv6
> only but has ms_bind_ipv4=true. So it's "broken" since 5.11.
>
> So, we have done more reading / researching on the ms_bind_ip{4,6} options:
>
> -
> https://pve.proxmox.com/wiki/Ceph_Luminous_to_Nautilus#Restart_the_OSD_daemon_on_all_nodes
>
> - https://github.com/rook/rook/issues/6266
>
> ^^ Describe that you have to disable bind to IPv4.
>
> - https://github.com/ceph/ceph/pull/13317
>
> ^^ this PR is not completely correct:
>
> **Note:** You may use IPv6 addresses instead of IPv4 addresses, but
> you must set ``ms bind ipv6`` to ``true``.
>
> ^^ That is not enough as we have learned, and starts to give trouble
> with 5.11 linux cephfs client.
>
> And from this documentation:
> https://docs.ceph.com/en/latest/rados/configuration/network-config-ref/#ipv4-ipv6-dual-stack-mode
> we learned that dual stack is not possible for any current stable
> release, but might be possible with latest code. So the takeaway is that
> the linux kernel client needs fixing to be able to support dual stack
> clusters in the future (multiple v1 / v2 address families), and, that
> until then you should run with ms_bind_ipv4=false for IPv6 only clusters.

I don't think we do any dual stack testing, whether in userspace or
(certainly!) with the kernel client.

>
> I'll make a PR to clear up the documenation. Do you want me to create a
> tracker for the kernel client? I will happily test your changes.

Sure.  You are correct that the kernel client needs a bit a work as we
haven't considered dual stack configurations there at all.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Monitor leveldb growing without bound v14.2.16

2021-03-03 Thread Frank Schilder
Slow mon sync can be caused by too large mon_sync_max_payload_size. The default 
is usually way too high. I had sync problems until I set

mon_sync_max_payload_size = 4096

Since then mon sync is not an issue any more.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Peter Woodman 
Sent: 03 March 2021 06:26:47
To: Lincoln Bryant
Cc: ceph-users
Subject: [ceph-users] Re: Monitor leveldb growing without bound v14.2.16

is the ceph insights plugin enabled? this caused huge huge bloat of the mon
stores for me. before i figured that out, i turned on leveldb compression
options on the mon store and got pretty significant savings, also.

On Tue, Mar 2, 2021 at 6:56 PM Lincoln Bryant  wrote:

> Hi list,
>
> We recently had a cluster outage over the weekend where several OSDs were
> inaccessible over night for several hours. When I found the cluster in the
> morning, the monitors' root disks (which contained both the monitor's
> leveldb and the Ceph logs) had completely filled.
>
> After restarting OSDs, cleaning out the monitors' logs, moving
> /var/lib/ceph to dedicated disks on the mons, and starting recovery (in
> which there was 1 unfound object that I marked lost, if that has any
> relevancy), the leveldb continued/continues to grow without bound. The
> cluster has all PGs in active+clean at this point, yet I'm accumulating
> what seems like approximately ~1GB/hr of new leveldb data.
>
> Two of the monitors (a, c) are in quorum, while the third (b) has been
> synchronizing for the last several hours, but doesn't seem to be able to
> catch up. Mon 'b' has been running for 4 hours now in the 'synchronizing'
> state. The mon's log has many messages about compacting and deleting files,
> yet we never exit the synchronization state.
>
> The ceph.log is also rapidly accumulating complaints that the mons are
> slow (not surprising, I suppose, since the levelDBs are ~100GB at this
> point).
>
> I've found that using monstore tool to do compaction on mons 'a' and 'c'
> thelps but is only a temporary fix. Soon the database inflates again and
> I'm back to where I started.
>
> Thoughts on how to proceed here? Some ideas I had:
>- Would it help to add some new monitors that use RocksDB?
>- Stop a monitor and dump the keys via monstoretool, just to get an
> idea of what's going on?
>- Increase mon_sync_max_payload_size to try to move data in larger
> chunks?
>- Drop down to a single monitor, and see if normal compaction triggers
> and stops growing unbounded?
>- Stop both 'a' and 'c', compact them, start them, and immediately
> start 'b' ?
>
> Appreciate any advice.
>
> Regards,
> Lincoln
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Octopus auto-scale causing HEALTH_WARN re object numbers [EXT]

2021-03-03 Thread Matthew Vernon

On 02/03/2021 16:38, Matthew Vernon wrote:


root@sto-t1-1:~# ceph health detail
HEALTH_WARN 1 pools have many more objects per pg than average; 9 pgs 
not deep-scrubbed in time
[WRN] MANY_OBJECTS_PER_PG: 1 pools have many more objects per pg than 
average
     pool default.rgw.buckets.data objects per pg (313153) is more than 
23.4063 times cluster average (13379)


...which seems like the wrong thing for the auto-scaler to be doing. Is 
this a known problem?


The autoscaler has finished, and I still have the health warning:

root@sto-t1-1:~# ceph health detail
HEALTH_WARN 1 pools have many more objects per pg than average
[WRN] MANY_OBJECTS_PER_PG: 1 pools have many more objects per pg than 
average
pool default.rgw.buckets.data objects per pg (313153) is more than 
23.0871 times cluster average (13564)


Am I right that the auto-scaler only considers size and never object count.

If so, am I right that this is a bug?

I mean, I think I can bodge around it with pg_num_min, but I thought one 
of the merits of Octopus was that the admin had to spend less time 
worrying about pool sizes...


Regards,

Matthew


--
The Wellcome Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE. 
___

ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Best practices for OSD on bcache

2021-03-03 Thread James Page
Hi Norman

On Wed, Mar 3, 2021 at 2:47 AM Norman.Kern  wrote:

> James,
>
> Can you tell me what's the hardware config of your bcache? I use the 400G
> SATA SSD as cache device and
>
> 10T HDD as the storage device.  Hardware relationed?
>

It might be - all of the deployments I've seen/worked with use NVMe SSD
devices and some more recent ones have used NVMe attached Optane devices as
well (but that is usual).

Backing HDD's are SAS attached 12TB ish  7K spinning disks.



>
> On 2021/3/2 下午4:49, James Page wrote:
> > Hi Norman
> >
> > On Mon, Mar 1, 2021 at 4:38 AM Norman.Kern  wrote:
> >
> >> Hi, guys
> >>
> >> I am testing ceph on bcache devices,  I found the performance is not
> good
> >> as expected. Does anyone have any best practices for it?  Thanks.
> >>
> > I've used bcache quite a bit with Ceph with the following configuration
> > options tweaked
> >
> > a) use writeback mode rather than writethrough (which is the default)
> >
> > This ensures that the cache device is actually used for write caching
> >
> > b) turn off the sequential cutoff
> >
> > sequential_cutoff = 0
> >
> > This means that sequential writes will also always go to the cache device
> > rather than the backing device
> >
> > c) disable the congestion read and write thresholds
> >
> > congested_read_threshold_us = congested_write_threshold_us = 0
> >
> > The following repository:
> >
> > https://git.launchpad.net/charm-bcache-tuning/tree/src/files
> >
> > has a python script and systemd configuration todo b) and c)
> automatically
> > on all bcache devices on boot; a) we let the provisioning system take
> care
> > of.
> >
> > HTH
> >
> >
> >> ___
> >> ceph-users mailing list -- ceph-users@ceph.io
> >> To unsubscribe send an email to ceph-users-le...@ceph.io
> >>
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] grafana-api-url not only for one host

2021-03-03 Thread E Taka
Hi,

if the host  fails, to which the grafana-api-url points (in the example
below ceph01.hostxyz.tld:3000), Ceph Dashboard can't Display Grafana Data:

# ceph dashboard get-grafana-api-url
https://ceph01.hostxyz.tld:3000

Is it possible to automagically switch to an other host?

Thanks, Erich
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Monitor leveldb growing without bound v14.2.16

2021-03-03 Thread Wido den Hollander




On 03/03/2021 00:55, Lincoln Bryant wrote:

Hi list,

We recently had a cluster outage over the weekend where several OSDs were 
inaccessible over night for several hours. When I found the cluster in the 
morning, the monitors' root disks (which contained both the monitor's leveldb 
and the Ceph logs) had completely filled.

After restarting OSDs, cleaning out the monitors' logs, moving /var/lib/ceph to 
dedicated disks on the mons, and starting recovery (in which there was 1 
unfound object that I marked lost, if that has any relevancy), the leveldb 
continued/continues to grow without bound. The cluster has all PGs in 
active+clean at this point, yet I'm accumulating what seems like approximately 
~1GB/hr of new leveldb data.

Two of the monitors (a, c) are in quorum, while the third (b) has been 
synchronizing for the last several hours, but doesn't seem to be able to catch 
up. Mon 'b' has been running for 4 hours now in the 'synchronizing' state. The 
mon's log has many messages about compacting and deleting files, yet we never 
exit the synchronization state.

The ceph.log is also rapidly accumulating complaints that the mons are slow 
(not surprising, I suppose, since the levelDBs are ~100GB at this point).

I've found that using monstore tool to do compaction on mons 'a' and 'c' thelps 
but is only a temporary fix. Soon the database inflates again and I'm back to 
where I started.


Are all the PGs in the active+clean state? I don't assume so? This will 
cause the MONs to keep a large history of OSDMaps in their DB and thus 
it will keep growing.




Thoughts on how to proceed here? Some ideas I had:
- Would it help to add some new monitors that use RocksDB?


They would need to sync which can take a lot of time. Moving to RocksDB 
is a good idea when this is all fixed.



- Stop a monitor and dump the keys via monstoretool, just to get an idea of 
what's going on?
- Increase mon_sync_max_payload_size to try to move data in larger chunks?


I would just try it.


- Drop down to a single monitor, and see if normal compaction triggers and 
stops growing unbounded?


It will keep growing, the compact only works for a limited time. Make 
sure the PGs become clean again.


In the meantime make sure you have enough disk space.

Wido


- Stop both 'a' and 'c', compact them, start them, and immediately start 
'b' ?

Appreciate any advice.

Regards,
Lincoln

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io