from:"Martin Verges"

[ceph-users] Re: Migration Nautilus to Pacifi : Very high latencies (EC profile)

2022-05-15 Thread Martin Verges

Hello,

depending on your workload, drives and OSD allocation size, using the 3+2
can be way slower than the 4+2. Maybe give it a small benchmark and try if
you see a huge difference. We had some benchmarks with such and they showed
quite ugly results in some tests. Best way to deploy EC in our findings is
in power of 2, like 2+x, 4+x, 8+x, 16+x. Especially when you deploy OSDs
before the Ceph allocation change patch, you might end up consuming way
more space if you don't use power of 2. With the 4k allocation size at
least this has been greatly improved for newer deployed OSDs.

--
Martin Verges
Managing director

Mobile: +49 174 9335695  | Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx


On Sun, 15 May 2022 at 20:30, stéphane chalansonnet 
wrote:

> Hi,
>
> Thank you for your answer.
> this is not a good news if you also notice a performance decrease on your
> side
> No, as far as we know, you cannot downgrade to Octopus.
> Going forward seems to be the only way, so Quincy .
> We have a a qualification cluster so we can try on it (but full virtual
> configuration)
>
>
> We are using 4+2 and 3+2 profile
> Are you also on the same profile on your Cluster ?
> Maybe replicated profile are not be impacted ?
>
> Actually, we are trying to recreate one by one the OSD.
> some parameters can be only set by this way .
> The first storage Node is almost rebuild, we will see if the latencies on
> it are below the others ...
>
> Wait and see .
>
> Le dim. 15 mai 2022 à 10:16, Martin Verges  a
> écrit :
>
>> Hello,
>>
>> what exact EC level do you use?
>>
>> I can confirm, that our internal data shows a performance drop when using
>> pacific. So far Octopus is faster and better than pacific but I doubt you
>> can roll back to it. We haven't rerun our benchmarks on Quincy yet, but
>> according to some presentation it should be faster than pacific. Maybe try
>> to jump away from the pacific release into the unknown!
>>
>> --
>> Martin Verges
>> Managing director
>>
>> Mobile: +49 174 9335695  | Chat: https://t.me/MartinVerges
>>
>> croit GmbH, Freseniusstr. 31h, 81247 Munich
>> CEO: Martin Verges - VAT-ID: DE310638492
>> Com. register: Amtsgericht Munich HRB 231263
>> Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx
>>
>>
>> On Sat, 14 May 2022 at 12:27, stéphane chalansonnet 
>> wrote:
>>
>>> Hello,
>>>
>>> After a successful update from Nautilus to Pacific on Centos8.5, we
>>> observed some high latencies on our cluster.
>>>
>>> We did not find very much thing on community related to latencies post
>>> migration
>>>
>>> Our setup is
>>> 6x storage Node (256GRAM, 2SSD OSD + 5*6To SATA HDD)
>>> Erasure coding profile
>>> We have two EC pool :
>>> -> Pool1 : Full HDD SAS Drive 6To
>>> -> Pool2 : Full SSD Drive
>>>
>>> Object S3 and RBD block workload
>>>
>>> Our performances in nautilus, before the upgrade , are acceptable.
>>> However , the next day , performance dropped by 3 or 4
>>> Benchmark showed 15KIOPS on flash drive , before upgrade we had
>>> almost 80KIOPS
>>> Also, HDD pool is almost down (too much lantencies
>>>
>>> We suspected , maybe, an impact on erasure Coding configuration on
>>> Pacific
>>> Anyone observed the same behaviour ? any tuning ?
>>>
>>> Thank you for your help.
>>>
>>> ceph osd tree
>>> ID   CLASS  WEIGHT TYPE NAME STATUS  REWEIGHT
>>> PRI-AFF
>>>  -1 347.61304  root default
>>>  -3  56.71570  host cnp31tcephosd01
>>>   0hdd5.63399  osd.0 up   1.0
>>> 1.0
>>>   1hdd5.63399  osd.1 up   1.0
>>> 1.0
>>>   2hdd5.63399  osd.2 up   1.0
>>> 1.0
>>>   3hdd5.63399  osd.3 up   1.0
>>> 1.0
>>>   4hdd5.63399  osd.4 up   1.0
>>> 1.0
>>>   5hdd5.63399  osd.5 up   1.0
>>> 1.0
>>>   6hdd5.63399  osd.6 up   1.0
>>> 1.0
>>>   7hdd5.63399  osd.7 up   1.0
>>> 1.0
>>>  40ssd5.82

[ceph-users] Re: Migration Nautilus to Pacifi : Very high latencies (EC profile)

2022-05-15 Thread Martin Verges

Hello,

what exact EC level do you use?

I can confirm, that our internal data shows a performance drop when using
pacific. So far Octopus is faster and better than pacific but I doubt you
can roll back to it. We haven't rerun our benchmarks on Quincy yet, but
according to some presentation it should be faster than pacific. Maybe try
to jump away from the pacific release into the unknown!

--
Martin Verges
Managing director

Mobile: +49 174 9335695  | Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx


On Sat, 14 May 2022 at 12:27, stéphane chalansonnet 
wrote:

> Hello,
>
> After a successful update from Nautilus to Pacific on Centos8.5, we
> observed some high latencies on our cluster.
>
> We did not find very much thing on community related to latencies post
> migration
>
> Our setup is
> 6x storage Node (256GRAM, 2SSD OSD + 5*6To SATA HDD)
> Erasure coding profile
> We have two EC pool :
> -> Pool1 : Full HDD SAS Drive 6To
> -> Pool2 : Full SSD Drive
>
> Object S3 and RBD block workload
>
> Our performances in nautilus, before the upgrade , are acceptable.
> However , the next day , performance dropped by 3 or 4
> Benchmark showed 15KIOPS on flash drive , before upgrade we had
> almost 80KIOPS
> Also, HDD pool is almost down (too much lantencies
>
> We suspected , maybe, an impact on erasure Coding configuration on Pacific
> Anyone observed the same behaviour ? any tuning ?
>
> Thank you for your help.
>
> ceph osd tree
> ID   CLASS  WEIGHT TYPE NAME STATUS  REWEIGHT  PRI-AFF
>  -1 347.61304  root default
>  -3  56.71570  host cnp31tcephosd01
>   0hdd5.63399  osd.0 up   1.0  1.0
>   1hdd5.63399  osd.1 up   1.0  1.0
>   2hdd5.63399  osd.2 up   1.0  1.0
>   3hdd5.63399  osd.3 up   1.0  1.0
>   4hdd5.63399  osd.4 up   1.0  1.0
>   5hdd5.63399  osd.5 up   1.0  1.0
>   6hdd5.63399  osd.6 up   1.0  1.0
>   7hdd5.63399  osd.7 up   1.0  1.0
>  40ssd5.82190  osd.40up   1.0  1.0
>  48ssd5.82190  osd.48up   1.0  1.0
>  -5  56.71570  host cnp31tcephosd02
>   8hdd5.63399  osd.8 up   1.0  1.0
>   9hdd5.63399  osd.9   down   1.0  1.0
>  10hdd5.63399  osd.10up   1.0  1.0
>  11hdd5.63399  osd.11up   1.0  1.0
>  12hdd5.63399  osd.12up   1.0  1.0
>  13hdd5.63399  osd.13up   1.0  1.0
>  14hdd5.63399  osd.14up   1.0  1.0
>  15hdd5.63399  osd.15up   1.0  1.0
>  49ssd5.82190  osd.49up   1.0  1.0
>  50ssd5.82190  osd.50up   1.0  1.0
>  -7  56.71570  host cnp31tcephosd03
>  16hdd5.63399  osd.16up   1.0  1.0
>  17hdd5.63399  osd.17up   1.0  1.0
>  18hdd5.63399  osd.18up   1.0  1.0
>  19hdd5.63399  osd.19up   1.0  1.0
>  20hdd5.63399  osd.20up   1.0  1.0
>  21hdd5.63399  osd.21up   1.0  1.0
>  22hdd5.63399  osd.22up   1.0  1.0
>  23hdd5.63399  osd.23up   1.0  1.0
>  51ssd5.82190  osd.51up   1.0  1.0
>  52ssd5.82190  osd.52up   1.0  1.0
>  -9  56.71570  host cnp31tcephosd04
>  24hdd5.63399  osd.24up   1.0  1.0
>  25hdd5.63399  osd.25up   1.0  1.0
>  26hdd5.63399  osd.26up   1.0  1.0
>  27hdd5.63399  osd.27up   1.0  1.0
>  28hdd5.63399  osd.28up   1.0  1.0
>  29hdd5.63399  osd.29up   1.0  1.0
>  30hdd5.63399  osd.30up   1

[ceph-users] Re: Cephadm is stable or not in product?

2022-03-07 Thread Martin Verges

Some say it is, some say it's not.
Every time I try it, it's buggy as hell and I can destroy my test clusters
with ease. That's why I still avoid it. But as you can see in my signature,
I am biased ;).

--
Martin Verges
Managing director

Mobile: +49 174 9335695  | Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx

On Tue, 8 Mar 2022 at 05:18, norman.kern  wrote:

> Dear Ceph folks,
>
> Anyone is using cephadm in product(Version: Pacific)？ I found several bugs
> on it and
> I really doubt it.
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Experience reducing size 3 to 2 on production cluster?

2021-12-11 Thread Martin Verges

Hello,

avoid size 2 whenever you can. As long as you know that you might lose
data, it can be an acceptable risk while migrating the cluster. We had that
in the past multiple time and it is a valid use case in our opinion.
However make sure to monitor the state and recover as fast as possible.
Leave min_size on 2 as well and accept the potential downtime!

--
Martin Verges
Managing director

Mobile: +49 174 9335695  | Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx

On Fri, 10 Dec 2021 at 18:05, Marco Pizzolo  wrote:

> Hello,
>
> As part of a migration process where we will be swinging Ceph hosts from
> one cluster to another we need to reduce the size from 3 to 2 in order to
> shrink the footprint sufficiently to allow safe removal of an OSD/Mon node.
>
> The cluster has about 500M objects as per dashboard, and is about 1.5PB in
> size comprised solely of small files served through CephFS to Samba.
>
> Has anyone encountered a similar situation?  What (if any) problems did you
> face?
>
> Ceph 14.2.22 bare metal deployment on Centos.
>
> Thanks in advance.
>
> Marco
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: SATA SSD recommendations.

2021-11-22 Thread Martin Verges

As the price for SSDs is the same regardless of the interface, I would not
invest so much money in a still slow and outdated platform.
Just buy some new chassis as well and go NVMe. It adds only a little cost
but will increase performance drastically.

--
Martin Verges
Managing director

Mobile: +49 174 9335695  | Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx

On Mon, 22 Nov 2021 at 13:57, Luke Hall  wrote:

> Hello,
>
> We are looking to replace the 36 aging 4TB HDDs in our 6 OSD machines
> with 36x 4TB SATA SSDs.
>
> There's obviously a big range of prices for large SSDs so I would
> appreciate any recommendations of Manufacturer/models to consider/avoid.
>
> I expect the balance to be between
>
> price/performance/durability
>
> Thanks in advance for any advice offered.
>
> Luke
>
> --
> All postal correspondence to:
> The Positive Internet Company, 24 Ganton Street, London. W1F 7QY
>
> *Follow us on Twitter* @posipeople
>
> The Positive Internet Company Limited is registered in England and Wales.
> Registered company number: 3673639. VAT no: 726 7072 28.
> Registered office: Northside House, Mount Pleasant, Barnet, Herts, EN4 9EE.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Is ceph itself a single point of failure?

2021-11-22 Thread Martin Verges

> In my setup size=2 and min_size=1

just don't.

> Real case: host goes down, individual OSDs from other hosts started
consuming >100GB RAM during backfill and get OOM-killed

configure your cluster in a better way can help

There will never be a single system that redundant that it has 100% uptime.
And as you can see on a regular basis, even big corps like facebook seem to
have some outages of their highly redundant systems. But there is a
difference between data loss and the unavailability to access your data for
a short period. You can design Ceph to be super redundant, to not lose
data, and to run even if one datacenter burns down without a downtime. But
this all come with costs, sometimes quite high costs. Often it's cheaper to
live with a short interruption or to build 2 separated systems than to get
more nines to your availability on a single one.

--
Martin Verges
Managing director

Mobile: +49 174 9335695  | Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx


On Mon, 22 Nov 2021 at 11:40, Marius Leustean  wrote:

> > I do not know what you mean by this, you can tune this with your min size
> and replication. It is hard to believe that exactly harddrives fail in the
> same pg. I wonder if this is not more related to your 'non-default' config?
>
> In my setup size=2 and min_size=1. I had cases when 1 PG being stuck in
> peering state was causing all the VMs in that pool to not get any I/O. My
> setup is really "default", deployed with minimal config changes derived
> from ceph-ansible and with even number of OSDs per host.
>
> > That is also very hard to believe, since I am updating ceph and reboot
> one node at time, which is just going fine.
>
> Real case: host goes down, individual OSDs from other hosts started
> consuming >100GB RAM during backfill and get OOM-killed (but hey,
> documentation says that "provisioning ~8GB per BlueStore OSD is advised.")
>
> > If you would read and investigate, you would not need to ask this
> question.
>
> I was thinking of getting insights on other people's environments, thus
> asking questions :)
>
> > Is your lack of knowledge of ceph maybe a critical issue?
>
> I'm just that poor guy reading and understanding the official documentation
> and lists, but getting hit by the real world ceph.
>
> On Mon, Nov 22, 2021 at 12:23 PM Marc  wrote:
>
> > >
> > > Many of us deploy ceph as a solution to storage high-availability.
> > >
> > > During the time, I've encountered a couple of moments when ceph refused
> > > to
> > > deliver I/O to VMs even when a tiny part of the PGs were stuck in
> > > non-active states due to challenges on the OSDs.
> >
> > I do not know what you mean by this, you can tune this with your min size
> > and replication. It is hard to believe that exactly harddrives fail in
> the
> > same pg. I wonder if this is not more related to your 'non-default'
> config?
> >
> > > So I found myself in very unpleasant situations when an entire cluster
> > > went
> > > down because of 1 single node, even if that cluster was supposed to be
> > > fault-tolerant.
> >
> > That is also very hard to believe, since I am updating ceph and reboot
> one
> > node at time, which is just going fine.
> >
> > >
> > > Regardless of the reason, the cluster itself can be a single point of
> > > failure, even if it's has a lot of nodes.
> >
> > Indeed, like the data center, and like the planet. The question you
> should
> > ask yourself, do you have a better alternative? For the 3-4 years I have
> > been using ceph, I did not find a better alternative (also not looking
> for
> > it ;))
> >
> > > How do you segment your deployments so that your business doesn't
> > > get jeopardised in the case when your ceph cluster misbehaves?
> > >
> > > Does anyone even use ceph for a very large clusters, or do you prefer
> to
> > > separate everything into smaller clusters?
> >
> > If you would read and investigate, you would not need to ask this
> > question.
> > Is your lack of knowledge of ceph maybe a critical issue? I know the ceph
> > organization likes to make everything as simple as possible for everyone.
> > But this has of course its flip side when users run into serious issues.
> >
> >
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: how many developers are working on ceph?

2021-11-19 Thread Martin Verges

Hello Marc,

3. someone mentioned the option for paid 'bug' fixing, but I have never
heard or seen anything about this here. How would one apply for this?
Would be good, but this can only be done by the companies working on Ceph.
However I would vote for hiring devs in the Ceph Foundation and do actual
development.

4. Afaik it is suggested to get paid 'hands on' support from companies
active on the mailing list. But is it also possible to get this directly
from the ceph 'organization'?
It's part of the Ceph ecosystem, Ceph is no organization and has absolute
no processes, personell or structure to be a support organization.
Therefore companies like ours do the support around Ceph.

--
Martin Verges
Managing director

Mobile: +49 174 9335695  | Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx

On Fri, 19 Nov 2021 at 11:07, Marc  wrote:

>
> The recent discussions made me wonder about:
>
> 1. how many paid full time developers are currently working on ceph?
> 2. how many hours are contributed by the community?
> 3. someone mentioned the option for paid 'bug' fixing, but I have never
> heard or seen anything about this here. How would one apply for this?
> 4. Afaik it is suggested to get paid 'hands on' support from companies
> active on the mailing list. But is it also possible to get this directly
> from the ceph 'organization'?
>
>
>
>
>
>
>
>
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Why you might want packages not containers for Ceph deployments

2021-11-17 Thread Martin Verges

docker itself is not the problem, it's super nice. It's just that adm/orch
is yet another deployment tool, and yet again not reliable enough. It's
easy to break, and adds additional errors like you can see at my
screenshot. I have a collection of them ;).

We are talking about a storage, meant to store data in a reliable way. Not
for days, but for years or longer. The oldest cluster we maintain runs
since around ~8y by now with completely replaced hardware, filestore to
bluestore migration, lot's of bugs and overall a hell of a ride. But it
never lost data and that's important. However when adding complexity, you
endangering that.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

Hans van den Bogert  schrieb am Mi., 17. Nov. 2021,
23:34:

> On 11/17/21 8:19 PM, Martin Verges wrote:
> > There are still alternative solutions without the need for useless
> > containers and added complexity. Stay away from that crap and you won't
> > have a hard time. 😜
>
> I don't have a problem with the containers *at all*. And with me
> probably a lot of users. But those who don't see the problem are silent
> in this thread.
>
> I love cephadm, finally spinning up a cluster and doing upgrades has
> become a lot less tedious.
>
> A positive sound was in due order here.
>
> Signing out again!
>
> Hans
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Why you might want packages not containers for Ceph deployments

2021-11-17 Thread Martin Verges

> And it looks like I'll have to accept the move to containers even though
I have serious concerns about operational maintainability due to the
inherent opaqueness of container solutions.

There are still alternative solutions without the need for useless
containers and added complexity. Stay away from that crap and you won't
have a hard time. 😜

We as croit started our own OBS Infrastructure to build packages for x86_64
and arm64. This should help us to maintain packages and avoid the useless
Ceph containers. I can post an update to the user ML when it's ready for
public service.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

Dave Hall  schrieb am Mi., 17. Nov. 2021, 20:05:

>
> On Wed, Nov 17, 2021 at 1:05 PM Martin Verges 
> wrote:
>
>> Hello Dave,
>>
>> > The potential to lose or lose access to millions of files/objects or
>> petabytes of data is enough to keep you up at night.
>> > Many of us out here have become critically dependent on Ceph storage,
>> and probably most of us can barely afford our production clusters, much
>> less a test cluster.
>>
>> Please remember, free software comes still with a price. You can not
>> expect someone to work on your individual problem while being cheap on your
>> highly critical data. If your data has value, then you should invest in
>> ensuring data safety. There are companies out, paying Ceph developers and
>> fixing bugs, so your problem will be gone as soon as you A) contribute code
>> yourself or B) pay someone to contribute code.
>>
>
> It's always tricky when one gets edgy.  I completely agree with your
> statements on free software.
>
> For the record, I don't actually have any Ceph problems right now.  It's
> been pretty smooth sailing since I first set the cluster up (Nautilus on
> Debian with Ceph-Ansible).  Some procedural confusion, but no outages in 18
> months  and expansion from 3 nodes to 12.
>
> So it's not about my pet bug or feature request.  What it is about is
> exactly the undeniable and unavoidable dilemmas of distributed open source
> development.  Ceph is wonderful, but it is incredibly complex all on it's
> own.  It wouldn't be easy even if all of the developers were sitting in the
> same building working for the same company.
>
> Further explanation:  Our Ceph cluster is entirely funded by research
> grants.  We can't just go out and buy a whole second cluster for data
> safety.  We can't go to management and ask for more systems.  We can't even
> get enough paid admins to do what we need to do.  But we also can't allow
> these limitations to impede useful research activities.  So unpaid overtime
> and shoe-string hardware budgets.
>
> We (myself and the researcher I'm supporting) chose Ceph because it is
> readily scalable and because it has redundancy and resiliency built in in
> the form of configurable failure domains, replication and EC pools.  I've
> looked at a lot of the distributed storage solutions out there.  Most,
> including the commercial offerings, don't even come close to Ceph on these
> points.
>
>
>> Don't get me wrong, every dev here should have the focus in providing
>> rock solid work and I believe they do, but in the end it's software, and
>> software never will be free of bugs. Ceph does quite a good job protecting
>> your data, and in my personal experience, if you don't do crazy stuff and
>> execute even crazier commands with "yes-i-really-mean-it", you usually
>> don't lose data.
>>
>
> I believe you that there are a lot of good devs out there doing good
> work.  Complexity is the biggest issue Ceph faces.  This complexity is
> necessary, but it can bite you.
>
> My honest perception of Pacific right now is that something dreadful could
> go wrong in the course of an upgrade to Pacific, even with a sensible
> cluster and a sensible cluster admin.  I wish I could pull some scrap
> hardware together and play out some scenarios, but I don't have the time or
> the hardware.
>
> To be completely straight, I am speaking up today because I want to see
> Ceph succeed and it seems like things are a bit rough right now.  In my
> decades in the business I've seen projects and even whole companies
> collapse because the developers lost touch with, or stopped listening to,
> the users.  I don't want that to happen to Ceph.
>
>
>>
>>

[ceph-users] Re: Why you might want packages not containers for Ceph deployments

2021-11-17 Thread Martin Verges

Hello Dave,

> The potential to lose or lose access to millions of files/objects or
petabytes of data is enough to keep you up at night.
> Many of us out here have become critically dependent on Ceph storage, and
probably most of us can barely afford our production clusters, much less a
test cluster.

Please remember, free software comes still with a price. You can not expect
someone to work on your individual problem while being cheap on your highly
critical data. If your data has value, then you should invest in ensuring
data safety. There are companies out, paying Ceph developers and fixing
bugs, so your problem will be gone as soon as you A) contribute code
yourself or B) pay someone to contribute code.

Don't get me wrong, every dev here should have the focus in providing rock
solid work and I believe they do, but in the end it's software, and
software never will be free of bugs. Ceph does quite a good job protecting
your data, and in my personal experience, if you don't do crazy stuff and
execute even crazier commands with "yes-i-really-mean-it", you usually
don't lose data.

> The real point here:  From what I'm reading in this mailing list it
appears that most non-developers are currently afraid to risk an upgrade to
Octopus or Pacific.  If this is an accurate perception then THIS IS THE
ONLY PROBLEM.

Octopus is one of the best releases ever. Often our support engineers do
upgrade old unmaintained installations from some super old release to
Octopus to get them running again or have propper tooling to fix the issue.
But I agree, we as croit are still afraid of pushing our users to Pacific,
as we encounter bugs in our tests. This however will change soon, as we are
close to a stable enough Pacific release as we believe.

--
Martin Verges
Managing director

Mobile: +49 174 9335695  | Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx

On Wed, 17 Nov 2021 at 18:41, Dave Hall  wrote:

> Sorry to be a bit edgy, but...
>
> So at least 5 customers that you know of have a test cluster, or do you
> have 5 test clusters?  So 5 test clusters out of how many total Ceph
> clusters worldwide.
>
> Answers like this miss the point.  Ceph is an amazing concept.  That it is
> Open Source makes it more amazing by 10x.  But storage is big, like
> glaciers and tectonic plates.  The potential to lose or lose access to
> millions of files/objects or petabytes of data is enough to keep you up at
> night.
>
> Many of us out here have become critically dependent on Ceph storage, and
> probably most of us can barely afford our production clusters, much less a
> test cluster.
>
> The best I could do right now today for a test cluster would be 3
> Virtualbox VMs with about 10GB of disk each.  Does anybody out there think
> I could find my way past some of the more gnarly O and P issues with this
> as my test cluster?
>
> The real point here:  From what I'm reading in this mailing list it appears
> that most non-developers are currently afraid to risk an upgrade to Octopus
> or Pacific.  If this is an accurate perception then THIS IS THE ONLY
> PROBLEM.
>
> Don't shame the users who are more concerned about stability than fresh
> paint.
>
> -Dave
>
> --
> Dave Hall
> Binghamton University
> kdh...@binghamton.edu
>
> On Wed, Nov 17, 2021 at 11:18 AM Stefan Kooman  wrote:
>
> > On 11/17/21 16:19, Marc wrote:
> > >> The CLT is discussing a more feasible alternative to LTS, namely to
> > >> publish an RC for each point release and involve the user community to
> > >> help test it.
> > >
> > > How many users even have the availability of a 'test cluster'?
> >
> > At least 5 (one physical 3 node). We installed a few of them with the
> > exact same version as when we started prod (luminous 12.2.4 IIRC) and
> > upgraded ever since. Especially for cases where old pieces of metadata
> > might cause issues in the long run (pre jewel blows up in pacific for
> > MDS case). Same for the osd OMAP conversion troubles in pacific.
> > Especially in these cases Ceph testing on real prod might have revealed
> > that. A VM enviroment would be ideal for this. As you could just
> > snapshot state and play back when needed. Ideally MDS / RGW / RBD
> > workloads on them to make sure all use cases are tested.
> >
> > But these cluster have not the same load as prod. Not the same data ...
> > so still stuff might break in special ways. But at least we try to avoid
> > that as much as possible.
> >
> > Gr. Stefan
> > __

[ceph-users] Re: Why you might want packages not containers for Ceph deployments

2021-11-17 Thread Martin Verges

Just as a friendly reminder:

1) No one prevents you from hiring developers to work on Ceph in a way you
like.
2) I personally dislike the current release cycle and would like change
that a bit.
3) There is a reason companies like our own prevent users from using latest
as "production", we tag them as nightly and not roll them out to most
environments until we or others found enough bugs or made sure that release
is stable enough for production. This often came with us ignoring releases
and still not setting pacific to default for new installations for a very
long time. So still every new croit deployment by default (without changing
it) is still on octopus instead of pacific and as you see on the changelog
for a good reason.

I would strongly suggest to not run something like a luminous or minic
release anymore, it's old but it also lacks a lot of debugging
functionality that makes it hard for support teams like ours to help users
fixing a cluster. Having 5-10y LTS is therefore something I would not
recommend and it's a requirement from the past in my personal opinion.
However a timespan of 3-5y is something the Ceph community should try to do
and we from croit would happily support that.

--
Martin Verges
Managing director

Mobile: +49 174 9335695  | Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx

On Wed, 17 Nov 2021 at 16:00, Marc  wrote:

> > >
> > > But since when do developers decide? Do you know any factory where
> > factory workers decide what product they are going to make and not the
> > product management???
> >
> > You might want to check out [1] and [2]. There are different
> > stakeholders with different interests. All these parties have the
> > possibility to influence the project, on all different levels and areas.
> > So it's not product management.
> >
> > IT is becoming such a refuge for undetected unprofessionals.
> >
> > So who do you target here? The ceph devs that try to make the product
> > better every day?
> >
>
> The target is, decision making based on "I do not want to need implement
> something, I want to move to a new version of python where there is a
> default library I can include" or "I need this functionality but there is
> no package available for centos7 so lets drop centos7 and only support
> centos8" or "but it works on my macos development environment" etc etc.
>
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Open discussing: Designing 50GB/s CephFS or S3 ceph cluster

2021-10-21 Thread Martin Verges

Hello,

if you would choose Seagate MACH.2 2X14 drives, you would get much better
throughput as well as density. Your RAM could be a bit on the lower end,
and for the MACH.2 it definitively would be to low.

You need dedicated metadata drives for S3 or MDS as well. Choose blazing
fast NVMe with low capacity and put them in each server.

> How many nodes should be deployed in order to achieve a minimum of
50GB/s, if possible, with the above hardware setting?
About 50 Nodes could be able to deliver it, but strongly depends on many
more factors.

> How many Cephfs MDS are required? (suppose 1MB request size), and how
many clients are needed for reach a total of 50GB/s?
MDS needs to be scaled more to the number of files than on the requests. Of
course, the more writes you want to do, the more load they get as well.
Just colocate them on the Servers and you are good to scale the active
number to your liking.

> From the perspective of getting the maximum bandwidth, which one should i
choose, CephFS or Ceph S3?
Choose what's best for your application / use case scenario.

--
Martin Verges
Managing director

Mobile: +49 174 9335695  | Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx


On Thu, 21 Oct 2021 at 18:24, huxia...@horebdata.cn 
wrote:

> Dear Cephers,
>
> I am thinking of designing a cephfs or S3 cluster, with a target to
> achieve a minimum of 50GB/s (write) bandwidth. For each node, I prefer 4U
> 36x 3.5" Supermicro server with 36x 12TB 7200K RPM HDDs, 2x Intel P4610
> 1.6TB NVMe SSD as DB/WAL, a single CPU socket with AMD 7302, and 256GB DDR4
> memory. Each node comes with 2x 25Gb networking in mode 4 bonded. 8+3 EC
> will be used.
>
> My questions are the following:
>
> 1   How many nodes should be deployed in order to achieve a minimum of
> 50GB/s, if possible, with the above hardware setting?
>
> 2   How many Cephfs MDS are required? (suppose 1MB request size), and how
> many clients are needed for reach a total of 50GB/s?
>
> 3   From the perspective of getting the maximum bandwidth, which one
> should i choose, CephFS or Ceph S3?
>
> Any comments, suggestions, or improvement tips are warmly welcome
>
> best regards,
>
> Samuel
>
>
>
> huxia...@horebdata.cn
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Which verison of ceph is better

2021-10-18 Thread Martin Verges

Use pacific for new deployments.

--
Martin Verges
Managing director

Mobile: +49 174 9335695  | Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx


On Tue, 19 Oct 2021 at 04:10, norman.kern  wrote:

> Hi guys,
>
> I have a long holiday since this summer, I came back to setup a new ceph
> server, I want to know which stable version of ceh you're using for
> production?
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Stretch cluster experiences in production?

2021-10-18 Thread Martin Verges

Hello Matthew,

building strech clusters is not a big deal. It works quite well and stable
as long as you have your network under control. This is the most error
prone part of a stretch cluster but can easy be solved when you choose a
good vendor and network gear.

For 3 data centers make sure to have a dark fiber interconnect and avoid
things like managed Ethernet. Build a ring out of them using overlay
network technologies like EVPN BGP+ECMP+VXLAN and have all network path
identical and active. This provides a stable high available network and in
addition avoids different packet runtimes through your network.
After having the storage backbone capable of running in service upgrades
and downtime free operation, just configure the crush rule to 3 data
centers and use a crush rule with the correct host/OSD selection. Don't
forget to place your MONs and Services in all of these data centers as well.

As additional tuning, your crush rule can reflect a primary data center. So
if all your workload is in DC-A, you can configure it to have all primary
OSDs of a PG in this DC. This way your read access is always local and
reduces network congestion. In addition, your writes will be a litte bit
faster as well.

We have quite some experience with that and can be of help if you need more
details and vendor suggestions.

--
Martin Verges
Managing director

Mobile: +49 174 9335695  | Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx

On Fri, 15 Oct 2021 at 17:22, Matthew Vernon  wrote:

> Hi,
>
> Stretch clusters[0] are new in Pacific; does anyone have experience of
> using one in production?
>
> I ask because I'm thinking about new RGW cluster (split across two main
> DCs), which I would naturally be doing using RGW multi-site between two
> clusters.
>
> But it strikes me that a stretch cluster might be simpler (multi-site
> RGW isn't entirely straightforward e.g. round resharding), and 2 copies
> per site is quite a bit less storage than 3 per site. But I'm not sure
> if this new feature is considered production-deployment-ready
>
> Also, if I'm using RGWs, will they do the right thing location-wise?
> i.e. DC A RGWs will talk to DC A OSDs wherever possible?
>
> Thanks,
>
> Matthew
>
> [0] https://docs.ceph.com/en/latest/rados/operations/stretch-mode/
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: RFP for arm64 test nodes

2021-10-11 Thread Martin Verges

Hello Dan,

why not using a bit bigger machines and use VMs for tests? We have quite
good experience with that and it works like a charm. If you plan them as
hypervisors, you can run a lot of tests simultaneous. Use the 80 core ARM,
put 512GB or more in them and use some good NVMe like P55XX or so. In
addition put 2*25GbE/40GbE in the servers and you need only a few of them
to simulate a lot. This would save costs, makes it easier to maintain, and
you are much more flexible. For example running tests on different OS,
injecting latency, simulating errors and more.

--
Martin Verges
Managing director

Mobile: +49 174 9335695  | Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx

On Sat, 9 Oct 2021 at 01:25, Dan Mick  wrote:

> Ceph has been completely ported to build and run on ARM hardware
> (architecture arm64/aarch64), but we're unable to test it due to lack of
> hardware.  We propose to purchase a significant number of ARM servers
> (50+?) to install in our upstream Sepia test lab to use for upstream
> testing of Ceph, alongside the x86 hardware we already own.
>
> This message is to start a discussion of what the nature of that
> hardware should be, and an investigation as to what's available and how
> much it might cost.  The general idea is to build something arm64-based
> that is similar to the smithi/gibba nodes:
>
> https://wiki.sepia.ceph.com/doku.php?id=hardware:gibba
>
> Some suggested features:
>
> * base hardware/peripheral support for current releases of RHEL, CentOS,
> Ubuntu
> * 1 fast and largish (400GB+) NVME drive for OSDs (it will be
> partitioned into 4-5 subdrives for tests)
> * 1 large (1TB+) SSD/HDD for boot/system and logs (faster is better but
> not as crucial as for cluster storage)
> * Remote/headless management (IPMI?)
> * At least 1 10G network interface per host
> * Order of 64GB main memory per host
>
> Density is valuable to the lab; we have space but not an unlimited amount.
>
> Any suggestions on vendors or specific server configurations?
>
> Thanks!
>
> ___
> Dev mailing list -- d...@ceph.io
> To unsubscribe send an email to dev-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Migrating CEPH OS looking for suggestions

2021-09-30 Thread Martin Verges

Just PXE boot whatever OS you like at the time. If you need to switch to
another, a reboot is enough to switch OS. It's even possible without
containers, so absolute no problem at all.

--
Martin Verges
Managing director

Mobile: +49 174 9335695  | Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx


On Thu, 30 Sept 2021 at 15:58, Stefan Kooman  wrote:

> On 9/30/21 14:48, Drew Weaver wrote:
> > Hi,
> >
> > I am going to migrate our ceph cluster to a new OS and I am trying to
> choose the right one so that I won't have to replace it again when python4
> becomes a requirement mid-cycle [or whatever].
> >
> > Has anyone seen any recommendations from the devs as to what distro they
> are targeting for lets say the next 5 years?
>
> CentOS 8 stream should be supported to may 31, 2024. Ubuntu 20.04 should
> bring you to april 2025. I don't think Ubuntu 20.04 will be targeted for
> a Ceph version released in 2026.
>
> If, in some point in time, you want to decouple host OS from Ceph, a
> containerized approach can be used (cephadm). You will just use whatever
> container is available / preferred at that time. With extended
> maintenance, you might be able to leave host OS running Ubuntu 20.04 for
> ~ 8.5 years from now. As long as requirements regarding container infra
> (docker / podman, etc.) don't change in the mean time ...
>
> It also depends to what Ceph version you want to upgrade. You can wait
> for new Ubuntu version 22.04, and newest Ceph release Quincy and manage
> a 5 year period.
>
> Gr. Stefan
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: How to make CephFS a tiered file system?

2021-07-20 Thread Martin Verges

Hello Samuel,

you can use the docs from
https://docs.ceph.com/en/latest/cephfs/file-layouts/ to find out how to put
a file on a different pool. Example: "setfattr -n ceph.dir.layout.pool -v
poolname foldername"

Ceph itself does unfortunately not provide tiering out of the box. You
therefore have to write a script das scans your storage and migrates files.
This will cause a lot of IOs and CPU/RAM on the MDS, so make sure it won't
affect your regular operations.

Pseudo code like:

for each file in cephfs do
   if file.date <  then // or whatever you want
  copy file to new temporary file in pool
  remove old file
  rename temp file to old file location

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


On Mon, 19 Jul 2021 at 23:28, huxia...@horebdata.cn 
wrote:

> Dear Cepher,
>
> I have a requirement to use CephFS as a tiered file system, i.e. the data
> will be first stored onto an all-flash pool (using SSD OSDs), and then
> automatically moved to an EC coded pool (using HDD OSDs) according to
> threshold on file creation time (or access time). The reason for such a
> file system is due to the fact that, files are created and most likely
> accessed within the first 6 months or 1 year, and after that period, those
> files have much less chance to be accessed and thus could be moved to a
> slower and cheap pool.
>
> Does CephFS already support such a tiered feature? and if yes, how to
> implement such feature with a pool of all SSD pool and a pool of EC-coded
> HDD pool?
>
> Any suggestion, ideas, comments are highly appreciated,
>
> best regards,
>
> samuel
>
>
>
> huxia...@horebdata.cn
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Ceph with BGP?

2021-07-05 Thread Martin Verges

Hello,

> This is not easy to answer without all the details. But for sure there
are cluster running with BGP in the field just fine.

Out of curiosity, is there someone here that has his Ceph cluster running
with BGP in production?
As far as I remember, here at croit with multiple hundred supported
clusters, we never encountered a BGP deployment in the field. It's always
just the theoretical or testing where we hear from BGP.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


On Tue, 6 Jul 2021 at 07:11, Stefan Kooman  wrote:

> On 7/5/21 6:26 PM, German Anders wrote:
> > Hi All,
> >
> > I have an already created and functional ceph cluster (latest
> luminous
> > release) with two networks one for the public (layer 2+3) and the other
> for
> > the cluster, the public one uses VLAN and its 10GbE and the other one
> uses
> > Infiniband with 56Gb/s, the cluster works ok. The public network uses
> > Juniper QFX5100 switches with VLAN in layer2+3 configuration but the
> > network team needs to move to a full layer3 and they want to use BGP, so
> > the question is, how can we move to that schema? What are the
> > considerations? Is it possible? Is there any step-by-step way to move to
> > that schema? Also is anything better than BGP or other alternatives?
>
> Ceph doesn't care at all. Just as long as the nodes can communicate to
> each other, it's fine. It depends on your failure domains how easy you
> can move to this L3 model. Do you have separate datacenters that you can
> do one by one, or separate racks?
>
> And you can do BGP on different levels: router, top of rack switches, or
> even on the Ceph host itselfs (FRR).
>
> We use BGP / VXLAN / EVPN for our Ceph cluster. But it all depends on
> why your networking teams wants to change to L3, and why.
>
> There are no step by step guides, as most deployments are unique.
>
> This might be a good time to reconsider a separate cluster network.
> Normally there is no need for that, and might make things simpler.
>
> Do you have separate storage switches? Whre are your clients connected
> to (separate switches or connected to storage switches as well).
>
> This is not easy to answer without all the details. But for sure there
> are cluster running with BGP in the field just fine.
>
> Gr. Stefan
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Why you might want packages not containers for Ceph deployments

2021-06-22 Thread Martin Verges

> There is no "should be", there is no one answer to that, other than 42.
Containers have been there before Docker, but Docker made them popular,
exactly for the same reason as why Ceph wants to use them: ship a known
good version (CI tests) of the software with all dependencies, that can be
run "as is" on any supported platform.

So ship it tested for container software XXX and run it on YYY. How will
that benefit me as a user? There are differences when running a docker
container, lxc, nspawn, podman, kubernetes and whatever. So you trade error
A for error B. There are even problems with containers if you don't use
version X from docker. That's what the past told us, why should it be
better in the future with even more container environments. Have you tried
running rancher on debian in the past? It breaks apart due to iptables or
other stuff.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

On Tue, 22 Jun 2021 at 17:53, Stefan Kooman  wrote:

> On 6/21/21 7:37 PM, Marc wrote:
>
>
> >
> > I have seen no arguments why to use containers other than to try and
> make it "easier" for new ceph people.
>
> I advise to read the whole thread again, especially Sage his comments,
> as there are other benefits. It would free up resources that can be
> dedicated to (arguably) more pressing issues.
>
> Containers are not being used as they should be.
>
> There is no "should be", there is no one answer to that, other than 42.
> Containers have been there before Docker, but Docker made them popular,
> exactly for the same reason as why Ceph wants to use them: ship a known
> good version (CI tests) of the software with all dependencies, that can
> be run "as is" on any supported platform.
>
> Gr. Stefan
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Why you might want packages not containers for Ceph deployments

2021-06-19 Thread Martin Verges

Hello Sage,

> ...I think that part of this comes down to a learning curve...
> ...cephadm represent two of the most successful efforts to address
usability...

Somehow it does not look right to me.

There is much more to operate a Ceph cluster than just deploying software.
Of course that helps on the short run to avoid that people leave the train
right when they started their Ceph journey. But the harder part is what to
do if shit hit's the fan and your cluster is down due to some issues and
then having additional layers of complexity kicking in and biting your ass.
Just saying, that day2 ops is much more important than getting a cluster
up&running. In my believe, no admin want to dig around containers and other
abstractions when the single most important part of a whole IT
infrastructure stops working. But just my thought, maybe I'm wrong.

In my opinion, the best possible way to run IT software is KISS, keep it
stupid simple. No additional layers, no abstractions of abstractions and
good error messages.

For example the docker topic here looks like something that can be
showcased:
> Question: If it uses docker and docker daemon fails what happens to you
containers?
> Answer: This is an obnoxious feature of docker

As you might see, you need a lot of knowledge about abstraction layers to
operate them well. Docker for example provides so called live-restore (
https://docs.docker.com/config/containers/live-restore/) that allows you to
stop the daemon without killing your containers. This enables you to update
docker daemon without downtimes but you have to know it and of course
enable it. This can make operating a Ceph cluster harder, not easier.

What about more sophisticated features, for example performance. Ceph
already is not a fast storage solution with way to high latency. Does it
help to add containers instead of going more direct to the hardware and
reduce overhead? Of course you can run SPDK and/or DPDK inside containers,
but does it make it better and faster or even easier? If you need
high-performance storage today, you can turn to open source alternatives
that are massively cheaper per IO and only minimally more expensive per GB.
I therefore believe, stripping out overhead is also an important topic for
the future of Ceph.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

On Fri, 18 Jun 2021 at 20:43, Sage Weil  wrote:

> Following up with some general comments on the main container
> downsides and on the upsides that led us down this path in the first
> place.
>
> Aside from a few minor misunderstandings, it seems like most of the
> objections to containers boil down to a few major points:
>
> > Containers are more complicated than packages, making debugging harder.
>
> I think that part of this comes down to a learning curve and some
> semi-arbitrary changes to get used to (e.g., systemd unit name has
> changed; logs now in /var/log/ceph/$fsid instead of /var/log/ceph).
> Another part of these changes are real hoops to jump through: to
> inspect process(es) inside a container you have to `cephadm enter
> --name ...`; ceph CLI may not be automatically installed on every
> host; stracing or finding coredumps requires extra steps. We're
> continuing to improve the tools etc so please call these things out as
> you see them!
>
> > Security (50 containers -> 50 versions of openssl to patch)
>
> This feels like the most tangible critique.  It's a tradeoff.  We have
> had so many bugs over the years due to varying versions of our
> dependencies that containers feel like a huge win: we can finally test
> and distribute something that we know won't break due to some random
> library on some random distro.  But it means the Ceph team is on the
> hook for rebuilding our containers when the libraries inside the
> container need to be patched.
>
> On the flip side, cephadm's use of containers offer some huge wins:
>
> - Package installation hell is gone.  Previously, ceph-deploy and
> ceph-ansible had thousands of lines of code to deal with the myriad
> ways that packages could be installed and where they could be
> published.  With containers, this now boils down to a single string,
> which is usually just something like "ceph/ceph:v16".  We're grown a
> handful of complexity there to let you log into private registries,
> but otherwise things are so much simpler.  Not to mention what happens
> when package dependencies break.
> - Upgrades/downgrades can be carefully orchestrated. With packages,
> the version change is by host, with a limbo period (and oc

[ceph-users] Re: Connect ceph to proxmox

2021-06-05 Thread Martin Verges

Hello Szabo,

you can try it with our docs at
https://croit.io/docs/master/hypervisors/proxmox, maybe it helps you to
connect your Ceph cluster to Proxmox.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


On Sat, 5 Jun 2021 at 04:20, Szabo, Istvan (Agoda) 
wrote:

> Hi,
>
> Is there a way to connect from my nautilus ceph setup the pool that I
> created in ceph to proxmox? Or need a totally different ceph install?
>
> Istvan Szabo
> Senior Infrastructure Engineer
> ---
> Agoda Services Co., Ltd.
> e: istvan.sz...@agoda.com<mailto:istvan.sz...@agoda.com>
> ---
>
> 
> This message is confidential and is for the sole use of the intended
> recipient(s). It may also be privileged or otherwise protected by copyright
> or other legal rules. If you have received it by mistake please let us know
> by reply email and delete it from your system. It is prohibited to copy
> this message or disclose its content to anyone. Any confidentiality or
> privilege is not waived or lost by any mistaken delivery or unauthorized
> disclosure of the message. All messages sent to and from Agoda may be
> monitored to ensure compliance with company policies, to protect the
> company's interests and to remove potential malware. Electronic messages
> may be intercepted, amended, lost or deleted, or contain viruses.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Rolling upgrade model to new OS

2021-06-04 Thread Martin Verges

Hello Drew,

or whole deployment and management solution is build on just replacing an
OS whenever there is an update. We at croit.io even provide Debian and Suse
based OS images and you can switch between per host at any time. No problem.

Just go and reinstall a node, install Ceph and the services will come up
without a problem when you have all configs in place.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

On Fri, 4 Jun 2021 at 14:56, Drew Weaver  wrote:

> Hello,
>
> I need to upgrade the OS that our Ceph cluster is running on to support
> new versions of Ceph.
>
> Has anyone devised a model for how you handle this?
>
> Do you just:
>
> Install some new nodes with the new OS
> Install the old version of Ceph on the new nodes
> Add those nodes/osds to the cluster
> Remove the old nodes
> Upgrade Ceph on the new nodes
>
> Are there any specific OS that Ceph has said that will have longer future
> version support? Would like to only touch the OS every 3-4 years if
> possible.
>
> Thanks,
> -Drew
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Why you might want packages not containers for Ceph deployments

2021-06-02 Thread Martin Verges

Hello,

I agree to Matthew, here at croit we work a lot with containers all day
long. No problem with that and enough knowledge to say for sure it's not
about getting used to it.
For us and our decisions here, Storage is the most valuable piece of IT
equipment in a company. If you have problems with your storage, most likely
you have a huge pain, costs, problems, downtime, whatever. Therefore, your
storage solution must be damn simple, you switch it on, it has to work.

If you take a short look into Ceph documentation about how to deploy a
cephadm cluster vs croit. We strongly believe it's much easier as we take
away all the pain from OS up to Ceph while keeping it simple behind the
scene. You still can always login to a node, kill a process, attach some
strace or whatever you like as you know it from years of linux
administration without any complexity layers like docker/podman/... It's
just friction less. In the end, what do you need? A kernel, an initramfs,
some systemd, a bit of libs and tooling, and the Ceph packages.

In addition, we help lot's of Ceph users on a regular basis with their hand
made setups, but we don't really wanna touch the cephadm ones, as they are
often harder to debug. But of course we do it anyways :).

To have a perfect storage, strip away anything unneccessary. Avoid any
complexity, avoid anything that might affect your system. Keep it simply
stupid.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

On Wed, 2 Jun 2021 at 11:38, Matthew Vernon  wrote:

> Hi,
>
> In the discussion after the Ceph Month talks yesterday, there was a bit
> of chat about cephadm / containers / packages. IIRC, Sage observed that
> a common reason in the recent user survey for not using cephadm was that
> it only worked on containerised deployments. I think he then went on to
> say that he hadn't heard any compelling reasons why not to use
> containers, and suggested that resistance was essentially a user
> education question[0].
>
> I'd like to suggest, briefly, that:
>
> * containerised deployments are more complex to manage, and this is not
> simply a matter of familiarity
> * reducing the complexity of systems makes admins' lives easier
> * the trade-off of the pros and cons of containers vs packages is not
> obvious, and will depend on deployment needs
> * Ceph users will benefit from both approaches being supported into the
> future
>
> We make extensive use of containers at Sanger, particularly for
> scientific workflows, and also for bundling some web apps (e.g.
> Grafana). We've also looked at a number of container runtimes (Docker,
> singularity, charliecloud). They do have advantages - it's easy to
> distribute a complex userland in a way that will run on (almost) any
> target distribution; rapid "cloud" deployment; some separation (via
> namespaces) of network/users/processes.
>
> For what I think of as a 'boring' Ceph deploy (i.e. install on a set of
> dedicated hardware and then run for a long time), I'm not sure any of
> these benefits are particularly relevant and/or compelling - Ceph
> upstream produce Ubuntu .debs and Canonical (via their Ubuntu Cloud
> Archive) provide .debs of a couple of different Ceph releases per Ubuntu
> LTS - meaning we can easily separate out OS upgrade from Ceph upgrade.
> And upgrading the Ceph packages _doesn't_ restart the daemons[1],
> meaning that we maintain control over restart order during an upgrade.
> And while we might briefly install packages from a PPA or similar to
> test a bugfix, we roll those (test-)cluster-wide, rather than trying to
> run a mixed set of versions on a single cluster - and I understand this
> single-version approach is best practice.
>
> Deployment via containers does bring complexity; some examples we've
> found at Sanger (not all Ceph-related, which we run from packages):
>
> * you now have 2 process supervision points - dockerd and systemd
> * docker updates (via distribution unattended-upgrades) have an
> unfortunate habit of rudely restarting everything
> * docker squats on a chunk of RFC 1918 space (and telling it not to can
> be a bore), which coincides with our internal network...
> * there is more friction if you need to look inside containers
> (particularly if you have a lot running on a host and are trying to find
> out what's going on)
> * you typically need to be root to build docker containers (unlike
> packages)
> * we already have package deployment infrastructure (which we'll need
> regardl

[ceph-users] Re: Best distro to run ceph.

2021-05-01 Thread Martin Verges

Hello Peter,

if you want to get rid of all these troubles, just use croit. We bring
a pre build and tested Debian 10 (Buster) OS or optionally in Beta a
OpenSuse Leap 15.2 based image with our deployment and management
software.

Btw. all the OS, installation, upgrading hassle are solved with the
free forever version.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

On Fri, 30 Apr 2021 at 21:14, Peter Childs  wrote:
>
> I'm trying to set up a new ceph cluster, and I've hit a bit of a blank.
>
> I started off with centos7 and cephadm. Worked fine to a point, except I
> had to upgrade podman but it mostly worked with octopus.
>
> Since this is a fresh cluster and hence no data at risk, I decided to jump
> straight into Pacific when it came out and upgrade. Which is where my
> trouble began. Mostly because Pacific needs a version on lvm later than
> what's in centos7.
>
> I can't upgrade to centos8 as my boot drives are not supported by centos8
> due to the way redhst disabled lots of disk drivers. I think I'm looking at
> Ubuntu or debian.
>
> Given cephadm has a very limited set of depends it would be good to have a
> supported matrix, it would also be good to have a check in cephadm on
> upgrade, that says no I won't upgrade if the version of lvm2 is too low on
> any host and let's the admin fix the issue and try again.
>
> I was thinking to upgrade to centos8 for this project anyway until I
> relised that centos8 can't support my hardware I've inherited. But
> currently I've got a broken cluster unless I can workout some way to
> upgrade lvm in centos7.
>
> Peter.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: RGW failed to start after upgrade to pacific

2021-04-05 Thread Martin Verges

Hello,

we see same problems. Deleting all the pools and redeploy rgw solved it on
that test cluster, however that is no solution for production ;)

systemd[1]: Started Ceph rados gateway.
radosgw[7171]: 2021-04-04T14:37:51.508+ 7fc6641efc00  0 deferred set
uid:gid to 167:167 (ceph:ceph)
radosgw[7171]: failed to chown /dev/null: (30) Read-only file system
radosgw[7171]: 2021-04-04T14:37:51.508+ 7fc6641efc00  0 ceph version
16.2.0-31-g5922b2b9c1 (5922b2b9c17f0877f84b0b3f2557ab72a628cbfe) pacific
(stable), process radosgw, pid 7171
radosgw[7171]: 2021-04-04T14:37:51.508+ 7fc6641efc00  0 framework:
beast
radosgw[7171]: 2021-04-04T14:37:51.508+ 7fc6641efc00  0 framework conf
key: ssl_port, val: 443
radosgw[7171]: 2021-04-04T14:37:51.508+ 7fc6641efc00  0 framework conf
key: port, val: 80
radosgw[7171]: 2021-04-04T14:37:51.508+ 7fc6641efc00  0 framework conf
key: ssl_certificate, val: /etc/ceph/rgwcert.pem
radosgw[7171]: 2021-04-04T14:37:51.508+ 7fc6641efc00  1 radosgw_Main
not setting numa affinity
radosgw[7171]: 2021-04-04T14:37:51.680+ 7fc6641efc00 -1 static int
rgw::cls::fifo::FIFO::create(librados::v14_2_0::IoCtx,
std::__cxx11::string, std::unique_ptr*,
optional_yield, std::optional,
std::optional >, bool, uint64_t, uint64_t):925
create_meta failed: r=-5
radosgw[7171]: 2021-04-04T14:37:51.680+ 7fc6641efc00 -1 static int
rgw::cls::fifo::FIFO::create(librados::v14_2_0::IoCtx,
std::__cxx11::string, std::unique_ptr*,
optional_yield, std::optional,
std::optional >, bool, uint64_t, uint64_t):925
create_meta failed: r=-5
radosgw[7171]: 2021-04-04T14:37:51.680+ 7fc6641efc00 -1 int
RGWDataChangesLog::start(const RGWZone*, const RGWZoneParams&, RGWSI_Cls*,
librados::v14_2_0::Rados*): Error when starting backend: Input/output error
radosgw[7171]: 2021-04-04T14:37:51.680+ 7fc6641efc00  0 ERROR: failed
to start datalog_rados service ((5) Input/output error
radosgw[7171]: 2021-04-04T14:37:51.680+ 7fc6641efc00 -1 int
RGWDataChangesLog::start(const RGWZone*, const RGWZoneParams&, RGWSI_Cls*,
librados::v14_2_0::Rados*): Error when starting backend: Input/output error
radosgw[7171]: 2021-04-04T14:37:51.680+ 7fc6641efc00  0 ERROR: failed
to init services (ret=(5) Input/output error)
radosgw[7171]: 2021-04-04T14:37:51.700+ 7fc6641efc00 -1 Couldn't init
storage provider (RADOS)
radosgw[7171]: 2021-04-04T14:37:51.700+ 7fc6641efc00 -1 Couldn't init
storage provider (RADOS)
systemd[1]: ceph-rado...@rgw.new-croit-host-C0DE01.service: Main process
exited, code=exited, status=5/NOTINSTALLED
systemd[1]: ceph-rado...@rgw.new-croit-host-C0DE01.service: Unit entered
failed state.
systemd[1]: ceph-rado...@rgw.new-croit-host-C0DE01.service: Failed with
result 'exit-code'.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


On Mon, 5 Apr 2021 at 19:59, Robert Sander 
wrote:

> Hi,
>
> Am 04.04.21 um 15:22 schrieb 胡 玮文:
>
> > bash[9823]: debug 2021-04-04T13:01:04.995+ 7ff80f172440 -1 static
> int rgw::cls::fifo::FIFO::create(librados::v14_2_0::IoCtx,
> std::__cxx11::string, std::unique_ptr*,
> optional_yield, std::optional,
> std::optional >, bool, uint64_t, uint64_t):925
> create_meta failed: r=-5
> > bash[9823]: debug 2021-04-04T13:01:04.995+ 7ff80f172440 -1 int
> RGWDataChangesLog::start(const RGWZone*, const RGWZoneParams&, RGWSI_Cls*,
> librados::v14_2_0::Rados*): Error when starting backend: Input/output error
> > bash[9823]: debug 2021-04-04T13:01:04.995+ 7ff80f172440  0 ERROR:
> failed to start datalog_rados service ((5) Input/output error
> > bash[9823]: debug 2021-04-04T13:01:04.995+ 7ff80f172440  0 ERROR:
> failed to init services (ret=(5) Input/output error)
>
> I see the same issues on an upgraded cluster.
>
> Regards
> --
> Robert Sander
> Heinlein Consulting GmbH
> Schwedter Str. 8/9b, 10119 Berlin
>
> http://www.heinlein-support.de
>
> Tel: 030 / 405051-43
> Fax: 030 / 405051-19
>
> Zwangsangaben lt. §35a GmbHG:
> HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
> Geschäftsführer: Peer Heinlein -- Sitz: Berlin
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: [Ceph-maintainers] v16.2.0 Pacific released

2021-04-01 Thread Martin Verges

Hello,

thanks for a very interesting new Ceph Release.

Are there any plans to build for Debian bullseye as well? It's in
"Hard Freeze" since 2021-03-12 and at the moment it comes with a
Nautilus release that will be EOL when Debian bullseye will be
official stable. That will be a pain for Debian users and if it's
still possible we should try to avoid that. Is there something we
could help to make it happen?

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

On Thu, 1 Apr 2021 at 16:27, David Galloway  wrote:
>
> We're glad to announce the first release of the Pacific v16.2.0 stable
> series. There have been a lot of changes across components from the
> previous Ceph releases, and we advise everyone to go through the release
> and upgrade notes carefully.
>
> Major Changes from Octopus
> --
>
> General
> ~~~
>
> * Cephadm can automatically upgrade an Octopus cluster to Pacific with a 
> single
>   command to start the process.
>
> * Cephadm has improved significantly over the past year, with improved
>   support for RGW (standalone and multisite), and new support for NFS
>   and iSCSI.  Most of these changes have already been backported to
>   recent Octopus point releases, but with the Pacific release we will
>   switch to backporting bug fixes only.
>
> * Packages are built for the following distributions:
>
>   - CentOS 8
>   - Ubuntu 20.04 (Focal)
>   - Ubuntu 18.04 (Bionic)
>   - Debian Buster
>   - Container image (based on CentOS 8)
>
>   With the exception of Debian Buster, packages and containers are
>   built for both x86_64 and aarch64 (arm64) architectures.
>
>   Note that cephadm clusters may work on many other distributions,
>   provided Python 3 and a recent version of Docker or Podman is
>   available to manage containers.  For more information, see
>   `cephadm-host-requirements`.
>
>
> Dashboard
> ~
>
> The `mgr-dashboard` brings improvements in the following management areas:
>
> * Orchestrator/Cephadm:
>
>   - Host management: maintenance mode, labels.
>   - Services: display placement specification.
>   - OSD: disk replacement, display status of ongoing deletion, and improved
> health/SMART diagnostics reporting.
>
> * Official `mgr ceph api`:
>
>   - OpenAPI v3 compliant.
>   - Stability commitment starting from Pacific release.
>   - Versioned via HTTP `Accept` header (starting with v1.0).
>   - Thoroughly tested (>90% coverage and per Pull Request validation).
>   - Fully documented.
>
> * RGW:
>
>   - Multi-site synchronization monitoring.
>   - Management of multiple RGW daemons and their resources (buckets and 
> users).
>   - Bucket and user quota usage visualization.
>   - Improved configuration of S3 tenanted users.
>
> * Security (multiple enhancements and fixes resulting from a pen testing 
> conducted by IBM):
>
>   - Account lock-out after a configurable number of failed log-in attempts.
>   - Improved cookie policies to mitigate XSS/CSRF attacks.
>   - Reviewed and improved security in HTTP headers.
>   - Sensitive information reviewed and removed from logs and error messages.
>   - TLS 1.0 and 1.1 support disabled.
>   - Debug mode when enabled triggers HEALTH_WARN.
>
> * Pools:
>
>   - Improved visualization of replication and erasure coding modes.
>   - CLAY erasure code plugin supported.
>
> * Alerts and notifications:
>
>   - Alert triggered on MTU mismatches in the cluster network.
>   - Favicon changes according cluster status.
>
> * Other:
>
>   - Landing page: improved charts and visualization.
>   - Telemetry configuration wizard.
>   - OSDs: management of individual OSD flags.
>   - RBD: per-RBD image Grafana dashboards.
>   - CephFS: Dirs and Caps displayed.
>   - NFS: v4 support only (v3 backward compatibility planned).
>   - Front-end: Angular 10 update.
>
>
> RADOS
> ~
>
> * Pacific introduces RocksDB sharding, which reduces disk space requirements.
>
> * Ceph now provides QoS between client I/O and background operations via the
>   mclock scheduler.
>
> * The balancer is now on by default in upmap mode to improve distribution of
>   PGs across OSDs.
>
> * The output of `ceph -s` has been improved to show recovery progress in
>   one progress bar. More detailed progress bars are visible via the
>   `ceph progress` command.
>
>
> RBD block storage
> ~
>
> * Image liv

[ceph-users] Re: How's the maturity of CephFS and how's the maturity of Ceph erasure code?

2021-03-31 Thread Martin Verges

>  Do you mean that the file system and erasure code of Ceph are ready enough 
> for production

yes, no doubts

> but it is strongly depends on deployment and maintance correctly.

yes

> Besides your descripton, may I know where can I get such data and evidence to 
> enhance my confidence.

Sorry I can't. But as Stefan Kooman wrote, "Best way is to find out
for yourself and your use case(s) with a PoC cluster"
We use it for our own stuff and customers of croit use it in lot of
installations. However nothing is better then a PoC to gain more
confidence.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


On Wed, 31 Mar 2021 at 12:50, Fred  wrote:
>
> Hi Martin,
> Sorry for late reply. I think maybe I understand what you said. Do you mean 
> that the file system and erasure code of Ceph are ready enough for 
> production, but it is strongly depends on deployment and maintance correctly.
> Besides your descripton, may I know where can I get such data and evidence to 
> enhance my confidence.
> Thanks very much! I appreciate it.
>
>
>
>
>
>
>
> At 2021-02-09 15:38:33, "Martin Verges"  wrote:
> >Hello Fred,
> >
> >from hundreds of installations, we can say it is production ready and
> >working fine if deployed and maintained correctly. As always it
> >depends, but it works for a huge amount of use cases.
> >
> >--
> >Martin Verges
> >Managing director
> >
> >Mobile: +49 174 9335695
> >E-Mail: martin.ver...@croit.io
> >Chat: https://t.me/MartinVerges
> >
> >croit GmbH, Freseniusstr. 31h, 81247 Munich
> >CEO: Martin Verges - VAT-ID: DE310638492
> >Com. register: Amtsgericht Munich HRB 231263
> >
> >Web: https://croit.io
> >YouTube: https://goo.gl/PGE1Bx
> >
> >Am Di., 9. Feb. 2021 um 03:00 Uhr schrieb fanyuanli :
> >>
> >> Hi all,
> >> I'm a rookie in CEPH. I want to ask two questions. One is the maturity of 
> >> cephfs, the file system of CEPH, and whether it is recommended for 
> >> production environment. The other is the maturity of CEPH's erasure code 
> >> and whether it can be used in production environment. Are the above two 
> >> questions explained in the official documents? I may not see where they 
> >> are.
> >> Thank you!
> >> Fred Fan
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> ___
> >> ceph-users mailing list -- ceph-users@ceph.io
> >> To unsubscribe send an email to ceph-users-le...@ceph.io
> >___
> >ceph-users mailing list -- ceph-users@ceph.io
> >To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Can I create 8+2 Erasure coding pool on 5 node?

2021-03-25 Thread Martin Verges

You can change the crush rule to be OSD instead of HOST specific. That way
Ceph will put a chunk per OSD and multiple Chunks per Host.

Please keep in mind, that will cause an outage if one of your hosts are
offline.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


On Thu, 25 Mar 2021 at 19:02, by morphin  wrote:

> Hello.
>
> I have 5 node Cluster in A datacenter. Also I have same 5 node in B
> datacenter.
> They're gonna be 10 node 8+2 EC cluster for backup but I need to add
> the 5 node later.
> I have to sync my S3 data with multisite on the 5 node cluster in A
> datacenter and move
> them to the B and add the other 5 node to the same cluster.
>
> The question is: Can I create 8+2 ec pool on 5 node cluster and add
> the 5 node later? How can I rebalance the data after that?
> Or is there any better solution in my case? what should I do?
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: ceph-ansible in Pacific and beyond?

2021-03-18 Thread Martin Verges

> So no, I am not convinced yet. Not against it, but personally I would say
it's not the only way forward.

100% agree to your whole answer

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


On Thu, 18 Mar 2021 at 09:42, Wido den Hollander  wrote:

>
>
> On 18/03/2021 09:09, Janne Johansson wrote:
> > Den ons 17 mars 2021 kl 20:17 skrev Matthew H  >:
> >>
> >> "A containerized environment just makes troubleshooting more difficult,
> getting access and retrieving details on Ceph processes isn't as
> straightforward as with a non containerized infrastructure. I am still not
> convinced that containerizing everything brings any benefits except the
> collocation of services."
> >>
> >> It changes the way you troubleshoot, but I don't find it more difficult
> in the issues I have seen and had. Even today without containers, all
> services can be co-located within the same hosts (mons,mgrs,osds,mds).. Is
> there a situation you've seen where that has not been the case?
> >
> > New ceph users pop in all the time on the #ceph IRC and have
> > absolutely no idea on how to see the relevant logs from the
> > containerized services.
> >
> > Me being one of the people that do run services on bare metal (and
> > VMs) I actually can't help them, and it seems several other old ceph
> > admins can't either.
> >
>
> Me being one of them.
>
> Yes, it's all possible with containers, but it's different. And I don't
> see the true benefit of running Ceph in Docker just yet.
>
> Another layer of abstraction which you need to understand. Also, when
> you need to do real emergency stuff like working with
> ceph-objectstore-tool to fix broken OSDs/PGs it's just much easier to
> work on a bare-metal box than with containers (if you ask me).
>
> So no, I am not convinced yet. Not against it, but personally I would
> say it's not the only way forward.
>
> DEB and RPM packages are still alive and kicking.
>
> Wido
>
> > Not that it is impossible or might not even be hard to get them, but
> > somewhere in the "it is so easy to get it up and running, just pop a
> > container and off you go" docs there seem to be a lack of the parts
> > "when the OSD crashes at boot, run this to export the file normally
> > called /var/log/ceph/ceph-osd.12.log" meaning it becomes a black box
> > to the users and they are left to wipe/reinstall or something else
> > when it doesn't work. At the end, I guess the project will see less
> > useful reports with Assert Failed logs from impossible conditions and
> > more people turning away from something that could be fixed in the
> > long run.
> >
> > I get some of the advantages, and for stateless services elsewhere it
> > might be gold to have containers, I am not equally enthusiastic about
> > it for ceph.
> >
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Email alerts from Ceph

2021-03-18 Thread Martin Verges

by adding a hook script within croit "onHealthDegrate" and
"onHealthRecover" that notifies us using telegram/slack/... ;)

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Mi., 17. März 2021 um 23:27 Uhr schrieb Andrew Walker-Brown <
andrew_jbr...@hotmail.com>:

> Hi all,
>
> How have folks implemented getting email or snmp alerts out of Ceph?
> Getting things like osd/pool nearly full or osd/daemon failures etc.
>
> Kind regards
>
> Andrew
>
> Sent from my iPhone
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: ceph-ansible in Pacific and beyond?

2021-03-17 Thread Martin Verges

>  I am still not convinced that containerizing everything brings any
benefits except the collocation of services.

Is there even a benefit? We as croit collocate all our services from Ceph
itself MON,MGR,MDS,OSD,... as well as ISCSI, SMB, NFS,... on the same host.
No problem with that, not a single one.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Mi., 17. März 2021 um 18:39 Uhr schrieb Teoman Onay :

> A containerized environment just makes troubleshooting more difficult,
> getting access and retrieving details on Ceph processes isn't as
> straightforward as with a non containerized infrastructure. I am still not
> convinced that containerizing everything brings any benefits except the
> collocation of services.
>
> On Wed, Mar 17, 2021 at 6:27 PM Matthew H 
> wrote:
>
> > There should not be any performance difference between an
> un-containerized
> > version and a containerized one.
> >
> > The shift to containers makes sense, as this is the general direction
> that
> > the industry as a whole is taking. I would suggest giving cephadm a try,
> > it's relatively straight forward and significantly faster for deployments
> > then ceph-ansible is.
> >
> > 
> > From: Matthew Vernon 
> > Sent: Wednesday, March 17, 2021 12:50 PM
> > To: ceph-users 
> > Subject: [ceph-users] ceph-ansible in Pacific and beyond?
> >
> > Hi,
> >
> > I caught up with Sage's talk on what to expect in Pacific (
> > https://www.youtube.com/watch?v=PVtn53MbxTc ) and there was no mention
> > of ceph-ansible at all.
> >
> > Is it going to continue to be supported? We use it (and uncontainerised
> > packages) for all our clusters, so I'd be a bit alarmed if it was going
> > to go away...
> >
> > Regards,
> >
> > Matthew
> >
> >
> > --
> >  The Wellcome Sanger Institute is operated by Genome Research
> >  Limited, a charity registered in England with number 1021457 and a
> >  company registered in England with number 2742969, whose registered
> >  office is 215 Euston Road, London, NW1 2BE.
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
> >
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: ceph-ansible in Pacific and beyond?

2021-03-17 Thread Martin Verges

Hello,

> Finer grained ability to allocate resources to services. (This process
gets 2g of ram and 1 cpu)

do you really believe this is a benefit? How can it be a benefit to have
crashing or slow OSDs? Sounds cool but doesn't work in most environments I
ever had my hands on.
We often encounter cluster that fall apart or have a meltdown just because
they run out of memory and we use tricks like zram to help them out and
recover their clusters. If I now go and do it per container/osd in a finer
grained way, it will just blow up even more.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Mi., 17. März 2021 um 18:59 Uhr schrieb Fox, Kevin M :

> There are a lot of benefits to containerization that is hard to do without
> it.
> Finer grained ability to allocate resources to services. (This process
> gets 2g of ram and 1 cpu)
> Security is better where only minimal software is available within the
> container so on service compromise its harder to escape.
> Ability to run exactly what was tested / released by upstream. Fewer
> issues with version mismatches. Especially useful across different distros.
> Easier to implement orchestration on top which enables some of the
> advanced features such as easy to allocate iscsi/nfs volumes. Ceph is
> finally doing so now that it is focusing on containers.
> And much more.
>
> 
> From: Teoman Onay 
> Sent: Wednesday, March 17, 2021 10:38 AM
> To: Matthew H
> Cc: Matthew Vernon; ceph-users
> Subject: [ceph-users] Re: ceph-ansible in Pacific and beyond?
>
> Check twice before you click! This email originated from outside PNNL.
>
>
> A containerized environment just makes troubleshooting more difficult,
> getting access and retrieving details on Ceph processes isn't as
> straightforward as with a non containerized infrastructure. I am still not
> convinced that containerizing everything brings any benefits except the
> collocation of services.
>
> On Wed, Mar 17, 2021 at 6:27 PM Matthew H 
> wrote:
>
> > There should not be any performance difference between an
> un-containerized
> > version and a containerized one.
> >
> > The shift to containers makes sense, as this is the general direction
> that
> > the industry as a whole is taking. I would suggest giving cephadm a try,
> > it's relatively straight forward and significantly faster for deployments
> > then ceph-ansible is.
> >
> > 
> > From: Matthew Vernon 
> > Sent: Wednesday, March 17, 2021 12:50 PM
> > To: ceph-users 
> > Subject: [ceph-users] ceph-ansible in Pacific and beyond?
> >
> > Hi,
> >
> > I caught up with Sage's talk on what to expect in Pacific (
> >
> https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DPVtn53MbxTc&data=04%7C01%7CKevin.Fox%40pnnl.gov%7Cc8375da0c5e949514eae08d8e96beb60%7Cd6faa5f90ae240338c0130048a38deeb%7C0%7C0%7C637515997042609565%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=7uTZ2om6cgMF7wVMY6ujPHdGS%2FltOUbv0C8L%2FKF3BSU%3D&reserved=0
> ) and there was no mention
> > of ceph-ansible at all.
> >
> > Is it going to continue to be supported? We use it (and uncontainerised
> > packages) for all our clusters, so I'd be a bit alarmed if it was going
> > to go away...
> >
> > Regards,
> >
> > Matthew
> >
> >
> > --
> >  The Wellcome Sanger Institute is operated by Genome Research
> >  Limited, a charity registered in England with number 1021457 and a
> >  company registered in England with number 2742969, whose registered
> >  office is 215 Euston Road, London, NW1 2BE.
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
> >
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Diskless boot for Ceph nodes

2021-03-17 Thread Martin Verges

Hello,

that's right, you can test our croit.io software for free or watch how
it works in a recording of a webinar
https://youtu.be/uMNxOIP1kHI?t=752

From our point of view, booting systems using PXE provides at least
the same benefits as containers on a system but with much stronger
integration. For example, we not only have debian10 buster images, but
also suse15.2 leap that we can boot up anytime. You can even do that
in the same cluster without a problem or migrate at any time between
different operating systems. It makes you independent as well as
flexible. If you break your OS, you simply press the reboot button and
get a nice fresh and clean OS booted in your memory. Besides that, It
is easy to maintain, solid, and all your hosts run on exactly the same
software and configuration state (kernel, libs, Ceph, everything).

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

Am Di., 16. März 2021 um 22:07 Uhr schrieb Stefan Kooman :
>
> On 3/16/21 6:37 PM, Stephen Smith6 wrote:
> > Hey folks - thought I'd check and see if anyone has ever tried to use
> > ephemeral (tmpfs / ramfs based) boot disks for Ceph nodes?
>
> croit.io does that quite succesfully I believe [1].
>
> Gr. Stefan
>
> [1]: https://www.croit.io/software/features
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: How big an OSD disk could be?

2021-03-13 Thread Martin Verges

> So perhaps we'll need to change the OSD to allow for 500 or 1000 PGs

We had a support case last year where we where forced to set the OSD
limit to >4000 for a few days, and had more then 4k active PGs on that
single OSD. You can do that, however it is quite uncommon.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

On Sat, 13 Mar 2021 at 14:29, Dan van der Ster  wrote:
>
> On Fri, Mar 12, 2021 at 6:35 PM Robert Sander
>  wrote:
> >
> > Am 12.03.21 um 18:30 schrieb huxia...@horebdata.cn:
> >
> > > Any other aspects on the limits of bigger capacity hard disk drives?
> >
> > Recovery will take longer increasing the risk of another failure in the
> > same time.
> >
>
> Another limitation is that OSDs should store 100 PGs each regardless
> of their size, so those PGs will each need to store many more objects
> and therefore recovery, scrubbing, removal, listing, etc... will all
> take longer and longer.
>
> So perhaps we'll need to change the OSD to allow for 500 or 1000 PGs
> per OSD eventually, (meaning also that PGs per cluster needs to scale
> up too!)
>
> Cheers, Dan
>
> > Regards
> > --
> > Robert Sander
> > Heinlein Support GmbH
> > Schwedter Str. 8/9b, 10119 Berlin
> >
> > http://www.heinlein-support.de
> >
> > Tel: 030 / 405051-43
> > Fax: 030 / 405051-19
> >
> > Zwangsangaben lt. §35a GmbHG:
> > HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
> > Geschäftsführer: Peer Heinlein -- Sitz: Berlin
> >
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: How big an OSD disk could be?

2021-03-13 Thread Martin Verges

If you have a small cluster, without host redundancy, you are still
able to configure this in Ceph to be handled correctly by adding a
drive failure domain between host and OSD level. So yes you need to
change more then just failure-domain=OSD, as this would be a problem.
However it is absolutely the same as to having multiple OSDs per NVMe
as some people do it.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

Am Sa., 13. März 2021 um 13:11 Uhr schrieb Marc :
>
> > Well, if you run with failure-domain=host, then if it says "I have 8
> > 14TB drives and one failed" or "I have 16 7TB drives and two failed"
> > isn't going to matter much in terms of recovery, is it?
> > It would mostly matter for failure-domain=OSD, otherwise it seems about
> > equal.
>
> Yes, but especially in small clusters, people are changing the failure domain 
> to osd to be able to use EC (like I have ;))
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: How big an OSD disk could be?

2021-03-13 Thread Martin Verges

> failure-domain=host

yes (or rack/room/datacenter/..), for regular clusters it's therefore
absolute no problem as you correctly assumed.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

Am Sa., 13. März 2021 um 13:08 Uhr schrieb Janne Johansson
:
>
> Den lör 13 mars 2021 kl 12:56 skrev Marc :
> > > A good mix of size and performance is the Seagate 2X14 MACH.2 Dual
> > > Actor 14TB HDD.
> > > This drive reports as 2x 7TB individual block devices and you install
> > > a OSD on each.
> >
> > My first thought was, wow quite nice this dual exposes itself as two 
> > drives. I was always under the impression that it was just one single drive 
> > with just more iops.
> > But come to think of it, with such a solution you do create a failure 
> > domain that ceph does not know about. With the default host failure domain 
> > the only problem is that you have an increased risk of 2 drives failing at 
> > once.
> > Is it possible to configure this drive whether it exposes itself as two 
> > drives or just one bigger-faster one?
>
> Well, if you run with failure-domain=host, then if it says "I have 8
> 14TB drives and one failed" or "I have 16 7TB drives and two failed"
> isn't going to matter much in terms of recovery, is it?
> It would mostly matter for failure-domain=OSD, otherwise it seems about equal.
>
> --
> May the most significant bit of your life be positive.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: How big an OSD disk could be?

2021-03-12 Thread Martin Verges

A good mix of size and performance is the Seagate 2X14 MACH.2 Dual
Actor 14TB HDD.
This drive reports as 2x 7TB individual block devices and you install
a OSD on each.

https://croit.io/blog/benchmarking-seagate-exos2x14-mach-2-hdds

We have a bunch of them in a permanent test cluster if someone want to
do some specific Ceph workload benchmarks, feel free to drop me a
mail.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

Am Fr., 12. März 2021 um 18:35 Uhr schrieb Robert Sander
:
>
> Am 12.03.21 um 18:30 schrieb huxia...@horebdata.cn:
>
> > Any other aspects on the limits of bigger capacity hard disk drives?
>
> Recovery will take longer increasing the risk of another failure in the
> same time.
>
> Regards
> --
> Robert Sander
> Heinlein Support GmbH
> Schwedter Str. 8/9b, 10119 Berlin
>
> http://www.heinlein-support.de
>
> Tel: 030 / 405051-43
> Fax: 030 / 405051-19
>
> Zwangsangaben lt. §35a GmbHG:
> HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
> Geschäftsführer: Peer Heinlein -- Sitz: Berlin
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Slow cluster / misplaced objects - Ceph 15.2.9

2021-02-26 Thread Martin Verges

Hello,

within croit, we have a network latency monitoring that would have
shown you the packetlos.
We therefore suggest to install something like a smokeping on your
infrastructure to monitor the quality of your network.

Why does it affect your cluster?

The network is the central component of a Ceph cluster. If this does
not function stably and reliably, Ceph cannot work properly either. It
is practically the backbone of the scale-out cluster and cannot be
replaced by anything. Single packet loss, for example, leads to
retransmits of packets, increased latency and thus reduced data
throughput. This in turn leads to a higher impact during replication
work, which is particularly prevalent in EC. In EC, not only write
accesses but also read accesses must be loaded from several OSDs.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

Am Fr., 26. Feb. 2021 um 20:00 Uhr schrieb David Orman :
>
> We figured this out - it was a leg of an LACP-based interface that was
> misbehaving. Once we dropped it, everything went back to normal. Does
> anybody know a good way to get a sense of what might be slowing down a
> cluster in this regard, with EC? We didn't see any indication of a single
> host as a problem until digging into the socket statistics and seeing high
> sendqs to that host.
>
> On Thu, Feb 25, 2021 at 7:33 PM David Orman  wrote:
>
> > Hi,
> >
> > We've got an interesting issue we're running into on Ceph 15.2.9. We're
> > experiencing VERY slow performance from the cluster, and extremely slow
> > misplaced object correction, with very little cpu/disk/network utilization
> > (almost idle) across all nodes in the cluster.
> >
> > We have 7 servers in this cluster, 24 rotational OSDs, and two NVMEs with
> > 12 OSD's worth of DB/WAL files on them. The OSDs are all equal weighted, so
> > the tree is pretty straightforward:
> >
> > root@ceph01:~# ceph osd tree
> >
> > Inferring fsid 41bb9256-c3bf-11ea-85b9-9e07b0435492
> >
> > Inferring config
> > /var/lib/ceph/41bb9256-c3bf-11ea-85b9-9e07b0435492/mon.ceph01/config
> >
> > Using recent ceph image
> > docker.io/ceph/ceph@sha256:4e710662986cf366c282323bfb4c4ca507d7e117c5ccf691a8273732073297e5
> >
> > ID   CLASS  WEIGHT  TYPE NAMESTATUS  REWEIGHT  PRI-AFF
> >
> >  -1 2149.39062  root default
> >
> >  -2 2149.39062  rack rack1
> >
> >  -5  307.05579  host ceph01
> >
> >   0hdd12.79399  osd.0up   1.0  1.0
> >
> >   1hdd12.79399  osd.1up   1.0  1.0
> >
> >   2hdd12.79399  osd.2up   1.0  1.0
> >
> >   3hdd12.79399  osd.3up   1.0  1.0
> >
> >   4hdd12.79399  osd.4up   1.0  1.0
> >
> >   5hdd12.79399  osd.5up   1.0  1.0
> >
> >   6hdd12.79399  osd.6up   1.0  1.0
> >
> >   7hdd12.79399  osd.7up   1.0  1.0
> >
> >   8hdd12.79399  osd.8up   1.0  1.0
> >
> >   9hdd12.79399  osd.9up   1.0  1.0
> >
> >  10hdd12.79399  osd.10   up   1.0  1.0
> >
> >  11hdd12.79399  osd.11   up   1.0  1.0
> >
> >  12hdd12.79399  osd.12   up   1.0  1.0
> >
> >  13hdd12.79399  osd.13   up   1.0  1.0
> >
> >  14hdd12.79399  osd.14   up   1.0  1.0
> >
> >  15hdd12.79399  osd.15   up   1.0  1.0
> >
> >  16hdd12.79399  osd.16   up   1.0  1.0
> >
> >  17hdd12.79399  osd.17   up   1.0  1.0
> >
> >  18hdd12.79399  osd.18   up   1.0  1.0
> >
> >  19hdd12.79399  osd.19   up   1.0  1.0
> >
> >  20hdd12.79399  osd.20   up   1.0  1.0
> >
> >  21hdd12.79399  osd.21   up   1.0  1.0
> >
> >  22hdd12.79399  osd.22   up   1.0  1.0
> >
> &g

[ceph-users] Re: Gradually Increasing PG/PGP

2021-02-21 Thread Martin Verges

Hello Mark,

Ceph itself does it incremental. Just select the value you will have
at the end, and wait for Ceph to do so.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

Am So., 21. Feb. 2021 um 23:34 Uhr schrieb Mark Johnson :
>
> Hi,
>
> Probably a basic/stupid question but I'm asking anyway.  Through lack of 
> knowledge and experience at the time, when we set up our pools, our pool that 
> holds the majority of our data was created with a PG/PGP num of 64.  As the 
> amount of data has grown, this has started causing issues with balance of 
> data across OSDs.  I want to increase the PG count to at least 512, or maybe 
> 1024 - obviously, I want to do this incrementally.  However, rather than 
> going from 64 to 128, then 256 etc, I'm considering doing this in much 
> smaller increments over a longer period of time so that it will hopefully be 
> doing the majority of moving around of data during the quieter time of day.  
> So, may start by going in increments of 4 until I get up to 128 and then go 
> in jumps of 8 and so on.
>
> My question is, will I still end up with the same net result going in 
> increments of 4 until I hit 128 as I would if I were to go straight to 128 in 
> one hit.  What I mean by that is that once I reach 128, would I have the 
> exact same level of data balance across PGs as I would if I went straight to 
> 128?  Are there any drawbacks in going up in small increments over a long 
> period of time?  I know that I'll have uneven PG sizes until I get to that 
> exponent of 2 but that should be OK as long as the end result is the desired 
> result.  I suspect I may have a greater amount of data moving around overall 
> doing it this way but given my goal is to reduce the amount of intensive data 
> moves during higher traffic times, that's not a huge concern in the grand 
> scheme of things.
>
> Thanks in advance,
> Mark
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: 10G stackabe lacp switches

2021-02-21 Thread Martin Verges

Hello MJ,

Arista has a good documentation available for example at
https://www.arista.com/en/um-eos/eos-multi-chassis-link-aggregation or
https://eos.arista.com/mlag-basic-configuration/. Don't worry, when
you know what exactly you want to configure, it's just a few lines of
config in the end.

It's more or less:

On Both Switches:
-
no spanning-tree vlan-id 4094
!
vlan 4094
   trunk group mlag
!
interface Port-Channel2000
   switchport mode trunk
   switchport trunk group mlag
!
interface Ethernet35
   description MLAG link
   channel-group 2000 mode active
!
interface Ethernet36
   description MLAG link
   channel-group 2000 mode active
!

On switch 1 in addition:
---
interface Vlan4094
   ip address 10.2.0.1/30
!
mlag configuration
   domain-id mlag
   local-interface Vlan4094
   peer-address 10.2.0.2
   peer-link Port-Channel2000
!

On switch 2 in addition:
---
interface Vlan4094
   ip address 10.2.0.2/30
!
mlag configuration
   domain-id mlag
   local-interface Vlan4094
   peer-address 10.2.0.1
   peer-link Port-Channel2000
!

After that you create port-channels for the ports you like to bundle
with a specific mlag id

On both switches as an example:
-
interface Port-Channel1
   switchport access vlan 101
   mlag 1
!
interface Ethernet7/1
   description "MLAG bonding to Server XXX"
   channel-group 1 mode active
!

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

Am Sa., 20. Feb. 2021 um 16:38 Uhr schrieb mj :
>
> Hi,
>
> Just to follow-up this dialogue:
>
> After Martin's tip for the arista's, we started looked around, and while
> buying from ebay did not appeal to us, we did manage to find a couple of
> refurb equipment resellers that offer them for a (little) bit more, but
> then including a year warranty.
>
> And actually, for an extra 175 euro, I was able to upgrade warranty from
> one to three years.
>
> Now I just need to get an ok from management here. And then the fun
> begins: start reading up on arista EOS MLAG config, en doing a lot of
> testing with new stuff.
>
> Now I wonder if it's accepted to ask arista-specific (MLAG) config
> questions here on this list...
>
> Have a nice weekend all!
>
> MJ
>
> On 2/15/21 1:41 PM, Sebastian Trojanowski wrote:
> > 3y ago I bought it on ebay to my home lab for 750$ with transport and
> > duty and additional tax, so it's possible
> >
> > https://www.ebay.com/sch/i.html?_from=R40&_trksid=p2334524.m570.l1313&_nkw=7050QX-32&_sacat=0&LH_TitleDesc=0&_osacat=0&_odkw=7050QX-32S
> >
> >
> > BR,
> > Sebastian
> >
> > On 15.02.2021 12:57, huxia...@horebdata.cn wrote:
> >> Just wondering from where one can buy Arista 7050QX-32S for 700 USD?
> >>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Upgrading Ceph luminous to mimic on debian-buster

2021-02-16 Thread Martin Verges

Hello,

you can migrate to nautilus and skip the outdated mimic. Save yourself
the trouble of mimic it's not worth.
You find packages on debian-backports
(https://packages.debian.org/buster-backports/ceph) or the croit
debian mirror.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

Am Di., 16. Feb. 2021 um 14:25 Uhr schrieb Jean-Marc FONTANA
:
>
> Hello everyone,
>
> We just installed a Ceph cluster version luminous (12.2.11) on servers
> working with Debian buster (10.8)
> using ceph-deploy and we are trying to upgrade it to mimic but can't
> find a way to do it.
>
> We tried ceph-deploy install --release mimic mon1 mon2 mon3 (after
> having modified /etc/apt/sources.list.d/ceph.list)
> but this does nothing because the packets are said to be up to date.
>
> Could someone help us, please ?
>
> Best regards
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: 10G stackabe lacp switches

2021-02-15 Thread Martin Verges

Hello,

you can buy Arista 7050QX-32S are 40G Switches. They come for around
500-700€ each and can be stacked using MLAG. They work great.
If you like to spend more money, 7060QX-32S are 100G Switches
available for around 1500€.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

Am Mo., 15. Feb. 2021 um 12:16 Uhr schrieb mj :
>
> Hi,
>
> Hapy to report that we recently upgraded our three-host 24 OSD cluster
> from HDD filestore to SSD BlueStore. After a few months of use, their
> WEAR is still at 1%, and the cluster performance ("rados bench" etc) has
> dramatically improved. So all in all: yes, we're happy Samsung PM883
> ceph users. :-)
>
> We currently have a "meshed" ceph setup, with the three hosts connected
> directly to each other over 10G ethernet, as described here:
>
> https://pve.proxmox.com/wiki/Full_Mesh_Network_for_Ceph_Server#Method_2_.28routed.29
>
> As we would like to be able to add more storage hosts, we need to loose
> the meshed network setup.
>
> My idea is to add two stacked 10G ethernet switches to the setup, so we
> can start using lacp bonded networking over two physical switches.
>
> Looking around, we can get refurb Cisco Small Business 550X for around
> 1300 euro. We also noticed that mikrotik and TP-Link have some even
> nicer-priced 10G switches, but those all lack bonding. :-(
>
> Therfore I'm asking here: anyone here with suggestions on what to look
> at, for nice-priced 10G stackable switches..?
>
> We would like to continue using ethernet, as we use that everywhere, and
> also performance-wise we're happy with what we currently have.
>
> Last december I wrote to mikrotik support, asking if they will support
> stacking / LACP any time soon, and their answer was: probably 2nd half
> of 2021.
>
> So, anyone here with interesting insights to share for ceph 10G ethernet
> storage networking?
>
> Thanks,
> MJ
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: How's the maturity of CephFS and how's the maturity of Ceph erasure code?

2021-02-08 Thread Martin Verges

Hello Fred,

from hundreds of installations, we can say it is production ready and
working fine if deployed and maintained correctly. As always it
depends, but it works for a huge amount of use cases.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

Am Di., 9. Feb. 2021 um 03:00 Uhr schrieb fanyuanli :
>
> Hi all,
> I'm a rookie in CEPH. I want to ask two questions. One is the maturity of 
> cephfs, the file system of CEPH, and whether it is recommended for production 
> environment. The other is the maturity of CEPH's erasure code and whether it 
> can be used in production environment. Are the above two questions explained 
> in the official documents? I may not see where they are.
> Thank you!
> Fred Fan
>
>
>
>
>
>
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: reinstalling node with orchestrator/cephadm

2021-02-08 Thread Martin Verges

Hello,

you could switch to croit. We can overtake existing clusters without
much pain and then you have a single button to upgrade in the future
;)

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

Am Mo., 8. Feb. 2021 um 16:53 Uhr schrieb Kenneth Waegeman
:
>
> Hi Eugen, all,
>
> Thanks for sharing your results! Since we have multiple clusters and
> clusters with +500 OSDs, this solution is not feasible for us.
>
> In the meantime I created an issue for this :
>
> https://tracker.ceph.com/issues/49159
>
> We would need this especially to migrate/reinstall all our clusters to
> Rhel8 (without destroying/recreating all osd disks), so I really hope
> there is another solution :)
>
> Thanks again!
>
> Kenneth
>
> On 05/02/2021 16:11, Eugen Block wrote:
> > Hi Kenneth,
> >
> > I managed to succeed with this just now. It's a lab environment and
> > the OSDs are not encrypted but I was able to get the OSDs up again.
> > The ceph-volume commands also worked (just activation didn't) so I had
> > the required information about those OSDs.
> >
> > What I did was
> >
> > - collect the OSD data (fsid, keyring)
> > - create directories for osd daemons under
> > /var/lib/ceph//osd.
> > - note that the directory with the ceph uuid already existed since the
> > crash container had been created after bringing the node back into the
> > cluster
> > - creating the content for that OSD by copying the required files from
> > a different host and changed the contents of
> > - fsid
> > - keyring
> > - whoami
> > - unit.run
> > - unit.poststop
> >
> > - created the symlinks to the OSD devices:
> > - ln -s /dev/ceph-/osd-block- block
> > - ln -s /dev/ceph-/osd-block- block.db
> >
> > - changed ownership to ceph
> > - chown -R ceph.ceph /var/lib/ceph//osd./
> >
> > - started the systemd unit
> > - systemctl start ceph-@osd..service
> >
> > I repeated this for all OSDs on that host, now all OSDs are online and
> > the cluster is happy. I'm not sure what else is necessary in case of
> > encrypted OSDs, but maybe this procedure helps you.
> > I don't know if there's a smoother or even automated way, I don't
> > think there currently is. Maybe someone is working on it though.
> >
> > Regards,
> > Eugen
> >
> >
> > Zitat von Kenneth Waegeman :
> >
> >> Hi all,
> >>
> >> I'm running a 15.2.8 cluster using ceph orch with all daemons adopted
> >> to cephadm.
> >>
> >> I tried reinstall an OSD node. Is there a way to make ceph
> >> orch/cephadm activate the devices on this node again, ideally
> >> automatically?
> >>
> >> I tried running `cephadm ceph-volume -- lvm activate --all` but this
> >> has an error related to dmcrypt:
> >>
> >>> [root@osd2803 ~]# cephadm ceph-volume -- lvm activate --all
> >>> Using recent ceph image docker.io/ceph/ceph:v15
> >>> /usr/bin/podman:stderr --> Activating OSD ID 0 FSID
> >>> 697698fd-3fa0-480f-807b-68492bd292bf
> >>> /usr/bin/podman:stderr Running command: /usr/bin/mount -t tmpfs
> >>> tmpfs /var/lib/ceph/osd/ceph-0
> >>> /usr/bin/podman:stderr Running command: /usr/bin/ceph-authtool
> >>> /var/lib/ceph/osd/ceph-0/lockbox.keyring --create-keyring --name
> >>> client.osd-lockbox.697698fd-3fa0-480f-807b-68492bd292bf --add-key
> >>> AQAy7Bdg0jQsBhAAj0gcteTEbcpwNNvMGZqTTg==
> >>> /usr/bin/podman:stderr  stdout: creating
> >>> /var/lib/ceph/osd/ceph-0/lockbox.keyring
> >>> /usr/bin/podman:stderr added entity
> >>> client.osd-lockbox.697698fd-3fa0-480f-807b-68492bd292bf
> >>> auth(key=AQAy7Bdg0jQsBhAAj0gcteTEbcpwNNvMGZqTTg==)
> >>> /usr/bin/podman:stderr Running command: /usr/bin/chown -R ceph:ceph
> >>> /var/lib/ceph/osd/ceph-0/lockbox.keyring
> >>> /usr/bin/podman:stderr Running command: /usr/bin/ceph --cluster ceph
> >>> --name client.osd-lockbox.697698fd-3fa0-480f-807b-68492bd292bf
> >>> --keyring /var/lib/ceph/osd/ceph-0/lockbox.keyring config-key get
> >>> dm-crypt/osd/697698fd-3fa0-480f-807b-68492bd292bf/luks
> >>> /usr/bin/podman:stderr  stderr: Error initializing cluster client:
>

[ceph-users] Re: Worst thing that can happen if I have size= 2

2021-02-03 Thread Martin Verges

Hello Adam,

2 copies are save, min size 1 is not.
As long as there is no write while one copy is missing, you can
recover from that or from the unavailable copy when it comes online
again.
If you have min size 1 and you therefore write data on a single copy,
no safety net will protect you.

In general we consider even a 2 copy setup not secure enough.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

Am Mi., 3. Feb. 2021 um 16:50 Uhr schrieb Adam Boyhan :
>
> Isn't this somewhat reliant on the OSD type?
>
> Redhat/Micron/Samsung/Supermicro have all put out white papers backing the 
> idea of 2 copies on NVMe's as safe for production.
>
>
> From: "Magnus HAGDORN" 
> To: pse...@avalon.org.ua
> Cc: "ceph-users" 
> Sent: Wednesday, February 3, 2021 4:43:08 AM
> Subject: [ceph-users] Re: Worst thing that can happen if I have size= 2
>
> On Wed, 2021-02-03 at 09:39 +, Max Krasilnikov wrote:
> > > if a OSD becomes unavailble (broken disk, rebooting server) then
> > > all
> > > I/O to the PGs stored on that OSD will block until replication
> > > level of
> > > 2 is reached again. So, for a highly available cluster you need a
> > > replication level of 3
> >
> >
> > AFAIK, with min_size 1 it is possible to write even to only active
> > OSD serving
> >
> yes, that's correct but then you seriously risk trashing your data
>
> The University of Edinburgh is a charitable body, registered in Scotland, 
> with registration number SC005336.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Using RBD to pack billions of small files

2021-01-31 Thread Martin Verges

Hello,

source code should be compressible, maybe just creating something like
a tar.gz per repo or so? That way you would get much bigger objects
that could improve speed and make it easier to store on any storage
system.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

Am Sa., 30. Jan. 2021 um 16:01 Uhr schrieb Loïc Dachary :
>
> Bonjour,
>
> In the context Software Heritage (a noble mission to preserve all source 
> code)[0], artifacts have an average size of ~3KB and there are billions of 
> them. They never change and are never deleted. To save space it would make 
> sense to write them, one after the other, in an every growing RBD volume 
> (more than 100TB). An index, located somewhere else, would record the offset 
> and size of the artifacts in the volume.
>
> I wonder if someone already implemented this idea with success? And if not... 
> does anyone see a reason why it would be a bad idea?
>
> Cheers
>
> [0] https://docs.softwareheritage.org/
>
> --
> Loïc Dachary, Artisan Logiciel Libre
>
>
>
>
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Centos 8 2021 with ceph, how to move forward?

2021-01-14 Thread Martin Verges

Hello,

we from croit use Ceph on Debian and deploy all our clusters with it.
It works like a charm and I personally have quite good experience with
it since ~20 years. It is a fantastic solid OS for Servers.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

Am Do., 14. Jan. 2021 um 11:12 Uhr schrieb David Majchrzak, ODERLAND
Webbhotell AB :
>
> One of our providers (cloudlinux)  released a 1:1 binary compatible
> redhat fork due to the changes with Centos 8.
>
> Could be worth looking at.
>
> https://almalinux.org/
>
> In our case we're using ceph on debian 10.
>
> --
>
> David Majchrzak
> CTO
> Oderland Webbhotell AB
> Östra Hamngatan 50B, 411 09 Göteborg, SWEDEN
>
> Den 2021-01-14 kl. 09:04, skrev Szabo, Istvan (Agoda):
> > Hi,
> >
> > Just curious how you guys move forward with this Centos 8 change.
> >
> > We just finished installing our full multisite cluster and looks like we 
> > need to change the operating system.
> >
> > So curious if you are using centos 8 with ceph, where you are going to move 
> > forward.
> >
> > Thank you
> >
> > 
> > This message is confidential and is for the sole use of the intended 
> > recipient(s). It may also be privileged or otherwise protected by copyright 
> > or other legal rules. If you have received it by mistake please let us know 
> > by reply email and delete it from your system. It is prohibited to copy 
> > this message or disclose its content to anyone. Any confidentiality or 
> > privilege is not waived or lost by any mistaken delivery or unauthorized 
> > disclosure of the message. All messages sent to and from Agoda may be 
> > monitored to ensure compliance with company policies, to protect the 
> > company's interests and to remove potential malware. Electronic messages 
> > may be intercepted, amended, lost or deleted, or contain viruses.
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Ceph on ARM ?

2020-11-24 Thread Martin Verges

Hello,

> I'm curious however if the ARM servers are better or not for this use case 
> (object-storage only).  For example, instead of using 2xSilver/Gold server, I 
> can use a Taishan 5280 server with 2x Kungpen 920 ARM CPUs with up to 128 
> cores in total .  So I can have twice as many CPU cores (or even more) per 
> server comparing with x86.  Probably the price is lower for the ARM servers 
> as well.

Even if they would be cheaper, which I strongly doubt, you will get
less performance out of them. More cores won't give you any benefit in
Ceph, but having much faster cores is somewhat of a game changer. Just
use a good AMD Epyc for best price/performance/power ratio.

> Has anyone tested Ceph in such scenario?  Is the Ceph software really 
> optimised for the ARM architecture ?  What do you think about this ?

If you choose ARM for your Ceph, you are one of very very few people
and will most properly hit some crazy bugs that will cause trouble. A
high price to pay in my opinion just for an "imaginary" performance or
power reduction benefit. Storage has to run 24*7 all year long without
a single incident. Everything else in my world is inacceptable.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

Am Di., 24. Nov. 2020 um 13:57 Uhr schrieb Robert Sander
:
>
> Am 24.11.20 um 13:12 schrieb Adrian Nicolae:
>
> > Has anyone tested Ceph in such scenario ?  Is the Ceph software
> > really optimised for the ARM architecture ?
>
> Personally I have not run Ceph on ARM, but there are companies selling
> such setups:
>
> https://softiron.com/
> https://www.ambedded.com.tw/
>
> Regards
> --
> Robert Sander
> Heinlein Support GmbH
> Schwedter Str. 8/9b, 10119 Berlin
>
> http://www.heinlein-support.de
>
> Tel: 030 / 405051-43
> Fax: 030 / 405051-19
>
> Zwangsangaben lt. §35a GmbHG:
> HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
> Geschäftsführer: Peer Heinlein -- Sitz: Berlin
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: which of cpu frequency and number of threads servers osd better?

2020-11-12 Thread Martin Verges

Hello Tony,

as it is HDD, your CPU won't be a bottleneck at all. Both CPUs are
overprovisioned.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Fr., 13. Nov. 2020 um 03:57 Uhr schrieb Tony Liu :

> Hi,
>
> For example, 16 threads with 3.2GHz and 32 threads with 3.0GHz,
> which makes 11 OSDs (10x12TB HDD and 1x960GB SSD) with better
> performance?
>
>
> Thanks!
> Tony
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Ceph Octopus and Snapshot Schedules

2020-10-22 Thread Martin Verges

Hello Adam,

in our croit Ceph Management Software, we have a snapshot manager feature
that is capable of doing that.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Do., 22. Okt. 2020 um 15:38 Uhr schrieb Adam Boyhan :

> Hey all.
>
> I was wondering if Ceph Octopus is capable of automating/managing snapshot
> creation/retention and then replication? Ive seen some notes about it, but
> can't seem to find anything solid.
>
> Open to suggestions as well. Appreciate any input!
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Ceph test cluster, how to estimate performance.

2020-10-12 Thread Martin Verges

Hello Daniel,

just throw away your crappy Samsung SSD 860 Pro. It won't work in an
acceptable way.

See
https://docs.google.com/spreadsheets/d/1E9-eXjzsKboiCCX-0u0r5fAjjufLKayaut_FOPxYZjc/edit?usp=sharing
for a performance indication of individual disks.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Di., 13. Okt. 2020 um 07:31 Uhr schrieb Daniel Mezentsev :

> Hi Ceph users,
>
> Im working on  common lisp client utilizing rados library. Got some
> results, but don't know how to estimate if i am getting correct
> performance. I'm running test cluster from laptop - 2 OSDs -  VM, RAM
> 4Gb, 4 vCPU each, monitors and mgr are running from the same VM(s). As
> for storage, i have Samsung SSD 860 Pro, 512G. Disk is splitted into 2
> logical volumes (LVMs), and that volumes are attached to VMs. I know
> that i can't expect too much from that layout, just want to know if im
> getting adequate numbers. Im doing read/write operations on very small
> objects - up to 1kb. In async write im getting ~7.5-8.0 KIOPS.
> Synchronouse read - pretty much the same 7.5-8.0 KIOPS. Async read is
> segfaulting don't know why. Disk itself is capable to deliver well
> above 50 KIOPS. Difference is magnitude. Any info is more welcome.
>   Daniel Mezentsev, founder
> (+1) 604 313 8592.
> Soleks Data Group.
> Shaping the clouds.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Monitor recovery

2020-10-10 Thread Martin Verges

Hello Brian,

as long as you have at least one working MON, it's kind of easy to recover.
Shutdown all MONs, modify the MONMAP by hand, leaving just one of the
working MONs and then start it up. After that, redeploy the other mons to
have your quorum and redundancy back again.

You find more details and commands at
https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-mon/#recovering-a-monitor-s-broken-monmap
.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Sa., 10. Okt. 2020 um 07:16 Uhr schrieb Brian Topping <
brian.topp...@gmail.com>:

> Hello experts,
>
> I have accidentally created a situation where the only monitor in a
> cluster has been moved to a new node without it’s /var/lib/ceph contents.
> Not realizing what I had done, I decommissioned the original node, but
> still have the contents of it’s /var/lib/ceph.
>
> Can I shut down the monitor running on the new node, copy monitor data
> from the original node to the new node and restart the monitor? Or is there
> information in the monitor database that is tied to the original node? If
> that’s the case, I suspect I need to somehow recommission the original node.
>
> Thanks for any feedback on this situation!
>
> Brian
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: ceph iscsi latency too high for esxi?

2020-10-04 Thread Martin Verges

Hello,

in my personal opinion, HDDs are a technology from the last century and I
would never ever think about using such old technology for modern
VM/Container/... workloads. My time, as well as any employee is too
precious to wait for a harddrive to find the requested data! Use EC on NVMe
if you need to save some money. It's still much faster with lower latency
than HDDs.

As each HDD only adds like 100 IO/s and 20-30 MB/s to your cluster, you can
throw in 100 Disks and won't even come near the performance of a single
SSD. Yes, each disk will improve your performance, but by such a small
amount that it makes no sense in my eyes.

> Does that mean that occasional iSCSI path drop-outs are somewhat
expected?
Not that I'm aware of, but I have no HDD based ISCSI cluster at hand to
check. Sorry.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am So., 4. Okt. 2020 um 16:06 Uhr schrieb Golasowski Martin <
martin.golasow...@vsb.cz>:

> Thanks!
>
> Does that mean that occasional iSCSI path drop-outs are somewhat expected?
> We are using SSDs for WAL/DB on each OSD server, so at least that.
>
> Do you think that If we buy additional 6/12 HDDs would that help with the
> IOPS for the VMs?
>
> Regards,
> Martin
>
>
>
> On 4 Oct 2020, at 15:17, Martin Verges  wrote:
>
> Hello,
>
> no iSCSI + VMware works without such problems.
>
> > We are on latest Nautilus, 12 x 10 TB OSDs (4 servers), 25 Gbit/s
> Ethernet, erasure coded rbd pool with 128 PGs, aroun 200 PGs per OSD total.
>
> Nautilus is a good choice
> 12*10TB HDD is not good for VMs
> 25Gbit/s on HDD is way to much for that system
> 200 PGs per OSD is to much, I would suggest 75-100 PGs per OSD
>
> You can improve latency on HDD clusters using external DB/WAL on NVMe.
> That might help you
>
> --
> Martin Verges
> Managing director
>
> Mobile: +49 174 9335695
> E-Mail: martin.ver...@croit.io
> Chat: https://t.me/MartinVerges
>
> croit GmbH, Freseniusstr. 31h, 81247 Munich
> CEO: Martin Verges - VAT-ID: DE310638492
> Com. register: Amtsgericht Munich HRB 231263
>
> Web: https://croit.io
> YouTube: https://goo.gl/PGE1Bx
>
>
> Am So., 4. Okt. 2020 um 14:37 Uhr schrieb Golasowski Martin <
> martin.golasow...@vsb.cz>:
>
>> Hi,
>> does anyone here use CEPH iSCSI with VMware ESXi? It seems that we are
>> hitting the 5 second timeout limit on software HBA in ESXi. It appears
>> whenever there is increased load on the cluster, like deep scrub or
>> rebalance. Is it normal behaviour in production? Or is there something
>> special we need to tune?
>>
>> We are on latest Nautilus, 12 x 10 TB OSDs (4 servers), 25 Gbit/s
>> Ethernet, erasure coded rbd pool with 128 PGs, aroun 200 PGs per OSD total.
>>
>>
>> ESXi Log:
>>
>> 2020-10-04T01:57:04.314Z cpu34:2098959)WARNING: iscsi_vmk:
>> iscsivmk_ConnReceiveAtomic:517: vmhba64:CH:1 T:0 CN:0: Failed to receive
>> data: Connection closed by peer
>> 2020-10-04T01:57:04.314Z cpu34:2098959)iscsi_vmk:
>> iscsivmk_ConnRxNotifyFailure:1235: vmhba64:CH:1 T:0 CN:0: Connection rx
>> notifying failure: Failed to Receive. State=Bound
>> 2020-10-04T01:57:04.566Z cpu19:2098979)WARNING: iscsi_vmk:
>> iscsivmk_StopConnection:741: vmhba64:CH:1 T:0 CN:0: iSCSI connection is
>> being marked "OFFLINE" (Event:4)
>> 2020-10-04T01:57:04.654Z cpu7:2097866)WARNING: VMW_SATP_ALUA:
>> satp_alua_issueCommandOnPath:788: Probe cmd 0xa3 failed for path
>> "vmhba64:C2:T0:L0" (0x5/0x20/0x0). Check if failover mode is still ALUA.
>>
>>
>> OSD Log:
>>
>> [303088.450088] Did not receive response to NOPIN on CID: 0, failing
>> connection for I_T Nexus
>> iqn.1994-05.com.redhat:esxi1,i,0x00023d02,iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw,t,0x01
>> [324926.694077] Did not receive response to NOPIN on CID: 0, failing
>> connection for I_T Nexus
>> iqn.1994-05.com.redhat:esxi2,i,0x00023d01,iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw,t,0x01
>> [407067.404538] ABORT_TASK: Found referenced iSCSI task_tag: 5891
>> [407076.077175] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag:
>> 5891
>> [411677.887690] ABORT_TASK: Found referenced iSCSI task_tag: 6722
>> [411683.297425] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag:
>> 6722
>> [481459.755876] ABORT_TASK: Found referenced iSCSI task_tag: 7930
>> [481460.787968] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag:
>> 7930
>>
>> Cheers,
>> Martin___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: ceph iscsi latency too high for esxi?

2020-10-04 Thread Martin Verges

Hello,

no iSCSI + VMware works without such problems.

> We are on latest Nautilus, 12 x 10 TB OSDs (4 servers), 25 Gbit/s
Ethernet, erasure coded rbd pool with 128 PGs, aroun 200 PGs per OSD total.

Nautilus is a good choice
12*10TB HDD is not good for VMs
25Gbit/s on HDD is way to much for that system
200 PGs per OSD is to much, I would suggest 75-100 PGs per OSD

You can improve latency on HDD clusters using external DB/WAL on NVMe. That
might help you

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am So., 4. Okt. 2020 um 14:37 Uhr schrieb Golasowski Martin <
martin.golasow...@vsb.cz>:

> Hi,
> does anyone here use CEPH iSCSI with VMware ESXi? It seems that we are
> hitting the 5 second timeout limit on software HBA in ESXi. It appears
> whenever there is increased load on the cluster, like deep scrub or
> rebalance. Is it normal behaviour in production? Or is there something
> special we need to tune?
>
> We are on latest Nautilus, 12 x 10 TB OSDs (4 servers), 25 Gbit/s
> Ethernet, erasure coded rbd pool with 128 PGs, aroun 200 PGs per OSD total.
>
>
> ESXi Log:
>
> 2020-10-04T01:57:04.314Z cpu34:2098959)WARNING: iscsi_vmk:
> iscsivmk_ConnReceiveAtomic:517: vmhba64:CH:1 T:0 CN:0: Failed to receive
> data: Connection closed by peer
> 2020-10-04T01:57:04.314Z cpu34:2098959)iscsi_vmk:
> iscsivmk_ConnRxNotifyFailure:1235: vmhba64:CH:1 T:0 CN:0: Connection rx
> notifying failure: Failed to Receive. State=Bound
> 2020-10-04T01:57:04.566Z cpu19:2098979)WARNING: iscsi_vmk:
> iscsivmk_StopConnection:741: vmhba64:CH:1 T:0 CN:0: iSCSI connection is
> being marked "OFFLINE" (Event:4)
> 2020-10-04T01:57:04.654Z cpu7:2097866)WARNING: VMW_SATP_ALUA:
> satp_alua_issueCommandOnPath:788: Probe cmd 0xa3 failed for path
> "vmhba64:C2:T0:L0" (0x5/0x20/0x0). Check if failover mode is still ALUA.
>
>
> OSD Log:
>
> [303088.450088] Did not receive response to NOPIN on CID: 0, failing
> connection for I_T Nexus
> iqn.1994-05.com.redhat:esxi1,i,0x00023d02,iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw,t,0x01
> [324926.694077] Did not receive response to NOPIN on CID: 0, failing
> connection for I_T Nexus
> iqn.1994-05.com.redhat:esxi2,i,0x00023d01,iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw,t,0x01
> [407067.404538] ABORT_TASK: Found referenced iSCSI task_tag: 5891
> [407076.077175] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 5891
> [411677.887690] ABORT_TASK: Found referenced iSCSI task_tag: 6722
> [411683.297425] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 6722
> [481459.755876] ABORT_TASK: Found referenced iSCSI task_tag: 7930
> [481460.787968] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 7930
>
> Cheers,
> Martin___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Massive Mon DB Size with noout on 14.2.11

2020-10-02 Thread Martin Verges

As long as the cluster is no healthy, the OSD will require much more space,
depending on the cluster size and other factors. Yes this is somewhat
normal.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Fr., 2. Okt. 2020 um 15:46 Uhr schrieb Andreas John :

> Hello,
>
> we observed massive and sudden growth of the mon db size on disk, from
> 50MB to 20GB+ (GB!) and thus reaching 100% disk usage on the mountpoint.
>
> As far as we can see, it happens if we set "noout" for a node reboot:
> After the node and the OSDs come back it looks like the mon db size
> increased drastically.
>
> We have 14.2.11, 10 OSD @ 2TB and cephfs in use.
>
> Is this a known issue? Should we avoid noout?
>
>
> TIA,
>
> derjohn
>
>
> --
> Andreas John
> net-lab GmbH  |  Frankfurter Str. 99  |  63067 Offenbach
> Geschaeftsfuehrer: Andreas John | AG Offenbach, HRB40832
> Tel: +49 69 8570033-1 | Fax: -2 | http://www.net-lab.net
>
> Facebook: https://www.facebook.com/netlabdotnet
> Twitter: https://twitter.com/netlabdotnet
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Feedback for proof of concept OSD Node

2020-10-02 Thread Martin Verges

For private projects, you can search small 1U servers with up to 4 3.5"
disk slots and some e3-1230 v3/4/5 cpu. They can be bought for 250-350€
(used) and then you just plug in a disk.
They are also good for SATA SSDs and work quite well. You can mix both
drives in the same system as well.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Fr., 2. Okt. 2020 um 08:32 Uhr schrieb Ignacio Ocampo :

> Hi Brian,
>
> Here more context about what I want to accomplish: I've migrated a bunch of
> services from AWS to a local server, but having everything in a single
> server is not safe, and instead of investing in RAID, I would like to start
> setting up a small Ceph Cluster to have redundancy and a robust mechanism
> in case any component fails.
>
> Also, in the mid-term, I do have plans to deploy a small OpenStack Cluster.
>
> Because of that, I would like to set up the first small Ceph Cluster that
> can scale as my needs grow, the idea is to have 3 OSD nodes with the same
> characteristics and add additional HDDs as needed, up to 5 HDD per OSD
> node, starting with 1 HDD per node.
>
> Thanks!
>
> On Thu, Oct 1, 2020 at 11:35 AM Brian Topping 
> wrote:
>
> > Welcome to Ceph!
> >
> > I think better questions to start with are “what are your objectives in
> > your study?” Is it just seeing Ceph run with many disks, or are you
> trying
> > to see how much performance you can get out of it with distributed disk?
> > What is your budget? Do you want to try different combinations of storage
> > devices to learn how they differ in performance or do you just want to
> jump
> > to the fastest things out there?
> >
> > One often doesn’t need a bunch of machines to determine that Ceph is a
> > really versatile and robust solution. I pretty regularly deploy Ceph on a
> > single node using Kubernetes and Rook. Some would ask “why would one ever
> > do that, just use direct storage!”. The answer is when I want to expand a
> > cluster, I am willing to have traded initial performance overhead for
> > letting Ceph distribute data at a later date. And the overhead is far
> lower
> > than one might think when there’s not a network bottleneck to deal with.
> I
> > do use direct storage on LVM when I have distributed workloads such as
> > Kafka that abstract storage that a service instance depends on. It
> doesn’t
> > make much sense in my mind for Kafka or Cassandra to use Ceph because I
> can
> > afford to lose nodes using those services.
> >
> > In other words, Ceph is virtualized storage. You have likely come to it
> > because your workloads need to be able to come up anywhere on your
> network
> > and reach that storage. How do you see those workloads exercising the
> > capabilities of Ceph? That’s where your interesting use cases come from,
> > and can help you better decide what the best lab platform is to get
> started.
> >
> > Hope that helps, Brian
> >
> > On Sep 29, 2020, at 12:44 AM, Ignacio Ocampo  wrote:
> >
> > Hi All :),
> >
> > I would like to get your feedback about the components below to build a
> > PoC OSD Node (I will build 3 of these).
> >
> > SSD for OS.
> > NVMe for cache.
> > HDD for storage.
> >
> > The Supermicro motherboard has 2 10Gb cards, and I will use ECC memories.
> >
> > 
> >
> > Thanks for your feedback!
> >
> > --
> > Ignacio Ocampo
> >
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
> >
> >
>
> --
> Ignacio Ocampo
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: rebalancing adapted during rebalancing with new updates?

2020-09-26 Thread Martin Verges

Without knowing the source code and just from my observations, I would say
everytime the osd map changes, the crush/pgmap tries to fix that. However a
running backfill is not stopped and only backfill_wait would be
reconsidered.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Sa., 26. Sept. 2020 um 13:33 Uhr schrieb Marc Roos <
m.r...@f1-outsourcing.eu>:

>
> When I add an osd rebalancing is taking place, lets say ceph relocates
> 40 pg's.
>
> When I add another osd during rebalancing, when ceph has only relocated
> 10 pgs and has to do still 30 pgs.
>
> What happens then:
>
> 1. Is ceph just finishing the relocation of these 30 pgs and then
> calculates how the new environment with the newly added osd should be
> relocated and starts relocating that.
>
> 2. or is ceph is finishing only the relocation of the pg it is currently
> doing, and then recalculates immediately how pg's should be distributed
> without finishing these 30 pg's it was planning to do.
>
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: NVMe's

2020-09-24 Thread Martin Verges

Hello,

It was some time ago but as far as I remember and found in the chat log, it
was during backfill/recovery and high client workload and on Intel Xeon
Silver 4110, 2.10GHz, 8C/16T Cpu.
I found a screenshot in my chat history stating 775% and 722% cpu usage in
htop for 2 OSDs (the server has 2 PCIe PM1725a NVMe OSDs and 12 HDD OSDs).
Unfortunately I have no console log output that would show more details
like IO pattern.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Do., 24. Sept. 2020 um 21:01 Uhr schrieb Mark Nelson :

> Mind if I ask what size of IOs those where, what kind of IOs
> (reads/writes/sequential/random?) and what kind of cores?
>
>
> Mark
>
>
> On 9/24/20 1:43 PM, Martin Verges wrote:
> > I did not see 10 cores, but 7 cores per osd over a long period on
> > pm1725a disks with around 60k IO/s according to sysstat of each disk.
> >
> > --
> > Martin Verges
> > Managing director
> >
> > Mobile: +49 174 9335695
> > E-Mail: martin.ver...@croit.io <mailto:martin.ver...@croit.io>
> > Chat: https://t.me/MartinVerges
> >
> > croit GmbH, Freseniusstr. 31h, 81247 Munich
> > CEO: Martin Verges - VAT-ID: DE310638492
> > Com. register: Amtsgericht Munich HRB 231263
> >
> > Web: https://croit.io
> > YouTube: https://goo.gl/PGE1Bx
> >
> >
> > Am Do., 24. Sept. 2020 um 18:47 Uhr schrieb  > <mailto:vita...@yourcmc.ru>>:
> >
> > OK, I'll retry my tests several times more.
> >
> > But I've never seen OSD utilize 10 cores, so... I won't believe it
> > until I see it myself on my machine. :-))
> >
> > I tried a fresh OSD on a block ramdisk ("brd"), for example. It
> > was eating 658% CPU and pushing only 4138 write iops...
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > <mailto:ceph-users@ceph.io>
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> > <mailto:ceph-users-le...@ceph.io>
> >
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: NVMe's

2020-09-24 Thread Martin Verges

I did not see 10 cores, but 7 cores per osd over a long period on pm1725a
disks with around 60k IO/s according to sysstat of each disk.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Do., 24. Sept. 2020 um 18:47 Uhr schrieb :

> OK, I'll retry my tests several times more.
>
> But I've never seen OSD utilize 10 cores, so... I won't believe it until I
> see it myself on my machine. :-))
>
> I tried a fresh OSD on a block ramdisk ("brd"), for example. It was eating
> 658% CPU and pushing only 4138 write iops...
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Choosing suitable SSD for Ceph cluster

2020-09-14 Thread Martin Verges

Hello,

Please keep in mind that you can have significant operational problems if
you choose too small OSDs. Sometimes your OSDs require >40G for
osdmaps/pgmaps/... and the smaller you OSD, the more likely it will be a
problem as Ceph is totally unable to deal with full disks and break apart.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Mo., 14. Sept. 2020 um 15:58 Uhr schrieb :

> https://www.kingston.com/unitedkingdom/en/ssd/dc1000b-data-center-boot-ssd
>
> look good for your purpose.
>
>
>
> - Original Message -
> From: "Seena Fallah" 
> To: "Виталий Филиппов" 
> Cc: "Anthony D'Atri" , "ceph-users" <
> ceph-users@ceph.io>
> Sent: Monday, September 14, 2020 2:47:14 PM
> Subject: [ceph-users] Re: Choosing suitable SSD for Ceph cluster
>
> Thanks for the sheet. I need a low space disk for my use case (around
> 240GB). Do you have any suggestions with M.2 and capacitors?
>
> On Mon, Sep 14, 2020 at 6:11 PM  wrote:
>
> > There's also Micron 7300 Pro/Max. Please benchmark it like described here
> >
> https://docs.google.com/spreadsheets/d/1E9-eXjzsKboiCCX-0u0r5fAjjufLKayaut_FOPxYZjc/edit
> > and send me the results if you get one :))
> >
> > Samsung PM983 M.2
> >
> > I want to have a separate disk for buckets index pool and all of my
> server
> > bays are full and I should use m2 storage devices. Also the bucket index
> > doesn't need much space so I plan to have a 6x device with replica 3 for
> > it. Each disk could be 240GB to not waste space but there is no
> enterprise
> > nvme disk in this space! Do you have any recommendations?
> > On Sun, Sep 13, 2020 at 10:17 PM Виталий Филиппов 
> > wrote:
> >
> > Easy, 883 has capacitors and 970 evo doesn't
> > 13 сентября 2020 г. 0:57:43 GMT+03:00, Seena Fallah <
> seenafal...@gmail.com>
> > пишет:
> >
> > Hi. How do you say 883DCT is faster than 970 EVO? I saw the
> specifications and 970 EVO has higher IOPS than 883DCT! Can you please tell
> why 970 EVO act lower than 883DCT?
> >
> > --
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
> >
> > --
> > With best regards,
> > Vitaliy Filippov
> >
> >
> >
> >
> >
> >
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: add debian buster stable support for ceph-deploy

2020-09-07 Thread Martin Verges

Hello,

yes this is correct. Sorry for the inconvenience.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Mo., 7. Sept. 2020 um 10:19 Uhr schrieb Lars Täuber :

> Hi Paul,
>
> the GPG Key of the repo has changed on 4th of June. Is this correct?
>
> Thanks for your buster repo!
>
> Cheers,
> Lars
>
> Mon, 18 Nov 2019 20:08:01 +0100
> Paul Emmerich  ==> Jelle de Jong <
> jelledej...@powercraft.nl> :
> > We maintain an unofficial mirror for Buster packages:
> > https://croit.io/2019/07/07/2019-07-07-debian-mirror
> >
> >
> > Paul
> >
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: I can just add 4Kn drives, not?

2020-08-06 Thread Martin Verges

Yes, no problem

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Do., 6. Aug. 2020 um 12:13 Uhr schrieb Marc Roos <
m.r...@f1-outsourcing.eu>:

>
>
> I can just add 4Kn drives to my existing setup not? Since this
> technology is only specific to how the osd daemon is talking to the
> disk?
>
>
>
>
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Monitor IPs

2020-07-15 Thread Martin Verges

Hello,

just delete the old one and deploy a new one.
Make sure to have a quorum (2 of 3 or 3 of 5) online while doing so.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Mi., 15. Juli 2020 um 13:14 Uhr schrieb Will Payne :

> I need to change the network my monitors are on. It seems this is not a
> trivial thing to do. Are there any up-to-date instructions for doing so on
> a cephadm-deployed cluster?
>
> I’ve found some steps in older versions of the docs but not sure if these
> are still correct - they mention using the ceph-mon command which I don’t
> have.
>
> Will
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Nautilus 14.2.10 mon_warn_on_pool_no_redundancy

2020-06-29 Thread Martin Verges

I agree, please check for min_size to cover min 1 max 2 configs as we have
done in our software for our users since years. It is important and it can
prevent lot's of issues.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Mo., 29. Juni 2020 um 15:06 Uhr schrieb Wout van Heeswijk :

> Hi All,
>
> I really like the idea of warning users against using unsafe practices.
>
> Wouldn't it make sense to warn against using min_size=1 instead of size=1.
>
> I've seen data loss happen with size=2 min_size=1 when multiple failures
> occur and write have been done between the failures. Effectively the new
> warning below says "It is not considered safe to run with no
> redundancy". Which is true, but when failure occurs or maintenance is
> executed, with size=2 and min_size=1, as soon as data is written, there
> might not be data redundancy for that newly written data. A failure of
> an OSD at that moment would result in data loss.
>
> Since you cannot run size=1 with min_size > 1, this use-case would also
> be covered.
>
> I understand this has implications for size=2 when executing
> maintenance, but I think most people are not aware of the risks they are
> taking with min_size=1. Those that are aware can suppress the warning.
>
> * Ceph will issue a health warning if a RADOS pool's `size` is set to 1
>or in other words the pool is configured with no redundancy. This can
>be fixed by setting the pool size to the minimum recommended value
>with::
>  ceph osd pool set  size 
>The warning can be silenced with::
>  ceph config set global mon_warn_on_pool_no_redundancy false
>
> --
> kind regards,
>
> Wout
> 42on
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: OSD node OS upgrade strategy

2020-06-21 Thread Martin Verges

Hello Jared,

you could use croit to manage the osd and drop the work entirely. You would
just need to restart host by host over pxe network to migrate.
Besides that, you can freshly install ubuntu within a running environment
using debootstrap and then just restart the host. However that is quite
tricky and not suggested to inexperienced users.

If you don't touch the Ceph disks at all, the service will come up again
without anything to change from your side. Sometimes it's better to clean
up some old mess and do it the way you currently work.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Fr., 19. Juni 2020 um 16:29 Uhr schrieb shubjero :

> Hi all,
>
> I have a 39 node, 1404 spinning disk Ceph Mimic cluster across 6 racks
> for a total of 9.1PiB raw and about 40% utilized. These storage nodes
> started their life on Ubuntu 14.04 and in-place upgraded to 16.04 2
> years ago however I have started a project to do fresh installs of
> each OSD node to Ubuntu 18.04 to keep things fresh and well supported.
> I am reaching out to see what others might suggest in terms of
> strategy to get these hosts updated quicker than my current strategy.
>
> Current strategy:
> 1. Pick 3 nodes, drain them by modifying the crush weight
> 2. Fresh install 18.04 using automation tool (MAAS) + some Ansible
> playbooks to setup server
> 3. Purge OSD node worth of OSD' (this causes data to be 'misplaced'
> due to rack weight changing)
> 4. Run ceph-volume lvm batch for osd node
> 5. Move OSD's in to desired hosts in crush map (large rebalancing to
> fill back up)
>
> If anyone has suggestions on a quicker way to do this I am all ears.
>
> I am wondering if its not necessary to have to drain/fill OSD nodes at
> all and if this can be done with just a fresh install and not touch
> the OSD's however I don't know how to perform a fresh installation and
> then tell ceph that I have OSD's with data on them and to somehow
> re-register them with the cluster? Or is there a better order of
> operations to draining/filling without causing a high amount of
> objects to be misplaced due to manipulating the crush map.
>
> That being said, since our cluster is a bit older and the majority of
> our bluestore osd's are provisioned in the 'simple' method using a
> small metadata partition and the remainder as a raw partition whereas
> now it seems the suggested way is to use the lvm layout and tmpfs.
>
> Anyways, I'm all ears and appreciate any feedback.
>
> Jared Baker
> Ontario Institute for Cancer Research
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: advantage separate cluster network on single interface

2020-06-17 Thread Martin Verges

In my opinion, having the additional and error prone work of configuring
additional networks, maintain and monitor them outweighs the small
benefits.
In the past, we saw lots of clusters that had reduced availability due to
misconfigured or broken networks. This got so far that we included a
network packet loss monitoring into our Ceph management solution to help
customers track down their network issues.

Therefore choosing a single network strongly increases your reliability and
availability of your cluster.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Di., 16. Juni 2020 um 16:43 Uhr schrieb Marcel Kuiper :

> Hi
>
> I wonder if there is any (theoretical) advantage running a separate
> backend network next to the public network (through vlan separation) over
> a single interface
>
> I googled a lot and while some blogs advice to do so, they do not give any
> argument that supports this statement
>
> Any insights on this is much appreciated
>
> Thanks
>
> Marcel
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Ceph deployment and Managing suite

2020-06-13 Thread Martin Verges

Hello,

take a look at croit.io, we believe to have the most sophisticated Ceph
deployment and management solution.

If something is missing, please let us know.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

Amudhan P  schrieb am Sa., 13. Juni 2020, 14:11:

> Hi,
>
> I am looking for a Software suite to deploy Ceph Storage Node and Gateway
> server (SMB & NFS) and also dashboard Showing entire Cluster status,
> Individual node health, disk identification or maintenance activity,
> network utilization.
> Simple user manageable dashboard.
>
> Please suggest any Paid or Community based you have been using or you
> recommend to others.
>
> regards
> Amudhan
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Octopus: orchestrator not working correctly with nfs

2020-06-11 Thread Martin Verges

Hello,

you could use another deployment and management solution to have NFS and
everything with ease. Take a look into
https://croit.io/docs/croit/master/gateways/nfs.html#services how easy it
would be to deploy NFS.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Do., 11. Juni 2020 um 13:20 Uhr schrieb Simon Sutter :

> Hello,
>
>
> You just copied the same message.
>
> I'll make a ticket in the tracker.
>
>
> Regards,
>
> Simon
>
> 
> Von: Amudhan P 
> Gesendet: Donnerstag, 11. Juni 2020 09:32:36
> An: Simon Sutter
> Cc: ceph-users@ceph.io
> Betreff: Re: [ceph-users] Re: Octopus: orchestrator not working correctly
> with nfs
>
> Hi,
>
> I have not worked with orchestrator but I remember I read somewhere that
> NFS implementation is not supported.
>
> Refer Cephadm documentation and for NFS you have configure nfs Ganesha.
>
> You can manage NFS thru dashboard but for that you have initial config in
> dashboard and in nfsganaesha you have refer it.
>
> Regards
> Amudhan
>
> On Thu 11 Jun, 2020, 11:40 AM Simon Sutter,  ssut...@hosttech.ch>> wrote:
> Hello,
>
>
> Did I not provide enough information, or simply nobody knows how to solve
> the problem?
> Should I write to the ceph tracker or does this just produce unnecessary
> overhead?
>
>
> Thanks in advance,
>
> Simon
>
> 
> Von: Simon Sutter mailto:ssut...@hosttech.ch>>
> Gesendet: Montag, 8. Juni 2020 10:56:00
> An: ceph-users@ceph.io<mailto:ceph-users@ceph.io>
> Betreff: [ceph-users] Octopus: orchestrator not working correctly with nfs
>
> Hello
>
>
> I know that nfs on octopus is still a bit under development.
>
> I'm trying to deploy nfs daemons and have some issues with the
> orchestartor.
>
> For the other daemons, for example monitors, I can issue the command "ceph
> orch apply mon 3"
>
> This will tell the orchestrator to deploy or remove monitor daemons until
> there are three of them.
>
> The command does not work with nfs, and now the orchestrator is a bit
> missconfigured...
>
> And with missconfigured I mean, that I have now a nfs daemon on node 1 and
> the orchestrator wants to create another one on node 1 but with wrong
> settings (it fails).
> Also a "ceph orch apply nfs –unconfigured" does not work, so I can't
> manually manage the nfs containers.
>
> Is there a manual way to tell ceph orch, to not create or remove nfs
> daemons? then I would be able to set them up manually.
> Or a manual way of configuring the orchestrator so it does the right thing.
>
>
> Thanks in advance
>
> Simon
> ___
> ceph-users mailing list -- ceph-users@ceph.io<mailto:ceph-users@ceph.io>
> To unsubscribe send an email to ceph-users-le...@ceph.io ceph-users-le...@ceph.io>
> hosttech GmbH | Simon Sutter
> hosttech.ch<http://hosttech.ch><https://www.hosttech.ch>
>
> WE LOVE TO HOST YOU.
>
> create your own website!
> more information & online-demo: www.website-creator.ch<
> http://www.website-creator.ch><http://www.website-creator.ch<
> http://www.website-creator.ch<http://www.website-creator.ch>>
> ___
> ceph-users mailing list -- ceph-users@ceph.io<mailto:ceph-users@ceph.io>
> To unsubscribe send an email to ceph-users-le...@ceph.io ceph-users-le...@ceph.io>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: looking for telegram group in English or Chinese

2020-05-26 Thread Martin Verges

Hello,

as I find it a good idea and couldn't find another, I just created
https://t.me/ceph_users.
Please feel free to join and let's see to get this channel startet ;)

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Mi., 27. Mai 2020 um 07:07 Uhr schrieb Konstantin Shalygin <
k0...@k0ste.ru>:

> On 5/26/20 1:13 PM, Zhenshi Zhou wrote:
> > Is there any telegram group for communicating with ceph users?
>
> AFAIK there is only Russian (CIS) group [1], but feel free to join with
> English!
>
>
>
>
> [1] https://t.me/ceph_ru
>
> k
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Ceph Nautius not working after setting MTU 9000

2020-05-24 Thread Martin Verges

Just save yourself the trouble. You won't have any real benefit from MTU
9000. It has some smallish, but it is not worth the effort, problems, and
loss of reliability for most environments.
Try it yourself and do some benchmarks, especially with your regular
workload on the cluster (not the maximum peak performance), then drop the
MTU to default ;).

Please if anyone has other real world benchmarks showing huge differences
in regular Ceph clusters, please feel free to post it here.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am So., 24. Mai 2020 um 15:54 Uhr schrieb Suresh Rama :

> Ping with 9000 MTU won't get response as I said and it should be 8972. Glad
> it is working but you should know what happened to avoid this issue later.
>
> On Sun, May 24, 2020, 3:04 AM Amudhan P  wrote:
>
> > No, ping with MTU size 9000 didn't work.
> >
> > On Sun, May 24, 2020 at 12:26 PM Khodayar Doustar 
> > wrote:
> >
> > > Does your ping work or not?
> > >
> > >
> > > On Sun, May 24, 2020 at 6:53 AM Amudhan P  wrote:
> > >
> > >> Yes, I have set setting on the switch side also.
> > >>
> > >> On Sat 23 May, 2020, 6:47 PM Khodayar Doustar, 
> > >> wrote:
> > >>
> > >>> Problem should be with network. When you change MTU it should be
> > changed
> > >>> all over the network, any single hup on your network should speak and
> > >>> accept 9000 MTU packets. you can check it on your hosts with
> "ifconfig"
> > >>> command and there is also equivalent commands for other
> > network/security
> > >>> devices.
> > >>>
> > >>> If you have just one node which it not correctly configured for MTU
> > 9000
> > >>> it wouldn't work.
> > >>>
> > >>> On Sat, May 23, 2020 at 2:30 PM si...@turka.nl 
> wrote:
> > >>>
> > >>>> Can the servers/nodes ping eachother using large packet sizes? I
> guess
> > >>>> not.
> > >>>>
> > >>>> Sinan Polat
> > >>>>
> > >>>> > Op 23 mei 2020 om 14:21 heeft Amudhan P  het
> > >>>> volgende geschreven:
> > >>>> >
> > >>>> > In OSD logs "heartbeat_check: no reply from OSD"
> > >>>> >
> > >>>> >> On Sat, May 23, 2020 at 5:44 PM Amudhan P 
> > >>>> wrote:
> > >>>> >>
> > >>>> >> Hi,
> > >>>> >>
> > >>>> >> I have set Network switch with MTU size 9000 and also in my
> netplan
> > >>>> >> configuration.
> > >>>> >>
> > >>>> >> What else needs to be checked?
> > >>>> >>
> > >>>> >>
> > >>>> >>> On Sat, May 23, 2020 at 3:39 PM Wido den Hollander <
> w...@42on.com
> > >
> > >>>> wrote:
> > >>>> >>>
> > >>>> >>>
> > >>>> >>>
> > >>>> >>>> On 5/23/20 12:02 PM, Amudhan P wrote:
> > >>>> >>>> Hi,
> > >>>> >>>>
> > >>>> >>>> I am using ceph Nautilus in Ubuntu 18.04 working fine wit MTU
> > size
> > >>>> 1500
> > >>>> >>>> (default) recently i tried to update MTU size to 9000.
> > >>>> >>>> After setting Jumbo frame running ceph -s is timing out.
> > >>>> >>>
> > >>>> >>> Ceph can run just fine with an MTU of 9000. But there is
> probably
> > >>>> >>> something else wrong on the network which is causing this.
> > >>>> >>>
> > >>>> >>> Check the Jumbo Frames settings on all the switches as well to
> > make
> > >>>> sure
> > >>>> >>> they forward all the packets.
> > >>>> >>>
> > >>>> >>> This is definitely not a Ceph issue.
> > >>>> >>>
> > >>>> >>> Wido
> > >>>> >>>
> > >>>> >>>>
> > >>>> >>>> regards
> > >>>> >>>> Amudhan P
> > >>>> >>>> ___
> > >>>> >>>> ceph-users mailing list -- ceph-users@ceph.io
> > >>>> >>>> To unsubscribe send an email to ceph-users-le...@ceph.io
> > >>>> >>>>
> > >>>> >>> ___
> > >>>> >>> ceph-users mailing list -- ceph-users@ceph.io
> > >>>> >>> To unsubscribe send an email to ceph-users-le...@ceph.io
> > >>>> >>>
> > >>>> >>
> > >>>> > ___
> > >>>> > ceph-users mailing list -- ceph-users@ceph.io
> > >>>> > To unsubscribe send an email to ceph-users-le...@ceph.io
> > >>>>
> > >>>> ___
> > >>>> ceph-users mailing list -- ceph-users@ceph.io
> > >>>> To unsubscribe send an email to ceph-users-le...@ceph.io
> > >>>>
> > >>>
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Ceph as a Fileserver for 3D Content Production

2020-05-17 Thread Martin Verges

Hello Moritz,

drop the EVO disk and use a SSD that is working with Ceph. For example just
use PM883 / PM983 from the same vendor and you will have a huge performance
increase.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am So., 17. Mai 2020 um 15:12 Uhr schrieb Moritz Wilhelm :

> Hi Marc,
>
> thank you very much for your feedback, actually that is what I am looking
> for (Design advices and feedback). I also wanted to get in touch with the
> community because right now I am on my own with this project with no
> experience at all. But I also wanted to get into it first and learn the
> basics and setup a system before bothering other people with it and to be
> able to even hold a conversation about it.
>
> So we are a 3D Content Production Company with around 10-12 Workstations
> and 6 Render stations. We work with software like Autodesk Maya or Adobe
> After effects. All our Projects are stored on the NAS and we work directly
> with those files on the NAS.
>
> This is our current System:
>
> Synology RS4017xs+
> Intel Xeon D-1541 2,1 Ghz, 8 Cores
> 32GB Ram
> 16x 8TB WD Red Drives
> 2  TB SSD Cache
> 2 x 10 Gb SFP+ Nics
>
>
> About the Bcache I just did tests with it as I found the idea very
> interesting and also performance was better. If this project is ever
> getting into real-life production I will stick to the most common setups
> for sure.
>
> Here are some Results of Diskmark on one Computer. (I Had only 5 OSDs per
> Node up and running at this test because I am still waiting for the new
> Seagate drives) For Sequential Reads and Writes I sometimes maxed out the
> 10Gbe host connection so the data is not very useful but the other numbers
> are quite interesting:
>
> For Comparison:
>
> Hosts Local NVME Drive (Samsung 970 EVO Pro 1TB):
> Read:
>  Write:
> Sequential 1MiB (Q=  8, T= 1):  3507.700 MB/s [   3345.2 IOPS] <
> 2389.61 us>   Sequential 1MiB (Q=  8, T= 1):  2548.063 MB/s [
>  2430.0 IOPS] <  3284.79 us>
> Sequential 1MiB (Q=  1, T= 1):  2368.308 MB/s [   2258.6 IOPS] <
>  442.07 us>   Sequential 1MiB (Q=  1, T= 1):  2456.471 MB/s [
>  2342.7 IOPS] <   426.10 us>
> Random 4KiB (Q= 32, T=16):  1555.565 MB/s [ 379776.6 IOPS] <
> 1312.45 us>Random 4KiB (Q= 32, T=16):16.580 MB/s [
>  4047.9 IOPS] <124267.88 us>
> Random 4KiB (Q=  1, T= 1):51.666 MB/s [  12613.8 IOPS] <
> 78.82 us>   Random 4KiB (Q=  1, T= 1):   108.983 MB/s [
> 26607.2 IOPS] <37.13 us>
>
> Current Synology NAS (SMB):
> Read:
>  Write:
> Sequential 1MiB (Q=  8, T= 1):  1045.594 MB/s [997.2 IOPS] <
> 7990.33 us>   Sequential 1MiB (Q=  8, T= 1):  1101.007 MB/s [
>  1050.0 IOPS] <  7588.25 us>
> Sequential 1MiB (Q=  1, T= 1):   953.709 MB/s [909.5 IOPS] <
> 1098.57 us>   Sequential 1MiB (Q=  1, T= 1):   847.847 MB/s [
> 808.6 IOPS] <  1235.26 us>
> Random 4KiB (Q= 32, T=16): 4.198 MB/s [   1024.9 IOPS]
> <380158.65 us>   Random 4KiB (Q= 32, T=16):   188.827 MB/s [
> 46100.3 IOPS] < 11076.14 us>
> Random 4KiB (Q=  1, T= 1): 2.486 MB/s [606.9 IOPS] <
> 1637.08 us>   Random 4KiB (Q=  1, T= 1): 7.177 MB/s
> [   1752.2 IOPS] <   570.16 us>
>
> Ceph With WAL/DB on NVME total of 5 old SATA HDD OSDs (SMB):
> Read:
>  Write:
> Sequential 1MiB (Q=  8, T= 1):   534.050 MB/s [509.3 IOPS] <
> 15628.63 us>   Sequential 1MiB (Q=  8, T= 1):   198.420 MB/s [
> 189.2 IOPS] < 42020.67 us>
> Sequential 1MiB (Q=  1, T= 1):   340.580 MB/s [324.8 IOPS] <
> 2921.17 us>   Sequential 1MiB (Q=  1, T= 1):   184.329 MB/s [
> 175.8 IOPS] <  5603.99 us>
> Random 4KiB (Q= 32, T=16): 3.172 MB/s [774.4 IOPS]
> <398622.73 us>   Random 4KiB (Q= 32, T=16): 8.639 MB/s [
>  2109.1 IOPS] <222699.43 us>
> Random 4KiB (Q=  1, T= 1): 1.907 MB/s [465.6 IOPS] <
> 2139.08 us>   Random 4KiB (Q=  1, T= 1): 7.294 MB/s
> [   1780.8 IOPS] <   560.91 us>
>
> Ceph With Bcache total of 5 old SATA HDD OSDs (SMB):
> Sequential 1MiB (Q=  8, T= 1):   967.386 MB/s [922.6 IOPS] <
> 8660.59 us>   Sequential 1

[ceph-users] Re: Cluster network and public network

2020-05-08 Thread Martin Verges

Hello Nghia,

just use one network interface card and use frontend and backend traffic on
the same. No problem with that.
If you have a dual port card, use both ports as an LACP channel and maybe
separate it using VLANs if you want to, but not required as well.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Fr., 8. Mai 2020 um 09:29 Uhr schrieb Nghia Viet Tran <
nghia.viet.t...@mgm-tp.com>:

> Hi everyone,
>
> I have a question about the network setup. From the document, It’s
> recommended to have 2 NICs per hosts as described in below picture
>
> [image: Diagram]
>
> In the picture, OSD hosts will connect to the Cluster network for
> replicate and heartbeat between OSDs, therefore, we definitely need 2 NICs
> for it. But seems there are no connections between Ceph MON and Cluster
> network. Can we install 1 NIC on Ceph MON then?
>
>
>
> I appreciated any comments!
>
> Thank you!
>
> --
>
> Nghia Viet Tran (Mr)
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Data loss by adding 2OSD causing Long heartbeat ping times

2020-05-07 Thread Martin Verges

Hello XuYun,

In my experience, I would always disable swap, it won't do any good.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Do., 7. Mai 2020 um 12:07 Uhr schrieb XuYun :

> We had got some ping back/front problems after upgrading from filestore to
> bluestore. It turned out to be related to insufficient memory/swap usage.
>
> > 2020年5月6日 下午10:08，Frank Schilder  写道：
> >
> > To answer some of my own questions:
> >
> > 1) Setting
> >
> > ceph osd set noout
> > ceph osd set nodown
> > ceph osd set norebalance
> >
> > before restart/re-deployment did not harm. I don't know if it helped,
> because I didn't retry the procedure that led to OSDs going down. See also
> point 3 below.
> >
> > 2) A peculiarity of this specific deployment of 2 OSDs was, that it was
> a mix of OSD deployment and restart after a reboot. I'm working on getting
> this sorted and this is a different story. For anyone who might find
> him-/herself in a situation where some OSDs are temporarily down/out with
> PGs remapped and objects degraded for whatever reason while new OSDs come
> up, the way to have ceph rescan the down/out OSDs after they come up is to
> >
> > - "ceph osd crush move" the new OSDs temporarily to a location outside
> the crush sub tree covering any pools (I have such a parking space in the
> crush hierarchy for easy draining and parking disks)
> > - bring up the down/out OSDs
> > - at this point, the cluster will fall back to the original crush map
> that was in place when the OSDs went down/out
> > - the cluster will now find all shards that went orphan and health will
> be restored very quickly
> > - once the cluster is healthy, "ceph osd crush move" the new OSDs back
> to their desired location
> > - now you will see remapped PGs/misplaced objects, but no degraded
> objects
> >
> > 3) I still don't have an answer why long heartbeat ping times were
> observed. There seems to be a more serious issue and this will continue in
> its own thread "Cluster outage due to client IO" to be opened soon.
> >
> > Best regards,
> > =
> > Frank Schilder
> > AIT Risø Campus
> > Bygning 109, rum S14
> >
> > 
> > From: Frank Schilder 
> > Sent: 25 April 2020 15:34:25
> > To: ceph-users
> > Subject: [ceph-users] Data loss by adding 2OSD causing Long heartbeat
> ping times
> >
> > Dear all,
> >
> > Two days ago I added very few disks to a ceph cluster and run into a
> problem I have never seen before when doing that. The entire cluster was
> deployed with mimic 13.2.2 and recently upgraded to 13.2.8. This is the
> first time I added OSDs under 13.2.8.
> >
> > I had a few hosts that I needed to add 1 or 2 OSDs to and I started with
> one that needed 1. Procedure was as usual:
> >
> > ceph osd set norebalance
> > deploy additional OSD
> >
> > The OSD came up and PGs started peering, so far so good. To my surprise,
> however, I started seeing health-warnings about slow ping times:
> >
> > Long heartbeat ping times on back interface seen, longest is 1171.910
> msec
> > Long heartbeat ping times on front interface seen, longest is 1180.764
> msec
> >
> > After peering it looked like it got better and I waited it out until the
> messages were gone. This took a really long time, at least 5-10 minutes.
> >
> > I went on to the next host and deployed 2 new OSDs this time. Same as
> above, but with much worse consequences. Apparently, the ping times
> exceeded a timeout for a very short moment and an OSD was marked out for
> ca. 2 seconds. Now all hell broke loose. I got health errors with the
> dreaded "backfill_toofull", undersized PGs and a large amount of degraded
> objects. I don't know what is causing what, but I ended up with data loss
> by just adding 2 disks.
> >
> > We have dedicated network hardware and each of the OSD hosts has 20GBit
> front and 40GBit back network capacity (LACP trunking).  There are
> currently no more than 16 disks per server. The disks were added to an SSD
> pool. There was no traffic nor any other exceptional load on the system. I
> have ganglia resource monitoring on all nodes and cannot see a single curve
> going up. Network, CPU utilisation, load, everything below measurement
> accura

[ceph-users] Re: Dear Abby: Why Is Architecting CEPH So Hard?

2020-04-23 Thread Martin Verges

Hello,

simpler systems tend to be cheaper to buy per TB storage, not on a
theoretical but practical quote.

For example 1U Gigabyte 16bay D120-C21 systems with a density of 64 disks
per 4U are quite ok for most users. On 40 Nodes per rack + 2 switches you
have 10PB raw space for around 350k€.
They come with everything you need from dual 10G SFP+ to acceptable 8c/16t
45W TDP CPU. It comes with a M.2 slot if you want a db/wal or other
additional disk.
Such systems equipped with 16x16TB have a price point of below 8k€ or ~31 €
per TB RAW storage.

For me this is just an example of a quite cheap but capable HDD node. I
never saw a better offer for big fat systems on a price per TB and TCO.

Please remember, there is no best node for everyone, this node is not the
best or fastest out on the market and just an example ;)

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Do., 23. Apr. 2020 um 11:21 Uhr schrieb Darren Soothill <
darren.sooth...@suse.com>:

> I can think of 1 vendor who has made some of the compromises that you talk
> of although memory and CPU is not one of them they are limited on slots and
> NVME capacity.
>
> But there are plenty of other vendors out there who use the same model of
> motherboard across the whole chassis range so there isn’t a compromise in
> terms of slots and CPU.
>
> The compromise may come with the size of the chassis in that a lot of
> these bigger chassis can also be deeper to get rid of the compromises.
>
> The reality with an OSD node is you don't need that many slots or network
> ports.
>
>
>
> From: Janne Johansson 
> Date: Thursday, 23 April 2020 at 08:08
> To: Darren Soothill 
> Cc: ceph-users@ceph.io 
> Subject: Re: [ceph-users] Re: Dear Abby: Why Is Architecting CEPH So Hard?
> Den tors 23 apr. 2020 kl 08:49 skrev Darren Soothill <
> darren.sooth...@suse.com<mailto:darren.sooth...@suse.com>>:
> If you want the lowest cost per TB then you will be going with larger
> nodes in your cluster but it does mean you minimum cluster size is going to
> be many PB’s in size.
> Now the question is what is the tax that a particular chassis vendor is
> charging you. I know from the configs we do on a regular basis that a 60
> drive chassis will give you the lowest cost per TB. BUT it has
> implications. Your cluster size needs to be up in the order of 10PB
> minimum. 60 x 18TB gives you around 1PB per node.  Oh did you notice here
> we are going for the bigger disk drives. Why because the more data you can
> spread your fixed costs across the lower the overall cost per GB.
>
> I don't know all models, but the computers I've looked at with 60 drive
> slots will have a small and "crappy" motherboard, with few options, not
> many buses/slots/network ports and low amounts of cores, DIMM sockets and
> so on, counting on you to make almost a passive storage node on it. I have
> a hard time thinking the 60*18TB OSD recovery requirements in cpu and ram
> would be covered in any way by the kinds of 60-slot boxes I've seen. Not
> that I focus on that area, but it seems like a common tradeoff, Heavy
> Duty(tm) motherboards or tons of drives.
>
> --
> May the most significant bit of your life be positive.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Dear Abby: Why Is Architecting CEPH So Hard?

2020-04-22 Thread Martin Verges

From all our calculations of clusters, going with smaller systems reduced
the TCO because of much cheaper hardware.
Having 100 Ceph nodes is not an issue, therefore you can scale small and
large clusters with the exact same hardware.

But please, prove me wrong. I would love to see a way to reduce the TCO
even more and if you have a way, I would love to hear about it.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Do., 23. Apr. 2020 um 05:18 Uhr schrieb lin.yunfan :

> I have seen a lot of people saying not to go with big nodes.
> What is the exact reason for that?
> I can understand that if the cluster is not big enough then the total
> nodes count could be too small to withstand a node failure, but if the
> cluster is big enough wouldn't the big node be more cost effective?
>
>
> lin.yunfan
> lin.yun...@gmail.com
>
> <https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=2&name=lin.yunfan&uid=lin.yunfan%40gmail.com&iconUrl=http%3A%2F%2Fmail-online.nosdn.127.net%2Fsm91cce6d5df0eecf9304e3975eeeae111.jpg&items=%5B%22%22%2C%22lin.yunfan%40gmail.com%22%2C%22%22%2C%22%22%2C%22%22%5D>
> 签名由 网易邮箱大师 <https://mail.163.com/dashi/dlpro.html?from=mail81> 定制
> On 4/23/2020 06:33，Brian Topping
>  wrote：
>
> Great set of suggestions, thanks! One to consider:
>
> On Apr 22, 2020, at 4:14 PM, Jack  wrote:
>
> I use 32GB flash-based satadom devices for root device
> They are basically SSD, and do not take front slots
> As they are never burning up, we never replace them
> Ergo, the need to "open" the server is not an issue
>
>
>
> This is probably the wrong forum to understand how you are not burning
> them out. Any kind of logs or monitor databases on a small SATADOM will
> cook them quick, especially an MLC. There is no extra space for wear
> leveling and the like. I tried making it work with fancy systemd logging to
> memory and having those logs pulled by a log scraper storing to the actual
> data drives, but there was no place for the monitor DB. No monitor DB means
> Ceph doesn’t load, and if a monitor DB gets corrupted, it’s perilous for
> the cluster and instant death if the monitors aren’t replicated.
>
> My node chassis have two motherboards and each is hard limited to four
> SSDs. On each node, `/boot` is mirrored (RAID1) on partition 1, `/` is
> stripe/mirrored (RAID10) on p2, then used whatever was left for ceph data
> on partition 3 of each disk. This way any disk could fail and I could still
> boot. Merging the volumes (ie no SATADOM), wear leveling was statistically
> more effective. And I don’t have to get into crazy system configurations
> that nobody would want to maintain or document.
>
> $0.02…
>
> Brian
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Dear Abby: Why Is Architecting CEPH So Hard?

2020-04-22 Thread Martin Verges

Hello Cody,

There are a few simple rules to design a good, stable and performant Ceph
cluster.

1) Don't choose big systems. Not only because often they are more
expensive, but you also have more impact when a system is down.

2) Throw away all the not required stuff like RAID controllers, make the
system as simple as possible.

3) Plan CPU with a rule of thumb:
 - for HDD 1 cpu thread of any cpu is ok
 - for SSD/NVMe midrange 1 cpu core is most likely ok
 - for high end NVMe up to 4 cpu cores (8 threads) can be consumed, most
setups would be ok with 2 cores per disk
And generally, the faster the cores, the better it will be. This is
especially important on high end NVMe.

4) Plan Memory by Number of OSD drives * 6-8 GB and then choose the next
optimal dimm config (for example 128 GB).

5) Network:
 - HDD don't provide a good performance, 2*10G is totally fine
 - SSD/NVMe midrange can exceed 10G so it would be the bare minimum, but
100G are way too much ;)
 - NVMe high end can cause a dual 40G link to exceed but honestly, I never
saw client traffic, only ceph recovery in that performance range
 And overall, choose a modern all path active network design, like leaf
spine with vxlan to scale

6) DB/WAL:
 - definitely will decrease latency
 - can increase performance
 - do require long lasting write intensive flash if you don't want to get
in trouble with them
 - Sizing this is a hot topic ;). I currently just plan 300G (not 299) per
OSD for best performance. Choose a PCIe interface, don't choose SATA
interface for DB/WAL it will be a bottleneck.

You can colocate any service of a unified Ceph on all the hosts. If you add
services like MON, RGW, MDS you need to add some extra resources to your
calculation
MON) Just throw it in, the rule of thumb above will work without a problem
RGW) Metadata requires an SSD/NVMe pool as HDD is too slow, depending on
the required performance, some more CPU is required. As we plan more but
smaller servers, load can be distributed across more nodes, it scales much
better.
MDS) Can easily consume high Memory rates. Again depending on the use-case
how much it will need. Most likely adding it to the rule of thumb is ok but
if there are many open files, choose the next bigger dimm config.

In the end, especially inexperienced customers do have a great need for
good Ceph management as well. If you are interested, please feel free to
contact me and I will show you how we do it. We also have reseller options,
maybe that's something for you.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Mi., 22. Apr. 2020 um 23:47 Uhr schrieb :

> Hey Folks,
>
> This is my first ever post here in the CEPH user group and I will preface
> with the fact that I know this is a lot of what many people ask frequently.
> Unlike what I assume to be a large majority of CEPH “users” in this forum,
> I am more of a CEPH “distributor.” My interests lie in how to build a CEPH
> environment to best fill an organization’s needs.I am here for the
> real-world experience and expertise so that I can learn to build CEPH
> “right.” I have spent the last couple years collecting data on general
> “best practices” through forum posts, CEPH documentation, CEPHLACON, etc. I
> wanted to post my findings to the forum to see where I can harden my stance.
>
> Below are two example designs that I might use when architecting a
> solution currently. I have specific questions around design elements in
> each that I would like you to approve for holding water or not. I want to
> focus on the hardware, so I am asking for generalizations where possible.
> Let’s assume in all scenarios that we are using Luminous and that the data
> type is mixed use.
> I am not expecting anyone to run through every question, so please feel
> free to comment on any piece you can. Tell me what is overkill and what is
> lacking!
>
> Example 1:
> 8x 60-Bay (8TB) Storage nodes (480x 8TB SAS Drives)
> Storage Node Spec:
> 2x 32C 2.9GHz AMD EPYC
>- Documentation mentions .5 cores per OSD for throughput optimized. Are
> they talking about .5 Physical cores or .5 Logical cores?
>- Is it better to pick my processors based on a total GHz measurement
> like 2GHz per OSD?
>- Would a theoretical 8C at 2GHz serve the same number of OSDs as a 16C
> at 1GHz? Would Threads be included in this calculation?
> 512GB Memory
>- I know this is the hot topic because of its role in recoveries.
> Basically, I am looking for the most generalized practice I can use as a
> safe number and a metric I can use as a nice to have.
>- Is it 1GB of RAM per TB of RAW

[ceph-users] Re: Using M2 SSDs as osds

2020-04-09 Thread Martin Verges

Hello Felix,

the lifetime is not a matter of the connector/slots type, it's about the
disk itself.
Check the datasheet for TBW and make sure your drive is suitable for Ceph.
If both are ok, M.2 is absolutely ok.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Do., 9. Apr. 2020 um 13:17 Uhr schrieb Stolte, Felix <
f.sto...@fz-juelich.de>:

> Hey guys,
>
> I am evaluating using M2 SSDs as osds for an all flash pool. Is anyone
> using that in production and can elaborate on his experience? I am a little
> bit concerned about the lifetime of the M2 disks.
>
> Best regards
> Felix
>
> IT-Services
> Telefon 02461 61-9243
> E-Mail: f.sto...@fz-juelich.de
>
> -
>
> -
> Forschungszentrum Juelich GmbH
> 52425 Juelich
> Sitz der Gesellschaft: Juelich
> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
> Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
> Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
> Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt
>
> -
>
> -
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Questions on Ceph cluster without OS disks

2020-04-05 Thread Martin Verges

Hello Brent,

just use
https://pages.croit.io/croit/v2002/getting-started/installation.html
our free community edition provides all the logic and you can use that to
have a reliable pxe ceph system.

If you want to see it in action, please feel free to contact me and I will
give you a live presentation and answer all your questions.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am So., 5. Apr. 2020 um 20:13 Uhr schrieb Brent Kennedy :

> I agree with the sentiment regarding swap, however it seems the OS devs
> still suggest having a swap, even if its small.  We monitor swap file usage
> and there is none in the ceph clusters, I am mainly looking at eliminating
> it(assuming its “safe” to do so), but don’t want to risk production
> machines just to save some OS space on disk.  However, the idea of loading
> the OS into memory is very interesting to me, at least in the instance of a
> production environment.  Not that it’s a new thing, more so in the use case
> of ceph clusters.  We already run all the command and control on VMs, so
> running the OSD host server OS’s in memory seems like a nifty idea to allow
> us to fully use every disk bay.  We have some older 620s that use an SD
> card on mirror( which is not super reliable in practice ), they might be
> good candidates for this.  I am just wondering how we would drop in the
> correct ceph configuration files during boot without needing to do tons of
> scripting ( the clusters are 15-20 machines ).
>
>
>
> -Brent
>
>
>
> *From:* Martin Verges 
> *Sent:* Sunday, April 5, 2020 3:04 AM
> *To:* Brent Kennedy 
> *Cc:* huxia...@horebdata.cn; ceph-users 
> *Subject:* Re: [ceph-users] Re: Questions on Ceph cluster without OS disks
>
>
>
> Hello Brent,
>
>
>
> no, swap is definitely not needed if you configure systems correctly.
>
> Swap in Ceph kills all your performance and brings a lot of harm to
> clusters. It increases the downtime, decreases the performance and can
> result in much longer recovery times which endangers your data.
>
>
>
> In the very old times, swap was required as you were unable to have enough
> memory in your systems. Today's server does not require a swap partition
> and I personally disable it on all my systems in the past >10y. As my last
> company was a datacenter provider with multiple thousand systems, I
> believe to have quite some insights if that is stable.
>
>
>
> What happens if you run out of memory you might ask? - simple, OOM killer
> kills one process and systemd restarts it, service is back up in a few
> seconds.
>
> Can you choose what process is killed most likely? - yes you can. Take a
> look into /proc/*/oom_adj
>
> What happens if I swap gets filled up? - total destruction ;), your OOM
> killer kills one process, freeing up swap takes a much longer time, system
> load skyrocks, services become unresponsive, Ceph client IO can drop to
> near zero... just save yourself the trouble.
>
>
>
> So yes, we strongly believe to have a far superior system by design
> by just preventing swap at all.
>
>
> --
>
> Martin Verges
> Managing director
>
> Mobile: +49 174 9335695
> E-Mail: martin.ver...@croit.io
> Chat: https://t.me/MartinVerges
>
> croit GmbH, Freseniusstr. 31h, 81247 Munich
> CEO: Martin Verges - VAT-ID: DE310638492
> Com. register: Amtsgericht Munich HRB 231263
>
> Web: https://croit.io
> YouTube: https://goo.gl/PGE1Bx
>
>
>
>
>
> Am So., 5. Apr. 2020 um 01:59 Uhr schrieb Brent Kennedy <
> bkenn...@cfl.rr.com>:
>
> Forgive me for asking but it seems most OS's require a swap file and when
> I look into doing something similar(meaning not having anything), they all
> say the OS could go unstable without it.  It seems that anyone doing this
> needs to be 100 certain memory will not be used at 100% ever or the OS
> would crash if no swap was there.  How are you getting around this and has
> it ever been a thing?
>
> Also, for the ceph OSDs, where are you storing the osd and host
> configurations ( central storage? )?
>
> Regards,
> -Brent
>
> Existing Clusters:
> Test: Nautilus 14.2.2 with 3 osd servers, 1 mon/man, 1 gateway, 2 iscsi
> gateways ( all virtual on nvme )
> US Production(HDD): Nautilus 14.2.2 with 11 osd servers, 3 mons, 4
> gateways, 2 iscsi gateways
> UK Production(HDD): Nautilus 14.2.2 with 12 osd servers, 3 mons, 4 gateways
> US Production(SSD): Nautilus 14.2.2 with 6 osd servers, 3 mons, 3
> ga

[ceph-users] Re: Questions on Ceph cluster without OS disks

2020-04-05 Thread Martin Verges

Hello Brent,

no, swap is definitely not needed if you configure systems correctly.
Swap in Ceph kills all your performance and brings a lot of harm to
clusters. It increases the downtime, decreases the performance and can
result in much longer recovery times which endangers your data.

In the very old times, swap was required as you were unable to have enough
memory in your systems. Today's server does not require a swap partition
and I personally disable it on all my systems in the past >10y. As my last
company was a datacenter provider with multiple thousand systems, I
believe to have quite some insights if that is stable.

What happens if you run out of memory you might ask? - simple, OOM killer
kills one process and systemd restarts it, service is back up in a few
seconds.
Can you choose what process is killed most likely? - yes you can. Take a
look into /proc/*/oom_adj
What happens if I swap gets filled up? - total destruction ;), your OOM
killer kills one process, freeing up swap takes a much longer time, system
load skyrocks, services become unresponsive, Ceph client IO can drop to
near zero... just save yourself the trouble.

So yes, we strongly believe to have a far superior system by design
by just preventing swap at all.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am So., 5. Apr. 2020 um 01:59 Uhr schrieb Brent Kennedy :

> Forgive me for asking but it seems most OS's require a swap file and when
> I look into doing something similar(meaning not having anything), they all
> say the OS could go unstable without it.  It seems that anyone doing this
> needs to be 100 certain memory will not be used at 100% ever or the OS
> would crash if no swap was there.  How are you getting around this and has
> it ever been a thing?
>
> Also, for the ceph OSDs, where are you storing the osd and host
> configurations ( central storage? )?
>
> Regards,
> -Brent
>
> Existing Clusters:
> Test: Nautilus 14.2.2 with 3 osd servers, 1 mon/man, 1 gateway, 2 iscsi
> gateways ( all virtual on nvme )
> US Production(HDD): Nautilus 14.2.2 with 11 osd servers, 3 mons, 4
> gateways, 2 iscsi gateways
> UK Production(HDD): Nautilus 14.2.2 with 12 osd servers, 3 mons, 4 gateways
> US Production(SSD): Nautilus 14.2.2 with 6 osd servers, 3 mons, 3
> gateways, 2 iscsi gateways
>
>
>
>
> -Original Message-
> From: Martin Verges 
> Sent: Sunday, March 22, 2020 3:50 PM
> To: huxia...@horebdata.cn
> Cc: ceph-users 
> Subject: [ceph-users] Re: Questions on Ceph cluster without OS disks
>
> Hello Samuel,
>
> we from croit.io don't use NFS to boot up Servers. We copy the OS
> directly into the RAM (approximately 0.5-1GB). Think of it like a
> container, you start it and throw it away when you no longer need it.
> This way we can save the slots of OS harddisks to add more storage per
> node and reduce overall costs as 1GB ram is cheaper then an OS disk and
> consumes less power.
>
> If our management node is down, nothing will happen to the cluster. No
> impact, no downtime. However, you do need the mgmt node to boot up the
> cluster. So after a very rare total power outage, your first system would
> be the mgmt node and then the cluster itself. But again, if you configure
> your systems correct, no manual work is required to recover from that. For
> everything else, it is possible (but definitely not needed) to deploy our
> mgmt node in active/passive HA.
>
> We have multiple hundred installations worldwide in production
> environments. Our strong PXE knowledge comes from more than 20 years of
> datacenter hosting experience and it never ever failed us in the last >10
> years.
>
> The main benefits out of that:
>  - Immutable OS freshly booted: Every host has exactly the same version,
> same library, kernel, Ceph versions,...
>  - OS is heavily tested by us: Every croit deployment has exactly the same
> image. We can find errors much faster and hit much fewer errors.
>  - Easy Update: Updating OS, Ceph or anything else is just a node reboot.
> No cluster downtime, No service Impact, full automatic handling by our
> mgmt Software.
>  - No need to install OS: No maintenance costs, no labor required, no
> other OS management required.
>  - Centralized Logs/Stats: As it is booted in memory, all logs and
> statistics are collected on a central place for easy access.
>  - Easy to scale: It doesn't matter if you boot 3 oder 300 nodes, all boot
> the exact same image in a few seconds.
>  .. lots more
>
> Please do not hesitate to c

[ceph-users] Re: luminous： osd continue down because of the hearbeattimeout

2020-04-02 Thread Martin Verges

Hello,

check your network, maybe link flapping or something else. Your port shows
a high dropped count as well. Use some tool like smokeping to detect loss
within your network, or if you are using croit, use the network loss
detection feature within the statistics view.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Do., 2. Apr. 2020 um 04:10 Uhr schrieb linghucongsong <
linghucongs...@163.com>:

> HI! all! Thanks for reading this msg.
>
>
> I hava one ceph cluster installed with ceph V12.2.12. It runs well for
> about half a year.
>
>
> Last week we add anoher two meachine to this ceph cluster.Then all the
> osds became unstable.
>
>
> The osd ansync message complain can not hearbeat to eachother.But the
> network ping with no drop packages and no error packages.
>
>
> I use bond0 for the ceph cluster front and back netwrok.Now I set nodown
> noout  the cluster became stable,
>
>
> but from the log I see a lot for error aysnc message.I have try simple
> message, It also the smae error.
>
>
> All the osd error like below:
>
>
> NG_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg accept
> replacing existing (lossy) channel (new one lossy=1)
> 2020-04-02 09:59:17.989469 7f42794da700  0 -- 10.255.255.54:6814/106
> >> 10.255.255.56:0/7 conn(0x55721e0e5800 :6814
> s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg
> accept replacing existing (lossy) channel (new one lossy=1)
> 2020-04-02 09:59:17.989557 7f42784d8700  0 -- 10.255.255.54:6819/106
> >> 10.255.255.52:0/7 conn(0x55721e0e8800 :6819
> s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg
> accept replacing existing (lossy) channel (new one lossy=1)
> 2020-04-02 09:59:17.989728 7f4278cd9700  0 -- 10.255.255.54:6814/106
> >> 10.255.255.55:0/7 conn(0x55722973b000 :6814
> s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg
> accept replacing existing (lossy) channel (new one lossy=1)
> 2020-04-02 09:59:17.989872 7f42794da700  0 -- 10.255.255.54:6819/106
> >> 10.255.255.55:0/7 conn(0x557225b15000 :6819
> s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg
> accept replacing existing (lossy) channel (new one lossy=1)
> 2020-04-02 09:59:17.990111 7f42794da700  0 -- 10.255.255.54:6819/106
> >> 10.255.255.55:0/7 conn(0x557228506000 :6819
> s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg
> accept replacing existing (lossy) channel (new one lossy=1)
> 2020-04-02 09:59:17.990161 7f42784d8700  0 -- 10.255.255.54:6819/106
> >> 10.255.255.56:0/7 conn(0x55722632 :6819
> s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg
> accept replacing existing (lossy) channel (new one lossy=1)
> 2020-04-02 09:59:17.990196 7f42794da700  0 -- 10.255.255.54:6814/106
> >> 10.255.255.56:0/7 conn(0x55722650b000 :6814
> s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg
> accept replacing existing (lossy) channel (new one lossy=1)
> 2020-04-02 09:59:17.991450 7f4278cd9700  0 -- 10.255.255.54:6819/106
> >> 10.255.255.55:0/7 conn(0x5572298d7800 :6819
> s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg
> accept replacing existing (lossy) channel (new one lossy=1)
> 2020-04-02 09:59:17.991458 7f42784d8700  0 -- 10.255.255.54:6814/106
> >> 10.255.255.52:0/7 conn(0x557226f19000 :6814
> s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg
> accept replacing existing (lossy) channel (new one lossy=1)
> 2020-04-02 09:59:17.991639 7f4278cd9700  0 -- 10.255.255.54:6819/106
> >> 10.255.255.52:0/7 conn(0x557226867800 :6819
> s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg
> accept replacing existing (lossy) channel (new one lossy=1)
> 2020-04-02 09:59:17.991798 7f42794da700  0 -- 10.255.255.54:6814/106
> >> 10.255.255.56:0/7 conn(0x55722a20b000 :6814
> s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg
> accept replacing existing (lossy) channel (new one lossy=1)
> 2020-04-02 09:59:17.991842 7f42784d8700  0 -- 10.255.255.54:6819/106
> >> 10.255.255.56:0/7 conn(0x557226869000 :6819
> s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg
> accept replacing existing (lossy) channel (new one lossy=1)
>
>
> The network config:
> bond0 Link encap:Ethernet  HWaddr 6c:92:bf:c2:8e:e

[ceph-users] Re: Questions on Ceph cluster without OS disks

2020-03-24 Thread Martin Verges

Hello Thomas,

we export the Logs using systemd-journald-remote / -upload. Long term
retention can be done configuring an external syslog / elk / .. using our
config file.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Di., 24. März 2020 um 08:47 Uhr schrieb Thomas Schneider <
74cmo...@gmail.com>:

> Hello Martin,
>
> I suspect you're using a central syslog server.
> Can you share information which central syslog server you use?
> Is this central server running on ceph cluster, too?
>
> Regards
> Thomas
>
> Am 23.03.2020 um 09:39 schrieb Martin Verges:
>
> Hello Thomas,
>
> by default we allocate 1GB per Host on the Management Node, nothing on the
> PXE booted server.
>
> This value can be changed in the management container config file
> (/config/config.yml):
> > ...
> > logFilesPerServerGB: 1
> > ...
> After changing the config, you need to restart the mgmt container.
>
> --
> Martin Verges
> Managing director
>
> Mobile: +49 174 9335695
> E-Mail: martin.ver...@croit.io
> Chat: https://t.me/MartinVerges
>
> croit GmbH, Freseniusstr. 31h, 81247 Munich
> CEO: Martin Verges - VAT-ID: DE310638492
> Com. register: Amtsgericht Munich HRB 231263
>
> Web: https://croit.io
> YouTube: https://goo.gl/PGE1Bx
>
>
> Am Mo., 23. März 2020 um 09:30 Uhr schrieb Thomas Schneider <
> 74cmo...@gmail.com>:
>
>> Hello Martin,
>>
>> how much disk space do you reserve for log in the PXE setup?
>>
>> Regards
>> Thomas
>>
>> Am 22.03.2020 um 20:50 schrieb Martin Verges:
>> > Hello Samuel,
>> >
>> > we from croit.io don't use NFS to boot up Servers. We copy the OS
>> directly
>> > into the RAM (approximately 0.5-1GB). Think of it like a container, you
>> > start it and throw it away when you no longer need it.
>> > This way we can save the slots of OS harddisks to add more storage per
>> node
>> > and reduce overall costs as 1GB ram is cheaper then an OS disk and
>> consumes
>> > less power.
>> >
>> > If our management node is down, nothing will happen to the cluster. No
>> > impact, no downtime. However, you do need the mgmt node to boot up the
>> > cluster. So after a very rare total power outage, your first system
>> would
>> > be the mgmt node and then the cluster itself. But again, if you
>> configure
>> > your systems correct, no manual work is required to recover from that.
>> For
>> > everything else, it is possible (but definitely not needed) to deploy
>> our
>> > mgmt node in active/passive HA.
>> >
>> > We have multiple hundred installations worldwide in production
>> > environments. Our strong PXE knowledge comes from more than 20 years of
>> > datacenter hosting experience and it never ever failed us in the last
>> >10
>> > years.
>> >
>> > The main benefits out of that:
>> >  - Immutable OS freshly booted: Every host has exactly the same version,
>> > same library, kernel, Ceph versions,...
>> >  - OS is heavily tested by us: Every croit deployment has exactly the
>> same
>> > image. We can find errors much faster and hit much fewer errors.
>> >  - Easy Update: Updating OS, Ceph or anything else is just a node
>> reboot.
>> > No cluster downtime, No service Impact, full automatic handling by our
>> mgmt
>> > Software.
>> >  - No need to install OS: No maintenance costs, no labor required, no
>> other
>> > OS management required.
>> >  - Centralized Logs/Stats: As it is booted in memory, all logs and
>> > statistics are collected on a central place for easy access.
>> >  - Easy to scale: It doesn't matter if you boot 3 oder 300 nodes, all
>> > boot the exact same image in a few seconds.
>> >  .. lots more
>> >
>> > Please do not hesitate to contact us directly. We always try to offer an
>> > excellent service and are strongly customer oriented.
>> >
>> > --
>> > Martin Verges
>> > Managing director
>> >
>> > Mobile: +49 174 9335695
>> > E-Mail: martin.ver...@croit.io
>> > Chat: https://t.me/MartinVerges
>> >
>> > croit GmbH, Freseniusstr. 31h, 81247 Munich
>> > CEO: Martin Verges - VAT-ID: DE310638492
>> > Com. r

[ceph-users] Re: Questions on Ceph cluster without OS disks

2020-03-23 Thread Martin Verges

Hello Thomas,

by default we allocate 1GB per Host on the Management Node, nothing on the
PXE booted server.

This value can be changed in the management container config file
(/config/config.yml):
> ...
> logFilesPerServerGB: 1
> ...
After changing the config, you need to restart the mgmt container.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Mo., 23. März 2020 um 09:30 Uhr schrieb Thomas Schneider <
74cmo...@gmail.com>:

> Hello Martin,
>
> how much disk space do you reserve for log in the PXE setup?
>
> Regards
> Thomas
>
> Am 22.03.2020 um 20:50 schrieb Martin Verges:
> > Hello Samuel,
> >
> > we from croit.io don't use NFS to boot up Servers. We copy the OS
> directly
> > into the RAM (approximately 0.5-1GB). Think of it like a container, you
> > start it and throw it away when you no longer need it.
> > This way we can save the slots of OS harddisks to add more storage per
> node
> > and reduce overall costs as 1GB ram is cheaper then an OS disk and
> consumes
> > less power.
> >
> > If our management node is down, nothing will happen to the cluster. No
> > impact, no downtime. However, you do need the mgmt node to boot up the
> > cluster. So after a very rare total power outage, your first system would
> > be the mgmt node and then the cluster itself. But again, if you configure
> > your systems correct, no manual work is required to recover from that.
> For
> > everything else, it is possible (but definitely not needed) to deploy our
> > mgmt node in active/passive HA.
> >
> > We have multiple hundred installations worldwide in production
> > environments. Our strong PXE knowledge comes from more than 20 years of
> > datacenter hosting experience and it never ever failed us in the last >10
> > years.
> >
> > The main benefits out of that:
> >  - Immutable OS freshly booted: Every host has exactly the same version,
> > same library, kernel, Ceph versions,...
> >  - OS is heavily tested by us: Every croit deployment has exactly the
> same
> > image. We can find errors much faster and hit much fewer errors.
> >  - Easy Update: Updating OS, Ceph or anything else is just a node reboot.
> > No cluster downtime, No service Impact, full automatic handling by our
> mgmt
> > Software.
> >  - No need to install OS: No maintenance costs, no labor required, no
> other
> > OS management required.
> >  - Centralized Logs/Stats: As it is booted in memory, all logs and
> > statistics are collected on a central place for easy access.
> >  - Easy to scale: It doesn't matter if you boot 3 oder 300 nodes, all
> > boot the exact same image in a few seconds.
> >  .. lots more
> >
> > Please do not hesitate to contact us directly. We always try to offer an
> > excellent service and are strongly customer oriented.
> >
> > --
> > Martin Verges
> > Managing director
> >
> > Mobile: +49 174 9335695
> > E-Mail: martin.ver...@croit.io
> > Chat: https://t.me/MartinVerges
> >
> > croit GmbH, Freseniusstr. 31h, 81247 Munich
> > CEO: Martin Verges - VAT-ID: DE310638492
> > Com. register: Amtsgericht Munich HRB 231263
> >
> > Web: https://croit.io
> > YouTube: https://goo.gl/PGE1Bx
> >
> >
> > Am Sa., 21. März 2020 um 13:53 Uhr schrieb huxia...@horebdata.cn <
> > huxia...@horebdata.cn>:
> >
> >> Hello， Martin，
> >>
> >> I notice that Croit advocate the use of ceph cluster without OS disks,
> but
> >> with PXE boot.
> >>
> >> Do you use a NFS server to serve the root file system for each node?
> such
> >> as hosting configuration files, user and password, log files, etc. My
> >> question is, will the NFS server be a single point of failure? If the
> NFS
> >> server goes down, the network experience any outage, ceph nodes may not
> be
> >> able to write to the local file systems, possibly leading to service
> outage.
> >>
> >> How do you deal with the above potential issues in production? I am a
> bit
> >> worried...
> >>
> >> best regards,
> >>
> >> samuel
> >>
> >>
> >>
> >>
> >> --
> >> huxia...@horebdata.cn
> >>
> >>
> >>
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Questions on Ceph cluster without OS disks

2020-03-22 Thread Martin Verges

Hello Samuel,

we from croit.io don't use NFS to boot up Servers. We copy the OS directly
into the RAM (approximately 0.5-1GB). Think of it like a container, you
start it and throw it away when you no longer need it.
This way we can save the slots of OS harddisks to add more storage per node
and reduce overall costs as 1GB ram is cheaper then an OS disk and consumes
less power.

If our management node is down, nothing will happen to the cluster. No
impact, no downtime. However, you do need the mgmt node to boot up the
cluster. So after a very rare total power outage, your first system would
be the mgmt node and then the cluster itself. But again, if you configure
your systems correct, no manual work is required to recover from that. For
everything else, it is possible (but definitely not needed) to deploy our
mgmt node in active/passive HA.

We have multiple hundred installations worldwide in production
environments. Our strong PXE knowledge comes from more than 20 years of
datacenter hosting experience and it never ever failed us in the last >10
years.

The main benefits out of that:
 - Immutable OS freshly booted: Every host has exactly the same version,
same library, kernel, Ceph versions,...
 - OS is heavily tested by us: Every croit deployment has exactly the same
image. We can find errors much faster and hit much fewer errors.
 - Easy Update: Updating OS, Ceph or anything else is just a node reboot.
No cluster downtime, No service Impact, full automatic handling by our mgmt
Software.
 - No need to install OS: No maintenance costs, no labor required, no other
OS management required.
 - Centralized Logs/Stats: As it is booted in memory, all logs and
statistics are collected on a central place for easy access.
 - Easy to scale: It doesn't matter if you boot 3 oder 300 nodes, all
boot the exact same image in a few seconds.
 .. lots more

Please do not hesitate to contact us directly. We always try to offer an
excellent service and are strongly customer oriented.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Sa., 21. März 2020 um 13:53 Uhr schrieb huxia...@horebdata.cn <
huxia...@horebdata.cn>:

> Hello， Martin，
>
> I notice that Croit advocate the use of ceph cluster without OS disks, but
> with PXE boot.
>
> Do you use a NFS server to serve the root file system for each node? such
> as hosting configuration files, user and password, log files, etc. My
> question is, will the NFS server be a single point of failure? If the NFS
> server goes down, the network experience any outage, ceph nodes may not be
> able to write to the local file systems, possibly leading to service outage.
>
> How do you deal with the above potential issues in production? I am a bit
> worried...
>
> best regards,
>
> samuel
>
>
>
>
> --
> huxia...@horebdata.cn
>
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: New 3 node Ceph cluster

2020-03-15 Thread Martin Verges

This is too little memory. We have already seen MDS with well over 50 GB
Ram requirements.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am So., 15. März 2020 um 14:34 Uhr schrieb Amudhan P :

> Thank you, All for your suggestions and ideas.
>
> what is your view on using MON, MGR, MDS and cephfs client or samba-ceph
> vfs in a single machine (10 core xeon CPU with 16GB RAM and SSD disk)?.
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: New 3 node Ceph cluster

2020-03-14 Thread Martin Verges

Hello Amudhan,

I will be using Cephfs + with samba for max 10 clients for upload and
> download.
>

Please use samba vfs and not the kernel mount.

Earlier I have tested orchestration using ceph-deploy in the test setup.
> now, is there any other alternative to ceph-deploy?
>

Yes, try our deployment tool. It brings you Ceph the easy way including
anything from Ceph RBD, S3, CephFS, NFS, ISCSI, SMB,.. hassle free.

Storage Node HW is Intel Xeon E5v2 8 core single Proc, 32GB RAM and 10Gb
> Nic 2 nos., 6TB SATA  HDD 24 Nos. each node, OS separate SSD disk
>

You need at least 4 GB ram per HDD, using 24 disks per System is not
suggested. With croit, you won't need an OS disk as the storage node gets
live booted over the network using PXE. Depending on your requirements, CPU
could be a bottleneck as well.

Can I restrict folder access to the user using cephfs + vfs samba or should
> I use ceph client + samba?
>

In croit you can attach the samba service to an active directory to make
use of permissions. You can configure them by hand as well.

Ubuntu or Centos?
>

Debian ;) it's the best. But Ubuntu is ok as well.

Any block size consideration for object size, metadata when using cephfs?
>

Leave it by the defaults unless you know what special case you have. A lot
of issues we see in the wild coming from bad configurations, copy pasted
from a random page found on google.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Sa., 14. März 2020 um 08:17 Uhr schrieb Amudhan P :

> Hi,
>
> I am planning to create a new 3 node ceph storage cluster.
>
> I will be using Cephfs + with samba for max 10 clients for upload and
> download.
>
> Storage Node HW is Intel Xeon E5v2 8 core single Proc, 32GB RAM and 10Gb
> Nic 2 nos., 6TB SATA  HDD 24 Nos. each node, OS separate SSD disk.
>
> Earlier I have tested orchestration using ceph-deploy in the test setup.
> now, is there any other alternative to ceph-deploy?
>
> Can I restrict folder access to the user using cephfs + vfs samba or should
> I use ceph client + samba?
>
> Ubuntu or Centos?
>
> Any block size consideration for object size, metadata when using cephfs?
>
> Idea or suggestion from existing users. I am also going to start to explore
> all the above.
>
> regards
> Amudhan
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Is there a better way to make a samba/nfs gateway? (Marc Roos)

2020-03-14 Thread Martin Verges

Hello Chad,

starting with the Problems from lost connections with the kernel CephFS
mount to a much simpler service setup, there are plenty.
But what would be the point in stacking different tools (kernel mount, smb
service,..) untested together just because you can?

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Fr., 13. März 2020 um 16:24 Uhr schrieb Chad William Seys <
cws...@physics.wisc.edu>:

> Awhile back I thought there were some limitations which prevented us
> from trying this, but I cannot remember...
>
> What does the ceph vfs gain you over exporting by cephfs kernel module
> (kernel 4.19).  What does it lose you?
>
> (I.e. pros and cons versus kernel module?)
>
> Thanks!
> C.
>
> > It's based on vfs_ceph and you can read more about how to configure it
> > yourself on
> > https://www.samba.org/samba/docs/current/man-html/vfs_ceph.8.html.
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Is there a better way to make a samba/nfs gateway?

2020-03-13 Thread Martin Verges

Hello,

we have a CTDB based HA Samba in our Ceph Management Solution.
It works like a charm and we connect it to existing active directories as
well.

It's based on vfs_ceph and you can read more about how to configure it
yourself on
https://www.samba.org/samba/docs/current/man-html/vfs_ceph.8.html.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Fr., 13. März 2020 um 13:06 Uhr schrieb Nathan Fish :

> Note that we have had issues with deadlocks when re-exporting CephFS
> via Samba. It appears to only occur with Mac clients, though. In some
> cases it has hung on a request for a high-level directory and hung
> that branch for all clients.
>
> On Fri, Mar 13, 2020 at 1:56 AM Konstantin Shalygin 
> wrote:
> >
> >
> > On 3/11/20 11:16 PM, Seth Galitzer wrote:
> > > I have a hybrid environment and need to share with both Linux and
> > > Windows clients. For my previous iterations of file storage, I
> > > exported nfs and samba shares directly from my monolithic file server.
> > > All Linux clients used nfs and all Windows clients used samba. Now
> > > that I've switched to ceph, things are a bit more complicated. I built
> > > a gateway to export nfs and samba as needed, and connect that as a
> > > client to my ceph cluster.
> > >
> > > After having file locking problems with kernel nfs, I made the switch
> > > to nfs-ganesha, which has helped immensely. For Linux clients that
> > > have high I/O needs, like desktops and some web servers, I connect to
> > > ceph directly for those shares. For all other Linux needs, I use nfs
> > > from the gateway. For all Windows clients (desktops and a small number
> > > of servers), I use samba exported from the gateway.
> > >
> > > Since my ceph cluster went live in August, I have had some kind of
> > > strange (to me) error at least once a week, almost always related to
> > > the gateway client. Last night, it was MDS_CLIENT_OLDEST_TID. Since
> > > we're on Spring Break at my university and not very busy, I decided to
> > > unmount/remount the ceph share, requiring stopping nfs and samba
> > > services. Stopping nfs-ganesha took a while, but it finally completed
> > > with no complaints from the ceph cluster. Stopping samba took longer
> > > and gave me MDS_SLOW_REQUEST and MDS_CLIENT_LATE_RELEASE on the mds.
> > > It finally finished, and I was able to unmount/remount the ceph share
> > > and that finally cleared all the errors.
> > >
> > > This is leading me to believe that samba on the gateway and all the
> > > clients attaching to that is putting a strain on the connection back
> > > to ceph. Which finally brings me to my question: is there a better way
> > > to export samba to my clients using the ceph back end? Or is this as
> > > good as it gets and I just have to put up with the seemingly frequent
> > > errors? I can live with the errors and have been able to handle them
> > > so far, but I know people who have much bigger clusters and many more
> > > clients than me (by an order of magnitude) and don't see nearly as
> > > many errors as I do. Which is why I'm trying to figure out what is
> > > special about my setup.
> > >
> > > All my ceph nodes are running latest nautilus on Centos 7 (I just
> > > updated last week to 14.2.8), as is the gateway host. I'm mounting
> > > ceph directly on the gateway (by way of the kernel using cephfs, not
> > > rados/rbd) to a single mount point and exporting from there.
> > >
> > > My searches so far have not turned up anything extraordinarily useful,
> > > so I'm asking for some guidance here. Any advice is welcome.
> >
> > You can connect to your cluster directly from userland, without kernel.
> > Use Samba vfs_ceph for this.
> >
> >
> >
> > k
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Nautilus 14.2.8

2020-03-03 Thread Martin Verges

*cough* use croit to deploy your cluster, then you have a well tested
OS+Ceph image and no random version change ;) *cough*

--
Martin Verges
Managing director

Hint: Secure one of the last slots in the upcoming 4-day Ceph Intensive
Training at https://croit.io/training/4-days-ceph-in-depth-training.

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Di., 3. März 2020 um 13:00 Uhr schrieb Fyodor Ustinov :

> Hi!
>
> > I really do not care about these 1-2 days in between, why are you? Do
> > not install it, configure yum to  lock a version, update your local repo
> > less frequent.
>
> I already asked this question - what to do to those who today decide to
> install the CEPH for the first time?
>
> ceph-deploy installs from ceph repo. Ceph repo already have 14.2.8 and no
> any information about this version.
>
> >
> >
> >
> > -Original Message-
> > Sent: 03 March 2020 11:22
> > To: ceph-users
> > Subject: [ceph-users] Nautilus 14.2.8
> >
> > Hi!
> >
> > Again. New version in repository without announce.
> >
> > :(
> >
> > I wonder who needs to write a letter and complain that there would
> > always be an announcement, and then a new version in the repository?
> >
> > WBR,
> >Fyodor.
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an
> > email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Performance of old vs new hw?

2020-02-18 Thread Martin Verges

Depends on your current SSDs and the new SSDs. It is highly likely that
most performance increase will come from choosing good new NVMe. In
addition higher clock frequency will increase IO as well but only if it is
a bottleneck.

--
Martin Verges
Managing director

Hint: Secure one of the last slots in the upcoming 4-day Ceph Intensive
Training at https://croit.io/training/4-days-ceph-in-depth-training.

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Mo., 17. Feb. 2020 um 19:55 Uhr schrieb :

>
> Hi
>
> We have some oldish servers with ssds - all on 25gbit nics. R815 AMD -
> 2,4ghz+
>
> Is there significant performance benefits in moving to a new NVMe based,
> new cpus?
>
> +20% IOPs? + 50% IOPs?
>
> Jesper
>
>
>
> Sent from myMail for iOS
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: EC Pools w/ RBD - IOPs

2020-02-13 Thread Martin Verges

Hello,

please do not even think about using an EC pool (k=2, m=1). See other posts
here, just don't.

EC works quite well and we have a lot of users with EC based VMs often with
proxmox (rbd) oder vmware (iscsi) hypervisors.
Performance depends on the hardware and is definitely slower than replica,
but cost efficient and more then ok on most workloads. If you split generic
VMs and Databases (or similar workloads), you can save a lot of money with
EC.

--
Martin Verges
Managing director

Hint: Secure one of the last slots in the upcoming 4-day Ceph Intensive
Training at https://croit.io/training/4-days-ceph-in-depth-training.

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Do., 13. Feb. 2020 um 17:52 Uhr schrieb Anthony Brandelli (abrandel) <
abran...@cisco.com>:

> Hi Ceph Community,
>
> Wondering what experiences good/bad you have with EC pools for iops
> intensive workloads (IE: 4Kish random IO from things like VMWare ESXi). I
> realize that EC pools are a tradeoff between more usable capacity, and
> having larger latency/lower iops, but in my testing the tradeoff for small
> IO seems to be much worse than I had anticipated.
>
> On an all flash 3x replicated pool we’re seeing 45k random read, and 35k
> random write iops testing with fio on a client living on an iSCSI LUN
> presented to an ESXi host. Average latencies for these ops are 4.2ms, and
> 5.5ms, which is respectable at an io depth of 32.
>
> Take this same setup with an EC pool (k=2, m=1, tested with both ISA and
> jerasure, ISA does give better performance for our use case) and we see 30k
> random read, and 16k random write iops. Random reads see 6.5ms average,
> while random writes suffer with 12ms average.
>
> Are others using EC pools seeing similar hits to random writes with small
> IOs? Any way to improve this?
>
> Thanks,
> Anthony
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: ceph positions

2020-02-03 Thread Martin Verges

Hello Frank,

we are always looking for Ceph/Linux consultants.

--
Martin Verges
Managing director

Hint: Secure one of the last slots in the upcoming 4-day Ceph Intensive
Training at https://croit.io/training/4-days-ceph-in-depth-training.

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Mo., 3. Feb. 2020 um 17:26 Uhr schrieb Frank R :

> Hi all,
>
> I really hope this isn't seen as spam. I am looking to find a position
> where I can focus on Linux storage/Ceph. If anyone is currently
> looking please let me know. Linkedin profile frankritchie.
>
> Thanks,
> Frank
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Questions on Erasure Coding

2020-02-02 Thread Martin Verges

Hello Dave,

you can configure Ceph to pick multiple OSDs per Host and therefore work
like a classic raid.
It will cause a downtime whenever you have to do maintenance on a system,
but when you plan to grow it quite fast, it's maybe an option for you.

--
Martin Verges
Managing director

Hint: Secure one of the last slots in the upcoming 4-day Ceph Intensive
Training at https://croit.io/training/4-days-ceph-in-depth-training.

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am So., 2. Feb. 2020 um 05:11 Uhr schrieb Dave Hall :

> Hello.
>
> Thanks to advice from bauen1 I now have OSDs on Debian/Nautilus and have
> been able to move on to MDS and CephFS.  Also, looking around in the
> Dashboard I noticed the options for Crush Failure Domain and further
> that it's possible to select 'OSD'.
>
> As I mentioned earlier our cluster is fairly small at this point (3
> hosts, 24 OSDs) , but we want to get as much usable storage as possible
> until we can get more nodes.  SInce the nodes are brand new we are
> probably more concerned about disk failures than about node failures for
> the next few months.
>
> If I interpret Crush Failure Domain = OSD, this means it's possible to
> create pools that behave somewhat similar to RAID 6 - something like 8 +
> 2 except dispersed across multiple nodes.  With the pool spread around
> like this loosing any one disk shouldn't put the cluster into read-only
> mode - if a disk did fail, would the cluster re-balance and reconstruct
> the lost data until the failed OSD was replaced.
>
> Does this make sense?  Or is it just wishful thinking.
>
> Thanks.
>
> -Dave
>
> --
> Dave Hall
> Binghamton University
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Micron SSD/Basic Config

2020-01-31 Thread Martin Verges

Hello Adam,

Can you describe what performance values you want to gain out of your
cluster?
What's the use case?
EC oder Replica?

In general, more disks are preferred over bigger ones.
As Micron has not provided us with demo hardware, we can't say how fast
these disks are in reality. Before I think about 40 vs 25/50/100 GbE, I
would reduce latency of these disks.

--
Martin Verges
Managing director

Hint: Secure one of the last slots in the upcoming 4-day Ceph Intensive
Training at https://croit.io/training/4-days-ceph-in-depth-training.

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Fr., 31. Jan. 2020 um 13:58 Uhr schrieb Adam Boyhan :

> Looking to role out a all flash Ceph cluster. Wanted to see if anyone else
> was using Micron drives along with some basic input on my design so far?
>
> Basic Config
> Ceph OSD Nodes
> 8x Supermicro A+ Server 2113S-WTRT
> - AMD EPYC 7601 32 Core 2.2Ghz
> - 256G Ram
> - AOC-S3008L-L8e HBA
> - 10GB SFP+ for client network
> - 40GB QSFP+ for ceph cluster network
>
> OSD
> 10x Micron 5300 PRO 7.68TB in each ceph node
> - 80 total drives across the 8 nodes
>
> WAL/DB
> 5x Micron 7300 MAX NVMe 800GB per Ceph Node
> - Plan on dedicating 1 for each 2 OSD's
>
> Still thinking out a external monitor node as I have a lot of options, but
> this is a pretty good start. Open to suggestions as well!
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Benchmark results for Seagate Exos2X14 Dual Actuator HDDs

2020-01-16 Thread Martin Verges

Hello,

according to some prices we have heard so far, the Seagate dual actuator
HDD will cost around 15-20% more than a single actuator.

We can help with a good hardware selection if interested.

--
Martin Verges
Managing director

Hint: Secure one of the last slots in the upcoming 4-day Ceph Intensive
Training at https://croit.io/training/4-days-ceph-in-depth-training.

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Do., 16. Jan. 2020 um 10:11 Uhr schrieb mj :

> Hi,
>
> Interesting technology!
>
> It seems they have only one capacity: 14TB? Or are they planning
> different sizes as well? Also the linked pdf mentions just this one disk.
>
> And obviouly the price would be interesting to know...
>
> MJ
>
> On 1/16/20 9:51 AM, Konstantin Shalygin wrote:
> > On 1/15/20 11:58 PM, Paul Emmerich wrote:
> >>
> >> we ran some benchmarks with a few samples of Seagate's new HDDs that
> >> some of you might find interesting:
> >>
> >> Blog post:
> >> https://croit.io/2020/01/06/2020-01-06-benchmark-mach2
> >>
> >> GitHub repo with scripts and raw data:
> >> https://github.com/croit/benchmarks/tree/master/mach2-disks
> >>
> >> Tl;dr: way faster for writes, somewhat faster for reads in some
> scenarios
> >
> > Very interesting, thanks for sharing results. Price is available?
> >
> >
> >
> > k
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Ceph on CentOS 8?

2019-12-13 Thread Martin Verges

Hello Manuel,

if you want to get rid of all the OS type problems, you can use our free
community version to deploy Ceph.
We make sure every dependency is met and you do not need to worry about
anything like that anymore.

How to do that?
 - Deploy the croit docker container on an independent management node
 - Import your cluster using our assistant/wizard
 - Reboot host by host with boot from network option
 - Done

After that, whenever you want to migrate to a newer version or release,
there will be a button that you just click and that's all you need to do
from that point on. No hassle, no pain, no OS trouble. It all comes to with
absolutely no costs!

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Fr., 13. Dez. 2019 um 17:44 Uhr schrieb Sage Weil :

> On Fri, 13 Dec 2019, Manuel Lausch wrote:
> > Hi,
> >
> > I am interested in el8 Packages as well.
> > Is there any plan to provide el8 packages in the near future?
>
> Ceph Octopus will be based on CentOS 8.  It's due out in March.
>
> The centos8 transition is awkward because our python 2 dependencies don't
> exist on in centos8, and it is a huge amount of effort to produce them.
> Octopus switches to python 3, but those dependencies cannot be produces
> for centos7.  So the nautilus->octopus upgrade will be either involve a
> transition to the new containerized deployment (either cephadm or
> ceph-ansible's container mode) or a simultaneous upgrade of the OS and
> Ceph.
>
> sage
>
>
>
> >
> > Regards
> > Manuel
> >
> > On Mon, 2 Dec 2019 11:16:01 +0100
> > Jan Kasprzak  wrote:
> >
> > > Hello, Ceph users,
> > >
> > > does anybody use Ceph on recently released CentOS 8? Apparently there
> > > are no el8 packages neither at download.ceph.com, nor in the native
> > > CentOS package tree. I am thinking about upgrading my cluster to C8
> > > (because of other software running on it apart from Ceph). Do el7
> > > packages simply work? Can they be rebuilt using rpmbuild --rebuild?
> > > Or is running Ceph on C8 more complicated than that?
> > >
> > > Thanks,
> > >
> > > -Yenya
> > >
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
> >
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: ceph mgr daemon multiple ip addresses

2019-12-09 Thread Martin Verges

There should be no issue and we have a lot of systems with multiple IPs.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

Frank R  schrieb am Mo., 9. Dez. 2019, 17:55:

> Hi all,
>
> Does anyone know what possible issues can arise if the ceph mgr daemon is
> running on a mon node that has 2 ips in the public net range (1 is a
> loopback address).
>
> As I understand the it. mgr will bind to all ips
>
> FYI - I am not sure why the loopback is there, I am trying to find out.
>
> thx
> Frank
>
>
>
>
> mlovell - ceph anycast
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: High swap usage on one replication node

2019-12-08 Thread Martin Verges

Without reading the links:

from more then 20y Linux server and datacenter hosting environment, around
7y of Ceph, and hundreds of different systems all configured without swap.
I never ever had a problem with noswap that would be solved using swap.

But on my Linux desktop swap helps me a bit.

btw, all multiple hundred croit based ceph deployments are 100% swap free.
As we boot over the network directly into the ram, there is no swap disk
that would be available. We don't have any issues with this and I doubt we
will ever encounter such.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

Nigel Williams  schrieb am So., 8. Dez. 2019,
23:15:

> On Sun, 8 Dec 2019 at 00:53, Martin Verges  wrote:
> > Swap is nothing you want to have in a Server as it is very slow and can
> cause long downtimes.
>
> Given the commentary on this page advocating at least some swap to
> enable Linux to manage memory when under pressure:
>
> https://utcc.utoronto.ca/~cks/space/blog/unix/NoSwapConsequence
>
> is it worth modifying the advice to at least have some swap available
> (even if only say 5% of overall memory)?
>
> There was a hnews thread here but to me it seemed inconclusive about
> solving the overall problem (other than applications taking more
> responsibility for memory consumption):
>
> https://news.ycombinator.com/item?id=20641551
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: High swap usage on one replication node

2019-12-07 Thread Martin Verges

Hello,

I would suggest to:

~# swapoff -a
~# vi /etc/fstab
... remove swap line ...

and buy additional ram if required. Without knowing your exact use case,
128GB would be our minimum ram for simple use cases and most likely not for
EC and complex configurations.

Swap is nothing you want to have in a Server as it is very slow and can
cause long downtimes.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Sa., 7. Dez. 2019 um 12:34 Uhr schrieb Xavier Trilla <
xavier.tri...@clouding.io>:

> Hi there,
>
> I think we have our OSD nodes setup with vm.swappiness = 0
>
> If I remember correctly few years ago vm.swappiness = 0 was changed and
> now it does not prevent swapping it just reduces the changes of memory
> being send to swap.
>
> Cheers,
> Xavier.
> -Mensaje original-
> De: Götz Reinicke 
> Enviado el: viernes, 6 de diciembre de 2019 8:14
> Para: ceph-users 
> Asunto: [ceph-users] High swap usage on one replication node
>
> Hi,
>
> our Ceph 14.2.3 cluster so far runs smooth with replicated and EC pools,
> but since a couple of days one of the dedicated replication nodes consumes
> up to 99% swap and stays at that level. The other two replicated nodes use
> +- 50 - 60% of swap.
>
> All the 24 NVMe OSDs per node are BlueStore with default settings, 128GB
> RAM. The vm.swappiness is set to 10.
>
> Do you have any suggestions how to handle/reduce the swap usage?
>
> Thanks for feedback and regards . Götz
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Building a petabyte cluster from scratch

2019-12-03 Thread Martin Verges

Hello,

   * 2 x Xeon Silver 4212 (12C/24T)
>

I would choose single cpu AMD EPYC systems for lower price with better
performance. Supermicro does have some good systems for AMD as well.

   * 16 x 10 TB nearline SAS HDD (8 bays for future needs)
>

Don't waste money here as well. No real gain. Invest it better in more or
faster (ssd) disks.

   * 4 x 40G QSFP+
>

With 24x spinning media, even a single 40G link will be enough. No gain for
a lot of money again.

  * 2 x 40G per server for ceph network (LACP/VPC for HA)
>   * 2 x 40G per server for public network (LACP/VPC for HA)
>

Use vlans if you really want to separate the networks. Most of the time we
see new customers coming in with problems on such configurations and we
don't suggest tu configure Ceph that way from our experience.

 * ZFS on RBD, exposed via samba shares (cluster with failover)
>

Maybe, just maybe think about just using samba on top of cephfs to export
the data. No need for all the overhead and possible bugs you would
encounter.

* We're used to run mons and mgrs daemons on a few of our OSD nodes,
> without any issue so far : is this a bad idea for a big cluster ?
>

We always do so and never had a problem with it. Just make sure the MON has
enough resources for your workload.

* We thought using cache tiering on an SSD pool, but a large part of the PB
> is used on a daily basis, so we expect the cache to be not so effective
> and really expensive ?
>

Tend to be error prone and we saw a lot of cluster meltdowns in the last 7
years due to cache tiering. Just go for an all flash cluster use db/wal
devices to improve performance.

 * Could a 2x10G network be enough ?
>

Yes ;), but maybe on recovery workloads it will slow down the recovery a
bit. However I don't believe that it will be a problem in your mentioned
szenario.

 * ZFS on Ceph ? Any thoughts ?
>

just don't ;)

 * What about CephFS ? We'd like to use RBD diff for backups but it looks
> impossible to use snapshot diff with Cephfs ?


Please see https://docs.ceph.com/docs/master/dev/cephfs-snapshots/

If you do have questions or want some consulting to get the best Ceph
cluster for the job. Please feel free to contact us.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Di., 3. Dez. 2019 um 21:07 Uhr schrieb Fabien Sirjean <
fsirj...@eddie.fdn.fr>:

> Hi Ceph users !
>
> After years of using Ceph, we plan to build soon a new cluster bigger than
> what
> we've done in the past. As the project is still in reflection, I'd like to
> have your thoughts on our planned design : any feedback is welcome :)
>
>
> ## Requirements
>
>  * ~1 PB usable space for file storage, extensible in the future
>  * The files are mostly "hot" data, no cold storage
>  * Purpose : storage for big files being essentially used on windows
> workstations (10G access)
>  * Performance is better :)
>
>
> ## Global design
>
>  * 8+3 Erasure Coded pool
>  * ZFS on RBD, exposed via samba shares (cluster with failover)
>
>
> ## Hardware
>
>  * 1 rack (multi-site would be better, of course...)
>
>  * OSD nodes : 14 x supermicro servers
>* 24 usable bays in 2U rackspace
>* 16 x 10 TB nearline SAS HDD (8 bays for future needs)
>* 2 x Xeon Silver 4212 (12C/24T)
>* 128 GB RAM
>* 4 x 40G QSFP+
>
>  * Networking : 2 x Cisco N3K 3132Q or 3164Q
>* 2 x 40G per server for ceph network (LACP/VPC for HA)
>* 2 x 40G per server for public network (LACP/VPC for HA)
>* QSFP+ DAC cables
>
>
> ## Sizing
>
> If we've done the maths well, we expect to have :
>
>  * 2.24 PB of raw storage, extensible to 3.36 PB by adding HDD
>  * 1.63 PB expected usable space with 8+3 EC, extensible to 2.44 PB
>  * ~1 PB of usable space if we want to keep the OSD use under 66% to allow
>loosing nodes without problem, extensible to 1.6 PB (same condition)
>
>
> ## Reflections
>
>  * We're used to run mons and mgrs daemons on a few of our OSD nodes,
> without
>any issue so far : is this a bad idea for a big cluster ?
>
>  * We thought using cache tiering on an SSD pool, but a large part of the
> PB is
>used on a daily basis, so we expect the cache to be not so effective and
>really expensive ?
>
>  * Could a 2x10G network be enough ?
>
>  * ZFS on Ceph ? Any thoughts ?
>
>  * What about CephFS ? We'd like to use RBD diff for backups but it looks
>impossible to use snapshot diff with Cephfs ?
>
>
> Thanks for reading, a

[ceph-users] Re: v13.2.7 mimic released

2019-11-28 Thread Martin Verges

Hello,

we (croit GmbH) are a founding member of the Ceph foundation and we build
the packages from the official git repository to ship it with our own
solution.
However, we are not Ceph itself and so this is not an official mirror.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Do., 28. Nov. 2019 um 04:24 Uhr schrieb Sang, Oliver <
oliver.s...@intel.com>:

> Thanks a lot for information!
>
>
>
> what’s the relationship of this mirror with ceph official website?
>
> Basically we want to use an official release and hesitate to use a 3rd
> part build package.
>
>
>
> *From:* Martin Verges 
> *Sent:* Wednesday, November 27, 2019 9:58 PM
> *To:* Sang, Oliver 
> *Cc:* Sage Weil ; ceph-annou...@ceph.io;
> ceph-users@ceph.io; d...@ceph.io
> *Subject:* Re: [ceph-users] Re: v13.2.7 mimic released
>
>
>
> Hello,
>
>
>
> as far I know Mimic and nautilus are still not available on debian.
> Unfortunately we do not provide mimic on our mirror for debian 10 buster.
> But if you want to migrate to nautilus, feel free to use our public mirrors
> described at https://croit.io/2019/07/07/2019-07-07-debian-mirror.
>
>
> --
> Martin Verges
> Managing director
>
> Mobile: +49 174 9335695
> E-Mail: martin.ver...@croit.io
> Chat: https://t.me/MartinVerges
>
> croit GmbH, Freseniusstr. 31h, 81247 Munich
> CEO: Martin Verges - VAT-ID: DE310638492
> Com. register: Amtsgericht Munich HRB 231263
>
> Web: https://croit.io
> YouTube: https://goo.gl/PGE1Bx
>
>
>
>
>
> Am Mi., 27. Nov. 2019 um 14:38 Uhr schrieb Sang, Oliver <
> oliver.s...@intel.com>:
>
> can this version be installed on Debian 10?
> If not, is there a plan for Mimic to support Debian 10?
>
> -Original Message-
> From: Sage Weil 
> Sent: Monday, November 25, 2019 10:50 PM
> To: ceph-annou...@ceph.io; ceph-users@ceph.io; d...@ceph.io
> Subject: [ceph-users] v13.2.7 mimic released
>
> This is the seventh bugfix release of the Mimic v13.2.x long term stable
> release series. We recommend all Mimic users upgrade.
>
> For the full release notes, see
>
> https://ceph.io/releases/v13-2-7-mimic-released/
>
> Notable Changes
>
> MDS:
>
>  - Cache trimming is now throttled. Dropping the MDS cache via the “ceph
> tell mds. cache drop” command or large reductions in the cache size
> will no longer cause service unavailability.
>  - Behavior with recalling caps has been significantly improved to not
> attempt recalling too many caps at once, leading to instability. MDS with a
> large cache (64GB+) should be more stable.
>  - MDS now provides a config option “mds_max_caps_per_client” (default:
> 1M) to limit the number of caps a client session may hold. Long running
> client sessions with a large number of caps have been a source of
> instability in the MDS when all of these caps need to be processed during
> certain session events. It is recommended to not unnecessarily increase
> this value.
>  - The “mds_recall_state_timeout” config parameter has been removed. Late
> client recall warnings are now generated based on the number of caps the
> MDS has recalled which have not been released. The new config parameters
> “mds_recall_warning_threshold” (default: 32K) and
> “mds_recall_warning_decay_rate” (default: 60s) set the threshold for this
> warning.
>  - The “cache drop” admin socket command has been removed. The “ceph tell
> mds.X cache drop” remains.
>
> OSD:
>
>  - A health warning is now generated if the average osd heartbeat ping
> time exceeds a configurable threshold for any of the intervals computed.
> The OSD computes 1 minute, 5 minute and 15 minute intervals with average,
> minimum and maximum values. New configuration option
> “mon_warn_on_slow_ping_ratio” specifies a percentage of
> “osd_heartbeat_grace” to determine the threshold. A value of zero disables
> the warning. A new configuration option “mon_warn_on_slow_ping_time”,
> specified in milliseconds, overrides the computed value, causing a warning
> when OSD heartbeat pings take longer than the specified amount. A new admin
> command “ceph daemon mgr.# dump_osd_network [threshold]” lists all
> connections with a ping time longer than the specified threshold or value
> determined by the config options, for the average for any of the 3
> intervals. A new admin command ceph daemon osd.# dump_osd_network
> [threshold]” does the same but only including heartbeats initiated by the
> specified OSD.
>  -

1 2 >

1 - 100 of 106 matches

Mail list logo