[ceph-users] Re: User + Dev Meetup Tomorrow!

2024-05-24 Thread Sebastian Wagner

Hi Frédéric,

I agree. Maybe we should re-frame things? Containers can run on 
bare-metal and containers can run virtualized. And distribution packages 
can run bare-metal and virtualized as well.


What about asking independently about:

 * Do you run containers or distribution packages?
 * Do you run bare-metal or virtualized?

Best,
Sebastian

Am 24.05.24 um 12:28 schrieb Frédéric Nass:

Hello everyone,

Nice talk yesterday. :-)

Regarding containers vs RPMs and orchestration, and the related discussion from 
yesterday, I wanted to share a few things (which I wasn't able to share 
yesterday on the call due to a headset/bluetooth stack issue) to explain why we 
use cephadm and ceph orch these days with bare-metal clusters even though, as 
someone said, cephadm was not supposed to work with (nor support) bare-metal 
clusters (which actually surprised me since cephadm is all about managing 
containers on a host, regardless of its type). I also think this explains the 
observation that was made that half of the reports (iirc) are supposedly using 
cephadm with bare-metal clusters.

Over the years, we've deployed and managed bare-metal clusters with ceph-deploy 
in Hammer, then switched to ceph-ansible (take-over-existing-cluster.yml) with 
Jewel (or was it Luminous?), and then moved to cephadm, cephadm-ansible and 
ceph-orch with Pacific, to manage the exact same bare-metal cluster. I guess 
this explains why some bare-metal cluster today are managed using cephadm. 
These are not new clusters deployed with Rook in K8s environments, but existing 
bare-metal clusters that continue to servce brilliantly 10 years after 
installation.

Regarding rpms vs containers, as mentioned during the call, not sure why one 
would still want to use rpms vs containers considering the simplicity and 
velocity that containers offer regarding upgrades with ceph orch clever 
automation. Some reported performance reasons between rpms vs containers, 
meaning rpms binaries would perform better than containers. Is there any 
evidence of that?

Perhaps the reason why people still use RPMs is instead that they have invested 
a lot of time and effort into developing automation tools/scripts/playbooks for 
RPMs installations and they consider the transition to ceph orch and 
containerized environments as a significant challenge.

Regarding containerized Ceph, I remember asking Sage for a minimalist CephOS 
back in 2018 (there was no containers by that time). IIRC, he said maintaining 
a ceph-specific Linux distro would take too much time and resources, so it was 
not something considered at that time. Now that Ceph is all containers, I 
really hope that a minimalist rolling Ceph distro comes out one day. ceph orch 
could even handle rare distro upgrades such as kernel upgrades as well as 
ordered reboots. This would make ceph clusters really easier to maintain over 
time (compared to the last complicated upgrade path from non-containerized 
RHEL7+RHCS4.3 to containerized RHEL9+RHCS5.2 that we had to follow a year ago).

Bests,
Frédéric.

- Le 23 Mai 24, à 15:58, Laura floreslflo...@redhat.com  a écrit :


Hi all,

The meeting will be starting shortly! Join us at this link:
https://meet.jit.si/ceph-user-dev-monthly

- Laura

On Wed, May 22, 2024 at 2:55 PM Laura Flores  wrote:


Hi all,

The User + Dev Meetup will be held tomorrow at 10:00 AM EDT. We will be
discussing the results of the latest survey, and users who attend will have
the opportunity to provide additional feedback in real time.

See you there!
Laura Flores

Meeting Details:
https://www.meetup.com/ceph-user-group/events/300883526/

--

Laura Flores

She/Her/Hers

Software Engineer, Ceph Storage

Chicago, IL

lflo...@ibm.com  |lflo...@redhat.com  
M: +17087388804




--

Laura Flores

She/Her/Hers

Software Engineer, Ceph Storage

Chicago, IL

lflo...@ibm.com  |lflo...@redhat.com  
M: +17087388804
___
ceph-users mailing list --ceph-users@ceph.io
To unsubscribe send an email toceph-users-le...@ceph.io

___
ceph-users mailing list --ceph-users@ceph.io
To unsubscribe send an email toceph-users-le...@ceph.io

--
Head of Software Development
E-Mail: sebastian.wag...@croit.io

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges, Andy Muthmann - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web  | LinkedIn  | 
Youtube  | 
Twitter 



TOP 100 Innovator Award Winner 
 by compamedia
Technology Fast50 Award 
 Winner by Deloitte

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Is it possible to stripe rados object?

2022-01-26 Thread Sebastian Wagner
libradosstriper ?

Am 26.01.22 um 10:16 schrieb lin yunfan:
> Hi,
> I know with rbd and cephfs there is a stripe setting to stripe data
> into multiple rodos object.
> Is it possible to use librados api to stripe a large object into many
> small ones?
>
> linyunfan
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Single Node Cephadm Upgrade to Pacific

2022-01-10 Thread Sebastian Wagner
Hi Nathan,

Should work, as long as you have two MGRs deployed. Please have a look at

ceph config set mgr mgr/mgr_standby_modules = False

Best,
Sebastian

Am 08.01.22 um 17:44 schrieb Nathan McGuire:
> Hello!
>
> I'm running into an issue with upgrading Cephadm v15 to v16 on a single host. 
> I've found a recent discussion at 
> https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/WGALKHM5ZVS32IX7AVHU2TN76JTRVCRY/
>  and have manually updated the unit.run to pull the v16.2.0 image for mgr but 
> other services are still running on v15.
>
> NAME HOST   STATUS REFRESHED  AGE  PORTS  VERSION 
>  IMAGE ID  CONTAINER ID
> alertmanager.prod1   prod1  running (68m)  2m ago 9M   -  0.20.0  
>  0881eb8f169f  1d076486c019
> crash.prod1  prod1  running (68m)  2m ago 9M   -  15.2.13 
>  2cf504fded39  ffa06d65577a
> mds.cephfs.prod1.awlcoq  prod1  running (68m)  2m ago 9M   -  15.2.13 
>  2cf504fded39  21e0cbb21ee4
> mgr.prod1.bxenuc prod1  running (59m)  2m ago 9M   -  16.2.0  
>  24ecd6d5f14c  cf0a7d5af51d
> mon.prod1prod1  running (68m)  2m ago 9M   -  15.2.13 
>  2cf504fded39  1d1a0cba5414
> node-exporter.prod1  prod1  running (68m)  2m ago 9M   -  0.18.1  
>  e5a616e4b9cf  41ec9f0fcfb1
> osd.0prod1  running (68m)  2m ago 9M   -  15.2.13 
>  2cf504fded39  353d308ecc6e
> osd.1prod1  running (68m)  2m ago 9M   -  15.2.13 
>  2cf504fded39  2ccc28d5aa3e
> osd.2prod1  running (68m)  2m ago 9M   -  15.2.13 
>  2cf504fded39  a98009d4726e
> osd.3prod1  running (68m)  2m ago 9M   -  15.2.13 
>  2cf504fded39  aa8f84c6edb5
> osd.4prod1  running (68m)  2m ago 9M   -  15.2.13 
>  2cf504fded39  ccbc89a0a41c
> osd.5prod1  running (68m)  2m ago 9M   -  15.2.13 
>  2cf504fded39  c6cd024f2f73
> osd.6prod1  running (68m)  2m ago 9M   -  15.2.13 
>  2cf504fded39  e38ff4a66c7c
> osd.7prod1  running (68m)  2m ago 9M   -  15.2.13 
>  2cf504fded39  55ce0bcfa0e3
> osd.8prod1  running (68m)  2m ago 9M   -  15.2.13 
>  2cf504fded39  ac6c0c8eaac8
> osd.9prod1  running (68m)  2m ago 9M   -  15.2.13 
>  2cf504fded39  f5978d39b51d
> prometheus.prod1 prod1  running (68m)  2m ago 9M   -  2.18.1  
>  de242295e225  d974a83515fd
>
> Any ideas on how to get the rest of the cluster to v16 besides just mgr?
> Thanks!
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: airgap install

2021-12-17 Thread Sebastian Wagner
Hi Zoran,

I'd like to have this properly documented in the Ceph documentation as
well.  I just created

https://github.com/ceph/ceph/pull/44346 to add the monitoring images to
that section. Feel free to review this one.

Sebastian

Am 17.12.21 um 11:06 schrieb Zoran Bošnjak:
> Kai, thank you for your answer. It looks like the "ceph config set mgr..." 
> commands are the key part, to specify my local registry. However, I haven't 
> got that far with the installation. I have tried various options, but I have 
> problems already with the bootstrap step.
>
> I have documented the procedure (and the errors) here:
> https://github.com/zoranbosnjak/ceph-install#readme
>
> Would you please have a look and suggest corrections.
> Ideally, I would like to run administrative commands from a dedicated (admin) 
> node... or alternatively to setup mon nodes to be able to run administrative 
> commands...
>
> regards,
> Zoran
>
> - Original Message -
> From: "Kai Stian Olstad" 
> To: "Zoran Bošnjak" 
> Cc: "ceph-users" 
> Sent: Thursday, December 16, 2021 9:40:22 AM
> Subject: Re: [ceph-users] airgap install
>
> On Mon, Dec 13, 2021 at 06:18:55PM +, Zoran Bošnjak wrote:
>> I am using "ubuntu 20.04" and I am trying to install "ceph pacific" version 
>> with "cephadm".
>>
>> Are there any instructions available about using "cephadm bootstrap" and 
>> other related commands in an airgap environment (that is: on the local 
>> network, without internet access)?
> Unfortunately they say cephadm is stable but I would call it beta because of
> lacking feature, bugs and missing documentation.
>
> I can give you some pointers.
>
> The best source to find the images you need is in cephadm code and for 16.2.7
> you find it here [1].
>
> cephadm bootstrap has the --image option to specify what image to use.
> I also run the bootstrap with --skip-monitoring-stack, if not it fails since 
> it
> can't find the images.
>
> After that you can update the monitor containers to you registry.
> cephadm shell
> ceph config set mgr mgr/cephadm/container_image_prometheus 
> ceph config set mgr mgr/cephadm/container_image_node_exporter  image>
> ceph config set mgr mgr/cephadm/container_image_grafana 
> ceph config set mgr mgr/cephadm/container_image_alertmanager  image>
>
> Check the result with
> ceph config get mgr
>
> To deploy the monitoring
> ceph mgr module enable prometheus
> ceph orch apply node-exporter '*'
> ceph orch apply alertmanager --placement ...
> ceph orch apply prometheus --placement ...
> ceph orch apply grafana --placement ...
>
>
> This should be what you need to get Ceph running in an isolated network.
>
> [1] https://github.com/ceph/ceph/blob/v16.2.7/src/cephadm/cephadm#L50-L61
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Octopus: conversion from ceph-ansible to Cephadm causes unexpected 15.2.15→.13 downgrade for MDSs and RGWs

2021-12-16 Thread Sebastian Wagner
Hi Florian, hi Guillaume

Am 16.12.21 um 14:18 schrieb Florian Haas:
> Hello everyone,
>
> my colleagues and I just ran into an interesting situation updating
> our Ceph training course. That course's labs cover deploying a
> Nautilus cluster with ceph-ansible, upgrading it to Octopus (also with
> ceph-ansible), and then converting it to Cephadm before proceeding
> with the upgrade to Pacific.

I'd go a different route actually.

 1. convert the nautilus cluster to be containerized
 2. upgrade the containerized cluster to Pacific using ceph-ansible
 3. run the adopt playbook.

Not because this path is better or worse, but because it's better tested.

Guillaume, should we recommend this somehow in the ceph-ansible docs?

Best,
Sebastian

>
> When freshly upgraded to Octopus with ceph-ansible, the entire cluster
> is at version 15.2.15. And everything that is then being adopted into
> Cephadm management (with "cephadm adopt --style legacy") gets
> containers running that release. So far, so good.
>
> When we've completed the adoption process for MGRs, MONs, and OSDs, we
> proceed to redeploying our MDSs and RGWs, using "ceph orch apply mds"
> and "ceph orch apply rgw". Here, what we end up with is a bunch of
> MDSs and RGWs running on 15.2.13. Since the cluster previously ran
> Ansible-deployed 15.2.15 MDSs and RGWs, that makes this a partial (and
> very unexpected) downgrade.
>
> The docs at https://docs.ceph.com/en/octopus/cephadm/adoption/ do
> state that we can use "cephadm --image " to set the image. But
> we don't actually need that when we invoke cephadm directly ("cephadm
> adopt" does pull the correct image). Rather we'd need to set the
> correct image for deployment by "ceph orch apply", and there doesn't
> seem to be a straightforward way to do that.
>
> I suppose that this can be worked around in a couple of ways:
>
> * by following the documentation and then running "ceph orch upgrade
> start --ceph-version 15.2.15" immediately after;
> * by running "ceph orch daemon redeploy", which does support an
> --image parameter (but is per-daemon, thus less convenient than
> running through a rolling update).
>
> But I'd argue that none of those additional steps should actually be
> necessary — rather, "ceph orch apply" should just deploy the correct
> (latest) version without additional user involvement.
>
> The documentation seems to suggest another approach, namely to use an
> updated service spec, but unfortunately that won't work as we can't
> set "image" that way. Example for the rgw service:
>
> ---
> # rgw.yml
> service_type: rgw
> service_id: default.default
> placement:
>   count: 3
> image: "quay.io/ceph/ceph:v15"
> ports:
>   - 7480
>
> # ceph orch apply -i rgw.yaml
> Error EINVAL: ServiceSpec: __init__() got an unexpected keyword
> argument 'image'
Right, this won't work.
>
> So, we're curious what's the correct way to ensure that "ceph orch
> apply" installs the latest Octopus release for MDSs and RGWs being
> redeployed as part of a Cephadm cluster conversion. Or is this simply
> a bug somewhere in the orchestrator that would need fixing?
>
> Cheers,
> Florian
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: v16.2.7 Pacific released

2021-12-08 Thread Sebastian Wagner
Hi Robert,

it would have been much better to avoid this NFS situation altogether by
avoiding those different implementations in the first place.
Unfortunately this wasn't the case and I agree this is not great.

In any case, here are the manual steps that are performed by the
migration automatically, in case something goes wrong:

https://github.com/ceph/ceph/pull/44252

I hope that helps!

Best,
Sebastian


Am 08.12.21 um 10:42 schrieb Robert Sander:
> Am 08.12.21 um 01:11 schrieb David Galloway:
>
>> * Cephadm & Ceph Dashboard: NFS management has been completely reworked
>> to ensure that NFS exports are managed consistently across the different
>> Ceph components. Prior to this, there were 3 incompatible
>> implementations for configuring the NFS exports: Ceph-Ansible/OpenStack
>> Manila, Ceph Dashboard and 'mgr/nfs' module. With this release the
>> 'mgr/nfs' way becomes the official interface, and the remaining
>> components (Cephadm and Ceph Dashboard) adhere to it. While this might
>> require manually migrating from the deprecated implementations, it will
>> simplify the user experience for those heavily relying on NFS exports.
>
> This change is introduced in a point release?
>
> After upgrading a cluster all NFS shares have to be configured again
> and in the meantime NFS services do not work. Not so great IMHO.
>
> Regards
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: 16.2.7 pacific QE validation status, RC1 available for testing

2021-12-02 Thread Sebastian Wagner

Am 29.11.21 um 18:23 schrieb Yuri Weinstein:
> Details of this release are summarized here:
>
> https://tracker.ceph.com/issues/53324
> Release Notes - https://github.com/ceph/ceph/pull/44131
>
> Seeking approvals for:
>
> rados - Neha
rados/cephadm looks good. Except for https://tracker.ceph.com/issues/53365
> rgw - Casey
> rbd - Ilya, Deepika
> krbd  Ilya, Deepika
> fs - Venky, Patrick
> upgrade/nautilus-x - Neha, Josh
> upgrade/pacific-p2p - Neha, Josh
>
> 
> We are also publishing a release candidate this time for users to try
> for testing only.
>
> The branch name is pacific-16.2.7_RC1
> (https://shaman.ceph.com/builds/ceph/pacific-16.2.7_RC1/fdc003bc12f1b2443c4596eeacb32cf62e806970/)
>
> ***Don’t use this RC on production clusters!***
>
> The idea of doing Release Candidates (RC) for point releases before
> doing final point releases was discussed in the first-ever Ceph User +
> Dev Monthly Meeting. Everybody thought that it was a good idea, to
> help identify bugs that do not get caught in integration testing.
>
> The goal is to give users time to test and give feedback on RC
> releases while our upstream long-running cluster also runs the same RC
> release during that time (period of one week). We will kick this
> process off with 16.2.7 and the pacific-16.2.7_RC1 release is now
> available for users to test.
>
> Please respond to this email to provide any feedback on issues found
> in this release.
>
> Thx
> YuriW
>
> ___
> Dev mailing list -- d...@ceph.io
> To unsubscribe send an email to dev-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Expose rgw using consul or service discovery

2021-11-09 Thread Sebastian Wagner

Am 09.11.21 um 15:58 schrieb Pierre GINDRAUD:
> I come back about radosgw deployment, I've test cephadm ingress
> service and then theses are my findings :
>
> Haproxy service is deployed but not "managed" by cephadm, here the
> sources
> https://github.com/ceph/ceph/blob/9ab9cc26e200cdc3108525770353b91b3dd6c6d8/src/pybind/mgr/cephadm/services/ingress.py
> So, when cephadm shutdown radosgw backend, it does not "drain" or "put
> in maintenance" the haproxy backend before. Haproxy, continue to serve
> request on failed backend until it is marked down by healthcheck.
> Fortunatly, the new retry feature of Haproxy 2
> https://www.haproxy.com/fr/blog/haproxy-layer-7-retries-and-chaos-engineering/
> will retry failed requests on another backend. But as it is wrote in
> document, not all "failures cases" are handled. So when the server
> (rados gw) return an empty answer, haproxy does not retry the request
> and forward the 502 code to client. We can think to enable "retry-on
> all-retryable-errors" option but what about retrying a POST or a PUT
> method on an api, if the first request passed fine but only it's
> answer was broken, the first "action" can still be finished sucessfully.
> In addition, the haproxy configuration file is not "fully" customizable,
> https://github.com/ceph/ceph/blob/9ab9cc26e200cdc3108525770353b91b3dd6c6d8/src/pybind/mgr/cephadm/templates/services/ingress/haproxy.cfg.j2
> does not allow for custom log format .
you can overwrite this template (compre it to the monitoring templates
https://docs.ceph.com/en/latest/cephadm/services/monitoring/#using-custom-configuration-files
). This way you can do what ever you want right now.
>
> In front of theses findings, I'm wondering if cephadm should approach
> the problem differently. For example, my previous proposal for
> "pre-task" and "post-tasks", or allow service registration in backend
> such "consul".
Maybe!
>
> Finally, in our setup, I will certainly deploy my own haproxy (using
> our infrastructure tools) and use consul and healthcheck to have a
> setup similar to ingress service but in our standards.
>
> Do you think any of my proposal can be really proposed to ceph
> developpers teams ?

At some point the amount of flexibility provided by cephadm's services
will reach its limit. And at that point one will need an escape hatch.
right now the escape hatch is making those templates overwritatible.
That's enough?


>
> On 23/10/2021 01:47, Maged Mokhtar wrote:
>>
>>>> In PetaSAN we use Consul to provide a service mesh for running
>>>> services active/active over Ceph.
>>>>
>>>> For rgw, we use nginx to load balance rgw gateways, the nginx
>>>> themselves run in an active/active ha setup so they do not become a
>>>> bottleneck as you pointed out with the haproxy setup.
>>>
>>
>>> How do you manage rgw upgrade ? do you use cephadm or any other
>>> automation tool ?
>>>
>>> How is nginx configured to talk to rgw ? using a upstream an a proxy
>>> pass ?
>>>
>>>
>> PetaSAN is a Ceph storage appliance based on Ubuntu OS and SUSE
>> kernel. We rely on Consul service mesh to scale the service/gateways
>> layer in a scale-out active/active fashion, this is for iSCSI, NFS,
>> SMB and S3.
>> Upgrades are done live via apt upgrade We do not use cephadm, we
>> provide a web based deployment ui (wizard like steps) as well as ui
>> for cluster management.
>> For nginx, we use the upstream method to configure the load balancing
>> of the rgws. The nginx config file is dynamically created/updated by
>> a python script which receives notifications from Consul (nodes
>> added/nodes down/ip changes..).
>> You can read more on our website
>> http://www.petasan.org <http://www.petasan.org>
>>
>>
>>>>
>>>> /Maged
>>>>
>>>> On 22/10/2021 16:41, Pierre GINDRAUD wrote:
>>>>>
>>>>> On 20/10/2021 10:17, Sebastian Wagner wrote:
>>>>>> Am 20.10.21 um 09:12 schrieb Pierre GINDRAUD:
>>>>>>> Hello,
>>>>>>>
>>>>>>> I'm migrating from puppet to cephadm to deploy a ceph cluster,
>>>>>>> and I'm
>>>>>>> using consul to expose radosgateway. Before, with puppet, we were
>>>>>>> deploying radosgateway with "apt install radosgw" and applying
>>>>>>> upgrade
>>>>>>> using "apt upgrade radosgw". In our 

[ceph-users] Re: cephadm does not find podman objects for osds

2021-10-28 Thread Sebastian Wagner
Some thoughts:

  * Do you have any error messages form the MDS daemons?

https://docs.ceph.com/en/latest/cephadm/troubleshooting/#gathering-log-files 
  * Do you have any error messages form the OSDs?
  * What do you mean by "osd podman object"?
  * Try downgrading to 3.0.1


Am 25.10.21 um 23:05 schrieb Magnus Harlander:
> Hi,
>
> after converting my 2 node cluster to cephadm I'm in lots of trouble.
>
> - containerized mds are not available in the cluster. I must
>   run mds from systemd to make my fs available.
>
> - osd podman objects are not found after a reboot of one node. I
>   don't want to test it on the second node, because if it looses
>   the podman config as well, my cluster is dead!
>
> Can anybody help me?
>
> Best regards Magnus
>
> Output from cephadm.log:
>
> cephadm ['--image', 'docker.io/ceph/ceph:v15', '--no-container-init', 'ls']
> 2021-10-25 22:47:01,929 DEBUG container_init=False
> 2021-10-25 22:47:01,929 DEBUG Running command: systemctl is-enabled
> ceph-mds@s1
> 2021-10-25 22:47:01,935 DEBUG systemctl: stdout enabled
> 2021-10-25 22:47:01,935 DEBUG Running command: systemctl is-active
> ceph-mds@s1
> 2021-10-25 22:47:01,940 DEBUG systemctl: stdout active
> 2021-10-25 22:47:01,940 DEBUG Running command: ceph -v
> 2021-10-25 22:47:02,009 DEBUG ceph: stdout ceph version 15.2.13
> (c44bc49e7a57a87d84dfff2a077a2058aa2172e2) octopus (stable)
> 2021-10-25 22:47:02,016 DEBUG Running command: systemctl is-enabled
> ceph-osd@1
> 2021-10-25 22:47:02,024 DEBUG systemctl: stdout disabled
> 2021-10-25 22:47:02,024 DEBUG Running command: systemctl is-active
> ceph-osd@1
> 2021-10-25 22:47:02,031 DEBUG systemctl: stdout inactive
> 2021-10-25 22:47:02,031 DEBUG Running command: systemctl is-enabled
> ceph-86bbd6c5-ae96-4c78-8a5e-50623f0ae524@mon.s1
> 2021-10-25 22:47:02,036 DEBUG systemctl: stdout enabled
> 2021-10-25 22:47:02,036 DEBUG Running command: systemctl is-active
> ceph-86bbd6c5-ae96-4c78-8a5e-50623f0ae524@mon.s1
> 2021-10-25 22:47:02,041 DEBUG systemctl: stdout active
> 2021-10-25 22:47:02,041 DEBUG Running command: /bin/podman --version
> 2021-10-25 22:47:02,068 DEBUG /bin/podman: stdout podman version 3.2.3
> 2021-10-25 22:47:02,069 DEBUG Running command: /bin/podman inspect
> --format {{.Id}},{{.Config.Image}},{{.Image}},{{.Created}},{{index
> .Config.Labels "io.ceph.version"}}
> ceph-86bbd6c5-ae96-4c78-8a5e-50623f0ae524-mon.s1
> 2021-10-25 22:47:02,111 DEBUG /bin/podman: stdout
> 1fd6debed26923212e3b3b263e88505d2b70fe024b7a1c01105299bb746d7c48,docker.io/ceph/ceph:v15,2cf504fded3980c76b59a354fca8f301941f86e369215a08752874d1ddb69b73,2021-10-25
> 22:45:39.315992691 +0200 CEST,
> 2021-10-25 22:47:02,203 DEBUG Running command: /bin/podman exec
> 1fd6debed26923212e3b3b263e88505d2b70fe024b7a1c01105299bb746d7c48 ceph -v
> 2021-10-25 22:47:02,356 DEBUG /bin/podman: stdout ceph version 15.2.13
> (c44bc49e7a57a87d84dfff2a077a2058aa2172e2) octopus (stable)
> 2021-10-25 22:47:02,434 DEBUG Running command: systemctl is-enabled
> ceph-86bbd6c5-ae96-4c78-8a5e-50623f0ae524@mgr.s1
> 2021-10-25 22:47:02,440 DEBUG systemctl: stdout enabled
> 2021-10-25 22:47:02,440 DEBUG Running command: systemctl is-active
> ceph-86bbd6c5-ae96-4c78-8a5e-50623f0ae524@mgr.s1
> 2021-10-25 22:47:02,445 DEBUG systemctl: stdout active
> 2021-10-25 22:47:02,445 DEBUG Running command: /bin/podman --version
> 2021-10-25 22:47:02,468 DEBUG /bin/podman: stdout podman version 3.2.3
> 2021-10-25 22:47:02,470 DEBUG Running command: /bin/podman inspect
> --format {{.Id}},{{.Config.Image}},{{.Image}},{{.Created}},{{index
> .Config.Labels "io.ceph.version"}}
> ceph-86bbd6c5-ae96-4c78-8a5e-50623f0ae524-mgr.s1
> 2021-10-25 22:47:02,513 DEBUG /bin/podman: stdout
> 2b08ddb6182e14985939e50a00e2306a77c3068a65ed122dab8bd5604c91af65,docker.io/ceph/ceph:v15,2cf504fded3980c76b59a354fca8f301941f86e369215a08752874d1ddb69b73,2021-10-25
> 22:45:39.535973449 +0200 CEST,
> 2021-10-25 22:47:02,601 DEBUG Running command: systemctl is-enabled
> ceph-86bbd6c5-ae96-4c78-8a5e-50623f0ae524@osd.0
> 2021-10-25 22:47:02,607 DEBUG systemctl: stdout enabled
> 2021-10-25 22:47:02,607 DEBUG Running command: systemctl is-active
> ceph-86bbd6c5-ae96-4c78-8a5e-50623f0ae524@osd.0
> 2021-10-25 22:47:02,612 DEBUG systemctl: stdout activating
> 2021-10-25 22:47:02,612 DEBUG Running command: /bin/podman --version
> 2021-10-25 22:47:02,636 DEBUG /bin/podman: stdout podman version 3.2.3
> 2021-10-25 22:47:02,637 DEBUG Running command: /bin/podman inspect
> --format {{.Id}},{{.Config.Image}},{{.Image}},{{.Created}},{{index
> .Config.Labels "io.ceph.version"}}
> ceph-86bbd6c5-ae96-4c78-8a5e-50623f0ae524-osd.0
> 2021-10-25 22:47:02,709 DEBUG /bin/podman: stderr Error: error
> inspecting object: no such object:
> "ceph-86bbd6c5-ae96-4c78-8a5e-50623f0ae524-osd.0"
> 2021-10-25 22:47:02,712 DEBUG Running command: systemctl is-enabled
> ceph-86bbd6c5-ae96-4c78-8a5e-50623f0ae524@osd.2
> 2021-10-25 22:47:02,718 DEBUG systemctl: stdout enabled
> 2021-10-25 22:47:02,718 DEBUG Runn

[ceph-users] Re: MDS and OSD Problems with cephadm@rockylinux solved

2021-10-28 Thread Sebastian Wagner
In case you still have the error messages and additional info, do you
want to create a tracker issue for this?
https://tracker.ceph.com/projects/orchestrator/issues/new . To me this
sounds like a network issue and not like a rockylinux issue.


Am 26.10.21 um 13:17 schrieb Magnus Harlander:
> Hi,
>
> I solved all my problems mentioned earlier. It boiled down
> to a minimal ceph.conf that was created by cephadm without
> network infos. After replacing the minimal config
> for osd and mds daemons in /var/lib/ceph/UUID/*/config
> everything was fine and osd and mds containers came
> up clean and working.
> Without the public_network directive the daemons didn't
> know which public address to use.
>
> This might be due to rockylinux not explicitly supported
> by cephadm, so id does not know how to parse
> 'ip a' output, or some other strange bug. Btw I'm
> using bonding interfaces and a VM bridge for qemu.
>
> Best regards,
>
> Magnus
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Expose rgw using consul or service discovery

2021-10-20 Thread Sebastian Wagner

Am 20.10.21 um 09:12 schrieb Pierre GINDRAUD:
> Hello,
>
> I'm migrating from puppet to cephadm to deploy a ceph cluster, and I'm
> using consul to expose radosgateway. Before, with puppet, we were
> deploying radosgateway with "apt install radosgw" and applying upgrade
> using "apt upgrade radosgw". In our consul service a simple healthcheck
> on this url worked fine "/swift/healthcheck", because we were able to
> put consul agent in maintenance mode before operations.
> I've seen this thread
> https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/32JZAIU45KDTOWEW6LKRGJGXOFCTJKSS/#N7EGVSDHMMIXHCTPEYBA4CYJBWLD3LLP
> that proves consul is a possible way.
>
> So, with cephadm, the upgrade process decide by himself when to stop,
> upgrade and start each radosgw instances. 

Right

> It's an issue because the
> consul healthcheck must detect "as fast as possible" the instance break
> to minimize the number of applicatives hits that can use the down
> instance's IP.
>
> In some application like traefik
> https://doc.traefik.io/traefik/reference/static-configuration/cli/ we
> have an option "requestacceptgracetimeout" that allow the "http server"
> to handle requests some time after a stop signal has been received while
> the healthcheck endpoint immediatly started to response with an "error".
> This allow the loadbalancer (consul here) to put instance down and stop
> traffic to it before it fall effectively down.
>
> In https://docs.ceph.com/en/latest/radosgw/config-ref/ I have see any
> option like that. And in cephadm I haven't seen "pre-task" and "post
> task" to, for exemple, touch a file somewhere consul will be able to
> test it, or putting down a host in maintenance.
>
> How do you expose radosgw service over your application ?

cephadm nowadays ships an ingress services using haproxy for this use case:

https://docs.ceph.com/en/latest/cephadm/services/rgw/#high-availability-service-for-rgw

> Have you any idea as workaround my issue ?

Plenty actually. cephadm itself does not provide a notification
mechanisms, but other component in the deployment stack might.

On the highest level we have the config-key store of the MONs. you
should be able to get notifications for config-key changes.
Unfortunately this would involve some Coding.

On the systemd level we have systemd-notify. I haven't looked into it,
but maybe you can get events about the rgw unit deployed by cephadm.

On the container level we have "podman events" that prints state changes
of containers.

To me a script that calls podman events on one hand and pushes updates
to consul sounds like the most promising solution to me.

In case you get this setup working properly, I'd love to read a blog
post about it.

>
> Regards
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephadm cluster behing a proxy

2021-10-14 Thread Sebastian Wagner
Hi Luis,

Yes, there is a dowstream documentation for it here:
https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/5/html-single/installation_guide/index#configuring-a-custom-registry-for-disconnected-installation_install
but we clearly lack an upstream version of it.

I'd love to merge a PR that adds this use case to the docs.

Sebastian


Am 14.10.21 um 10:18 schrieb Luis Domingues:
> Hello,
>
> We have a cluster deployed with cephadm that sits behind a proxy. It has no 
> direct access to internet.
>
> Deploying was not an issue, we did cephadm pull on all the machines before 
> bootstrapping the cluster. But we are now facing errors when we try to update 
> the cluster, basically this kind of isues:
>
> /bin/podman: stderr Error: Error initializing source 
> docker://quay.io/ceph/ceph:v16.2.6: error pinging docker registry quay.io: 
> Get "https://quay.io/v2/": dial tcp 54.156.10.58:443: connect: network is 
> unreachable
>
> Is there a way to tell cephadm to use an http proxy? I did not found anything 
> on the documentation, and I want to avoid to have http_proxy environment 
> variables set on shell system wise.
>
> Or should I use a local container registry mirroring the ceph images?
>
> Thanks,
> Luis Domingues
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Cephadm set rgw SSL port

2021-09-29 Thread Sebastian Wagner
Here you go: https://github.com/ceph/ceph/pull/43332

Am 28.09.21 um 15:49 schrieb Sebastian Wagner:
> Am 28.09.21 um 15:12 schrieb Daniel Pivonka:
>> Hi,
>>
>> 1. I believe the field is called  'rgw_frontend_port'
>> 2. I don't think something like that exists but probably should
>
> At least for RGWs, we have:
> https://docs.ceph.com/en/pacific/cephadm/rgw/#service-specification
>
>> -Daniel Pivonka
>>
>>
>> On Mon, Sep 27, 2021 at 4:40 PM Sergei Genchev  wrote:
>>
>>> Hi, I need to deploy RGW with SSL and was looking at the page
>>> https://docs.ceph.com/en/pacific/cephadm/rgw/  I want rados gateway to
>>> listen on a custom port.
>>> My yaml file looks like this:
>>> service_type: rgw
>>> service_id: connectTest
>>> placement:
>>>   hosts:
>>>   - cv1xta-conctcephradosgw000
>>>   - cv1xta-conctcephradosgw001
>>> spec:
>>>   #ssl_port: 7443
>>>   #port: 7443
>>>   rgw_frontend_ssl_certificate: |
>>> --MULTILINE CERT --
>>>   ssl: true
>>> I tried setting both port: and ssl_port, and could not make it work. I get
>>> Error EINVAL: ServiceSpec: __init__() got an unexpected keyword argument
>>> 'ssl_port' or
>>> Error EINVAL: ServiceSpec: __init__() got an unexpected keyword argument
>>> 'port'
>>>
>>> 1. Do you know how I can set up both SSL and a custom port?
>>> 2. More generic: is there a place where all available yaml config options
>>> are listed for rgw, and other services?
>>>
>>> Thanks!
>>> ___
>>> ceph-users mailing list -- ceph-users@ceph.io
>>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>>
>>>
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Cephadm set rgw SSL port

2021-09-28 Thread Sebastian Wagner

Am 28.09.21 um 15:12 schrieb Daniel Pivonka:
> Hi,
>
> 1. I believe the field is called  'rgw_frontend_port'
> 2. I don't think something like that exists but probably should


At least for RGWs, we have:
https://docs.ceph.com/en/pacific/cephadm/rgw/#service-specification

>
> -Daniel Pivonka
>
>
> On Mon, Sep 27, 2021 at 4:40 PM Sergei Genchev  wrote:
>
>> Hi, I need to deploy RGW with SSL and was looking at the page
>> https://docs.ceph.com/en/pacific/cephadm/rgw/  I want rados gateway to
>> listen on a custom port.
>> My yaml file looks like this:
>> service_type: rgw
>> service_id: connectTest
>> placement:
>>   hosts:
>>   - cv1xta-conctcephradosgw000
>>   - cv1xta-conctcephradosgw001
>> spec:
>>   #ssl_port: 7443
>>   #port: 7443
>>   rgw_frontend_ssl_certificate: |
>> --MULTILINE CERT --
>>   ssl: true
>> I tried setting both port: and ssl_port, and could not make it work. I get
>> Error EINVAL: ServiceSpec: __init__() got an unexpected keyword argument
>> 'ssl_port' or
>> Error EINVAL: ServiceSpec: __init__() got an unexpected keyword argument
>> 'port'
>>
>> 1. Do you know how I can set up both SSL and a custom port?
>> 2. More generic: is there a place where all available yaml config options
>> are listed for rgw, and other services?
>>
>> Thanks!
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Error ceph-mgr on fedora 36

2021-09-27 Thread Sebastian Wagner

looks like you should create a tracker issue for this.

https://tracker.ceph.com/projects/mgr/issues/new 



Am 18.09.21 um 14:34 schrieb Igor Savlook:

OS: Fedora 36 (rawhide)
Ceph: 16.2.6
Python: 3.10

When start ceph-mgr he is try load core python modules but crash with 
error:


Sep 18 15:11:35 ceph-master0 ceph-mgr[71463]: 
2021-09-18T15:11:35.086+0300 6adb4530 -1 mgr load Failed to 
construct class in 'balancer'
Sep 18 15:11:35 ceph-master0 ceph-mgr[71463]: 
2021-09-18T15:11:35.086+0300 6adb4530 -1 mgr load Traceback (most 
recent call last):
    File 
"/usr/share/ceph/mgr/balancer/module.py", line 427, in __init__
  super(Module, 
self).__init__(*args, **kwargs)
    File 
"/usr/share/ceph/mgr/mgr_module.py", line 882, in __init__

self._configure_logging(mgr_level, log_level, cluster_level,
    File 
"/usr/share/ceph/mgr/mgr_module.py", line 581, in _configure_logging

self._cluster_log_handler = ClusterLogHandler(self)
    File 
"/usr/share/ceph/mgr/mgr_module.py", line 535, in __init__

super().__init__()
  RuntimeError: super(): 
__class__ cell not found 


Sep 18 15:11:35 ceph-master0 ceph-mgr[71463]: 
2021-09-18T15:11:35.090+0300 6adb4530 -1 mgr operator() Failed to 
run module in active mode ('balancer')
Sep 18 15:11:35 ceph-master0 ceph-mgr[71463]: 
2021-09-18T15:11:35.090+0300 6adb4530 -1 mgr load Failed to 
construct class in 'crash'
Sep 18 15:11:35 ceph-master0 ceph-mgr[71463]: 
2021-09-18T15:11:35.090+0300 6adb4530 -1 mgr load Traceback (most 
recent call last):
    File 
"/usr/share/ceph/mgr/crash/module.py", line 37, in __init__
  super(Module, 
self).__init__(*args, **kwargs)
    File 
"/usr/share/ceph/mgr/mgr_module.py", line 882, in __init__

self._configure_logging(mgr_level, log_level, cluster_level,
    File 
"/usr/share/ceph/mgr/mgr_module.py", line 581, in _configure_logging

self._cluster_log_handler = ClusterLogHandler(self)
    File 
"/usr/share/ceph/mgr/mgr_module.py", line 535, in __init__

super().__init__()
  RuntimeError: super(): 
__class__ cell not found 


Sep 18 15:11:35 ceph-master0 ceph-mgr[71463]: 
2021-09-18T15:11:35.096+0300 6adb4530 -1 mgr operator() Failed to 
run module in active mode ('iostat')
Sep 18 15:11:35 ceph-master0 ceph-mgr[71463]: 
2021-09-18T15:11:35.096+0300 6adb4530 -1 mgr load Failed to 
construct class in 'orchestrator' Sep 18 15:11:35 ceph-master0 
ceph-mgr[71463]: 2021-09-18T15:11:35.096+0300 6adb4530 -1 mgr 
load Traceback (most recent call last):
    File 
"/usr/share/ceph/mgr/orchestrator/module.py", line 210, in __init__ 
super(OrchestratorCli, self).__init__(*args, **kwargs)
    File 
"/usr/share/ceph/mgr/mgr_module.py", line 882, in __init__ 
self._configure_logging(mgr_level, log_level, cluster_level,
    File 
"/usr/share/ceph/mgr/mgr_module.py", line 581, in _configure_logging 
self._cluster_log_handler = ClusterLogHandler(self)
    File 
"/usr/share/ceph/mgr/mgr_module.py", line 535, in __init__ 
super().__init__()
  RuntimeError: super(): 
__class__ cell not found


and etc.

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Remoto 1.1.4 in Ceph 16.2.6 containers

2021-09-27 Thread Sebastian Wagner

Thank you David!

Am 24.09.21 um 00:41 schrieb David Galloway:

I just repushed the 16.2.6 container with remoto 1.2.1 in it.

On 9/22/21 4:19 PM, David Orman wrote:

https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2021-4b2736a28c

^^ if people want to test and provide feedback for a potential merge
to EPEL8 stable.

David

On Wed, Sep 22, 2021 at 11:43 AM David Orman  wrote:

I'm wondering if this was installed using pip/pypi before, and now
switched to using EPEL? That would explain it - 1.2.1 may never have
been pushed to EPEL.

David

On Wed, Sep 22, 2021 at 11:26 AM David Orman  wrote:

We'd worked on pushing a change to fix
https://tracker.ceph.com/issues/50526 for a deadlock in remoto here:
https://github.com/alfredodeza/remoto/pull/63

A new version, 1.2.1, was built to help with this. With the Ceph
release 16.2.6 (at least), we see 1.1.4 is again part of the
containers. Looking at EPEL8, all that is built now is 1.1.4. We're
not sure what happened, but would it be possible to get 1.2.1 pushed
to EPEL8 again, and figure out why it was removed? We'd then need a
rebuild of the 16.2.6 containers to 'fix' this bug.

This is definitely a high urgency bug, as it impacts any deployments
with medium to large counts of OSDs or split db/wal devices, like many
modern deployments.

https://koji.fedoraproject.org/koji/packageinfo?packageID=18747
https://dl.fedoraproject.org/pub/epel/8/Everything/x86_64/Packages/p/

Respectfully,
David Orman

___
Dev mailing list -- d...@ceph.io
To unsubscribe send an email to dev-le...@ceph.io


___
Dev mailing list -- d...@ceph.io
To unsubscribe send an email to dev-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How you loadbalance your rgw endpoints?

2021-09-27 Thread Sebastian Wagner

Hi Szabo,

I think you can have a look at 
https://docs.ceph.com/en/latest/cephadm/rgw/#high-availability-service-for-rgw 
 
even if you don't deploy ceph using cephadm.


Am 24.09.21 um 07:59 schrieb Szabo, Istvan (Agoda):

Hi,

Wonder how you guys do it due to we will always have limitation on the network 
bandwidth of the loadbalancer.

Or if no balancer what to monitor if 1 rgw maxed out? I’m using 15rgw.

Ty
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Restore OSD disks damaged by deployment misconfiguration

2021-09-27 Thread Sebastian Wagner

Hi Phil,


Am 27.09.21 um 10:06 schrieb Phil Merricks:

Hey folks,

A recovery scenario I'm looking at right now is this:

1: In a clean 3-node Ceph cluster (pacific, deployed with cephadm), the OS
Disk is lost from all nodes
2: Trying to be helpful, a self-healing deployment system reinstalls the OS
on each node, and rebuilds the ceph services
3: Somewhere in the deployment system are 'sensible defaults' that assume
there are no stateful workloads, so the superblock is wiped from the other
block devices attached to these nodes to prevent stale metadata conflicts
4: The ceph rebuild has no knowledge of prior states and is expected to
simply restore based on discovery of existing devices.
5: Out of 5 block devices, 3 had their superblock wiped, 1 suffered
mechanical failure upon reboot, and 1 is completely intact, with
'ceph-volume lvm list' returning the correct information.
As soon as you manage to restore the OSDs, there is a command to 
re-create 
 
the OSD containers.


Is there a way to restore the 3 devices with wiped superblocks?  Some basic
attempts with fsck to find the superblocks on the disks that were affected
yielded nothing.

Thanks for your time reading this message.

Cheers

Phil
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Docker & CEPH-CRASH

2021-09-16 Thread Sebastian Wagner
ceph-crash should work, as crash dumps aren't namespaced in the kernel. 
Note that you need a pid1 process in your containers in order for crash 
dumps to be created.


Am 16.09.21 um 08:57 schrieb Eugen Block:
I haven't tried it myself but it would probably work to run the crash 
services apart from cephadm, maybe someone else has a similar setup. 
Since those are not critical services you can try it without impacting 
the rest of the cluster. But again, I haven't tried it this way.



Zitat von Guilherme Geronimo :


Got it: one instance per host is enough.

In my case, I'm not using "ceph orch".
We did it manually,  crafting one docker-compose.yml per host.

The question is:
Is it possible to run a "crash instance" per host or the solution 
oblige me to adopt the cephadm solution?


Thanks!

[]'s
Arthur

On 15/09/2021 08:30, Eugen Block wrote:

Hi,

ceph-crash services are standalone containers, they are not running 
inside other containers:


host1:~ # ceph orch ls
NAME   RUNNING  REFRESHED  AGE 
PLACEMENT    IMAGE 
NAME     IMAGE ID
crash  4/4  9m ago 3w * mix 
d2b64e3c3805


Do you see it in your specs? Can you share this output:

ceph orch ls --export --format yaml

You can add the crash service to a spec file and apply it with 'ceph 
orch apply -i crash-service.yml' where the yml file could look like 
this:


service_type: crash
service_name: crash
placement:
  host_pattern: '*'




Zitat von Guilherme Geronimo :


Hey Guys!
I'm running my  entire cluster (12hosts/89osds - v15.2.22) on 
Docker and everything runs smoothly.


But  I'm kind of "blind" here: ceph-crash is not running inside the 
containers.
And there's nothing related to "ceph-crash" in the docker logs 
either


Is there a special way to configure it?
Should I create and external volume and run a single instance of it?

Thanks!
Guilherme Geronimo (aKa Arthur)

docker-compose example:

services:
   osd.106:
  container_name: osd106
  image: ceph/daemon:latest-nautilus
  command: osd_directory_single
  restart: unless-stopped
  pid: "host"
  network_mode: host
  privileged: true
  volumes:
    - /dev/:/dev/
    - ../ceph.conf:/etc/ceph
    - ./data/ceph-106/:/var/lib/ceph/osd/ceph-106

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How to purge/remove rgw from ceph/pacific

2021-09-11 Thread Sebastian Wagner

Yeah, looks like this was missing from the docs. See

https://github.com/ceph/ceph/pull/43141



Am 11.09.21 um 12:46 schrieb Eugen Block:
Edit your rgw service specs and set „unmanaged“ to true so cephadm 
won’t redeploy a daemon, then remove it as you did before.

See [1] for more details.

[1] https://docs.ceph.com/en/pacific/cephadm/service-management.html


Zitat von Cem Zafer :


Hi,
How to remove rgw from hosts? When I execute ```ceph orch daemon rm
```, it spawns another.
What is the proper way to remove rgw from ceph hosts?
Thanks.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: mon startup problem on upgrade octopus to pacific

2021-09-02 Thread Sebastian Wagner
Could you please verify that the mon_map of each mon contains all and 
correct mons?


Am 30.08.21 um 21:45 schrieb Chris Dunlop:

Hi,

Does anyone have any suggestions?

Thanks,

Chris

On Mon, Aug 30, 2021 at 03:52:29PM +1000, Chris Dunlop wrote:

Hi,

I'm stuck, mid upgrade from octopus to pacific using cephadm, at the 
point of upgrading the mons.


I have 3 mons still on octopus and in quorum. When I try to bring up 
a new pacific mon it stays permanently in "probing" state.


The pacific mon is running off:

docker.io/ceph/ceph@sha256:829ebf54704f2d827de00913b171e5da741aad9b53c1f35ad59251524790eceb 



The lead octopus mon is running off:

quay.io/ceph/ceph:v15

The other 2 octopus mons are 15.2.14-1~bpo10+1. These are manually 
started due to the cephadm upgrade failing at the point of upgrading 
the mons and leaving me with only one cephadm mon running.


I've confirmed all mons (current and new) can contact each other on 
ports 3300 and 6789, and max mtu packets (9000) get through in all 
directions.


On the box where I'm trying to start the pacific mon, if I start up 
an octopus mon it happily joins the mon set.


With debug_mon=20 on the pacific mon I see *constant* repeated 
mon_probe reply processing. The first mon_probe reply produces:


e0  got newer/committed monmap epoch 35, mine was 0

Subsequent mon_probe replies produce:

e35 got newer/committed monmap epoch 35, mine was 35

...but this just keeps repeating and it never gets any further - see 
below.


Where to from here?

Cheers,

Chris

--
debug_mon=20 from pacific mon
--
Aug 29 08:25:34 b5 conmon[2648666]: debug 
2021-08-28T22:25:34.792+ 7f74f223a700 10 mon.b5@-1(probing) e0 
handle_probe mon_probe(reply c6618970-0ce0-4cb2-bc9a-dd5f29b62e24 
name b4 quorum 0,1,2 leader 0 paxos( fc 364908695 lc 364909318 ) 
mon_release octopus) v7
Aug 29 08:25:34 b5 conmon[2648666]: debug 
2021-08-28T22:25:34.792+ 7f74f223a700 10 mon.b5@-1(probing) e0 
handle_probe_reply mon.2 v2:10.200.63.132:3300/0 mon_probe(reply 
c6618970-0ce0-4cb2-bc9a-dd5f29b62e24 name b4 quorum 0,1,2 leader 0 
paxos( fc 364908695 lc 364909318 ) mon_release octopus) v7
Aug 29 08:25:34 b5 conmon[2648666]: debug 
2021-08-28T22:25:34.792+ 7f74f223a700 10 mon.b5@-1(probing) e0  
monmap is e0: 3 mons at 
{noname-a=[v2:10.200.63.130:3300/0,v1:10.200.63.130:6789/0],noname-b=[v2:10.200.63.132:3300/0,v1:10.200.63.132:6789/0],noname-c=[v2:192.168.254.251:3300/0,v1:192.168.254.251:6789/0]}
Aug 29 08:25:34 b5 conmon[2648666]: debug 
2021-08-28T22:25:34.792+ 7f74f223a700 10 mon.b5@-1(probing) e0  
got newer/committed monmap epoch 35, mine was 0
Aug 29 08:25:34 b5 conmon[2648666]: debug 
2021-08-28T22:25:34.792+ 7f74f223a700 10 mon.b5@-1(probing) e35 
bootstrap
Aug 29 08:25:34 b5 conmon[2648666]: debug 
2021-08-28T22:25:34.792+ 7f74f223a700 10 mon.b5@-1(probing) e35 
sync_reset_requester
Aug 29 08:25:34 b5 conmon[2648666]: debug 
2021-08-28T22:25:34.792+ 7f74f223a700 10 mon.b5@-1(probing) e35 
unregister_cluster_logger - not registered
Aug 29 08:25:34 b5 conmon[2648666]: debug 
2021-08-28T22:25:34.792+ 7f74f223a700 10 mon.b5@-1(probing) e35 
cancel_probe_timeout 0x5564a433c900
Aug 29 08:25:34 b5 conmon[2648666]: debug 
2021-08-28T22:25:34.792+ 7f74f223a700 10 mon.b5@-1(probing) e35 
monmap e35: 3 mons at 
{b2=[v2:10.200.63.130:3300/0,v1:10.200.63.130:6789/0],b4=[v2:10.200.63.132:3300/0,v1:10.200.63.132:6789/0],k2=[v2:192.168.254.251:3300/0,v1:192.168.254.251:6789/0]}
Aug 29 08:25:34 b5 conmon[2648666]: debug 
2021-08-28T22:25:34.792+ 7f74f223a700 10 mon.b5@-1(probing) e35 
_reset
Aug 29 08:25:34 b5 conmon[2648666]: debug 
2021-08-28T22:25:34.792+ 7f74f223a700 10 mon.b5@-1(probing).auth 
v0 _set_mon_num_rank num 0 rank 0
Aug 29 08:25:34 b5 conmon[2648666]: debug 
2021-08-28T22:25:34.792+ 7f74f223a700 10 mon.b5@-1(probing) e35 
cancel_probe_timeout (none scheduled)
Aug 29 08:25:34 b5 conmon[2648666]: debug 
2021-08-28T22:25:34.792+ 7f74f223a700 10 mon.b5@-1(probing) e35 
timecheck_finish
Aug 29 08:25:34 b5 conmon[2648666]: debug 
2021-08-28T22:25:34.792+ 7f74f223a700 15 mon.b5@-1(probing) e35 
health_tick_stop
Aug 29 08:25:34 b5 conmon[2648666]: debug 
2021-08-28T22:25:34.792+ 7f74f223a700 15 mon.b5@-1(probing) e35 
health_interval_stop
Aug 29 08:25:34 b5 conmon[2648666]: debug 
2021-08-28T22:25:34.792+ 7f74f223a700 10 mon.b5@-1(probing) e35 
scrub_event_cancel
Aug 29 08:25:34 b5 conmon[2648666]: debug 
2021-08-28T22:25:34.792+ 7f74f223a700 10 mon.b5@-1(probing) e35 
scrub_reset
Aug 29 08:25:34 b5 conmon[2648666]: debug 
2021-08-28T22:25:34.792+ 7f74f223a700 10 mon.b5@-1(probing) e35 
cancel_probe_timeout (none scheduled)
Aug 29 08:25:34 b5 conmon[2648666]: debug 
2021-08-28T22:25:34.792+ 7f74f223a700 10 mon.b5@-1(probing) e35 
reset_probe_timeout 0x5564a433c900 after 2 seconds
Aug 29 

[ceph-users] Re: Cephadm cannot aquire lock

2021-09-02 Thread Sebastian Wagner


Am 31.08.21 um 04:05 schrieb fcid:

Hi ceph community,

I'm having some trouble trying to delete an OSD.

I've been using cephadm in one of our clusters and it's works fine, 
but lately, after an OSD failure, I cannot delete it using the 
orchestrator. Since the orchestrator is not working (for some unknown 
reason) I tried to manually delete the OSD using the following command:


ceph purge osd  --yes-i-really-mean-it

This command removed the OSD from the crush map, but then the warning 
CEPHADM_FAILED_DEAMON appeared. So the next step is delete de daemon 
in the server that use to host the failed OSD. The command I used here 
was the following:


cephadm rm-daemon --name osd. --fsid 

But this command does not work because, accoding to the log, cephadm 
cannot aquire lock:


2021-08-30 21:50:09,712 DEBUG Lock 139899822730784 not acquired on 
/run/cephadm/$FSID.lock, waiting 0.05 seconds ...
2021-08-30 21:50:09,762 DEBUG Acquiring lock 139899822730784 on 
/run/cephadm/$FSID.lock
2021-08-30 21:50:09,763 DEBUG Lock 139899822730784 not acquired on 
/run/cephadm/$FSID.lock, waiting 0.05 seconds ...


The file /run/cephadm/$FSID.lock does exist. Can I safely remove it? 
What should I check before doing such task.


Yes, in case you're sure that no other cephadm process (i.e. call `ps`) 
is stuck.




I'll really appreciate any hint you can give relating this matter.

Thanks! regards.



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Very beginner question for cephadm: config file for bootstrap and osd_crush_chooseleaf_type

2021-09-02 Thread Sebastian Wagner

It does sets three config options things:

1. global/osd_crush_choose_leaf_type = 0
2. global/osd_pool_default_size = 2
3. mgr/mgr_standby_modules = False

Am 31.08.21 um 13:08 schrieb Ignacio García:
Just for experimenting, which are those single host defaults? Maybe 
these?:


mon_allow_pool_size_one = 1
osd_pool_default_size = 1

Ignacio


El 30/8/21 a las 17:31, Sebastian Wagner escribió:

Try running `cephadm bootstrap --single-host-defaults`

Am 20.08.21 um 18:23 schrieb Eugen Block:

Hi,

you can just set the config option with 'ceph config set ...' after 
your cluster has been bootstrapped. See [1] for more details about 
the config store.


[1] 
https://docs.ceph.com/en/latest/rados/configuration/ceph-conf/#monitor-configuration-database



Zitat von Dong Xie :



Dear All,



Early days of my venture with Ceph. I understand that one should 
have at
least two hosts to truly appreciate the design of Ceph. But as a 
baby step

having one playground on a single host is really unavoidable.



My env:

A single Azure VM

Ubuntu 20.04

Installed cephadm via apt per official doc.

ceph version 15.2.13 (c44bc49e7a57a87d84dfff2a077a2058aa2172e2) 
octopus

(stable)



Issues encountered:

Following steps here: https://docs.ceph.com/en/latest/cephadm/install/

I’ve created a initial-ceph.conf to set /osd crush chooseleaf type 
= 0/.




1, with cmd like:

sudo cephadm bootstrap --config initial-ceph.conf --mon-ip 10.2.0.4

It seems my config line is simply ignored in the final conf.

I’ve tried “/osd crush chooseleaf type = 0”/ (no underline symbol but
rather space, I doubt if the document is correct).

I’ve also tried “osd_crush_chooseleaf_type = 0” (underline rather
than space).

I’ve tried putting a tab to start the line, or without tab.



My line is always ignored, unless I use --no-minimize-config switch.



2, with above switch and seeing my value set to 0 in conf file, when I
login to the dashboard, looking at Cluster -> Configuration, search 
this
item, it shows Default value to 1, Current value empty, means it 
doesn’t

really got set?



I’ve tried to search an answer for a while, but didn’t get any hint or
workaround, any help would be greatly appreciated,



Best regards,



Dong Xie

CodeRobin Ltd.






___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephadm Pacific bootstrap hangs waiting for mon

2021-09-02 Thread Sebastian Wagner

by chance do you still have the logs of the mon the never went up?

https://docs.ceph.com/en/latest/cephadm/troubleshooting/#checking-cephadm-logs 



Sebastian

Am 31.08.21 um 23:51 schrieb Matthew Pounsett:

On Tue, 31 Aug 2021 at 03:24, Arnaud MARTEL
 wrote:

Hi Matthew,

I dont' know if it will be helpful but I had the same problem using debian 10 
and the solution was to install docker from docker.io and not from the debian 
package (too old).


Ah, that makes sense.  Thanks!


Arnaud

- Mail original -
De: "Matthew Pounsett" 
À: "ceph-users" 
Envoyé: Lundi 30 Août 2021 17:34:32
Objet: [ceph-users] cephadm Pacific bootstrap hangs waiting for mon

I'm just getting started with Pacific, and I've run into this problem
trying to get bootstrapped.  cephadm is waiting for the mon to start,
and waiting, and waiting ...   checking docker ps it looks like it's
running, but I guess it's never finishing its startup tasks?   I
waited about 30 minutes the first time.  Killed cephadm and restarted,
and I seem to have the same problem; I let it run overnight and got
some additional output that doesn't actually help me much.  Details
pasted below.

What additional things should I be doing to try to troubleshoot this?

In case it's useful reference info, the mon IP I've given is on our
"admin" VLAN which is reachable from all hosts on our network.  The
cluster network subnet I supplied is the 10G VLAN reachable only by
the servers in the ceph cluster I'm building.  The IP supplied is
reachable on the local host.

% sudo cephadm bootstrap --allow-fqdn-hostname --mon-ip 192.168.1.192
--cluster-network 192.168.0.0/24
Verifying podman|docker is present...
Verifying lvm2 is present...
Verifying time synchronization is in place...
Unit systemd-timesyncd.service is enabled and running
Repeating the final host check...
podman|docker (/usr/bin/docker) is present
systemctl is present
lvcreate is present
Unit systemd-timesyncd.service is enabled and running
Host looks OK
Cluster fsid: fb45c7b2-0911-11ec-9731-bc97e15d6534
Verifying IP 192.168.1.192 port 3300 ...
Verifying IP 192.168.1.192 port 6789 ...
Mon IP `192.168.1.192` is in CIDR network `192.168.1.0/24`
Pulling container image docker.io/ceph/ceph:v16...
Ceph version: ceph version 16.2.5
(0883bdea7337b95e4b611c768c0279868462204a) pacific (stable)
Extracting ceph user uid/gid from container image...
Creating initial keys...
Creating initial monmap...
Creating mon...
Waiting for mon to start...
Waiting for mon...
Non-zero exit code 1 from /usr/bin/docker run --rm --ipc=host
--stop-signal=SIGTERM --net=host --entrypoint /usr/bin/ceph --init -e
CONTAINER_IMAGE=docker.io/ceph/ceph:v16 -e
NODE_NAME=cmgmt01.example.net -e CEPH_USE_RANDOM_NONCE=1 -v
/var/lib/ceph/fb45c7b2-0911-11ec-9731-bc97e15d6534/mon.cmgmt01.example.net:/var/lib/ceph/mon/ceph-cmgmt01.example.net:z
-v /tmp/ceph-tmp8q3oxeg3:/etc/ceph/ceph.client.admin.keyring:z -v
/tmp/ceph-tmp4_69yc31:/etc/ceph/ceph.conf:z docker.io/ceph/ceph:v16
status
/usr/bin/ceph: stderr 2021-08-29T21:47:23.263+ 7f2aeaa37700  0
monclient(hunting): authenticate timed out after 300
/usr/bin/ceph: stderr 2021-08-29T21:52:23.262+ 7f2aeaa37700  0
monclient(hunting): authenticate timed out after 300
/usr/bin/ceph: stderr 2021-08-29T21:57:23.266+ 7f2aeaa37700  0
monclient(hunting): authenticate timed out after 300
/usr/bin/ceph: stderr 2021-08-29T22:02:23.265+ 7f2aeaa37700  0
monclient(hunting): authenticate timed out after 300
/usr/bin/ceph: stderr 2021-08-29T22:07:23.268+ 7f2aeaa37700  0
monclient(hunting): authenticate timed out after 300
/usr/bin/ceph: stderr 2021-08-29T22:12:23.268+ 7f2aeaa37700  0
monclient(hunting): authenticate timed out after 300
/usr/bin/ceph: stderr 2021-08-29T22:17:23.271+ 7f2aeaa37700  0
monclient(hunting): authenticate timed out after 300
/usr/bin/ceph: stderr 2021-08-29T22:22:23.266+ 7f2aeaa37700  0
monclient(hunting): authenticate timed out after 300
/usr/bin/ceph: stderr 2021-08-29T22:27:23.270+ 7f2aeaa37700  0
monclient(hunting): authenticate timed out after 300
/usr/bin/ceph: stderr 2021-08-29T22:32:23.273+ 7f2aeaa37700  0
monclient(hunting): authenticate timed out after 300
/usr/bin/ceph: stderr [errno 110] RADOS timed out (error connecting to
the cluster)
mon not available, waiting (1/15)...
[ repeats ... ]

The log contains identical info.  The only extra I see is a note at
the end about releasing locks, which I'm sure is expected and of no
additional help.

2021-08-30 11:03:02,801 DEBUG Releasing lock 140656683483824 on
/run/cephadm/fb45c7b2-0911-11ec-9731-bc97e15d6534.lock
2021-08-30 11:03:02,801 DEBUG Lock 140656683483824 released on
/run/cephadm/fb45c7b2-0911-11ec-9731-bc97e15d6534.lock
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

_

[ceph-users] Re: Brand New Cephadm Deployment, OSDs show either in/down or out/down

2021-09-02 Thread Sebastian Wagner
Can you verify that the `/usr/lib/sysctl.d/` folder exists on your 
debian machines?


Am 01.09.21 um 15:19 schrieb Alcatraz:

Sebastian,


I appreciate all your help. I actually (out of desperation) spun up 
another cluster, same specs, just using Ubuntu 18.04 rather than 
Debian 10. All the OSDs were recognized, and all went up/in without 
issue.



Thanks

On 9/1/21 06:15, Sebastian Wagner wrote:

Am 30.08.21 um 17:39 schrieb Alcatraz:

Sebastian,


Thanks for responding! And of course.


1. ceph orch ls --service-type osd --format yaml

Output:

service_type: osd
service_id: all-available-devices
service_name: osd.all-available-devices
placement:
  host_pattern: '*'
unmanaged: true
spec:
  data_devices:
    all: true
  filter_logic: AND
  objectstore: bluestore
status:
  created: '2021-08-30T13:57:51.000178Z'
  last_refresh: '2021-08-30T15:24:10.534710Z'
  running: 0
  size: 6
events:
- 2021-08-30T03:48:01.652108Z service:osd.all-available-devices 
[INFO] "service was

  created"
- "2021-08-30T03:49:00.267808Z service:osd.all-available-devices 
[ERROR] \"Failed\
  \ to apply: cephadm exited with an error code: 1, stderr:Non-zero 
exit code 1 from\
  \ /usr/bin/docker container inspect --format {{.State.Status}} 
ceph-d1405594-0944-11ec-8ebc-f23c92edc936-osd.0\n\
  /usr/bin/docker: stdout \n/usr/bin/docker: stderr Error: No such 
container: ceph-d1405594-0944-11ec-8ebc-f23c92edc936-osd.0\n\
  Deploy daemon osd.0 ...\nTraceback (most recent call last):\n File 
\"/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/cephadm.d4237e4639c108308fe13147b1c08af93c3d5724d9ff21ae797eb4b78fea3931\"\
  , line 8230, in \n    main()\n  File 
\"/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/cephadm.d4237e4639c108308fe13147b1c08af93c3d5724d9ff21ae797eb4b78fea3931\"\
  , line 8218, in main\n    r = ctx.func(ctx)\n  File 
\"/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/cephadm.d4237e4639c108308fe13147b1c08af93c3d5724d9ff21ae797eb4b78fea3931\"\
  , line 1759, in _default_image\n    return func(ctx)\n  File 
\"/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/cephadm.d4237e4639c108308fe13147b1c08af93c3d5724d9ff21ae797eb4b78fea3931\"\ 

  , line 4326, in command_deploy\n    ports=daemon_ports)\n File 
\"/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/cephadm.d4237e4639c108308fe13147b1c08af93c3d5724d9ff21ae797eb4b78fea3931\"\ 

  , line 2632, in deploy_daemon\n    c, osd_fsid=osd_fsid, 
ports=ports)\n  File \"\
/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/cephadm.d4237e4639c108308fe13147b1c08af93c3d5724d9ff21ae797eb4b78fea3931\"\ 

  , line 2801, in deploy_daemon_units\n    install_sysctl(ctx, fsid, 
daemon_type)\n\
  \  File 
\"/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/cephadm.d4237e4639c108308fe13147b1c08af93c3d5724d9ff21ae797eb4b78fea3931\"\
  , line 2963, in install_sysctl\n    _write(conf, lines)\n File 
\"/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/cephadm.d4237e4639c108308fe13147b1c08af93c3d5724d9ff21ae797eb4b78fea3931\"\ 

  , line 2948, in _write\n    with open(conf, 'w') as 
f:\nFileNotFoundError: [Errno\
  \ 2] No such file or directory: 
'/usr/lib/sysctl.d/90-ceph-d1405594-0944-11ec-8ebc-f23c92edc936-osd.conf'\""


https://tracker.ceph.com/issues/52481

- '2021-08-30T03:49:08.356762Z service:osd.all-available-devices 
[ERROR] "Failed to

  apply: auth get failed: failed to find osd.0 in keyring retval: -2"'
- '2021-08-30T03:52:34.100977Z service:osd.all-available-devices 
[ERROR] "Failed to

  apply: auth get failed: failed to find osd.3 in keyring retval: -2"'
- '2021-08-30T03:52:42.260439Z service:osd.all-available-devices 
[ERROR] "Failed to

  apply: auth get failed: failed to find osd.6 in keyring retval: -2"'


Will be fixed by https://github.com/ceph/ceph/pull/42989





2. ceph orch ps --daemon-type osd --format yaml

Output: ...snip...

3. ceph auth add osd.0 osd 'allow *' mon 'allow rwx' -i 
/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/osd.0/keyring


I verified 
/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/osd.0/keyring 
file does exist.


Output:

Error EINVAL: caps cannot be specified both in keyring and in command



You only need to create the keyring, you don't need to store the 
keyring anywhere. I'd still suggest to somehow create the keyring, 
but I haven't seen this particular error before.



hth

Sebastian




Thanks

On 8/30/21 10:28, Sebastian Wagner wrote:

Could you run

1. ceph orch ls --service-type osd --format yaml

2. cpeh orch ps --daemon-type osd --format yaml

3. try running the `ceph auth add` call form 
https://docs.ceph.com/en/mimic/rados/operations/add-or-rm-osds/#adding-an-osd-manual 




Am 30.08.21 um 14:49 schrieb Alcatraz:

Hello all,

Running into some issues trying to

[ceph-users] Re: cephadm 15.2.14 - mixed container registries?

2021-09-02 Thread Sebastian Wagner


Am 02.09.21 um 02:54 schrieb Nigel Williams:

I managed to upgrade to 15.2.14 by doing:

ceph orch upgrade start --image quay.io/ceph/ceph:v15.2.14

(anything else I tried would fail)

When I look in ceph orch ps output though I see quay.io for most image
sources, but alertmanager, grafana, node-exporter are coming from docker.io

Before doing the upgrade I had set /etc/container/registry.conf to only
have quay.io for unqualified-search-registries.

Questions: does it matter? is it easily fixed so there are consistent image
sources?


Perfectly fine. Every developer community is free to choose its 
registry. Some push to docker.io some to quay.io.





thanks.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: podman daemons in error state - where to find logs?

2021-09-02 Thread Sebastian Wagner

We have a troubleshooting section here:

https://docs.ceph.com/en/latest/cephadm/troubleshooting/#checking-cephadm-logs 



a ceph user should not be required for the containers to log to systemd. 
Did things end up in syslog?


Am 02.09.21 um 02:58 schrieb Nigel Williams:

thanks for the tip.

All OSD logs on all hosts are zero length for me though, I suspect a
permission problem but most hosts don't have a ceph user defined.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Brand New Cephadm Deployment, OSDs show either in/down or out/down

2021-09-01 Thread Sebastian Wagner

Am 30.08.21 um 17:39 schrieb Alcatraz:

Sebastian,


Thanks for responding! And of course.


1. ceph orch ls --service-type osd --format yaml

Output:

service_type: osd
service_id: all-available-devices
service_name: osd.all-available-devices
placement:
  host_pattern: '*'
unmanaged: true
spec:
  data_devices:
    all: true
  filter_logic: AND
  objectstore: bluestore
status:
  created: '2021-08-30T13:57:51.000178Z'
  last_refresh: '2021-08-30T15:24:10.534710Z'
  running: 0
  size: 6
events:
- 2021-08-30T03:48:01.652108Z service:osd.all-available-devices [INFO] 
"service was

  created"
- "2021-08-30T03:49:00.267808Z service:osd.all-available-devices 
[ERROR] \"Failed\
  \ to apply: cephadm exited with an error code: 1, stderr:Non-zero 
exit code 1 from\
  \ /usr/bin/docker container inspect --format {{.State.Status}} 
ceph-d1405594-0944-11ec-8ebc-f23c92edc936-osd.0\n\
  /usr/bin/docker: stdout \n/usr/bin/docker: stderr Error: No such 
container: ceph-d1405594-0944-11ec-8ebc-f23c92edc936-osd.0\n\
  Deploy daemon osd.0 ...\nTraceback (most recent call last):\n File 
\"/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/cephadm.d4237e4639c108308fe13147b1c08af93c3d5724d9ff21ae797eb4b78fea3931\"\
  , line 8230, in \n    main()\n  File 
\"/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/cephadm.d4237e4639c108308fe13147b1c08af93c3d5724d9ff21ae797eb4b78fea3931\"\
  , line 8218, in main\n    r = ctx.func(ctx)\n  File 
\"/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/cephadm.d4237e4639c108308fe13147b1c08af93c3d5724d9ff21ae797eb4b78fea3931\"\
  , line 1759, in _default_image\n    return func(ctx)\n  File 
\"/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/cephadm.d4237e4639c108308fe13147b1c08af93c3d5724d9ff21ae797eb4b78fea3931\"\
  , line 4326, in command_deploy\n    ports=daemon_ports)\n  File 
\"/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/cephadm.d4237e4639c108308fe13147b1c08af93c3d5724d9ff21ae797eb4b78fea3931\"\ 

  , line 2632, in deploy_daemon\n    c, osd_fsid=osd_fsid, 
ports=ports)\n  File \"\
/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/cephadm.d4237e4639c108308fe13147b1c08af93c3d5724d9ff21ae797eb4b78fea3931\"\ 

  , line 2801, in deploy_daemon_units\n    install_sysctl(ctx, fsid, 
daemon_type)\n\
  \  File 
\"/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/cephadm.d4237e4639c108308fe13147b1c08af93c3d5724d9ff21ae797eb4b78fea3931\"\
  , line 2963, in install_sysctl\n    _write(conf, lines)\n  File 
\"/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/cephadm.d4237e4639c108308fe13147b1c08af93c3d5724d9ff21ae797eb4b78fea3931\"\ 

  , line 2948, in _write\n    with open(conf, 'w') as 
f:\nFileNotFoundError: [Errno\
  \ 2] No such file or directory: 
'/usr/lib/sysctl.d/90-ceph-d1405594-0944-11ec-8ebc-f23c92edc936-osd.conf'\""


https://tracker.ceph.com/issues/52481

- '2021-08-30T03:49:08.356762Z service:osd.all-available-devices 
[ERROR] "Failed to

  apply: auth get failed: failed to find osd.0 in keyring retval: -2"'
- '2021-08-30T03:52:34.100977Z service:osd.all-available-devices 
[ERROR] "Failed to

  apply: auth get failed: failed to find osd.3 in keyring retval: -2"'
- '2021-08-30T03:52:42.260439Z service:osd.all-available-devices 
[ERROR] "Failed to

  apply: auth get failed: failed to find osd.6 in keyring retval: -2"'


Will be fixed by https://github.com/ceph/ceph/pull/42989





2. ceph orch ps --daemon-type osd --format yaml

Output: ...snip...

3. ceph auth add osd.0 osd 'allow *' mon 'allow rwx' -i 
/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/osd.0/keyring


I verified 
/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/osd.0/keyring file 
does exist.


Output:

Error EINVAL: caps cannot be specified both in keyring and in command



You only need to create the keyring, you don't need to store the keyring 
anywhere. I'd still suggest to somehow create the keyring, but I haven't 
seen this particular error before.



hth

Sebastian




Thanks

On 8/30/21 10:28, Sebastian Wagner wrote:

Could you run

1. ceph orch ls --service-type osd --format yaml

2. cpeh orch ps --daemon-type osd --format yaml

3. try running the `ceph auth add` call form 
https://docs.ceph.com/en/mimic/rados/operations/add-or-rm-osds/#adding-an-osd-manual 




Am 30.08.21 um 14:49 schrieb Alcatraz:

Hello all,

Running into some issues trying to build a virtual PoC for Ceph. 
Went to my cloud provider of choice and spun up some nodes. I have 
three identical hosts consisting of:


Debian 10
8 cpu cores
16GB RAM
1x315GB Boot Drive
3x400GB Data drives

After deploying Ceph (v 16.2.5) using cephadm, adding hosts, and 
logging into the dashboard, Ceph showed 9 OSDs, 0 up, 9 in. I 
thought perhaps it just needed some time to bring up the OSDs, so I

[ceph-users] Re: Very beginner question for cephadm: config file for bootstrap and osd_crush_chooseleaf_type

2021-08-30 Thread Sebastian Wagner

Try running `cephadm bootstrap --single-host-defaults`

Am 20.08.21 um 18:23 schrieb Eugen Block:

Hi,

you can just set the config option with 'ceph config set ...' after 
your cluster has been bootstrapped. See [1] for more details about the 
config store.


[1] 
https://docs.ceph.com/en/latest/rados/configuration/ceph-conf/#monitor-configuration-database



Zitat von Dong Xie :



Dear All,



Early days of my venture with Ceph. I understand that one should have at
least two hosts to truly appreciate the design of Ceph. But as a baby 
step

having one playground on a single host is really unavoidable.



My env:

A single Azure VM

Ubuntu 20.04

Installed cephadm via apt per official doc.

ceph version 15.2.13 (c44bc49e7a57a87d84dfff2a077a2058aa2172e2) octopus
(stable)



Issues encountered:

Following steps here: https://docs.ceph.com/en/latest/cephadm/install/

I’ve created a initial-ceph.conf to set /osd crush chooseleaf type = 0/.



1, with cmd like:

sudo cephadm bootstrap --config initial-ceph.conf --mon-ip 10.2.0.4

It seems my config line is simply ignored in the final conf.

I’ve tried “/osd crush chooseleaf type = 0”/ (no underline symbol but
rather space, I doubt if the document is correct).

I’ve also tried “osd_crush_chooseleaf_type = 0” (underline rather
than space).

I’ve tried putting a tab to start the line, or without tab.



My line is always ignored, unless I use --no-minimize-config switch.



2, with above switch and seeing my value set to 0 in conf file, when I
login to the dashboard, looking at Cluster -> Configuration, search this
item, it shows Default value to 1, Current value empty, means it doesn’t
really got set?



I’ve tried to search an answer for a while, but didn’t get any hint or
workaround, any help would be greatly appreciated,



Best regards,



Dong Xie

CodeRobin Ltd.






___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Brand New Cephadm Deployment, OSDs show either in/down or out/down

2021-08-30 Thread Sebastian Wagner

Could you run

1. ceph orch ls --service-type osd --format yaml

2. cpeh orch ps --daemon-type osd --format yaml

3. try running the `ceph auth add` call form 
https://docs.ceph.com/en/mimic/rados/operations/add-or-rm-osds/#adding-an-osd-manual 




Am 30.08.21 um 14:49 schrieb Alcatraz:

Hello all,

Running into some issues trying to build a virtual PoC for Ceph. Went 
to my cloud provider of choice and spun up some nodes. I have three 
identical hosts consisting of:


Debian 10
8 cpu cores
16GB RAM
1x315GB Boot Drive
3x400GB Data drives

After deploying Ceph (v 16.2.5) using cephadm, adding hosts, and 
logging into the dashboard, Ceph showed 9 OSDs, 0 up, 9 in. I thought 
perhaps it just needed some time to bring up the OSDs, so I left it 
running overnight.


This morning, I checked, and the Ceph dashboard shows 9 OSDs, 0 up, 6 
in, 3 out. I find this odd, as it hasn't been touched since it was 
deployed. Ceph health shows "HEALTH_OK", `ceph osd tree` outputs:


ID  CLASS  WEIGHT  TYPE NAME STATUS  REWEIGHT  PRI-AFF
-1  0  root default
 0  0  osd.0   down 0  1.0
 1  0  osd.1   down 0  1.0
 2  0  osd.2   down 0  1.0
 3  0  osd.3   down   1.0  1.0
 4  0  osd.4   down   1.0  1.0
 5  0  osd.5   down   1.0  1.0
 6  0  osd.6   down   1.0  1.0
 7  0  osd.7   down   1.0  1.0
 8  0  osd.8   down   1.0  1.0

and if I run `ls /var/run/ceph` the only thing it outputs is 
"d1405594-0944-11ec-8ebc-f23c92edc936" (sans quotes), which I assume 
is the cluster ID? So of course, if I run `ceph daemon osd.8 help` for 
example, it just returns:


Can't get admin socket path: unable to get conf option admin_socket 
for osd: b"error parsing 'osd': expected string of the form TYPE.ID, 
valid types are: auth, mon, osd, mds, mgr, client\n"


If I look at the log within the Ceph dashboard, no errors or warnings 
appear. Will Ceph not work on virtual hardware? Is there something I 
need to do to bring up the OSDs?


Just as I was about to send this email I went to check the logs and it 
shows the following (traceback ommited for length):


8/30/21 7:44:15 AM[ERR]Failed to apply osd.all-available-devices spec 
DriveGroupSpec(name=all-available-devices->placement=PlacementSpec(host_pattern='*'), 
service_id='all-available-devices', service_type='osd', 
data_devices=DeviceSelection(all=True), osd_id_claims={}, 
unmanaged=False, filter_logic='AND', preview_only=False): auth get 
failed: failed to find osd.6 in keyring retval: -2


8/30/21 7:45:19 AM[ERR]executing create_from_spec_one(([('ceph01', 
0x7f63a930bf98>), ('ceph02', 
0x7f63a81ac8d0>), ('ceph03', 
0x7f63a930b0b8>)],)) failed.


and similar for the other OSDs. I'm not sure why it's complaining 
about auth, because in order to even add the hosts to the cluster I 
had to copy the ceph public key to the hosts to begin with.


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: "ceph orch ls", "ceph orch daemon rm" fail with exception "'KeyError: 'not'" on 15.2.10

2021-08-10 Thread Sebastian Wagner

Hi,

you managed to hit https://tracker.ceph.com/issues/51176 which will be 
fixed by https://github.com/ceph/ceph/pull/42177 .


https://tracker.ceph.com/issues/51176#note-9 contains a list of steps 
for you to recover from this.



Hope that helps,

Sebastian



Am 09.08.21 um 13:11 schrieb Erkki Seppala:

Hi,

Might anyone have any insight for this issue?  I have been unable to resolve
it so far and it prevents many "ceph orch" commands and breaks many aspects
of the Web user interface.



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph Pacific mon is not starting after host reboot

2021-08-10 Thread Sebastian Wagner

Good morning Robert,

Am 10.08.21 um 09:53 schrieb Robert Sander:

Hi,

Am 09.08.21 um 20:44 schrieb Adam King:

This issue looks the same as https://tracker.ceph.com/issues/51027 
which is
being worked on. Essentially, it seems that hosts that were being 
rebooted
were temporarily marked as offline and cephadm had an issue where it 
would
try to remove all daemons (outside of osds I believe) from offline 
hosts.


Sorry for maybe being rude but how on earth does one come up with the 
idea to automatically remove components from a cluster where just one 
node is currently rebooting without any operator interference?


Obviously no one :-). We already have over 750 tests for the cephadm 
scheduler and I can foresee that we'll get some additional ones for this 
case as well.


Kind regards,

Sebastian




Regards


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephadm shell fails to start due to missing config files?

2021-07-05 Thread Sebastian Wagner

Hi Vladimir,

The behavior of`cephadm shell` will be improved by 
https://github.com/ceph/ceph/pull/42028 In the meantime and as a 
workaround you can either deploy a daemon on this host or you can copy 
the system's ceph.conf into the location that is shown in the error 
message.


Hope that helps,

Sebastian

Am 02.07.21 um 19:04 schrieb Vladimir Brik:

Hello

I am getting an error on one node in my cluster (other nodes are fine) 
when trying to run "cephadm shell". Historically this machine has been 
used as the primary Ceph management host, so it would be nice if this 
could be fixed.


ceph-1 ~ # cephadm -v shell
container_init=False
Inferring fsid 79656e6e-21e2-4092-ac04-d536f25a435d
Inferring config 
/var/lib/ceph/79656e6e-21e2-4092-ac04-d536f25a435d/mon.ceph-1/config
Running command: /usr/bin/podman images --filter label=ceph=True 
--filter dangling=false --format {{.Repository}}@{{.Digest}}
/usr/bin/podman: stdout 
docker.io/ceph/daemon-base@sha256:0810dc7db854150bc48cf8fc079875e28b3138d070990a630b8fb7cec7cd2ced
/usr/bin/podman: stdout 
docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949
/usr/bin/podman: stdout 
docker.io/ceph/ceph@sha256:16d37584df43bd6545d16e5aeba527de7d6ac3da3ca7b882384839d2d86acc7d
Using recent ceph image 
docker.io/ceph/daemon-base@sha256:0810dc7db854150bc48cf8fc079875e28b3138d070990a630b8fb7cec7cd2ced
Running command: /usr/bin/podman run --rm --ipc=host --net=host 
--entrypoint stat -e 
CONTAINER_IMAGE=docker.io/ceph/daemon-base@sha256:0810dc7db854150bc48cf8fc079875e28b3138d070990a630b8fb7cec7cd2ced 
-e NODE_NAME=ceph-1 
docker.io/ceph/daemon-base@sha256:0810dc7db854150bc48cf8fc079875e28b3138d070990a630b8fb7cec7cd2ced 
-c %u %g /var/lib/ceph

stat: stdout 167 167
Running command (timeout=None): /usr/bin/podman run --rm --ipc=host 
--net=host --privileged --group-add=disk -it -e LANG=C -e PS1=[ceph: 
\u@\h \W]\$  -e 
CONTAINER_IMAGE=docker.io/ceph/daemon-base@sha256:0810dc7db854150bc48cf8fc079875e28b3138d070990a630b8fb7cec7cd2ced 
-e NODE_NAME=ceph-1 -v 
/var/run/ceph/79656e6e-21e2-4092-ac04-d536f25a435d:/var/run/ceph:z -v 
/var/log/ceph/79656e6e-21e2-4092-ac04-d536f25a435d:/var/log/ceph:z -v 
/var/lib/ceph/79656e6e-21e2-4092-ac04-d536f25a435d/crash:/var/lib/ceph/crash:z 
-v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm 
-v /run/lock/lvm:/run/lock/lvm -v 
/var/lib/ceph/79656e6e-21e2-4092-ac04-d536f25a435d/mon.ceph-1/config:/etc/ceph/ceph.conf:z 
-v /etc/ceph/ceph.client.admin.keyring:/etc/ceph/ceph.keyring:z -v 
/var/lib/ceph/79656e6e-21e2-4092-ac04-d536f25a435d/home:/root 
--entrypoint bash 
docker.io/ceph/daemon-base@sha256:0810dc7db854150bc48cf8fc079875e28b3138d070990a630b8fb7cec7cd2ced
Error: error checking path 
"/var/lib/ceph/79656e6e-21e2-4092-ac04-d536f25a435d/mon.ceph-1/config": 
stat 
/var/lib/ceph/79656e6e-21e2-4092-ac04-d536f25a435d/mon.ceph-1/config: 
no such file or directory



The machine in question doesn't run a mon daemon (but it did a long 
time ago), so I am not sure why "cephadm shell" on this particular 
machine is looking for mon.ceph-1/config



Can anybody help?

Thanks,

Vlad
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io





OpenPGP_signature
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Module 'devicehealth' has failed:

2021-06-15 Thread Sebastian Wagner

Hi Torkil,

you should see more information in the MGR log file.

Might be an idea to restart the MGR to get some recent logs.

Am 15.06.21 um 09:41 schrieb Torkil Svensgaard:

Hi

Looking at this error in v15.2.13:

"
[ERR] MGR_MODULE_ERROR: Module 'devicehealth' has failed:
    Module 'devicehealth' has failed:
"

It used to work. Since the module is always on I can't seem to restart 
it and I've found no clue as to why it failed. I've tried rebooting 
all hosts to no avail.


Suggestions?

Thanks,

Torkil


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io





OpenPGP_signature
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: lib remoto in ubuntu

2021-06-11 Thread Sebastian Wagner

Hi Alfredo,

if you don't use cephadm, then I'd recommend to not install the 
ceph-mgr-cephadm package.


If you use cephadm with an ubuntu based container, you'll have to make 
sure that the MGR properly finds the remoto package within the container.


Thanks,

Sebastian

Am 11.06.21 um 05:24 schrieb Alfredo Rezinovsky:

I cannot enable cephadm because it cannot find remoto lib.

Even when I installed it using "pip3 install remoto" and then installed ir
from the deb package build from the git sources at
https://github.com/alfredodeza/remoto/

If I type "import remoto" in a python3 prompt it works.





OpenPGP_signature
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: mon vanished after cephadm upgrade

2021-05-14 Thread Sebastian Wagner

Hi Ashley,

is sn-m01 listed in `ceph -s`? Which hosts are listed in `ceph orch ps 
--daemon-type mon ?



Otherwise, there are a two helpful commands now:

 * `cpeh orch daemon rm mon.sn-m01` to remove the mon
 * `ceph orch daemon start mon.sn-m01` to start it again

Am 14.05.21 um 14:14 schrieb Ashley Merrick:

I had a 3 mon CEPH cluster, after updating from 15.2.x to 16.2.x one of my mon's is showing as a 
stopped state in the Ceph Dashboard.And checking the cephadm logs on the server in question I can 
see "/usr/bin/docker: Error: No such object: 
ceph-30449cba-44e4-11eb-ba64-dda10beff041-mon.sn-m01"There is a few OSD services running on 
the same physical server and they all are starting/running fine via docker.I tried to do a cephadm 
apply mon to push a new mon to the same host, but it seems to not do anything, nothing shows in the 
same log file on sn-m01Also ceph -s shows full health and no errors and has no trace of the 
"failed" mon (not sure if this is expected), only in the ceph dashboard under services 
can I see the stopped not running mon.
  
Sent via MXlogin

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: one of 3 monitors keeps going down

2021-04-29 Thread Sebastian Wagner

Right, here are the docs for that workflow:

https://docs.ceph.com/en/latest/cephadm/mon/#mon-service

Am 29.04.21 um 13:13 schrieb Eugen Block:

Hi,

instead of copying MON data to this one did you also try to redeploy the 
MON container entirely so it gets a fresh start?



Zitat von "Robert W. Eckert" :


Hi,
On a daily basis, one of my monitors goes down

[root@cube ~]# ceph health detail
HEALTH_WARN 1 failed cephadm daemon(s); 1/3 mons down, quorum 
rhel1.robeckert.us,story

[WRN] CEPHADM_FAILED_DAEMON: 1 failed cephadm daemon(s)
    daemon mon.cube on cube.robeckert.us is in error state
[WRN] MON_DOWN: 1/3 mons down, quorum rhel1.robeckert.us,story
    mon.cube (rank 2) addr 
[v2:192.168.2.142:3300/0,v1:192.168.2.142:6789/0] is down (out of quorum)

[root@cube ~]# ceph --version
ceph version 15.2.11 (e3523634d9c2227df9af89a4eac33d16738c49cb) 
octopus (stable)


I have a script that will copy the mon data from another server and it 
restarts and runs well for a while.


It is always the same monitor, and when I look at the logs the only 
thing I really see is the cephadm log showing it down


2021-04-28 10:07:26,173 DEBUG Running command: /usr/bin/podman --version
2021-04-28 10:07:26,217 DEBUG /usr/bin/podman: stdout podman version 
2.2.1
2021-04-28 10:07:26,222 DEBUG Running command: /usr/bin/podman inspect 
--format {{.Id}},{{.Config.Image}},{{.Image}},{{.Created}},{{index 
.Config.Labels "io.ceph.version"}} 
ceph-fe3a7cb0-69ca-11eb-8d45-c86000d08867-osd.2
2021-04-28 10:07:26,326 DEBUG /usr/bin/podman: stdout 
fab17e5242eb4875e266df19ca89b596a2f2b1d470273a99ff71da2ae81eeb3c,docker.io/ceph/ceph:v15,5b724076c58f97872fc2f7701e8405ec809047d71528f79da452188daf2af72e,2021-04-26 
17:13:15.54183375 -0400 EDT,
2021-04-28 10:07:26,328 DEBUG Running command: systemctl is-enabled 
ceph-fe3a7cb0-69ca-11eb-8d45-c86000d08...@mon.cube 


2021-04-28 10:07:26,334 DEBUG systemctl: stdout enabled
2021-04-28 10:07:26,335 DEBUG Running command: systemctl is-active 
ceph-fe3a7cb0-69ca-11eb-8d45-c86000d08...@mon.cube 


2021-04-28 10:07:26,340 DEBUG systemctl: stdout failed
2021-04-28 10:07:26,340 DEBUG Running command: /usr/bin/podman --version
2021-04-28 10:07:26,395 DEBUG /usr/bin/podman: stdout podman version 
2.2.1
2021-04-28 10:07:26,402 DEBUG Running command: /usr/bin/podman inspect 
--format {{.Id}},{{.Config.Image}},{{.Image}},{{.Created}},{{index 
.Config.Labels "io.ceph.version"}} 
ceph-fe3a7cb0-69ca-11eb-8d45-c86000d08867-mon.cube
2021-04-28 10:07:26,526 DEBUG /usr/bin/podman: stdout 
04e7c673cbacf5160427b0c3eb2f0948b2f15d02c58bd1d9dd14f975a84cfc6f,docker.io/ceph/ceph:v15,5b724076c58f97872fc2f7701e8405ec809047d71528f79da452188daf2af72e,2021-04-28 
08:54:57.614847512 -0400 EDT,


I don't know if it matters, but this  server is an AMD 3600XT while my 
other two servers which have had no issues are intel based.


The root file system was originally on a SSD, and I switched to NVME, 
so I eliminated controller or drive issues.  (I didn't see anything in 
dmesg anyway)


If someone could point me in the right direction on where to 
troubleshoot next, I would appreciate it.


Thanks,
Rob Eckert
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephadm: how to create more than 1 rgw per host

2021-04-19 Thread Sebastian Wagner
Hi Ivan,

this is a feature that is not yet released in Pacific. It seems the
documentation is a bit ahead of time right now.

Sebastian

On Fri, Apr 16, 2021 at 10:58 PM i...@z1storage.com 
wrote:

> Hello,
>
> According to the documentation, there's count-per-host key to 'ceph
> orch', but it does not work for me:
>
> :~# ceph orch apply rgw z1 sa-1 --placement='label:rgw count-per-host:2'
> --port=8000 --dry-run
> Error EINVAL: Host and label are mutually exclusive
>
> Why it says anything about Host if I don't specify any hosts, just labels?
>
> ~# ceph orch host ls
> HOST  ADDR  LABELS   STATUS
> s101  s101  mon rgw
> s102  s102  mgr mon rgw
> s103  s103  mon rgw
> s104  s104  mgr mon rgw
> s105  s105  mgr mon rgw
> s106  s106  mon rgw
> s107  s107  mon rgw
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How to disable ceph-grafana during cephadm bootstrap

2021-04-14 Thread Sebastian Wagner
cephadm bootstrap --skip-monitoring-stack

should to the trick. See man cephadm

On Tue, Apr 13, 2021 at 6:05 PM mabi  wrote:

> Hello,
>
> When bootstrapping a new ceph Octopus cluster with "cephadm bootstrap",
> how can I tell the cephadm bootstrap NOT to install the ceph-grafana
> container?
>
> Thank you very much in advance for your answer.
>
> Best regards,
> Mabi
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephadm custom mgr modules

2021-04-12 Thread Sebastian Wagner
You want to build a custom container for that user case indeed.

On Mon, Apr 12, 2021 at 2:18 PM Rob Haverkamp  wrote:

> Hi there,
>
> I'm developing a custom ceph-mgr module and have issues deploying this on
> a cluster deployed with cephadm.
> With a cluster deployed with ceph-deploy, I can just put my code under
> /usr/share/ceph/mgr/ and load the module. This works fine.
>
> I think I found 2 options to do this with cephadm:
>
> 1. build a custom container image:
> https://docs.ceph.com/en/octopus/cephadm/install/#deploying-custom-containers
> 2. use the --shared_ceph_folder during cephadm bootstrap: 'Development
> mode. Several folders in containers are volumes mapped to different
> sub-folders in the ceph source folder'
>
>
> The shared folder method is only meant for development. So that is not an
> option in a production environment.
> Building a custom container image should be possible, but I don't think I
> want to go there.
>
> Are there more options?
>
> It would be nice if it was possible to deploy the managers with a custom
> service specification that for example mounts a folder from the host system
> to /usr/share/ceph/mgr/ in the container.
>
>
> Thanks!
>
> Rob Haverkamp
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Unhealthy Cluster | Remove / Purge duplicate osds | Fix daemon

2021-03-16 Thread Sebastian Wagner
Hi Oliver,

I don't know how you managed to remove all MGRs from the cluster, but
there is the documentation to manually recover from this:


> https://docs.ceph.com/en/latest/cephadm/troubleshooting/#manually-deploying-a-mgr-daemon

Hope that helps,
Sebastian


Am 15.03.21 um 18:24 schrieb Oliver Weinmann:
> Hi Sebastian,
> 
> thanks that seems to have worked. At least on one of the two nodes. But
> now I have another problem. It seems that all mgr daemons are gone and
> ceph command is stuck.
> 
> [root@gedasvl02 ~]# cephadm ls | grep mgr
> 
> I tried to deploy a new mgr but this doesn't seem to work either:
> 
> [root@gedasvl02 ~]# cephadm ls | grep mgr
> [root@gedasvl02 ~]# cephadm deploy --fsid
> d0920c36-2368-11eb-a5de-005056b703af --name mgr.gedaopl03
> INFO:cephadm:Deploy daemon mgr.gedaopl03 ...
> 
> At least I can't see a mgr container on node gedaopl03:
> 
> [root@gedaopl03 ~]# podman ps
> CONTAINER ID  IMAGE
> COMMAND   CREATED STATUS PORTS  NAMES
> 63518d95201b  docker.io/prom/node-exporter:v0.18.1 
> --no-collector.ti...  3 days ago  Up 3 days ago
> ceph-d0920c36-2368-11eb-a5de-005056b703af-node-exporter.gedaopl03
> aa9b57fd77b8  docker.io/ceph/ceph:v15   -n
> client.crash.g...  3 days ago  Up 3 days ago
> ceph-d0920c36-2368-11eb-a5de-005056b703af-crash.gedaopl03
> 8b02715f9cb4  docker.io/ceph/ceph:v15   -n osd.2 -f
> --set...  3 days ago  Up 3 days ago
> ceph-d0920c36-2368-11eb-a5de-005056b703af-osd.2
> 40f15a6357fe  docker.io/ceph/ceph:v15   -n osd.7 -f
> --set...  3 days ago  Up 3 days ago
> ceph-d0920c36-2368-11eb-a5de-005056b703af-osd.7
> bda260378239  docker.io/ceph/ceph:v15   -n
> mds.cephfs.ged...  3 days ago  Up 3 days ago
> ceph-d0920c36-2368-11eb-a5de-005056b703af-mds.cephfs.gedaopl03.kybzgy
> [root@gedaopl03 ~]# systemctl --failed
>  
> UNIT  
> LOAD  
> ACTIVE SUB    DESCRIPTION
> ●
> ceph-d0920c36-2368-11eb-a5de-005056b703af@crash.gedaopl03.service 
> loaded
> failed failed Ceph crash.gedaopl03 for d0920c36-2368-11eb-a5de-005056b703af
> ●
> ceph-d0920c36-2368-11eb-a5de-005056b703af@mon.gedaopl03.service   
> loaded
> failed failed Ceph mon.gedaopl03 for d0920c36-2368-11eb-a5de-005056b703af
> ●
> ceph-d0920c36-2368-11eb-a5de-005056b703af@node-exporter.gedaopl03.service 
> loaded
> failed failed Ceph node-exporter.gedaopl03 for
> d0920c36-2368-11eb-a5de-005056b703af
> ●
> ceph-d0920c36-2368-11eb-a5de-005056b703af@osd.3.service   
> loaded
> failed failed Ceph osd.3 for d0920c36-2368-11eb-a5de-005056b703af
> 
> LOAD   = Reflects whether the unit definition was properly loaded.
> ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
> SUB    = The low-level unit activation state, values depend on unit type.
> 
> 4 loaded units listed. Pass --all to see loaded but inactive units, too.
> To show all installed unit files use 'systemctl list-unit-files'.
> 
> Maybe it's best to just scrap the whole cluster. It is only for testing,
> but I guess it is also a good practice for recovery. :)
> 
> Am 12. März 2021 um 12:35 schrieb Sebastian Wagner :
> 
>> Hi Oliver,
>>
>> # ssh gedaopl02
>> # cephadm rm-daemon osd.0
>>
>> should do the trick.
>>
>> Be careful to remove the broken OSD :-)
>>
>> Best,
>>
>> Sebastian
>>
>> Am 11.03.21 um 22:10 schrieb Oliver Weinmann:
>>> Hi,
>>>
>>> On my 3 node Octopus 15.2.5 test cluster, that I haven't used for quite
>>> a while, I noticed that it shows some errors:
>>>
>>> [root@gedasvl02 ~]# ceph health detail
>>> INFO:cephadm:Inferring fsid d0920c36-2368-11eb-a5de-005056b703af
>>> INFO:cephadm:Inferring config
>>> /var/lib/ceph/d0920c36-2368-11eb-a5de-005056b703af/mon.gedasvl02/config
>>> INFO:cephadm:Using recent ceph image docker.io/ceph/ceph:v15
>>> HEALTH_WARN 2 failed cephadm daemon(s)
>>> [WRN] CEPHADM_FAILED_DAEMON: 2 failed cephadm daemon(s)
>>>     daemon osd.0 on gedaopl02 is in error state
>>>     daemon node-exporter.gedaopl01 on gedaopl01 is in error state
>>>
>>> The error about the osd.0 is strange since osd.0 is actually up and
>>> running but on a different node. I guess I missed to correctly remove it
>>> from node gedaopl02 and then added a new osd to a different node
>>> gedaopl01 and now there are duplicate osd ids for osd.0 and osd.2.
>>>
>>> [root@gedasv

[ceph-users] Re: Container deployment - Ceph-volume activation

2021-03-12 Thread Sebastian Wagner


Am 11.03.21 um 18:40 schrieb 胡 玮文:
> Hi,
> 
> Assuming you are using cephadm? Checkout this 
> https://docs.ceph.com/en/latest/cephadm/osd/#activate-existing-osds
> 
> 
> ceph cephadm osd activate ...


Might not be backported.

see https://tracker.ceph.com/issues/46691#note-1 for the workaround


> 
> 在 2021年3月11日,23:01,Cloud Guy  写道:
> 
> Hello,
> 
> 
> 
> TL;DR
> 
> Looking for guidance on ceph-volume lvm activate --all as it would apply to
> a containerized ceph deployment (Nautilus or Octopus).
> 
> 
> 
> Detail:
> 
> I’m planning to upgrade my Nautilus non-container cluster to Octopus
> (eventually containerized).   There’s an expanded procedure that was tested
> and working in our lab, however won’t go into the whole process.   My
> question is around existing OSD hosts.
> 
> 
> 
> I have to re-platform the host OS, and one of the ways in the OSDs were
> reactivated previously when this was done (non-containerized) was to
> install ceph packages, deploy keys, config, etc.   then run ceph-volume lvm
> activate --all to magically bring up all OSDs.
> 
> 
> 
> Looking for a similar approach except if the OSDs are containerized, and I
> re-platform the host OS (Centos -> Ubuntu), how could I reactivate all OSDs
> as containers and avoid rebuilding data on the OSDs?
> 
> 
> 
> Thank you.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> 

-- 
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Unhealthy Cluster | Remove / Purge duplicate osds | Fix daemon

2021-03-12 Thread Sebastian Wagner
Hi Oliver,

# ssh gedaopl02
# cephadm rm-daemon osd.0

should do the trick.

Be careful to remove the broken OSD :-)

Best,

Sebastian

Am 11.03.21 um 22:10 schrieb Oliver Weinmann:
> Hi,
> 
> On my 3 node Octopus 15.2.5 test cluster, that I haven't used for quite
> a while, I noticed that it shows some errors:
> 
> [root@gedasvl02 ~]# ceph health detail
> INFO:cephadm:Inferring fsid d0920c36-2368-11eb-a5de-005056b703af
> INFO:cephadm:Inferring config
> /var/lib/ceph/d0920c36-2368-11eb-a5de-005056b703af/mon.gedasvl02/config
> INFO:cephadm:Using recent ceph image docker.io/ceph/ceph:v15
> HEALTH_WARN 2 failed cephadm daemon(s)
> [WRN] CEPHADM_FAILED_DAEMON: 2 failed cephadm daemon(s)
>     daemon osd.0 on gedaopl02 is in error state
>     daemon node-exporter.gedaopl01 on gedaopl01 is in error state
> 
> The error about the osd.0 is strange since osd.0 is actually up and
> running but on a different node. I guess I missed to correctly remove it
> from node gedaopl02 and then added a new osd to a different node
> gedaopl01 and now there are duplicate osd ids for osd.0 and osd.2.
> 
> [root@gedasvl02 ~]# ceph orch ps
> INFO:cephadm:Inferring fsid d0920c36-2368-11eb-a5de-005056b703af
> INFO:cephadm:Inferring config
> /var/lib/ceph/d0920c36-2368-11eb-a5de-005056b703af/mon.gedasvl02/config
> INFO:cephadm:Using recent ceph image docker.io/ceph/ceph:v15
> NAME HOST   STATUS    REFRESHED AGE 
> VERSION    IMAGE NAME    IMAGE ID  CONTAINER ID
> alertmanager.gedasvl02   gedasvl02  running (6h)  7m ago 4M  
> 0.20.0 docker.io/prom/alertmanager:v0.20.0 0881eb8f169f  5b80fb977a5f
> crash.gedaopl01  gedaopl01  stopped   7m ago 4M  
> 15.2.5 docker.io/ceph/ceph:v15 4405f6339e35  810cf432b6d6
> crash.gedaopl02  gedaopl02  running (5h)  7m ago 4M  
> 15.2.5 docker.io/ceph/ceph:v15 4405f6339e35  34ab264fd5ed
> crash.gedaopl03  gedaopl03  running (2d)  7m ago 2d  
> 15.2.9 docker.io/ceph/ceph:v15 dfc483079636  233f30086d2d
> crash.gedasvl02  gedasvl02  running (6h)  7m ago 4M  
> 15.2.5 docker.io/ceph/ceph:v15 4405f6339e35  ea3d3e7c4f58
> grafana.gedasvl02    gedasvl02  running (6h)  7m ago 4M  
> 6.6.2  docker.io/ceph/ceph-grafana:6.6.2 a0dce381714a  5a94f3e41c32
> mds.cephfs.gedaopl01.zjuhem  gedaopl01  stopped   7m ago 3M  
>   docker.io/ceph/ceph:v15  
> mds.cephfs.gedasvl02.xsjtpi  gedasvl02  running (6h)  7m ago 3M  
> 15.2.5 docker.io/ceph/ceph:v15 4405f6339e35  26e7c8759d89
> mgr.gedaopl03.zilwbl gedaopl03  running (7h)  7m ago 7h  
> 15.2.9 docker.io/ceph/ceph:v15 dfc483079636  e18b6f40871c
> mon.gedaopl03    gedaopl03  running (7h)  7m ago 7h  
> 15.2.9 docker.io/ceph/ceph:v15 dfc483079636  5afdf40e41ba
> mon.gedasvl02    gedasvl02  running (6h)  7m ago 4M  
> 15.2.5 docker.io/ceph/ceph:v15 4405f6339e35  e83dfcd864aa
> node-exporter.gedaopl01  gedaopl01  error 7m ago 4M  
> 0.18.1 docker.io/prom/node-exporter:v0.18.1 e5a616e4b9cf  0fefcfcc9639
> node-exporter.gedaopl02  gedaopl02  running (5h)  7m ago 4M  
> 0.18.1 docker.io/prom/node-exporter:v0.18.1 e5a616e4b9cf  f459045b7e41
> node-exporter.gedaopl03  gedaopl03  running (2d)  7m ago 2d  
> 0.18.1 docker.io/prom/node-exporter:v0.18.1 e5a616e4b9cf  3bd9f8dd6d5b
> node-exporter.gedasvl02  gedasvl02  running (6h)  7m ago 4M  
> 0.18.1 docker.io/prom/node-exporter:v0.18.1 e5a616e4b9cf  72e96963261e
> *osd.0    gedaopl01  running (5h)  7m ago 5h  
> 15.2.5 docker.io/ceph/ceph:v15 4405f6339e35  ed76fafb1988**
> **osd.0    gedaopl02  error 7m ago 4M  
>  docker.io/ceph/ceph:v15    *
> osd.1    gedaopl01  running (4h)  7m ago 3d  
> 15.2.5 docker.io/ceph/ceph:v15 4405f6339e35  41a43733e601
> *osd.2    gedaopl01  stopped   7m ago 4M  
>  docker.io/ceph/ceph:v15    **
> **osd.2    gedaopl03  running (7h)  7m ago 7h  
> 15.2.9 docker.io/ceph/ceph:v15 dfc483079636  ac9e660db2fb*
> osd.3    gedaopl03  running (7h)  7m ago 7h  
> 15.2.9 docker.io/ceph/ceph:v15 dfc483079636  bde17b5bb2fb
> osd.4    gedaopl02  running (5h)  7m ago 3d  
> 15.2.5 docker.io/ceph/ceph:v15 4405f6339e35  7cc3ef7c4469
> osd.5    gedaopl02  running (5h)  7m ago 3d  
> 15.2.5 docker.io/ceph/ceph:v15 4405f6339e35  761b96d235e4
> osd.6    gedaopl02  running (5h)  7m ago 3d  
> 15.2.5 docker.io/ceph/ceph:v15 4405f6339e35  d047b28fe2bd
> osd.7    gedaopl03  running (7h)  7m ago 7h  
> 15.2.9 docker.io/ceph/ceph:v15 dfc483079636  3b54b01841f4
> osd.8    gedaopl01  running (5h)  7m ago 5h  
> 15.2.5 docker.io/ceph/ceph:v15 4405f6339e35  cdd308cdc82b
> prometheus.gedasvl

[ceph-users] Re: Cephadm: Upgrade 15.2.5 -> 15.2.9 stops on non existing OSD

2021-03-11 Thread Sebastian Wagner
yes

Am 11.03.21 um 15:46 schrieb Kai Stian Olstad:
> Hi Sebastian
> 
> On 11.03.2021 13:13, Sebastian Wagner wrote:
>> looks like
>>
>> $ ssh pech-hd-009
>> # cephadm ls
>>
>> is returning this non-existent OSDs.
>>
>> can you verify that `cephadm ls` on that host doesn't
>> print osd.355 ?
> 
> "cephadm ls" on the node does list this drive
> 
> {
>     "style": "cephadm:v1",
>     "name": "osd.355",
>     "fsid": "3614abcc-201c-11eb-995a-2794bcc75ae0",
>     "systemd_unit": "ceph-3614abcc-201c-11eb-995a-2794bcc75ae0@osd.355",
>     "enabled": true,
>     "state": "stopped",
>     "container_id": null,
>     "container_image_name":
> "goharbor.example.com/library/ceph/ceph:v15.2.5",
>     "container_image_id": null,
>     "version": null,
>     "started": null,
>     "created": "2021-01-20T09:53:22.229080",
>     "deployed": "2021-02-09T09:24:02.855576",
>     "configured": "2021-02-09T09:24:04.211587"
> }
> 
> 
> To resolve it, could I just remove it with "cephadm rm-daemon"?
> 

-- 
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephadm (curl master)/15.2.9:: how to add orchestration

2021-03-11 Thread Sebastian Wagner
Hi Adrian,



Am 11.03.21 um 13:55 schrieb Adrian Sevcenco:
> Hi! After an initial bumpy bootstrapping (IMHO the defaults should be
> whatever is already defined in .ssh of the user and custom values setup
> with cli arguments) now i'm stuck adding any service/hosts/osds because
> apparently i lack orchestration

did you call bootstrap with --skip-ssh? Might explain this.

Don't know how you ended up with your url, but the correct one is:


> https://docs.ceph.com/en/octopus/mgr/orchestrator/

$ ceph mgr module enable cephadm
$ ceph orch set backend cephadm
$ ceph cephadm set-ssh-config -i ...
$ cephadm set-priv-key -i ...
$ cephadm set-pub-key -i ...

Then

> https://docs.ceph.com/en/octopus/cephadm/install/#add-hosts-to-the-cluster


should get you going again.

I'd recommend to avoid calling --skip-ssh to avoid this roundtrip.
Setting the ssh configs via


> https://docs.ceph.com/en/latest/man/8/cephadm/#bootstrap

works better typically.


> .. the the documentation show a big
> "Page does not exist"
> see
> https://docs.ceph.com/en/latest/docs/octopus/mgr/orchestrator
> 
> so, what is it and what options do i have?
> to set up it seems that is as easy as:
> ceph orch set backend
> 
> I just started with ceph and i just want to start a ceph service (i
> cannot call it a cluster) on my desktop (with 2 dedicated osds) to get
> familiar also with usage.
> 
> Thanks a lot!
> Adrian
> 
> 
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> 

-- 
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Cephadm: Upgrade 15.2.5 -> 15.2.9 stops on non existing OSD

2021-03-11 Thread Sebastian Wagner
Hi Kai,

looks like

$ ssh pech-hd-009
# cephadm ls

is returning this non-existent OSDs.

can you verify that `cephadm ls` on that host doesn't
print osd.355 ?

Best,
Sebastian

Am 11.03.21 um 12:16 schrieb Kai Stian Olstad:
> Before I started the upgrade the cluster was healthy but one
> OSD(osd.355) was down, can't remember if it was in or out.
> Upgrade was started with
>     ceph orch upgrade start --image
> goharbor.example.com/library/ceph/ceph:v15.2.9
> 
> The upgrade started but when Ceph tried to upgrade osd.355 it paused
> with the following messages:
> 
>     2021-03-11T09:15:35.638104+ mgr.pech-mon-2.cjeiyc [INF] Upgrade:
> Target is goharbor.example.com/library/ceph/ceph:v15.2.9 with id
> dfc48307963697ff48acd9dd6fda4a7a24017b9d8124f86c2
> a542b0802fe77ba
>     2021-03-11T09:15:35.639882+ mgr.pech-mon-2.cjeiyc [INF] Upgrade:
> Checking mgr daemons...
>     2021-03-11T09:15:35.644170+ mgr.pech-mon-2.cjeiyc [INF] Upgrade:
> All mgr daemons are up to date.
>     2021-03-11T09:15:35.644376+ mgr.pech-mon-2.cjeiyc [INF] Upgrade:
> Checking mon daemons...
>     2021-03-11T09:15:35.647669+ mgr.pech-mon-2.cjeiyc [INF] Upgrade:
> All mon daemons are up to date.
>     2021-03-11T09:15:35.647866+ mgr.pech-mon-2.cjeiyc [INF] Upgrade:
> Checking crash daemons...
>     2021-03-11T09:15:35.652035+ mgr.pech-mon-2.cjeiyc [INF] Upgrade:
> Setting container_image for all crash...
>     2021-03-11T09:15:35.653683+ mgr.pech-mon-2.cjeiyc [INF] Upgrade:
> All crash daemons are up to date.
>     2021-03-11T09:15:35.653896+ mgr.pech-mon-2.cjeiyc [INF] Upgrade:
> Checking osd daemons...
>     2021-03-11T09:15:36.273345+ mgr.pech-mon-2.cjeiyc [INF] It is
> presumed safe to stop ['osd.355']
>     2021-03-11T09:15:36.273504+ mgr.pech-mon-2.cjeiyc [INF] Upgrade:
> It is presumed safe to stop ['osd.355']
>     2021-03-11T09:15:36.273887+ mgr.pech-mon-2.cjeiyc [INF] Upgrade:
> Redeploying osd.355
>     2021-03-11T09:15:36.276673+ mgr.pech-mon-2.cjeiyc [ERR] Upgrade:
> Paused due to UPGRADE_REDEPLOY_DAEMON: Upgrading daemon osd.355 on host
> pech-hd-009 failed.
> 
> 
> One of the first ting the upgrade did was to upgrade mon, so they are
> restarted and now the osd.355 no longer exist
> 
>     $ ceph osd info osd.355
>     Error EINVAL: osd.355 does not exist
> 
> But if I run a resume
>     ceph orch upgrade resume
> it still tries to upgrade osd.355, same message as above.
> 
> I tried to stop and start the upgrade again with
>     ceph orch upgrade stop
>     ceph orch upgrade start --image
> goharbor.example.com/library/ceph/ceph:v15.2.9
> it still tries to upgrade osd.355, with the same message as above.
> 
> Looking at the source code it looks like it get daemons to upgrade from
> mgr cache, so I restarted both mgr but still it tries to upgrade osd.355.
> 
> 
> Does anyone know how I can get the upgrade to continue?
> 
> -- 
> Kai Stian Olstad
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> 

-- 
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Alertmanager not using custom configuration template

2021-03-11 Thread Sebastian Wagner
Hi Mark,

Indeed. I just merged https://github.com/ceph/ceph/pull/39932
which fixes the names of those config keys.

Might want to try again (with slashes instead of underscores).

Thanks for reporting this,

Sebastian

Am 10.03.21 um 15:34 schrieb Marc 'risson' Schmitt:
> Hi,
> 
> I'm trying to use a custom template for Alertmanager deployed with
> Cephadm. Following its documentation[1], I set the option
> `mgr/cephadm/alertmanager_alertmanager.yml` to my own template,
> restarted the mgr, and re-deployed Alertmanager. However, Cephadm seems
> to always use its internal template.
> 
> After some debugging, I found that the mgr indeed queries for
> `mgr/cephadm/alertmanager_alertmanager.yml`, but the end template
> always is the default one.
> 
> Am I doing something wrong? Is anyone else having this issue?
> 
> [1]
> https://docs.ceph.com/en/octopus/cephadm/monitoring/#using-custom-configuration-files
> 
> Regards,
> 

-- 
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: bug in latest cephadm bootstrap: got an unexpected keyword argument 'verbose_on_failure'

2021-03-03 Thread Sebastian Wagner
Indeed. That is going to be fixed by

https://github.com/ceph/ceph/pull/39633



Am 03.03.21 um 07:31 schrieb Philip Brown:
> Seems like someone is not testing cephadm on centos 7.9
> 
> Just tried installing cephadm from the repo, and ran
> cephadm bootstrap --mon-ip=xxx
> 
> it blew up, with
> 
> ceph TypeError: __init__() got an unexpected keyword argument 
> 'verbose_on_failure'
> 
> just after the firewall section.
> 
> I happen to have a test cluser from a few months ago, and compared the code.
> 
> Some added, in line 2348,
> 
> "out, err, ret = call([self.cmd, '--permanent', '--query-port', 
> tcp_port], verbose_on_failure=False)"
> 
> this made the init fail, on my centos 7.9 system, freshly installed and 
> updated today.
> 
> # cephadm version
> ceph version 15.2.9 (357616cbf726abb779ca75a551e8d02568e15b17) octopus 
> (stable)
> 
> 
> Simply commenting out that line makes it complete the cluster init like I 
> remember.
> 
> 
> --
> Philip Brown| Sr. Linux System Administrator | Medata, Inc. 
> 5 Peters Canyon Rd Suite 250 
> Irvine CA 92606 
> Office 714.918.1310| Fax 714.918.1325 
> pbr...@medata.com| www.medata.com
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> 

-- 
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Python API mon_comand()

2021-01-15 Thread Sebastian Wagner


Am 15.01.21 um 09:24 schrieb Robert Sander:
> Hi,
> 
> I am trying to get some statistics via the Python API but fail to run the 
> equivalent of "ceph df detail".
> 
> 
> ...snip...
 cluster.mon_command(json.dumps({'prefix': 'df detail', 'format': 'json'}), 
 b'')
> (-22, '', u'command not known')

> 
> Anything I can do to get the output of "ceph df detail" via Python API?
> I would like to have the stats fields "rd", "wr", "rd_bytes" and "wr_bytes" 
> per pool.

https://docs.ceph.com/en/latest/api/mon_command_api/#df

cluster.mon_command(json.dumps(
{
  'prefix': 'df',
  'detail': 'detail',
  'format': 'json'}
), b'')

> 
> Regards
> 
> 
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> 

-- 
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Module 'dashboard' has failed: '_cffi_backend.CDataGCP' object has no attribute 'type'

2020-11-18 Thread Sebastian Wagner
Sounds like a bug. mind creating a tracker issue?

https://tracker.ceph.com/projects/mgr/issues/new

Am 17.11.20 um 17:39 schrieb Marcelo:
> Hello all.
> 
> I'm trying to deploy the dashboard (Nautilus 14.2.8), and after I run ceph
> dashboard create-self-signed-cert, the cluster started to show this warning:
> # ceph health detail
> HEALTH_ERR Module 'dashboard' has failed: '_cffi_backend.CDataGCP' object
> has no attribute 'type'
> MGR_MODULE_ERROR Module 'dashboard' has failed: '_cffi_backend.CDataGCP'
> object has no attribute 'type'
> Module 'dashboard' has failed: '_cffi_backend.CDataGCP' object has no
> attribute 'type'
> 
> If I set ceph config set mgr mgr/dashboard/ssl false, the error goes away.
> 
> I tried to manually upload the certs, but I'm still hitting the error.
> 
> Has anyone experienced something similar?
> 
> Thanks, Marcelo.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> 

-- 
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephadm & iSCSI

2020-09-04 Thread Sebastian Wagner
Thanks! do you want to create a bug for that?

https://tracker.ceph.com/projects/orchestrator/issues/new

Am 04.09.20 um 15:25 schrieb Robert Sander:
> Hi,
> 
> yes, I have read https://docs.ceph.com/docs/octopus/cephadm/stability/
> and know that the iSCSI support is still under development.
> 
> But why is the command "ceph orch apply iscsi" available
> when cephadmin is the orchestrator backend?
> 
> I am at the stage that I successfully rolled out an iSCSI gateway
> which is visible in the Ceph dashboard.
> 
> But when trying to setup a target an error occurs:
> 
> Sep 04 15:15:58 ceph02 rbd-target-api[124040]: Unable to create the Target 
> definition - Could not load module: iscsi_target_mod
> Sep 04 15:15:58 ceph02 rbd-target-api[124040]: Unhandled Exception
>Traceback (most recent call 
> last):
>  File 
> "/usr/lib/python3.6/site-packages/rtslib_fb/node.py", line 71, in 
> _create_in_cfs_ine
>os.mkdir(self.path)
>FileNotFoundError: [Errno 2] 
> No such file or directory: '/sys/kernel/config/target/iscsi'
>
>During handling of the above 
> exception, another exception occurred:
>
>Traceback (most recent call 
> last):
>  File 
> "/usr/lib/python3.6/site-packages/rtslib_fb/fabric.py", line 156, in 
> _check_self
>
> self._create_in_cfs_ine('any')
>  File 
> "/usr/lib/python3.6/site-packages/rtslib_fb/node.py", line 74, in 
> _create_in_cfs_ine
>% self.__class__.__name__)
>rtslib_fb.utils.RTSLibError: 
> Could not create ISCSIFabricModule in configFS
>
>During handling of the above 
> exception, another exception occurred:
>
>Traceback (most recent call 
> last):
>  File 
> "/usr/lib/python3.6/site-packages/rtslib_fb/utils.py", line 432, in modprobe
>
> kmod.Kmod().modprobe(module)
>  File "kmod/kmod.pyx", line 
> 106, in kmod.kmod.Kmod.modprobe
>  File "kmod/kmod.pyx", line 
> 82, in lookup
>kmod.error.KmodError: Could 
> not modprobe
>
>During handling of the above 
> exception, another exception occurred:
>
>Traceback (most recent call 
> last):
>  File 
> "/usr/lib/python3.6/site-packages/flask/app.py", line 1612, in 
> full_dispatch_request
>rv = 
> self.dispatch_request()
>  File 
> "/usr/lib/python3.6/site-packages/flask/app.py", line 1598, in 
> dispatch_request
>return 
> self.view_functions[rule.endpoint](**req.view_args)
>  File 
> "/usr/bin/rbd-target-api", line 106, in decorated
>return f(*args, **kwargs)
>  File 
> "/usr/bin/rbd-target-api", line 304, in target
>target.manage('init')
>  File 
> "/usr/lib/python3.6/site-packages/ceph_iscsi_config/target.py", line 710, in 
> manage
>
> 'mutual_password_encryption_enabled'])
>  File 
> "/usr/lib/python3.6/site-packages/ceph_iscsi_config/discovery.py", line 14, 
> in set_discovery_auth_lio
>
> iscsi_fabric.clear_discovery_auth_settings()
>  File 
> "/usr/lib/python3.6/site-packages/rtslib_fb/fabric.py", line 224, in 
> clear_discovery_auth_settings
>self._check_self()
>  File 
> "/usr/lib/python3.6/site-packages/rtslib_fb/fabric.py", line 158, in 
> _check_self
> 

[ceph-users] Re: help me enable ceph iscsi gatewaty in ceph octopus

2020-08-05 Thread Sebastian Wagner
Till iscsi is fully working in cephadm, you can install ceph-iscsi
manually as described here:

https://docs.ceph.com/docs/master/rbd/iscsi-target-cli/



Am 05.08.20 um 11:44 schrieb Hoài Thương:
> hello swagner,
> Can you give me document , i use cephadm

-- 
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: help me enable ceph iscsi gatewaty in ceph octopus

2020-08-05 Thread Sebastian Wagner
hi David, hi Ricardo,

I think we first have to clarify, if that was actually a cephadm
deployment (and not ceph-ansible).

If you install Ceph using ceph-ansible, then please refer to the
ceph-ansible docs.

If we're actually talking about cephadm here (which is not clear to me):
iSCSI for cephadm will land in the next octopus release and at that
point we can add a proper documentation.



Hope that helps,

Sebastian

Am 05.08.20 um 11:11 schrieb Ricardo Marques:
> Hi David,
> 
> I was able to configure iSCSI gateways on my local test environment using the 
> following spec:
> 
> ```
> # tail -14 service_spec_gw.yml
> ---
> service_type: iscsi
> service_id: iscsi_service
> placement:
> hosts:
> - 'node1'
> - 'node2'
> spec:
> pool: rbd
> trusted_ip_list: 10.20.94.201,10.20.94.202,10.20.94.203
> api_port: 5000
> api_user: admin1
> api_password: admin2
> api_secure: False
> 
> # ceph orch apply -i service_spec_gw.yml
> ```
> 
> You can use this spec as a starting point, but note that the pool must exist 
> (in this case `rbd` pool), and you will need to adapt `hosts`,  
> `trusted_ip_list`, etc...
> 
> You may also want to change `api_secure` to `True` and set `ssl_cert` and 
> `ssh_key` accordingly.
> 
> Unfortunately, iSCSI deployment is not included in the documentation yet 
> (coming soon).
> 
> [1] https://docs.ceph.com/docs/octopus/cephadm/install/
> 
> 
> Ricardo Marques
> 
> 
> From: David Thuong 
> Sent: Wednesday, August 5, 2020 5:16 AM
> To: ceph-users@ceph.io 
> Subject: [ceph-users] help me enable ceph iscsi gatewaty in ceph octopus
> 
> Please help me enable ceph iscsi gatewaty in ceph octopus . when i install 
> ceph complete . i see iscsi gateway not enable. please help me config it
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> 
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> 

-- 
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: 6 hosts fail cephadm check (15.2.4)

2020-07-28 Thread Sebastian Wagner
Looks as if your cluster is still running 15.2.1.

Have a look at https://docs.ceph.com/docs/master/cephadm/upgrade/

Am 28.07.20 um 09:57 schrieb Ml Ml:
> Hello,
> 
> i get:
> 
> [WRN] CEPHADM_HOST_CHECK_FAILED: 6 hosts fail cephadm check
> host ceph01 failed check: Failed to connect to ceph01 (ceph01).
> Check that the host is reachable and accepts connections using the
> cephadm SSH key
> you may want to run:
>> ssh -F =(ceph cephadm get-ssh-config) -i =(ceph config-key get 
>> mgr/cephadm/ssh_identity_key) root@ceph01
> host ceph02 failed check: Failed to connect to ceph02 (10.10.1.2).
> Check that the host is reachable and accepts connections using the
> cephadm SSH key
> you may want to run:
>> ssh -F =(ceph cephadm get-ssh-config) -i =(ceph config-key get 
>> mgr/cephadm/ssh_identity_key) root@ceph02
> host ceph03 failed check: Failed to connect to ceph03 (10.10.1.3).
> Check that the host is reachable and accepts connections using the
> cephadm SSH key
> you may want to run:
>> ssh -F =(ceph cephadm get-ssh-config) -i =(ceph config-key get 
>> mgr/cephadm/ssh_identity_key) root@ceph03
> host ceph04 failed check: Failed to connect to ceph04 (10.10.1.4).
> Check that the host is reachable and accepts connections using the
> cephadm SSH key
> you may want to run:
>> ssh -F =(ceph cephadm get-ssh-config) -i =(ceph config-key get 
>> mgr/cephadm/ssh_identity_key) root@ceph04
> host ceph05 failed check: Failed to connect to ceph05 (10.10.1.5).
> Check that the host is reachable and accepts connections using the
> cephadm SSH key
> you may want to run:
>> ssh -F =(ceph cephadm get-ssh-config) -i =(ceph config-key get 
>> mgr/cephadm/ssh_identity_key) root@ceph05
> host ceph06 failed check: Failed to connect to ceph06 (10.10.1.6).
> Check that the host is reachable and accepts connections using the
> cephadm SSH key
> 
> 
> on ceph01 i run:
> ceph cephadm get-ssh-config > /tmp/ceph.conf
> ceph config-key get mgr/cephadm/ssh_identity_key > /tmp/ceph.key
> chmod 600 /tmp/ceph.key
> ssh -F /tmp/ceph.conf -i /tmp/ceph.key root@ceph01 (which works)
> 
> So i can not understand the errors above.
> 
> root@ceph01:~# ceph versions
> {
> "mon": {
> "ceph version 15.2.1
> (9fd2f65f91d9246fae2c841a6222d34d121680ee) octopus (stable)": 3
> },
> "mgr": {
> "ceph version 15.2.1
> (9fd2f65f91d9246fae2c841a6222d34d121680ee) octopus (stable)": 3
> },
> "osd": {
> "ceph version 15.2.1
> (9fd2f65f91d9246fae2c841a6222d34d121680ee) octopus (stable)": 56
> },
> "mds": {
> "ceph version 15.2.1
> (9fd2f65f91d9246fae2c841a6222d34d121680ee) octopus (stable)": 1
> },
> "overall": {
> "ceph version 15.2.1
> (9fd2f65f91d9246fae2c841a6222d34d121680ee) octopus (stable)": 63
> }
> }
> 
> root@ceph01:~# dpkg -l |grep ceph
> ii  ceph-base   15.2.4-1~bpo10+1
>amd64common ceph daemon libraries and management tools
> ii  ceph-common 15.2.4-1~bpo10+1
>amd64common utilities to mount and interact with a ceph
> storage cluster
> ii  ceph-deploy 2.0.1
>all  Ceph-deploy is an easy to use configuration tool
> ii  ceph-fuse   15.2.4-1~bpo10+1
>amd64FUSE-based client for the Ceph distributed file system
> ii  ceph-grafana-dashboards 15.2.4-1~bpo10+1
>all  grafana dashboards for the ceph dashboard
> ii  ceph-mds15.2.4-1~bpo10+1
>amd64metadata server for the ceph distributed file system
> ii  ceph-mgr15.2.4-1~bpo10+1
>amd64manager for the ceph distributed storage system
> ii  ceph-mgr-cephadm15.2.4-1~bpo10+1
>all  cephadm orchestrator module for ceph-mgr
> ii  ceph-mgr-dashboard  15.2.4-1~bpo10+1
>all  dashboard module for ceph-mgr
> ii  ceph-mgr-diskprediction-cloud   15.2.4-1~bpo10+1
>all  diskprediction-cloud module for ceph-mgr
> ii  ceph-mgr-diskprediction-local   15.2.4-1~bpo10+1
>all  diskprediction-local module for ceph-mgr
> ii  ceph-mgr-k8sevents  15.2.4-1~bpo10+1
>all  kubernetes events module for ceph-mgr
> ii  ceph-mgr-modules-core   15.2.4-1~bpo10+1
>all  ceph manager modules which are always enabled
> ii  ceph-mon15.2.4-1~bpo10+1
>amd64monitor server for the ceph storage system
> ii  ceph-osd15.2.4-1~bpo10+1
>amd64OSD server for the ceph storage system
> ii  cephadm 15.2.4-1~bpo10+1
>amd64cephadm utility to bootstrap ceph daemons with systemd
> and containers
> ii  libcephfs1  10.2.11-2
>amd64Ceph distributed file system client libra

[ceph-users] Re: ceph orch apply [osd, mon] -i YAML file not found

2020-07-24 Thread Sebastian Wagner
Did you `alias ceph='cephadm shell -- ceph'?

Then

cat /root/osd_spec.yml | ceph orch apply -i -

should do the trick.

Nevertheless. I'll remove the alias command in

> https://docs.ceph.com/docs/master/cephadm/install/?highlight=alias#enable-ceph-cli
>  immediately. 

Thanks for the report.


Am 23.07.20 um 22:28 schrieb Hayashida, Mami:
> I am exploring the new cephadm tool by deploying a tiny test cluster using
> OpenStack VMs, but have encountered the following issue: ceph orch apply
> osd -i  command fails with "No such file or directory:  file>".   (Monitor setup with a yaml file resulted in the same error)
> 
> ```
> root@octopus-mon-1:~# ls -lh
> total 12K
> drwxr-xr-x 2 root root 4.0K Jul 23 14:20 Playbooks
> -rw-r--r-- 1 root root   99 Jul 23 16:16 mon_spec.yml
> -rw-r--r-- 1 root root  127 Jul 23 16:03 osd_spec.yml
> 
> root@octopus-mon-1:~# cat osd_spec.yml
> ---
> service_type: osd
> service_id: default_drive_group
> placement:
>   host_pattern: 'octopus-osd-[1-3]'
> data_devices:
>   all: true
> 
> root@octopus-mon-1:~# ceph orch apply osd -i /root/osd_spec.yml --dry-run
>  --verbose
> INFO:cephadm:Inferring fsid 9e87977c-cd16-11ea-b571-fa163ec9a1ad
> INFO:cephadm:Inferring config
> /var/lib/ceph/9e87977c-cd16-11ea-b571-fa163ec9a1ad/mon.octopus-mon-1/config
> INFO:cephadm:Using recent ceph image ceph/ceph:v15
> parsed_args: Namespace(admin_socket=None, block=False, cephconf=None,
> client_id=None, client_name=None, cluster=None, cluster_timeout=None,
> completion=False, help=False, input_file='/root/osd_spec.yml',
> output_file=None, output_format=None, period=1, setgroup=None,
> setuser=None, status=False, verbose=True, version=False, watch=False,
> watch_channel=None, watch_debug=False, watch_error=False, watch_info=False,
> watch_sec=False, watch_warn=False), childargs: ['orch', 'apply', 'osd',
> '--dry-run']
> Can't open input file /root/osd_spec.yml: [Errno 2] No such file or
> directory: '/root/osd_spec.yml'
> 
> root@octopus-mon-1:~# ceph orch --version
> INFO:cephadm:Inferring fsid 9e87977c-cd16-11ea-b571-fa163ec9a1ad
> INFO:cephadm:Inferring config
> /var/lib/ceph/9e87977c-cd16-11ea-b571-fa163ec9a1ad/mon.octopus-mon-1/config
> INFO:cephadm:Using recent ceph image ceph/ceph:v15
> ceph version 15.2.4 (7447c15c6ff58d7fce91843b705a268a1917325c) octopus
> (stable)
> 
> 
> 
> 
> *Mami Hayashida*
> *Research Computing Associate*
> Univ. of Kentucky ITS Research Computing Infrastructure
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> 

-- 
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Monitor IPs

2020-07-16 Thread Sebastian Wagner
Well, for a cephadm deployment, I'd recommend do stick to the workflow
that deploys new MONs.

in order to use the workflow that is based on injecting monmaps, I'd
wait, till we have a tested documentation for it.



Am 15.07.20 um 15:34 schrieb Amit Ghadge:
> you can try, ceph mon set-addrs a [v2:1.2.3.4:1112,v1:1.2.3.4:],
> https://docs.ceph.com/docs/nautilus/rados/configuration/msgr2/#msgr2-ceph-conf
> 
> 
> On Wed, Jul 15, 2020 at 4:43 PM Will Payne  wrote:
> 
>> I need to change the network my monitors are on. It seems this is not a
>> trivial thing to do. Are there any up-to-date instructions for doing so on
>> a cephadm-deployed cluster?
>>
>> I’ve found some steps in older versions of the docs but not sure if these
>> are still correct - they mention using the ceph-mon command which I don’t
>> have.
>>
>> Will
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> 

-- 
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephadm adoption failed

2020-07-14 Thread Sebastian Wagner
strange. 0xc3 actually looks like utf-8 encoded german to me.

by chance, do you have a hexdump of

> chown -c -R $uid:$gid $data_dir_dst

?

Am 13.07.20 um 20:51 schrieb Tobias Gall:
> Hello,
> 
> I'm trying to adopt an existing cluster.
> The cluster consists of 5 converged (mon, mgr, osd, mds on same host)
> servers running Octopus 15.2.4.
> 
> I've followed the guide:
> https://docs.ceph.com/docs/octopus/cephadm/adoption/
> 
> Adopting the first mon I've got the following problem:
> 
> root@mulberry:/home/toga# cephadm adopt --style legacy --name mon.mulberry
> INFO:cephadm:Pulling latest docker.io/ceph/ceph:v15 container...
> INFO:cephadm:Stopping old systemd unit ceph-mon@mulberry...
> INFO:cephadm:Disabling old systemd unit ceph-mon@mulberry...
> INFO:cephadm:Moving data...
> INFO:cephadm:Chowning content...
> Traceback (most recent call last):
>   File "/usr/sbin/cephadm", line 4761, in 
>     r = args.func()
>   File "/usr/sbin/cephadm", line 1162, in _default_image
>     return func()
>   File "/usr/sbin/cephadm", line 3241, in command_adopt
>     command_adopt_ceph(daemon_type, daemon_id, fsid);
>   File "/usr/sbin/cephadm", line 3387, in command_adopt_ceph
>     call_throws(['chown', '-c', '-R', '%d.%d' % (uid, gid), data_dir_dst])
>   File "/usr/sbin/cephadm", line 844, in call_throws
>     out, err, ret = call(command, **kwargs)
>   File "/usr/sbin/cephadm", line 784, in call
>     message = message_b.decode('utf-8')
> UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc3 in position
> 1023: unexpected end of data
> 
> In `cephadm ls` the old mon is gone and the new is present:
> 
> {
>     "style": "cephadm:v1",
>     "name": "mon.mulberry",
>     "fsid": "74307e84-e1fe-4706-8312-fe47703928a1",
>     "systemd_unit":
> "ceph-74307e84-e1fe-4706-8312-fe47703928a1@mon.mulberry",
>     "enabled": false,
>     "state": "stopped",
>     "container_id": null,
>     "container_image_name": null,
>     "container_image_id": null,
>     "version": null,
>     "started": null,
>     "created": null,
>     "deployed": null,
>     "configured": null
> }
> 
> But there is no container running.
> How can I resolve this?
> 
> Regards,
> Tobias
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

-- 
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Error on upgrading to 15.2.4 / invalid service name using containers

2020-07-13 Thread Sebastian Wagner
Thanks! I've created https://tracker.ceph.com/issues/46497

Am 13.07.20 um 11:51 schrieb Mario J. Barchéin Molina:
> Hello. We finally solved the problem, we just deleted the failed service
> with:
> 
>  # ceph orch rm mds.label:mds
> 
> and after that, we could finish the upgrade to 15.2.4.
> 
> 
> El vie., 10 jul. 2020 a las 3:41, Mario J. Barchéin Molina (<
> ma...@intelligenia.com>) escribió:
> 
>> Hello. I'm trying to upgrade to ceph 15.2.4 from 15.2.3. The upgrade is
>> almost finished, but it has entered in a service start/stop loop. I'm using
>> a container deployment over Debian 10 with 4 nodes. The problem is with a
>> service named literally "mds.label:mds". It has the colon character, which
>> is of special use in docker. This character can't appear in the name of the
>> container and also breaks the volumen binding syntax.
>>
>> I have seen in the /var/lib/ceph/UUID/ the files for this service:
>>
>> root@ceph-admin:/var/lib/ceph/0ce93550-b628-11ea-9484-f6dc192416ca# ls
>> -la
>> total 48
>> drwx-- 12167 167 4096 jul 10 02:54 .
>> drwxr-x---  3 ceph   ceph4096 jun 24 16:36 ..
>> drwx--  3 nobody nogroup 4096 jun 24 16:37 alertmanager.ceph-admin
>> drwx--  3167 167 4096 jun 24 16:36 crash
>> drwx--  2167 167 4096 jul 10 01:35 crash.ceph-admin
>> drwx--  4998 996 4096 jun 24 16:38 grafana.ceph-admin
>> drwx--  2167 167 4096 jul 10 02:55
>> mds.label:mds.ceph-admin.rwmtkr
>> drwx--  2167 167 4096 jul 10 01:33 mgr.ceph-admin.doljkl
>> drwx--  3167 167 4096 jul 10 01:34 mon.ceph-admin
>> drwx--  2 nobody nogroup 4096 jun 24 16:38 node-exporter.ceph-admin
>> drwx--  4 nobody nogroup 4096 jun 24 16:38 prometheus.ceph-admin
>> drwx--  4 root   root4096 jul  3 02:43 removed
>>
>>
>> root@ceph-admin:/var/lib/ceph/0ce93550-b628-11ea-9484-f6dc192416ca/mds.label:mds.ceph-admin.rwmtkr#
>> ls -la
>> total 32
>> drwx--  2  167  167 4096 jul 10 02:55 .
>> drwx-- 12  167  167 4096 jul 10 02:54 ..
>> -rw---  1  167  167  295 jul 10 02:55 config
>> -rw---  1  167  167  152 jul 10 02:55 keyring
>> -rw---  1  167  167   38 jul 10 02:55 unit.configured
>> -rw---  1  167  167   48 jul 10 02:54 unit.created
>> -rw---  1 root root   24 jul 10 02:55 unit.image
>> -rw---  1 root root0 jul 10 02:55 unit.poststop
>> -rw---  1 root root  981 jul 10 02:55 unit.run
>>
>> root@ceph-admin:/var/lib/ceph/0ce93550-b628-11ea-9484-f6dc192416ca/mds.label:mds.ceph-admin.rwmtkr#
>> cat unit.run
>> /usr/bin/install -d -m0770 -o 167 -g 167
>> /var/run/ceph/0ce93550-b628-11ea-9484-f6dc192416ca
>> /usr/bin/docker run --rm --net=host --ipc=host --name
>> ceph-0ce93550-b628-11ea-9484-f6dc192416ca-mds.label:mds.ceph-admin.rwmtkr
>> -e CONTAINER_IMAGE=docker.io/ceph/ceph:v15 -e NODE_NAME=ceph-admin -v
>> /var/ru
>> n/ceph/0ce93550-b628-11ea-9484-f6dc192416ca:/var/run/ceph:z -v
>> /var/log/ceph/0ce93550-b628-11ea-9484-f6dc192416ca:/var/log/ceph:z -v
>> /var/lib/ceph/0ce93550-b628-11ea-9484-f6dc192416ca/crash:/var/lib/ceph/c
>> rash:z -v
>> /var/lib/ceph/0ce93550-b628-11ea-9484-f6dc192416ca/mds.label:mds.ceph-admin.rwmtkr:/var/lib/ceph/mds/ceph-label:mds.ceph-admin.rwmtkr:z
>> -v /var/lib/ceph/0ce93550-b628-11ea-9484-f6dc192416ca/mds.l
>> abel:mds.ceph-admin.rwmtkr/config:/etc/ceph/ceph.conf:z --entrypoint
>> /usr/bin/ceph-mds docker.io/ceph/ceph:v15 -n
>> mds.label:mds.ceph-admin.rwmtkr -f --setuser ceph --setgroup ceph
>> --default-log-to-file=fal
>> se --default-log-to-stderr=true --default-log-stderr-prefix="debug "
>>
>> If I try to manually run the docker command, this is the error:
>>
>> docker: Error response from daemon: Invalid container name
>> (ceph-0ce93550-b628-11ea-9484-f6dc192416ca-mds.label:mds.ceph-admin.rwmtkr),
>> only [a-zA-Z0-9][a-zA-Z0-9_.-] are allowed.
>>
>> If I try with a different container name, then the volume binding error
>> rises:
>>
>> docker: Error response from daemon: invalid volume specification:
>> '/var/lib/ceph/0ce93550-b628-11ea-9484-f6dc192416ca/mds.label:mds.ceph-admin.rwmtkr:/var/lib/ceph/mds/ceph-label:mds.ceph-admin.rwmtkr:z'.
>>
>> This mds is not needed and I would be happy simply removing it, but I
>> don't know how to do it. The documentation says how to do it for "normal"
>> services, but my installation is a container deployment. I have tried to
>> remove the directory and restart the upgrading process but then the
>> directory with this service appears again.
>>
>> Please, how can I remove or rename this service so I can complete the
>> upgrading?
>>
>> Also, I think it's a bug to allow docker-forbidden characters in the
>> service names when using container deployment and it should be checked.
>>
>> Thank you very much.
>>
>> --
>> *Mario J. Barchéin Molina*
>> *Departamento de I+D+i*
>> ma...@intelligenia.com
>> Madrid: +34 911 86 35 46
>> US: +1 (918) 856 - 3838
>> Granada: +34 958 07 70 70
>> ――
>> intelligenia · Intelligent Engineering · Web 

[ceph-users] Re: Ceph SSH orchestrator?

2020-07-03 Thread Sebastian Wagner


Am 02.07.20 um 19:57 schrieb Oliver Freyermuth:
> Dear Cephalopodians,
> 
> as we all know, ceph-deploy is on its demise since a while and essentially in 
> "maintenance mode". 
> 
> We've been eyeing the "ssh orchestrator" which was in Nautilus as the 
> "successor in spirit" of ceph-deploy. 
> While we have not tried it out just yet, I find this module seems to be gone 
> without a trace in Octopus. 
> There's still an Orchestrator module, but this seems to work "only" with 
> containers. 
> 
> Is this true, or is there still an SSH orchestrator capable of bare-metal 
> operation in Octopus (or are there plans to have something like this)? 
> 
> While I see many advantages of containers in many areas, and certainly also 
> for smaller setups or test setups with Ceph,
> as any technology, they come with their own problems. 
> Example issues (which all can be solved, but require extra work from the 
> administrator) are:
> - Operation on machines without connectivity to the internet (you'd need to 
> mirror the containers or run your own registry),
> - Ensuring automated security updates both outside the containers and inside 
> the containers, or re-pull them regularly (and monitor that),
> - Integrate with existing logging and configuration management systems,
> - Potential hardware issues, such das InfiniBand RDMA. 
> 
> There's surely more (and there are also as many benefits), and as I said, all 
> can be solved; the point I want to make is:
> Containers are not the best solution in all environments and also not for all 
> admins. 
> 
> So my question is: Is there something like the SSH orchestrator still 
> available? 
> I guess essentially the cephadm orchestrator does something similar behind 
> the screnes, with the added bells and whistles to manage the containers. 
> Of course, a reduced feature-set would be expected (e.g. no "ceph orch 
> upgrade"), but it would jump into the hole ceph-deploy has left.

we're renamed the SSH orchestrator into cephadm. See

https://github.com/ceph/ceph/pull/32193 for the
corresponding pull request.

Hope that helps,

Sebastian


> 
> Maybe this is as easy as setting a configuration knob? Or is it also possible 
> to switch to a "bare-metal edition" of cephadm (which might rely on users
> or existing configuration management to install the packages, e.g.)? 
> 
> Cheers,
>   Oliver
> 
> 
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> 

-- 
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How to ceph-volume on remote hosts?

2020-06-24 Thread Sebastian Wagner


Am 24.06.20 um 05:15 schrieb steven prothero:
> Hello,
> 
> I am new to CEPH and on a few test servers attempting to setup and
> learn a test ceph system.
> 
> I started off the install with the "Cephadm" option and it uses podman
> containers.
> Followed steps here:
> https://docs.ceph.com/docs/master/cephadm/install/
> 
> I ran the bootstrap, added remote hosts, added monitors and everything
> is looking good.
> 
> Now I would like to add OSDs...
> 
> On the bootstrapped server i did a :
> 
> ceph-volume lvm prepare   --data /dev/sda6
>and then the "activate" and "ceph orch daemon add osd (etc)" to add
> it and it works...
> 
> But now I am ready to add OSDs on the remote nodes. I am not able to
> find documentation or examples on how to do :
> 
>   ceph-volume lvm prepare & activate steps on the remote hosts.
> 
> How do we prepare & activate the remote hosts disks?

ceph orch apply osd

as described in
https://docs.ceph.com/docs/master/cephadm/install/#deploy-osds

should do the trick. In case it doesn't, what's the output of

ceph orch device ls

?

> 
> Thank you very much for your input,
> 
> Cheers
> Steve
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> 

-- 
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph orch upgrade stuck at the beginning.

2020-06-04 Thread Sebastian Wagner
d, 87 TiB / 87 
> TiB avail
> 2020-05-21T18:54:11.686+ 7faed5caa700  0 log_channel(cluster) log [DBG] : 
> pgmap v53: 97 pgs: 97 active+clean; 4.8 GiB data, 12 GiB used, 87 TiB / 87 
> TiB avail
> 2020-05-21T18:54:13.690+ 7faed5caa700  0 log_channel(cluster) log [DBG] : 
> pgmap v54: 97 pgs: 97 active+clean; 4.8 GiB data, 12 GiB used, 87 TiB / 87 
> TiB avail
> 2020-05-21T18:54:15.691+ 7faed5caa700  0 log_channel(cluster) log [DBG] : 
> pgmap v55: 97 pgs: 97 active+clean; 4.8 GiB data, 12 GiB used, 87 TiB / 87 
> TiB avail
> 2020-05-21T18:54:17.691+ 7faed5caa700  0 log_channel(cluster) log [DBG] : 
> pgmap v56: 97 pgs: 97 active+clean; 4.8 GiB data, 12 GiB used, 87 TiB / 87 
> TiB avail
> 2020-05-21T18:54:18.631+ 7faee6b7d700  0 log_channel(cephadm) log [INF] : 
> Upgrade: It is NOT safe to stop mon.vx-rg23-rk65-u43-130
> 2020-05-21T18:54:19.691+ 7faed5caa700  0 log_channel(cluster) log [DBG] : 
> pgmap v57: 97 pgs: 97 active+clean; 4.8 GiB data, 12 GiB used, 87 TiB / 87 
> TiB avail
> 2020-05-21T18:54:21.691+ 7faed5caa700  0 log_channel(cluster) log [DBG] : 
> pgmap v58: 97 pgs: 97 active+clean; 4.8 GiB data, 12 GiB used, 87 TiB / 87 
> TiB avail
> 2020-05-21T18:54:23.691+ 7faed5caa700  0 log_channel(cluster) log [DBG] : 
> pgmap v59: 97 pgs: 97 active+clean; 4.8 GiB data, 12 GiB used, 87 TiB / 87 
> TiB avail
> 2020-05-21T18:54:25.695+ 7faed5caa700  0 log_channel(cluster) log [DBG] : 
> pgmap v60: 97 pgs: 97 active+clean; 4.8 GiB data, 12 GiB used, 87 TiB / 87 
> TiB avail; 170 B/s wr, 0 op/s
> 2020-05-21T18:54:27.695+ 7faed5caa700  0 log_channel(cluster) log [DBG] : 
> pgmap v61: 97 pgs: 97 active+clean; 4.8 GiB data, 12 GiB used, 87 TiB / 87 
> TiB avail; 170 B/s wr, 0 op/s
> 2020-05-21T18:54:29.695+ 7faed5caa700  0 log_channel(cluster) log [DBG] : 
> pgmap v62: 97 pgs: 97 active+clean; 4.8 GiB data, 12 GiB used, 87 TiB / 87 
> TiB avail; 170 B/s wr, 0 op/s
> 2020-05-21T18:54:31.695+ 7faed5caa700  0 log_channel(cluster) log [DBG] : 
> pgmap v63: 97 pgs: 97 active+clean; 4.8 GiB data, 12 GiB used, 87 TiB / 87 
> TiB avail; 170 B/s wr, 0 op/s
> 2020-05-21T18:54:33.647+ 7faee6b7d700  0 log_channel(cephadm) log [INF] : 
> Upgrade: It is NOT safe to stop mon.vx-rg23-rk65-u43-130
> 2020-05-21T18:54:33.695+ 7faed5caa700  0 log_channel(cluster) log [DBG] : 
> pgmap v64: 97 pgs: 97 active+clean; 4.8 GiB data, 12 GiB used, 87 TiB / 87 
> TiB avail; 170 B/s wr, 0 op/s
> 2020-05-21T18:54:35.699+ 7faed5caa700  0 log_channel(cluster) log [DBG] : 
> pgmap v65: 97 pgs: 97 active+clean; 4.8 GiB data, 12 GiB used, 87 TiB / 87 
> TiB avail; 170 B/s wr, 0 op/s
> 2020-05-21T18:54:37.699+ 7faed5caa700  0 log_channel(cluster) log [DBG] : 
> pgmap v66: 97 pgs: 97 active+clean; 4.8 GiB data, 12 GiB used, 87 TiB / 87 
> TiB avail
> 2020-05-21T18:54:39.699+ 7faed5caa700  0 log_channel(cluster) log [DBG] : 
> pgmap v67: 97 pgs: 97 active+clean; 4.8 GiB data, 12 GiB used, 87 TiB / 87 
> TiB avail
> 2020-05-21T18:54:41.699+ 7faed5caa700  0 log_channel(cluster) log [DBG] : 
> pgmap v68: 97 pgs: 97 active+clean; 4.8 GiB data, 12 GiB used, 87 TiB / 87 
> TiB avail
> 2020-05-21T18:54:43.703+ 7faed5caa700  0 log_channel(cluster) log [DBG] : 
> pgmap v69: 97 pgs: 97 active+clean; 4.8 GiB data, 12 GiB used, 87 TiB / 87 
> TiB avail
> 2020-05-21T18:54:45.703+ 7faed5caa700  0 log_channel(cluster) log [DBG] : 
> pgmap v70: 97 pgs: 97 active+clean; 4.8 GiB data, 12 GiB used, 87 TiB / 87 
> TiB avail
> 2020-05-21T18:54:47.703+ 7faed5caa700  0 log_channel(cluster) log [DBG] : 
> pgmap v71: 97 pgs: 97 active+clean; 4.8 GiB data, 12 GiB used, 87 TiB / 87 
> TiB avail
> 2020-05-21T18:54:48.663+ 7faee6b7d700  0 log_channel(cephadm) log [INF] : 
> Upgrade: It is NOT safe to stop mon.vx-rg23-rk65-u43-130
> 2020-05-21T18:54:49.703+ 7faed5caa700  0 log_channel(cluster) log [DBG] : 
> pgmap v72: 97 pgs: 97 active+clean; 4.8 GiB data, 12 GiB used, 87 TiB / 87 
> TiB avail
> 2020-05-21T18:54:51.707+ 7faed5caa700  0 log_channel(cluster) log [DBG] : 
> pgmap v73: 97 pgs: 97 active+clean; 4.8 GiB data, 12 GiB used, 87 TiB / 87 
> TiB avail
> 2020-05-21T18:54:53.707+ 7faed5caa700  0 log_channel(cluster) log [DBG] : 
> pgmap v74: 97 pgs: 97 active+clean; 4.8 GiB data, 12 GiB used, 87 TiB / 87 
> TiB avail
> 2020-05-21T18:54:55.707+ 7faed5caa700  0 log_channel(cluster) log [DBG] : 
> pgmap v75: 97 pgs: 97 active+clean; 4.8 GiB data, 12 GiB used, 87 TiB / 87 
> TiB avail
> 2020-05-21T18:54:57.707+ 7faed5caa700  0 log_channel(cluster) log [DBG] : 
> pgmap v76: 97 pgs: 97 active+clean; 4.8 GiB data, 12 GiB used, 87 TiB / 87 
> TiB avail
> 2020-05-21T18:54:59.711+ 7faed5caa700  0 log_channel(cluster) log [DBG] : 
> pgma

[ceph-users] Re: Cephadm Setup Query

2020-06-04 Thread Sebastian Wagner


Am 26.05.20 um 08:16 schrieb Shivanshi .:
> Hi,
> 
> I am facing an issue on Cephadm cluster setup. Whenever, I try to add
> remote devices as OSDs, command just hangs.
> 
> The steps I have followed :
> 
> sudo ceph orch daemon add osd node1:device
> 
>  
> 
>  1. For the setup I have followed the steps mentioned in :
> 
> https://ralph.blog.imixs.com/2020/04/14/ceph-octopus-running-on-debian-buster/
> 
>  
> 
>  1. To make sure it is not facing ssh errors and  host is reachable I
> have tried the following commands:
> cephadm shell -- ceph config-key get mgr/cephadm/ssh_identity_key > key
> cephadm shell -- ceph cephadm get-ssh-config > config
> ssh -F config -i key root@hostname
> 
>   I am able to connect to the host as root.
> 
>  
> 
>  1. Then I have tired collecting the log information
>  1. Command : sudo cephadm logs --fsid
> e236062e-96ad-11ea-bedb-5254002e4127 --name osd
> Result :
> Traceback (most recent call last):
> File "/usr/sbin/cephadm", line 4282, in 
> r = args.func()
> File "/usr/sbin/cephadm", line 921, in _infer_fsid
> return func()
> File "/usr/sbin/cephadm", line 2689, in command_logs
> (daemon_type, daemon_id) = args.name.split('.', 1)
> ValueError: not enough values to unpack (expected 2, got 1)

cephadm logs expects a name as returned by `cephadm ls | jq '.[].name'`


>  2. Commad : sudo ceph log last cephadm
>  
> 
> Result :
> 
>  
> 
> INFO:cephadm:Verifying port 9100 ...
> 
>  WARNING:cephadm:Cannot bind to IP 0.0.0.0 port 9100: [Errno 98] Address
> already in use
> 
>  ERROR: TCP Port(s) '9100' required for node-exporter is already in use
> 
>  Traceback (most recent call last):
> 
> File "/usr/share/ceph/mgr/cephadm/module.py", line 1638, in _run_cephadm
> 
> code, '\n'.join(err)))
> 
>  RuntimeError: cephadm exited with an error code: 1,
> stderr:INFO:cephadm:Deploying daemon node-exporter.ceph-mon ...
> 
>  INFO:cephadm:Verifying port 9100 ...
> 
>  WARNING:cephadm:Cannot bind to IP 0.0.0.0 port 9100: [Errno 98] Address
> already in use
> 
>  ERROR: TCP Port(s) '9100' required for node-exporter is already in use

Looks like a node-exporter is already running on this host. I don't know
where this comes from. Was a node-exporter installed previously?

> 
>  2020-05-15T13:33:46.966159+ mgr.ceph-mgr.dixgvy (mgr.14161) 678 :
> cephadm [WRN] Failed to apply node-exporter spec ServiceSpec(
> 
> {'placement': PlacementSpec(host_pattern='*'), 'service_type':
> 'node-exporter', 'service_id': None, 'unmanaged': False}
> 
> ): cephadm exited with an error code: 1, stderr:INFO:cephadm:Deploying
> daemon node-exporter.ceph-mon ...
> 
>  INFO:cephadm:Verifying port 9100 ...
> 
>  WARNING:cephadm:Cannot bind to IP 0.0.0.0 port 9100: [Errno 98] Address
> already in use
> 
>  ERROR: TCP Port(s) '9100' required for node-exporter is already in use
> 
>  
> 
>  
> 
> But I am not able to infer from these log information. Can you please
> help me with the issue.
> 
> 
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> 

-- 
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Octopus 15.2.2 unable to make drives available (reject reason locked)...

2020-06-04 Thread Sebastian Wagner
Hi Marco,

note that encrypted OSDs will land in the next octous release.

Regarding the locked state, you could run ceph-volume directly on the
host to understand the issue better. c-v should give you the reasons.

Am 29.05.20 um 03:18 schrieb Marco Pizzolo:
> Rebooting addressed
> 
> On Thu, May 28, 2020 at 4:52 PM Marco Pizzolo 
> wrote:
> 
>> Hello,
>>
>> Hitting an issue with a new 15.2.2 deployment using cephadm.  I am having
>> a problem creating encrypted, 2 osds per device OSDs (they are NVMe).
>>
>> After removing and bootstrapping the cluster again, i am unable to create
>> OSDs as they're locked.  sgdisk, wipefs, zap all fail to leave the drives
>> as available.
>>
>> Any help would be appreciated.
>> Any comments on performance experiences with ceph in containers (cephadm
>> deployed) vs bare metal (ceph-deploy) would be greatly appreciated as well.
>>
>> Thanks,
>> Marco
>>
>> ceph orch device ls
>> HOST PATH  TYPE   SIZE  DEVICE
>>   AVAIL  REJECT REASONS
>> prdhcistonode01  /dev/nvme0n1  ssd   11.6T
>>  Micron_9300_MTFDHAL12T8TDR_2006266528D1  False  *locked*
>> prdhcistonode01  /dev/nvme1n1  ssd   11.6T
>>  Micron_9300_MTFDHAL12T8TDR_2006266534D9  False  *locked*
>> prdhcistonode01  /dev/nvme2n1  ssd953G  INTEL
>> SSDPEKKF010T8_BTHH850215GA1P0E False  *locked*
>> prdhcistonode01  /dev/nvme3n1  ssd   11.6T
>>  Micron_9300_MTFDHAL12T8TDR_200626651473  False  *locked*
>> prdhcistonode01  /dev/nvme4n1  ssd   11.6T
>>  Micron_9300_MTFDHAL12T8TDR_2006266508FB  False * locked*
>> prdhcistonode01  /dev/nvme5n1  ssd   11.6T
>>  Micron_9300_MTFDHAL12T8TDR_20062664E6E8  False  *locked*
>> prdhcistonode01  /dev/nvme6n1  ssd   11.6T
>>  Micron_9300_MTFDHAL12T8TDR_200626653CC0  False * locked*
>> prdhcistonode01  /dev/nvme7n1  ssd   11.6T
>>  Micron_9300_MTFDHAL12T8TDR_1939243B797E  False * locked*
>> prdhcistonode01  /dev/nvme8n1  ssd   11.6T
>>  Micron_9300_MTFDHAL12T8TDR_200626652441  False  *locked*
>>
>>
>> lsblk
>>
>> NAME
>>MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
>> nvme2n1
>> 259:00 953.9G  0 disk
>> ├─nvme2n1p1
>> 259:10   512M  0 part /boot/efi
>> └─nvme2n1p2
>> 259:20 953.4G  0 part /
>> nvme3n1
>> 259:30  11.7T  0 disk
>> └─ceph--5bd47cae--97b3--4cad--b010--215fd982497b-osd--data--e6045acd--a56d--41d2--a016--b8647b9a717a
>>  253:10  11.7T  0 lvm
>> nvme4n1
>> 259:40  11.7T  0 disk
>> └─ceph--bf7dbfb4--afe3--4391--9847--08e461bf6247-osd--data--12faafac--b695--4c30--b6d7--7046d8275d9f
>>  253:00  11.7T  0 lvm
>> nvme0n1
>> 259:50  11.7T  0 disk
>> └─ceph--1a5d8e23--ff7d--44c3--b6d2--de143fed2b7d-osd--block--b6593547--e99a--4add--8edd--5d0fb53254cd
>> 253:20  11.7T  0 lvm
>> nvme5n1
>> 259:60  11.7T  0 disk
>> └─ceph--7d85ff24--79c8--4792--a2c8--bb4908f77ff0-osd--data--fc4e9dbd--920f--41b8--8467--74e9dcbd57ca
>>  253:30  11.7T  0 lvm
>> nvme6n1
>> 259:70  11.7T  0 disk
>> └─ceph--d8c8652a--1cd8--4e10--a333--4ea10f3b5004-osd--data--9a70a549--3cba--4f0d--a13a--8465781a10e9
>>  253:50  11.7T  0 lvm
>> nvme8n1
>> 259:80  11.7T  0 disk
>> └─ceph--e1914f1c--2385--4c0c--9951--d4b9200b7164-osd--data--8876559c--6393--4fbc--821b--7ac74cfb5a54
>>  253:70  11.7T  0 lvm
>> nvme7n1
>> 259:90  11.7T  0 disk
>> └─ceph--3765b53a--75eb--489e--97e1--d6b03bc25532-osd--data--777638e0--a325--401d--a01d--459676871003
>>  253:40  11.7T  0 lvm
>> nvme1n1
>> 259:10   0  11.7T  0 disk
>> └─ceph--2124f206--2b50--41a1--8a3c--d47c1a909a3b-osd--block--88e4f1eb--73f4--4c83--b978--fe7cabc0c3e6
>> 253:60  11.7T  0 lvm
>>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> 

-- 
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Cephadm Hangs During OSD Apply

2020-06-04 Thread Sebastian Wagner
encrypted OSDS should land in the next octopus release:

https://tracker.ceph.com/issues/44625

Am 27.05.20 um 20:31 schrieb m...@silvenga.com:
> I noticed the luks volumes were open, even though luksOpen hung. I killed 
> cryptsetup (once per disk) and ceph-volume continued and eventually created 
> the osd's for the host (yes, this node will be slated for another reinstall 
> when cephadm is stabilized).
> 
> Is there a way to remove an osd service spec with the current tooling? The 
> drives are immediately locked when the node is added to orch.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> 

-- 
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph orch upgrade stuck at the beginning.

2020-05-25 Thread Sebastian Wagner


Am 22.05.20 um 19:28 schrieb Gencer W. Genç:
> Upgrade: It is NOT safe to stop mon.vx-rg23-rk65-u43-130

please make sure,

ceph mon ok-to-stop mon.vx-rg23-rk65-u43-130

return ok

-- 
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph orch upgrade stuck at the beginning.

2020-05-20 Thread Sebastian Wagner
Hi Gencer,

I'm going to need the full mgr log file.

Best,
Sebastian

Am 20.05.20 um 15:07 schrieb Gencer W. Genç:
> Ah yes,
> 
> {
>     "mon": {
>         "ceph version 15.2.1 (9fd2f65f91d9246fae2c841a6222d34d121680ee) 
> octopus (stable)": 2
>     },
>     "mgr": {
>         "ceph version 15.2.2 (0c857e985a29d90501a285f242ea9c008df49eb8) 
> octopus (stable)": 2
>     },
>     "osd": {
>         "ceph version 15.2.1 (9fd2f65f91d9246fae2c841a6222d34d121680ee) 
> octopus (stable)": 24
>     },
>     "mds": {
>         "ceph version 15.2.1 (9fd2f65f91d9246fae2c841a6222d34d121680ee) 
> octopus (stable)": 2
>     },
>     "overall": {
>         "ceph version 15.2.1 (9fd2f65f91d9246fae2c841a6222d34d121680ee) 
> octopus (stable)": 28,
>         "ceph version 15.2.2 (0c857e985a29d90501a285f242ea9c008df49eb8) 
> octopus (stable)": 2
>     }
> }
> 
> How can i fix this?
> 
> Gencer.
> On 20.05.2020 16:04:33, Ashley Merrick  wrote:
> Does:
> 
> 
> ceph versions
> 
> 
> show any services yet running on 15.2.2?
> 
> 
> 
>  On Wed, 20 May 2020 21:01:12 +0800 Gencer W. Genç  
> wrote 
> 
> 
> Hi Ashley,
> $ ceph orch upgrade status
> 
> 
> {
> 
>     "target_image": "docker.io/ceph/ceph:v15.2.2",
> 
>     "in_progress": true,
> 
>     "services_complete": [],
> 
>     "message": ""
> 
> }
> 
> 
> Thanks,
> 
> Gencer.
> 
> 
> On 20.05.2020 15:58:34, Ashley Merrick  [mailto:singap...@amerrick.co.uk]> wrote:
> 
> What does
> 
> ceph orch upgrade status
> 
> show?
> 
> 
> 
>  On Wed, 20 May 2020 20:52:39 +0800 Gencer W. Genç  [mailto:gen...@gencgiyen.com]> wrote 
> 
> 
> Hi,
> 
> I've 15.2.1 installed on all machines. On primary machine I executed ceph 
> upgrade command:
> 
> $ ceph orch upgrade start --ceph-version 15.2.2
> 
> 
> When I check ceph -s I see this:
> 
>   progress:
>     Upgrade to docker.io/ceph/ceph:v15.2.2 (30m)
>       [=...] (remaining: 8h)
> 
> It says 8 hours. It is already ran for 3 hours. No upgrade processed. It get 
> stuck at this point.
> 
> Is there any way to know why this has stuck?
> 
> Thanks,
> Gencer.
> ___
> ceph-users mailing list -- ceph-users@ceph.io [mailto:ceph-users@ceph.io]
> To unsubscribe send an email to ceph-users-le...@ceph.io 
> [mailto:ceph-users-le...@ceph.io]
> 
> 
> 
> 
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> 

-- 
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Cephadm and rados gateways

2020-05-18 Thread Sebastian Wagner
This will be fixed in 15.2.2 

https://tracker.ceph.com/issues/45215
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How to apply ceph.conf changes using new tool cephadm

2020-05-05 Thread Sebastian Wagner
ceph@elchaka.de wrote:
> I am not absolutly sure but you should be able to do something like
> 
>  ceph config mon set

Yes. please use `ceph config ...` cephadm only uses a minimal ceph.conf which 
only contains the IPs of the other MONs.
> 
> Or try to restart the mon/osd daemon
> 
> Hth
> 
> Am 29. April 2020 16:42:31 MESZ schrieb "Gencer W. Genç"
>  > Hi,
> > 
> > I just deployed a new cluster with cephadm instead of ceph-deploy. In
> > tyhe past, If i change ceph.conf for tweaking, i was able to copy them
> > and apply to all servers. But i cannot find this on new cephadm tool.
> > 
> > I did few changes on ceph.conf but ceph is unaware of those changes.
> > How can i apply them? I've used it with docker.
> > 
> > Thanks,
> > Gencer.
> > ___
> > ceph-users mailing list -- ceph-users(a)ceph.io
> > To unsubscribe send an email to ceph-users-leave(a)ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How to debug ssh: ceph orch host add ceph01 10.10.1.1

2020-04-29 Thread Sebastian Wagner
We've improved the docs a little bit. 

Does https://docs.ceph.com/docs/master/cephadm/troubleshooting/#ssh-errors help 
you now?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: PGs unknown (osd down) after conversion to cephadm

2020-04-16 Thread Sebastian Wagner
Hi Marco,

# ceph orch upgrade start --ceph-version 15.2.1

should do the trick.



Am 15.04.20 um 17:40 schrieb Dr. Marco Savoca:
> Hi Sebastian,
> 
>  
> 
> as I said, the orchestrator does not seem to be reachable after
> cluster’s reboot. The requested output could only be gathered after
> manual restart of the osd containers. By the way, if I try to upgrade to
> v15.2.1 via cephadm (ceph orch upgrade start --version 15.2.1), I only
> get the output “ceph version 15.2.0
> (dc6a0b5c3cbf6a5e1d6d4f20b5ad466d76b96247) octopus (rc)” and the upgrade
> does not start:
> 
> sudo ceph orch upgrade status
> 
> {
> 
>     "target_image": null,
> 
>     "in_progress": false,
> 
>     "services_complete": [],
> 
>     "message": ""
> 
> }
> 
>  
> 
> Maybe it’s time to open a ticket.
> 
>  
> 
> Here the requested outputs.
> 
>  
> 
> sudo ceph orch host ls --format json
> 
>  
> 
> [{"addr": "ceph1.domainname.de", "hostname": "ceph1.domainname.de",
> "labels": [], "status": ""}, {"addr": "ceph2.domainname.de", "hostname":
> "ceph2.domainname.de", "labels": [], "status": ""}, {"addr":
> "ceph3.domainname.de", "hostname": "ceph3.domainname.de", "labels": [],
> "status": ""}]
> 
>  
> 
> sudo ceph orch ls --format json
> 
>  
> 
> [{"container_image_id":
> "204a01f9b0b6710dd0c0af7f37ce7139c47ff0f0105d778d7104c69282dfbbf1",
> "container_image_name": "docker.io/ceph/ceph:v15", "service_name":
> "mds.media", "size": 3, "running": 3, "spec": {"placement": {"count":
> 3}, "service_type": "mds", "service_id": "media"}, "last_refresh":
> "2020-04-15T15:26:53.664473", "created": "2020-03-30T23:51:32.239555"},
> {"container_image_id":
> "204a01f9b0b6710dd0c0af7f37ce7139c47ff0f0105d778d7104c69282dfbbf1",
> "container_image_name": "docker.io/ceph/ceph:v15", "service_name":
> "mgr", "size": 0, "running": 3, "last_refresh":
> "2020-04-15T15:26:53.664098"}, {"container_image_id":
> "204a01f9b0b6710dd0c0af7f37ce7139c47ff0f0105d778d7104c69282dfbbf1",
> "container_image_name": "docker.io/ceph/ceph:v15", "service_name":
> "mon", "size": 0, "running": 3, "last_refresh":
> "2020-04-15T15:26:53.664270"}]
> 
>  
> 
> Thanks,
> 
>  
> 
> Marco
> 
>  
> 
>  
> 
> *Von: *Sebastian Wagner <mailto:swag...@suse.com>
> *Gesendet: *Dienstag, 14. April 2020 16:53
> *An: *ceph-users@ceph.io <mailto:ceph-users@ceph.io>
> *Betreff: *[ceph-users] Re: PGs unknown (osd down) after conversion to
> cephadm
> 
>  
> 
> Might be an issue with cephadm.
> 
>  
> 
> Do you have the output of `ceph orch host ls --format json` and `ceph
> 
> orch ls --format json`?
> 
>  
> 
> Am 09.04.20 um 13:23 schrieb Dr. Marco Savoca:
> 
>> Hi all,
> 
>>
> 
>>  
> 
>>
> 
>> last week I successfully upgraded my cluster to Octopus and converted it
> 
>> to cephadm. The conversion process (according to the docs) went well and
> 
>> the cluster ran in an active+clean status.
> 
>>
> 
>>  
> 
>>
> 
>> But after a reboot all osd went down with a delay of a couple of minutes
> 
>> after reboot and all (100%) of the PGs ran into the unknown state. The
> 
>> orchestrator isn’t reacheable during this state (ceph orch status
> 
>> doesn’t come to an end).
> 
>>
> 
>>  
> 
>>
> 
>> A manual restart of the osd-daemons resolved the problem and the cluster
> 
>> is now active+clean again.
> 
>>
> 
>>  
> 
>>
> 
>> This behavior is reproducible.
> 
>>
> 
>>  
> 
>>
> 
>>  
> 
>>
> 
>> The “ceph log last cephadm” command spits out (redacted):
> 
>>
> 
>>  
> 
>>
> 
>>  
> 
>>
> 
>> 2020-03-30T23:07:06.881061+ mgr.ceph2 (mgr.1854484) 42 : cephadm
> 
>> [INF] Generating ssh key...
> 
>>
> 
>> 2020-03-30T23:22:00.250422+ mgr.ceph2 (mgr.1854484) 492 : cephadm
> 
>> [ERR] _Promise failed
> 
>

[ceph-users] Re: PGs unknown (osd down) after conversion to cephadm

2020-04-14 Thread Sebastian Wagner
Might be an issue with cephadm.

Do you have the output of `ceph orch host ls --format json` and `ceph
orch ls --format json`?

Am 09.04.20 um 13:23 schrieb Dr. Marco Savoca:
> Hi all,
> 
>  
> 
> last week I successfully upgraded my cluster to Octopus and converted it
> to cephadm. The conversion process (according to the docs) went well and
> the cluster ran in an active+clean status.
> 
>  
> 
> But after a reboot all osd went down with a delay of a couple of minutes
> after reboot and all (100%) of the PGs ran into the unknown state. The
> orchestrator isn’t reacheable during this state (ceph orch status
> doesn’t come to an end).
> 
>  
> 
> A manual restart of the osd-daemons resolved the problem and the cluster
> is now active+clean again.
> 
>  
> 
> This behavior is reproducible.
> 
>  
> 
>  
> 
> The “ceph log last cephadm” command spits out (redacted):
> 
>  
> 
>  
> 
> 2020-03-30T23:07:06.881061+ mgr.ceph2 (mgr.1854484) 42 : cephadm
> [INF] Generating ssh key...
> 
> 2020-03-30T23:22:00.250422+ mgr.ceph2 (mgr.1854484) 492 : cephadm
> [ERR] _Promise failed
> 
> Traceback (most recent call last):
> 
>   File "/usr/share/ceph/mgr/cephadm/module.py", line 444, in do_work
> 
>     res = self._on_complete_(*args, **kwargs)
> 
>   File "/usr/share/ceph/mgr/cephadm/module.py", line 512, in 
> 
>     return cls(on_complete=lambda x: f(*x), value=args, name=name,
> **c_kwargs)
> 
>   File "/usr/share/ceph/mgr/cephadm/module.py", line 1648, in add_host
> 
>     spec.hostname, spec.addr, err))
> 
> orchestrator._interface.OrchestratorError: New host ceph1 (ceph1) failed
> check: ['INFO:cephadm:podman|docker (/usr/bin/docker) is present',
> 'INFO:cephadm:systemctl is present', 'INFO:cephadm:lvcreate is present',
> 'INFO:cephadm:Unit systemd-timesyncd.service is enabled and running',
> 'ERROR: hostname "ceph1.domain.de" does not match expected hostname
> "ceph1"']
> 
> 2020-03-30T23:22:27.267344+ mgr.ceph2 (mgr.1854484) 508 : cephadm
> [INF] Added host ceph1.domain.de
> 
> 2020-03-30T23:22:36.078462+ mgr.ceph2 (mgr.1854484) 515 : cephadm
> [INF] Added host ceph2.domain.de
> 
> 2020-03-30T23:22:55.200280+ mgr.ceph2 (mgr.1854484) 527 : cephadm
> [INF] Added host ceph3.domain.de
> 
> 2020-03-30T23:23:17.491596+ mgr.ceph2 (mgr.1854484) 540 : cephadm
> [ERR] _Promise failed
> 
> Traceback (most recent call last):
> 
>   File "/usr/share/ceph/mgr/cephadm/module.py", line 444, in do_work
> 
>     res = self._on_complete_(*args, **kwargs)
> 
>   File "/usr/share/ceph/mgr/cephadm/module.py", line 512, in 
> 
>     return cls(on_complete=lambda x: f(*x), value=args, name=name,
> **c_kwargs)
> 
>   File "/usr/share/ceph/mgr/cephadm/module.py", line 1648, in add_host
> 
>     spec.hostname, spec.addr, err))
> 
> orchestrator._interface.OrchestratorError: New host ceph1 (10.10.0.10)
> failed check: ['INFO:cephadm:podman|docker (/usr/bin/docker) is
> present', 'INFO:cephadm:systemctl is present', 'INFO:cephadm:lvcreate is
> present', 'INFO:cephadm:Unit systemd-timesyncd.service is enabled and
> running', 'ERROR: hostname "ceph1.domain.de" does not match expected
> hostname "ceph1"']
> 
>  
> 
> Could this be a problem with the ssh key?
> 
>  
> 
> Thanks for the help and happy eastern.
> 
>  
> 
> Marco Savoca
> 
>  
> 
> 
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> 

-- 
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: osd with specifiying directories

2020-04-06 Thread Sebastian Wagner
Hi Micha,

cephadm does not (yet) support Filestore.

See https://tracker.ceph.com/issues/44874 for details.

Best,

Sebastian

Am 03.04.20 um 10:11 schrieb Micha:
> Hi,
> 
> I want to try using object storage with java.
> Is it possible to set up osds with "only" directories as data destination
> (using cephadmin) , instead of whole disks? I have read through much of the
> docu but didn't found how to do it (if it's possible).
> 
> Thanks
>  Michael
> 
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> 

-- 
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io