[ceph-users] Re: Cluster down

2021-10-13 Thread Alex Gorbachev
Hi Jorge,

This looks like a corosync problem to me.  If corosync loses connectivity,
the Proxmox nodes would fence and reboot.  Ideally, you'd have a second
ring on different switch(es), even a cheap 1Gb switch will do.

--
Alex Gorbachev
ISS - Storcium



On Wed, Oct 13, 2021 at 7:07 AM Jorge JP  wrote:

> Hello Marc,
>
> For add node to ceph cluster with Proxmox first I have to install Proxmox
> hehe, this is not the problem.
>
> File configuration is revised and correct. I understand your words but not
> is problem of configuration.
>
> I can understand that cluster can have problems if any servers not
> configured correctly or ports in the switches not configured correctly. But
> this server never became in a member of cluster.
>
> I extracted a part of logfile when ceph down.
>
> A bit weeks ago, I have a problem with a port configuration and remove mtu
> 9216 and various hypervisors of cluster proxmox rebooted. But today the
> server not relationated with ceph cluster. Only have public and private ips
> in same network but ports not configured.
>
> 
> De: Marc 
> Enviado: miércoles, 13 de octubre de 2021 12:49
> Para: Jorge JP ; ceph-users@ceph.io <
> ceph-users@ceph.io>
> Asunto: RE: Cluster down
>
> >
> > We currently have a ceph cluster in Proxmox, with 5 ceph nodes with the
> > public and private network correctly configured and without problems.
> > The state of ceph was optimal.
> >
> > We had prepared a new server to add to the ceph cluster. We did the
> > first step of installing Proxmox with the same version. I was at the
> > point where I was setting up the network.
>
> I am not using proxmox, just libvirt. But I would say the most important
> part is your ceph cluster. So before doing anything I would make sure to
> add the ceph node first and then install other things.
>
> > For this step, I did was connect by SSH to the new server and copy the
> > network configuration of one of the ceph nodes to this new one. Of
> > course, changing the ip addresses.
>
> I would not copy at all. Just change the files manually if you did not
> edit one file correctly or the server reboots before you change the ip
> addresses you can get into all kinds of problems.
>
> > What happened when restarting the network service is that I lost access
> > to the cluster. I couldn't access any of the 5 servers that are part of
> > the  ceph cluster. Also, 2 of 3 hypervisors
> > that we have in the proxmox cluster were restarted directly.
>
> So now you know, you first have to configure networking, then ceph and
> then proxmox. Take your time adding a server. I guess the main reason you
> are in the current situation, you try to do it quick quick.
>
> > Why has this happened if the new server is not yet inside the ceph
> > cluster on the proxmox cluster and I don't even have the ports
> > configured on my switch?
>
> Without logs nobody is able to tell.
>
> > Do you have any idea?
> >
> > I do not understand, if now I go and take any server and configure an IP
> > of the cluster network and even if the ports are not even configured,
> > will the cluster knock me down?
>
> Nothing should happen if you install an OS and use ip addresses in the
> same space as your cluster/client network. Do this first.
>
> > I recovered the cluster by phisically removing the cables from the new
> > server.
>
> So wipe it, and start over.
>
> > Thanks a lot and sorry for my english...
>
> No worries, your english is much better than my spanish ;)
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Snap-schedule stopped working?

2021-10-13 Thread Kyriazis, George
Hello ceph-users,

I am running Proxmox 7 with ceph 16.2.6 with 46 OSDs.

I enabled snap_schedule about a month ago, and it seemed to be going fine, at 
least at the beginning.

I’ve noticed, however, that snapshots stopped happening, as shown below:

root@vis-mgmt:/ceph/backups/nassie/NAS/.snap# ls
scheduled-2021-09-12-23_00_00  scheduled-2021-09-24-23_00_00  
scheduled-2021-09-27-18_00_00
scheduled-2021-09-19-23_00_00  scheduled-2021-09-25-23_00_00  
scheduled-2021-09-27-19_00_00
scheduled-2021-09-20-23_00_00  scheduled-2021-09-26-23_00_00  
scheduled-2021-09-27-20_00_00
scheduled-2021-09-21-23_00_00  scheduled-2021-09-27-15_00_00  
scheduled-2021-09-27-21_00_00
scheduled-2021-09-22-23_00_00  scheduled-2021-09-27-16_00_00
scheduled-2021-09-23-23_00_00  scheduled-2021-09-27-17_00_00
root@vis-mgmt:/ceph/backups/nassie/NAS/.snap# 

Snap-schedule is below:

root@vis-mgmt:/ceph/backups/nassie/NAS/.snap# ceph fs snap-schedule list 
/backups/nassie/NAS
/backups/nassie/NAS 1h 
/backups/nassie/NAS 24h 
/backups/nassie/NAS 7d 
/backups/nassie/NAS 4w 
root@vis-mgmt:/ceph/backups/nassie/NAS/.snap# 

And snap-schedule status:

root@vis-mgmt:/ceph/backups/nassie/NAS/.snap# ceph fs snap-schedule status 
/backups/nassie/NAS
{"fs": "cephfs", "subvol": null, "path": "/backups/nassie/NAS", "rel_path": 
"/backups/nassie/NAS", "schedule": "1h", "retention": {}, "start": 
"2021-09-08T02:00:00", "created": "2021-09-09T22:57:14", "first": 
"2021-09-09T23:00:00", "last": "2021-09-09T23:00:00", "last_pruned": null, 
"created_count": 1, "pruned_count": 0, "active": true}
===
{"fs": "cephfs", "subvol": null, "path": "/backups/nassie/NAS", "rel_path": 
"/backups/nassie/NAS", "schedule": "24h", "retention": {}, "start": 
"2021-09-08T02:00:00", "created": "2021-09-09T23:01:17", "first": null, "last": 
null, "last_pruned": null, "created_count": 0, "pruned_count": 0, "active": 
true}
===
{"fs": "cephfs", "subvol": null, "path": "/backups/nassie/NAS", "rel_path": 
"/backups/nassie/NAS", "schedule": "7d", "retention": {}, "start": 
"2021-09-08T02:00:00", "created": "2021-09-09T23:01:26", "first": null, "last": 
null, "last_pruned": null, "created_count": 0, "pruned_count": 0, "active": 
true}
===
{"fs": "cephfs", "subvol": null, "path": "/backups/nassie/NAS", "rel_path": 
"/backups/nassie/NAS", "schedule": "4w", "retention": {}, "start": 
"2021-09-08T02:00:00", "created": "2021-09-09T23:01:36", "first": null, "last": 
null, "last_pruned": null, "created_count": 0, "pruned_count": 0, "active": 
true}
root@vis-mgmt:/ceph/backups/nassie/NAS/.snap#

So, snap scheduler’s looks like it’s active, but no snapshots are being taken.

/var/log/ceph/ceph-mgr..*.log on the node running ceph-mgr only has 
the following lines regarding snapshot schedules:

2021-10-13T22:44:11.339-0500 7f3acac28700  0 [snap_schedule INFO mgr_util] 
scanning for idle connections..
2021-10-13T22:44:11.339-0500 7f3acac28700  0 [snap_schedule INFO mgr_util] 
cleaning up connections: []
2021-10-13T22:44:41.336-0500 7f3acac28700  0 [snap_schedule INFO mgr_util] 
scanning for idle connections..
2021-10-13T22:44:41.336-0500 7f3acac28700  0 [snap_schedule INFO mgr_util] 
cleaning up connections: []


Any reasons why snapshot schedule would stop working?  Why would they get stuck?

Thank you!

George

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: OSD's fail to start after power loss

2021-10-13 Thread Orbiting Code, Inc.
I have an update on the topic "OSD's fail to start after power loss". We 
have fixed the issue. After our last "apt upgrade" procedure about 90 
days ago, the package python-pkg-resources was removed via "apt 
autoremove" after rebooting the OSD host. The command below shows that 
the module pkg_resources was missing when running ceph-volume manually.


root@osd3:/root/# ceph-volume lvm activate --all
Traceback (most recent call last):
  File "/usr/sbin/ceph-volume", line 6, in 
    from pkg_resources import load_entry_point
ImportError: No module named pkg_resources

After installing python-pkg-resources, the above command succeeded, and 
all 12 OSD's are now active in the cluster.


And Dominic, to answer your questions, I am running Ceph 14.2.2 
(Nautilus) on Ubuntu 18.04. I used ceph-deploy to install the cluster. 
The tmpfs directories /var/lib/ceph/osd/ceph-* were not mounted due to 
the missing pkg_resources module, which caused the keyrings to be 
unavailable.


Thanks Everyone,
Todd

On 10/13/21 4:21 PM, dhils...@performair.com wrote:

Todd;

What version of ceph are you running? Are you running containers or 
packages? Was the cluster installed manually, or using a deployment tool?


Logs provided are for osd ID 31, is ID 31 appropriate for that server? 
Have you verified that the ceph.conf on that server is intact, and 
correct?


Your log snippet references /var/lib/ceph/osd/ceph-31/keyring; does 
this file exist? Does the /var/lib/ceph/osd/ceph-31/ folder exist? If 
both exist, are the ownership and permissions correct / appropriate?


Thank you,

Dominic L. Hilsbos, MBA
Vice President - Information Technology
Perform Air International Inc.
dhils...@performair.com
www.PerformAir.com


-Original Message-
From: Orbiting Code, Inc. [mailto:supp...@orbitingcode.com]
Sent: Wednesday, October 13, 2021 7:21 AM
To: ceph-users@ceph.io
Subject: [ceph-users] OSD's fail to start after power loss

Hello Everyone,

I have 3 OSD hosts with 12 OSD's each. After a power failure on 1 host,
all 12 OSD's fail to start on that host. The other 2 hosts did not lose
power, and are functioning. Obviously I don't want to restart the
working hosts at this time. Syslog shows:

Oct 12 17:24:07 osd3 systemd[1]:
ceph-volume@lvm-31-cae13d9a-1d3d-4003-a57f-6ffac21a682e.service: Main
process exited, code
=exited, status=1/FAILURE
Oct 12 17:24:07 osd3 systemd[1]:
ceph-volume@lvm-31-cae13d9a-1d3d-4003-a57f-6ffac21a682e.service: Failed
with result 'exit-
code'.
Oct 12 17:24:07 osd3 systemd[1]: Failed to start Ceph Volume activation:
lvm-31-cae13d9a-1d3d-4003-a57f-6ffac21a682e.

This is repeated for all 12 OSD's on the failed host. Running the
following command, shows additional errors.

root@osd3:/var/log# /usr/bin/ceph-osd -f --cluster ceph --id 31
--setuser ceph --setgroup ceph
2021-10-12 17:50:23.117 7fce92e6ac00 -1 auth: unable to find a keyring
on /var/lib/ceph/osd/ceph-31/keyring: (2) No such file or directory
2021-10-12 17:50:23.117 7fce92e6ac00 -1 AuthRegistry(0x55c4ec50aa40) no
keyring found at /var/lib/ceph/osd/ceph-31/keyring, disabling cephx
2021-10-12 17:50:23.117 7fce92e6ac00 -1 auth: unable to find a keyring
on /var/lib/ceph/osd/ceph-31/keyring: (2) No such file or directory
2021-10-12 17:50:23.117 7fce92e6ac00 -1 AuthRegistry(0x7ffe9b64eb08) no
keyring found at /var/lib/ceph/osd/ceph-31/keyring, disabling cephx
failed to fetch mon config (--no-mon-config to skip)

No tmpfs mounts exist for any directories in /var/lib/ceph/osd/ceph-**

Any assistance helping with this situation would be greatly appreciated.

Thank you,
Todd
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RGW pubsub deprecation

2021-10-13 Thread Dave Piper
Hi Yuval,

We're using pubsub!

We opted for pubsub over bucket notifications as the pull mode fits well with 
our requirements.

1) We want to be able to guarantee that our client (the external server) has 
received and processed each event. My initial understanding of bucket 
notifications was that they weren't stored on ceph at all, and were simply 
broadcast and then forgotten. Actually I see that the docs state the 
notification will be retried until acked [3]. Is that guaranteed? Will ceph 
ultimately give up and drop an event? Is there a way of seeing how many events 
have been unacked / dropped?

2) Being able to pull a list of missed events back, rather than receiving them 
one at a time, allows our client to cut down on processing. As an example, if 
the same object is updated 10 times, pubsub catchup list will list 10 events 
for the same object, and the client can recognise this and only needs to 
process the object once and ack all 10 events.  The bucket notification model 
suggests we will have to process each event in turn. There are possibly ways we 
can work around this though (e.g. queue incoming bucket notifications on the 
client and process them in batches).

We've had a number of issues with pubsub and still aren't confident in its 
behaviour. Your post suggests its not well used, which might imply it has less 
field hardening that bucket notifications. If so, it sounds like it might be 
better for us both if we switched to using the bucket notifications method 
instead. It'd be good to get your thoughts on how we could satisfy two 
requirements above.

If pubsub is likely to be deprecated, we'll need to start moving fast. What's 
the latest thinking on this?

Cheers,

Dave

[3] 
https://docs.ceph.com/en/latest/radosgw/notifications/#notification-reliability

- 



-Original Message-
From: Yuval Lifshitz  
Sent: 05 November 2020 06:57
To: ceph-users 
Subject: [ceph-users] RGW pubsub deprecation

NOTE: Message is from an external sender

Dear Community,
Since Nautilus, we have 2 mechanisms for notifying 3rd parties on changes in 
buckets and objects: "bucket notifications" [1] and "pubsub" [2].

In "bucket notifications" (="push mode") the events are sent from the RGW to an 
external entity (kafka, rabbitmq etc.), while in "pubsub" (="pull
mode") the events are synched with a special zone, where they are stored and 
could be later fetched by an external app.

>From communications that I've seen so far, users preferred to use "bucket 
>notifications" over "pubsub". Since supporting both modes has maintenance 
>overhead, I was considering deprecating "pubsub".
However, before doing that I would like to see what the community has to say!

So, if you are currently using pubsub, or plan to use it, as "pull mode"
fits your usecase better than "push mode" please chime in.

Yuval

[1] 
https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.ceph.com%2Fen%2Flatest%2Fradosgw%2Fnotifications%2Fdata=04%7C01%7Cdavid.piper%40metaswitch.com%7C01afaedcb924464d82a408d8815821e0%7C9d9e56ebf6134ddbb27bbfcdf14b2cdb%7C1%7C0%7C637401562833936026%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=mSP8%2FrcB%2FRLPvpxD099BMzGiQzmwlitRpACN%2F85zyxc%3Dreserved=0
[2] 
https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.ceph.com%2Fen%2Flatest%2Fradosgw%2Fpubsub-module%2Fdata=04%7C01%7Cdavid.piper%40metaswitch.com%7C01afaedcb924464d82a408d8815821e0%7C9d9e56ebf6134ddbb27bbfcdf14b2cdb%7C1%7C0%7C637401562833936026%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=%2F4TYjlde8cekotkGKsTxl4dUroURq73CrcZsbdTuA7g%3Dreserved=0
___
ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to 
ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: OSD's fail to start after power loss

2021-10-13 Thread DHilsbos
Todd;

What version of ceph are you running?  Are you running containers or packages?  
Was the cluster installed manually, or using a deployment tool?

Logs provided are for osd ID 31, is ID 31 appropriate for that server?  Have 
you verified that the ceph.conf on that server is intact, and correct?

Your log snippet references /var/lib/ceph/osd/ceph-31/keyring; does this file 
exist?  Does the /var/lib/ceph/osd/ceph-31/ folder exist?  If both exist, are 
the ownership and permissions correct / appropriate?

Thank you,

Dominic L. Hilsbos, MBA
Vice President - Information Technology
Perform Air International Inc.
dhils...@performair.com
www.PerformAir.com


-Original Message-
From: Orbiting Code, Inc. [mailto:supp...@orbitingcode.com] 
Sent: Wednesday, October 13, 2021 7:21 AM
To: ceph-users@ceph.io
Subject: [ceph-users] OSD's fail to start after power loss

Hello Everyone,

I have 3 OSD hosts with 12 OSD's each. After a power failure on 1 host, 
all 12 OSD's fail to start on that host. The other 2 hosts did not lose 
power, and are functioning. Obviously I don't want to restart the 
working hosts at this time. Syslog shows:

Oct 12 17:24:07 osd3 systemd[1]: 
ceph-volume@lvm-31-cae13d9a-1d3d-4003-a57f-6ffac21a682e.service: Main 
process exited, code
=exited, status=1/FAILURE
Oct 12 17:24:07 osd3 systemd[1]: 
ceph-volume@lvm-31-cae13d9a-1d3d-4003-a57f-6ffac21a682e.service: Failed 
with result 'exit-
code'.
Oct 12 17:24:07 osd3 systemd[1]: Failed to start Ceph Volume activation: 
lvm-31-cae13d9a-1d3d-4003-a57f-6ffac21a682e.

This is repeated for all 12 OSD's on the failed host. Running the 
following command, shows additional errors.

root@osd3:/var/log# /usr/bin/ceph-osd -f --cluster ceph --id 31 
--setuser ceph --setgroup ceph
2021-10-12 17:50:23.117 7fce92e6ac00 -1 auth: unable to find a keyring 
on /var/lib/ceph/osd/ceph-31/keyring: (2) No such file or directory
2021-10-12 17:50:23.117 7fce92e6ac00 -1 AuthRegistry(0x55c4ec50aa40) no 
keyring found at /var/lib/ceph/osd/ceph-31/keyring, disabling cephx
2021-10-12 17:50:23.117 7fce92e6ac00 -1 auth: unable to find a keyring 
on /var/lib/ceph/osd/ceph-31/keyring: (2) No such file or directory
2021-10-12 17:50:23.117 7fce92e6ac00 -1 AuthRegistry(0x7ffe9b64eb08) no 
keyring found at /var/lib/ceph/osd/ceph-31/keyring, disabling cephx
failed to fetch mon config (--no-mon-config to skip)

No tmpfs mounts exist for any directories in /var/lib/ceph/osd/ceph-**

Any assistance helping with this situation would be greatly appreciated.

Thank you,
Todd
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Cluster down

2021-10-13 Thread DHilsbos
Jorge;

This sounds, to me, like something to discuss with the proxmox folks.

Unless there was an IP conflict between the rebooted server, and one of the 
existing mons, I can't see the ceph cluster going unavailable.  Further, I 
don't see where anything ceph related would cause hypervisors, on other hosts, 
to restart.

Thank you,

Dominic L. Hilsbos, MBA
Vice President - Information Technology
Perform Air International Inc.
dhils...@performair.com
www.PerformAir.com


-Original Message-
From: Jorge JP [mailto:jorg...@outlook.es] 
Sent: Wednesday, October 13, 2021 4:07 AM
To: Marc; ceph-users@ceph.io
Subject: [ceph-users] Re: Cluster down

Hello Marc,

For add node to ceph cluster with Proxmox first I have to install Proxmox hehe, 
this is not the problem.

File configuration is revised and correct. I understand your words but not is 
problem of configuration.

I can understand that cluster can have problems if any servers not configured 
correctly or ports in the switches not configured correctly. But this server 
never became in a member of cluster.

I extracted a part of logfile when ceph down.

A bit weeks ago, I have a problem with a port configuration and remove mtu 9216 
and various hypervisors of cluster proxmox rebooted. But today the server not 
relationated with ceph cluster. Only have public and private ips in same 
network but ports not configured.


De: Marc 
Enviado: miércoles, 13 de octubre de 2021 12:49
Para: Jorge JP ; ceph-users@ceph.io 
Asunto: RE: Cluster down

>
> We currently have a ceph cluster in Proxmox, with 5 ceph nodes with the
> public and private network correctly configured and without problems.
> The state of ceph was optimal.
>
> We had prepared a new server to add to the ceph cluster. We did the
> first step of installing Proxmox with the same version. I was at the
> point where I was setting up the network.

I am not using proxmox, just libvirt. But I would say the most important part 
is your ceph cluster. So before doing anything I would make sure to add the 
ceph node first and then install other things.

> For this step, I did was connect by SSH to the new server and copy the
> network configuration of one of the ceph nodes to this new one. Of
> course, changing the ip addresses.

I would not copy at all. Just change the files manually if you did not edit one 
file correctly or the server reboots before you change the ip addresses you can 
get into all kinds of problems.

> What happened when restarting the network service is that I lost access
> to the cluster. I couldn't access any of the 5 servers that are part of
> the  ceph cluster. Also, 2 of 3 hypervisors
> that we have in the proxmox cluster were restarted directly.

So now you know, you first have to configure networking, then ceph and then 
proxmox. Take your time adding a server. I guess the main reason you are in the 
current situation, you try to do it quick quick.

> Why has this happened if the new server is not yet inside the ceph
> cluster on the proxmox cluster and I don't even have the ports
> configured on my switch?

Without logs nobody is able to tell.

> Do you have any idea?
>
> I do not understand, if now I go and take any server and configure an IP
> of the cluster network and even if the ports are not even configured,
> will the cluster knock me down?

Nothing should happen if you install an OS and use ip addresses in the same 
space as your cluster/client network. Do this first.

> I recovered the cluster by phisically removing the cables from the new
> server.

So wipe it, and start over.

> Thanks a lot and sorry for my english...

No worries, your english is much better than my spanish ;)

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Adopting "unmanaged" OSDs into OSD service specification

2021-10-13 Thread David Orman
That's the exact situation we've found too. We'll add it to our backlog to
investigate on the development side since it seems nobody else has run into
this issue before.

David

On Wed, Oct 13, 2021 at 4:24 AM Luis Domingues 
wrote:

> Hi,
>
> We have the same issue on our lab cluster. The only way I found to have
> the osds on the new specification was to drain, remove and re-add the host.
> The orchestrator was happy to recreate the osds under the good
> specification.
>
> But I do not think this is a good solution for production cluster. We are
> still looking for a more smooth way to do that.
>
> Luis Domingues
>
> ‐‐‐ Original Message ‐‐‐
>
> On Monday, October 4th, 2021 at 10:01 PM, David Orman <
> orma...@corenode.com> wrote:
>
> > We have an older cluster which has been iterated on many times. It's
> >
> > always been cephadm deployed, but I am certain the OSD specification
> >
> > used has changed over time. I believe at some point, it may have been
> >
> > 'rm'd.
> >
> > So here's our current state:
> >
> > root@ceph02:/# ceph orch ls osd --export
> >
> > service_type: osd
> >
> > service_id: osd_spec_foo
> >
> > service_name: osd.osd_spec_foo
> >
> > placement:
> >
> > label: osd
> >
> > spec:
> >
> > data_devices:
> >
> > rotational: 1
> >
> > db_devices:
> >
> > rotational: 0
> >
> > db_slots: 12
> >
> > filter_logic: AND
> >
> > objectstore: bluestore
> >
> 
> >
> > service_type: osd
> >
> > service_id: unmanaged
> >
> > service_name: osd.unmanaged
> >
> > placement: {}
> >
> > unmanaged: true
> >
> > spec:
> >
> > filter_logic: AND
> >
> > objectstore: bluestore
> >
> > root@ceph02:/# ceph orch ls
> >
> > NAME PORTS RUNNING REFRESHED AGE PLACEMENT
> >
> > crash 7/7 10m ago 14M *
> >
> > mgr 5/5 10m ago 7M label:mgr
> >
> > mon 5/5 10m ago 14M label:mon
> >
> > osd.osd_spec_foo 0/7 - 24m label:osd
> >
> > osd.unmanaged 167/167 10m ago - 
> >
> > The osd_spec_foo would match these devices normally, so we're curious
> >
> > how we can get these 'managed' under this service specification.
> >
> > What's the appropriate way in order to effectively 'adopt' these
> >
> > pre-existing OSDs into the service specification that we want them to
> >
> > be managed under?
> >
> > ceph-users mailing list -- ceph-users@ceph.io
> >
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: MDS: corrupted header/values: decode past end of struct encoding: Malformed input

2021-10-13 Thread Stefan Kooman

On 10/13/21 16:22, von Hoesslin, Volker wrote:

okay, i did it :S if have run this command: cephfs-data-scan scan_links


its ends in the next error, checkout attachment. i think i will 
re-deploy my complete ceph storage and recover my external 
backup-files... thx for help.


It's the same assert. Did you already fix the PG issues?

You might be able to build the metadata pool from the data pool as 
described in [1]. If it works you might be able to recover all files and 
learn  while doing so.


Gr. Stefan

[1]: 
https://docs.ceph.com/en/latest/cephfs/disaster-recovery-experts/#disaster-recovery-experts

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Do people still use LevelDBStore?

2021-10-13 Thread Casey Bodley
+1 from a dev's perspective. we don't test leveldb, and we don't
expect it to perform as well as rocksdb in ceph, so i don't see any
value in keeping it

the rados team put a ton of effort into converting existing clusters
to rocksdb, so i would be very surprised if removing leveldb left any
users stuck without an upgrade path

On Wed, Oct 13, 2021 at 2:13 PM Ken Dreyer  wrote:
>
> I think it's a great idea to remove it.
>
> - Ken
>
> On Wed, Oct 13, 2021 at 12:52 PM Adam C. Emerson  wrote:
> >
> > Good day,
> >
> > Some time ago, the LevelDB maintainers turned -fno-rtti on in their
> > build. As we don't use -fno-rtti, building LevelDBStore
> > against newer LevelDB packages can fail.
> >
> > This has made me wonder, are there still people who use LevelDBStore
> > and rely on it, or can we deprecate and/or remove it?
> >
> > ___
> > Dev mailing list -- d...@ceph.io
> > To unsubscribe send an email to dev-le...@ceph.io
> >
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Do people still use LevelDBStore?

2021-10-13 Thread Ken Dreyer
I think it's a great idea to remove it.

- Ken

On Wed, Oct 13, 2021 at 12:52 PM Adam C. Emerson  wrote:
>
> Good day,
>
> Some time ago, the LevelDB maintainers turned -fno-rtti on in their
> build. As we don't use -fno-rtti, building LevelDBStore
> against newer LevelDB packages can fail.
>
> This has made me wonder, are there still people who use LevelDBStore
> and rely on it, or can we deprecate and/or remove it?
>
> ___
> Dev mailing list -- d...@ceph.io
> To unsubscribe send an email to dev-le...@ceph.io
>

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Do people still use LevelDBStore?

2021-10-13 Thread Adam C. Emerson
Good day,

Some time ago, the LevelDB maintainers turned -fno-rtti on in their
build. As we don't use -fno-rtti, building LevelDBStore
against newer LevelDB packages can fail.

This has made me wonder, are there still people who use LevelDBStore
and rely on it, or can we deprecate and/or remove it?

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Ganesha NFS hangs on any rebalancing or degraded data redundancy

2021-10-13 Thread Jeff Turmelle
We are using NFS-Ganesha to serve data from our Nautilus cluster to older 
clients.  We recently had an OSD fail and the NFS server will not respond while 
we have degraded data redundancy.  This also happens on the rare occasion when 
we have some lost objects on a PG.  Is this a known issue and is there a 
workaround?

—
Jeff Turmelle, Lead Systems Analyst
International Research Institute for Climate and Society 

Columbia Climate School 
cell: (845) 652-3461

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Default policy for bucket creation

2021-10-13 Thread Dante F . B . Colò
Hello everyone ,

I'm not very experienced ceph user/administrator, i'm looking for some way
to set a default policy for newly created buckets , i can set a policy for
some user existing buckets, but i need this policy on bucket creation , is
there anyway  i can accomplish this ?

Best Regards
Dante
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Accessing Ceph storage from a Windows guest.

2021-10-13 Thread open infra
Hi,

I have deployed Openstack with Ceph.
To get better performance from a Windows guest do I need to have specific
client-side configuration?

My use-case is hundreds of Openstack  Windows guests supposed access 2TB
shared volume (with executable files) (multiattach) as drive D and each VM
has 100GB  root disk.

Regards
Danishka
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: MDS: corrupted header/values: decode past end of struct encoding: Malformed input

2021-10-13 Thread von Hoesslin, Volker
okay, i did it :S if have run this command: cephfs-data-scan scan_links


its ends in the next error, checkout attachment. i think i will re-deploy my 
complete ceph storage and recover my external backup-files... thx for help.


volker.




Von: Stefan Kooman 
Gesendet: Donnerstag, 7. Oktober 2021 12:26:44
An: von Hoesslin, Volker; ceph-users@ceph.io
Betreff: [ceph-users] Re: MDS: corrupted header/values: decode past end of 
struct encoding: Malformed input

Externe E-Mail! Öffnen Sie nur Links oder Anhänge von vertrauenswürdigen 
Absendern!
On 10/7/21 11:27, von Hoesslin, Volker wrote:
> So should i Just Run this command?

I don't know. The problem is, that you already went through the disaster
recovery procedure, and reset a couple of things. And that the cause why
it went into degraded state is not clear to me. And in the current state
it's probably difficult for anyone to gauge what the best path forward
is (sorry for the pun).

AFAICT The assert is there because it cannot find "_", possibly looking
up the head of an object (_head) that it cannot find.

So not sure if that command is going to fix that for you, or possibly
make things worse.

Gr. Stefan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RFP for arm64 test nodes

2021-10-13 Thread Mark Nelson
There are a lot of advantages to going bare metal if you can make use of 
all of the cores.  It's sort of ironic that it's one of the things Ceph 
is fairly good at.  If you need more parallelism you can throw more OSDs 
at the problem.  Failure domain and general simplicity have always been 
the wildcards that keep me on the side of smaller dense nodes with fewer 
cores, straightforward topography, and surprisingly sometimes lower 
cost.  As far as ARM processors go, it's still pretty wild west imho.  
We might be able to get away with it for functional testing, but we 
can't necessarily buy ampre and expect that it gives us a good picture 
of how we would behave on Graviton2, Grace, M1, or other setups 
(especially if we were to eventually target things like GPU erasure 
coding offload).



Mark


On 10/11/21 7:56 PM, Dan Mick wrote:
We have some experience testing Ceph on x86 VMs; we used to do that a 
lot, but have move to mostly physical hosts. I could be wrong, but I 
think our experience is that the cross-loading from one swamped VM to 
another on the same physical host can skew the load/failure recovery 
testing enough that it's attractive for our normal test strategy/load 
to have separate physical hosts.


On 10/11/2021 12:00 AM, Martin Verges wrote:

Hello Dan,

why not using a bit bigger machines and use VMs for tests? We have 
quite good experience with that and it works like a charm. If you 
plan them as hypervisors, you can run a lot of tests simultaneous. 
Use the 80 core ARM, put 512GB or more in them and use some good NVMe 
like P55XX or so. In addition put 2*25GbE/40GbE in the servers and 
you need only a few of them to simulate a lot. This would save costs, 
makes it easier to maintain, and you are much more flexible. For 
example running tests on different OS, injecting latency, simulating 
errors and more.


--
Martin Verges
Managing director

Mobile: +49 174 9335695  | Chat: https://t.me/MartinVerges 



croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io  | YouTube: 
https://goo.gl/PGE1Bx 



On Sat, 9 Oct 2021 at 01:25, Dan Mick > wrote:


    Ceph has been completely ported to build and run on ARM hardware
    (architecture arm64/aarch64), but we're unable to test it due to
    lack of
    hardware.  We propose to purchase a significant number of ARM 
servers

    (50+?) to install in our upstream Sepia test lab to use for upstream
    testing of Ceph, alongside the x86 hardware we already own.

    This message is to start a discussion of what the nature of that
    hardware should be, and an investigation as to what's available 
and how
    much it might cost.  The general idea is to build something 
arm64-based

    that is similar to the smithi/gibba nodes:

    https://wiki.sepia.ceph.com/doku.php?id=hardware:gibba


    Some suggested features:

    * base hardware/peripheral support for current releases of RHEL,
    CentOS,
    Ubuntu
    * 1 fast and largish (400GB+) NVME drive for OSDs (it will be
    partitioned into 4-5 subdrives for tests)
    * 1 large (1TB+) SSD/HDD for boot/system and logs (faster is 
better but

    not as crucial as for cluster storage)
    * Remote/headless management (IPMI?)
    * At least 1 10G network interface per host
    * Order of 64GB main memory per host

    Density is valuable to the lab; we have space but not an unlimited
    amount.

    Any suggestions on vendors or specific server configurations?

    Thanks!

    ___
    Dev mailing list -- d...@ceph.io 
    To unsubscribe send an email to dev-le...@ceph.io
    



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] OSD's fail to start after power loss

2021-10-13 Thread Orbiting Code, Inc.

Hello Everyone,

I have 3 OSD hosts with 12 OSD's each. After a power failure on 1 host, 
all 12 OSD's fail to start on that host. The other 2 hosts did not lose 
power, and are functioning. Obviously I don't want to restart the 
working hosts at this time. Syslog shows:


Oct 12 17:24:07 osd3 systemd[1]: 
ceph-volume@lvm-31-cae13d9a-1d3d-4003-a57f-6ffac21a682e.service: Main 
process exited, code

=exited, status=1/FAILURE
Oct 12 17:24:07 osd3 systemd[1]: 
ceph-volume@lvm-31-cae13d9a-1d3d-4003-a57f-6ffac21a682e.service: Failed 
with result 'exit-

code'.
Oct 12 17:24:07 osd3 systemd[1]: Failed to start Ceph Volume activation: 
lvm-31-cae13d9a-1d3d-4003-a57f-6ffac21a682e.


This is repeated for all 12 OSD's on the failed host. Running the 
following command, shows additional errors.


root@osd3:/var/log# /usr/bin/ceph-osd -f --cluster ceph --id 31 
--setuser ceph --setgroup ceph
2021-10-12 17:50:23.117 7fce92e6ac00 -1 auth: unable to find a keyring 
on /var/lib/ceph/osd/ceph-31/keyring: (2) No such file or directory
2021-10-12 17:50:23.117 7fce92e6ac00 -1 AuthRegistry(0x55c4ec50aa40) no 
keyring found at /var/lib/ceph/osd/ceph-31/keyring, disabling cephx
2021-10-12 17:50:23.117 7fce92e6ac00 -1 auth: unable to find a keyring 
on /var/lib/ceph/osd/ceph-31/keyring: (2) No such file or directory
2021-10-12 17:50:23.117 7fce92e6ac00 -1 AuthRegistry(0x7ffe9b64eb08) no 
keyring found at /var/lib/ceph/osd/ceph-31/keyring, disabling cephx

failed to fetch mon config (--no-mon-config to skip)

No tmpfs mounts exist for any directories in /var/lib/ceph/osd/ceph-**

Any assistance helping with this situation would be greatly appreciated.

Thank you,
Todd
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Adopting "unmanaged" OSDs into OSD service specification

2021-10-13 Thread Luis Domingues
Hi,

We have the same issue on our lab cluster. The only way I found to have the 
osds on the new specification was to drain, remove and re-add the host. The 
orchestrator was happy to recreate the osds under the good specification.

But I do not think this is a good solution for production cluster. We are still 
looking for a more smooth way to do that.

Luis Domingues

‐‐‐ Original Message ‐‐‐

On Monday, October 4th, 2021 at 10:01 PM, David Orman  
wrote:

> We have an older cluster which has been iterated on many times. It's
>
> always been cephadm deployed, but I am certain the OSD specification
>
> used has changed over time. I believe at some point, it may have been
>
> 'rm'd.
>
> So here's our current state:
>
> root@ceph02:/# ceph orch ls osd --export
>
> service_type: osd
>
> service_id: osd_spec_foo
>
> service_name: osd.osd_spec_foo
>
> placement:
>
> label: osd
>
> spec:
>
> data_devices:
>
> rotational: 1
>
> db_devices:
>
> rotational: 0
>
> db_slots: 12
>
> filter_logic: AND
>
> objectstore: bluestore
> 
>
> service_type: osd
>
> service_id: unmanaged
>
> service_name: osd.unmanaged
>
> placement: {}
>
> unmanaged: true
>
> spec:
>
> filter_logic: AND
>
> objectstore: bluestore
>
> root@ceph02:/# ceph orch ls
>
> NAME PORTS RUNNING REFRESHED AGE PLACEMENT
>
> crash 7/7 10m ago 14M *
>
> mgr 5/5 10m ago 7M label:mgr
>
> mon 5/5 10m ago 14M label:mon
>
> osd.osd_spec_foo 0/7 - 24m label:osd
>
> osd.unmanaged 167/167 10m ago - 
>
> The osd_spec_foo would match these devices normally, so we're curious
>
> how we can get these 'managed' under this service specification.
>
> What's the appropriate way in order to effectively 'adopt' these
>
> pre-existing OSDs into the service specification that we want them to
>
> be managed under?
>
> ceph-users mailing list -- ceph-users@ceph.io
>
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: is it possible to remove the db+wal from an external device (nvme)

2021-10-13 Thread Szabo, Istvan (Agoda)
Is it possible to extend the block.db lv of that specific osd with lvextend 
command or it needs some special bluestore extend?
I want to extend that lv with the size of the spillover, compact it and migrate 
after.

Istvan Szabo
Senior Infrastructure Engineer
---
Agoda Services Co., Ltd.
e: istvan.sz...@agoda.com
---

From: Igor Fedotov 
Sent: Tuesday, October 12, 2021 7:15 PM
To: Szabo, Istvan (Agoda) 
Cc: ceph-users@ceph.io; 胡 玮文 
Subject: Re: [ceph-users] Re: is it possible to remove the db+wal from an 
external device (nvme)

Email received from the internet. If in doubt, don't click any link nor open 
any attachment !


Istvan,

So things with migrations are clear at the moment, right? As I mentioned the 
migrate command in 15.2.14 has a bug which causes corrupted OSD if db->slow 
migration occurs on spilled over OSD. To work around that you might want to 
migrate slow to db first or try manual compaction. Please make sure there is no 
spilled over data left after any of them via bluestore-tool's 
bluestore-bdev-sizes command before proceeding with db->slow migrate...

just a side note - IMO it sounds a bit controversial that you're 
expecting/experiencing better performance without standalone DB and at the same 
time spillovers cause performance issues... Spillover means some data goes to 
main device (which you're trying to achieve by migrating as well) hence it 
would rather improve things... Or the root cause of your performace issues is 
different... Just want to share my thoughts - I don't have any better ideas 
about that so far...



Thanks,

Igor
On 10/12/2021 2:54 PM, Szabo, Istvan (Agoda) wrote:
I’m having 1 billions of objects in the cluster and we are still increasing and 
faced spillovers allover the clusters.
After 15-18 spilledover osds (out of the 42-50) the osds started to die, 
flapping.
Tried to compact manually the spilleovered ones, but didn’t help, however the 
not spilled osds less frequently crashed.
In our design 3 ssd was used 1 nvme for db+wal, but this nvme has 30k iops on 
random write, however the ssds behind this nvme have individually 67k so 
actually the SSDs are faster in write than the nvme which means our config 
suboptimal.

I’ve decided to update the cluster to 15.2.14 to be able to run this 
ceph-volume lvm migrate command and started to use it.

10-20% is the failed migration at the moment, 80-90% is successful.
I want to avoid this spillover in the future so I’ll use bare SSDs as osds 
without wal+db. At the moment my iowait decreased  a lot without nvme drives, I 
just hope didn’t do anything wrong with this migration right?

The failed ones I’m removing from the cluster and add it back after cleaned up.

Istvan Szabo
Senior Infrastructure Engineer
---
Agoda Services Co., Ltd.
e: istvan.sz...@agoda.com
---

From: Igor Fedotov 
Sent: Tuesday, October 12, 2021 6:45 PM
To: Szabo, Istvan (Agoda) 

Cc: ceph-users@ceph.io; 胡 玮文 

Subject: Re: [ceph-users] Re: is it possible to remove the db+wal from an 
external device (nvme)

Email received from the internet. If in doubt, don't click any link nor open 
any attachment !


You mean you run migrate for these 72 OSDs and all of them aren't starting any 
more? Or you just upgraded them to Octopus and experiencing performance issues.

In the latter case and if you have enough space at DB device you might want to 
try to migrate data from slow to db first. Run fsck (just in case) and then 
migrate from DB/WAl back to slow.
Theoretically this should help in avoiding the before-mentioned bug. But  I 
haven't try that personally...

And this wouldn't fix the corrupted OSDs if any though...



Thanks,

Igor
On 10/12/2021 2:36 PM, Szabo, Istvan (Agoda) wrote:
Omg, I’ve already migrated 24x osds in each dc-s (altogether 72).
What should I do then? 12 left (altogether 36). In my case slow device is 
faster in random write iops than the one which is serving it.

Istvan Szabo
Senior Infrastructure Engineer
---
Agoda Services Co., Ltd.
e: istvan.sz...@agoda.com
---



On 2021. Oct 12., at 13:21, Igor Fedotov 
 wrote:
Email received from the internet. If in doubt, don't click any link nor open 
any attachment !


Istvan,

you're bitten by

It's not fixed in 15.2.14. This has got a backport to upcoming Octopus
minor release. Please do not use 'migrate' command from WAL/DB to slow
volume if some data is already present 

[ceph-users] Re: Cluster down

2021-10-13 Thread Jorge JP
Hello Marc,

For add node to ceph cluster with Proxmox first I have to install Proxmox hehe, 
this is not the problem.

File configuration is revised and correct. I understand your words but not is 
problem of configuration.

I can understand that cluster can have problems if any servers not configured 
correctly or ports in the switches not configured correctly. But this server 
never became in a member of cluster.

I extracted a part of logfile when ceph down.

A bit weeks ago, I have a problem with a port configuration and remove mtu 9216 
and various hypervisors of cluster proxmox rebooted. But today the server not 
relationated with ceph cluster. Only have public and private ips in same 
network but ports not configured.


De: Marc 
Enviado: miércoles, 13 de octubre de 2021 12:49
Para: Jorge JP ; ceph-users@ceph.io 
Asunto: RE: Cluster down

>
> We currently have a ceph cluster in Proxmox, with 5 ceph nodes with the
> public and private network correctly configured and without problems.
> The state of ceph was optimal.
>
> We had prepared a new server to add to the ceph cluster. We did the
> first step of installing Proxmox with the same version. I was at the
> point where I was setting up the network.

I am not using proxmox, just libvirt. But I would say the most important part 
is your ceph cluster. So before doing anything I would make sure to add the 
ceph node first and then install other things.

> For this step, I did was connect by SSH to the new server and copy the
> network configuration of one of the ceph nodes to this new one. Of
> course, changing the ip addresses.

I would not copy at all. Just change the files manually if you did not edit one 
file correctly or the server reboots before you change the ip addresses you can 
get into all kinds of problems.

> What happened when restarting the network service is that I lost access
> to the cluster. I couldn't access any of the 5 servers that are part of
> the  ceph cluster. Also, 2 of 3 hypervisors
> that we have in the proxmox cluster were restarted directly.

So now you know, you first have to configure networking, then ceph and then 
proxmox. Take your time adding a server. I guess the main reason you are in the 
current situation, you try to do it quick quick.

> Why has this happened if the new server is not yet inside the ceph
> cluster on the proxmox cluster and I don't even have the ports
> configured on my switch?

Without logs nobody is able to tell.

> Do you have any idea?
>
> I do not understand, if now I go and take any server and configure an IP
> of the cluster network and even if the ports are not even configured,
> will the cluster knock me down?

Nothing should happen if you install an OS and use ip addresses in the same 
space as your cluster/client network. Do this first.

> I recovered the cluster by phisically removing the cables from the new
> server.

So wipe it, and start over.

> Thanks a lot and sorry for my english...

No worries, your english is much better than my spanish ;)

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Datacenter migration: How to change cluster network.

2021-10-13 Thread mhnx
Hello.

We're moving our Cluster to a different datacenter and I need to change the
Cluster and Public  network.
Is there any procedure guide for doing this?

I think I should follow these steps:
1- Power-on all nodes.
2- Do not start any Mon,Mgr,Mds,Osd.
3- Set up the old network ip's as vlan0 access mode on Public NIC's
temporarily. (There is no gateway but it will work as vlan0 cause they're
in the same lan.) But with these settings there is no communication between
public and cluster networks.
4- Add the new network as a Cluster network using different NIC's, set up
bonding and be sure everything works perfectly.
5- Start Mon's with old network configuration on Vlan0 (do not unset
pause,norecover, norebalance, nobackfill, nodown,noout)
6- Change Monmap and add new network in ceph.conf. (I'm not sure.)
7- ?? is there any other steps ??
8- Start OSD's
9- Unset flags and done.

I've created a 3 node test cluster on VM's to try it.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Cluster down

2021-10-13 Thread Marc
> 
> We currently have a ceph cluster in Proxmox, with 5 ceph nodes with the
> public and private network correctly configured and without problems.
> The state of ceph was optimal.
> 
> We had prepared a new server to add to the ceph cluster. We did the
> first step of installing Proxmox with the same version. I was at the
> point where I was setting up the network.

I am not using proxmox, just libvirt. But I would say the most important part 
is your ceph cluster. So before doing anything I would make sure to add the 
ceph node first and then install other things.

> For this step, I did was connect by SSH to the new server and copy the
> network configuration of one of the ceph nodes to this new one. Of
> course, changing the ip addresses.

I would not copy at all. Just change the files manually if you did not edit one 
file correctly or the server reboots before you change the ip addresses you can 
get into all kinds of problems.

> What happened when restarting the network service is that I lost access
> to the cluster. I couldn't access any of the 5 servers that are part of
> the  ceph cluster. Also, 2 of 3 hypervisors
> that we have in the proxmox cluster were restarted directly.

So now you know, you first have to configure networking, then ceph and then 
proxmox. Take your time adding a server. I guess the main reason you are in the 
current situation, you try to do it quick quick.

> Why has this happened if the new server is not yet inside the ceph
> cluster on the proxmox cluster and I don't even have the ports
> configured on my switch?

Without logs nobody is able to tell.

> Do you have any idea?
> 
> I do not understand, if now I go and take any server and configure an IP
> of the cluster network and even if the ports are not even configured,
> will the cluster knock me down?

Nothing should happen if you install an OS and use ip addresses in the same 
space as your cluster/client network. Do this first.

> I recovered the cluster by phisically removing the cables from the new
> server.

So wipe it, and start over.

> Thanks a lot and sorry for my english...

No worries, your english is much better than my spanish ;)

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Cluster down

2021-10-13 Thread Jorge JP
Hello,

We currently have a ceph cluster in Proxmox, with 5 ceph nodes with the public 
and private network correctly configured and without problems. The state of 
ceph was optimal.

We had prepared a new server to add to the ceph cluster. We did the first step 
of installing Proxmox with the same version. I was at the point where I was 
setting up the network.

For this step, I did was connect by SSH to the new server and copy the network 
configuration of one of the ceph nodes to this new one. Of course, changing the 
ip addresses.

On each ceph node, I have 2 ports configured on two different switches 
configured with bond.

In the new ceph node, I did not have the ports configured with the bond, in the 
swith (cisco). My intention was once the configuration file was saved, restart 
the network service from the server
and then go to configure the ports on the switch. I have to configure the ports 
in the last step, because the server configured without bond, and If I 
configure the ports with bond I lost access to the server.

What happened when restarting the network service is that I lost access to the 
cluster. I couldn't access any of the 5 servers that are part of the  ceph 
cluster. Also, 2 of 3 hypervisors
that we have in the proxmox cluster were restarted directly.

Why has this happened if the new server is not yet inside the ceph cluster on 
the proxmox cluster and I don't even have the ports configured on my switch?

Do you have any idea?

I do not understand, if now I go and take any server and configure an IP of the 
cluster network and even if the ports are not even configured, will the cluster 
knock me down?

I recovered the cluster by phisically removing the cables from the new server.

Thanks a lot and sorry for my english...
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Multisite Pubsub - Duplicates Growing Uncontrollably

2021-10-13 Thread Yuval Lifshitz
Hi Alex,
How many overall zones do you have configured in the system?
We have an issue with pubsub based notifications, where we may get as many
as (#zone-1) duplicates per object.
This, however, won't explain 13 events per object.

Did you verify that these are indeed the same events? For the same object,
do you see the same mtime and etag?

RGW restarts may also explain the issue, if the RGW restarts mid bucket
syncing, it will restart the sync over, and this may result in duplicate
notification events.

As a side note, why are you using pubsub based notifications and not the
regular "push" bucket notifications [1]?
The application APIs for bucket notifications are more standard (exists in
most SDKs), and the feature is more robust (especially with the async
"persistent" notifications), easier to configure (no special zone is
needed), and more feature-rich.

Yuval

[1] https://docs.ceph.com/en/octopus/radosgw/notifications/



On Mon, Oct 11, 2021 at 9:54 PM Alex Hussein-Kershaw 
wrote:

> Hi Ceph-Users,
>
> I have a multisite Ceph cluster deployed on containers within 3 VMs (6 VMs
> total over 2 sites). Each VM has a mon, osd, mgr, mds, and two rgw
> containers (regular and pubsub).  It was installed with ceph-ansible.
>
> One of the sites has been up for a few years, the other site has been
> recently re-installed and paired with the initial site. The initial site is
> using Nautlius (14.2.9), the new site is on Octopus (15.2.13). (Side point
> - is this valid?)
>
> I've noticed that on the new site, pubsub is building a gigantic queue of
> objects (it's building faster than our product can acknowledge the events).
> I'm having a rough time trying to debug this/understand why the queue is
> building.
>
> I currently have 450k objects stored in an S3 bucket, that is mostly
> inactive (our test system backed by this cluster is off while we attempt to
> resolve this), synced between the two sites. The pubsub queue on the second
> site currently has 1.7M objects, and I've disabled the pubsub containers to
> prevent it building further.  As soon as I enable the pubsub containers
> again this starts building at an alarming rate.
>
> What I've tried:
>
>   *   Interacting with the pubsub REST API. I pulled all the events in the
> pubsub queue and did some analysis on them.
>   *   Of the 1.7M events, there were 106k unique S3 objects referenced.
>   *   The average S3 object had 13 pubsub events referring to it. This
> seems very odd given the inactivity of the data, I was expecting to find no
> duplicate entries here.
>   *   The most mentioned S3 object was referred to 362 times (i.e. a
> single S3 object had 362 pubsub OBJECT_CREATE events).
>   *   All the mTimes are from 2020 (other than 35 in 2021) - the second
> site was only deployed this month.
>
> Does anyone have any suggestions as to why this is occurring?
>
> Thanks,
> Alex
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: is it possible to remove the db+wal from an external device (nvme)

2021-10-13 Thread Igor Fedotov

Yes. For DB volume expanding underlying device/lv should be enough...

--
Igor Fedotov
Ceph Lead Developer

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx

On 10/13/2021 12:03 PM, Szabo, Istvan (Agoda) wrote:


Is it possible to extend the block.db lv of that specific osd with 
lvextend command or it needs some special bluestore extend?


I want to extend that lv with the size of the spillover, compact it 
and migrate after.


Istvan Szabo
Senior Infrastructure Engineer
---
Agoda Services Co., Ltd.
e: istvan.sz...@agoda.com 
---

*From:* Igor Fedotov 
*Sent:* Tuesday, October 12, 2021 7:15 PM
*To:* Szabo, Istvan (Agoda) 
*Cc:* ceph-users@ceph.io; 胡 玮文 
*Subject:* Re: [ceph-users] Re: is it possible to remove the db+wal 
from an external device (nvme)


Email received from the internet. If in doubt, don't click any link 
nor open any attachment !




Istvan,

So things with migrations are clear at the moment, right? As I 
mentioned the migrate command in 15.2.14 has a bug which causes 
corrupted OSD if db->slow migration occurs on spilled over OSD. To 
work around that you might want to migrate slow to db first or try 
manual compaction. Please make sure there is no spilled over data left 
after any of them via bluestore-tool's bluestore-bdev-sizes command 
before proceeding with db->slow migrate...


just a side note - IMO it sounds a bit controversial that you're 
expecting/experiencing better performance without standalone DB and at 
the same time spillovers cause performance issues... Spillover means 
some data goes to main device (which you're trying to achieve by 
migrating as well) hence it would rather improve things... Or the root 
cause of your performace issues is different... Just want to share my 
thoughts - I don't have any better ideas about that so far...


Thanks,

Igor

On 10/12/2021 2:54 PM, Szabo, Istvan (Agoda) wrote:

I’m having 1 billions of objects in the cluster and we are still
increasing and faced spillovers allover the clusters.

After 15-18 spilledover osds (out of the 42-50) the osds started
to die, flapping.

Tried to compact manually the spilleovered ones, but didn’t help,
however the not spilled osds less frequently crashed.

In our design 3 ssd was used 1 nvme for db+wal, but this nvme has
30k iops on random write, however the ssds behind this nvme have
individually 67k so actually the SSDs are faster in write than the
nvme which means our config suboptimal.

I’ve decided to update the cluster to 15.2.14 to be able to run
this ceph-volume lvm migrate command and started to use it.

10-20% is the failed migration at the moment, 80-90% is successful.

I want to avoid this spillover in the future so I’ll use bare SSDs
as osds without wal+db. At the moment my iowait decreased  a lot
without nvme drives, I just hope didn’t do anything wrong with
this migration right?

The failed ones I’m removing from the cluster and add it back
after cleaned up.

Istvan Szabo
Senior Infrastructure Engineer
---
Agoda Services Co., Ltd.
e: istvan.sz...@agoda.com 
---

*From:* Igor Fedotov 

*Sent:* Tuesday, October 12, 2021 6:45 PM
*To:* Szabo, Istvan (Agoda) 

*Cc:* ceph-users@ceph.io ; 胡 玮文
 
*Subject:* Re: [ceph-users] Re: is it possible to remove the
db+wal from an external device (nvme)

Email received from the internet. If in doubt, don't click any
link nor open any attachment !



You mean you run migrate for these 72 OSDs and all of them aren't
starting any more? Or you just upgraded them to Octopus and
experiencing performance issues.

In the latter case and if you have enough space at DB device you
might want to try to migrate data from slow to db first. Run fsck
(just in case) and then migrate from DB/WAl back to slow.

Theoretically this should help in avoiding the before-mentioned
bug. But  I haven't try that personally...

And this wouldn't fix the corrupted OSDs if any though...

Thanks,

Igor

On 10/12/2021 2:36 PM, Szabo, Istvan (Agoda) wrote:

Omg, I’ve already migrated 24x osds in each dc-s (altogether 72).

What should I do then? 12 left (altogether 36). In