Hi All,
In regards to the monitoring services on a Ceph Cluster (ie Prometheus,
Grafana, Alertmanager, Loki, Node-Exported, Promtail, etc) how many
instances should/can we run for fault tolerance purposes? I can't seem
to recall that advice being in the doco anywhere (but of course, I
On Fri, Jan 19, 2024 at 2:38 PM Marc wrote:
>
> Am I doing something weird when I do on a ceph node (nautilus, el7):
>
> rbd snap ls vps-test -p rbd
> rbd map vps-test@vps-test.snap1 -p rbd
>
> mount -o ro /dev/mapper/VGnew-LVnew /mnt/disk <--- reset/reboot ceph node
Hi Marc,
It's not clear
HI Roman,
The fact that changing the pg_num for the index pool drops the latency
back down might be a clue. Do you have a lot of deletes happening on
this cluster? If you have a lot of deletes and long pauses between
writes, you could be accumulating tombstones that you have to keep
Hi Stefan,
Do you make use of a separate db partition as well? And if so, where is
> it stored?
>
No, only WAL partition is on separate NVME partition. Not sure if
ceph-ansible could install Ceph with db partition on separate device on
v17.6.2
Do you only see latency increase in reads? And not
On 1/18/24 03:40, Frank Schilder wrote:
For multi- vs. single-OSD per flash drive decision the following test might be
useful:
We found dramatic improvements using multiple OSDs per flash drive with octopus
*if* the bottleneck is the kv_sync_thread. Apparently, each OSD has only one
and
Am I doing something weird when I do on a ceph node (nautilus, el7):
rbd snap ls vps-test -p rbd
rbd map vps-test@vps-test.snap1 -p rbd
mount -o ro /dev/mapper/VGnew-LVnew /mnt/disk <--- reset/reboot ceph node
___
ceph-users mailing list --
Hi Eugen,
How is the data growth in your cluster? Is the pool size rather stable or
> is it constantly growing?
>
Pool size is fairly constant with tiny up trend. It's growth doesn't
correlate with increase of OSD read latency. I've combined pool usage with
OSD read latency on one graph to
On 16-01-2024 11:22, Roman Pashin wrote:
Hello Ceph users,
we see strange issue on last recent Ceph installation v17.6.2. We store
data on HDD pool, index pool is on SSD. Each OSD store its wal on NVME
partition.
Do you make use of a separate db partition as well? And if so, where is
it
Oh that does sound strange indeed. I don't have a good idea right now,
hopefully someone from the dev team can shed some light on this.
Zitat von Robert Sander :
Hi,
more strang behaviour:
When I isssue "ceph mgr fail" a backup MGR takes over and updates
all config files on all hosts
Hi,
I checked two production clusters which don't use RGW too heavily,
both on Pacific though. There's no latency increase visible there. How
is the data growth in your cluster? Is the pool size rather stable or
is it constantly growing?
Thanks,
Eugen
Zitat von Roman Pashin :
Hello
Hi,
more strang behaviour:
When I isssue "ceph mgr fail" a backup MGR takes over and updates all
config files on all hosts including /etc/ceph/ceph.conf.
At first I thought that this was the solution but when I now remove the
_admin label and add it again the new MGR also does not update
Hi Eugen,
thanks for verifying this. I have created a tracker issue:
https://tracker.ceph.com/issues/64102
-Yenya
Eugen Block wrote:
: Hi,
:
: I checked the behaviour on Octopus, Pacific and Quincy, I can
: confirm. I don't have the time to dig deeper right now, but I'd
: suggest to
I'm having a bit of a weird issue with cluster rebalances with a new EC
pool. I have a 3-machine cluster, each machine with 4 HDD OSDs (+1 SSD).
Until now I've been using an erasure coded k=5 m=3 pool for most of my
data. I've recently started to migrate to a k=5 m=4 pool, so I can
configure the
Hi,
I checked the behaviour on Octopus, Pacific and Quincy, I can confirm.
I don't have the time to dig deeper right now, but I'd suggest to open
a tracker issue.
Thanks,
Eugen
Zitat von Jan Kasprzak :
Hello, Ceph users,
what is the correct location of keyring for ceph-crash?
I tried
14 matches
Mail list logo