[ceph-users] Re: ceph iscsi latency too high for esxi?

2020-10-04 Thread Maged Mokhtar
It is a load issue. Your combined load: client io, recovery, scrub is higher that what your cluster can handle. Whereas some ceph commands can block when things are very busy, VMWare iSCSI is less tolerant but it is not the problem. If you have charts, look at the metric for disk %

[ceph-users] Re: ceph iscsi latency too high for esxi?

2020-10-04 Thread Golasowski Martin
Oh, thanks, that does not sound very encouraging. In our case it looked the same, we had to reboot three ESXi nodes via IPMI, because it got stuck at ordinary soft reboot. 1. RecoveryTimeout is set at 25 on our nodes 2. We have one two-port adapter per node (Connect-X 5) and 4 iSCSI GWs total,

[ceph-users] Re: ceph iscsi latency too high for esxi?

2020-10-04 Thread Phil Regnauld
Yep, and we're still experiencing it every few months. One (and only one) of our ESXi nodes, which are otherwise identical, is experiencing total freeze of all I/O, and it won't recover - I mean, ESXi is so dead, we have to go into IPMI and reset the box... We're using Croit's software, but the

[ceph-users] Re: ceph iscsi latency too high for esxi?

2020-10-04 Thread Golasowski Martin
For clarity, the issue has been reported also before: https://www.spinics.net/lists/ceph-users/msg59798.html https://www.spinics.net/lists/target-devel/msg10469.html > On 4 Oct 2020, at 16:46, Steve Thompson wrote: > > On Sun, 4 Oct

[ceph-users] Re: ceph iscsi latency too high for esxi?

2020-10-04 Thread Martin Verges
Hello, in my personal opinion, HDDs are a technology from the last century and I would never ever think about using such old technology for modern VM/Container/... workloads. My time, as well as any employee is too precious to wait for a harddrive to find the requested data! Use EC on NVMe if you

[ceph-users] Re: ceph iscsi latency too high for esxi?

2020-10-04 Thread Golasowski Martin
Thanks! Does that mean that occasional iSCSI path drop-outs are somewhat expected? We are using SSDs for WAL/DB on each OSD server, so at least that. Do you think that If we buy additional 6/12 HDDs would that help with the IOPS for the VMs? Regards, Martin > On 4 Oct 2020, at 15:17,

[ceph-users] Re: ceph iscsi latency too high for esxi?

2020-10-04 Thread Martin Verges
Hello, no iSCSI + VMware works without such problems. > We are on latest Nautilus, 12 x 10 TB OSDs (4 servers), 25 Gbit/s Ethernet, erasure coded rbd pool with 128 PGs, aroun 200 PGs per OSD total. Nautilus is a good choice 12*10TB HDD is not good for VMs 25Gbit/s on HDD is way to much for that