[ceph-users] ceph iscsi latency too high for esxi?

2020-10-04 Thread Golasowski Martin
Hi, does anyone here use CEPH iSCSI with VMware ESXi? It seems that we are hitting the 5 second timeout limit on software HBA in ESXi. It appears whenever there is increased load on the cluster, like deep scrub or rebalance. Is it normal behaviour in production? Or is there something special we

[ceph-users] Re: ceph iscsi latency too high for esxi?

2020-10-04 Thread Martin Verges
Hello, no iSCSI + VMware works without such problems. > We are on latest Nautilus, 12 x 10 TB OSDs (4 servers), 25 Gbit/s Ethernet, erasure coded rbd pool with 128 PGs, aroun 200 PGs per OSD total. Nautilus is a good choice 12*10TB HDD is not good for VMs 25Gbit/s on HDD is way to much for that

[ceph-users] Re: ceph iscsi latency too high for esxi?

2020-10-04 Thread Golasowski Martin
Thanks! Does that mean that occasional iSCSI path drop-outs are somewhat expected? We are using SSDs for WAL/DB on each OSD server, so at least that. Do you think that If we buy additional 6/12 HDDs would that help with the IOPS for the VMs? Regards, Martin > On 4 Oct 2020, at 15:17, Mart

[ceph-users] Re: ceph iscsi latency too high for esxi?

2020-10-04 Thread Martin Verges
Hello, in my personal opinion, HDDs are a technology from the last century and I would never ever think about using such old technology for modern VM/Container/... workloads. My time, as well as any employee is too precious to wait for a harddrive to find the requested data! Use EC on NVMe if you

[ceph-users] Re: ceph iscsi latency too high for esxi?

2020-10-04 Thread Steve Thompson
On Sun, 4 Oct 2020, Martin Verges wrote: Does that mean that occasional iSCSI path drop-outs are somewhat expected? Not that I'm aware of, but I have no HDD based ISCSI cluster at hand to check. Sorry. I use iscsi extensively, but for ZFS and not ceph. Path drop-outs are not common; indeed,

[ceph-users] Re: ceph iscsi latency too high for esxi?

2020-10-04 Thread Golasowski Martin
For clarity, the issue has been reported also before: https://www.spinics.net/lists/ceph-users/msg59798.html https://www.spinics.net/lists/target-devel/msg10469.html > On 4 Oct 2020, at 16:46, Steve Thompson wrote: > > On Sun, 4 Oct 2

[ceph-users] Re: ceph iscsi latency too high for esxi?

2020-10-04 Thread Phil Regnauld
Yep, and we're still experiencing it every few months. One (and only one) of our ESXi nodes, which are otherwise identical, is experiencing total freeze of all I/O, and it won't recover - I mean, ESXi is so dead, we have to go into IPMI and reset the box... We're using Croit's software, but the is

[ceph-users] Re: Feedback for proof of concept OSD Node

2020-10-04 Thread Brian Topping
Hi Ignacio, apologies I missed your responses here. I would agree with Martin about buying used hardware for as cheap as possible, but also understand the desire to have hardware you can promote into future OpenStack usage. Regarding networking, I started to use SFP+ cables like https://amzn.

[ceph-users] Re: ceph iscsi latency too high for esxi?

2020-10-04 Thread Golasowski Martin
Oh, thanks, that does not sound very encouraging. In our case it looked the same, we had to reboot three ESXi nodes via IPMI, because it got stuck at ordinary soft reboot. 1. RecoveryTimeout is set at 25 on our nodes 2. We have one two-port adapter per node (Connect-X 5) and 4 iSCSI GWs total,

[ceph-users] Re: Feedback for proof of concept OSD Node

2020-10-04 Thread Ignacio Ocampo
Hi Brian and Martin, Physical space isn't a constraint at this point, the only requirement I've in mind is to maintain a *low level of noise* (since the equipment will be in my office) and *if possible low energy consumption*. Based on my limited experience, the only downside with used hardware i

[ceph-users] Re: Feedback for proof of concept OSD Node

2020-10-04 Thread Brian Topping
Comments inline > On Oct 4, 2020, at 2:27 PM, Ignacio Ocampo wrote: > > Physical space isn't a constraint at this point, the only requirement I've in > mind is to maintain a low level of noise (since the equipment will be in my > office) and if possible low energy consumption. > > Based on my

[ceph-users] Re: ceph iscsi latency too high for esxi?

2020-10-04 Thread Maged Mokhtar
It is a load issue. Your combined load: client io, recovery, scrub is higher that what your cluster can handle. Whereas some ceph commands can block when things are very busy, VMWare iSCSI is less tolerant but it is not the problem. If you have charts, look at the metric for disk % utilization/bu

[ceph-users] Re: Feedback for proof of concept OSD Node

2020-10-04 Thread Anthony D'Atri
>> If you guys have any suggestions about used hardware that can be a good fit >> considering mainly low noise, please let me know. > > So we didn’t get these requirements initially, there’s no way for us to help > you when the requirements aren’t available for us to consider, even if we had >

[ceph-users] Re: Feedback for proof of concept OSD Node

2020-10-04 Thread Ignacio Ocampo
A cluster to serve actual workloads. I think raspberry pi and Virtual machine are out of scope, this is not just for learning. Thanks! Ignacio Ocampo > On 4 Oct 2020, at 16:01, Anthony D'Atri wrote: > >  >>> If you guys have any suggestions about used hardware that can be a good fit >>> co