@Vahric, FYI, if you use directio, instead of sync (like a database is is
default configured for), you will just be using the RBD cache. Look at the
latency on your numbers. It is lower than is possible for a packet to
traverse the network. You'll need to use sync=1 if you want to see what the
If fault domain is a concern, you can always split the cloud up into 3
regions, each having a dedicate Ceph cluster. It isn't necessarily going to
mean more hardware, just logical splits. This is kind of assuming that the
network doesn't share the same fault domain though.
Alternatively, you can
We are in the same boat. Can't get rid of ephemeral for it's speed, and
independence. I get it, but it makes management of all these tiny pools a
scheduling and capacity nightmare.
Warren @ Walmart
On Wed, Feb 17, 2016 at 1:50 PM, Ned Rhudy (BLOOMBERG/ 731 LEX) <
erh...@bloomberg.net> wrote:
>
The only time we saw major performance issues with ephemeral (we're using
SSDs in RAID 0) was when we ran fio against a sparse file. It sounds like
you ran it against a properly filled file though, and it looks like you're
on a single spinning drive, based on the fio numbers. Can you confirm?
I'm gonna forward this to my co-workers :) I've been kicking this idea
around for some time now, and it hasn't caught traction. I think it could
work for a modest overcommit, depending on the memory workload. We decided
that it should be possible to do this sanely, but that it needed testing.
I'm
Even though we're using Ceph as a backend, we still use qcow2 images as our
golden images, since we still have a significant (maybe majority) number of
users using true ephemeral disks. It would be nice if glance was clever
enough to convert where appropriate.
Warren
Warren
On Thu, May 28, 2015
I would avoid co-locating Ceph and compute processes. Memory on compute
nodes is a scare resource, if you're not running with any overcommit, which
you shouldn't. Ceph requires a fair amount (2GB per OSD to be safe) of
guaranteed memory to deal with recovery. You can certainly overload memory
and
Your understanding is correct. I have the same problem as well. For now, my
plan is to just move cinder-volume to our more robust hosts, and run
database changes to modify the host, as needed.
I have noticed a growing trend to replace the host parameter with a
generic, but I agree that this
It would be nice to have an archive status for images, where brand new
instances may not be launched off it, but where it may be used as a
reference for existing instances and snapshots.
Not sure about other folks, but we just keep piling up the old images, and
they have to remain public. There's