On Thu, Mar 30, 2017 at 12:01:14AM +0200, Gabriel Marais wrote:
On Wed, Mar 29, 2017 at 6:01 PM, Stéphane Graber <stgra...@ubuntu.com>
wrote:
On Wed, Mar 29, 2017 at 03:13:36PM +0200, Gabriel Marais wrote:
Hi Guys
If this is the incorrect platform for this post, please point me in the
right direction.
We are in the process of deploying a small production environment with
the
following equipment:-
2 x Dell R430 servers each with 128GB Ram and 3 x 600GB SAS 10k drives
1 x Dell PowerVault MD3400 with
3 x 600GB 15k SAS Drives
3 x 6TB 7.2k Nearline SAS drives
The PowerVault is cabled directly to the Host Servers via Direct Attached
Storage, redundantly.
We would like to run a mixture of KVM and LXD containers on both Host
Servers.
The big question is, how do we implement the PowerVault (and to a certain
extent the storage on the Host Servers themselves) to be most beneficial
in
this mixed environment.
I have a few ideas on what I could do, but since I don't have much
experience with shared storage, I am probably just picking straws and
would
like to hear from others that probably has more experience than me.
Hi,
I'm not particularly familiar with the DELL PowerVault series, but it
looks like the other answers you've received so far have entirely missed
the "Direct Attached Storage" part of your description :)
For others reading this thread, this setup will effectively show up on
both servers as directly attached disks through /dev/mapper (because of
multipath), there is no need to use any kind of networked storage on top
of this.
The answer to your question I suspect will depend greatly on whether
you're dealing with a fixed number of VMs and containers, or if you
intend to spawn and delete them frequently. And also on whether you need
fast (no copy) migration of individual VMs and containers between the
two hosts.
We are not planning on having a fixed number of VMs and containers. It will
always grow as the requirement grow and as many as CPU and RAM will allow.
Migration of containers seem easy enough using the lxc migrate command,
although we have only done migrations with containers running on a normal
file system (e.g ext4) on a host.
One approach is to have a physical partition per virtual machine.
Meaning, to create physical partitions on the controller and assign it to
both hosts. So lets say we create a physical partitions on the controller,
60GB to be used for a specific VM. That partition will show up as e.g. sdd
on the host. We can then install the VM on that partition. If I understand
you correctly?
Correct, though you should use something more unique than sdd as those
device names are sequential based on detection order and so can change
after reboot. Since you're in a HA setup, you should have multiple paths
to each drive. Then something like multipathd will detect the different
paths and setup a virtual device in /dev/mapper for you to use. That
device is typically named after the WWN of the drive which is a unique
stable identifier that you can rely on.
With this, you can then access the drive from either host (obviously
never from both at the same time), which means that should you want to
start the VM on the other host, you just need to stop the kvm process on
one and start it again on the other, without any data ever being moved.
I assume we would simply use XML (export / import) to create the VM on the
other host which is pointing to the partition sitting on the storage device?
Yep
For containers, it's a bit trickier as we don't support using a raw
block device as the root of the container. So you'd need LXD to either
use the host's local storage for the container root and then mount block
devices into those containers at paths that hold the data you care
about. Or you'd need to define a block device for each server in the
PowerVault and have LXD use that for storage (avoiding using the local
storage).
I like the idea of creating a block device on the storage for each
container and have LXD use that block device for a specific container. I'm
not sure how and if we would be able to simply migrate a container from one
host to the other (assuming that we would have those block devices
available to both hosts)...?
Right so as I mentioned, you can't have a LXD container use a single
partition from your storage array as kvm lets you do.
So instead your best bet is probably to setup one chunk of storage for
use for the container root filesystems and point LXD to that as the
default storage pool.
For container data, you can then use a partition from your storage
array, format it and pass it to the LXD container with something like:
lxc config device add some-container mysql disk source=/dev/mapper/PARTITION
path=/var/lib/mysql
The obvious advantage of the second option is that should one of the
server go away for whatever reason, you'd be able to mount that server's
LXD pool onto the other server and spawn a second LXD daemon on it,
taking over the role of the dead server.
From what I've read, a particular host can only use one ZFS Pool. This
creates a limitation since we won't be able to create two pools - 1 for the
faster storage drives and 1 for the slower storage drives.
That's true of LXD until LXD 2.9 which introduces our storage API which
does allow you to use multiple different storage pools in whatever way
you want.
On recent LXD, you can now do:
lxc storage create ssd zfs tank-ssd/lxd
lxc storage create spindle zfs tank-spindle/lxd
Which will define two storage pools, both using ZFS and both using a
subset of a different zpool for the containers.
You can choose which pool to use at launch time with:
lxc launch ubuntu:16.04 blah -s ssd
Or set a default pool for a profile with:
lxc profile device add default root disk path=/ pool=spindle
My initial planning was to:-
Option 1
----------------
Take the Fast Storage (3 x 600GB 15k SAS drives) and configure them on the
controller as RAID5, split them into 2 and give each host a partition (sdx)
of +/- 600GB
Take the Slower Storage (3 x 6TB 7.2k drives) and configure them on the
controller as RAID5, split them into 2 and give each host a partition (sdy)
of +/- 6TB
Setup LVM with two volume groups, e.g. Vol0 - Fast Storage (300GB) and Vol1
(3TB) - Slow Storage
Create logical volumes as needed in terms of disk space per container and
VMs and install the container and VMs onto those logical volumes.
That would work fine, though I'd recommend you use ZFS rather than LVM.
ZFS is much better whenever you have to deal with files, as containers
do but also supports creating block devices for use with virtual
machines.
You can even configure libvirt to directly allocate such ZFS volumes as
needed (if you're using libvirt), otherwise, they can be create with
"zfs create -V".
Option 2
----------------
The other option I was thinking off was to create partitions on the
controller and split the storage up so we could use say:-
150GB Fast Storage as a ZFS Pool for containers (that needs disk speed)
1.5TB Slow Storage as a ZFS Pool for containers (that doesn't need as much
disk speed)
1.5TB Slow Storage with LVM for VMs
and the same with the Slower Storage but then as far as I know, we can only
have one ZFS Pool per host. So that's not going to work...?
So if you're willing to use the non-LTS branch of LXD, you can get the
multi-pool support I described above.
Though again, I'd recommend just putting everything on ZFS so you don't
need to guess how much space you need for containers vs VMs and can run
everything using the same storage technology.
Option 3
----------------
The last option I had in mind was to create the partitions on the
controller and assign them onto their respective hosts (having sda, sdb,
sdc etc.)
That way, we could select whether we wanted Fast or Slow storage for a
partition and have LXD and KVM use the partition for the installation.
At least with LVM we could still leverage from snapshots.
Option 4
----------------
Create a partition on the controller per host with space and use ZFS on
those partitions and setup LXD to use those partitions, we would select
either slower storage or faster storage for the purpose.
Create the same size partition on both hosts.
That way we can leverage from live migrations and snapshots.
Use LVM for the VMs and/or data mount points within the VMs / Containers.
So sounds like your best bet may be:
- Setup two "fast" volumes on the PowerVault, give one to each server
- Setup two "slow" volumes on the PowerVault, give one to each server
- Setup the hosts to each have:
- external-fast zpool
- external-slow zpool
- internal-fast zpool
- Then for each zpool, create a "vm" and a "lxd" dataset. If using the
newer LXD, then create 3 LXD pools using those "lxd" datasets.
- You can then set quotas for the "vm" and "lxd" datasets as needed,
tweak compression settings and any other ZFS option.
- You can then choose for every container and virtual machine, what
kind of storage to give them and can create additional volumes to attach
to them as needed (for example a container that's on the limited fast
storage could get a big chunk of storage attached from the slower
storage pool for a given path).
Hope that helps.