Phase I is complete! The brain of KPLUG's web and email server,
SparKPLUG, has been moved off its old hardware and into a fully
virtualized Xen virtual machine on our new server. This gives the
current setup more room to breathe, and now allows us to set up a
fresh new system in parallel and migrate the data over gradually.
Hooray for virtualization!
This migration wasn't as easy as I had hoped, but then again what
type of sysadmin work ever goes exactly as planned? Roadblocks came
from a number of directions - gaps in my experience, the 2.4 kernel
and outdated Debian system on the old hardware, and insufficient or
just plain inaccurate Xen documentation. For educational purposes,
here's a writeup of the migration process with some details that I
thought useful or interesting. Feel free to ask questions or make
suggestions.
We were starting with an Debian Sarge 3.1 system running a 2.4 kernel
on an old Pentium III based computer with a SCSI hard drive. Our
goal was to lift the entire Linux system, as is, and drop it into a
Xen virtual machine on our new Xeon-based (VT enabled) CentOS5 server
with as few modifications as possible. This is what's known as a P2V
- Physical To Virtual - migration. In commercial settings, vendors
such as VMware have nifty software tools that help with the process.
None of those were available to us, but that's no big deal because
they're really more useful for OSes that are hard to move without
breaking, such as Windows. Linux on the other hand is usually pretty
accepting of new hardware.
In my head there were a couple of possible methods of migration the
data:
1) dd over netcat
2) tar over ssh (or netcat)
3) rsync over ssh
I was originally leaning toward 'dd' since I knew I could bring the
entire disk, partition table and all, over and just plop it in as a
Xen virtual disk image. However this ran contrary to my favored Xen
"best practice" of making a LVM partition for each virtual machine's
storage. Plus I was planning to perform the migration onsite at the
colo facility, and should something happen halfway through the dd
process I'd have to start over. Left with options 2 and 3, I chose
rsync since given the proper switches it could perform just as
complete of a data backup as tar, and should it fail part way through
it could pick right back up where it left off. The Xen wiki also
suggests rsync in their page on manual P2V migrations (http://
wiki.xensource.com/xenwiki/XenManualPtoVProcess).
So I went ahead and carved out a new logical volume for "OldSparky"
using LVM, shut down all services on the old SparkPLUG and fired up
the following rsync:
# rsync -av --numeric-ids -H -S -D --exclude-from"xenexcl" /
[EMAIL PROTECTED]:/mnt/OldSpark/
The contents of the "xenexcl" file were as follows:
proc/*
tmp/*
lost+found/
etc/mtab
I originally had dev/* in there as well until I remembered that the
old Debian system did not use devfs and therefore we needed the
device nodes to migrate as well (hence the -D parameter for rsync).
I'm not sure the -H (hard links) and -S (sparse files) switches were
necessary, but I wanted to be thorough. The --numeric-ids switch for
rsync was critical to prevent it from trying to match up user/group
names between the Debian and RedHat systems.
This migration took quite a bit longer than I had hoped (~3 hours for
~8GB of data), and I'm not sure why - the two machines were connected
by 100Mbit Ethernet. During the first 1.5 hours, the new server was
simultaneously building a RAID1 md set, so that surely had some
effect, but it still doesn't explain it fully. No matter, I sat in
the corner and worked on other projects while I waited.
Once complete, I verified that everything had copied over as I had
expected and then swapped out the physical hardware and went home for
the night. The next day I would build the Xen config files and fire
it up!
In Xen, given hardware with virtualization extensions (Intel's VT or
AMD's Pacifica), you have two options on how to run a virtual Linux
machine:
Paravirtualized - The guest OS has a kernel compiled with special
extensions that let it work with the host hypervisor. This is the
preferred method from a performance and manageability standpoint.
Fully Virtualized - The guest OS runs completely unmodified and
thinks it has complete control of its hardware. Since the host has
to intercept and reroute fundamental system calls to give the guest
its "complete control," a performance hit is taken. This is the only
choice for virtualizing an OS such as Windows where you do not have
the ability to modify the kernel.
Though I could have retrofitted our old Debian system to run a Xen-
enabled kernel, I didn't want to put that effort into the obsolete
system, and instead chose to just run the old system image fully
virtualized. Here's what the Xen config file looks like for our
fully virtualized OldSpark system:
/etc/xen/oldspark.cfg
---
name = "oldspark"
builder = "hvm"
memory = "1024"
vcpus=1
disk = [
'file:/var/lib/xen/images/oldspark,hda,w',
'file:/var/lib/xen/images/knoppix37.iso,hdc:cdrom,r',
'file:/var/lib/xen/images/oldsparkswap,hdb,w' ]
# boot = 'd'
vif = [ 'type=ioemu, bridge=xenbr0', ]
device_model = "/usr/lib64/xen/bin/qemu-dm"
kernel = "/usr/lib/xen/boot/hvmloader"
vnc=1
vncunused=1
apic=1
acpi=1
pae=1
serial = "pty" # enable serial console
on_reboot = 'restart'
on_crash = 'restart'
---
The keys to the full virtualization are the 'hvm' (Hardware Virtual
Machine) kernel and qemu device model. For gory technical details
behind how all this is done, you can read here: http://
www.linuxjournal.com/article/8909
What you see above is a copy of the working Xen config file... but
it's not quite what I started with. One of the first problems I ran
into was one of documentation - Xen has been around a while but full
virtualization only came with v3. Therefore 90% of what's written
about Xen (official docs as well as mailing lists and other related
articles found through Google) assumes a paravirtualization setup.
The LVM partition approach I used during the migration was based on
this, but wouldn't suffice for what I wanted to do.
Given a Xen-enabled (paravirtualized) kernel, a disk configuration
can be written like this:
disk = [ 'phy:/dev/G0/OldSpark,hda1,w' ]
See how I was able to specify a mapping to 'hda1'? With that, I
could simply pass a Xen-enabled kernel 'root=/dev/hda1' and it would
find the system per the config file and boot.
But the fully virtualized setup isn't so granular - it can only
accept an entire disk image, which it maps to a faked up IDE
controller inside the virtual machine. The LVM volume I originally
migrated to was lacking such import things as a boot sector and
partition table! Of course Xen didn't bother to give any errors, it
just refused to do the mapping. That's why you see a Knoppix CD
image in the configuration above - booting it up inside the virtual
machine was an invaluable method for troubleshooting.
So I had to regroup - the LVM partition wasn't going to work as I had
configured it. I had to get the old SparkPLUG data laid out in a way
that Xen would accept as hda. This is a job for disk images!
I mocked up an entire hard drive layout inside a single file, and
configured Xen to use that image. (Had I used 'dd' in the original
P2V I would have already had such an image... but this was more
fun. :) Here's the process for those interested:
First create a ~10GB blank image file (note that I could have just
done "count=20480" to allocate the entire file up front, but the seek
trick instead creates a sparse file that will fill up as necessary
and only takes seconds to create):
# dd if=/dev/zero of=oldspark bs=516096c seek=20479 count=1
Next mount the image using loopback:
# losetup /dev/loop0 oldspark
Now lay down a partition table. Since the file is not an actual hard
drive, the geometry has to be specified in the fdisk command line:
# fdisk -u -C20480 -S63 -H16 /dev/loop0
Inside fdisk, create a single partition that takes up the whole
image, leaving a partition table that looks like this:
---
Disk /dev/loop0: 10.5 GB, 10569646080 bytes
16 heads, 63 sectors/track, 20480 cylinders, total 20643840 sectors
Units = sectors of 1 * 512 = 512 bytes
Device Boot Start End Blocks Id System
/dev/loop0p1 * 63 20643839 10321888+ 83 Linux
---
Note two important numbers in that table - the start sector of 63 and
block count of 10321888. Those are necessary to get the filesystem
formatted correctly.
Dismount the image:
# losetup -d /dev/loop0
Then remount it, 63 sectors in. To do this, give losetup an offset.
Since the sectors are 512 bytes, and 63 of them need to be skipped to
get to the beginning of the first partition, the offset is 63*512, or
32256:
# losetup -o32256 /dev/loop oldspark
Now format the filesystem, using the block count from above:
# mke2fs -b1024 -j /dev/loop0 10321888
At this point the disk was ready, so I copied the old server's
contents onto it, using the same command I did for the initial rsync:
# mount /dev/loop0 /mnt/images
# rsync -av --numeric-ids -H -S -D --exclude-from="xenexcl" /mnt/
oldspark/ /mnt/images/
# umount /mnt/images
# losetup -d /dev/loop0
I verified that Xen would accept this image has hda and it did... but
there was still one part missing. Because there are no ties between
the host and guest systems, our guest had to be 100% responsible for
booting itself, which means it needed a boot loader.
Again I turned to the Knoppix boot CD, booting it inside the Xen
guest along with the new hda image, and then chrooting to the Debian
system. From there I ran GRUB and told it to set up a fresh boot
sector. I also modified the GRUB configuration to consider the new
device names and edited /etc/fstab accordingly.
At this point I shut the guest down, removed the Knoppix boot CD, and
fired it back up. Happily, I was presented with a GRUB bootloader
screen and my choice of kernel. Unhappily, that kernel immediately
panicked, complaining that it could not initialize the sym53c8xx
driver and could not find the disk containing its root filesystem.
This was because the original hardware was SCSI based and the initrd
had been created accordingly. Once again, I brought up Knoppix,
chrooted to the Debian install, and created an initrd with the
correct PIIX ATA drivers.
This time, booting into the virtual machine was successful!
SparKPLUG was alive again, and in its new home. There were two
remaining cleanup tasks to address. First, I had neglected to
consider swap when creating my disk image. At this point, the
easiest thing to do was create another disk image just as I had
above, mount it up in the virtual machine as hdb, and use that for swap.
The second task related to the network card driver. The virtual
machine actually had come up on the network just fine, but I noticed
during the boot that there were quite a few errors relating to the
network driver. Debian's 'discover' system had correctly detected
the emulated Realtek 8139 network driver, and helpfully loaded the
8139too driver, which worked. But upon load the 8139too driver
recognized that the emulated chip was actually an "enhanced 8139C+"
and suggested that we use the 8139cp driver instead. On top of that,
later in the boot the 'hotplug' system *again* detected the network
hardware and tried to load the driver, not once but twice - once for
8139too and once for 8139cp.
My goal was to get the system to autoload the 8139cp driver *once*.
I started by adding a line to /etc/discover.conf telling it to skip
loading the 8139too driver, hoping that it would instead pick up the
8139cp. Unfortunately its device database seems to be old enough not
to realize that 8139cp was a valid alternate. That wasn't a problem,
since hotplug was more than anxious to do the driver loading
instead. Only problem was that it was trying the 8139too driver
before 8139cp, so I had to add 8139too to /etc/hotplug/blacklist.d/
local which prevented it from loading.
This was all slightly annoying because all along I'd had the line
"alias eth0 8139cp" in modules.conf, but those instructions only
count for the kernel. Strangely, the kernel is perfectly capable of
loading the network driver without help from discover or hotplug, so
I'm not sure why Debian is set up that way. I'll write it off to the
old distribution release and 2.4 kernel.
Anyhow, there we go -- SparKPLUG is now virtualized inside our new
server and running quite cleanly, happy to have the extra RAM and CPU
afforded by the new hardware. It's still not as snappy as I'd like
it to be, but part of that lies in some Plone/Zope work that needs to
be done, and part of that lies in the subsequent OS upgrade and
paravirtualization that we'll be doing. I'll be sure to keep the
list updated with progress!
--
Joshua Penix http://www.binarytribe.com
Binary Tribe Linux Integration Services & Network Consulting
--
[email protected]
http://www.kernel-panic.org/cgi-bin/mailman/listinfo/kplug-list