Re: [smartos-discuss] High availability solutions with SmartOS

Joshua M. Clulow Mon, 28 Dec 2015 15:23:22 -0800

On 28 December 2015 at 14:47, Will Beazley <[email protected]> wrote:
> I imagine if you want to do it SmartOS would be an OS to do it. I guess it
> may require a mapping scheme so heterogeneous processes could be migrated.


It's much more likely that you would be able to migrate an entire
_zone_ than just one process.

One of the many issues with migrating a UNIX process from one machine
to another is the set of essentially opaque, integer-typed identifiers
that refer to resources outside the memory space of the process.  The
classical example is the process ID (pid), though this applies to
other things as well.

In order to be able to migrate a zone from one machine to another, you
would need to preserve the pid of each mutually visible process within
that zone.  Processes become aware of each other through their pids,
and these relationships must be preserved.  Currently the pid
namespace is global to the entire machine; you would need to create a
per-zone unique namespace to prevent a newly migrated zone from
stepping on the pids of another zone already on the migration target.
The system would still require a globally unique pid underneath, so
you'd probably end up having a "shadow" pid within the zone; this will
complicate the implementation of ps(1), DTrace, every ptool, etc.  It
will also complicate the operator's view of the system (from the
global zone).

There are many other extra-process resources that you would need to
serialise and deserialise as well: a file descriptor really connects
you with a particular vnode within a filesystem, so you'd need to be
able to reconstitute that (even if unlinked) after presumably also
sending the ZFS dataset to the remote peer.  You would also need to
move the VNIC (MAC & IP) as well as all open connection state, with
the contents of the send and receive buffers, etc.  You would need to
unhook any inter-zone socket connections that have been fused together
for performance reasons.

Even though the zone (instead of the process) could potentially make
for a crisper boundary along which to detach a set of processes (and
associated resources) and move them to another machine, it is almost
certainly not worth the trouble.  Such an architectural shift would
forever complicate the implementation of every other operating system
feature implemented afterwards.  Each new feature would need to be
built with a view to being paused, serialised, deserialised and
resumed; there is little practical difference between this kind of
migration and a checkpoint/restart style facility.

It is cleaner, simpler, and more robust to do HA (whatever that means
for you) in your application.  There, you can make different and
nuanced decisions about consistency and availability for individual
abstract pieces of your application, even down to the level of
particular tables in a database, rather than trying to create a UNIX
host that magically spans the data centre.

-- 
Joshua M. Clulow
UNIX Admin/Developer
http://blog.sysmgr.org


-------------------------------------------
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com

Re: [smartos-discuss] High availability solutions with SmartOS

Reply via email to