Chip,

The issues with clock drift on the system VMs goes farther and deeper than
S3-backed Secondary Storage.  Essentially, anything the system vms do that
involves time can not be trusted.  For example, the timestamps of files
written by the SSVM.  Bear in mind that it is possible for a system vm to
have a slow clock.  Therefore, in a worst case scenario, the timestamp of
the file would be in the past breaking any logic that scans for updated
files.  Additionally, correlating logs on a system VM with the management
server or other parts of the infrastructure is difficult to impossible in
these types of clock drift scenarios.  In summary, when time in a
distributed system gets skewed, there are a raft of subtle but significant
operational issues that emerge.

It is also important to note that fixing this issue is larger than simply
running NTP on the system VMs.  As I noted on the ticket, each hypervisor
has a recommended approach for ensuring clock synchronization (e.g. VMWare
and KVM provide daemons and/or kernel drivers to sync clocks properly).
 The proper fix for the issue will be to implement those best practices in
each hypervisor-specific system VM ISO.  I think the biggest challenge to
implementing the fix will be testing more than development.

Thanks,
-John



On Wed, May 15, 2013 at 10:18 AM, Chip Childers
<chip.child...@sungard.com>wrote:

> Starting a thread on this specific issue.
>
> CLOUDSTACK-2492 was opened, which is basically the fact that the System
> VMs aren't syncing time to the host or to an NTP server.  The S3
> integration is broken because of this problem, and therefore could not
> be considered a function available in 4.1 if we release as is.
>
> We need input from people that know about the current system VMs (the
> 3.x VMs), as well as the possibility of using the newer ones that we
> have been considering experimental for 4.1.0.
>
> What should we do?
>
> -chip
>

Reply via email to