Re: suspending machines in Tashi

Michael Stroucken Fri, 23 Mar 2012 11:45:21 -0700

Greg Ganger wrote:


Yea, that would be the concern... perhaps it should be a config setting
that one can set?  [Where the DFS is fast, go there directly... where
not, stage through the local disk.]

I think the original issue that Richard thinks may have prompted thismay have gone away.

So if data has to end up on DFS anyway, sending it directly eliminatesthe double copy. Right now (on SVN trunk), the data is stored to a localfilesystem completely before sending to DFS.

In this case, the bottleneck is the local disk, so the double copyhalves the throughput again. Lets say the state file is 100 GB big, andthe disk has a throughput of 40 MB/s. Currently, storing it wouldtheoretically take 1.5 hours.

If we're saving directly to DFS, which has a throughput of about 70MB/s, it should theoretically be finished in 24 minutes.

I am playing around with user-configurable suspend and resume handlerswhich could be free to stage things as they wish. In my case, they tryto compress VM state in different ways.

The state is very variable in nature; at times the hypervisor sends moredata than can be processed by the CPU, at other times the storage systemis the bottleneck.

For a VM with the above characteristics and uninitialized local storage,actual suspend times ranged from 130 minutes using the default gzip anddouble copying down to 45 minutes using parallel gzip and storingdirectly onto DFS.


Greetings,
Michael.

Re: suspending machines in Tashi

Reply via email to