Re: [Qemu-devel] [PATCH 00/15] optimize Qemu RSS usage

2016-11-01 Thread Michael R. Hines
On 10/31/2016 05:00 PM, Michael R. Hines wrote: On 10/18/2016 05:47 AM, Peter Lieven wrote: Am 12.10.2016 um 23:18 schrieb Michael R. Hines: Peter, Greetings from DigitalOcean. We're experiencing the same symptoms without this patch. We have, collectively, many gigabytes of un-planne

Re: [Qemu-devel] [PATCH 00/15] optimize Qemu RSS usage

2016-10-31 Thread Michael R. Hines
On 10/18/2016 05:47 AM, Peter Lieven wrote: Am 12.10.2016 um 23:18 schrieb Michael R. Hines: Peter, Greetings from DigitalOcean. We're experiencing the same symptoms without this patch. We have, collectively, many gigabytes of un-planned-for RSS being used per-hypervisor that we would

Re: [Qemu-devel] [PATCH 00/15] optimize Qemu RSS usage

2016-10-19 Thread Michael R. Hines
Thank you for the response! I'll run off and test that. =) /* * Michael R. Hines * Senior Engineer, DigitalOcean. */ On 10/18/2016 05:47 AM, Peter Lieven wrote: Am 12.10.2016 um 23:18 schrieb Michael R. Hines: Peter, Greetings from DigitalOcean. We're experiencing the sam

Re: [Qemu-devel] [PATCH 00/15] optimize Qemu RSS usage

2016-10-12 Thread Michael R. Hines
eline for it? - Michael /* * Michael R. Hines * Senior Engineer, DigitalOcean. */ On 06/28/2016 04:01 AM, Peter Lieven wrote: I recently found that Qemu is using several hundred megabytes of RSS memory more than older versions such as Qemu 2.2.0. So I started tracing memory allocation and fo

Re: [Qemu-devel] [PATCH 0/3] RDMA error handling

2016-09-23 Thread Michael R. Hines
Reviewed-by: Michael R. Hines (By the way, I no longer work for IBM and no longer have direct access to RDMA hardware. If someone is willing to let me login to something that does in the future, I don't mind debugging things. I just don't have any hardware of my own anymore to debu

Re: [Qemu-devel] An RDMA race?

2016-01-09 Thread Michael R. Hines
the COLO guys are doing). Maybe we can coalesce around something? - Michael On 01/04/2016 12:15 PM, Dr. David Alan Gilbert wrote: * Michael R. Hines (mhi...@digitalocean.com) wrote: Adding such a control message would defeat the benefits of RDMA, as there shouldn't be any signalling i

Re: [Qemu-devel] An RDMA race?

2015-12-19 Thread Michael R. Hines
And, yes, out-of-order messages are totally fine - we just have to be careful with the design. - Michael On Sun, Dec 20, 2015 at 3:08 PM, Michael R. Hines wrote: > Adding such a control message would defeat the benefits of RDMA, as there > shouldn't be any signalling in the actu

Re: [Qemu-devel] An RDMA race?

2015-12-19 Thread Michael R. Hines
ch up? > > Yes I think so; I also added a sequence number to the 'ready' messages > to check I wasn't losing one. > I had a chat to one of our RDMA guys (Doug Ledford) and he said > it's perfectly legal for RDMA to take longer to return the signal > from the send than for the round trip of the destination responding; > the 'signal' doesn't happen until an ack has been received from the > destination card anyway, so the ack can get delayed or retried. > So I think we do need to fix this; the question then is how do we fix > it for all control messages without breaking anything else. Are there > any cases that rely on having received the signal from the send before > continuing, or could i just do what I'm doing for all control messages? > > Dave > > > - Michael > -- > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK > -- /* * Michael R. Hines * https://michael.hinespot.com */

Re: [Qemu-devel] An RDMA race?

2015-12-11 Thread Michael R. Hines
David, Thanks for including my email directly. It helps a lot. Below, I'm going to assume that only "dest" is calling qemu_rdma_exchange_recv() and only src is calling qemu_rdma_exchange_send(), since you didn't specify who is sending and who is receiving. If that assumption is wrong, please res

Re: [Qemu-devel] [PATCH COLO-BLOCK v7 00/17] Block replication for continuous checkpoints

2015-07-08 Thread Michael R. Hines
On 07/07/2015 08:38 PM, Wen Congyang wrote: On 07/08/2015 12:56 AM, Michael R. Hines wrote: On 07/07/2015 04:23 AM, Paolo Bonzini wrote: On 07/07/2015 11:13, Dr. David Alan Gilbert wrote: This log is very stange. The NBD client connects to NBD server, and NBD server wants to read data from

Re: [Qemu-devel] [PATCH COLO-BLOCK v7 00/17] Block replication for continuous checkpoints

2015-07-07 Thread Michael R. Hines
On 07/07/2015 04:23 AM, Paolo Bonzini wrote: On 07/07/2015 11:13, Dr. David Alan Gilbert wrote: This log is very stange. The NBD client connects to NBD server, and NBD server wants to read data from NBD client, but reading fails. It seems that the connection is closed unexpectedly. Can you gi

Re: [Qemu-devel] [PATCH COLO-BLOCK v7 00/17] Block replication for continuous checkpoints

2015-07-06 Thread Michael R. Hines
On 07/04/2015 07:46 AM, Wen Congyang wrote: At 2015/7/3 23:30, Dr. David Alan Gilbert Wrote: * Wen Congyang (we...@cn.fujitsu.com) wrote: Block replication is a very important feature which is used for continuous checkpoints(for example: COLO). Usage: Please refer to docs/block-replication.txt

Re: [Qemu-devel] [PATCH COLO-BLOCK v7 00/17] Block replication for continuous checkpoints

2015-07-02 Thread Michael R. Hines
On 06/29/2015 10:34 PM, Wen Congyang wrote: Block replication is a very important feature which is used for continuous checkpoints(for example: COLO). Usage: Please refer to docs/block-replication.txt You can get the patch here: https://github.com/wencongyang/qemu-colo/commits/block-replication

Re: [Qemu-devel] [PATCH COLO-BLOCK v7 13/17] docs: block replication's description

2015-07-02 Thread Michael R. Hines
On 06/29/2015 10:34 PM, Wen Congyang wrote: Signed-off-by: Wen Congyang Signed-off-by: Yang Hongyang Signed-off-by: zhanghailiang Signed-off-by: Gonglei --- docs/block-replication.txt | 179 + 1 file changed, 179 insertions(+) create mode 10064

Re: [Qemu-devel] [PATCH COLO-BLOCK v7 00/17] Block replication for continuous checkpoints

2015-07-02 Thread Michael R. Hines
Is this up to date: On 06/29/2015 10:34 PM, Wen Congyang wrote: Block replication is a very important feature which is used for continuous checkpoints(for example: COLO). Usage: Please refer to docs/block-replication.txt You can get the patch here: https://github.com/wencongyang/qemu-colo/comm

Re: [Qemu-devel] [RFC PATCH COLO v2 00/13] Block replication for continuous checkpoints

2015-07-01 Thread Michael R. Hines
On 06/30/2015 11:11 PM, Wen Congyang wrote: On 07/01/2015 11:09 AM, Michael R. Hines wrote: On 03/25/2015 04:36 AM, Wen Congyang wrote: Block replication is a very important feature which is used for continuous checkpoints(for example: COLO). Usage: Please refer to docs/block-replication.txt

Re: [Qemu-devel] [RFC PATCH COLO v2 00/13] Block replication for continuous checkpoints

2015-07-01 Thread Michael R. Hines
On 06/30/2015 11:11 PM, Wen Congyang wrote: On 07/01/2015 11:09 AM, Michael R. Hines wrote: On 03/25/2015 04:36 AM, Wen Congyang wrote: Block replication is a very important feature which is used for continuous checkpoints(for example: COLO). Usage: Please refer to docs/block-replication.txt

Re: [Qemu-devel] [RFC PATCH COLO v2 00/13] Block replication for continuous checkpoints

2015-06-30 Thread Michael R. Hines
On 03/25/2015 04:36 AM, Wen Congyang wrote: Block replication is a very important feature which is used for continuous checkpoints(for example: COLO). Usage: Please refer to docs/block-replication.txt You can get the patch here: https://github.com/wencongyang/qemu-colo/commits/block-replication

Re: [Qemu-devel] [PATCH v2 06/12] Translate offsets to destination address space

2015-06-12 Thread Michael R. Hines
On 06/12/2015 01:50 PM, Dr. David Alan Gilbert wrote: * Michael R. Hines (mrhi...@linux.vnet.ibm.com) wrote: On 06/11/2015 01:58 PM, Dr. David Alan Gilbert wrote: * Michael R. Hines (mrhi...@linux.vnet.ibm.com) wrote: On 06/11/2015 12:17 PM, Dr. David Alan Gilbert (git) wrote: From: &qu

Re: [Qemu-devel] [PATCH v2 06/12] Translate offsets to destination address space

2015-06-11 Thread Michael R. Hines
On 06/11/2015 01:58 PM, Dr. David Alan Gilbert wrote: * Michael R. Hines (mrhi...@linux.vnet.ibm.com) wrote: On 06/11/2015 12:17 PM, Dr. David Alan Gilbert (git) wrote: From: "Dr. David Alan Gilbert" The 'offset' field in RDMACompress and 'current_addr' field in

Re: [Qemu-devel] [PATCH v2 10/12] Sort destination RAMBlocks to be the same as the source

2015-06-11 Thread Michael R. Hines
"vs %" PRIu64, local->block[i].block_name, i, +local->block[i].length, +rdma->dest_blocks[i].length); return -EINVAL; } +local->block[i].remote_host_addr =

Re: [Qemu-devel] [PATCH v2 09/12] Rework ram block hash

2015-06-11 Thread Michael R. Hines
; RDMA_WRID_MAX; idx++) { ret = qemu_rdma_reg_control(rdma, idx); if (ret) { You didn't want to use the ID string as a key? I forget Reviewed-by: Michael R. Hines

Re: [Qemu-devel] [PATCH v2 08/12] Allow rdma_delete_block to work without the hash

2015-06-11 Thread Michael R. Hines
mu_rdma_cleanup(RDMAContext *rdma) if (rdma->local_ram_blocks.block) { while (rdma->local_ram_blocks.nb_blocks) { -rdma_delete_block(rdma, rdma->local_ram_blocks.block->offset); +rdma_delete_block(rdma, &rdma->local_ram_blocks.block[0]); }

Re: [Qemu-devel] [PATCH v2 07/12] Rework ram_control_load_hook to hook during block load

2015-06-11 Thread Michael R. Hines
On 06/11/2015 12:17 PM, Dr. David Alan Gilbert (git) wrote: From: "Dr. David Alan Gilbert" We need the names of RAMBlocks as they're loaded for RDMA, reuse a slightly modified ram_control_load_hook: a) Pass a 'data' parameter to use for the name in the block-reg case b) Only some ho

Re: [Qemu-devel] [PATCH v2 06/12] Translate offsets to destination address space

2015-06-11 Thread Michael R. Hines
On 06/11/2015 12:17 PM, Dr. David Alan Gilbert (git) wrote: From: "Dr. David Alan Gilbert" The 'offset' field in RDMACompress and 'current_addr' field in RDMARegister are commented as being offsets within a particular RAMBlock, however they appear to actually be offsets within the ram_addr_t sp

Re: [Qemu-devel] [PATCH v2 04/12] rdma typos

2015-06-11 Thread Michael R. Hines
On 06/11/2015 12:17 PM, Dr. David Alan Gilbert (git) wrote: From: "Dr. David Alan Gilbert" A couple of typo fixes. Signed-off-by: Dr. David Alan Gilbert --- migration/rdma.c | 6 +++--- trace-events | 4 ++-- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/migration/rdm

Re: [Qemu-devel] [PATCH 08/10] Rework ram block hash

2015-05-19 Thread Michael R. Hines
On 05/19/2015 01:55 PM, Dr. David Alan Gilbert wrote: I would like to keep the ramblock list directly addressable by hash on both sides, because, as I mentioned earlier, we want as much flexibility in registering RAMBlock memory as possible by being able to add or delete arbitrary blocks int the

Re: [Qemu-devel] [PATCH 04/10] Translate offsets to destination address space

2015-05-19 Thread Michael R. Hines
On 05/19/2015 01:44 PM, Dr. David Alan Gilbert wrote: * Michael R. Hines (mrhi...@linux.vnet.ibm.com) wrote: On 04/20/2015 10:57 AM, Dr. David Alan Gilbert (git) wrote: From: "Dr. David Alan Gilbert" The 'offset' field in RDMACompress and 'current_addr' field in

Re: [Qemu-devel] [PATCH 10/10] Sanity check RDMA remote data

2015-05-19 Thread Michael R. Hines
: bad chunk for block %s" +" chunk: %" PRIx64, +block->block_name, reg->key.chunk); +ret = -ERANGE; + break; +} } chunk_start = ram_chunk_start(block, chunk); chunk_end = ram_chunk_end(block, chunk + reg->chunks); Reviewed-by: Michael R. Hines

Re: [Qemu-devel] [PATCH 09/10] Sort destination RAMBlocks to be the same as the source

2015-05-19 Thread Michael R. Hines
On 04/20/2015 10:57 AM, Dr. David Alan Gilbert (git) wrote: From: "Dr. David Alan Gilbert" Use the order of incoming RAMBlocks from the source to record an index number; that then allows us to sort the destination local RAMBlock list to match the source. Now that the RAMBlocks are known to be

Re: [Qemu-devel] [PATCH 08/10] Rework ram block hash

2015-05-19 Thread Michael R. Hines
On 04/20/2015 10:57 AM, Dr. David Alan Gilbert (git) wrote: From: "Dr. David Alan Gilbert" RDMA uses a hash from block offset->RAM Block; this isn't needed on the destination, and now that the destination sorts the ramblock list, is harder to maintain. Destination sorts the ramblock list? Is

Re: [Qemu-devel] [PATCH 07/10] Simplify rdma_delete_block and remove it's dependence on the hash

2015-05-19 Thread Michael R. Hines
On 04/20/2015 10:57 AM, Dr. David Alan Gilbert (git) wrote: From: "Dr. David Alan Gilbert" rdma_delete_block is currently very general, but it's only used in cleanup at the end. Simplify it and remove it's dependence on the hash table and remove all of the hash-table regeneration designed to

Re: [Qemu-devel] [PATCH 05/10] Rework ram_control_load_hook to hook during block load

2015-05-19 Thread Michael R. Hines
On 04/20/2015 10:57 AM, Dr. David Alan Gilbert (git) wrote: From: "Dr. David Alan Gilbert" We need the names of RAMBlocks as they're loaded for RDMA, reuse an existing QEMUFile hook with some small mods. Signed-off-by: Dr. David Alan Gilbert --- arch_init.c | 4 +++- inc

Re: [Qemu-devel] [PATCH 06/10] Remove unneeded memset

2015-05-19 Thread Michael R. Hines
rdma->current_chunk = -1; Reviewed-by: Michael R. Hines

Re: [Qemu-devel] [PATCH 04/10] Translate offsets to destination address space

2015-05-19 Thread Michael R. Hines
On 04/20/2015 10:57 AM, Dr. David Alan Gilbert (git) wrote: From: "Dr. David Alan Gilbert" The 'offset' field in RDMACompress and 'current_addr' field in RDMARegister are commented as being offsets within a particular RAMBlock, however they appear to actually be offsets within the ram_addr_t sp

Re: [Qemu-devel] [PATCH 03/10] Store block name in local blocks structure

2015-05-19 Thread Michael R. Hines
" end: %" PRIu64 " bits %" PRIu64 " chunks %d" rdma_delete_block(int block, uint64_t addr, uint64_t offset, uint64_t len, uint64_t end, uint64_t bits, int chunks) "Deleted Block: %d, addr: %" PRIu64 ", offset: %" PRIu64 " length: %" PRIu64 " end: %" PRIu64 " bits %" PRIu64 " chunks %d" rdma_start_incoming_migration(void) "" rdma_start_incoming_migration_after_dest_init(void) "" Reviewed-by: Michael R. Hines

Re: [Qemu-devel] [PATCH 02/10] qemu_ram_foreach_block: pass up error value, and down the ramblock name

2015-05-19 Thread Michael R. Hines
(opaque, host_addr, block_offset, length); } /* Shame on me for not checking the return value =) Reviewed-by: Michael R. Hines

Re: [Qemu-devel] [PATCH 01/10] Rename RDMA structures to make destination clear

2015-05-19 Thread Michael R. Hines
return -EINVAL; } local->block[j].remote_host_addr = -rdma->block[i].remote_host_addr; -local->block[j].remote_rkey = rdma->block[i].remote_rkey; +rdma->dest_blocks[i].remote_host_addr; +local->block[j].remote_rkey = rdma->dest_blocks[i].remote_rkey; break; } Good to get these renamed, thanks. Reviewed-by: Michael R. Hines

Re: [Qemu-devel] [PATCH 00/10] Remove RDMA migration dependence on RAMBlock offset

2015-05-18 Thread Michael R. Hines
On 04/20/2015 10:57 AM, Dr. David Alan Gilbert (git) wrote: From: "Dr. David Alan Gilbert" RDMA migration currently relies on the source and destination RAMBlocks having the same offsets within ram_addr_t space; unfortunately that's just not true when: a) You hotplug on the source but then

Re: [Qemu-devel] [Bug 1363641] Re: Build of v2.1.0 fails on armv7l due to undeclared __NR_select

2014-09-23 Thread Michael R. Hines
On 09/09/2014 02:09 AM, Dr. David Alan Gilbert wrote: (cc'ing Michael Hines who owns and knows the RDMA code) * Karl-Philipp Richter (krichter...@aol.de) wrote: ** Description changed: After `make clean` and `git clean -x -f -d` `git checkout v2.1.0 && configure --prefix=/home/user/prefi

Re: [Qemu-devel] [Bug 1363641] Re: Build of v2.1.0 fails on armv7l due to undeclared __NR_select

2014-09-23 Thread Michael R. Hines
On 09/09/2014 02:09 AM, Dr. David Alan Gilbert wrote: (cc'ing Michael Hines who owns and knows the RDMA code) * Karl-Philipp Richter (krichter...@aol.de) wrote: ** Description changed: After `make clean` and `git clean -x -f -d` `git checkout v2.1.0 && configure --prefix=/home/user/prefi

Re: [Qemu-devel] ballooning not working on hotplugged pc-dimm

2014-09-10 Thread Michael R. Hines
On 09/11/2014 02:22 PM, Paolo Bonzini wrote: Il 11/09/2014 03:57, Michael R. Hines ha scritto: Why does hotplugging use a different name? This also affects RDMA live migration - we are explicitly looking up "pc.ram" ram blocks and pinning them for memory registration with Linux.

Re: [Qemu-devel] ballooning not working on hotplugged pc-dimm

2014-09-10 Thread Michael R. Hines
On 09/10/2014 05:00 PM, zhanghailiang wrote: On 2014/9/9 11:05, Alexandre DERUMIER wrote: Hello, I was playing with pc-dimm hotplug, and I notice that balloning is not working on memory space of pc-dimm devices. example: qemu -m size=1024,slots=255,maxmem=15000M #free -m : 1024M -> qmp ba

Re: [Qemu-devel] Microcheckpointing: Memory-VCPU / Disk State consistency

2014-09-10 Thread Michael R. Hines
for any ideas Walid Am 17.08.2014 11:52, schrieb Paolo Bonzini: Il 11/08/2014 22:15, Michael R. Hines ha scritto: Excellent question: QEMU does have a feature called "drive-mirror" in block/mirror.c that was introduced a couple of years ago. I'm not sure what the adoption rate

Re: [Qemu-devel] Microcheckpointing: Memory-VCPU / Disk State consistency

2014-08-14 Thread Michael R. Hines
On 08/14/2014 06:58 PM, Dr. David Alan Gilbert wrote: cc'ing in a couple of the COLOers. Thanks, David. Glad to see their patches in last month - I need to take a look at them. The 2013 paper says: 'COLO modifies the guest OS’s TCP/IP stack in order to make the behavior more deterministic.

Re: [Qemu-devel] Microcheckpointing: Memory-VCPU / Disk State consistency

2014-08-13 Thread Michael R. Hines
On 08/13/2014 10:03 PM, Walid Nouri wrote: While looking to find some ideas for approaches to replicating block devices I have read the paper about the Remus implementation. I think MC can take a similar approach for local disk. I agree. Here are the main facts that I have understood: L

Re: [Qemu-devel] Microcheckpointing: Memory-VCPU / Disk State consistency

2014-08-11 Thread Michael R. Hines
irectly build into Qemu? Walid Am 09.08.2014 14:25, schrieb Michael R. Hines: On Sat, 2014-08-09 at 14:08 +0200, Walid Nouri wrote: Hi Michael, how is the weather in Bejing? :-) It's terrible. Lots of pollution =( May I ask you some questions to your MC implementation? Currently i'm trying

Re: [Qemu-devel] Microcheckpointing: Memory-VCPU / Disk State consistency

2014-08-11 Thread Michael R. Hines
irectly build into Qemu? Walid Am 09.08.2014 14:25, schrieb Michael R. Hines: On Sat, 2014-08-09 at 14:08 +0200, Walid Nouri wrote: Hi Michael, how is the weather in Bejing? :-) It's terrible. Lots of pollution =( May I ask you some questions to your MC implementation? Currently i'm trying

Re: [Qemu-devel] [RFC] COLO HA Project proposal

2014-07-09 Thread Michael R. Hines
On 07/03/2014 11:42 AM, Hongyang Yang wrote: I wonder if there is anyway to coordinate this between COLO, Michael Hines microcheckpointing and the two separate reverse-execution projects that also need to do some similar things. Are there any standard APIs for the heartbeet thing we can already

Re: [Qemu-devel] [PATCH] rdma: bug fixes

2014-06-12 Thread Michael R. Hines
On 02/18/2014 10:34 AM, mrhi...@linux.vnet.ibm.com wrote: From: "Michael R. Hines" 1. Fix small memory leak in parsing inet address from command line in data_init() 2. Fix ibv_post_send() return value check and pass error code back up correctly. 3. Fix rdma_destroy_qp() segfault aft

Re: [Qemu-devel] [PATCH] rdma: Fix block during rdma migration

2014-05-14 Thread Michael R. Hines
On 05/09/2014 12:25 PM, Gonglei (Arei) wrote: Hi, -Original Message- From: Michael R. Hines [mailto:mrhi...@linux.vnet.ibm.com] Sent: Tuesday, April 01, 2014 8:42 AM To: Gonglei (Arei); qemu-devel@nongnu.org Cc: Huangweidong (C); quint...@redhat.com; dgilb...@redhat.com; owass

Re: [Qemu-devel] [Qemu-trivial] [PATCH] arch_init.c: remove duplicate function

2014-04-14 Thread Michael R. Hines
the vector-optimized version like migration does? Just for RDMA: Reviewed-by: Michael R. Hines - Michael

Re: [Qemu-devel] [RFC PATCH v2 10/12] mc: expose tunable parameter for checkpointing frequency

2014-04-10 Thread Michael R. Hines
On 04/04/2014 10:56 PM, Eric Blake wrote: On 04/03/2014 11:29 PM, Michael R. Hines wrote: I'm trying to thing of a back-compat method, which exploits the fact that we now have flat unions (something we didn't have when migrate-set-capabilities was first added). Maybe something like

Re: [Qemu-devel] [RFC PATCH v2 10/12] mc: expose tunable parameter for checkpointing frequency

2014-04-03 Thread Michael R. Hines
On 03/12/2014 06:49 AM, Eric Blake wrote: On 03/11/2014 04:15 PM, Juan Quintela wrote: Eric Blake wrote: On 02/18/2014 01:50 AM, mrhi...@linux.vnet.ibm.com wrote: From: "Michael R. Hines" We're building up a LOT of migrate- tunable commands. Maybe it's time to think a

Re: [Qemu-devel] [RFC PATCH v2 11/12] mc: introduce new capabilities to control micro-checkpointing

2014-04-03 Thread Michael R. Hines
On 03/12/2014 06:07 AM, Eric Blake wrote: On 03/11/2014 04:02 PM, Juan Quintela wrote: mrhi...@linux.vnet.ibm.com wrote: From: "Michael R. Hines" +# @mc-net-disable: Deactivate network buffering against outbound network +# traffic while Micro-Checkpointing (@mc)

Re: [Qemu-devel] [RFC PATCH v2 11/12] mc: introduce new capabilities to control micro-checkpointing

2014-04-03 Thread Michael R. Hines
On 03/12/2014 06:02 AM, Juan Quintela wrote: mrhi...@linux.vnet.ibm.com wrote: From: "Michael R. Hines" New capabilities include the use of RDMA acceleration, use of network buffering, and keepalive support, as documented in patch #1. Signed-off-by: Michael R. Hines --- qapi-s

Re: [Qemu-devel] [RFC PATCH v2 07/12] mc: introduce additional QMP statistics for micro-checkpointing

2014-04-03 Thread Michael R. Hines
On 03/12/2014 05:59 AM, Juan Quintela wrote: mrhi...@linux.vnet.ibm.com wrote: From: "Michael R. Hines" MC provides a lot of new information, including the same RAM statistics that ordinary migration does, so we centralize a lot of that printing code into a common function so th

Re: [Qemu-devel] [RFC PATCH v2 06/12] mc: introduce state machine changes for MC

2014-04-03 Thread Michael R. Hines
On 03/12/2014 05:57 AM, Juan Quintela wrote: mrhi...@linux.vnet.ibm.com wrote: From: "Michael R. Hines" This patch sets up the initial changes to the migration state machine and prototypes to be used by the checkpointing code to interact with the state machine so that we can la

Re: [Qemu-devel] [RFC PATCH v2 11/12] mc: introduce new capabilities to control micro-checkpointing

2014-04-03 Thread Michael R. Hines
On 03/12/2014 05:57 AM, Eric Blake wrote: --- qapi-schema.json | 36 +++- 1 file changed, 35 insertions(+), 1 deletion(-) +# Only for performance testing. (Since 2.x) +# +# @mc-rdma-copy: MC requires creating a local-memory checkpoint before +#

Re: [Qemu-devel] [RFC PATCH v2 10/12] mc: expose tunable parameter for checkpointing frequency

2014-04-03 Thread Michael R. Hines
On 03/12/2014 05:49 AM, Eric Blake wrote: diff --git a/hmp-commands.hx b/hmp-commands.hx index f3fc514..2066c76 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -888,7 +888,7 @@ ETEXI "\n\t\t\t -b for migration without shared storage with" " full c

Re: [Qemu-devel] [RFC PATCH v2 07/12] mc: introduce additional QMP statistics for micro-checkpointing

2014-04-03 Thread Michael R. Hines
On 03/12/2014 05:45 AM, Eric Blake wrote: +++ b/qapi-schema.json @@ -603,6 +603,36 @@ 'cache-miss': 'int', 'overflow': 'int' } } ## +# @MCStats +# +# Detailed Micro Checkpointing (MC) statistics +# +# @mbps: throughput of transmitting last MC +# +# @xmit-time: milliseconds to t

Re: [Qemu-devel] [RFC PATCH v2 03/12] mc: introduce a 'checkpointing' status check into the VCPU states

2014-04-03 Thread Michael R. Hines
On 03/12/2014 05:40 AM, Eric Blake wrote: +++ b/qapi-schema.json @@ -169,6 +169,8 @@ # # @save-vm: guest is paused to save the VM state # +# @checkpoint-vm: guest is paused to checkpoint the VM state +# It would be nice to mention '(since 2.1)'. Acknowledged. # @shutdown: guest is sh

Re: [Qemu-devel] [RFC PATCH v2 03/12] mc: introduce a 'checkpointing' status check into the VCPU states

2014-04-03 Thread Michael R. Hines
On 03/12/2014 05:36 AM, Juan Quintela wrote: mrhi...@linux.vnet.ibm.com wrote: From: "Michael R. Hines" During micro-checkpointing, the VCPUs get repeatedly paused and resumed. We need to not freak out when the VM begins micro-checkpointing. Signed-off-by: Michael R. Hines di

Re: [Qemu-devel] [RFC PATCH v2 02/12] mc: timestamp migration_bitmap and KVM logdirty usage

2014-04-03 Thread Michael R. Hines
On 03/12/2014 05:31 AM, Juan Quintela wrote: mrhi...@linux.vnet.ibm.com wrote: From: "Michael R. Hines" We also later export these statistics over QMP for better monitoring of micro-checkpointing as the workload changes. Signed-off-by: Michael R. Hines --- arch_i

Re: [Qemu-devel] [PATCH] rdma: Fix block during rdma migration

2014-03-31 Thread Michael R. Hines
t equal to event_addr_resolved %s", rdma_event_str(cm_event->event)); perror("rdma_resolve_addr"); +rdma_ack_cm_event(cm_event); ret = -EINVAL; goto err_resolve_get_addr; } Reviewed-by: Michael R. Hines Good catch. =) Tha

[Qemu-devel] RDMA upstream moved to stable status - will proceed formally with libvirt patchset and more FT review

2014-03-26 Thread Michael R. Hines
--- migration/next for 20140225 Dr. David Alan Gilbert (2): Fix vmstate_info_int32_le comparison/assign Fix two XBZRLE corruption issues Juan Quintela (1): qemu_file: use fwrite() correctly Michael R. Hines (1): rdma: rename 'x-rdma&

Re: [Qemu-devel] [PATCH] rdma: bug fixes

2014-03-26 Thread Michael R. Hines
On 02/27/2014 11:49 PM, Michael Roth wrote: Quoting mrhi...@linux.vnet.ibm.com (2014-02-17 20:34:06) From: "Michael R. Hines" 1. Fix small memory leak in parsing inet address from command line in data_init() 2. Fix ibv_post_send() return value check and pass error code back up co

Re: [Qemu-devel] [RFC PATCH v2 01/12] mc: add documentation for micro-checkpointing

2014-03-02 Thread Michael R. Hines
On 02/21/2014 05:44 PM, Dr. David Alan Gilbert wrote: It's not clear to me how much of this (or any) of this control loop should be in QEMU or in the management software, but I would definitely agree that a minimum of at least the ability to detect the situation and remedy the situation should be

Re: [Qemu-devel] [RFC PATCH v2 06/12] mc: introduce state machine changes for MC

2014-02-21 Thread Michael R. Hines
On 02/19/2014 09:00 AM, Li Guang wrote: Hi, mrhi...@linux.vnet.ibm.com wrote: From: "Michael R. Hines" This patch sets up the initial changes to the migration state machine and prototypes to be used by the checkpointing code to interact with the state machine so that we can la

Re: [Qemu-devel] [RFC PATCH v2 01/12] mc: add documentation for micro-checkpointing

2014-02-20 Thread Michael R. Hines
On 02/21/2014 12:32 AM, Dr. David Alan Gilbert wrote: I'm happy to use more memory to get FT, all I'm trying to do is see if it's possible to put a lower bound than 2x on it while still maintaining full FT, at the expense of performance in the case where it uses a lot of memory. The bottom lin

Re: [Qemu-devel] [RFC PATCH v2 01/12] mc: add documentation for micro-checkpointing

2014-02-20 Thread Michael R. Hines
On 02/20/2014 07:14 PM, Li Guang wrote: Dr. David Alan Gilbert wrote: * Michael R. Hines (mrhi...@linux.vnet.ibm.com) wrote: On 02/19/2014 07:27 PM, Dr. David Alan Gilbert wrote: I was just wondering if a separate 'max buffer size' knob would allow you to more reasonably bound memo

Re: [Qemu-devel] [RFC PATCH v2 01/12] mc: add documentation for micro-checkpointing

2014-02-20 Thread Michael R. Hines
On 02/20/2014 06:09 PM, Dr. David Alan Gilbert wrote: * Michael R. Hines (mrhi...@linux.vnet.ibm.com) wrote: On 02/19/2014 07:27 PM, Dr. David Alan Gilbert wrote: I was just wondering if a separate 'max buffer size' knob would allow you to more reasonably bound memory without setting

Re: [Qemu-devel] [RFC PATCH v2 06/12] mc: introduce state machine changes for MC

2014-02-19 Thread Michael R. Hines
On 02/19/2014 09:00 AM, Li Guang wrote: Hi, mrhi...@linux.vnet.ibm.com wrote: From: "Michael R. Hines" This patch sets up the initial changes to the migration state machine and prototypes to be used by the checkpointing code to interact with the state machine so that we can la

Re: [Qemu-devel] [RFC PATCH v2 01/12] mc: add documentation for micro-checkpointing

2014-02-19 Thread Michael R. Hines
On 02/19/2014 07:27 PM, Dr. David Alan Gilbert wrote: I was just wondering if a separate 'max buffer size' knob would allow you to more reasonably bound memory without setting policy; I don't think people like having potentially x2 memory. Note: Checkpoint memory is not monotonic in this patch

Re: [Qemu-devel] [RFC PATCH v2 08/12] mc: core logic

2014-02-18 Thread Michael R. Hines
On 02/19/2014 10:53 AM, Li Guang wrote: Michael R. Hines wrote: On 02/19/2014 09:07 AM, Li Guang wrote: Hi, mrhi...@linux.vnet.ibm.com wrote: From: "Michael R. Hines" This implements the core logic, all described in the first patch (docs/mc.txt). Signed-off-by: Michae

Re: [Qemu-devel] [RFC PATCH v1 0/3] provenance: save migration stats after completion to destination

2014-02-18 Thread Michael R. Hines
On 02/19/2014 10:40 AM, Eric Blake wrote: On 02/18/2014 07:30 PM, Michael R. Hines wrote: qemu 2.0 -> 2.0: pass the smaller struct from source, expect the smaller struct on dest, no problem qemu 2.0 -> 2.1: pass the smaller struct from source, dest notices the optional field is missi

Re: [Qemu-devel] [RFC PATCH v1 0/3] provenance: save migration stats after completion to destination

2014-02-18 Thread Michael R. Hines
On 02/18/2014 09:40 PM, Eric Blake wrote: On 02/17/2014 10:53 PM, mrhi...@linux.vnet.ibm.com wrote: From: "Michael R. Hines" This series allows us to send the contents of the entire MigrationInfo structure to the destination once the migration is over. This is very useful for ana

Re: [Qemu-devel] [RFC PATCH v2 08/12] mc: core logic

2014-02-18 Thread Michael R. Hines
On 02/19/2014 09:07 AM, Li Guang wrote: Hi, mrhi...@linux.vnet.ibm.com wrote: From: "Michael R. Hines" This implements the core logic, all described in the first patch (docs/mc.txt). Signed-off-by: Michael R. Hines --- migration-checkpoin

Re: [Qemu-devel] [RFC PATCH v2 06/12] mc: introduce state machine changes for MC

2014-02-18 Thread Michael R. Hines
On 02/19/2014 09:00 AM, Li Guang wrote: Hi, mrhi...@linux.vnet.ibm.com wrote: From: "Michael R. Hines" This patch sets up the initial changes to the migration state machine and prototypes to be used by the checkpointing code to interact with the state machine so that we can la

Re: [Qemu-devel] [RFC PATCH v2 02/12] mc: timestamp migration_bitmap and KVM logdirty usage

2014-02-18 Thread Michael R. Hines
On 02/18/2014 06:32 PM, Dr. David Alan Gilbert wrote: * mrhi...@linux.vnet.ibm.com (mrhi...@linux.vnet.ibm.com) wrote: From: "Michael R. Hines" We also later export these statistics over QMP for better monitoring of micro-checkpointing as the workload changes. @@ -548,9 +568,11

Re: [Qemu-devel] [RFC PATCH v2 01/12] mc: add documentation for micro-checkpointing

2014-02-18 Thread Michael R. Hines
On 02/18/2014 08:45 PM, Dr. David Alan Gilbert wrote: +The Micro-Checkpointing Process +Basic Algorithm +Micro-Checkpoints (MC) work against the existing live migration path in QEMU, and can effectively be understood as a "live migration that never ends". As such, iteration rounds happen at the

Re: [Qemu-devel] [RFC PATCH v2 00/12] mc: fault tolerante through micro-checkpointing

2014-02-18 Thread Michael R. Hines
directory. - Michael mrhi...@linux.vnet.ibm.com wrote: From: "Michael R. Hines" Changes since v1: 1. Re-based against Juan's improved migration_bitmap performance changes 2. Overhauled RDMA support to prepare for better usage of RDMA in other parts of the QEMU code base (such as stora

Re: [Qemu-devel] [Qemu-stable] [PATCH] rdma: memory leak InetSocketAddress

2014-02-17 Thread Michael R. Hines
On 02/16/2014 10:33 AM, Michael Roth wrote: Quoting Frank (2013-09-12 08:51:56) It is allocated by g_new0() in inet_parse(), so needs to be freed in qemu_rdma_data_init(). From d7a8d1aad11fbe9af389cf9dd6cee14cc3249b1f Mon Sep 17 00:00:00 2001 From: Frank Yang Date: Thu, 12 Sep 2013 21:37:56

Re: [Qemu-devel] qemu_rdma_cleanup seg - related to 5a91337?

2014-02-17 Thread Michael R. Hines
On 02/17/2014 05:06 PM, Dr. David Alan Gilbert wrote: * Michael R. Hines (mrhi...@linux.vnet.ibm.com) wrote: On 02/06/2014 08:26 PM, Dr. David Alan Gilbert wrote: Hi Isaku, I hit a seg in qemu_rdma_cleanup in the code changed by your '[PATCH] rdma: clean up of qemu_rdma_cl

Re: [Qemu-devel] qemu_rdma_cleanup seg - related to 5a91337?

2014-02-16 Thread Michael R. Hines
On 02/06/2014 08:26 PM, Dr. David Alan Gilbert wrote: Hi Isaku, I hit a seg in qemu_rdma_cleanup in the code changed by your '[PATCH] rdma: clean up of qemu_rdma_cleanup()' migration-rdma.c ~ 2241 if (rdma->qp) { rdma_destroy_qp(rdma->cm_id); rdma->qp = NULL; }

Re: [Qemu-devel] [PULL 00/50] migration queue

2014-01-12 Thread Michael R. Hines
On 12/25/2013 12:06 AM, Juan Quintela wrote: Hi Anthony This is the patches in the migration queue. Please pull. This includes: - Eduardo refactorings & tests - Matthew rate limit fix - Zhanghaoyu CANCELLING fixes - My bitmap changes Integration work was done by Orit. Happy Christmas, Juan.

Re: [Qemu-devel] [PATCH] migration: qmp_migrate(): keep working after syntax error

2014-01-02 Thread Michael R. Hines
r *uri, bool has_blk, bool blk, #endif } else { error_set(errp, QERR_INVALID_PARAMETER_VALUE, "uri", "a valid migration protocol"); +s->state = MIG_STATE_ERROR; return; } Reviewed-by: Michael R. Hines

Re: [Qemu-devel] [PATCH 11/17] add argument ram_addr_t to hook_ram_load

2013-12-26 Thread Michael R. Hines
qemu_file_set_error(f, ret); } Reviewed-by: Michael R. Hines

Re: [Qemu-devel] [PATCH 07/17] save_page: replace block_offset with a MemoryRegion

2013-12-26 Thread Michael R. Hines
if (ret != RAM_SAVE_CONTROL_DELAYED) { if (bytes_sent && *bytes_sent > 0) { Reviewed-by: Michael R. Hines

Re: [Qemu-devel] [PATCH v4 resend] rdma: rename 'x-rdma' => 'rdma'

2013-12-18 Thread Michael R. Hines
On 12/18/2013 09:51 PM, Eric Blake wrote: On 12/17/2013 10:43 PM, mrhi...@linux.vnet.ibm.com wrote: From: "Michael R. Hines" As far as we can tell, all known bugs have been fixed: 1. Parallel migrations are working 2. IPv6 migration is working 3. virt-test is working +++ b/qapi-s

Re: [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization

2013-11-24 Thread Michael R. Hines
On 11/25/2013 02:15 PM, Michael R. Hines wrote: On 11/06/2013 09:04 PM, Juan Quintela wrote: Hi [v2] In this version: - fixed all the comments from last versions (thanks Eric) - kvm migration bitmap is synchronized using bitmap operations - qemu bitmap -> migration bitmap is synchronized us

Re: [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization

2013-11-24 Thread Michael R. Hines
On 11/06/2013 11:49 PM, Paolo Bonzini wrote: Il 06/11/2013 15:37, Gerd Hoffmann ha scritto: - vga ram by default is not aligned in a page number multiple of 64, it could be optimized. Kraxel? It syncs the kvm bitmap at least 1 a second or so? bitmap is only 2048 pages (16MB by default). We ne

Re: [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization

2013-11-24 Thread Michael R. Hines
On 11/06/2013 09:04 PM, Juan Quintela wrote: Hi [v2] In this version: - fixed all the comments from last versions (thanks Eric) - kvm migration bitmap is synchronized using bitmap operations - qemu bitmap -> migration bitmap is synchronized using bitmap operations If bitmaps are not properly ali

Re: [Qemu-devel] [PATCH v3 for-1.7 resend] rdma: rename 'x-rdma' => 'rdma'

2013-11-22 Thread Michael R. Hines
On 11/23/2013 12:37 AM, Daniel P. Berrange wrote: On Sat, Nov 23, 2013 at 12:29:51AM +0800, mrhi...@linux.vnet.ibm.com wrote: From: "Michael R. Hines" As far as we can tell, all known bugs have been fixed: 3. Libvirt patches are ready Please stop claiming this. A proof of concept

Re: [Qemu-devel] [PATCH v3 for-1.7] rdma: rename 'x-rdma' => 'rdma'

2013-11-15 Thread Michael R. Hines
On 11/15/2013 02:25 PM, Eric Blake wrote: On 11/15/2013 10:40 AM, Michael R. Hines wrote: This is unrelated to RDMA - accessing the /dev/infiniband device nodes is already supported by libvirt my modifying the configuration file in /etc and that works just fine. http://wiki.qemu.org/Features

Re: [Qemu-devel] [PATCH v3 for-1.7] rdma: rename 'x-rdma' => 'rdma'

2013-11-15 Thread Michael R. Hines
On 11/15/2013 12:06 PM, Daniel P. Berrange wrote: On Wed, Nov 06, 2013 at 01:59:14PM -0500, mrhi...@linux.vnet.ibm.com wrote: From: "Michael R. Hines" As far as we can tell, all known bugs have been fixed: [snip] 3. Libvirt patches are ready [snip] Objections? There was a f

Re: [Qemu-devel] [PATCH v2] rdma: rename 'x-rdma' => 'rdma'

2013-11-06 Thread Michael R. Hines
On 11/05/2013 05:19 PM, Eric Blake wrote: On 10/26/2013 02:03 PM, mrhi...@linux.vnet.ibm.com wrote: From: "Michael R. Hines" As far as we can tell, all known bugs have been fixed: 1. Parallel RDMA migrations are working 2. IPv6 migration is working 3. Libvirt patches are ready 4.

Re: [Qemu-devel] [PATCH v3 0/4] Curling: KVM Fault Tolerance

2013-11-06 Thread Michael R. Hines
On 10/23/2013 01:23 AM, Jules wrote: On 2013-10-22 17:00 -0400,Michael R. Hines wrote: On 10/15/2013 03:26 AM, Jules Wang wrote: v2 -> v3: * add documentation of new option in qapi-schema. * long option name: ft -> fault-tolerant v1 -> v2: * cmdline: migrate cu

Re: [Qemu-devel] [RFC PATCH v1: 11/12] mc: register MC qemu-file functions and expose MC tunable capability

2013-11-06 Thread Michael R. Hines
, mrhi...@linux.vnet.ibm.com wrote: From: "Michael R. Hines" The capability allows management software to throttle the MC frequency during VM application transience. The qemu-file savevm() functions inform the destination that the incoming traffic is MC-specific traffic and not va

Re: [Qemu-devel] [PATCH] rdma: rename 'x-rdma' => 'rdma'

2013-10-26 Thread Michael R. Hines
On 10/25/2013 11:14 AM, Paolo Bonzini wrote: Il 25/10/2013 16:03, Michael R. Hines ha scritto: Well, I tried posting libvirt support with this naming scheme, but they didn't accepted. Their reason (Daniel, I think) is valid: experimental implies that it shouldn't be exposed in the

  1   2   3   4   5   >