[Qemu-devel] Wiki account request

2015-04-09 Thread Fan
Hi guys,

This is my first post here, so nice to meet you everyone!

I’m wondering if some of you can create a Wiki account for me. I want to add 
installation guide (e.g. dependencies list etc) for Arch Linux on 
http://wiki.qemu.org/Hosts/Linux .

Thanks.

[Qemu-devel] fail to test virtio-gpu with heaven benchmark4.0

2016-08-29 Thread Chen Fan
 float(((vec4(1,1,1,1)).w));
temp0[0].x = float(((in_0.).x));
temp0[0].y = float(((in_0.).y));
temp0[0].z = float(((in_4.).z));
temp0[2].w = float(((vec4(1,1,1,1)).w));
temp0[2].x = float(dot(vec4( temp0[3] ), vec4( temp0[0] )));
temp0[5].x = float(dot(vec4( temp0[4] ), vec4( temp0[0] )));
temp0[2].y = float(( temp0[5]. .y));
temp0[0].x = float(dot(vec4( temp0[1] ), vec4( temp0[0] )));
temp0[2].z = float(( temp0[0]. .z));
temp0[0].x = float(dot(vec3( temp0[3].xyzz ), vec3((in_4.xyzz;
temp0[5].x = float(dot(vec3( temp0[4].xyzz ), vec3((in_4.xyzz;
temp0[0].y = float(( temp0[5]. .y));
temp0[5].x = float(dot(vec3( temp0[1].xyzz ), vec3((in_4.xyzz;
temp0[0].z = float(( temp0[5]. .z));
temp0[5].x = float(dot(vec3( temp0[3].xyzz ), vec3((in_5.xyzz;
temp0[6].x = float(dot(vec3( temp0[4].xyzz ), vec3((in_5.xyzz;
temp0[5].y = float(( temp0[6]. .y));
temp0[6].x = float(dot(vec3( temp0[1].xyzz ), vec3((in_5.xyzz;
temp0[5].z = float(( temp0[6]. .z));
temp0[6].x = float(uintBitsToFloat((uvec4(gl_InstanceID) * 
uvec4(uvec4(ivec4(3,3,3,3).x);
temp0[6].xyz = 
vec3(intBitsToFloat(ivec4((uvec4(floatBitsToUint(temp0[6].)) + 
uvec4(uvec4(ivec4(0,1,2,2)).xyz);

addr0 = int(floatBitsToInt(temp0[6].));
temp0[3] = vec4((uintBitsToFloat(vsconst0[addr0 + 199])));
addr0 = int(floatBitsToInt(temp0[6].));
temp0[4] = vec4((uintBitsToFloat(vsconst0[addr0 + 199])));
addr0 = int(floatBitsToInt(temp0[6].));
temp0[1] = vec4((uintBitsToFloat(vsconst0[addr0 + 199])));
temp0[6].x = float(dot(vec4( temp0[3] ), vec4( temp0[2] )));
temp0[7].x = float(dot(vec4( temp0[4] ), vec4( temp0[2] )));
temp0[6].y = float(( temp0[7]. .y));
temp0[2].x = float(dot(vec4( temp0[1] ), vec4( temp0[2] )));
temp0[6].z = float(( temp0[2]. .z));
temp0[8].x = float(dot(vec3( temp0[3].xyzz ), vec3( temp0[0].xyzz )));
temp0[9].x = float(dot(vec3( temp0[4].xyzz ), vec3( temp0[0].xyzz )));
temp0[8].y = float(( temp0[9]. .y));
temp0[0].x = float(dot(vec3( temp0[1].xyzz ), vec3( temp0[0].xyzz )));
temp0[8].z = float(( temp0[0]. .z));
temp0[0].x = float(dot(vec3( temp0[8].xyzz ), vec3( temp0[8].xyzz )));
temp0[0].x = float(inversesqrt( temp0[0]. .x));
temp0[0].xyz = vec3((( temp0[8].xyzz  *  temp0[0]. )).xyz);
temp0[3].x = float(dot(vec3( temp0[3].xyzz ), vec3( temp0[5].xyzz )));
temp0[4].x = float(dot(vec3( temp0[4].xyzz ), vec3( temp0[5].xyzz )));
temp0[3].y = float(( temp0[4]. .y));
temp0[1].x = float(dot(vec3( temp0[1].xyzz ), vec3( temp0[5].xyzz )));
temp0[3].z = float(( temp0[1]. .z));
temp0[1].x = float(dot(vec3( temp0[3].xyzz ), vec3( temp0[3].xyzz )));
temp0[1].x = float(inversesqrt( temp0[1]. .x));
temp0[1].xyz = vec3((( temp0[3].xyzz  *  temp0[1]. )).xyz);
temp0[3].xyz = vec3((( temp0[0].zxyy  *  temp0[1].yzxx )).xyz);
temp0[3].xyz = vec3(( temp0[0].yzxx  *  temp0[1].zxyy  + -temp0[3].xyzz 
).xyz);

temp0[3].xyz = vec3((( temp0[3].xyzz  * (in_5.))).xyz);
temp0[4].zw = vec2(((in_3.wwzw).zw));
temp0[4].xy = vec2(((in_3.xyyy) * uintBitsToFloat(vsconst0[223].xyyy) + 
uintBitsToFloat(vsconst0[223].zwww)).xy);
temp0[0].xyz = vec3((( temp0[0].xyzz  * 
uintBitsToFloat(vsconst0[5].))).xyz);

temp0[5] = vec4(((uintBitsToFloat(vsconst0[1]) *  temp0[6]. )));
temp0[5] = vec4((uintBitsToFloat(vsconst0[2]) *  temp0[7].  + 
temp0[5] ));
temp0[2] = vec4((uintBitsToFloat(vsconst0[3]) *  temp0[2].  + 
temp0[5] ));

temp0[2] = vec4((( temp0[2]  + uintBitsToFloat(vsconst0[4];
temp0[5].xyz = vec3((( temp0[6].xyzz  * 
uintBitsToFloat(vsconst0[0].))).xyz);

temp0[6].x = float(( temp0[1]. .x));
temp0[6].y = float(( temp0[3]. .y));
temp0[6].z = float(( temp0[0]. .z));
temp0[6].xyz = vec3(( temp0[6].xyzx .xyz));
temp0[7].x = float(( temp0[1]. .x));
temp0[7].y = float(( temp0[3]. .y));
temp0[7].z = float(( temp0[0]. .z));
temp0[7].xyz = vec3(( temp0[7].xyzx .xyz));
temp0[1].x = float(( temp0[1]. .x));
temp0[1].y = float(( temp0[3]. .y));
temp0[1].z = float(( temp0[0]. .z));
temp0[0].xyz = vec3(( temp0[1].xyzx .xyz));
ex_g11 = vec4(( temp0[6] ));
ex_g12 = vec4(( temp0[7] ));
ex_g13 = vec4(( temp0[0] ));
gl_Position = vec4(( temp0[2] ));
ex_g9 = vec4(( temp0[4] ));
ex_g10 = vec4(( temp0[5] ));
gl_Position.y = gl_Position.y * winsys_adjust.y;
gl_Position.z = dot(gl_Position, vec4(0.0, 0.0, winsys_adjust.zw));
}

vrend_report_buffer_error: context error reported 3 "Xorg" Illegal 
command buffer 149816321


--
Sincerely,
Chen Fan

[root@localhost Unigine_Heaven-4.0]# ./heaven 
Loading "/root/Desktop/Unigine_Heaven-4.0/bin/../data/heaven_4.0.cfg"...
Loading "libGPUMonitor_x64.so"...
Loading "libGL.so.1"...
Loading "libopenal.so.1"...
Set 1920x1080 fullscreen video mode
Set 1.00 gamma value
Unigine engine http://unigine.com/
Binary: Linux 64bit GCC 4.4.5 Release Feb 13 2013 r11284
Features: OpenGL OpenAL XPad360 Joystick Flash Editor
App path:  /root/Desktop/Unigine_

[Qemu-devel] [PATCH] virtio: rename the bar index field name in VirtIOPCIProxy

2016-09-28 Thread Chen Fan
the bar index names are much similar to the bar memory regions,
distinguish them to improve the code readability.

Signed-off-by: Chen Fan 
---
 hw/display/virtio-vga.c |  4 ++--
 hw/virtio/virtio-pci.c  | 20 ++--
 hw/virtio/virtio-pci.h  |  8 
 3 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/hw/display/virtio-vga.c b/hw/display/virtio-vga.c
index f77b401..f9b017d 100644
--- a/hw/display/virtio-vga.c
+++ b/hw/display/virtio-vga.c
@@ -120,8 +120,8 @@ static void virtio_vga_realize(VirtIOPCIProxy *vpci_dev, 
Error **errp)
  * virtio regions are moved to the end of bar #2, to make room for
  * the stdvga mmio registers at the start of bar #2.
  */
-vpci_dev->modern_mem_bar = 2;
-vpci_dev->msix_bar = 4;
+vpci_dev->modern_mem_bar_idx = 2;
+vpci_dev->msix_bar_idx = 4;
 
 if (!(vpci_dev->flags & VIRTIO_PCI_FLAG_PAGE_PER_VQ)) {
 /*
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 2d60a00..06831de 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -1551,7 +1551,7 @@ static void 
virtio_pci_modern_mem_region_map(VirtIOPCIProxy *proxy,
  struct virtio_pci_cap *cap)
 {
 virtio_pci_modern_region_map(proxy, region, cap,
- &proxy->modern_bar, proxy->modern_mem_bar);
+ &proxy->modern_bar, 
proxy->modern_mem_bar_idx);
 }
 
 static void virtio_pci_modern_io_region_map(VirtIOPCIProxy *proxy,
@@ -1559,7 +1559,7 @@ static void 
virtio_pci_modern_io_region_map(VirtIOPCIProxy *proxy,
 struct virtio_pci_cap *cap)
 {
 virtio_pci_modern_region_map(proxy, region, cap,
- &proxy->io_bar, proxy->modern_io_bar);
+ &proxy->io_bar, proxy->modern_io_bar_idx);
 }
 
 static void virtio_pci_modern_mem_region_unmap(VirtIOPCIProxy *proxy,
@@ -1670,14 +1670,14 @@ static void virtio_pci_device_plugged(DeviceState *d, 
Error **errp)
 memory_region_init(&proxy->io_bar, OBJECT(proxy),
"virtio-pci-io", 0x4);
 
-pci_register_bar(&proxy->pci_dev, proxy->modern_io_bar,
+pci_register_bar(&proxy->pci_dev, proxy->modern_io_bar_idx,
  PCI_BASE_ADDRESS_SPACE_IO, &proxy->io_bar);
 
 virtio_pci_modern_io_region_map(proxy, &proxy->notify_pio,
 ¬ify_pio.cap);
 }
 
-pci_register_bar(&proxy->pci_dev, proxy->modern_mem_bar,
+pci_register_bar(&proxy->pci_dev, proxy->modern_mem_bar_idx,
  PCI_BASE_ADDRESS_SPACE_MEMORY |
  PCI_BASE_ADDRESS_MEM_PREFETCH |
  PCI_BASE_ADDRESS_MEM_TYPE_64,
@@ -1693,7 +1693,7 @@ static void virtio_pci_device_plugged(DeviceState *d, 
Error **errp)
 
 if (proxy->nvectors) {
 int err = msix_init_exclusive_bar(&proxy->pci_dev, proxy->nvectors,
-  proxy->msix_bar);
+  proxy->msix_bar_idx);
 if (err) {
 /* Notice when a system that supports MSIx can't initialize it.  */
 if (err != -ENOTSUP) {
@@ -1716,7 +1716,7 @@ static void virtio_pci_device_plugged(DeviceState *d, 
Error **errp)
   &virtio_pci_config_ops,
   proxy, "virtio-pci", size);
 
-pci_register_bar(&proxy->pci_dev, proxy->legacy_io_bar,
+pci_register_bar(&proxy->pci_dev, proxy->legacy_io_bar_idx,
  PCI_BASE_ADDRESS_SPACE_IO, &proxy->bar);
 }
 
@@ -1760,10 +1760,10 @@ static void virtio_pci_realize(PCIDevice *pci_dev, 
Error **errp)
  *   region 4+5 --  virtio modern memory (64bit) bar
  *
  */
-proxy->legacy_io_bar  = 0;
-proxy->msix_bar   = 1;
-proxy->modern_io_bar  = 2;
-proxy->modern_mem_bar = 4;
+proxy->legacy_io_bar_idx  = 0;
+proxy->msix_bar_idx   = 1;
+proxy->modern_io_bar_idx  = 2;
+proxy->modern_mem_bar_idx = 4;
 
 proxy->common.offset = 0x0;
 proxy->common.size = 0x1000;
diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h
index 541cbdb..b4edea6 100644
--- a/hw/virtio/virtio-pci.h
+++ b/hw/virtio/virtio-pci.h
@@ -143,10 +143,10 @@ struct VirtIOPCIProxy {
 MemoryRegion io_bar;
 MemoryRegion modern_cfg;
 AddressSpace modern_as;
-uint32_t legacy_io_bar;
-uint32_t msix_bar;
-uint32_t modern_io_bar;
-uint32_t modern_mem_bar;
+uint32_t legacy_io_bar_idx;
+uint32_t msix_bar_idx;
+uint32_t modern_io_bar_idx;
+uint32_t modern_mem_bar_idx;
 int config_cap;
 uint32_t flags;
 bool disable_modern;
-- 
2.7.4




Re: [Qemu-devel] [PATCH RFC] migration: set cpu throttle value by workload

2017-01-17 Thread Chao Fan
am version.

Any comments will be welcome.

[*]http://accc.riken.jp/en/supercom/himenobmt/

Thanks,

Chao FanOn Thu, Dec 29, 2016 at 05:16:19PM +0800, Chao Fan wrote:
>This RFC PATCH is my demo about the new feature, here is my POC mail:
>https://lists.gnu.org/archive/html/qemu-devel/2016-12/msg00646.html
>
>When migration_bitmap_sync executed, get the time and read bitmap to
>calculate how many dirty pages born between two sync.
>Use inst_dirty_pages / (time_now - time_prev) / ram_size to get
>inst_dirty_pages_rate. Then map from the inst_dirty_pages_rate
>to cpu throttle value. I have no idea how to map it. So I just do
>that in a simple way. The mapping way is just a guess and should
>be improved.
>
>This is just a demo. There are more methods.
>1.In another file, calculate the inst_dirty_pages_rate every second
>  or two seconds or another fixed time. Then set the cpu throttle
>  value according to the inst_dirty_pages_rate
>2.When inst_dirty_pages_rate gets a threshold, begin cpu throttle
>  and set the throttle value.
>
>Any comments will be welcome.
>
>Signed-off-by: Chao Fan 
>---
> include/qemu/bitmap.h | 17 +
> migration/ram.c   | 49 +
> 2 files changed, 66 insertions(+)
>
>diff --git a/include/qemu/bitmap.h b/include/qemu/bitmap.h
>index 63ea2d0..dc99f9b 100644
>--- a/include/qemu/bitmap.h
>+++ b/include/qemu/bitmap.h
>@@ -235,4 +235,21 @@ static inline unsigned long *bitmap_zero_extend(unsigned 
>long *old,
> return new;
> }
> 
>+static inline unsigned long bitmap_weight(const unsigned long *src, long 
>nbits)
>+{
>+unsigned long i, count = 0, nlong = nbits / BITS_PER_LONG;
>+
>+if (small_nbits(nbits)) {
>+return hweight_long(*src & BITMAP_LAST_WORD_MASK(nbits));
>+}
>+for (i = 0; i < nlong; i++) {
>+count += hweight_long(src[i]);
>+}
>+if (nbits % BITS_PER_LONG) {
>+count += hweight_long(src[i] & BITMAP_LAST_WORD_MASK(nbits));
>+}
>+
>+return count;
>+}
>+
> #endif /* BITMAP_H */
>diff --git a/migration/ram.c b/migration/ram.c
>index a1c8089..f96e3e3 100644
>--- a/migration/ram.c
>+++ b/migration/ram.c
>@@ -44,6 +44,7 @@
> #include "exec/ram_addr.h"
> #include "qemu/rcu_queue.h"
> #include "migration/colo.h"
>+#include "hw/boards.h"
> 
> #ifdef DEBUG_MIGRATION_RAM
> #define DPRINTF(fmt, ...) \
>@@ -599,6 +600,9 @@ static int64_t num_dirty_pages_period;
> static uint64_t xbzrle_cache_miss_prev;
> static uint64_t iterations_prev;
> 
>+static int64_t dirty_pages_time_prev;
>+static int64_t dirty_pages_time_now;
>+
> static void migration_bitmap_sync_init(void)
> {
> start_time = 0;
>@@ -606,6 +610,49 @@ static void migration_bitmap_sync_init(void)
> num_dirty_pages_period = 0;
> xbzrle_cache_miss_prev = 0;
> iterations_prev = 0;
>+
>+dirty_pages_time_prev = 0;
>+dirty_pages_time_now = 0;
>+}
>+
>+static void migration_inst_rate(void)
>+{
>+RAMBlock *block;
>+MigrationState *s = migrate_get_current();
>+int64_t inst_dirty_pages_rate, inst_dirty_pages = 0;
>+int64_t i;
>+unsigned long *num;
>+unsigned long len = 0;
>+
>+dirty_pages_time_now = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
>+if (dirty_pages_time_prev != 0) {
>+rcu_read_lock();
>+DirtyMemoryBlocks *blocks = atomic_rcu_read(
>+ &ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION]);
>+QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>+if (len == 0) {
>+len = block->offset;
>+}
>+len += block->used_length;
>+}
>+ram_addr_t idx = (len >> TARGET_PAGE_BITS) / DIRTY_MEMORY_BLOCK_SIZE;
>+if (((len >> TARGET_PAGE_BITS) % DIRTY_MEMORY_BLOCK_SIZE) != 0) {
>+idx++;
>+}
>+for (i = 0; i < idx; i++) {
>+num = blocks->blocks[i];
>+inst_dirty_pages += bitmap_weight(num, DIRTY_MEMORY_BLOCK_SIZE);
>+}
>+rcu_read_unlock();
>+
>+inst_dirty_pages_rate = inst_dirty_pages * TARGET_PAGE_SIZE *
>+1024 * 1024 * 1000 /
>+(dirty_pages_time_now - dirty_pages_time_prev) /
>+current_machine->ram_size;
>+s->parameters.cpu_throttle_initial = inst_dirty_pages_rate / 200;
>+s->parameters.cpu_throttle_increment = inst_dirty_pages_rate / 200;
>+}
>+dirty_pages_time_prev = dirty_pages_time_now;
> }
> 
> static void migration_bitmap_sync(

Re: [Qemu-devel] [PATCH v8 11/12] vfio: register aer resume notification handler for aer resume

2016-06-21 Thread Chen Fan

On 2016年06月21日 11:13, Alex Williamson wrote:

On Tue, 21 Jun 2016 10:16:25 +0800
Zhou Jie  wrote:


Hi, Alex


I was really hoping to hear your opinion, or at least some further
discussion of pros and cons rather than simply parroting back my idea.

I understand.


My current thinking is that a resume notifier to userspace is poorly
defined, it's not clear what the user can and cannot do between an
error notification and the resume notification.

Yes, do nothing between that time is better.


One approach to solve
that might be that the kernel internally handles the resume
notifications.  Maybe that means blocking the ioctl (interruptible
timeout) until the internal resume occurs, or maybe that means
returning -EAGAIN.

I don't think it is a good idea.
The kernel give the error and resume notifications, it's enough.
It's up to user to how to use them.

Well that's exactly why it's poorly defined.  What does a resume
notification signal a user that they're allowed to do?  What can they
not do between error and resume notification.  Clearly you had issues
attempting to perform a reset during this time period since it was
racing with the kernel reset, so is a user allowed to do a hot reset
between error and resume?  Where do we define it?  Do we prevent it if
they try?  Why?  What about the reset ioctl?  How and why is that
different from a hot reset?  (hint, they can be the same)  Do we define
that resets are not allowed between error and resume, but other
operations like read/write or interrupt setup ioctls are allowed? Why?
Clearly we can't do anything that manipulates the device between error
and resume since it might be lost or ineffective, but where do we
define it and do we need to actively enforce those rules?  I'm arguing
that it's poorly defined, so "it's up to the user how to use them"
doesn't not give me any additional confidence in that approach.  We
can't trust the user to be polite, we can't even trust the user not to
be malicious.

Hi Alex,
 on kernel side, I think if we don't trust the user behaviors, we 
should
 disable the access of vfio-pci interface once vfio-pci driver got the 
error_detected,

 we should disable all access to vfio fd regardless whether the vfio-pci
 was assigned to a VM, we also can return a EAGAIN error if user try
 to access it during the reset period until the host reset finished.
 on qemu side, when we got a error_detect, we pass through the
aer error to guest directly, ignore all access to vfio-pci during this 
time,

when qemu need to do a hot reset, we can retry to get the info from
the get info ioctl until we got the info that vfio-pci has been reset 
finished,
then do the hot_reset ioctl if need, the kernel should ensure the ioctl 
become

 accessible after host reset completed.

Thanks,
Chen


  

Probably implementations of each need to be worked
through to determine which is better.  We don't want to add complexity
to the kernel simply to make things easier for userspace, but we also
don't want a poorly specified interface that is difficult for
userspace to use correctly.  Thanks,

In qemu, the aer recovery process:
1. Detect support for resume notification
   If host vfio driver does not support for resume notification,
   directly fail to boot up VM as with aer enabled.
2. Immediately notify the VM on error detected.
3. Disable the device.
   Unmap the config space and bar region.
4. Delay the guest directed bus reset.
5. Wait for resume notification.
   If we don't get the resume notification from the host after
   some timeout, we would abort the guest directed bus reset
   altogether and unplug of the device to prevent it from further
   interacting with the VM.
6. After get the resume notification reset bus and enable the device.

I think we only make sure the disabled device
   will not interact with the VM.

Should interrupt irqfds then also be disabled so they trap into QEMU
and we can prevent that interaction?  Also, QEMU can be polite, but as
above, QEMU is just one user, the API is open to anyone and QEMU might
be exploited to not be so polite.  So if there are points where the
user can interfere with the kernel or exploit the knowledge that the
device is going through a reset, the kernel can't rely on a friendly
user.  Thanks,

Alex



--
Sincerely,
Chen Fan



[Qemu-devel] [PATCH] Add inst_dirty_pages_rate in 'info migrate'

2017-03-01 Thread Chao Fan
Auto-converge aims to accelerate migration by slowing down the
generation of dirty pages. But user doesn't know how to determine the
throttle value, so, a new item "inst-dirty-pages-rate" in "info migrate"
would be helpful for user's determination.

Signed-off-by: Chao Fan 
---
 hmp.c |  4 
 include/migration/migration.h |  1 +
 include/qemu/bitmap.h | 17 +
 migration/migration.c |  2 ++
 migration/ram.c   | 44 +++
 qapi-schema.json  |  1 +
 6 files changed, 69 insertions(+)

diff --git a/hmp.c b/hmp.c
index 2bc4f06..c7892ea 100644
--- a/hmp.c
+++ b/hmp.c
@@ -219,6 +219,10 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
 monitor_printf(mon, "dirty pages rate: %" PRIu64 " pages\n",
info->ram->dirty_pages_rate);
 }
+if (info->ram->inst_dirty_pages_rate) {
+monitor_printf(mon, "inst dirty pages rate: %" PRIu64 " bytes/s\n",
+   info->ram->inst_dirty_pages_rate);
+}
 if (info->ram->postcopy_requests) {
 monitor_printf(mon, "postcopy request count: %" PRIu64 "\n",
info->ram->postcopy_requests);
diff --git a/include/migration/migration.h b/include/migration/migration.h
index 1735d66..95f0453 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -164,6 +164,7 @@ struct MigrationState
 int64_t downtime;
 int64_t expected_downtime;
 int64_t dirty_pages_rate;
+int64_t inst_dirty_pages_rate;
 int64_t dirty_bytes_rate;
 bool enabled_capabilities[MIGRATION_CAPABILITY__MAX];
 int64_t xbzrle_cache_size;
diff --git a/include/qemu/bitmap.h b/include/qemu/bitmap.h
index 63ea2d0..dc99f9b 100644
--- a/include/qemu/bitmap.h
+++ b/include/qemu/bitmap.h
@@ -235,4 +235,21 @@ static inline unsigned long *bitmap_zero_extend(unsigned 
long *old,
 return new;
 }
 
+static inline unsigned long bitmap_weight(const unsigned long *src, long nbits)
+{
+unsigned long i, count = 0, nlong = nbits / BITS_PER_LONG;
+
+if (small_nbits(nbits)) {
+return hweight_long(*src & BITMAP_LAST_WORD_MASK(nbits));
+}
+for (i = 0; i < nlong; i++) {
+count += hweight_long(src[i]);
+}
+if (nbits % BITS_PER_LONG) {
+count += hweight_long(src[i] & BITMAP_LAST_WORD_MASK(nbits));
+}
+
+return count;
+}
+
 #endif /* BITMAP_H */
diff --git a/migration/migration.c b/migration/migration.c
index c6ae69d..18fc2ec 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -644,6 +644,7 @@ static void populate_ram_info(MigrationInfo *info, 
MigrationState *s)
 if (s->state != MIGRATION_STATUS_COMPLETED) {
 info->ram->remaining = ram_bytes_remaining();
 info->ram->dirty_pages_rate = s->dirty_pages_rate;
+info->ram->inst_dirty_pages_rate = s->inst_dirty_pages_rate;
 }
 }
 
@@ -1099,6 +1100,7 @@ MigrationState *migrate_init(const MigrationParams 
*params)
 s->downtime = 0;
 s->expected_downtime = 0;
 s->dirty_pages_rate = 0;
+s->inst_dirty_pages_rate = 0;
 s->dirty_bytes_rate = 0;
 s->setup_time = 0;
 s->dirty_sync_count = 0;
diff --git a/migration/ram.c b/migration/ram.c
index f289fcd..185556f 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -44,6 +44,7 @@
 #include "exec/ram_addr.h"
 #include "qemu/rcu_queue.h"
 #include "migration/colo.h"
+#include "hw/boards.h"
 
 static int dirty_rate_high_cnt;
 
@@ -591,6 +592,9 @@ static int64_t num_dirty_pages_period;
 static uint64_t xbzrle_cache_miss_prev;
 static uint64_t iterations_prev;
 
+static int64_t dirty_pages_time_prev;
+static int64_t dirty_pages_time_now;
+
 static void migration_bitmap_sync_init(void)
 {
 start_time = 0;
@@ -598,6 +602,44 @@ static void migration_bitmap_sync_init(void)
 num_dirty_pages_period = 0;
 xbzrle_cache_miss_prev = 0;
 iterations_prev = 0;
+dirty_pages_time_prev = 0;
+dirty_pages_time_now = 0;
+}
+
+static void migration_inst_rate(void)
+{
+RAMBlock *block;
+MigrationState *s = migrate_get_current();
+int64_t inst_dirty_pages = 0;
+int64_t i;
+unsigned long *num;
+unsigned long len = 0;
+
+dirty_pages_time_now = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
+if (dirty_pages_time_prev != 0) {
+rcu_read_lock();
+DirtyMemoryBlocks *blocks = atomic_rcu_read(
+ &ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION]);
+QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
+if (len == 0) {
+len = block->offset;
+}
+len += block->used_length;
+}
+ram_a

[Qemu-devel] [PATCH v2] Add inst_dirty_pages_rate in 'info migrate'

2017-03-08 Thread Chao Fan
Auto-converge aims to accelerate migration by slowing down the
generation of dirty pages. But user doesn't know how to determine the
throttle value, so, a new item "inst-dirty-pages-rate" in "info migrate"
would be helpful for user's determination.

Signed-off-by: Chao Fan 

---
v2:
  Update the way to caculate the time.
  Add tag '(since 2.9)' in documentation of qapi-schema.json
---
 hmp.c |  4 
 include/migration/migration.h |  1 +
 include/qemu/bitmap.h | 17 
 migration/migration.c |  2 ++
 migration/ram.c   | 45 +++
 qapi-schema.json  |  4 
 6 files changed, 73 insertions(+)

diff --git a/hmp.c b/hmp.c
index 2bc4f06..c7892ea 100644
--- a/hmp.c
+++ b/hmp.c
@@ -219,6 +219,10 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
 monitor_printf(mon, "dirty pages rate: %" PRIu64 " pages\n",
info->ram->dirty_pages_rate);
 }
+if (info->ram->inst_dirty_pages_rate) {
+monitor_printf(mon, "inst dirty pages rate: %" PRIu64 " bytes/s\n",
+   info->ram->inst_dirty_pages_rate);
+}
 if (info->ram->postcopy_requests) {
 monitor_printf(mon, "postcopy request count: %" PRIu64 "\n",
info->ram->postcopy_requests);
diff --git a/include/migration/migration.h b/include/migration/migration.h
index 1735d66..95f0453 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -164,6 +164,7 @@ struct MigrationState
 int64_t downtime;
 int64_t expected_downtime;
 int64_t dirty_pages_rate;
+int64_t inst_dirty_pages_rate;
 int64_t dirty_bytes_rate;
 bool enabled_capabilities[MIGRATION_CAPABILITY__MAX];
 int64_t xbzrle_cache_size;
diff --git a/include/qemu/bitmap.h b/include/qemu/bitmap.h
index 63ea2d0..dc99f9b 100644
--- a/include/qemu/bitmap.h
+++ b/include/qemu/bitmap.h
@@ -235,4 +235,21 @@ static inline unsigned long *bitmap_zero_extend(unsigned 
long *old,
 return new;
 }
 
+static inline unsigned long bitmap_weight(const unsigned long *src, long nbits)
+{
+unsigned long i, count = 0, nlong = nbits / BITS_PER_LONG;
+
+if (small_nbits(nbits)) {
+return hweight_long(*src & BITMAP_LAST_WORD_MASK(nbits));
+}
+for (i = 0; i < nlong; i++) {
+count += hweight_long(src[i]);
+}
+if (nbits % BITS_PER_LONG) {
+count += hweight_long(src[i] & BITMAP_LAST_WORD_MASK(nbits));
+}
+
+return count;
+}
+
 #endif /* BITMAP_H */
diff --git a/migration/migration.c b/migration/migration.c
index c6ae69d..18fc2ec 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -644,6 +644,7 @@ static void populate_ram_info(MigrationInfo *info, 
MigrationState *s)
 if (s->state != MIGRATION_STATUS_COMPLETED) {
 info->ram->remaining = ram_bytes_remaining();
 info->ram->dirty_pages_rate = s->dirty_pages_rate;
+info->ram->inst_dirty_pages_rate = s->inst_dirty_pages_rate;
 }
 }
 
@@ -1099,6 +1100,7 @@ MigrationState *migrate_init(const MigrationParams 
*params)
 s->downtime = 0;
 s->expected_downtime = 0;
 s->dirty_pages_rate = 0;
+s->inst_dirty_pages_rate = 0;
 s->dirty_bytes_rate = 0;
 s->setup_time = 0;
 s->dirty_sync_count = 0;
diff --git a/migration/ram.c b/migration/ram.c
index f289fcd..7b440fd 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -44,6 +44,7 @@
 #include "exec/ram_addr.h"
 #include "qemu/rcu_queue.h"
 #include "migration/colo.h"
+#include "hw/boards.h"
 
 static int dirty_rate_high_cnt;
 
@@ -590,6 +591,7 @@ static int64_t bytes_xfer_prev;
 static int64_t num_dirty_pages_period;
 static uint64_t xbzrle_cache_miss_prev;
 static uint64_t iterations_prev;
+static int64_t dirty_pages_time_prev;
 
 static void migration_bitmap_sync_init(void)
 {
@@ -598,6 +600,47 @@ static void migration_bitmap_sync_init(void)
 num_dirty_pages_period = 0;
 xbzrle_cache_miss_prev = 0;
 iterations_prev = 0;
+dirty_pages_time_prev = 0;
+}
+
+static void migration_inst_rate(void)
+{
+int64_t dirty_pages_time_now;
+if (!dirty_pages_time_prev) {
+dirty_pages_time_prev = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
+}
+dirty_pages_time_now = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
+if (dirty_pages_time_now > dirty_pages_time_prev + 1000) {
+RAMBlock *block;
+MigrationState *s = migrate_get_current();
+int64_t inst_dirty_pages = 0;
+int64_t i;
+unsigned long *num;
+unsigned long len = 0;
+
+rcu_read_lock();
+DirtyMemoryBlocks *blocks = atomic_rcu_read(
+  

Re: [Qemu-devel] [PATCH v2] Add inst_dirty_pages_rate in 'info migrate'

2017-03-09 Thread Chao Fan
On Wed, Mar 08, 2017 at 01:45:59PM +, Daniel P. Berrange wrote:
>On Wed, Mar 08, 2017 at 04:28:19PM +0800, Chao Fan wrote:
>> Auto-converge aims to accelerate migration by slowing down the
>> generation of dirty pages. But user doesn't know how to determine the
>> throttle value, so, a new item "inst-dirty-pages-rate" in "info migrate"
>> would be helpful for user's determination.
Hi Daniel,

Thank you for your reply.
>
>The "info migrate" command already reports a "dirty-pages-rate" value.
>
>Maybe I'm mis-understanding what you're calculcating, this this proposal
>looks the same, except reporting in bytes rather than page counts.
>
>QEMU in fact already records the bytes count internally too in the
>'dirty_pages_bytes' parameter which is calculated from taking
>'dirty_pages_size * TARGET_PAGE_SIZE'.
>
>So I wonder if we can just export the existing dirty-pages-bytes
>value in info migrate, and avoid needing this new code here:
>
It's different, inst-dirty-pages-rate in this patch is greater than
or equal to dirty-pages-bytes. Because in function
cpu_physical_memory_sync_dirty_bitmap, file include/exec/ram_addr.h:

if (src[idx][offset]) {
unsigned long bits = atomic_xchg(&src[idx][offset], 0);
unsigned long new_dirty;
new_dirty = ~dest[k];
dest[k] |= bits;
new_dirty &= bits;
num_dirty += ctpopl(new_dirty);
}

After these codes, only the pages not dirtied in bitmap(dest), but dirtied
in dirty_memory[DIRTY_MEMORY_MIGRATION] will be calculated. For example:
When ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] = 0b,
and atomic_rcu_read(&migration_bitmap_rcu)->bmap = 0b0011,
the new_dirty will be 0b1100, and this function will return 2 but not
4 which is expected.

Thanks,
Chao Fan

>> +static void migration_inst_rate(void)
>> +{
>> +int64_t dirty_pages_time_now;
>> +if (!dirty_pages_time_prev) {
>> +dirty_pages_time_prev = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
>> +}
>> +dirty_pages_time_now = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
>> +if (dirty_pages_time_now > dirty_pages_time_prev + 1000) {
>> +RAMBlock *block;
>> +MigrationState *s = migrate_get_current();
>> +int64_t inst_dirty_pages = 0;
>> +int64_t i;
>> +unsigned long *num;
>> +unsigned long len = 0;
>> +
>> +rcu_read_lock();
>> +DirtyMemoryBlocks *blocks = atomic_rcu_read(
>> + &ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION]);
>> +QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>> +if (len == 0) {
>> +len = block->offset;
>> +}
>> +len += block->used_length;
>> +}
>> +ram_addr_t idx = (len >> TARGET_PAGE_BITS) / 
>> DIRTY_MEMORY_BLOCK_SIZE;
>> +if (((len >> TARGET_PAGE_BITS) % DIRTY_MEMORY_BLOCK_SIZE) != 0) {
>> +idx++;
>> +}
>> +for (i = 0; i < idx; i++) {
>> +num = blocks->blocks[i];
>> +inst_dirty_pages += bitmap_weight(num, DIRTY_MEMORY_BLOCK_SIZE);
>> +}
>> +rcu_read_unlock();
>> +
>> +s->inst_dirty_pages_rate = inst_dirty_pages * TARGET_PAGE_SIZE *
>> +1000 / (dirty_pages_time_now - dirty_pages_time_prev);
>> +}
>> +dirty_pages_time_prev = dirty_pages_time_now;
>>  }
>
>Regards,
>Daniel
>-- 
>|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
>|: http://libvirt.org  -o- http://virt-manager.org :|
>|: http://entangle-photo.org   -o-http://search.cpan.org/~danberr/ :|
>
>





[PATCH] COLO-compare: Fix incorrect `if` logic

2019-09-24 Thread Fan Yang
'colo_mark_tcp_pkt' should return 'true' when packets are the same, and
'false' otherwise.  However, it returns 'true' when
'colo_compare_packet_payload' returns non-zero while
'colo_compare_packet_payload' is just a 'memcmp'.  The result is that
COLO-compare reports inconsistent TCP packets when they are actually
the same.

Signed-off-by: Fan Yang 
---
 net/colo-compare.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 7489840bde..7ee17f2cf8 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -319,7 +319,7 @@ static bool colo_mark_tcp_pkt(Packet *ppkt, Packet *spkt,
 *mark = 0;
 
 if (ppkt->tcp_seq == spkt->tcp_seq && ppkt->seq_end == spkt->seq_end) {
-if (colo_compare_packet_payload(ppkt, spkt,
+if (!colo_compare_packet_payload(ppkt, spkt,
 ppkt->header_size, spkt->header_size,
 ppkt->payload_size)) {
 *mark = COLO_COMPARE_FREE_SECONDARY | COLO_COMPARE_FREE_PRIMARY;
@@ -329,7 +329,7 @@ static bool colo_mark_tcp_pkt(Packet *ppkt, Packet *spkt,
 
 /* one part of secondary packet payload still need to be compared */
 if (!after(ppkt->seq_end, spkt->seq_end)) {
-if (colo_compare_packet_payload(ppkt, spkt,
+if (!colo_compare_packet_payload(ppkt, spkt,
 ppkt->header_size + ppkt->offset,
 spkt->header_size + spkt->offset,
 ppkt->payload_size - ppkt->offset)) {
@@ -348,7 +348,7 @@ static bool colo_mark_tcp_pkt(Packet *ppkt, Packet *spkt,
 /* primary packet is longer than secondary packet, compare
  * the same part and mark the primary packet offset
  */
-if (colo_compare_packet_payload(ppkt, spkt,
+if (!colo_compare_packet_payload(ppkt, spkt,
 ppkt->header_size + ppkt->offset,
 spkt->header_size + spkt->offset,
 spkt->payload_size - spkt->offset)) {
-- 
2.17.1




Re: [PATCH] COLO-compare: Fix incorrect `if` logic

2019-09-24 Thread Fan Yang
OK, thank you all :)

Jason Wang  writes:

> On 2019/9/24 下午11:35, Philippe Mathieu-Daudé wrote:
>> Hi Fan,
>>
>> you forgot to Cc the maintainers (doing that for you):
>>
>> ./scripts/get_maintainer.pl -f net/colo-compare.c
>> Zhang Chen  (supporter:COLO Proxy)
>> Li Zhijian  (supporter:COLO Proxy)
>> Jason Wang  (maintainer:Network device ba...)
>> qemu-devel@nongnu.org (open list:All patches CC here)
>>
>> On 9/24/19 4:08 PM, Fan Yang wrote:
>>> 'colo_mark_tcp_pkt' should return 'true' when packets are the same, and
>>> 'false' otherwise.  However, it returns 'true' when
>>> 'colo_compare_packet_payload' returns non-zero while
>>> 'colo_compare_packet_payload' is just a 'memcmp'.  The result is that
>>> COLO-compare reports inconsistent TCP packets when they are actually
>>> the same.
>>>
>> Fixes: f449c9e549c
>> Reviewed-by: Philippe Mathieu-Daudé 
>
>
> Applied.
>
> Thanks
>
>
>>
>>> Signed-off-by: Fan Yang 
>>> ---
>>>  net/colo-compare.c | 6 +++---
>>>  1 file changed, 3 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/net/colo-compare.c b/net/colo-compare.c
>>> index 7489840bde..7ee17f2cf8 100644
>>> --- a/net/colo-compare.c
>>> +++ b/net/colo-compare.c
>>> @@ -319,7 +319,7 @@ static bool colo_mark_tcp_pkt(Packet *ppkt, Packet 
>>> *spkt,
>>>  *mark = 0;
>>>  
>>>  if (ppkt->tcp_seq == spkt->tcp_seq && ppkt->seq_end == spkt->seq_end) {
>>> -if (colo_compare_packet_payload(ppkt, spkt,
>>> +if (!colo_compare_packet_payload(ppkt, spkt,
>>>  ppkt->header_size, 
>>> spkt->header_size,
>>>  ppkt->payload_size)) {
>>>  *mark = COLO_COMPARE_FREE_SECONDARY | 
>>> COLO_COMPARE_FREE_PRIMARY;
>>> @@ -329,7 +329,7 @@ static bool colo_mark_tcp_pkt(Packet *ppkt, Packet 
>>> *spkt,
>>>  
>>>  /* one part of secondary packet payload still need to be compared */
>>>  if (!after(ppkt->seq_end, spkt->seq_end)) {
>>> -if (colo_compare_packet_payload(ppkt, spkt,
>>> +if (!colo_compare_packet_payload(ppkt, spkt,
>>>  ppkt->header_size + ppkt->offset,
>>>  spkt->header_size + spkt->offset,
>>>  ppkt->payload_size - 
>>> ppkt->offset)) {
>>> @@ -348,7 +348,7 @@ static bool colo_mark_tcp_pkt(Packet *ppkt, Packet 
>>> *spkt,
>>>  /* primary packet is longer than secondary packet, compare
>>>   * the same part and mark the primary packet offset
>>>   */
>>> -if (colo_compare_packet_payload(ppkt, spkt,
>>> +if (!colo_compare_packet_payload(ppkt, spkt,
>>>  ppkt->header_size + ppkt->offset,
>>>  spkt->header_size + spkt->offset,
>>>  spkt->payload_size - 
>>> spkt->offset)) {
>>>



Re: [Qemu-devel] An issue for migration determining the cpu throttle value according to workload

2016-12-11 Thread Chao Fan
ping...

Thanks,
Chao Fan

On Tue, Dec 06, 2016 at 04:52:11PM +0800, Chao Fan wrote:
>Hi all,
>
>Here is an issue in auto-converge feature of migration.
>
>When migrating a guest which consumes too much CPU & memory, dirty
>pages amount will increase significantly, so does the migration
>time, migration can not even complete, at worst.
>
>I did some simple tests on this feature. Set the two parameters
>the same as 10,20,30,40,50,60,70,80,99 and run the same task in the
>same guest. The result roughly is, with the increment of the
>two parameters, the total_time and the dirty_sync_count will decrease.
>Result shows larger the value of the two parameters is, faster the
>migration is, but much more slowly the guest runs.
>
>So I think there should be a appropriate throttle value according to
>the workload of guest. But users do not know how to determine the
>appropriate value.
>
>So I want to do a job that qemu can set the throttle value according
>to the workload of guest. I think qemu could calculate the instant
>dirty pages rate, and then determine a appropriate throttle value.
>The instant dirty pages rate means in a short fixed time, how
>many dirty pages born. But I have two questions:
>1. Where to add this feature. I have two options:
>   a. Now qemu detects the rest migration time and decides whether
>  to execute the CPU throttle. It can be changed to that qemu
>  executes the CPU throttle when instant dirty pages rate increases
>  to a certain threshold and sets the throttle value according to
>  the instant dirty pages rate. 
>   b. Using the current way as it is, when the rest migration time
>  is too long and begin to execute the CPU throttle, assign
>  appropriate throttle value according to the workload. Codes
>  will be changed fewer in this method.
>2. How to determine the CPU throttle value according to the dirty pages.
>   My preliminary idea is, the CPU throttle should be related to
>   the instant dirty pages rate and the total memory.
>   But I am not sure how to do the map from instant dirty pages rate
>   and total memory to CPU throttle value is best.
>
>Any comments will be welcome, and I want to know whether more people
>think this feature is needed.
>If anyone has good ideas, please tell me.
>
>Thanks,
>Chao Fan





[Qemu-devel] [PATCH RFC] migration: set cpu throttle value by workload

2016-12-29 Thread Chao Fan
This RFC PATCH is my demo about the new feature, here is my POC mail:
https://lists.gnu.org/archive/html/qemu-devel/2016-12/msg00646.html

When migration_bitmap_sync executed, get the time and read bitmap to
calculate how many dirty pages born between two sync.
Use inst_dirty_pages / (time_now - time_prev) / ram_size to get
inst_dirty_pages_rate. Then map from the inst_dirty_pages_rate
to cpu throttle value. I have no idea how to map it. So I just do
that in a simple way. The mapping way is just a guess and should
be improved.

This is just a demo. There are more methods.
1.In another file, calculate the inst_dirty_pages_rate every second
  or two seconds or another fixed time. Then set the cpu throttle
  value according to the inst_dirty_pages_rate
2.When inst_dirty_pages_rate gets a threshold, begin cpu throttle
  and set the throttle value.

Any comments will be welcome.

Signed-off-by: Chao Fan 
---
 include/qemu/bitmap.h | 17 +
 migration/ram.c   | 49 +
 2 files changed, 66 insertions(+)

diff --git a/include/qemu/bitmap.h b/include/qemu/bitmap.h
index 63ea2d0..dc99f9b 100644
--- a/include/qemu/bitmap.h
+++ b/include/qemu/bitmap.h
@@ -235,4 +235,21 @@ static inline unsigned long *bitmap_zero_extend(unsigned 
long *old,
 return new;
 }
 
+static inline unsigned long bitmap_weight(const unsigned long *src, long nbits)
+{
+unsigned long i, count = 0, nlong = nbits / BITS_PER_LONG;
+
+if (small_nbits(nbits)) {
+return hweight_long(*src & BITMAP_LAST_WORD_MASK(nbits));
+}
+for (i = 0; i < nlong; i++) {
+count += hweight_long(src[i]);
+}
+if (nbits % BITS_PER_LONG) {
+count += hweight_long(src[i] & BITMAP_LAST_WORD_MASK(nbits));
+}
+
+return count;
+}
+
 #endif /* BITMAP_H */
diff --git a/migration/ram.c b/migration/ram.c
index a1c8089..f96e3e3 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -44,6 +44,7 @@
 #include "exec/ram_addr.h"
 #include "qemu/rcu_queue.h"
 #include "migration/colo.h"
+#include "hw/boards.h"
 
 #ifdef DEBUG_MIGRATION_RAM
 #define DPRINTF(fmt, ...) \
@@ -599,6 +600,9 @@ static int64_t num_dirty_pages_period;
 static uint64_t xbzrle_cache_miss_prev;
 static uint64_t iterations_prev;
 
+static int64_t dirty_pages_time_prev;
+static int64_t dirty_pages_time_now;
+
 static void migration_bitmap_sync_init(void)
 {
 start_time = 0;
@@ -606,6 +610,49 @@ static void migration_bitmap_sync_init(void)
 num_dirty_pages_period = 0;
 xbzrle_cache_miss_prev = 0;
 iterations_prev = 0;
+
+dirty_pages_time_prev = 0;
+dirty_pages_time_now = 0;
+}
+
+static void migration_inst_rate(void)
+{
+RAMBlock *block;
+MigrationState *s = migrate_get_current();
+int64_t inst_dirty_pages_rate, inst_dirty_pages = 0;
+int64_t i;
+unsigned long *num;
+unsigned long len = 0;
+
+dirty_pages_time_now = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
+if (dirty_pages_time_prev != 0) {
+rcu_read_lock();
+DirtyMemoryBlocks *blocks = atomic_rcu_read(
+ &ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION]);
+QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
+if (len == 0) {
+len = block->offset;
+}
+len += block->used_length;
+}
+ram_addr_t idx = (len >> TARGET_PAGE_BITS) / DIRTY_MEMORY_BLOCK_SIZE;
+if (((len >> TARGET_PAGE_BITS) % DIRTY_MEMORY_BLOCK_SIZE) != 0) {
+idx++;
+}
+for (i = 0; i < idx; i++) {
+num = blocks->blocks[i];
+inst_dirty_pages += bitmap_weight(num, DIRTY_MEMORY_BLOCK_SIZE);
+}
+rcu_read_unlock();
+
+inst_dirty_pages_rate = inst_dirty_pages * TARGET_PAGE_SIZE *
+1024 * 1024 * 1000 /
+(dirty_pages_time_now - dirty_pages_time_prev) /
+current_machine->ram_size;
+s->parameters.cpu_throttle_initial = inst_dirty_pages_rate / 200;
+s->parameters.cpu_throttle_increment = inst_dirty_pages_rate / 200;
+}
+dirty_pages_time_prev = dirty_pages_time_now;
 }
 
 static void migration_bitmap_sync(void)
@@ -629,6 +676,8 @@ static void migration_bitmap_sync(void)
 trace_migration_bitmap_sync_start();
 memory_global_dirty_log_sync();
 
+migration_inst_rate();
+
 qemu_mutex_lock(&migration_bitmap_mutex);
 rcu_read_lock();
 QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
-- 
2.9.3






Re: [Qemu-devel] [PATCH RFC] migration: set cpu throttle value by workload

2016-12-29 Thread Chao Fan
Hi all,

There is something to explain in this RFC PATCH.

On Thu, Dec 29, 2016 at 05:16:19PM +0800, Chao Fan wrote:
>This RFC PATCH is my demo about the new feature, here is my POC mail:
>https://lists.gnu.org/archive/html/qemu-devel/2016-12/msg00646.html
>
>When migration_bitmap_sync executed, get the time and read bitmap to
>calculate how many dirty pages born between two sync.
>Use inst_dirty_pages / (time_now - time_prev) / ram_size to get
>inst_dirty_pages_rate. Then map from the inst_dirty_pages_rate
>to cpu throttle value. I have no idea how to map it. So I just do
>that in a simple way. The mapping way is just a guess and should
>be improved.
>
>This is just a demo. There are more methods.
>1.In another file, calculate the inst_dirty_pages_rate every second
>  or two seconds or another fixed time. Then set the cpu throttle
>  value according to the inst_dirty_pages_rate
>2.When inst_dirty_pages_rate gets a threshold, begin cpu throttle
>  and set the throttle value.
>
>Any comments will be welcome.
>
>Signed-off-by: Chao Fan 
>---
> include/qemu/bitmap.h | 17 +
> migration/ram.c   | 49 +
> 2 files changed, 66 insertions(+)
>
>diff --git a/include/qemu/bitmap.h b/include/qemu/bitmap.h
>index 63ea2d0..dc99f9b 100644
>--- a/include/qemu/bitmap.h
>+++ b/include/qemu/bitmap.h
>@@ -235,4 +235,21 @@ static inline unsigned long *bitmap_zero_extend(unsigned 
>long *old,
> return new;
> }
> 
>+static inline unsigned long bitmap_weight(const unsigned long *src, long 
>nbits)

It is a function imported from kernel, to calculate the number of
dirty pages.

>+{
>+unsigned long i, count = 0, nlong = nbits / BITS_PER_LONG;
>+
>+if (small_nbits(nbits)) {
>+return hweight_long(*src & BITMAP_LAST_WORD_MASK(nbits));
>+}
>+for (i = 0; i < nlong; i++) {
>+count += hweight_long(src[i]);
>+}
>+if (nbits % BITS_PER_LONG) {
>+count += hweight_long(src[i] & BITMAP_LAST_WORD_MASK(nbits));
>+}
>+
>+return count;
>+}
>+
> #endif /* BITMAP_H */
>diff --git a/migration/ram.c b/migration/ram.c
>index a1c8089..f96e3e3 100644
>--- a/migration/ram.c
>+++ b/migration/ram.c
>@@ -44,6 +44,7 @@
> #include "exec/ram_addr.h"
> #include "qemu/rcu_queue.h"
> #include "migration/colo.h"
>+#include "hw/boards.h"
> 
> #ifdef DEBUG_MIGRATION_RAM
> #define DPRINTF(fmt, ...) \
>@@ -599,6 +600,9 @@ static int64_t num_dirty_pages_period;
> static uint64_t xbzrle_cache_miss_prev;
> static uint64_t iterations_prev;
> 
>+static int64_t dirty_pages_time_prev;
>+static int64_t dirty_pages_time_now;
>+
> static void migration_bitmap_sync_init(void)
> {
> start_time = 0;
>@@ -606,6 +610,49 @@ static void migration_bitmap_sync_init(void)
> num_dirty_pages_period = 0;
> xbzrle_cache_miss_prev = 0;
> iterations_prev = 0;
>+
>+dirty_pages_time_prev = 0;
>+dirty_pages_time_now = 0;
>+}
>+
>+static void migration_inst_rate(void)
>+{
>+RAMBlock *block;
>+MigrationState *s = migrate_get_current();
>+int64_t inst_dirty_pages_rate, inst_dirty_pages = 0;
>+int64_t i;
>+unsigned long *num;
>+unsigned long len = 0;
>+
>+dirty_pages_time_now = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);

When sync executed, we do this. And maybe every 1 second or another fixed
time to get the pages and time is also OK. But I have no idear which is
better.

>+if (dirty_pages_time_prev != 0) {
>+rcu_read_lock();
>+DirtyMemoryBlocks *blocks = atomic_rcu_read(
>+ &ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION]);
>+QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>+if (len == 0) {
>+len = block->offset;
>+}
>+len += block->used_length;
>+}
>+ram_addr_t idx = (len >> TARGET_PAGE_BITS) / DIRTY_MEMORY_BLOCK_SIZE;
>+if (((len >> TARGET_PAGE_BITS) % DIRTY_MEMORY_BLOCK_SIZE) != 0) {
>+idx++;
>+}
>+for (i = 0; i < idx; i++) {
>+num = blocks->blocks[i];
>+inst_dirty_pages += bitmap_weight(num, DIRTY_MEMORY_BLOCK_SIZE);
>+}
>+rcu_read_unlock();
>+
>+inst_dirty_pages_rate = inst_dirty_pages * TARGET_PAGE_SIZE *
>+1024 * 1024 * 1000 /

The time we get is ms, so pages *1000 to make time changed to second.

The two *1024 is just to keep the magnitude, otherwise the
inst_dirty_pages is so small that the rate will be 0.

>+(dirty_pages_time_now 

Re: [Qemu-devel] [patch v6 11/12] vfio: register aer resume notification handler for aer resume

2016-05-05 Thread Chen Fan


On 04/26/2016 10:48 PM, Alex Williamson wrote:

On Tue, 26 Apr 2016 11:39:02 +0800
Chen Fan  wrote:


On 04/14/2016 09:02 AM, Chen Fan wrote:

On 04/12/2016 05:38 AM, Alex Williamson wrote:

On Tue, 5 Apr 2016 19:42:02 +0800
Cao jin  wrote:
  

From: Chen Fan

for supporting aer recovery, host and guest would run the same aer
recovery code, that would do the secondary bus reset if the error
is fatal, the aer recovery process:
1. error_detected
2. reset_link (if fatal)
3. slot_reset/mmio_enabled
4. resume

it indicates that host will do secondary bus reset to reset
the physical devices under bus in step 2, that would cause
devices in D3 status in a short time. but in qemu, we register
an error detected handler, that would be invoked as host broadcasts
the error-detected event in step 1, in order to avoid guest do
reset_link when host do reset_link simultaneously. it may cause
fatal error. we introduce a resmue notifier to assure host reset
completely. then do guest aer injection.

Why is it safe to continue running the VM between the error detected
notification and the resume notification?  We're just pushing back the
point at which we inject the AER into the guest, potentially negating
any benefit by allowing the VM to consume bad data.  Shouldn't we
instead be immediately notifying the VM on error detected, but stalling
any access to the device until resume is signaled?  How do we know that
resume will ever be signaled?  We have both the problem that we may be
running on an older kernel that won't support a resume notification and
the problem that seeing a resume notification depends on the host being
able to successfully complete a link reset after fatal error. We can
detect support for resume notification, but we still need a strategy
for never receiving it.  Thanks,

That's make sense, but I haven't came up with a good idea. do you have
any idea, Alex?

I don't know that there are any good solutions here.  We need to
respond to the current error notifier interrupt and not regress from
our support there.  I think that means that if we want to switch from a
simple halt-on-error to a mechanism for the guest to handle recovery,
we need to disable access to the device between being notified that the
error occurred and being notified to resume.  We can do that by
disabling mmaps to the device and preventing access via the slow path
handlers.  I don't know what the best solution is for preventing access,
do we block and pause the VM or do we drop writes and return -1 for
reads, that's something that needs to be determined.  We also need to
inject the AER into the VM at the point we're notified of an error
because the VM needs to know as soon as possible to stop using the
device or trusting any data from it.  The next coordination point would
be something like the resume notifier that you've added and there are
numerous questions around the interaction of that with the guest
handling.  Clearly we can't do a guest directed bus reset until we get
the resume notifier, so do we block that execution path in QEMU until
the resume notification is received?  What happens if we don't get that
notification?  Is there any way that we can rely on the host having
done a bus reset to the point where we don't need to act on the guest
directed reset?  These are all things that need to be figured out.
Thanks,

Maybe we can simply pause the vcpu running and avoid the VM to
access the device. and add two flags in VFIO_DEVICE_GET_INFO to query
whether the vfio pci driver has a resume notifier,
if it does not have resume notifier flags, we can directly fail to boot 
up VM

as with aer enabled. otherwise, we should wait for resume notifier coming to
restart the cpu. about the problem of the reduplicated bus reset by host 
and guest,
I think qemu can according to the error is fatal or non-fatal to decide 
whether need
to do a bus reset on guest, I think it's not critical and could be 
resolved later.


Thanks,
Chen



Alex


.








[Qemu-devel] [PATCH] i386/helper: add cpu dump APIC information

2014-07-21 Thread Chen Fan
When KVM exit reason is KVM_EXIT_SHUTDOWN, there will cause
guest to reset, but we can't get any information to fix.
we knew KVM handle triple fault will set exit_reason to
KVM_EXIT_SHUTDOWN, so we also should dump the APIC information
to help to fix.

Signed-off-by: Chen Fan 
---
 include/qom/cpu.h|  2 ++
 kvm-all.c|  1 +
 target-i386/helper.c | 53 
 3 files changed, 56 insertions(+)

diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 1aafbf5..2d4d9d9 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -362,11 +362,13 @@ int cpu_write_elf32_qemunote(WriteCoreDumpFunction f, 
CPUState *cpu,
  * @CPU_DUMP_CODE:
  * @CPU_DUMP_FPU: dump FPU register state, not just integer
  * @CPU_DUMP_CCOP: dump info about TCG QEMU's condition code optimization state
+ * @CPU_DUMP_APIC: dump APIC info about interrupt executed state
  */
 enum CPUDumpFlags {
 CPU_DUMP_CODE = 0x0001,
 CPU_DUMP_FPU  = 0x0002,
 CPU_DUMP_CCOP = 0x0004,
+CPU_DUMP_APIC = 0x0008,
 };
 
 /**
diff --git a/kvm-all.c b/kvm-all.c
index 3ae30ee..74f27e6 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1753,6 +1753,7 @@ int kvm_cpu_exec(CPUState *cpu)
 case KVM_EXIT_SHUTDOWN:
 DPRINTF("shutdown\n");
 qemu_system_reset_request();
+cpu_dump_state(cpu, stderr, fprintf, CPU_DUMP_CODE|CPU_DUMP_APIC);
 ret = EXCP_INTERRUPT;
 break;
 case KVM_EXIT_UNKNOWN:
diff --git a/target-i386/helper.c b/target-i386/helper.c
index 11ca864..1a2d26e 100644
--- a/target-i386/helper.c
+++ b/target-i386/helper.c
@@ -25,6 +25,8 @@
 #include "monitor/monitor.h"
 #endif
 
+#include "hw/i386/apic_internal.h"
+
 //#define DEBUG_MMU
 
 static void cpu_x86_version(CPUX86State *env, int *family, int *model)
@@ -186,6 +188,7 @@ void x86_cpu_dump_state(CPUState *cs, FILE *f, 
fprintf_function cpu_fprintf,
 int flags)
 {
 X86CPU *cpu = X86_CPU(cs);
+APICCommonState *apic = APIC_COMMON(cpu->apic_state);
 CPUX86State *env = &cpu->env;
 int eflags, i, nb;
 char cc_op_name[32];
@@ -356,6 +359,56 @@ void x86_cpu_dump_state(CPUState *cs, FILE *f, 
fprintf_function cpu_fprintf,
 cpu_fprintf(f, " ");
 }
 }
+if (flags & CPU_DUMP_APIC && apic) {
+cpu_fprintf(f, "APICBASE=%08x APICID=%08x VERSION=%08x APR=%08x\n"
+   " TPR=%08xSVR=%08x LDR=%08x DFR=%08x\n",
+apic->apicbase,
+apic->id,
+apic->version,
+apic->arb_id,
+apic->tpr,
+apic->spurious_vec,
+apic->log_dest,
+apic->dest_mode);
+  for (i = 0; i < 8; i++) {
+  cpu_fprintf(f, "ISR[%3d:%3d]=%08x",
+  (32 * i), (32 * i + 31), apic->isr[i]);
+  if (i == 3 || i == 7) {
+  cpu_fprintf(f, "\n");
+  } else {
+  cpu_fprintf(f, " ");
+  }
+  }
+  for (i = 0; i < 8; i++) {
+  cpu_fprintf(f, "TMR[%3d:%3d]=%08x",
+  (32 * i), (32 * i + 31), apic->tmr[i]);
+  if (i == 3 || i == 7) {
+  cpu_fprintf(f, "\n");
+  } else {
+  cpu_fprintf(f, " ");
+  }
+  }
+  for (i = 0; i < 8; i++) {
+  cpu_fprintf(f, "IRR[%3d:%3d]=%08x",
+  (32 * i), (32 * i + 31), apic->irr[i]);
+  if (i == 3 || i == 7) {
+  cpu_fprintf(f, "\n");
+  } else {
+  cpu_fprintf(f, " ");
+  }
+  }
+  for (i = 0; i < APIC_LVT_NB; i++) {
+  cpu_fprintf(f, "LVT[%d]=%08x", i, apic->lvt[i]);
+  if (i == (APIC_LVT_NB - 1) / 2 || i == (APIC_LVT_NB -1)) {
+  cpu_fprintf(f, "\n");
+  } else {
+  cpu_fprintf(f, " ");
+  }
+  }
+  cpu_fprintf(f, "ESR=%08x ", apic->esr);
+  cpu_fprintf(f, "ICR[31:0]=%08x ICR[63:32]=%08x\n", apic->icr[0], 
apic->icr[1]);
+}
+
 if (flags & CPU_DUMP_CODE) {
 target_ulong base = env->segs[R_CS].base + env->eip;
 target_ulong offs = MIN(env->eip, DUMP_CODE_BYTES_BACKWARD);
-- 
1.9.3




[Qemu-devel] [PATCH] target-i386/cpu.c: Fix two error output indentation

2014-07-28 Thread Chen Fan
Signed-off-by: Chen Fan 
---
 target-i386/cpu.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index 6d008ab..217500c 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -1716,9 +1716,9 @@ static void x86_set_hv_spinlocks(Object *obj, Visitor *v, 
void *opaque,
 
 if (value < min || value > max) {
 error_setg(errp, "Property %s.%s doesn't take value %" PRId64
-  " (minimum: %" PRId64 ", maximum: %" PRId64 ")",
-  object_get_typename(obj), name ? name : "null",
-  value, min, max);
+   " (minimum: %" PRId64 ", maximum: %" PRId64 ")",
+   object_get_typename(obj), name ? name : "null",
+   value, min, max);
 return;
 }
 cpu->hyperv_spinlock_attempts = value;
@@ -1808,8 +1808,8 @@ static void x86_cpu_parse_featurestr(CPUState *cs, char 
*features,
 }
 if (numvalue < min) {
 error_report("hv-spinlocks value shall always be >= 0x%x"
-", fixup will be removed in future versions",
-min);
+ ", fixup will be removed in future versions",
+ min);
 numvalue = min;
 }
 snprintf(num, sizeof(num), "%" PRId32, numvalue);
-- 
1.9.3




[Qemu-devel] [PATCH 1/3] query-memdev: fix potential memory leaks

2014-08-01 Thread Chen Fan
Signed-off-by: Chen Fan 
---
 numa.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/numa.c b/numa.c
index 7bf7834..a2b4bca 100644
--- a/numa.c
+++ b/numa.c
@@ -318,10 +318,11 @@ void memory_region_allocate_system_memory(MemoryRegion 
*mr, Object *owner,
 static int query_memdev(Object *obj, void *opaque)
 {
 MemdevList **list = opaque;
+MemdevList *m = NULL;
 Error *err = NULL;
 
 if (object_dynamic_cast(obj, TYPE_MEMORY_BACKEND)) {
-MemdevList *m = g_malloc0(sizeof(*m));
+m = g_malloc0(sizeof(*m));
 
 m->value = g_malloc0(sizeof(*m->value));
 
@@ -369,6 +370,9 @@ static int query_memdev(Object *obj, void *opaque)
 
 return 0;
 error:
+g_free(m->value);
+g_free(m);
+
 return -1;
 }
 
-- 
1.9.3




[Qemu-devel] [PATCH 0/3] Fix some memory leaks about query memdev

2014-08-01 Thread Chen Fan
when using valgrind to test the command "query memdev", I had
found some memory leaks. the test result:

==13802== 4 bytes in 1 blocks are definitely lost in loss record 125 of 8,508
==13802==at 0x4A08934: malloc (vg_replace_malloc.c:291)
==13802==by 0x4A08AA8: realloc (vg_replace_malloc.c:687)
==13802==by 0x64C5736: g_realloc (in /usr/lib64/libglib-2.0.so.0.3400.2)
==13802==by 0x64DE226: ??? (in /usr/lib64/libglib-2.0.so.0.3400.2)
==13802==by 0x64DE279: g_string_sized_new (in 
/usr/lib64/libglib-2.0.so.0.3400.2)
==13802==by 0x47CFBB: string_output_visitor_new 
(string-output-visitor.c:341)
==13802==by 0x3F456F: object_property_get_uint16List (object.c:970)
==13802==by 0x1E8764: query_memdev (numa.c:361)
==13802==by 0x3F3248: object_child_foreach (object.c:686)
==13802==by 0x1E9141: qmp_query_memdev (numa.c:389)
==13802==by 0x2D79A0: qmp_marshal_input_query_memdev (qmp-marshal.c:5057)
==13802==by 0x1DD7D7: handle_qmp_command (monitor.c:5038)

==15046== 48 (16 direct, 32 indirect) bytes in 1 blocks are definitely lost in 
loss record 4,722 of 8,549
==15046==at 0x4A08934: malloc (vg_replace_malloc.c:291)
==15046==by 0x64C541D: ??? (in /usr/lib64/libglib-2.0.so.0.3400.2)
==15046==by 0x64C56E6: g_malloc0 (in /usr/lib64/libglib-2.0.so.0.3400.2)
==15046==by 0x1E868C: query_memdev (numa.c:325)
==15046==by 0x3F3258: object_child_foreach (object.c:686)
==15046==by 0x1E9141: qmp_query_memdev (numa.c:389)
==15046==by 0x2DDFF3: hmp_info_memdev (hmp.c:1687)
==15046==by 0x1E4B08: handle_user_command (monitor.c:4119)
==15046==by 0x1E4E7A: monitor_command_cb (monitor.c:5156)
==15046==by 0x496056: readline_handle_byte (readline.c:391)
==15046==by 0x1E4BCE: monitor_read (monitor.c:5139)
==15046==by 0x2BCDEF: fd_chr_read (qemu-char.c:213)

Chen Fan (3):
  query-memdev: fix potential memory leaks
  qom/object.c: fix string_output_get_string() memory leak
  hmp: fix MemdevList memory leak

 hmp.c| 15 ++-
 numa.c   |  6 +-
 qom/object.c | 11 ---
 3 files changed, 23 insertions(+), 9 deletions(-)

-- 
1.9.3




[Qemu-devel] [PATCH 2/3] qom/object.c: fix string_output_get_string() memory leak

2014-08-01 Thread Chen Fan
string_output_get_string() always return the data the sov->string
point. and never free.

Signed-off-by: Chen Fan 
---
 hmp.c|  6 --
 qom/object.c | 11 ---
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/hmp.c b/hmp.c
index 4d1838e..2414cc7 100644
--- a/hmp.c
+++ b/hmp.c
@@ -1687,6 +1687,7 @@ void hmp_info_memdev(Monitor *mon, const QDict *qdict)
 MemdevList *memdev_list = qmp_query_memdev(&err);
 MemdevList *m = memdev_list;
 StringOutputVisitor *ov;
+char *str;
 int i = 0;
 
 
@@ -1704,10 +1705,11 @@ void hmp_info_memdev(Monitor *mon, const QDict *qdict)
m->value->prealloc ? "true" : "false");
 monitor_printf(mon, "  policy: %s\n",
HostMemPolicy_lookup[m->value->policy]);
-monitor_printf(mon, "  host nodes: %s\n",
-   string_output_get_string(ov));
+str = string_output_get_string(ov);
+monitor_printf(mon, "  host nodes: %s\n", str);
 
 string_output_visitor_cleanup(ov);
+g_free(str);
 m = m->next;
 i++;
 }
diff --git a/qom/object.c b/qom/object.c
index 0e8267b..7ae4f33 100644
--- a/qom/object.c
+++ b/qom/object.c
@@ -948,16 +948,18 @@ int object_property_get_enum(Object *obj, const char 
*name,
 {
 StringOutputVisitor *sov;
 StringInputVisitor *siv;
+char *str;
 int ret;
 
 sov = string_output_visitor_new(false);
 object_property_get(obj, string_output_get_visitor(sov), name, errp);
-siv = string_input_visitor_new(string_output_get_string(sov));
+str = string_output_get_string(sov);
+siv = string_input_visitor_new(str);
 string_output_visitor_cleanup(sov);
 visit_type_enum(string_input_get_visitor(siv),
 &ret, strings, NULL, name, errp);
 string_input_visitor_cleanup(siv);
-
+g_free(str);
 return ret;
 }
 
@@ -966,15 +968,18 @@ void object_property_get_uint16List(Object *obj, const 
char *name,
 {
 StringOutputVisitor *ov;
 StringInputVisitor *iv;
+char *str;
 
 ov = string_output_visitor_new(false);
 object_property_get(obj, string_output_get_visitor(ov),
 name, errp);
-iv = string_input_visitor_new(string_output_get_string(ov));
+str = string_output_get_string(ov);
+iv = string_input_visitor_new(str);
 visit_type_uint16List(string_input_get_visitor(iv),
   list, NULL, errp);
 string_output_visitor_cleanup(ov);
 string_input_visitor_cleanup(iv);
+g_free(str);
 }
 
 void object_property_parse(Object *obj, const char *string,
-- 
1.9.3




[Qemu-devel] [PATCH 3/3] hmp: fix MemdevList memory leak

2014-08-01 Thread Chen Fan
Signed-off-by: Chen Fan 
---
 hmp.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/hmp.c b/hmp.c
index 2414cc7..0b1c830 100644
--- a/hmp.c
+++ b/hmp.c
@@ -1685,13 +1685,14 @@ void hmp_info_memdev(Monitor *mon, const QDict *qdict)
 {
 Error *err = NULL;
 MemdevList *memdev_list = qmp_query_memdev(&err);
-MemdevList *m = memdev_list;
+MemdevList *m;
 StringOutputVisitor *ov;
 char *str;
 int i = 0;
 
 
-while (m) {
+while (memdev_list) {
+m = memdev_list;
 ov = string_output_visitor_new(false);
 visit_type_uint16List(string_output_get_visitor(ov),
   &m->value->host_nodes, NULL, NULL);
@@ -1710,7 +1711,9 @@ void hmp_info_memdev(Monitor *mon, const QDict *qdict)
 
 string_output_visitor_cleanup(ov);
 g_free(str);
-m = m->next;
+memdev_list = memdev_list->next;
+g_free(m->value);
+g_free(m);
 i++;
 }
 
-- 
1.9.3




[Qemu-devel] [v2 1/3] query-memdev: fix potential memory leaks

2014-08-03 Thread Chen Fan
Signed-off-by: Chen Fan 
Reviewed-by: Peter Crosthwaite 
---
 numa.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/numa.c b/numa.c
index 7bf7834..a2b4bca 100644
--- a/numa.c
+++ b/numa.c
@@ -318,10 +318,11 @@ void memory_region_allocate_system_memory(MemoryRegion 
*mr, Object *owner,
 static int query_memdev(Object *obj, void *opaque)
 {
 MemdevList **list = opaque;
+MemdevList *m = NULL;
 Error *err = NULL;
 
 if (object_dynamic_cast(obj, TYPE_MEMORY_BACKEND)) {
-MemdevList *m = g_malloc0(sizeof(*m));
+m = g_malloc0(sizeof(*m));
 
 m->value = g_malloc0(sizeof(*m->value));
 
@@ -369,6 +370,9 @@ static int query_memdev(Object *obj, void *opaque)
 
 return 0;
 error:
+g_free(m->value);
+g_free(m);
+
 return -1;
 }
 
-- 
1.9.3




[Qemu-devel] [v2 2/3] qom/object.c: fix string_output_get_string() memory leak

2014-08-03 Thread Chen Fan
string_output_get_string() uses g_string_free(str, false) to
transfer the 'str' pointer to callers and never free it.

Signed-off-by: Chen Fan 
---
 hmp.c|  6 --
 qom/object.c | 12 ++--
 2 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/hmp.c b/hmp.c
index 4d1838e..ba40c75 100644
--- a/hmp.c
+++ b/hmp.c
@@ -1687,6 +1687,7 @@ void hmp_info_memdev(Monitor *mon, const QDict *qdict)
 MemdevList *memdev_list = qmp_query_memdev(&err);
 MemdevList *m = memdev_list;
 StringOutputVisitor *ov;
+char *str;
 int i = 0;
 
 
@@ -1704,9 +1705,10 @@ void hmp_info_memdev(Monitor *mon, const QDict *qdict)
m->value->prealloc ? "true" : "false");
 monitor_printf(mon, "  policy: %s\n",
HostMemPolicy_lookup[m->value->policy]);
-monitor_printf(mon, "  host nodes: %s\n",
-   string_output_get_string(ov));
+str = string_output_get_string(ov);
+monitor_printf(mon, "  host nodes: %s\n", str);
 
+g_free(str);
 string_output_visitor_cleanup(ov);
 m = m->next;
 i++;
diff --git a/qom/object.c b/qom/object.c
index 0e8267b..e5aed60 100644
--- a/qom/object.c
+++ b/qom/object.c
@@ -948,14 +948,18 @@ int object_property_get_enum(Object *obj, const char 
*name,
 {
 StringOutputVisitor *sov;
 StringInputVisitor *siv;
+char *str;
 int ret;
 
 sov = string_output_visitor_new(false);
 object_property_get(obj, string_output_get_visitor(sov), name, errp);
-siv = string_input_visitor_new(string_output_get_string(sov));
+str = string_output_get_string(sov);
+siv = string_input_visitor_new(str);
 string_output_visitor_cleanup(sov);
 visit_type_enum(string_input_get_visitor(siv),
 &ret, strings, NULL, name, errp);
+
+g_free(str);
 string_input_visitor_cleanup(siv);
 
 return ret;
@@ -966,13 +970,17 @@ void object_property_get_uint16List(Object *obj, const 
char *name,
 {
 StringOutputVisitor *ov;
 StringInputVisitor *iv;
+char *str;
 
 ov = string_output_visitor_new(false);
 object_property_get(obj, string_output_get_visitor(ov),
 name, errp);
-iv = string_input_visitor_new(string_output_get_string(ov));
+str = string_output_get_string(ov);
+iv = string_input_visitor_new(str);
 visit_type_uint16List(string_input_get_visitor(iv),
   list, NULL, errp);
+
+g_free(str);
 string_output_visitor_cleanup(ov);
 string_input_visitor_cleanup(iv);
 }
-- 
1.9.3




[Qemu-devel] [v2 0/3] Fix some memory leaks about query memdev

2014-08-03 Thread Chen Fan
when using valgrind to test the command "query memdev", I had
found some memory leaks. the test result:

==13802== 4 bytes in 1 blocks are definitely lost in loss record 125 of 8,508
==13802==at 0x4A08934: malloc (vg_replace_malloc.c:291)
==13802==by 0x4A08AA8: realloc (vg_replace_malloc.c:687)
==13802==by 0x64C5736: g_realloc (in /usr/lib64/libglib-2.0.so.0.3400.2)
==13802==by 0x64DE226: ??? (in /usr/lib64/libglib-2.0.so.0.3400.2)
==13802==by 0x64DE279: g_string_sized_new (in 
/usr/lib64/libglib-2.0.so.0.3400.2)
==13802==by 0x47CFBB: string_output_visitor_new 
(string-output-visitor.c:341)
==13802==by 0x3F456F: object_property_get_uint16List (object.c:970)
==13802==by 0x1E8764: query_memdev (numa.c:361)
==13802==by 0x3F3248: object_child_foreach (object.c:686)
==13802==by 0x1E9141: qmp_query_memdev (numa.c:389)
==13802==by 0x2D79A0: qmp_marshal_input_query_memdev (qmp-marshal.c:5057)
==13802==by 0x1DD7D7: handle_qmp_command (monitor.c:5038)

==15046== 48 (16 direct, 32 indirect) bytes in 1 blocks are definitely lost in 
loss record 4,722 of 8,549
==15046==at 0x4A08934: malloc (vg_replace_malloc.c:291)
==15046==by 0x64C541D: ??? (in /usr/lib64/libglib-2.0.so.0.3400.2)
==15046==by 0x64C56E6: g_malloc0 (in /usr/lib64/libglib-2.0.so.0.3400.2)
==15046==by 0x1E868C: query_memdev (numa.c:325)
==15046==by 0x3F3258: object_child_foreach (object.c:686)
==15046==by 0x1E9141: qmp_query_memdev (numa.c:389)
==15046==by 0x2DDFF3: hmp_info_memdev (hmp.c:1687)
==15046==by 0x1E4B08: handle_user_command (monitor.c:4119)
==15046==by 0x1E4E7A: monitor_command_cb (monitor.c:5156)
==15046==by 0x496056: readline_handle_byte (readline.c:391)
==15046==by 0x1E4BCE: monitor_read (monitor.c:5139)
==15046==by 0x2BCDEF: fd_chr_read (qemu-char.c:213)

Chen Fan (3):
  query-memdev: fix potential memory leaks
  qom/object.c: fix string_output_get_string() memory leak
  hmp: fix MemdevList memory leak

 hmp.c| 15 ++-
 numa.c   |  6 +-
 qom/object.c | 11 ---
 3 files changed, 23 insertions(+), 9 deletions(-)

-- 
1.9.3




[Qemu-devel] [v2 3/3] hmp: fix MemdevList memory leak

2014-08-03 Thread Chen Fan
the memdev_list in hmp_info_memdev() is never freed.
so we use existent method qapi_free_MemdevList() to free it.
and also we can use qapi_free_MemdevList() to replace list loops
to clean up the memdev list in error path.

Signed-off-by: Chen Fan 
---
 hmp.c  | 2 ++
 numa.c | 9 ++---
 2 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/hmp.c b/hmp.c
index ba40c75..40a90da 100644
--- a/hmp.c
+++ b/hmp.c
@@ -1715,4 +1715,6 @@ void hmp_info_memdev(Monitor *mon, const QDict *qdict)
 }
 
 monitor_printf(mon, "\n");
+
+qapi_free_MemdevList(memdev_list);
 }
diff --git a/numa.c b/numa.c
index a2b4bca..b3e140e 100644
--- a/numa.c
+++ b/numa.c
@@ -379,7 +379,7 @@ error:
 MemdevList *qmp_query_memdev(Error **errp)
 {
 Object *obj;
-MemdevList *list = NULL, *m;
+MemdevList *list = NULL;
 
 obj = object_resolve_path("/objects", NULL);
 if (obj == NULL) {
@@ -393,11 +393,6 @@ MemdevList *qmp_query_memdev(Error **errp)
 return list;
 
 error:
-while (list) {
-m = list;
-list = list->next;
-g_free(m->value);
-g_free(m);
-}
+qapi_free_MemdevList(list);
 return NULL;
 }
-- 
1.9.3




Re: [Qemu-devel] [PATCH v2] pc-dimm/numa: Fix stat of memory size in node when hotplug memory

2014-09-18 Thread Chen, Fan
On Thu, 2014-09-18 at 20:07 +0800, zhanghailiang wrote: 
> When do memory hotplug, if there is numa node, we should add
> the memory size to the corresponding node memory size.
> 
> For now, it mainly affects the result of hmp command "info numa".
> 
> Signed-off-by: zhanghailiang 
> ---
>  v2:
> - Don't modify the numa_info.node_mem directly when treating hotplug memory,
>   fix the "info numa" instead (suggested by Igor Mammedov)
> ---
>  hw/mem/pc-dimm.c | 27 +++
>  include/hw/mem/pc-dimm.h |  2 ++
>  include/sysemu/sysemu.h  |  1 +
>  monitor.c|  6 +-
>  numa.c   | 15 +++
>  5 files changed, 50 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c
> index 5bfc5b7..7b233a6 100644
> --- a/hw/mem/pc-dimm.c
> +++ b/hw/mem/pc-dimm.c
> @@ -195,6 +195,33 @@ out:
>  return ret;
>  }
>  
> +static int pc_dimm_stat_mem_size(Object *obj, void *opaque)
> +{
> +uint64_t *node_mem = opaque;
> +int ret;
> +
> +if (object_dynamic_cast(obj, TYPE_PC_DIMM)) {
> +DeviceState *dev = DEVICE(obj);
> +if (dev->realized && dev->hotplugged) {
> +PCDIMMDevice *dimm = PC_DIMM(obj);
> +int size = object_property_get_int(OBJECT(dimm), 
> PC_DIMM_SIZE_PROP,
> +   NULL);
> +if (size < 0) {
> +return -1;
> +}
> +node_mem[dimm->node] += size;
> +}
> +}
> +
> +ret = object_child_foreach(obj, pc_dimm_stat_mem_size, opaque);
> +return ret;
> +}
> +
> +void pc_dimm_stat_node_mem(uint64_t *node_mem)
> +{
> +object_child_foreach(qdev_get_machine(), pc_dimm_stat_mem_size, 
> node_mem);
> +}
> +
>  static Property pc_dimm_properties[] = {
>  DEFINE_PROP_UINT64(PC_DIMM_ADDR_PROP, PCDIMMDevice, addr, 0),
>  DEFINE_PROP_UINT32(PC_DIMM_NODE_PROP, PCDIMMDevice, node, 0),
> diff --git a/include/hw/mem/pc-dimm.h b/include/hw/mem/pc-dimm.h
> index 761eeef..0c9a8eb 100644
> --- a/include/hw/mem/pc-dimm.h
> +++ b/include/hw/mem/pc-dimm.h
> @@ -78,4 +78,6 @@ uint64_t pc_dimm_get_free_addr(uint64_t address_space_start,
>  int pc_dimm_get_free_slot(const int *hint, int max_slots, Error **errp);
>  
>  int qmp_pc_dimm_device_list(Object *obj, void *opaque);
> +
> +void pc_dimm_stat_node_mem(uint64_t *node_mem);
>  #endif
> diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
> index d8539fd..cfc1592 100644
> --- a/include/sysemu/sysemu.h
> +++ b/include/sysemu/sysemu.h
> @@ -160,6 +160,7 @@ typedef struct node_info {
>  extern NodeInfo numa_info[MAX_NODES];
>  void set_numa_nodes(void);
>  void set_numa_modes(void);
> +int query_numa_node_mem(uint64_t *node_mem);
>  extern QemuOptsList qemu_numa_opts;
>  int numa_init_func(QemuOpts *opts, void *opaque);
>  
> diff --git a/monitor.c b/monitor.c
> index 7467521..c8c812f 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -1948,7 +1948,10 @@ static void do_info_numa(Monitor *mon, const QDict 
> *qdict)
>  {
>  int i;
>  CPUState *cpu;
> +uint64_t *node_mem;
>  
> +node_mem = g_new0(uint64_t, nb_numa_nodes);
> +query_numa_node_mem(node_mem);
>  monitor_printf(mon, "%d nodes\n", nb_numa_nodes);
>  for (i = 0; i < nb_numa_nodes; i++) {
>  monitor_printf(mon, "node %d cpus:", i);
> @@ -1959,8 +1962,9 @@ static void do_info_numa(Monitor *mon, const QDict 
> *qdict)
>  }
>  monitor_printf(mon, "\n");
>  monitor_printf(mon, "node %d size: %" PRId64 " MB\n", i,
> -numa_info[i].node_mem >> 20);
> +node_mem[i] >> 20);
the indentation looks weird.

>  }
> +g_free(node_mem);
>  }
>  
>  #ifdef CONFIG_PROFILER
> diff --git a/numa.c b/numa.c
> index 3b98135..4e27dd8 100644
> --- a/numa.c
> +++ b/numa.c
> @@ -35,6 +35,7 @@
>  #include "hw/boards.h"
>  #include "sysemu/hostmem.h"
>  #include "qmp-commands.h"
> +#include "hw/mem/pc-dimm.h"
>  
>  QemuOptsList qemu_numa_opts = {
>  .name = "numa",
> @@ -315,6 +316,20 @@ void memory_region_allocate_system_memory(MemoryRegion 
> *mr, Object *owner,
>  }
>  }
>  
> +int query_numa_node_mem(uint64_t *node_mem)
> +{
> +int i;
> +
> +if (nb_numa_nodes <= 0) {
> +return 0;
if you don't change the out parameter node_mem, node_mem should
initialization.



Regards,
Chen 

> +}
> +pc_dimm_stat_node_mem(node_mem);
> +for (i = 0; i < nb_numa_nodes; i++) {
> +node_mem[i] += numa_info[i].node_mem;
> +}
> +return 0;
> +}
> +
>  static int query_memdev(Object *obj, void *opaque)
>  {
>  MemdevList **list = opaque;



[Qemu-devel] [RESEND v2 1/3] query-memdev: fix potential memory leaks

2014-08-17 Thread Chen Fan
Signed-off-by: Chen Fan 
Reviewed-by: Peter Crosthwaite 
Reviewed-by: Hu Tao 

---
 numa.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/numa.c b/numa.c
index c78cec9..aa772aa 100644
--- a/numa.c
+++ b/numa.c
@@ -318,10 +318,11 @@ void memory_region_allocate_system_memory(MemoryRegion 
*mr, Object *owner,
 static int query_memdev(Object *obj, void *opaque)
 {
 MemdevList **list = opaque;
+MemdevList *m = NULL;
 Error *err = NULL;
 
 if (object_dynamic_cast(obj, TYPE_MEMORY_BACKEND)) {
-MemdevList *m = g_malloc0(sizeof(*m));
+m = g_malloc0(sizeof(*m));
 
 m->value = g_malloc0(sizeof(*m->value));
 
@@ -369,6 +370,9 @@ static int query_memdev(Object *obj, void *opaque)
 
 return 0;
 error:
+g_free(m->value);
+g_free(m);
+
 return -1;
 }
 
-- 
1.9.3




[Qemu-devel] [RESEND v2 0/3] Fix some memory leaks about query memdev

2014-08-17 Thread Chen Fan
when using valgrind to test the command "query memdev", I had
found some memory leaks. the test result:

==13802== 4 bytes in 1 blocks are definitely lost in loss record 125 of 8,508
==13802==at 0x4A08934: malloc (vg_replace_malloc.c:291)
==13802==by 0x4A08AA8: realloc (vg_replace_malloc.c:687)
==13802==by 0x64C5736: g_realloc (in /usr/lib64/libglib-2.0.so.0.3400.2)
==13802==by 0x64DE226: ??? (in /usr/lib64/libglib-2.0.so.0.3400.2)
==13802==by 0x64DE279: g_string_sized_new (in 
/usr/lib64/libglib-2.0.so.0.3400.2)
==13802==by 0x47CFBB: string_output_visitor_new 
(string-output-visitor.c:341)
==13802==by 0x3F456F: object_property_get_uint16List (object.c:970)
==13802==by 0x1E8764: query_memdev (numa.c:361)
==13802==by 0x3F3248: object_child_foreach (object.c:686)
==13802==by 0x1E9141: qmp_query_memdev (numa.c:389)
==13802==by 0x2D79A0: qmp_marshal_input_query_memdev (qmp-marshal.c:5057)
==13802==by 0x1DD7D7: handle_qmp_command (monitor.c:5038)

==15046== 48 (16 direct, 32 indirect) bytes in 1 blocks are definitely lost in 
loss record 4,722 of 8,549
==15046==at 0x4A08934: malloc (vg_replace_malloc.c:291)
==15046==by 0x64C541D: ??? (in /usr/lib64/libglib-2.0.so.0.3400.2)
==15046==by 0x64C56E6: g_malloc0 (in /usr/lib64/libglib-2.0.so.0.3400.2)
==15046==by 0x1E868C: query_memdev (numa.c:325)
==15046==by 0x3F3258: object_child_foreach (object.c:686)
==15046==by 0x1E9141: qmp_query_memdev (numa.c:389)
==15046==by 0x2DDFF3: hmp_info_memdev (hmp.c:1687)
==15046==by 0x1E4B08: handle_user_command (monitor.c:4119)
==15046==by 0x1E4E7A: monitor_command_cb (monitor.c:5156)
==15046==by 0x496056: readline_handle_byte (readline.c:391)
==15046==by 0x1E4BCE: monitor_read (monitor.c:5139)
==15046==by 0x2BCDEF: fd_chr_read (qemu-char.c:213)


Chen Fan (3):
  query-memdev: fix potential memory leaks
  qom/object.c: fix string_output_get_string() memory leak
  hmp: fix MemdevList memory leak

 hmp.c|  8 ++--
 numa.c   | 15 +++
 qom/object.c | 12 ++--
 3 files changed, 23 insertions(+), 12 deletions(-)

-- 
1.9.3




[Qemu-devel] [RESEND v2 3/3] hmp: fix MemdevList memory leak

2014-08-17 Thread Chen Fan
the memdev_list in hmp_info_memdev() is never freed.
so we use existent method qapi_free_MemdevList() to free it.
and also we can use qapi_free_MemdevList() to replace list loops
to clean up the memdev list in error path.

Signed-off-by: Chen Fan 
Reviewed-by: Peter Crosthwaite 
Reviewed-by: Hu Tao 

---
 hmp.c  | 2 ++
 numa.c | 9 ++---
 2 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/hmp.c b/hmp.c
index ba40c75..40a90da 100644
--- a/hmp.c
+++ b/hmp.c
@@ -1715,4 +1715,6 @@ void hmp_info_memdev(Monitor *mon, const QDict *qdict)
 }
 
 monitor_printf(mon, "\n");
+
+qapi_free_MemdevList(memdev_list);
 }
diff --git a/numa.c b/numa.c
index aa772aa..f07149b 100644
--- a/numa.c
+++ b/numa.c
@@ -379,7 +379,7 @@ error:
 MemdevList *qmp_query_memdev(Error **errp)
 {
 Object *obj;
-MemdevList *list = NULL, *m;
+MemdevList *list = NULL;
 
 obj = object_resolve_path("/objects", NULL);
 if (obj == NULL) {
@@ -393,11 +393,6 @@ MemdevList *qmp_query_memdev(Error **errp)
 return list;
 
 error:
-while (list) {
-m = list;
-list = list->next;
-g_free(m->value);
-g_free(m);
-}
+qapi_free_MemdevList(list);
 return NULL;
 }
-- 
1.9.3




[Qemu-devel] [RESEND v2 2/3] qom/object.c: fix string_output_get_string() memory leak

2014-08-17 Thread Chen Fan
string_output_get_string() uses g_string_free(str, false) to
transfer the 'str' pointer to callers and never free it.

Signed-off-by: Chen Fan 
Reviewed-by: Peter Crosthwaite 
Reviewed-by: Hu Tao 

---
 hmp.c|  6 --
 qom/object.c | 12 ++--
 2 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/hmp.c b/hmp.c
index 4d1838e..ba40c75 100644
--- a/hmp.c
+++ b/hmp.c
@@ -1687,6 +1687,7 @@ void hmp_info_memdev(Monitor *mon, const QDict *qdict)
 MemdevList *memdev_list = qmp_query_memdev(&err);
 MemdevList *m = memdev_list;
 StringOutputVisitor *ov;
+char *str;
 int i = 0;
 
 
@@ -1704,9 +1705,10 @@ void hmp_info_memdev(Monitor *mon, const QDict *qdict)
m->value->prealloc ? "true" : "false");
 monitor_printf(mon, "  policy: %s\n",
HostMemPolicy_lookup[m->value->policy]);
-monitor_printf(mon, "  host nodes: %s\n",
-   string_output_get_string(ov));
+str = string_output_get_string(ov);
+monitor_printf(mon, "  host nodes: %s\n", str);
 
+g_free(str);
 string_output_visitor_cleanup(ov);
 m = m->next;
 i++;
diff --git a/qom/object.c b/qom/object.c
index 0e8267b..e5aed60 100644
--- a/qom/object.c
+++ b/qom/object.c
@@ -948,14 +948,18 @@ int object_property_get_enum(Object *obj, const char 
*name,
 {
 StringOutputVisitor *sov;
 StringInputVisitor *siv;
+char *str;
 int ret;
 
 sov = string_output_visitor_new(false);
 object_property_get(obj, string_output_get_visitor(sov), name, errp);
-siv = string_input_visitor_new(string_output_get_string(sov));
+str = string_output_get_string(sov);
+siv = string_input_visitor_new(str);
 string_output_visitor_cleanup(sov);
 visit_type_enum(string_input_get_visitor(siv),
 &ret, strings, NULL, name, errp);
+
+g_free(str);
 string_input_visitor_cleanup(siv);
 
 return ret;
@@ -966,13 +970,17 @@ void object_property_get_uint16List(Object *obj, const 
char *name,
 {
 StringOutputVisitor *ov;
 StringInputVisitor *iv;
+char *str;
 
 ov = string_output_visitor_new(false);
 object_property_get(obj, string_output_get_visitor(ov),
 name, errp);
-iv = string_input_visitor_new(string_output_get_string(ov));
+str = string_output_get_string(ov);
+iv = string_input_visitor_new(str);
 visit_type_uint16List(string_input_get_visitor(iv),
   list, NULL, errp);
+
+g_free(str);
 string_output_visitor_cleanup(ov);
 string_input_visitor_cleanup(iv);
 }
-- 
1.9.3




[Qemu-devel] [PATCH] gtk.c: Fix memory leak in gd_set_keycode_type()

2014-09-01 Thread Chen Fan
 this memory leak is introduced by the original
 commit 3158a3482b0093e41f2b2596fba50774ea31ae08

 valgrind out showing:
 ==14553== 21,459 (72 direct, 21,387 indirect) bytes in 1 blocks are definitely
 lost in loss record 8,055 of 8,082
 ==14553==at 0x4A06BC3: calloc (vg_replace_malloc.c:618)
 ==14553==by 0x80DBFBC: XkbGetKeyboardByName (in /usr/lib64/libX11.so.6.3.0)
 ==14553==by 0x40C704: gtk_display_init (gtk.c:1798)
 ==14553==by 0x1AEDC1: main (vl.c:4480)

Signed-off-by: Chen Fan 
---
 ui/gtk.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/ui/gtk.c b/ui/gtk.c
index 2345d7e..cdd2567 100644
--- a/ui/gtk.c
+++ b/ui/gtk.c
@@ -1810,6 +1810,13 @@ static void gd_set_keycode_type(GtkDisplayState *s)
 fprintf(stderr, "unknown keycodes `%s', please report to "
 "qemu-devel@nongnu.org\n", keycodes);
 }
+
+if (desc) {
+XkbFreeKeyboard(desc, XkbGBN_AllComponentsMask, True);
+}
+if (keycodes) {
+XFree(keycodes);
+}
 }
 #endif
 }
-- 
1.9.3




[Qemu-devel] [RFC 1/3] using CPUMASK bitmaps to calculate cpu index

2014-05-13 Thread Chen Fan
instead of seeking the number of CPUs, using CPUMASK bitmaps to
calculate the cpu index, also would be a gread benefit to remove
cpu index.

Signed-off-by: Chen Fan 
---
 exec.c  | 9 -
 include/qom/cpu.h   | 9 +
 include/sysemu/sysemu.h | 7 ---
 3 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/exec.c b/exec.c
index cf12049..2948841 100644
--- a/exec.c
+++ b/exec.c
@@ -473,16 +473,15 @@ void cpu_exec_init(CPUArchState *env)
 {
 CPUState *cpu = ENV_GET_CPU(env);
 CPUClass *cc = CPU_GET_CLASS(cpu);
-CPUState *some_cpu;
 int cpu_index;
 
 #if defined(CONFIG_USER_ONLY)
 cpu_list_lock();
 #endif
-cpu_index = 0;
-CPU_FOREACH(some_cpu) {
-cpu_index++;
-}
+cpu_index = find_first_zero_bit(cc->cpu_present_mask,
+MAX_CPUMASK_BITS);
+set_bit(cpu_index, cc->cpu_present_mask);
+
 cpu->cpu_index = cpu_index;
 cpu->numa_node = 0;
 QTAILQ_INIT(&cpu->breakpoints);
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index df977c8..b8f46b1 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -70,6 +70,13 @@ typedef void (*CPUUnassignedAccess)(CPUState *cpu, hwaddr 
addr,
 
 struct TranslationBlock;
 
+/* The following shall be true for all CPUs:
+ *   cpu->cpu_index < max_cpus <= MAX_CPUMASK_BITS
+ *
+ * Note that cpu->get_arch_id() may be larger than MAX_CPUMASK_BITS.
+ */
+#define MAX_CPUMASK_BITS 255
+
 /**
  * CPUClass:
  * @class_by_name: Callback to map -cpu command line model name to an
@@ -142,6 +149,8 @@ typedef struct CPUClass {
 const struct VMStateDescription *vmsd;
 int gdb_num_core_regs;
 const char *gdb_core_xml_file;
+
+DECLARE_BITMAP(cpu_present_mask, MAX_CPUMASK_BITS);
 } CPUClass;
 
 #ifdef HOST_WORDS_BIGENDIAN
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index ba5c7f8..04edb8b 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -134,13 +134,6 @@ extern QEMUClockType rtc_clock;
 
 #define MAX_NODES 64
 
-/* The following shall be true for all CPUs:
- *   cpu->cpu_index < max_cpus <= MAX_CPUMASK_BITS
- *
- * Note that cpu->get_arch_id() may be larger than MAX_CPUMASK_BITS.
- */
-#define MAX_CPUMASK_BITS 255
-
 extern int nb_numa_nodes;
 extern uint64_t node_mem[MAX_NODES];
 extern unsigned long *node_cpumask[MAX_NODES];
-- 
1.8.1.4




[Qemu-devel] [RFC 0/3] cpu: add device_add foo-x86_64-cpu support

2014-05-13 Thread Chen Fan
this patches tried to make cpu hotplug with device_add, 
and made -device foo-x86_64-cpu available,also we can
set apic-id property with command line, if without setting
apic-id property, we added first unoccupied apic id as the
default new apic id. and hotplug cpu with device_add, we
must make check of APIC ID after cpu object initialization
that was different from 'cpu_add' command which check 'ids'
at the beginning.

Chen Fan (3):
  using CPUMASK bitmaps to calculate cpu index
  cpu: introduce CpuTopoInfo structure for argument simplification
  cpu: add device_add foo-x86_64-cpu support

 exec.c  |  9 +++--
 include/qom/cpu.h   | 11 ++
 include/sysemu/sysemu.h |  7 
 qdev-monitor.c  | 11 ++
 target-i386/cpu.c   | 91 -
 target-i386/topology.h  | 51 ++-
 6 files changed, 151 insertions(+), 29 deletions(-)

-- 
1.8.1.4




[Qemu-devel] [RFC 2/3] cpu: introduce CpuTopoInfo structure for argument simplification

2014-05-13 Thread Chen Fan
Signed-off-by: Chen Fan 
Reviewed-by: Eduardo Habkost 
---
 target-i386/topology.h | 33 +
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/target-i386/topology.h b/target-i386/topology.h
index 07a6c5f..e9ff89c 100644
--- a/target-i386/topology.h
+++ b/target-i386/topology.h
@@ -47,6 +47,12 @@
  */
 typedef uint32_t apic_id_t;
 
+typedef struct X86CPUTopoInfo {
+unsigned pkg_id;
+unsigned core_id;
+unsigned smt_id;
+} X86CPUTopoInfo;
+
 /* Return the bit width needed for 'count' IDs
  */
 static unsigned apicid_bitwidth_for_count(unsigned count)
@@ -92,13 +98,11 @@ static inline unsigned apicid_pkg_offset(unsigned nr_cores, 
unsigned nr_threads)
  */
 static inline apic_id_t apicid_from_topo_ids(unsigned nr_cores,
  unsigned nr_threads,
- unsigned pkg_id,
- unsigned core_id,
- unsigned smt_id)
+ const X86CPUTopoInfo *topo)
 {
-return (pkg_id  << apicid_pkg_offset(nr_cores, nr_threads)) |
-   (core_id << apicid_core_offset(nr_cores, nr_threads)) |
-   smt_id;
+return (topo->pkg_id  << apicid_pkg_offset(nr_cores, nr_threads)) |
+   (topo->core_id << apicid_core_offset(nr_cores, nr_threads)) |
+   topo->smt_id;
 }
 
 /* Calculate thread/core/package IDs for a specific topology,
@@ -107,14 +111,12 @@ static inline apic_id_t apicid_from_topo_ids(unsigned 
nr_cores,
 static inline void x86_topo_ids_from_idx(unsigned nr_cores,
  unsigned nr_threads,
  unsigned cpu_index,
- unsigned *pkg_id,
- unsigned *core_id,
- unsigned *smt_id)
+ X86CPUTopoInfo *topo)
 {
 unsigned core_index = cpu_index / nr_threads;
-*smt_id = cpu_index % nr_threads;
-*core_id = core_index % nr_cores;
-*pkg_id = core_index / nr_cores;
+topo->smt_id = cpu_index % nr_threads;
+topo->core_id = core_index % nr_cores;
+topo->pkg_id = core_index / nr_cores;
 }
 
 /* Make APIC ID for the CPU 'cpu_index'
@@ -125,10 +127,9 @@ static inline apic_id_t x86_apicid_from_cpu_idx(unsigned 
nr_cores,
 unsigned nr_threads,
 unsigned cpu_index)
 {
-unsigned pkg_id, core_id, smt_id;
-x86_topo_ids_from_idx(nr_cores, nr_threads, cpu_index,
-  &pkg_id, &core_id, &smt_id);
-return apicid_from_topo_ids(nr_cores, nr_threads, pkg_id, core_id, smt_id);
+X86CPUTopoInfo topo;
+x86_topo_ids_from_idx(nr_cores, nr_threads, cpu_index, &topo);
+return apicid_from_topo_ids(nr_cores, nr_threads, &topo);
 }
 
 #endif /* TARGET_I386_TOPOLOGY_H */
-- 
1.8.1.4




[Qemu-devel] [RFC 3/3] cpu: add device_add foo-x86_64-cpu support

2014-05-13 Thread Chen Fan
In order to implement adding cpu with device_add, we should make the
check of APIC ID after object_init(), so add UserCreatable complete
method for checking APIC ID availability, and introduce cpu_physid_mask
for saving occupied APIC ID, then we could use -device foo-x86_64-cpu
without setting apic-id property to add default APIC IDs.

Signed-off-by: Chen Fan 
---
 include/qom/cpu.h  |  2 ++
 qdev-monitor.c | 11 ++
 target-i386/cpu.c  | 91 +-
 target-i386/topology.h | 18 ++
 4 files changed, 121 insertions(+), 1 deletion(-)

diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index b8f46b1..8ba9f7b 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -151,6 +151,7 @@ typedef struct CPUClass {
 const char *gdb_core_xml_file;
 
 DECLARE_BITMAP(cpu_present_mask, MAX_CPUMASK_BITS);
+DECLARE_BITMAP(cpu_physid_mask, MAX_CPUMASK_BITS);
 } CPUClass;
 
 #ifdef HOST_WORDS_BIGENDIAN
@@ -296,6 +297,7 @@ struct CPUState {
 QTAILQ_HEAD(CPUTailQ, CPUState);
 extern struct CPUTailQ cpus;
 #define CPU_NEXT(cpu) QTAILQ_NEXT(cpu, node)
+#define CPU_REMOVE(cpu) QTAILQ_REMOVE(&cpus, cpu, node)
 #define CPU_FOREACH(cpu) QTAILQ_FOREACH(cpu, &cpus, node)
 #define CPU_FOREACH_SAFE(cpu, next_cpu) \
 QTAILQ_FOREACH_SAFE(cpu, &cpus, node, next_cpu)
diff --git a/qdev-monitor.c b/qdev-monitor.c
index 02cbe43..36c200e 100644
--- a/qdev-monitor.c
+++ b/qdev-monitor.c
@@ -24,6 +24,7 @@
 #include "qmp-commands.h"
 #include "sysemu/arch_init.h"
 #include "qemu/config-file.h"
+#include "qom/object_interfaces.h"
 
 /*
  * Aliases were a bad idea from the start.  Let's keep them
@@ -556,6 +557,16 @@ DeviceState *qdev_device_add(QemuOpts *opts)
 return NULL;
 }
 
+user_creatable_complete(OBJECT(dev), &err);
+if (err != NULL) {
+qerror_report_err(err);
+ error_free(err);
+ object_unparent(OBJECT(dev));
+ object_unref(OBJECT(dev));
+ qerror_report(QERR_DEVICE_INIT_FAILED, driver);
+ return NULL;
+}
+
 dev->opts = opts;
 object_property_set_bool(OBJECT(dev), true, "realized", &err);
 if (err != NULL) {
diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index 8f193a9..56cc3ad 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -48,6 +48,7 @@
 #include "hw/i386/apic_internal.h"
 #endif
 
+#include "qom/object_interfaces.h"
 
 /* Cache topology CPUID constants: */
 
@@ -158,7 +159,7 @@
 #define L2_ITLB_4K_ASSOC   4
 #define L2_ITLB_4K_ENTRIES   512
 
-
+static int64_t cpu_2_physid[MAX_CPUMASK_BITS];
 
 static void x86_cpu_vendor_words2str(char *dst, uint32_t vendor1,
  uint32_t vendor2, uint32_t vendor3)
@@ -1546,12 +1547,16 @@ static void x86_cpuid_get_apic_id(Object *obj, Visitor 
*v, void *opaque,
 static void x86_cpuid_set_apic_id(Object *obj, Visitor *v, void *opaque,
   const char *name, Error **errp)
 {
+CPUState *cs = CPU(obj);
+CPUClass *cc = CPU_GET_CLASS(obj);
 X86CPU *cpu = X86_CPU(obj);
 DeviceState *dev = DEVICE(obj);
 const int64_t min = 0;
 const int64_t max = UINT32_MAX;
 Error *error = NULL;
 int64_t value;
+X86CPUTopoInfo topo;
+int64_t phys_id;
 
 if (dev->realized) {
 error_setg(errp, "Attempt to set property '%s' on '%s' after "
@@ -1571,10 +1576,28 @@ static void x86_cpuid_set_apic_id(Object *obj, Visitor 
*v, void *opaque,
 return;
 }
 
+if (value > x86_cpu_apic_id_from_index(max_cpus - 1)) {
+error_setg(errp, "CPU with APIC ID %" PRIi64
+   " is more than MAX APIC ID limits", value);
+return;
+}
+
+x86_topo_ids_from_apic_id(smp_cores, smp_threads, value, &topo);
+if (topo.smt_id >= smp_threads || topo.core_id >= smp_cores) {
+error_setg(errp, "CPU with APIC ID %" PRIi64 " does not match "
+   "topology configuration.", value);
+return;
+}
+
 if ((value != cpu->env.cpuid_apic_id) && cpu_exists(value)) {
 error_setg(errp, "CPU with APIC ID %" PRIi64 " exists", value);
 return;
 }
+
+phys_id = (topo.smt_id + topo.core_id * smp_threads
++ topo.pkg_id * smp_cores * smp_threads);
+set_bit(phys_id, cc->cpu_physid_mask);
+cpu_2_physid[cs->cpu_index] = phys_id;
 cpu->env.cpuid_apic_id = value;
 }
 
@@ -1999,12 +2022,57 @@ out:
 return cpu;
 }
 
+static void x86_cpu_cpudef_instance_init(Object *obj)
+{
+DeviceState *dev = DEVICE(obj);
+X86CPU *cpu = X86_CPU(obj);
+CPUX86State *env = &cpu->env;
+
+dev->hotplugged = true;
+
+env->cpuid_apic_id = ~0U;
+}
+
+static void x86_cpu_cpudef_complete(U

[Qemu-devel] [PATCH] target-i386: cpu: keeping function parameters alignment on new line

2014-11-05 Thread Chen Fan
Signed-off-by: Chen Fan 
---
 target-i386/cpu.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index fa860de..3f13dfe 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -540,8 +540,8 @@ void host_cpuid(uint32_t function, uint32_t count,
  * otherwise the string is assumed to sized by a terminating nul.
  * Return lexical ordering of *s1:*s2.
  */
-static int sstrcmp(const char *s1, const char *e1, const char *s2,
-const char *e2)
+static int sstrcmp(const char *s1, const char *e1,
+   const char *s2, const char *e2)
 {
 for (;;) {
 if (!*s1 || !*s2 || *s1 != *s2)
@@ -1859,7 +1859,7 @@ static void x86_cpu_parse_featurestr(CPUState *cs, char 
*features,
  * if flags, suppress names undefined in featureset.
  */
 static void listflags(char *buf, int bufsize, uint32_t fbits,
-const char **featureset, uint32_t flags)
+  const char **featureset, uint32_t flags)
 {
 const char **p = &featureset[31];
 char *q, *b, bit;
-- 
1.9.3




[Qemu-devel] [PATCH 0/2] pcie-aer: Fix command pcie_aer_inject_error invalid

2014-11-18 Thread Chen Fan
set each patch details.

Chen Fan (2):
  pcie_aer: fix typos in pcie_aer_inject_error comment
  pcie-aer: Fix command pcie_aer_inject_error is invalid

 hw/pci/pcie_aer.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

-- 
1.9.3




[Qemu-devel] [PATCH 1/2] pcie_aer: fix typos in pcie_aer_inject_error comment

2014-11-18 Thread Chen Fan
Refer to "PCI Express Base Spec3.0", this comments can't
fit the description in spec, so we should fix them.

Signed-off-by: Chen Fan 
---
 hw/pci/pcie_aer.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/pci/pcie_aer.c b/hw/pci/pcie_aer.c
index 1f4be16..7ca077a 100644
--- a/hw/pci/pcie_aer.c
+++ b/hw/pci/pcie_aer.c
@@ -618,11 +618,11 @@ static bool pcie_aer_inject_uncor_error(PCIEAERInject 
*inj, bool is_fatal)
  * non-Function specific error must be recorded in all functions.
  * It is the responsibility of the caller of this function.
  * It is also caller's responsibility to determine which function should
- * report the rerror.
+ * report the error.
  *
  * 6.2.4 Error Logging
- * 6.2.5 Sqeunce of Device Error Signaling and Logging Operations
- * table 6-2: Flowchard Showing Sequence of Device Error Signaling and Logging
+ * 6.2.5 Sequence of Device Error Signaling and Logging Operations
+ * table 6-2: Flowchart Showing Sequence of Device Error Signaling and Logging
  *Operations
  */
 int pcie_aer_inject_error(PCIDevice *dev, const PCIEAERErr *err)
-- 
1.9.3




[Qemu-devel] [PATCH 2/2] pcie-aer: Fix command pcie_aer_inject_error is invalid

2014-11-18 Thread Chen Fan
in spec "PCI Express 3.0" section 6.2.6 Figure 6-3 virtual bridge part,
the flowchart showing tell us SERR# enable at Bridge Control register
associate with system error at Secondary Status register can send error
message. but bridge_control from dev->config is NULL, and SERR# was set
in dev->wmask in pcie_aer_init() which was implemented by root port and
swith devices, so we should add wmask (for w/r) bit set for bridge control.
we can user command like:
qemu_system_x86_64:
-device ioh3420,bus=pcie.0,addr=1c.0,multifunction=on,port=1,id=bridge1
-device x3130-upstream,bus=bridge1,id=up.1,addr=00.0
-device xio3130-downstream,bus=up.1,id=down.1,port=1,addr=00.0,chassis=5

(qemu) pcie_aer_inject_error net0 POISON_TLP

after that,
guest can output the error message.

Signed-off-by: Chen Fan 
---
 hw/pci/pcie_aer.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/pci/pcie_aer.c b/hw/pci/pcie_aer.c
index 7ca077a..571dc92 100644
--- a/hw/pci/pcie_aer.c
+++ b/hw/pci/pcie_aer.c
@@ -231,7 +231,8 @@ pcie_aer_msg_alldev(PCIDevice *dev, const PCIEAERMsg *msg)
  */
 static bool pcie_aer_msg_vbridge(PCIDevice *dev, const PCIEAERMsg *msg)
 {
-uint16_t bridge_control = pci_get_word(dev->config + PCI_BRIDGE_CONTROL);
+uint16_t bridge_control = pci_get_word(dev->config + PCI_BRIDGE_CONTROL) |
+  pci_get_word(dev->wmask + PCI_BRIDGE_CONTROL);
 
 if (pcie_aer_msg_is_uncor(msg)) {
 /* Received System Error */
-- 
1.9.3




[Qemu-devel] [PATCH V9 0/4] qemu-img: add preallocation=full

2014-05-27 Thread Chen Fan
From: Hu Tao 

The purpose of this series is to use posix_fallocate() when creating
img file to ensure there are disk space for it which is way fast than
acturally writing to disk. But this only works in file system level.
For cases like thin provisioning, an option full preallocation is
added to write zeros to storage to ensure disk space.

changes to v8 are mainly address Eric's comments, as:

 - round up image file size to nearest sector size
 - dont' blindly lose error info
 - target for 2.1 rather than 2.0
 - and, rebase to latest git tree


Hu Tao (4):
  qapi: introduce PreallocMode and a new PreallocMode full.
  raw, qcow2: don't convert file size to sector size
  raw-posix: Add full image preallocation option
  qcow2: Add full image preallocation option

 block/qcow2.c  | 95 --
 block/raw-posix.c  | 64 +++
 block/raw-win32.c  |  5 ++-
 qapi-schema.json   | 14 +++
 tests/qemu-iotests/082.out | 54 +-
 5 files changed, 184 insertions(+), 48 deletions(-)

-- 
1.9.3




[Qemu-devel] [PATCH V9 1/4] qapi: introduce PreallocMode and a new PreallocMode full.

2014-05-27 Thread Chen Fan
From: Hu Tao 

This patch prepares for the subsequent patches.

Reviewed-by: Fam Zheng 
Reviewed-by: Eric Blake 
Signed-off-by: Hu Tao 
---
 block/qcow2.c|  8 
 qapi-schema.json | 14 ++
 2 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index a4b97e8..51f547d 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1594,7 +1594,7 @@ static int preallocate(BlockDriverState *bs)
 
 static int qcow2_create2(const char *filename, int64_t total_size,
  const char *backing_file, const char *backing_format,
- int flags, size_t cluster_size, int prealloc,
+ int flags, size_t cluster_size, PreallocMode prealloc,
  QEMUOptionParameter *options, int version,
  Error **errp)
 {
@@ -1771,7 +1771,7 @@ static int qcow2_create(const char *filename, 
QEMUOptionParameter *options,
 uint64_t sectors = 0;
 int flags = 0;
 size_t cluster_size = DEFAULT_CLUSTER_SIZE;
-int prealloc = 0;
+PreallocMode prealloc = PREALLOC_MODE_OFF;
 int version = 3;
 Error *local_err = NULL;
 int ret;
@@ -1792,9 +1792,9 @@ static int qcow2_create(const char *filename, 
QEMUOptionParameter *options,
 }
 } else if (!strcmp(options->name, BLOCK_OPT_PREALLOC)) {
 if (!options->value.s || !strcmp(options->value.s, "off")) {
-prealloc = 0;
+prealloc = PREALLOC_MODE_OFF;
 } else if (!strcmp(options->value.s, "metadata")) {
-prealloc = 1;
+prealloc = PREALLOC_MODE_METADATA;
 } else {
 error_setg(errp, "Invalid preallocation mode: '%s'",
options->value.s);
diff --git a/qapi-schema.json b/qapi-schema.json
index 7bc33ea..80abcb7 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -4722,3 +4722,17 @@
   'btn' : 'InputBtnEvent',
   'rel' : 'InputMoveEvent',
   'abs' : 'InputMoveEvent' } }
+
+##
+# @PreallocMode
+#
+# Preallocation mode of QEMU image file
+#
+# @off: no preallocation
+# @metadata: preallocate only for metadata
+# @full: preallocate all data, including metadata
+#
+# Since 2.1
+##
+{ 'enum': 'PreallocMode',
+  'data': [ 'off', 'metadata', 'full' ] }
-- 
1.9.3




[Qemu-devel] [PATCH V9 3/4] raw-posix: Add full image preallocation option

2014-05-27 Thread Chen Fan
From: Hu Tao 

This patch adds a new option preallocation for raw format, and implements
full preallocation by writing zeros to disk.

The metadata option is changed to use posix_fallocate() to ensure
subsquent writes to image file won't fail because of lack of disk space.

The purpose is to ensure disk space for image file. In cases
posix_fallocate() is supported, metadata option can be used, otherwise
(posix_fallocate() is not supported by filesystem, or in case of thin
 provisioning), full option has to be used. User has to choose the proper
way to use.

Signed-off-by: Hu Tao 
---
 block/raw-posix.c | 61 ---
 1 file changed, 54 insertions(+), 7 deletions(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index 710ea9b..07e2088 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -1246,6 +1246,7 @@ static int raw_create(const char *filename, 
QEMUOptionParameter *options,
 int fd;
 int result = 0;
 int64_t total_size = 0;
+PreallocMode prealloc = PREALLOC_MODE_OFF;
 
 strstart(filename, "file:", &filename);
 
@@ -1254,6 +1255,18 @@ static int raw_create(const char *filename, 
QEMUOptionParameter *options,
 if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
 total_size = (options->value.n + BDRV_SECTOR_SIZE) &
 BDRV_SECTOR_MASK;
+} else if (!strcmp(options->name, BLOCK_OPT_PREALLOC)) {
+if (!options->value.s || !strcmp(options->value.s, "off")) {
+prealloc = PREALLOC_MODE_OFF;
+} else if (!strcmp(options->value.s, "metadata")) {
+prealloc = PREALLOC_MODE_METADATA;
+} else if (!strcmp(options->value.s, "full")) {
+prealloc = PREALLOC_MODE_FULL;
+} else {
+error_setg(errp, "Invalid preallocation mode: '%s'",
+   options->value.s);
+return -EINVAL;
+}
 }
 options++;
 }
@@ -1263,16 +1276,45 @@ static int raw_create(const char *filename, 
QEMUOptionParameter *options,
 if (fd < 0) {
 result = -errno;
 error_setg_errno(errp, -result, "Could not create file");
-} else {
-if (ftruncate(fd, total_size) != 0) {
-result = -errno;
-error_setg_errno(errp, -result, "Could not resize file");
+goto out;
+}
+if (ftruncate(fd, total_size) != 0) {
+result = -errno;
+error_setg_errno(errp, -result, "Could not resize file");
+goto out_close;
+}
+if (prealloc == PREALLOC_MODE_METADATA) {
+/* posix_fallocate() doesn't set errno. */
+result = -posix_fallocate(fd, 0, total_size);
+if (result != 0) {
+error_setg_errno(errp, -result,
+ "Could not preallocate data for the new file");
 }
-if (qemu_close(fd) != 0) {
-result = -errno;
-error_setg_errno(errp, -result, "Could not close the new file");
+} else if (prealloc == PREALLOC_MODE_FULL) {
+char *buf = g_malloc0(65536);
+int64_t num = 0, left = total_size;
+
+while (left > 0) {
+num = MIN(left, 65536);
+result = write(fd, buf, num);
+if (result < 0) {
+result = -errno;
+error_setg_errno(errp, -result,
+ "Could not write to the new file");
+g_free(buf);
+goto out_close;
+}
+left -= num;
 }
+fsync(fd);
+g_free(buf);
+}
+out_close:
+if (qemu_close(fd) != 0 && result == 0) {
+result = -errno;
+error_setg_errno(errp, -result, "Could not close the new file");
 }
+out:
 return result;
 }
 
@@ -1447,6 +1489,11 @@ static QEMUOptionParameter raw_create_options[] = {
 .type = OPT_SIZE,
 .help = "Virtual disk size"
 },
+{
+.name = BLOCK_OPT_PREALLOC,
+.type = OPT_STRING,
+.help = "Preallocation mode (allowed values: off, metadata, full)"
+},
 { NULL }
 };
 
-- 
1.9.3




[Qemu-devel] [PATCH V9 2/4] raw, qcow2: don't convert file size to sector size

2014-05-27 Thread Chen Fan
From: Hu Tao 

and avoid converting it back later. And round up file size to nearest
sector.

Signed-off-by: Hu Tao 
---
 block/qcow2.c | 8 
 block/raw-posix.c | 5 +++--
 block/raw-win32.c | 5 +++--
 3 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 51f547d..81c2979 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1715,7 +1715,7 @@ static int qcow2_create2(const char *filename, int64_t 
total_size,
 }
 
 /* Okay, now that we have a valid image, let's give it the right size */
-ret = bdrv_truncate(bs, total_size * BDRV_SECTOR_SIZE);
+ret = bdrv_truncate(bs, total_size);
 if (ret < 0) {
 error_setg_errno(errp, -ret, "Could not resize image");
 goto out;
@@ -1768,7 +1768,7 @@ static int qcow2_create(const char *filename, 
QEMUOptionParameter *options,
 {
 const char *backing_file = NULL;
 const char *backing_fmt = NULL;
-uint64_t sectors = 0;
+uint64_t size = 0;
 int flags = 0;
 size_t cluster_size = DEFAULT_CLUSTER_SIZE;
 PreallocMode prealloc = PREALLOC_MODE_OFF;
@@ -1779,7 +1779,7 @@ static int qcow2_create(const char *filename, 
QEMUOptionParameter *options,
 /* Read out options */
 while (options && options->name) {
 if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
-sectors = options->value.n / 512;
+size = (options->value.n + BDRV_SECTOR_SIZE) & BDRV_SECTOR_MASK;
 } else if (!strcmp(options->name, BLOCK_OPT_BACKING_FILE)) {
 backing_file = options->value.s;
 } else if (!strcmp(options->name, BLOCK_OPT_BACKING_FMT)) {
@@ -1830,7 +1830,7 @@ static int qcow2_create(const char *filename, 
QEMUOptionParameter *options,
 return -EINVAL;
 }
 
-ret = qcow2_create2(filename, sectors, backing_file, backing_fmt, flags,
+ret = qcow2_create2(filename, size, backing_file, backing_fmt, flags,
 cluster_size, prealloc, options, version, &local_err);
 if (local_err) {
 error_propagate(errp, local_err);
diff --git a/block/raw-posix.c b/block/raw-posix.c
index 6586a0c..710ea9b 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -1252,7 +1252,8 @@ static int raw_create(const char *filename, 
QEMUOptionParameter *options,
 /* Read out options */
 while (options && options->name) {
 if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
-total_size = options->value.n / BDRV_SECTOR_SIZE;
+total_size = (options->value.n + BDRV_SECTOR_SIZE) &
+BDRV_SECTOR_MASK;
 }
 options++;
 }
@@ -1263,7 +1264,7 @@ static int raw_create(const char *filename, 
QEMUOptionParameter *options,
 result = -errno;
 error_setg_errno(errp, -result, "Could not create file");
 } else {
-if (ftruncate(fd, total_size * BDRV_SECTOR_SIZE) != 0) {
+if (ftruncate(fd, total_size) != 0) {
 result = -errno;
 error_setg_errno(errp, -result, "Could not resize file");
 }
diff --git a/block/raw-win32.c b/block/raw-win32.c
index 064ea31..faa574b 100644
--- a/block/raw-win32.c
+++ b/block/raw-win32.c
@@ -489,7 +489,8 @@ static int raw_create(const char *filename, 
QEMUOptionParameter *options,
 /* Read out options */
 while (options && options->name) {
 if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
-total_size = options->value.n / 512;
+total_size = (options->value.n + BDRV_SECTOR_SIZE) &
+BDRV_SECTOR_MASK;
 }
 options++;
 }
@@ -501,7 +502,7 @@ static int raw_create(const char *filename, 
QEMUOptionParameter *options,
 return -EIO;
 }
 set_sparse(fd);
-ftruncate(fd, total_size * 512);
+ftruncate(fd, total_size);
 qemu_close(fd);
 return 0;
 }
-- 
1.9.3




[Qemu-devel] [PATCH V9 4/4] qcow2: Add full image preallocation option

2014-05-27 Thread Chen Fan
From: Hu Tao 

This adds a preallocation=full mode to qcow2 image creation, which
creates a non-sparse image file.

Signed-off-by: Hu Tao 
---
 block/qcow2.c  | 79 --
 tests/qemu-iotests/082.out | 54 +++
 2 files changed, 103 insertions(+), 30 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 81c2979..5807dc0 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1598,6 +1598,7 @@ static int qcow2_create2(const char *filename, int64_t 
total_size,
  QEMUOptionParameter *options, int version,
  Error **errp)
 {
+QEMUOptionParameter *alloc_options = NULL;
 /* Calculate cluster_bits */
 int cluster_bits;
 cluster_bits = ffs(cluster_size) - 1;
@@ -1627,10 +1628,78 @@ static int qcow2_create2(const char *filename, int64_t 
total_size,
 Error *local_err = NULL;
 int ret;
 
+if (prealloc == PREALLOC_MODE_FULL || prealloc == PREALLOC_MODE_METADATA) {
+int64_t meta_size = 0;
+unsigned nreftablee, nrefblocke, nl1e, nl2e;
+BlockDriver *drv;
+
+total_size = align_offset(total_size, cluster_size);
+
+drv = bdrv_find_protocol(filename, true);
+if (drv == NULL) {
+error_setg(errp, "Could not find protocol for file '%s'", 
filename);
+return -ENOENT;
+}
+
+alloc_options = append_option_parameters(alloc_options,
+ drv->create_options);
+alloc_options = append_option_parameters(alloc_options, options);
+
+/* header: 1 cluster */
+meta_size += cluster_size;
+
+/* total size of L2 tables */
+nl2e = total_size / cluster_size;
+nl2e = align_offset(nl2e, cluster_size / sizeof(uint64_t));
+meta_size += nl2e * sizeof(uint64_t);
+
+/* total size of L1 tables */
+nl1e = nl2e * sizeof(uint64_t) / cluster_size;
+nl1e = align_offset(nl1e, cluster_size / sizeof(uint64_t));
+meta_size += nl1e * sizeof(uint64_t);
+
+/* total size of refcount blocks
+ *
+ * note: every host cluster is reference-counted, including metadata
+ * (even refcount blocks are recursively included).
+ * Let:
+ *   a = total_size (this is the guest disk size)
+ *   m = meta size not including refcount blocks and refcount tables
+ *   c = cluster size
+ *   y1 = number of refcount blocks entries
+ *   y2 = meta size including everything
+ * then,
+ *   y1 = (y2 + a)/c
+ *   y2 = y1 * sizeof(u16) + y1 * sizeof(u16) * sizeof(u64) / c + m
+ * we can get y1:
+ *   y1 = (a + m) / (c - sizeof(u16) - sizeof(u16) * sizeof(u64) / c)
+ */
+nrefblocke = (total_size + meta_size + cluster_size) /
+(cluster_size - sizeof(uint16_t) -
+ 1.0 * sizeof(uint16_t) * sizeof(uint64_t) / cluster_size);
+nrefblocke = align_offset(nrefblocke, cluster_size / sizeof(uint16_t));
+meta_size += nrefblocke * sizeof(uint16_t);
+
+/* total size of refcount tables */
+nreftablee = nrefblocke * sizeof(uint16_t) / cluster_size;
+nreftablee = align_offset(nreftablee, cluster_size / sizeof(uint64_t));
+meta_size += nreftablee * sizeof(uint64_t);
+
+set_option_parameter_int(alloc_options, BLOCK_OPT_SIZE,
+ total_size + meta_size);
+if (prealloc == PREALLOC_MODE_FULL) {
+set_option_parameter(alloc_options, BLOCK_OPT_PREALLOC, "full");
+} else if (prealloc == PREALLOC_MODE_METADATA) {
+set_option_parameter(alloc_options, BLOCK_OPT_PREALLOC, 
"metadata");
+}
+
+options = alloc_options;
+}
+
 ret = bdrv_create_file(filename, options, &local_err);
 if (ret < 0) {
 error_propagate(errp, local_err);
-return ret;
+goto out_options;
 }
 
 bs = NULL;
@@ -1638,7 +1707,7 @@ static int qcow2_create2(const char *filename, int64_t 
total_size,
 NULL, &local_err);
 if (ret < 0) {
 error_propagate(errp, local_err);
-return ret;
+goto out_options;
 }
 
 /* Write the header */
@@ -1760,6 +1829,8 @@ out:
 if (bs) {
 bdrv_unref(bs);
 }
+out_options:
+free_option_parameters(alloc_options);
 return ret;
 }
 
@@ -1795,6 +1866,8 @@ static int qcow2_create(const char *filename, 
QEMUOptionParameter *options,
 prealloc = PREALLOC_MODE_OFF;
 } else if (!strcmp(options->value.s, "metadata")) {
 prealloc = PREALLOC_MODE_METADATA;
+} else if (!strcmp(options->value.s, "full")) {
+prealloc = PREALLOC_MODE_FULL;
 } else {
 error_setg(errp, "Invalid preallocation mode: '%s'",
options->value.s);
@@ -2360,7 +24

[Qemu-devel] [PATCH] trace: docs: add trace file description

2014-07-10 Thread Chen Fan
When user used the trace print command from docs/tracing.txt:
  ./scripts/simpletrace.py trace-events trace-*

the user maybe be misled by the "trace-*", because if user
directly copy the comand line to run, there alway print the
bored message:
"usage: ./scripts/simpletrace.py  "

then we should describe that the "trace-*" represented.

Signed-off-by: Chen Fan 
---
 docs/tracing.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/tracing.txt b/docs/tracing.txt
index c6ab1c1..c2299ce 100644
--- a/docs/tracing.txt
+++ b/docs/tracing.txt
@@ -23,7 +23,7 @@ for debugging, profiling, and observing execution.
 
 4. Pretty-print the binary trace file:
 
-./scripts/simpletrace.py trace-events trace-*
+./scripts/simpletrace.py trace-events trace-* # Override * with QEMU 
 
 == Trace events ==
 
-- 
1.9.3




[Qemu-devel] Merging latest qemu and Marss' qemu

2013-08-12 Thread Songchun Fan
Hello everyone,

I am wondering if there is a way to merge the latest official qemu with
Marss' qemu. I already switched to the qemu branch in Marss in order to get
a newer version of qemu (v1.1), yet it still differs a lot from the latest
version v1.6. The reason I need to merge these two is that one of my images
can only boot on v1.6.

Using diff I saw that Marss added a lot of things to qemu, while the latest
qemu also updated quite a lot compared to the v1.1 version. Does anyone
know how to do the merging?

Thanks in advance!

SF


[Qemu-devel] [RFC][PATCH] cpu: implement CPEJ method for unpluging cpu

2013-08-28 Thread Chen Fan
After OS ejecting a vcpu successful, it will call CPEJ method,
there communicate the masked vcpu bitmap to QEMU.

Signed-off-by: Chen Fan 
---
 src/acpi-dsdt-cpu-hotplug.dsl | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/acpi-dsdt-cpu-hotplug.dsl b/src/acpi-dsdt-cpu-hotplug.dsl
index 0f3e83b..b25963c 100644
--- a/src/acpi-dsdt-cpu-hotplug.dsl
+++ b/src/acpi-dsdt-cpu-hotplug.dsl
@@ -34,7 +34,11 @@ Scope(\_SB) {
 }
 Method(CPEJ, 2, NotSerialized) {
 // _EJ0 method - eject callback
-Sleep(200)
+Store(Zero, Index(CPON, ToInteger(Arg0)))
+Store(One, Local0)
+ShiftLeft(Local0, Arg0, Local0)
+Not(Local0, Local0)
+And(PRS, Local0, PRS)
 }
 
 /* CPU hotplug notify method */
-- 
1.8.1.4




[Qemu-devel] [RFC][PATCH 2/6] cpus: release allocated vcpu objects and exit vcpu thread

2013-08-28 Thread Chen Fan
After ACPI get a signal to eject a vcpu, then it will notify
the vcpu thread of needing to exit, before the vcpu exiting,
will release the vcpu related objects.

Signed-off-by: Chen Fan 
---
 cpus.c   | 36 
 hw/acpi/piix4.c  | 16 
 include/qom/cpu.h|  9 +
 include/sysemu/kvm.h |  1 +
 kvm-all.c| 26 ++
 5 files changed, 84 insertions(+), 4 deletions(-)

diff --git a/cpus.c b/cpus.c
index 70cc617..6b793cb 100644
--- a/cpus.c
+++ b/cpus.c
@@ -697,6 +697,30 @@ void async_run_on_cpu(CPUState *cpu, void (*func)(void 
*data), void *data)
 qemu_cpu_kick(cpu);
 }
 
+static void qemu_kvm_destroy_vcpu(CPUState *cpu)
+{
+CPUState *pcpu, *pcpu1;
+
+pcpu = first_cpu;
+pcpu1 = NULL;
+
+while (pcpu) {
+if (pcpu == cpu && pcpu1) {
+pcpu1->next_cpu = cpu->next_cpu;
+break;
+}
+pcpu1 = pcpu;
+pcpu = pcpu->next_cpu;
+}
+
+if (kvm_destroy_vcpu(cpu) < 0) {
+fprintf(stderr, "kvm_destroy_vcpu failed.\n");
+exit(1);
+}
+
+qdev_free(DEVICE(X86_CPU(cpu)));
+}
+
 static void flush_queued_work(CPUState *cpu)
 {
 struct qemu_work_item *wi;
@@ -788,6 +812,11 @@ static void *qemu_kvm_cpu_thread_fn(void *arg)
 }
 }
 qemu_kvm_wait_io_event(cpu);
+if (cpu->exit && !cpu_can_run(cpu)) {
+qemu_kvm_destroy_vcpu(cpu);
+qemu_mutex_unlock(&qemu_global_mutex);
+return NULL;
+}
 }
 
 return NULL;
@@ -1080,6 +1109,13 @@ static void qemu_dummy_start_vcpu(CPUState *cpu)
 }
 }
 
+void qemu_down_vcpu(CPUState *cpu)
+{
+cpu->stop = true;
+cpu->exit = true;
+qemu_cpu_kick(cpu);
+}
+
 void qemu_init_vcpu(CPUState *cpu)
 {
 cpu->nr_cores = smp_cores;
diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index 1aaa7a4..44bc809 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -611,10 +611,18 @@ static const MemoryRegionOps piix4_pci_ops = {
 },
 };
 
-static void acpi_piix_eject_vcpu(int64_t cpuid)
+static void acpi_piix_eject_vcpu(PIIX4PMState *s, int64_t cpuid)
 {
-/* TODO: eject a vcpu, release allocated vcpu and exit the vcpu pthread.  
*/
-PIIX4_DPRINTF("vcpu: %" PRIu64 " need to be ejected.\n", cpuid);
+CPUStatus *cpus = &s->gpe_cpu;
+CPUState *cs = NULL;
+
+cs = qemu_get_cpu(cpuid);
+if (cs == NULL) {
+return;
+}
+
+cpus->old_sts[cpuid / 8] &= ~(1 << (cpuid % 8));
+qemu_down_vcpu(cs);
 }
 
 static uint64_t cpu_status_read(void *opaque, hwaddr addr, unsigned int size)
@@ -647,7 +655,7 @@ static void cpu_status_write(void *opaque, hwaddr addr, 
uint64_t data,
 }
 
 if (cpuid != 0) {
-acpi_piix_eject_vcpu(cpuid);
+acpi_piix_eject_vcpu(s, cpuid);
 }
 }
 
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 3e49936..fa8ec8a 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -180,6 +180,7 @@ struct CPUState {
 bool created;
 bool stop;
 bool stopped;
+bool exit;
 volatile sig_atomic_t exit_request;
 volatile sig_atomic_t tcg_exit_req;
 uint32_t interrupt_request;
@@ -489,6 +490,14 @@ void cpu_exit(CPUState *cpu);
 void cpu_resume(CPUState *cpu);
 
 /**
+ * qemu_down_vcpu:
+ * @cpu: The vCPU will to down.
+ *
+ * Down a vCPU.
+ */
+void qemu_down_vcpu(CPUState *cpu);
+
+/**
  * qemu_init_vcpu:
  * @cpu: The vCPU to initialize.
  *
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index de74411..fd85605 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -158,6 +158,7 @@ int kvm_has_intx_set_mask(void);
 
 int kvm_init_vcpu(CPUState *cpu);
 int kvm_cpu_exec(CPUState *cpu);
+int kvm_destroy_vcpu(CPUState *cpu);
 
 #ifdef NEED_CPU_H
 
diff --git a/kvm-all.c b/kvm-all.c
index 716860f..fda3601 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -225,6 +225,32 @@ static void kvm_reset_vcpu(void *opaque)
 kvm_arch_reset_vcpu(cpu);
 }
 
+int kvm_destroy_vcpu(CPUState *cpu)
+{
+KVMState *s = kvm_state;
+long mmap_size;
+int ret = 0;
+
+DPRINTF("kvm_destroy_vcpu\n");
+
+mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0);
+if (mmap_size < 0) {
+ret = mmap_size;
+DPRINTF("KVM_GET_VCPU_MMAP_SIZE failed\n");
+goto err;
+}
+
+ret = munmap(cpu->kvm_run, mmap_size);
+if (ret < 0) {
+goto err;
+}
+
+close(cpu->kvm_fd);
+
+err:
+return ret;
+}
+
 int kvm_init_vcpu(CPUState *cpu)
 {
 KVMState *s = kvm_state;
-- 
1.8.1.4




[Qemu-devel] [RFC][PATCH 3/6] qom cpu: rename variable 'cpu_added_notifier' to 'cpu_hotplug_notifier'

2013-08-28 Thread Chen Fan
Rename variable 'cpu_added_notifier' to 'cpu_hotplug_notifier', for
adding vcpu-remove notifier support.

Signed-off-by: Chen Fan 
---
 hw/acpi/piix4.c | 10 +-
 hw/i386/pc.c|  2 +-
 include/sysemu/sysemu.h |  2 +-
 qom/cpu.c   | 10 +-
 4 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index 44bc809..0a58ff7 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -96,7 +96,7 @@ typedef struct PIIX4PMState {
 uint8_t s4_val;
 
 CPUStatus gpe_cpu;
-Notifier cpu_added_notifier;
+Notifier cpu_hotplug_notifier;
 } PIIX4PMState;
 
 #define TYPE_PIIX4_PM "PIIX4_PM"
@@ -700,9 +700,9 @@ static void piix4_cpu_hotplug_req(PIIX4PMState *s, CPUState 
*cpu,
 pm_update_sci(s);
 }
 
-static void piix4_cpu_added_req(Notifier *n, void *opaque)
+static void piix4_cpu_hotplug(Notifier *n, void *opaque)
 {
-PIIX4PMState *s = container_of(n, PIIX4PMState, cpu_added_notifier);
+PIIX4PMState *s = container_of(n, PIIX4PMState, cpu_hotplug_notifier);
 
 piix4_cpu_hotplug_req(s, CPU(opaque), PLUG);
 }
@@ -738,8 +738,8 @@ static void piix4_acpi_system_hot_add_init(MemoryRegion 
*parent,
 memory_region_init_io(&s->io_cpu, OBJECT(s), &cpu_hotplug_ops, s,
   "acpi-cpu-hotplug", PIIX4_PROC_LEN);
 memory_region_add_subregion(parent, PIIX4_PROC_BASE, &s->io_cpu);
-s->cpu_added_notifier.notify = piix4_cpu_added_req;
-qemu_register_cpu_added_notifier(&s->cpu_added_notifier);
+s->cpu_hotplug_notifier.notify = piix4_cpu_hotplug;
+qemu_register_cpu_hotplug_notifier(&s->cpu_hotplug_notifier);
 }
 
 static void enable_device(PIIX4PMState *s, int slot)
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index e8bc8ce..c0e7cbd 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -408,7 +408,7 @@ void pc_cmos_init(ram_addr_t ram_size, ram_addr_t 
above_4g_mem_size,
 /* init CPU hotplug notifier */
 cpu_hotplug_cb.rtc_state = s;
 cpu_hotplug_cb.cpu_added_notifier.notify = rtc_notify_cpu_added;
-qemu_register_cpu_added_notifier(&cpu_hotplug_cb.cpu_added_notifier);
+qemu_register_cpu_hotplug_notifier(&cpu_hotplug_cb.cpu_added_notifier);
 
 if (set_boot_dev(s, boot_device)) {
 exit(1);
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index d7a77b6..a7384c0 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -153,7 +153,7 @@ void do_pci_device_hot_remove(Monitor *mon, const QDict 
*qdict);
 void drive_hot_add(Monitor *mon, const QDict *qdict);
 
 /* CPU hotplug */
-void qemu_register_cpu_added_notifier(Notifier *notifier);
+void qemu_register_cpu_hotplug_notifier(Notifier *notifier);
 
 /* pcie aer error injection */
 void pcie_aer_inject_error_print(Monitor *mon, const QObject *data);
diff --git a/qom/cpu.c b/qom/cpu.c
index e71e57b..e3e75de 100644
--- a/qom/cpu.c
+++ b/qom/cpu.c
@@ -79,12 +79,12 @@ static void cpu_common_get_memory_mapping(CPUState *cpu,
 }
 
 /* CPU hot-plug notifiers */
-static NotifierList cpu_added_notifiers =
-NOTIFIER_LIST_INITIALIZER(cpu_add_notifiers);
+static NotifierList cpu_hotplug_notifiers =
+NOTIFIER_LIST_INITIALIZER(cpu_hotplug_notifiers);
 
-void qemu_register_cpu_added_notifier(Notifier *notifier)
+void qemu_register_cpu_hotplug_notifier(Notifier *notifier)
 {
-notifier_list_add(&cpu_added_notifiers, notifier);
+notifier_list_add(&cpu_hotplug_notifiers, notifier);
 }
 
 void cpu_reset_interrupt(CPUState *cpu, int mask)
@@ -230,7 +230,7 @@ static void cpu_common_realizefn(DeviceState *dev, Error 
**errp)
 
 if (dev->hotplugged) {
 cpu_synchronize_post_init(cpu);
-notifier_list_notify(&cpu_added_notifiers, dev);
+notifier_list_notify(&cpu_hotplug_notifiers, dev);
 cpu_resume(cpu);
 }
 }
-- 
1.8.1.4




[Qemu-devel] [RFC][PATCH 4/6] qmp: add 'cpu-del' command support

2013-08-28 Thread Chen Fan
Signed-off-by: Chen Fan 
---
 hw/i386/pc.c |  5 +
 hw/i386/pc_piix.c|  1 +
 include/hw/boards.h  |  2 ++
 include/hw/i386/pc.h |  1 +
 qapi-schema.json | 12 
 qmp-commands.hx  | 23 +++
 qmp.c|  9 +
 7 files changed, 53 insertions(+)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index c0e7cbd..75fc9bb 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -958,6 +958,11 @@ void pc_hot_add_cpu(const int64_t id, Error **errp)
 pc_new_cpu(current_cpu_model, apic_id, icc_bridge, errp);
 }
 
+void pc_hot_del_cpu(const int64_t id, Error **errp)
+{
+/* TODO: hot remove VCPU. */
+}
+
 void pc_cpus_init(const char *cpu_model, DeviceState *icc_bridge)
 {
 int i;
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 6e1e654..d779b75 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -347,6 +347,7 @@ static QEMUMachine pc_i440fx_machine_v1_6 = {
 .desc = "Standard PC (i440FX + PIIX, 1996)",
 .init = pc_init_pci_1_6,
 .hot_add_cpu = pc_hot_add_cpu,
+.hot_del_cpu = pc_hot_del_cpu,
 .max_cpus = 255,
 .is_default = 1,
 DEFAULT_MACHINE_OPTIONS,
diff --git a/include/hw/boards.h b/include/hw/boards.h
index fb7c6f1..fea3737 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -23,6 +23,7 @@ typedef void QEMUMachineInitFunc(QEMUMachineInitArgs *args);
 typedef void QEMUMachineResetFunc(void);
 
 typedef void QEMUMachineHotAddCPUFunc(const int64_t id, Error **errp);
+typedef void QEMUMachineHotDelCPUFunc(const int64_t id, Error **errp);
 
 typedef struct QEMUMachine {
 const char *name;
@@ -31,6 +32,7 @@ typedef struct QEMUMachine {
 QEMUMachineInitFunc *init;
 QEMUMachineResetFunc *reset;
 QEMUMachineHotAddCPUFunc *hot_add_cpu;
+QEMUMachineHotDelCPUFunc *hot_del_cpu;
 BlockInterfaceType block_default_type;
 int max_cpus;
 unsigned int no_serial:1,
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index f79d478..b7e66f4 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -96,6 +96,7 @@ void pc_acpi_smi_interrupt(void *opaque, int irq, int level);
 
 void pc_cpus_init(const char *cpu_model, DeviceState *icc_bridge);
 void pc_hot_add_cpu(const int64_t id, Error **errp);
+void pc_hot_del_cpu(const int64_t id, Error **errp);
 void pc_acpi_init(const char *default_dsdt);
 
 PcGuestInfo *pc_guest_info_init(ram_addr_t below_4g_mem_size,
diff --git a/qapi-schema.json b/qapi-schema.json
index a51f7d2..6d3bf5f 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1432,6 +1432,18 @@
 ##
 { 'command': 'cpu-add', 'data': {'id': 'int'} }
 
+# @cpu-del
+
+# Deletes CPU with specified ID
+#
+# @id: ID of CPU to be deleted, valid values [0..max_cpus)
+#
+# Returns: Nothing on success
+#
+# Since 1.6
+##
+{ 'command': 'cpu-del', 'data': {'id': 'int'} }
+
 ##
 # @memsave:
 #
diff --git a/qmp-commands.hx b/qmp-commands.hx
index cf47e3f..16b54fd 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -411,6 +411,29 @@ Example:
 EQMP
 
 {
+.name   = "cpu-del",
+.args_type  = "id:i",
+.mhandler.cmd_new = qmp_marshal_input_cpu_del,
+},
+
+SQMP
+cpu-del
+---
+
+Deletes virtual cpu
+
+Arguments:
+
+- "id": cpu id (json-int)
+
+Example:
+
+-> { "execute": "cpu-del", "arguments": { "id": 2 } }
+<- { "return": {} }
+
+EQMP
+
+{
 .name   = "memsave",
 .args_type  = "val:l,size:i,filename:s,cpu:i?",
 .mhandler.cmd_new = qmp_marshal_input_memsave,
diff --git a/qmp.c b/qmp.c
index 4c149b3..84dc873 100644
--- a/qmp.c
+++ b/qmp.c
@@ -118,6 +118,15 @@ void qmp_cpu_add(int64_t id, Error **errp)
 }
 }
 
+void qmp_cpu_del(int64_t id, Error **errp)
+{
+if (current_machine->hot_del_cpu) {
+current_machine->hot_del_cpu(id, errp);
+} else {
+error_setg(errp, "Not supported");
+}
+}
+
 #ifndef CONFIG_VNC
 /* If VNC support is enabled, the "true" query-vnc command is
defined in the VNC subsystem */
-- 
1.8.1.4




[Qemu-devel] [RFC][PATCH 1/6] piix4: implement function 'cpu_status_write' for vcpu ejection

2013-08-28 Thread Chen Fan
When OS eject a vcpu (like: echo 1 > /sys/bus/acpi/devices/LNXCPUXX/eject),
it will call acpi EJ0 method, the firmware will write the new cpumap, QEMU
will know which vcpu need to be ejected.

Signed-off-by: Chen Fan 
---
 hw/acpi/piix4.c | 35 ++-
 1 file changed, 34 insertions(+), 1 deletion(-)

diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index c885690..1aaa7a4 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -61,6 +61,7 @@ struct pci_status {
 
 typedef struct CPUStatus {
 uint8_t sts[PIIX4_PROC_LEN];
+uint8_t old_sts[PIIX4_PROC_LEN];
 } CPUStatus;
 
 typedef struct PIIX4PMState {
@@ -610,6 +611,12 @@ static const MemoryRegionOps piix4_pci_ops = {
 },
 };
 
+static void acpi_piix_eject_vcpu(int64_t cpuid)
+{
+/* TODO: eject a vcpu, release allocated vcpu and exit the vcpu pthread.  
*/
+PIIX4_DPRINTF("vcpu: %" PRIu64 " need to be ejected.\n", cpuid);
+}
+
 static uint64_t cpu_status_read(void *opaque, hwaddr addr, unsigned int size)
 {
 PIIX4PMState *s = opaque;
@@ -622,7 +629,26 @@ static uint64_t cpu_status_read(void *opaque, hwaddr addr, 
unsigned int size)
 static void cpu_status_write(void *opaque, hwaddr addr, uint64_t data,
  unsigned int size)
 {
-/* TODO: implement VCPU removal on guest signal that CPU can be removed */
+PIIX4PMState *s = opaque;
+CPUStatus *cpus = &s->gpe_cpu;
+uint8_t val;
+int i;
+int64_t cpuid = 0;
+
+val = cpus->old_sts[addr] ^ data;
+
+if (val == 0)
+return;
+
+for (i = 0; i < 8; i++) {
+if (val & 1 << i) {
+cpuid = 8 * addr + i;
+}
+}
+
+if (cpuid != 0) {
+acpi_piix_eject_vcpu(cpuid);
+}
 }
 
 static const MemoryRegionOps cpu_hotplug_ops = {
@@ -647,11 +673,17 @@ static void piix4_cpu_hotplug_req(PIIX4PMState *s, 
CPUState *cpu,
 ACPIGPE *gpe = &s->ar.gpe;
 CPUClass *k = CPU_GET_CLASS(cpu);
 int64_t cpu_id;
+int i;
 
 assert(s != NULL);
 
 *gpe->sts = *gpe->sts | PIIX4_CPU_HOTPLUG_STATUS;
 cpu_id = k->get_arch_id(CPU(cpu));
+
+for (i = 0; i < PIIX4_PROC_LEN; i++) {
+g->old_sts[i] = g->sts[i];
+}
+
 if (action == PLUG) {
 g->sts[cpu_id / 8] |= (1 << (cpu_id % 8));
 } else {
@@ -675,6 +707,7 @@ static void piix4_init_cpu_status(CPUState *cpu, void *data)
 
 g_assert((id / 8) < PIIX4_PROC_LEN);
 g->sts[id / 8] |= (1 << (id % 8));
+g->old_sts[id / 8] |= (1 << (id % 8));
 }
 
 static int piix4_device_hotplug(DeviceState *qdev, PCIDevice *dev,
-- 
1.8.1.4




[Qemu-devel] [RFC][PATCH 0/6] i386: add cpu hot remove support

2013-08-28 Thread Chen Fan
via implementing ACPI standard methods _EJ0, after Guest OS hot remove
one vcpu, it is able to send a signal to QEMU, then QEMU could notify 
the assigned vcpu to exit.

this series patches must be used with seabios patch and KVM patch together.

for KVM patches:
 http://comments.gmane.org/gmane.comp.emulators.kvm.devel/114347

for seabios patches:
 http://comments.gmane.org/gmane.comp.emulators.qemu/230460

Chen Fan (6):
  piix4: implement function 'cpu_status_write' for vcpu ejection
  cpus: release allocated vcpu objects and exit vcpu thread
  qom cpu: rename variable 'cpu_added_notifier' to
'cpu_hotplug_notifier'
  qmp: add 'cpu-del' command support
  qom cpu: add struct CPUNotifier for supporting PLUG and UNPLUG cpu
notifier
  i386: implement cpu interface 'cpu_common_unrealizefn'

 cpus.c  | 36 +
 hw/acpi/piix4.c | 61 +++--
 hw/i386/pc.c| 24 ++-
 hw/i386/pc_piix.c   |  1 +
 include/hw/boards.h |  2 ++
 include/hw/i386/pc.h|  1 +
 include/qom/cpu.h   | 19 +++
 include/sysemu/kvm.h|  1 +
 include/sysemu/sysemu.h |  2 +-
 kvm-all.c   | 26 +
 qapi-schema.json| 12 ++
 qmp-commands.hx | 23 +++
 qmp.c   |  9 
 qom/cpu.c   | 27 ++
 14 files changed, 225 insertions(+), 19 deletions(-)

-- 
1.8.1.4




[Qemu-devel] [RFC][PATCH 5/6] qom cpu: add struct CPUNotifier for supporting PLUG and UNPLUG cpu notifier

2013-08-28 Thread Chen Fan
Move struct HotplugEventType from file piix4.c to file qom/cpu.c,
and add struct CPUNotifier for supporting PLUG and UNPLUG cpu notifier.

Signed-off-by: Chen Fan 
---
 hw/acpi/piix4.c   |  8 ++--
 include/qom/cpu.h | 10 ++
 qom/cpu.c |  6 +-
 3 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index 0a58ff7..fa82768 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -669,11 +669,6 @@ static const MemoryRegionOps cpu_hotplug_ops = {
 },
 };
 
-typedef enum {
-PLUG,
-UNPLUG,
-} HotplugEventType;
-
 static void piix4_cpu_hotplug_req(PIIX4PMState *s, CPUState *cpu,
   HotplugEventType action)
 {
@@ -703,8 +698,9 @@ static void piix4_cpu_hotplug_req(PIIX4PMState *s, CPUState 
*cpu,
 static void piix4_cpu_hotplug(Notifier *n, void *opaque)
 {
 PIIX4PMState *s = container_of(n, PIIX4PMState, cpu_hotplug_notifier);
+CPUNotifier *notifier = opaque;
 
-piix4_cpu_hotplug_req(s, CPU(opaque), PLUG);
+piix4_cpu_hotplug_req(s, CPU(notifier->dev), notifier->type);
 }
 
 static void piix4_init_cpu_status(CPUState *cpu, void *data)
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index fa8ec8a..c6f612d 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -518,6 +518,16 @@ void qemu_init_vcpu(CPUState *cpu);
  */
 void cpu_single_step(CPUState *cpu, int enabled);
 
+typedef enum {
+PLUG,
+UNPLUG,
+} HotplugEventType;
+
+typedef struct CPUNotifier {
+DeviceState *dev;
+HotplugEventType type;
+} CPUNotifier;
+
 #ifdef CONFIG_SOFTMMU
 extern const struct VMStateDescription vmstate_cpu_common;
 #else
diff --git a/qom/cpu.c b/qom/cpu.c
index e3e75de..3439c5d 100644
--- a/qom/cpu.c
+++ b/qom/cpu.c
@@ -227,10 +227,14 @@ static ObjectClass *cpu_common_class_by_name(const char 
*cpu_model)
 static void cpu_common_realizefn(DeviceState *dev, Error **errp)
 {
 CPUState *cpu = CPU(dev);
+CPUNotifier notifier;
+
+notifier.dev = dev;
+notifier.type = PLUG;
 
 if (dev->hotplugged) {
 cpu_synchronize_post_init(cpu);
-notifier_list_notify(&cpu_hotplug_notifiers, dev);
+notifier_list_notify(&cpu_hotplug_notifiers, ¬ifier);
 cpu_resume(cpu);
 }
 }
-- 
1.8.1.4




[Qemu-devel] [RFC][PATCH 6/6] i386: implement cpu interface 'cpu_common_unrealizefn'

2013-08-28 Thread Chen Fan
Implement cpu interface 'cpu_common_unrealizefn' for emiting vcpu-remove
notifier to ACPI, then ACPI could send sci interrupt to OS for hot-remove
vcpu.

Signed-off-by: Chen Fan 
---
 hw/i386/pc.c | 19 ++-
 qom/cpu.c| 13 +
 2 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 75fc9bb..9a87ac0 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -960,7 +960,24 @@ void pc_hot_add_cpu(const int64_t id, Error **errp)
 
 void pc_hot_del_cpu(const int64_t id, Error **errp)
 {
-/* TODO: hot remove VCPU. */
+CPUState *s = NULL;
+X86CPU *cpu = NULL;
+DeviceState *ds = NULL;
+DeviceClass *dc = NULL;
+
+s = qemu_get_cpu(id);
+if (s == NULL) {
+error_setg(errp, "Unable to find cpu-index: %" PRIi64
+   ",it non-exists or has been deleted.", id);
+return;
+}
+
+cpu = X86_CPU(s);
+ds = DEVICE(cpu);
+dc = DEVICE_GET_CLASS(ds);
+if (dc->unrealize) {
+dc->unrealize(ds, errp);
+}
 }
 
 void pc_cpus_init(const char *cpu_model, DeviceState *icc_bridge)
diff --git a/qom/cpu.c b/qom/cpu.c
index 3439c5d..d2b0c9e 100644
--- a/qom/cpu.c
+++ b/qom/cpu.c
@@ -239,6 +239,18 @@ static void cpu_common_realizefn(DeviceState *dev, Error 
**errp)
 }
 }
 
+static void cpu_common_unrealizefn(DeviceState *dev, Error **errp)
+{
+CPUNotifier notifier;
+
+notifier.dev = dev;
+notifier.type = UNPLUG;
+
+if (dev->hotplugged) {
+notifier_list_notify(&cpu_hotplug_notifiers, ¬ifier);
+ }
+}
+
 static void cpu_common_initfn(Object *obj)
 {
 CPUState *cpu = CPU(obj);
@@ -269,6 +281,7 @@ static void cpu_class_init(ObjectClass *klass, void *data)
 k->gdb_read_register = cpu_common_gdb_read_register;
 k->gdb_write_register = cpu_common_gdb_write_register;
 dc->realize = cpu_common_realizefn;
+dc->unrealize = cpu_common_unrealizefn;
 dc->no_user = 1;
 }
 
-- 
1.8.1.4




[Qemu-devel] [RFC 2/3] target-i386: add -smp X,apics=0x option

2014-01-14 Thread Chen Fan
This option provides the infrastructure for specifying apicids when
boot VM, For example:

 #boot with apicid 0 and 2:
 -smp 2,apics=0xA,maxcpus=4  /* 1010 */
 #boot with apicid 1 and 7:
 -smp 2,apics=0x41,maxcpus=8 /* 0100 0001 */

Signed-off-by: Chen Fan 
---
 hw/i386/pc.c|  9 +--
 include/sysemu/sysemu.h |  4 
 qemu-options.hx | 15 +---
 vl.c| 62 -
 4 files changed, 84 insertions(+), 6 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 963446f..3582167 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -991,8 +991,13 @@ void pc_cpus_init(const char *cpu_model, DeviceState 
*icc_bridge)
 current_cpu_model = cpu_model;
 
 for (i = 0; i < smp_cpus; i++) {
-cpu = pc_new_cpu(cpu_model, x86_cpu_apic_id_from_index(i),
- icc_bridge, &error);
+int64_t apic_id;
+if (nb_boot_apics == 0) {
+apic_id = x86_cpu_apic_id_from_index(i);
+} else {
+apic_id = x86_cpu_apic_id_from_index(boot_apics[i]);
+}
+cpu = pc_new_cpu(cpu_model, apic_id, icc_bridge, &error);
 if (error) {
 error_report("%s", error_get_pretty(error));
 error_free(error);
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 495dae8..510a626 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -149,6 +149,10 @@ extern int nb_option_roms;
 extern const char *prom_envs[MAX_PROM_ENVS];
 extern unsigned int nb_prom_envs;
 
+#define MAX_APICS 255
+extern int nb_boot_apics;
+extern int64_t boot_apics[MAX_APICS];
+
 /* pci-hotplug */
 void pci_device_hot_add(Monitor *mon, const QDict *qdict);
 int pci_drive_hot_add(Monitor *mon, const QDict *qdict, DriveInfo *dinfo);
diff --git a/qemu-options.hx b/qemu-options.hx
index bcfe9ea..7f86519 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -73,16 +73,17 @@ Select CPU model (@code{-cpu help} for list and additional 
feature selection)
 ETEXI
 
 DEF("smp", HAS_ARG, QEMU_OPTION_smp,
-"-smp 
[cpus=]n[,maxcpus=cpus][,cores=cores][,threads=threads][,sockets=sockets]\n"
+"-smp 
[cpus=]n[,maxcpus=cpus][,cores=cores][,threads=threads][,sockets=sockets][,apics=apics]\n"
 "set the number of CPUs to 'n' [default=1]\n"
 "maxcpus= maximum number of total cpus, including\n"
 "offline CPUs for hotplug, etc\n"
 "cores= number of CPU cores on one socket\n"
 "threads= number of threads on one CPU core\n"
-"sockets= number of discrete sockets in the system\n",
+"sockets= number of discrete sockets in the system\n"
+"apics= a hex number with leading '0x' as boot bitmap of 
existed apicid\n",
 QEMU_ARCH_ALL)
 STEXI
-@item -smp 
[cpus=]@var{n}[,cores=@var{cores}][,threads=@var{threads}][,sockets=@var{sockets}][,maxcpus=@var{maxcpus}]
+@item -smp 
[cpus=]@var{n}[,cores=@var{cores}][,threads=@var{threads}][,sockets=@var{sockets}][,maxcpus=@var{maxcpus}][,apics=@var{apics}]
 @findex -smp
 Simulate an SMP system with @var{n} CPUs. On the PC target, up to 255
 CPUs are supported. On Sparc32 target, Linux limits the number of usable CPUs
@@ -92,6 +93,14 @@ of @var{threads} per cores and the total number of 
@var{sockets} can be
 specified. Missing values will be computed. If any on the three values is
 given, the total number of CPUs @var{n} can be omitted. @var{maxcpus}
 specifies the maximum number of hotpluggable CPUs.
+@var{apics} specifies the boot bitmap of existed apicid.
+
+@example
+#specify the boot bitmap of apicid with 0 and 2:
+qemu-system-i386 -smp 2,apics=0xA,maxcpus=4  /* 1010 */
+#specify the boot bitmap of apicid with 1 and 7:
+qemu-system-i386 -smp 2,apics=0x41,maxcpus=8 /* 0100 0001 */
+@end example
 ETEXI
 
 DEF("numa", HAS_ARG, QEMU_OPTION_numa,
diff --git a/vl.c b/vl.c
index 7511e70..870b1bd 100644
--- a/vl.c
+++ b/vl.c
@@ -254,6 +254,9 @@ unsigned long *node_cpumask[MAX_NODES];
 uint8_t qemu_uuid[16];
 bool qemu_uuid_set;
 
+int nb_boot_apics;
+int64_t boot_apics[MAX_APICS];
+
 static QEMUBootSetHandler *boot_set_handler;
 static void *boot_set_opaque;
 
@@ -1379,6 +1382,9 @@ static QemuOptsList qemu_smp_opts = {
 }, {
 .name = "maxcpus",
 .type = QEMU_OPT_NUMBER,
+}, {
+.name = "apics",
+.type = QEMU_OPT_STRING,
 },
 { /*End of list */ }
 },
@@ -1392,6 +1398,7 @@ static void smp_parse(QemuOpts *opts)
 unsigned sockets = qemu_opt_get_number(opts, "sockets", 0);
 unsigned cores   = qemu_opt_get_number(opts, "cores", 0);
 unsigned threads = qemu_opt_get_num

[Qemu-devel] [RFC 0/3] fix migration issues after hotplug a discontinuous cpuid

2014-01-14 Thread Chen Fan
At present, after hotplug a discontinuous cpu id on source side, then done
migration, hotplug again will fail on destination side. for example:
on source side:
   1) boot with -smp 1,maxcpus=4
   2) cpu-add id=2
   3) live-migration
on destination side:
   1) boot with -smp 2,maxcpus=4
   1) cpu-add id=1

the root cause is the generated apicid is sequential from 0 to smp_cpus when
initialize cpus at booting time on destination side, there apicid will be 0 
and 1, but on source side the existed apicid after hotplug are 0 and 2.
so if add cpu with id=1, this will show error with this cpu alreay exists.

this patches added -smp X,apics=0x option to specify apic map. follow above
example:
on destination side:
   1) boot with -smp 2,maxcpus=4,apics=0xA
this apics value is a hex number as existed apicid bitmap, 0xA is 1010B for
apicid are 0 and 2.

this patches will be helpful for arbitrary CPU hot-remove as well.

Chen Fan (3):
  target-i386: moving registers of vmstate from cpu_exec_init() to
x86_cpu_realizefn()
  target-i386: add -smp X,apics=0x option
  target-i386: add qmp command 'query-cpus' to display apic_id

 cpus.c  |  1 +
 exec.c  |  5 
 hw/i386/pc.c|  9 +--
 include/sysemu/sysemu.h |  4 
 qapi-schema.json|  4 +++-
 qemu-options.hx | 15 +---
 target-i386/cpu.c   |  9 +++
 vl.c| 62 -
 8 files changed, 102 insertions(+), 7 deletions(-)

-- 
1.8.1.4




[Qemu-devel] [RFC 3/3] target-i386: add qmp command 'query-cpus' to display apic_id

2014-01-14 Thread Chen Fan
this patch provided the apic_id display as using command 'query-cpus'.

Signed-off-by: Chen Fan 
---
 cpus.c   | 1 +
 qapi-schema.json | 4 +++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/cpus.c b/cpus.c
index ca4c59f..e6ed098 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1351,6 +1351,7 @@ CpuInfoList *qmp_query_cpus(Error **errp)
 #if defined(TARGET_I386)
 info->value->has_pc = true;
 info->value->pc = env->eip + env->segs[R_CS].base;
+info->value->apic_id = env->cpuid_apic_id;
 #elif defined(TARGET_PPC)
 info->value->has_nip = true;
 info->value->nip = env->nip;
diff --git a/qapi-schema.json b/qapi-schema.json
index c3c939c..40c67ac 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -783,6 +783,8 @@
 #
 # @thread_id: ID of the underlying host thread
 #
+# @apic_id: the apic id of the virtual CPU
+#
 # Since: 0.14.0
 #
 # Notes: @halted is a transient state that changes frequently.  By the time the
@@ -790,7 +792,7 @@
 ##
 { 'type': 'CpuInfo',
   'data': {'CPU': 'int', 'current': 'bool', 'halted': 'bool', '*pc': 'int',
-   '*nip': 'int', '*npc': 'int', '*PC': 'int', 'thread_id': 'int'} }
+   '*nip': 'int', '*npc': 'int', '*PC': 'int', 'thread_id': 'int', 
'apic_id': 'int'} }
 
 ##
 # @query-cpus:
-- 
1.8.1.4




[Qemu-devel] [RFC 1/3] target-i386: moving registers of vmstate from cpu_exec_init() to x86_cpu_realizefn()

2014-01-14 Thread Chen Fan
the intend of this patch is to register cpu vmstates with apic id instead of cpu
index, due to the property setting of apic_id is behind the cpu initialization. 
so
we move the registers of cpu vmstate from cpu_exec_init() to 
x86_cpu_realizefn() to
let the set apicid as the parameter.

Signed-off-by: Chen Fan 
---
 exec.c| 5 +
 target-i386/cpu.c | 9 +
 2 files changed, 14 insertions(+)

diff --git a/exec.c b/exec.c
index 7e49e8e..9be5855 100644
--- a/exec.c
+++ b/exec.c
@@ -438,7 +438,9 @@ CPUState *qemu_get_cpu(int index)
 void cpu_exec_init(CPUArchState *env)
 {
 CPUState *cpu = ENV_GET_CPU(env);
+#if !defined(TARGET_I386)
 CPUClass *cc = CPU_GET_CLASS(cpu);
+#endif
 CPUState *some_cpu;
 int cpu_index;
 
@@ -460,6 +462,8 @@ void cpu_exec_init(CPUArchState *env)
 #if defined(CONFIG_USER_ONLY)
 cpu_list_unlock();
 #endif
+
+#if !defined(TARGET_I386)
 if (qdev_get_vmsd(DEVICE(cpu)) == NULL) {
 vmstate_register(NULL, cpu_index, &vmstate_cpu_common, cpu);
 }
@@ -472,6 +476,7 @@ void cpu_exec_init(CPUArchState *env)
 if (cc->vmsd != NULL) {
 vmstate_register(NULL, cpu_index, cc->vmsd, cpu);
 }
+#endif /* !TARGET_I386 */
 }
 
 #if defined(TARGET_HAS_ICE)
diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index 967529a..dada6f6 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -2552,6 +2552,7 @@ static void x86_cpu_apic_realize(X86CPU *cpu, Error 
**errp)
 static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
 {
 CPUState *cs = CPU(dev);
+CPUClass *cc = CPU_GET_CLASS(cs);
 X86CPU *cpu = X86_CPU(dev);
 X86CPUClass *xcc = X86_CPU_GET_CLASS(dev);
 CPUX86State *env = &cpu->env;
@@ -2615,6 +2616,14 @@ static void x86_cpu_realizefn(DeviceState *dev, Error 
**errp)
 cpu_reset(cs);
 
 xcc->parent_realize(dev, &local_err);
+
+if (qdev_get_vmsd(DEVICE(cs)) == NULL) {
+vmstate_register(NULL, env->cpuid_apic_id, &vmstate_cpu_common, cs);
+}
+
+if (cc->vmsd != NULL) {
+vmstate_register(NULL, env->cpuid_apic_id, cc->vmsd, cs);
+}
 out:
 if (local_err != NULL) {
 error_propagate(errp, local_err);
-- 
1.8.1.4




Re: [Qemu-devel] [RFC 1/3] target-i386: moving registers of vmstate from cpu_exec_init() to x86_cpu_realizefn()

2014-01-15 Thread Chen Fan
On Tue, 2014-01-14 at 11:40 +0100, Igor Mammedov wrote:
> On Tue, 14 Jan 2014 17:27:20 +0800
> Chen Fan  wrote:
> 
> > the intend of this patch is to register cpu vmstates with apic id instead 
> > of cpu
> > index, due to the property setting of apic_id is behind the cpu 
> > initialization. so
> > we move the registers of cpu vmstate from cpu_exec_init() to 
> > x86_cpu_realizefn() to
> > let the set apicid as the parameter.
> > 
> > Signed-off-by: Chen Fan 
> > ---
> >  exec.c| 5 +
> >  target-i386/cpu.c | 9 +
> >  2 files changed, 14 insertions(+)
> > 
> > diff --git a/exec.c b/exec.c
> > index 7e49e8e..9be5855 100644
> > --- a/exec.c
> > +++ b/exec.c
> > @@ -438,7 +438,9 @@ CPUState *qemu_get_cpu(int index)
> >  void cpu_exec_init(CPUArchState *env)
> >  {
> >  CPUState *cpu = ENV_GET_CPU(env);
> > +#if !defined(TARGET_I386)
> >  CPUClass *cc = CPU_GET_CLASS(cpu);
> > +#endif
> >  CPUState *some_cpu;
> >  int cpu_index;
> >  
> > @@ -460,6 +462,8 @@ void cpu_exec_init(CPUArchState *env)
> >  #if defined(CONFIG_USER_ONLY)
> >  cpu_list_unlock();
> >  #endif
> > +
> > +#if !defined(TARGET_I386)
> >  if (qdev_get_vmsd(DEVICE(cpu)) == NULL) {
> >  vmstate_register(NULL, cpu_index, &vmstate_cpu_common, cpu);
> >  }
> > @@ -472,6 +476,7 @@ void cpu_exec_init(CPUArchState *env)
> >  if (cc->vmsd != NULL) {
> >  vmstate_register(NULL, cpu_index, cc->vmsd, cpu);
> >  }
> > +#endif /* !TARGET_I386 */
> >  }
> >  
> >  #if defined(TARGET_HAS_ICE)
> > diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> > index 967529a..dada6f6 100644
> > --- a/target-i386/cpu.c
> > +++ b/target-i386/cpu.c
> > @@ -2552,6 +2552,7 @@ static void x86_cpu_apic_realize(X86CPU *cpu, Error 
> > **errp)
> >  static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
> >  {
> >  CPUState *cs = CPU(dev);
> > +CPUClass *cc = CPU_GET_CLASS(cs);
> >  X86CPU *cpu = X86_CPU(dev);
> >  X86CPUClass *xcc = X86_CPU_GET_CLASS(dev);
> >  CPUX86State *env = &cpu->env;
> > @@ -2615,6 +2616,14 @@ static void x86_cpu_realizefn(DeviceState *dev, 
> > Error **errp)
> >  cpu_reset(cs);
> >  
> >  xcc->parent_realize(dev, &local_err);
> > +
> > +if (qdev_get_vmsd(DEVICE(cs)) == NULL) {
> > +vmstate_register(NULL, env->cpuid_apic_id, &vmstate_cpu_common, 
> > cs);
> > +}
> > +
> > +if (cc->vmsd != NULL) {
> > +vmstate_register(NULL, env->cpuid_apic_id, cc->vmsd, cs);
> > +}
> how about doing it in common CPUclass.realize()
> you can use get_arch_id() for getting CPU id, it returns cpu_index by default
> and apic_id for target-i386.

Thanks for your kind suggestion, does this mean we can directly move
vmstate_register to cpu_common_realizefn()? 

> Pls note that changing vmstate id should be done only for new machine types
> so not to break old qemu -> new qemu migration.
Yes.

Thanks.
Chen

> 
> >  out:
> >  if (local_err != NULL) {
> >  error_propagate(errp, local_err);
> 
> 





Re: [Qemu-devel] Exposing and calculating CPU APIC IDs (was Re: [RFC 1/3] target-i386: moving registers of vmstate from cpu_exec_init() to x86_cpu_realizefn())

2014-01-20 Thread Chen Fan
On Mon, 2014-01-20 at 13:29 +0100, Igor Mammedov wrote:
> On Fri, 17 Jan 2014 17:13:55 -0200
> Eduardo Habkost  wrote:
> 
> > On Wed, Jan 15, 2014 at 03:37:04PM +0100, Igor Mammedov wrote:
> > > On Wed, 15 Jan 2014 20:24:01 +0800
> > > Chen Fan  wrote:
> > > > On Tue, 2014-01-14 at 11:40 +0100, Igor Mammedov wrote:
> > > > > On Tue, 14 Jan 2014 17:27:20 +0800
> > > > > Chen Fan  wrote:
> > > > > 
> > > > > > the intend of this patch is to register cpu vmstates with apic id 
> > > > > > instead of cpu
> > > > > > index, due to the property setting of apic_id is behind the cpu 
> > > > > > initialization. so
> > > > > > we move the registers of cpu vmstate from cpu_exec_init() to 
> > > > > > x86_cpu_realizefn() to
> > > > > > let the set apicid as the parameter.
> > > > > > 
> > > > > > Signed-off-by: Chen Fan 
> > > > > > ---
> > > > > >  exec.c| 5 +
> > > > > >  target-i386/cpu.c | 9 +
> > > > > >  2 files changed, 14 insertions(+)
> > > > > > 
> > > > > > diff --git a/exec.c b/exec.c
> > > > > > index 7e49e8e..9be5855 100644
> > > > > > --- a/exec.c
> > > > > > +++ b/exec.c
> > > > > > @@ -438,7 +438,9 @@ CPUState *qemu_get_cpu(int index)
> > > > > >  void cpu_exec_init(CPUArchState *env)
> > > > > >  {
> > > > > >  CPUState *cpu = ENV_GET_CPU(env);
> > > > > > +#if !defined(TARGET_I386)
> > > > > >  CPUClass *cc = CPU_GET_CLASS(cpu);
> > > > > > +#endif
> > > > > >  CPUState *some_cpu;
> > > > > >  int cpu_index;
> > > > > >  
> > > > > > @@ -460,6 +462,8 @@ void cpu_exec_init(CPUArchState *env)
> > > > > >  #if defined(CONFIG_USER_ONLY)
> > > > > >  cpu_list_unlock();
> > > > > >  #endif
> > > > > > +
> > > > > > +#if !defined(TARGET_I386)
> > > > > >  if (qdev_get_vmsd(DEVICE(cpu)) == NULL) {
> > > > > >  vmstate_register(NULL, cpu_index, &vmstate_cpu_common, 
> > > > > > cpu);
> > > > > >  }
> > > > > > @@ -472,6 +476,7 @@ void cpu_exec_init(CPUArchState *env)
> > > > > >  if (cc->vmsd != NULL) {
> > > > > >  vmstate_register(NULL, cpu_index, cc->vmsd, cpu);
> > > > > >  }
> > > > > > +#endif /* !TARGET_I386 */
> > > > > >  }
> > > > > >  
> > > > > >  #if defined(TARGET_HAS_ICE)
> > > > > > diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> > > > > > index 967529a..dada6f6 100644
> > > > > > --- a/target-i386/cpu.c
> > > > > > +++ b/target-i386/cpu.c
> > > > > > @@ -2552,6 +2552,7 @@ static void x86_cpu_apic_realize(X86CPU *cpu, 
> > > > > > Error **errp)
> > > > > >  static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
> > > > > >  {
> > > > > >  CPUState *cs = CPU(dev);
> > > > > > +CPUClass *cc = CPU_GET_CLASS(cs);
> > > > > >  X86CPU *cpu = X86_CPU(dev);
> > > > > >  X86CPUClass *xcc = X86_CPU_GET_CLASS(dev);
> > > > > >  CPUX86State *env = &cpu->env;
> > > > > > @@ -2615,6 +2616,14 @@ static void x86_cpu_realizefn(DeviceState 
> > > > > > *dev, Error **errp)
> > > > > >  cpu_reset(cs);
> > > > > >  
> > > > > >  xcc->parent_realize(dev, &local_err);
> > > > > > +
> > > > > > +if (qdev_get_vmsd(DEVICE(cs)) == NULL) {
> > > > > > +vmstate_register(NULL, env->cpuid_apic_id, 
> > > > > > &vmstate_cpu_common, cs);
> > > > > > +}
> > > > > > +
> > > > > > +if (cc->vmsd != NULL) {
> > > > > > +vmstate_register(NULL, env->cpuid_apic_id, cc->vmsd, cs);
> > > > > > +}
> > > > > how about doing it in common CPUclass.realize()
> > > > > you can use get_arch_id() for getting CPU id, it returns cpu_index by 
> 

Re: [Qemu-devel] Exposing and calculating CPU APIC IDs (was Re: [RFC 1/3] target-i386: moving registers of vmstate from cpu_exec_init() to x86_cpu_realizefn())

2014-01-21 Thread Chen Fan
On Tue, 2014-01-21 at 10:31 +0100, Igor Mammedov wrote:
> On Tue, 21 Jan 2014 15:12:45 +0800
> Chen Fan  wrote:
> 
> > On Mon, 2014-01-20 at 13:29 +0100, Igor Mammedov wrote:
> > > On Fri, 17 Jan 2014 17:13:55 -0200
> > > Eduardo Habkost  wrote:
> > > 
> > > > On Wed, Jan 15, 2014 at 03:37:04PM +0100, Igor Mammedov wrote:
> > > > > On Wed, 15 Jan 2014 20:24:01 +0800
> > > > > Chen Fan  wrote:
> > > > > > On Tue, 2014-01-14 at 11:40 +0100, Igor Mammedov wrote:
> > > > > > > On Tue, 14 Jan 2014 17:27:20 +0800
> > > > > > > Chen Fan  wrote:
> > > > > > > 
> > > > > > > > the intend of this patch is to register cpu vmstates with apic 
> > > > > > > > id instead of cpu
> > > > > > > > index, due to the property setting of apic_id is behind the cpu 
> > > > > > > > initialization. so
> > > > > > > > we move the registers of cpu vmstate from cpu_exec_init() to 
> > > > > > > > x86_cpu_realizefn() to
> > > > > > > > let the set apicid as the parameter.
> > > > > > > > 
> > > > > > > > Signed-off-by: Chen Fan 
> > > > > > > > ---
> > > > > > > >  exec.c| 5 +
> > > > > > > >  target-i386/cpu.c | 9 +
> > > > > > > >  2 files changed, 14 insertions(+)
> > > > > > > > 
> > > > > > > > diff --git a/exec.c b/exec.c
> > > > > > > > index 7e49e8e..9be5855 100644
> > > > > > > > --- a/exec.c
> > > > > > > > +++ b/exec.c
> > > > > > > > @@ -438,7 +438,9 @@ CPUState *qemu_get_cpu(int index)
> > > > > > > >  void cpu_exec_init(CPUArchState *env)
> > > > > > > >  {
> > > > > > > >  CPUState *cpu = ENV_GET_CPU(env);
> > > > > > > > +#if !defined(TARGET_I386)
> > > > > > > >  CPUClass *cc = CPU_GET_CLASS(cpu);
> > > > > > > > +#endif
> > > > > > > >  CPUState *some_cpu;
> > > > > > > >  int cpu_index;
> > > > > > > >  
> > > > > > > > @@ -460,6 +462,8 @@ void cpu_exec_init(CPUArchState *env)
> > > > > > > >  #if defined(CONFIG_USER_ONLY)
> > > > > > > >  cpu_list_unlock();
> > > > > > > >  #endif
> > > > > > > > +
> > > > > > > > +#if !defined(TARGET_I386)
> > > > > > > >  if (qdev_get_vmsd(DEVICE(cpu)) == NULL) {
> > > > > > > >  vmstate_register(NULL, cpu_index, &vmstate_cpu_common, 
> > > > > > > > cpu);
> > > > > > > >  }
> > > > > > > > @@ -472,6 +476,7 @@ void cpu_exec_init(CPUArchState *env)
> > > > > > > >  if (cc->vmsd != NULL) {
> > > > > > > >  vmstate_register(NULL, cpu_index, cc->vmsd, cpu);
> > > > > > > >  }
> > > > > > > > +#endif /* !TARGET_I386 */
> > > > > > > >  }
> > > > > > > >  
> > > > > > > >  #if defined(TARGET_HAS_ICE)
> > > > > > > > diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> > > > > > > > index 967529a..dada6f6 100644
> > > > > > > > --- a/target-i386/cpu.c
> > > > > > > > +++ b/target-i386/cpu.c
> > > > > > > > @@ -2552,6 +2552,7 @@ static void x86_cpu_apic_realize(X86CPU 
> > > > > > > > *cpu, Error **errp)
> > > > > > > >  static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
> > > > > > > >  {
> > > > > > > >  CPUState *cs = CPU(dev);
> > > > > > > > +CPUClass *cc = CPU_GET_CLASS(cs);
> > > > > > > >  X86CPU *cpu = X86_CPU(dev);
> > > > > > > >  X86CPUClass *xcc = X86_CPU_GET_CLASS(dev);
> > > > > > > >  CPUX86State *env = &cpu->env;
> > > > > > > > @@ -2615,6 +2616,14 @@ static void 
> > > > > > > > x86_cpu_realizefn(DeviceState *dev, Error **errp)
> > > >

[Qemu-devel] [RFC qom-cpu v4 03/10] apic: remove local_apics array and using CPU_FOREACH instead

2013-10-09 Thread Chen Fan
Using CPU_FOREACH() marco instead of scaning the entire
local_apics array for fast searching apic.

Signed-off-by: Chen Fan 
---
 hw/intc/apic.c  | 73 ++---
 include/hw/i386/apic_internal.h |  2 --
 2 files changed, 32 insertions(+), 43 deletions(-)

diff --git a/hw/intc/apic.c b/hw/intc/apic.c
index a913186..f8f2cbf 100644
--- a/hw/intc/apic.c
+++ b/hw/intc/apic.c
@@ -32,8 +32,6 @@
 #define SYNC_TO_VAPIC   0x2
 #define SYNC_ISR_IRR_TO_VAPIC   0x4
 
-static APICCommonState *local_apics[MAX_APICS + 1];
-
 static void apic_set_irq(APICCommonState *s, int vector_num, int trigger_mode);
 static void apic_update_irq(APICCommonState *s);
 static void apic_get_delivery_bitmask(uint32_t *deliver_bitmask,
@@ -200,18 +198,15 @@ static void apic_external_nmi(APICCommonState *s)
 
 #define foreach_apic(apic, deliver_bitmask, code) \
 {\
+CPUState *cpu;\
 int __i, __j, __mask;\
-for(__i = 0; __i < MAX_APIC_WORDS; __i++) {\
+CPU_FOREACH(cpu) {\
+apic = APIC_COMMON(X86_CPU(cpu)->apic_state);\
+__i = apic->idx / 32;\
+__j = apic->idx % 32;\
 __mask = deliver_bitmask[__i];\
-if (__mask) {\
-for(__j = 0; __j < 32; __j++) {\
-if (__mask & (1 << __j)) {\
-apic = local_apics[__i * 32 + __j];\
-if (apic) {\
-code;\
-}\
-}\
-}\
+if (__mask & (1 << __j)) {\
+code;\
 }\
 }\
 }
@@ -235,9 +230,13 @@ static void apic_bus_deliver(const uint32_t 
*deliver_bitmask,
 }
 }
 if (d >= 0) {
-apic_iter = local_apics[d];
-if (apic_iter) {
-apic_set_irq(apic_iter, vector_num, trigger_mode);
+CPUState *cpu;
+CPU_FOREACH(cpu) {
+apic_iter = APIC_COMMON(X86_CPU(cpu)->apic_state);
+if (apic_iter->idx == d) {
+apic_set_irq(apic_iter, vector_num, trigger_mode);
+break;
+}
 }
 }
 }
@@ -422,18 +421,14 @@ static void apic_eoi(APICCommonState *s)
 
 static int apic_find_dest(uint8_t dest)
 {
-APICCommonState *apic = local_apics[dest];
-int i;
-
-if (apic && apic->id == dest)
-return dest;  /* shortcut in case apic->id == apic->idx */
+APICCommonState *apic;
+CPUState *cpu;
 
-for (i = 0; i < MAX_APICS; i++) {
-apic = local_apics[i];
-   if (apic && apic->id == dest)
-return i;
-if (!apic)
-break;
+CPU_FOREACH(cpu) {
+apic = APIC_COMMON(X86_CPU(cpu)->apic_state);
+if (apic->id == dest) {
+return apic->idx;
+}
 }
 
 return -1;
@@ -443,7 +438,7 @@ static void apic_get_delivery_bitmask(uint32_t 
*deliver_bitmask,
   uint8_t dest, uint8_t dest_mode)
 {
 APICCommonState *apic_iter;
-int i;
+CPUState *cpu;
 
 if (dest_mode == 0) {
 if (dest == 0xff) {
@@ -457,20 +452,17 @@ static void apic_get_delivery_bitmask(uint32_t 
*deliver_bitmask,
 } else {
 /* XXX: cluster mode */
 memset(deliver_bitmask, 0x00, MAX_APIC_WORDS * sizeof(uint32_t));
-for(i = 0; i < MAX_APICS; i++) {
-apic_iter = local_apics[i];
-if (apic_iter) {
-if (apic_iter->dest_mode == 0xf) {
-if (dest & apic_iter->log_dest)
-apic_set_bit(deliver_bitmask, i);
-} else if (apic_iter->dest_mode == 0x0) {
-if ((dest & 0xf0) == (apic_iter->log_dest & 0xf0) &&
-(dest & apic_iter->log_dest & 0x0f)) {
-apic_set_bit(deliver_bitmask, i);
-}
+CPU_FOREACH(cpu) {
+apic_iter = APIC_COMMON(X86_CPU(cpu)->apic_state);
+if (apic_iter->dest_mode == 0xf) {
+if (dest & apic_iter->log_dest) {
+apic_set_bit(deliver_bitmask, apic_iter->idx);
+}
+} else if (apic_iter->dest_mode == 0x0) {
+if ((dest & 0xf0) == (apic_iter->log_dest & 0xf0) &&
+(dest & apic_iter->log_dest & 0x0f)) {
+apic_set_bit(deliver_bitmask, apic_iter->idx);
 }
-} else {
-break;
 }
 }
 }
@@ -877,7 +869,6 @@ static void apic_init(APICCommonState *s)
   APIC_SPACE_SIZE);
 
 s->timer = timer_new_ns(QEMU_CLOCK_VIR

[Qemu-devel] [RFC qom-cpu v4 06/10] qom cpu: rename variable 'cpu_added_notifier' to 'cpu_hotplug_notifier'

2013-10-09 Thread Chen Fan
Rename variable 'cpu_added_notifier' to 'cpu_hotplug_notifier', for
adding vcpu-remove notifier support.

Signed-off-by: Chen Fan 
---
 hw/acpi/piix4.c | 10 +-
 hw/i386/pc.c|  2 +-
 include/sysemu/sysemu.h |  2 +-
 qom/cpu.c   | 10 +-
 4 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index b46bd5e..06f55d6 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -95,7 +95,7 @@ typedef struct PIIX4PMState {
 uint8_t s4_val;
 
 CPUStatus gpe_cpu;
-Notifier cpu_added_notifier;
+Notifier cpu_hotplug_notifier;
 } PIIX4PMState;
 
 #define TYPE_PIIX4_PM "PIIX4_PM"
@@ -661,9 +661,9 @@ static void piix4_cpu_hotplug_req(PIIX4PMState *s, CPUState 
*cpu,
 pm_update_sci(s);
 }
 
-static void piix4_cpu_added_req(Notifier *n, void *opaque)
+static void piix4_cpu_hotplug(Notifier *n, void *opaque)
 {
-PIIX4PMState *s = container_of(n, PIIX4PMState, cpu_added_notifier);
+PIIX4PMState *s = container_of(n, PIIX4PMState, cpu_hotplug_notifier);
 
 piix4_cpu_hotplug_req(s, CPU(opaque), PLUG);
 }
@@ -696,8 +696,8 @@ static void piix4_acpi_system_hot_add_init(MemoryRegion 
*parent,
 memory_region_init_io(&s->io_cpu, OBJECT(s), &cpu_hotplug_ops, s,
   "acpi-cpu-hotplug", PIIX4_PROC_LEN);
 memory_region_add_subregion(parent, PIIX4_PROC_BASE, &s->io_cpu);
-s->cpu_added_notifier.notify = piix4_cpu_added_req;
-qemu_register_cpu_added_notifier(&s->cpu_added_notifier);
+s->cpu_hotplug_notifier.notify = piix4_cpu_hotplug;
+qemu_register_cpu_hotplug_notifier(&s->cpu_hotplug_notifier);
 }
 
 static void enable_device(PIIX4PMState *s, int slot)
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 40d611e..8ab6e4f 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -406,7 +406,7 @@ void pc_cmos_init(ram_addr_t ram_size, ram_addr_t 
above_4g_mem_size,
 /* init CPU hotplug notifier */
 cpu_hotplug_cb.rtc_state = s;
 cpu_hotplug_cb.cpu_added_notifier.notify = rtc_notify_cpu_added;
-qemu_register_cpu_added_notifier(&cpu_hotplug_cb.cpu_added_notifier);
+qemu_register_cpu_hotplug_notifier(&cpu_hotplug_cb.cpu_added_notifier);
 
 if (set_boot_dev(s, boot_device)) {
 exit(1);
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index cd5791e..72c5ff9 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -158,7 +158,7 @@ void do_pci_device_hot_remove(Monitor *mon, const QDict 
*qdict);
 void drive_hot_add(Monitor *mon, const QDict *qdict);
 
 /* CPU hotplug */
-void qemu_register_cpu_added_notifier(Notifier *notifier);
+void qemu_register_cpu_hotplug_notifier(Notifier *notifier);
 
 /* pcie aer error injection */
 void pcie_aer_inject_error_print(Monitor *mon, const QObject *data);
diff --git a/qom/cpu.c b/qom/cpu.c
index 818fb26..0913c9c 100644
--- a/qom/cpu.c
+++ b/qom/cpu.c
@@ -67,12 +67,12 @@ static void cpu_common_get_memory_mapping(CPUState *cpu,
 }
 
 /* CPU hot-plug notifiers */
-static NotifierList cpu_added_notifiers =
-NOTIFIER_LIST_INITIALIZER(cpu_add_notifiers);
+static NotifierList cpu_hotplug_notifiers =
+NOTIFIER_LIST_INITIALIZER(cpu_hotplug_notifiers);
 
-void qemu_register_cpu_added_notifier(Notifier *notifier)
+void qemu_register_cpu_hotplug_notifier(Notifier *notifier)
 {
-notifier_list_add(&cpu_added_notifiers, notifier);
+notifier_list_add(&cpu_hotplug_notifiers, notifier);
 }
 
 void cpu_reset_interrupt(CPUState *cpu, int mask)
@@ -219,7 +219,7 @@ static void cpu_common_realizefn(DeviceState *dev, Error 
**errp)
 
 if (dev->hotplugged) {
 cpu_synchronize_post_init(cpu);
-notifier_list_notify(&cpu_added_notifiers, dev);
+notifier_list_notify(&cpu_hotplug_notifiers, dev);
 cpu_resume(cpu);
 }
 }
-- 
1.8.1.4




[Qemu-devel] [RFC qom-cpu v4 02/10] apic: remove redundant variable 'apic_no' from apic_init_common()

2013-10-09 Thread Chen Fan
In struct APICCommonState, there is an id field yet, which was set earlier,
qdev_prop_set_uint8(env->apic_state, "id", env->cpuid_apic_id);
so we use the id field instead of the variable 'apic_no' to represent the 
unique apic
index.

Signed-off-by: Chen Fan 
---
 hw/intc/apic_common.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
index a0beb10..82fbb7f 100644
--- a/hw/intc/apic_common.c
+++ b/hw/intc/apic_common.c
@@ -289,13 +289,9 @@ static int apic_init_common(ICCDevice *dev)
 APICCommonState *s = APIC_COMMON(dev);
 APICCommonClass *info;
 static DeviceState *vapic;
-static int apic_no;
 static bool mmio_registered;
 
-if (apic_no >= MAX_APICS) {
-return -1;
-}
-s->idx = apic_no++;
+s->idx = s->id;
 
 info = APIC_COMMON_GET_CLASS(s);
 info->init(s);
-- 
1.8.1.4




[Qemu-devel] [RFC qom-cpu v4 04/10] x86: add x86_cpu_unrealizefn() for cpu apic remove

2013-10-09 Thread Chen Fan
Implement x86_cpu_unrealizefn() for corresponding x86_cpu_realizefn(),
which is mostly used to clear the apic related information at here.
and refactor apic initialization, use QOM realizefn.

Signed-off-by: Chen Fan 
---
 hw/i386/kvm/apic.c  | 18 --
 hw/intc/apic.c  | 18 --
 hw/intc/apic_common.c   | 11 +++
 include/hw/i386/apic_internal.h |  4 +++-
 target-i386/cpu-qom.h   |  1 +
 target-i386/cpu.c   | 35 +++
 6 files changed, 74 insertions(+), 13 deletions(-)

diff --git a/hw/i386/kvm/apic.c b/hw/i386/kvm/apic.c
index 5609063..87f1cce 100644
--- a/hw/i386/kvm/apic.c
+++ b/hw/i386/kvm/apic.c
@@ -171,21 +171,35 @@ static const MemoryRegionOps kvm_apic_io_ops = {
 .endianness = DEVICE_NATIVE_ENDIAN,
 };
 
-static void kvm_apic_init(APICCommonState *s)
+static void kvm_apic_realize(DeviceState *dev, Error **errp)
 {
+APICCommonState *s = APIC_COMMON(dev);
+APICCommonClass *acc = APIC_COMMON_GET_CLASS(s);
+
 memory_region_init_io(&s->io_memory, NULL, &kvm_apic_io_ops, s, 
"kvm-apic-msi",
   APIC_SPACE_SIZE);
 
 if (kvm_has_gsi_routing()) {
 msi_supported = true;
 }
+
+acc->parent_realize(dev, errp);
+}
+
+static void kvm_apic_unrealize(DeviceState *dev, Error **errp)
+{
+APICCommonState *s = APIC_COMMON(dev);
+memory_region_destroy(&s->io_memory);
 }
 
 static void kvm_apic_class_init(ObjectClass *klass, void *data)
 {
 APICCommonClass *k = APIC_COMMON_CLASS(klass);
+DeviceClass *dc = DEVICE_CLASS(klass);
 
-k->init = kvm_apic_init;
+k->parent_realize = dc->realize;
+dc->realize = kvm_apic_realize;
+dc->unrealize = kvm_apic_unrealize;
 k->set_base = kvm_apic_set_base;
 k->set_tpr = kvm_apic_set_tpr;
 k->get_tpr = kvm_apic_get_tpr;
diff --git a/hw/intc/apic.c b/hw/intc/apic.c
index f8f2cbf..c022640 100644
--- a/hw/intc/apic.c
+++ b/hw/intc/apic.c
@@ -863,21 +863,35 @@ static const MemoryRegionOps apic_io_ops = {
 .endianness = DEVICE_NATIVE_ENDIAN,
 };
 
-static void apic_init(APICCommonState *s)
+static void apic_realize(DeviceState *dev, Error **errp)
 {
+APICCommonState *s = APIC_COMMON(dev);
+APICCommonClass *acc = APIC_COMMON_GET_CLASS(s);
+
 memory_region_init_io(&s->io_memory, OBJECT(s), &apic_io_ops, s, 
"apic-msi",
   APIC_SPACE_SIZE);
 
 s->timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, apic_timer, s);
 
 msi_supported = true;
+
+acc->parent_realize(dev, errp);
+}
+
+static void apic_unrealize(DeviceState *dev, Error **errp)
+{
+APICCommonState *s = APIC_COMMON(dev);
+memory_region_destroy(&s->io_memory);
 }
 
 static void apic_class_init(ObjectClass *klass, void *data)
 {
 APICCommonClass *k = APIC_COMMON_CLASS(klass);
+DeviceClass *dc = DEVICE_CLASS(klass);
 
-k->init = apic_init;
+k->parent_realize = dc->realize;
+dc->realize = apic_realize;
+dc->unrealize = apic_unrealize;
 k->set_base = apic_set_base;
 k->set_tpr = apic_set_tpr;
 k->get_tpr = apic_get_tpr;
diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
index 82fbb7f..fbb276d 100644
--- a/hw/intc/apic_common.c
+++ b/hw/intc/apic_common.c
@@ -284,17 +284,15 @@ static int apic_load_old(QEMUFile *f, void *opaque, int 
version_id)
 return 0;
 }
 
-static int apic_init_common(ICCDevice *dev)
+static void apic_common_realize(DeviceState *dev, Error **errp)
 {
 APICCommonState *s = APIC_COMMON(dev);
-APICCommonClass *info;
+APICCommonClass *info = APIC_COMMON_GET_CLASS(s);
 static DeviceState *vapic;
 static bool mmio_registered;
 
 s->idx = s->id;
 
-info = APIC_COMMON_GET_CLASS(s);
-info->init(s);
 if (!mmio_registered) {
 ICCBus *b = ICC_BUS(qdev_get_parent_bus(DEVICE(dev)));
 memory_region_add_subregion(b->apic_address_space, 0, &s->io_memory);
@@ -310,8 +308,6 @@ static int apic_init_common(ICCDevice *dev)
 if (apic_report_tpr_access && info->enable_tpr_reporting) {
 info->enable_tpr_reporting(s, true);
 }
-
-return 0;
 }
 
 static void apic_dispatch_pre_save(void *opaque)
@@ -377,14 +373,13 @@ static Property apic_properties_common[] = {
 
 static void apic_common_class_init(ObjectClass *klass, void *data)
 {
-ICCDeviceClass *idc = ICC_DEVICE_CLASS(klass);
 DeviceClass *dc = DEVICE_CLASS(klass);
 
+dc->realize = apic_common_realize;
 dc->vmsd = &vmstate_apic_common;
 dc->reset = apic_reset_common;
 dc->no_user = 1;
 dc->props = apic_properties_common;
-idc->init = apic_init_common;
 }
 
 static const TypeInfo apic_common_type = {
diff --git a/include/hw/i386/apic_internal.h b/include/hw/i386/apic_internal.h
index 5b763ac..9f885e7 100644
--- a

[Qemu-devel] [RFC qom-cpu v4 01/10] x86: move apic_state field from CPUX86State to X86CPU

2013-10-09 Thread Chen Fan
This motion is preparing for refactoring vCPU apic subsequently.

Signed-off-by: Chen Fan 
---
 cpu-exec.c|  2 +-
 cpus.c|  5 ++---
 hw/i386/kvmvapic.c|  8 +++-
 hw/i386/pc.c  | 17 -
 target-i386/cpu-qom.h |  4 
 target-i386/cpu.c | 22 ++
 target-i386/cpu.h |  4 
 target-i386/helper.c  |  9 -
 target-i386/kvm.c | 23 ++-
 target-i386/misc_helper.c |  8 
 10 files changed, 46 insertions(+), 56 deletions(-)

diff --git a/cpu-exec.c b/cpu-exec.c
index 30cfa2a..2711c58 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -320,7 +320,7 @@ int cpu_exec(CPUArchState *env)
 #if !defined(CONFIG_USER_ONLY)
 if (interrupt_request & CPU_INTERRUPT_POLL) {
 cpu->interrupt_request &= ~CPU_INTERRUPT_POLL;
-apic_poll_irq(env->apic_state);
+apic_poll_irq(x86_env_get_cpu(env)->apic_state);
 }
 #endif
 if (interrupt_request & CPU_INTERRUPT_INIT) {
diff --git a/cpus.c b/cpus.c
index e566297..4ace860 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1383,12 +1383,11 @@ void qmp_inject_nmi(Error **errp)
 
 CPU_FOREACH(cs) {
 X86CPU *cpu = X86_CPU(cs);
-CPUX86State *env = &cpu->env;
 
-if (!env->apic_state) {
+if (!cpu->apic_state) {
 cpu_interrupt(cs, CPU_INTERRUPT_NMI);
 } else {
-apic_deliver_nmi(env->apic_state);
+apic_deliver_nmi(cpu->apic_state);
 }
 }
 #elif defined(TARGET_S390X)
diff --git a/hw/i386/kvmvapic.c b/hw/i386/kvmvapic.c
index 1c2dbf5..9fa346b 100644
--- a/hw/i386/kvmvapic.c
+++ b/hw/i386/kvmvapic.c
@@ -366,7 +366,7 @@ static int vapic_enable(VAPICROMState *s, X86CPU *cpu)
 (((hwaddr)cpu_number) << VAPIC_CPU_SHIFT);
 cpu_physical_memory_rw(vapic_paddr + offsetof(VAPICState, enabled),
(void *)&enabled, sizeof(enabled), 1);
-apic_enable_vapic(cpu->env.apic_state, vapic_paddr);
+apic_enable_vapic(cpu->apic_state, vapic_paddr);
 
 s->state = VAPIC_ACTIVE;
 
@@ -496,12 +496,10 @@ static void vapic_enable_tpr_reporting(bool enable)
 };
 CPUState *cs;
 X86CPU *cpu;
-CPUX86State *env;
 
 CPU_FOREACH(cs) {
 cpu = X86_CPU(cs);
-env = &cpu->env;
-info.apic = env->apic_state;
+info.apic = cpu->apic_state;
 run_on_cpu(cs, vapic_do_enable_tpr_reporting, &info);
 }
 }
@@ -697,7 +695,7 @@ static void vapic_write(void *opaque, hwaddr addr, uint64_t 
data,
 default:
 case 4:
 if (!kvm_irqchip_in_kernel()) {
-apic_poll_irq(env->apic_state);
+apic_poll_irq(cpu->apic_state);
 }
 break;
 }
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 0c313fe..832c9b2 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -169,13 +169,14 @@ void cpu_smm_update(CPUX86State *env)
 int cpu_get_pic_interrupt(CPUX86State *env)
 {
 int intno;
+X86CPU *cpu = x86_env_get_cpu(env);
 
-intno = apic_get_interrupt(env->apic_state);
+intno = apic_get_interrupt(cpu->apic_state);
 if (intno >= 0) {
 return intno;
 }
 /* read the irq from the PIC */
-if (!apic_accept_pic_intr(env->apic_state)) {
+if (!apic_accept_pic_intr(cpu->apic_state)) {
 return -1;
 }
 
@@ -187,15 +188,13 @@ static void pic_irq_request(void *opaque, int irq, int 
level)
 {
 CPUState *cs = first_cpu;
 X86CPU *cpu = X86_CPU(cs);
-CPUX86State *env = &cpu->env;
 
 DPRINTF("pic_irqs: %s irq %d\n", level? "raise" : "lower", irq);
-if (env->apic_state) {
+if (cpu->apic_state) {
 CPU_FOREACH(cs) {
 cpu = X86_CPU(cs);
-env = &cpu->env;
-if (apic_accept_pic_intr(env->apic_state)) {
-apic_deliver_pic_intr(env->apic_state, level);
+if (apic_accept_pic_intr(cpu->apic_state)) {
+apic_deliver_pic_intr(cpu->apic_state, level);
 }
 }
 } else {
@@ -890,7 +889,7 @@ DeviceState *cpu_get_current_apic(void)
 {
 if (current_cpu) {
 X86CPU *cpu = X86_CPU(current_cpu);
-return cpu->env.apic_state;
+return cpu->apic_state;
 } else {
 return NULL;
 }
@@ -984,7 +983,7 @@ void pc_cpus_init(const char *cpu_model, DeviceState 
*icc_bridge)
 }
 
 /* map APIC MMIO area if CPU has APIC */
-if (cpu && cpu->env.apic_state) {
+if (cpu && cpu->apic_state) {
 /* XXX: what if the base changes? */
 sysbus_mmio_map_overlap(SYS_BUS_DEVICE(icc_bridge), 0,
 APIC_DEFAULT_ADDRESS, 0x1000);
diff --git a/target-i386/cpu-qom.h b/

[Qemu-devel] [RFC qom-cpu v4 08/10] i386: implement pc interface pc_hot_del_cpu()

2013-10-09 Thread Chen Fan
Implement cpu interface pc_hot_del_cpu() for unrealizing device vCPU.
emiting vcpu-remove notifier to ACPI, then ACPI could send sci interrupt
to OS for hot-remove vcpu.

Signed-off-by: Chen Fan 
---
 hw/i386/pc.c | 30 --
 qom/cpu.c| 12 
 2 files changed, 40 insertions(+), 2 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 8ab6e4f..a6b9b78 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -958,8 +958,34 @@ void pc_hot_add_cpu(const int64_t id, Error **errp)
 
 void pc_hot_del_cpu(const int64_t id, Error **errp)
 {
-/* TODO: hot remove vCPU. */
-error_setg(errp, "Hot-remove CPU is not supported.");
+CPUState *cpu;
+bool found = false;
+X86CPUClass *xcc;
+
+CPU_FOREACH(cpu) {
+CPUClass *cc = CPU_GET_CLASS(cpu);
+int64_t cpuid = cc->get_arch_id(cpu);
+
+if (cpuid == id) {
+found = true;
+break;
+}
+}
+
+if (!found) {
+error_setg(errp, "Unable to find cpu-index: %" PRIi64
+   ", it doesn't exist or has been deleted.", id);
+return;
+}
+
+if (cpu == first_cpu && !CPU_NEXT(cpu)) {
+error_setg(errp, "Unable to delete the last "
+   "one cpu when VM running.");
+return;
+}
+
+xcc = X86_CPU_GET_CLASS(DEVICE(cpu));
+xcc->parent_unrealize(DEVICE(cpu), errp);
 }
 
 void pc_cpus_init(const char *cpu_model, DeviceState *icc_bridge)
diff --git a/qom/cpu.c b/qom/cpu.c
index d20783b..89fc8bd 100644
--- a/qom/cpu.c
+++ b/qom/cpu.c
@@ -228,6 +228,17 @@ static void cpu_common_realizefn(DeviceState *dev, Error 
**errp)
 }
 }
 
+static void cpu_common_unrealizefn(DeviceState *dev, Error **errp)
+{
+CPUNotifier notifier;
+
+notifier.dev = dev;
+notifier.type = UNPLUG;
+
+notifier_list_notify(&cpu_hotplug_notifiers, ¬ifier);
+}
+
+
 static void cpu_common_initfn(Object *obj)
 {
 CPUState *cpu = CPU(obj);
@@ -258,6 +269,7 @@ static void cpu_class_init(ObjectClass *klass, void *data)
 k->gdb_read_register = cpu_common_gdb_read_register;
 k->gdb_write_register = cpu_common_gdb_write_register;
 dc->realize = cpu_common_realizefn;
+dc->unrealize = cpu_common_unrealizefn;
 dc->no_user = 1;
 }
 
-- 
1.8.1.4




[Qemu-devel] [RFC qom-cpu v4 00/10] i386: add cpu hot remove support

2013-10-09 Thread Chen Fan
Via implementing ACPI standard methods _EJ0 in bios, after Guest OS hot remove
one vCPU, it is able to send a signal to QEMU, then QEMU could notify
the assigned vCPU of exiting. meanwhile, and intruduce the QOM command 
'cpu-del' to remove
vCPU from QEMU itself.

this work is based on Andreas Färber's qom-cpu branch tree.
git://github.com/afaerber/qemu-cpu.git

this series patches must be used with seabios patch and KVM patch together.

for KVM patches:
http://comments.gmane.org/gmane.comp.emulators.kvm.devel/114347

for seabios patches:
http://comments.gmane.org/gmane.comp.emulators.qemu/230460

Chen Fan (10):
  x86: move apic_state field from CPUX86State to X86CPU
  apic: remove redundant variable 'apic_no' from apic_init_common()
  apic: remove local_apics array and using CPU_FOREACH instead
  x86: add x86_cpu_unrealizefn() for cpu apic remove
  qmp: add 'cpu-del' command support
  qom cpu: rename variable 'cpu_added_notifier' to
'cpu_hotplug_notifier'
  qom cpu: add UNPLUG cpu notifier support
  i386: implement pc interface pc_hot_del_cpu()
  piix4: implement function cpu_status_write() for vcpu ejection
  cpus: reclaim allocated vCPU objects

 cpu-exec.c  |  2 +-
 cpus.c  | 51 +--
 hw/acpi/piix4.c | 66 --
 hw/i386/kvm/apic.c  | 18 +++-
 hw/i386/kvmvapic.c  |  8 ++--
 hw/i386/pc.c| 51 ++-
 hw/i386/pc_piix.c   |  3 +-
 hw/intc/apic.c  | 91 ++---
 hw/intc/apic_common.c   | 17 ++--
 include/hw/boards.h |  2 +
 include/hw/i386/apic_internal.h |  6 +--
 include/hw/i386/pc.h|  1 +
 include/qom/cpu.h   | 20 +
 include/sysemu/kvm.h|  1 +
 include/sysemu/sysemu.h |  2 +-
 kvm-all.c   | 25 +++
 qapi-schema.json| 12 ++
 qmp-commands.hx | 23 +++
 qmp.c   |  9 
 qom/cpu.c   | 26 +---
 target-i386/cpu-qom.h   |  5 +++
 target-i386/cpu.c   | 57 --
 target-i386/cpu.h   |  4 --
 target-i386/helper.c|  9 ++--
 target-i386/kvm.c   | 23 +--
 target-i386/misc_helper.c   |  8 ++--
 26 files changed, 403 insertions(+), 137 deletions(-)

-- 
1.8.1.4




[Qemu-devel] [RFC qom-cpu v4 09/10] piix4: implement function cpu_status_write() for vcpu ejection

2013-10-09 Thread Chen Fan
When OS eject a vcpu (like: echo 1 > /sys/bus/acpi/devices/LNXCPUXX/eject),
it will call acpi EJ0 method, the firmware will write the new cpumap, QEMU
will know which vcpu need to be ejected.

Signed-off-by: Chen Fan 
---
 hw/acpi/piix4.c | 37 -
 1 file changed, 36 insertions(+), 1 deletion(-)

diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index dc506bf..fd27001 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -61,6 +61,7 @@ struct pci_status {
 
 typedef struct CPUStatus {
 uint8_t sts[PIIX4_PROC_LEN];
+uint8_t old_sts[PIIX4_PROC_LEN];
 } CPUStatus;
 
 typedef struct PIIX4PMState {
@@ -611,6 +612,12 @@ static const MemoryRegionOps piix4_pci_ops = {
 },
 };
 
+static void acpi_piix_eject_vcpu(int64_t cpuid)
+{
+/* TODO: eject a vcpu, release allocated vcpu and exit the vcpu pthread.  
*/
+PIIX4_DPRINTF("vcpu: %" PRIu64 " need to be ejected.\n", cpuid);
+}
+
 static uint64_t cpu_status_read(void *opaque, hwaddr addr, unsigned int size)
 {
 PIIX4PMState *s = opaque;
@@ -623,7 +630,27 @@ static uint64_t cpu_status_read(void *opaque, hwaddr addr, 
unsigned int size)
 static void cpu_status_write(void *opaque, hwaddr addr, uint64_t data,
  unsigned int size)
 {
-/* TODO: implement VCPU removal on guest signal that CPU can be removed */
+PIIX4PMState *s = opaque;
+CPUStatus *cpus = &s->gpe_cpu;
+uint8_t val;
+int i;
+int64_t cpuid = 0;
+
+val = cpus->old_sts[addr] ^ data;
+
+if (val == 0) {
+return;
+}
+
+for (i = 0; i < 8; i++) {
+if (val & 1 << i) {
+cpuid = 8 * addr + i;
+}
+}
+
+if (cpuid != 0) {
+acpi_piix_eject_vcpu(cpuid);
+}
 }
 
 static const MemoryRegionOps cpu_hotplug_ops = {
@@ -643,13 +670,20 @@ static void piix4_cpu_hotplug_req(PIIX4PMState *s, 
CPUState *cpu,
 ACPIGPE *gpe = &s->ar.gpe;
 CPUClass *k = CPU_GET_CLASS(cpu);
 int64_t cpu_id;
+int i;
 
 assert(s != NULL);
 
 *gpe->sts = *gpe->sts | PIIX4_CPU_HOTPLUG_STATUS;
 cpu_id = k->get_arch_id(CPU(cpu));
+
+for (i = 0; i < PIIX4_PROC_LEN; i++) {
+g->old_sts[i] = g->sts[i];
+}
+
 if (action == PLUG) {
 g->sts[cpu_id / 8] |= (1 << (cpu_id % 8));
+g->old_sts[cpu_id / 8] |= (1 << (cpu_id % 8));
 } else {
 g->sts[cpu_id / 8] &= ~(1 << (cpu_id % 8));
 }
@@ -688,6 +722,7 @@ static void piix4_acpi_system_hot_add_init(MemoryRegion 
*parent,
 
 g_assert((id / 8) < PIIX4_PROC_LEN);
 s->gpe_cpu.sts[id / 8] |= (1 << (id % 8));
+s->gpe_cpu.old_sts[id / 8] |= (1 << (id % 8));
 }
 memory_region_init_io(&s->io_cpu, OBJECT(s), &cpu_hotplug_ops, s,
   "acpi-cpu-hotplug", PIIX4_PROC_LEN);
-- 
1.8.1.4




[Qemu-devel] [RFC qom-cpu v4 05/10] qmp: add 'cpu-del' command support

2013-10-09 Thread Chen Fan
Signed-off-by: Chen Fan 
---
 hw/i386/pc.c |  6 ++
 hw/i386/pc_piix.c|  3 ++-
 include/hw/boards.h  |  2 ++
 include/hw/i386/pc.h |  1 +
 qapi-schema.json | 12 
 qmp-commands.hx  | 23 +++
 qmp.c|  9 +
 7 files changed, 55 insertions(+), 1 deletion(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 832c9b2..40d611e 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -956,6 +956,12 @@ void pc_hot_add_cpu(const int64_t id, Error **errp)
 pc_new_cpu(current_cpu_model, apic_id, icc_bridge, errp);
 }
 
+void pc_hot_del_cpu(const int64_t id, Error **errp)
+{
+/* TODO: hot remove vCPU. */
+error_setg(errp, "Hot-remove CPU is not supported.");
+}
+
 void pc_cpus_init(const char *cpu_model, DeviceState *icc_bridge)
 {
 int i;
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index c6042c7..e7039be 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -337,7 +337,8 @@ static void pc_xen_hvm_init(QEMUMachineInitArgs *args)
 #define PC_I440FX_MACHINE_OPTIONS \
 PC_DEFAULT_MACHINE_OPTIONS, \
 .desc = "Standard PC (i440FX + PIIX, 1996)", \
-.hot_add_cpu = pc_hot_add_cpu
+.hot_add_cpu = pc_hot_add_cpu, \
+.hot_del_cpu = pc_hot_del_cpu
 
 #define PC_I440FX_1_7_MACHINE_OPTIONS PC_I440FX_MACHINE_OPTIONS
 static QEMUMachine pc_i440fx_machine_v1_7 = {
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 5a7ae9f..5934828 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -20,6 +20,7 @@ typedef void QEMUMachineInitFunc(QEMUMachineInitArgs *args);
 typedef void QEMUMachineResetFunc(void);
 
 typedef void QEMUMachineHotAddCPUFunc(const int64_t id, Error **errp);
+typedef void QEMUMachineHotDelCPUFunc(const int64_t id, Error **errp);
 
 typedef struct QEMUMachine {
 const char *name;
@@ -28,6 +29,7 @@ typedef struct QEMUMachine {
 QEMUMachineInitFunc *init;
 QEMUMachineResetFunc *reset;
 QEMUMachineHotAddCPUFunc *hot_add_cpu;
+QEMUMachineHotDelCPUFunc *hot_del_cpu;
 BlockInterfaceType block_default_type;
 int max_cpus;
 unsigned int no_serial:1,
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 6083839..e7f4313 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -96,6 +96,7 @@ void pc_acpi_smi_interrupt(void *opaque, int irq, int level);
 
 void pc_cpus_init(const char *cpu_model, DeviceState *icc_bridge);
 void pc_hot_add_cpu(const int64_t id, Error **errp);
+void pc_hot_del_cpu(const int64_t id, Error **errp);
 void pc_acpi_init(const char *default_dsdt);
 
 PcGuestInfo *pc_guest_info_init(ram_addr_t below_4g_mem_size,
diff --git a/qapi-schema.json b/qapi-schema.json
index 145eca8..e2a47ea 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1479,6 +1479,18 @@
 ##
 { 'command': 'cpu-add', 'data': {'id': 'int'} }
 
+# @cpu-del
+
+# Deletes CPU with specified ID
+#
+# @id: ID of CPU to be deleted, valid values [0..max_cpus)
+#
+# Returns: Nothing on success
+#
+# Since 1.7
+##
+{ 'command': 'cpu-del', 'data': {'id': 'int'} }
+
 ##
 # @memsave:
 #
diff --git a/qmp-commands.hx b/qmp-commands.hx
index b17c46e..8f2bfdb 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -411,6 +411,29 @@ Example:
 EQMP
 
 {
+.name   = "cpu-del",
+.args_type  = "id:i",
+.mhandler.cmd_new = qmp_marshal_input_cpu_del,
+},
+
+SQMP
+cpu-del
+---
+
+Deletes virtual cpu
+
+Arguments:
+
+- "id": cpu id (json-int)
+
+Example:
+
+-> { "execute": "cpu-del", "arguments": { "id": 2 } }
+<- { "return": {} }
+
+EQMP
+
+{
 .name   = "memsave",
 .args_type  = "val:l,size:i,filename:s,cpu:i?",
 .mhandler.cmd_new = qmp_marshal_input_memsave,
diff --git a/qmp.c b/qmp.c
index 4c149b3..84dc873 100644
--- a/qmp.c
+++ b/qmp.c
@@ -118,6 +118,15 @@ void qmp_cpu_add(int64_t id, Error **errp)
 }
 }
 
+void qmp_cpu_del(int64_t id, Error **errp)
+{
+if (current_machine->hot_del_cpu) {
+current_machine->hot_del_cpu(id, errp);
+} else {
+error_setg(errp, "Not supported");
+}
+}
+
 #ifndef CONFIG_VNC
 /* If VNC support is enabled, the "true" query-vnc command is
defined in the VNC subsystem */
-- 
1.8.1.4




[Qemu-devel] [RFC qom-cpu v4 10/10] cpus: reclaim allocated vCPU objects

2013-10-09 Thread Chen Fan
After ACPI get a signal to eject a vCPU, then it will notify
the vCPU thread to exit in KVM, and the vCPU must be removed from CPU list,
before the vCPU really removed, there will release the all related vCPU objects.

Signed-off-by: Chen Fan 
---
 cpus.c   | 46 ++
 hw/acpi/piix4.c  | 23 +--
 include/qom/cpu.h| 10 ++
 include/sysemu/kvm.h |  1 +
 kvm-all.c| 25 +
 5 files changed, 99 insertions(+), 6 deletions(-)

diff --git a/cpus.c b/cpus.c
index 4ace860..942af0a 100644
--- a/cpus.c
+++ b/cpus.c
@@ -714,6 +714,26 @@ void async_run_on_cpu(CPUState *cpu, void (*func)(void 
*data), void *data)
 qemu_cpu_kick(cpu);
 }
 
+static void qemu_kvm_destroy_vcpu(CPUState *cpu)
+{
+CPU_REMOVE(cpu);
+
+if (kvm_destroy_vcpu(cpu) < 0) {
+fprintf(stderr, "kvm_destroy_vcpu failed.\n");
+exit(1);
+}
+
+object_property_set_bool(OBJECT(cpu), false, "realized", NULL);
+qdev_free(DEVICE(cpu));
+}
+
+static void qemu_tcg_destroy_vcpu(CPUState *cpu)
+{
+CPU_REMOVE(cpu);
+object_property_set_bool(OBJECT(cpu), false, "realized", NULL);
+qdev_free(DEVICE(cpu));
+}
+
 static void flush_queued_work(CPUState *cpu)
 {
 struct qemu_work_item *wi;
@@ -805,6 +825,11 @@ static void *qemu_kvm_cpu_thread_fn(void *arg)
 }
 }
 qemu_kvm_wait_io_event(cpu);
+if (cpu->exit && !cpu_can_run(cpu)) {
+qemu_kvm_destroy_vcpu(cpu);
+qemu_mutex_unlock(&qemu_global_mutex);
+return NULL;
+}
 }
 
 return NULL;
@@ -857,6 +882,7 @@ static void tcg_exec_all(void);
 static void *qemu_tcg_cpu_thread_fn(void *arg)
 {
 CPUState *cpu = arg;
+CPUState *remove_cpu = NULL;
 
 qemu_tcg_init_cpu_signals();
 qemu_thread_get_self(cpu->thread);
@@ -889,6 +915,16 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
 }
 }
 qemu_tcg_wait_io_event();
+CPU_FOREACH(cpu) {
+if (cpu->exit && !cpu_can_run(cpu)) {
+remove_cpu = cpu;
+break;
+}
+}
+if (remove_cpu) {
+qemu_tcg_destroy_vcpu(remove_cpu);
+remove_cpu = NULL;
+}
 }
 
 return NULL;
@@ -1045,6 +1081,13 @@ void resume_all_vcpus(void)
 }
 }
 
+void cpu_remove(CPUState *cpu)
+{
+cpu->stop = true;
+cpu->exit = true;
+qemu_cpu_kick(cpu);
+}
+
 static void qemu_tcg_init_vcpu(CPUState *cpu)
 {
 /* share a single thread for all cpus with TCG */
@@ -1219,6 +1262,9 @@ static void tcg_exec_all(void)
 break;
 }
 } else if (cpu->stop || cpu->stopped) {
+if (cpu->exit) {
+next_cpu = CPU_NEXT(cpu);
+}
 break;
 }
 }
diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index fd27001..bde8123 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -612,10 +612,21 @@ static const MemoryRegionOps piix4_pci_ops = {
 },
 };
 
-static void acpi_piix_eject_vcpu(int64_t cpuid)
+static void acpi_piix_eject_vcpu(PIIX4PMState *s, int64_t cpuid)
 {
-/* TODO: eject a vcpu, release allocated vcpu and exit the vcpu pthread.  
*/
-PIIX4_DPRINTF("vcpu: %" PRIu64 " need to be ejected.\n", cpuid);
+CPUStatus *g = &s->gpe_cpu;
+CPUState *cpu;
+
+CPU_FOREACH(cpu) {
+CPUClass *cc = CPU_GET_CLASS(cpu);
+int64_t id = cc->get_arch_id(cpu);
+
+if (cpuid == id) {
+g->old_sts[cpuid / 8] &= ~(1 << (cpuid % 8));
+cpu_remove(cpu);
+break;
+}
+}
 }
 
 static uint64_t cpu_status_read(void *opaque, hwaddr addr, unsigned int size)
@@ -634,7 +645,7 @@ static void cpu_status_write(void *opaque, hwaddr addr, 
uint64_t data,
 CPUStatus *cpus = &s->gpe_cpu;
 uint8_t val;
 int i;
-int64_t cpuid = 0;
+int64_t cpuid = -1;
 
 val = cpus->old_sts[addr] ^ data;
 
@@ -648,8 +659,8 @@ static void cpu_status_write(void *opaque, hwaddr addr, 
uint64_t data,
 }
 }
 
-if (cpuid != 0) {
-acpi_piix_eject_vcpu(cpuid);
+if (cpuid != -1) {
+acpi_piix_eject_vcpu(s, cpuid);
 }
 }
 
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 0238532..eb8d32b 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -181,6 +181,7 @@ struct CPUState {
 bool created;
 bool stop;
 bool stopped;
+bool exit;
 volatile sig_atomic_t exit_request;
 volatile sig_atomic_t tcg_exit_req;
 uint32_t interrupt_request;
@@ -206,6 +207,7 @@ struct CPUState {
 QTAILQ_HEAD(CPUTailQ, CPUState);
 extern struct CPUTailQ cpus;
 #define CPU_NEXT(cpu) QTAILQ_NEXT(cpu, node)
+#define CPU_REMOVE(cpu) QTAILQ_REMOVE(&cpus, cpu, node)
 #define CPU_FOREACH(cpu) QTAILQ_FOREACH(cpu, 

[Qemu-devel] [RFC qom-cpu v4 07/10] qom cpu: add UNPLUG cpu notifier support

2013-10-09 Thread Chen Fan
Move struct HotplugEventType from file piix4.c to file qom/cpu.c,
and add struct CPUNotifier for supporting UNPLUG cpu notifier.

Signed-off-by: Chen Fan 
---
 hw/acpi/piix4.c   |  8 ++--
 include/qom/cpu.h | 10 ++
 qom/cpu.c |  6 +-
 3 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index 06f55d6..dc506bf 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -636,11 +636,6 @@ static const MemoryRegionOps cpu_hotplug_ops = {
 },
 };
 
-typedef enum {
-PLUG,
-UNPLUG,
-} HotplugEventType;
-
 static void piix4_cpu_hotplug_req(PIIX4PMState *s, CPUState *cpu,
   HotplugEventType action)
 {
@@ -664,8 +659,9 @@ static void piix4_cpu_hotplug_req(PIIX4PMState *s, CPUState 
*cpu,
 static void piix4_cpu_hotplug(Notifier *n, void *opaque)
 {
 PIIX4PMState *s = container_of(n, PIIX4PMState, cpu_hotplug_notifier);
+CPUNotifier *notifier = opaque;
 
-piix4_cpu_hotplug_req(s, CPU(opaque), PLUG);
+piix4_cpu_hotplug_req(s, CPU(notifier->dev), notifier->type);
 }
 
 static int piix4_device_hotplug(DeviceState *qdev, PCIDevice *dev,
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 7739e00..0238532 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -507,6 +507,16 @@ void qemu_init_vcpu(CPUState *cpu);
  */
 void cpu_single_step(CPUState *cpu, int enabled);
 
+typedef enum {
+PLUG,
+UNPLUG,
+} HotplugEventType;
+
+typedef struct CPUNotifier {
+DeviceState *dev;
+HotplugEventType type;
+} CPUNotifier;
+
 #ifdef CONFIG_SOFTMMU
 extern const struct VMStateDescription vmstate_cpu_common;
 #else
diff --git a/qom/cpu.c b/qom/cpu.c
index 0913c9c..d20783b 100644
--- a/qom/cpu.c
+++ b/qom/cpu.c
@@ -216,10 +216,14 @@ static ObjectClass *cpu_common_class_by_name(const char 
*cpu_model)
 static void cpu_common_realizefn(DeviceState *dev, Error **errp)
 {
 CPUState *cpu = CPU(dev);
+CPUNotifier notifier;
+
+notifier.dev = dev;
+notifier.type = PLUG;
 
 if (dev->hotplugged) {
 cpu_synchronize_post_init(cpu);
-notifier_list_notify(&cpu_hotplug_notifiers, dev);
+notifier_list_notify(&cpu_hotplug_notifiers, ¬ifier);
 cpu_resume(cpu);
 }
 }
-- 
1.8.1.4




[Qemu-devel] [PATCH v1 3/3] x86: move apic_state field from CPUX86State to X86CPU

2013-10-22 Thread Chen Fan
Signed-off-by: Chen Fan 
---
 cpu-exec.c|  2 +-
 cpus.c|  5 ++---
 hw/i386/kvmvapic.c|  8 +++-
 hw/i386/pc.c  | 17 -
 hw/intc/apic.c|  8 
 target-i386/cpu-qom.h |  4 
 target-i386/cpu.c | 22 ++
 target-i386/cpu.h |  4 
 target-i386/helper.c  |  9 -
 target-i386/kvm.c | 23 ++-
 target-i386/misc_helper.c |  8 
 11 files changed, 50 insertions(+), 60 deletions(-)

diff --git a/cpu-exec.c b/cpu-exec.c
index 30cfa2a..2711c58 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -320,7 +320,7 @@ int cpu_exec(CPUArchState *env)
 #if !defined(CONFIG_USER_ONLY)
 if (interrupt_request & CPU_INTERRUPT_POLL) {
 cpu->interrupt_request &= ~CPU_INTERRUPT_POLL;
-apic_poll_irq(env->apic_state);
+apic_poll_irq(x86_env_get_cpu(env)->apic_state);
 }
 #endif
 if (interrupt_request & CPU_INTERRUPT_INIT) {
diff --git a/cpus.c b/cpus.c
index e566297..4ace860 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1383,12 +1383,11 @@ void qmp_inject_nmi(Error **errp)
 
 CPU_FOREACH(cs) {
 X86CPU *cpu = X86_CPU(cs);
-CPUX86State *env = &cpu->env;
 
-if (!env->apic_state) {
+if (!cpu->apic_state) {
 cpu_interrupt(cs, CPU_INTERRUPT_NMI);
 } else {
-apic_deliver_nmi(env->apic_state);
+apic_deliver_nmi(cpu->apic_state);
 }
 }
 #elif defined(TARGET_S390X)
diff --git a/hw/i386/kvmvapic.c b/hw/i386/kvmvapic.c
index 1c2dbf5..9fa346b 100644
--- a/hw/i386/kvmvapic.c
+++ b/hw/i386/kvmvapic.c
@@ -366,7 +366,7 @@ static int vapic_enable(VAPICROMState *s, X86CPU *cpu)
 (((hwaddr)cpu_number) << VAPIC_CPU_SHIFT);
 cpu_physical_memory_rw(vapic_paddr + offsetof(VAPICState, enabled),
(void *)&enabled, sizeof(enabled), 1);
-apic_enable_vapic(cpu->env.apic_state, vapic_paddr);
+apic_enable_vapic(cpu->apic_state, vapic_paddr);
 
 s->state = VAPIC_ACTIVE;
 
@@ -496,12 +496,10 @@ static void vapic_enable_tpr_reporting(bool enable)
 };
 CPUState *cs;
 X86CPU *cpu;
-CPUX86State *env;
 
 CPU_FOREACH(cs) {
 cpu = X86_CPU(cs);
-env = &cpu->env;
-info.apic = env->apic_state;
+info.apic = cpu->apic_state;
 run_on_cpu(cs, vapic_do_enable_tpr_reporting, &info);
 }
 }
@@ -697,7 +695,7 @@ static void vapic_write(void *opaque, hwaddr addr, uint64_t 
data,
 default:
 case 4:
 if (!kvm_irqchip_in_kernel()) {
-apic_poll_irq(env->apic_state);
+apic_poll_irq(cpu->apic_state);
 }
 break;
 }
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 0c313fe..832c9b2 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -169,13 +169,14 @@ void cpu_smm_update(CPUX86State *env)
 int cpu_get_pic_interrupt(CPUX86State *env)
 {
 int intno;
+X86CPU *cpu = x86_env_get_cpu(env);
 
-intno = apic_get_interrupt(env->apic_state);
+intno = apic_get_interrupt(cpu->apic_state);
 if (intno >= 0) {
 return intno;
 }
 /* read the irq from the PIC */
-if (!apic_accept_pic_intr(env->apic_state)) {
+if (!apic_accept_pic_intr(cpu->apic_state)) {
 return -1;
 }
 
@@ -187,15 +188,13 @@ static void pic_irq_request(void *opaque, int irq, int 
level)
 {
 CPUState *cs = first_cpu;
 X86CPU *cpu = X86_CPU(cs);
-CPUX86State *env = &cpu->env;
 
 DPRINTF("pic_irqs: %s irq %d\n", level? "raise" : "lower", irq);
-if (env->apic_state) {
+if (cpu->apic_state) {
 CPU_FOREACH(cs) {
 cpu = X86_CPU(cs);
-env = &cpu->env;
-if (apic_accept_pic_intr(env->apic_state)) {
-apic_deliver_pic_intr(env->apic_state, level);
+if (apic_accept_pic_intr(cpu->apic_state)) {
+apic_deliver_pic_intr(cpu->apic_state, level);
 }
 }
 } else {
@@ -890,7 +889,7 @@ DeviceState *cpu_get_current_apic(void)
 {
 if (current_cpu) {
 X86CPU *cpu = X86_CPU(current_cpu);
-return cpu->env.apic_state;
+return cpu->apic_state;
 } else {
 return NULL;
 }
@@ -984,7 +983,7 @@ void pc_cpus_init(const char *cpu_model, DeviceState 
*icc_bridge)
 }
 
 /* map APIC MMIO area if CPU has APIC */
-if (cpu && cpu->env.apic_state) {
+if (cpu && cpu->apic_state) {
 /* XXX: what if the base changes? */
 sysbus_mmio_map_overlap(SYS_BUS_DEVICE(icc_bridge), 0,
 APIC_DEFAULT_ADDRESS, 0x1000);
diff --git a/hw/intc/apic.c b/hw/intc/apic.c
index fc18600..74edf

[Qemu-devel] [PATCH v1 1/3] Change apic/kvm/xen to use QOM typing

2013-10-22 Thread Chen Fan
Get rid of unused icc_device_realize()

Signed-off-by: Chen Fan 
---
 hw/cpu/icc_bus.c| 17 -
 hw/i386/kvm/apic.c  | 10 --
 hw/intc/apic.c  | 18 --
 hw/intc/apic_common.c   | 17 +++--
 hw/xen/xen_apic.c   | 11 +--
 include/hw/cpu/icc_bus.h|  1 -
 include/hw/i386/apic_internal.h |  3 ++-
 7 files changed, 38 insertions(+), 39 deletions(-)

diff --git a/hw/cpu/icc_bus.c b/hw/cpu/icc_bus.c
index 9a4ea7e..5038836 100644
--- a/hw/cpu/icc_bus.c
+++ b/hw/cpu/icc_bus.c
@@ -38,27 +38,10 @@ static const TypeInfo icc_bus_info = {
 .instance_init = icc_bus_init,
 };
 
-
-/* icc-device implementation */
-
-static void icc_device_realize(DeviceState *dev, Error **errp)
-{
-ICCDevice *id = ICC_DEVICE(dev);
-ICCDeviceClass *idc = ICC_DEVICE_GET_CLASS(id);
-
-if (idc->init) {
-if (idc->init(id) < 0) {
-error_setg(errp, "%s initialization failed.",
-   object_get_typename(OBJECT(dev)));
-}
-}
-}
-
 static void icc_device_class_init(ObjectClass *oc, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(oc);
 
-dc->realize = icc_device_realize;
 dc->bus_type = TYPE_ICC_BUS;
 }
 
diff --git a/hw/i386/kvm/apic.c b/hw/i386/kvm/apic.c
index 5609063..ba30599 100644
--- a/hw/i386/kvm/apic.c
+++ b/hw/i386/kvm/apic.c
@@ -171,21 +171,27 @@ static const MemoryRegionOps kvm_apic_io_ops = {
 .endianness = DEVICE_NATIVE_ENDIAN,
 };
 
-static void kvm_apic_init(APICCommonState *s)
+static void kvm_apic_realize(DeviceState *dev, Error **errp)
 {
+APICCommonState *s = APIC_COMMON(dev);
+APICCommonClass *acc = APIC_COMMON_GET_CLASS(s);
+
 memory_region_init_io(&s->io_memory, NULL, &kvm_apic_io_ops, s, 
"kvm-apic-msi",
   APIC_SPACE_SIZE);
 
 if (kvm_has_gsi_routing()) {
 msi_supported = true;
 }
+acc->parent_realize(dev, errp);
 }
 
 static void kvm_apic_class_init(ObjectClass *klass, void *data)
 {
 APICCommonClass *k = APIC_COMMON_CLASS(klass);
+DeviceClass *dc = DEVICE_CLASS(klass);
 
-k->init = kvm_apic_init;
+k->parent_realize = dc->realize;
+dc->realize = kvm_apic_realize;
 k->set_base = kvm_apic_set_base;
 k->set_tpr = kvm_apic_set_tpr;
 k->get_tpr = kvm_apic_get_tpr;
diff --git a/hw/intc/apic.c b/hw/intc/apic.c
index a913186..8080e20 100644
--- a/hw/intc/apic.c
+++ b/hw/intc/apic.c
@@ -871,22 +871,36 @@ static const MemoryRegionOps apic_io_ops = {
 .endianness = DEVICE_NATIVE_ENDIAN,
 };
 
-static void apic_init(APICCommonState *s)
+static void apic_realize(DeviceState *dev, Error **errp)
 {
+APICCommonState *s = APIC_COMMON(dev);
+APICCommonClass *acc = APIC_COMMON_GET_CLASS(s);
+static int apic_no;
+
+if (apic_no >= MAX_APICS) {
+error_setg(errp, "the new apic number: %d "
+   "exceeded max apic number", apic_no);
+return;
+}
+
 memory_region_init_io(&s->io_memory, OBJECT(s), &apic_io_ops, s, 
"apic-msi",
   APIC_SPACE_SIZE);
 
 s->timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, apic_timer, s);
+s->idx = apic_no++;
 local_apics[s->idx] = s;
 
 msi_supported = true;
+acc->parent_realize(dev, errp);
 }
 
 static void apic_class_init(ObjectClass *klass, void *data)
 {
 APICCommonClass *k = APIC_COMMON_CLASS(klass);
+DeviceClass *dc = DEVICE_CLASS(klass);
 
-k->init = apic_init;
+k->parent_realize = dc->realize;
+dc->realize = apic_realize;
 k->set_base = apic_set_base;
 k->set_tpr = apic_set_tpr;
 k->get_tpr = apic_get_tpr;
diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
index a0beb10..eac538f 100644
--- a/hw/intc/apic_common.c
+++ b/hw/intc/apic_common.c
@@ -284,21 +284,13 @@ static int apic_load_old(QEMUFile *f, void *opaque, int 
version_id)
 return 0;
 }
 
-static int apic_init_common(ICCDevice *dev)
+static void apic_common_realize(DeviceState *dev, Error **errp)
 {
 APICCommonState *s = APIC_COMMON(dev);
-APICCommonClass *info;
+APICCommonClass *info = APIC_COMMON_GET_CLASS(s);
 static DeviceState *vapic;
-static int apic_no;
 static bool mmio_registered;
 
-if (apic_no >= MAX_APICS) {
-return -1;
-}
-s->idx = apic_no++;
-
-info = APIC_COMMON_GET_CLASS(s);
-info->init(s);
 if (!mmio_registered) {
 ICCBus *b = ICC_BUS(qdev_get_parent_bus(DEVICE(dev)));
 memory_region_add_subregion(b->apic_address_space, 0, &s->io_memory);
@@ -314,8 +306,6 @@ static int apic_init_common(ICCDevice *dev)
 if (apic_report_tpr_access && info->enable_tpr_reporting) {
 info->enable_tpr_reporting(s, true);
 }
-
-return 0;
 }
 
 static void apic_dispatch_pr

[Qemu-devel] [PATCH v1 2/3] Using CPU_FOREACH() instead of scanning local_apics

2013-10-22 Thread Chen Fan
And dropping MAX_APICS cast macro altogether.

Signed-off-by: Chen Fan 
---
 hw/intc/apic.c  | 82 +
 include/hw/i386/apic_internal.h |  2 -
 2 files changed, 33 insertions(+), 51 deletions(-)

diff --git a/hw/intc/apic.c b/hw/intc/apic.c
index 8080e20..fc18600 100644
--- a/hw/intc/apic.c
+++ b/hw/intc/apic.c
@@ -32,8 +32,6 @@
 #define SYNC_TO_VAPIC   0x2
 #define SYNC_ISR_IRR_TO_VAPIC   0x4
 
-static APICCommonState *local_apics[MAX_APICS + 1];
-
 static void apic_set_irq(APICCommonState *s, int vector_num, int trigger_mode);
 static void apic_update_irq(APICCommonState *s);
 static void apic_get_delivery_bitmask(uint32_t *deliver_bitmask,
@@ -200,18 +198,15 @@ static void apic_external_nmi(APICCommonState *s)
 
 #define foreach_apic(apic, deliver_bitmask, code) \
 {\
+CPUState *cpu;\
 int __i, __j, __mask;\
-for(__i = 0; __i < MAX_APIC_WORDS; __i++) {\
+CPU_FOREACH(cpu) {\
+apic = APIC_COMMON(X86_CPU(cpu)->env.apic_state);\
+__i = apic->idx / 32;\
+__j = apic->idx % 32;\
 __mask = deliver_bitmask[__i];\
-if (__mask) {\
-for(__j = 0; __j < 32; __j++) {\
-if (__mask & (1 << __j)) {\
-apic = local_apics[__i * 32 + __j];\
-if (apic) {\
-code;\
-}\
-}\
-}\
+if (__mask & (1 << __j)) {\
+code;\
 }\
 }\
 }
@@ -235,9 +230,13 @@ static void apic_bus_deliver(const uint32_t 
*deliver_bitmask,
 }
 }
 if (d >= 0) {
-apic_iter = local_apics[d];
-if (apic_iter) {
-apic_set_irq(apic_iter, vector_num, trigger_mode);
+CPUState *cpu;
+CPU_FOREACH(cpu) {
+apic_iter = APIC_COMMON(X86_CPU(cpu)->env.apic_state);
+if (apic_iter->idx == d) {
+apic_set_irq(apic_iter, vector_num, trigger_mode);
+break;
+}
 }
 }
 }
@@ -422,18 +421,14 @@ static void apic_eoi(APICCommonState *s)
 
 static int apic_find_dest(uint8_t dest)
 {
-APICCommonState *apic = local_apics[dest];
-int i;
-
-if (apic && apic->id == dest)
-return dest;  /* shortcut in case apic->id == apic->idx */
+APICCommonState *apic;
+CPUState *cpu;
 
-for (i = 0; i < MAX_APICS; i++) {
-apic = local_apics[i];
-   if (apic && apic->id == dest)
-return i;
-if (!apic)
-break;
+CPU_FOREACH(cpu) {
+apic = APIC_COMMON(X86_CPU(cpu)->env.apic_state);
+if (apic->id == dest) {
+return apic->idx;
+}
 }
 
 return -1;
@@ -443,7 +438,7 @@ static void apic_get_delivery_bitmask(uint32_t 
*deliver_bitmask,
   uint8_t dest, uint8_t dest_mode)
 {
 APICCommonState *apic_iter;
-int i;
+CPUState *cpu;
 
 if (dest_mode == 0) {
 if (dest == 0xff) {
@@ -457,20 +452,17 @@ static void apic_get_delivery_bitmask(uint32_t 
*deliver_bitmask,
 } else {
 /* XXX: cluster mode */
 memset(deliver_bitmask, 0x00, MAX_APIC_WORDS * sizeof(uint32_t));
-for(i = 0; i < MAX_APICS; i++) {
-apic_iter = local_apics[i];
-if (apic_iter) {
-if (apic_iter->dest_mode == 0xf) {
-if (dest & apic_iter->log_dest)
-apic_set_bit(deliver_bitmask, i);
-} else if (apic_iter->dest_mode == 0x0) {
-if ((dest & 0xf0) == (apic_iter->log_dest & 0xf0) &&
-(dest & apic_iter->log_dest & 0x0f)) {
-apic_set_bit(deliver_bitmask, i);
-}
+CPU_FOREACH(cpu) {
+apic_iter = APIC_COMMON(X86_CPU(cpu)->env.apic_state);
+if (apic_iter->dest_mode == 0xf) {
+if (dest & apic_iter->log_dest) {
+apic_set_bit(deliver_bitmask, apic_iter->idx);
+}
+} else if (apic_iter->dest_mode == 0x0) {
+if ((dest & 0xf0) == (apic_iter->log_dest & 0xf0) &&
+(dest & apic_iter->log_dest & 0x0f)) {
+apic_set_bit(deliver_bitmask, apic_iter->idx);
 }
-} else {
-break;
 }
 }
 }
@@ -875,20 +867,12 @@ static void apic_realize(DeviceState *dev, Error **errp)
 {
 APICCommonState *s = APIC_COMMON(dev);
 APICCommonClass *acc = APIC_COMMON_GET_CLASS(s);
-   

[Qemu-devel] [PATCH v1 0/3] refactor x86 apic to QOM typing

2013-10-22 Thread Chen Fan
In order to implement 'cpu-del' in the furture.
at first, needing to refactor x86 apic codes.
this converts apic/kvm/xen 's init() callbacks to realize()
and dropping local_apics[] from file hw/intc/apic.c.
moving apic_state field from CPUX86State to X86CPU.
 
Chen Fan (3):
  Change apic/kvm/xen to use QOM typing
  Using CPU_FOREACH() instead of scanning local_apics
  x86: move apic_state field from CPUX86State to X86CPU

 cpu-exec.c  |  2 +-
 cpus.c  |  5 +--
 hw/cpu/icc_bus.c| 17 -
 hw/i386/kvm/apic.c  | 10 -
 hw/i386/kvmvapic.c  |  8 ++--
 hw/i386/pc.c| 17 -
 hw/intc/apic.c  | 84 -
 hw/intc/apic_common.c   | 17 ++---
 hw/xen/xen_apic.c   | 11 +-
 include/hw/cpu/icc_bus.h|  1 -
 include/hw/i386/apic_internal.h |  5 +--
 target-i386/cpu-qom.h   |  4 ++
 target-i386/cpu.c   | 22 +--
 target-i386/cpu.h   |  4 --
 target-i386/helper.c|  9 ++---
 target-i386/kvm.c   | 23 +--
 target-i386/misc_helper.c   |  8 ++--
 17 files changed, 109 insertions(+), 138 deletions(-)

-- 
1.8.1.4




[Qemu-devel] [PATCH 1/1] docs/ccid.txt: fix the typo

2013-10-24 Thread Weng Fan

From: WengFan 
Date: Wed, 25 Oct 2013 11:18:22 -0400
Subject: [PATCH 1/1] fix the typo

Signed-off-by: WengFan 
---
 qemu-master/docs/ccid.txt |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/qemu-master/docs/ccid.txt b/qemu-master/docs/ccid.txt
index 8bbaa94..83c174d 100644
--- a/qemu-master/docs/ccid.txt
+++ b/qemu-master/docs/ccid.txt
@@ -52,7 +52,7 @@ Configuring and building:
 Assuming you have a working smartcard on the host with the current
 user, using NSS, qemu acts as another NSS client using ccid-card-emulated:

-qemu -usb -device usb-ccid -device ccid-card-emualated
+qemu -usb -device usb-ccid -device ccid-card-emulated

 4. Using ccid-card-emulated with certificates

--
1.7.1




Re: [Qemu-devel] [PATCH v2 2/4] apic: QOM'ify apic & icc_bus

2013-11-05 Thread Chen Fan
On Tue, 2013-11-05 at 15:55 +0800, xiaoqiang zhao wrote:
> changes includes:
> 1. use type constant for apic and kvm_apic
> 2. convert function 'init' to QOM's 'realize' for apic/kvm_apic
> 3. for consistency, also QOM'ify apic's parent bus 'icc_bus'
> ---
>  hw/cpu/icc_bus.c|   14 ++
>  hw/i386/kvm/apic.c  |   10 +++---
>  hw/intc/apic.c  |   10 +++---
>  hw/intc/apic_common.c   |   13 +++--
>  include/hw/cpu/icc_bus.h|3 ++-
>  include/hw/i386/apic_internal.h |5 +++--
>  6 files changed, 32 insertions(+), 23 deletions(-)
> 
> diff --git a/hw/cpu/icc_bus.c b/hw/cpu/icc_bus.c
> index 9a4ea7e..1cc64ac 100644
> --- a/hw/cpu/icc_bus.c
> +++ b/hw/cpu/icc_bus.c
> @@ -43,15 +43,13 @@ static const TypeInfo icc_bus_info = {
>  
>  static void icc_device_realize(DeviceState *dev, Error **errp)
>  {
> -ICCDevice *id = ICC_DEVICE(dev);
> -ICCDeviceClass *idc = ICC_DEVICE_GET_CLASS(id);
> -
> -if (idc->init) {
> -if (idc->init(id) < 0) {
> -error_setg(errp, "%s initialization failed.",
> -   object_get_typename(OBJECT(dev)));
> -}
> +ICCDeviceClass *idc = ICC_DEVICE_GET_CLASS(dev);
> +
> +/* convert to QOM */
> +if (idc->realize) {
> + idc->realize(dev, errp);
>  }
> +
>  }
>  
>  static void icc_device_class_init(ObjectClass *oc, void *data)
> diff --git a/hw/i386/kvm/apic.c b/hw/i386/kvm/apic.c
> index 84f6056..55f4a53 100644
> --- a/hw/i386/kvm/apic.c
> +++ b/hw/i386/kvm/apic.c
> @@ -13,6 +13,8 @@
>  #include "hw/pci/msi.h"
>  #include "sysemu/kvm.h"
>  
> +#define TYPE_KVM_APIC "kvm-apic"
> +
>  static inline void kvm_apic_set_reg(struct kvm_lapic_state *kapic,
>  int reg_id, uint32_t val)
>  {
> @@ -171,8 +173,10 @@ static const MemoryRegionOps kvm_apic_io_ops = {
>  .endianness = DEVICE_NATIVE_ENDIAN,
>  };
>  
> -static void kvm_apic_init(APICCommonState *s)
> +static void kvm_apic_realize(DeviceState *dev, Error **errp)
>  {
> +APICCommonState *s = APIC_COMMON(dev);
> +
>  memory_region_init_io(&s->io_memory, NULL, &kvm_apic_io_ops, s, 
> "kvm-apic-msi",
>APIC_SPACE_SIZE);
>  
> @@ -185,7 +189,7 @@ static void kvm_apic_class_init(ObjectClass *klass, void 
> *data)
>  {
>  APICCommonClass *k = APIC_COMMON_CLASS(klass);
>  
> -k->init = kvm_apic_init;
> +k->realize = kvm_apic_realize;
>  k->set_base = kvm_apic_set_base;
>  k->set_tpr = kvm_apic_set_tpr;
>  k->get_tpr = kvm_apic_get_tpr;
> @@ -195,7 +199,7 @@ static void kvm_apic_class_init(ObjectClass *klass, void 
> *data)
>  }
>  
>  static const TypeInfo kvm_apic_info = {
> -.name = "kvm-apic",
> +.name = TYPE_KVM_APIC,
>  .parent = TYPE_APIC_COMMON,
>  .instance_size = sizeof(APICCommonState),
>  .class_init = kvm_apic_class_init,
> diff --git a/hw/intc/apic.c b/hw/intc/apic.c
> index b542628..2d7891d 100644
> --- a/hw/intc/apic.c
> +++ b/hw/intc/apic.c
> @@ -32,6 +32,8 @@
>  #define SYNC_TO_VAPIC   0x2
>  #define SYNC_ISR_IRR_TO_VAPIC   0x4
>  
> +#define TYPE_APIC "apic"
> +
>  static APICCommonState *local_apics[MAX_APICS + 1];
>  
>  static void apic_set_irq(APICCommonState *s, int vector_num, int 
> trigger_mode);
> @@ -871,8 +873,10 @@ static const MemoryRegionOps apic_io_ops = {
>  .endianness = DEVICE_NATIVE_ENDIAN,
>  };
>  
> -static void apic_init(APICCommonState *s)
> +static void apic_realize(DeviceState *dev, Error **errp)
>  {
> +APICCommonState *s = APIC_COMMON(dev);
> +
>  memory_region_init_io(&s->io_memory, OBJECT(s), &apic_io_ops, s, 
> "apic-msi",
>APIC_SPACE_SIZE);
>  
> @@ -886,7 +890,7 @@ static void apic_class_init(ObjectClass *klass, void 
> *data)
>  {
>  APICCommonClass *k = APIC_COMMON_CLASS(klass);
>  
> -k->init = apic_init;
> +k->realize = apic_realize;
>  k->set_base = apic_set_base;
>  k->set_tpr = apic_set_tpr;
>  k->get_tpr = apic_get_tpr;
> @@ -897,7 +901,7 @@ static void apic_class_init(ObjectClass *klass, void 
> *data)
>  }
>  
>  static const TypeInfo apic_info = {
> -.name  = "apic",
> +.name  = TYPE_APIC,
>  .instance_size = sizeof(APICCommonState),
>  .parent= TYPE_APIC_COMMON,
>  .class_init= apic_class_init,
> diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
> index f3edf48..5a413cc 100644
> --- a/hw/intc/apic_common.c
> +++ b/hw/intc/apic_common.c
> @@ -284,7 +284,7 @@ static int apic_load_old(QEMUFile *f, void *opaque, int 
> version_id)
>  return 0;
>  }
>  
> -static int apic_init_common(ICCDevice *dev)
> +static void apic_common_realize(DeviceState *dev, Error **errp)
>  {
>  APICCommonState *s = APIC_COMMON(dev);
>  APICCommonClass *info;
> @@ -293,14 +293,16 @@ static int apic_init_common(ICCDevice *dev)
>  static bool mmio_registered;
>  
>  if (apic_no

[Qemu-devel] [RFC qom-cpu v2 2/8] x86: add x86_cpu_unrealizefn() for cpu apic remove

2013-09-10 Thread Chen Fan
Implement x86_cpu_unrealizefn() for corresponding x86_cpu_realizefn(),
which is mostly used to clear the apic related information at here.

Signed-off-by: Chen Fan 
---
 hw/cpu/icc_bus.c| 11 +++
 hw/i386/kvm/apic.c  |  6 ++
 hw/intc/apic.c  |  7 +++
 hw/intc/apic_common.c   | 11 +++
 include/hw/cpu/icc_bus.h|  1 +
 include/hw/i386/apic_internal.h |  1 +
 target-i386/cpu-qom.h   |  1 +
 target-i386/cpu.c   | 35 +++
 8 files changed, 73 insertions(+)

diff --git a/hw/cpu/icc_bus.c b/hw/cpu/icc_bus.c
index 8748cc5..45e87d1 100644
--- a/hw/cpu/icc_bus.c
+++ b/hw/cpu/icc_bus.c
@@ -54,11 +54,22 @@ static void icc_device_realize(DeviceState *dev, Error 
**errp)
 }
 }
 
+static void icc_device_unrealize(DeviceState *dev, Error **errp)
+{
+ICCDevice *id = ICC_DEVICE(dev);
+ICCDeviceClass *idc = ICC_DEVICE_GET_CLASS(id);
+
+if (idc->exit) {
+idc->exit(id);
+}
+}
+
 static void icc_device_class_init(ObjectClass *oc, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(oc);
 
 dc->realize = icc_device_realize;
+dc->unrealize = icc_device_unrealize;
 dc->bus_type = TYPE_ICC_BUS;
 }
 
diff --git a/hw/i386/kvm/apic.c b/hw/i386/kvm/apic.c
index 5609063..8f028a1 100644
--- a/hw/i386/kvm/apic.c
+++ b/hw/i386/kvm/apic.c
@@ -181,11 +181,17 @@ static void kvm_apic_init(APICCommonState *s)
 }
 }
 
+static void kvm_apic_exit(APICCommonState *s)
+{
+memory_region_destroy(&s->io_memory);
+}
+
 static void kvm_apic_class_init(ObjectClass *klass, void *data)
 {
 APICCommonClass *k = APIC_COMMON_CLASS(klass);
 
 k->init = kvm_apic_init;
+k->exit = kvm_apic_exit;
 k->set_base = kvm_apic_set_base;
 k->set_tpr = kvm_apic_set_tpr;
 k->get_tpr = kvm_apic_get_tpr;
diff --git a/hw/intc/apic.c b/hw/intc/apic.c
index a913186..23488b4 100644
--- a/hw/intc/apic.c
+++ b/hw/intc/apic.c
@@ -882,11 +882,18 @@ static void apic_init(APICCommonState *s)
 msi_supported = true;
 }
 
+static void apic_uninit(APICCommonState *s)
+{
+memory_region_destroy(&s->io_memory);
+local_apics[s->idx] = NULL;
+}
+
 static void apic_class_init(ObjectClass *klass, void *data)
 {
 APICCommonClass *k = APIC_COMMON_CLASS(klass);
 
 k->init = apic_init;
+k->exit = apic_uninit;
 k->set_base = apic_set_base;
 k->set_tpr = apic_set_tpr;
 k->get_tpr = apic_get_tpr;
diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
index 5568621..32c2f74 100644
--- a/hw/intc/apic_common.c
+++ b/hw/intc/apic_common.c
@@ -316,6 +316,16 @@ static int apic_init_common(ICCDevice *dev)
 return 0;
 }
 
+static void apic_exit_common(ICCDevice *dev)
+{
+APICCommonState *s = APIC_COMMON(dev);
+APICCommonClass *info;
+
+info = APIC_COMMON_GET_CLASS(s);
+if (info->exit)
+info->exit(s);
+}
+
 static void apic_dispatch_pre_save(void *opaque)
 {
 APICCommonState *s = APIC_COMMON(opaque);
@@ -387,6 +397,7 @@ static void apic_common_class_init(ObjectClass *klass, void 
*data)
 dc->no_user = 1;
 dc->props = apic_properties_common;
 idc->init = apic_init_common;
+idc->exit = apic_exit_common;
 }
 
 static const TypeInfo apic_common_type = {
diff --git a/include/hw/cpu/icc_bus.h b/include/hw/cpu/icc_bus.h
index b550070..15d5374 100644
--- a/include/hw/cpu/icc_bus.h
+++ b/include/hw/cpu/icc_bus.h
@@ -67,6 +67,7 @@ typedef struct ICCDeviceClass {
 /*< public >*/
 
 int (*init)(ICCDevice *dev); /* TODO replace with QOM realize */
+void (*exit)(ICCDevice *dev);
 } ICCDeviceClass;
 
 #define TYPE_ICC_DEVICE "icc-device"
diff --git a/include/hw/i386/apic_internal.h b/include/hw/i386/apic_internal.h
index 1b0a7fb..87d5248 100644
--- a/include/hw/i386/apic_internal.h
+++ b/include/hw/i386/apic_internal.h
@@ -81,6 +81,7 @@ typedef struct APICCommonClass
 ICCDeviceClass parent_class;
 
 void (*init)(APICCommonState *s);
+void (*exit)(APICCommonState *s);
 void (*set_base)(APICCommonState *s, uint64_t val);
 void (*set_tpr)(APICCommonState *s, uint8_t val);
 uint8_t (*get_tpr)(APICCommonState *s);
diff --git a/target-i386/cpu-qom.h b/target-i386/cpu-qom.h
index c4447c2..1e520be 100644
--- a/target-i386/cpu-qom.h
+++ b/target-i386/cpu-qom.h
@@ -50,6 +50,7 @@ typedef struct X86CPUClass {
 /*< public >*/
 
 DeviceRealize parent_realize;
+DeviceUnrealize parent_unrealize;
 void (*parent_reset)(CPUState *cpu);
 } X86CPUClass;
 
diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index 2b99683..6f9154d 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -2339,10 +2339,31 @@ static void x86_cpu_apic_realize(X86CPU *cpu, Error 
**errp)
 return;
 }
 }
+
+static void x86_cpu_apic_unrealize(X86CPU *cpu, Error **errp)
+{
+CPUX86State

[Qemu-devel] [RFC qom-cpu v2 1/8] apic: remove apic_no from apic_init_common()

2013-09-10 Thread Chen Fan
the 'apic_no' is increased by one when initialize/create a vCPU each time,
which causes APICCommonState s->idx always is increased.
but if we want to re-add a vCPU after removing a vCPU, we need to re-use the
vacant s->idx which the corresponding vCPU had been removed. 
so we could use the unique cpu apic_id instead of the progressive s->idx.

Signed-off-by: Chen Fan 
---
 hw/intc/apic_common.c | 4 +---
 target-i386/cpu.c | 1 +
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
index a0beb10..5568621 100644
--- a/hw/intc/apic_common.c
+++ b/hw/intc/apic_common.c
@@ -289,13 +289,11 @@ static int apic_init_common(ICCDevice *dev)
 APICCommonState *s = APIC_COMMON(dev);
 APICCommonClass *info;
 static DeviceState *vapic;
-static int apic_no;
 static bool mmio_registered;
 
-if (apic_no >= MAX_APICS) {
+if (s->idx >= MAX_APICS) {
 return -1;
 }
-s->idx = apic_no++;
 
 info = APIC_COMMON_GET_CLASS(s);
 info->init(s);
diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index 42c5de0..2b99683 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -2322,6 +2322,7 @@ static void x86_cpu_apic_create(X86CPU *cpu, Error **errp)
 /* TODO: convert to link<> */
 apic = APIC_COMMON(env->apic_state);
 apic->cpu = cpu;
+apic->idx = env->cpuid_apic_id;
 }
 
 static void x86_cpu_apic_realize(X86CPU *cpu, Error **errp)
-- 
1.8.1.4




[Qemu-devel] [RFC qom-cpu v2 7/8] piix4: implement function cpu_status_write() for vcpu ejection

2013-09-10 Thread Chen Fan
When OS eject a vcpu (like: echo 1 > /sys/bus/acpi/devices/LNXCPUXX/eject),
it will call acpi EJ0 method, the firmware will write the new cpumap, QEMU
will know which vcpu need to be ejected.

Signed-off-by: Chen Fan 
---
 hw/acpi/piix4.c | 37 -
 1 file changed, 36 insertions(+), 1 deletion(-)

diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index 2ddc9a8..0e9b5bd 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -61,6 +61,7 @@ struct pci_status {
 
 typedef struct CPUStatus {
 uint8_t sts[PIIX4_PROC_LEN];
+uint8_t old_sts[PIIX4_PROC_LEN];
 } CPUStatus;
 
 typedef struct PIIX4PMState {
@@ -610,6 +611,12 @@ static const MemoryRegionOps piix4_pci_ops = {
 },
 };
 
+static void acpi_piix_eject_vcpu(int64_t cpuid)
+{
+/* TODO: eject a vcpu, release allocated vcpu and exit the vcpu pthread.  
*/
+PIIX4_DPRINTF("vcpu: %" PRIu64 " need to be ejected.\n", cpuid);
+}
+
 static uint64_t cpu_status_read(void *opaque, hwaddr addr, unsigned int size)
 {
 PIIX4PMState *s = opaque;
@@ -622,7 +629,27 @@ static uint64_t cpu_status_read(void *opaque, hwaddr addr, 
unsigned int size)
 static void cpu_status_write(void *opaque, hwaddr addr, uint64_t data,
  unsigned int size)
 {
-/* TODO: implement VCPU removal on guest signal that CPU can be removed */
+PIIX4PMState *s = opaque;
+CPUStatus *cpus = &s->gpe_cpu;
+uint8_t val;
+int i;
+int64_t cpuid = 0;
+
+val = cpus->old_sts[addr] ^ data;
+
+if (val == 0) {
+return;
+}
+
+for (i = 0; i < 8; i++) {
+if (val & 1 << i) {
+cpuid = 8 * addr + i;
+}
+}
+
+if (cpuid != 0) {
+acpi_piix_eject_vcpu(cpuid);
+}
 }
 
 static const MemoryRegionOps cpu_hotplug_ops = {
@@ -642,13 +669,20 @@ static void piix4_cpu_hotplug_req(PIIX4PMState *s, 
CPUState *cpu,
 ACPIGPE *gpe = &s->ar.gpe;
 CPUClass *k = CPU_GET_CLASS(cpu);
 int64_t cpu_id;
+int i;
 
 assert(s != NULL);
 
 *gpe->sts = *gpe->sts | PIIX4_CPU_HOTPLUG_STATUS;
 cpu_id = k->get_arch_id(CPU(cpu));
+
+for (i = 0; i < PIIX4_PROC_LEN; i++) {
+g->old_sts[i] = g->sts[i];
+}
+
 if (action == PLUG) {
 g->sts[cpu_id / 8] |= (1 << (cpu_id % 8));
+g->old_sts[cpu_id / 8] |= (1 << (cpu_id % 8));
 } else {
 g->sts[cpu_id / 8] &= ~(1 << (cpu_id % 8));
 }
@@ -687,6 +721,7 @@ static void piix4_acpi_system_hot_add_init(MemoryRegion 
*parent,
 
 g_assert((id / 8) < PIIX4_PROC_LEN);
 s->gpe_cpu.sts[id / 8] |= (1 << (id % 8));
+s->gpe_cpu.old_sts[id / 8] |= (1 << (id % 8));
 }
 memory_region_init_io(&s->io_cpu, OBJECT(s), &cpu_hotplug_ops, s,
   "acpi-cpu-hotplug", PIIX4_PROC_LEN);
-- 
1.8.1.4




[Qemu-devel] [RFC qom-cpu v2 4/8] qom cpu: rename variable 'cpu_added_notifier' to 'cpu_hotplug_notifier'

2013-09-10 Thread Chen Fan
Rename variable 'cpu_added_notifier' to 'cpu_hotplug_notifier', for
adding vcpu-remove notifier support.

Signed-off-by: Chen Fan 
---
 hw/acpi/piix4.c | 10 +-
 hw/i386/pc.c|  2 +-
 include/sysemu/sysemu.h |  2 +-
 qom/cpu.c   | 10 +-
 4 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index 0b8d1d9..c8f4182 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -95,7 +95,7 @@ typedef struct PIIX4PMState {
 uint8_t s4_val;
 
 CPUStatus gpe_cpu;
-Notifier cpu_added_notifier;
+Notifier cpu_hotplug_notifier;
 } PIIX4PMState;
 
 #define TYPE_PIIX4_PM "PIIX4_PM"
@@ -660,9 +660,9 @@ static void piix4_cpu_hotplug_req(PIIX4PMState *s, CPUState 
*cpu,
 pm_update_sci(s);
 }
 
-static void piix4_cpu_added_req(Notifier *n, void *opaque)
+static void piix4_cpu_hotplug(Notifier *n, void *opaque)
 {
-PIIX4PMState *s = container_of(n, PIIX4PMState, cpu_added_notifier);
+PIIX4PMState *s = container_of(n, PIIX4PMState, cpu_hotplug_notifier);
 
 piix4_cpu_hotplug_req(s, CPU(opaque), PLUG);
 }
@@ -695,8 +695,8 @@ static void piix4_acpi_system_hot_add_init(MemoryRegion 
*parent,
 memory_region_init_io(&s->io_cpu, OBJECT(s), &cpu_hotplug_ops, s,
   "acpi-cpu-hotplug", PIIX4_PROC_LEN);
 memory_region_add_subregion(parent, PIIX4_PROC_BASE, &s->io_cpu);
-s->cpu_added_notifier.notify = piix4_cpu_added_req;
-qemu_register_cpu_added_notifier(&s->cpu_added_notifier);
+s->cpu_hotplug_notifier.notify = piix4_cpu_hotplug;
+qemu_register_cpu_hotplug_notifier(&s->cpu_hotplug_notifier);
 }
 
 static void enable_device(PIIX4PMState *s, int slot)
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 3de9c51..f36903f 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -407,7 +407,7 @@ void pc_cmos_init(ram_addr_t ram_size, ram_addr_t 
above_4g_mem_size,
 /* init CPU hotplug notifier */
 cpu_hotplug_cb.rtc_state = s;
 cpu_hotplug_cb.cpu_added_notifier.notify = rtc_notify_cpu_added;
-qemu_register_cpu_added_notifier(&cpu_hotplug_cb.cpu_added_notifier);
+qemu_register_cpu_hotplug_notifier(&cpu_hotplug_cb.cpu_added_notifier);
 
 if (set_boot_dev(s, boot_device)) {
 exit(1);
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index b1aa059..e1c1120 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -153,7 +153,7 @@ void do_pci_device_hot_remove(Monitor *mon, const QDict 
*qdict);
 void drive_hot_add(Monitor *mon, const QDict *qdict);
 
 /* CPU hotplug */
-void qemu_register_cpu_added_notifier(Notifier *notifier);
+void qemu_register_cpu_hotplug_notifier(Notifier *notifier);
 
 /* pcie aer error injection */
 void pcie_aer_inject_error_print(Monitor *mon, const QObject *data);
diff --git a/qom/cpu.c b/qom/cpu.c
index fa7ec6b..7992fe1 100644
--- a/qom/cpu.c
+++ b/qom/cpu.c
@@ -67,12 +67,12 @@ static void cpu_common_get_memory_mapping(CPUState *cpu,
 }
 
 /* CPU hot-plug notifiers */
-static NotifierList cpu_added_notifiers =
-NOTIFIER_LIST_INITIALIZER(cpu_add_notifiers);
+static NotifierList cpu_hotplug_notifiers =
+NOTIFIER_LIST_INITIALIZER(cpu_hotplug_notifiers);
 
-void qemu_register_cpu_added_notifier(Notifier *notifier)
+void qemu_register_cpu_hotplug_notifier(Notifier *notifier)
 {
-notifier_list_add(&cpu_added_notifiers, notifier);
+notifier_list_add(&cpu_hotplug_notifiers, notifier);
 }
 
 void cpu_reset_interrupt(CPUState *cpu, int mask)
@@ -218,7 +218,7 @@ static void cpu_common_realizefn(DeviceState *dev, Error 
**errp)
 
 if (dev->hotplugged) {
 cpu_synchronize_post_init(cpu);
-notifier_list_notify(&cpu_added_notifiers, dev);
+notifier_list_notify(&cpu_hotplug_notifiers, dev);
 cpu_resume(cpu);
 }
 }
-- 
1.8.1.4




[Qemu-devel] [RFC qom-cpu v2 5/8] qom cpu: add UNPLUG cpu notifier support

2013-09-10 Thread Chen Fan
Move struct HotplugEventType from file piix4.c to file qom/cpu.c,
and add struct CPUNotifier for supporting UNPLUG cpu notifier.

Signed-off-by: Chen Fan 
---
 hw/acpi/piix4.c   |  8 ++--
 include/qom/cpu.h | 10 ++
 qom/cpu.c |  6 +-
 3 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index c8f4182..2ddc9a8 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -635,11 +635,6 @@ static const MemoryRegionOps cpu_hotplug_ops = {
 },
 };
 
-typedef enum {
-PLUG,
-UNPLUG,
-} HotplugEventType;
-
 static void piix4_cpu_hotplug_req(PIIX4PMState *s, CPUState *cpu,
   HotplugEventType action)
 {
@@ -663,8 +658,9 @@ static void piix4_cpu_hotplug_req(PIIX4PMState *s, CPUState 
*cpu,
 static void piix4_cpu_hotplug(Notifier *n, void *opaque)
 {
 PIIX4PMState *s = container_of(n, PIIX4PMState, cpu_hotplug_notifier);
+CPUNotifier *notifier = opaque;
 
-piix4_cpu_hotplug_req(s, CPU(opaque), PLUG);
+piix4_cpu_hotplug_req(s, CPU(notifier->dev), notifier->type);
 }
 
 static int piix4_device_hotplug(DeviceState *qdev, PCIDevice *dev,
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 7739e00..0238532 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -507,6 +507,16 @@ void qemu_init_vcpu(CPUState *cpu);
  */
 void cpu_single_step(CPUState *cpu, int enabled);
 
+typedef enum {
+PLUG,
+UNPLUG,
+} HotplugEventType;
+
+typedef struct CPUNotifier {
+DeviceState *dev;
+HotplugEventType type;
+} CPUNotifier;
+
 #ifdef CONFIG_SOFTMMU
 extern const struct VMStateDescription vmstate_cpu_common;
 #else
diff --git a/qom/cpu.c b/qom/cpu.c
index 7992fe1..c6d7ebc 100644
--- a/qom/cpu.c
+++ b/qom/cpu.c
@@ -215,10 +215,14 @@ static ObjectClass *cpu_common_class_by_name(const char 
*cpu_model)
 static void cpu_common_realizefn(DeviceState *dev, Error **errp)
 {
 CPUState *cpu = CPU(dev);
+CPUNotifier notifier;
+
+notifier.dev = dev;
+notifier.type = PLUG;
 
 if (dev->hotplugged) {
 cpu_synchronize_post_init(cpu);
-notifier_list_notify(&cpu_hotplug_notifiers, dev);
+notifier_list_notify(&cpu_hotplug_notifiers, ¬ifier);
 cpu_resume(cpu);
 }
 }
-- 
1.8.1.4




[Qemu-devel] [RFC qom-cpu v2 3/8] qmp: add 'cpu-del' command support

2013-09-10 Thread Chen Fan
Signed-off-by: Chen Fan 
---
 hw/i386/pc.c |  5 +
 hw/i386/pc_piix.c|  1 +
 include/hw/boards.h  |  2 ++
 include/hw/i386/pc.h |  1 +
 qapi-schema.json | 12 
 qmp-commands.hx  | 23 +++
 qmp.c|  9 +
 7 files changed, 53 insertions(+)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 0c313fe..3de9c51 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -957,6 +957,11 @@ void pc_hot_add_cpu(const int64_t id, Error **errp)
 pc_new_cpu(current_cpu_model, apic_id, icc_bridge, errp);
 }
 
+void pc_hot_del_cpu(const int64_t id, Error **errp)
+{
+/* TODO: hot remove vCPU. */
+}
+
 void pc_cpus_init(const char *cpu_model, DeviceState *icc_bridge)
 {
 int i;
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 6e1e654..d779b75 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -347,6 +347,7 @@ static QEMUMachine pc_i440fx_machine_v1_6 = {
 .desc = "Standard PC (i440FX + PIIX, 1996)",
 .init = pc_init_pci_1_6,
 .hot_add_cpu = pc_hot_add_cpu,
+.hot_del_cpu = pc_hot_del_cpu,
 .max_cpus = 255,
 .is_default = 1,
 DEFAULT_MACHINE_OPTIONS,
diff --git a/include/hw/boards.h b/include/hw/boards.h
index fb7c6f1..fea3737 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -23,6 +23,7 @@ typedef void QEMUMachineInitFunc(QEMUMachineInitArgs *args);
 typedef void QEMUMachineResetFunc(void);
 
 typedef void QEMUMachineHotAddCPUFunc(const int64_t id, Error **errp);
+typedef void QEMUMachineHotDelCPUFunc(const int64_t id, Error **errp);
 
 typedef struct QEMUMachine {
 const char *name;
@@ -31,6 +32,7 @@ typedef struct QEMUMachine {
 QEMUMachineInitFunc *init;
 QEMUMachineResetFunc *reset;
 QEMUMachineHotAddCPUFunc *hot_add_cpu;
+QEMUMachineHotDelCPUFunc *hot_del_cpu;
 BlockInterfaceType block_default_type;
 int max_cpus;
 unsigned int no_serial:1,
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index f79d478..b7e66f4 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -96,6 +96,7 @@ void pc_acpi_smi_interrupt(void *opaque, int irq, int level);
 
 void pc_cpus_init(const char *cpu_model, DeviceState *icc_bridge);
 void pc_hot_add_cpu(const int64_t id, Error **errp);
+void pc_hot_del_cpu(const int64_t id, Error **errp);
 void pc_acpi_init(const char *default_dsdt);
 
 PcGuestInfo *pc_guest_info_init(ram_addr_t below_4g_mem_size,
diff --git a/qapi-schema.json b/qapi-schema.json
index a51f7d2..6052aa9 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1432,6 +1432,18 @@
 ##
 { 'command': 'cpu-add', 'data': {'id': 'int'} }
 
+# @cpu-del
+
+# Deletes CPU with specified ID
+#
+# @id: ID of CPU to be deleted, valid values [0..max_cpus)
+#
+# Returns: Nothing on success
+#
+# Since 1.7
+##
+{ 'command': 'cpu-del', 'data': {'id': 'int'} }
+
 ##
 # @memsave:
 #
diff --git a/qmp-commands.hx b/qmp-commands.hx
index cf47e3f..16b54fd 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -411,6 +411,29 @@ Example:
 EQMP
 
 {
+.name   = "cpu-del",
+.args_type  = "id:i",
+.mhandler.cmd_new = qmp_marshal_input_cpu_del,
+},
+
+SQMP
+cpu-del
+---
+
+Deletes virtual cpu
+
+Arguments:
+
+- "id": cpu id (json-int)
+
+Example:
+
+-> { "execute": "cpu-del", "arguments": { "id": 2 } }
+<- { "return": {} }
+
+EQMP
+
+{
 .name   = "memsave",
 .args_type  = "val:l,size:i,filename:s,cpu:i?",
 .mhandler.cmd_new = qmp_marshal_input_memsave,
diff --git a/qmp.c b/qmp.c
index 4c149b3..84dc873 100644
--- a/qmp.c
+++ b/qmp.c
@@ -118,6 +118,15 @@ void qmp_cpu_add(int64_t id, Error **errp)
 }
 }
 
+void qmp_cpu_del(int64_t id, Error **errp)
+{
+if (current_machine->hot_del_cpu) {
+current_machine->hot_del_cpu(id, errp);
+} else {
+error_setg(errp, "Not supported");
+}
+}
+
 #ifndef CONFIG_VNC
 /* If VNC support is enabled, the "true" query-vnc command is
defined in the VNC subsystem */
-- 
1.8.1.4




[Qemu-devel] [RFC qom-cpu v2 6/8] i386: implement pc interface pc_hot_del_cpu()

2013-09-10 Thread Chen Fan
Implement cpu interface pc_hot_del_cpu() for unrealizing device vCPU.
emiting vcpu-remove notifier to ACPI, then ACPI could send sci interrupt
to OS for hot-remove vcpu.

Signed-off-by: Chen Fan 
---
 hw/i386/pc.c | 29 -
 qom/cpu.c| 11 +++
 2 files changed, 39 insertions(+), 1 deletion(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index f36903f..6f88e41 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -959,7 +959,34 @@ void pc_hot_add_cpu(const int64_t id, Error **errp)
 
 void pc_hot_del_cpu(const int64_t id, Error **errp)
 {
-/* TODO: hot remove vCPU. */
+CPUState *cpu;
+bool found = false;
+X86CPUClass *xcc;
+
+CPU_FOREACH(cpu) {
+CPUClass *cc = CPU_GET_CLASS(cpu);
+int64_t cpuid = cc->get_arch_id(cpu);
+
+if (cpuid == id) {
+found = true;
+break;
+}
+}
+
+if (!found) {
+error_setg(errp, "Unable to find cpu-index: %" PRIi64
+   ", it doesn't exist or has been deleted.", id);
+return;
+}
+
+if (cpu == first_cpu && !CPU_NEXT(cpu)) {
+error_setg(errp, "Unable to delete the last"
+   " cpu when VM running.");
+return;
+}
+
+xcc = X86_CPU_GET_CLASS(DEVICE(cpu));
+xcc->parent_unrealize(DEVICE(cpu), errp);
 }
 
 void pc_cpus_init(const char *cpu_model, DeviceState *icc_bridge)
diff --git a/qom/cpu.c b/qom/cpu.c
index c6d7ebc..9cd7fcd 100644
--- a/qom/cpu.c
+++ b/qom/cpu.c
@@ -227,6 +227,16 @@ static void cpu_common_realizefn(DeviceState *dev, Error 
**errp)
 }
 }
 
+static void cpu_common_unrealizefn(DeviceState *dev, Error **errp)
+{
+CPUNotifier notifier;
+
+notifier.dev = dev;
+notifier.type = UNPLUG;
+
+notifier_list_notify(&cpu_hotplug_notifiers, ¬ifier);
+}
+
 static void cpu_common_initfn(Object *obj)
 {
 CPUState *cpu = CPU(obj);
@@ -257,6 +267,7 @@ static void cpu_class_init(ObjectClass *klass, void *data)
 k->gdb_read_register = cpu_common_gdb_read_register;
 k->gdb_write_register = cpu_common_gdb_write_register;
 dc->realize = cpu_common_realizefn;
+dc->unrealize = cpu_common_unrealizefn;
 dc->no_user = 1;
 }
 
-- 
1.8.1.4




[Qemu-devel] [RFC qom-cpu v2 0/8] i386: add cpu hot remove support

2013-09-10 Thread Chen Fan
Via implementing ACPI standard methods _EJ0 in bios, after Guest OS hot remove
one vCPU, it is able to send a signal to QEMU, then QEMU could notify
the assigned vCPU of exiting.

this work is based on Andreas Färber's qom-cpu branch tree.
git://github.com/afaerber/qemu-cpu.git

this series patches must be used with seabios patch and KVM patch together.
 
for KVM patches:
  http://comments.gmane.org/gmane.comp.emulators.kvm.devel/114347

for seabios patches:
  http://comments.gmane.org/gmane.comp.emulators.qemu/230460

Chen Fan (8):
  apic: remove apic_no from apic_init_common()
  x86: add x86_cpu_unrealizefn() for cpu apic remove
  qmp: add 'cpu-del' command support
  qom cpu: rename variable 'cpu_added_notifier' to
'cpu_hotplug_notifier'
  qom cpu: add UNPLUG cpu notifier support
  i386: implement pc interface pc_hot_del_cpu()
  piix4: implement function cpu_status_write() for vcpu ejection
  cpus: release allocated vCPU objects

 cpus.c  | 46 
 hw/acpi/piix4.c | 66 +
 hw/cpu/icc_bus.c| 11 +++
 hw/i386/kvm/apic.c  |  6 
 hw/i386/pc.c| 34 -
 hw/i386/pc_piix.c   |  1 +
 hw/intc/apic.c  |  7 +
 hw/intc/apic_common.c   | 15 --
 include/hw/boards.h |  2 ++
 include/hw/cpu/icc_bus.h|  1 +
 include/hw/i386/apic_internal.h |  1 +
 include/hw/i386/pc.h|  1 +
 include/qom/cpu.h   | 20 +
 include/sysemu/kvm.h|  1 +
 include/sysemu/sysemu.h |  2 +-
 kvm-all.c   | 25 
 qapi-schema.json| 12 
 qmp-commands.hx | 23 ++
 qmp.c   |  9 ++
 qom/cpu.c   | 25 
 target-i386/cpu-qom.h   |  1 +
 target-i386/cpu.c   | 36 ++
 22 files changed, 323 insertions(+), 22 deletions(-)

-- 
1.8.1.4




[Qemu-devel] [RFC qom-cpu v2 8/8] cpus: release allocated vCPU objects

2013-09-10 Thread Chen Fan
After ACPI get a signal to eject a vCPU, then it will notify
the vCPU thread to exit when using KVM, and the vCPU must be removed from CPU 
list,
before the vCPU really removed, there will release the all related vCPU objects 
and
apic device.

Signed-off-by: Chen Fan 
---
 cpus.c   | 46 ++
 hw/acpi/piix4.c  | 23 +--
 include/qom/cpu.h| 10 ++
 include/sysemu/kvm.h |  1 +
 kvm-all.c| 25 +
 5 files changed, 99 insertions(+), 6 deletions(-)

diff --git a/cpus.c b/cpus.c
index 980697e..10dded3 100644
--- a/cpus.c
+++ b/cpus.c
@@ -714,6 +714,26 @@ void async_run_on_cpu(CPUState *cpu, void (*func)(void 
*data), void *data)
 qemu_cpu_kick(cpu);
 }
 
+static void qemu_kvm_destroy_vcpu(CPUState *cpu)
+{
+CPU_REMOVE(cpu);
+
+if (kvm_destroy_vcpu(cpu) < 0) {
+fprintf(stderr, "kvm_destroy_vcpu failed.\n");
+exit(1);
+}
+
+object_property_set_bool(OBJECT(cpu), false, "realized", NULL);
+qdev_free(DEVICE(cpu));
+}
+
+static void qemu_tcg_destroy_vcpu(CPUState *cpu)
+{
+CPU_REMOVE(cpu);
+object_property_set_bool(OBJECT(cpu), false, "realized", NULL);
+qdev_free(DEVICE(cpu));
+}
+
 static void flush_queued_work(CPUState *cpu)
 {
 struct qemu_work_item *wi;
@@ -805,6 +825,11 @@ static void *qemu_kvm_cpu_thread_fn(void *arg)
 }
 }
 qemu_kvm_wait_io_event(cpu);
+if (cpu->exit && !cpu_can_run(cpu)) {
+qemu_kvm_destroy_vcpu(cpu);
+qemu_mutex_unlock(&qemu_global_mutex);
+return NULL;
+}
 }
 
 return NULL;
@@ -857,6 +882,7 @@ static void tcg_exec_all(void);
 static void *qemu_tcg_cpu_thread_fn(void *arg)
 {
 CPUState *cpu = arg;
+CPUState *remove_cpu = NULL;
 
 qemu_tcg_init_cpu_signals();
 qemu_thread_get_self(cpu->thread);
@@ -889,6 +915,16 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
 }
 }
 qemu_tcg_wait_io_event();
+CPU_FOREACH(cpu) {
+if (cpu->exit && !cpu_can_run(cpu)) {
+remove_cpu = cpu;
+break;
+}
+}
+if (remove_cpu) {
+qemu_tcg_destroy_vcpu(remove_cpu);
+remove_cpu = NULL;
+}
 }
 
 return NULL;
@@ -1045,6 +1081,13 @@ void resume_all_vcpus(void)
 }
 }
 
+void cpu_remove(CPUState *cpu)
+{
+cpu->stop = true;
+cpu->exit = true;
+qemu_cpu_kick(cpu);
+}
+
 static void qemu_tcg_init_vcpu(CPUState *cpu)
 {
 /* share a single thread for all cpus with TCG */
@@ -1219,6 +1262,9 @@ static void tcg_exec_all(void)
 break;
 }
 } else if (cpu->stop || cpu->stopped) {
+if (cpu->exit) {
+next_cpu = CPU_NEXT(cpu);
+}
 break;
 }
 }
diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index 0e9b5bd..c2cf519 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -611,10 +611,21 @@ static const MemoryRegionOps piix4_pci_ops = {
 },
 };
 
-static void acpi_piix_eject_vcpu(int64_t cpuid)
+static void acpi_piix_eject_vcpu(PIIX4PMState *s, int64_t cpuid)
 {
-/* TODO: eject a vcpu, release allocated vcpu and exit the vcpu pthread.  
*/
-PIIX4_DPRINTF("vcpu: %" PRIu64 " need to be ejected.\n", cpuid);
+CPUStatus *g = &s->gpe_cpu;
+CPUState *cpu;
+
+CPU_FOREACH(cpu) {
+CPUClass *cc = CPU_GET_CLASS(cpu);
+int64_t id = cc->get_arch_id(cpu);
+
+if (cpuid == id) {
+g->old_sts[cpuid / 8] &= ~(1 << (cpuid % 8));
+cpu_remove(cpu);
+break;
+}
+}
 }
 
 static uint64_t cpu_status_read(void *opaque, hwaddr addr, unsigned int size)
@@ -633,7 +644,7 @@ static void cpu_status_write(void *opaque, hwaddr addr, 
uint64_t data,
 CPUStatus *cpus = &s->gpe_cpu;
 uint8_t val;
 int i;
-int64_t cpuid = 0;
+int64_t cpuid = -1;
 
 val = cpus->old_sts[addr] ^ data;
 
@@ -647,8 +658,8 @@ static void cpu_status_write(void *opaque, hwaddr addr, 
uint64_t data,
 }
 }
 
-if (cpuid != 0) {
-acpi_piix_eject_vcpu(cpuid);
+if (cpuid != -1) {
+acpi_piix_eject_vcpu(s, cpuid);
 }
 }
 
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 0238532..eb8d32b 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -181,6 +181,7 @@ struct CPUState {
 bool created;
 bool stop;
 bool stopped;
+bool exit;
 volatile sig_atomic_t exit_request;
 volatile sig_atomic_t tcg_exit_req;
 uint32_t interrupt_request;
@@ -206,6 +207,7 @@ struct CPUState {
 QTAILQ_HEAD(CPUTailQ, CPUState);
 extern struct CPUTailQ cpus;
 #define CPU_NEXT(cpu) QTAILQ_NEXT(cpu, node)
+#define CPU_REMOVE(cpu) QTAILQ_REMOVE(&cpus, cpu, node)
 #define CPU_FOREAC

[Qemu-devel] [RFC qom-cpu v3 02/10] apic: remove redundant variable 'apic_no' from apic_init_common()

2013-09-15 Thread Chen Fan
In struct APICCommonState, there is an id field yet, which was set earlier,
qdev_prop_set_uint8(env->apic_state, "id", env->cpuid_apic_id);
so we use the id field instead of the variable 'apic_no' to represent the 
unique apic
index.

Signed-off-by: Chen Fan 
---
 hw/intc/apic_common.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
index a0beb10..82fbb7f 100644
--- a/hw/intc/apic_common.c
+++ b/hw/intc/apic_common.c
@@ -289,13 +289,9 @@ static int apic_init_common(ICCDevice *dev)
 APICCommonState *s = APIC_COMMON(dev);
 APICCommonClass *info;
 static DeviceState *vapic;
-static int apic_no;
 static bool mmio_registered;
 
-if (apic_no >= MAX_APICS) {
-return -1;
-}
-s->idx = apic_no++;
+s->idx = s->id;
 
 info = APIC_COMMON_GET_CLASS(s);
 info->init(s);
-- 
1.8.1.4




[Qemu-devel] [RFC qom-cpu v3 05/10] qmp: add 'cpu-del' command support

2013-09-15 Thread Chen Fan
Signed-off-by: Chen Fan 
---
 hw/i386/pc.c |  6 ++
 hw/i386/pc_piix.c|  1 +
 include/hw/boards.h  |  2 ++
 include/hw/i386/pc.h |  1 +
 qapi-schema.json | 12 
 qmp-commands.hx  | 23 +++
 qmp.c|  9 +
 7 files changed, 54 insertions(+)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 832c9b2..40d611e 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -956,6 +956,12 @@ void pc_hot_add_cpu(const int64_t id, Error **errp)
 pc_new_cpu(current_cpu_model, apic_id, icc_bridge, errp);
 }
 
+void pc_hot_del_cpu(const int64_t id, Error **errp)
+{
+/* TODO: hot remove vCPU. */
+error_setg(errp, "Hot-remove CPU is not supported.");
+}
+
 void pc_cpus_init(const char *cpu_model, DeviceState *icc_bridge)
 {
 int i;
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 6e1e654..d779b75 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -347,6 +347,7 @@ static QEMUMachine pc_i440fx_machine_v1_6 = {
 .desc = "Standard PC (i440FX + PIIX, 1996)",
 .init = pc_init_pci_1_6,
 .hot_add_cpu = pc_hot_add_cpu,
+.hot_del_cpu = pc_hot_del_cpu,
 .max_cpus = 255,
 .is_default = 1,
 DEFAULT_MACHINE_OPTIONS,
diff --git a/include/hw/boards.h b/include/hw/boards.h
index fb7c6f1..fea3737 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -23,6 +23,7 @@ typedef void QEMUMachineInitFunc(QEMUMachineInitArgs *args);
 typedef void QEMUMachineResetFunc(void);
 
 typedef void QEMUMachineHotAddCPUFunc(const int64_t id, Error **errp);
+typedef void QEMUMachineHotDelCPUFunc(const int64_t id, Error **errp);
 
 typedef struct QEMUMachine {
 const char *name;
@@ -31,6 +32,7 @@ typedef struct QEMUMachine {
 QEMUMachineInitFunc *init;
 QEMUMachineResetFunc *reset;
 QEMUMachineHotAddCPUFunc *hot_add_cpu;
+QEMUMachineHotDelCPUFunc *hot_del_cpu;
 BlockInterfaceType block_default_type;
 int max_cpus;
 unsigned int no_serial:1,
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index f79d478..b7e66f4 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -96,6 +96,7 @@ void pc_acpi_smi_interrupt(void *opaque, int irq, int level);
 
 void pc_cpus_init(const char *cpu_model, DeviceState *icc_bridge);
 void pc_hot_add_cpu(const int64_t id, Error **errp);
+void pc_hot_del_cpu(const int64_t id, Error **errp);
 void pc_acpi_init(const char *default_dsdt);
 
 PcGuestInfo *pc_guest_info_init(ram_addr_t below_4g_mem_size,
diff --git a/qapi-schema.json b/qapi-schema.json
index a51f7d2..6052aa9 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1432,6 +1432,18 @@
 ##
 { 'command': 'cpu-add', 'data': {'id': 'int'} }
 
+# @cpu-del
+
+# Deletes CPU with specified ID
+#
+# @id: ID of CPU to be deleted, valid values [0..max_cpus)
+#
+# Returns: Nothing on success
+#
+# Since 1.7
+##
+{ 'command': 'cpu-del', 'data': {'id': 'int'} }
+
 ##
 # @memsave:
 #
diff --git a/qmp-commands.hx b/qmp-commands.hx
index cf47e3f..16b54fd 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -411,6 +411,29 @@ Example:
 EQMP
 
 {
+.name   = "cpu-del",
+.args_type  = "id:i",
+.mhandler.cmd_new = qmp_marshal_input_cpu_del,
+},
+
+SQMP
+cpu-del
+---
+
+Deletes virtual cpu
+
+Arguments:
+
+- "id": cpu id (json-int)
+
+Example:
+
+-> { "execute": "cpu-del", "arguments": { "id": 2 } }
+<- { "return": {} }
+
+EQMP
+
+{
 .name   = "memsave",
 .args_type  = "val:l,size:i,filename:s,cpu:i?",
 .mhandler.cmd_new = qmp_marshal_input_memsave,
diff --git a/qmp.c b/qmp.c
index 4c149b3..84dc873 100644
--- a/qmp.c
+++ b/qmp.c
@@ -118,6 +118,15 @@ void qmp_cpu_add(int64_t id, Error **errp)
 }
 }
 
+void qmp_cpu_del(int64_t id, Error **errp)
+{
+if (current_machine->hot_del_cpu) {
+current_machine->hot_del_cpu(id, errp);
+} else {
+error_setg(errp, "Not supported");
+}
+}
+
 #ifndef CONFIG_VNC
 /* If VNC support is enabled, the "true" query-vnc command is
defined in the VNC subsystem */
-- 
1.8.1.4




[Qemu-devel] [RFC qom-cpu v3 01/10] x86: move apic_state field from CPUX86State to X86CPU

2013-09-15 Thread Chen Fan
This motion is preparing for refactoring vCPU apic subsequently.

Signed-off-by: Chen Fan 
---
 cpu-exec.c|  2 +-
 cpus.c|  5 ++---
 hw/i386/kvmvapic.c|  8 +++-
 hw/i386/pc.c  | 17 -
 target-i386/cpu-qom.h |  4 
 target-i386/cpu.c | 22 ++
 target-i386/cpu.h |  4 
 target-i386/helper.c  |  9 -
 target-i386/kvm.c | 23 ++-
 target-i386/misc_helper.c |  8 
 10 files changed, 46 insertions(+), 56 deletions(-)

diff --git a/cpu-exec.c b/cpu-exec.c
index 301be28..463dc2e 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -320,7 +320,7 @@ int cpu_exec(CPUArchState *env)
 #if !defined(CONFIG_USER_ONLY)
 if (interrupt_request & CPU_INTERRUPT_POLL) {
 cpu->interrupt_request &= ~CPU_INTERRUPT_POLL;
-apic_poll_irq(env->apic_state);
+apic_poll_irq(x86_env_get_cpu(env)->apic_state);
 }
 #endif
 if (interrupt_request & CPU_INTERRUPT_INIT) {
diff --git a/cpus.c b/cpus.c
index 980697e..3bc10f4 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1383,12 +1383,11 @@ void qmp_inject_nmi(Error **errp)
 
 CPU_FOREACH(cs) {
 X86CPU *cpu = X86_CPU(cs);
-CPUX86State *env = &cpu->env;
 
-if (!env->apic_state) {
+if (!cpu->apic_state) {
 cpu_interrupt(cs, CPU_INTERRUPT_NMI);
 } else {
-apic_deliver_nmi(env->apic_state);
+apic_deliver_nmi(cpu->apic_state);
 }
 }
 #else
diff --git a/hw/i386/kvmvapic.c b/hw/i386/kvmvapic.c
index d3a6fbe..5a01c35 100644
--- a/hw/i386/kvmvapic.c
+++ b/hw/i386/kvmvapic.c
@@ -366,7 +366,7 @@ static int vapic_enable(VAPICROMState *s, X86CPU *cpu)
 (((hwaddr)cpu_number) << VAPIC_CPU_SHIFT);
 cpu_physical_memory_rw(vapic_paddr + offsetof(VAPICState, enabled),
(void *)&enabled, sizeof(enabled), 1);
-apic_enable_vapic(cpu->env.apic_state, vapic_paddr);
+apic_enable_vapic(cpu->apic_state, vapic_paddr);
 
 s->state = VAPIC_ACTIVE;
 
@@ -496,12 +496,10 @@ static void vapic_enable_tpr_reporting(bool enable)
 };
 CPUState *cs;
 X86CPU *cpu;
-CPUX86State *env;
 
 CPU_FOREACH(cs) {
 cpu = X86_CPU(cs);
-env = &cpu->env;
-info.apic = env->apic_state;
+info.apic = cpu->apic_state;
 run_on_cpu(cs, vapic_do_enable_tpr_reporting, &info);
 }
 }
@@ -690,7 +688,7 @@ static void vapic_write(void *opaque, hwaddr addr, uint64_t 
data,
 default:
 case 4:
 if (!kvm_irqchip_in_kernel()) {
-apic_poll_irq(env->apic_state);
+apic_poll_irq(cpu->apic_state);
 }
 break;
 }
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 0c313fe..832c9b2 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -169,13 +169,14 @@ void cpu_smm_update(CPUX86State *env)
 int cpu_get_pic_interrupt(CPUX86State *env)
 {
 int intno;
+X86CPU *cpu = x86_env_get_cpu(env);
 
-intno = apic_get_interrupt(env->apic_state);
+intno = apic_get_interrupt(cpu->apic_state);
 if (intno >= 0) {
 return intno;
 }
 /* read the irq from the PIC */
-if (!apic_accept_pic_intr(env->apic_state)) {
+if (!apic_accept_pic_intr(cpu->apic_state)) {
 return -1;
 }
 
@@ -187,15 +188,13 @@ static void pic_irq_request(void *opaque, int irq, int 
level)
 {
 CPUState *cs = first_cpu;
 X86CPU *cpu = X86_CPU(cs);
-CPUX86State *env = &cpu->env;
 
 DPRINTF("pic_irqs: %s irq %d\n", level? "raise" : "lower", irq);
-if (env->apic_state) {
+if (cpu->apic_state) {
 CPU_FOREACH(cs) {
 cpu = X86_CPU(cs);
-env = &cpu->env;
-if (apic_accept_pic_intr(env->apic_state)) {
-apic_deliver_pic_intr(env->apic_state, level);
+if (apic_accept_pic_intr(cpu->apic_state)) {
+apic_deliver_pic_intr(cpu->apic_state, level);
 }
 }
 } else {
@@ -890,7 +889,7 @@ DeviceState *cpu_get_current_apic(void)
 {
 if (current_cpu) {
 X86CPU *cpu = X86_CPU(current_cpu);
-return cpu->env.apic_state;
+return cpu->apic_state;
 } else {
 return NULL;
 }
@@ -984,7 +983,7 @@ void pc_cpus_init(const char *cpu_model, DeviceState 
*icc_bridge)
 }
 
 /* map APIC MMIO area if CPU has APIC */
-if (cpu && cpu->env.apic_state) {
+if (cpu && cpu->apic_state) {
 /* XXX: what if the base changes? */
 sysbus_mmio_map_overlap(SYS_BUS_DEVICE(icc_bridge), 0,
 APIC_DEFAULT_ADDRESS, 0x1000);
diff --git a/target-i386/cpu-qom.h b/target-i386/cpu-qom.h
i

[Qemu-devel] [RFC qom-cpu v3 00/10] i386: add cpu hot remove support

2013-09-15 Thread Chen Fan
Via implementing ACPI standard methods _EJ0 in bios, after Guest OS hot remove
one vCPU, it is able to send a signal to QEMU, then QEMU could notify
the assigned vCPU of exiting. meanwhile, and intruduce the QOM command 
'cpu-del' to remove
vCPU from QEMU itself.

this work is based on Andreas Färber's qom-cpu branch tree.
git://github.com/afaerber/qemu-cpu.git

this series patches must be used with seabios patch and KVM patch together.

for KVM patches:
http://comments.gmane.org/gmane.comp.emulators.kvm.devel/114347

for seabios patches:
http://comments.gmane.org/gmane.comp.emulators.qemu/230460

Chen Fan (10):
  x86: move apic_state field from CPUX86State to X86CPU
  apic: remove redundant variable 'apic_no' from apic_init_common()
  apic: remove local_apics array and using CPU_FOREACH instead
  x86: add x86_cpu_unrealizefn() for cpu apic remove
  qmp: add 'cpu-del' command support
  qom cpu: rename variable 'cpu_added_notifier' to
'cpu_hotplug_notifier'
  qom cpu: add UNPLUG cpu notifier support
  i386: implement pc interface pc_hot_del_cpu()
  piix4: implement function cpu_status_write() for vcpu ejection
  cpus: reclaim allocated vCPU objects

 cpu-exec.c  |  2 +-
 cpus.c  | 51 --
 hw/acpi/piix4.c | 66 +++--
 hw/i386/kvm/apic.c  |  8 
 hw/i386/kvmvapic.c  |  8 ++--
 hw/i386/pc.c| 51 +-
 hw/i386/pc_piix.c   |  1 +
 hw/intc/apic.c  | 81 -
 hw/intc/apic_common.c   |  6 +--
 include/hw/boards.h |  2 +
 include/hw/i386/apic_internal.h |  2 -
 include/hw/i386/pc.h|  1 +
 include/qom/cpu.h   | 20 ++
 include/sysemu/kvm.h|  1 +
 include/sysemu/sysemu.h |  2 +-
 kvm-all.c   | 25 +
 qapi-schema.json| 12 ++
 qmp-commands.hx | 23 
 qmp.c   |  9 +
 qom/cpu.c   | 26 ++---
 target-i386/cpu-qom.h   |  5 +++
 target-i386/cpu.c   | 57 +++--
 target-i386/cpu.h   |  4 --
 target-i386/helper.c|  9 ++---
 target-i386/kvm.c   | 23 +---
 target-i386/misc_helper.c   |  8 ++--
 26 files changed, 380 insertions(+), 123 deletions(-)

-- 
1.8.1.4




[Qemu-devel] [RFC qom-cpu v3 09/10] piix4: implement function cpu_status_write() for vcpu ejection

2013-09-15 Thread Chen Fan
When OS eject a vcpu (like: echo 1 > /sys/bus/acpi/devices/LNXCPUXX/eject),
it will call acpi EJ0 method, the firmware will write the new cpumap, QEMU
will know which vcpu need to be ejected.

Signed-off-by: Chen Fan 
---
 hw/acpi/piix4.c | 37 -
 1 file changed, 36 insertions(+), 1 deletion(-)

diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index 2ddc9a8..0e9b5bd 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -61,6 +61,7 @@ struct pci_status {
 
 typedef struct CPUStatus {
 uint8_t sts[PIIX4_PROC_LEN];
+uint8_t old_sts[PIIX4_PROC_LEN];
 } CPUStatus;
 
 typedef struct PIIX4PMState {
@@ -610,6 +611,12 @@ static const MemoryRegionOps piix4_pci_ops = {
 },
 };
 
+static void acpi_piix_eject_vcpu(int64_t cpuid)
+{
+/* TODO: eject a vcpu, release allocated vcpu and exit the vcpu pthread.  
*/
+PIIX4_DPRINTF("vcpu: %" PRIu64 " need to be ejected.\n", cpuid);
+}
+
 static uint64_t cpu_status_read(void *opaque, hwaddr addr, unsigned int size)
 {
 PIIX4PMState *s = opaque;
@@ -622,7 +629,27 @@ static uint64_t cpu_status_read(void *opaque, hwaddr addr, 
unsigned int size)
 static void cpu_status_write(void *opaque, hwaddr addr, uint64_t data,
  unsigned int size)
 {
-/* TODO: implement VCPU removal on guest signal that CPU can be removed */
+PIIX4PMState *s = opaque;
+CPUStatus *cpus = &s->gpe_cpu;
+uint8_t val;
+int i;
+int64_t cpuid = 0;
+
+val = cpus->old_sts[addr] ^ data;
+
+if (val == 0) {
+return;
+}
+
+for (i = 0; i < 8; i++) {
+if (val & 1 << i) {
+cpuid = 8 * addr + i;
+}
+}
+
+if (cpuid != 0) {
+acpi_piix_eject_vcpu(cpuid);
+}
 }
 
 static const MemoryRegionOps cpu_hotplug_ops = {
@@ -642,13 +669,20 @@ static void piix4_cpu_hotplug_req(PIIX4PMState *s, 
CPUState *cpu,
 ACPIGPE *gpe = &s->ar.gpe;
 CPUClass *k = CPU_GET_CLASS(cpu);
 int64_t cpu_id;
+int i;
 
 assert(s != NULL);
 
 *gpe->sts = *gpe->sts | PIIX4_CPU_HOTPLUG_STATUS;
 cpu_id = k->get_arch_id(CPU(cpu));
+
+for (i = 0; i < PIIX4_PROC_LEN; i++) {
+g->old_sts[i] = g->sts[i];
+}
+
 if (action == PLUG) {
 g->sts[cpu_id / 8] |= (1 << (cpu_id % 8));
+g->old_sts[cpu_id / 8] |= (1 << (cpu_id % 8));
 } else {
 g->sts[cpu_id / 8] &= ~(1 << (cpu_id % 8));
 }
@@ -687,6 +721,7 @@ static void piix4_acpi_system_hot_add_init(MemoryRegion 
*parent,
 
 g_assert((id / 8) < PIIX4_PROC_LEN);
 s->gpe_cpu.sts[id / 8] |= (1 << (id % 8));
+s->gpe_cpu.old_sts[id / 8] |= (1 << (id % 8));
 }
 memory_region_init_io(&s->io_cpu, OBJECT(s), &cpu_hotplug_ops, s,
   "acpi-cpu-hotplug", PIIX4_PROC_LEN);
-- 
1.8.1.4




[Qemu-devel] [RFC qom-cpu v3 06/10] qom cpu: rename variable 'cpu_added_notifier' to 'cpu_hotplug_notifier'

2013-09-15 Thread Chen Fan
Rename variable 'cpu_added_notifier' to 'cpu_hotplug_notifier', for
adding vcpu-remove notifier support.

Signed-off-by: Chen Fan 
---
 hw/acpi/piix4.c | 10 +-
 hw/i386/pc.c|  2 +-
 include/sysemu/sysemu.h |  2 +-
 qom/cpu.c   | 10 +-
 4 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index 0b8d1d9..c8f4182 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -95,7 +95,7 @@ typedef struct PIIX4PMState {
 uint8_t s4_val;
 
 CPUStatus gpe_cpu;
-Notifier cpu_added_notifier;
+Notifier cpu_hotplug_notifier;
 } PIIX4PMState;
 
 #define TYPE_PIIX4_PM "PIIX4_PM"
@@ -660,9 +660,9 @@ static void piix4_cpu_hotplug_req(PIIX4PMState *s, CPUState 
*cpu,
 pm_update_sci(s);
 }
 
-static void piix4_cpu_added_req(Notifier *n, void *opaque)
+static void piix4_cpu_hotplug(Notifier *n, void *opaque)
 {
-PIIX4PMState *s = container_of(n, PIIX4PMState, cpu_added_notifier);
+PIIX4PMState *s = container_of(n, PIIX4PMState, cpu_hotplug_notifier);
 
 piix4_cpu_hotplug_req(s, CPU(opaque), PLUG);
 }
@@ -695,8 +695,8 @@ static void piix4_acpi_system_hot_add_init(MemoryRegion 
*parent,
 memory_region_init_io(&s->io_cpu, OBJECT(s), &cpu_hotplug_ops, s,
   "acpi-cpu-hotplug", PIIX4_PROC_LEN);
 memory_region_add_subregion(parent, PIIX4_PROC_BASE, &s->io_cpu);
-s->cpu_added_notifier.notify = piix4_cpu_added_req;
-qemu_register_cpu_added_notifier(&s->cpu_added_notifier);
+s->cpu_hotplug_notifier.notify = piix4_cpu_hotplug;
+qemu_register_cpu_hotplug_notifier(&s->cpu_hotplug_notifier);
 }
 
 static void enable_device(PIIX4PMState *s, int slot)
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 40d611e..8ab6e4f 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -406,7 +406,7 @@ void pc_cmos_init(ram_addr_t ram_size, ram_addr_t 
above_4g_mem_size,
 /* init CPU hotplug notifier */
 cpu_hotplug_cb.rtc_state = s;
 cpu_hotplug_cb.cpu_added_notifier.notify = rtc_notify_cpu_added;
-qemu_register_cpu_added_notifier(&cpu_hotplug_cb.cpu_added_notifier);
+qemu_register_cpu_hotplug_notifier(&cpu_hotplug_cb.cpu_added_notifier);
 
 if (set_boot_dev(s, boot_device)) {
 exit(1);
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index b1aa059..e1c1120 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -153,7 +153,7 @@ void do_pci_device_hot_remove(Monitor *mon, const QDict 
*qdict);
 void drive_hot_add(Monitor *mon, const QDict *qdict);
 
 /* CPU hotplug */
-void qemu_register_cpu_added_notifier(Notifier *notifier);
+void qemu_register_cpu_hotplug_notifier(Notifier *notifier);
 
 /* pcie aer error injection */
 void pcie_aer_inject_error_print(Monitor *mon, const QObject *data);
diff --git a/qom/cpu.c b/qom/cpu.c
index fa7ec6b..7992fe1 100644
--- a/qom/cpu.c
+++ b/qom/cpu.c
@@ -67,12 +67,12 @@ static void cpu_common_get_memory_mapping(CPUState *cpu,
 }
 
 /* CPU hot-plug notifiers */
-static NotifierList cpu_added_notifiers =
-NOTIFIER_LIST_INITIALIZER(cpu_add_notifiers);
+static NotifierList cpu_hotplug_notifiers =
+NOTIFIER_LIST_INITIALIZER(cpu_hotplug_notifiers);
 
-void qemu_register_cpu_added_notifier(Notifier *notifier)
+void qemu_register_cpu_hotplug_notifier(Notifier *notifier)
 {
-notifier_list_add(&cpu_added_notifiers, notifier);
+notifier_list_add(&cpu_hotplug_notifiers, notifier);
 }
 
 void cpu_reset_interrupt(CPUState *cpu, int mask)
@@ -218,7 +218,7 @@ static void cpu_common_realizefn(DeviceState *dev, Error 
**errp)
 
 if (dev->hotplugged) {
 cpu_synchronize_post_init(cpu);
-notifier_list_notify(&cpu_added_notifiers, dev);
+notifier_list_notify(&cpu_hotplug_notifiers, dev);
 cpu_resume(cpu);
 }
 }
-- 
1.8.1.4




[Qemu-devel] [RFC qom-cpu v3 03/10] apic: remove local_apics array and using CPU_FOREACH instead

2013-09-15 Thread Chen Fan
Using CPU_FOREACH() marco instead of scaning the entire
local_apics array for fast searching apic.

Signed-off-by: Chen Fan 
---
 hw/intc/apic.c  | 73 ++---
 include/hw/i386/apic_internal.h |  2 --
 2 files changed, 32 insertions(+), 43 deletions(-)

diff --git a/hw/intc/apic.c b/hw/intc/apic.c
index a913186..f8f2cbf 100644
--- a/hw/intc/apic.c
+++ b/hw/intc/apic.c
@@ -32,8 +32,6 @@
 #define SYNC_TO_VAPIC   0x2
 #define SYNC_ISR_IRR_TO_VAPIC   0x4
 
-static APICCommonState *local_apics[MAX_APICS + 1];
-
 static void apic_set_irq(APICCommonState *s, int vector_num, int trigger_mode);
 static void apic_update_irq(APICCommonState *s);
 static void apic_get_delivery_bitmask(uint32_t *deliver_bitmask,
@@ -200,18 +198,15 @@ static void apic_external_nmi(APICCommonState *s)
 
 #define foreach_apic(apic, deliver_bitmask, code) \
 {\
+CPUState *cpu;\
 int __i, __j, __mask;\
-for(__i = 0; __i < MAX_APIC_WORDS; __i++) {\
+CPU_FOREACH(cpu) {\
+apic = APIC_COMMON(X86_CPU(cpu)->apic_state);\
+__i = apic->idx / 32;\
+__j = apic->idx % 32;\
 __mask = deliver_bitmask[__i];\
-if (__mask) {\
-for(__j = 0; __j < 32; __j++) {\
-if (__mask & (1 << __j)) {\
-apic = local_apics[__i * 32 + __j];\
-if (apic) {\
-code;\
-}\
-}\
-}\
+if (__mask & (1 << __j)) {\
+code;\
 }\
 }\
 }
@@ -235,9 +230,13 @@ static void apic_bus_deliver(const uint32_t 
*deliver_bitmask,
 }
 }
 if (d >= 0) {
-apic_iter = local_apics[d];
-if (apic_iter) {
-apic_set_irq(apic_iter, vector_num, trigger_mode);
+CPUState *cpu;
+CPU_FOREACH(cpu) {
+apic_iter = APIC_COMMON(X86_CPU(cpu)->apic_state);
+if (apic_iter->idx == d) {
+apic_set_irq(apic_iter, vector_num, trigger_mode);
+break;
+}
 }
 }
 }
@@ -422,18 +421,14 @@ static void apic_eoi(APICCommonState *s)
 
 static int apic_find_dest(uint8_t dest)
 {
-APICCommonState *apic = local_apics[dest];
-int i;
-
-if (apic && apic->id == dest)
-return dest;  /* shortcut in case apic->id == apic->idx */
+APICCommonState *apic;
+CPUState *cpu;
 
-for (i = 0; i < MAX_APICS; i++) {
-apic = local_apics[i];
-   if (apic && apic->id == dest)
-return i;
-if (!apic)
-break;
+CPU_FOREACH(cpu) {
+apic = APIC_COMMON(X86_CPU(cpu)->apic_state);
+if (apic->id == dest) {
+return apic->idx;
+}
 }
 
 return -1;
@@ -443,7 +438,7 @@ static void apic_get_delivery_bitmask(uint32_t 
*deliver_bitmask,
   uint8_t dest, uint8_t dest_mode)
 {
 APICCommonState *apic_iter;
-int i;
+CPUState *cpu;
 
 if (dest_mode == 0) {
 if (dest == 0xff) {
@@ -457,20 +452,17 @@ static void apic_get_delivery_bitmask(uint32_t 
*deliver_bitmask,
 } else {
 /* XXX: cluster mode */
 memset(deliver_bitmask, 0x00, MAX_APIC_WORDS * sizeof(uint32_t));
-for(i = 0; i < MAX_APICS; i++) {
-apic_iter = local_apics[i];
-if (apic_iter) {
-if (apic_iter->dest_mode == 0xf) {
-if (dest & apic_iter->log_dest)
-apic_set_bit(deliver_bitmask, i);
-} else if (apic_iter->dest_mode == 0x0) {
-if ((dest & 0xf0) == (apic_iter->log_dest & 0xf0) &&
-(dest & apic_iter->log_dest & 0x0f)) {
-apic_set_bit(deliver_bitmask, i);
-}
+CPU_FOREACH(cpu) {
+apic_iter = APIC_COMMON(X86_CPU(cpu)->apic_state);
+if (apic_iter->dest_mode == 0xf) {
+if (dest & apic_iter->log_dest) {
+apic_set_bit(deliver_bitmask, apic_iter->idx);
+}
+} else if (apic_iter->dest_mode == 0x0) {
+if ((dest & 0xf0) == (apic_iter->log_dest & 0xf0) &&
+(dest & apic_iter->log_dest & 0x0f)) {
+apic_set_bit(deliver_bitmask, apic_iter->idx);
 }
-} else {
-break;
 }
 }
 }
@@ -877,7 +869,6 @@ static void apic_init(APICCommonState *s)
   APIC_SPACE_SIZE);
 
 s->timer = timer_new_ns(QEMU_CLOCK_VIR

[Qemu-devel] [RFC qom-cpu v3 07/10] qom cpu: add UNPLUG cpu notifier support

2013-09-15 Thread Chen Fan
Move struct HotplugEventType from file piix4.c to file qom/cpu.c,
and add struct CPUNotifier for supporting UNPLUG cpu notifier.

Signed-off-by: Chen Fan 
---
 hw/acpi/piix4.c   |  8 ++--
 include/qom/cpu.h | 10 ++
 qom/cpu.c |  6 +-
 3 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index c8f4182..2ddc9a8 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -635,11 +635,6 @@ static const MemoryRegionOps cpu_hotplug_ops = {
 },
 };
 
-typedef enum {
-PLUG,
-UNPLUG,
-} HotplugEventType;
-
 static void piix4_cpu_hotplug_req(PIIX4PMState *s, CPUState *cpu,
   HotplugEventType action)
 {
@@ -663,8 +658,9 @@ static void piix4_cpu_hotplug_req(PIIX4PMState *s, CPUState 
*cpu,
 static void piix4_cpu_hotplug(Notifier *n, void *opaque)
 {
 PIIX4PMState *s = container_of(n, PIIX4PMState, cpu_hotplug_notifier);
+CPUNotifier *notifier = opaque;
 
-piix4_cpu_hotplug_req(s, CPU(opaque), PLUG);
+piix4_cpu_hotplug_req(s, CPU(notifier->dev), notifier->type);
 }
 
 static int piix4_device_hotplug(DeviceState *qdev, PCIDevice *dev,
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 7739e00..0238532 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -507,6 +507,16 @@ void qemu_init_vcpu(CPUState *cpu);
  */
 void cpu_single_step(CPUState *cpu, int enabled);
 
+typedef enum {
+PLUG,
+UNPLUG,
+} HotplugEventType;
+
+typedef struct CPUNotifier {
+DeviceState *dev;
+HotplugEventType type;
+} CPUNotifier;
+
 #ifdef CONFIG_SOFTMMU
 extern const struct VMStateDescription vmstate_cpu_common;
 #else
diff --git a/qom/cpu.c b/qom/cpu.c
index 7992fe1..c6d7ebc 100644
--- a/qom/cpu.c
+++ b/qom/cpu.c
@@ -215,10 +215,14 @@ static ObjectClass *cpu_common_class_by_name(const char 
*cpu_model)
 static void cpu_common_realizefn(DeviceState *dev, Error **errp)
 {
 CPUState *cpu = CPU(dev);
+CPUNotifier notifier;
+
+notifier.dev = dev;
+notifier.type = PLUG;
 
 if (dev->hotplugged) {
 cpu_synchronize_post_init(cpu);
-notifier_list_notify(&cpu_hotplug_notifiers, dev);
+notifier_list_notify(&cpu_hotplug_notifiers, ¬ifier);
 cpu_resume(cpu);
 }
 }
-- 
1.8.1.4




[Qemu-devel] [RFC qom-cpu v3 10/10] cpus: reclaim allocated vCPU objects

2013-09-15 Thread Chen Fan
After ACPI get a signal to eject a vCPU, then it will notify
the vCPU thread to exit in KVM, and the vCPU must be removed from CPU list,
before the vCPU really removed, there will release the all related vCPU objects.

Signed-off-by: Chen Fan 
---
 cpus.c   | 46 ++
 hw/acpi/piix4.c  | 23 +--
 include/qom/cpu.h| 10 ++
 include/sysemu/kvm.h |  1 +
 kvm-all.c| 25 +
 5 files changed, 99 insertions(+), 6 deletions(-)

diff --git a/cpus.c b/cpus.c
index 3bc10f4..c1ad2f4 100644
--- a/cpus.c
+++ b/cpus.c
@@ -714,6 +714,26 @@ void async_run_on_cpu(CPUState *cpu, void (*func)(void 
*data), void *data)
 qemu_cpu_kick(cpu);
 }
 
+static void qemu_kvm_destroy_vcpu(CPUState *cpu)
+{
+CPU_REMOVE(cpu);
+
+if (kvm_destroy_vcpu(cpu) < 0) {
+fprintf(stderr, "kvm_destroy_vcpu failed.\n");
+exit(1);
+}
+
+object_property_set_bool(OBJECT(cpu), false, "realized", NULL);
+qdev_free(DEVICE(cpu));
+}
+
+static void qemu_tcg_destroy_vcpu(CPUState *cpu)
+{
+CPU_REMOVE(cpu);
+object_property_set_bool(OBJECT(cpu), false, "realized", NULL);
+qdev_free(DEVICE(cpu));
+}
+
 static void flush_queued_work(CPUState *cpu)
 {
 struct qemu_work_item *wi;
@@ -805,6 +825,11 @@ static void *qemu_kvm_cpu_thread_fn(void *arg)
 }
 }
 qemu_kvm_wait_io_event(cpu);
+if (cpu->exit && !cpu_can_run(cpu)) {
+qemu_kvm_destroy_vcpu(cpu);
+qemu_mutex_unlock(&qemu_global_mutex);
+return NULL;
+}
 }
 
 return NULL;
@@ -857,6 +882,7 @@ static void tcg_exec_all(void);
 static void *qemu_tcg_cpu_thread_fn(void *arg)
 {
 CPUState *cpu = arg;
+CPUState *remove_cpu = NULL;
 
 qemu_tcg_init_cpu_signals();
 qemu_thread_get_self(cpu->thread);
@@ -889,6 +915,16 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
 }
 }
 qemu_tcg_wait_io_event();
+CPU_FOREACH(cpu) {
+if (cpu->exit && !cpu_can_run(cpu)) {
+remove_cpu = cpu;
+break;
+}
+}
+if (remove_cpu) {
+qemu_tcg_destroy_vcpu(remove_cpu);
+remove_cpu = NULL;
+}
 }
 
 return NULL;
@@ -1045,6 +1081,13 @@ void resume_all_vcpus(void)
 }
 }
 
+void cpu_remove(CPUState *cpu)
+{
+cpu->stop = true;
+cpu->exit = true;
+qemu_cpu_kick(cpu);
+}
+
 static void qemu_tcg_init_vcpu(CPUState *cpu)
 {
 /* share a single thread for all cpus with TCG */
@@ -1219,6 +1262,9 @@ static void tcg_exec_all(void)
 break;
 }
 } else if (cpu->stop || cpu->stopped) {
+if (cpu->exit) {
+next_cpu = CPU_NEXT(cpu);
+}
 break;
 }
 }
diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index 0e9b5bd..c2cf519 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -611,10 +611,21 @@ static const MemoryRegionOps piix4_pci_ops = {
 },
 };
 
-static void acpi_piix_eject_vcpu(int64_t cpuid)
+static void acpi_piix_eject_vcpu(PIIX4PMState *s, int64_t cpuid)
 {
-/* TODO: eject a vcpu, release allocated vcpu and exit the vcpu pthread.  
*/
-PIIX4_DPRINTF("vcpu: %" PRIu64 " need to be ejected.\n", cpuid);
+CPUStatus *g = &s->gpe_cpu;
+CPUState *cpu;
+
+CPU_FOREACH(cpu) {
+CPUClass *cc = CPU_GET_CLASS(cpu);
+int64_t id = cc->get_arch_id(cpu);
+
+if (cpuid == id) {
+g->old_sts[cpuid / 8] &= ~(1 << (cpuid % 8));
+cpu_remove(cpu);
+break;
+}
+}
 }
 
 static uint64_t cpu_status_read(void *opaque, hwaddr addr, unsigned int size)
@@ -633,7 +644,7 @@ static void cpu_status_write(void *opaque, hwaddr addr, 
uint64_t data,
 CPUStatus *cpus = &s->gpe_cpu;
 uint8_t val;
 int i;
-int64_t cpuid = 0;
+int64_t cpuid = -1;
 
 val = cpus->old_sts[addr] ^ data;
 
@@ -647,8 +658,8 @@ static void cpu_status_write(void *opaque, hwaddr addr, 
uint64_t data,
 }
 }
 
-if (cpuid != 0) {
-acpi_piix_eject_vcpu(cpuid);
+if (cpuid != -1) {
+acpi_piix_eject_vcpu(s, cpuid);
 }
 }
 
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 0238532..eb8d32b 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -181,6 +181,7 @@ struct CPUState {
 bool created;
 bool stop;
 bool stopped;
+bool exit;
 volatile sig_atomic_t exit_request;
 volatile sig_atomic_t tcg_exit_req;
 uint32_t interrupt_request;
@@ -206,6 +207,7 @@ struct CPUState {
 QTAILQ_HEAD(CPUTailQ, CPUState);
 extern struct CPUTailQ cpus;
 #define CPU_NEXT(cpu) QTAILQ_NEXT(cpu, node)
+#define CPU_REMOVE(cpu) QTAILQ_REMOVE(&cpus, cpu, node)
 #define CPU_FOREACH(cpu) QTAILQ_FOREACH(cpu, 

[Qemu-devel] [RFC qom-cpu v3 04/10] x86: add x86_cpu_unrealizefn() for cpu apic remove

2013-09-15 Thread Chen Fan
Implement x86_cpu_unrealizefn() for corresponding x86_cpu_realizefn(),
which is mostly used to clear the apic related information at here.

Signed-off-by: Chen Fan 
---
 hw/i386/kvm/apic.c|  8 
 hw/intc/apic.c|  8 
 target-i386/cpu-qom.h |  1 +
 target-i386/cpu.c | 35 +++
 4 files changed, 52 insertions(+)

diff --git a/hw/i386/kvm/apic.c b/hw/i386/kvm/apic.c
index 5609063..9461600 100644
--- a/hw/i386/kvm/apic.c
+++ b/hw/i386/kvm/apic.c
@@ -181,11 +181,19 @@ static void kvm_apic_init(APICCommonState *s)
 }
 }
 
+static void kvm_apic_unrealize(DeviceState *dev, Error **errp)
+{
+APICCommonState *s = APIC_COMMON(dev);
+memory_region_destroy(&s->io_memory);
+}
+
 static void kvm_apic_class_init(ObjectClass *klass, void *data)
 {
 APICCommonClass *k = APIC_COMMON_CLASS(klass);
+DeviceClass *dc = DEVICE_CLASS(klass);
 
 k->init = kvm_apic_init;
+dc->unrealize = kvm_apic_unrealize;
 k->set_base = kvm_apic_set_base;
 k->set_tpr = kvm_apic_set_tpr;
 k->get_tpr = kvm_apic_get_tpr;
diff --git a/hw/intc/apic.c b/hw/intc/apic.c
index f8f2cbf..46ea047 100644
--- a/hw/intc/apic.c
+++ b/hw/intc/apic.c
@@ -873,11 +873,19 @@ static void apic_init(APICCommonState *s)
 msi_supported = true;
 }
 
+static void apic_unrealize(DeviceState *dev, Error **errp)
+{
+APICCommonState *s = APIC_COMMON(dev);
+memory_region_destroy(&s->io_memory);
+}
+
 static void apic_class_init(ObjectClass *klass, void *data)
 {
 APICCommonClass *k = APIC_COMMON_CLASS(klass);
+DeviceClass *dc = DEVICE_CLASS(klass);
 
 k->init = apic_init;
+dc->unrealize = apic_unrealize;
 k->set_base = apic_set_base;
 k->set_tpr = apic_set_tpr;
 k->get_tpr = apic_get_tpr;
diff --git a/target-i386/cpu-qom.h b/target-i386/cpu-qom.h
index 548a449..ad8ad82 100644
--- a/target-i386/cpu-qom.h
+++ b/target-i386/cpu-qom.h
@@ -50,6 +50,7 @@ typedef struct X86CPUClass {
 /*< public >*/
 
 DeviceRealize parent_realize;
+DeviceUnrealize parent_unrealize;
 void (*parent_reset)(CPUState *cpu);
 } X86CPUClass;
 
diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index 047bb77..6ac3ff2 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -2336,10 +2336,31 @@ static void x86_cpu_apic_realize(X86CPU *cpu, Error 
**errp)
 return;
 }
 }
+
+static void x86_cpu_apic_unrealize(X86CPU *cpu, Error **errp)
+{
+Error *local_err = NULL;
+
+if (cpu->apic_state == NULL) {
+return;
+}
+
+object_property_set_bool(OBJECT(cpu->apic_state),
+ false, "realized", &local_err);
+if (local_err != NULL) {
+error_propagate(errp, local_err);
+return;
+}
+
+qdev_free(cpu->apic_state);
+}
 #else
 static void x86_cpu_apic_realize(X86CPU *cpu, Error **errp)
 {
 }
+static void x86_cpu_apic_unrealize(X86CPU *cpu, Error **errp)
+{
+}
 #endif
 
 static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
@@ -2415,6 +2436,18 @@ out:
 }
 }
 
+static void x86_cpu_unrealizefn(DeviceState *dev, Error **errp)
+{
+X86CPU *cpu = X86_CPU(dev);
+Error *local_err = NULL;
+
+x86_cpu_apic_unrealize(cpu, &local_err);
+if (local_err != NULL) {
+error_propagate(errp, local_err);
+return;
+}
+}
+
 /* Enables contiguous-apic-ID mode, for compatibility */
 static bool compat_apic_id_mode;
 
@@ -2546,7 +2579,9 @@ static void x86_cpu_common_class_init(ObjectClass *oc, 
void *data)
 DeviceClass *dc = DEVICE_CLASS(oc);
 
 xcc->parent_realize = dc->realize;
+xcc->parent_unrealize = dc->unrealize;
 dc->realize = x86_cpu_realizefn;
+dc->unrealize = x86_cpu_unrealizefn;
 dc->bus_type = TYPE_ICC_BUS;
 dc->props = x86_cpu_properties;
 
-- 
1.8.1.4




[Qemu-devel] [RFC qom-cpu v3 08/10] i386: implement pc interface pc_hot_del_cpu()

2013-09-15 Thread Chen Fan
Implement cpu interface pc_hot_del_cpu() for unrealizing device vCPU.
emiting vcpu-remove notifier to ACPI, then ACPI could send sci interrupt
to OS for hot-remove vcpu.

Signed-off-by: Chen Fan 
---
 hw/i386/pc.c | 30 --
 qom/cpu.c| 12 
 2 files changed, 40 insertions(+), 2 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 8ab6e4f..ce7b20f 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -958,8 +958,34 @@ void pc_hot_add_cpu(const int64_t id, Error **errp)
 
 void pc_hot_del_cpu(const int64_t id, Error **errp)
 {
-/* TODO: hot remove vCPU. */
-error_setg(errp, "Hot-remove CPU is not supported.");
+CPUState *cpu;
+bool found = false;
+X86CPUClass *xcc;
+
+CPU_FOREACH(cpu) {
+CPUClass *cc = CPU_GET_CLASS(cpu);
+int64_t cpuid = cc->get_arch_id(cpu);
+
+if (cpuid == id) {
+found = true;
+break;
+}
+}
+
+if (!found) {
+error_setg(errp, "Unable to find cpu-index: %" PRIi64
+   ", it doesn't exist or has been deleted.", id);
+return;
+}
+
+if (cpu == first_cpu && !CPU_NEXT(cpu)) {
+error_setg(errp, "Unable to delete the last "
+   "one cpu when VM running.");
+return;
+}
+
+xcc = X86_CPU_GET_CLASS(DEVICE(cpu));
+xcc->parent_unrealize(DEVICE(cpu), errp);
 }
 
 void pc_cpus_init(const char *cpu_model, DeviceState *icc_bridge)
diff --git a/qom/cpu.c b/qom/cpu.c
index c6d7ebc..b413a4c 100644
--- a/qom/cpu.c
+++ b/qom/cpu.c
@@ -227,6 +227,17 @@ static void cpu_common_realizefn(DeviceState *dev, Error 
**errp)
 }
 }
 
+static void cpu_common_unrealizefn(DeviceState *dev, Error **errp)
+{
+CPUNotifier notifier;
+
+notifier.dev = dev;
+notifier.type = UNPLUG;
+
+notifier_list_notify(&cpu_hotplug_notifiers, ¬ifier);
+}
+
+
 static void cpu_common_initfn(Object *obj)
 {
 CPUState *cpu = CPU(obj);
@@ -257,6 +268,7 @@ static void cpu_class_init(ObjectClass *klass, void *data)
 k->gdb_read_register = cpu_common_gdb_read_register;
 k->gdb_write_register = cpu_common_gdb_write_register;
 dc->realize = cpu_common_realizefn;
+dc->unrealize = cpu_common_unrealizefn;
 dc->no_user = 1;
 }
 
-- 
1.8.1.4




Re: [Qemu-devel] Exposing and calculating CPU APIC IDs (was Re: [RFC 1/3] target-i386: moving registers of vmstate from cpu_exec_init() to x86_cpu_realizefn())

2014-02-12 Thread Chen Fan
On Tue, 2014-01-21 at 11:10 +0100, Andreas Färber wrote:
> Am 21.01.2014 10:51, schrieb Chen Fan:
> > On Tue, 2014-01-21 at 10:31 +0100, Igor Mammedov wrote:
> >> On Tue, 21 Jan 2014 15:12:45 +0800
> >> Chen Fan  wrote:
> >>> On Mon, 2014-01-20 at 13:29 +0100, Igor Mammedov wrote:
> >>>> On Fri, 17 Jan 2014 17:13:55 -0200
> >>>> Eduardo Habkost  wrote:
> >>>>> On Wed, Jan 15, 2014 at 03:37:04PM +0100, Igor Mammedov wrote:
> >>>>>> I recall there were objections to it since APIC ID contains topology
> >>>>>> information and it's not trivial for user to get it right.
> >>>>>> The last idea that was discussed to fix it was not expose APIC ID to
> >>>>>> user but rather introduce QOM hierarchy like:
> >>>>>>   /machine/node/N/socket/X/core/Y/thread/Z
> >>>>>> and use it in user interface as a means to specify an arbitrary CPU
> >>>>>> and let QEMU calculate APIC ID based on this path.
> >>>>>>
> >>>>>> But nobody took on implementing it yet.
> >>>>>
> >>>>> We're taking so long to get a decent interface implemented, that part of
> >>>>> me is considering exposing the APIC ID directly like suggested before,
> >>>>> and requiring libvirt to calculate topology-aware APIC IDs[1] to
> >>>>> properly implement CPU hotplug (and possibly for other tasks).
> >>>> If you are speaking about 
> >>>> 'qemu will core dump with "-smp 254, sockets=2, cores=3, threads=2"'
> >>>> http://patchwork.ozlabs.org/patch/301272/
> >>>> bug then it's limitation of ACPI implementation,
> >>>> I'm going to refactor it to use full APIC ids instead of using bitmap,
> >>>> so that we won't ever run into issue regardless of cpu supported CPU 
> >>>> count.
> >>>>
> >>>>>
> >>>>> Another part of me is hoping that the libvirt developers ask us to
> >>>>> please not do that, so I can use it as argument against exposing the
> >>>>> APIC IDs directly the next time we discuss this.  :)
> >>>>
> >>>> why not try your  /machine/node/N/socket/X/core/Y/thread/Z idea first.
> >>>> It will benefit not only cpu hotplug but also '-numa' and topology
> >>>> description in general.
> >>>>
> >>> have there been any plan/model of the idea? Need to add a new option to
> >>> qemu command?
> >> I suppose we can start with internal default implementation first.
> >>
> >> one way could be
> >>  1. let machine prebuild empty QOM tree 
> >> /machine/node/N/socket/X/core/Y/thread/Z
> >>  2. add node, socket, core, thread properties to CPU and link CPU into 
> >> respective
> >> link created by #1
> >>  
> > Thanks, I hope I can take some time to make some patches to implement
> > it.
> 
> Please give us a few hours to reply. :)
> 
> /machine/node seems too broad a term to me.
> You can't prebuild the full tree, you can only prepare the nodes.
> core[Y]/thread[Z] was previously discussed as syntax.
> 
> The important part to decide on will be what is going to be child<> and
> what link<>. Has anyone played with the Intel Quark platform for
> instance? (Galileo board or upcoming Edison card) On a regular
> mainboard, we would have socket[X] as a link, which might
> point to a child /machine/memory-node[W]/cpu[X]. But if we do so we
> can't reassign it to another memory node - acceptable? With Quark (or
> Qseven modules etc.) there would be a container object rather than the
> /machine itself that has a child instead of a link.
> I guess the memory nodes could still be on the /machine though.
> The other point of discussion between Anthony and me was whether core[Y]
> should be a link<> or child<>, same for thread. I believe a child<> is
> better as it enforces that unrealizing the CPU will unrealize all its
> cores and all its threads in the future.
> 
> More issues may pop up when thinking about it longer than a few minutes.
> But yes, we need to start investigating this, and so far I had other
> priorities like getting the CPUState mess I created cleaned up.
Hi, Igor, Andreas,

  In addition, I want to know what way user could use to specify an
arbitrary CPU if using /machine/node/N/socket/X/core/Y/thread/Z idea? 
-device qemu64,socket=X,core=Y,thread=Z? or add a new optional command
line?

Thanks,
Chen

> 
> Regards,
> Andreas
> 





Re: [Qemu-devel] Exposing and calculating CPU APIC IDs (was Re: [RFC 1/3] target-i386: moving registers of vmstate from cpu_exec_init() to x86_cpu_realizefn())

2014-02-17 Thread Chen Fan
On Thu, 2014-02-13 at 10:44 +0100, Igor Mammedov wrote:
> On Thu, 13 Feb 2014 14:14:08 +0800
> Chen Fan  wrote:
> 
> > On Tue, 2014-01-21 at 11:10 +0100, Andreas Färber wrote:
> > > Am 21.01.2014 10:51, schrieb Chen Fan:
> > > > On Tue, 2014-01-21 at 10:31 +0100, Igor Mammedov wrote:
> > > >> On Tue, 21 Jan 2014 15:12:45 +0800
> > > >> Chen Fan  wrote:
> > > >>> On Mon, 2014-01-20 at 13:29 +0100, Igor Mammedov wrote:
> > > >>>> On Fri, 17 Jan 2014 17:13:55 -0200
> > > >>>> Eduardo Habkost  wrote:
> > > >>>>> On Wed, Jan 15, 2014 at 03:37:04PM +0100, Igor Mammedov wrote:
> > > >>>>>> I recall there were objections to it since APIC ID contains 
> > > >>>>>> topology
> > > >>>>>> information and it's not trivial for user to get it right.
> > > >>>>>> The last idea that was discussed to fix it was not expose APIC ID 
> > > >>>>>> to
> > > >>>>>> user but rather introduce QOM hierarchy like:
> > > >>>>>>   /machine/node/N/socket/X/core/Y/thread/Z
> > > >>>>>> and use it in user interface as a means to specify an arbitrary CPU
> > > >>>>>> and let QEMU calculate APIC ID based on this path.
> > > >>>>>>
> > > >>>>>> But nobody took on implementing it yet.
> > > >>>>>
> > > >>>>> We're taking so long to get a decent interface implemented, that 
> > > >>>>> part of
> > > >>>>> me is considering exposing the APIC ID directly like suggested 
> > > >>>>> before,
> > > >>>>> and requiring libvirt to calculate topology-aware APIC IDs[1] to
> > > >>>>> properly implement CPU hotplug (and possibly for other tasks).
> > > >>>> If you are speaking about 
> > > >>>> 'qemu will core dump with "-smp 254, sockets=2, cores=3, threads=2"'
> > > >>>> http://patchwork.ozlabs.org/patch/301272/
> > > >>>> bug then it's limitation of ACPI implementation,
> > > >>>> I'm going to refactor it to use full APIC ids instead of using 
> > > >>>> bitmap,
> > > >>>> so that we won't ever run into issue regardless of cpu supported CPU 
> > > >>>> count.
> > > >>>>
> > > >>>>>
> > > >>>>> Another part of me is hoping that the libvirt developers ask us to
> > > >>>>> please not do that, so I can use it as argument against exposing the
> > > >>>>> APIC IDs directly the next time we discuss this.  :)
> > > >>>>
> > > >>>> why not try your  /machine/node/N/socket/X/core/Y/thread/Z idea 
> > > >>>> first.
> > > >>>> It will benefit not only cpu hotplug but also '-numa' and topology
> > > >>>> description in general.
> > > >>>>
> > > >>> have there been any plan/model of the idea? Need to add a new option 
> > > >>> to
> > > >>> qemu command?
> > > >> I suppose we can start with internal default implementation first.
> > > >>
> > > >> one way could be
> > > >>  1. let machine prebuild empty QOM tree 
> > > >> /machine/node/N/socket/X/core/Y/thread/Z
> > > >>  2. add node, socket, core, thread properties to CPU and link CPU into 
> > > >> respective
> > > >> link created by #1
> > > >>  
> > > > Thanks, I hope I can take some time to make some patches to implement
> > > > it.
> > > 
> > > Please give us a few hours to reply. :)
> > > 
> > > /machine/node seems too broad a term to me.
> > > You can't prebuild the full tree, you can only prepare the nodes.
> > > core[Y]/thread[Z] was previously discussed as syntax.
> > > 
> > > The important part to decide on will be what is going to be child<> and
> > > what link<>. Has anyone played with the Intel Quark platform for
> > > instance? (Galileo board or upcoming Edison card) On a regular
> > > mainboard, we would have socket[X] as a link, which might
> > > point to a child /machine/memory-node[W]/cpu[X]. But

Re: [Qemu-devel] [RFC 2/3] target-i386: add -smp X,apics=0x option

2014-02-17 Thread Chen Fan
On Mon, 2014-02-17 at 11:37 -0700, Eric Blake wrote:
> On 01/14/2014 02:27 AM, Chen Fan wrote:
> > This option provides the infrastructure for specifying apicids when
> > boot VM, For example:
> > 
> >  #boot with apicid 0 and 2:
> >  -smp 2,apics=0xA,maxcpus=4  /* 1010 */
> >  #boot with apicid 1 and 7:
> >  -smp 2,apics=0x41,maxcpus=8 /* 0100 0001 */
> 
> This syntax feels a bit odd when maxcpus is not a multiple of 8, and
> even harder when not a multiple of 4.  I think part of my confusion
> stems from you treating the lsb as the left-most bit, but expect me to
> write in hex where I'm used to the right-most bit being lsb   Wouldn't
> it be easier to express:
> 
> msb  lsb
> 
> with leading 0s implied as needed, as in:
> 
> 0x5 => 0101 => id 0 (lsb) and id 2 are enabled, regardless of whether
> maxcpus=4 or maxcpus=8
> 0x82 => 1000 0010 => id 1 and id 7 are enabled, regardless of whether
> maxcpus=8 or maxcpus=256
> 
> 0x1 => id 32 is enabled
> 
> Or even better, why not reuse existing parsers that take cpu ids
> directly as numbers instead of making me compute a bitmap (as in
> maxcpus=4,id=0,id=2 - although I don't quite know QemuOpts well enough
> to know if you can repeat id= for forming a list of disjoint id numbers)
> 
Thanks for your review, but this form was deprecated. Igor proposed
using -device /-device-add to specify the disjoint id numbers.

Thanks,
Chen 

> > @@ -92,6 +93,14 @@ of @var{threads} per cores and the total number of 
> > @var{sockets} can be
> >  specified. Missing values will be computed. If any on the three values is
> >  given, the total number of CPUs @var{n} can be omitted. @var{maxcpus}
> >  specifies the maximum number of hotpluggable CPUs.
> > +@var{apics} specifies the boot bitmap of existed apicid.
> 
> s/existed/existing/
> 
> > +
> > +@example
> > +#specify the boot bitmap of apicid with 0 and 2:
> > +qemu-system-i386 -smp 2,apics=0xA,maxcpus=4  /* 1010 */
> > +#specify the boot bitmap of apicid with 1 and 7:
> > +qemu-system-i386 -smp 2,apics=0x41,maxcpus=8 /* 0100 0001 */
> > +@end example
> 
> These examples would need updating to match my concerns.
> 
> > @@ -1379,6 +1382,9 @@ static QemuOptsList qemu_smp_opts = {
> >  }, {
> >  .name = "maxcpus",
> >  .type = QEMU_OPT_NUMBER,
> > +}, {
> > +.name = "apics",
> > +.type = QEMU_OPT_STRING,
> 
> Why a string with your own ad-hoc parser?  Can't we reuse some of the
> existing parsers that already know how to handle (possibly-disjoint)
> lists of cpu numbers?
> 
> > +if (apics) {
> > +if (strstart(apics, "0x", &apics)) {
> 
> Why not also allow 0X?
> 
> > +if (*apics != '\0') {
> > +int i, count;
> > +int64_t max_apicid = 0;
> > +uint32_t val;
> > +char tmp[2];
> > +
> > +count = strlen(apics);
> > +
> > +for (i = 0; i < count; i++) {
> > +tmp[0] = apics[i];
> > +tmp[1] = '\0';
> > +sscanf(tmp, "%x", &val);
> 
> sscanf is evil.  It has undefined behavior on input overflow (that is,
> if I say 0x1, there is no guarantee what sscanf will
> stick into val).  All the more reason you should be using an existing
> parser which gracefully handles overflow.
> 





  1   2   3   4   5   6   7   8   9   10   >