Re: [Qemu-devel] [PATCH V14 00/11] Add support for binding guest numa nodes to host numa nodes

2013-10-20 Thread Wanlong Gao
Hi folks,

Any more comments?

Thanks,
Wanlong Gao

> As you know, QEMU can't direct it's memory allocation now, this may cause
> guest cross node access performance regression.
> And, the worse thing is that if PCI-passthrough is used,
> direct-attached-device uses DMA transfer between device and qemu process.
> All pages of the guest will be pinned by get_user_pages().
> 
> KVM_ASSIGN_PCI_DEVICE ioctl
>   kvm_vm_ioctl_assign_device()
> =>kvm_assign_device()
>   => kvm_iommu_map_memslots()
> => kvm_iommu_map_pages()
>=> kvm_pin_pages()
> 
> So, with direct-attached-device, all guest page's page count will be +1 and
> any page migration will not work. AutoNUMA won't too.
> 
> So, we should set the guest nodes memory allocation policy before
> the pages are really mapped.
> 
> According to this patch set, we are able to set guest nodes memory policy
> like following:
> 
>  -numa node,nodeid=0,cpus=0, \
>  -numa mem,size=1024M,policy=membind,host-nodes=0-1 \
>  -numa node,nodeid=1,cpus=1 \
>  -numa mem,size=1024M,policy=interleave,host-nodes=1
> 
> This supports 
> "policy={default|membind|interleave|preferred},relative=true,host-nodes=N-N" 
> like format.
> 
> And add a QMP command "query-numa" to show numa info through
> this API.
> 
> And convert the "info numa" monitor command to use this
> QMP command "query-numa".
> 
> This version removes "set-mem-policy" qmp and hmp commands temporarily
> as Marcelo and Paolo suggested.
> 
> V1->V2:
> change to use QemuOpts in numa options (Paolo)
> handle Error in mpol parser (Paolo)
> change qmp command format to mem-policy=membind,mem-hostnode=0-1 like 
> (Paolo)
> V2->V3:
> also handle Error in cpus parser (5/10)
> split out common parser from cpus and hostnode parser (Bandan 6/10)
> V3-V4:
> rebase to request for comments
> V4->V5:
> use OptVisitor and split -numa option (Paolo)
>  - s/set-mpol/set-mem-policy (Andreas)
>  - s/mem-policy/policy
>  - s/mem-hostnode/host-nodes
> fix hmp command process after error (Luiz)
> add qmp command query-numa and convert info numa to it (Luiz)
> V5->V6:
> remove tabs in json file (Laszlo, Paolo)
> add back "-numa node,mem=xxx" as legacy (Paolo)
> change cpus and host-nodes to array (Laszlo, Eric)
> change "nodeid" to "uint16"
> add NumaMemPolicy enum type (Eric)
> rebased on Laszlo's "OptsVisitor: support / flatten integer ranges for 
> repeating options" patch set, thanks for Laszlo's help
> V6-V7:
> change UInt16 to uint16 (Laszlo)
> fix a typo in adding qmp command set-mem-policy
> V7-V8:
> rebase to current master with Laszlo's V2 of OptsVisitor patch set
> fix an adding white space line error
> V8->V9:
> rebase to current master
> check if total numa memory size is equal to ram_size (Paolo)
> add comments to the OptsVisitor stuff in qapi-schema.json (Eric, Laszlo)
> replace the use of numa_num_configured_nodes() (Andrew)
> avoid abusing the fact i==nodeid (Andrew)
> V9->V10:
> rebase to current master
> remove libnuma (Andrew)
> MAX_NODES=64 -> MAX_NODES=128 since libnuma selected 128 (Andrew)
> use MAX_NODES instead of MAX_CPUMASK_BITS for host_mem bitmap (Andrew)
> remove a useless clear_bit() operation (Andrew)
> V10->V11:
> rebase to current master
> fix "maxnode" argument of mbind(2)
> V11->V12:
> rebase to current master
> split patch 02/11 of V11 (Eduardo)
> add some max value check (Eduardo)
> split MAX_NODES change patch (Eduardo)
> V12->V13:
> rebase to current master
> thanks for Luiz's review (Luiz)
> doc hmp command set-mem-policy (Luiz)
> rename: NUMAInfo -> NUMANode (Luiz)
> V13->V14:
> remove "set-mem-policy" qmp and hmp commands (Marcelo, Paolo)
> 
> 
> *I hope this can catch up the train of 1.7.*
> 
> Thanks,
> Wanlong Gao
> 
> Wanlong Gao (11):
>   NUMA: move numa related code to new file numa.c
>   NUMA: check if the total numa memory size is equal to ram_size
>   NUMA: Add numa_info structure to contain numa nodes info
>   NUMA: convert -numa option to use OptsVisitor
>   NUMA: introduce NumaMemOptions
>   NUMA: add "-numa mem," options
>   NUMA: expand MAX_NODES from 64 to 128
>   NUMA: parse guest numa nodes memory policy
>   NUMA: set guest numa nodes memory policy
>   NUMA: add qmp command query-numa
>   NUMA: convert hmp command info_numa to use qmp command query_numa
> 
>  Makefile.target |   2 +-
>  cpus.c  |  14 --
>  hmp.c   |  57 +++
>  hmp.h   |   1 +
>  hw/i386/pc.c|   4 +-
>  include/sysemu/cpus.h   |   1 -
>  include/sysemu/sysemu.h |  18 ++-
>  monitor.c   |  21 +--
>  numa.c  | 395 
> 
>  qapi-schema.json| 112 ++
>  qemu-options.hx |   6 +-
>  qmp-commands.hx |  48 ++
>  vl.c| 160 ++

Re: [Qemu-devel] [PATCH V14 00/11] Add support for binding guest numa nodes to host numa nodes

2013-10-15 Thread Wanlong Gao
Hi folks,

Settled another week, who can pick?

Thanks,
Wanlong Gao

> As you know, QEMU can't direct it's memory allocation now, this may cause
> guest cross node access performance regression.
> And, the worse thing is that if PCI-passthrough is used,
> direct-attached-device uses DMA transfer between device and qemu process.
> All pages of the guest will be pinned by get_user_pages().
> 
> KVM_ASSIGN_PCI_DEVICE ioctl
>   kvm_vm_ioctl_assign_device()
> =>kvm_assign_device()
>   => kvm_iommu_map_memslots()
> => kvm_iommu_map_pages()
>=> kvm_pin_pages()
> 
> So, with direct-attached-device, all guest page's page count will be +1 and
> any page migration will not work. AutoNUMA won't too.
> 
> So, we should set the guest nodes memory allocation policy before
> the pages are really mapped.
> 
> According to this patch set, we are able to set guest nodes memory policy
> like following:
> 
>  -numa node,nodeid=0,cpus=0, \
>  -numa mem,size=1024M,policy=membind,host-nodes=0-1 \
>  -numa node,nodeid=1,cpus=1 \
>  -numa mem,size=1024M,policy=interleave,host-nodes=1
> 
> This supports 
> "policy={default|membind|interleave|preferred},relative=true,host-nodes=N-N" 
> like format.
> 
> And add a QMP command "query-numa" to show numa info through
> this API.
> 
> And convert the "info numa" monitor command to use this
> QMP command "query-numa".
> 
> This version removes "set-mem-policy" qmp and hmp commands temporarily
> as Marcelo and Paolo suggested.
> 
> V1->V2:
> change to use QemuOpts in numa options (Paolo)
> handle Error in mpol parser (Paolo)
> change qmp command format to mem-policy=membind,mem-hostnode=0-1 like 
> (Paolo)
> V2->V3:
> also handle Error in cpus parser (5/10)
> split out common parser from cpus and hostnode parser (Bandan 6/10)
> V3-V4:
> rebase to request for comments
> V4->V5:
> use OptVisitor and split -numa option (Paolo)
>  - s/set-mpol/set-mem-policy (Andreas)
>  - s/mem-policy/policy
>  - s/mem-hostnode/host-nodes
> fix hmp command process after error (Luiz)
> add qmp command query-numa and convert info numa to it (Luiz)
> V5->V6:
> remove tabs in json file (Laszlo, Paolo)
> add back "-numa node,mem=xxx" as legacy (Paolo)
> change cpus and host-nodes to array (Laszlo, Eric)
> change "nodeid" to "uint16"
> add NumaMemPolicy enum type (Eric)
> rebased on Laszlo's "OptsVisitor: support / flatten integer ranges for 
> repeating options" patch set, thanks for Laszlo's help
> V6-V7:
> change UInt16 to uint16 (Laszlo)
> fix a typo in adding qmp command set-mem-policy
> V7-V8:
> rebase to current master with Laszlo's V2 of OptsVisitor patch set
> fix an adding white space line error
> V8->V9:
> rebase to current master
> check if total numa memory size is equal to ram_size (Paolo)
> add comments to the OptsVisitor stuff in qapi-schema.json (Eric, Laszlo)
> replace the use of numa_num_configured_nodes() (Andrew)
> avoid abusing the fact i==nodeid (Andrew)
> V9->V10:
> rebase to current master
> remove libnuma (Andrew)
> MAX_NODES=64 -> MAX_NODES=128 since libnuma selected 128 (Andrew)
> use MAX_NODES instead of MAX_CPUMASK_BITS for host_mem bitmap (Andrew)
> remove a useless clear_bit() operation (Andrew)
> V10->V11:
> rebase to current master
> fix "maxnode" argument of mbind(2)
> V11->V12:
> rebase to current master
> split patch 02/11 of V11 (Eduardo)
> add some max value check (Eduardo)
> split MAX_NODES change patch (Eduardo)
> V12->V13:
> rebase to current master
> thanks for Luiz's review (Luiz)
> doc hmp command set-mem-policy (Luiz)
> rename: NUMAInfo -> NUMANode (Luiz)
> V13->V14:
> remove "set-mem-policy" qmp and hmp commands (Marcelo, Paolo)
> 
> 
> *I hope this can catch up the train of 1.7.*
> 
> Thanks,
> Wanlong Gao
> 
> Wanlong Gao (11):
>   NUMA: move numa related code to new file numa.c
>   NUMA: check if the total numa memory size is equal to ram_size
>   NUMA: Add numa_info structure to contain numa nodes info
>   NUMA: convert -numa option to use OptsVisitor
>   NUMA: introduce NumaMemOptions
>   NUMA: add "-numa mem," options
>   NUMA: expand MAX_NODES from 64 to 128
>   NUMA: parse guest numa nodes memory policy
>   NUMA: set guest numa nodes memory policy
>   NUMA: add qmp command query-numa
>   NUMA: convert hmp command info_numa to use qmp command query_numa
> 
>  Makefile.target |   2 +-
>  cpus.c  |  14 --
>  hmp.c   |  57 +++
>  hmp.h   |   1 +
>  hw/i386/pc.c|   4 +-
>  include/sysemu/cpus.h   |   1 -
>  include/sysemu/sysemu.h |  18 ++-
>  monitor.c   |  21 +--
>  numa.c  | 395 
> 
>  qapi-schema.json| 112 ++
>  qemu-options.hx |   6 +-
>  qmp-commands.hx |  48 ++
>  vl.c   

[Qemu-devel] [PATCH V14 00/11] Add support for binding guest numa nodes to host numa nodes

2013-10-07 Thread Wanlong Gao
As you know, QEMU can't direct it's memory allocation now, this may cause
guest cross node access performance regression.
And, the worse thing is that if PCI-passthrough is used,
direct-attached-device uses DMA transfer between device and qemu process.
All pages of the guest will be pinned by get_user_pages().

KVM_ASSIGN_PCI_DEVICE ioctl
  kvm_vm_ioctl_assign_device()
=>kvm_assign_device()
  => kvm_iommu_map_memslots()
=> kvm_iommu_map_pages()
   => kvm_pin_pages()

So, with direct-attached-device, all guest page's page count will be +1 and
any page migration will not work. AutoNUMA won't too.

So, we should set the guest nodes memory allocation policy before
the pages are really mapped.

According to this patch set, we are able to set guest nodes memory policy
like following:

 -numa node,nodeid=0,cpus=0, \
 -numa mem,size=1024M,policy=membind,host-nodes=0-1 \
 -numa node,nodeid=1,cpus=1 \
 -numa mem,size=1024M,policy=interleave,host-nodes=1

This supports 
"policy={default|membind|interleave|preferred},relative=true,host-nodes=N-N" 
like format.

And add a QMP command "query-numa" to show numa info through
this API.

And convert the "info numa" monitor command to use this
QMP command "query-numa".

This version removes "set-mem-policy" qmp and hmp commands temporarily
as Marcelo and Paolo suggested.

V1->V2:
change to use QemuOpts in numa options (Paolo)
handle Error in mpol parser (Paolo)
change qmp command format to mem-policy=membind,mem-hostnode=0-1 like 
(Paolo)
V2->V3:
also handle Error in cpus parser (5/10)
split out common parser from cpus and hostnode parser (Bandan 6/10)
V3-V4:
rebase to request for comments
V4->V5:
use OptVisitor and split -numa option (Paolo)
 - s/set-mpol/set-mem-policy (Andreas)
 - s/mem-policy/policy
 - s/mem-hostnode/host-nodes
fix hmp command process after error (Luiz)
add qmp command query-numa and convert info numa to it (Luiz)
V5->V6:
remove tabs in json file (Laszlo, Paolo)
add back "-numa node,mem=xxx" as legacy (Paolo)
change cpus and host-nodes to array (Laszlo, Eric)
change "nodeid" to "uint16"
add NumaMemPolicy enum type (Eric)
rebased on Laszlo's "OptsVisitor: support / flatten integer ranges for 
repeating options" patch set, thanks for Laszlo's help
V6-V7:
change UInt16 to uint16 (Laszlo)
fix a typo in adding qmp command set-mem-policy
V7-V8:
rebase to current master with Laszlo's V2 of OptsVisitor patch set
fix an adding white space line error
V8->V9:
rebase to current master
check if total numa memory size is equal to ram_size (Paolo)
add comments to the OptsVisitor stuff in qapi-schema.json (Eric, Laszlo)
replace the use of numa_num_configured_nodes() (Andrew)
avoid abusing the fact i==nodeid (Andrew)
V9->V10:
rebase to current master
remove libnuma (Andrew)
MAX_NODES=64 -> MAX_NODES=128 since libnuma selected 128 (Andrew)
use MAX_NODES instead of MAX_CPUMASK_BITS for host_mem bitmap (Andrew)
remove a useless clear_bit() operation (Andrew)
V10->V11:
rebase to current master
fix "maxnode" argument of mbind(2)
V11->V12:
rebase to current master
split patch 02/11 of V11 (Eduardo)
add some max value check (Eduardo)
split MAX_NODES change patch (Eduardo)
V12->V13:
rebase to current master
thanks for Luiz's review (Luiz)
doc hmp command set-mem-policy (Luiz)
rename: NUMAInfo -> NUMANode (Luiz)
V13->V14:
remove "set-mem-policy" qmp and hmp commands (Marcelo, Paolo)


*I hope this can catch up the train of 1.7.*

Thanks,
Wanlong Gao

Wanlong Gao (11):
  NUMA: move numa related code to new file numa.c
  NUMA: check if the total numa memory size is equal to ram_size
  NUMA: Add numa_info structure to contain numa nodes info
  NUMA: convert -numa option to use OptsVisitor
  NUMA: introduce NumaMemOptions
  NUMA: add "-numa mem," options
  NUMA: expand MAX_NODES from 64 to 128
  NUMA: parse guest numa nodes memory policy
  NUMA: set guest numa nodes memory policy
  NUMA: add qmp command query-numa
  NUMA: convert hmp command info_numa to use qmp command query_numa

 Makefile.target |   2 +-
 cpus.c  |  14 --
 hmp.c   |  57 +++
 hmp.h   |   1 +
 hw/i386/pc.c|   4 +-
 include/sysemu/cpus.h   |   1 -
 include/sysemu/sysemu.h |  18 ++-
 monitor.c   |  21 +--
 numa.c  | 395 
 qapi-schema.json| 112 ++
 qemu-options.hx |   6 +-
 qmp-commands.hx |  48 ++
 vl.c| 160 +++-
 13 files changed, 654 insertions(+), 185 deletions(-)
 create mode 100644 numa.c

-- 
1.8.4.474.g128a96c