Re: read_barrier_depends() usage in vhost.c

2019-12-19 Thread Jason Wang


On 2019/12/18 下午5:19, Herbert Xu wrote:

Will Deacon  wrote:

--->8

// drivers/vhost/vhost.c
static int get_indirect(struct vhost_virtqueue *vq,
   struct iovec iov[], unsigned int iov_size,
   unsigned int *out_num, unsigned int *in_num,
   struct vhost_log *log, unsigned int *log_num,
   struct vring_desc *indirect)
{
   [...]

   /* We will use the result as an address to read from, so most
* architectures only need a compiler barrier here. */
   read_barrier_depends();

--->8

Unfortunately, although the barrier is commented (hurrah!), it's not
particularly enlightening about the accesses making up the dependency
chain, and I don't understand the supposed need for a compiler barrier
either (read_barrier_depends() doesn't generally provide this).

Does anybody know which accesses are being ordered here? Usually you'd need
a READ_ONCE()/rcu_dereference() beginning the chain, but I haven't managed
to find one...

I think what it's trying to separate is using indirect->addr as a
base and then reading from that through copy_from_iter.

Cheers,



The question is that there's a smp_rmb() before in vhost_get_vq_desc(), 
isn't it sufficient to do this?


Thanks

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH 4.9 115/199] virtio-balloon: fix managed page counts when migrating pages between zones

2019-12-19 Thread Greg Kroah-Hartman
From: David Hildenbrand 

commit 63341ab03706e11a31e3dd8ccc0fbc9beaf723f0 upstream.

In case we have to migrate a ballon page to a newpage of another zone, the
managed page count of both zones is wrong. Paired with memory offlining
(which will adjust the managed page count), we can trigger kernel crashes
and all kinds of different symptoms.

One way to reproduce:
1. Start a QEMU guest with 4GB, no NUMA
2. Hotplug a 1GB DIMM and online the memory to ZONE_NORMAL
3. Inflate the balloon to 1GB
4. Unplug the DIMM (be quick, otherwise unmovable data ends up on it)
5. Observe /proc/zoneinfo
  Node 0, zone   Normal
pages free 16810
  min  24848885473806
  low  18471592959183339
  high 36918337032892872
  spanned  262144
  present  262144
  managed  18446744073709533486
6. Do anything that requires some memory (e.g., inflate the balloon some
more). The OOM goes crazy and the system crashes
  [  238.324946] Out of memory: Killed process 537 (login) total-vm:27584kB, 
anon-rss:860kB, file-rss:0kB, shmem-rss:00
  [  238.338585] systemd invoked oom-killer: 
gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
  [  238.339420] CPU: 0 PID: 1 Comm: systemd Tainted: G  D W 
5.4.0-next-20191204+ #75
  [  238.340139] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu4
  [  238.341121] Call Trace:
  [  238.341337]  dump_stack+0x8f/0xd0
  [  238.341630]  dump_header+0x61/0x5ea
  [  238.341942]  oom_kill_process.cold+0xb/0x10
  [  238.342299]  out_of_memory+0x24d/0x5a0
  [  238.342625]  __alloc_pages_slowpath+0xd12/0x1020
  [  238.343024]  __alloc_pages_nodemask+0x391/0x410
  [  238.343407]  pagecache_get_page+0xc3/0x3a0
  [  238.343757]  filemap_fault+0x804/0xc30
  [  238.344083]  ? ext4_filemap_fault+0x28/0x42
  [  238.34]  ext4_filemap_fault+0x30/0x42
  [  238.344789]  __do_fault+0x37/0x1a0
  [  238.345087]  __handle_mm_fault+0x104d/0x1ab0
  [  238.345450]  handle_mm_fault+0x169/0x360
  [  238.345790]  do_user_addr_fault+0x20d/0x490
  [  238.346154]  do_page_fault+0x31/0x210
  [  238.346468]  async_page_fault+0x43/0x50
  [  238.346797] RIP: 0033:0x7f47eba4197e
  [  238.347110] Code: Bad RIP value.
  [  238.347387] RSP: 002b:7ffd7c0c1890 EFLAGS: 00010293
  [  238.347834] RAX: 0002 RBX: 55d196a20a20 RCX: 
7f47eba4197e
  [  238.348437] RDX: 0033 RSI: 7ffd7c0c18c0 RDI: 
0004
  [  238.349047] RBP: 7ffd7c0c1c20 R08:  R09: 
0033
  [  238.349660] R10:  R11: 0293 R12: 
0001
  [  238.350261] R13:  R14:  R15: 
7ffd7c0c18c0
  [  238.350878] Mem-Info:
  [  238.351085] active_anon:3121 inactive_anon:51 isolated_anon:0
  [  238.351085]  active_file:12 inactive_file:7 isolated_file:0
  [  238.351085]  unevictable:0 dirty:0 writeback:0 unstable:0
  [  238.351085]  slab_reclaimable:5565 slab_unreclaimable:10170
  [  238.351085]  mapped:3 shmem:111 pagetables:155 bounce:0
  [  238.351085]  free:720717 free_pcp:2 free_cma:0
  [  238.353757] Node 0 active_anon:12484kB inactive_anon:204kB 
active_file:48kB inactive_file:28kB unevictable:0kB iss
  [  238.355979] Node 0 DMA free:11556kB min:36kB low:48kB high:60kB 
reserved_highatomic:0KB active_anon:152kB inactivB
  [  238.358345] lowmem_reserve[]: 0 2955 2884 2884 2884
  [  238.358761] Node 0 DMA32 free:2677864kB min:7004kB low:10028kB 
high:13052kB reserved_highatomic:0KB active_anon:0B
  [  238.361202] lowmem_reserve[]: 0 0 72057594037927865 72057594037927865 
72057594037927865
  [  238.361888] Node 0 Normal free:193448kB min:99395541895224kB 
low:73886371836733356kB high:147673348131571488kB reB
  [  238.364765] lowmem_reserve[]: 0 0 0 0 0
  [  238.365101] Node 0 DMA: 7*4kB (U) 5*8kB (UE) 6*16kB (UME) 2*32kB (UM) 
1*64kB (U) 2*128kB (UE) 3*256kB (UME) 2*512B
  [  238.366379] Node 0 DMA32: 0*4kB 1*8kB (U) 2*16kB (UM) 2*32kB (UM) 2*64kB 
(UM) 1*128kB (U) 1*256kB (U) 1*512kB (U)B
  [  238.367654] Node 0 Normal: 1985*4kB (UME) 1321*8kB (UME) 844*16kB (UME) 
524*32kB (UME) 300*64kB (UME) 138*128kB (B
  [  238.369184] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 
hugepages_size=2048kB
  [  238.369915] 130 total pagecache pages
  [  238.370241] 0 pages in swap cache
  [  238.370533] Swap cache stats: add 0, delete 0, find 0/0
  [  238.370981] Free swap  = 0kB
  [  238.371239] Total swap = 0kB
  [  238.371488] 1048445 pages RAM
  [  238.371756] 0 pages HighMem/MovableOnly
  [  238.372090] 306992 pages reserved
  [  238.372376] 0 pages cma reserved
  [  238.372661] 0 pages hwpoisoned

In another instance (older kernel), I was able to observe this
(negative page count :/):
  [  180.896971] Offlined Pages 32768
  [  182.667462] Offlined Pages 32768
  [  184.408117] Offlined Pages 32768
  [  186.026321] Offlined Pages 32768
  [  187.684861] Offlined Pages 32768
  [  189.227013] Offlined P

[PATCH 4.4 098/162] virtio-balloon: fix managed page counts when migrating pages between zones

2019-12-19 Thread Greg Kroah-Hartman
From: David Hildenbrand 

commit 63341ab03706e11a31e3dd8ccc0fbc9beaf723f0 upstream.

In case we have to migrate a ballon page to a newpage of another zone, the
managed page count of both zones is wrong. Paired with memory offlining
(which will adjust the managed page count), we can trigger kernel crashes
and all kinds of different symptoms.

One way to reproduce:
1. Start a QEMU guest with 4GB, no NUMA
2. Hotplug a 1GB DIMM and online the memory to ZONE_NORMAL
3. Inflate the balloon to 1GB
4. Unplug the DIMM (be quick, otherwise unmovable data ends up on it)
5. Observe /proc/zoneinfo
  Node 0, zone   Normal
pages free 16810
  min  24848885473806
  low  18471592959183339
  high 36918337032892872
  spanned  262144
  present  262144
  managed  18446744073709533486
6. Do anything that requires some memory (e.g., inflate the balloon some
more). The OOM goes crazy and the system crashes
  [  238.324946] Out of memory: Killed process 537 (login) total-vm:27584kB, 
anon-rss:860kB, file-rss:0kB, shmem-rss:00
  [  238.338585] systemd invoked oom-killer: 
gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
  [  238.339420] CPU: 0 PID: 1 Comm: systemd Tainted: G  D W 
5.4.0-next-20191204+ #75
  [  238.340139] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu4
  [  238.341121] Call Trace:
  [  238.341337]  dump_stack+0x8f/0xd0
  [  238.341630]  dump_header+0x61/0x5ea
  [  238.341942]  oom_kill_process.cold+0xb/0x10
  [  238.342299]  out_of_memory+0x24d/0x5a0
  [  238.342625]  __alloc_pages_slowpath+0xd12/0x1020
  [  238.343024]  __alloc_pages_nodemask+0x391/0x410
  [  238.343407]  pagecache_get_page+0xc3/0x3a0
  [  238.343757]  filemap_fault+0x804/0xc30
  [  238.344083]  ? ext4_filemap_fault+0x28/0x42
  [  238.34]  ext4_filemap_fault+0x30/0x42
  [  238.344789]  __do_fault+0x37/0x1a0
  [  238.345087]  __handle_mm_fault+0x104d/0x1ab0
  [  238.345450]  handle_mm_fault+0x169/0x360
  [  238.345790]  do_user_addr_fault+0x20d/0x490
  [  238.346154]  do_page_fault+0x31/0x210
  [  238.346468]  async_page_fault+0x43/0x50
  [  238.346797] RIP: 0033:0x7f47eba4197e
  [  238.347110] Code: Bad RIP value.
  [  238.347387] RSP: 002b:7ffd7c0c1890 EFLAGS: 00010293
  [  238.347834] RAX: 0002 RBX: 55d196a20a20 RCX: 
7f47eba4197e
  [  238.348437] RDX: 0033 RSI: 7ffd7c0c18c0 RDI: 
0004
  [  238.349047] RBP: 7ffd7c0c1c20 R08:  R09: 
0033
  [  238.349660] R10:  R11: 0293 R12: 
0001
  [  238.350261] R13:  R14:  R15: 
7ffd7c0c18c0
  [  238.350878] Mem-Info:
  [  238.351085] active_anon:3121 inactive_anon:51 isolated_anon:0
  [  238.351085]  active_file:12 inactive_file:7 isolated_file:0
  [  238.351085]  unevictable:0 dirty:0 writeback:0 unstable:0
  [  238.351085]  slab_reclaimable:5565 slab_unreclaimable:10170
  [  238.351085]  mapped:3 shmem:111 pagetables:155 bounce:0
  [  238.351085]  free:720717 free_pcp:2 free_cma:0
  [  238.353757] Node 0 active_anon:12484kB inactive_anon:204kB 
active_file:48kB inactive_file:28kB unevictable:0kB iss
  [  238.355979] Node 0 DMA free:11556kB min:36kB low:48kB high:60kB 
reserved_highatomic:0KB active_anon:152kB inactivB
  [  238.358345] lowmem_reserve[]: 0 2955 2884 2884 2884
  [  238.358761] Node 0 DMA32 free:2677864kB min:7004kB low:10028kB 
high:13052kB reserved_highatomic:0KB active_anon:0B
  [  238.361202] lowmem_reserve[]: 0 0 72057594037927865 72057594037927865 
72057594037927865
  [  238.361888] Node 0 Normal free:193448kB min:99395541895224kB 
low:73886371836733356kB high:147673348131571488kB reB
  [  238.364765] lowmem_reserve[]: 0 0 0 0 0
  [  238.365101] Node 0 DMA: 7*4kB (U) 5*8kB (UE) 6*16kB (UME) 2*32kB (UM) 
1*64kB (U) 2*128kB (UE) 3*256kB (UME) 2*512B
  [  238.366379] Node 0 DMA32: 0*4kB 1*8kB (U) 2*16kB (UM) 2*32kB (UM) 2*64kB 
(UM) 1*128kB (U) 1*256kB (U) 1*512kB (U)B
  [  238.367654] Node 0 Normal: 1985*4kB (UME) 1321*8kB (UME) 844*16kB (UME) 
524*32kB (UME) 300*64kB (UME) 138*128kB (B
  [  238.369184] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 
hugepages_size=2048kB
  [  238.369915] 130 total pagecache pages
  [  238.370241] 0 pages in swap cache
  [  238.370533] Swap cache stats: add 0, delete 0, find 0/0
  [  238.370981] Free swap  = 0kB
  [  238.371239] Total swap = 0kB
  [  238.371488] 1048445 pages RAM
  [  238.371756] 0 pages HighMem/MovableOnly
  [  238.372090] 306992 pages reserved
  [  238.372376] 0 pages cma reserved
  [  238.372661] 0 pages hwpoisoned

In another instance (older kernel), I was able to observe this
(negative page count :/):
  [  180.896971] Offlined Pages 32768
  [  182.667462] Offlined Pages 32768
  [  184.408117] Offlined Pages 32768
  [  186.026321] Offlined Pages 32768
  [  187.684861] Offlined Pages 32768
  [  189.227013] Offlined P

Re: [PATCH v10 00/11] x86: PIE support to extend KASLR randomization

2019-12-19 Thread Thomas Garnier
On Thu, Dec 19, 2019 at 5:35 AM Peter Zijlstra  wrote:
>
> On Wed, Dec 04, 2019 at 04:09:37PM -0800, Thomas Garnier wrote:
> > Minor changes based on feedback and rebase from v9.
> >
> > Splitting the previous serie in two. This part contains assembly code
> > changes required for PIE but without any direct dependencies with the
> > rest of the patchset.
>
> ISTR suggestion you add an objtool pass that verifies there are no
> absolute text references left. Otherwise we'll forever be chasing that
> last one..

Correct, I have a reference in the changelog saying I will tackle in
the next patchset because we still have non-pie references in other
places but the fix is a bit more complex (for exemple per-cpu) and not
included in this phase. I will add a better explanation in the next
message for patch v11.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v3 4/5] iommu: intel: Use generic_iommu_put_resv_regions()

2019-12-19 Thread Joerg Roedel
On Thu, Dec 19, 2019 at 01:47:47PM +0100, Thierry Reding wrote:
> On Thu, Dec 19, 2019 at 09:53:22AM +0800, Lu Baolu wrote:
> > Please tweak the title to
> > 
> > "iommu/vt-d: Use generic_iommu_put_resv_regions()"
> > 
> > then,
> > 
> > Acked-by: Lu Baolu 
> > 
> > Best regards,
> > baolu
> 
> Joerg, do you want me to resend with this change or is it more efficient
> if you fix up the subject while applying?

No need to re-send, I'll fix it up in this patch and in the others too.


Joerg
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


CfP VHPC20: HPC Containers-Kubernetes

2019-12-19 Thread VHPC 20

CALL FOR PAPERS

15th Workshop on Virtualization in High-Performance Cloud Computing
(VHPC 20) held in conjunction with the International Supercomputing
Conference - High Performance, June 21-25, 2020, Frankfurt, Germany.
(Springer LNCS Proceedings)




Date: June 25, 2020
Workshop URL: vhpc[dot]org


Abstract registration Deadline: Jan 31st, 2020
Paper Submission Deadline: Apr 05th, 2020
Springer LNCS



Call for Papers


Containers and virtualization technologies constitute key enabling
factors for flexible resource management in modern data centers, and
particularly in cloud environments. Cloud providers need to manage
complex infrastructures in a seamless fashion to support the highly
dynamic and heterogeneous workloads and hosted applications customers
deploy. Similarly, HPC environments have been increasingly adopting
techniques that enable flexible management of vast computing and
networking resources, close to marginal provisioning cost, which is
unprecedented in the history of scientific and commercial computing.
Most recently, Function as a Service (Faas) and Serverless computing,
utilizing lightweight VMs-containers widens the spectrum of
applications that can be deployed in a cloud environment, especially
in an HPC context. Here, HPC-provided services can be become
accessible to distributed workloads outside of large cluster
environments.

Various virtualization-containerization technologies contribute to the
overall picture in different ways: machine virtualization, with its
capability to enable consolidation of multiple under­utilized servers
with heterogeneous software and operating systems (OSes), and its
capability to live­-migrate a fully operating virtual machine (VM)
with a very short downtime, enables novel and dynamic ways to manage
physical servers; OS-­level virtualization (i.e., containerization),
with its capability to isolate multiple user­-space environments and
to allow for their co­existence within the same OS kernel, promises to
provide many of the advantages of machine virtualization with high
levels of responsiveness and performance; lastly, unikernels provide
for many virtualization benefits with a minimized OS/library surface.
I/O Virtualization in turn allows physical network interfaces to take
traffic from multiple VMs or containers; network virtualization, with
its capability to create logical network overlays that are independent
of the underlying physical topology is furthermore enabling
virtualization of HPC infrastructures.


Publication


Accepted papers will be published in a Springer LNCS proceedings
volume.


Topics of Interest


The VHPC program committee solicits original, high-quality submissions
related to virtualization across the entire software stack with a
special focus on the intersection of HPC, containers-virtualization
and the cloud.


Major Topics:
- HPC workload orchestration (Kubernetes)
- Kubernetes HPC batch
- HPC Container Environments Landscape
- HW Heterogeneity
- Container ecosystem (Docker alternatives)
- Networking
- Lightweight Virtualization
- Unikernels / LibOS
- State-of-the-art processor virtualization (RISC-V, EPI)
- Containerizing HPC Stacks/Apps/Codes:
  Climate model containers


each major topic encompassing design/architecture, management,
performance management, modeling and configuration/tooling.
Specifically, we invite papers that deal with the following topics:

- HPC orchestration (Kubernetes)
   - Virtualizing Kubernetes for HPC
   - Deployment paradigms
   - Multitenancy
   - Serverless
   - Declerative data center integration
   - Network provisioning
   - Storage
   - OCI i.a. images
   - Isolation/security
- HW Accelerators, including GPUs, FPGAs, AI, and others
   - State-of-practice/art, including transition to cloud
   - Frameworks, system software
   - Programming models, runtime systems, and APIs to facilitate cloud
 adoption
   - Edge use-cases
   - Application adaptation, success stories
- Kubernetes Batch
   - Scheduling, job management
   - Execution paradigm - workflow
   - Data management
   - Deployment paradigm
   - Multi-cluster/scalability
   - Performance improvement
   - Workflow / execution paradigm
- Podman: end-to-end Docker alternative container environment & use-cases
   - Creating, Running containers as non-root (rootless)
   - Running rootless containers with MPI
   - Container live migration
   - Running containers in restricted environments without setuid
- Networking
   - Software defined networks and network virtualization
   - New virtualization NICs/Nitro alike ASICs for the data center?
   - Kubernetes SDN policy (Calico i.a.)
   - Kubernetes network provisioning (Flannel i.a.)
- Lightweight Virtualization
   - Micro VMMs (Rust-VMM, Firecracker, solo5)
   - Xen
   - Nitro hypervisor (KVM)
   - RVirt
   - Cloud Hypervisor
- Unikernels / LibOS
- HPC S

Re: [PATCH v10 00/11] x86: PIE support to extend KASLR randomization

2019-12-19 Thread Peter Zijlstra
On Wed, Dec 04, 2019 at 04:09:37PM -0800, Thomas Garnier wrote:
> Minor changes based on feedback and rebase from v9.
> 
> Splitting the previous serie in two. This part contains assembly code
> changes required for PIE but without any direct dependencies with the
> rest of the patchset.

ISTR suggestion you add an objtool pass that verifies there are no
absolute text references left. Otherwise we'll forever be chasing that
last one..
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v3 4/5] iommu: intel: Use generic_iommu_put_resv_regions()

2019-12-19 Thread Thierry Reding
On Thu, Dec 19, 2019 at 09:53:22AM +0800, Lu Baolu wrote:
> Please tweak the title to
> 
> "iommu/vt-d: Use generic_iommu_put_resv_regions()"
> 
> then,
> 
> Acked-by: Lu Baolu 
> 
> Best regards,
> baolu

Joerg, do you want me to resend with this change or is it more efficient
if you fix up the subject while applying?

Thierry

> On 12/18/19 9:42 PM, Thierry Reding wrote:
> > From: Thierry Reding 
> > 
> > Use the new standard function instead of open-coding it.
> > 
> > Cc: David Woodhouse 
> > Signed-off-by: Thierry Reding 
> > ---
> >   drivers/iommu/intel-iommu.c | 11 +--
> >   1 file changed, 1 insertion(+), 10 deletions(-)
> > 
> > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> > index 42966611a192..a6d5b7cf9183 100644
> > --- a/drivers/iommu/intel-iommu.c
> > +++ b/drivers/iommu/intel-iommu.c
> > @@ -5744,15 +5744,6 @@ static void intel_iommu_get_resv_regions(struct 
> > device *device,
> > list_add_tail(®->list, head);
> >   }
> > -static void intel_iommu_put_resv_regions(struct device *dev,
> > -struct list_head *head)
> > -{
> > -   struct iommu_resv_region *entry, *next;
> > -
> > -   list_for_each_entry_safe(entry, next, head, list)
> > -   kfree(entry);
> > -}
> > -
> >   int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct device 
> > *dev)
> >   {
> > struct device_domain_info *info;
> > @@ -5987,7 +5978,7 @@ const struct iommu_ops intel_iommu_ops = {
> > .add_device = intel_iommu_add_device,
> > .remove_device  = intel_iommu_remove_device,
> > .get_resv_regions   = intel_iommu_get_resv_regions,
> > -   .put_resv_regions   = intel_iommu_put_resv_regions,
> > +   .put_resv_regions   = generic_iommu_put_resv_regions,
> > .apply_resv_region  = intel_iommu_apply_resv_region,
> > .device_group   = pci_device_group,
> > .dev_has_feat   = intel_iommu_dev_has_feat,
> > 


signature.asc
Description: PGP signature
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 2/5] KVM: arm64: Implement PV_LOCK_FEATURES call

2019-12-19 Thread yezengruan
Hi Steve,

On 2019/12/17 22:28, Steven Price wrote:
> On Tue, Dec 17, 2019 at 01:55:46PM +, yezengr...@huawei.com wrote:
>> From: Zengruan Ye 
>>
>> This provides a mechanism for querying which paravirtualized lock
>> features are available in this hypervisor.
>>
>> Also add the header file which defines the ABI for the paravirtualized
>> lock features we're about to add.
>>
>> Signed-off-by: Zengruan Ye 
>> ---
>>  arch/arm64/include/asm/pvlock-abi.h | 16 
>>  include/linux/arm-smccc.h   | 13 +
>>  virt/kvm/arm/hypercalls.c   |  3 +++
>>  3 files changed, 32 insertions(+)
>>  create mode 100644 arch/arm64/include/asm/pvlock-abi.h
>>
>> diff --git a/arch/arm64/include/asm/pvlock-abi.h 
>> b/arch/arm64/include/asm/pvlock-abi.h
>> new file mode 100644
>> index ..06e0c3d7710a
>> --- /dev/null
>> +++ b/arch/arm64/include/asm/pvlock-abi.h
>> @@ -0,0 +1,16 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/*
>> + * Copyright(c) 2019 Huawei Technologies Co., Ltd
>> + * Author: Zengruan Ye 
>> + */
>> +
>> +#ifndef __ASM_PVLOCK_ABI_H
>> +#define __ASM_PVLOCK_ABI_H
>> +
>> +struct pvlock_vcpu_state {
>> +__le64 preempted;
> 
> Somewhere we need to document when 'preempted' is. It looks like it's a
> 1-bit field from the later patches.

Good point, I'll document this in the pvlock doc.

> 
>> +/* Structure must be 64 byte aligned, pad to that size */
>> +u8 padding[56];
>> +} __packed;
>> +
>> +#endif
>> diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h
>> index 59494df0f55b..59e65a951959 100644
>> --- a/include/linux/arm-smccc.h
>> +++ b/include/linux/arm-smccc.h
>> @@ -377,5 +377,18 @@ asmlinkage void __arm_smccc_hvc(unsigned long a0, 
>> unsigned long a1,
>> ARM_SMCCC_OWNER_STANDARD_HYP,\
>> 0x21)
>>  
>> +/* Paravirtualised lock calls */
>> +#define ARM_SMCCC_HV_PV_LOCK_FEATURES   \
>> +ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
>> +   ARM_SMCCC_SMC_64,\
>> +   ARM_SMCCC_OWNER_STANDARD_HYP,\
>> +   0x40)
>> +
>> +#define ARM_SMCCC_HV_PV_LOCK_PREEMPTED  \
>> +ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
>> +   ARM_SMCCC_SMC_64,\
>> +   ARM_SMCCC_OWNER_STANDARD_HYP,\
>> +   0x41)
>> +
>>  #endif /*__ASSEMBLY__*/
>>  #endif /*__LINUX_ARM_SMCCC_H*/
>> diff --git a/virt/kvm/arm/hypercalls.c b/virt/kvm/arm/hypercalls.c
>> index 550dfa3e53cd..ff13871fd85a 100644
>> --- a/virt/kvm/arm/hypercalls.c
>> +++ b/virt/kvm/arm/hypercalls.c
>> @@ -52,6 +52,9 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
>>  case ARM_SMCCC_HV_PV_TIME_FEATURES:
>>  val = SMCCC_RET_SUCCESS;
>>  break;
>> +case ARM_SMCCC_HV_PV_LOCK_FEATURES:
>> +val = SMCCC_RET_SUCCESS;
>> +break;
> 
> Ideally you wouldn't report that PV_LOCK_FEATURES exists until the
> actual hypercalls are wired up to avoid breaking a bisect.

Thanks for pointing it out to me! I'll update the code.

> 
> Steve
> 
>>  }
>>  break;
>>  case ARM_SMCCC_HV_PV_TIME_FEATURES:
>> -- 
>> 2.19.1
>>
>>
> 
> .
> 

Thanks,

Zengruan


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/5] KVM: arm64: Document PV-lock interface

2019-12-19 Thread yezengruan
Hi Steve,

On 2019/12/17 22:21, Steven Price wrote:
> On Tue, Dec 17, 2019 at 01:55:45PM +, yezengr...@huawei.com wrote:
>> From: Zengruan Ye 
>>
>> Introduce a paravirtualization interface for KVM/arm64 to obtain the vcpu
>> is currently running or not.
>>
>> A hypercall interface is provided for the guest to interrogate the
>> hypervisor's support for this interface and the location of the shared
>> memory structures.
>>
>> Signed-off-by: Zengruan Ye 
>> ---
>>  Documentation/virt/kvm/arm/pvlock.rst | 31 +++
>>  1 file changed, 31 insertions(+)
>>  create mode 100644 Documentation/virt/kvm/arm/pvlock.rst
>>
>> diff --git a/Documentation/virt/kvm/arm/pvlock.rst 
>> b/Documentation/virt/kvm/arm/pvlock.rst
>> new file mode 100644
>> index ..eec0c36edf17
>> --- /dev/null
>> +++ b/Documentation/virt/kvm/arm/pvlock.rst
>> @@ -0,0 +1,31 @@
>> +.. SPDX-License-Identifier: GPL-2.0
>> +
>> +Paravirtualized lock support for arm64
>> +==
>> +
>> +KVM/arm64 provids some hypervisor service calls to support a paravirtualized
>> +guest obtaining the vcpu is currently running or not.
>> +
>> +Two new SMCCC compatible hypercalls are defined:
>> +
>> +* PV_LOCK_FEATURES:   0xC540
>> +* PV_LOCK_PREEMPTED:  0xC541
> 
> These values are in the "Standard Hypervisor Service Calls" section of
> SMCCC - so is there a document that describes this features such that
> other OSes or hypervisors can implement it? I'm also not entirely sure
> of the process of ensuring that the IDs picked are non-conflicting.
> 
> Otherwise if this is a KVM specific interface this should probably
> belong within the "Vendor Specific Hypervisor Service Calls" section
> along with some probing that the hypervisor is actually KVM. Although I
> don't see anything KVM specific.

Thanks for pointing it out to me! Actually, I also don't see any documents
or KVM specific that describes this features. The values in the "Vendor
Specific Hypervisor Service Calls" section may be more appropriate, such as
the following

* PV_LOCK_FEATURES:   0xC620
* PV_LOCK_PREEMPTED:  0xC621

Please let me know if you have any suggestions.

> 
>> +
>> +The existence of the PV_LOCK hypercall should be probed using the SMCCC 1.1
>> +ARCH_FEATURES mechanism before calling it.
>> +
>> +PV_LOCK_FEATURES
>> += ==
>> +Function ID:  (uint32)0xC540
>> +PV_call_id:   (uint32)The function to query for support.
>> +Return value: (int64) NOT_SUPPORTED (-1) or SUCCESS (0) if the 
>> relevant
>> +  PV-lock feature is supported by the 
>> hypervisor.
>> += ==
>> +
>> +PV_LOCK_PREEMPTED
>> += ==
>> +Function ID:  (uint32)0xC541
>> +Return value: (int64) NOT_SUPPORTED (-1) or SUCCESS (0) if the IPA 
>> of
>> +  this vcpu's pv data structure is configured by
>> +  the hypervisor.
>> += ==
> 
>>From the code it looks like there's another argument for this SMC - the
> physical address (or IPA) of a struct pvlock_vcpu_state. This structure
> also needs to be described as it is part of the ABI.

Will update.

> 
> Steve
> 
> .
> 

Thanks,

Zengruan


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net-next v3 00/11] VSOCK: add vsock_test test suite

2019-12-19 Thread Stefan Hajnoczi
On Wed, Dec 18, 2019 at 07:06:57PM +0100, Stefano Garzarella wrote:
> The vsock_diag.ko module already has a test suite but the core AF_VSOCK
> functionality has no tests. This patch series adds several test cases that
> exercise AF_VSOCK SOCK_STREAM socket semantics (send/recv, connect/accept,
> half-closed connections, simultaneous connections).
> 
> The v1 of this series was originally sent by Stefan.
> 
> v3:
> - Patch 6:
>   * check the byte received in the recv_byte()
>   * use send(2)/recv(2) instead of write(2)/read(2) to test also flags
> (e.g. MSG_PEEK)
> - Patch 8:
>   * removed unnecessary control_expectln("CLOSED") [Stefan].
> - removed patches 9,10,11 added in the v2
> - new Patch 9 add parameters to list and skip tests (e.g. useful for vmci
>   that doesn't support half-closed socket in the host)
> - new Patch 10 prints a list of options in the help
> - new Patch 11 tests MSG_PEEK flags of recv(2)
> 
> v2: https://patchwork.ozlabs.org/cover/1140538/
> v1: https://patchwork.ozlabs.org/cover/847998/
> 
> Stefan Hajnoczi (7):
>   VSOCK: fix header include in vsock_diag_test
>   VSOCK: add SPDX identifiers to vsock tests
>   VSOCK: extract utility functions from vsock_diag_test.c
>   VSOCK: extract connect/accept functions from vsock_diag_test.c
>   VSOCK: add full barrier between test cases
>   VSOCK: add send_byte()/recv_byte() test utilities
>   VSOCK: add AF_VSOCK test cases
> 
> Stefano Garzarella (4):
>   vsock_test: wait for the remote to close the connection
>   testing/vsock: add parameters to list and skip tests
>   testing/vsock: print list of options and description
>   vsock_test: add SOCK_STREAM MSG_PEEK test
> 
>  tools/testing/vsock/.gitignore|   1 +
>  tools/testing/vsock/Makefile  |   9 +-
>  tools/testing/vsock/README|   3 +-
>  tools/testing/vsock/control.c |  15 +-
>  tools/testing/vsock/control.h |   2 +
>  tools/testing/vsock/timeout.h |   1 +
>  tools/testing/vsock/util.c| 376 +
>  tools/testing/vsock/util.h|  49 
>  tools/testing/vsock/vsock_diag_test.c | 202 --
>  tools/testing/vsock/vsock_test.c  | 379 ++
>  10 files changed, 883 insertions(+), 154 deletions(-)
>  create mode 100644 tools/testing/vsock/util.c
>  create mode 100644 tools/testing/vsock/util.h
>  create mode 100644 tools/testing/vsock/vsock_test.c
> 
> -- 
> 2.24.1
> 
> ___
> Virtualization mailing list
> Virtualization@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization