Re: [PATCH v10 10/16] KVM: x86: Introduce KVM_GET_SHARED_PAGES_LIST ioctl

2021-03-09 Thread Kalra, Ashish


> On Mar 9, 2021, at 3:22 AM, Steve Rutherford  wrote:
> 
> On Mon, Mar 8, 2021 at 1:11 PM Brijesh Singh  wrote:
>> 
>> 
>>> On 3/8/21 1:51 PM, Sean Christopherson wrote:
>>> On Mon, Mar 08, 2021, Ashish Kalra wrote:
 On Fri, Feb 26, 2021 at 09:44:41AM -0800, Sean Christopherson wrote:
> +Will and Quentin (arm64)
> 
> Moving the non-KVM x86 folks to bcc, I don't they care about KVM details 
> at this
> point.
> 
> On Fri, Feb 26, 2021, Ashish Kalra wrote:
>> On Thu, Feb 25, 2021 at 02:59:27PM -0800, Steve Rutherford wrote:
>>> On Thu, Feb 25, 2021 at 12:20 PM Ashish Kalra  
>>> wrote:
>>> Thanks for grabbing the data!
>>> 
>>> I am fine with both paths. Sean has stated an explicit desire for
>>> hypercall exiting, so I think that would be the current consensus.
> Yep, though it'd be good to get Paolo's input, too.
> 
>>> If we want to do hypercall exiting, this should be in a follow-up
>>> series where we implement something more generic, e.g. a hypercall
>>> exiting bitmap or hypercall exit list. If we are taking the hypercall
>>> exit route, we can drop the kvm side of the hypercall.
> I don't think this is a good candidate for arbitrary hypercall 
> interception.  Or
> rather, I think hypercall interception should be an orthogonal 
> implementation.
> 
> The guest, including guest firmware, needs to be aware that the hypercall 
> is
> supported, and the ABI needs to be well-defined.  Relying on userspace 
> VMMs to
> implement a common ABI is an unnecessary risk.
> 
> We could make KVM's default behavior be a nop, i.e. have KVM enforce the 
> ABI but
> require further VMM intervention.  But, I just don't see the point, it 
> would
> save only a few lines of code.  It would also limit what KVM could do in 
> the
> future, e.g. if KVM wanted to do its own bookkeeping _and_ exit to 
> userspace,
> then mandatory interception would essentially make it impossible for KVM 
> to do
> bookkeeping while still honoring the interception request.
> 
> However, I do think it would make sense to have the userspace exit be a 
> generic
> exit type.  But hey, we already have the necessary ABI defined for that!  
> It's
> just not used anywhere.
> 
>/* KVM_EXIT_HYPERCALL */
>struct {
>__u64 nr;
>__u64 args[6];
>__u64 ret;
>__u32 longmode;
>__u32 pad;
>} hypercall;
> 
> 
>>> Userspace could also handle the MSR using MSR filters (would need to
>>> confirm that).  Then userspace could also be in control of the cpuid 
>>> bit.
> An MSR is not a great fit; it's x86 specific and limited to 64 bits of 
> data.
> The data limitation could be fudged by shoving data into non-standard 
> GPRs, but
> that will result in truly heinous guest code, and extensibility issues.
> 
> The data limitation is a moot point, because the x86-only thing is a deal
> breaker.  arm64's pKVM work has a near-identical use case for a guest to 
> share
> memory with a host.  I can't think of a clever way to avoid having to 
> support
> TDX's and SNP's hypervisor-agnostic variants, but we can at least not have
> multiple KVM variants.
> 
 Potentially, there is another reason for in-kernel hypercall handling
 considering SEV-SNP. In case of SEV-SNP the RMP table tracks the state
 of each guest page, for instance pages in hypervisor state, i.e., pages
 with C=0 and pages in guest valid state with C=1.
 
 Now, there shouldn't be a need for page encryption status hypercalls on
 SEV-SNP as KVM can track & reference guest page status directly using
 the RMP table.
>>> Relying on the RMP table itself would require locking the RMP table for an
>>> extended duration, and walking the entire RMP to find shared pages would be
>>> very inefficient.
>>> 
 As KVM maintains the RMP table, therefore we will need SET/GET type of
 interfaces to provide the guest page encryption status to userspace.
>>> Hrm, somehow I temporarily forgot about SNP and TDX adding their own 
>>> hypercalls
>>> for converting between shared and private.  And in the case of TDX, the 
>>> hypercall
>>> can't be trusted, i.e. is just a hint, otherwise the guest could induce a 
>>> #MC in
>>> the host.
>>> 
>>> But, the different guest behavior doesn't require KVM to maintain a 
>>> list/tree,
>>> e.g. adding a dedicated KVM_EXIT_* for notifying userspace of page 
>>> encryption
>>> status changes would also suffice.
>>> 
>>> Actually, that made me think of another argument against maintaining a list 
>>> in
>>> KVM: there's no way to notify userspace that a page's status has changed.
>>> Userspace would need to query KVM to do GET_LIST after every GET_DIRTY.
>>> Obviously not a 

RE: [PATCH v10 10/16] KVM: x86: Introduce KVM_GET_SHARED_PAGES_LIST ioctl

2021-02-18 Thread Kalra, Ashish
[AMD Public Use]


-Original Message-
From: Sean Christopherson  
Sent: Tuesday, February 16, 2021 7:03 PM
To: Kalra, Ashish 
Cc: pbonz...@redhat.com; t...@linutronix.de; mi...@redhat.com; h...@zytor.com; 
rkrc...@redhat.com; j...@8bytes.org; b...@suse.de; Lendacky, Thomas 
; x...@kernel.org; k...@vger.kernel.org; 
linux-kernel@vger.kernel.org; srutherf...@google.com; 
venu.busire...@oracle.com; Singh, Brijesh 
Subject: Re: [PATCH v10 10/16] KVM: x86: Introduce KVM_GET_SHARED_PAGES_LIST 
ioctl

On Thu, Feb 04, 2021, Ashish Kalra wrote:
> From: Brijesh Singh 
> 
> The ioctl is used to retrieve a guest's shared pages list.

>What's the performance hit to boot time if KVM_HC_PAGE_ENC_STATUS is passed 
>through to userspace?  That way, userspace could manage the set of pages >in 
>whatever data structure they want, and these get/set ioctls go away.

I will be more concerned about performance hit during guest DMA I/O if the page 
encryption status hypercalls are passed through to user-space, 
a lot of guest DMA I/O dynamically sets up pages for encryption and then flips 
them at DMA completion, so guest I/O will surely take a performance 
hit with this pass-through stuff.

Thanks,
Ashish


RE: [PATCH v10 10/16] KVM: x86: Introduce KVM_GET_SHARED_PAGES_LIST ioctl

2021-02-18 Thread Kalra, Ashish
[AMD Public Use]


-Original Message-
From: Sean Christopherson  
Sent: Thursday, February 18, 2021 10:39 AM
To: Kalra, Ashish 
Cc: pbonz...@redhat.com; t...@linutronix.de; mi...@redhat.com; h...@zytor.com; 
rkrc...@redhat.com; j...@8bytes.org; b...@suse.de; Lendacky, Thomas 
; x...@kernel.org; k...@vger.kernel.org; 
linux-kernel@vger.kernel.org; srutherf...@google.com; 
venu.busire...@oracle.com; Singh, Brijesh 
Subject: Re: [PATCH v10 10/16] KVM: x86: Introduce KVM_GET_SHARED_PAGES_LIST 
ioctl

On Thu, Feb 18, 2021, Kalra, Ashish wrote:
> From: Sean Christopherson 
> 
> On Wed, Feb 17, 2021, Kalra, Ashish wrote:
> >> From: Sean Christopherson  On Thu, Feb 04, 2021, 
> >> Ashish Kalra wrote:
> >> > From: Brijesh Singh 
> >> > 
> >> > The ioctl is used to retrieve a guest's shared pages list.
> >> 
> >> >What's the performance hit to boot time if KVM_HC_PAGE_ENC_STATUS 
> >> >is passed through to userspace?  That way, userspace could manage 
> >> >the set of pages >in whatever data structure they want, and these get/set 
> >> >ioctls go away.
> >> 
> >> What is the advantage of passing KVM_HC_PAGE_ENC_STATUS through to 
> >> user-space ?
> >> 
> >> As such it is just a simple interface to get the shared page list 
> >> via the get/set ioctl's. simply an array is passed to these ioctl 
> >> to get/set the shared pages list.
>> 
>> > It eliminates any probability of the kernel choosing the wrong data 
>> > structure, and it's two fewer ioctls to maintain and test.
>> 
>> The set shared pages list ioctl cannot be avoided as it needs to be 
>> issued to setup the shared pages list on the migrated VM, it cannot be 
>> achieved by passing KVM_HC_PAGE_ENC_STATUS through to user-space.

>Why's that?  AIUI, KVM doesn't do anything with the list other than pass it 
>back to userspace.  Assuming that's the case, userspace can just hold onto the 
>list >for the next migration.

KVM does use it as part of the SEV DBG_DECTYPT API, within sev_dbg_decrypt() to 
check if the guest page(s) are encrypted or not,
and accordingly use it to decide whether to decrypt the guest page(s) and 
return that back to user-space or just return it as it is.

Thanks,
Ashish


RE: [PATCH v10 10/16] KVM: x86: Introduce KVM_GET_SHARED_PAGES_LIST ioctl

2021-02-17 Thread Kalra, Ashish
[AMD Public Use]

-Original Message-
From: Sean Christopherson  
Sent: Wednesday, February 17, 2021 10:13 AM
To: Kalra, Ashish 
Cc: pbonz...@redhat.com; t...@linutronix.de; mi...@redhat.com; h...@zytor.com; 
rkrc...@redhat.com; j...@8bytes.org; b...@suse.de; Lendacky, Thomas 
; x...@kernel.org; k...@vger.kernel.org; 
linux-kernel@vger.kernel.org; srutherf...@google.com; 
venu.busire...@oracle.com; Singh, Brijesh 
Subject: Re: [PATCH v10 10/16] KVM: x86: Introduce KVM_GET_SHARED_PAGES_LIST 
ioctl

On Wed, Feb 17, 2021, Kalra, Ashish wrote:
>> From: Sean Christopherson  On Thu, Feb 04, 2021, 
>> Ashish Kalra wrote:
>> > From: Brijesh Singh 
>> > 
>> > The ioctl is used to retrieve a guest's shared pages list.
>> 
>> >What's the performance hit to boot time if KVM_HC_PAGE_ENC_STATUS is 
>> >passed through to userspace?  That way, userspace could manage the 
>> >set of pages >in whatever data structure they want, and these get/set 
>> >ioctls go away.
>> 
>> What is the advantage of passing KVM_HC_PAGE_ENC_STATUS through to 
>> user-space ?
>> 
>> As such it is just a simple interface to get the shared page list via 
>> the get/set ioctl's. simply an array is passed to these ioctl to 
>> get/set the shared pages list.

> It eliminates any probability of the kernel choosing the wrong data 
> structure, and it's two fewer ioctls to maintain and test.

The set shared pages list ioctl cannot be avoided as it needs to be issued to 
setup the shared pages list on the migrated
VM, it cannot be achieved by passing KVM_HC_PAGE_ENC_STATUS through to 
user-space.

So it makes sense to add both get/set shared pages list ioctl, passing through 
to user-space is just adding more complexity
without any significant gains.

> >Also, aren't there plans for an in-guest migration helper?  If so, do 
> >we have any idea what that interface will look like?  E.g. if we're 
> >going to end up with a full >fledged driver in the guest, why not 
> >bite the bullet now and bypass KVM entirely?
> 
> Even the in-guest migration helper will be using page encryption 
> status hypercalls, so some interface is surely required.

>If it's a driver with a more extensive interace, then the hypercalls can be 
>replaced by a driver operation.  That's obviously a big if, though.

> Also the in-guest migration will be mainly an OVMF component, won't  
> really be a full fledged kernel driver in the guest.

>Is there code and/or a description of what the proposed helper would look like?

Not right now, there are prototype(s) under development, I assume they will be 
posted upstream soon.

Thanks,
Ashish


RE: [PATCH v10 10/16] KVM: x86: Introduce KVM_GET_SHARED_PAGES_LIST ioctl

2021-02-17 Thread Kalra, Ashish
[AMD Public Use]

-Original Message-
From: Sean Christopherson  
Sent: Tuesday, February 16, 2021 7:03 PM
To: Kalra, Ashish 
Cc: pbonz...@redhat.com; t...@linutronix.de; mi...@redhat.com; h...@zytor.com; 
rkrc...@redhat.com; j...@8bytes.org; b...@suse.de; Lendacky, Thomas 
; x...@kernel.org; k...@vger.kernel.org; 
linux-kernel@vger.kernel.org; srutherf...@google.com; 
venu.busire...@oracle.com; Singh, Brijesh 
Subject: Re: [PATCH v10 10/16] KVM: x86: Introduce KVM_GET_SHARED_PAGES_LIST 
ioctl

On Thu, Feb 04, 2021, Ashish Kalra wrote:
> From: Brijesh Singh 
> 
> The ioctl is used to retrieve a guest's shared pages list.

>What's the performance hit to boot time if KVM_HC_PAGE_ENC_STATUS is passed 
>through to userspace?  That way, userspace could manage the set of pages >in 
>whatever data structure they want, and these get/set ioctls go away.

What is the advantage of passing KVM_HC_PAGE_ENC_STATUS through to user-space ?

As such it is just a simple interface to get the shared page list via the 
get/set ioctl's. simply an array is passed to these ioctl to get/set the shared 
pages
list.

>Also, aren't there plans for an in-guest migration helper?  If so, do we have 
>any idea what that interface will look like?  E.g. if we're going to end up 
>with a full >fledged driver in the guest, why not bite the bullet now and 
>bypass KVM entirely?

Even the in-guest migration helper will be using page encryption status 
hypercalls, so some interface is surely required.

Also the in-guest migration will be mainly an OVMF component, won't  really be 
a full fledged kernel driver in the guest.

Thanks,
Ashish


RE: [PATCH v9 00/18] Add AMD SEV guest live migration support

2021-01-14 Thread Kalra, Ashish
[AMD Public Use]

Hello Steve,

I don't think we have ever discussed supporting this command, maybe we can 
support it in a future follow up patch.

Thanks,
Ashish

-Original Message-
From: Steve Rutherford  
Sent: Thursday, January 14, 2021 6:32 PM
To: Kalra, Ashish 
Cc: Paolo Bonzini ; Thomas Gleixner ; 
Ingo Molnar ; H. Peter Anvin ; Radim Krčmář 
; Joerg Roedel ; Borislav Petkov 
; Lendacky, Thomas ; X86 ML 
; KVM list ; LKML 
; Venu Busireddy ; 
Singh, Brijesh 
Subject: Re: [PATCH v9 00/18] Add AMD SEV guest live migration support

Forgot to ask this: is there an intention to support SEND_CANCEL in a follow up 
patch?


On Tue, Dec 8, 2020 at 2:03 PM Ashish Kalra  wrote:
>
> From: Ashish Kalra 
>
> The series add support for AMD SEV guest live migration commands. To 
> protect the confidentiality of an SEV protected guest memory while in 
> transit we need to use the SEV commands defined in SEV API spec [1].
>
> SEV guest VMs have the concept of private and shared memory. Private 
> memory is encrypted with the guest-specific key, while shared memory 
> may be encrypted with hypervisor key. The commands provided by the SEV 
> FW are meant to be used for the private memory only. The patch series 
> introduces a new hypercall.
> The guest OS can use this hypercall to notify the page encryption status.
> If the page is encrypted with guest specific-key then we use SEV 
> command during the migration. If page is not encrypted then fallback to 
> default.
>
> The patch adds new ioctls KVM_{SET,GET}_PAGE_ENC_BITMAP. The ioctl can 
> be used by the qemu to get the page encrypted bitmap. Qemu can consult 
> this bitmap during the migration to know whether the page is encrypted.
>
> This section descibes how the SEV live migration feature is negotiated 
> between the host and guest, the host indicates this feature support 
> via KVM_FEATURE_CPUID. The guest firmware (OVMF) detects this feature 
> and sets a UEFI enviroment variable indicating OVMF support for live 
> migration, the guest kernel also detects the host support for this 
> feature via cpuid and in case of an EFI boot verifies if OVMF also 
> supports this feature by getting the UEFI enviroment variable and if 
> it set then enables live migration feature on host by writing to a 
> custom MSR, if not booted under EFI, then it simply enables the 
> feature by again writing to the custom MSR. The host returns error as 
> part of SET_PAGE_ENC_BITMAP ioctl if guest has not enabled live migration.
>
> A branch containing these patches is available here:
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgith
> ub.com%2FAMDESE%2Flinux%2Ftree%2Fsev-migration-v9data=04%7C01%7CA
> shish.Kalra%40amd.com%7C940bfdcae0f640321ff208d8b8ed0e93%7C3dd8961fe48
> 84e608e11a82d994e183d%7C0%7C0%7C637462676023260162%7CUnknown%7CTWFpbGZ
> sb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3
> D%7C1000sdata=yyEegH2gWukbKMM%2FQ%2FgkMpHacwxu7KJ0E3Q3wfxLZ%2B0%3
> Dreserved=0
>
> [1] 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdeve
> loper.amd.com%2Fwp-content%2Fresources%2F55766.PDFdata=04%7C01%7C
> Ashish.Kalra%40amd.com%7C940bfdcae0f640321ff208d8b8ed0e93%7C3dd8961fe4
> 884e608e11a82d994e183d%7C0%7C0%7C637462676023270160%7CUnknown%7CTWFpbG
> Zsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%
> 3D%7C1000sdata=RhUoTrbJelVdD4ijE4XpZNIr%2BIMokCY6FTtlAWhcLEs%3D
> mp;reserved=0
>
> Changes since v8:
> - Rebasing to kvm next branch.
> - Fixed and added comments as per review feedback on v8 patches.
> - Removed implicitly enabling live migration for incoming VMs in
>   in KVM_SET_PAGE_ENC_BITMAP, it is now done via KVM_SET_MSR ioctl.
> - Adds support for bypassing unencrypted guest memory regions for
>   DBG_DECRYPT API calls, guest memory region encryption status in
>   sev_dbg_decrypt() is referenced using the page encryption bitmap.
>
> Changes since v7:
> - Removed the hypervisor specific hypercall/paravirt callback for
>   SEV live migration and moved back to calling kvm_sev_hypercall3
>   directly.
> - Fix build errors as
>   Reported-by: kbuild test robot , specifically fixed
>   build error when CONFIG_HYPERVISOR_GUEST=y and
>   CONFIG_AMD_MEM_ENCRYPT=n.
> - Implicitly enabled live migration for incoming VM(s) to handle
>   A->B->C->... VM migrations.
> - Fixed Documentation as per comments on v6 patches.
> - Fixed error return path in sev_send_update_data() as per comments
>   on v6 patches.
>
> Changes since v6:
> - Rebasing to mainline and refactoring to the new split SVM
>   infrastructre.
> - Move to static allocation of the unified Page Encryption bitmap
>   instead of the dynamic resizing of the bitmap, the static allocation
>   

Re: [PATCH v2 1/9] KVM: x86: Add AMD SEV specific Hypercall3

2020-12-07 Thread Kalra, Ashish

> 
>> I suspect a list
>> would consume far less memory, hopefully without impacting performance.

And how much host memory are we talking about for here, say for a 4gb guest, 
the bitmap will be using just using something like 128k+.

Thanks,
Ashish

> On Dec 7, 2020, at 10:16 PM, Kalra, Ashish  wrote:
> 
> I don’t think that the bitmap by itself is really a performance bottleneck 
> here.
> 
> Thanks,
> Ashish
> 
>>> On Dec 7, 2020, at 9:10 PM, Steve Rutherford  wrote:
>>> On Mon, Dec 7, 2020 at 12:42 PM Sean Christopherson  
>>> wrote:
>>>> On Sun, Dec 06, 2020, Paolo Bonzini wrote:
>>>> On 03/12/20 01:34, Sean Christopherson wrote:
>>>>> On Tue, Dec 01, 2020, Ashish Kalra wrote:
>>>>>> From: Brijesh Singh 
>>>>>> KVM hypercall framework relies on alternative framework to patch the
>>>>>> VMCALL -> VMMCALL on AMD platform. If a hypercall is made before
>>>>>> apply_alternative() is called then it defaults to VMCALL. The approach
>>>>>> works fine on non SEV guest. A VMCALL would causes #UD, and hypervisor
>>>>>> will be able to decode the instruction and do the right things. But
>>>>>> when SEV is active, guest memory is encrypted with guest key and
>>>>>> hypervisor will not be able to decode the instruction bytes.
>>>>>> Add SEV specific hypercall3, it unconditionally uses VMMCALL. The 
>>>>>> hypercall
>>>>>> will be used by the SEV guest to notify encrypted pages to the 
>>>>>> hypervisor.
>>>>> What if we invert KVM_HYPERCALL and X86_FEATURE_VMMCALL to default to 
>>>>> VMMCALL
>>>>> and opt into VMCALL?  It's a synthetic feature flag either way, and I 
>>>>> don't
>>>>> think there are any existing KVM hypercalls that happen before 
>>>>> alternatives are
>>>>> patched, i.e. it'll be a nop for sane kernel builds.
>>>>> I'm also skeptical that a KVM specific hypercall is the right approach 
>>>>> for the
>>>>> encryption behavior, but I'll take that up in the patches later in the 
>>>>> series.
>>>> Do you think that it's the guest that should "donate" memory for the bitmap
>>>> instead?
>>> No.  Two things I'd like to explore:
>>> 1. Making the hypercall to announce/request private vs. shared common across
>>>   hypervisors (KVM, Hyper-V, VMware, etc...) and technologies (SEV-* and 
>>> TDX).
>>>   I'm concerned that we'll end up with multiple hypercalls that do more or
>>>   less the same thing, e.g. KVM+SEV, Hyper-V+SEV, TDX, etc...  Maybe it's a
>>>   pipe dream, but I'd like to at least explore options before shoving in 
>>> KVM-
>>>   only hypercalls.
>>> 2. Tracking shared memory via a list of ranges instead of a using bitmap to
>>>   track all of guest memory.  For most use cases, the vast majority of guest
>>>   memory will be private, most ranges will be 2mb+, and conversions between
>>>   private and shared will be uncommon events, i.e. the overhead to walk and
>>>   split/merge list entries is hopefully not a big concern.  I suspect a list
>>>   would consume far less memory, hopefully without impacting performance.
>> For a fancier data structure, I'd suggest an interval tree. Linux
>> already has an rbtree-based interval tree implementation, which would
>> likely work, and would probably assuage any performance concerns.
>> Something like this would not be worth doing unless most of the shared
>> pages were physically contiguous. A sample Ubuntu 20.04 VM on GCP had
>> 60ish discontiguous shared regions. This is by no means a thorough
>> search, but it's suggestive. If this is typical, then the bitmap would
>> be far less efficient than most any interval-based data structure.
>> You'd have to allow userspace to upper bound the number of intervals
>> (similar to the maximum bitmap size), to prevent host OOMs due to
>> malicious guests. There's something nice about the guest donating
>> memory for this, since that would eliminate the OOM risk.


Re: [PATCH v2 1/9] KVM: x86: Add AMD SEV specific Hypercall3

2020-12-07 Thread Kalra, Ashish
I don’t think that the bitmap by itself is really a performance bottleneck here.

Thanks,
Ashish

> On Dec 7, 2020, at 9:10 PM, Steve Rutherford  wrote:
> 
> On Mon, Dec 7, 2020 at 12:42 PM Sean Christopherson  
> wrote:
>> 
>>> On Sun, Dec 06, 2020, Paolo Bonzini wrote:
>>> On 03/12/20 01:34, Sean Christopherson wrote:
 On Tue, Dec 01, 2020, Ashish Kalra wrote:
> From: Brijesh Singh 
> 
> KVM hypercall framework relies on alternative framework to patch the
> VMCALL -> VMMCALL on AMD platform. If a hypercall is made before
> apply_alternative() is called then it defaults to VMCALL. The approach
> works fine on non SEV guest. A VMCALL would causes #UD, and hypervisor
> will be able to decode the instruction and do the right things. But
> when SEV is active, guest memory is encrypted with guest key and
> hypervisor will not be able to decode the instruction bytes.
> 
> Add SEV specific hypercall3, it unconditionally uses VMMCALL. The 
> hypercall
> will be used by the SEV guest to notify encrypted pages to the hypervisor.
 
 What if we invert KVM_HYPERCALL and X86_FEATURE_VMMCALL to default to 
 VMMCALL
 and opt into VMCALL?  It's a synthetic feature flag either way, and I don't
 think there are any existing KVM hypercalls that happen before 
 alternatives are
 patched, i.e. it'll be a nop for sane kernel builds.
 
 I'm also skeptical that a KVM specific hypercall is the right approach for 
 the
 encryption behavior, but I'll take that up in the patches later in the 
 series.
>>> 
>>> Do you think that it's the guest that should "donate" memory for the bitmap
>>> instead?
>> 
>> No.  Two things I'd like to explore:
>> 
>>  1. Making the hypercall to announce/request private vs. shared common across
>> hypervisors (KVM, Hyper-V, VMware, etc...) and technologies (SEV-* and 
>> TDX).
>> I'm concerned that we'll end up with multiple hypercalls that do more or
>> less the same thing, e.g. KVM+SEV, Hyper-V+SEV, TDX, etc...  Maybe it's a
>> pipe dream, but I'd like to at least explore options before shoving in 
>> KVM-
>> only hypercalls.
>> 
>> 
>>  2. Tracking shared memory via a list of ranges instead of a using bitmap to
>> track all of guest memory.  For most use cases, the vast majority of 
>> guest
>> memory will be private, most ranges will be 2mb+, and conversions between
>> private and shared will be uncommon events, i.e. the overhead to walk and
>> split/merge list entries is hopefully not a big concern.  I suspect a 
>> list
>> would consume far less memory, hopefully without impacting performance.
> 
> For a fancier data structure, I'd suggest an interval tree. Linux
> already has an rbtree-based interval tree implementation, which would
> likely work, and would probably assuage any performance concerns.
> 
> Something like this would not be worth doing unless most of the shared
> pages were physically contiguous. A sample Ubuntu 20.04 VM on GCP had
> 60ish discontiguous shared regions. This is by no means a thorough
> search, but it's suggestive. If this is typical, then the bitmap would
> be far less efficient than most any interval-based data structure.
> 
> You'd have to allow userspace to upper bound the number of intervals
> (similar to the maximum bitmap size), to prevent host OOMs due to
> malicious guests. There's something nice about the guest donating
> memory for this, since that would eliminate the OOM risk.


Re: [PATCH v7] swiotlb: Adjust SWIOTBL bounce buffer size for SEV guests.

2020-12-07 Thread Kalra, Ashish


> On Dec 7, 2020, at 4:14 PM, Borislav Petkov  wrote:
> 
> On Mon, Dec 07, 2020 at 10:06:24PM +, Ashish Kalra wrote:
>> This is related to the earlier static adjustment of the SWIOTLB buffers
>> as per guest memory size and Konrad's feedback on the same, as copied
>> below : 
>> 
 That is eating 128MB for 1GB, aka 12% of the guest memory allocated 
 statically for this.
 
 And for guests that are 2GB, that is 12% until it gets to 3GB when 
 it is 8% and then 6% at 4GB.
 
 I would prefer this to be based on your memory count, that is 6% of 
 total memory.
> 
> So no rule of thumb and no measurements? Just a magic number 6.

It is more of an approximation of the earlier static adjustment which was 128M 
for <1G guests, 256M for 1G-4G guests and 512M for >4G guests.

Thanks,
Ashish

Re: [PATCH v8 13/18] KVM: x86: Introduce new KVM_FEATURE_SEV_LIVE_MIGRATION feature & Custom MSR.

2020-12-06 Thread Kalra, Ashish

> On Dec 6, 2020, at 4:58 AM, Paolo Bonzini  wrote:
> 
> On 04/12/20 18:23, Sean Christopherson wrote:
>>> On Fri, Dec 04, 2020, Ashish Kalra wrote:
>>> An immediate response, actually the SEV live migration patches are preferred
>>> over the Page encryption bitmap patches, in other words, if SEV live
>>> migration patches are applied then we don't need the Page encryption bitmap
>>> patches and we prefer the live migration series to be applied.
>>> 
>>> It is not that page encryption bitmap series supersede the live migration
>>> patches, they are just cut of the live migration patches.
>> In that case, can you post a fresh version of the live migration series?  
>> Paolo
>> is obviously willing to take a big chunk of that series, and it will likely 
>> be
>> easier to review with the full context, e.g. one of my comments on the 
>> standalone
>> encryption bitmap series was going to be that it's hard to review without 
>> seeing
>> the live migration aspect.
> 
> It still applies without change.  For now I'll only keep the series queued in 
> my (n)SVM branch, but will hold on applying it to kvm.git's queue and next 
> branches.
> 

Ok thanks Paolo.


Re: [PATCH v8 13/18] KVM: x86: Introduce new KVM_FEATURE_SEV_LIVE_MIGRATION feature & Custom MSR.

2020-12-04 Thread Kalra, Ashish
This time I received your email directly.

Thanks,
Ashish

> On Dec 4, 2020, at 12:41 PM, Sean Christopherson  wrote:
> 
> On Fri, Dec 4, 2020 at 10:07 AM Ashish Kalra  wrote:
>> 
>> Yes i will post a fresh version of the live migration patches.
>> 
>> Also, can you please check your email settings, we are only able to see your 
>> response on the
>> mailing list but we are not getting your direct responses.
> 
> Hrm, as in you don't get the email?
> 
> Is this email any different?  Sending via gmail instead of mutt...


Re: [PATCH v6] swiotlb: Adjust SWIOTBL bounce buffer size for SEV guests.

2020-11-24 Thread Kalra, Ashish


> On Nov 24, 2020, at 3:38 AM, Borislav Petkov  wrote:
> 
> On Tue, Nov 24, 2020 at 09:25:06AM +0000, Kalra, Ashish wrote:
>> But what will be the criteria to figure out this percentage?
>> 
>> As I mentioned earlier, this can be made as complicated as possible by
>> adding all kind of heuristics but without any predictable performance
>> gain.
>> 
>> Or it can be kept simple by using a static percentage value.
> 
> Yes, static percentage number based on the guest memory. X% of the guest
> memory is used for SWIOTLB.
> 
> Since you use sev_active(), it means the size computation is done in the
> guest so that SWIOTLB size is per-guest. Yes?

Yes

> 
> If so, you can simply take, say, 5% of the guest memory's size and use
> that for SWIOTLB buffers. Or 6 or X or whatever.
> 
> Makes sense?

Sure it does.

Thanks,
Ashish

> 
> -- 
> Regards/Gruss,
>Boris.
> 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeople.kernel.org%2Ftglx%2Fnotes-about-netiquettedata=04%7C01%7CAshish.Kalra%40amd.com%7C91b611b21d3049d70ca908d8905cbc37%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637418075284000564%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=JvnUfskyd9xdsal4oYkSYW5ouL2b4cs%2Fo2oYi9KrkFo%3Dreserved=0


Re: [PATCH v6] swiotlb: Adjust SWIOTBL bounce buffer size for SEV guests.

2020-11-24 Thread Kalra, Ashish


> On Nov 24, 2020, at 3:04 AM, Borislav Petkov  wrote:
> 
> On Mon, Nov 23, 2020 at 10:56:31PM +, Ashish Kalra wrote:
>> As i mentioned earlier, the patch was initially based on using a % of
>> guest memory,
> 
> Can you figure out how much the guest memory is and then allocate a
> percentage?

But what will be the criteria to figure out this percentage?

As I mentioned earlier, this can be made as complicated as possible by adding 
all kind of heuristics but without any predictable performance gain.

Or it can be kept simple by using a static percentage value.

Thanks,
Ashish

> -- 
> Regards/Gruss,
>Boris.
> 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeople.kernel.org%2Ftglx%2Fnotes-about-netiquettedata=04%7C01%7Cashish.kalra%40amd.com%7C0766422bcee64d2eb57208d89057f620%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637418054797950694%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=%2FEFuRGOMOu4BZUkPOd9rxam%2BBA3nXj4tdRFFj3nQ47U%3Dreserved=0


RE: [PATCH 0/3] UCC TDM driver for MPC83xx platforms

2008-01-16 Thread Kalra Ashish
Hello All,
 
I am sure that the TDM bus driver model/framework will make us put a lot
more programming effort without
any assurance of the code being accepted by the Linux community,
especially as there are many
Telephony/VoIP stack implementations in Linux such as the Sangoma
WANPIPE Kernel suite which
have their own Zaptel TDM (channelized zero-copy) interface layer. There
are other High Speed serial (HSS)
API interfaces, again supporting channelized and/or prioritized API
interfaces. All these implementations 
are proprietary and have their own tightly coupled upper layers and
hardware abstraction layers. It is
difficult to predict that these stacks will move towards a generic TDM
bus driver interface. Therefore, i think
we can have our own tightly coupled interface with our VoIP framework
and let us the keep the driver as it is,
i.e., as a misc driver.

Ashish

-Original Message-
From: Kumar Gala [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, January 15, 2008 9:01 AM
To: Andrew Morton
Cc: Phillips Kim; Aggrwal Poonam; [EMAIL PROTECTED];
[EMAIL PROTECTED]; [EMAIL PROTECTED];
[EMAIL PROTECTED]; linux-kernel@vger.kernel.org; Barkowski Michael;
Kalra Ashish; Cutler Richard
Subject: Re: [PATCH 0/3] UCC TDM driver for MPC83xx platforms


On Jan 14, 2008, at 3:15 PM, Andrew Morton wrote:

> On Mon, 14 Jan 2008 12:00:51 -0600
> Kim Phillips <[EMAIL PROTECTED]> wrote:
>
>> On Thu, 10 Jan 2008 21:41:20 -0700
>> "Aggrwal Poonam" <[EMAIL PROTECTED]> wrote:
>>
>>> Hello  All
>>>
>>> I am waiting for more feedback on the patches.
>>>
>>> If there are no objections please consider them for 2.6.25.
>>>
>> if this isn't going to go through Alessandro Rubini/misc drivers, can

>> it go through the akpm/mm tree?
>>
>
> That would work.  But it might be more appropriate to go Kumar-
> >paulus->Linus.

I'm ok w/taking the arch/powerpc bits, but I"m a bit concerned about  
the driver itself.  I'm wondering if we need a TDM framework in the  
kernel.

I guess if Poonam could possibly describe how this driver is actually  
used that would be helpful.  I see we have 8315 with a discrete TDM  
block and I'm guessing 82xx/85xx based CPM parts of some form of TDM  
as well.

- k
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 0/3] UCC TDM driver for MPC83xx platforms

2008-01-16 Thread Kalra Ashish
Hello All,
 
I am sure that the TDM bus driver model/framework will make us put a lot
more programming effort without
any assurance of the code being accepted by the Linux community,
especially as there are many
Telephony/VoIP stack implementations in Linux such as the Sangoma
WANPIPE Kernel suite which
have their own Zaptel TDM (channelized zero-copy) interface layer. There
are other High Speed serial (HSS)
API interfaces, again supporting channelized and/or prioritized API
interfaces. All these implementations 
are proprietary and have their own tightly coupled upper layers and
hardware abstraction layers. It is
difficult to predict that these stacks will move towards a generic TDM
bus driver interface. Therefore, i think
we can have our own tightly coupled interface with our VoIP framework
and let us the keep the driver as it is,
i.e., as a misc driver.

Ashish

-Original Message-
From: Kumar Gala [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, January 15, 2008 9:01 AM
To: Andrew Morton
Cc: Phillips Kim; Aggrwal Poonam; [EMAIL PROTECTED];
[EMAIL PROTECTED]; [EMAIL PROTECTED];
[EMAIL PROTECTED]; linux-kernel@vger.kernel.org; Barkowski Michael;
Kalra Ashish; Cutler Richard
Subject: Re: [PATCH 0/3] UCC TDM driver for MPC83xx platforms


On Jan 14, 2008, at 3:15 PM, Andrew Morton wrote:

 On Mon, 14 Jan 2008 12:00:51 -0600
 Kim Phillips [EMAIL PROTECTED] wrote:

 On Thu, 10 Jan 2008 21:41:20 -0700
 Aggrwal Poonam [EMAIL PROTECTED] wrote:

 Hello  All

 I am waiting for more feedback on the patches.

 If there are no objections please consider them for 2.6.25.

 if this isn't going to go through Alessandro Rubini/misc drivers, can

 it go through the akpm/mm tree?


 That would work.  But it might be more appropriate to go Kumar-
 paulus-Linus.

I'm ok w/taking the arch/powerpc bits, but Im a bit concerned about  
the driver itself.  I'm wondering if we need a TDM framework in the  
kernel.

I guess if Poonam could possibly describe how this driver is actually  
used that would be helpful.  I see we have 8315 with a discrete TDM  
block and I'm guessing 82xx/85xx based CPM parts of some form of TDM  
as well.

- k
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/