Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

Alexander Graf Wed, 24 Jul 2013 01:26:32 -0700

On 24.07.2013, at 04:26, “tiejun.chen” wrote:

> On 07/18/2013 06:27 PM, Alexander Graf wrote:
>> 
>> On 18.07.2013, at 12:19, “tiejun.chen” wrote:
>> 
>>> On 07/18/2013 06:12 PM, Alexander Graf wrote:
>>>> 
>>>> On 18.07.2013, at 12:08, “tiejun.chen” wrote:
>>>> 
>>>>> On 07/18/2013 05:48 PM, Alexander Graf wrote:
>>>>>> 
>>>>>> On 18.07.2013, at 10:25, Bhushan Bharat-R65777 wrote:
>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> -----Original Message-----
>>>>>>>> From: Bhushan Bharat-R65777
>>>>>>>> Sent: Thursday, July 18, 2013 1:53 PM
>>>>>>>> To: '"�tiejun.chen�"'
>>>>>>>> Cc: [email protected]; [email protected]; [email protected]; Wood 
>>>>>>>> Scott-
>>>>>>>> B07421
>>>>>>>> Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for 
>>>>>>>> kernel
>>>>>>>> managed pages
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: "�tiejun.chen�" [mailto:[email protected]]
>>>>>>>>> Sent: Thursday, July 18, 2013 1:52 PM
>>>>>>>>> To: Bhushan Bharat-R65777
>>>>>>>>> Cc: [email protected]; [email protected]; [email protected]; Wood
>>>>>>>>> Scott-
>>>>>>>>> B07421
>>>>>>>>> Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
>>>>>>>>> kernel managed pages
>>>>>>>>> 
>>>>>>>>> On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote:
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: [email protected]
>>>>>>>>>>> [mailto:[email protected]] On Behalf Of "�tiejun.chen�"
>>>>>>>>>>> Sent: Thursday, July 18, 2013 1:01 PM
>>>>>>>>>>> To: Bhushan Bharat-R65777
>>>>>>>>>>> Cc: [email protected]; [email protected]; [email protected];
>>>>>>>>>>> Wood
>>>>>>>>>>> Scott-
>>>>>>>>>>> B07421
>>>>>>>>>>> Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
>>>>>>>>>>> kernel managed pages
>>>>>>>>>>> 
>>>>>>>>>>> On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>> From: "�tiejun.chen�" [mailto:[email protected]]
>>>>>>>>>>>>> Sent: Thursday, July 18, 2013 11:56 AM
>>>>>>>>>>>>> To: Bhushan Bharat-R65777
>>>>>>>>>>>>> Cc: [email protected]; [email protected]; [email protected];
>>>>>>>>>>>>> Wood
>>>>>>>>>>>>> Scott- B07421; Bhushan Bharat-R65777
>>>>>>>>>>>>> Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only
>>>>>>>>>>>>> for kernel managed pages
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On 07/18/2013 02:04 PM, Bharat Bhushan wrote:
>>>>>>>>>>>>>> If there is a struct page for the requested mapping then it's
>>>>>>>>>>>>>> normal DDR and the mapping sets "M" bit (coherent, cacheable)
>>>>>>>>>>>>>> else this is treated as I/O and we set  "I + G"  (cache
>>>>>>>>>>>>>> inhibited,
>>>>>>>>>>>>>> guarded)
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> This helps setting proper TLB mapping for direct assigned device
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Signed-off-by: Bharat Bhushan <[email protected]>
>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>    arch/powerpc/kvm/e500_mmu_host.c |   17 ++++++++++++-----
>>>>>>>>>>>>>>    1 files changed, 12 insertions(+), 5 deletions(-)
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> diff --git a/arch/powerpc/kvm/e500_mmu_host.c
>>>>>>>>>>>>>> b/arch/powerpc/kvm/e500_mmu_host.c
>>>>>>>>>>>>>> index 1c6a9d7..089c227 100644
>>>>>>>>>>>>>> --- a/arch/powerpc/kvm/e500_mmu_host.c
>>>>>>>>>>>>>> +++ b/arch/powerpc/kvm/e500_mmu_host.c
>>>>>>>>>>>>>> @@ -64,13 +64,20 @@ static inline u32
>>>>>>>>>>>>>> e500_shadow_mas3_attrib(u32 mas3, int
>>>>>>>>>>>>> usermode)
>>>>>>>>>>>>>>          return mas3;
>>>>>>>>>>>>>>    }
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int
>>>>>>>>>>>>>> usermode)
>>>>>>>>>>>>>> +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
>>>>>>>>>>>>>>    {
>>>>>>>>>>>>>> +        u32 mas2_attr;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +        mas2_attr = mas2 & MAS2_ATTRIB_MASK;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +        if (!pfn_valid(pfn)) {
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Why not directly use kvm_is_mmio_pfn()?
>>>>>>>>>>>> 
>>>>>>>>>>>> What I understand from this function (someone can correct me) is
>>>>>>>>>>>> that it
>>>>>>>>>>> returns "false" when the page is managed by kernel and is not
>>>>>>>>>>> marked as RESERVED (for some reason). For us it does not matter
>>>>>>>>>>> whether the page is reserved or not, if it is kernel visible page 
>>>>>>>>>>> then it
>>>>>>>> is DDR.
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> I think you are setting I|G by addressing all mmio pages, right? If
>>>>>>>>>>> so,
>>>>>>>>>>> 
>>>>>>>>>>>      KVM: direct mmio pfn check
>>>>>>>>>>> 
>>>>>>>>>>>      Userspace may specify memory slots that are backed by mmio
>>>>>>>>>>> pages rather than
>>>>>>>>>>>      normal RAM.  In some cases it is not enough to identify these
>>>>>>>>>>> mmio
>>>>>>>>> pages
>>>>>>>>>>>      by pfn_valid().  This patch adds checking the PageReserved as 
>>>>>>>>>>> well.
>>>>>>>>>> 
>>>>>>>>>> Do you know what are those "some cases" and how checking
>>>>>>>>>> PageReserved helps in
>>>>>>>>> those cases?
>>>>>>>>> 
>>>>>>>>> No, myself didn't see these actual cases in qemu,too. But this should
>>>>>>>>> be chronically persistent as I understand ;-)
>>>>>>>> 
>>>>>>>> Then I will wait till someone educate me :)
>>>>>>> 
>>>>>>> The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do 
>>>>>>> not want to call this for all tlbwe operation unless it is necessary.
>>>>>> 
>>>>>> It certainly does more than we need and potentially slows down the fast 
>>>>>> path (RAM mapping). The only thing it does on top of "if (pfn_valid())" 
>>>>>> is to check for pages that are declared reserved on the host. This 
>>>>>> happens in 2 cases:
>>>>>> 
>>>>>>   1) Non cache coherent DMA
>>>>>>   2) Memory hot remove
>>>>>> 
>>>>>> The non coherent DMA case would be interesting, as with the mechanism as 
>>>>>> it is in place in Linux today, we could potentially break normal guest 
>>>>>> operation if we don't take it into account. However, it's Kconfig 
>>>>>> guarded by:
>>>>>> 
>>>>>>         depends on 4xx || 8xx || E200 || PPC_MPC512x || GAMECUBE_COMMON
>>>>>>         default n if PPC_47x
>>>>>>         default y
>>>>>> 
>>>>>> so we never hit it with any core we care about ;).
>>>>>> 
>>>>>> Memory hot remove does not exist on e500 FWIW, so we don't have to worry 
>>>>>> about that one either.
>>>>> 
>>>>> Thanks for this good information :)
>>>>> 
>>>>> So why not limit those codes with CONFIG_MEMORY_HOTPLUG inside 
>>>>> kvm_is_mmio_pfn() to make sure that check is only valid when that is 
>>>>> really needed? This can decrease those unnecessary performance loss.
>>>>> 
>>>>> If I'm wrong please correct me :)
>>>> 
>>>> You're perfectly right, but this is generic KVM code. So it gets run 
>>>> across all architectures. What if someone has the great idea to add a new 
>>>> case here for x86, but doesn't tell us? In that case we potentially break 
>>>> x86.
>>>> 
>>>> I'd rather not like to break x86 :).
>>>> 
>>>> However, it'd be very interesting to see a benchmark with this change. Do 
>>>> you think you could just rip out the whole reserved check and run a few 
>>>> benchmarks and show us the results?
>>>> 
>>> 
>>> Often what case should be adopted to validate this scenario?
>> 
>> Something which hammers the TLB emulation heavily. I usually just run 
>> /bin/echo a thousand times in "time" and see how long it takes ;)
>> 
> 
> I tried to run five times with this combination, "time `for ((i=0; i<5000; 
> i++));  do /bin/echo; done`", to calculate the average value with this change:
> 
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 1580dd4..5e8635b 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -102,6 +102,10 @@ static bool largepages_enabled = true;
> 
> bool kvm_is_mmio_pfn(pfn_t pfn)
> {
> +#ifdef CONFIG_MEMORY_HOTPLUG


I'd feel safer if we narrow this down to e500.

> +       /*
> +        * Currently only in memory hot remove case we may still need this.
> +        */
>        if (pfn_valid(pfn)) {

We still have to check for pfn_valid, no? So the #ifdef should be down here.

>                int reserved;
>                struct page *tail = pfn_to_page(pfn);
> @@ -124,6 +128,7 @@ bool kvm_is_mmio_pfn(pfn_t pfn)
>                }
>                return PageReserved(tail);
>        }
> +#endif
> 
>        return true;
> }
> 
> Before apply this change:
> 
> real    (1m19.954s + 1m20.918s + 1m22.740s + 1m21.146s + 1m22.120s)/5= 
> 1m21.376s
> user    (0m23.181s + 0m23.550s + 0m23.506s + 0m23.410s + 0m23.520s)/5= 
> 0m23.433s
> sys   (0m49.087s + 0m49.563s + 0m51.758s + 0m50.290s + 0m51.047s)/5= 0m50.349s
> 
> After apply this change:
> 
> real    (1m19.507s + 1m20.919s + 1m21.436s + 1m21.179s + 1m20.293s)/5= 
> 1m20.667s
> user    (0m22.595s + 0m22.719s + 0m22.484s + 0m22.811s + 0m22.467s)/5= 
> 0m22.615s
> sys   (0m48.841s + 0m49.929s + 0m50.310s + 0m49.813s + 0m48.587s)/5= 0m49.496s
> 
> So,
> 
> real    (1m20.667s - 1m21.376s)/1m21.376s x 100% = -0.6%
> user    (0m22.615s - 0m23.433s)/0m23.433s x 100% = -3.5%
> sys   (0m49.496s - 0m50.349s)/0m50.349s x 100% = -1.7%

Very nice, so there is a real world performance benefit to doing this. Then 
yes, I think it would make sense to change the global helper function to be 
fast on e500 and use that one from e500_shadow_mas2_attrib() instead.

Gleb, Paolo, any hard feelings?


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

Reply via email to