Re: [Qemu-devel] host physical address width issues/questions for x86_64

Prasad Singamsetty Fri, 20 Oct 2017 15:55:16 -0700


On 10/18/2017 8:33 PM, Peter Xu wrote:

On Wed, Oct 18, 2017 at 10:19:31AM -0700, Prasad Singamsetty wrote:



On 10/16/2017 8:56 PM, Peter Xu wrote:

On Mon, Oct 16, 2017 at 10:02:25AM -0700, Prasad Singamsetty wrote:



On 10/14/2017 8:53 PM, Peter Xu wrote:

On Fri, Oct 13, 2017 at 11:14:03AM -0600, Alex Williamson wrote:

On Fri, 13 Oct 2017 18:01:44 +0100
"Dr. David Alan Gilbert" <dgilb...@redhat.com> wrote:

* Prasad Singamsetty (prasad.singamse...@oracle.com) wrote:

Hi,

I am new to the alias. I have some questions on this subject
and seek some clarifications from the experts in the team.
I ran into a couple of issues when I tried with large configuration
( >= 1TB memory, > 255 CPUs) for x86_64 guest machine.

1. QEMU uses the default value of 40 (TCG_PHYS_ADDR_BITS) for address
    width if user has not specified phys-bits or host-phys-bits=true
    property. The default value is obviously not sufficient and
    causing guest kernel to crash if configured with >= 1TB
    memory. Depending on the linux kernel version in the guest the
    panic was in different code paths. The workaround is for the
    user to specify the phys-bits property or set the property
    host-phys-bits=true.

    QUESTIONS:

...

2. host_address_width in DMAR table structure

    In this case, the default value is set to 39
    (VTD_HOST_ADDRESS_WIDTH - 1). With interrupt remapping
    enabled for the intel iommu and the guest is configured
    with > 255 cpus and >= 1TB memory, the guest kernel hangs
    during boot up. This need to be fixed.

    QUESTION:
    The question here again is can we fix this to use the
    real address width from the host as the default?


I don't know DMAR stuff; chatting to Alex (cc'd) it does sound
like that's an ommission that should be fixed.


[CC +Peter]

On physical hardware VT-d supports either 39 or 48 bit address widths
and generally you'd expect a sufficiently capable IOMMU to be matched
with the CPU.  Seems QEMU has only implemented a lower bit width and
it should probably be forcing phys bits of the VM to 39 to match until
the extended width can be implemented.  Thanks,

Alex


There were patches that tried to enable 48 bits GAW but it was
not accepted somehow:

   https://lists.gnu.org/archive/html/qemu-devel/2016-12/msg01886.html

Would this help in any way?


Thanks Alex for the patch info. Just curious why the patch was not
accepted. Any way, I will try it.


I don't sure I know the reason.  Anyway, it originated from one of
Fam's request for some NVMe tests.  If it can really help for your use
case as well, please feel free to revive those patches, or let me know
so that I can respin.  Thanks,


Thanks Peter. I will start with your patch and see if I can get
it to work first.

A quick question. Looking at the code, it doesn't look like there
is a way to disable dma remapping. User may have a case where he
is interested only in interrupt remapping (for > 255 cpus) and
not DMA remapping. Is that scenario considered before?


It can be done in the guest if the guest doesn't want DMAR.

Note that there are two isolated kernel tunables for the VT-d device:

- intel_iommu: "on" to turn on DMAR, "off" to turn off DMAR
- intremap:    "on" to turn on IR, "off" to turn off IR

So even if guest has "intel_iommu=off" in its boot parameter, IR will
still be on by default (or specify it explicitly using "intremap=on").


Thanks Peter. I think I figured out the problem in my test case
due to VTD_HOST_ADDRESS_WIDTH.

Problem scenario:

Guest kernel (machine type q35) is configured with 1TB memory.
With interrupt remapping enabled, the interrupt remapping
table is allocated by the guest kernel which can be any
where in the available physical memory. In my test case,
the physical address of the table is 0xfc3ec00000. And
this gets truncated by vtd_interrupt_remap_table_setup()
function to 0x7c3ec00000. This causes guest kernel to
get invalid data later on and it loops forever in
qi_submit_sync() in the guest kernel trying check fault
status.

This is after applying the patch from Peter Xu. The patch
is incomplete as the VTD_HAW_MASK is unchanged so it is
defined for 39 bits. There are several other masks defined
based on this in accessing iommu data structures. So, more
changes needed to implement Peter's approach of providing
x-aw-bits property.

Proposal:

We can simply change the VTD_HOST_ADDRESS_WIDTH to 48 bits
with out any other changes to the code. The current set of
features in the intel iommu emulator code works for q35
machine type and it doesn't have any other side effect.
Since the remapping tables are allocated by the guest kernel
they are always within the phys-bits range and as long
as the same range supported by intel iommu code in QEMU
it works fine. For the current q35 machine type, all the
supported cpus have <= 48 bits as the physical address
width. For short term, just changing the VTD_HOST_ADDRESS_WIDTH
to 48 should work fine for q35. I tried this and it seems
to work fine.

For long term, the VTD_HOST_ADDRESS_WIDTH has to match with
host cpu address width. If necessary we may need to define
a new machine type to keep VTD_HOST_ADDRESS_WIDTH value to
match with the host cpu.

Please let me know if you have any comments or suggestions
on this.

Thanks.
--Prasad


Thanks,

Re: [Qemu-devel] host physical address width issues/questions for x86_64

Reply via email to