2016-08-09 15:09+0800, Peter Xu:
> On Tue, Aug 09, 2016 at 08:33:13AM +0200, Jan Kiszka wrote:
>> On 2016-08-09 08:24, Peter Xu wrote:
>> > On Tue, Aug 09, 2016 at 02:18:15PM +0800, Peter Xu wrote:
>> >> On Tue, Aug 09, 2016 at 12:33:17PM +0800, Chao Gao wrote:
>> >>> On Mon, Aug 08, 2016 at 04:57:14PM +0800, Peter Xu wrote:
>> >>>> On Mon, Aug 08, 2016 at 03:41:23PM +0800, Chao Gao wrote:
>> >>>>> HI, everyone.
>> >>>>>
>> >>>>> We have done some tests after merging this patch set into the lastest 
>> >>>>> qemu
>> >>>>> master. In kvm aspect, we use the lastest kvm linux-next branch. Here 
>> >>>>> are
>> >>>>> some problems we have met.
>> >>>>>
>> >>>>> 1. We can't boot up a 288 vcpus linux guest with CLI:
>> >>>>> qemu-system-x86_64 -boot c -m 4096 -sdl -monitor pty --enable-kvm \
>> >>>>> -M kernel-irqchip=split -serial stdio -bios bios.bin -smp cpus=288 \
>> >>>>> -hda vdisk.img -device intel-iommu,intremap=on -machine q35.
>> >>>>> The problem exists, even after we only assign 32 vcpus to the linux 
>> >>>>> guest.
>> >>>>> Maybe the output "do_IRQ: 146.113 No irq handler for vector (irq -1)" 
>> >>>>> is a clue.
>> >>>>> The output of qemu and kernel is in attachments. Do you have any idea
>> >>>>> about the problem and how to solve it?
>> >>>>
>> >>>> IIUC, we need to wait for Radim's QEMU patches to finally enable 288
>> >>>> vcpus?
>> >>>>
>> >>>> Btw, could you please try adding this to the QEMU cmdline when testing
>> >>>> with 32 vcpus:
>> >>>>
>> >>>>  -global ioapic.version=0x20
>> >>>>
>> >>>> I see that you were running RHEL 7.2 guest with a default e1000. In
>> >>>> that case, we may need to boost ioapic version to 0x20.
>> >>>
>> >>> It doesn't work. My host machine has 16 cpus. When I assign 4 or 8 vcpus 
>> >>> to the guest
>> >>> or 255 vcpus but set "kernel-irqchip=off", the guest work well. Maybe 
>> >>> when irqchip
>> >>> is in kernel, intremap can only handle situations that vcpus number is 
>> >>> less than 
>> >>> physical cpus'. Do you think it's right? 
>> >>
>> >> I don't think so. Vcpu number should not be related to host cpu
>> >> numbers.
>> >>
>> >> I think the problem is with x2apic. Currently, x2apic is enabled in
>> >> vIOMMU when kernel irqchip is used. This is problematic, since
>> >> actually we are throughing away dest_id[31:8] without Radim's patches,
>> >> meanwhile I see that by default x2apic is using cluster mode.
>> >>
>> >> In cluster mode, 8 bits will possibly not suffice (I think the reason
>> >> why >17 vcpus will bring trouble is that each cluster has 16 vcpus,
>> >> we'll have trouble if we have more than one cluster).
>> >>
>> >> To temporarily solve your issue, you should not only need "-global
>> >> ioapic.version=0x20" in QEMU command line, but also add "x2apic_phys"
>> >> to you guest kernel boot parameter, to force guest boot with x2apic
>> >> physical mode (not cluster mode). Though this can only work for <255
>> >> vcpus. IMHO we may still need to wait for Radim's patches to test >255
>> >> case.
>> > 
>> > Not sure whether we should temporarily disable EIM by default for now
>> > (provide an extra flag to optionally enable it)? Since it might break
>> > guests with >17 vcpus.
>> > 
>> > CC Jan as well.
>> 
>> A switch for EIM would be fine for me if it helps.
>> 
>> To my understanding, the issue will be gone with an enhance KVM
>> interface that we can then also detect via some cap (to flip the default
>> again)?
> 
> Would you help explain how to do it?
> 
> Btw, if we have that switch, the default can go back to EIM mode along
> with Radim's future patches.

I will post patches today as the feature made it upstream.

Reply via email to