On 01/02/2019 06:41, Zheng Xiang wrote: > > On 2019/1/31 23:12, Marc Zyngier wrote: >> Hi Zeng, >> >> On 31/01/2019 14:47, Zheng Xiang wrote: >>> Hi Marc, >>> >>> On 2019/1/29 13:42, Zheng Xiang wrote: >>>> On 2019/1/28 21:51, Marc Zyngier wrote: >>>>> On 28/01/2019 07:13, Zheng Xiang wrote: >>>>>> Hi Marc, >>>>>> >>>>>> Thanks for your review. >>>>>> >>>>>> On 2019/1/26 19:38, Marc Zyngier wrote: >>>>>>> Hi Zheng, >>>>>>> >>>>>>> On Sat, 26 Jan 2019 06:16:24 +0000, >>>>>>> Zheng Xiang <zhengxia...@huawei.com> wrote: >>>>>>>> >>>>>>>> Currently each PCI device under a PCI Bridge shares the same device id >>>>>>>> and ITS device. Assume there are two PCI devices call its_msi_prepare >>>>>>>> concurrently and they are both going to find and create their ITS >>>>>>>> device. There is a chance that the later one couldn't find ITS device >>>>>>>> before the other one creating the ITS device. It will cause the later >>>>>>>> one to create a different ITS device even if they have the same >>>>>>>> device_id. >>>>>>> >>>>>>> Interesting finding. Is this something you've actually seen in practice >>>>>>> with two devices being probed in parallel? Or something that you found >>>>>>> by inspection? >>>>>> >>>>>> Yes, I find this problem after analyzing the reason of VM hung. At last, >>>>>> I >>>>>> find that the virtio-gpu cannot receive the MSI interrupts due to sharing >>>>>> a same event_id as virtio-serial. >>>>>> >>>>>> See https://lkml.org/lkml/2019/1/10/299 for the bug report. >>>>>> >>>>>> This problem can be reproducted with high probability by booting a >>>>>> Qemu/KVM >>>>>> VM with a virtio-serial controller and a virtio-gpu adding to a PCI >>>>>> Bridge >>>>>> and also adding some delay before creating ITS device. >>>>> >>>>> Fair enough. Do you mind sharing your QEMU command line? It'd be useful >>>>> if I could reproduce it here (and would give me a way to check that it >>>>> doesn't regress). >>>> >>> >>> Have you reproduced it with my QEMU command line? >>> >>> If so, should I send a V2 patch with your suggestion? >> >> I've queued the following, much more complete patch: >> >> https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/commit/?h=irq/irqchip-next&id=9791ec7df0e7b4d80706ccea8f24b6542f6059e9 >> >> Can you check that it works for you? I didn't manage to get the right >> timing conditions, but I also had issues getting virtio-gpu running on >> my TX2, so one might explain the other. >> > > It works for my case, but I worried about the below lines which may > cause memory leak. > > @@ -2627,8 +2640,14 @@ static void its_irq_domain_free(struct irq_domain > *domain, unsigned int virq, > irq_domain_reset_irq_data(data); > } > > - /* If all interrupts have been freed, start mopping the floor */ > - if (bitmap_empty(its_dev->event_map.lpi_map, > + mutex_lock(&its->dev_alloc_lock); > + > + /* > + * If all interrupts have been freed, start mopping the > + * floor. This is conditionned on the device not being shared. > + */ > + if (!its_dev->shared && > + bitmap_empty(its_dev->event_map.lpi_map, > its_dev->event_map.nr_lpis)) { > its_lpi_free(its_dev->event_map.lpi_map, > its_dev->event_map.lpi_base, > > It seems that the shared its_dev would never be freed since the value of > its_dev->shared is always *true*.
Yes, and that is on purpose. As we don't refcount the number of interrupts that have been requested in the prepare phase, there is a race between free and alloc. We can have the following situation: CPU0: CPU1: msi_prepare: mutex_lock() find device() -> found store its_dev mutex_unlock() its_irq_domain_free: mutex_lock() free_device() mutex_unlock() its_irq_domain_alloc: use its_dev -> boom. So the trick is not to free the its_dev structure if it shares a devid. It is not really a leak, as the next device sharing the same devid will pick up the same structure. Does it make sense? Thanks, M. -- Jazz is not dead. It just smells funny...