Hi Henry,

On 22/05/2024 02:22, Henry Wang wrote:
On 5/22/2024 9:16 AM, Stefano Stabellini wrote:
On Wed, 22 May 2024, Henry Wang wrote:
Hi Julien,

On 5/21/2024 8:30 PM, Julien Grall wrote:
Hi,

On 21/05/2024 05:35, Henry Wang wrote:
diff --git a/xen/arch/arm/gic-vgic.c b/xen/arch/arm/gic-vgic.c
index 56490dbc43..956c11ba13 100644
--- a/xen/arch/arm/gic-vgic.c
+++ b/xen/arch/arm/gic-vgic.c
@@ -439,24 +439,33 @@ int vgic_connect_hw_irq(struct domain *d, struct
vcpu *v, unsigned int virq,
         /* We are taking to rank lock to prevent parallel connections. */
       vgic_lock_rank(v_target, rank, flags);
+    spin_lock(&v_target->arch.vgic.lock);
I know this is what Stefano suggested, but v_target would point to the
current affinity whereas the interrupt may be pending/active on the
"previous" vCPU. So it is a little unclear whether v_target is the correct
lock. Do you have more pointer to show this is correct?
No I think you are correct, we have discussed this in the initial version of
this patch. Sorry.

I followed the way from that discussion to note down the vcpu ID and retrieve
here, below is the diff, would this make sense to you?

diff --git a/xen/arch/arm/gic-vgic.c b/xen/arch/arm/gic-vgic.c
index 956c11ba13..134ed4e107 100644
--- a/xen/arch/arm/gic-vgic.c
+++ b/xen/arch/arm/gic-vgic.c
@@ -439,7 +439,7 @@ int vgic_connect_hw_irq(struct domain *d, struct vcpu *v,
unsigned int virq,

      /* We are taking to rank lock to prevent parallel connections. */
      vgic_lock_rank(v_target, rank, flags);
-    spin_lock(&v_target->arch.vgic.lock);
+ spin_lock(&d->vcpu[p->spi_vcpu_id]->arch.vgic.lock);

      if ( connect )
      {
@@ -465,7 +465,7 @@ int vgic_connect_hw_irq(struct domain *d, struct vcpu *v,
unsigned int virq,
              p->desc = NULL;
      }

-    spin_unlock(&v_target->arch.vgic.lock);
+ spin_unlock(&d->vcpu[p->spi_vcpu_id]->arch.vgic.lock);
      vgic_unlock_rank(v_target, rank, flags);

      return ret;
diff --git a/xen/arch/arm/include/asm/vgic.h b/xen/arch/arm/include/asm/vgic.h
index 79b73a0dbb..f4075d3e75 100644
--- a/xen/arch/arm/include/asm/vgic.h
+++ b/xen/arch/arm/include/asm/vgic.h
@@ -85,6 +85,7 @@ struct pending_irq
      uint8_t priority;
      uint8_t lpi_priority;       /* Caches the priority if this is an LPI. */
      uint8_t lpi_vcpu_id;        /* The VCPU for an LPI. */
+    uint8_t spi_vcpu_id;        /* The VCPU for an SPI. */
      /* inflight is used to append instances of pending_irq to
       * vgic.inflight_irqs */
      struct list_head inflight;
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index c04fc4f83f..e852479f13 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -632,6 +632,7 @@ void vgic_inject_irq(struct domain *d, struct vcpu *v,
unsigned int virq,
      }
      list_add_tail(&n->inflight, &v->arch.vgic.inflight_irqs);
  out:
+    n->spi_vcpu_id = v->vcpu_id;
      spin_unlock_irqrestore(&v->arch.vgic.lock, flags);

      /* we have a new higher priority irq, inject it into the guest */
      vcpu_kick(v);


Also, while looking at the locking, I noticed that we are not doing anything with GIC_IRQ_GUEST_MIGRATING. In gic_update_one_lr(), we seem to assume that
if the flag is set, then p->desc cannot be NULL.

Can we reach vgic_connect_hw_irq() with the flag set?
I think even from the perspective of making the code extra safe, we should also check GIC_IRQ_GUEST_MIGRATING as the LR is allocated for this case. I
will also add the check of GIC_IRQ_GUEST_MIGRATING here.
Yes. I think it might be easier to check for GIC_IRQ_GUEST_MIGRATING
early and return error immediately in that case. Otherwise, we can
continue and take spin_lock(&v_target->arch.vgic.lock) because no
migration is in progress

Ok, this makes sense to me, I will add

     if( test_bit(GIC_IRQ_GUEST_MIGRATING, &p->status) )
     {
         vgic_unlock_rank(v_target, rank, flags);
         return -EBUSY;
     }

right after taking the vgic rank lock.

I think that would be ok. I have to admit, I am still a bit wary about allowing to remove interrupts when the domain is running.

I am less concerned about the add part. Do you need the remove part now? If not, I would suggest to split in two so we can get the most of this series merged for 4.19 and continue to deal with the remove path in the background.

I will answer here to the other reply:

> I don't think so, if I am not mistaken, no LR will be allocated with other flags set.

I wasn't necessarily thinking about the LR allocation. I was more thinking whether there are any flags that could still be set.

IOW, will the vIRQ like new once vgic_connect_hw_irq() is succesful?

Also, while looking at the flags, I noticed we clear _IRQ_INPROGRESS before vgic_connect_hw_irq(). Shouldn't we only clear *after*?

This brings to another question. You don't special case a dying domain. If the domain is crashing, wouldn't this mean it wouldn't be possible to destroy it?

Cheers,

--
Julien Grall

Reply via email to