All:
        This is the draft design of the IRQ virtualization, comments are
appreciated.
Thx,eddie
                Xen/IA64 interrupt virtualization


* Introduction
This document targets xen/ia64 developers, providing an design overview of
interrupt virtualization. How the guest IOSAPIC looks like and how the 
machine IOSAPIC is used in hypervisor.


* Terminology
(Not formal definition, just for better understanding)

PIRQ:  Physical IRQ generate by partitioned device, vector 0-255 in X86

VIRQ:  Dynamic IRQ that is pure virtual. Vector 256-511 in X86

IPI:   Inter processor IRQ
VIPI:

MMIO:  Memory Mapped IO

Event channel:



* Background

How Xen/X86 handle callback and event channel:
  In Xen environment, a para-guest registers its callback/safe callback entry
to hypervisor for batch delivering of events to guest. When a guest has 
pending events(shared bitmap), the guest execution turn to 
the pre-registered callback function (evtchn_do_upcall) like an interrupt
happens on native system. This control transfer can be disabled by 
another shared variable evtchn_upcall_mask. In this way guest software
can disable upcall for some reason.
Within evtchn_do_upcall, the events is dispatched. I.e. call do_IRQ() or
evtchn_device_upcall().

Current IA64 approach for callback:
  Current Xen/IA64 is using a pesudo physical IRQ to indicate the active of 
events and do dispatch at that pseudo IRQ handle. Within Xen summit we all
agree to implement the callback/safe fallback mechanism to avoid potentail
bugs and Intel is working on that now.

How X86 Xenlinux handle IRQs:
   Guest IRQ including PIRQ, VIRQ, IPI and interdomain communication 
channel are all bund with event channel. I.e. they all are carried
by event channel.

   At intial time, the guest needs to initialize IO_APIC hardware base on 
knowledge presented by firmware. And eventually register a pure virtual 
"pirq_type" as hw_interrupt_type instead of ioapic_level_type 
and ioapic_edge_type.

   At run time, "pirq_type" works and do pure event channel based operation.
for example, irq_desc->handler->ack (becomes ack_pirq) mask the 
corresponding event channel (no hypercall). irq_desc->handler->end 
(becomes end_pirq) unmask the corresponding event channel and 
may notify xen through hypercall (PHYSDEVOP_IRQ_UNMASK_NOTIFY)
to call xen irq_desc->handler->end. The later one may signal "EOI" 
in hypervisor (In IO_APIC, it is unmask_IO_APIC_irq).

 
Difference between pirq_type and ioapic_level_type/ioapic_edge_type:
     The initial time of this 2 type are similar, I.e. startup/shutdown,
   enable/disable are same, both may need to access machine resource.
     But the runtime service, i.e. ack/end, are quit different. pirq_type
   mainly access event channel related share memory for mask/unmask, but 
   ioapic_level_type/ioapic_edge_type needs to access machine IOSAPIC 
   resource for example: ack_edge_ioapic_irq and ack_edge_ioapic_vector
   need to mask APIC reource and ack APIC. 
     Another difference is that with event channel approach, the 
   hw_interrupt_type, i.e. pirq_type, works for both level and edge 
   triggered IRQ. 


When Xen received PHYSDEVOP_IRQ_UNMASK_NOTIFY (comes from guest pirq_type.end):
    pirq_guest_unmask()
        if ( --irq_desc->action->in_flight == 0 ) {
                irq_desc->handler->end();   // "EOI"
        }
    Done;



Machine IRQ delivery in Xen/X86
  The code flow of xen IRQ delivery (IRQ belongs to guests)
  A machine IRQ happens -> xen -> do_IRQ() of xen.
    irq_desc->handler->ack();   // same with Linux, op real resource
    __do_IRQ_guest()
        for each bund guest {
            send_guest_pirq();
            irq_desc->action->in_flight++; 
        }
    Done;

   send_guest_pirq():
     Set pending event channel bit (shared evtchn_pending) in target 
   processor. In SMP system when the target processor is running, a 
   machine IPI will be sent to (evtchn_notify).
        
When xen return to guest
     Before restore_all_guest, if VCPUINFO_upcall_mask=0, 
   i.e evtchan_upcall_mask = 0 and there is pending event channel,
   Xen will create a bounce_frame on guest that is similar with
   exception frame, the guest control then goes to callback entry.

    



*Xen/IA64 IRQ virtualization design

1: Hypervisor owns machine IOSAPIC/LSAPIC exclusively.
   This makes IRQ sharing between driver domains much easier as there
is no contention from domains. 

 
2: Machine IRQ delivery in Xen/IA64:
 
   The basic logical is exactly same with Linux/IA64
   An IRQ happens -> IVT+0x3000 -> ia64_handle_irq()
   while (IRQ exist) {
        vector=CR.IVR;
        mask IRQ using TPR;
        __do_IRQ();
        unmask IRQ using TPR
        Issue CR.EOI
   }
   
   A slight difference is __do_IRQ. In linux it calls do_IRQ, 
because Xen merge do_IRQ and __do_IRQ together and use name 
__do_IRQ.   --- Resue
   
   The do_IRQ do followings (API in Xen/arch/x86/irq.c), the code
sequence is same and is much detail explained here:
        desc=&irq_desc[vector];
        desc->handler->ack(vector);
        if ( desc->status & IRQ_GUEST ) { 
                __do_IRQ_guest(vector);
        }
        action = desc->action;
        action->handler(...);
        desc->handler->end(vector);
 
   __do_IRQ_guest(..) is same with X86. ---- Reuse


3: The machine SAPIC operation snapshot:
    An machine IRQ happens -> 
    while (IRQ exist) {
        read IVR
        mask by TPR
        set event channel;
        unmask by TPR
        issue CR.EOI
    };
    
  Within above sequence IOSAPIC.EOI is not issued, so when the driver 
domain get active:
    Mask event channel
    action to handle the device IRQ.
    Unmask event channel
    Notify Xen to Issue IOSAPIC.EOI through PHYSDEVOP_IRQ_UNMASK_NOTIFY 
hypercall.
    
4: IRQ prioritize
    Using event channel priority as guest IRQ priority...
    
    
    
What does future patch looks like
    The most job of this patch will be in initialization APIs like 
io_apic-xen.c 
in X86. The runtime features are already there and almost no code change. The 
current
event channel mechanism support SMP host/guest well.

_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel

Reply via email to