Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-06-02 Thread Ian Campbell
On Mon, 2015-06-01 at 15:57 -0700, Manish Jaggi wrote:

> > Anyway, the general shape of this plan seems plausible enough.
> Could you modify the http://xenbits.xen.org/people/ianc/vits/draftC.html(5 
> vITS to pITS mapping) based on this approach

I'm updating things as I go and feed back will be relected in the next
draft.


> > > -5- domU is booted with a single virtual its node in device tree. Front 
> > > end driver  attaches this its as msi-parent
> > > -6- When domU accesses for ITS are trapped in Xen, using the helper 
> > > function say
> > > get_phys_its_for_guest(guest_id, guest_sbdf, /*[out]*/its_ptr *its)
> > > 
> > > its can be retrieved.
> > > AFAIK this is numa safe.
> > > > > 2) When PCI device is assigned to DomU, how does domU choose
> > > > >  vITS to send commands.  AFAIK, the BDF of assigned device
> > > > >  is different from actual BDF in DomU.
> > > > AIUI this is described in the firmware tables.
> > > > 
> > > > e.g. in DT via the msi-parent phandle on the PCI root complex or
> > > > individual device.
> > > > 
> > > > Is there an assumption here that a single PCI root bridge is associated
> > > > with a single ITS block? Or can different devices on a PCI bus use
> > > > different ITS blocks?
> > > > 
> > > > Ian.
> > > > 
> > > > 
> > > > ___
> > > > Xen-devel mailing list
> > > > Xen-devel@lists.xen.org
> > > > http://lists.xen.org/xen-devel
> > 
> 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-06-01 Thread Manish Jaggi



On Tuesday 26 May 2015 06:04 AM, Ian Campbell wrote:

On Thu, 2015-05-21 at 05:37 -0700, Manish Jaggi wrote:

On Tuesday 19 May 2015 07:18 AM, Ian Campbell wrote:

On Tue, 2015-05-19 at 19:34 +0530, Vijay Kilari wrote:

On Tue, May 19, 2015 at 7:24 PM, Ian Campbell  wrote:

On Tue, 2015-05-19 at 14:36 +0100, Ian Campbell wrote:

On Tue, 2015-05-19 at 14:27 +0100, Julien Grall wrote:

With the multiple vITS we would have to retrieve the number of vITS.
Maybe by extending the xen_arch_domainconfig?

I'm sure we can find a way.

The important question is whether we want to go for a N:N vits:pits
mapping or 1:N.

So far I think we are leaning (slightly?) towards the 1:N model, if we
can come up with a satisfactory answer for what to do with global
commands.

Actually, Julien just mentioned NUMA which I think is a strong argument
for the N:N model.

We need to make a choice here one way or another, since it has knock on
effects on other parts, e.g the handling of SYNC and INVALL etc.

Given that N:N seems likely to be simpler from the Xen side and in any
case doesn't preclude us moving to a 1:N model (or even a 2:N model etc)
in the future how about we start with that?

If there is agreement in taking this direction then I will adjust the
relevant sections of the document to reflect this.

Yes, this make Xen side simple. Most important point to discuss is

1) How Xen maps vITS to pITS. its0 -> vits0?

The choices are basically either Xen chooses and the tools get told (or
"Just Know" the result), or the tools choose and setup the mapping in
Xen via hypercalls.


This could be one possible flow:
-1- xen code parses the pci node and creates a pci_hostbridge structure
which stores the device_tree ptr.
(using this pointer msi-parent (or respective its) can be retrieved)
-2- dom0 invokes a hypercall to register pci_hostbridge (seg_no:cfg_addr)
-3- Xen now knows that the device id (seg:bus:dev.fn) has which its.
Using a helper function its node for a seg_no can be retrieved.
-4- When a device is assigned to a domU, we introduce a new hypercall
map_guest_bdf which would let xen know
that for a guest how a virtual sbdf maps to a physical sdbf

This is an extension to XEN_DOMCTL_assign_device, I think. An extension
because that hypercall currently only receives the physical SBDF.

I wonder how x86 knows the virtual SBDF. Perhaps it has no need to for
some reason.

Anyway, the general shape of this plan seems plausible enough.


Could you modify the http://xenbits.xen.org/people/ianc/vits/draftC.html(5  vITS to 
pITS mapping  ) based 
on this approach


-5- domU is booted with a single virtual its node in device tree. Front
end driver  attaches this its as msi-parent
-6- When domU accesses for ITS are trapped in Xen, using the helper
function say
get_phys_its_for_guest(guest_id, guest_sbdf, /*[out]*/its_ptr *its)

its can be retrieved.
AFAIK this is numa safe.

2) When PCI device is assigned to DomU, how does domU choose
  vITS to send commands.  AFAIK, the BDF of assigned device
  is different from actual BDF in DomU.

AIUI this is described in the firmware tables.

e.g. in DT via the msi-parent phandle on the PCI root complex or
individual device.

Is there an assumption here that a single PCI root bridge is associated
with a single ITS block? Or can different devices on a PCI bus use
different ITS blocks?

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-26 Thread Ian Campbell
On Thu, 2015-05-21 at 05:37 -0700, Manish Jaggi wrote:
> 
> On Tuesday 19 May 2015 07:18 AM, Ian Campbell wrote:
> > On Tue, 2015-05-19 at 19:34 +0530, Vijay Kilari wrote:
> >> On Tue, May 19, 2015 at 7:24 PM, Ian Campbell  
> >> wrote:
> >>> On Tue, 2015-05-19 at 14:36 +0100, Ian Campbell wrote:
>  On Tue, 2015-05-19 at 14:27 +0100, Julien Grall wrote:
> > With the multiple vITS we would have to retrieve the number of vITS.
> > Maybe by extending the xen_arch_domainconfig?
>  I'm sure we can find a way.
> 
>  The important question is whether we want to go for a N:N vits:pits
>  mapping or 1:N.
> 
>  So far I think we are leaning (slightly?) towards the 1:N model, if we
>  can come up with a satisfactory answer for what to do with global
>  commands.
> >>> Actually, Julien just mentioned NUMA which I think is a strong argument
> >>> for the N:N model.
> >>>
> >>> We need to make a choice here one way or another, since it has knock on
> >>> effects on other parts, e.g the handling of SYNC and INVALL etc.
> >>>
> >>> Given that N:N seems likely to be simpler from the Xen side and in any
> >>> case doesn't preclude us moving to a 1:N model (or even a 2:N model etc)
> >>> in the future how about we start with that?
> >>>
> >>> If there is agreement in taking this direction then I will adjust the
> >>> relevant sections of the document to reflect this.
> >> Yes, this make Xen side simple. Most important point to discuss is
> >>
> >> 1) How Xen maps vITS to pITS. its0 -> vits0?
> > The choices are basically either Xen chooses and the tools get told (or
> > "Just Know" the result), or the tools choose and setup the mapping in
> > Xen via hypercalls.
> >
> This could be one possible flow:
> -1- xen code parses the pci node and creates a pci_hostbridge structure 
> which stores the device_tree ptr.
> (using this pointer msi-parent (or respective its) can be retrieved)
> -2- dom0 invokes a hypercall to register pci_hostbridge (seg_no:cfg_addr)
> -3- Xen now knows that the device id (seg:bus:dev.fn) has which its.
> Using a helper function its node for a seg_no can be retrieved.
> -4- When a device is assigned to a domU, we introduce a new hypercall 
> map_guest_bdf which would let xen know
> that for a guest how a virtual sbdf maps to a physical sdbf

This is an extension to XEN_DOMCTL_assign_device, I think. An extension
because that hypercall currently only receives the physical SBDF.

I wonder how x86 knows the virtual SBDF. Perhaps it has no need to for
some reason.

Anyway, the general shape of this plan seems plausible enough.

> -5- domU is booted with a single virtual its node in device tree. Front 
> end driver  attaches this its as msi-parent
> -6- When domU accesses for ITS are trapped in Xen, using the helper 
> function say
> get_phys_its_for_guest(guest_id, guest_sbdf, /*[out]*/its_ptr *its)
> 
> its can be retrieved.
> AFAIK this is numa safe.
> >> 2) When PCI device is assigned to DomU, how does domU choose
> >>  vITS to send commands.  AFAIK, the BDF of assigned device
> >>  is different from actual BDF in DomU.
> > AIUI this is described in the firmware tables.
> >
> > e.g. in DT via the msi-parent phandle on the PCI root complex or
> > individual device.
> >
> > Is there an assumption here that a single PCI root bridge is associated
> > with a single ITS block? Or can different devices on a PCI bus use
> > different ITS blocks?
> >
> > Ian.
> >
> >
> > ___
> > Xen-devel mailing list
> > Xen-devel@lists.xen.org
> > http://lists.xen.org/xen-devel
> 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-21 Thread Manish Jaggi



On Tuesday 19 May 2015 07:18 AM, Ian Campbell wrote:

On Tue, 2015-05-19 at 19:34 +0530, Vijay Kilari wrote:

On Tue, May 19, 2015 at 7:24 PM, Ian Campbell  wrote:

On Tue, 2015-05-19 at 14:36 +0100, Ian Campbell wrote:

On Tue, 2015-05-19 at 14:27 +0100, Julien Grall wrote:

With the multiple vITS we would have to retrieve the number of vITS.
Maybe by extending the xen_arch_domainconfig?

I'm sure we can find a way.

The important question is whether we want to go for a N:N vits:pits
mapping or 1:N.

So far I think we are leaning (slightly?) towards the 1:N model, if we
can come up with a satisfactory answer for what to do with global
commands.

Actually, Julien just mentioned NUMA which I think is a strong argument
for the N:N model.

We need to make a choice here one way or another, since it has knock on
effects on other parts, e.g the handling of SYNC and INVALL etc.

Given that N:N seems likely to be simpler from the Xen side and in any
case doesn't preclude us moving to a 1:N model (or even a 2:N model etc)
in the future how about we start with that?

If there is agreement in taking this direction then I will adjust the
relevant sections of the document to reflect this.

Yes, this make Xen side simple. Most important point to discuss is

1) How Xen maps vITS to pITS. its0 -> vits0?

The choices are basically either Xen chooses and the tools get told (or
"Just Know" the result), or the tools choose and setup the mapping in
Xen via hypercalls.


This could be one possible flow:
-1- xen code parses the pci node and creates a pci_hostbridge structure 
which stores the device_tree ptr.

(using this pointer msi-parent (or respective its) can be retrieved)
-2- dom0 invokes a hypercall to register pci_hostbridge (seg_no:cfg_addr)
-3- Xen now knows that the device id (seg:bus:dev.fn) has which its.
Using a helper function its node for a seg_no can be retrieved.
-4- When a device is assigned to a domU, we introduce a new hypercall 
map_guest_bdf which would let xen know

that for a guest how a virtual sbdf maps to a physical sdbf
-5- domU is booted with a single virtual its node in device tree. Front 
end driver  attaches this its as msi-parent
-6- When domU accesses for ITS are trapped in Xen, using the helper 
function say

get_phys_its_for_guest(guest_id, guest_sbdf, /*[out]*/its_ptr *its)

its can be retrieved.
AFAIK this is numa safe.

2) When PCI device is assigned to DomU, how does domU choose
 vITS to send commands.  AFAIK, the BDF of assigned device
 is different from actual BDF in DomU.

AIUI this is described in the firmware tables.

e.g. in DT via the msi-parent phandle on the PCI root complex or
individual device.

Is there an assumption here that a single PCI root bridge is associated
with a single ITS block? Or can different devices on a PCI bus use
different ITS blocks?

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-19 Thread Julien Grall
On 19/05/15 15:48, Ian Campbell wrote:
>> A software would be buggy if no INV/INVALL is sent after change the LPI
>> configuration table.
> 
> Specifically _guest_ software.
> 
> AIUI the ITS is not required to reread the LPI cfg table unless an
> INV/INVALL is issued, but it is allowed to do so if it wants, i.e. it
> could pickup the config change at any point after the write to the cfg
> table. Is that correct?

Yes.

> If so then as long as it cannot blow up in Xen's face (i.e. an interrupt
> storm) I think between a write to the LPI config table and the next
> associated INV/INVALL we are entitled either continue using the old
> config until the INV/INVALL, to immediately enact the change or anything
> in the middle. I think this gives a fair bit of flexibility.

The interrupt is deprivileged by Xen and EOI by the guest. I don't think
it's possible to produce an interrupt storm.

> You've proposed something at the "immediately enact" end of the
> spectrum.

Yes, it one suggestion among another.

>> As suggested on a previous mail, I think we can get rid of sending
>> INV/INVALL command to the pITS by trapping the LPI configuration table:
> 
> The motivation here is simply to avoid the potential negative impact on
> the system of a guest which fills its command queue with INVALL
> commands?

Right.

> I think we don't especially care about INV since they are targeted. We
> care about INVALL because they are global. INV handling comes along for
> the ride though.
> 
>> For every write access, when the vLPIs is valid (i.e associated to a
>> device/interrupt), Xen will toggle the enable bit in the hardware LPIs
>> configuration table, send an INV * and sync his internal state. This
>> requiring to be able to translate the vLPIs to a (device,ID).
> 
> "INV *"? You don't mean INVALL I think, but rather INV of the specific
> device?

Yes, I mean INV command.

> 
> One possible downside is that you will convert this guest vits
> interaction:
> for all LPIs
> enable LPI
> INVALL
> 
> Into this pits interaction:
> for all LPIs
> enable LPI
> INV LPI
> 
> Also sequences of events with toggle things back and forth before
> invalidating are similarly made more synchronous. (Such sequences seem
> dumb to me, but kernel side abstractions sometimes lead to such things).

Correct, this will result to send much more command to the ITS.

>> INVALL/INV command could be ignored and directly increment CREADR (with
>> some care) because it only ensure that the command has been executed,
>> not fully completed. A SYNC would be required from the guest in order to
>> ensure the completion.
>>
>> Therefore we would need more care for the SYNC. Maybe by injecting a
>> SYNC when it's necessary.
>>
>> Note that we would need Xen to send command on behalf of the guest (i.e
>> not part of the command queue).
> 
> A guest may do this:
> Enqueue command A
> Enqueue command B
> Change LPI1 cfg table
> Change LPI2 cfg table
> Enqueue command C
> Enqueue command D
> Enqueue INV LPI2
> Enqueue INV LPI1
> 
> With your change this would end up going to the PITS as:
> Enqueue command A
> Enqueue command B
> Change LPI1 cfg table
> Enqueue INV LPI1
> Change LPI2 cfg table
> Enqueue INV LPI2
> Enqueue command C
> Enqueue command D
> 
> Note that the INV's have been reordered WRT command C and D as well as
> each other. Are there sequences of commands where this may make a
> semantic difference?

AFAICT, the commands don't change their semantics following the state of
LPI configuration.

> What if command C is a SYNC for example?

That would not be a problem. As soon as the OS write into the LPI
configuration it can expect that the ITS will take the change a anytime.

>> With this solution, it would be possible to have a small amount of time
>> where the pITS doesn't use the correct the configuration (i.e the
>> interrupt not yet enabled/disabled). Xen is able to cooperate with that
>> and will queue the interrupt to the guest.
> 
> I think it is inherent in the h/w design that an LPI may still be
> delivered after the cfg table has changed or even the INV enqueued, it
> is only guaranteed to take effect with a sync following the INV.

Right.

> I had in mind a lazier scheme which I'll mention for completeness not
> because I necessarily think it is better.

I wasn't expected to have a correct solution from the beginning ;). I
was more a first step for a better one such as yours.

> For each vits we maintain a bit map which marks LPI cfg table entries as
> dirty. Possibly a count of dirty entries too.
> 
> On trap of cfg table write we propagate the change to the physical table
> and set the corresponding dirty bit (and count++ if we are doing that)
> 
> On INV we insert the corresponding INV to the PITS iff
> test_and_clear(dirty, LPI) and count--.

Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-19 Thread Ian Campbell
On Tue, 2015-05-19 at 15:05 +0100, Julien Grall wrote:
> On 19/05/15 13:48, Vijay Kilari wrote:
> > On Tue, May 19, 2015 at 5:49 PM, Ian Campbell  
> > wrote:
> >> On Tue, 2015-05-19 at 17:40 +0530, Vijay Kilari wrote:
>  If a guest issues (for example) a MOVI which is not followed by an
>  INV/INVALL on native then what would trigger the LPI configuration to be
>  applied by the h/w?
> 
>  If a guest is required to send an INV/INVALL in order for some change to
>  take affect and it does not do so then it is buggy, isn't it?
> >>>
> >>> agreed.
> >>>
> 
>  IOW all Xen needs to do is to propagate any guest initiated INV/INVALL
>  as/when it occurs in the command queue. I don't think we need to
>  fabricate an additional INV/INVALL while emulating a MOVI.
> 
>  What am I missing?
> >>>
> >>> back to point:
> >>>
> >>> INV has device id so not an issue.
> >>> INVALL does not have device id to know pITS to send.
> >>> For that reason Xen is expected to insert INVALL at proper
> >>> places similar to SYNC and ignore INV/INVALL of guest.
> >>
> >> Why wouldn't Xen just insert an INVALL in to all relevant pITS in
> >> response to an INVALL from the guest?
> > 
> > If INVALL is sent on all pITS, then we need to wait for all pITS to complete
> > the command before we update CREADR of vITS.
> > 
> >>
> >> If you are proposing something different then please be explicit by what
> >> you mean by "proper places similar to SYNC". Ideally by proposing some
> >> new text which I can use in the document.
> > 
> > If the platform has more than 1 pITS, The ITS commands are mapped
> > from vITS to pITS using device ID provided with ITS command.
> > 
> > However SYNC and INVALL does not have device ID.
> > In such case there could be two ways to handle
> > 1) SYNC and INVALL of guest will be sent to pITS based on previous ITS 
> > commands
> > of guest
> > 2) Xen will insert/append SYNC and INVALL to guest ITS commands
> > where-ever required and ignore guest
> >SYNC and INVALL commands
> > 
> > IMO (2) would be better as approach (1) might fail to handle
> > scenario where-in guest is sending only SYNC & INVALL commands.
> 
> When the guest send a SYNC, it expects all the command to be completed.
> If you send SYNC only when you think it's required we will end up to
> unexpected behavior.
> 
> Now, for INVALL, as said on a previous mail it's never required after an
> instruction. It's used to ask the ITS to invalid his cache of the LPI
> configuration.
> 
> A software would be buggy if no INV/INVALL is sent after change the LPI
> configuration table.

Specifically _guest_ software.

AIUI the ITS is not required to reread the LPI cfg table unless an
INV/INVALL is issued, but it is allowed to do so if it wants, i.e. it
could pickup the config change at any point after the write to the cfg
table. Is that correct?

If so then as long as it cannot blow up in Xen's face (i.e. an interrupt
storm) I think between a write to the LPI config table and the next
associated INV/INVALL we are entitled either continue using the old
config until the INV/INVALL, to immediately enact the change or anything
in the middle. I think this gives a fair bit of flexibility.

You've proposed something at the "immediately enact" end of the
spectrum.

> As suggested on a previous mail, I think we can get rid of sending
> INV/INVALL command to the pITS by trapping the LPI configuration table:

The motivation here is simply to avoid the potential negative impact on
the system of a guest which fills its command queue with INVALL
commands?

I think we don't especially care about INV since they are targeted. We
care about INVALL because they are global. INV handling comes along for
the ride though.

> For every write access, when the vLPIs is valid (i.e associated to a
> device/interrupt), Xen will toggle the enable bit in the hardware LPIs
> configuration table, send an INV * and sync his internal state. This
> requiring to be able to translate the vLPIs to a (device,ID).

"INV *"? You don't mean INVALL I think, but rather INV of the specific
device?

One possible downside is that you will convert this guest vits
interaction:
for all LPIs
enable LPI
INVALL

Into this pits interaction:
for all LPIs
enable LPI
INV LPI

Also sequences of events with toggle things back and forth before
invalidating are similarly made more synchronous. (Such sequences seem
dumb to me, but kernel side abstractions sometimes lead to such things).

> INVALL/INV command could be ignored and directly increment CREADR (with
> some care) because it only ensure that the command has been executed,
> not fully completed. A SYNC would be required from the guest in order to
> ensure the completion.
> 
> Therefore we would need more care for the SYNC. Maybe by injecting a
> SYNC when it's necessary.
> 
> Note that we would need Xen to send command on behalf of the gu

Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-19 Thread Ian Campbell
On Tue, 2015-05-19 at 19:34 +0530, Vijay Kilari wrote:
> On Tue, May 19, 2015 at 7:24 PM, Ian Campbell  wrote:
> > On Tue, 2015-05-19 at 14:36 +0100, Ian Campbell wrote:
> >> On Tue, 2015-05-19 at 14:27 +0100, Julien Grall wrote:
> >> > With the multiple vITS we would have to retrieve the number of vITS.
> >> > Maybe by extending the xen_arch_domainconfig?
> >>
> >> I'm sure we can find a way.
> >>
> >> The important question is whether we want to go for a N:N vits:pits
> >> mapping or 1:N.
> >>
> >> So far I think we are leaning (slightly?) towards the 1:N model, if we
> >> can come up with a satisfactory answer for what to do with global
> >> commands.
> >
> > Actually, Julien just mentioned NUMA which I think is a strong argument
> > for the N:N model.
> >
> > We need to make a choice here one way or another, since it has knock on
> > effects on other parts, e.g the handling of SYNC and INVALL etc.
> >
> > Given that N:N seems likely to be simpler from the Xen side and in any
> > case doesn't preclude us moving to a 1:N model (or even a 2:N model etc)
> > in the future how about we start with that?
> >
> > If there is agreement in taking this direction then I will adjust the
> > relevant sections of the document to reflect this.
> 
> Yes, this make Xen side simple. Most important point to discuss is
> 
> 1) How Xen maps vITS to pITS. its0 -> vits0?

The choices are basically either Xen chooses and the tools get told (or
"Just Know" the result), or the tools choose and setup the mapping in
Xen via hypercalls.

> 2) When PCI device is assigned to DomU, how does domU choose
> vITS to send commands.  AFAIK, the BDF of assigned device
> is different from actual BDF in DomU.

AIUI this is described in the firmware tables.

e.g. in DT via the msi-parent phandle on the PCI root complex or
individual device.

Is there an assumption here that a single PCI root bridge is associated
with a single ITS block? Or can different devices on a PCI bus use
different ITS blocks?

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-19 Thread Julien Grall
On 19/05/15 14:54, Ian Campbell wrote:
> On Tue, 2015-05-19 at 14:36 +0100, Ian Campbell wrote:
>> On Tue, 2015-05-19 at 14:27 +0100, Julien Grall wrote:
>>> With the multiple vITS we would have to retrieve the number of vITS.
>>> Maybe by extending the xen_arch_domainconfig?
>>
>> I'm sure we can find a way.
>>
>> The important question is whether we want to go for a N:N vits:pits
>> mapping or 1:N.
>>
>> So far I think we are leaning (slightly?) towards the 1:N model, if we
>> can come up with a satisfactory answer for what to do with global
>> commands.
> 
> Actually, Julien just mentioned NUMA which I think is a strong argument
> for the N:N model.
> 
> We need to make a choice here one way or another, since it has knock on
> effects on other parts, e.g the handling of SYNC and INVALL etc.
> 
> Given that N:N seems likely to be simpler from the Xen side and in any
> case doesn't preclude us moving to a 1:N model (or even a 2:N model etc)
> in the future how about we start with that?

+1.

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-19 Thread Julien Grall
On 19/05/15 13:48, Vijay Kilari wrote:
> On Tue, May 19, 2015 at 5:49 PM, Ian Campbell  wrote:
>> On Tue, 2015-05-19 at 17:40 +0530, Vijay Kilari wrote:
 If a guest issues (for example) a MOVI which is not followed by an
 INV/INVALL on native then what would trigger the LPI configuration to be
 applied by the h/w?

 If a guest is required to send an INV/INVALL in order for some change to
 take affect and it does not do so then it is buggy, isn't it?
>>>
>>> agreed.
>>>

 IOW all Xen needs to do is to propagate any guest initiated INV/INVALL
 as/when it occurs in the command queue. I don't think we need to
 fabricate an additional INV/INVALL while emulating a MOVI.

 What am I missing?
>>>
>>> back to point:
>>>
>>> INV has device id so not an issue.
>>> INVALL does not have device id to know pITS to send.
>>> For that reason Xen is expected to insert INVALL at proper
>>> places similar to SYNC and ignore INV/INVALL of guest.
>>
>> Why wouldn't Xen just insert an INVALL in to all relevant pITS in
>> response to an INVALL from the guest?
> 
> If INVALL is sent on all pITS, then we need to wait for all pITS to complete
> the command before we update CREADR of vITS.
> 
>>
>> If you are proposing something different then please be explicit by what
>> you mean by "proper places similar to SYNC". Ideally by proposing some
>> new text which I can use in the document.
> 
> If the platform has more than 1 pITS, The ITS commands are mapped
> from vITS to pITS using device ID provided with ITS command.
> 
> However SYNC and INVALL does not have device ID.
> In such case there could be two ways to handle
> 1) SYNC and INVALL of guest will be sent to pITS based on previous ITS 
> commands
> of guest
> 2) Xen will insert/append SYNC and INVALL to guest ITS commands
> where-ever required and ignore guest
>SYNC and INVALL commands
> 
> IMO (2) would be better as approach (1) might fail to handle
> scenario where-in guest is sending only SYNC & INVALL commands.

When the guest send a SYNC, it expects all the command to be completed.
If you send SYNC only when you think it's required we will end up to
unexpected behavior.

Now, for INVALL, as said on a previous mail it's never required after an
instruction. It's used to ask the ITS to invalid his cache of the LPI
configuration.

A software would be buggy if no INV/INVALL is sent after change the LPI
configuration table.

As suggested on a previous mail, I think we can get rid of sending
INV/INVALL command to the pITS by trapping the LPI configuration table:

For every write access, when the vLPIs is valid (i.e associated to a
device/interrupt), Xen will toggle the enable bit in the hardware LPIs
configuration table, send an INV * and sync his internal state. This
requiring to be able to translate the vLPIs to a (device,ID).

INVALL/INV command could be ignored and directly increment CREADR (with
some care) because it only ensure that the command has been executed,
not fully completed. A SYNC would be required from the guest in order to
ensure the completion.

Therefore we would need more care for the SYNC. Maybe by injecting a
SYNC when it's necessary.

Note that we would need Xen to send command on behalf of the guest (i.e
not part of the command queue).

With this solution, it would be possible to have a small amount of time
where the pITS doesn't use the correct the configuration (i.e the
interrupt not yet enabled/disabled). Xen is able to cooperate with that
and will queue the interrupt to the guest.

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-19 Thread Vijay Kilari
On Tue, May 19, 2015 at 7:24 PM, Ian Campbell  wrote:
> On Tue, 2015-05-19 at 14:36 +0100, Ian Campbell wrote:
>> On Tue, 2015-05-19 at 14:27 +0100, Julien Grall wrote:
>> > With the multiple vITS we would have to retrieve the number of vITS.
>> > Maybe by extending the xen_arch_domainconfig?
>>
>> I'm sure we can find a way.
>>
>> The important question is whether we want to go for a N:N vits:pits
>> mapping or 1:N.
>>
>> So far I think we are leaning (slightly?) towards the 1:N model, if we
>> can come up with a satisfactory answer for what to do with global
>> commands.
>
> Actually, Julien just mentioned NUMA which I think is a strong argument
> for the N:N model.
>
> We need to make a choice here one way or another, since it has knock on
> effects on other parts, e.g the handling of SYNC and INVALL etc.
>
> Given that N:N seems likely to be simpler from the Xen side and in any
> case doesn't preclude us moving to a 1:N model (or even a 2:N model etc)
> in the future how about we start with that?
>
> If there is agreement in taking this direction then I will adjust the
> relevant sections of the document to reflect this.

Yes, this make Xen side simple. Most important point to discuss is

1) How Xen maps vITS to pITS. its0 -> vits0?
2) When PCI device is assigned to DomU, how does domU choose
vITS to send commands.  AFAIK, the BDF of assigned device
is different from actual BDF in DomU.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-19 Thread Ian Campbell
On Tue, 2015-05-19 at 14:36 +0100, Ian Campbell wrote:
> On Tue, 2015-05-19 at 14:27 +0100, Julien Grall wrote:
> > With the multiple vITS we would have to retrieve the number of vITS.
> > Maybe by extending the xen_arch_domainconfig?
> 
> I'm sure we can find a way.
> 
> The important question is whether we want to go for a N:N vits:pits
> mapping or 1:N.
> 
> So far I think we are leaning (slightly?) towards the 1:N model, if we
> can come up with a satisfactory answer for what to do with global
> commands.

Actually, Julien just mentioned NUMA which I think is a strong argument
for the N:N model.

We need to make a choice here one way or another, since it has knock on
effects on other parts, e.g the handling of SYNC and INVALL etc.

Given that N:N seems likely to be simpler from the Xen side and in any
case doesn't preclude us moving to a 1:N model (or even a 2:N model etc)
in the future how about we start with that?

If there is agreement in taking this direction then I will adjust the
relevant sections of the document to reflect this.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-19 Thread Julien Grall
On 19/05/15 14:36, Ian Campbell wrote:
> On Tue, 2015-05-19 at 14:27 +0100, Julien Grall wrote:
>> With the multiple vITS we would have to retrieve the number of vITS.
>> Maybe by extending the xen_arch_domainconfig?
> 
> I'm sure we can find a way.
> 
> The important question is whether we want to go for a N:N vits:pits
> mapping or 1:N.
> 
> So far I think we are leaning (slightly?) towards the 1:N model, if we
> can come up with a satisfactory answer for what to do with global
> commands.

I was leaning toward the 1:1 model :).

I think the 1:N model will result to a more complex scheduling and would
slow down the emulation in environment where each domain is using a
different pITS.

Also there is the question of I/O Numa.

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-19 Thread Ian Campbell
On Tue, 2015-05-19 at 14:27 +0100, Julien Grall wrote:
> With the multiple vITS we would have to retrieve the number of vITS.
> Maybe by extending the xen_arch_domainconfig?

I'm sure we can find a way.

The important question is whether we want to go for a N:N vits:pits
mapping or 1:N.

So far I think we are leaning (slightly?) towards the 1:N model, if we
can come up with a satisfactory answer for what to do with global
commands.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-19 Thread Julien Grall
Hi Ian,

On 19/05/15 13:14, Ian Campbell wrote:
> On Fri, 2015-05-15 at 14:24 +0100, Julien Grall wrote:
>> Hi Ian,
>>
>> On 15/05/15 13:58, Ian Campbell wrote:
> Therefore it is proposed that the restriction that a single vITS maps
> to one pITS be retained. If a guest requires access to devices
> associated with multiple pITSs then multiple vITS should be
> configured.

 Having multiple vITS per domain brings other issues:
- How do you know the number of ITS to describe in the device 
 tree at boot?
>>>
>>> I'm not sure. I don't think 1 vs N is very different from the question
>>> of 0 vs 1 though, somehow the tools need to know about the pITS setup.
>>
>> I don't see why the tools would require to know the pITS setup.
>
> Even with only a single vits the tools need to know if the system has 0,
> 1, or more pits, to know whether to vreate a vits at all or not.

 In the 1 vITS solution no, it's only necessary to add a new gic define
 for the gic_version field in xen_arch_domainconfig.
>>>
>>> Would we expose a vITS to guests on a host which has no pITS at all?
>>
>> No, Xen will check if we can support vITS. See an example with my "GICv2
>> on GICv3" series. Obviously, we don't allow vGICv3 on GICv2.
> 
> Did you mean to refer to "arm: Allow the user to specify the GIC
> version" or some other part of that series?

Yes I mean this patch.

> I suppose you are proposing a new flag vits=yes|no passed as part of the
> domain config which Xen can then update to indicate yes or no? Or is
> there more to it than that? Could Xen not equally well expose nr_vits
> back to the tools?

A new flag or extending gic_version parameters (gic_version = "v3-its").

With the multiple vITS we would have to retrieve the number of vITS.
Maybe by extending the xen_arch_domainconfig?

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-19 Thread Ian Campbell
On Tue, 2015-05-19 at 18:18 +0530, Vijay Kilari wrote:
> On Tue, May 19, 2015 at 5:49 PM, Ian Campbell  wrote:
> > On Tue, 2015-05-19 at 17:40 +0530, Vijay Kilari wrote:
> >> > If a guest issues (for example) a MOVI which is not followed by an
> >> > INV/INVALL on native then what would trigger the LPI configuration to be
> >> > applied by the h/w?
> >> >
> >> > If a guest is required to send an INV/INVALL in order for some change to
> >> > take affect and it does not do so then it is buggy, isn't it?
> >>
> >> agreed.
> >>
> >> >
> >> > IOW all Xen needs to do is to propagate any guest initiated INV/INVALL
> >> > as/when it occurs in the command queue. I don't think we need to
> >> > fabricate an additional INV/INVALL while emulating a MOVI.
> >> >
> >> > What am I missing?
> >>
> >> back to point:
> >>
> >> INV has device id so not an issue.
> >> INVALL does not have device id to know pITS to send.
> >> For that reason Xen is expected to insert INVALL at proper
> >> places similar to SYNC and ignore INV/INVALL of guest.
> >
> > Why wouldn't Xen just insert an INVALL in to all relevant pITS in
> > response to an INVALL from the guest?
> 
> If INVALL is sent on all pITS, then we need to wait for all pITS to complete
> the command before we update CREADR of vITS.

Correct, but doesn't that already naturally fall out of any scheme which
maps on vits onto multiple pits? It's not specific to INVALL that we
need to consider the progress of all pITS before updating the vITS.

> >
> > If you are proposing something different then please be explicit by what
> > you mean by "proper places similar to SYNC". Ideally by proposing some
> > new text which I can use in the document.
> 
> If the platform has more than 1 pITS, The ITS commands are mapped
> from vITS to pITS using device ID provided with ITS command.
> 
> However SYNC and INVALL does not have device ID.
> In such case there could be two ways to handle
> 1) SYNC and INVALL of guest will be sent to pITS based on previous ITS 
> commands
> of guest
> 2) Xen will insert/append SYNC and INVALL to guest ITS commands
> where-ever required and ignore guest
>SYNC and INVALL commands
> 
> IMO (2) would be better as approach (1) might fail to handle
> scenario where-in guest is sending only SYNC & INVALL commands.

That depends on what "where-ever required" evaluates to. Please be
explicit here.

It sounds like this needs to be something which is handled as a new
chapter on translation, in a subsection dealing with non-device specific
command handling.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-19 Thread Vijay Kilari
On Tue, May 19, 2015 at 5:49 PM, Ian Campbell  wrote:
> On Tue, 2015-05-19 at 17:40 +0530, Vijay Kilari wrote:
>> > If a guest issues (for example) a MOVI which is not followed by an
>> > INV/INVALL on native then what would trigger the LPI configuration to be
>> > applied by the h/w?
>> >
>> > If a guest is required to send an INV/INVALL in order for some change to
>> > take affect and it does not do so then it is buggy, isn't it?
>>
>> agreed.
>>
>> >
>> > IOW all Xen needs to do is to propagate any guest initiated INV/INVALL
>> > as/when it occurs in the command queue. I don't think we need to
>> > fabricate an additional INV/INVALL while emulating a MOVI.
>> >
>> > What am I missing?
>>
>> back to point:
>>
>> INV has device id so not an issue.
>> INVALL does not have device id to know pITS to send.
>> For that reason Xen is expected to insert INVALL at proper
>> places similar to SYNC and ignore INV/INVALL of guest.
>
> Why wouldn't Xen just insert an INVALL in to all relevant pITS in
> response to an INVALL from the guest?

If INVALL is sent on all pITS, then we need to wait for all pITS to complete
the command before we update CREADR of vITS.

>
> If you are proposing something different then please be explicit by what
> you mean by "proper places similar to SYNC". Ideally by proposing some
> new text which I can use in the document.

If the platform has more than 1 pITS, The ITS commands are mapped
from vITS to pITS using device ID provided with ITS command.

However SYNC and INVALL does not have device ID.
In such case there could be two ways to handle
1) SYNC and INVALL of guest will be sent to pITS based on previous ITS commands
of guest
2) Xen will insert/append SYNC and INVALL to guest ITS commands
where-ever required and ignore guest
   SYNC and INVALL commands

IMO (2) would be better as approach (1) might fail to handle
scenario where-in guest is sending only SYNC & INVALL commands.

Regards
Vijay

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-19 Thread Ian Campbell
On Tue, 2015-05-19 at 17:40 +0530, Vijay Kilari wrote:
> > If a guest issues (for example) a MOVI which is not followed by an
> > INV/INVALL on native then what would trigger the LPI configuration to be
> > applied by the h/w?
> >
> > If a guest is required to send an INV/INVALL in order for some change to
> > take affect and it does not do so then it is buggy, isn't it?
> 
> agreed.
> 
> >
> > IOW all Xen needs to do is to propagate any guest initiated INV/INVALL
> > as/when it occurs in the command queue. I don't think we need to
> > fabricate an additional INV/INVALL while emulating a MOVI.
> >
> > What am I missing?
> 
> back to point:
> 
> INV has device id so not an issue.
> INVALL does not have device id to know pITS to send.
> For that reason Xen is expected to insert INVALL at proper
> places similar to SYNC and ignore INV/INVALL of guest.

Why wouldn't Xen just insert an INVALL in to all relevant pITS in
response to an INVALL from the guest?

If you are proposing something different then please be explicit by what
you mean by "proper places similar to SYNC". Ideally by proposing some
new text which I can use in the document.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-19 Thread Ian Campbell
On Fri, 2015-05-15 at 14:24 +0100, Julien Grall wrote:
> Hi Ian,
> 
> On 15/05/15 13:58, Ian Campbell wrote:
> >>> Therefore it is proposed that the restriction that a single vITS maps
> >>> to one pITS be retained. If a guest requires access to devices
> >>> associated with multiple pITSs then multiple vITS should be
> >>> configured.
> >>
> >> Having multiple vITS per domain brings other issues:
> >>- How do you know the number of ITS to describe in the device 
> >> tree at boot?
> >
> > I'm not sure. I don't think 1 vs N is very different from the question
> > of 0 vs 1 though, somehow the tools need to know about the pITS setup.
> 
>  I don't see why the tools would require to know the pITS setup.
> >>>
> >>> Even with only a single vits the tools need to know if the system has 0,
> >>> 1, or more pits, to know whether to vreate a vits at all or not.
> >>
> >> In the 1 vITS solution no, it's only necessary to add a new gic define
> >> for the gic_version field in xen_arch_domainconfig.
> > 
> > Would we expose a vITS to guests on a host which has no pITS at all?
> 
> No, Xen will check if we can support vITS. See an example with my "GICv2
> on GICv3" series. Obviously, we don't allow vGICv3 on GICv2.

Did you mean to refer to "arm: Allow the user to specify the GIC
version" or some other part of that series?

I suppose you are proposing a new flag vits=yes|no passed as part of the
domain config which Xen can then update to indicate yes or no? Or is
there more to it than that? Could Xen not equally well expose nr_vits
back to the tools?

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-19 Thread Vijay Kilari
On Tue, May 19, 2015 at 5:25 PM, Ian Campbell  wrote:
> On Tue, 2015-05-19 at 17:08 +0530, Vijay Kilari wrote:
>> Hi Ian,
>>
>>If we want to target for 4.6, then I think we should draw conclusion
>>
>> On Sat, May 16, 2015 at 2:19 PM, Julien Grall  
>> wrote:
>> > Hi,
>> >
>> >
>> > On 16/05/2015 05:03, Vijay Kilari wrote:
>> >>
>> >> On Fri, May 15, 2015 at 11:01 PM, Julien Grall 
>> >> wrote:
>> >>>
>> >>> On 15/05/15 16:38, Ian Campbell wrote:
>> 
>>  On Fri, 2015-05-15 at 16:05 +0100, Julien Grall wrote:
>> >
>> > On 15/05/15 15:04, Vijay Kilari wrote:
>> >>
>> >> On Fri, May 15, 2015 at 7:14 PM, Julien Grall
>> >>  wrote:
>> >>>
>> >>> On 15/05/15 14:24, Ian Campbell wrote:
>> 
>>  On Fri, 2015-05-15 at 18:44 +0530, Vijay Kilari wrote:
>> >
>> > On Fri, May 15, 2015 at 6:23 PM, Ian Campbell
>> >  wrote:
>> >>
>> >> On Fri, 2015-05-15 at 18:17 +0530, Vijay Kilari wrote:
>> >>>
>> >>> On Fri, May 15, 2015 at 5:33 PM, Julien Grall
>> >>>  wrote:
>> 
>>  On 15/05/15 12:30, Ian Campbell wrote:
>> >>
>> >> Handling of Single vITS and multipl pITS can be made simple.
>> >>
>> >> All ITS commands except SYNC & INVALL has device id which will
>> >> help us to know to which pITS it should be sent.
>> >>
>> >> SYNC & INVALL can be dropped by Xen on Guest request
>> >>   and let Xen append where ever SYNC & INVALL is required.
>> >> (Ex; Linux driver adds SYNC for required commands).
>> >> With this assumption, all ITS commands are mapped to pITS
>> >> and no need of synchronization across pITS
>> >
>> >
>> > You've ignored the second bullet its three sub-bullets, I
>> > think.
>> 
>> 
>> >>> Why can't we group the batch of commands based on pITS it has
>> >>> to be sent?.
>> >>
>> >>
>> >> Are you suggesting that each batch we send should be synchronous?
>> >> (i.e.
>> >> end with SYNC+INT) That doesn't seem at all desirable.
>> >
>> >
>> > Not only at the end of batch, SYNC can be appended based on every
>> > command within the batch.
>> 
>> 
>>  Could be, but something to avoid I think?
>> >>>
>> >>>
>> >>> That would slow down the ITS processing (SYNC is waiting that the
>> >>> previous command has executed).
>> >>>
>> >>> Also, what about INTALL? Sending it everytime would be horrible for
>> >>> the
>> >>> performance because it flush the ITS cache.
>> >>
>> >>
>> >> INVALL is not required everytime. It can be sent only as mentioned in
>> >> spec Note.
>> >> ex; MOVI
>> >>>
>> >>>
>> >>> BTW, when you quote the spec, can you give the section number/version of
>> >>> the spec? So far, I'm not able to find anything about the relation
>> >>> between MOVI and INVALL in my spec.
>> >>>
>> >>
>> >> See 5.13.19 INVALL collection of PRD03-GENC-010745 20.0
>> >
>> >
>> > Still nothing about MOVI... How did you deduce it?
>>
>>  I have quoted it as an example where INVALL might be needed.
>>
>> >
>> >
>> > The spec only says:
>> >
>> > "this command is expected to be used by software when it changed the
>> > re-configuration of an LPI in memory
>> > to ensure any cached copies of the old configuration are discarded."
>> >
>> >>> INV* commands are sent in order to ask the ITS reloading the
>> >>> configuration tables (see 4.8.4 PRD03-GENC-010745 24.0):
>> >>>
>> >>> "The effects of this caching are not visible to software except when
>> >>> reconfiguring an LPI, in which case an explicit invalidate command must
>> >>> be issued (e.g. an ITS INV command or a write to GICR_INVLPIR)
>> >>> Note: this means hardware must manage its caches automatically when
>> >>> moving interrupts"
>> >>>
>> >>> So, it looks like to me that INV* command are only necessary when
>> >>> configuration tables is changed.
>> >>>
>> >>> FWIW, Linux is using INVALL when a collection is map and INV when the
>> >>> LPI configuration is changed. I don't see any INV* command after MOVI.
>> >>> So it confirms what the spec says.
>> >>>
>> >> Note: this command is expected to be used by software when it changed
>> >> the re-configuration
>> >> of an LPI in memory to ensure any cached copies of the old
>> >> configuration are discarded.
>> >
>> >
>> > INVALL is used when a large number of LPIs has been reconfigured. If
>> > you
>> > send one by MOVI is not efficient at all and will slowdown all the
>> > interrupts for few milliseconds. We need to use them with caution.
>> >
>> > Usually a guest will send one for multiple MOVI command.
>> 
>> 
>>  We should be prepared for a guest which do

Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-19 Thread Ian Campbell
On Tue, 2015-05-19 at 17:08 +0530, Vijay Kilari wrote:
> Hi Ian,
> 
>If we want to target for 4.6, then I think we should draw conclusion
> 
> On Sat, May 16, 2015 at 2:19 PM, Julien Grall  wrote:
> > Hi,
> >
> >
> > On 16/05/2015 05:03, Vijay Kilari wrote:
> >>
> >> On Fri, May 15, 2015 at 11:01 PM, Julien Grall 
> >> wrote:
> >>>
> >>> On 15/05/15 16:38, Ian Campbell wrote:
> 
>  On Fri, 2015-05-15 at 16:05 +0100, Julien Grall wrote:
> >
> > On 15/05/15 15:04, Vijay Kilari wrote:
> >>
> >> On Fri, May 15, 2015 at 7:14 PM, Julien Grall
> >>  wrote:
> >>>
> >>> On 15/05/15 14:24, Ian Campbell wrote:
> 
>  On Fri, 2015-05-15 at 18:44 +0530, Vijay Kilari wrote:
> >
> > On Fri, May 15, 2015 at 6:23 PM, Ian Campbell
> >  wrote:
> >>
> >> On Fri, 2015-05-15 at 18:17 +0530, Vijay Kilari wrote:
> >>>
> >>> On Fri, May 15, 2015 at 5:33 PM, Julien Grall
> >>>  wrote:
> 
>  On 15/05/15 12:30, Ian Campbell wrote:
> >>
> >> Handling of Single vITS and multipl pITS can be made simple.
> >>
> >> All ITS commands except SYNC & INVALL has device id which will
> >> help us to know to which pITS it should be sent.
> >>
> >> SYNC & INVALL can be dropped by Xen on Guest request
> >>   and let Xen append where ever SYNC & INVALL is required.
> >> (Ex; Linux driver adds SYNC for required commands).
> >> With this assumption, all ITS commands are mapped to pITS
> >> and no need of synchronization across pITS
> >
> >
> > You've ignored the second bullet its three sub-bullets, I
> > think.
> 
> 
> >>> Why can't we group the batch of commands based on pITS it has
> >>> to be sent?.
> >>
> >>
> >> Are you suggesting that each batch we send should be synchronous?
> >> (i.e.
> >> end with SYNC+INT) That doesn't seem at all desirable.
> >
> >
> > Not only at the end of batch, SYNC can be appended based on every
> > command within the batch.
> 
> 
>  Could be, but something to avoid I think?
> >>>
> >>>
> >>> That would slow down the ITS processing (SYNC is waiting that the
> >>> previous command has executed).
> >>>
> >>> Also, what about INTALL? Sending it everytime would be horrible for
> >>> the
> >>> performance because it flush the ITS cache.
> >>
> >>
> >> INVALL is not required everytime. It can be sent only as mentioned in
> >> spec Note.
> >> ex; MOVI
> >>>
> >>>
> >>> BTW, when you quote the spec, can you give the section number/version of
> >>> the spec? So far, I'm not able to find anything about the relation
> >>> between MOVI and INVALL in my spec.
> >>>
> >>
> >> See 5.13.19 INVALL collection of PRD03-GENC-010745 20.0
> >
> >
> > Still nothing about MOVI... How did you deduce it?
> 
>  I have quoted it as an example where INVALL might be needed.
> 
> >
> >
> > The spec only says:
> >
> > "this command is expected to be used by software when it changed the
> > re-configuration of an LPI in memory
> > to ensure any cached copies of the old configuration are discarded."
> >
> >>> INV* commands are sent in order to ask the ITS reloading the
> >>> configuration tables (see 4.8.4 PRD03-GENC-010745 24.0):
> >>>
> >>> "The effects of this caching are not visible to software except when
> >>> reconfiguring an LPI, in which case an explicit invalidate command must
> >>> be issued (e.g. an ITS INV command or a write to GICR_INVLPIR)
> >>> Note: this means hardware must manage its caches automatically when
> >>> moving interrupts"
> >>>
> >>> So, it looks like to me that INV* command are only necessary when
> >>> configuration tables is changed.
> >>>
> >>> FWIW, Linux is using INVALL when a collection is map and INV when the
> >>> LPI configuration is changed. I don't see any INV* command after MOVI.
> >>> So it confirms what the spec says.
> >>>
> >> Note: this command is expected to be used by software when it changed
> >> the re-configuration
> >> of an LPI in memory to ensure any cached copies of the old
> >> configuration are discarded.
> >
> >
> > INVALL is used when a large number of LPIs has been reconfigured. If
> > you
> > send one by MOVI is not efficient at all and will slowdown all the
> > interrupts for few milliseconds. We need to use them with caution.
> >
> > Usually a guest will send one for multiple MOVI command.
> 
> 
>  We should be prepared for a guest which does nothing but send INVALL
>  commands (i.e. trying to DoS the host).
> 
>  I mentioned earlier about maybe needing to track which pITS's a SYNC
>  goes to (based on what S

Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-19 Thread Ian Campbell
On Tue, 2015-05-19 at 17:08 +0530, Vijay Kilari wrote:
> Hi Ian,
> 
>If we want to target for 4.6, then I think we should draw conclusion

I'm waiting for this subthread to reach some sort of conclusion before
posting another draft.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-19 Thread Vijay Kilari
Hi Ian,

   If we want to target for 4.6, then I think we should draw conclusion

On Sat, May 16, 2015 at 2:19 PM, Julien Grall  wrote:
> Hi,
>
>
> On 16/05/2015 05:03, Vijay Kilari wrote:
>>
>> On Fri, May 15, 2015 at 11:01 PM, Julien Grall 
>> wrote:
>>>
>>> On 15/05/15 16:38, Ian Campbell wrote:

 On Fri, 2015-05-15 at 16:05 +0100, Julien Grall wrote:
>
> On 15/05/15 15:04, Vijay Kilari wrote:
>>
>> On Fri, May 15, 2015 at 7:14 PM, Julien Grall
>>  wrote:
>>>
>>> On 15/05/15 14:24, Ian Campbell wrote:

 On Fri, 2015-05-15 at 18:44 +0530, Vijay Kilari wrote:
>
> On Fri, May 15, 2015 at 6:23 PM, Ian Campbell
>  wrote:
>>
>> On Fri, 2015-05-15 at 18:17 +0530, Vijay Kilari wrote:
>>>
>>> On Fri, May 15, 2015 at 5:33 PM, Julien Grall
>>>  wrote:

 On 15/05/15 12:30, Ian Campbell wrote:
>>
>> Handling of Single vITS and multipl pITS can be made simple.
>>
>> All ITS commands except SYNC & INVALL has device id which will
>> help us to know to which pITS it should be sent.
>>
>> SYNC & INVALL can be dropped by Xen on Guest request
>>   and let Xen append where ever SYNC & INVALL is required.
>> (Ex; Linux driver adds SYNC for required commands).
>> With this assumption, all ITS commands are mapped to pITS
>> and no need of synchronization across pITS
>
>
> You've ignored the second bullet its three sub-bullets, I
> think.


>>> Why can't we group the batch of commands based on pITS it has
>>> to be sent?.
>>
>>
>> Are you suggesting that each batch we send should be synchronous?
>> (i.e.
>> end with SYNC+INT) That doesn't seem at all desirable.
>
>
> Not only at the end of batch, SYNC can be appended based on every
> command within the batch.


 Could be, but something to avoid I think?
>>>
>>>
>>> That would slow down the ITS processing (SYNC is waiting that the
>>> previous command has executed).
>>>
>>> Also, what about INTALL? Sending it everytime would be horrible for
>>> the
>>> performance because it flush the ITS cache.
>>
>>
>> INVALL is not required everytime. It can be sent only as mentioned in
>> spec Note.
>> ex; MOVI
>>>
>>>
>>> BTW, when you quote the spec, can you give the section number/version of
>>> the spec? So far, I'm not able to find anything about the relation
>>> between MOVI and INVALL in my spec.
>>>
>>
>> See 5.13.19 INVALL collection of PRD03-GENC-010745 20.0
>
>
> Still nothing about MOVI... How did you deduce it?

 I have quoted it as an example where INVALL might be needed.

>
>
> The spec only says:
>
> "this command is expected to be used by software when it changed the
> re-configuration of an LPI in memory
> to ensure any cached copies of the old configuration are discarded."
>
>>> INV* commands are sent in order to ask the ITS reloading the
>>> configuration tables (see 4.8.4 PRD03-GENC-010745 24.0):
>>>
>>> "The effects of this caching are not visible to software except when
>>> reconfiguring an LPI, in which case an explicit invalidate command must
>>> be issued (e.g. an ITS INV command or a write to GICR_INVLPIR)
>>> Note: this means hardware must manage its caches automatically when
>>> moving interrupts"
>>>
>>> So, it looks like to me that INV* command are only necessary when
>>> configuration tables is changed.
>>>
>>> FWIW, Linux is using INVALL when a collection is map and INV when the
>>> LPI configuration is changed. I don't see any INV* command after MOVI.
>>> So it confirms what the spec says.
>>>
>> Note: this command is expected to be used by software when it changed
>> the re-configuration
>> of an LPI in memory to ensure any cached copies of the old
>> configuration are discarded.
>
>
> INVALL is used when a large number of LPIs has been reconfigured. If
> you
> send one by MOVI is not efficient at all and will slowdown all the
> interrupts for few milliseconds. We need to use them with caution.
>
> Usually a guest will send one for multiple MOVI command.


 We should be prepared for a guest which does nothing but send INVALL
 commands (i.e. trying to DoS the host).

 I mentioned earlier about maybe needing to track which pITS's a SYNC
 goes to (based on what SYNC have happened already and what commands the
 guest has sent since).

 Do we also need to track which LPIs a guest has fiddled with in order to
 decide (perhaps via a threshold) whether to use INVALL vs a small number
 of targeted INVALL?
>>>
>>>
>>> I did some reading about the INV* commands (INV and I

Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-16 Thread Julien Grall

Hi,

On 16/05/2015 05:03, Vijay Kilari wrote:

On Fri, May 15, 2015 at 11:01 PM, Julien Grall  wrote:

On 15/05/15 16:38, Ian Campbell wrote:

On Fri, 2015-05-15 at 16:05 +0100, Julien Grall wrote:

On 15/05/15 15:04, Vijay Kilari wrote:

On Fri, May 15, 2015 at 7:14 PM, Julien Grall  wrote:

On 15/05/15 14:24, Ian Campbell wrote:

On Fri, 2015-05-15 at 18:44 +0530, Vijay Kilari wrote:

On Fri, May 15, 2015 at 6:23 PM, Ian Campbell  wrote:

On Fri, 2015-05-15 at 18:17 +0530, Vijay Kilari wrote:

On Fri, May 15, 2015 at 5:33 PM, Julien Grall  wrote:

On 15/05/15 12:30, Ian Campbell wrote:

Handling of Single vITS and multipl pITS can be made simple.

All ITS commands except SYNC & INVALL has device id which will
help us to know to which pITS it should be sent.

SYNC & INVALL can be dropped by Xen on Guest request
  and let Xen append where ever SYNC & INVALL is required.
(Ex; Linux driver adds SYNC for required commands).
With this assumption, all ITS commands are mapped to pITS
and no need of synchronization across pITS


You've ignored the second bullet its three sub-bullets, I think.



Why can't we group the batch of commands based on pITS it has
to be sent?.


Are you suggesting that each batch we send should be synchronous? (i.e.
end with SYNC+INT) That doesn't seem at all desirable.


Not only at the end of batch, SYNC can be appended based on every
command within the batch.


Could be, but something to avoid I think?


That would slow down the ITS processing (SYNC is waiting that the
previous command has executed).

Also, what about INTALL? Sending it everytime would be horrible for the
performance because it flush the ITS cache.


INVALL is not required everytime. It can be sent only as mentioned in spec Note.
ex; MOVI


BTW, when you quote the spec, can you give the section number/version of
the spec? So far, I'm not able to find anything about the relation
between MOVI and INVALL in my spec.



See 5.13.19 INVALL collection of PRD03-GENC-010745 20.0


Still nothing about MOVI... How did you deduce it?

The spec only says:

"this command is expected to be used by software when it changed the 
re-configuration of an LPI in memory

to ensure any cached copies of the old configuration are discarded."


INV* commands are sent in order to ask the ITS reloading the
configuration tables (see 4.8.4 PRD03-GENC-010745 24.0):

"The effects of this caching are not visible to software except when
reconfiguring an LPI, in which case an explicit invalidate command must
be issued (e.g. an ITS INV command or a write to GICR_INVLPIR)
Note: this means hardware must manage its caches automatically when
moving interrupts"

So, it looks like to me that INV* command are only necessary when
configuration tables is changed.

FWIW, Linux is using INVALL when a collection is map and INV when the
LPI configuration is changed. I don't see any INV* command after MOVI.
So it confirms what the spec says.


Note: this command is expected to be used by software when it changed
the re-configuration
of an LPI in memory to ensure any cached copies of the old
configuration are discarded.


INVALL is used when a large number of LPIs has been reconfigured. If you
send one by MOVI is not efficient at all and will slowdown all the
interrupts for few milliseconds. We need to use them with caution.

Usually a guest will send one for multiple MOVI command.


We should be prepared for a guest which does nothing but send INVALL
commands (i.e. trying to DoS the host).

I mentioned earlier about maybe needing to track which pITS's a SYNC
goes to (based on what SYNC have happened already and what commands the
guest has sent since).

Do we also need to track which LPIs a guest has fiddled with in order to
decide (perhaps via a threshold) whether to use INVALL vs a small number
of targeted INVALL?


I did some reading about the INV* commands (INV and INVALL). The
interesting section in GICv3 is 4.8.4 PRD03-GENC-010745 24.0.

They are only used to ensure the ITS re-read the LPIs configuration
table. I don't speak about the pending table as the spec (4.8.5) says
that it's maintained solely by a re-distributor. It's up to the
implementation to provide a mechanism to sync the memory (useful for
Power Management).

The LPIs configuration tables is used to enable/disable the LPI and set
the priority. Only the enable/disable bit needs to be replicated to the
hardware.

The pITS LPIs configuration tables is managed by Xen. Each guest will
provide to the vITS his own LPIs configuration table.

The emulation of INV* command will depend on how we decide to emulate
the LPIs configuration table.

Solution 1: Trap every access to the guest LPIs configuration table


Trapping on guest LPI configuration table is mandatory to
enable/disable LPI in LPI pending table. There is no ITS command
for this. In my RFC patches I have done this, where Xen calls
irq_hw_controller's set_affinity which will send INVALL command


Trapping is not mandatory. The ITS may no

Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-15 Thread Vijay Kilari
On Fri, May 15, 2015 at 11:01 PM, Julien Grall  wrote:
> On 15/05/15 16:38, Ian Campbell wrote:
>> On Fri, 2015-05-15 at 16:05 +0100, Julien Grall wrote:
>>> On 15/05/15 15:04, Vijay Kilari wrote:
 On Fri, May 15, 2015 at 7:14 PM, Julien Grall  
 wrote:
> On 15/05/15 14:24, Ian Campbell wrote:
>> On Fri, 2015-05-15 at 18:44 +0530, Vijay Kilari wrote:
>>> On Fri, May 15, 2015 at 6:23 PM, Ian Campbell  
>>> wrote:
 On Fri, 2015-05-15 at 18:17 +0530, Vijay Kilari wrote:
> On Fri, May 15, 2015 at 5:33 PM, Julien Grall 
>  wrote:
>> On 15/05/15 12:30, Ian Campbell wrote:
 Handling of Single vITS and multipl pITS can be made simple.

 All ITS commands except SYNC & INVALL has device id which will
 help us to know to which pITS it should be sent.

 SYNC & INVALL can be dropped by Xen on Guest request
  and let Xen append where ever SYNC & INVALL is required.
 (Ex; Linux driver adds SYNC for required commands).
 With this assumption, all ITS commands are mapped to pITS
 and no need of synchronization across pITS
>>>
>>> You've ignored the second bullet its three sub-bullets, I think.
>>
>Why can't we group the batch of commands based on pITS it has
> to be sent?.

 Are you suggesting that each batch we send should be synchronous? (i.e.
 end with SYNC+INT) That doesn't seem at all desirable.
>>>
>>> Not only at the end of batch, SYNC can be appended based on every
>>> command within the batch.
>>
>> Could be, but something to avoid I think?
>
> That would slow down the ITS processing (SYNC is waiting that the
> previous command has executed).
>
> Also, what about INTALL? Sending it everytime would be horrible for the
> performance because it flush the ITS cache.

 INVALL is not required everytime. It can be sent only as mentioned in spec 
 Note.
 ex; MOVI
>
> BTW, when you quote the spec, can you give the section number/version of
> the spec? So far, I'm not able to find anything about the relation
> between MOVI and INVALL in my spec.
>

See 5.13.19 INVALL collection of PRD03-GENC-010745 20.0

> INV* commands are sent in order to ask the ITS reloading the
> configuration tables (see 4.8.4 PRD03-GENC-010745 24.0):
>
> "The effects of this caching are not visible to software except when
> reconfiguring an LPI, in which case an explicit invalidate command must
> be issued (e.g. an ITS INV command or a write to GICR_INVLPIR)
> Note: this means hardware must manage its caches automatically when
> moving interrupts"
>
> So, it looks like to me that INV* command are only necessary when
> configuration tables is changed.
>
> FWIW, Linux is using INVALL when a collection is map and INV when the
> LPI configuration is changed. I don't see any INV* command after MOVI.
> So it confirms what the spec says.
>
 Note: this command is expected to be used by software when it changed
 the re-configuration
 of an LPI in memory to ensure any cached copies of the old
 configuration are discarded.
>>>
>>> INVALL is used when a large number of LPIs has been reconfigured. If you
>>> send one by MOVI is not efficient at all and will slowdown all the
>>> interrupts for few milliseconds. We need to use them with caution.
>>>
>>> Usually a guest will send one for multiple MOVI command.
>>
>> We should be prepared for a guest which does nothing but send INVALL
>> commands (i.e. trying to DoS the host).
>>
>> I mentioned earlier about maybe needing to track which pITS's a SYNC
>> goes to (based on what SYNC have happened already and what commands the
>> guest has sent since).
>>
>> Do we also need to track which LPIs a guest has fiddled with in order to
>> decide (perhaps via a threshold) whether to use INVALL vs a small number
>> of targeted INVALL?
>
> I did some reading about the INV* commands (INV and INVALL). The
> interesting section in GICv3 is 4.8.4 PRD03-GENC-010745 24.0.
>
> They are only used to ensure the ITS re-read the LPIs configuration
> table. I don't speak about the pending table as the spec (4.8.5) says
> that it's maintained solely by a re-distributor. It's up to the
> implementation to provide a mechanism to sync the memory (useful for
> Power Management).
>
> The LPIs configuration tables is used to enable/disable the LPI and set
> the priority. Only the enable/disable bit needs to be replicated to the
> hardware.
>
> The pITS LPIs configuration tables is managed by Xen. Each guest will
> provide to the vITS his own LPIs configuration table.
>
> The emulation of INV* command will depend on how we decide to emulate
> the LPIs configuration table.
>
> Solution 1: Trap every access to the guest LPIs configuration table
>
   Trapping on guest LPI configuration table is mandatory to
enable/disable LPI in LPI pendi

Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-15 Thread Julien Grall
On 15/05/15 16:38, Ian Campbell wrote:
> On Fri, 2015-05-15 at 16:05 +0100, Julien Grall wrote:
>> On 15/05/15 15:04, Vijay Kilari wrote:
>>> On Fri, May 15, 2015 at 7:14 PM, Julien Grall  
>>> wrote:
 On 15/05/15 14:24, Ian Campbell wrote:
> On Fri, 2015-05-15 at 18:44 +0530, Vijay Kilari wrote:
>> On Fri, May 15, 2015 at 6:23 PM, Ian Campbell  
>> wrote:
>>> On Fri, 2015-05-15 at 18:17 +0530, Vijay Kilari wrote:
 On Fri, May 15, 2015 at 5:33 PM, Julien Grall 
  wrote:
> On 15/05/15 12:30, Ian Campbell wrote:
>>> Handling of Single vITS and multipl pITS can be made simple.
>>>
>>> All ITS commands except SYNC & INVALL has device id which will
>>> help us to know to which pITS it should be sent.
>>>
>>> SYNC & INVALL can be dropped by Xen on Guest request
>>>  and let Xen append where ever SYNC & INVALL is required.
>>> (Ex; Linux driver adds SYNC for required commands).
>>> With this assumption, all ITS commands are mapped to pITS
>>> and no need of synchronization across pITS
>>
>> You've ignored the second bullet its three sub-bullets, I think.
>
Why can't we group the batch of commands based on pITS it has
 to be sent?.
>>>
>>> Are you suggesting that each batch we send should be synchronous? (i.e.
>>> end with SYNC+INT) That doesn't seem at all desirable.
>>
>> Not only at the end of batch, SYNC can be appended based on every
>> command within the batch.
>
> Could be, but something to avoid I think?

 That would slow down the ITS processing (SYNC is waiting that the
 previous command has executed).

 Also, what about INTALL? Sending it everytime would be horrible for the
 performance because it flush the ITS cache.
>>>
>>> INVALL is not required everytime. It can be sent only as mentioned in spec 
>>> Note.
>>> ex; MOVI

BTW, when you quote the spec, can you give the section number/version of
the spec? So far, I'm not able to find anything about the relation
between MOVI and INVALL in my spec.

INV* commands are sent in order to ask the ITS reloading the
configuration tables (see 4.8.4 PRD03-GENC-010745 24.0):

"The effects of this caching are not visible to software except when
reconfiguring an LPI, in which case an explicit invalidate command must
be issued (e.g. an ITS INV command or a write to GICR_INVLPIR)
Note: this means hardware must manage its caches automatically when
moving interrupts"

So, it looks like to me that INV* command are only necessary when
configuration tables is changed.

FWIW, Linux is using INVALL when a collection is map and INV when the
LPI configuration is changed. I don't see any INV* command after MOVI.
So it confirms what the spec says.

>>> Note: this command is expected to be used by software when it changed
>>> the re-configuration
>>> of an LPI in memory to ensure any cached copies of the old
>>> configuration are discarded.
>>
>> INVALL is used when a large number of LPIs has been reconfigured. If you
>> send one by MOVI is not efficient at all and will slowdown all the
>> interrupts for few milliseconds. We need to use them with caution.
>>
>> Usually a guest will send one for multiple MOVI command.
> 
> We should be prepared for a guest which does nothing but send INVALL
> commands (i.e. trying to DoS the host).
> 
> I mentioned earlier about maybe needing to track which pITS's a SYNC
> goes to (based on what SYNC have happened already and what commands the
> guest has sent since).
> 
> Do we also need to track which LPIs a guest has fiddled with in order to
> decide (perhaps via a threshold) whether to use INVALL vs a small number
> of targeted INVALL?

I did some reading about the INV* commands (INV and INVALL). The
interesting section in GICv3 is 4.8.4 PRD03-GENC-010745 24.0.

They are only used to ensure the ITS re-read the LPIs configuration
table. I don't speak about the pending table as the spec (4.8.5) says
that it's maintained solely by a re-distributor. It's up to the
implementation to provide a mechanism to sync the memory (useful for
Power Management).

The LPIs configuration tables is used to enable/disable the LPI and set
the priority. Only the enable/disable bit needs to be replicated to the
hardware.

The pITS LPIs configuration tables is managed by Xen. Each guest will
provide to the vITS his own LPIs configuration table.

The emulation of INV* command will depend on how we decide to emulate
the LPIs configuration table.

Solution 1: Trap every access to the guest LPIs configuration table

For every write access, when the vLPIs is valid (i.e associated to a
device/interrupt), Xen will toggle the enable bit in the hardware LPIs
configuration table and send an INV *. This requiring to be able to
translate the vLPIs to a (device,ID).

INVALL/INV command could be ignored and directly increment CREADR
because it only ensure that th

Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-15 Thread Ian Campbell
On Fri, 2015-05-15 at 16:05 +0100, Julien Grall wrote:
> On 15/05/15 15:04, Vijay Kilari wrote:
> > On Fri, May 15, 2015 at 7:14 PM, Julien Grall  
> > wrote:
> >> On 15/05/15 14:24, Ian Campbell wrote:
> >>> On Fri, 2015-05-15 at 18:44 +0530, Vijay Kilari wrote:
>  On Fri, May 15, 2015 at 6:23 PM, Ian Campbell  
>  wrote:
> > On Fri, 2015-05-15 at 18:17 +0530, Vijay Kilari wrote:
> >> On Fri, May 15, 2015 at 5:33 PM, Julien Grall 
> >>  wrote:
> >>> On 15/05/15 12:30, Ian Campbell wrote:
> > Handling of Single vITS and multipl pITS can be made simple.
> >
> > All ITS commands except SYNC & INVALL has device id which will
> > help us to know to which pITS it should be sent.
> >
> > SYNC & INVALL can be dropped by Xen on Guest request
> >  and let Xen append where ever SYNC & INVALL is required.
> > (Ex; Linux driver adds SYNC for required commands).
> > With this assumption, all ITS commands are mapped to pITS
> > and no need of synchronization across pITS
> 
>  You've ignored the second bullet its three sub-bullets, I think.
> >>>
> >>Why can't we group the batch of commands based on pITS it has
> >> to be sent?.
> >
> > Are you suggesting that each batch we send should be synchronous? (i.e.
> > end with SYNC+INT) That doesn't seem at all desirable.
> 
>  Not only at the end of batch, SYNC can be appended based on every
>  command within the batch.
> >>>
> >>> Could be, but something to avoid I think?
> >>
> >> That would slow down the ITS processing (SYNC is waiting that the
> >> previous command has executed).
> >>
> >> Also, what about INTALL? Sending it everytime would be horrible for the
> >> performance because it flush the ITS cache.
> > 
> > INVALL is not required everytime. It can be sent only as mentioned in spec 
> > Note.
> > ex; MOVI
> > 
> > Note: this command is expected to be used by software when it changed
> > the re-configuration
> > of an LPI in memory to ensure any cached copies of the old
> > configuration are discarded.
> 
> INVALL is used when a large number of LPIs has been reconfigured. If you
> send one by MOVI is not efficient at all and will slowdown all the
> interrupts for few milliseconds. We need to use them with caution.
> 
> Usually a guest will send one for multiple MOVI command.

We should be prepared for a guest which does nothing but send INVALL
commands (i.e. trying to DoS the host).

I mentioned earlier about maybe needing to track which pITS's a SYNC
goes to (based on what SYNC have happened already and what commands the
guest has sent since).

Do we also need to track which LPIs a guest has fiddled with in order to
decide (perhaps via a threshold) whether to use INVALL vs a small number
of targeted INVALL?

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-15 Thread Julien Grall
On 15/05/15 15:04, Vijay Kilari wrote:
> On Fri, May 15, 2015 at 7:14 PM, Julien Grall  wrote:
>> On 15/05/15 14:24, Ian Campbell wrote:
>>> On Fri, 2015-05-15 at 18:44 +0530, Vijay Kilari wrote:
 On Fri, May 15, 2015 at 6:23 PM, Ian Campbell  
 wrote:
> On Fri, 2015-05-15 at 18:17 +0530, Vijay Kilari wrote:
>> On Fri, May 15, 2015 at 5:33 PM, Julien Grall  
>> wrote:
>>> On 15/05/15 12:30, Ian Campbell wrote:
> Handling of Single vITS and multipl pITS can be made simple.
>
> All ITS commands except SYNC & INVALL has device id which will
> help us to know to which pITS it should be sent.
>
> SYNC & INVALL can be dropped by Xen on Guest request
>  and let Xen append where ever SYNC & INVALL is required.
> (Ex; Linux driver adds SYNC for required commands).
> With this assumption, all ITS commands are mapped to pITS
> and no need of synchronization across pITS

 You've ignored the second bullet its three sub-bullets, I think.
>>>
>>Why can't we group the batch of commands based on pITS it has
>> to be sent?.
>
> Are you suggesting that each batch we send should be synchronous? (i.e.
> end with SYNC+INT) That doesn't seem at all desirable.

 Not only at the end of batch, SYNC can be appended based on every
 command within the batch.
>>>
>>> Could be, but something to avoid I think?
>>
>> That would slow down the ITS processing (SYNC is waiting that the
>> previous command has executed).
>>
>> Also, what about INTALL? Sending it everytime would be horrible for the
>> performance because it flush the ITS cache.
> 
> INVALL is not required everytime. It can be sent only as mentioned in spec 
> Note.
> ex; MOVI
> 
> Note: this command is expected to be used by software when it changed
> the re-configuration
> of an LPI in memory to ensure any cached copies of the old
> configuration are discarded.

INVALL is used when a large number of LPIs has been reconfigured. If you
send one by MOVI is not efficient at all and will slowdown all the
interrupts for few milliseconds. We need to use them with caution.

Usually a guest will send one for multiple MOVI command.

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-15 Thread Ian Campbell
On Fri, 2015-05-15 at 14:44 +0100, Julien Grall wrote:
> On 15/05/15 14:24, Ian Campbell wrote:
> > On Fri, 2015-05-15 at 18:44 +0530, Vijay Kilari wrote:
> >> On Fri, May 15, 2015 at 6:23 PM, Ian Campbell  
> >> wrote:
> >>> On Fri, 2015-05-15 at 18:17 +0530, Vijay Kilari wrote:
>  On Fri, May 15, 2015 at 5:33 PM, Julien Grall  
>  wrote:
> > On 15/05/15 12:30, Ian Campbell wrote:
> >>> Handling of Single vITS and multipl pITS can be made simple.
> >>>
> >>> All ITS commands except SYNC & INVALL has device id which will
> >>> help us to know to which pITS it should be sent.
> >>>
> >>> SYNC & INVALL can be dropped by Xen on Guest request
> >>>  and let Xen append where ever SYNC & INVALL is required.
> >>> (Ex; Linux driver adds SYNC for required commands).
> >>> With this assumption, all ITS commands are mapped to pITS
> >>> and no need of synchronization across pITS
> >>
> >> You've ignored the second bullet its three sub-bullets, I think.
> >
> Why can't we group the batch of commands based on pITS it has
>  to be sent?.
> >>>
> >>> Are you suggesting that each batch we send should be synchronous? (i.e.
> >>> end with SYNC+INT) That doesn't seem at all desirable.
> >>
> >> Not only at the end of batch, SYNC can be appended based on every
> >> command within the batch.
> > 
> > Could be, but something to avoid I think?
> 
> That would slow down the ITS processing (SYNC is waiting that the
> previous command has executed).
> 
> Also, what about INTALL? Sending it everytime would be horrible for the
> performance because it flush the ITS cache.
> 
> >> Also to handle second bullet, where a batch of commands might be
> >> sent on multple pITS. In that case batch of ITS commands is split
> >> across pITS and we have
> >> to wait for all the pITS to complete. Managing this would be difficult.
> >> For this I propose, batch can be created/split such that each batch
> >> contains commands related to one pITS. But it leads to small batch of 
> >> commands.
> 
> If I understand correctly, even with multiple pITS only a single batch
> per domain would be in-flight, right?
> 
> > That's not a bad idea, commonly I would expect commands for one device
> > to come in a short batch anyway. So long as the thing does cope if not I
> > think this might work.
> 
> This doesn't work well, we will need to read/validate twice a command.
> The first time to get the devID and notice we need to create a separate
> batch, the second time to effectively queue the command.
> 
> Given that validation is the part where the emulation will spend most of
> the time, we should avoid to do it twice.

Which can trivially be arranged by not doing it the dumb way. At worst
you remember the first translation which mismatched and use it again
next time.

Or you do translates in batches into a queue and then dequeue into the
physical command queue based on the target devices.

Thinking about global commands a bit, you could make those somewhat less
painful by remembering on a per `vits_cq` basis which pits devices it
has sent commands to since the last invalidate on that device and elide
any where the guest didn't touch that pits. Doesn't help against a
malicious guest in the worst case but does improve things in the common
case.

> Although, if we cache the validation we may send the wrong command/data
> if the guest decide to write in the command queue at the same time.

A guest which modifies its command queue after having advanced CWRITER
past that point deserves whatever it gets.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-15 Thread Vijay Kilari
On Fri, May 15, 2015 at 7:14 PM, Julien Grall  wrote:
> On 15/05/15 14:24, Ian Campbell wrote:
>> On Fri, 2015-05-15 at 18:44 +0530, Vijay Kilari wrote:
>>> On Fri, May 15, 2015 at 6:23 PM, Ian Campbell  
>>> wrote:
 On Fri, 2015-05-15 at 18:17 +0530, Vijay Kilari wrote:
> On Fri, May 15, 2015 at 5:33 PM, Julien Grall  
> wrote:
>> On 15/05/15 12:30, Ian Campbell wrote:
 Handling of Single vITS and multipl pITS can be made simple.

 All ITS commands except SYNC & INVALL has device id which will
 help us to know to which pITS it should be sent.

 SYNC & INVALL can be dropped by Xen on Guest request
  and let Xen append where ever SYNC & INVALL is required.
 (Ex; Linux driver adds SYNC for required commands).
 With this assumption, all ITS commands are mapped to pITS
 and no need of synchronization across pITS
>>>
>>> You've ignored the second bullet its three sub-bullets, I think.
>>
>Why can't we group the batch of commands based on pITS it has
> to be sent?.

 Are you suggesting that each batch we send should be synchronous? (i.e.
 end with SYNC+INT) That doesn't seem at all desirable.
>>>
>>> Not only at the end of batch, SYNC can be appended based on every
>>> command within the batch.
>>
>> Could be, but something to avoid I think?
>
> That would slow down the ITS processing (SYNC is waiting that the
> previous command has executed).
>
> Also, what about INTALL? Sending it everytime would be horrible for the
> performance because it flush the ITS cache.

INVALL is not required everytime. It can be sent only as mentioned in spec Note.
ex; MOVI

Note: this command is expected to be used by software when it changed
the re-configuration
of an LPI in memory to ensure any cached copies of the old
configuration are discarded.

>
>>> Also to handle second bullet, where a batch of commands might be
>>> sent on multple pITS. In that case batch of ITS commands is split
>>> across pITS and we have
>>> to wait for all the pITS to complete. Managing this would be difficult.
>>> For this I propose, batch can be created/split such that each batch
>>> contains commands related to one pITS. But it leads to small batch of 
>>> commands.
>
> If I understand correctly, even with multiple pITS only a single batch
> per domain would be in-flight, right?
>
>> That's not a bad idea, commonly I would expect commands for one device
>> to come in a short batch anyway. So long as the thing does cope if not I
>> think this might work.
>
> This doesn't work well, we will need to read/validate twice a command.
> The first time to get the devID and notice we need to create a separate
> batch, the second time to effectively queue the command.
>
> Given that validation is the part where the emulation will spend most of
> the time, we should avoid to do it twice.
>
> Although, if we cache the validation we may send the wrong command/data
> if the guest decide to write in the command queue at the same time.

The devID in the first command of the batch will decide pITS and all
subsequent commands in the batch will be added to the same batch if the
devID is same. I don't think mapping devID to pITS can be changed by guest
at any time

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-15 Thread Julien Grall
On 15/05/15 14:24, Ian Campbell wrote:
> On Fri, 2015-05-15 at 18:44 +0530, Vijay Kilari wrote:
>> On Fri, May 15, 2015 at 6:23 PM, Ian Campbell  
>> wrote:
>>> On Fri, 2015-05-15 at 18:17 +0530, Vijay Kilari wrote:
 On Fri, May 15, 2015 at 5:33 PM, Julien Grall  
 wrote:
> On 15/05/15 12:30, Ian Campbell wrote:
>>> Handling of Single vITS and multipl pITS can be made simple.
>>>
>>> All ITS commands except SYNC & INVALL has device id which will
>>> help us to know to which pITS it should be sent.
>>>
>>> SYNC & INVALL can be dropped by Xen on Guest request
>>>  and let Xen append where ever SYNC & INVALL is required.
>>> (Ex; Linux driver adds SYNC for required commands).
>>> With this assumption, all ITS commands are mapped to pITS
>>> and no need of synchronization across pITS
>>
>> You've ignored the second bullet its three sub-bullets, I think.
>
Why can't we group the batch of commands based on pITS it has
 to be sent?.
>>>
>>> Are you suggesting that each batch we send should be synchronous? (i.e.
>>> end with SYNC+INT) That doesn't seem at all desirable.
>>
>> Not only at the end of batch, SYNC can be appended based on every
>> command within the batch.
> 
> Could be, but something to avoid I think?

That would slow down the ITS processing (SYNC is waiting that the
previous command has executed).

Also, what about INTALL? Sending it everytime would be horrible for the
performance because it flush the ITS cache.

>> Also to handle second bullet, where a batch of commands might be
>> sent on multple pITS. In that case batch of ITS commands is split
>> across pITS and we have
>> to wait for all the pITS to complete. Managing this would be difficult.
>> For this I propose, batch can be created/split such that each batch
>> contains commands related to one pITS. But it leads to small batch of 
>> commands.

If I understand correctly, even with multiple pITS only a single batch
per domain would be in-flight, right?

> That's not a bad idea, commonly I would expect commands for one device
> to come in a short batch anyway. So long as the thing does cope if not I
> think this might work.

This doesn't work well, we will need to read/validate twice a command.
The first time to get the devID and notice we need to create a separate
batch, the second time to effectively queue the command.

Given that validation is the part where the emulation will spend most of
the time, we should avoid to do it twice.

Although, if we cache the validation we may send the wrong command/data
if the guest decide to write in the command queue at the same time.

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-15 Thread Ian Campbell
On Fri, 2015-05-15 at 18:44 +0530, Vijay Kilari wrote:
> On Fri, May 15, 2015 at 6:23 PM, Ian Campbell  wrote:
> > On Fri, 2015-05-15 at 18:17 +0530, Vijay Kilari wrote:
> >> On Fri, May 15, 2015 at 5:33 PM, Julien Grall  
> >> wrote:
> >> > On 15/05/15 12:30, Ian Campbell wrote:
> >> >>> Handling of Single vITS and multipl pITS can be made simple.
> >> >>>
> >> >>> All ITS commands except SYNC & INVALL has device id which will
> >> >>> help us to know to which pITS it should be sent.
> >> >>>
> >> >>> SYNC & INVALL can be dropped by Xen on Guest request
> >> >>>  and let Xen append where ever SYNC & INVALL is required.
> >> >>> (Ex; Linux driver adds SYNC for required commands).
> >> >>> With this assumption, all ITS commands are mapped to pITS
> >> >>> and no need of synchronization across pITS
> >> >>
> >> >> You've ignored the second bullet its three sub-bullets, I think.
> >> >
> >>Why can't we group the batch of commands based on pITS it has
> >> to be sent?.
> >
> > Are you suggesting that each batch we send should be synchronous? (i.e.
> > end with SYNC+INT) That doesn't seem at all desirable.
> 
> Not only at the end of batch, SYNC can be appended based on every
> command within the batch.

Could be, but something to avoid I think?

> Also to handle second bullet, where a batch of commands might be
> sent on multple pITS. In that case batch of ITS commands is split
> across pITS and we have
> to wait for all the pITS to complete. Managing this would be difficult.
> For this I propose, batch can be created/split such that each batch
> contains commands related to one pITS. But it leads to small batch of 
> commands.

That's not a bad idea, commonly I would expect commands for one device
to come in a short batch anyway. So long as the thing does cope if not I
think this might work.

> 
> >> > Aside ignoring the second bullet it's not possible to drop like that a
> >> > SYNC/INVALL command sent be the guest. How can you decide when a SYNC is
> >> > required or not? Why dropping "optional" SYNC would be fine? The spec
> >> > only says "This command specifies that all actions for the specified
> >> > re-distributor must be completed"...
> >>
> >>  If Xen is sending SYNC/INVALL commands to pITS based on the commands
> >> Xen is sending on pITS, there is no harm in ignoring guest commands.
> >>
> >> SYNC/INVALL are always depends on previous ITS commands.
> >> IMO, Alone these commands does not have any significance.
> >>
> >> >
> >> > Linux is not a good example for respecting the spec. Developers may
> >> > decide to put SYNC differently in new necessary place and we won't be
> >> > able to handle it correctly in Xen (see the vGICv3 re-dist example...).
> >> >
> >> > If we go on one vITS per multiple pITS we would have to send the command
> >> > SYNC/INVALL to every pITS.
> >> >
> >> > Regards,
> >> >
> >> > --
> >> > Julien Grall
> >
> >
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-15 Thread Julien Grall
Hi Ian,

On 15/05/15 13:58, Ian Campbell wrote:
>>> Therefore it is proposed that the restriction that a single vITS maps
>>> to one pITS be retained. If a guest requires access to devices
>>> associated with multiple pITSs then multiple vITS should be
>>> configured.
>>
>> Having multiple vITS per domain brings other issues:
>>  - How do you know the number of ITS to describe in the device tree at 
>> boot?
>
> I'm not sure. I don't think 1 vs N is very different from the question
> of 0 vs 1 though, somehow the tools need to know about the pITS setup.

 I don't see why the tools would require to know the pITS setup.
>>>
>>> Even with only a single vits the tools need to know if the system has 0,
>>> 1, or more pits, to know whether to vreate a vits at all or not.
>>
>> In the 1 vITS solution no, it's only necessary to add a new gic define
>> for the gic_version field in xen_arch_domainconfig.
> 
> Would we expose a vITS to guests on a host which has no pITS at all?

No, Xen will check if we can support vITS. See an example with my "GICv2
on GICv3" series. Obviously, we don't allow vGICv3 on GICv2.

 If we are going to expose multiple vITS to the guest, we should only use
 vITS for guest using PCI passthrough. This is because migration won't be
 compatible with it.
>>>
>>> It would be possible to support one s/w only vits for migration, i.e the
>>> evtchn thing at the end, but for the general case that is correct. On
>>> x86 I believe that if you hot unplug all passthrough devices you can
>>> migrate and then plug in other devices at the other end.
>>
>> What about migration on platform having fewer/more pITS (AFAIU on cavium
>> it may be possible because there is only one node)? If we want to
>> migrate vITS we should have to handle case where there is a mismatch.
>> Which brings to the solution with one vITS.
> 
> At the moment I don't think we are expecting to do heterogeneous
> migration. But perhaps we should plan for that eventuality, since one
> day it seems people would want to at least move to a newer version of
> the same silicon family for upgrade purposes.

I was think migration within the same version of the silicon.

AFAICT, cavium can be shipped with 1 or 2 nodes. This will result to
have 1 or 2 ITS.

Migration wouldn't be possible between servers using different number of
nodes.

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-15 Thread Julien Grall
On 15/05/15 13:38, Vijay Kilari wrote:
>> Can you give some examples of the heaviest translations please so I can
>> get a feel for actually how expensive we are talking here.
>>
> For example to translate MAPVI device_ID, event_ID, vID, vCID
> 
> 1) Read from vITS command queue

Not expensive

> 2) Validate device_ID is valid by looking at device list attached
> to that domain (vITS)

It can be reduced by using a tree rather than a list.

> 3) Validate vCID (virtual Collection ID) by checking against
> re-distributor address/cpu numbers
> of this domain

Validating vCID can be O(1) if you use only the cpu numbers (see
GITS_TYPER.PTA = 0).

> 4) Allocate physical LPI for the vID (virtual LPI) from lpi map of
> this device
>- Check if virtual LPI is already allocated from this device.
>- If not allocate it

Not expensive. Only looking in a bitmap.

>- Update lpi entries for this device

What do you mean by updating the LPI entries for this device?

> 5) Allocate memory for physical LPI descriptor (Add radix tree
> entry) and populate it
> 6) Call route_irq_to_guest() for this LPI

This could be done earlier by pre-allocating a chunk of LPIs.

If memory usage is a concern, I think we could allocate one IRQ
descriptor per chunk of LPIs and manage it ourself.

> 7) Format physical ITS command and send to pITS

Not expensive.

Overall, I don't think command are so expensive if we take time to think
how to optimize the emulation.

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-15 Thread Vijay Kilari
On Fri, May 15, 2015 at 6:23 PM, Ian Campbell  wrote:
> On Fri, 2015-05-15 at 18:17 +0530, Vijay Kilari wrote:
>> On Fri, May 15, 2015 at 5:33 PM, Julien Grall  
>> wrote:
>> > On 15/05/15 12:30, Ian Campbell wrote:
>> >>> Handling of Single vITS and multipl pITS can be made simple.
>> >>>
>> >>> All ITS commands except SYNC & INVALL has device id which will
>> >>> help us to know to which pITS it should be sent.
>> >>>
>> >>> SYNC & INVALL can be dropped by Xen on Guest request
>> >>>  and let Xen append where ever SYNC & INVALL is required.
>> >>> (Ex; Linux driver adds SYNC for required commands).
>> >>> With this assumption, all ITS commands are mapped to pITS
>> >>> and no need of synchronization across pITS
>> >>
>> >> You've ignored the second bullet its three sub-bullets, I think.
>> >
>>Why can't we group the batch of commands based on pITS it has
>> to be sent?.
>
> Are you suggesting that each batch we send should be synchronous? (i.e.
> end with SYNC+INT) That doesn't seem at all desirable.

Not only at the end of batch, SYNC can be appended based on every
command within the batch.

Also to handle second bullet, where a batch of commands might be
sent on multple pITS. In that case batch of ITS commands is split
across pITS and we have
to wait for all the pITS to complete. Managing this would be difficult.
For this I propose, batch can be created/split such that each batch
contains commands related to one pITS. But it leads to small batch of commands.

>> > Aside ignoring the second bullet it's not possible to drop like that a
>> > SYNC/INVALL command sent be the guest. How can you decide when a SYNC is
>> > required or not? Why dropping "optional" SYNC would be fine? The spec
>> > only says "This command specifies that all actions for the specified
>> > re-distributor must be completed"...
>>
>>  If Xen is sending SYNC/INVALL commands to pITS based on the commands
>> Xen is sending on pITS, there is no harm in ignoring guest commands.
>>
>> SYNC/INVALL are always depends on previous ITS commands.
>> IMO, Alone these commands does not have any significance.
>>
>> >
>> > Linux is not a good example for respecting the spec. Developers may
>> > decide to put SYNC differently in new necessary place and we won't be
>> > able to handle it correctly in Xen (see the vGICv3 re-dist example...).
>> >
>> > If we go on one vITS per multiple pITS we would have to send the command
>> > SYNC/INVALL to every pITS.
>> >
>> > Regards,
>> >
>> > --
>> > Julien Grall
>
>

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-15 Thread Ian Campbell
On Fri, 2015-05-15 at 18:08 +0530, Vijay Kilari wrote:
> On Fri, May 15, 2015 at 4:58 PM, Ian Campbell  wrote:
> > On Wed, 2015-05-13 at 21:57 +0530, Vijay Kilari wrote:
> >> > * On receipt of an interrupt notification arising from Xen's own use
> >> >   of `INT`; (see discussion under Completion)
> >>
> >> If INT notification method is used, then I don't think there is need
> >> for pITS scheduling on CREADER read.
> >>
> >> As we discussed in patch #13. Below steps should be suffice to virtualize
> >> command queue.
> >>
> >> 1) On each guest CWRITER update, Read batch ( 'm' commands) of commands
> >> and translate it and put on pITS schedule list. If there are more than 
> >> 'm'
> >> commands create m/n entries in schedule list. Append INT command for 
> >> each
> >>  schedule list entry
> >
> > How many INT commands do you mean here?
> 
>One INT command (Xen's completion INT) per batch
> 
> >
> >>  1a) If there is no ongoing command from this vITS on physical queue,
> >>send to physical queue.
> >>  1b) If there is ongoing command return to guest.
> >> 2) On receiving completion interrupt, update CREADER of guest and post next
> >> command from schedule list to physical queue.
> >>
> >> With this,
> >>- There will be no overhead of translating command in interrupt context
> >> which is quite heavy because translating ITS command requires validating
> >> and updating interval ITS structures.
> >
> > Can you give some examples of the heaviest translations please so I can
> > get a feel for actually how expensive we are talking here.
> >
> For example to translate MAPVI device_ID, event_ID, vID, vCID
[...]

Thanks.

> >>- Always only one request from guest will be posted to physical queue
> >>- Even in guest floods with large number of commands, all the commands
> >>  will be translated and queued in schedule list and posted batch by 
> >> batch
> >>- Scheduling pass is called only on CWRITER & completion INT.
> >
> > I think the main difference in what you propose here is that commands
> > are queued in pre-translated form to be injected (cheaply) during
> > scheduling as opposed to being left on the guest queue and translated
> > directly into the pits queue.
> >
> > I think `INT` vs `CREADR` scheduling is largely orthogonal to that.
> >
> > Julien proposed moving scheduling to a softirq, which gets it out of IRQ
> > context (good) but does necessarily account the translation to the
> > guest, which is a benefit of your approach. (I think things wihch happen
> > in a sortirq are implicitly accounted  to current, whoever that may be)
> >
>one softirq that looks at the all the vITS and posts the commands to pITS?
> or one softirq per vITS?

The former.

However in draft B I proposed that we might need something more like the
latter for accounting purposes, either the actual scheduling pass or a
per-vITS translation pass.

> > On the downside pretranslation adds memory overhead and reintroduces the
> > issue of a potentially long synchronous translation during `CWRITER`
> > handling.
> 
>Memory that is allocated is freed after completion of that batch.

It is still overhead.

>   The translation duration depends on how many commands guest is
> writing before updated CWRITER.

Xen cannot trust a guest to not write an enourmous batch. We need to
think in terms of malicious guest behaviour, i.e. deliberately try to
subvert or DoS the system, we cannot assume a well behaved guest.

> >> > Possible simplification: If we arrange that no guest ever has multiple
> >> > batches in flight (which can occur if we wrap around the list several
> >> > times) then we may be able to simplify the book keeping
> >> > required. However this may need some careful thought wrt fairness for
> >> > guests submitting frequent small batches of commands vs those sending
> >> > large batches.
> >>
> >>   If one LPI of the dummy device assigned to one VM, then book keeping
> >> per vITS becomes simple
> >
> > What dummy device do you mean? What simplifications does it imply?
> >
> 
>   I mean fake device (non-existent device)  to generate completion INT.
> Using unique completion INT for every vITS, then book keeping would be
> simple. This helps to identify vITS on receiving completion INT (Completion 
> INT
> <=> vITS mapping)

It already seem interesting to find one INT, would finding N (for
potentially large N) be possible?

However given the synchronous nature of things I think one suffices, you
can fairly easily keep the vits on a list in the order they appear on
the ring etc.

> 
> >>
> >> >
> >> > ### Completion
> >> >
> >> > It is expected that commands will normally be completed (resulting in
> >> > an update of the corresponding `vits_cq.creadr`) via guest read from
> >> > `CREADR`. This will trigger a scheduling pass which will ensure the
> >> > `vits_cq.creadr` value is up to date before it is returned.
> >> >
> >> If guest is CREADR to know c

Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-15 Thread Ian Campbell
On Fri, 2015-05-15 at 13:19 +0100, Julien Grall wrote:
> On 15/05/15 11:59, Ian Campbell wrote:
>  AFAIU the process suggested, Xen will inject small batch as long as the
>  physical command queue is not full.
> >>>
>  Let's take a simple case, only a single domain is using vITS on the
>  platform. If it injects a huge number of commands, Xen will split it
>  with lots of small batch. All batch will be injected in the same pass as
>  long as it fits in the physical command queue. Am I correct?
> >>>
> >>> That's how it is currently written, yes. With the "possible
> >>> simplification" above the answer is no, only a batch at a time would be
> >>> written for each guest.
> >>>
> >>> BTW, it doesn't have to be a single guest, the sum total of the
> >>> injections across all guests could also take a similar amount of time.
> >>> Is that a concern?
> >>
> >> Yes, the example with only a guest was easier to explain.
> > 
> > So as well as limiting the number of commands in each domains batch we
> > also want to limit the total number of batches?
> 
> Right. We want to have a "short" scheduling pass no matter the size of
> the queue.
> 
>  I think we have to restrict total number of batch (i.e for all the
>  domain) injected in a same scheduling pass.
> 
>  I would even tend to allow only one in flight batch per domain. That
>  would limit the possible problem I pointed out.
> >>>
> >>> This is the "possible simplification" I think. Since it simplifies other
> >>> things (I think) as well as addressing this issue I think it might be a
> >>> good idea.
> >>
> >> With the limitation of command send per batch, would the fairness you
> >> were talking on the design doc still required?
> > 
> > I think we still want to schedule the guest's in a strict round robin
> > manner, to avoid one guest monopolising things.
> 
> I agree, although I was talking about the fairness you mentionned in
> "However this may need some careful thought wrt fairness for
> guests submitting frequent small batches of commands vs those sending
> large batches."

Ah, yes.

The trade off here is between number of INT+scheduling passes vs time
spent in each int pass. Smaller batches would mean more ints and
overhead there.

So I think limiting batch sizes is ok, but we may need to tweak the
sizing a bit based on experience.

> > Therefore it is proposed that the restriction that a single vITS maps
> > to one pITS be retained. If a guest requires access to devices
> > associated with multiple pITSs then multiple vITS should be
> > configured.
> 
>  Having multiple vITS per domain brings other issues:
>   - How do you know the number of ITS to describe in the device tree at 
>  boot?
> >>>
> >>> I'm not sure. I don't think 1 vs N is very different from the question
> >>> of 0 vs 1 though, somehow the tools need to know about the pITS setup.
> >>
> >> I don't see why the tools would require to know the pITS setup.
> > 
> > Even with only a single vits the tools need to know if the system has 0,
> > 1, or more pits, to know whether to vreate a vits at all or not.
> 
> In the 1 vITS solution no, it's only necessary to add a new gic define
> for the gic_version field in xen_arch_domainconfig.

Would we expose a vITS to guests on a host which has no pITS at all?
What would happen if the guest tried to use it? That's the 0 vITS case,
and once you can distinguish 0 from 1 distinguishing larger numbers
isn't a huge stretch.

> >> If we are going to expose multiple vITS to the guest, we should only use
> >> vITS for guest using PCI passthrough. This is because migration won't be
> >> compatible with it.
> > 
> > It would be possible to support one s/w only vits for migration, i.e the
> > evtchn thing at the end, but for the general case that is correct. On
> > x86 I believe that if you hot unplug all passthrough devices you can
> > migrate and then plug in other devices at the other end.
> 
> What about migration on platform having fewer/more pITS (AFAIU on cavium
> it may be possible because there is only one node)? If we want to
> migrate vITS we should have to handle case where there is a mismatch.
> Which brings to the solution with one vITS.

At the moment I don't think we are expecting to do heterogeneous
migration. But perhaps we should plan for that eventuality, since one
day it seems people would want to at least move to a newer version of
the same silicon family for upgrade purposes.

> As said your event channel paragraph, we should put aside the event
> channel injected by the vITS for now. It was only a suggestion and it
> will require more though that the vITS emulation.






___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-15 Thread Ian Campbell
On Fri, 2015-05-15 at 18:17 +0530, Vijay Kilari wrote:
> On Fri, May 15, 2015 at 5:33 PM, Julien Grall  wrote:
> > On 15/05/15 12:30, Ian Campbell wrote:
> >>> Handling of Single vITS and multipl pITS can be made simple.
> >>>
> >>> All ITS commands except SYNC & INVALL has device id which will
> >>> help us to know to which pITS it should be sent.
> >>>
> >>> SYNC & INVALL can be dropped by Xen on Guest request
> >>>  and let Xen append where ever SYNC & INVALL is required.
> >>> (Ex; Linux driver adds SYNC for required commands).
> >>> With this assumption, all ITS commands are mapped to pITS
> >>> and no need of synchronization across pITS
> >>
> >> You've ignored the second bullet its three sub-bullets, I think.
> >
>Why can't we group the batch of commands based on pITS it has
> to be sent?.

Are you suggesting that each batch we send should be synchronous? (i.e.
end with SYNC+INT) That doesn't seem at all desirable.

> > Aside ignoring the second bullet it's not possible to drop like that a
> > SYNC/INVALL command sent be the guest. How can you decide when a SYNC is
> > required or not? Why dropping "optional" SYNC would be fine? The spec
> > only says "This command specifies that all actions for the specified
> > re-distributor must be completed"...
> 
>  If Xen is sending SYNC/INVALL commands to pITS based on the commands
> Xen is sending on pITS, there is no harm in ignoring guest commands.
> 
> SYNC/INVALL are always depends on previous ITS commands.
> IMO, Alone these commands does not have any significance.
> 
> >
> > Linux is not a good example for respecting the spec. Developers may
> > decide to put SYNC differently in new necessary place and we won't be
> > able to handle it correctly in Xen (see the vGICv3 re-dist example...).
> >
> > If we go on one vITS per multiple pITS we would have to send the command
> > SYNC/INVALL to every pITS.
> >
> > Regards,
> >
> > --
> > Julien Grall



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-15 Thread Julien Grall
On 15/05/15 13:47, Vijay Kilari wrote:
>> Aside ignoring the second bullet it's not possible to drop like that a
>> SYNC/INVALL command sent be the guest. How can you decide when a SYNC is
>> required or not? Why dropping "optional" SYNC would be fine? The spec
>> only says "This command specifies that all actions for the specified
>> re-distributor must be completed"...
> 
>  If Xen is sending SYNC/INVALL commands to pITS based on the commands
> Xen is sending on pITS, there is no harm in ignoring guest commands.
> 
> SYNC/INVALL are always depends on previous ITS commands.
> IMO, Alone these commands does not have any significance.

The SYNC command ensure that any commands before it has been completed...

The guest can decide to put one after only one command or after a batch
of command.

You have to respect it and not let Xen guess when it's necessary to have
one.

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-15 Thread Vijay Kilari
On Fri, May 15, 2015 at 5:33 PM, Julien Grall  wrote:
> On 15/05/15 12:30, Ian Campbell wrote:
>>> Handling of Single vITS and multipl pITS can be made simple.
>>>
>>> All ITS commands except SYNC & INVALL has device id which will
>>> help us to know to which pITS it should be sent.
>>>
>>> SYNC & INVALL can be dropped by Xen on Guest request
>>>  and let Xen append where ever SYNC & INVALL is required.
>>> (Ex; Linux driver adds SYNC for required commands).
>>> With this assumption, all ITS commands are mapped to pITS
>>> and no need of synchronization across pITS
>>
>> You've ignored the second bullet its three sub-bullets, I think.
>
   Why can't we group the batch of commands based on pITS it has
to be sent?.

> Aside ignoring the second bullet it's not possible to drop like that a
> SYNC/INVALL command sent be the guest. How can you decide when a SYNC is
> required or not? Why dropping "optional" SYNC would be fine? The spec
> only says "This command specifies that all actions for the specified
> re-distributor must be completed"...

 If Xen is sending SYNC/INVALL commands to pITS based on the commands
Xen is sending on pITS, there is no harm in ignoring guest commands.

SYNC/INVALL are always depends on previous ITS commands.
IMO, Alone these commands does not have any significance.

>
> Linux is not a good example for respecting the spec. Developers may
> decide to put SYNC differently in new necessary place and we won't be
> able to handle it correctly in Xen (see the vGICv3 re-dist example...).
>
> If we go on one vITS per multiple pITS we would have to send the command
> SYNC/INVALL to every pITS.
>
> Regards,
>
> --
> Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-15 Thread Vijay Kilari
On Fri, May 15, 2015 at 4:58 PM, Ian Campbell  wrote:
> On Wed, 2015-05-13 at 21:57 +0530, Vijay Kilari wrote:
>> > * On receipt of an interrupt notification arising from Xen's own use
>> >   of `INT`; (see discussion under Completion)
>>
>> If INT notification method is used, then I don't think there is need
>> for pITS scheduling on CREADER read.
>>
>> As we discussed in patch #13. Below steps should be suffice to virtualize
>> command queue.
>>
>> 1) On each guest CWRITER update, Read batch ( 'm' commands) of commands
>> and translate it and put on pITS schedule list. If there are more than 
>> 'm'
>> commands create m/n entries in schedule list. Append INT command for each
>>  schedule list entry
>
> How many INT commands do you mean here?

   One INT command (Xen's completion INT) per batch

>
>>  1a) If there is no ongoing command from this vITS on physical queue,
>>send to physical queue.
>>  1b) If there is ongoing command return to guest.
>> 2) On receiving completion interrupt, update CREADER of guest and post next
>> command from schedule list to physical queue.
>>
>> With this,
>>- There will be no overhead of translating command in interrupt context
>> which is quite heavy because translating ITS command requires validating
>> and updating interval ITS structures.
>
> Can you give some examples of the heaviest translations please so I can
> get a feel for actually how expensive we are talking here.
>
For example to translate MAPVI device_ID, event_ID, vID, vCID

1) Read from vITS command queue
2) Validate device_ID is valid by looking at device list attached
to that domain (vITS)
3) Validate vCID (virtual Collection ID) by checking against
re-distributor address/cpu numbers
of this domain
4) Allocate physical LPI for the vID (virtual LPI) from lpi map of
this device
   - Check if virtual LPI is already allocated from this device.
   - If not allocate it
   - Update lpi entries for this device
5) Allocate memory for physical LPI descriptor (Add radix tree
entry) and populate it
6) Call route_irq_to_guest() for this LPI
7) Format physical ITS command and send to pITS

>>- Always only one request from guest will be posted to physical queue
>>- Even in guest floods with large number of commands, all the commands
>>  will be translated and queued in schedule list and posted batch by batch
>>- Scheduling pass is called only on CWRITER & completion INT.
>
> I think the main difference in what you propose here is that commands
> are queued in pre-translated form to be injected (cheaply) during
> scheduling as opposed to being left on the guest queue and translated
> directly into the pits queue.
>
> I think `INT` vs `CREADR` scheduling is largely orthogonal to that.
>
> Julien proposed moving scheduling to a softirq, which gets it out of IRQ
> context (good) but does necessarily account the translation to the
> guest, which is a benefit of your approach. (I think things wihch happen
> in a sortirq are implicitly accounted  to current, whoever that may be)
>
   one softirq that looks at the all the vITS and posts the commands to pITS?
or one softirq per vITS?

> On the downside pretranslation adds memory overhead and reintroduces the
> issue of a potentially long synchronous translation during `CWRITER`
> handling.

   Memory that is allocated is freed after completion of that batch.
  The translation duration depends on how many commands guest is
writing before updated CWRITER.

>
> We could pretranslate a batch of commands into a s/w queue rather than
> into the pits queue, but then we are back to where do we refill that
> queue from.
>
> The first draft wasn't particular clear on when translation occurs
> (although I intended it to be during scheduling). I shall add some
> treatment of that to the next draft.
>>
>> > * On any interrupt injection arising from a guests use of the `INT`
>> >   command; (XXX perhaps, see discussion under Completion)
>> >
>> > Each scheduling pass will:
>> >
>> > * Read the physical `CREADR`;
>> > * For each command between `pits.last_creadr` and the new `CREADR`
>> >   value process completion of that command and update the
>> >   corresponding `vits_cq.creadr`.
>> > * Attempt to refill the pITS Command Queue (see below).
>> >
>> > ### Filling the pITS Command Queue.
>> >
>> > Various algorithms could be used here. For now a simple proposal is
>> > to traverse the `pits.schedule_list` starting from where the last
>> > refill finished (i.e not from the top of the list each time).
>> >
>> > If a `vits_cq` has no pending commands then it is removed from the
>> > list.
>> >
>> > If a `vits_cq` has some pending commands then `min(pits-free-slots,
>> > vits-outstanding, VITS_BATCH_SIZE)` will be taken from the vITS
>> > command queue, translated and placed onto the pITS
>> > queue. `vits_cq.progress` will be updated to reflect this.
>> >
>> > Each `vits_

Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-15 Thread Julien Grall
On 15/05/15 11:59, Ian Campbell wrote:
 AFAIU the process suggested, Xen will inject small batch as long as the
 physical command queue is not full.
>>>
 Let's take a simple case, only a single domain is using vITS on the
 platform. If it injects a huge number of commands, Xen will split it
 with lots of small batch. All batch will be injected in the same pass as
 long as it fits in the physical command queue. Am I correct?
>>>
>>> That's how it is currently written, yes. With the "possible
>>> simplification" above the answer is no, only a batch at a time would be
>>> written for each guest.
>>>
>>> BTW, it doesn't have to be a single guest, the sum total of the
>>> injections across all guests could also take a similar amount of time.
>>> Is that a concern?
>>
>> Yes, the example with only a guest was easier to explain.
> 
> So as well as limiting the number of commands in each domains batch we
> also want to limit the total number of batches?

Right. We want to have a "short" scheduling pass no matter the size of
the queue.

 I think we have to restrict total number of batch (i.e for all the
 domain) injected in a same scheduling pass.

 I would even tend to allow only one in flight batch per domain. That
 would limit the possible problem I pointed out.
>>>
>>> This is the "possible simplification" I think. Since it simplifies other
>>> things (I think) as well as addressing this issue I think it might be a
>>> good idea.
>>
>> With the limitation of command send per batch, would the fairness you
>> were talking on the design doc still required?
> 
> I think we still want to schedule the guest's in a strict round robin
> manner, to avoid one guest monopolising things.

I agree, although I was talking about the fairness you mentionned in
"However this may need some careful thought wrt fairness for
guests submitting frequent small batches of commands vs those sending
large batches."

> Therefore it is proposed that the restriction that a single vITS maps
> to one pITS be retained. If a guest requires access to devices
> associated with multiple pITSs then multiple vITS should be
> configured.

 Having multiple vITS per domain brings other issues:
- How do you know the number of ITS to describe in the device tree at 
 boot?
>>>
>>> I'm not sure. I don't think 1 vs N is very different from the question
>>> of 0 vs 1 though, somehow the tools need to know about the pITS setup.
>>
>> I don't see why the tools would require to know the pITS setup.
> 
> Even with only a single vits the tools need to know if the system has 0,
> 1, or more pits, to know whether to vreate a vits at all or not.

In the 1 vITS solution no, it's only necessary to add a new gic define
for the gic_version field in xen_arch_domainconfig.

Although, I agree that in multiple vITS configuration we would need to
know the number of vITS to create (not necessarily the number of pITS).

- How do you tell to the guest that the PCI device is mapped to a
 specific vITS?
>>>
>>> Device Tree or IORT, just like on native and just like we'd have to tell
>>> the guest about that mapping even if there was a single vITS.
>>
>> Right, although the root controller can only be attached to one ITS.
>>
>> It will be necessary to have multiple root controller in the guest in
>> the case of we passthrough devices using different ITS.
>>
>> Is pci-back able to expose multiple root controller?
> 
> In principal the xenstore protocol supports it, but AFAIK all toolstacks
> have only every used "bus" 0, so I wouldn't be surprised if there were
> bugs lurking.
> 
> But we could fix those, I don't think it is a requirement that this
> stuff suddenly springs into life on ARM even with existing kernels.

Right.

> 
>>> I think the complexity of having one vITS target multiple pITSs is going
>>> to be quite high in terms of data structures and the amount of
>>> thinking/tracking scheduler code will have to do, mostly down to out of
>>> order completion of things put in the pITS queue.
>>
>> I understand the complexity, but exposing on vITS per pITS means that we
>> are exposing the underlying hardware to the guest.
> 
> Some aspect of it, yes, but it is still a virtual ITs.

Yes and no. It make more complex the migration case (even without PCI
passthrough). See below.

>> If we are going to expose multiple vITS to the guest, we should only use
>> vITS for guest using PCI passthrough. This is because migration won't be
>> compatible with it.
> 
> It would be possible to support one s/w only vits for migration, i.e the
> evtchn thing at the end, but for the general case that is correct. On
> x86 I believe that if you hot unplug all passthrough devices you can
> migrate and then plug in other devices at the other end.

What about migration on platform having fewer/more pITS (AFAIU on cavium
it may be possible because there is only one node)? If we want to
migrate vITS we should hav

Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-15 Thread Julien Grall
On 15/05/15 12:30, Ian Campbell wrote:
>> Handling of Single vITS and multipl pITS can be made simple.
>>
>> All ITS commands except SYNC & INVALL has device id which will
>> help us to know to which pITS it should be sent.
>>
>> SYNC & INVALL can be dropped by Xen on Guest request
>>  and let Xen append where ever SYNC & INVALL is required.
>> (Ex; Linux driver adds SYNC for required commands).
>> With this assumption, all ITS commands are mapped to pITS
>> and no need of synchronization across pITS
> 
> You've ignored the second bullet its three sub-bullets, I think.

Aside ignoring the second bullet it's not possible to drop like that a
SYNC/INVALL command sent be the guest. How can you decide when a SYNC is
required or not? Why dropping "optional" SYNC would be fine? The spec
only says "This command specifies that all actions for the specified
re-distributor must be completed"...

Linux is not a good example for respecting the spec. Developers may
decide to put SYNC differently in new necessary place and we won't be
able to handle it correctly in Xen (see the vGICv3 re-dist example...).

If we go on one vITS per multiple pITS we would have to send the command
SYNC/INVALL to every pITS.

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-15 Thread Ian Campbell
On Fri, 2015-05-15 at 16:56 +0530, Vijay Kilari wrote:
> On Fri, May 15, 2015 at 4:29 PM, Ian Campbell  wrote:
> > On Wed, 2015-05-13 at 15:26 +0100, Julien Grall wrote:
> >> >>>   on that vits;
> >> >>> * On receipt of an interrupt notification arising from Xen's own use
> >> >>>   of `INT`; (see discussion under Completion)
> >> >>> * On any interrupt injection arising from a guests use of the `INT`
> >> >>>   command; (XXX perhaps, see discussion under Completion)
> >> >>
> >> >> With all the solution suggested, it will be very likely that we will try
> >> >> to execute multiple the scheduling pass at the same time.
> >> >>
> >> >> One way is to wait, until the previous pass as finished. But that would
> >> >> mean that the scheduler would be executed very often.
> >> >>
> >> >> Or maybe you plan to offload the scheduler in a softirq?
> >> >
> >> > Good point.
> >> >
> >> > A soft irq might be one solution, but it is problematic during emulation
> >> > of `CREADR`, when we would like to do a pass immediately to complete any
> >> > operations outstanding for the domain doing the read.
> >> >
> >> > Or just using spin_try_lock and not bothering if one is already in
> >> > progress might be another. But has similar problems.
> >> >
> >> > Or we could defer only scheduling from `INT` (either guest or Xen's own)
> >> > to a softirq but do ones from `CREADR` emulation synchronously? The
> >> > softirq would be run on return from the interrupt handler but multiple
> >> > such would be coalesced I think?
> >>
> >> I think we could defer the scheduling to a softirq for CREADR too, if
> >> the guest is using:
> >>   - INT completion: vits.creadr would have been correctly update when
> >> receiving the INT in xen.
> >>   - polling completion: the guest will loop on CREADR. It will likely 
> >> get
> >> the info on the next read. The drawback is the guest may loose few
> >> instructions cycle.
> >>
> >> Overall, I don't think it's necessary to have an accurate CREADR.
> >
> > Yes, deferring the update by one exit+enter might be tolerable. I added
> > after this list:
> > This may result in lots of contention on the scheduler
> > locking. Therefore we consider that in each case all which happens 
> > is
> > triggering of a softirq which will be processed on return to guest,
> > and just once even for multiple events. The is considered OK for the
> > `CREADR` case because at worst the value read will be one cycle out 
> > of
> > date.
> >
> >
> >
> >>
> >> [..]
> >>
> >> >> AFAIU the process suggested, Xen will inject small batch as long as the
> >> >> physical command queue is not full.
> >> >
> >> >> Let's take a simple case, only a single domain is using vITS on the
> >> >> platform. If it injects a huge number of commands, Xen will split it
> >> >> with lots of small batch. All batch will be injected in the same pass as
> >> >> long as it fits in the physical command queue. Am I correct?
> >> >
> >> > That's how it is currently written, yes. With the "possible
> >> > simplification" above the answer is no, only a batch at a time would be
> >> > written for each guest.
> >> >
> >> > BTW, it doesn't have to be a single guest, the sum total of the
> >> > injections across all guests could also take a similar amount of time.
> >> > Is that a concern?
> >>
> >> Yes, the example with only a guest was easier to explain.
> >
> > So as well as limiting the number of commands in each domains batch we
> > also want to limit the total number of batches?
> >
> >> >> I think we have to restrict total number of batch (i.e for all the
> >> >> domain) injected in a same scheduling pass.
> >> >>
> >> >> I would even tend to allow only one in flight batch per domain. That
> >> >> would limit the possible problem I pointed out.
> >> >
> >> > This is the "possible simplification" I think. Since it simplifies other
> >> > things (I think) as well as addressing this issue I think it might be a
> >> > good idea.
> >>
> >> With the limitation of command send per batch, would the fairness you
> >> were talking on the design doc still required?
> >
> > I think we still want to schedule the guest's in a strict round robin
> > manner, to avoid one guest monopolising things.
> >
> >> >>> Therefore it is proposed that the restriction that a single vITS maps
> >> >>> to one pITS be retained. If a guest requires access to devices
> >> >>> associated with multiple pITSs then multiple vITS should be
> >> >>> configured.
> >> >>
> >> >> Having multiple vITS per domain brings other issues:
> >> >>- How do you know the number of ITS to describe in the device tree 
> >> >> at boot?
> >> >
> >> > I'm not sure. I don't think 1 vs N is very different from the question
> >> > of 0 vs 1 though, somehow the tools need to know about the pITS setup.
> >>
> >> I don't see why the tools would require to know the pITS setup.
> >
> > Even with only a single vits the tools need to know if the system has 0

Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-15 Thread Ian Campbell
On Wed, 2015-05-13 at 21:57 +0530, Vijay Kilari wrote:
> > * On receipt of an interrupt notification arising from Xen's own use
> >   of `INT`; (see discussion under Completion)
> 
> If INT notification method is used, then I don't think there is need
> for pITS scheduling on CREADER read.
> 
> As we discussed in patch #13. Below steps should be suffice to virtualize
> command queue.
> 
> 1) On each guest CWRITER update, Read batch ( 'm' commands) of commands
> and translate it and put on pITS schedule list. If there are more than 'm'
> commands create m/n entries in schedule list. Append INT command for each
>  schedule list entry

How many INT commands do you mean here?

>  1a) If there is no ongoing command from this vITS on physical queue,
>send to physical queue.
>  1b) If there is ongoing command return to guest.
> 2) On receiving completion interrupt, update CREADER of guest and post next
> command from schedule list to physical queue.
> 
> With this,
>- There will be no overhead of translating command in interrupt context
> which is quite heavy because translating ITS command requires validating
> and updating interval ITS structures.

Can you give some examples of the heaviest translations please so I can
get a feel for actually how expensive we are talking here.

>- Always only one request from guest will be posted to physical queue
>- Even in guest floods with large number of commands, all the commands
>  will be translated and queued in schedule list and posted batch by batch
>- Scheduling pass is called only on CWRITER & completion INT.

I think the main difference in what you propose here is that commands
are queued in pre-translated form to be injected (cheaply) during
scheduling as opposed to being left on the guest queue and translated
directly into the pits queue.

I think `INT` vs `CREADR` scheduling is largely orthogonal to that.

Julien proposed moving scheduling to a softirq, which gets it out of IRQ
context (good) but does necessarily account the translation to the
guest, which is a benefit of your approach. (I think things wihch happen
in a sortirq are implicitly accounted  to current, whoever that may be)

On the downside pretranslation adds memory overhead and reintroduces the
issue of a potentially long synchronous translation during `CWRITER`
handling.

We could pretranslate a batch of commands into a s/w queue rather than
into the pits queue, but then we are back to where do we refill that
queue from.

The first draft wasn't particular clear on when translation occurs
(although I intended it to be during scheduling). I shall add some
treatment of that to the next draft.
> 
> > * On any interrupt injection arising from a guests use of the `INT`
> >   command; (XXX perhaps, see discussion under Completion)
> >
> > Each scheduling pass will:
> >
> > * Read the physical `CREADR`;
> > * For each command between `pits.last_creadr` and the new `CREADR`
> >   value process completion of that command and update the
> >   corresponding `vits_cq.creadr`.
> > * Attempt to refill the pITS Command Queue (see below).
> >
> > ### Filling the pITS Command Queue.
> >
> > Various algorithms could be used here. For now a simple proposal is
> > to traverse the `pits.schedule_list` starting from where the last
> > refill finished (i.e not from the top of the list each time).
> >
> > If a `vits_cq` has no pending commands then it is removed from the
> > list.
> >
> > If a `vits_cq` has some pending commands then `min(pits-free-slots,
> > vits-outstanding, VITS_BATCH_SIZE)` will be taken from the vITS
> > command queue, translated and placed onto the pITS
> > queue. `vits_cq.progress` will be updated to reflect this.
> >
> > Each `vits_cq` is handled in turn in this way until the pITS Command
> > Queue is full or there are no more outstanding commands.
> >
> > There will likely need to be a data structure which shadows the pITS
> > Command Queue slots with references to the `vits_cq` which has a
> > command currently occupying that slot and corresponding the index into
> > the virtual command queue, for use when completing a command.
> >
> > `VITS_BATCH_SIZE` should be small, TBD say 4 or 8.
> >
> > Possible simplification: If we arrange that no guest ever has multiple
> > batches in flight (which can occur if we wrap around the list several
> > times) then we may be able to simplify the book keeping
> > required. However this may need some careful thought wrt fairness for
> > guests submitting frequent small batches of commands vs those sending
> > large batches.
> 
>   If one LPI of the dummy device assigned to one VM, then book keeping
> per vITS becomes simple

What dummy device do you mean? What simplifications does it imply?

> 
> >
> > ### Completion
> >
> > It is expected that commands will normally be completed (resulting in
> > an update of the corresponding `vits_cq.creadr`) via guest read from
> > `CREADR`. This will trigger a sch

Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-15 Thread Vijay Kilari
On Fri, May 15, 2015 at 4:29 PM, Ian Campbell  wrote:
> On Wed, 2015-05-13 at 15:26 +0100, Julien Grall wrote:
>> >>>   on that vits;
>> >>> * On receipt of an interrupt notification arising from Xen's own use
>> >>>   of `INT`; (see discussion under Completion)
>> >>> * On any interrupt injection arising from a guests use of the `INT`
>> >>>   command; (XXX perhaps, see discussion under Completion)
>> >>
>> >> With all the solution suggested, it will be very likely that we will try
>> >> to execute multiple the scheduling pass at the same time.
>> >>
>> >> One way is to wait, until the previous pass as finished. But that would
>> >> mean that the scheduler would be executed very often.
>> >>
>> >> Or maybe you plan to offload the scheduler in a softirq?
>> >
>> > Good point.
>> >
>> > A soft irq might be one solution, but it is problematic during emulation
>> > of `CREADR`, when we would like to do a pass immediately to complete any
>> > operations outstanding for the domain doing the read.
>> >
>> > Or just using spin_try_lock and not bothering if one is already in
>> > progress might be another. But has similar problems.
>> >
>> > Or we could defer only scheduling from `INT` (either guest or Xen's own)
>> > to a softirq but do ones from `CREADR` emulation synchronously? The
>> > softirq would be run on return from the interrupt handler but multiple
>> > such would be coalesced I think?
>>
>> I think we could defer the scheduling to a softirq for CREADR too, if
>> the guest is using:
>>   - INT completion: vits.creadr would have been correctly update when
>> receiving the INT in xen.
>>   - polling completion: the guest will loop on CREADR. It will likely get
>> the info on the next read. The drawback is the guest may loose few
>> instructions cycle.
>>
>> Overall, I don't think it's necessary to have an accurate CREADR.
>
> Yes, deferring the update by one exit+enter might be tolerable. I added
> after this list:
> This may result in lots of contention on the scheduler
> locking. Therefore we consider that in each case all which happens is
> triggering of a softirq which will be processed on return to guest,
> and just once even for multiple events. The is considered OK for the
> `CREADR` case because at worst the value read will be one cycle out of
> date.
>
>
>
>>
>> [..]
>>
>> >> AFAIU the process suggested, Xen will inject small batch as long as the
>> >> physical command queue is not full.
>> >
>> >> Let's take a simple case, only a single domain is using vITS on the
>> >> platform. If it injects a huge number of commands, Xen will split it
>> >> with lots of small batch. All batch will be injected in the same pass as
>> >> long as it fits in the physical command queue. Am I correct?
>> >
>> > That's how it is currently written, yes. With the "possible
>> > simplification" above the answer is no, only a batch at a time would be
>> > written for each guest.
>> >
>> > BTW, it doesn't have to be a single guest, the sum total of the
>> > injections across all guests could also take a similar amount of time.
>> > Is that a concern?
>>
>> Yes, the example with only a guest was easier to explain.
>
> So as well as limiting the number of commands in each domains batch we
> also want to limit the total number of batches?
>
>> >> I think we have to restrict total number of batch (i.e for all the
>> >> domain) injected in a same scheduling pass.
>> >>
>> >> I would even tend to allow only one in flight batch per domain. That
>> >> would limit the possible problem I pointed out.
>> >
>> > This is the "possible simplification" I think. Since it simplifies other
>> > things (I think) as well as addressing this issue I think it might be a
>> > good idea.
>>
>> With the limitation of command send per batch, would the fairness you
>> were talking on the design doc still required?
>
> I think we still want to schedule the guest's in a strict round robin
> manner, to avoid one guest monopolising things.
>
>> >>> Therefore it is proposed that the restriction that a single vITS maps
>> >>> to one pITS be retained. If a guest requires access to devices
>> >>> associated with multiple pITSs then multiple vITS should be
>> >>> configured.
>> >>
>> >> Having multiple vITS per domain brings other issues:
>> >>- How do you know the number of ITS to describe in the device tree at 
>> >> boot?
>> >
>> > I'm not sure. I don't think 1 vs N is very different from the question
>> > of 0 vs 1 though, somehow the tools need to know about the pITS setup.
>>
>> I don't see why the tools would require to know the pITS setup.
>
> Even with only a single vits the tools need to know if the system has 0,
> 1, or more pits, to know whether to vreate a vits at all or not.
>
>> >>- How do you tell to the guest that the PCI device is mapped to a
>> >> specific vITS?
>> >
>> > Device Tree or IORT, just like on native and just like we'd have to tell
>> > the guest about that mapping

Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-15 Thread Ian Campbell
On Wed, 2015-05-13 at 15:26 +0100, Julien Grall wrote:
> >>>   on that vits;
> >>> * On receipt of an interrupt notification arising from Xen's own use
> >>>   of `INT`; (see discussion under Completion)
> >>> * On any interrupt injection arising from a guests use of the `INT`
> >>>   command; (XXX perhaps, see discussion under Completion)
> >>
> >> With all the solution suggested, it will be very likely that we will try
> >> to execute multiple the scheduling pass at the same time.
> >>
> >> One way is to wait, until the previous pass as finished. But that would
> >> mean that the scheduler would be executed very often.
> >>
> >> Or maybe you plan to offload the scheduler in a softirq?
> > 
> > Good point.
> > 
> > A soft irq might be one solution, but it is problematic during emulation
> > of `CREADR`, when we would like to do a pass immediately to complete any
> > operations outstanding for the domain doing the read.
> > 
> > Or just using spin_try_lock and not bothering if one is already in
> > progress might be another. But has similar problems.
> > 
> > Or we could defer only scheduling from `INT` (either guest or Xen's own)
> > to a softirq but do ones from `CREADR` emulation synchronously? The
> > softirq would be run on return from the interrupt handler but multiple
> > such would be coalesced I think?
> 
> I think we could defer the scheduling to a softirq for CREADR too, if
> the guest is using:
>   - INT completion: vits.creadr would have been correctly update when
> receiving the INT in xen.
>   - polling completion: the guest will loop on CREADR. It will likely get
> the info on the next read. The drawback is the guest may loose few
> instructions cycle.
> 
> Overall, I don't think it's necessary to have an accurate CREADR.

Yes, deferring the update by one exit+enter might be tolerable. I added
after this list:
This may result in lots of contention on the scheduler
locking. Therefore we consider that in each case all which happens is
triggering of a softirq which will be processed on return to guest,
and just once even for multiple events. The is considered OK for the
`CREADR` case because at worst the value read will be one cycle out of
date.



> 
> [..]
> 
> >> AFAIU the process suggested, Xen will inject small batch as long as the
> >> physical command queue is not full.
> > 
> >> Let's take a simple case, only a single domain is using vITS on the
> >> platform. If it injects a huge number of commands, Xen will split it
> >> with lots of small batch. All batch will be injected in the same pass as
> >> long as it fits in the physical command queue. Am I correct?
> > 
> > That's how it is currently written, yes. With the "possible
> > simplification" above the answer is no, only a batch at a time would be
> > written for each guest.
> > 
> > BTW, it doesn't have to be a single guest, the sum total of the
> > injections across all guests could also take a similar amount of time.
> > Is that a concern?
> 
> Yes, the example with only a guest was easier to explain.

So as well as limiting the number of commands in each domains batch we
also want to limit the total number of batches?

> >> I think we have to restrict total number of batch (i.e for all the
> >> domain) injected in a same scheduling pass.
> >>
> >> I would even tend to allow only one in flight batch per domain. That
> >> would limit the possible problem I pointed out.
> > 
> > This is the "possible simplification" I think. Since it simplifies other
> > things (I think) as well as addressing this issue I think it might be a
> > good idea.
> 
> With the limitation of command send per batch, would the fairness you
> were talking on the design doc still required?

I think we still want to schedule the guest's in a strict round robin
manner, to avoid one guest monopolising things.

> >>> Therefore it is proposed that the restriction that a single vITS maps
> >>> to one pITS be retained. If a guest requires access to devices
> >>> associated with multiple pITSs then multiple vITS should be
> >>> configured.
> >>
> >> Having multiple vITS per domain brings other issues:
> >>- How do you know the number of ITS to describe in the device tree at 
> >> boot?
> > 
> > I'm not sure. I don't think 1 vs N is very different from the question
> > of 0 vs 1 though, somehow the tools need to know about the pITS setup.
> 
> I don't see why the tools would require to know the pITS setup.

Even with only a single vits the tools need to know if the system has 0,
1, or more pits, to know whether to vreate a vits at all or not.

> >>- How do you tell to the guest that the PCI device is mapped to a
> >> specific vITS?
> > 
> > Device Tree or IORT, just like on native and just like we'd have to tell
> > the guest about that mapping even if there was a single vITS.
> 
> Right, although the root controller can only be attached to one ITS.
> 
> It will be necessary to have multiple root 

Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-13 Thread Vijay Kilari
Hi Ian,

   Few thoughts..

On Tue, May 12, 2015 at 8:32 PM, Ian Campbell  wrote:
> On Tue, 2015-05-05 at 17:44 +0530, Vijay Kilari wrote:
>> Hi,
>>
>>As discussed, here is the design doc/txt.
>
> There seems to be no consideration of multiple guests or VCPUs all
> accessing one or more vITS in parallel and the associated issues around
> fairness etc.
>
> Overall I think there needs to be a stronger logical separation between
> the vITS emulation and the stuff which interacts with the pITS
> (scheduling, completion handling etc).
>
> I've written up my thinking as a design doc below (it's pandoc and the
> pdf version is also at
> http://xenbits.xen.org/people/ianc/vits/draftA.pdf FWIW).
>
> Corrections and comments welcome. There are several XXX's in it,
> representing open questions or things I wasn't sure about how to handle.
>
> This only really covers command queue virtualisation and not other
> aspects (I'm not sure if they need covering or not).
>
> Lets try and use this as a basis for discussion so we can correct and
> amend it to represent what the actual design will be
>
> Ian.
>
> % Xen on ARM vITS Handling
> % Ian Campbell 
> % Draft A
>
> # Introduction
>
> ARM systems containing a GIC version 3 or later may contain one or
> more ITS logical blocks. An ITS is used to route Message Signalled
> interrupts from devices into an LPI injection on the processor.
>
> The following summarises the ITS hardware design and serves as a set
> of assumptions for the vITS software design. (XXX it is entirely
> possible I've horribly misunderstood how this stuff fits
> together). For full details of the ITS see the "GIC Architecture
> Specification".
>
> Message signalled interrupts are translated into an LPI via a
> translation table which must be configured for each device which can
> generate an MSI. The ITS uses the device id of the originating device
> to lookup the corresponding translation table. Devices IDs are
> typically described via system firmware, e.g. the ACPI IORT table or
> via device tree.
>
> The ITS is configured and managed, including establishing a
> Translation Table for each device, via an in memory ring shared
> between the CPU and the ITS controller. The ring is managed via the
> `GITS_CBASER` register and indexed by `GITS_CWRITER` and `GITS_CREADR`
> registers.
>
> A processor adds commands to the shared ring and then updates
> `GITS_CWRITER` to make them visible to the ITS controller.
>
> The ITS controller processes commands from the ring and then updates
> `GITS_CREADR` to indicate the the processor that the command has been
> processed.
>
> Commands are processed sequentially.
>
> Commands sent on the ring include operational commands:
>
> * Routing interrupts to processors;
> * Generating interrupts;
> * Clearing the pending state of interrupts;
> * Synchronising the command queue
>
> and maintenance commands:
>
> * Map device/collection/processor;
> * Map virtual interrupt;
> * Clean interrupts;
> * Discard interrupts;
>
> The ITS provides no specific completion notification
> mechanism. Completion is monitored by a combination of a `SYNC`
> command and either polling `GITS_CREADR` or notification via an
> interrupt generated via the `INT` command.
>
> Note that the interrupt generation via `INT` requires an originating
> device ID to be supplied (which is then translated via the ITS into an
> LPI). No specific device ID is defined for this purpose and so the OS
> software is expected to fabricate one.
>
> Possible ways of inventing such a device ID are:
>
> * Enumerate all device ids in the system and pick another one;
> * Use a PCI BDF associated with a non-existent device function (such
>   as an unused one relating to the PCI root-bridge) and translate that
>   (via firmware tables) into a suitable device id;
> * ???
>
> # vITS
>
> A guest domain which is allowed to use ITS functionality (i.e. has
> been assigned pass-through devices which can generate MSIs) will be
> presented with a virtualised ITS.
>
> Accesses to the vITS registers will trap to Xen and be emulated and a
> virtualised Command Queue will be provided.
>
> Commands entered onto the virtual Command Queue will be translated
> into physical commands (this translation is described in the GIC
> specification).
>
> XXX there are other aspects to virtualising the ITS (LPI collection
> management, assignment of LPI ranges to guests). However these are not
> currently considered here. XXX Should they be/do they need to be?
>
> ## Requirements
>
> Emulation should not block in the hypervisor for extended periods. In
> particular Xen should not busy wait on the physical ITS. Doing so
> blocks the physical CPU from doing anything else (such as scheduling
> other VCPUS)
>
> There may be multiple guests which have a vITS, all targeting the same
> underlying pITS. A single guest VCPU should not be able to monopolise
> the pITS via its vITS and all guests should be able to make forward
> progress.
>
> ## Command Queue V

Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-13 Thread Julien Grall
Hi Ian,

On 13/05/15 14:23, Ian Campbell wrote:
> On Tue, 2015-05-12 at 18:35 +0100, Julien Grall wrote:
>>> On read from the virtual `CREADR` iff the vits_cq is such that
>>
>> s/iff/if/
> 
> "iff" is a shorthand for "if and only if". Apparently not as common as I
> think it is though!

Oh ok. I wasn't aware about this shorthand.

> 
>>
>>> commands are outstanding then a scheduling pass is attempted (in order
>>> to update `vits_cq.creadr`). The current value of `vitq_cq.creadr` is
>>> then returned.
>>>
>>> ### pITS Scheduling
>>
>> I'm not sure if the design document is the right place to talk about it.
>>
>> If a domain die during the process , how would it affect the scheduler?
> 
> 
> So I think we have to wait for them to finish.
> 
> Vague thoughts:
> 
> We can't free a `vits_cq` while has things on the physical
> control
> queue, and we cannot cancel things which are on the control
> queue.
> 
> So we must wait.
> 
> Obviously don't enqueue anything new onto the pits if
> `d->is_dying`.

Right.

> `domain_relinquish_resources()` waits (somehow, with suitable
> continuations etc) for anything which the `vits_cq` has
> outstanding to be completed so that the datastructures can be
> cleared.
> 
> ?

I think that would work.

> 
> I've added that to a new section "Domain Shutdown" right after
> scheduling.

Thanks.

> 
>>>   on that vits;
>>> * On receipt of an interrupt notification arising from Xen's own use
>>>   of `INT`; (see discussion under Completion)
>>> * On any interrupt injection arising from a guests use of the `INT`
>>>   command; (XXX perhaps, see discussion under Completion)
>>
>> With all the solution suggested, it will be very likely that we will try
>> to execute multiple the scheduling pass at the same time.
>>
>> One way is to wait, until the previous pass as finished. But that would
>> mean that the scheduler would be executed very often.
>>
>> Or maybe you plan to offload the scheduler in a softirq?
> 
> Good point.
> 
> A soft irq might be one solution, but it is problematic during emulation
> of `CREADR`, when we would like to do a pass immediately to complete any
> operations outstanding for the domain doing the read.
> 
> Or just using spin_try_lock and not bothering if one is already in
> progress might be another. But has similar problems.
> 
> Or we could defer only scheduling from `INT` (either guest or Xen's own)
> to a softirq but do ones from `CREADR` emulation synchronously? The
> softirq would be run on return from the interrupt handler but multiple
> such would be coalesced I think?

I think we could defer the scheduling to a softirq for CREADR too, if
the guest is using:
- INT completion: vits.creadr would have been correctly update when
receiving the INT in xen.
- polling completion: the guest will loop on CREADR. It will likely get
the info on the next read. The drawback is the guest may loose few
instructions cycle.

Overall, I don't think it's necessary to have an accurate CREADR.

[..]

>> AFAIU the process suggested, Xen will inject small batch as long as the
>> physical command queue is not full.
> 
>> Let's take a simple case, only a single domain is using vITS on the
>> platform. If it injects a huge number of commands, Xen will split it
>> with lots of small batch. All batch will be injected in the same pass as
>> long as it fits in the physical command queue. Am I correct?
> 
> That's how it is currently written, yes. With the "possible
> simplification" above the answer is no, only a batch at a time would be
> written for each guest.
> 
> BTW, it doesn't have to be a single guest, the sum total of the
> injections across all guests could also take a similar amount of time.
> Is that a concern?

Yes, the example with only a guest was easier to explain.

>> I think we have to restrict total number of batch (i.e for all the
>> domain) injected in a same scheduling pass.
>>
>> I would even tend to allow only one in flight batch per domain. That
>> would limit the possible problem I pointed out.
> 
> This is the "possible simplification" I think. Since it simplifies other
> things (I think) as well as addressing this issue I think it might be a
> good idea.

With the limitation of command send per batch, would the fairness you
were talking on the design doc still required?

[..]

>>> This assumes that there is no particular benefit to keeping the
>>> `CWRITER` rolling ahead of the pITS's actual processing.
>>
>> I don't understand this assumption. CWRITER will always point to the
>> last command in the queue.
> 
> Correct, but that might be ahead of where the pITS has actually gotten
> to (which we cannot see).
> 
> What I am trying to say here is that there is no point in trying to
> eagerly complete things (by checking `CREADR`) such that we can write
> new things (and hence push `CWRITER` forward) just to keep ahead of 

Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-13 Thread Ian Campbell
On Tue, 2015-05-12 at 18:35 +0100, Julien Grall wrote:
> > Message signalled interrupts are translated into an LPI via a
> > translation table which must be configured for each device which can
> > generate an MSI. The ITS uses the device id of the originating device
> > to lookup the corresponding translation table. Devices IDs are
> > typically described via system firmware, e.g. the ACPI IORT table or
> > via device tree.
> > 
> > The ITS is configured and managed, including establishing a
> > Translation Table for each device, via an in memory ring shared
> 
> s/an in/a/?

Either is acceptable IMHO. "an (in memory) ring" is how you would parse
what I've written.

> > # vITS
> > 
> > A guest domain which is allowed to use ITS functionality (i.e. has
> > been assigned pass-through devices which can generate MSIs) will be
> > presented with a virtualised ITS.
> > 
> > Accesses to the vITS registers will trap to Xen and be emulated and a
> > virtualised Command Queue will be provided.
> > 
> > Commands entered onto the virtual Command Queue will be translated
> > into physical commands (this translation is described in the GIC
> > specification).
> > 
> > XXX there are other aspects to virtualising the ITS (LPI collection
> > management, assignment of LPI ranges to guests).
> 
> Another aspect to think is device management.

Added.

> > However these are not
> > currently considered here. XXX Should they be/do they need to be?
> 
> I think those aspects are straightforward and doesn't require any
> specific design docs. We could discuss about it during the
> implementation (number of LPIs supported, LPIs allocations...).

OK

> > On read from the virtual `CREADR` iff the vits_cq is such that
> 
> s/iff/if/

"iff" is a shorthand for "if and only if". Apparently not as common as I
think it is though!

> 
> > commands are outstanding then a scheduling pass is attempted (in order
> > to update `vits_cq.creadr`). The current value of `vitq_cq.creadr` is
> > then returned.
> > 
> > ### pITS Scheduling
> 
> I'm not sure if the design document is the right place to talk about it.
> 
> If a domain die during the process , how would it affect the scheduler?


So I think we have to wait for them to finish.

Vague thoughts:

We can't free a `vits_cq` while has things on the physical
control
queue, and we cannot cancel things which are on the control
queue.

So we must wait.

Obviously don't enqueue anything new onto the pits if
`d->is_dying`.

`domain_relinquish_resources()` waits (somehow, with suitable
continuations etc) for anything which the `vits_cq` has
outstanding to be completed so that the datastructures can be
cleared.

?

I've added that to a new section "Domain Shutdown" right after
scheduling.

> >   on that vits;
> > * On receipt of an interrupt notification arising from Xen's own use
> >   of `INT`; (see discussion under Completion)
> > * On any interrupt injection arising from a guests use of the `INT`
> >   command; (XXX perhaps, see discussion under Completion)
> 
> With all the solution suggested, it will be very likely that we will try
> to execute multiple the scheduling pass at the same time.
> 
> One way is to wait, until the previous pass as finished. But that would
> mean that the scheduler would be executed very often.
> 
> Or maybe you plan to offload the scheduler in a softirq?

Good point.

A soft irq might be one solution, but it is problematic during emulation
of `CREADR`, when we would like to do a pass immediately to complete any
operations outstanding for the domain doing the read.

Or just using spin_try_lock and not bothering if one is already in
progress might be another. But has similar problems.

Or we could defer only scheduling from `INT` (either guest or Xen's own)
to a softirq but do ones from `CREADR` emulation synchronously? The
softirq would be run on return from the interrupt handler but multiple
such would be coalesced I think?

I've not updated the doc (apart from a note to remember the issue) while
we think about this.

> 
> > Each scheduling pass will:
> > 
> > * Read the physical `CREADR`;
> > * For each command between `pits.last_creadr` and the new `CREADR`
> >   value process completion of that command and update the
> >   corresponding `vits_cq.creadr`.
> > * Attempt to refill the pITS Command Queue (see below).
> > 
> > ### Filling the pITS Command Queue.
> > 
> > Various algorithms could be used here. For now a simple proposal is
> > to traverse the `pits.schedule_list` starting from where the last
> > refill finished (i.e not from the top of the list each time).
> > 
> > If a `vits_cq` has no pending commands then it is removed from the
> > list.
> > 
> > If a `vits_cq` has some pending commands then `min(pits-free-slots,
> > vits-outstanding, VITS_BATCH_SIZE)` will be taken from the vITS
> > command queue, translated and placed onto 

Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-12 Thread Julien Grall
Hi Ian,

On 12/05/15 16:02, Ian Campbell wrote:
> On Tue, 2015-05-05 at 17:44 +0530, Vijay Kilari wrote:
>> Hi,
>>
>>As discussed, here is the design doc/txt.
> 
> There seems to be no consideration of multiple guests or VCPUs all
> accessing one or more vITS in parallel and the associated issues around
> fairness etc.
> 
> Overall I think there needs to be a stronger logical separation between
> the vITS emulation and the stuff which interacts with the pITS
> (scheduling, completion handling etc).
> 
> I've written up my thinking as a design doc below (it's pandoc and the
> pdf version is also at
> http://xenbits.xen.org/people/ianc/vits/draftA.pdf FWIW).

Thank you for write the doc.

> 
> Corrections and comments welcome. There are several XXX's in it,
> representing open questions or things I wasn't sure about how to handle.
> 
> This only really covers command queue virtualisation and not other
> aspects (I'm not sure if they need covering or not).
> 
> Lets try and use this as a basis for discussion so we can correct and
> amend it to represent what the actual design will be
> 
> Ian.
> 
> % Xen on ARM vITS Handling
> % Ian Campbell 
> % Draft A
> 
> # Introduction
> 
> ARM systems containing a GIC version 3 or later may contain one or
> more ITS logical blocks. An ITS is used to route Message Signalled
> interrupts from devices into an LPI injection on the processor.
> 
> The following summarises the ITS hardware design and serves as a set
> of assumptions for the vITS software design. (XXX it is entirely
> possible I've horribly misunderstood how this stuff fits
> together). For full details of the ITS see the "GIC Architecture
> Specification".

The summarise of the ITS hardware design looks good to me.

> Message signalled interrupts are translated into an LPI via a
> translation table which must be configured for each device which can
> generate an MSI. The ITS uses the device id of the originating device
> to lookup the corresponding translation table. Devices IDs are
> typically described via system firmware, e.g. the ACPI IORT table or
> via device tree.
> 
> The ITS is configured and managed, including establishing a
> Translation Table for each device, via an in memory ring shared

s/an in/a/?

> between the CPU and the ITS controller. The ring is managed via the
> `GITS_CBASER` register and indexed by `GITS_CWRITER` and `GITS_CREADR`
> registers.
> 
> A processor adds commands to the shared ring and then updates
> `GITS_CWRITER` to make them visible to the ITS controller.
> 
> The ITS controller processes commands from the ring and then updates
> `GITS_CREADR` to indicate the the processor that the command has been
> processed.
> 
> Commands are processed sequentially.
> 
> Commands sent on the ring include operational commands:
> 
> * Routing interrupts to processors;
> * Generating interrupts;
> * Clearing the pending state of interrupts;
> * Synchronising the command queue
> 
> and maintenance commands:
> 
> * Map device/collection/processor;
> * Map virtual interrupt;
> * Clean interrupts;
> * Discard interrupts;
> 
> The ITS provides no specific completion notification
> mechanism. Completion is monitored by a combination of a `SYNC`
> command and either polling `GITS_CREADR` or notification via an
> interrupt generated via the `INT` command.
> 
> Note that the interrupt generation via `INT` requires an originating
> device ID to be supplied (which is then translated via the ITS into an
> LPI). No specific device ID is defined for this purpose and so the OS
> software is expected to fabricate one.
> 
> Possible ways of inventing such a device ID are:
> 
> * Enumerate all device ids in the system and pick another one;
> * Use a PCI BDF associated with a non-existent device function (such
>   as an unused one relating to the PCI root-bridge) and translate that
>   (via firmware tables) into a suitable device id;
> * ???

I don't have any other ideas in mind.

> # vITS
> 
> A guest domain which is allowed to use ITS functionality (i.e. has
> been assigned pass-through devices which can generate MSIs) will be
> presented with a virtualised ITS.
> 
> Accesses to the vITS registers will trap to Xen and be emulated and a
> virtualised Command Queue will be provided.
> 
> Commands entered onto the virtual Command Queue will be translated
> into physical commands (this translation is described in the GIC
> specification).
> 
> XXX there are other aspects to virtualising the ITS (LPI collection
> management, assignment of LPI ranges to guests).

Another aspect to think is device management.

> However these are not
> currently considered here. XXX Should they be/do they need to be?

I think those aspects are straightforward and doesn't require any
specific design docs. We could discuss about it during the
implementation (number of LPIs supported, LPIs allocations...).

> 
> ## Requirements
> 
> Emulation should not block in the hypervisor for extended periods. In
> particular Xen should not

Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-12 Thread Ian Campbell
On Tue, 2015-05-05 at 17:44 +0530, Vijay Kilari wrote:
> Hi,
> 
>As discussed, here is the design doc/txt.

There seems to be no consideration of multiple guests or VCPUs all
accessing one or more vITS in parallel and the associated issues around
fairness etc.

Overall I think there needs to be a stronger logical separation between
the vITS emulation and the stuff which interacts with the pITS
(scheduling, completion handling etc).

I've written up my thinking as a design doc below (it's pandoc and the
pdf version is also at
http://xenbits.xen.org/people/ianc/vits/draftA.pdf FWIW).

Corrections and comments welcome. There are several XXX's in it,
representing open questions or things I wasn't sure about how to handle.

This only really covers command queue virtualisation and not other
aspects (I'm not sure if they need covering or not).

Lets try and use this as a basis for discussion so we can correct and
amend it to represent what the actual design will be

Ian.

% Xen on ARM vITS Handling
% Ian Campbell 
% Draft A

# Introduction

ARM systems containing a GIC version 3 or later may contain one or
more ITS logical blocks. An ITS is used to route Message Signalled
interrupts from devices into an LPI injection on the processor.

The following summarises the ITS hardware design and serves as a set
of assumptions for the vITS software design. (XXX it is entirely
possible I've horribly misunderstood how this stuff fits
together). For full details of the ITS see the "GIC Architecture
Specification".

Message signalled interrupts are translated into an LPI via a
translation table which must be configured for each device which can
generate an MSI. The ITS uses the device id of the originating device
to lookup the corresponding translation table. Devices IDs are
typically described via system firmware, e.g. the ACPI IORT table or
via device tree.

The ITS is configured and managed, including establishing a
Translation Table for each device, via an in memory ring shared
between the CPU and the ITS controller. The ring is managed via the
`GITS_CBASER` register and indexed by `GITS_CWRITER` and `GITS_CREADR`
registers.

A processor adds commands to the shared ring and then updates
`GITS_CWRITER` to make them visible to the ITS controller.

The ITS controller processes commands from the ring and then updates
`GITS_CREADR` to indicate the the processor that the command has been
processed.

Commands are processed sequentially.

Commands sent on the ring include operational commands:

* Routing interrupts to processors;
* Generating interrupts;
* Clearing the pending state of interrupts;
* Synchronising the command queue

and maintenance commands:

* Map device/collection/processor;
* Map virtual interrupt;
* Clean interrupts;
* Discard interrupts;

The ITS provides no specific completion notification
mechanism. Completion is monitored by a combination of a `SYNC`
command and either polling `GITS_CREADR` or notification via an
interrupt generated via the `INT` command.

Note that the interrupt generation via `INT` requires an originating
device ID to be supplied (which is then translated via the ITS into an
LPI). No specific device ID is defined for this purpose and so the OS
software is expected to fabricate one.

Possible ways of inventing such a device ID are:

* Enumerate all device ids in the system and pick another one;
* Use a PCI BDF associated with a non-existent device function (such
  as an unused one relating to the PCI root-bridge) and translate that
  (via firmware tables) into a suitable device id;
* ???

# vITS

A guest domain which is allowed to use ITS functionality (i.e. has
been assigned pass-through devices which can generate MSIs) will be
presented with a virtualised ITS.

Accesses to the vITS registers will trap to Xen and be emulated and a
virtualised Command Queue will be provided.

Commands entered onto the virtual Command Queue will be translated
into physical commands (this translation is described in the GIC
specification).

XXX there are other aspects to virtualising the ITS (LPI collection
management, assignment of LPI ranges to guests). However these are not
currently considered here. XXX Should they be/do they need to be?

## Requirements

Emulation should not block in the hypervisor for extended periods. In
particular Xen should not busy wait on the physical ITS. Doing so
blocks the physical CPU from doing anything else (such as scheduling
other VCPUS)

There may be multiple guests which have a vITS, all targeting the same
underlying pITS. A single guest VCPU should not be able to monopolise
the pITS via its vITS and all guests should be able to make forward
progress.

## Command Queue Virtualisation

The command queue of each vITS is represented by a data structure:

struct vits_cq {
list_head schedule_list; /* Queued onto pits.schedule_list */
uint32_t creadr; /* Virtual creadr */
uint32_t cwriter;/* Virtual cwriter */
uint32_t p

Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-05 Thread Julien Grall
On 05/05/15 17:09, Vijay Kilari wrote:
> On Tue, May 5, 2015 at 7:39 PM, Julien Grall  wrote:
>> On 05/05/15 13:14, Vijay Kilari wrote:
>>> Proposal 2:
>>> 
>>> Here when guest writes command to vITS queue and updates CWRITER registers,
>>> it is trapped in XEN and below steps are followed to process ITS command
>>>
>>> - Dom0 creates a ITS completion device with device id (00:00.1) and reserves
>>>   n number (256 or so) irqs (LPIs) for this device.
>>> - One irq/LPI (called completion_irq) of this completion device is
>>> allocated per domain
>>> - With this irq/LPI descriptor we can identify the domain/vITS.
>>> - Info of all the ongoing ITS requests(put in pITS Queue) of this domain is
>>>   stored in ITS command status array (called its_requests). This is
>>> managed per vITS.
>>>
>>> 1) Trap of CWRITER write by guest
>>> 2) Take vITS lock
>>> 3) Read all the commands written by guest, translate it
>>> - If one of the guest command is INT command
>>
>> Why do you need a specific handling for the guest INT command?
> 
>   If guest driver is using interrupt mechanism instead of polling
> then INT command is passed by guest. To make sure that CREADER is updated
> before INT command raises interrupt to guest, Xen has to insert completion
> interrupt and update CREADER

Hmmm I see what you mean now. Although, if I understand correctly, Xen
would receive two interrupts: one for the completion, and the other for
the guest.

It would be better if we avoid the first by re-using the INT command
from the guest. If it's not to difficult of course.

>>>a) Append INT command with completion_irq and write this batch as
>>>   seperate request and goto (3) to process next commands
>>> - If more than 'n' commands are sent by guest, start a timer to process
>>>   remaining commands
>>
>> Hmmm... How are you sure the time for the timer would be enough?
>>
>Not thought of how much time. May be the number of pending
>commands in physical queue might give some hueristic on timer value.

I'm wondering if a tasklet would be better here.

>>> 4) Append INT command with completion_irq of current domain
>>> 5) Release vITS lock
>>> 6) Take physical ITS (pITS) lock
>>> 7) Write translated cmds to physical ITS
>>> 8) Add entry in its_requests
>>
>> You don't explain what is its_requests.
>>
>>> 9) Release pITS lock
>>> 10) return from trap
>>>
>>> One receiving completion interrupt:
>>>
>>> 1) Take the first pending request from its_requests.
>>
>> I'm assuming that you have some kind of array/list to store the pending
>> request? I think this would be more difficult to manage than only
>> supporting one batch per domain at any time.
> 
>   Yes, If only one batch per domain is processed at a time,
> then the array could store only one entry. I will tune it when I implement

You won't need an array in this case...

>>> 2) Update vITS CREADER of the guest indicating completion of command to 
>>> guest
>>>
>>> Cons:
>>>- Has overhead of processing completion interrupt.
>>>- Need to reserve a fake device to generate completion interrupt and
>>>  reserve one LPI per-domain
>>>
>>> Pros:
>>>- VCPU does not poll in Xen for completion of commands.
>>>- Handles guest flooding command queue with commands. But needs timer
>>>
>>> Handling Command queue state:
>>>  - Physical Queue cannot be full as it 64KB there by it can accomodate
>>> 1K ITS commands.
>>
>> I don't understand this sentence. Why do you think the physical queue
>> cannot be full?
> 
>   I mean that it is unlikely that physical ITS command Q would be full
> because of 64KB size. If at all if it full then below action is taken

Oh ok. I though you were saying it's not possible :).

> 
>>
>>>In case it is full, VCPU has to poll with timeout till physical
>>> Queue is empty before it post
>>>next command
>>>  - If vITS Queue condition should be managed by guest ITS driver.
>>
>> Same here.
> 
> vITS Queue is under guest control. If Xen is processing commands slowly
> and if guest sees its queue is full then guest driver will handle it.

This paragraph is easier to understand thanks.

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-05 Thread Vijay Kilari
On Tue, May 5, 2015 at 7:39 PM, Julien Grall  wrote:
> On 05/05/15 13:14, Vijay Kilari wrote:
>> Hi,
>>
>
> Hi Vijay,
>
>>As discussed, here is the design doc/txt.
>
> I will comment on the proposal 2 as it seems to be the preferred one
> assuming you are able to find why it's slow.
>
>> Proposal 2:
>> 
>> Here when guest writes command to vITS queue and updates CWRITER registers,
>> it is trapped in XEN and below steps are followed to process ITS command
>>
>> - Dom0 creates a ITS completion device with device id (00:00.1) and reserves
>>   n number (256 or so) irqs (LPIs) for this device.
>> - One irq/LPI (called completion_irq) of this completion device is
>> allocated per domain
>> - With this irq/LPI descriptor we can identify the domain/vITS.
>> - Info of all the ongoing ITS requests(put in pITS Queue) of this domain is
>>   stored in ITS command status array (called its_requests). This is
>> managed per vITS.
>>
>> 1) Trap of CWRITER write by guest
>> 2) Take vITS lock
>> 3) Read all the commands written by guest, translate it
>> - If one of the guest command is INT command
>
> Why do you need a specific handling for the guest INT command?

  If guest driver is using interrupt mechanism instead of polling
then INT command is passed by guest. To make sure that CREADER is updated
before INT command raises interrupt to guest, Xen has to insert completion
interrupt and update CREADER

>>a) Append INT command with completion_irq and write this batch as
>>   seperate request and goto (3) to process next commands
>> - If more than 'n' commands are sent by guest, start a timer to process
>>   remaining commands
>
> Hmmm... How are you sure the time for the timer would be enough?
>
   Not thought of how much time. May be the number of pending
   commands in physical queue might give some hueristic on timer value.

>> 4) Append INT command with completion_irq of current domain
>> 5) Release vITS lock
>> 6) Take physical ITS (pITS) lock
>> 7) Write translated cmds to physical ITS
>> 8) Add entry in its_requests
>
> You don't explain what is its_requests.
>
>> 9) Release pITS lock
>> 10) return from trap
>>
>> One receiving completion interrupt:
>>
>> 1) Take the first pending request from its_requests.
>
> I'm assuming that you have some kind of array/list to store the pending
> request? I think this would be more difficult to manage than only
> supporting one batch per domain at any time.

  Yes, If only one batch per domain is processed at a time,
then the array could store only one entry. I will tune it when I implement

>> 2) Update vITS CREADER of the guest indicating completion of command to guest
>>
>> Cons:
>>- Has overhead of processing completion interrupt.
>>- Need to reserve a fake device to generate completion interrupt and
>>  reserve one LPI per-domain
>>
>> Pros:
>>- VCPU does not poll in Xen for completion of commands.
>>- Handles guest flooding command queue with commands. But needs timer
>>
>> Handling Command queue state:
>>  - Physical Queue cannot be full as it 64KB there by it can accomodate
>> 1K ITS commands.
>
> I don't understand this sentence. Why do you think the physical queue
> cannot be full?

  I mean that it is unlikely that physical ITS command Q would be full
because of 64KB size. If at all if it full then below action is taken

>
>>In case it is full, VCPU has to poll with timeout till physical
>> Queue is empty before it post
>>next command
>>  - If vITS Queue condition should be managed by guest ITS driver.
>
> Same here.

vITS Queue is under guest control. If Xen is processing commands slowly
and if guest sees its queue is full then guest driver will handle it.

>
>> Behaviour of Polling and completion interrupt based guest driver:
>>  - If completion interrupt (INT) is used by guest driver, then insert
>> Xen completion
>>INT command so that CREADER is updated before guest's INT command is 
>> injected
>>  - If polling mode is used, trap on CREADER checks for completion of command
>>
>
> Regards,
>
> --
> Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-05 Thread Vijay Kilari
On Tue, May 5, 2015 at 7:21 PM, Stefano Stabellini
 wrote:
> On Tue, 5 May 2015, Vijay Kilari wrote:
>> Proposal 2:
>> 
>> Here when guest writes command to vITS queue and updates CWRITER registers,
>> it is trapped in XEN and below steps are followed to process ITS command
>>
>> - Dom0 creates a ITS completion device with device id (00:00.1) and reserves
>>   n number (256 or so) irqs (LPIs) for this device.
>> - One irq/LPI (called completion_irq) of this completion device is
>> allocated per domain
>
> Good. Is it possible to actually assign an LPI to a domain when/if a PCI
> device is assigned to the domain? So that we don't waste LPIs for
> domains that are not going to use the vITS?

Yes we can. On receiving first MAPD command we can allocate LPI.

>
>
>> - With this irq/LPI descriptor we can identify the domain/vITS.
>> - Info of all the ongoing ITS requests(put in pITS Queue) of this domain is
>>   stored in ITS command status array (called its_requests). This is
>> managed per vITS.
>>
>> 1) Trap of CWRITER write by guest
>> 2) Take vITS lock
>> 3) Read all the commands written by guest, translate it
>> - If one of the guest command is INT command
>>a) Append INT command with completion_irq and write this batch as
>>   seperate request and goto (3) to process next commands
>> - If more than 'n' commands are sent by guest, start a timer to process
>>   remaining commands
>> 4) Append INT command with completion_irq of current domain
>
> I would consider adding a vcpu_block call
>
>
>> 5) Release vITS lock
>> 6) Take physical ITS (pITS) lock
>> 7) Write translated cmds to physical ITS
>> 8) Add entry in its_requests
>> 9) Release pITS lock
>> 10) return from trap
>>
>> One receiving completion interrupt:
>>
>> 1) Take the first pending request from its_requests.
>> 2) Update vITS CREADER of the guest indicating completion of command to guest
>
> I would add vcpu_unblock
>
>
>> Cons:
>>- Has overhead of processing completion interrupt.
>>- Need to reserve a fake device to generate completion interrupt and
>>  reserve one LPI per-domain
>>
>> Pros:
>>- VCPU does not poll in Xen for completion of commands.
>>- Handles guest flooding command queue with commands. But needs timer
>>
>> Handling Command queue state:
>>  - Physical Queue cannot be full as it 64KB there by it can accomodate
>> 1K ITS commands.
>>In case it is full, VCPU has to poll with timeout till physical
>> Queue is empty before it post
>>next command
>>  - If vITS Queue condition should be managed by guest ITS driver.
>>
>> Behaviour of Polling and completion interrupt based guest driver:
>>  - If completion interrupt (INT) is used by guest driver, then insert
>> Xen completion
>>INT command so that CREADER is updated before guest's INT command is 
>> injected
>>  - If polling mode is used, trap on CREADER checks for completion of command

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-05 Thread Julien Grall
On 05/05/15 13:14, Vijay Kilari wrote:
> Hi,
> 

Hi Vijay,

>As discussed, here is the design doc/txt.

I will comment on the proposal 2 as it seems to be the preferred one
assuming you are able to find why it's slow.

> Proposal 2:
> 
> Here when guest writes command to vITS queue and updates CWRITER registers,
> it is trapped in XEN and below steps are followed to process ITS command
> 
> - Dom0 creates a ITS completion device with device id (00:00.1) and reserves
>   n number (256 or so) irqs (LPIs) for this device.
> - One irq/LPI (called completion_irq) of this completion device is
> allocated per domain
> - With this irq/LPI descriptor we can identify the domain/vITS.
> - Info of all the ongoing ITS requests(put in pITS Queue) of this domain is
>   stored in ITS command status array (called its_requests). This is
> managed per vITS.
> 
> 1) Trap of CWRITER write by guest
> 2) Take vITS lock
> 3) Read all the commands written by guest, translate it
> - If one of the guest command is INT command

Why do you need a specific handling for the guest INT command?

>a) Append INT command with completion_irq and write this batch as
>   seperate request and goto (3) to process next commands
> - If more than 'n' commands are sent by guest, start a timer to process
>   remaining commands

Hmmm... How are you sure the time for the timer would be enough?

> 4) Append INT command with completion_irq of current domain
> 5) Release vITS lock
> 6) Take physical ITS (pITS) lock
> 7) Write translated cmds to physical ITS
> 8) Add entry in its_requests

You don't explain what is its_requests.

> 9) Release pITS lock
> 10) return from trap
> 
> One receiving completion interrupt:
> 
> 1) Take the first pending request from its_requests.

I'm assuming that you have some kind of array/list to store the pending
request? I think this would be more difficult to manage than only
supporting one batch per domain at any time.

> 2) Update vITS CREADER of the guest indicating completion of command to guest
> 
> Cons:
>- Has overhead of processing completion interrupt.
>- Need to reserve a fake device to generate completion interrupt and
>  reserve one LPI per-domain
> 
> Pros:
>- VCPU does not poll in Xen for completion of commands.
>- Handles guest flooding command queue with commands. But needs timer
> 
> Handling Command queue state:
>  - Physical Queue cannot be full as it 64KB there by it can accomodate
> 1K ITS commands.

I don't understand this sentence. Why do you think the physical queue
cannot be full?

>In case it is full, VCPU has to poll with timeout till physical
> Queue is empty before it post
>next command
>  - If vITS Queue condition should be managed by guest ITS driver.

Same here.

> Behaviour of Polling and completion interrupt based guest driver:
>  - If completion interrupt (INT) is used by guest driver, then insert
> Xen completion
>INT command so that CREADER is updated before guest's INT command is 
> injected
>  - If polling mode is used, trap on CREADER checks for completion of command
> 

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-05 Thread Julien Grall
On 05/05/15 14:51, Stefano Stabellini wrote:
>> - With this irq/LPI descriptor we can identify the domain/vITS.
>> - Info of all the ongoing ITS requests(put in pITS Queue) of this domain is
>>   stored in ITS command status array (called its_requests). This is
>> managed per vITS.
>>
>> 1) Trap of CWRITER write by guest
>> 2) Take vITS lock
>> 3) Read all the commands written by guest, translate it
>> - If one of the guest command is INT command
>>a) Append INT command with completion_irq and write this batch as
>>   seperate request and goto (3) to process next commands
>> - If more than 'n' commands are sent by guest, start a timer to process
>>   remaining commands
>> 4) Append INT command with completion_irq of current domain
> 
> I would consider adding a vcpu_block call

I don't think the vcpu_block would improve performance here.

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-05 Thread Stefano Stabellini
On Tue, 5 May 2015, Vijay Kilari wrote:
> Proposal 2:
> 
> Here when guest writes command to vITS queue and updates CWRITER registers,
> it is trapped in XEN and below steps are followed to process ITS command
> 
> - Dom0 creates a ITS completion device with device id (00:00.1) and reserves
>   n number (256 or so) irqs (LPIs) for this device.
> - One irq/LPI (called completion_irq) of this completion device is
> allocated per domain

Good. Is it possible to actually assign an LPI to a domain when/if a PCI
device is assigned to the domain? So that we don't waste LPIs for
domains that are not going to use the vITS?


> - With this irq/LPI descriptor we can identify the domain/vITS.
> - Info of all the ongoing ITS requests(put in pITS Queue) of this domain is
>   stored in ITS command status array (called its_requests). This is
> managed per vITS.
> 
> 1) Trap of CWRITER write by guest
> 2) Take vITS lock
> 3) Read all the commands written by guest, translate it
> - If one of the guest command is INT command
>a) Append INT command with completion_irq and write this batch as
>   seperate request and goto (3) to process next commands
> - If more than 'n' commands are sent by guest, start a timer to process
>   remaining commands
> 4) Append INT command with completion_irq of current domain

I would consider adding a vcpu_block call


> 5) Release vITS lock
> 6) Take physical ITS (pITS) lock
> 7) Write translated cmds to physical ITS
> 8) Add entry in its_requests
> 9) Release pITS lock
> 10) return from trap
>
> One receiving completion interrupt:
> 
> 1) Take the first pending request from its_requests.
> 2) Update vITS CREADER of the guest indicating completion of command to guest

I would add vcpu_unblock


> Cons:
>- Has overhead of processing completion interrupt.
>- Need to reserve a fake device to generate completion interrupt and
>  reserve one LPI per-domain
> 
> Pros:
>- VCPU does not poll in Xen for completion of commands.
>- Handles guest flooding command queue with commands. But needs timer
> 
> Handling Command queue state:
>  - Physical Queue cannot be full as it 64KB there by it can accomodate
> 1K ITS commands.
>In case it is full, VCPU has to poll with timeout till physical
> Queue is empty before it post
>next command
>  - If vITS Queue condition should be managed by guest ITS driver.
> 
> Behaviour of Polling and completion interrupt based guest driver:
>  - If completion interrupt (INT) is used by guest driver, then insert
> Xen completion
>INT command so that CREADER is updated before guest's INT command is 
> injected
>  - If polling mode is used, trap on CREADER checks for completion of command

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] Xen/arm: Virtual ITS command queue handling

2015-05-05 Thread Vijay Kilari
Hi,

   As discussed, here is the design doc/txt.

ARM GICv3 provides ITS (Interrupt Translation Service) feature to handle
MSIx interrupts.  Below are various mechanisms to handle
ITS commands.

ITS command completion detection mechanism:
--
1) Append INT command to receive interrupt from ITS hardware after completion
of ITS command
2) Poll ITS command Queue by reading CREADER register

ITS driver running the guest can follow either one or both of the approaches
to know command completion.

Assumptions:

1) Each VM will have one Virtual ITS (vITS)
2) VM is trapped on CWRITER write.
3) ITS commands should be processed in order of occurance.
   Though we release vITS lock before we put in physical ITS queue,
   there will not be any other VCPU that can trap and post another
   ITS command because the current VCPU is trapped on CWRITER.
   so another VCPU of the same domain cannot trap on CWRITER update.
   If this assumption is not valid, then vITS lock should be held
   untill command is posted to physical ITS.

Below are the proposed methods to emulate ITS commands in Xen.

Proposal 1:

Here when guest writes command to vITS queue and updates CWRITER registers,
it is trapped in XEN and below steps are followed to process ITS command

1) Trap of CWRITER write by guest
2) Take vITS lock
3) Read command written by guest, translate it.
   command queue.
4) Release vITS lock
5) Take physical ITS (pITS) lock
6) write CMD to physical ITS
7) Release pITS lock
8) Poll physical CREADER for completion of command.
9) Update vITS CREADER of the guest
10)If next command is available goto step (2)
   else
   return from trap

Cons:
   - VCPU loops in Xen untill all commands written by Guest are completed.
   - All the ITS commands written by guest is translated at processed before
 VCPU returns from trap.
   - If guest floods with ITS commands, VCPU keeps posting commands continously.

Pros:
   - Only one set of ITS commands sent by one VCPU per domain is
processed at a time

Handling Command queue state:
 - vITS Queue cannot be full as VCPU returns only on completion of ITS command.
 - Physical Queue cannot be full as it 64KB there by it can accomodate
1K ITS commands.
   If physical Queue is full, then VCPU will poll looking for empty physical.
   On timeout return error.

Behaviour of Polling and completion interrupt based guest driver:
 - If completion interrupt (INT) is used by guest driver, then guest driver will
   always see updated CREADER as commands are completed as it is
written to Queue.
 - If polling mode is used, trap on CREADER checks for completion of command.

Proposal 2:

Here when guest writes command to vITS queue and updates CWRITER registers,
it is trapped in XEN and below steps are followed to process ITS command

- Dom0 creates a ITS completion device with device id (00:00.1) and reserves
  n number (256 or so) irqs (LPIs) for this device.
- One irq/LPI (called completion_irq) of this completion device is
allocated per domain
- With this irq/LPI descriptor we can identify the domain/vITS.
- Info of all the ongoing ITS requests(put in pITS Queue) of this domain is
  stored in ITS command status array (called its_requests). This is
managed per vITS.

1) Trap of CWRITER write by guest
2) Take vITS lock
3) Read all the commands written by guest, translate it
- If one of the guest command is INT command
   a) Append INT command with completion_irq and write this batch as
  seperate request and goto (3) to process next commands
- If more than 'n' commands are sent by guest, start a timer to process
  remaining commands
4) Append INT command with completion_irq of current domain
5) Release vITS lock
6) Take physical ITS (pITS) lock
7) Write translated cmds to physical ITS
8) Add entry in its_requests
9) Release pITS lock
10) return from trap

One receiving completion interrupt:

1) Take the first pending request from its_requests.
2) Update vITS CREADER of the guest indicating completion of command to guest

Cons:
   - Has overhead of processing completion interrupt.
   - Need to reserve a fake device to generate completion interrupt and
 reserve one LPI per-domain

Pros:
   - VCPU does not poll in Xen for completion of commands.
   - Handles guest flooding command queue with commands. But needs timer

Handling Command queue state:
 - Physical Queue cannot be full as it 64KB there by it can accomodate
1K ITS commands.
   In case it is full, VCPU has to poll with timeout till physical
Queue is empty before it post
   next command
 - If vITS Queue condition should be managed by guest ITS driver.

Behaviour of Polling and completion interrupt based guest driver:
 - If completion interrupt (INT) is used by guest driver, then insert
Xen completion
   INT command so that CREADER is updated before guest's INT command is injected
 - If polling mode is used, tra