[PATCH v4] Handle wrap around in limit calculation

2024-01-23 Thread Shlomo Pongratz
Hanlde wrap around when calculating the viewport size
caused by the fact that perior to version 460A the limit variable
was 32bit quantity and not 64 bits quantity.
In the i.MX 6Dual/6Quad Applications Processor Reference Manual
document on which the original code was based upon in the
description of the iATU Region Upper Base Address Register it is
written:
Forms bits [63:32] of the start (and end) address of the address region to 
be
translated.
That is in this register is the upper of both base and the limit.
In the current implementation this value was ignored for the limit
which caused a wrap around of the size calculaiton.
Using the documnet example:
Base HI: 0x8000 Base LO: 0xd000 Limit LO: 0xd000
The correct size is 0x8000d000 - 0x8000d000 + 1 =
0x01
The wrong result is 0xd000 - 0x8000d000 + 1 = 0x8001

Signed-off-by: Shlomo Pongratz 



Changes since v3:
 * Use Peter Maydell's suggestion for fixing the bug,
   that is compute a 64 bits limit from the upper 32 bits of the base
   and the given 32 bits limit.

Changes since v2:
 * Don't try to fix the calculation.
 * Change the limit variable from 32bit to 64 bit
 * Set the limit bits [63:32] same as the base according to
   the specification on which the original code was base upon.

Changes since v1:
 * Seperate subject and description
---
 hw/pci-host/designware.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/hw/pci-host/designware.c b/hw/pci-host/designware.c
index dd9e389c07..f81dbe3975 100644
--- a/hw/pci-host/designware.c
+++ b/hw/pci-host/designware.c
@@ -269,8 +269,20 @@ static void 
designware_pcie_update_viewport(DesignwarePCIERoot *root,
 {
 const uint64_t target = viewport->target;
 const uint64_t base   = viewport->base;
-const uint64_t size   = (uint64_t)viewport->limit - base + 1;
+uint64_t size;
 const bool enabled= viewport->cr[1] & DESIGNWARE_PCIE_ATU_ENABLE;
+uint64_t limit;
+
+/*
+ * Versions 460A and above of this PCI controller have a
+ * 64-bit limit register. We currently model the older type
+ * where the limit register is 32 bits, and the upper 32 bits
+ * of the end address are the same as the upper 32 bits of
+ * the start address.
+ */
+
+limit = deposit64(viewport->limit, 32, 32, extract64(base, 32, 32));
+size = limit - base + 1;
 
 MemoryRegion *current, *other;
 
-- 
2.25.1




Re: [PATCH v3] Handle wrap around in limit calculation

2024-01-22 Thread Shlomo Pongratz

Please see inline

On 22/01/2024 01:17, Philippe Mathieu-Daudé wrote:

Hi Shlomo,

On 21/1/24 17:47, Shlomo Pongratz wrote:

From: Shlomo Pongratz 

 Hanlde wrap around when calculating the viewport size
 caused by the fact that perior to version 460A the limit variable
 was 32bit quantity and not 64 bits quantity.
 In the i.MX 6Dual/6Quad Applications Processor Reference Manual
 document on which the original code was based upon in the
 description of the iATU Region Upper Base Address Register it is
 written:
 Forms bits [63:32] of the start (and end) address of the address 
region to be

 translated.
 That is in this register is the upper of both base and the limit.
 In the current implementation this value was ignored for the limit
 which caused a wrap around of the size calculaiton.
 Using the documnet example:
 Base HI: 0x8000 Base LO: 0xd000 Limit LO: 0xd000
 The correct size is 0x8000d000 - 0x8000d000 + 1 =
0x01
 The wrong result is 0xd000 - 0x8000d000 + 1 = 
0x8001


 Signed-off-by: Shlomo Pongratz 

 

 Changes since v2:
  * Don't try to fix the calculation.
  * Change the limit variable from 32bit to 64 bit
  * Set the limit bits [63:32] same as the base according to
    the specification on which the original code was base upon.

 Changes since v1:
  * Seperate subject and description
---
  hw/pci-host/designware.c | 19 ++-
  include/hw/pci-host/designware.h |  2 +-
  2 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/hw/pci-host/designware.c b/hw/pci-host/designware.c
index dd9e389c07..43cba9432f 100644
--- a/hw/pci-host/designware.c
+++ b/hw/pci-host/designware.c
@@ -269,7 +269,7 @@ static void 
designware_pcie_update_viewport(DesignwarePCIERoot *root,

  {
  const uint64_t target = viewport->target;
  const uint64_t base   = viewport->base;
-    const uint64_t size   = (uint64_t)viewport->limit - base + 1;
+    const uint64_t size   = viewport->limit - base + 1;
  const bool enabled    = viewport->cr[1] & 
DESIGNWARE_PCIE_ATU_ENABLE;

    MemoryRegion *current, *other;
@@ -351,6 +351,14 @@ static void 
designware_pcie_root_config_write(PCIDevice *d, uint32_t address,

  case DESIGNWARE_PCIE_ATU_UPPER_BASE:
  viewport->base &= 0xULL;
  viewport->base |= (uint64_t)val << 32;
+    /* The documentatoin states that the value of this register
+ * "Forms bits [63:32] of the start (and end) address
+ * of the address region to be translated.
+ * Note that from version 406A there is a sperate
+ * register fot the upper end address
+ */
+    viewport->limit &= 0xULL;
+    viewport->limit |= (uint64_t)val << 32;


This code is easier to review using:

  viewport->limit = deposit64(viewport->limit, 32, 32, val);


It will be strange to have
viewport->base &= 0xULL;
    viewport->base |= (uint64_t)val << 32;
and then
viewport->limit = deposit64(viewport->limit, 32, 32, val);
I think that the code for base and limit should be the same.
And I don't think the original for base should be change to
viewport->base = deposit64(viewport->base, 32, 32, val);
SO everything will look the same.

  break;
    case DESIGNWARE_PCIE_ATU_LOWER_TARGET:
@@ -364,7 +372,8 @@ static void 
designware_pcie_root_config_write(PCIDevice *d, uint32_t address,

  break;
    case DESIGNWARE_PCIE_ATU_LIMIT:
-    viewport->limit = val;
+    viewport->limit &= 0xULL;
+    viewport->limit |= val;


Here:

  viewport->limit = deposit64(viewport->limit, 0, 32, val);

My opinion is that the code should be identical to
case DESIGNWARE_PCIE_ATU_LOWER_BASE:
    viewport->base &= 0xULL;
    viewport->base |= val;
    break;
I don't think it is good to mix two styles.

  break;
    case DESIGNWARE_PCIE_ATU_CR1:
@@ -429,7 +438,7 @@ static void 
designware_pcie_root_realize(PCIDevice *dev, Error **errp)

  viewport->inbound = true;
  viewport->base    = 0xULL;
  viewport->target  = 0xULL;
-    viewport->limit   = UINT32_MAX;
+    viewport->limit   = 0xULL;


Previous code is easier to review IMHO.

Just want to make it clear that this is 64 bit values and that the
upper value is the same as the base's upper value, and according to spec.

  viewport->cr[0]   = DESIGNWARE_PCIE_ATU_TYPE_MEM;
    source  = &host->pci.address_space_root;
@@ -453,7 +462,7 @@ static void 
designware_pcie_root_realize(PCIDevice *dev, Error **errp)

  viewport

[PATCH v3] Handle wrap around in limit calculation

2024-01-21 Thread Shlomo Pongratz
From: Shlomo Pongratz 

Hanlde wrap around when calculating the viewport size
caused by the fact that perior to version 460A the limit variable
was 32bit quantity and not 64 bits quantity.
In the i.MX 6Dual/6Quad Applications Processor Reference Manual
document on which the original code was based upon in the
description of the iATU Region Upper Base Address Register it is
written:
Forms bits [63:32] of the start (and end) address of the address region to 
be
translated.
That is in this register is the upper of both base and the limit.
In the current implementation this value was ignored for the limit
which caused a wrap around of the size calculaiton.
Using the documnet example:
Base HI: 0x8000 Base LO: 0xd000 Limit LO: 0xd000
The correct size is 0x8000d000 - 0x8000d000 + 1 =
0x01
The wrong result is 0xd000 - 0x8000d000 + 1 = 0x8001

Signed-off-by: Shlomo Pongratz 



Changes since v2:
 * Don't try to fix the calculation.
 * Change the limit variable from 32bit to 64 bit
 * Set the limit bits [63:32] same as the base according to
   the specification on which the original code was base upon.

Changes since v1:
 * Seperate subject and description
---
 hw/pci-host/designware.c | 19 ++-
 include/hw/pci-host/designware.h |  2 +-
 2 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/hw/pci-host/designware.c b/hw/pci-host/designware.c
index dd9e389c07..43cba9432f 100644
--- a/hw/pci-host/designware.c
+++ b/hw/pci-host/designware.c
@@ -269,7 +269,7 @@ static void 
designware_pcie_update_viewport(DesignwarePCIERoot *root,
 {
 const uint64_t target = viewport->target;
 const uint64_t base   = viewport->base;
-const uint64_t size   = (uint64_t)viewport->limit - base + 1;
+const uint64_t size   = viewport->limit - base + 1;
 const bool enabled= viewport->cr[1] & DESIGNWARE_PCIE_ATU_ENABLE;
 
 MemoryRegion *current, *other;
@@ -351,6 +351,14 @@ static void designware_pcie_root_config_write(PCIDevice 
*d, uint32_t address,
 case DESIGNWARE_PCIE_ATU_UPPER_BASE:
 viewport->base &= 0xULL;
 viewport->base |= (uint64_t)val << 32;
+/* The documentatoin states that the value of this register
+ * "Forms bits [63:32] of the start (and end) address
+ * of the address region to be translated.
+ * Note that from version 406A there is a sperate
+ * register fot the upper end address
+ */
+viewport->limit &= 0xULL;
+viewport->limit |= (uint64_t)val << 32;
 break;
 
 case DESIGNWARE_PCIE_ATU_LOWER_TARGET:
@@ -364,7 +372,8 @@ static void designware_pcie_root_config_write(PCIDevice *d, 
uint32_t address,
 break;
 
 case DESIGNWARE_PCIE_ATU_LIMIT:
-viewport->limit = val;
+viewport->limit &= 0xULL;
+viewport->limit |= val;
 break;
 
 case DESIGNWARE_PCIE_ATU_CR1:
@@ -429,7 +438,7 @@ static void designware_pcie_root_realize(PCIDevice *dev, 
Error **errp)
 viewport->inbound = true;
 viewport->base= 0xULL;
 viewport->target  = 0xULL;
-viewport->limit   = UINT32_MAX;
+viewport->limit   = 0xULL;
 viewport->cr[0]   = DESIGNWARE_PCIE_ATU_TYPE_MEM;
 
 source  = &host->pci.address_space_root;
@@ -453,7 +462,7 @@ static void designware_pcie_root_realize(PCIDevice *dev, 
Error **errp)
 viewport->inbound = false;
 viewport->base= 0xULL;
 viewport->target  = 0xULL;
-viewport->limit   = UINT32_MAX;
+viewport->limit   = 0xULL;
 viewport->cr[0]   = DESIGNWARE_PCIE_ATU_TYPE_MEM;
 
 destination = &host->pci.memory;
@@ -560,7 +569,7 @@ static const VMStateDescription 
vmstate_designware_pcie_viewport = {
 .fields = (const VMStateField[]) {
 VMSTATE_UINT64(base, DesignwarePCIEViewport),
 VMSTATE_UINT64(target, DesignwarePCIEViewport),
-VMSTATE_UINT32(limit, DesignwarePCIEViewport),
+VMSTATE_UINT64(limit, DesignwarePCIEViewport),
 VMSTATE_UINT32_ARRAY(cr, DesignwarePCIEViewport, 2),
 VMSTATE_END_OF_LIST()
 }
diff --git a/include/hw/pci-host/designware.h b/include/hw/pci-host/designware.h
index 908f3d946b..51052683b7 100644
--- a/include/hw/pci-host/designware.h
+++ b/include/hw/pci-host/designware.h
@@ -41,7 +41,7 @@ typedef struct DesignwarePCIEViewport {
 
 uint64_t base;
 uint64_t target;
-uint32_t limit;
+uint64_t limit;
 uint32_t cr[2];
 
 bool inbound;
-- 
2.25.1




Re: [PATCH V2] Handle wrap around in limit calculation

2024-01-18 Thread Shlomo Pongratz

See Inline.

On 15/01/2024 18:47, Peter Maydell wrote:

On Mon, 15 Jan 2024 at 13:51, Shlomo Pongratz  wrote:

On 15/01/2024 12:37, Peter Maydell wrote:

For instance, the kernel code suggests that pre-460A
there's a 32 bit limit register, and post-460A there
is a 64-bit limit (with an "UPPER_LIMIT" register to
access the top 32 bits), plus a PCIE_ATU_INCREASE_REGION_SIZE
flag bit. That suggests we might either:
   (1) implement all that
   (2) say we're implementing a pre-460A version with a
   32 bit limit field
Presumably we're aiming for (2) here, but I find the
arithmetic you have in this patch completely confusing:
it doesn't look like something hardware would be doing
when it has a 64 bit base address register and a 32 bit limit
address register. It's also weird as C code, because you
add 0x1 when calculating tlimit, but this will
have no effect because of the AND with 0x in
the following line.

My guess at what the hardware is doing is "the actual
limit address takes its low 32 bits from the limit address
register and the high 32 bits from the high 32 bits of
the base address".

The original code which claims to be based on i.MX6 Applications Processor
actually fails for the example given there in page 4131 where the lower
is set to 0xd000
the upper to 0x800 and the limit to 0xd000 which gives us a size
of 0x8001,
which is a negative number. Therefore some fix need to be done.

Ah, I hadn't realised that this PCI controller was documented
in the i.MX6 applications processor reference manual. Looking
at that I see that the "upper base address register" is documented
as "Forms bits [63:32] of the start (and end) address of the address
region to be translated". That "(and end)" part confirms my
guess that the 64-bit limit address is supposed to be formed by
putting together the upper-base-address with the limit value.
Actually this is one implementation of one pre-460A and we don't even 
knows which.

And QEMU version is 0 which I assume should be some ideal implementation.
So as I suspect there is an implied HW limitation that a region should 
never cross

4G boundary e.g. [0x8000f000 0x8001efff] will be sent as
0x8000f000 and 0xefff resulting in[0x8000f000 
0x8000efff]

Your suggestion solve this issue and gives the correct address even if
the addresses are translated by for example by a multiple of 4G, e.g
0x2,
but it won't work if the range is translated in a way that is cross 4G
boundary (e.g. translate by 0x2000).

After reviewing the mathematics I've found a solution which to my
humiliation is more simple and it is likely that the HW
is doing it, and this is just to ignore the high 32 bits of the
calculation's result. That is:
const uint64_t size = ((uint64_t)viewport->limit - base + 1) & 0x;

That gives the wrong answer for the case where the limit register
is set to 0x_ (it will give you 0 rather than 0x1__).

This was a typo
The + 1 should have been added after masking.
That is the calculation is done in 32 bit.

    uint32_t tbase;
    uint32_t tsize;
    uint64_t size;
    uint64_t tlimit;

    // Use 32 bit calculation
    tbase = base; // Truncate
    tsize = limit - tbase;
    // Return to 64 bit
    size = tsize;
    size++;


I think my suggestion is:

uint64_t base = viewport->base;
uint64_t limit = viewport->limit;
uint64_t size;

/*
 * Versions 460A and above of this PCI controller have a
 * 64-bit limit register. We currently model the older type
 * where the limit register is 32 bits, and the upper 32 bits
 * of the end address are the same as the upper 32 bits of
 * the start address.
 */
limit = deposit64(limit, 32, 32, extract64(base, 32, 32));
size = limit - base + 1;

This will work the only thing is that it will not work if for example
The original range is [0x8000f000 0x8001efff]
Stuffing 0x8000f000 0x0efff to your solution yields
size 0x0001 and as result a limit of 0x8000efff
Note that the 32bit calculation above gives the original values.
Full simulation of all algorithms over some ranges can be found in
my gist 
https://gist.github.com/shlomopongartz/f82466b6044f3a1653890b43b046c443


Anyway I assume that in your opinion we should keep the HW limitation 
and not try to fix the HW

That's a fairly clear implementation of what the docs say the
behaviour is, and it gives us a nice easy place to add 64-bit
limit register support if we ever need to (by guarding the setting
of 'limit' with an "if 64-bit limit registers not enabled" check).

I'll be glad to see post 460A implementation.
Since until then it will be impossible to use the same device tree in a 
project
where the HW are working in a post-A460 controller and the

Re: [PATCH V2] Handle wrap around in limit calculation

2024-01-15 Thread Shlomo Pongratz

On 15/01/2024 12:37, Peter Maydell wrote:

See inline.

On Mon, 15 Jan 2024 at 05:58, Shlomo Pongratz  wrote:

Thank you.
Please see comments inline.

On Fri, Jan 12, 2024 at 7:03 PM Peter Maydell  wrote:

On Tue, 9 Jan 2024 at 12:45, Shlomo Pongratz  wrote:

Hi; thanks for this patch.


Hanlde wrap around caused by the fact that perior to version 460A

Is this "460A" version number a version of the hardware
we're modelling ?


the limit was 32bit quantity.
See Linux kernel code in:
drivers/pci/controllers/dwc/pcie-designware.c

"/controller/"


function: __dw_pcie_prog_outbound_atu

There don't seem to be any comments in this kernel function
that say anything about wrap-around:

https://elixir.bootlin.com/linux/latest/source/drivers/pci/controller/dwc/pcie-designware.c#L468

so I'm not sure what you're trying to explain by referring to it.

This is just to give some  context.
If you look at the original submission of the controller patch d64e5eabc4c7e20c
pci: Add support for Designware IP block by Andrey Smirnov it is written there
"Add code needed to get a functional PCI subsytem when using in
 conjunction with upstream Linux guest (4.13+)."

Now in a 64bit system the range can be above 4G but as long as
the limit itself is less then 4G the overflow is avoided

Signed-off-by: Shlomo Pongratz 



Changes since v1:
  * Seperate subject and description
---
  hw/pci-host/designware.c | 15 ++-
  1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/hw/pci-host/designware.c b/hw/pci-host/designware.c
index dd9e389c07..7ce4a6b64d 100644
--- a/hw/pci-host/designware.c
+++ b/hw/pci-host/designware.c
@@ -269,11 +269,24 @@ static void 
designware_pcie_update_viewport(DesignwarePCIERoot *root,
  {
  const uint64_t target = viewport->target;
  const uint64_t base   = viewport->base;
-const uint64_t size   = (uint64_t)viewport->limit - base + 1;
  const bool enabled= viewport->cr[1] & DESIGNWARE_PCIE_ATU_ENABLE;
+uint64_t tbase, tlimit, size;

  MemoryRegion *current, *other;

+/*
+ * Hanlde wrap around caused by the fact that perior to version 460A
+ * the limit was 32bit quantity.
+ * See Linux kernel code in:
+ * drivers/pci/controllers/dwc/pcie-designware.c
+ * function: __dw_pcie_prog_outbound_atu
+ * Now in a 64bit system the range can be above 4G but as long as
+ * the limit itself is less then 4G the overflow is avoided
+ */
+tbase = base & 0x;
+tlimit = 0x1 + (uint64_t)viewport->limit;
+size = ((tlimit - tbase) & 0x) + 1;
+

I find this patch a bit confusing, partly because the comment
seems to be written from the perspective of what the kernel
driver is doing, not from the perspective of the hardware
behaviour.


Again I refer to the original patch comment.
The original patch was written to support Linux kernel 4.13+ and a
specific implementation i.MX6 Applications Processor.
I've looked at the i.MX6 reference manual and it was silent regarding
the wrap-around case.
There is no reference to the  relevant Synopsys' Designware's specification.
Note that the pci version of the QEMU is 0, therefore I assume that
the implementation was tailored
to a specific implementation.

What is the behaviour of the actual hardware here, both before
and after 460A ? Which version are we trying to model?

I don't have access to the pantora of Synopsys' Designware's root port.
I can only conclude from the Linux kernel code that although the base
address was always 64 bit quantity,
then before version 460A that the limit quantity was 32 bit quantity
and from version 460A it is 64 bit quantity.
And the document that the original patch was based on didn't say what
is the behavior in case of wrap around.
I don't want to model any specific version, I just want to work with
device tree definitions that has regions above 4G,
which are possible since the base address is 64 bit quantity as long
as the regions size are
less teh 4G.

But we must model *something*, which is ideally "what the spec
says" or possibly "what some specific version is". In this
particular case, we need to be clear about whether we are
modelling the pre-460A behaviour or the 460A-and-onward
behaviour. "This change seems to be enough to make Linux
work" is generally not sufficient to justify it.

If all we have is the Linux driver code then the flow
has to be:
  * look at what the kernel does
  * deduce what we think the hardware implementation must
be, based on that plus common sense about what is and
isn't typical and easy for hardware to do
  * implement that, with comments about where we're making
guesses

For instance, the kernel code suggests that pre-460A
there's a 32 bit limit register, and post-460A there
is a 64-bit limit (with an "UPPER_LIMIT" re

Re: [PATCH V2] Handle wrap around in limit calculation

2024-01-14 Thread Shlomo Pongratz

Thank you.

Please see comments inline.

On Fri, Jan 12, 2024 at 7:03 PM Peter Maydell  wrote:

On Tue, 9 Jan 2024 at 12:45, Shlomo Pongratz  wrote:

Hi; thanks for this patch.


Hanlde wrap around caused by the fact that perior to version 460A

Is this "460A" version number a version of the hardware
we're modelling ?


the limit was 32bit quantity.
See Linux kernel code in:
drivers/pci/controllers/dwc/pcie-designware.c

"/controller/"


function: __dw_pcie_prog_outbound_atu

There don't seem to be any comments in this kernel function
that say anything about wrap-around:

https://elixir.bootlin.com/linux/latest/source/drivers/pci/controller/dwc/pcie-designware.c#L468

so I'm not sure what you're trying to explain by referring to it.

This is just to give some  context.
If you look at the original submission of the controller patch d64e5eabc4c7e20c
pci: Add support for Designware IP block by Andrey Smirnov it is written there
"Add code needed to get a functional PCI subsytem when using in
 conjunction with upstream Linux guest (4.13+)."

Now in a 64bit system the range can be above 4G but as long as
the limit itself is less then 4G the overflow is avoided

Signed-off-by: Shlomo Pongratz 



Changes since v1:
  * Seperate subject and description
---
  hw/pci-host/designware.c | 15 ++-
  1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/hw/pci-host/designware.c b/hw/pci-host/designware.c
index dd9e389c07..7ce4a6b64d 100644
--- a/hw/pci-host/designware.c
+++ b/hw/pci-host/designware.c
@@ -269,11 +269,24 @@ static void 
designware_pcie_update_viewport(DesignwarePCIERoot *root,
  {
  const uint64_t target = viewport->target;
  const uint64_t base   = viewport->base;
-const uint64_t size   = (uint64_t)viewport->limit - base + 1;
  const bool enabled= viewport->cr[1] & DESIGNWARE_PCIE_ATU_ENABLE;
+uint64_t tbase, tlimit, size;

  MemoryRegion *current, *other;

+/*
+ * Hanlde wrap around caused by the fact that perior to version 460A
+ * the limit was 32bit quantity.
+ * See Linux kernel code in:
+ * drivers/pci/controllers/dwc/pcie-designware.c
+ * function: __dw_pcie_prog_outbound_atu
+ * Now in a 64bit system the range can be above 4G but as long as
+ * the limit itself is less then 4G the overflow is avoided
+ */
+tbase = base & 0x;
+tlimit = 0x1 + (uint64_t)viewport->limit;
+size = ((tlimit - tbase) & 0x) + 1;
+

I find this patch a bit confusing, partly because the comment
seems to be written from the perspective of what the kernel
driver is doing, not from the perspective of the hardware
behaviour.


Again I refer to the original patch comment.
The original patch was written to support Linux kernel 4.13+ and a
specific implementation i.MX6 Applications Processor.
I've looked at the i.MX6 reference manual and it was silent regarding
the wrap-around case.
There is no reference to the  relevant Synopsys' Designware's specification.
Note that the pci version of the QEMU is 0, therefore I assume that
the implementation was tailored
to a specific implementation.

What is the behaviour of the actual hardware here, both before
and after 460A ? Which version are we trying to model?

I don't have access to the pantora of Synopsys' Designware's root port.
I can only conclude from the Linux kernel code that although the base
address was always 64 bit quantity,
then before version 460A that the limit quantity was 32 bit quantity
and from version 460A it is 64 bit quantity.
And the document that the original patch was based on didn't say what
is the behavior in case of wrap around.
I don't want to model any specific version, I just want to work with
device tree definitions that has regions above 4G,
which are possible since the base address is 64 bit quantity as long
as the regions size are
less teh 4G.

thanks
-- PMM




Re: [PATCH V2] Handle wrap around in limit calculation

2024-01-14 Thread Shlomo Pongratz
Thank you.
Please see comments inline.

On Fri, Jan 12, 2024 at 7:03 PM Peter Maydell  wrote:
>
> On Tue, 9 Jan 2024 at 12:45, Shlomo Pongratz  wrote:
>
> Hi; thanks for this patch.
>
> > Hanlde wrap around caused by the fact that perior to version 460A
>
> Is this "460A" version number a version of the hardware
> we're modelling ?
>
> > the limit was 32bit quantity.
> > See Linux kernel code in:
> > drivers/pci/controllers/dwc/pcie-designware.c
>
> "/controller/"
>
> > function: __dw_pcie_prog_outbound_atu
>
> There don't seem to be any comments in this kernel function
> that say anything about wrap-around:
>
> https://elixir.bootlin.com/linux/latest/source/drivers/pci/controller/dwc/pcie-designware.c#L468
>
> so I'm not sure what you're trying to explain by referring to it.
This is just to give some  context.
If you look at the original submission of the controller patch d64e5eabc4c7e20c
pci: Add support for Designware IP block by Andrey Smirnov it is written there
"Add code needed to get a functional PCI subsytem when using in
conjunction with upstream Linux guest (4.13+)."
>
> > Now in a 64bit system the range can be above 4G but as long as
> > the limit itself is less then 4G the overflow is avoided
> >
> > Signed-off-by: Shlomo Pongratz 
> >
> > 
> >
> > Changes since v1:
> >  * Seperate subject and description
> > ---
> >  hw/pci-host/designware.c | 15 ++-
> >  1 file changed, 14 insertions(+), 1 deletion(-)
> >
> > diff --git a/hw/pci-host/designware.c b/hw/pci-host/designware.c
> > index dd9e389c07..7ce4a6b64d 100644
> > --- a/hw/pci-host/designware.c
> > +++ b/hw/pci-host/designware.c
> > @@ -269,11 +269,24 @@ static void 
> > designware_pcie_update_viewport(DesignwarePCIERoot *root,
> >  {
> >  const uint64_t target = viewport->target;
> >  const uint64_t base   = viewport->base;
> > -const uint64_t size   = (uint64_t)viewport->limit - base + 1;
> >  const bool enabled= viewport->cr[1] & DESIGNWARE_PCIE_ATU_ENABLE;
> > +uint64_t tbase, tlimit, size;
> >
> >  MemoryRegion *current, *other;
> >
> > +/*
> > + * Hanlde wrap around caused by the fact that perior to version 460A
> > + * the limit was 32bit quantity.
> > + * See Linux kernel code in:
> > + * drivers/pci/controllers/dwc/pcie-designware.c
> > + * function: __dw_pcie_prog_outbound_atu
> > + * Now in a 64bit system the range can be above 4G but as long as
> > + * the limit itself is less then 4G the overflow is avoided
> > + */
> > +tbase = base & 0x;
> > +tlimit = 0x1 + (uint64_t)viewport->limit;
> > +size = ((tlimit - tbase) & 0x) + 1;
> > +
>
> I find this patch a bit confusing, partly because the comment
> seems to be written from the perspective of what the kernel
> driver is doing, not from the perspective of the hardware
> behaviour.
>
Again I refer to the original patch comment.
The original patch was written to support Linux kernel 4.13+ and a
specific implementation i.MX6 Applications Processor.
I've looked at the i.MX6 reference manual and it was silent regarding
the wrap-around case.
There is no reference to the  relevant Synopsys' Designware's specification.
Note that the pci version of the QEMU is 0, therefore I assume that
the implementation was tailored
to a specific implementation.
> What is the behaviour of the actual hardware here, both before
> and after 460A ? Which version are we trying to model?
I don't have access to the pantora of Synopsys' Designware's root port.
I can only conclude from the Linux kernel code that although the base
address was always 64 bit quantity,
then before version 460A that the limit quantity was 32 bit quantity
and from version 460A it is 64 bit quantity.
And the document that the original patch was based on didn't say what
is the behavior in case of wrap around.
I don't want to model any specific version, I just want to work with
device tree definitions that has regions above 4G,
which are possible since the base address is 64 bit quantity as long
as the regions size are
less teh 4G.
>
> thanks
> -- PMM



[PATCH V2] Handle wrap around in limit calculation

2024-01-09 Thread Shlomo Pongratz
Hanlde wrap around caused by the fact that perior to version 460A
the limit was 32bit quantity.
See Linux kernel code in:
drivers/pci/controllers/dwc/pcie-designware.c
function: __dw_pcie_prog_outbound_atu
Now in a 64bit system the range can be above 4G but as long as
the limit itself is less then 4G the overflow is avoided

Signed-off-by: Shlomo Pongratz 



Changes since v1:
 * Seperate subject and description
---
 hw/pci-host/designware.c | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/hw/pci-host/designware.c b/hw/pci-host/designware.c
index dd9e389c07..7ce4a6b64d 100644
--- a/hw/pci-host/designware.c
+++ b/hw/pci-host/designware.c
@@ -269,11 +269,24 @@ static void 
designware_pcie_update_viewport(DesignwarePCIERoot *root,
 {
 const uint64_t target = viewport->target;
 const uint64_t base   = viewport->base;
-const uint64_t size   = (uint64_t)viewport->limit - base + 1;
 const bool enabled= viewport->cr[1] & DESIGNWARE_PCIE_ATU_ENABLE;
+uint64_t tbase, tlimit, size;
 
 MemoryRegion *current, *other;
 
+/*
+ * Hanlde wrap around caused by the fact that perior to version 460A
+ * the limit was 32bit quantity.
+ * See Linux kernel code in:
+ * drivers/pci/controllers/dwc/pcie-designware.c
+ * function: __dw_pcie_prog_outbound_atu
+ * Now in a 64bit system the range can be above 4G but as long as
+ * the limit itself is less then 4G the overflow is avoided
+ */
+tbase = base & 0x;
+tlimit = 0x1 + (uint64_t)viewport->limit;
+size = ((tlimit - tbase) & 0x) + 1;
+
 if (viewport->cr[0] == DESIGNWARE_PCIE_ATU_TYPE_MEM) {
 current = &viewport->mem;
 other   = &viewport->cfg;
-- 
2.25.1




[PATCH] Hanlde wrap around caused by the fact that perior to version 460A the limit was 32bit quantity. See Linux kernel code in: drivers/pci/controllers/dwc/pcie-designware.c function: __dw_pcie_prog

2024-01-08 Thread Shlomo Pongratz
Signed-off-by: Shlomo Pongratz 
---
 hw/pci-host/designware.c | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/hw/pci-host/designware.c b/hw/pci-host/designware.c
index dd9e389c07..7ce4a6b64d 100644
--- a/hw/pci-host/designware.c
+++ b/hw/pci-host/designware.c
@@ -269,11 +269,24 @@ static void 
designware_pcie_update_viewport(DesignwarePCIERoot *root,
 {
 const uint64_t target = viewport->target;
 const uint64_t base   = viewport->base;
-const uint64_t size   = (uint64_t)viewport->limit - base + 1;
 const bool enabled= viewport->cr[1] & DESIGNWARE_PCIE_ATU_ENABLE;
+uint64_t tbase, tlimit, size;
 
 MemoryRegion *current, *other;
 
+/*
+ * Hanlde wrap around caused by the fact that perior to version 460A
+ * the limit was 32bit quantity.
+ * See Linux kernel code in:
+ * drivers/pci/controllers/dwc/pcie-designware.c
+ * function: __dw_pcie_prog_outbound_atu
+ * Now in a 64bit system the range can be above 4G but as long as
+ * the limit itself is less then 4G the overflow is avoided
+ */
+tbase = base & 0x;
+tlimit = 0x1 + (uint64_t)viewport->limit;
+size = ((tlimit - tbase) & 0x) + 1;
+
 if (viewport->cr[0] == DESIGNWARE_PCIE_ATU_TYPE_MEM) {
 current = &viewport->mem;
 other   = &viewport->cfg;
-- 
2.25.1




Re: PCIe with Designware RC.

2023-12-24 Thread Shlomo Pongratz
Thank you, see comment inside.

On Sun, Dec 24, 2023 at 1:11 PM BALATON Zoltan  wrote:
>
> On Sun, 24 Dec 2023, Shlomo Pongratz wrote:
> > Hi,
> > I'm working on a AARCH64 project that uses the designeware
> > (hw/pci-host/designware.c).
> > I've copied the designware initialization from hw/arm/fsl-imx7.c and I
> > hope I've updated the dtsi correctly.
> > After fixing an issue with the iATU windows (see patch
> > https://lists.gnu.org/archive/html/qemu-devel/2023-12/msg02643.html)
> > I've tried to add virtualized NVMe controller.
> > When I added the lines:
> >-device nvme,serial=deadbeef,drive=nvme0,bus=pcie \  (Or without 
> > bus=)
> >-drive file=/home/pliops/disk-1.img,if=none,id=nvme1 \
>
> You define drive with if=none,id=nvme1 but have drive=nvme0 in your
> device. You should refer to the drive you want the device to use so I
> think it should either be -device nvme,drive=nvme1 or the if of drive
> should be nvme0.

This was a typo, All are nvme0.

 I don't know how this works for nvme but for CD drives
> for example adding a device would add it without disk and drive defines
> the disk to use. Not sure this makes sense for hard disks or nvme device
> but maybe the command line options don't consider that.
>
> > I could see in QEMU monitor that the NVMe device was preset i.e.
> > (qemu) info pci
> >  Bus  0, device   0, function 0:
> >PCI bridge: PCI device 16c3:abcd
> >  IRQ 0, pin A
> >  BUS 0.
> >  secondary bus 1.
> >  subordinate bus 255.
> >  IO range [0xf000, 0x0fff]
> >  memory range [0xfff0, 0x000f]
> >  prefetchable memory range [0xfff0, 0x000f]
> >  id ""
> >  Bus  0, device   1, function 0:
> >Class 0264: PCI device 1b36:0010
> >  PCI subsystem 1af4:1100
> >  IRQ 0, pin A
> >  BAR0: 64 bit memory at 0x [0x3ffe].
> >  id ""
> > However in lspci it was missing
> > # lspci
> > 00:00.0 Class 0604: 16c3:abcd
> >
> > If I used the following command
> >-drive file=/home/pliops/disk.img,if=none,id=nvme0 \
> >-device nvme,serial=deadbeef,drive=nvme0,bus=dw-pcie \
>
> Here you correctly define both media and drive so it works as expected.
> There are some shortcuts for -drive with media=disk or media=cdrom and
> if=ide or scsi that don't need a separate drive option as if=none does but
> not sure if that supports nvme. You probably have to check documentation
> or code to find out.
>
> > Then in the monitor I see:
> > (qemu) info pci
> >  Bus  0, device   0, function 0:
> >PCI bridge: PCI device 16c3:abcd
> >  IRQ 0, pin A
> >  BUS 0.
> >  secondary bus 1.
> >  subordinate bus 255.
> >  IO range [0xf000, 0x0fff]
> >  memory range [0x4000, 0x401f]
> >  prefetchable memory range [0xfff0, 0x000f]
> >  id ""
> >  Bus  1, device   0, function 0:
> >Class 0264: PCI device 1b36:0010
> >  PCI subsystem 1af4:1100
> >  IRQ 1, pin A
> >  BAR0: 64 bit memory at 0x [0x3ffe].
> >  id ""
> > That is the NVMe is on BUS 1.
> > And in lspci I can now see the device but on bus 1.
> > # lspci
> > 01:00.0 Class 0108: 1b36:0010
> > 00:00.0 Class 0604: 16c3:abcd
> >
> > Is this expected?
> >
> > But the main problem is that during the initialization of the
> > controller registers in BAR0 all the read and writes are actually done
> > into the config space.
>
> I don't know what this is but don't think it's related to the above.
>
> Regards,
> BALATON Zoltan
>
> > Any ideas?
> >
> > Thank you
> > Shlomo Pongratz.
> >
> >



PCIe with Designware RC.

2023-12-24 Thread Shlomo Pongratz
Hi,
I'm working on a AARCH64 project that uses the designeware
(hw/pci-host/designware.c).
I've copied the designware initialization from hw/arm/fsl-imx7.c and I
hope I've updated the dtsi correctly.
After fixing an issue with the iATU windows (see patch
https://lists.gnu.org/archive/html/qemu-devel/2023-12/msg02643.html)
I've tried to add virtualized NVMe controller.
When I added the lines:
-device nvme,serial=deadbeef,drive=nvme0,bus=pcie \  (Or without bus=)
-drive file=/home/pliops/disk-1.img,if=none,id=nvme1 \
I could see in QEMU monitor that the NVMe device was preset i.e.
(qemu) info pci
  Bus  0, device   0, function 0:
PCI bridge: PCI device 16c3:abcd
  IRQ 0, pin A
  BUS 0.
  secondary bus 1.
  subordinate bus 255.
  IO range [0xf000, 0x0fff]
  memory range [0xfff0, 0x000f]
  prefetchable memory range [0xfff0, 0x000f]
  id ""
  Bus  0, device   1, function 0:
Class 0264: PCI device 1b36:0010
  PCI subsystem 1af4:1100
  IRQ 0, pin A
  BAR0: 64 bit memory at 0x [0x3ffe].
  id ""
However in lspci it was missing
# lspci
00:00.0 Class 0604: 16c3:abcd

If I used the following command
-drive file=/home/pliops/disk.img,if=none,id=nvme0 \
-device nvme,serial=deadbeef,drive=nvme0,bus=dw-pcie \
Then in the monitor I see:
(qemu) info pci
  Bus  0, device   0, function 0:
PCI bridge: PCI device 16c3:abcd
  IRQ 0, pin A
  BUS 0.
  secondary bus 1.
  subordinate bus 255.
  IO range [0xf000, 0x0fff]
  memory range [0x4000, 0x401f]
  prefetchable memory range [0xfff0, 0x000f]
  id ""
  Bus  1, device   0, function 0:
Class 0264: PCI device 1b36:0010
  PCI subsystem 1af4:1100
  IRQ 1, pin A
  BAR0: 64 bit memory at 0x [0x3ffe].
  id ""
That is the NVMe is on BUS 1.
And in lspci I can now see the device but on bus 1.
# lspci
01:00.0 Class 0108: 1b36:0010
00:00.0 Class 0604: 16c3:abcd

Is this expected?

But the main problem is that during the initialization of the
controller registers in BAR0 all the read and writes are actually done
into the config space.

Any ideas?

Thank you
Shlomo Pongratz.



[PATCH] Fix iATU num viewports manipulation

2023-12-20 Thread Shlomo Pongratz
The number of iATU ports in the RP emulation is 4.
This value is exported via register at address 0x900
The specification states that the value in the resisetr
is 1 less then the actual number.
the Linux kernel routine dw_pcie_iatu_detect in
drivers/pci/controller/dwc/pcie-designware.c follows the
following protocol, fisrt it reads this register,
and if this value is not 0x it write 0xFF
to this registers and reads it back, then it set the number
of region to be 1 the number read plus 1.
Then the kernel code tries to initialize this number of inbound
and outbound entries.
The current code in QEMU just accepts the number given by the kernel
and returns it back without considering the implementation limit (4).
As a result, with the current code the kernel tries to initalizes
256 enties. This patch limits the number the kernel can set to
the value imposed by the implementation.

Signed-off-by: Shlomo Pongratz 
---
 hw/pci-host/designware.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/pci-host/designware.c b/hw/pci-host/designware.c
index f477f97847..4558d552ab 100644
--- a/hw/pci-host/designware.c
+++ b/hw/pci-host/designware.c
@@ -340,7 +340,8 @@ static void designware_pcie_root_config_write(PCIDevice *d, 
uint32_t address,
 break;
 
 case DESIGNWARE_PCIE_ATU_VIEWPORT:
-root->atu_viewport = val;
+root->atu_viewport = val < DESIGNWARE_PCIE_NUM_VIEWPORTS ?
+ val : (DESIGNWARE_PCIE_NUM_VIEWPORTS - 1);
 break;
 
 case DESIGNWARE_PCIE_ATU_LOWER_BASE:
-- 
2.25.1




Re: Usage of vfio-pci without KVM.

2023-09-12 Thread Shlomo Pongratz
Hi,
What can I as a user do to honor this requirement.
Are you suggesting that I should patch the QEMU code as it is not
supported out of the box?

Thank you.

S.P.

On Tue, Sep 12, 2023 at 3:58 PM Alex Williamson
 wrote:
>
> On Tue, 12 Sep 2023 14:47:41 +0200
> Philippe Mathieu-Daudé  wrote:
>
> > Cc'ing VFIO maintainers.
> >
> > On 12/9/23 14:39, Shlomo Pongratz wrote:
> > > Hi,
> > > I'm running qemu-system-aarch64 (QEMU emulator version 7.0.93) on
> > > Ubuntu 20.04.4 LTS i with Intel's i7.
> > > I'm trying to pass a Samsung NVME device using vfio-pci. I detached
> > > the device from the nvme driver and attached it to the vfio-pci.
> > > Using lspci I can see "Kernel driver in use: vfio-pci"
> > > In QEMU script I've written "-device vfio-pci,host=:03:00.0" where
> > > :03:00.0 is the device PCI address.
> > > I get the error
> > > qemu-system-aarch64: -device vfio-pci,host=:03:00.0: VFIO_MAP_DMA
> > > failed: Invalid argument
> > > qemu-system-aarch64: -device vfio-pci,host=:03:00.0: vfio
> > > :03:00.0: failed to setup container for group 15: memory listener
> > > initialization failed: Region mach-virt.ram:
> > > vfio_dma_map(0x55855c75bf00, 0x4000, 0x1, 0x7f5197e0)
> > > = -22 (Invalid argument
> > >
> > > My question is vfio-pci is supported with cross architecture?
>
> It does, but reserved address ranges need to be honored.  x86 has a
> reserved range at 0xfee0 for MSI mapping, so the VM address space
> needs to be such that it avoids trying to place mappings there.  Thanks,
>
> Alex
>



Re: [Qemu-devel] [RFC 1/5] hw/intc/arm_gicv3_common: Add state information

2016-02-24 Thread Shlomo Pongratz
On Monday, February 22, 2016, Peter Maydell 
wrote:

> From: Pavel Fedin >
>
> Add state information to GICv3 object structure and implement
> arm_gicv3_common_reset(). Also, add some functions for registers which are
> not stored directly but simulated.
>
> State information includes not only pure GICv3 data, but also some legacy
> registers. This will be useful for implementing software emulation of GICv3
> with v2 backwards compatilibity mode.
>
> Signed-off-by: Pavel Fedin >
> [PMM: significantly overhauled:
>  * Add missing qom/cpu.h include
>  * Remove legacy-only state fields (we can add them later if/when we add
>legacy emulation)
>  * Add various missing register offset #defines
>  * Accessor macros removed entirely
>  * Fields in state structures renamed to match architectural register names
>  * Corrected the reset value for GICR_IENABLER0 since we don't support
>legacy mode
>  * Added ARM_LINUX_BOOT_IF interface for "we are directly booting a kernel
> in
>non-secure" so that we can fake up the firmware-mandated reconfiguration
>only when we need it
> ]
> Signed-off-by: Peter Maydell >
> ---
>  hw/intc/arm_gicv3_common.c | 116 -
>  hw/intc/gicv3_internal.h   | 147
> +
>  include/hw/intc/arm_gicv3_common.h | 106 +-
>  3 files changed, 364 insertions(+), 5 deletions(-)
>  create mode 100644 hw/intc/gicv3_internal.h
>
> diff --git a/hw/intc/arm_gicv3_common.c b/hw/intc/arm_gicv3_common.c
> index e4f0f5a..a1ce4c1 100644
> --- a/hw/intc/arm_gicv3_common.c
> +++ b/hw/intc/arm_gicv3_common.c
> @@ -3,8 +3,9 @@
>   *
>   * Copyright (c) 2012 Linaro Limited
>   * Copyright (c) 2015 Huawei.
> + * Copyright (c) 2015 Samsung Electronics Co., Ltd.
>   * Written by Peter Maydell
> - * Extended to 64 cores by Shlomo Pongratz
> + * Reworked for GICv3 by Shlomo Pongratz and Pavel Fedin
>   *
>   * This program is free software; you can redistribute it and/or modify
>   * it under the terms of the GNU General Public License as published by
> @@ -21,7 +22,10 @@
>   */
>
>  #include "qemu/osdep.h"
> +#include "qom/cpu.h"
>  #include "hw/intc/arm_gicv3_common.h"
> +#include "gicv3_internal.h"
> +#include "hw/arm/linux-boot-if.h"
>
>  static void gicv3_pre_save(void *opaque)
>  {
> @@ -89,6 +93,8 @@ void gicv3_init_irqs_and_mmio(GICv3State *s,
> qemu_irq_handler handler,
>  static void arm_gicv3_common_realize(DeviceState *dev, Error **errp)
>  {
>  GICv3State *s = ARM_GICV3_COMMON(dev);
> +Object *cpu;
> +int i;
>
>  /* revision property is actually reserved and currently used only in
> order
>   * to keep the interface compatible with GICv2 code, avoiding extra
> @@ -99,11 +105,111 @@ static void arm_gicv3_common_realize(DeviceState
> *dev, Error **errp)
>  error_setg(errp, "unsupported GIC revision %d", s->revision);
>  return;
>  }
> +
> +if (s->num_irq > GICV3_MAXIRQ) {
> +error_setg(errp,
> +   "requested %u interrupt lines exceeds GIC maximum %d",
> +   s->num_irq, GICV3_MAXIRQ);
> +return;
> +}
> +
> +s->cpu = g_new0(GICv3CPUState, s->num_cpu);
> +
> +for (i = 0; i < s->num_cpu; i++) {
> +cpu = OBJECT(qemu_get_cpu(i));
> +s->cpu[i].affinity_id = object_property_get_int(cpu,
> "mp-affinity",
> +NULL);
> +}
>  }
>
>  static void arm_gicv3_common_reset(DeviceState *dev)
>  {
> -/* TODO */
> +GICv3State *s = ARM_GICV3_COMMON(dev);
> +int i;
> +
> +for (i = 0; i < s->num_cpu; i++) {
> +GICv3CPUState *c = &s->cpu[i];
> +
> +c->gicr_waker = GICR_WAKER_ProcessorSleep |
> GICR_WAKER_ChildrenAsleep;
> +c->gicr_ctlr = 0;
> +c->gicr_propbaser = 0;
> +c->gicr_pendbaser = 0;
> +/* If we're resetting a TZ-aware GIC as if secure firmware
> + * had set it up ready to start a kernel in non-secure, we
> + * need to set interrupts to group 1 so the kernel can use them.
> + * Otherwise they reset to group 0 like the hardware.
> + */
> +if (s->security_extn && s->irq_reset_nonsecure) {
> +c->gicr_igroupr0 = 0x;
> +} else {
> +c->gicr_igroupr0 = 0;
> +}
> +
> +c->gicr_ienabler0 = 0;
> +c->gicr_ipendr0 = 0;
> +c-&g

Re: [Qemu-devel] [PATCH RFC V5 0/9] Implement GIC-500 from GICv3 family for arm64

2016-02-21 Thread Shlomo Pongratz
On Tuesday, February 16, 2016, Shlomo Pongratz 
wrote:

>
>
> On Tuesday, February 16, 2016, Peter Maydell  > wrote:
>
>> On 31 January 2016 at 15:54, Shlomo Pongratz 
>> wrote:
>> > I will do a new revision of the GICv3.
>> > I needed to get a time slot from my employee in order to do the work
>> and I
>> > got one starting next week.
>>
>> Hi Shlomo; I was just wondering how you were getting on with this?
>>
>> I ask because now I'm getting to the end of the TrustZone stuff
>> I've been working on I expect to have more time to help with
>> review/code/whatever for the GICv3. So I don't want to tread on
>> your toes with stuff you're working on now, but if you do find
>> you don't have time to make forward progress could you let me know?
>> (or if there's something you want help with do say.)
>>
>> thank
>> -- PMM
>>
>
> Hi,
>
> I just got a go ahead.
> I'm reviewing all the comments and pulling latest revision.
> I assume it will take a week or two.
> Best regards,
>
> S.P.
>
>

Hi,

Working of re-basing the GICv3 I've noticed in the mailing list Pavel's
patch series named "GICv3 live migration support". This patch series
modifies GICv3 structures. I can see that this patch series was not merged
in the latest git that I've pulled (commit 80b5d6bfc). So me question is,
should I use the relevant patches in my patch series or ignore them?

Best regards,

S.P.


Re: [Qemu-devel] [PATCH RFC V5 0/9] Implement GIC-500 from GICv3 family for arm64

2016-01-31 Thread Shlomo Pongratz
On Friday, January 29, 2016, Christopher Covington 
wrote:

> On 10/20/2015 01:22 PM, Shlomo Pongratz wrote:
> > From: Shlomo Pongratz >
> >
> > This patch is a first step multicores support for arm64.
> >
> > This implemntation was tested up to 100 cores.
> >
> > Things left to do:
> >
> > Support SPI, ITS and ITS CONTROL, note that this patch porpose is to
> enable
> > running multi cores using the "virt" virtual machine and this goal is
> achived
> > without that.
> >
> > Add GICv2 backwards competability. Since there is a GICv2 implementation
> I
> > can't see the pusprose for it.
> >
> > Special thanks to Peter Crostwaite whose patch to th Linux (kernel) i.e.
> > Implement cpu_relax as yield solved the problem of the boot process
> getting
> > stuck for 24 cores and more.
> >
> > Figure out why virtual machine name changed from virt-v3 to
> virt-v3-machine
>
> Hi Shlomo,
>
> Were you planning on another revision of this patchset? Are there any
> things you would like help with?
>
> Peter,
>
> Do you have any thoughts about what is essential and what isn't for a
> first wave of TCG GICv3 patches to be mergeable?
>
> Thanks,
> Christopher Covington
>
> --
> Qualcomm Innovation Center, Inc.
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> a Linux Foundation Collaborative Project
>

Hi,

I will do a new revision of the GICv3.
I needed to get a time slot from my employee in order to do the work and I
got one starting next week.

Best regards,

S.P.


Re: [Qemu-devel] [PATCH v2] target-arm: Extract some external ARM CPU API

2015-10-28 Thread Shlomo Pongratz


-Original Message-
From: Pavel Fedin [mailto:p.fe...@samsung.com] 
Sent: 2015年10月28日 19:13
To: 'Peter Crosthwaite'; 'Paolo Bonzini'
Cc: 'Peter Maydell'; 'Shlomo Pongratz'; 'QEMU Developers'; Shlomo Pongratz
Subject: RE: [Qemu-devel] [PATCH v2] target-arm: Extract some external ARM CPU 
API

 Hello!

>  Ok, so decided. I will convert my code, test the build and send a 
> small patch for this soon, perhaps today.

 arm_gicv3_kvm.o is already in obj-y, and arm_gicv3_common.o does not use any 
of those definitions. So, nothing to move, there will be no patch.

 So far, we have only this small leftover: 
http://lists.nongnu.org/archive/html/qemu-devel/2015-10/msg02349.html. Needed 
both by live migration and SW emulation of GICv3.

 Shlomo: Just add your GICv3 code to obj-$(CONFIG_ARM_GIC), and you'll be able 
to include things you need.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


Hi Pavel,

Thanks, I'm currently in business trip to China.
I'll resume working on it in approx two weeks.
Also please excuse me for using outlook (if the mail is unreadable).

Best regards,

S.P.



Re: [Qemu-devel] [RFC PATCH v3 1/4] hw/intc/arm_gicv3_common: Add state information

2015-10-24 Thread Shlomo Pongratz
Comment on the "workaround" see inline.

On Friday, October 23, 2015, Peter Maydell  wrote:

> On 22 October 2015 at 15:02, Pavel Fedin  > wrote:
> > Add state information to GICv3 object structure and implement
> > arm_gicv3_common_reset(). Also, add some functions for registers which
> are
> > not stored directly but simulated.
> >
> > State information includes not only pure GICv3 data, but also some legacy
> > registers. This will be useful for implementing software emulation of
> GICv3
> > with v2 backwards compatilibity mode.
>
> So we're going to (potentially) emulate:
>  * non-system-register config
>  * non-affinity-routing config
>
> ? OK, but are we really sure we want to do that? Legacy config
> is deprecated and it's really going to complicate the model...
>
> (Are we starting out with the non-legacy config that forces
> system-regs and affinity-routing to always-on?)
>
> > Signed-off-by: Pavel Fedin >
> > ---
> >  hw/intc/arm_gicv3_common.c | 127 +-
> >  hw/intc/gicv3_internal.h   | 265
> +
> >  include/hw/intc/arm_gicv3_common.h |  93 -
> >  3 files changed, 480 insertions(+), 5 deletions(-)
> >  create mode 100644 hw/intc/gicv3_internal.h
> >
> > diff --git a/hw/intc/arm_gicv3_common.c b/hw/intc/arm_gicv3_common.c
> > index 032ece2..2082d05 100644
> > --- a/hw/intc/arm_gicv3_common.c
> > +++ b/hw/intc/arm_gicv3_common.c
> > @@ -3,8 +3,9 @@
> >   *
> >   * Copyright (c) 2012 Linaro Limited
> >   * Copyright (c) 2015 Huawei.
> > + * Copyright (c) 2015 Samsung Electronics Co., Ltd.
> >   * Written by Peter Maydell
> > - * Extended to 64 cores by Shlomo Pongratz
> > + * Reworked for GICv3 by Shlomo Pongratz and Pavel Fedin
> >   *
> >   * This program is free software; you can redistribute it and/or modify
> >   * it under the terms of the GNU General Public License as published by
> > @@ -21,6 +22,7 @@
> >   */
> >
> >  #include "hw/intc/arm_gicv3_common.h"
> > +#include "gicv3_internal.h"
> >
> >  static void gicv3_pre_save(void *opaque)
> >  {
> > @@ -88,6 +90,8 @@ void gicv3_init_irqs_and_mmio(GICv3State *s,
> qemu_irq_handler handler,
> >  static void arm_gicv3_common_realize(DeviceState *dev, Error **errp)
> >  {
> >  GICv3State *s = ARM_GICV3_COMMON(dev);
> > +Object *cpu;
> > +unsigned int i, j;
> >
> >  /* revision property is actually reserved and currently used only
> in order
> >   * to keep the interface compatible with GICv2 code, avoiding extra
> > @@ -98,11 +102,130 @@ static void arm_gicv3_common_realize(DeviceState
> *dev, Error **errp)
> >  error_setg(errp, "unsupported GIC revision %d", s->revision);
> >  return;
> >  }
> > +
> > +if (s->num_irq > GICV3_MAXIRQ) {
> > +error_setg(errp,
> > +   "requested %u interrupt lines exceeds GIC maximum
> %d",
> > +   s->num_irq, GICV3_MAXIRQ);
> > +return;
> > +}
> > +
> > +s->cpu = g_malloc(s->num_cpu * sizeof(GICv3CPUState));
>
> g_new0(GICv3CPUState, s->num_cpu);
>
> > +
> > +for (i = 0; i < s->num_cpu; i++) {
> > +for (j = 0; j < GIC_NR_SGIS; j++) {
> > +s->cpu[i].sgi[j].pending =
> g_malloc(BITS_TO_LONGS(s->num_cpu));
> > +s->cpu[i].sgi[j].size = s->num_cpu;
> > +}
> > +
> > +cpu = OBJECT(qemu_get_cpu(i));
> > +s->cpu[i].affinity_id = object_property_get_int(cpu,
> "mp-affinity",
> > +NULL);
> > +}
> >  }
> >
> >  static void arm_gicv3_common_reset(DeviceState *dev)
> >  {
> > -/* TODO */
> > +GICv3State *s = ARM_GICV3_COMMON(dev);
> > +unsigned int i, j;
> > +
> > +for (i = 0; i < s->num_cpu; i++) {
> > +GICv3CPUState *c = &s->cpu[i];
> > +
> > +c->cpu_enabled = false;
> > +c->redist_ctlr = 0;
> > +c->propbaser =
> GICR_PROPBASER_InnerShareable|GICR_PROPBASER_WaWb;
> > +c->pendbaser =
> GICR_PENDBASER_InnerShareable|GICR_PENDBASER_WaWb;
> > +/* Workaround!
> > + * Linux (drivers/irqchip/irq-gic-v3.c) is enabling only group
> one,
> > + * in gic_cpu_sys_reg_init it calls gic_write_grpen1(1);
> > +  

Re: [Qemu-devel] [PATCH RFC V5 8/9] target-arm/cpu64 GICv3 system instructions support

2015-10-22 Thread Shlomo Pongratz
Hi,

On Thursday, October 22, 2015, Pavel Fedin  wrote:

>  Hello!
>
> >> It has nothing to do with KVM. EFI is a firmware, which originates from
> Intel, but now adopted by ARM64 architecture too. You can also run
> >> it under qemu, if you want to make kind of "full" machine. And it
> writes some value to BPR1, which is indeed ignored by Linux kernel.
>
> > So were is it used in QEMU?
>
> You can configure any ARM machine to use firmware blob instead of kernel +
> device tree. This way it looks more like a real machine.
>
>
I don't have any "blob" I can only run virt.


> > Which machine in hw/arm needs it?
>
>  We ran it on virt, for example.
>
>
And how did you accessed the register and for what purpose?
What should be the expected outcome.


> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
>
>  In general I can't release something that I don't understand and can't
test.


Re: [Qemu-devel] [PATCH RFC V5 8/9] target-arm/cpu64 GICv3 system instructions support

2015-10-22 Thread Shlomo Pongratz
Hi

On Thursday, October 22, 2015, Pavel Fedin  wrote:

>  Hello!
>
> > I've implemented the registers accessed by Linux driver in
> drivers/irqchip/irq-gic-v3.c
> > If this register is used only with KVM e.g. virt/kvm/arm/vgic-v3.c than
> it is out of my mandate.
>
>  It has nothing to do with KVM. EFI is a firmware, which originates from
> Intel, but now adopted by ARM64 architecture too. You can also run it under
> qemu, if you want to make kind of "full" machine. And it writes some value
> to BPR1, which is indeed ignored by Linux kernel.
>
>
So were is it used in QEMU?
Which machine in hw/arm needs it?


> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
>
>


Re: [Qemu-devel] [PATCH RFC V5 8/9] target-arm/cpu64 GICv3 system instructions support

2015-10-22 Thread Shlomo Pongratz
Hi

On Thursday, October 22, 2015, Pavel Fedin  wrote:

>  Hello!
>
> > -Original Message-
> > From: Shlomo Pongratz [mailto:shlomopongr...@gmail.com ]
> > Sent: Tuesday, October 20, 2015 8:22 PM
> > To: qemu-devel@nongnu.org 
> > Cc: p.fe...@samsung.com ; peter.mayd...@linaro.org
> ; eric.au...@linaro.org ;
> > shannon.z...@linaro.org ; imamm...@redhat.com
> ; ash...@broadcom.com ; Shlomo Pongratz
> > Subject: [PATCH RFC V5 8/9] target-arm/cpu64 GICv3 system instructions
> support
> >
> > From: Shlomo Pongratz >
> >
> > Add system instructions used by the Linux (kernel) GICv3
> > device driver
> >
> > Signed-off-by: Shlomo Pongratz  >
> > ---
> >  target-arm/cpu-qom.h |   1 +
> >  target-arm/cpu.h |  12 ++
> >  target-arm/cpu64.c   | 118
> +++
> >  3 files changed, 131 insertions(+)
> >
> > diff --git a/target-arm/cpu-qom.h b/target-arm/cpu-qom.h
> > index 25fb1ce..6a50433 100644
> > --- a/target-arm/cpu-qom.h
> > +++ b/target-arm/cpu-qom.h
> > @@ -220,6 +220,7 @@ hwaddr arm_cpu_get_phys_page_debug(CPUState *cpu,
> vaddr addr);
> >
> >  int arm_cpu_gdb_read_register(CPUState *cpu, uint8_t *buf, int reg);
> >  int arm_cpu_gdb_write_register(CPUState *cpu, uint8_t *buf, int reg);
> > +void aarch64_registers_with_opaque_set(Object *obj, void *opaque);
> >
> >  /* Callback functions for the generic timer's timers. */
> >  void arm_gt_ptimer_cb(void *opaque);
> > diff --git a/target-arm/cpu.h b/target-arm/cpu.h
> > index 3daa7f5..d561313 100644
> > --- a/target-arm/cpu.h
> > +++ b/target-arm/cpu.h
> > @@ -1034,6 +1034,18 @@ void armv7m_nvic_set_pending(void *opaque, int
> irq);
> >  int armv7m_nvic_acknowledge_irq(void *opaque);
> >  void armv7m_nvic_complete_irq(void *opaque, int irq);
> >
> > +void armv8_gicv3_set_sgi(void *opaque, int cpuindex, uint64_t value);
> > +uint64_t armv8_gicv3_acknowledge_irq(void *opaque, int cpuindex,
> > +  MemTxAttrs attrs);
> > +void armv8_gicv3_complete_irq(void *opaque, int cpuindex, int irq,
> > +  MemTxAttrs attrs);
> > +uint64_t armv8_gicv3_get_priority_mask(void *opaque, int cpuindex);
> > +void armv8_gicv3_set_priority_mask(void *opaque, int cpuindex, uint32_t
> mask);
> > +uint64_t armv8_gicv3_get_sre(void *opaque);
> > +void armv8_gicv3_set_sre(void *opaque, uint64_t sre);
> > +uint64_t armv8_gicv3_get_igrpen1(void *opaque, int cpuindex);
> > +void armv8_gicv3_set_igrpen1(void *opaque, int cpuindex, uint64_t
> igrpen1);
> > +
> >  /* Interface for defining coprocessor registers.
> >   * Registers are defined in tables of arm_cp_reginfo structs
> >   * which are passed to define_arm_cp_regs().
> > diff --git a/target-arm/cpu64.c b/target-arm/cpu64.c
> > index 63c8b1c..4224779 100644
> > --- a/target-arm/cpu64.c
> > +++ b/target-arm/cpu64.c
> > @@ -45,6 +45,115 @@ static uint64_t a57_a53_l2ctlr_read(CPUARMState
> *env, const ARMCPRegInfo
> > *ri)
> >  }
> >  #endif
> >
> > +#ifndef CONFIG_USER_ONLY
> > +static void sgi_write(CPUARMState *env, const ARMCPRegInfo *ri,
> uint64_t value)
> > +{
> > +CPUState *cpu = ENV_GET_CPU(env);
> > +armv8_gicv3_set_sgi(ri->opaque, cpu->cpu_index, value);
> > +}
> > +
> > +static uint64_t iar_read(CPUARMState *env, const ARMCPRegInfo *ri)
> > +{
> > +uint64_t value;
> > +MemTxAttrs attrs;;
> > +CPUState *cpu = ENV_GET_CPU(env);
> > +attrs.secure = arm_is_secure_below_el3(env) ? 1 : 0;
> > +value = armv8_gicv3_acknowledge_irq(ri->opaque, cpu->cpu_index,
> attrs);
> > +return value;
> > +}
> > +
> > +static void sre_write(CPUARMState *env, const ARMCPRegInfo *ri,
> uint64_t value)
> > +{
> > +armv8_gicv3_set_sre(ri->opaque, value);
> > +}
> > +
> > +static uint64_t sre_read(CPUARMState *env, const ARMCPRegInfo *ri)
> > +{
> > +uint64_t value;
> > +value = armv8_gicv3_get_sre(ri->opaque);
> > +return value;
> > +}
> > +
> > +static void eoir_write(CPUARMState *env, const ARMCPRegInfo *ri,
> uint64_t value)
> > +{
> > +MemTxAttrs attrs;
> > +CPUState *cpu = ENV_GET_CPU(env);
> > +attrs.secure = arm_is_secure_below_el3(env) ? 1 : 0;
> > +armv8_gicv3_complete_irq(ri->opaque, cpu->cpu_index, value, attrs);
> > +}
> > +
> > +static uint64_t pmr_read(CPUARMState *env, const ARMCPR

Re: [Qemu-devel] [PATCH RFC V5 1/9] hw/intc: Implement GIC-500 support files

2015-10-21 Thread Shlomo Pongratz
O.K.

On Wednesday, October 21, 2015, Pavel Fedin  wrote:

>  Hello!
>
> > Do you mean that in virt.c::create_gic I'll take the cpu's affinity from
> the cpu's property and not directly from
> > ARM_CPU(qemu_get_cpu(i))->mp_affinity
>
>  I mean that you can do it in your GIC's realize function. And, even
> better, in arm_gicv3_common_realize(), because KVM GICv3 live migration
> code will also need it:
> --- cut ---
> for (i = 0; i < s->num_cpu; i++) {
> Object *cpu = OBJECT(qemu_get_cpu(i));
> s->cpu[i].mp_affinity = object_property_get_int(cpu, "mp-affinity",
> NULL);
> }
> --- cut ---
>
> > I can do that but that depends on acceptance of your patch.
>
>  Peter ACKed it, just he doesn't like having unused code:
> http://lists.nongnu.org/archive/html/qemu-devel/2015-10/msg03105.html
>  Just include it into your next respin and forget. :) Actually, i made my
> RFC so that you could just take 0001 and 0002 from it and use for your
> purpose. With additions, of course, if necessary.
>
> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
>
>
>


Re: [Qemu-devel] [PATCH RFC V5 6/9] hw/intc: arm_gicv3_spi_its

2015-10-21 Thread Shlomo Pongratz
On Wednesday, October 21, 2015, Pavel Fedin  wrote:

>  Hello!
>
> > I just added a placeholder, I didn't add any functionality.
>
>  I see. Just wanted to say, that if we accept my proposal (implementing
> ITS as a separate object), then the only thing we would do with this
> placeholder is to remove it. It should go then to something like
> hw/intc/arm_gicv3_its.c and its object should inherit from
> TYPE_ARM_GICV3_ITS_COMMON.
>  Or do you have some explicit reasons to have everything as a monolith?
>

No I just didn't want to have 3 stub files spi, its and its_control.
Do you suggest that I'll split it to 3 files?

>
> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
>
>
>


Re: [Qemu-devel] [PATCH RFC V5 6/9] hw/intc: arm_gicv3_spi_its

2015-10-21 Thread Shlomo Pongratz
Hi,

As far as I understand Figure 1 in GICv3 architecture document the
interrupts from device goes to the distributor and from it to the
re-distributors or from the deices via the ITS to the re-distributors.
So eventually ITS should be part of the GIC. That is if the ITS is a
different entity how do you see it work with the rest of the GIC?


On Wednesday, October 21, 2015, Pavel Fedin  wrote:

>  Hello!
>
> >> Or do you have some explicit reasons to have everything as a monolith?
> > No I just didn't want to have 3 stub files spi, its and its_control.
> > Do you suggest that I'll split it to 3 files?
>
>  You didn't understand my question. It's not about internal structure of
> ITS implementation. It is about GIC and ITS connection.
>  Please review my KVM ITS RFC:
> http://lists.nongnu.org/archive/html/qemu-devel/2015-10/msg04613.html.
> You'll see that ITS is a separate object of a separate class, which can
> even be omitted at all, if machine model doesn't need it for some reason.
> So, i suggest that all the ITS code should go there, and it would be a
> completely separate entity, and a separate patch set, after your GICv3 is
> accepted. I will help you with this.
>  Peter, i know you can be very busy, but, could you at least take a glance
> at my vITS v2 RFC structure and judge us? Should ITS + GICv3 be a
> monolithic object, or is my suggestion better?
>  By the way, gicv3_init_irqs_and_mmio() expects only two regions, so it
> will not even pay attention to your stubs. You could patch it, of course,
> but... I don't think it's the good thing to do.
>
> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
>
>
>


Re: [Qemu-devel] [PATCH RFC V5 6/9] hw/intc: arm_gicv3_spi_its

2015-10-21 Thread Shlomo Pongratz
Hi,

I just added a placeholder, I didn't add any functionality.

On Wednesday, October 21, 2015, Pavel Fedin  wrote:

>  Hello!
>
> > This patch includes a placeholder code for future spi and its
> > implementation.
>
>  Forgot to comment on this. I see that here you are building an ITS into
> GIC as a monolithic thing. This can be wrong because we
> could want to emulate platforms which have GICv3 but don't have ITS. I
> would suggest to implement ITS as a separate class, and i
> have actually done it in
> http://lists.nongnu.org/archive/html/qemu-devel/2015-10/msg04613.html.
>  So, i think we should not bother about ITS for now and suggest you to
> leave it out completely and focus on pure GICv3.
>  Even more, i have software-emulated ITS code in one of my abandoned
> branches, it lacks only GICv3's ability to receive LPIs (which
> is indeed GIC's functionality). After you're done with GIC, i could rebase
> it on top and post at least as an RFC. May be i'll
> complete it myself, may be you'll want to pick it up, i don't know.
> Actually, it works with Linux kernel and successfully translates
> an MSI event into LPI number, which is then injected into the GIC object.
> Just it's GIC missing the appropriate LPI hanlders. I
> wrote it when there was no in-kernel vITS implementation, but abandoned
> when vITS patch series was published.
>
> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
>
>
>


Re: [Qemu-devel] [PATCH RFC V5 9/9] hw/arm: Add virt-v3 machine that uses GIC-500

2015-10-21 Thread Shlomo Pongratz
O.K.

On Wednesday, October 21, 2015, Pavel Fedin  wrote:

>  Hello!
>
> >> The system register implementation belongs in the gic code, not
> >> target-arm/. We already have support for devices that say
> >> "I have some system registers, please add them to this CPU".
>
> > I don't understand.
> > The system registers are defined in ARM Architecture reference Manual.
> > It is true that the real implementation is in arm_gicv3_interrupts.c
> > But the crn, crm, op0, and op1 of the instructions are in CPU domain.
>
>  Not really. If you take a closer look, you'll see that crn, crm, op1, op2
> are the same for both ARM64 and ARM32. The only difference is that ARM64
> uses op0 = 3, and ARM32 uses cp = 15. Both of these specifiers can coexist
> in the descriptor table.
>  I think Peter wants to tell that you should not insert your register
> table and registration function into target-arm/cpu64.c. This code should
> go to hw/intc/arm_gicv3_cpu_interface.c, add .cp = 15, and - voila - it
> magically works with both ARM32 and ARM64.
>
> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
>
>
>


Re: [Qemu-devel] [PATCH RFC V5 6/9] hw/intc: arm_gicv3_spi_its

2015-10-21 Thread Shlomo Pongratz
I just wanted to understand. I don't have any preferences.

On Wednesday, October 21, 2015, Pavel Fedin  wrote:

>  Hello!
>
> > As far as I understand Figure 1 in GICv3 architecture document the
> interrupts from device goes to the distributor and from it to the
> > re-distributors or from the deices via the ITS to the re-distributors.
> > So eventually ITS should be part of the GIC. That is if the ITS is a
> different entity how do you see it work with the rest of the GIC?
>
>  Easy. As i said before, what ITS actually does is transforming device ID
> + event ID into a single LPI number. Then, an LPI goes to the
> redistributor. Therefore, all we need from GIC is a method like
> inject_lpi(int lpi). Well, may be a couple more, for moving LPIs between
> collections. But this is not a problem and makes up a nice interface.
>  Even on the picture you mentioned ITS is a kind of separate thing. It can
> even be completely missing.
>
>  Well, my reasons end here. If you still disagree... Do what you want
> then, it will not be a problem to remove these unused stubs.
>  Peter, we are summoning you. :)
>
> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
>
>
>


Re: [Qemu-devel] [PATCH RFC V5 9/9] hw/arm: Add virt-v3 machine that uses GIC-500

2015-10-21 Thread Shlomo Pongratz
On Wednesday, October 21, 2015, Peter Maydell 
wrote:

> On 21 October 2015 at 12:33, Shlomo Pongratz  > wrote:
> > I assume I can add the system registers to target-arm/cpu.c but I wonder
> if
> > someone really needs to simulate more than 8 AArch32 CPU(s)
>
> The system register implementation belongs in the gic code, not
> target-arm/. We already have support for devices that say
> "I have some system registers, please add them to this CPU".
>
>
I don't understand.
The system registers are defined in ARM Architecture reference Manual.
It is true that the real implementation is in arm_gicv3_interrupts.c
But the crn, crm, op0, and op1 of the instructions are in CPU domain.


> The mechanism is the same for system registers for both 32-bit
> and 64-bit, incidentally.
>
> I agree.


> thanks
> -- PMM
>


Re: [Qemu-devel] [PATCH RFC V5 1/9] hw/intc: Implement GIC-500 support files

2015-10-21 Thread Shlomo Pongratz
See inline

On Wednesday, October 21, 2015, Pavel Fedin  wrote:

>  Hello!
>
> >> See this:
> http://lists.nongnu.org/archive/html/qemu-devel/2015-10/msg02349.html.
> This is also a part of my live migration RFC.
> >> I remember that Peter told long time ago that "it should really be a
> property", when i integrated full affinity support. But, he currently
> >> refused to accept this small standalone patch because there are no
> users for now. My GICv3 live migration is waiting for kernel API to be
> >> ready. And kernel API is waiting for kernel 4.5 development cycle to
> begin.
> > So please resubmit it and mention me as a client.
>
>  Ok, i'll PING, but you can also include it into your patchset. BTW, why
> is it still RFC?
>

It is still in RFC because there are still many comments as you can see :-)

>
> > But I wonder if accessing the property in real time (not in only in
> initialization) from the GIC code will have impact on performance.
>
>  It can, but you can cache them during realize. For example, if you accept
> my data layout, then you can just add "uint64_t mp_affinity" to
> GICv3CPUState.
>
>
Do you mean that in virt.c::create_gic I'll take the cpu's affinity from
the cpu's property and not directly from
ARM_CPU(qemu_get_cpu(i))->mp_affinity
I can do that but that depends on acceptance of your patch.


> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
>
>
>


Re: [Qemu-devel] [PATCH RFC V5 1/9] hw/intc: Implement GIC-500 support files

2015-10-21 Thread Shlomo Pongratz
See inline.

On Wednesday, October 21, 2015, Pavel Fedin  wrote:

>  Hello!
>
> > I can't find the patch that handles the for example modification of
> "uint8_t sgi_pending[GIC_NR_SGIS][GIC_NCPU]" to fixed-size bitmaps. as
> > GIC_NCPU is not a constant and uint8_t need to have the size of nubmer
> of CPUs which is no longer bounded.
>
>  This is the only thing which i excluded, because my vGIC implementation
> didn't need these flags. I know that you cannot actually omit them in
> software emulation, because they are needed in order to handle a situation
> where more than one CPU sends the same SGI number.
>  I think you can place it in distributor, in "internal state" section, if
> you look at my patch.
>
> I'll modify the reset along your suggestion.


> > Are you sure that the semantics is the same? In section 4.1.4 of GICv3 I
> see that the security level is a combination of two registers GRPMOD > and
> GROUP and I wanted to be prepared for it.
>
>  Looks like you have some private NDA version of the manual, because my
> one (downloaded from infocenter) doesn't have a paragraph numbered 4.1.4.
> Anyway, after reading my one, i think that GRPMOD simply overrides what is
> specified in GROUP. I cannot find such thing as "group 2" anywhere, and all
> the text starts with "In a GIC which supports two security states". So,
> there's only non-secure and secure state, nothing else.
>  Again, nothing changes since ARM32.
>
>
I'll re-examine the document and see if this is relevant.

> I assume you want to distinguish between Secure EL1 and Secure EL3 (in
> the future).
>
>  As far as i understand, EL3 is what was called "monitor" in ARM32, and i
> would still prefer to call it this way, because it is more meaningful than
> those stupid (IMHO) numbers. The only special thing about this mode is that
> it allows you to freely switch between secure and non-secure states. So,
> again, there's nothing special about "secure EL3".
>
>  Peter, please correct me if i'm wrong.
>
> > I don't have access to the internal CPU's data structures in the GIC's
> core, its an opaque structure to the GIC.
> > Including the CPU include files doesn't seems to work.
>
>  See this:
> http://lists.nongnu.org/archive/html/qemu-devel/2015-10/msg02349.html.
> This is also a part of my live migration RFC.
>  I remember that Peter told long time ago that "it should really be a
> property", when i integrated full affinity support. But, he currently
> refused to accept this small standalone patch because there are no users
> for now. My GICv3 live migration is waiting for kernel API to be ready. And
> kernel API is waiting for kernel 4.5 development cycle to begin.
>
> So please resubmit it and mention me as a client.
But I wonder if accessing the property in real time (not in only in
initialization) from the GIC code will have impact on performance.

Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
>
>
>


Re: [Qemu-devel] [PATCH RFC V5 9/9] hw/arm: Add virt-v3 machine that uses GIC-500

2015-10-21 Thread Shlomo Pongratz
See inline

On Wednesday, October 21, 2015, Pavel Fedin  wrote:

>  Hello!
>
> > I see, just how do I pass the gic version from the command line?
>
>  Easy. -machine virt,gic-version=3
>
> > GICV3 is accessed by system instructions that exists only in ARCH64.
>
>  Wrong. In 32-bit mode the CPU sees them as CP15 registers. See "8.5
> AArch32 System register descriptions".
>  I would say, system registers are nothing new, these are just renamed
> CP15 registers, with "CPn" prefix removed.
>

I assume I can add the system registers to target-arm/cpu.c but I wonder if
someone really needs to simulate more than 8 AArch32 CPU(s)


> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
>
>
>


Re: [Qemu-devel] [PATCH RFC V5 9/9] hw/arm: Add virt-v3 machine that uses GIC-500

2015-10-21 Thread Shlomo Pongratz
See inline.

On Wednesday, October 21, 2015, Pavel Fedin  wrote:

>  Hello!
>
> > -Original Message-
> > From: Shlomo Pongratz [mailto:shlomopongr...@gmail.com ]
> > Sent: Tuesday, October 20, 2015 8:22 PM
> > To: qemu-devel@nongnu.org 
> > Cc: p.fe...@samsung.com ; peter.mayd...@linaro.org
> ; eric.au...@linaro.org ;
> > shannon.z...@linaro.org ; imamm...@redhat.com
> ; ash...@broadcom.com ; Shlomo Pongratz
> > Subject: [PATCH RFC V5 9/9] hw/arm: Add virt-v3 machine that uses GIC-500
> >
> > From: Shlomo Pongratz >
> >
> > Add virt-v3 machine that uses GIC-500.
>
>  We already concluded long time ago that adding virt-v3 machine is wrong
> approach. We already have gic-version property for the virt
> machine, and especially for you i left a TODO placeholder in
> gicv3_class_name() function. Just return name of your class there, and
> you're done.
>
>
I see, just how do I pass the gic version from the command line?


> > Registers the CPU system instructions and bind them to the GIC.
> > Pass the cores' affinity to the GIC.
> >
> > Signed-off-by: Shlomo Pongratz  >
> > ---
> >  hw/arm/virt.c | 87
> ++-
> >  include/hw/arm/fdt.h  |  2 ++
> >  include/hw/arm/virt.h |  1 +
> >  target-arm/machine.c  |  7 -
> >  4 files changed, 88 insertions(+), 9 deletions(-)
> >
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index 5d38c47..d4fb8f3 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -71,6 +71,7 @@ typedef struct VirtBoardInfo {
> >  uint32_t clock_phandle;
> >  uint32_t gic_phandle;
> >  uint32_t v2m_phandle;
> > +const char *class_name;
> >  } VirtBoardInfo;
> >
> >  typedef struct {
> > @@ -86,6 +87,7 @@ typedef struct {
> >  } VirtMachineState;
> >
> >  #define TYPE_VIRT_MACHINE   MACHINE_TYPE_NAME("virt")
> > +#define TYPE_VIRTV3_MACHINE MACHINE_TYPE_NAME("virt-v3")
> >  #define VIRT_MACHINE(obj) \
> >  OBJECT_CHECK(VirtMachineState, (obj), TYPE_VIRT_MACHINE)
> >  #define VIRT_MACHINE_GET_CLASS(obj) \
> > @@ -111,6 +113,7 @@ static const MemMapEntry a15memmap[] = {
> >  [VIRT_CPUPERIPHS] = { 0x0800, 0x0002 },
> >  /* GIC distributor and CPU interfaces sit inside the CPU peripheral
> space */
> >  [VIRT_GIC_DIST] =   { 0x0800, 0x0001 },
> > +/* Note on GICv3 VIRT_GIC_DIST_SPI takes place of VIRT_GIC_CPU */
>
>  Memory map of the "virt" machine already contains everything needed. BTW,
> what's "SPI"? "MBI", you meant? Well, we are going to
> have an ITS, so this simple message translator will not be needed.
>
Typo.

>
> >  [VIRT_GIC_CPU] ={ 0x0801, 0x0001 },
> >  [VIRT_GIC_V2M] ={ 0x0802, 0x1000 },
> >  /* The space in between here is reserved for GICv3 CPU/vCPU/HYP */
> > @@ -145,21 +148,26 @@ static VirtBoardInfo machines[] = {
> >  .cpu_model = "cortex-a15",
> >  .memmap = a15memmap,
> >  .irqmap = a15irqmap,
> > +.class_name = TYPE_ARM_CPU,
> > +
> >  },
> >  {
> >  .cpu_model = "cortex-a53",
> >  .memmap = a15memmap,
> >  .irqmap = a15irqmap,
> > +.class_name = TYPE_AARCH64_CPU,
> >  },
> >  {
> >  .cpu_model = "cortex-a57",
> >  .memmap = a15memmap,
> >  .irqmap = a15irqmap,
> > +.class_name = TYPE_AARCH64_CPU,
> >  },
> >  {
> >  .cpu_model = "host",
> >  .memmap = a15memmap,
> >  .irqmap = a15irqmap,
> > +.class_name = TYPE_ARM_CPU,
> >  },
> >  };
> >
> > @@ -228,7 +236,6 @@ static void fdt_add_psci_node(const VirtBoardInfo
> *vbi)
> >  if (armcpu->psci_version == 2) {
> >  const char comp[] = "arm,psci-0.2\0arm,psci";
> >  qemu_fdt_setprop(fdt, "/psci", "compatible", comp,
> sizeof(comp));
> > -
> >  cpu_off_fn = QEMU_PSCI_0_2_FN_CPU_OFF;
> >  if (arm_feature(&armcpu->env, ARM_FEATURE_AARCH64)) {
> >  cpu_suspend_fn = QEMU_PSCI_0_2_FN64_CPU_SUSPEND;
> > @@ -241,7 +248,6 @@ static void fdt_add_psci_node(const VirtBoardInfo
> *vbi)
> >  }
> >  } else {
> >  qemu_fdt_setprop_string(fdt, "/p

[Qemu-devel] [PATCH RFC V5 0/9] Implement GIC-500 from GICv3 family for arm64

2015-10-20 Thread Shlomo Pongratz
From: Shlomo Pongratz 

This patch is a first step multicores support for arm64.

This implemntation was tested up to 100 cores.

Things left to do:

Support SPI, ITS and ITS CONTROL, note that this patch porpose is to enable
running multi cores using the "virt" virtual machine and this goal is achived
without that.

Add GICv2 backwards competability. Since there is a GICv2 implementation I
can't see the pusprose for it.

Special thanks to Peter Crostwaite whose patch to th Linux (kernel) i.e.
Implement cpu_relax as yield solved the problem of the boot process getting
stuck for 24 cores and more.

Figure out why virtual machine name changed from virt-v3 to virt-v3-machine

V5->V4
  - Remove the 64 cores limit by using bitmaps.
  - Break the GICv3 code to multiple files e.g. distributer, redistributer, etc,
and submit each file in a patch of its own.
  - Pass the cores' affinity to the GIC using property vector since the GIC
can't access the CPU data structure and it is not aware of the number of the
ARM_CPUS_PER_CLUSTER as defined in "target-arm/cpu.c"
Rebase to hash ee9dfed242610e from Oct 20.

V3->V4:
  - Rebase the patch series on Pavel Feding's "vGICv3 support" V14 patch series.
  - Replace the usage of vnic/irqchip in the CPU structure to keep the gic
object for the system instructions with the usage of the routine
define_arm_cp_regs_with_opaque() as suggested by Peter Maydell in his mail
from August the 3rd. 

V2->V3:
  - Replace my original 1 & 4 patches with Pavel's patches.
  - Add groups support to complies with new GICv2 addtions
  - Cosmetic changes.

V1->V2:
  - Split the original patch to 4 patches
  - Add SRE API to the GIC code.
  - Add call to gicv3_update to armv8_gicv3_set_priority_mask.
  - Cosmetic changes.
  - Fix number of irq when reading GICD_TYPER.


Shlomo Pongratz (9):
  hw/intc: Implement GIC-500 support files
  hw/intc: arm_gicv3_interrupts
  hw/intc: arm_gicv3_cpu_interface
  hw/intc: arm_gicv3_dist
  hw/intc arm_gicv3_redist
  hw/intc: arm_gicv3_spi_its
  hw/intc: arm_gicv3
  target-arm/cpu64 GICv3 system instructions support
  hw/arm: Add virt-v3 machine that uses GIC-500

 hw/arm/virt.c  |  87 -
 hw/intc/Makefile.objs  |   6 +
 hw/intc/arm_gicv3.c| 134 
 hw/intc/arm_gicv3_common.c | 251 +-
 hw/intc/arm_gicv3_cpu_interface.c  | 130 
 hw/intc/arm_gicv3_cpu_interface.h  |  21 ++
 hw/intc/arm_gicv3_dist.c   | 655 +
 hw/intc/arm_gicv3_dist.h   |   9 +
 hw/intc/arm_gicv3_interrupts.c | 295 +
 hw/intc/arm_gicv3_interrupts.h |  11 +
 hw/intc/arm_gicv3_redist.c | 460 ++
 hw/intc/arm_gicv3_redist.h |   9 +
 hw/intc/arm_gicv3_spi_its.c| 359 
 hw/intc/arm_gicv3_spi_its.h|  11 +
 hw/intc/gicv3_internal.h   | 243 ++
 include/hw/arm/fdt.h   |   2 +
 include/hw/arm/virt.h  |   1 +
 include/hw/intc/arm_gicv3.h|  44 +++
 include/hw/intc/arm_gicv3_common.h |  78 -
 target-arm/cpu-qom.h   |   1 +
 target-arm/cpu.h   |  12 +
 target-arm/cpu64.c | 118 +++
 target-arm/machine.c   |   7 +-
 23 files changed, 2929 insertions(+), 15 deletions(-)
 create mode 100644 hw/intc/arm_gicv3.c
 create mode 100644 hw/intc/arm_gicv3_cpu_interface.c
 create mode 100644 hw/intc/arm_gicv3_cpu_interface.h
 create mode 100644 hw/intc/arm_gicv3_dist.c
 create mode 100644 hw/intc/arm_gicv3_dist.h
 create mode 100644 hw/intc/arm_gicv3_interrupts.c
 create mode 100644 hw/intc/arm_gicv3_interrupts.h
 create mode 100644 hw/intc/arm_gicv3_redist.c
 create mode 100644 hw/intc/arm_gicv3_redist.h
 create mode 100644 hw/intc/arm_gicv3_spi_its.c
 create mode 100644 hw/intc/arm_gicv3_spi_its.h
 create mode 100644 hw/intc/gicv3_internal.h
 create mode 100644 include/hw/intc/arm_gicv3.h

-- 
1.9.1




[Qemu-devel] [PATCH RFC V5 3/9] hw/intc: arm_gicv3_cpu_interface

2015-10-20 Thread Shlomo Pongratz
From: Shlomo Pongratz 

This patch incudes the GIC functionality that is exposed to the CPU
via system instructions. In GICv2 this functionality was exposed via
memory mapped access.

Signed-off-by: Shlomo Pongratz 
---
 hw/intc/Makefile.objs |   1 +
 hw/intc/arm_gicv3_cpu_interface.c | 130 ++
 hw/intc/arm_gicv3_cpu_interface.h |  21 ++
 hw/intc/gicv3_internal.h  |   6 ++
 4 files changed, 158 insertions(+)
 create mode 100644 hw/intc/arm_gicv3_cpu_interface.c
 create mode 100644 hw/intc/arm_gicv3_cpu_interface.h

diff --git a/hw/intc/Makefile.objs b/hw/intc/Makefile.objs
index e8cdd27..fb6494f 100644
--- a/hw/intc/Makefile.objs
+++ b/hw/intc/Makefile.objs
@@ -14,6 +14,7 @@ common-obj-$(CONFIG_ARM_GIC) += arm_gic.o
 common-obj-$(CONFIG_ARM_GIC) += arm_gicv2m.o
 common-obj-$(CONFIG_ARM_GIC) += arm_gicv3_common.o
 common-obj-$(CONFIG_ARM_GIC) += arm_gicv3_interrupts.o
+common-obj-$(CONFIG_ARM_GIC) += arm_gicv3_cpu_interface.o
 common-obj-$(CONFIG_OPENPIC) += openpic.o
 
 obj-$(CONFIG_APIC) += apic.o apic_common.o
diff --git a/hw/intc/arm_gicv3_cpu_interface.c 
b/hw/intc/arm_gicv3_cpu_interface.c
new file mode 100644
index 000..ba5ee38
--- /dev/null
+++ b/hw/intc/arm_gicv3_cpu_interface.c
@@ -0,0 +1,130 @@
+#include "gicv3_internal.h"
+#include "qom/cpu.h"
+#include "arm_gicv3_cpu_interface.h"
+#include "arm_gicv3_interrupts.h"
+
+/* Cuurently no GICv2 backwards compatibility (no memory mapped regs)
+ * Uses system registers mode.
+ */
+const int gicv3_no_gicv2_bc = 1;
+uint32_t gicv3_sre;
+
+void armv8_gicv3_set_sgi(void *opaque, int cpuindex, uint64_t value)
+{
+GICv3State *s = (GICv3State *) opaque;
+int irq, i;
+
+/* Page 2227 ICC_SGI1R_EL1 */
+
+irq = (value >> 24) & 0xf;
+
+/* The external routines use the hardware vector numbering, ie. the first
+ * IRQ is #16.  The internal GIC routines use #32 as the first IRQ.
+ */
+if (irq >= 16)
+irq += 16;
+
+/* IRM bit */
+if (value & (1ll << 40)) {
+/* Send to all the cores exclude self */
+for (i = 0; i < cpuindex; i++) {
+set_bit(cpuindex, s->sgi[irq].state[i].pending);
+}
+for (i = cpuindex + 1; i < s->num_cpu; i++) {
+set_bit(cpuindex, s->sgi[irq].state[i].pending);
+}
+/* GIC_SET_PENDING(irq, (ALL_CPU_MASK & ~cm)); */
+bitmap_fill(s->irq_state[irq].pending, s->num_cpu);
+clear_bit(cpuindex, s->irq_state[irq].pending);
+DPRINTF("cpu(%d) sends irq(%d) to ALL exclude self\n", cpuindex, irq);
+} else {
+/* Find linear of first core in cluster. See page 2227 ICC_SGI1R_EL1
+ * With our GIC-500 implementation we can have 16 clusters of 8 cpu 
each
+ */
+uint64_t target_affinity;
+uint64_t target_list;
+target_affinity  = (value >> (16 - ARM_AFF1_SHIFT)) & ARM_AFF1_MASK;
+target_affinity |= (value >> (32 - ARM_AFF2_SHIFT)) & ARM_AFF2_MASK;
+target_affinity |= (value >> (48 - ARM_AFF3_SHIFT)) & ARM_AFF3_MASK;
+target_list = value & 0xff;
+
+for (i = 0; i < s->num_cpu; i++) {
+uint64_t cpu_aff0   = s->mp_affinity[i] & ARM_AFF0_MASK;
+uint64_t cpu_aff123 = s->mp_affinity[i] & ~ARM_AFF0_MASK;
+if (cpu_aff123 == target_affinity &&
+((1 << cpu_aff0) & target_list)) {
+set_bit(cpuindex, s->sgi[irq].state[i].pending);
+GIC_SET_PENDING(irq, i);
+}
+}
+}
+gicv3_update(s);
+}
+
+uint64_t armv8_gicv3_acknowledge_irq(void *opaque, int cpuindex,
+ MemTxAttrs attrs)
+{
+GICv3State *s = (GICv3State *) opaque;
+return gicv3_acknowledge_irq(s, cpuindex, attrs);
+}
+
+void armv8_gicv3_complete_irq(void *opaque, int cpuindex, int irq,
+  MemTxAttrs attrs)
+{
+GICv3State *s = (GICv3State *) opaque;
+irq &= 0xff;
+gicv3_complete_irq(s, cpuindex, irq, attrs);
+}
+
+uint64_t armv8_gicv3_get_priority_mask(void *opaque, int cpuindex)
+{
+GICv3State *s = (GICv3State *) opaque;
+return s->priority_mask[cpuindex];
+}
+
+void armv8_gicv3_set_priority_mask(void *opaque, int cpuindex, uint32_t mask)
+{
+GICv3State *s = (GICv3State *) opaque;
+s->priority_mask[cpuindex] = mask & 0xff;
+DPRINTF("%s cpu(%d) priority mask 0x%x\n",
+__func__, cpuindex, s->priority_mask[cpuindex]);
+gicv3_update(s);
+}
+
+uint64_t armv8_gicv3_get_sre(void *opaque)
+{
+/* Uses only system registers, no memory mapped access GICv2 mode */
+return gicv3_sre;
+}
+
+void armv8_gicv3_set_sre(void *opaque, uint64_t sre)
+{
+if (!(sre & 1) && gicv3_no_gicv2_bc) {
+  

[Qemu-devel] [PATCH RFC V5 2/9] hw/intc: arm_gicv3_interrupts

2015-10-20 Thread Shlomo Pongratz
From: Shlomo Pongratz 

This patch includes the part of the GIC's code that handles
the interrupts.

Signed-off-by: Shlomo Pongratz 
---
 hw/intc/Makefile.objs  |   1 +
 hw/intc/arm_gicv3_interrupts.c | 295 +
 hw/intc/arm_gicv3_interrupts.h |  11 ++
 hw/intc/gicv3_internal.h   |  19 +++
 4 files changed, 326 insertions(+)
 create mode 100644 hw/intc/arm_gicv3_interrupts.c
 create mode 100644 hw/intc/arm_gicv3_interrupts.h

diff --git a/hw/intc/Makefile.objs b/hw/intc/Makefile.objs
index 004b0c2..e8cdd27 100644
--- a/hw/intc/Makefile.objs
+++ b/hw/intc/Makefile.objs
@@ -13,6 +13,7 @@ common-obj-$(CONFIG_ARM_GIC) += arm_gic_common.o
 common-obj-$(CONFIG_ARM_GIC) += arm_gic.o
 common-obj-$(CONFIG_ARM_GIC) += arm_gicv2m.o
 common-obj-$(CONFIG_ARM_GIC) += arm_gicv3_common.o
+common-obj-$(CONFIG_ARM_GIC) += arm_gicv3_interrupts.o
 common-obj-$(CONFIG_OPENPIC) += openpic.o
 
 obj-$(CONFIG_APIC) += apic.o apic_common.o
diff --git a/hw/intc/arm_gicv3_interrupts.c b/hw/intc/arm_gicv3_interrupts.c
new file mode 100644
index 000..da2293e
--- /dev/null
+++ b/hw/intc/arm_gicv3_interrupts.c
@@ -0,0 +1,295 @@
+#include "gicv3_internal.h"
+#include "qom/cpu.h"
+#include "arm_gicv3_interrupts.h"
+
+/* TODO: Many places that call this routine could be optimized.  */
+/* Update interrupt status after enabled or pending bits have been changed.  */
+void gicv3_update(GICv3State *s)
+{
+int best_irq;
+int best_prio;
+int irq;
+int irq_level, fiq_level;
+int cpu;
+
+for (cpu = 0; cpu < NUM_CPU(s); cpu++) {
+s->current_pending[cpu] = 1023;
+if (!(s->ctlr & (GICD_CTLR_EN_GRP0 | GICD_CTLR_EN_GRP1_ALL))
+|| !test_bit(cpu, s->cpu_enabled)
+|| !(s->cpu_ctlr[cpu] & (GICC_CTLR_EN_GRP0 | GICC_CTLR_EN_GRP1))) {
+qemu_irq_lower(s->parent_irq[cpu]);
+qemu_irq_lower(s->parent_fiq[cpu]);
+/* In original GICv2 there is a return here. But if status is
+ * disabled then all parent IRQs need to be lowered
+ * And assume CPU i is disabled then with the original GICv2
+ * implementation CPU - 1 will be considered but not CPU + 1
+ */
+continue;
+}
+best_prio = 0x100;
+best_irq = 1023;
+for (irq = 0; irq < s->num_irq; irq++) {
+if (GIC_TEST_ENABLED(irq, cpu) && gic_test_pending(s, irq, cpu) &&
+(irq < GICV3_INTERNAL || GIC_TEST_TARGET(irq, cpu))) {
+if (GIC_GET_PRIORITY(irq, cpu) < best_prio) {
+best_prio = GIC_GET_PRIORITY(irq, cpu);
+best_irq = irq;
+}
+}
+}
+
+irq_level = fiq_level = 0;
+
+if (best_prio < s->priority_mask[cpu]) {
+s->current_pending[cpu] = best_irq;
+if (best_prio < s->running_priority[cpu]) {
+int group = GIC_TEST_GROUP(best_irq, cpu);
+if (extract32(s->ctlr, group, 1) &&
+extract32(s->cpu_ctlr[cpu], group, 1)) {
+if (group == 0 && s->cpu_ctlr[cpu] & GICC_CTLR_FIQ_EN) {
+DPRINTF("Raised pending FIQ %d (cpu %d)\n",
+best_irq, cpu);
+fiq_level = 1;
+} else {
+DPRINTF("Raised pending IRQ %d (cpu %d)\n",
+best_irq, cpu);
+irq_level = 1;
+}
+}
+}
+}
+
+qemu_set_irq(s->parent_irq[cpu], irq_level);
+qemu_set_irq(s->parent_fiq[cpu], fiq_level);
+}
+}
+
+static void gicv3_set_irq_generic(GICv3State *s, int irq, int level,
+  int cm, unsigned long *target)
+{
+if (level) {
+/* if cm is -1 then the macro will set them all */
+GIC_SET_LEVEL(irq, cm);
+DPRINTF("Set %d pending cpu %d\n", irq, cm);
+if (GIC_TEST_EDGE_TRIGGER(irq)) {
+if (cm < 0)
+GIC_SET_PENDING_MASK(irq, target);
+else
+GIC_SET_PENDING(irq, cm);
+}
+} else {
+GIC_CLEAR_LEVEL(irq, cm);
+}
+}
+
+/* Process a change in an external IRQ input.  */
+void gicv3_set_irq(void *opaque, int irq, int level)
+{
+/* Meaning of the 'irq' parameter:
+ *  [0..N-1] : external interrupts
+ *  [N..N+31] : PPI (internal) interrupts for CPU 0
+ *  [N+32..N+63] : PPI (internal interrupts for CPU 1
+ *  ...
+ */
+GICv3State *s = (GICv3State *)opaque;
+int cm;
+unsigned long *target;
+
+if (irq < (s->num_irq - GICV3_INTERNAL)) {
+/* The first external input line is internal interrupt 32.  */
+

[Qemu-devel] [PATCH RFC V5 8/9] target-arm/cpu64 GICv3 system instructions support

2015-10-20 Thread Shlomo Pongratz
From: Shlomo Pongratz 

Add system instructions used by the Linux (kernel) GICv3
device driver

Signed-off-by: Shlomo Pongratz 
---
 target-arm/cpu-qom.h |   1 +
 target-arm/cpu.h |  12 ++
 target-arm/cpu64.c   | 118 +++
 3 files changed, 131 insertions(+)

diff --git a/target-arm/cpu-qom.h b/target-arm/cpu-qom.h
index 25fb1ce..6a50433 100644
--- a/target-arm/cpu-qom.h
+++ b/target-arm/cpu-qom.h
@@ -220,6 +220,7 @@ hwaddr arm_cpu_get_phys_page_debug(CPUState *cpu, vaddr 
addr);
 
 int arm_cpu_gdb_read_register(CPUState *cpu, uint8_t *buf, int reg);
 int arm_cpu_gdb_write_register(CPUState *cpu, uint8_t *buf, int reg);
+void aarch64_registers_with_opaque_set(Object *obj, void *opaque);
 
 /* Callback functions for the generic timer's timers. */
 void arm_gt_ptimer_cb(void *opaque);
diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 3daa7f5..d561313 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -1034,6 +1034,18 @@ void armv7m_nvic_set_pending(void *opaque, int irq);
 int armv7m_nvic_acknowledge_irq(void *opaque);
 void armv7m_nvic_complete_irq(void *opaque, int irq);
 
+void armv8_gicv3_set_sgi(void *opaque, int cpuindex, uint64_t value);
+uint64_t armv8_gicv3_acknowledge_irq(void *opaque, int cpuindex,
+  MemTxAttrs attrs);
+void armv8_gicv3_complete_irq(void *opaque, int cpuindex, int irq,
+  MemTxAttrs attrs);
+uint64_t armv8_gicv3_get_priority_mask(void *opaque, int cpuindex);
+void armv8_gicv3_set_priority_mask(void *opaque, int cpuindex, uint32_t mask);
+uint64_t armv8_gicv3_get_sre(void *opaque);
+void armv8_gicv3_set_sre(void *opaque, uint64_t sre);
+uint64_t armv8_gicv3_get_igrpen1(void *opaque, int cpuindex);
+void armv8_gicv3_set_igrpen1(void *opaque, int cpuindex, uint64_t igrpen1);
+
 /* Interface for defining coprocessor registers.
  * Registers are defined in tables of arm_cp_reginfo structs
  * which are passed to define_arm_cp_regs().
diff --git a/target-arm/cpu64.c b/target-arm/cpu64.c
index 63c8b1c..4224779 100644
--- a/target-arm/cpu64.c
+++ b/target-arm/cpu64.c
@@ -45,6 +45,115 @@ static uint64_t a57_a53_l2ctlr_read(CPUARMState *env, const 
ARMCPRegInfo *ri)
 }
 #endif
 
+#ifndef CONFIG_USER_ONLY
+static void sgi_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
+{
+CPUState *cpu = ENV_GET_CPU(env);
+armv8_gicv3_set_sgi(ri->opaque, cpu->cpu_index, value);
+}
+
+static uint64_t iar_read(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+uint64_t value;
+MemTxAttrs attrs;;
+CPUState *cpu = ENV_GET_CPU(env);
+attrs.secure = arm_is_secure_below_el3(env) ? 1 : 0;
+value = armv8_gicv3_acknowledge_irq(ri->opaque, cpu->cpu_index, attrs);
+return value;
+}
+
+static void sre_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
+{
+armv8_gicv3_set_sre(ri->opaque, value);
+}
+
+static uint64_t sre_read(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+uint64_t value;
+value = armv8_gicv3_get_sre(ri->opaque);
+return value;
+}
+
+static void eoir_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t 
value)
+{
+MemTxAttrs attrs;
+CPUState *cpu = ENV_GET_CPU(env);
+attrs.secure = arm_is_secure_below_el3(env) ? 1 : 0;
+armv8_gicv3_complete_irq(ri->opaque, cpu->cpu_index, value, attrs);
+}
+
+static uint64_t pmr_read(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+uint64_t value;
+CPUState *cpu = ENV_GET_CPU(env);
+value = armv8_gicv3_get_priority_mask(ri->opaque, cpu->cpu_index);
+return value;
+}
+
+static void pmr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
+{
+CPUState *cpu = ENV_GET_CPU(env);
+armv8_gicv3_set_priority_mask(ri->opaque, cpu->cpu_index, value);
+}
+
+static uint64_t igrpen1_read(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+uint64_t value;
+CPUState *cpu = ENV_GET_CPU(env);
+value = armv8_gicv3_get_igrpen1(ri->opaque, cpu->cpu_index);
+return value;
+}
+
+static void igrpen1_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t 
value)
+{
+CPUState *cpu = ENV_GET_CPU(env);
+armv8_gicv3_set_igrpen1(ri->opaque, cpu->cpu_index, value);
+}
+#endif
+
+static const ARMCPRegInfo cortex_a57_a53_cp_with_opaque_reginfo[] = {
+{ .name = "EIOR1_EL1", .state = ARM_CP_STATE_AA64,
+#ifndef CONFIG_USER_ONLY
+  .writefn = eoir_write,
+#endif
+  .opc0 = 3, .opc1 = 0, .crn = 12, .crm = 12, .opc2 = 1,
+  .access = PL1_W, .type = ARM_CP_SPECIAL, .resetvalue = 0 },
+{ .name = "IAR1_EL1", .state = ARM_CP_STATE_AA64,
+#ifndef CONFIG_USER_ONLY
+  .readfn = iar_read,
+#endif
+  .opc0 = 3, .opc1 = 0, .crn = 12, .crm = 12, .opc2 = 0,
+  .access = PL1_R, .type = ARM_CP_SPECIAL, .resetvalue = 0 },
+{ .name = "SGI1R_EL1", .state = ARM_CP_STATE_AA64,
+#ifndef CONFIG_USER_ONLY
+  .writefn = sgi_write,
+#endif
+  .o

[Qemu-devel] [PATCH RFC V5 7/9] hw/intc: arm_gicv3

2015-10-20 Thread Shlomo Pongratz
From: Shlomo Pongratz 

This patch includes the code that binds together the codes of
all the previous patches.

Signed-off-by: Shlomo Pongratz 
---
 hw/intc/Makefile.objs |   1 +
 hw/intc/arm_gicv3.c   | 134 ++
 2 files changed, 135 insertions(+)
 create mode 100644 hw/intc/arm_gicv3.c

diff --git a/hw/intc/Makefile.objs b/hw/intc/Makefile.objs
index 0e6784f..b7b2288 100644
--- a/hw/intc/Makefile.objs
+++ b/hw/intc/Makefile.objs
@@ -18,6 +18,7 @@ common-obj-$(CONFIG_ARM_GIC) += arm_gicv3_cpu_interface.o
 common-obj-$(CONFIG_ARM_GIC) += arm_gicv3_dist.o
 common-obj-$(CONFIG_ARM_GIC) += arm_gicv3_redist.o
 common-obj-$(CONFIG_ARM_GIC) += arm_gicv3_spi_its.o
+common-obj-$(CONFIG_ARM_GIC) += arm_gicv3.o
 common-obj-$(CONFIG_OPENPIC) += openpic.o
 
 obj-$(CONFIG_APIC) += apic.o apic_common.o
diff --git a/hw/intc/arm_gicv3.c b/hw/intc/arm_gicv3.c
new file mode 100644
index 000..62c8299
--- /dev/null
+++ b/hw/intc/arm_gicv3.c
@@ -0,0 +1,134 @@
+/*
+ * ARM Generic/Distributed Interrupt Controller
+ *
+ * Copyright (c) 2006-2007 CodeSourcery.
+ * Copyright (c) 2015 Huawei.
+ * Written by Shlomo Pongratz
+ * Base on gic.c by Paul Brook
+ *
+ * This code is licensed under the GPL.
+ */
+
+/* This file contains implementation code for the GIC-500 interrupt
+ * controller, which is an implementation of the GICv3 architecture.
+ * Curently it supports up to 64 cores. Enhancmet to 128 cores requires
+ * working with bitops.
+ */
+
+#include "hw/sysbus.h"
+#include "gicv3_internal.h"
+#include "qom/cpu.h"
+#include "arm_gicv3_cpu_interface.h"
+#include "arm_gicv3_interrupts.h"
+#include "arm_gicv3_dist.h"
+#include "arm_gicv3_redist.h"
+#include "arm_gicv3_spi_its.h"
+
+static const MemoryRegionOps gic_ops[] = {
+{
+.read_with_attrs = gic_dist_read,
+.write_with_attrs = gic_dist_write,
+.impl = {
+ .min_access_size = 4,
+ .max_access_size = 8,
+ },
+.endianness = DEVICE_NATIVE_ENDIAN,
+},
+{
+.read_with_attrs = gic_redist_read,
+.write_with_attrs = gic_redist_write,
+.impl = {
+ .min_access_size = 4,
+ .max_access_size = 8,
+ },
+.endianness = DEVICE_NATIVE_ENDIAN,
+},
+{
+.read = gic_its_read,
+.write = gic_its_write,
+.impl = {
+ .min_access_size = 4,
+ .max_access_size = 8,
+ },
+.endianness = DEVICE_NATIVE_ENDIAN,
+},
+{
+.read = gic_spi_read,
+.write = gic_spi_write,
+.impl = {
+ .min_access_size = 4,
+ .max_access_size = 8,
+ },
+.endianness = DEVICE_NATIVE_ENDIAN,
+},
+{
+.read = gic_its_cntrl_read,
+.write = gic_its_cntrl_write,
+.impl = {
+ .min_access_size = 4,
+ .max_access_size = 8,
+ },
+.endianness = DEVICE_NATIVE_ENDIAN,
+}
+};
+
+static void arm_gic_realize(DeviceState *dev, Error **errp)
+{
+/* Device instance realize function for the GIC sysbus device */
+GICv3State *s = ARM_GICV3(dev);
+ARMGICv3Class *agc = ARM_GICV3_GET_CLASS(s);
+Error *local_err = NULL;
+uint32_t power2;
+
+agc->parent_realize(dev, &local_err);
+if (local_err) {
+error_propagate(errp, local_err);
+return;
+}
+
+/* Cuurently no GICv2 backwards compatibility (e.g. no memory mapped regs)
+ * Uses system gicv3_no_gicv2_bc mode
+ */
+gicv3_sre = 1;
+
+/* Tell the common code we're a GICv3 */
+s->revision = REV_V3;
+
+gicv3_init_irqs_and_mmio(s, gicv3_set_irq, gic_ops);
+
+/* Compute mask for decoding the core number in redistributer */
+if (is_power_of_2(NUM_CPU(s)))
+power2 = NUM_CPU(s);
+else
+/* QEMU has only  pow2floor !!! */
+power2 = pow2floor(2 * NUM_CPU(s));
+s->cpu_mask = (power2 - 1);
+
+DPRINTF(" -- NUM_CPUS(%d) - cpu mask(0%x) -- \n", NUM_CPU(s), s->cpu_mask);
+
+bitmap_zero(s->cpu_enabled, s->num_cpu);
+}
+
+static void arm_gicv3_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+ARMGICv3Class *agc = ARM_GICV3_CLASS(klass);
+
+agc->parent_realize = dc->realize;
+dc->realize = arm_gic_realize;
+}
+
+static const TypeInfo arm_gicv3_info = {
+.name = TYPE_ARM_GICV3,
+.parent = TYPE_ARM_GICV3_COMMON,
+.instance_size = sizeof(GICv3State),
+.class_init = arm_gicv3_class_init,
+.class_size = sizeof(ARMGICv3Class),
+};
+
+static void arm_gicv3_register_types(void)
+{
+type_register_static(&arm_gicv3_info);
+}
+
+type_init(arm_gicv3_register_types)
-- 
1.9.1




[Qemu-devel] [PATCH RFC V5 9/9] hw/arm: Add virt-v3 machine that uses GIC-500

2015-10-20 Thread Shlomo Pongratz
From: Shlomo Pongratz 

Add virt-v3 machine that uses GIC-500.
Registers the CPU system instructions and bind them to the GIC.
Pass the cores' affinity to the GIC.

Signed-off-by: Shlomo Pongratz 
---
 hw/arm/virt.c | 87 ++-
 include/hw/arm/fdt.h  |  2 ++
 include/hw/arm/virt.h |  1 +
 target-arm/machine.c  |  7 -
 4 files changed, 88 insertions(+), 9 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 5d38c47..d4fb8f3 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -71,6 +71,7 @@ typedef struct VirtBoardInfo {
 uint32_t clock_phandle;
 uint32_t gic_phandle;
 uint32_t v2m_phandle;
+const char *class_name;
 } VirtBoardInfo;
 
 typedef struct {
@@ -86,6 +87,7 @@ typedef struct {
 } VirtMachineState;
 
 #define TYPE_VIRT_MACHINE   MACHINE_TYPE_NAME("virt")
+#define TYPE_VIRTV3_MACHINE MACHINE_TYPE_NAME("virt-v3")
 #define VIRT_MACHINE(obj) \
 OBJECT_CHECK(VirtMachineState, (obj), TYPE_VIRT_MACHINE)
 #define VIRT_MACHINE_GET_CLASS(obj) \
@@ -111,6 +113,7 @@ static const MemMapEntry a15memmap[] = {
 [VIRT_CPUPERIPHS] = { 0x0800, 0x0002 },
 /* GIC distributor and CPU interfaces sit inside the CPU peripheral space 
*/
 [VIRT_GIC_DIST] =   { 0x0800, 0x0001 },
+/* Note on GICv3 VIRT_GIC_DIST_SPI takes place of VIRT_GIC_CPU */
 [VIRT_GIC_CPU] ={ 0x0801, 0x0001 },
 [VIRT_GIC_V2M] ={ 0x0802, 0x1000 },
 /* The space in between here is reserved for GICv3 CPU/vCPU/HYP */
@@ -145,21 +148,26 @@ static VirtBoardInfo machines[] = {
 .cpu_model = "cortex-a15",
 .memmap = a15memmap,
 .irqmap = a15irqmap,
+.class_name = TYPE_ARM_CPU,
+
 },
 {
 .cpu_model = "cortex-a53",
 .memmap = a15memmap,
 .irqmap = a15irqmap,
+.class_name = TYPE_AARCH64_CPU,
 },
 {
 .cpu_model = "cortex-a57",
 .memmap = a15memmap,
 .irqmap = a15irqmap,
+.class_name = TYPE_AARCH64_CPU,
 },
 {
 .cpu_model = "host",
 .memmap = a15memmap,
 .irqmap = a15irqmap,
+.class_name = TYPE_ARM_CPU,
 },
 };
 
@@ -228,7 +236,6 @@ static void fdt_add_psci_node(const VirtBoardInfo *vbi)
 if (armcpu->psci_version == 2) {
 const char comp[] = "arm,psci-0.2\0arm,psci";
 qemu_fdt_setprop(fdt, "/psci", "compatible", comp, sizeof(comp));
-
 cpu_off_fn = QEMU_PSCI_0_2_FN_CPU_OFF;
 if (arm_feature(&armcpu->env, ARM_FEATURE_AARCH64)) {
 cpu_suspend_fn = QEMU_PSCI_0_2_FN64_CPU_SUSPEND;
@@ -241,7 +248,6 @@ static void fdt_add_psci_node(const VirtBoardInfo *vbi)
 }
 } else {
 qemu_fdt_setprop_string(fdt, "/psci", "compatible", "arm,psci");
-
 cpu_suspend_fn = QEMU_PSCI_0_1_FN_CPU_SUSPEND;
 cpu_off_fn = QEMU_PSCI_0_1_FN_CPU_OFF;
 cpu_on_fn = QEMU_PSCI_0_1_FN_CPU_ON;
@@ -274,6 +280,12 @@ static void fdt_add_timer_nodes(const VirtBoardInfo *vbi, 
int gictype)
 irqflags = deposit32(irqflags, GIC_FDT_IRQ_PPI_CPU_START,
  GIC_FDT_IRQ_PPI_CPU_WIDTH,
  (1 << vbi->smp_cpus) - 1);
+} else if (gictype == 3) {
+uint32_t max;
+/* Argument is 32 bit but 8 bits are reserved for flags */
+max = (vbi->smp_cpus >= 24) ? 24 : vbi->smp_cpus;
+irqflags = deposit32(irqflags, GIC_FDT_IRQ_PPI_CPU_START,
+ GICV3_FDT_IRQ_PPI_CPU_WIDTH, (1 << max) - 1);
 }
 
 qemu_fdt_add_subnode(vbi->fdt, "/timer");
@@ -429,12 +441,32 @@ static void create_gic(VirtBoardInfo *vbi, qemu_irq *pic, 
int type, bool secure)
 gicdev = qdev_create(NULL, gictype);
 qdev_prop_set_uint32(gicdev, "revision", type);
 qdev_prop_set_uint32(gicdev, "num-cpu", smp_cpus);
+
+#ifdef TARGET_AARCH64
+if (type == 3) {
+qdev_prop_set_uint32(gicdev, "len-mp-affinity", smp_cpus);
+/* Connect GIC to CPU */
+for (i = 0; i < smp_cpus; i++) {
+char *propname;
+CPUState *cpu = qemu_get_cpu(i);
+ARMCPU *armcpu = ARM_CPU(cpu);
+propname = g_strdup_printf("mp-affinity[%d]", i);
+qdev_prop_set_uint64(gicdev, propname, armcpu->mp_affinity);
+g_free(propname);
+}
+}
+#endif
 /* Note that the num-irq property counts both internal and external
  * interrupts; there are always 32 of the former (mandated by GIC spec).
  */
 qdev_prop_set_uint32(gicdev, "num-irq", NUM_IRQS + 32);
 if (!kvm_irqchip_in_kernel()) {
-qdev_prop_set_bit(gicdev, "has-security-extensions", secure);
+if (type == 3) {
+/* AARC

[Qemu-devel] [PATCH RFC V5 5/9] hw/intc arm_gicv3_redist

2015-10-20 Thread Shlomo Pongratz
From: Shlomo Pongratz 

This patch includes the code that implements the functionality of
the redisributer, which is the main enhancment of GICv3 over GICv3

Signed-off-by: Shlomo Pongratz 
---
 hw/intc/Makefile.objs  |   1 +
 hw/intc/arm_gicv3_redist.c | 460 +
 hw/intc/arm_gicv3_redist.h |   9 +
 3 files changed, 470 insertions(+)
 create mode 100644 hw/intc/arm_gicv3_redist.c
 create mode 100644 hw/intc/arm_gicv3_redist.h

diff --git a/hw/intc/Makefile.objs b/hw/intc/Makefile.objs
index cdfb877..5e56acc 100644
--- a/hw/intc/Makefile.objs
+++ b/hw/intc/Makefile.objs
@@ -16,6 +16,7 @@ common-obj-$(CONFIG_ARM_GIC) += arm_gicv3_common.o
 common-obj-$(CONFIG_ARM_GIC) += arm_gicv3_interrupts.o
 common-obj-$(CONFIG_ARM_GIC) += arm_gicv3_cpu_interface.o
 common-obj-$(CONFIG_ARM_GIC) += arm_gicv3_dist.o
+common-obj-$(CONFIG_ARM_GIC) += arm_gicv3_redist.o
 common-obj-$(CONFIG_OPENPIC) += openpic.o
 
 obj-$(CONFIG_APIC) += apic.o apic_common.o
diff --git a/hw/intc/arm_gicv3_redist.c b/hw/intc/arm_gicv3_redist.c
new file mode 100644
index 000..cbac184
--- /dev/null
+++ b/hw/intc/arm_gicv3_redist.c
@@ -0,0 +1,460 @@
+#include "gicv3_internal.h"
+#include "qom/cpu.h"
+#include "arm_gicv3_redist.h"
+#include "arm_gicv3_interrupts.h"
+
+static const uint8_t gic_redist_ids[] = {
+0x44, 0x00, 0x00, 0x00, 0x093, 0xB4, 0x3B, 0x00, 0x0D, 0xF0, 0x05, 0xB1
+};
+
+static uint64_t gic_redist_readb(void *opaque, hwaddr offset, MemTxAttrs attrs)
+{
+GICv3State *s = (GICv3State *)opaque;
+uint64_t res = 0;
+uint64_t sgi_ppi, core, off;
+uint64_t irq, i;
+
+/* GIC-500 Table 3-2 page 3-4 (Rev r0p0)
+ * [  1x<16-bits-offset>]
+ * x = 0: LPIs
+ * x = 1: SGIs & PPIs
+ */
+off = offset;
+sgi_ppi = off & (1 << 16);
+core = (off >> 17) & s->cpu_mask;
+offset = off & 0x;
+
+if (sgi_ppi) {
+/* SGIs, PPIs */
+/* Interrupt Set/Clear Enable.  */
+if (offset < 0x100) {
+if (offset >= 0x80) {
+/* Interrupt Group Registers: these RAZ/WI if this is an NS
+ * access to a GIC with the security extensions, or if the GIC
+ * doesn't have groups at all.
+ */
+res = 0;
+if (!attrs.secure) {
+/* GIC-500 comment 'a' */
+goto bad_reg;
+}
+/* Every byte offset holds 8 group status bits */
+irq = (offset - 0x080) * 8 + GICV3_BASE_IRQ;
+if (irq >= s->num_irq) {
+goto bad_reg;
+}
+if (irq >= GICV3_INTERNAL) {
+DPRINTF("non Internal should be only in dist %lu\n", irq);
+goto bad_reg;
+}
+for (i = 0; i < 8; i++) {
+if (GIC_TEST_GROUP(irq + i, core)) {
+res |= (1 << i);
+}
+}
+return res;
+}
+} else if (offset < 0x200) {
+/* Interrupt Set/Clear Enable.  */
+if (offset < 0x180)
+irq = (offset - 0x100) * 8;
+else
+irq = (offset - 0x180) * 8;
+irq += GICV3_BASE_IRQ;
+if (irq >= s->num_irq)
+goto bad_reg;
+if (irq >= GICV3_INTERNAL) {
+DPRINTF("non Internal should be only in dist %lu\n", irq);
+goto bad_reg;
+}
+res = 0;
+for (i = 0; i < 8; i++) {
+if (GIC_TEST_ENABLED(irq + i, core)) {
+res |= (1 << i);
+}
+}
+} else if (offset < 0x300) {
+/* Interrupt Set/Clear Pending.  */
+if (offset < 0x280)
+irq = (offset - 0x200) * 8;
+else
+irq = (offset - 0x280) * 8;
+irq += GICV3_BASE_IRQ;
+if (irq >= s->num_irq)
+goto bad_reg;
+if (irq >= GICV3_INTERNAL) {
+DPRINTF("non Internal should be only in dist %lu\n", irq);
+goto bad_reg;
+}
+res = 0;
+for (i = 0; i < 8; i++) {
+if (gic_test_pending(s, irq + i, core)) {
+res |= (1 << i);
+}
+}
+} else if (offset < 0x400) {
+/* Interrupt Set/Clear Active.  */
+if (offset < 0x380)
+irq = (offset - 0x300) * 8 + GICV3_BASE_IRQ;
+else
+irq = (offset - 0x380) * 8 + GICV3_BASE_IRQ;
+if (irq >= s->num_irq)
+goto bad_reg;
+if 

[Qemu-devel] [PATCH RFC V5 6/9] hw/intc: arm_gicv3_spi_its

2015-10-20 Thread Shlomo Pongratz
From: Shlomo Pongratz 

This patch includes a placeholder code for future spi and its
implementation.

Signed-off-by: Shlomo Pongratz 
---
 hw/intc/Makefile.objs   |   1 +
 hw/intc/arm_gicv3_spi_its.c | 359 
 hw/intc/arm_gicv3_spi_its.h |  11 ++
 3 files changed, 371 insertions(+)
 create mode 100644 hw/intc/arm_gicv3_spi_its.c
 create mode 100644 hw/intc/arm_gicv3_spi_its.h

diff --git a/hw/intc/Makefile.objs b/hw/intc/Makefile.objs
index 5e56acc..0e6784f 100644
--- a/hw/intc/Makefile.objs
+++ b/hw/intc/Makefile.objs
@@ -17,6 +17,7 @@ common-obj-$(CONFIG_ARM_GIC) += arm_gicv3_interrupts.o
 common-obj-$(CONFIG_ARM_GIC) += arm_gicv3_cpu_interface.o
 common-obj-$(CONFIG_ARM_GIC) += arm_gicv3_dist.o
 common-obj-$(CONFIG_ARM_GIC) += arm_gicv3_redist.o
+common-obj-$(CONFIG_ARM_GIC) += arm_gicv3_spi_its.o
 common-obj-$(CONFIG_OPENPIC) += openpic.o
 
 obj-$(CONFIG_APIC) += apic.o apic_common.o
diff --git a/hw/intc/arm_gicv3_spi_its.c b/hw/intc/arm_gicv3_spi_its.c
new file mode 100644
index 000..cd2b56d
--- /dev/null
+++ b/hw/intc/arm_gicv3_spi_its.c
@@ -0,0 +1,359 @@
+#include "gicv3_internal.h"
+#include "qom/cpu.h"
+#include "arm_gicv3_spi_its.h"
+#include "arm_gicv3_interrupts.h"
+
+/* ITS routines are stubs for future development */
+static uint64_t gic_its_readb(void *opaque, hwaddr offset)
+{
+return 0;
+}
+
+static uint64_t gic_its_readw(void *opaque, hwaddr offset)
+{
+uint64_t val;
+val = gic_its_readb(opaque, offset);
+val |= gic_its_readb(opaque, offset + 1) << 8;
+return val;
+}
+
+static uint64_t gic_its_readl(void *opaque, hwaddr offset)
+{
+uint64_t val;
+val = gic_its_readw(opaque, offset);
+val |= gic_its_readw(opaque, offset + 2) << 16;
+return val;
+}
+
+static uint64_t gic_its_readll(void *opaque, hwaddr offset)
+{
+uint64_t val;
+val = gic_its_readl(opaque, offset);
+val |= gic_its_readl(opaque, offset + 4) << 32;
+return val;
+}
+
+static void gic_its_writeb(void *opaque, hwaddr offset,
+uint64_t value)
+{
+GICv3State *s = (GICv3State *)opaque;
+gicv3_update(s);
+return;
+}
+
+static void gic_its_writew(void *opaque, hwaddr offset,
+uint64_t value)
+{
+gic_its_writeb(opaque, offset, value & 0xff);
+gic_its_writeb(opaque, offset + 1, value >> 8);
+}
+
+static void gic_its_writel(void *opaque, hwaddr offset,
+uint64_t value)
+{
+gic_its_writel(opaque, offset, value & 0x);
+gic_its_writel(opaque, offset + 2, value >> 16);
+}
+
+static void gic_its_writell(void *opaque, hwaddr offset,
+uint64_t value)
+{
+gic_its_writell(opaque, offset, value & 0x);
+gic_its_writell(opaque, offset + 4, value >> 32);
+}
+
+uint64_t gic_its_read(void *opaque, hwaddr addr, unsigned size)
+{
+uint64_t data;
+switch (size) {
+case 1:
+data = gic_its_readb(opaque, addr);
+break;
+case 2:
+data = gic_its_readw(opaque, addr);
+break;
+case 4:
+data = gic_its_readl(opaque, addr);
+break;
+case 8:
+data = gic_its_readll(opaque, addr);
+break;
+default:
+qemu_log_mask(LOG_GUEST_ERROR,
+  "%s: size %u\n", __func__, size);
+assert(0);
+break;
+}
+DPRINTF("offset %p data %p\n", (void *) addr, (void *) data);
+return data;
+}
+
+void gic_its_write(void *opaque, hwaddr addr, uint64_t data, unsigned size)
+{
+DPRINTF("offset %p data %p\n", (void *) addr, (void *) data);
+switch (size) {
+case 1:
+gic_its_writew(opaque, addr, data);
+break;
+case 2:
+gic_its_writew(opaque, addr, data);
+break;
+case 4:
+gic_its_writel(opaque, addr, data);
+break;
+case 8:
+gic_its_writell(opaque, addr, data);
+break;
+default:
+qemu_log_mask(LOG_GUEST_ERROR,
+  "%s: size %u\n", __func__, size);
+assert(0);
+break;
+}
+}
+
+/* SPI routines are stubs for future development */
+static uint64_t gic_spi_readb(void *opaque, hwaddr offset)
+{
+return 0;
+}
+
+static uint64_t gic_spi_readw(void *opaque, hwaddr offset)
+{
+uint64_t val;
+val = gic_spi_readb(opaque, offset);
+val |= gic_spi_readb(opaque, offset + 1) << 8;
+return val;
+}
+
+static uint64_t gic_spi_readl(void *opaque, hwaddr offset)
+{
+uint64_t val;
+val = gic_spi_readw(opaque, offset);
+val |= gic_spi_readw(opaque, offset + 2) << 16;
+return val;
+}
+
+static uint64_t gic_spi_readll(void *opaque, hwaddr offset)
+{
+uint64_t val;
+val = gic_spi_readl(opaque, offset);
+val |= gic_spi_readl(opaque, offset + 4) << 32;
+return val;
+}
+
+static void gic_sp

[Qemu-devel] [PATCH RFC V5 4/9] hw/intc: arm_gicv3_dist

2015-10-20 Thread Shlomo Pongratz
From: Shlomo Pongratz 

Signed-off-by: Shlomo Pongratz 
---
 hw/intc/Makefile.objs|   1 +
 hw/intc/arm_gicv3_dist.c | 655 +++
 hw/intc/arm_gicv3_dist.h |   9 +
 hw/intc/gicv3_internal.h |   8 +
 4 files changed, 673 insertions(+)
 create mode 100644 hw/intc/arm_gicv3_dist.c
 create mode 100644 hw/intc/arm_gicv3_dist.h

diff --git a/hw/intc/Makefile.objs b/hw/intc/Makefile.objs
index fb6494f..cdfb877 100644
--- a/hw/intc/Makefile.objs
+++ b/hw/intc/Makefile.objs
@@ -15,6 +15,7 @@ common-obj-$(CONFIG_ARM_GIC) += arm_gicv2m.o
 common-obj-$(CONFIG_ARM_GIC) += arm_gicv3_common.o
 common-obj-$(CONFIG_ARM_GIC) += arm_gicv3_interrupts.o
 common-obj-$(CONFIG_ARM_GIC) += arm_gicv3_cpu_interface.o
+common-obj-$(CONFIG_ARM_GIC) += arm_gicv3_dist.o
 common-obj-$(CONFIG_OPENPIC) += openpic.o
 
 obj-$(CONFIG_APIC) += apic.o apic_common.o
diff --git a/hw/intc/arm_gicv3_dist.c b/hw/intc/arm_gicv3_dist.c
new file mode 100644
index 000..973a1b3
--- /dev/null
+++ b/hw/intc/arm_gicv3_dist.c
@@ -0,0 +1,655 @@
+#include "gicv3_internal.h"
+#include "qom/cpu.h"
+#include "arm_gicv3_dist.h"
+#include "arm_gicv3_interrupts.h"
+
+static const uint8_t gic_dist_ids[] = {
+0x44, 0x00, 0x00, 0x00, 0x092, 0xB4, 0x3B, 0x00, 0x0D, 0xF0, 0x05, 0xB1
+};
+
+static uint64_t gic_dist_readb(void *opaque, hwaddr offset, MemTxAttrs attrs)
+{
+GICv3State *s = (GICv3State *)opaque;
+uint64_t res;
+int irq;
+int i;
+int cpu;
+int cm;
+
+cpu = gic_get_current_cpu(s);
+if (offset < 0x100) {
+if (offset == 0) {/* GICD_CTLR */
+/* GICv3 5.3.20 without GICV2 backwards compatibility  */
+if (s->security_levels > 1) {
+/* Support TWO security states */
+if (attrs.secure) {
+return s->ctlr;
+} else {
+/* For non secure access bits [30:5] are reserved and 
ARE_NS
+ * is RAO/WI note that ARE_NS in this case is bit [4] same
+ * as ARE_S in the prvious secure case. Now since it is
+ * RAO/WI than bit [0] EnableGrp1 is reserved and we can
+ * only use Bit [1] EnableGrp1A to enable Non-secure Group1
+ * interrupts
+ */
+return s->ctlr & (GICD_CTLR_ARE_NS |
+  GICD_CTLR_EN_GRP1NS |
+  GICD_CTLR_RWP);
+}
+} else {
+/* Support ONE security state without ARE is RAO/WI*/
+return s->ctlr & (GICD_CTLR_ARE |
+  GICD_CTLR_EN_GRP0 | GICD_CTLR_EN_GRP1NS |
+  GICD_CTLR_RWP);
+}
+}
+if (offset == 4) { /* GICD_TYPER */
+uint64_t num = NUM_CPU(s);
+/* the number of cores in the system, saturated to 8 minus one. */
+if (num > 8)
+num = 8;
+/* The num_irqs as given from virt machine via "num-irq"
+ * includes the internal irqs, so subtract them
+ */
+res = (s->num_irq - GICV3_INTERNAL) / 32;
+res |= (num - 1) << 5;
+res |= 0xF << 19;
+return res;
+}
+if (offset == 0x08)
+return 0x43B; /* GIC_IIDR */
+if (offset >= 0x80) {
+/* Interrupt Group Registers: these RAZ/WI if this is an NS
+ * access to a GIC with the security extensions, or if the GIC
+ * doesn't have groups at all.
+ */
+res = 0;
+if (!attrs.secure) {
+/* GIC-500 comment 'f' */
+goto bad_reg;
+}
+/* Every byte offset holds 8 group status bits */
+irq = (offset - 0x080) * 8 + GICV3_BASE_IRQ;
+if (irq >= s->num_irq) {
+goto bad_reg;
+}
+for (i = 0; i < 8; i++) {
+if (GIC_TEST_GROUP(irq + i, cpu)) {
+res |= (1 << i);
+}
+}
+return res;
+}
+goto bad_reg;
+} else if (offset < 0x200) {
+/* Interrupt Set/Clear Enable.  */
+if (offset < 0x180)
+irq = (offset - 0x100) * 8;
+else
+irq = (offset - 0x180) * 8;
+irq += GICV3_BASE_IRQ;
+if (irq >= s->num_irq)
+goto bad_reg;
+res = 0;
+for (i = 0; i < 8; i++) {
+if (GIC_TEST_ENABLED(irq + i, cpu)) {
+res |= (1 << i);
+}
+}
+} else if (offset < 0x300) {
+/* Interrupt Set/Clear Pending.  */
+if (offset < 0x280)
+irq = (offset - 0x200) * 8;
+else

Re: [Qemu-devel] [PATCH RFC V4 4/4] Add virt-v3 machine that uses GIC-500

2015-09-28 Thread Shlomo Pongratz
Thanks,

I'll fix it and submit together with with the changes required by Peter.

Best regards,

S.P.

On Thursday, September 24, 2015, Christopher Covington 
wrote:

> On 09/24/2015 02:03 PM, Christopher Covington wrote:
> > Hi,
> >
> > On 09/17/2015 01:38 PM, Shlomo Pongratz wrote:
> >> From: Pavel Fedin >
> >>
> >> I would like to offer this, slightly improved implementation. The key
> thing is a new
> >> kernel_irqchip_type member in Machine class. Currently it it used only
> by virt machine for
> >> its internal purposes, however in future it is to be passed to KVM in
> >> kvm_irqchip_create(). The variable is defined as int in order to be
> architecture agnostic,
> >> for potential future users.
> >>
> >> Signed-off-by: Pavel Fedin >
> >> ---
> >>  hw/arm/virt.c | 72
> +--
> >>  include/hw/arm/fdt.h  |  2 ++
> >>  include/hw/arm/virt.h |  1 +
> >>  target-arm/machine.c  |  7 -
> >>  4 files changed, 73 insertions(+), 9 deletions(-)
> >>
> >> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> >> index 4d15309..4c2ae7f 100644
> >> --- a/hw/arm/virt.c
> >> +++ b/hw/arm/virt.c
> >
> >> @@ -445,6 +462,14 @@ static void create_gic(VirtBoardInfo *vbi,
> qemu_irq *pic, int type, bool secure)
> >>  sysbus_mmio_map(gicbusdev, 1, vbi->memmap[VIRT_GIC_CPU].base);
> >>  }
> >>
> >> +if (type == 3) {
> >> +/* Connect GIC to CPU */
> >> +for (i = 0; i < smp_cpus; i++) {
> >> +CPUState *cpu = qemu_get_cpu(i);
> >> +aatch64_registers_with_opaque_set(OBJECT(cpu), (void
> *)gicdev);
> >
> > Typo--should be "aarch64".
> >
> > With that, feel free to add the following if it's any use:
> >
> > Tested-by: Christopher Covington >
>
> I originally tested building only for aarch64-softmmu, but I've now
> noticed a
> build issue with arm-softmmu.
>
> Christopher Covington
>
> --
> Qualcomm Innovation Center, Inc.
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project
>


Re: [Qemu-devel] [PATCH RFC V4 0/4] Implement GIC-500 from GICv3 family for arm64

2015-09-17 Thread Shlomo Pongratz
See inline.

On Thursday, September 17, 2015, Peter Maydell 
wrote:

> On 17 September 2015 at 19:18, Shlomo Pongratz  > wrote:
> > On Thursday, September 17, 2015, Peter Maydell  >
> >> This still seems to have the same issues as were noted in previous
> >> rounds:
> >>  * you need to get rid of the limitation on number of cores. There
> >>should be no hard limit imposed by the GIC emulation code: not
> >>64, not 128, not anything.
> >
> >
> > The GICv3 spec limits the number of cores to 128.
>
> I don't believe it does (the only limit is the affinity fields
> which have a limit up in the millions). Can you point me to the
> part of the spec that you think imposes a lower limit?
>
>
I don't have the GICv3 spec at home only the GIC 500 which is based on it
and is what implemented also in HW and it states:

1.1 About the GIC-500 The GIC-500 is a build-time configurable interrupt
controller that supports up to 128 cores.

I'll send the GICv3 reference next week when I'll come back to work (I have
it only at work).


> I assume you don't expect
> > GICv2 to support more than 8 cores.
> > As I wrote since the largest "atomic" type is 64 bit supporting more
> than 64
> > cores requires major rewrite using bitops. This can be done but I think
> this
> > can wait for future versions.
>
> No, I don't want to accept a version with a limitation.
> Data structures are important and we should get them right from
> the start. Don't look at the GICv2 to figure out what data structures
> you should have and how they should be arranged, look at the GICv3 spec.
>
> (In particular, you probably want a data structure layout that
> says "this GIC has an array of N redistributors, and each redistributor
> has such-and-such state", rather than the QEMU GICv2 model's approach
> of having a struct per interrupt that tries to track per-CPU state
> within that struct.)
>
> > I wonder how can I remove the limit and not crash is someone tries an
> > arbitrary large number.
> > How can I protect from wrong usage?
>
> You need to allocate your data structures based on the number of CPUs
> which the user passes in to you. (We do this in Pavel's patchset for
> irq lines, for instance.) If the user passes in something that's so
> large we can't allocate memory, then QEMU's memory allocation functions
> will fail cleanly.
>
> >>  * patch 2 is over 2000 lines, which is far too big to review. It
> >>needs to be split up. Aim for 200 lines per patch at maximum;
> >>smaller is better.
> >
> >
> > I'm open to suggestions. This is actually a single file (with the
> necessary
> > Makefile addition). Do you suggest that I'll break it for example to
> > distributer part an redistributer part e.t.c.?
> > Please advice.
>
> Yes, you need to split it up into logical subsections that each
> add a small but coherent piece of functionality. The purpose
> of splitting up into patches is to make the code easier to
> review for correctness. It's impossible to read a single 2000
> line patch and keep it all in your head at once. So a split
> into multiple patches is about presenting the changes in a
> way that allows the reader to assess it in a logical order
> and not too much at once.
>
> thanks
> -- PMM
>


Re: [Qemu-devel] [PATCH RFC V4 0/4] Implement GIC-500 from GICv3 family for arm64

2015-09-17 Thread Shlomo Pongratz
Thanks

Please see inline
Best regards,

S.P.


On Thursday, September 17, 2015, Peter Maydell 
wrote:

> On 17 September 2015 at 18:38, Shlomo Pongratz  > wrote:
> > From: Shlomo Pongratz >
> >
> > This patch is a first step toward 128 cores support for arm64.
> >
> > At first only 64 cores are supported.
> > This is because largest integer type has the size of 64 bits and
> modifying
> > essential data structures in order to support 128 cores will require
> > the usage of bitops.
> >
> > Things left to do:
> >
> > Support SPI, note that this patch porpose is to enable running 64 cores
> using
> > the "virt" virtual machine.
> >
> > Add support for 128 cores. This requires the usage
> > of bitops which requires a major rewrite.
> >
> > Special thanks to Peter Crostwaite whose patch to th Linux (kernel) i.e.
> > Implement cpu_relax as yield solved the problem of the boot process
> getting
> > stuck for 24 cores and more.
>
> This still seems to have the same issues as were noted in previous
> rounds:
>  * you need to get rid of the limitation on number of cores. There
>should be no hard limit imposed by the GIC emulation code: not
>64, not 128, not anything.
>

The GICv3 spec limits the number of cores to 128. I assume you don't expect
GICv2 to support more than 8 cores.
As I wrote since the largest "atomic" type is 64 bit supporting more than
64 cores requires major rewrite using bitops. This can be done but I think
this can wait for future versions.
I wonder how can I remove the limit and not crash is someone tries an
arbitrary large number.
How can I protect from wrong usage?

 * patch 2 is over 2000 lines, which is far too big to review. It
>needs to be split up. Aim for 200 lines per patch at maximum;
>smaller is better.
>

I'm open to suggestions. This is actually a single file (with the necessary
Makefile addition). Do you suggest that I'll break it for example to
distributer part an redistributer part e.t.c.?
Please advice.

>
> Also, patch 4 looks like it's an older version of one of Pavel's;
> you probably want to rebase on top of Pavel's recent v14 series,
> which is nearly ready to go into master.
>

Patch #4 is based on Pavel's V14 5/5 patch but adds GICv3 to it.

>
> thanks
> -- PMM
>


[Qemu-devel] [PATCH RFC V4 2/4] hw/intc: Implment GIC-500

2015-09-17 Thread Shlomo Pongratz
From: Shlomo Pongratz 

Implement GIC-500 from GICv3 family for arm64

This patch is a first step toward 128 cores support for arm64.

At first only 64 cores are supported.
This is because the largest integer type has the size of 64 bits and modifying
essential data structures in order to support 128 cores will require
the usage of bitops.

Currently SPI are not supported. This patch implements only what is necessary
in order to run the hw/arm/virt.c board with 64 cores.

Things left to do:

Add support to 128 cores.
Implement SPI.

Signed-off-by: Shlomo Pongratz 
---
 hw/intc/Makefile.objs |1 +
 hw/intc/arm_gicv3.c   | 2044 +
 2 files changed, 2045 insertions(+)
 create mode 100644 hw/intc/arm_gicv3.c

diff --git a/hw/intc/Makefile.objs b/hw/intc/Makefile.objs
index 004b0c2..f20a19a 100644
--- a/hw/intc/Makefile.objs
+++ b/hw/intc/Makefile.objs
@@ -13,6 +13,7 @@ common-obj-$(CONFIG_ARM_GIC) += arm_gic_common.o
 common-obj-$(CONFIG_ARM_GIC) += arm_gic.o
 common-obj-$(CONFIG_ARM_GIC) += arm_gicv2m.o
 common-obj-$(CONFIG_ARM_GIC) += arm_gicv3_common.o
+common-obj-$(CONFIG_ARM_GIC) += arm_gicv3.o
 common-obj-$(CONFIG_OPENPIC) += openpic.o
 
 obj-$(CONFIG_APIC) += apic.o apic_common.o
diff --git a/hw/intc/arm_gicv3.c b/hw/intc/arm_gicv3.c
new file mode 100644
index 000..b3c54ee
--- /dev/null
+++ b/hw/intc/arm_gicv3.c
@@ -0,0 +1,2044 @@
+/*
+ * ARM Generic/Distributed Interrupt Controller
+ *
+ * Copyright (c) 2006-2007 CodeSourcery.
+ * Copyright (c) 2015 Huawei.
+ * Written by Shlomo Pongratz
+ * Base on gic.c by Paul Brook
+ *
+ * This code is licensed under the GPL.
+ */
+
+/* This file contains implementation code for the GIC-500 interrupt
+ * controller, which is an implementation of the GICv3 architecture.
+ * Curently it supports up to 64 cores. Enhancmet to 128 cores requires
+ * working with bitops.
+ */
+
+#include "hw/sysbus.h"
+#include "gicv3_internal.h"
+#include "qom/cpu.h"
+
+#undef DEBUG_GICV3
+
+#ifdef DEBUG_GICV3
+#define DPRINTF(fmt, ...) \
+do { fprintf(stderr, "arm_gicv3::%s: " fmt , __func__, ## __VA_ARGS__); } 
while (0)
+#else
+#define DPRINTF(fmt, ...) do {} while(0)
+#endif
+
+/* These routines are called from cpu64.c and are defined in target-arm/cpu.h
+ * like armv7m_nvic_XXX routines.
+ * I couldn't find how to include it without compilation errors
+ */
+void armv8_gicv3_set_sgi(void *opaque, int cpuindex, uint64_t value);
+uint64_t armv8_gicv3_acknowledge_irq(void *opaque, int cpuindex,
+ MemTxAttrs attrs);
+void armv8_gicv3_complete_irq(void *opaque, int cpuindex, int irq,
+ MemTxAttrs attrs);
+uint64_t armv8_gicv3_get_priority_mask(void *opaque, int cpuindex);
+void armv8_gicv3_set_priority_mask(void *opaque, int cpuindex, uint32_t mask);
+uint64_t armv8_gicv3_get_sre(void *opaque);
+void armv8_gicv3_set_sre(void *opaque, uint64_t sre);
+uint64_t armv8_gicv3_get_igrpen1(void *opaque, int cpuindex);
+void armv8_gicv3_set_igrpen1(void *opaque, int cpuindex, uint64_t igrpen1);
+
+
+static uint32_t gicv3_acknowledge_irq(GICv3State *s, int cpu, MemTxAttrs 
attrs);
+static void gicv3_complete_irq(GICv3State *s, int cpu, int irq,
+   MemTxAttrs attrs);
+static void gicv3_update(GICv3State *s);
+static void gicv3_set_priority(GICv3State *s, int cpu, int irq, uint8_t val,
+   MemTxAttrs attrs);
+
+static const uint8_t gic_dist_ids[] = {
+0x44, 0x00, 0x00, 0x00, 0x092, 0xB4, 0x3B, 0x00, 0x0D, 0xF0, 0x05, 0xB1
+};
+
+static const uint8_t gic_redist_ids[] = {
+0x44, 0x00, 0x00, 0x00, 0x093, 0xB4, 0x3B, 0x00, 0x0D, 0xF0, 0x05, 0xB1
+};
+
+/* Cuurently no GICv2 backwards compatibility (no memory mapped regs)
+ * Uses system registers mode.
+ */
+static const int no_gicv2_bc = 1;
+static uint32_t gic_sre;
+
+#define NUM_CPU(s) ((s)->num_cpu)
+
+static inline int gic_get_current_cpu(GICv3State *s)
+{
+if (s->num_cpu > 1) {
+return current_cpu->cpu_index;
+}
+return 0;
+}
+
+/* Return true if this GIC config has interrupt groups, which is
+ * true if we're a GICv3. Keep just
+ */
+static inline bool gic_has_groups(GICv3State *s)
+{
+return 1;
+}
+
+/* TODO: Many places that call this routine could be optimized.  */
+/* Update interrupt status after enabled or pending bits have been changed.  */
+void gicv3_update(GICv3State *s)
+{
+int best_irq;
+int best_prio;
+int irq;
+int irq_level, fiq_level;
+int cpu;
+uint64_t cm;
+
+for (cpu = 0; cpu < NUM_CPU(s); cpu++) {
+cm = 1ll << cpu;
+s->current_pending[cpu] = 1023;
+if (!(s->ctlr & (GICD_CTLR_EN_GRP0 | GICD_CTLR_EN_GRP1_ALL))
+|| !s->cpu_enabled[cpu]
+|| !(s->cpu_ctlr[cpu] & (GICC_CTLR_EN_GRP0 | GICC_CTLR_EN_GRP1))) {
+qemu_irq_lower(s->parent_irq[cpu]);

[Qemu-devel] [PATCH RFC V4 3/4] target-arm/cpu64 GICv3 system instructions support

2015-09-17 Thread Shlomo Pongratz
From: Shlomo Pongratz 

Add system instructions used by the Linux (kernel) GICv3
device driver

Signed-off-by: Shlomo Pongratz 
---
 target-arm/cpu-qom.h |   1 +
 target-arm/cpu.h |  12 ++
 target-arm/cpu64.c   | 118 +++
 3 files changed, 131 insertions(+)

diff --git a/target-arm/cpu-qom.h b/target-arm/cpu-qom.h
index 25fb1ce..d0bf602 100644
--- a/target-arm/cpu-qom.h
+++ b/target-arm/cpu-qom.h
@@ -220,6 +220,7 @@ hwaddr arm_cpu_get_phys_page_debug(CPUState *cpu, vaddr 
addr);
 
 int arm_cpu_gdb_read_register(CPUState *cpu, uint8_t *buf, int reg);
 int arm_cpu_gdb_write_register(CPUState *cpu, uint8_t *buf, int reg);
+void aatch64_registers_with_opaque_set(Object *obj, void *opaque);
 
 /* Callback functions for the generic timer's timers. */
 void arm_gt_ptimer_cb(void *opaque);
diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 1b80516..2ece8b9 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -1033,6 +1033,18 @@ void armv7m_nvic_set_pending(void *opaque, int irq);
 int armv7m_nvic_acknowledge_irq(void *opaque);
 void armv7m_nvic_complete_irq(void *opaque, int irq);
 
+void armv8_gicv3_set_sgi(void *opaque, int cpuindex, uint64_t value);
+uint64_t armv8_gicv3_acknowledge_irq(void *opaque, int cpuindex,
+  MemTxAttrs attrs);
+void armv8_gicv3_complete_irq(void *opaque, int cpuindex, int irq,
+  MemTxAttrs attrs);
+uint64_t armv8_gicv3_get_priority_mask(void *opaque, int cpuindex);
+void armv8_gicv3_set_priority_mask(void *opaque, int cpuindex, uint32_t mask);
+uint64_t armv8_gicv3_get_sre(void *opaque);
+void armv8_gicv3_set_sre(void *opaque, uint64_t sre);
+uint64_t armv8_gicv3_get_igrpen1(void *opaque, int cpuindex);
+void armv8_gicv3_set_igrpen1(void *opaque, int cpuindex, uint64_t igrpen1);
+
 /* Interface for defining coprocessor registers.
  * Registers are defined in tables of arm_cp_reginfo structs
  * which are passed to define_arm_cp_regs().
diff --git a/target-arm/cpu64.c b/target-arm/cpu64.c
index 63c8b1c..4781f49 100644
--- a/target-arm/cpu64.c
+++ b/target-arm/cpu64.c
@@ -45,6 +45,115 @@ static uint64_t a57_a53_l2ctlr_read(CPUARMState *env, const 
ARMCPRegInfo *ri)
 }
 #endif
 
+#ifndef CONFIG_USER_ONLY
+static void sgi_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
+{
+CPUState *cpu = ENV_GET_CPU(env);
+armv8_gicv3_set_sgi(ri->opaque, cpu->cpu_index, value);
+}
+
+static uint64_t iar_read(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+uint64_t value;
+MemTxAttrs attrs;;
+CPUState *cpu = ENV_GET_CPU(env);
+attrs.secure = arm_is_secure_below_el3(env) ? 1 : 0;
+value = armv8_gicv3_acknowledge_irq(ri->opaque, cpu->cpu_index, attrs);
+return value;
+}
+
+static void sre_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
+{
+armv8_gicv3_set_sre(ri->opaque, value);
+}
+
+static uint64_t sre_read(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+uint64_t value;
+value = armv8_gicv3_get_sre(ri->opaque);
+return value;
+}
+
+static void eoir_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t 
value)
+{
+MemTxAttrs attrs;
+CPUState *cpu = ENV_GET_CPU(env);
+attrs.secure = arm_is_secure_below_el3(env) ? 1 : 0;
+armv8_gicv3_complete_irq(ri->opaque, cpu->cpu_index, value, attrs);
+}
+
+static uint64_t pmr_read(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+uint64_t value;
+CPUState *cpu = ENV_GET_CPU(env);
+value = armv8_gicv3_get_priority_mask(ri->opaque, cpu->cpu_index);
+return value;
+}
+
+static void pmr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
+{
+CPUState *cpu = ENV_GET_CPU(env);
+armv8_gicv3_set_priority_mask(ri->opaque, cpu->cpu_index, value);
+}
+
+static uint64_t igrpen1_read(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+uint64_t value;
+CPUState *cpu = ENV_GET_CPU(env);
+value = armv8_gicv3_get_igrpen1(ri->opaque, cpu->cpu_index);
+return value;
+}
+
+static void igrpen1_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t 
value)
+{
+CPUState *cpu = ENV_GET_CPU(env);
+armv8_gicv3_set_igrpen1(ri->opaque, cpu->cpu_index, value);
+}
+#endif
+
+static const ARMCPRegInfo cortex_a57_a53_cp_with_opaque_reginfo[] = {
+{ .name = "EIOR1_EL1", .state = ARM_CP_STATE_AA64,
+#ifndef CONFIG_USER_ONLY
+  .writefn = eoir_write,
+#endif
+  .opc0 = 3, .opc1 = 0, .crn = 12, .crm = 12, .opc2 = 1,
+  .access = PL1_W, .type = ARM_CP_SPECIAL, .resetvalue = 0 },
+{ .name = "IAR1_EL1", .state = ARM_CP_STATE_AA64,
+#ifndef CONFIG_USER_ONLY
+  .readfn = iar_read,
+#endif
+  .opc0 = 3, .opc1 = 0, .crn = 12, .crm = 12, .opc2 = 0,
+  .access = PL1_R, .type = ARM_CP_SPECIAL, .resetvalue = 0 },
+{ .name = "SGI1R_EL1", .state = ARM_CP_STATE_AA64,
+#ifndef CONFIG_USER_ONLY
+  .writefn = sgi_write,
+#endif
+  .o

[Qemu-devel] [PATCH RFC V4 4/4] Add virt-v3 machine that uses GIC-500

2015-09-17 Thread Shlomo Pongratz
From: Pavel Fedin 

I would like to offer this, slightly improved implementation. The key thing is 
a new
kernel_irqchip_type member in Machine class. Currently it it used only by virt 
machine for
its internal purposes, however in future it is to be passed to KVM in
kvm_irqchip_create(). The variable is defined as int in order to be 
architecture agnostic,
for potential future users.

Signed-off-by: Pavel Fedin 
---
 hw/arm/virt.c | 72 +--
 include/hw/arm/fdt.h  |  2 ++
 include/hw/arm/virt.h |  1 +
 target-arm/machine.c  |  7 -
 4 files changed, 73 insertions(+), 9 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 4d15309..4c2ae7f 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -71,6 +71,7 @@ typedef struct VirtBoardInfo {
 uint32_t clock_phandle;
 uint32_t gic_phandle;
 uint32_t v2m_phandle;
+const char *class_name;
 } VirtBoardInfo;
 
 typedef struct {
@@ -86,6 +87,7 @@ typedef struct {
 } VirtMachineState;
 
 #define TYPE_VIRT_MACHINE   "virt"
+#define TYPE_VIRTV3_MACHINE   "virt-v3"
 #define VIRT_MACHINE(obj) \
 OBJECT_CHECK(VirtMachineState, (obj), TYPE_VIRT_MACHINE)
 #define VIRT_MACHINE_GET_CLASS(obj) \
@@ -111,6 +113,7 @@ static const MemMapEntry a15memmap[] = {
 [VIRT_CPUPERIPHS] = { 0x0800, 0x0002 },
 /* GIC distributor and CPU interfaces sit inside the CPU peripheral space 
*/
 [VIRT_GIC_DIST] =   { 0x0800, 0x0001 },
+/* Note on GICv3 VIRT_GIC_DIST_SPI takes place of VIRT_GIC_CPU */
 [VIRT_GIC_CPU] ={ 0x0801, 0x0001 },
 [VIRT_GIC_V2M] ={ 0x0802, 0x1000 },
 /* The space in between here is reserved for GICv3 CPU/vCPU/HYP */
@@ -145,21 +148,26 @@ static VirtBoardInfo machines[] = {
 .cpu_model = "cortex-a15",
 .memmap = a15memmap,
 .irqmap = a15irqmap,
+.class_name = TYPE_ARM_CPU,
+
 },
 {
 .cpu_model = "cortex-a53",
 .memmap = a15memmap,
 .irqmap = a15irqmap,
+.class_name = TYPE_AARCH64_CPU,
 },
 {
 .cpu_model = "cortex-a57",
 .memmap = a15memmap,
 .irqmap = a15irqmap,
+.class_name = TYPE_AARCH64_CPU,
 },
 {
 .cpu_model = "host",
 .memmap = a15memmap,
 .irqmap = a15irqmap,
+.class_name = TYPE_ARM_CPU,
 },
 };
 
@@ -228,7 +236,6 @@ static void fdt_add_psci_node(const VirtBoardInfo *vbi)
 if (armcpu->psci_version == 2) {
 const char comp[] = "arm,psci-0.2\0arm,psci";
 qemu_fdt_setprop(fdt, "/psci", "compatible", comp, sizeof(comp));
-
 cpu_off_fn = QEMU_PSCI_0_2_FN_CPU_OFF;
 if (arm_feature(&armcpu->env, ARM_FEATURE_AARCH64)) {
 cpu_suspend_fn = QEMU_PSCI_0_2_FN64_CPU_SUSPEND;
@@ -241,7 +248,6 @@ static void fdt_add_psci_node(const VirtBoardInfo *vbi)
 }
 } else {
 qemu_fdt_setprop_string(fdt, "/psci", "compatible", "arm,psci");
-
 cpu_suspend_fn = QEMU_PSCI_0_1_FN_CPU_SUSPEND;
 cpu_off_fn = QEMU_PSCI_0_1_FN_CPU_OFF;
 cpu_on_fn = QEMU_PSCI_0_1_FN_CPU_ON;
@@ -274,6 +280,12 @@ static void fdt_add_timer_nodes(const VirtBoardInfo *vbi, 
int gictype)
 irqflags = deposit32(irqflags, GIC_FDT_IRQ_PPI_CPU_START,
  GIC_FDT_IRQ_PPI_CPU_WIDTH,
  (1 << vbi->smp_cpus) - 1);
+} else if (gictype == 3) {
+uint32_t max;
+/* Argument is 32 bit but 8 bits are reserved for flags */
+max = (vbi->smp_cpus >= 24) ? 24 : vbi->smp_cpus;
+irqflags = deposit32(irqflags, GIC_FDT_IRQ_PPI_CPU_START,
+ GICV3_FDT_IRQ_PPI_CPU_WIDTH, (1 << max) - 1);
 }
 
 qemu_fdt_add_subnode(vbi->fdt, "/timer");
@@ -434,7 +446,12 @@ static void create_gic(VirtBoardInfo *vbi, qemu_irq *pic, 
int type, bool secure)
  */
 qdev_prop_set_uint32(gicdev, "num-irq", NUM_IRQS + 32);
 if (!kvm_irqchip_in_kernel()) {
-qdev_prop_set_bit(gicdev, "has-security-extensions", secure);
+if (type == 3) {
+/* AARCH64 has 4 security levels */
+qdev_prop_set_uint8(gicdev, "security-levels", secure ? 1 : 0);
+} else {
+qdev_prop_set_bit(gicdev, "has-security-extensions", secure);
+}
 }
 qdev_init_nofail(gicdev);
 gicbusdev = SYS_BUS_DEVICE(gicdev);
@@ -445,6 +462,14 @@ static void create_gic(VirtBoardInfo *vbi, qemu_irq *pic, 
int type, bool secure)
 sysbus_mmio_map(gicbusdev, 1, vbi->memmap[VIRT_GIC_CPU].base);
 }
 
+if (type == 3) {
+/* Connect GIC to CPU */
+for (i = 0; i < smp_cpus; i++) {
+CPUState *cpu = qemu_get_cpu(i);
+aatch64_registers_with_opaque_set(OBJECT(cpu), (void *)gicdev);
+}
+}
+
 /* Wire the outputs from each CPU's generic timer to the
  * appropriate GIC PPI inputs, and the GIC's IRQ output to
  * the CPU's IRQ 

[Qemu-devel] [PATCH RFC V4 1/4] hw/intc: Implement GIC-500 support files

2015-09-17 Thread Shlomo Pongratz
From: Shlomo Pongratz 

Add files need to maintain the GIC-500 state.

Signed-off-by: Shlomo Pongratz 
---
 hw/intc/arm_gicv3_common.c | 120 ++-
 hw/intc/gicv3_internal.h   | 161 +
 include/hw/intc/arm_gicv3.h|  44 ++
 include/hw/intc/arm_gicv3_common.h |  56 -
 4 files changed, 378 insertions(+), 3 deletions(-)
 create mode 100644 hw/intc/gicv3_internal.h
 create mode 100644 include/hw/intc/arm_gicv3.h

diff --git a/hw/intc/arm_gicv3_common.c b/hw/intc/arm_gicv3_common.c
index 032ece2..4a14d57 100644
--- a/hw/intc/arm_gicv3_common.c
+++ b/hw/intc/arm_gicv3_common.c
@@ -21,6 +21,7 @@
  */
 
 #include "hw/intc/arm_gicv3_common.h"
+#include "gicv3_internal.h"
 
 static void gicv3_pre_save(void *opaque)
 {
@@ -43,11 +44,52 @@ static int gicv3_post_load(void *opaque, int version_id)
 return 0;
 }
 
+static const VMStateDescription vmstate_gicv3_irq_state = {
+.name = "arm_gicv3_irq_state",
+.version_id = 1,
+.minimum_version_id = 1,
+.fields = (VMStateField[]) {
+VMSTATE_UINT64(pending, gicv3_irq_state),
+VMSTATE_UINT64(active, gicv3_irq_state),
+VMSTATE_UINT64(level, gicv3_irq_state),
+VMSTATE_UINT64(group, gicv3_irq_state),
+VMSTATE_BOOL(edge_trigger, gicv3_irq_state),
+VMSTATE_END_OF_LIST()
+}
+};
+
+static const VMStateDescription vmstate_gicv3_sgi_state = {
+.name = "arm_gicv3_sgi_state",
+.version_id = 1,
+.minimum_version_id = 1,
+.fields = (VMStateField[]) {
+VMSTATE_UINT64_ARRAY(pending, gicv3_sgi_state, GICV3_NCPU),
+VMSTATE_END_OF_LIST()
+}
+};
+
 static const VMStateDescription vmstate_gicv3 = {
 .name = "arm_gicv3",
 .unmigratable = 1,
 .pre_save = gicv3_pre_save,
 .post_load = gicv3_post_load,
+.fields = (VMStateField[]) {
+VMSTATE_UINT32(ctlr, GICv3State),
+VMSTATE_UINT32_ARRAY(cpu_ctlr, GICv3State, GICV3_NCPU),
+VMSTATE_STRUCT_ARRAY(irq_state, GICv3State, GICV3_MAXIRQ, 1,
+ vmstate_gicv3_irq_state, gicv3_irq_state),
+VMSTATE_UINT64_ARRAY(irq_target, GICv3State, GICV3_MAXIRQ),
+VMSTATE_UINT8_2DARRAY(priority1, GICv3State, GICV3_INTERNAL, 
GICV3_NCPU),
+VMSTATE_UINT8_ARRAY(priority2, GICv3State, GICV3_MAXIRQ - 
GICV3_INTERNAL),
+VMSTATE_UINT16_2DARRAY(last_active, GICv3State, GICV3_MAXIRQ, 
GICV3_NCPU),
+VMSTATE_STRUCT_ARRAY(sgi_state, GICv3State, GICV3_NR_SGIS, 1,
+ vmstate_gicv3_sgi_state, gicv3_sgi_state),
+VMSTATE_UINT16_ARRAY(priority_mask, GICv3State, GICV3_NCPU),
+VMSTATE_UINT16_ARRAY(running_irq, GICv3State, GICV3_NCPU),
+VMSTATE_UINT16_ARRAY(running_priority, GICv3State, GICV3_NCPU),
+VMSTATE_UINT16_ARRAY(current_pending, GICv3State, GICV3_NCPU),
+VMSTATE_END_OF_LIST()
+}
 };
 
 void gicv3_init_irqs_and_mmio(GICv3State *s, qemu_irq_handler handler,
@@ -88,6 +130,7 @@ void gicv3_init_irqs_and_mmio(GICv3State *s, 
qemu_irq_handler handler,
 static void arm_gicv3_common_realize(DeviceState *dev, Error **errp)
 {
 GICv3State *s = ARM_GICV3_COMMON(dev);
+int num_irq = s->num_irq;
 
 /* revision property is actually reserved and currently used only in order
  * to keep the interface compatible with GICv2 code, avoiding extra
@@ -98,18 +141,91 @@ static void arm_gicv3_common_realize(DeviceState *dev, 
Error **errp)
 error_setg(errp, "unsupported GIC revision %d", s->revision);
 return;
 }
+if (s->num_cpu > GICV3_NCPU) {
+error_setg(errp, "requested %u CPUs exceeds GIC maximum %d",
+   s->num_cpu, GICV3_NCPU);
+return;
+}
+s->num_irq += GICV3_BASE_IRQ;
+if (s->num_irq > GICV3_MAXIRQ) {
+error_setg(errp,
+   "requested %u interrupt lines exceeds GIC maximum %d",
+   num_irq, GICV3_MAXIRQ);
+return;
+}
+/* ITLinesNumber is represented as (N / 32) - 1 (see
+ * gic_dist_readb) so this is an implementation imposed
+ * restriction, not an architectural one:
+ */
+if (s->num_irq < 32 || (s->num_irq % 32)) {
+error_setg(errp,
+   "%d interrupt lines unsupported: not divisible by 32",
+   num_irq);
+return;
+}
 }
 
 static void arm_gicv3_common_reset(DeviceState *dev)
 {
-/* TODO */
+GICv3State *s = ARM_GICV3_COMMON(dev);
+int i;
+
+/* Note num_cpu and num_irq are properties */
+
+/* Don't reset anything assigned in arm_gic_realize or any property */
+
+/* No GICv2 backwards computability support */
+for (i = 0; i < s->num_cpu; i++) {
+s->priority_mask[i] = 0;
+s->current_pending[i] = 1023;
+s->running

[Qemu-devel] [PATCH RFC V4 0/4] Implement GIC-500 from GICv3 family for arm64

2015-09-17 Thread Shlomo Pongratz
From: Shlomo Pongratz 

This patch is a first step toward 128 cores support for arm64.

At first only 64 cores are supported.
This is because largest integer type has the size of 64 bits and modifying
essential data structures in order to support 128 cores will require
the usage of bitops.

Things left to do:

Support SPI, note that this patch porpose is to enable running 64 cores using
the "virt" virtual machine.

Add support for 128 cores. This requires the usage
of bitops which requires a major rewrite.

Special thanks to Peter Crostwaite whose patch to th Linux (kernel) i.e.
Implement cpu_relax as yield solved the problem of the boot process getting
stuck for 24 cores and more.

V3->V4:
  - Rebase the patch series on Pavel Feding's "vGICv3 support" V14 patch series.
  - Replace the usage of vnic/irqchip in the CPU structure to keep the gic
object for the system instructions with the usage of the routine
define_arm_cp_regs_with_opaque() as suggested by Peter Maydell in his mail
from August the 3rd. 

V2->V3:
  - Replace my original 1 & 4 patches with Pavel's patches.
  - Add groups support to complies with new GICv2 addtions
  - Cosmetic changes.

V1->V2:
  - Split the original patch to 4 patches
  - Add SRE API to the GIC code.
  - Add call to gicv3_update to armv8_gicv3_set_priority_mask.
  - Cosmetic changes.
  - Fix number of irq when reading GICD_TYPER.


Pavel Fedin (1):
  Add virt-v3 machine that uses GIC-500

Shlomo Pongratz (3):
  hw/intc: Implement GIC-500 support files
  hw/intc: Implment GIC-500
  target-arm/cpu64 GICv3 system instructions support

 hw/arm/virt.c  |   72 +-
 hw/intc/Makefile.objs  |1 +
 hw/intc/arm_gicv3.c| 2044 
 hw/intc/arm_gicv3_common.c |  120 ++-
 hw/intc/gicv3_internal.h   |  161 +++
 include/hw/arm/fdt.h   |2 +
 include/hw/arm/virt.h  |1 +
 include/hw/intc/arm_gicv3.h|   44 +
 include/hw/intc/arm_gicv3_common.h |   56 +-
 target-arm/cpu-qom.h   |1 +
 target-arm/cpu.h   |   12 +
 target-arm/cpu64.c |  118 +++
 target-arm/machine.c   |7 +-
 13 files changed, 2627 insertions(+), 12 deletions(-)
 create mode 100644 hw/intc/arm_gicv3.c
 create mode 100644 hw/intc/gicv3_internal.h
 create mode 100644 include/hw/intc/arm_gicv3.h

-- 
1.9.1




[Qemu-devel] Invitation: Dr Velano @ Wed Sep 16, 2015 19:50 - 20:50 (shlomopongr...@gmail.com)

2015-09-12 Thread Shlomo Pongratz
BEGIN:VCALENDAR
PRODID:-//Google Inc//Google Calendar 70.9054//EN
VERSION:2.0
CALSCALE:GREGORIAN
METHOD:REQUEST
BEGIN:VEVENT
DTSTART:20150916T165000Z
DTEND:20150916T175000Z
DTSTAMP:20150912T140307Z
ORGANIZER;CN=Shlomo Pongratz:mailto:shlomopongr...@gmail.com
UID:mhfgd22vmap7kbtja2d8p70...@google.com
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=ACCEPTED;RSVP=TRUE
 ;CN=Shlomo Pongratz;X-NUM-GUESTS=0:mailto:shlomopongr...@gmail.com
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=lav...@bezeqint.net;X-NUM-GUESTS=0:mailto:lav...@bezeqint.net
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=qemu-devel@nongnu.org;X-NUM-GUESTS=0:mailto:qemu-devel@nongnu.org
CREATED:20150912T140307Z
DESCRIPTION:View your event at https://www.google.com/calendar/event?action
 =VIEW&eid=bWhmZ2QyMnZtYXA3a2J0amEyZDhwNzA4bGcgcWVtdS1kZXZlbEBub25nbnUub3Jn&
 tok=MjQjc2hsb21vcG9uZ3JhdHpAZ21haWwuY29tMjIyYzRhMzI5OGRlYzdjZTIxNGRmNWVkZmJ
 mMTY5MDFlZmY0ZDkyOQ&ctz=Asia/Jerusalem&hl=en.
LAST-MODIFIED:20150912T140307Z
LOCATION:Uri Tsvi Grinberg Street\, Tel Aviv-Yafo\, Israel 25
SEQUENCE:0
STATUS:CONFIRMED
SUMMARY:Dr Velano
TRANSP:OPAQUE
BEGIN:VALARM
ACTION:DISPLAY
DESCRIPTION:This is an event reminder
TRIGGER:-P1D
END:VALARM
BEGIN:VALARM
ACTION:DISPLAY
DESCRIPTION:This is an event reminder
TRIGGER:-P0DT1H0M0S
END:VALARM
END:VEVENT
END:VCALENDAR


invite.ics
Description: application/ics


Re: [Qemu-devel] [PATCH v13 5/5] hw/arm/virt: Add gic-version option to virt machine

2015-09-08 Thread Shlomo Pongratz
This patch doesn't apply on latest git commit i.e. 298fae389 from Sep 7.
Can you please rebase?

Best regards,

S.P.



Re: [Qemu-devel] PING: [PATCH v2 0/2] cpu_arm: Implement irqchip property for ARM CPU

2015-09-07 Thread Shlomo Pongratz
On Friday, September 4, 2015, Peter Maydell 
wrote:

> On 4 September 2015 at 07:49, Pavel Fedin  > wrote:
> >  Hi! This message is mainly for Peter. I say you reviewed my major sets,
> but looks like missed this
> > one. If it is OK, we could apply it, and i could successfully bring back
> the missing part in
> > vGICv3-enabled hw/arm/virt.c which attaches irqchip to CPUs. This would
> make us more ready for TCG
> > version of GICv3.
>
> It's not clear to me that the TCG version of GICv3
> emulation should need to have such a link. The original
> emulation patchset was definitely not handling the
> GIC-to-CPU connection in the right way, and I haven't
> seen anybody post an updated version of those patches
> which fixes it. I'd rather not add this link until
> we have the GIC emulation design sorted out and
> we know that we need it.
>
> thanks
> -- PMM
>

Hi,

First I want to apologize for been absent for such a long time.
As far as I remember Peter suggested that with the GICv3 support the
routine define_arm_cp_regs_with_opaque should be used, where the "opaque"
is the gic object.
This "opaque" is later accessed from the register info "opaque" member. I
assume the idea is to take hw/arm/pxa2xx.c as an example.

I also assume that the code that registers the system instructions should
be called from the GICv3 code as the GICv3 mode of operation determines if
the access to the GICv3 is via memory access or via system registers.
Am I right?

I hope to release a new version soon.

Best regards,

S.P.


Re: [Qemu-devel] [PATCH v7 6/6] Add gicversion option to virt machine

2015-08-03 Thread Shlomo Pongratz
On Monday, August 3, 2015, Peter Maydell  wrote:

> On 3 August 2015 at 08:19, Pavel Fedin >
> wrote:
> >  Hello!
> >
> >> >  gicdev = qdev_create(NULL, gictype);
> >> > -qdev_prop_set_uint32(gicdev, "revision", 2);
> >> > +
> >> > +for (i = 0; i < vbi->smp_cpus; i++) {
> >> > +CPUState *cpu = qemu_get_cpu(i);
> >> > +CPUARMState *env = cpu->env_ptr;
> >> > +env->nvic = gicdev;
> >> > +}
> >>
> >> We definitely need to come up with a something cleaner
> >> than this (which is ugly for two reasons
> >
> >  This could be done:
> > a) as property
> > b) as global variable because 'gicdev' is a single of its kind.
> >
> >  But, actually, this is currently only for TCG, which needs it in
> > order to forward system register accesses to GICv3 code. Would it
> > be OK if i just omit this assignment ?
>
> Yes, I think so.
>
> I'm surprised we tell the CPU about the GIC pointer for the
> system register stuff -- I was expecting that we'd give the
> GIC a CPU pointer. (We could in theory implement some
> equivalent of the architected protocol between the redistributors
> and the CPU interfaces, but I think that would be overkill.)
>
> -- PMM
>

The problem is that each function added as a system's instruction helper to
target-arm/cpu64.c has the signature of void set(CPUARMState *env,
ARMCPRegInfo *ri, uint64_t value) or uint64_t get(CPUARMState *env,
ARMCPRegInfo *ri)
I just mimicked the way armv7m_nvic_ API works.
So in a sense the CPU must be familiar with the GIC (as an opaque object of
course).

Best regards,

S.P.


Re: [Qemu-devel] [PATCH v7 2/6] Implement GIC-500 base class

2015-07-26 Thread Shlomo Pongratz
Hi,

See inline.

On Friday, July 24, 2015, Peter Maydell  wrote:

> On 24 July 2015 at 10:55, Pavel Fedin >
> wrote:
> > From: Shlomo Pongratz >
> >
> > This class is to be used by both software and KVM implementations of
> GICv3
> >
>
> > +++ b/include/hw/intc/arm_gicv3_common.h
> > @@ -0,0 +1,112 @@
> > +/*
> > + * ARM GIC support
> > + *
> > + * Copyright (c) 2012 Linaro Limited
> > + * Copyright (c) 2015 Huawei.
> > + * Written by Peter Maydell
> > + * Extended to 64 cores by Shlomo Pongratz
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License as published by
> > + * the Free Software Foundation, either version 2 of the License, or
> > + * (at your option) any later version.
> > + *
> > + * This program is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + * GNU General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU General Public License
> along
> > + * with this program; if not, see <http://www.gnu.org/licenses/>.
> > + */
> > +
> > +#ifndef HW_ARM_GICV3_COMMON_H
> > +#define HW_ARM_GICV3_COMMON_H
> > +
> > +#include "hw/sysbus.h"
> > +#include "hw/intc/arm_gic_common.h"
> > +
> > +/* Maximum number of possible interrupts, determined by the GIC
> architecture */
> > +#define GICV3_MAXIRQ 1020
>
> So how do LPIs work? They have IDs above 1023.
>

Currently I don't use LPI as the virt virtual machine doesn't use it. There
is no point to write something I can't test.

>
> > +#define GICV3_NCPU 64
>
> Where does '64' come from as a maximum limit?
>
The goal is 128 (the GIC500 limitation) but since the largest builtin is
uint64_t this will require the usage of bitfiled library. Therefore I
thought it should be delayed to future version after the logic part is
debugged.

>
> > +
> > +typedef struct gicv3_irq_state {
> > +/* The enable bits are only banked for per-cpu interrupts.  */
> > +uint64_t enabled;
> > +uint64_t pending;
> > +uint64_t active;
> > +uint64_t level;
> > +uint64_t group;
>
> Why are these uint64_t ?
>

These are bitfields. which were enlarged from 8 (old limit) to 64 (new
limit)

>
> > +bool edge_trigger; /* true: edge-triggered, false: level-triggered
> */
> > +} gicv3_irq_state;
> > +
> > +typedef struct gicv3_sgi_state {
> > +uint64_t pending[GICV3_NCPU];
> > +} gicv3_sgi_state;
> > +
> > +typedef struct GICv3State {
> > +/*< private >*/
> > +SysBusDevice parent_obj;
> > +/*< public >*/
> > +
> > +qemu_irq parent_irq[GICV3_NCPU];
> > +qemu_irq parent_fiq[GICV3_NCPU];
> > +/* GICD_CTLR; for a GIC with the security extensions the NS banked
> version
> > + * of this register is just an alias of bit 1 of the S banked
> version.
> > + */
> > +uint32_t ctlr;
> > +/* Sim GICC_CTLR; again, the NS banked version is just aliases of
> bits of
> > + * the S banked register, so our state only needs to store the S
> version.
> > + */
>
> "Sim" ?
>

Should be simulate as I only support system registers and not memory mapped
registers, but keeping this variable makes the code more close to GICv2
code.

>
> > +uint32_t cpu_ctlr[GICV3_NCPU];
> > +bool cpu_enabled[GICV3_NCPU];
> > +
> > +gicv3_irq_state irq_state[GICV3_MAXIRQ];
> > +uint64_t irq_target[GICV3_MAXIRQ];
> > +uint8_t priority1[GIC_INTERNAL][GICV3_NCPU];
> > +uint8_t priority2[GICV3_MAXIRQ - GIC_INTERNAL];
> > +uint16_t last_active[GICV3_MAXIRQ][GICV3_NCPU];
> > +/* For each SGI on the target CPU, we store 64 bits
> > + * indicating which source CPUs have made this SGI
> > + * pending on the target CPU. These correspond to
> > + * the bytes in the GIC_SPENDSGIR* registers as
> > + * read by the target CPU.
>
> This comment doesn't make any sense. You can't fit
> 64 bits into a byte, and GIC_SPENDSGIR only hold
> 8 set-pending bits per SGI.
>
Correct, the comment is wrong and should be removed as spi_mending was
replaced by a structure.
The usage of a structure and not by a uint64_t vector is because there is
no serialization macro for it.

>
> > + */
> > +gicv3_sgi_s

Re: [Qemu-devel] [PATCH v6 4/6] Introduce irqchip type specification for KVMis

2015-07-24 Thread Shlomo Pongratz
Sorry, weekend.
But I'll test it as soon as possible.
S.P.

On Fri, Jul 24, 2015, 10:13 Pavel Fedin  wrote:

>  Hello!
>
> > This Doesn't compile, a problem with KVM_DEV_TYPE_ARM_VGIC_V2.
> > I assume this is include file issue as it exists in
> linux-headers/linux/kvm.h
> > Note that everything should compile also for TCG only.
>
>  Damn! I think i know what the problem is... Your host arch != target arch
> and CONFIG_KVM gets disabled.
>  Can you try adding explicit #include "linux/kvm.h" in vexpress.c and
> virt.c for a quick test?
>
> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
>
>


Re: [Qemu-devel] [PATCH v6 4/6] Introduce irqchip type specification for KVMis

2015-07-23 Thread Shlomo Pongratz
Hi,

This Doesn't compile, a problem with KVM_DEV_TYPE_ARM_VGIC_V2.
I assume this is include file issue as it exists in
linux-headers/linux/kvm.h
Note that everything should compile also for TCG only.

Best regards,

S.P.


On Wednesday, July 22, 2015, Pavel Fedin  wrote:

> This patch introduces kernel_irqchip_type member in Machine class, which
> it passed to kvm_arch_irqchip_create. It allows machine models to specify
> correct GIC type during KVM capability verification. The variable is
> defined as int in order to be architecture-agnostic for potential future
> uses by other architectures.
>
> Signed-off-by: Pavel Fedin >
> ---
>  hw/arm/vexpress.c|  1 +
>  hw/arm/virt.c|  3 +++
>  include/hw/boards.h  |  1 +
>  include/sysemu/kvm.h |  3 ++-
>  kvm-all.c|  2 +-
>  stubs/kvm.c  |  2 +-
>  target-arm/kvm.c | 13 +++--
>  7 files changed, 20 insertions(+), 5 deletions(-)
>
> diff --git a/hw/arm/vexpress.c b/hw/arm/vexpress.c
> index da21788..0675a00 100644
> --- a/hw/arm/vexpress.c
> +++ b/hw/arm/vexpress.c
> @@ -556,6 +556,7 @@ static void vexpress_common_init(MachineState *machine)
>  const hwaddr *map = daughterboard->motherboard_map;
>  int i;
>
> +machine->kernel_irqchip_type = KVM_DEV_TYPE_ARM_VGIC_V2;
>  daughterboard->init(vms, machine->ram_size, machine->cpu_model, pic);
>
>  /*
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 4846892..2e7d858 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -945,6 +945,9 @@ static void virt_instance_init(Object *obj)
>  "Set on/off to enable/disable the ARM
> "
>  "Security Extensions (TrustZone)",
>  NULL);
> +
> +/* Default GIC type is v2 */
> +vms->parent.kernel_irqchip_type = KVM_DEV_TYPE_ARM_VGIC_V2;
>  }
>
>  static void virt_class_init(ObjectClass *oc, void *data)
> diff --git a/include/hw/boards.h b/include/hw/boards.h
> index 2aec9cb..37eb767 100644
> --- a/include/hw/boards.h
> +++ b/include/hw/boards.h
> @@ -127,6 +127,7 @@ struct MachineState {
>  char *accel;
>  bool kernel_irqchip_allowed;
>  bool kernel_irqchip_required;
> +int kernel_irqchip_type;
>  int kvm_shadow_mem;
>  char *dtb;
>  char *dumpdtb;
> diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
> index 983e99e..8f4d485 100644
> --- a/include/sysemu/kvm.h
> +++ b/include/sysemu/kvm.h
> @@ -434,6 +434,7 @@ void kvm_init_irq_routing(KVMState *s);
>  /**
>   * kvm_arch_irqchip_create:
>   * @KVMState: The KVMState pointer
> + * @type: irqchip type, architecture-specific
>   *
>   * Allow architectures to create an in-kernel irq chip themselves.
>   *
> @@ -441,7 +442,7 @@ void kvm_init_irq_routing(KVMState *s);
>   *0: irq chip was not created
>   *  > 0: irq chip was created
>   */
> -int kvm_arch_irqchip_create(KVMState *s);
> +int kvm_arch_irqchip_create(KVMState *s, int type);
>
>  /**
>   * kvm_set_one_reg - set a register value in KVM via KVM_SET_ONE_REG ioctl
> diff --git a/kvm-all.c b/kvm-all.c
> index 06e06f2..8df938d 100644
> --- a/kvm-all.c
> +++ b/kvm-all.c
> @@ -1395,7 +1395,7 @@ static void kvm_irqchip_create(MachineState
> *machine, KVMState *s)
>
>  /* First probe and see if there's a arch-specific hook to create the
>   * in-kernel irqchip for us */
> -ret = kvm_arch_irqchip_create(s);
> +ret = kvm_arch_irqchip_create(s, machine->kernel_irqchip_type);
>  if (ret == 0) {
>  ret = kvm_vm_ioctl(s, KVM_CREATE_IRQCHIP);
>  }
> diff --git a/stubs/kvm.c b/stubs/kvm.c
> index e7c60b6..a8505ff 100644
> --- a/stubs/kvm.c
> +++ b/stubs/kvm.c
> @@ -1,7 +1,7 @@
>  #include "qemu-common.h"
>  #include "sysemu/kvm.h"
>
> -int kvm_arch_irqchip_create(KVMState *s)
> +int kvm_arch_irqchip_create(KVMState *s, int type)
>  {
>  return 0;
>  }
> diff --git a/target-arm/kvm.c b/target-arm/kvm.c
> index b278542..180f75f 100644
> --- a/target-arm/kvm.c
> +++ b/target-arm/kvm.c
> @@ -583,19 +583,28 @@ void kvm_arch_init_irq_routing(KVMState *s)
>  {
>  }
>
> -int kvm_arch_irqchip_create(KVMState *s)
> +int kvm_arch_irqchip_create(KVMState *s, int type)
>  {
>  int ret;
>
> +/* Failure here means forgotten initialization of
> + * machine->kernel_irqchip_type in model code
> + */
> +assert(type != 0);
> +
>  /* If we can create the VGIC using the newer device control API, we
>   * let the device do this when it initializes itself, otherwise we
>   * fall back to the old API */
>
> -ret = kvm_create_device(s, KVM_DEV_TYPE_ARM_VGIC_V2, true);
> +ret = kvm_create_device(s, type, true);
>  if (ret == 0) {
>  return 1;
>  }
>
> +/* Fallback will create VGIC v2 */
> +if (type != KVM_DEV_TYPE_ARM_VGIC_V2) {
> +return ret;
> +}
>  return 0;
>  }
>
> --
> 1.9.5.msysgit.0
>
>


Re: [Qemu-devel] User space vs kernel space instructions distribution.

2015-07-23 Thread Shlomo Pongratz
See inline

On Wednesday, July 22, 2015, Christopher Covington 
wrote:

> On 07/14/2015 04:45 AM, Peter Maydell wrote:
> > On 14 July 2015 at 09:32, Shlomo Pongratz  > wrote:
> >> Hi,
> >>
> >> I'm running aarm64 QEMU and I'm counting the number of instructions
> which
> >> "belong" to user space vs kernel space. My measurements shows that 99
> >> percent of instructions are in kernel space.
>
> How are you counting? Instrumenting QEMU?

I have an array of the three address spaces user/unmapped/kernel and array
of 4 els and I'm add the TB's icount to the appropriate entry according to
the env->pc and arm_current_el(env) before the block execution.
As I wrote before I disabled chaining so TB's icount is the number of
executed instructions.

>

>> I've used both the address of the instructions and the EL just to be
> sure. I
> >> also added an option to disable block chaining just to make sure that
> all
> >> the instructions in every TB  is counted.
> >> When examining some kernel's instructions against the objdump of the
> kernel
> >> I've noticed that most of them are in interrupts/timers area.
> >>
> >> Does this make sense?
> >> Did someone also encountered this phenomenon?
> >
> > Depends entirely on your workload, obviously. If the system only
> > boots then most instructions will be in kernel space. If the system
> > is only sitting idle then it'll just be executing the kernel space
> > idle loop. If you're measuring solely the section of time where
> > a userspace program is doing real work with the CPU and you're
> > still seeing a 99% figure then the obvious conclusion would be that
> > your measurement approach is wrong...
> >
> > If your measurement instrumentation is intrusive and is significantly
> > slowing down QEMU then you'll naturally find that the guest spends
> > more time in timer interrupt handling, because the timer interrupts
> > come in in real time, and you've just effectively reduced the speed
> > of your CPU, so it can get less useful work done between timer
> > interrupts.
>
> I find such behavior undesirable. As best I understand, -icount exists to
> provide an alternative, although it may have bugs. I've tinkered with
> -icount
> a little, but I have yet to come up with useful tests to verify its correct
> behavior. If anyone has suggestions for how to test it, I'd be eager to
> hear.
>
I don't use -icount.

>
> Chris
>
> --
> Qualcomm Innovation Center, Inc.
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project
>


[Qemu-devel] User space vs kernel space instructions distribution.

2015-07-14 Thread Shlomo Pongratz
Hi,

I'm running aarm64 QEMU and I'm counting the number of instructions which
"belong" to user space vs kernel space. My measurements shows that 99
percent of instructions are in kernel space.
I've used both the address of the instructions and the EL just to be sure.
I also added an option to disable block chaining just to make sure that all
the instructions in every TB  is counted.
When examining some kernel's instructions against the objdump of the kernel
I've noticed that most of them are in interrupts/timers area.

Does this make sense?
Did someone also encountered this phenomenon?

Best regards,

S.P.


Re: [Qemu-devel] [PATCH RFC V3 1/4] Use Aff1 with mpidr This is an improved and KVM-aware alternative to

2015-06-06 Thread Shlomo Pongratz
Hi all.
Patch #1 is actually Pavel Fedin's patch
https://qemu-devel/2015-05/msg04495.html, which I included as a replacement
to my original patch #1 "as there could be only one". I think that Pavel's
needs to address all the issues in the original thread.
Best regards.
S.P.
On Jun 4, 2015 7:18 PM, "Peter Maydell"  wrote:

> On 4 June 2015 at 17:40, Shlomo Pongratz  wrote:
> > From: Pavel Fedin 
>
> Hi. I think this patch largely makes sense, but I have some comments
> below. If you want to fix these and resend as a standalone patch
> I'm happy to apply that.
>
> > 1. MPIDR value (minus feature bits) is now directly stored in CPU
> instance.
> > 2. Default assignment is based on original rule (8 CPUs per cluster).
> > 3. If KVM is used, MPIDR values are overridden with those from KVM. This
> is
> > necessary for
> > proper KVM PSCI functioning.
> > 4. Some cosmetic changes which would make further expansion of this code
> easier.
> > 5. Added some debugging macros because CPU ID assignment is tricky part,
> and if
> > there are
> > some problems, i think there should be a quick way to make sure they are
> > correct.
> >  This does not have an RFC mark because it is perfectly okay to be
> committed
> > alone, it
> > does not break anything. Commit message follows. Cc'ed to all interested
> > parties because i
> > really hope to get things going somewhere this time.
>
> This sort of commentary belongs below the '---' line, so it doesn't
> end up in the final git commit message.
>
> > In order to support up to 128 cores with GIC-500 (GICv3 implementation)
> > affinity1 must be used. GIC-500 support up to 32 clusters with up to
> > 8 cores in a cluster. So for example, if one wishes to have 16 cores,
> > the options are: 2 clusters of 8 cores each, 4 clusters with 4 cores each
> > Currently only the first option is supported for TCG. However, KVM
> expects
> > to have the same clusters layout as host machine has. Therefore, if we
> use
> > 64-bit KVM we override CPU IDs with those provided by the KVM. For 32-bit
> > systems it is OK to leave the default because so far we do not have more
> > than 8 CPUs on any of them.
> > This implementation has a potential to be extended with some methods
> which
> > would define cluster layout instead of hardcoded CPUS_PER_CLUSTER
> > definition. This would allow to emulate real machines with different
> > layouts. However, in case of KVM we would still have to inherit IDs from
> > the host.
>
> I agree that we want to be using the same idea of MPIDR as KVM
> when KVM is enabled, certainly. But this commit message has
> a number of inaccuracies:
>  * KVM doesn't inherit IDs from the host
>  * we don't need to have the same cluster layout as the host machine
>  * this isn't specific to 64-bit VMs so we should fix 32 bits at the same
> time
>
> The bug we're actually trying to fix here is just that the
> kernel's mapping from VCPU ID to MPIDR value has changed
> from the simplistic implementation it originally had. So in
> userspace we need to read the MPIDR value back from the kernel
> rather than assuming we know the mapping. (It happens that the
> reason the kernel changed was to accommodate GICv3 and its
> restrictions on affinity values, but that's not relevant to why
> this patch is needed.)
>
> > Signed-off-by: Shlomo Pongratz 
> > Signed-off-by: Pavel Fedin 
> > ---
> >  hw/arm/virt.c|  6 +-
> >  target-arm/cpu-qom.h |  3 +++
> >  target-arm/cpu.c | 17 +
> >  target-arm/helper.c  |  9 +++--
> >  target-arm/kvm64.c   | 25 +
> >  target-arm/psci.c| 20 ++--
> >  6 files changed, 71 insertions(+), 9 deletions(-)
> >
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index 05db8cb..19c8c3a 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -294,7 +294,11 @@ static void fdt_add_cpu_nodes(const VirtBoardInfo
> *vbi)
> >  "enable-method", "psci");
> >  }
> >
> > -qemu_fdt_setprop_cell(vbi->fdt, nodename, "reg", cpu);
> > +/*
> > + * If cpus node's #address-cells property is set to 1
> > + * The reg cell bits [23:0] must be set to bits [23:0] of
> MPIDR_EL1.
> > + */
>
> This comment is either obvious or misleading depending on your
> point of view.
>
> > +qemu_fdt_setprop_cell(vbi->fdt, nodename, "reg", armcpu-&

[Qemu-devel] [PATCH RFC V3 2/4] Implment GIC-500

2015-06-04 Thread Shlomo Pongratz
From: Shlomo Pongratz 

Implement GIC-500 from GICv3 family for arm64

This patch is a first step toward 128 cores support for arm64.

At first only 64 cores are supported for two reasons:
First the largest integer type has the size of 64 bits and modifying
essential data structures in order to support 128 cores will require
the usage of bitops.
Second currently the Linux (kernel) can be configured to support
up to 64 cores thus there is no urgency with 128 cores support.

Things left to do:

Currently the booting Linux may got stuck. The probability of getting stuck
increases with the number of cores. I'll appreciate core review.

Signed-off-by: Shlomo Pongratz 
---
 hw/intc/Makefile.objs  |2 +
 hw/intc/arm_gicv3.c| 2086 
 hw/intc/arm_gicv3_common.c |  217 
 hw/intc/gicv3_internal.h   |  161 +++
 include/hw/intc/arm_gicv3.h|   44 +
 include/hw/intc/arm_gicv3_common.h |  119 ++
 6 files changed, 2629 insertions(+)
 create mode 100644 hw/intc/arm_gicv3.c
 create mode 100644 hw/intc/arm_gicv3_common.c
 create mode 100644 hw/intc/gicv3_internal.h
 create mode 100644 include/hw/intc/arm_gicv3.h
 create mode 100644 include/hw/intc/arm_gicv3_common.h

diff --git a/hw/intc/Makefile.objs b/hw/intc/Makefile.objs
index 843864a..41fe9ec 100644
--- a/hw/intc/Makefile.objs
+++ b/hw/intc/Makefile.objs
@@ -11,6 +11,8 @@ common-obj-$(CONFIG_SLAVIO) += slavio_intctl.o
 common-obj-$(CONFIG_IOAPIC) += ioapic_common.o
 common-obj-$(CONFIG_ARM_GIC) += arm_gic_common.o
 common-obj-$(CONFIG_ARM_GIC) += arm_gic.o
+common-obj-$(CONFIG_ARM_GIC) += arm_gicv3_common.o
+common-obj-$(CONFIG_ARM_GIC) += arm_gicv3.o
 common-obj-$(CONFIG_OPENPIC) += openpic.o
 
 obj-$(CONFIG_APIC) += apic.o apic_common.o
diff --git a/hw/intc/arm_gicv3.c b/hw/intc/arm_gicv3.c
new file mode 100644
index 000..477be21
--- /dev/null
+++ b/hw/intc/arm_gicv3.c
@@ -0,0 +1,2086 @@
+/*
+ * ARM Generic/Distributed Interrupt Controller
+ *
+ * Copyright (c) 2006-2007 CodeSourcery.
+ * Copyright (c) 2015 Huawei.
+ * Written by Shlomo Pongratz
+ * Base on gic.c by Paul Brook
+ *
+ * This code is licensed under the GPL.
+ */
+
+/* This file contains implementation code for the GIC-500 interrupt
+ * controller, which is an implementation of the GICv3 architecture.
+ * Curently it supports up to 64 cores. Enhancmet to 128 cores requires
+ * working with bitops.
+ */
+
+#include "hw/sysbus.h"
+#include "gicv3_internal.h"
+#include "qom/cpu.h"
+
+#undef DEBUG_GICV3
+
+#ifdef DEBUG_GICV3
+#define DPRINTF(fmt, ...) \
+do { fprintf(stderr, "arm_gicv3::%s: " fmt , __func__, ## __VA_ARGS__); } 
while (0)
+#else
+#define DPRINTF(fmt, ...) do {} while(0)
+#endif
+
+/* These routines are called from cpu64.c and are defined in target-arm/cpu.h
+ * like armv7m_nvic_XXX routines.
+ * I couldn't find how to include it without compilation errors
+ */
+void armv8_gicv3_set_sgi(void *opaque, int cpuindex, uint64_t value);
+uint64_t armv8_gicv3_acknowledge_irq(void *opaque, int cpuindex,
+ MemTxAttrs attrs);
+void armv8_gicv3_complete_irq(void *opaque, int cpuindex, int irq,
+ MemTxAttrs attrs);
+uint64_t armv8_gicv3_get_priority_mask(void *opaque, int cpuindex);
+void armv8_gicv3_set_priority_mask(void *opaque, int cpuindex, uint32_t mask);
+uint64_t armv8_gicv3_get_sre(void *opaque);
+void armv8_gicv3_set_sre(void *opaque, uint64_t sre);
+uint64_t armv8_gicv3_get_igrpen1(void *opaque, int cpuindex);
+void armv8_gicv3_set_igrpen1(void *opaque, int cpuindex, uint64_t igrpen1);
+
+
+static uint32_t gicv3_acknowledge_irq(GICv3State *s, int cpu, MemTxAttrs 
attrs);
+static void gicv3_complete_irq(GICv3State *s, int cpu, int irq,
+   MemTxAttrs attrs);
+static void gicv3_update(GICv3State *s);
+static void gicv3_init_irqs_and_distributor(GICv3State *s, int num_irq);
+static void gicv3_set_priority(GICv3State *s, int cpu, int irq, uint8_t val,
+   MemTxAttrs attrs);
+
+static const uint8_t gic_dist_ids[] = {
+0x44, 0x00, 0x00, 0x00, 0x092, 0xB4, 0x3B, 0x00, 0x0D, 0xF0, 0x05, 0xB1
+};
+
+static const uint8_t gic_redist_ids[] = {
+0x44, 0x00, 0x00, 0x00, 0x093, 0xB4, 0x3B, 0x00, 0x0D, 0xF0, 0x05, 0xB1
+};
+
+/* Cuurently no GICv2 backwards compatibility (no memory mapped regs)
+ * Uses system registers mode.
+ */
+static const int no_gicv2_bc = 1;
+static uint32_t gic_sre;
+
+#define NUM_CPU(s) ((s)->num_cpu)
+
+static inline int gic_get_current_cpu(GICv3State *s)
+{
+if (s->num_cpu > 1) {
+return current_cpu->cpu_index;
+}
+return 0;
+}
+
+/* Return true if this GIC config has interrupt groups, which is
+ * true if we're a GICv3. Keep just
+ */
+static inline bool gic_has_groups(GICv3State *s)
+{
+return 1;
+}
+
+/* TODO: Many places that call this routine could be optim

[Qemu-devel] [PATCH RFC V3 3/4] GICv3 support

2015-06-04 Thread Shlomo Pongratz
From: Shlomo Pongratz 

Add system instructions used by the Linux (kernel) GICv3
device driver

Signed-off-by: Shlomo Pongratz 
---
 target-arm/cpu.h   |  12 ++
 target-arm/cpu64.c | 105 +
 2 files changed, 117 insertions(+)

diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 21b5b8e..810490d 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -1013,6 +1013,18 @@ void armv7m_nvic_set_pending(void *opaque, int irq);
 int armv7m_nvic_acknowledge_irq(void *opaque);
 void armv7m_nvic_complete_irq(void *opaque, int irq);
 
+void armv8_gicv3_set_sgi(void *opaque, int cpuindex, uint64_t value);
+uint64_t armv8_gicv3_acknowledge_irq(void *opaque, int cpuindex,
+  MemTxAttrs attrs);
+void armv8_gicv3_complete_irq(void *opaque, int cpuindex, int irq,
+  MemTxAttrs attrs);
+uint64_t armv8_gicv3_get_priority_mask(void *opaque, int cpuindex);
+void armv8_gicv3_set_priority_mask(void *opaque, int cpuindex, uint32_t mask);
+uint64_t armv8_gicv3_get_sre(void *opaque);
+void armv8_gicv3_set_sre(void *opaque, uint64_t sre);
+uint64_t armv8_gicv3_get_igrpen1(void *opaque, int cpuindex);
+void armv8_gicv3_set_igrpen1(void *opaque, int cpuindex, uint64_t igrpen1);
+
 /* Interface for defining coprocessor registers.
  * Registers are defined in tables of arm_cp_reginfo structs
  * which are passed to define_arm_cp_regs().
diff --git a/target-arm/cpu64.c b/target-arm/cpu64.c
index bf7dd68..2bce6f6 100644
--- a/target-arm/cpu64.c
+++ b/target-arm/cpu64.c
@@ -45,6 +45,72 @@ static uint64_t a57_a53_l2ctlr_read(CPUARMState *env, const 
ARMCPRegInfo *ri)
 }
 #endif
 
+#ifndef CONFIG_USER_ONLY
+static void sgi_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
+{
+CPUState *cpu = ENV_GET_CPU(env);
+armv8_gicv3_set_sgi(env->nvic, cpu->cpu_index, value);
+}
+
+static uint64_t iar_read(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+uint64_t value;
+MemTxAttrs attrs;;
+CPUState *cpu = ENV_GET_CPU(env);
+attrs.secure = arm_is_secure_below_el3(env) ? 1 : 0;
+value = armv8_gicv3_acknowledge_irq(env->nvic, cpu->cpu_index, attrs);
+return value;
+}
+
+static void sre_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
+{
+armv8_gicv3_set_sre(env->nvic, value);
+}
+
+static uint64_t sre_read(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+uint64_t value;
+value = armv8_gicv3_get_sre(env->nvic);
+return value;
+}
+
+static void eoir_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t 
value)
+{
+MemTxAttrs attrs;
+CPUState *cpu = ENV_GET_CPU(env);
+attrs.secure = arm_is_secure_below_el3(env) ? 1 : 0;
+armv8_gicv3_complete_irq(env->nvic, cpu->cpu_index, value, attrs);
+}
+
+static uint64_t pmr_read(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+uint64_t value;
+CPUState *cpu = ENV_GET_CPU(env);
+value = armv8_gicv3_get_priority_mask(env->nvic, cpu->cpu_index);
+return value;
+}
+
+static void pmr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
+{
+CPUState *cpu = ENV_GET_CPU(env);
+armv8_gicv3_set_priority_mask(env->nvic, cpu->cpu_index, value);
+}
+
+static uint64_t igrpen1_read(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+uint64_t value;
+CPUState *cpu = ENV_GET_CPU(env);
+value = armv8_gicv3_get_igrpen1(env->nvic, cpu->cpu_index);
+return value;
+}
+
+static void igrpen1_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t 
value)
+{
+CPUState *cpu = ENV_GET_CPU(env);
+armv8_gicv3_set_igrpen1(env->nvic, cpu->cpu_index, value);
+}
+#endif
+
 static const ARMCPRegInfo cortex_a57_a53_cp_reginfo[] = {
 #ifndef CONFIG_USER_ONLY
 { .name = "L2CTLR_EL1", .state = ARM_CP_STATE_AA64,
@@ -89,6 +155,45 @@ static const ARMCPRegInfo cortex_a57_a53_cp_reginfo[] = {
 { .name = "L2MERRSR",
   .cp = 15, .opc1 = 3, .crm = 15,
   .access = PL1_RW, .type = ARM_CP_CONST | ARM_CP_64BIT, .resetvalue = 0 },
+{ .name = "EIOR1_EL1", .state = ARM_CP_STATE_AA64,
+#ifndef CONFIG_USER_ONLY
+  .writefn = eoir_write,
+#endif
+  .opc0 = 3, .opc1 = 0, .crn = 12, .crm = 12, .opc2 = 1,
+  .access = PL1_W, .type = ARM_CP_SPECIAL, .resetvalue = 0 },
+{ .name = "IAR1_EL1", .state = ARM_CP_STATE_AA64,
+#ifndef CONFIG_USER_ONLY
+  .readfn = iar_read,
+#endif
+  .opc0 = 3, .opc1 = 0, .crn = 12, .crm = 12, .opc2 = 0,
+  .access = PL1_R, .type = ARM_CP_SPECIAL, .resetvalue = 0 },
+{ .name = "SGI1R_EL1", .state = ARM_CP_STATE_AA64,
+#ifndef CONFIG_USER_ONLY
+  .writefn = sgi_write,
+#endif
+  .opc0 = 3, .opc1 = 0, .crn = 12, .crm = 11, .opc2 = 5,
+  .access = PL1_RW, .type = ARM_CP_SPECIAL, .resetvalue = 0 },
+{ .name = "PMR_EL1", .state = ARM_CP_STATE_AA64,
+  .opc0 = 3, .opc1 = 0, .crn = 4, .crm = 6, .opc2 = 0,
+#ifndef CONFIG_USER_O

[Qemu-devel] [PATCH RFC V3 0/4] Implement GIC-500 from GICv3 family for arm64

2015-06-04 Thread Shlomo Pongratz
From: Shlomo Pongratz 

This patch is a first step toward 128 cores support for arm64.

At first only 64 cores are supported for two reasons:
First the largest integer type has the size of 64 bits and modifying
essential data structures in order to support 128 cores will require
the usage of bitops.
Second currently the Linux (kernel) can be configured to support
up to 64 cores thus there is no urgency with 128 cores support.

Things left to do:

There is a need to support flexible clusters size. The GIC-500 can support
up to 128 cores, up to 32 clusters and up to 8 cores is a cluster.
So for example, if one wishes to have 16 cores, the options are:
2 clusters of 8 cores each, 4 clusters with 4 cores each
Currently only the first option is supported.
There is an issue of passing clock affinity to via the dtb. In the dtb

interrupt section there are only 24 bit left to affinity since the
variable is a 32 bit entity and 8 bits are reserved for flags.
See Documentation/devicetree/bindings/arm/arch_timer.txt.
Note that this issue is not seems to be critical as when checking
/proc/irq/3/smp_affinity with 32 cores all 32 bits are one.

The last issue is to add support for 128 cores. This requires the usage
of bitops and currently can be tested up to 64 cores.

Special thanks to Peter Crostwaite whose patch to th Linux (kernel) i.e.
Implement cpu_relax as yield solved the problem of the boot process getting
stuck for 24 cores and more.

V2:
  - Split the original patch to 4 patches
  - Add SRE API to the GIC code.
  - Add call to gicv3_update to armv8_gicv3_set_priority_mask.
  - Cosmetic changes.
  - Fix number of irq when reading GICD_TYPER.

V3:
  - Replace my original 1 & 4 patches with Pavel's patches.
  - Add groups support to complies with new GICv2 addtions
  - Cosmetic changes.


Pavel Fedin (2):
  Use Aff1 with mpidr This is an improved and KVM-aware alternative to
  Add virt-v3 machine that uses GIC-500

Shlomo Pongratz (2):
  Implment GIC-500
  GICv3 support

 hw/arm/virt.c  |  165 ++-
 hw/intc/Makefile.objs  |2 +
 hw/intc/arm_gicv3.c| 2086 
 hw/intc/arm_gicv3_common.c |  217 
 hw/intc/gicv3_internal.h   |  161 +++
 include/hw/arm/virt.h  |4 +
 include/hw/boards.h|1 +
 include/hw/intc/arm_gicv3.h|   44 +
 include/hw/intc/arm_gicv3_common.h |  119 ++
 target-arm/cpu-qom.h   |3 +
 target-arm/cpu.c   |   17 +
 target-arm/cpu.h   |   12 +
 target-arm/cpu64.c |  105 ++
 target-arm/helper.c|9 +-
 target-arm/kvm64.c |   25 +
 target-arm/psci.c  |   20 +-
 16 files changed, 2952 insertions(+), 38 deletions(-)
 create mode 100644 hw/intc/arm_gicv3.c
 create mode 100644 hw/intc/arm_gicv3_common.c
 create mode 100644 hw/intc/gicv3_internal.h
 create mode 100644 include/hw/intc/arm_gicv3.h
 create mode 100644 include/hw/intc/arm_gicv3_common.h

-- 
1.9.1




[Qemu-devel] [PATCH RFC V3 1/4] Use Aff1 with mpidr This is an improved and KVM-aware alternative to

2015-06-04 Thread Shlomo Pongratz
From: Pavel Fedin 

1. MPIDR value (minus feature bits) is now directly stored in CPU instance.
2. Default assignment is based on original rule (8 CPUs per cluster).
3. If KVM is used, MPIDR values are overridden with those from KVM. This is
necessary for
proper KVM PSCI functioning.
4. Some cosmetic changes which would make further expansion of this code easier.
5. Added some debugging macros because CPU ID assignment is tricky part, and if
there are
some problems, i think there should be a quick way to make sure they are
correct.
 This does not have an RFC mark because it is perfectly okay to be committed
alone, it
does not break anything. Commit message follows. Cc'ed to all interested
parties because i
really hope to get things going somewhere this time.

In order to support up to 128 cores with GIC-500 (GICv3 implementation)
affinity1 must be used. GIC-500 support up to 32 clusters with up to
8 cores in a cluster. So for example, if one wishes to have 16 cores,
the options are: 2 clusters of 8 cores each, 4 clusters with 4 cores each
Currently only the first option is supported for TCG. However, KVM expects
to have the same clusters layout as host machine has. Therefore, if we use
64-bit KVM we override CPU IDs with those provided by the KVM. For 32-bit
systems it is OK to leave the default because so far we do not have more
than 8 CPUs on any of them.
This implementation has a potential to be extended with some methods which
would define cluster layout instead of hardcoded CPUS_PER_CLUSTER
definition. This would allow to emulate real machines with different
layouts. However, in case of KVM we would still have to inherit IDs from
the host.

Signed-off-by: Shlomo Pongratz 
Signed-off-by: Pavel Fedin 
---
 hw/arm/virt.c|  6 +-
 target-arm/cpu-qom.h |  3 +++
 target-arm/cpu.c | 17 +
 target-arm/helper.c  |  9 +++--
 target-arm/kvm64.c   | 25 +
 target-arm/psci.c| 20 ++--
 6 files changed, 71 insertions(+), 9 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 05db8cb..19c8c3a 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -294,7 +294,11 @@ static void fdt_add_cpu_nodes(const VirtBoardInfo *vbi)
 "enable-method", "psci");
 }
 
-qemu_fdt_setprop_cell(vbi->fdt, nodename, "reg", cpu);
+/*
+ * If cpus node's #address-cells property is set to 1
+ * The reg cell bits [23:0] must be set to bits [23:0] of MPIDR_EL1.
+ */
+qemu_fdt_setprop_cell(vbi->fdt, nodename, "reg", armcpu->mpidr);
 g_free(nodename);
 }
 }
diff --git a/target-arm/cpu-qom.h b/target-arm/cpu-qom.h
index ed5a644..a382a09 100644
--- a/target-arm/cpu-qom.h
+++ b/target-arm/cpu-qom.h
@@ -159,6 +159,7 @@ typedef struct ARMCPU {
 uint64_t id_aa64mmfr1;
 uint32_t dbgdidr;
 uint32_t clidr;
+uint64_t mpidr; /* Without feature bits */
 /* The elements of this array are the CCSIDR values for each cache,
  * in the order L1DCache, L1ICache, L2DCache, L2ICache, etc.
  */
@@ -171,6 +172,8 @@ typedef struct ARMCPU {
 uint64_t rvbar;
 } ARMCPU;
 
+#define CPUS_PER_CLUSTER 8
+
 #define TYPE_AARCH64_CPU "aarch64-cpu"
 #define AARCH64_CPU_CLASS(klass) \
 OBJECT_CLASS_CHECK(AArch64CPUClass, (klass), TYPE_AARCH64_CPU)
diff --git a/target-arm/cpu.c b/target-arm/cpu.c
index 4a888ab..7272466 100644
--- a/target-arm/cpu.c
+++ b/target-arm/cpu.c
@@ -31,6 +31,12 @@
 #include "sysemu/kvm.h"
 #include "kvm_arm.h"
 
+#ifdef DEBUG
+# define DPRINTF(format, ...) printf("armcpu: " format , ## __VA_ARGS__)
+#else
+# define DPRINTF(format, ...) do { } while (0)
+#endif
+
 static void arm_cpu_set_pc(CPUState *cs, vaddr value)
 {
 ARMCPU *cpu = ARM_CPU(cs);
@@ -388,12 +394,23 @@ static void arm_cpu_initfn(Object *obj)
 CPUState *cs = CPU(obj);
 ARMCPU *cpu = ARM_CPU(obj);
 static bool inited;
+uint32_t Aff1, Aff0;
 
 cs->env_ptr = &cpu->env;
 cpu_exec_init(&cpu->env);
 cpu->cp_regs = g_hash_table_new_full(g_int_hash, g_int_equal,
  g_free, g_free);
 
+/*
+ * We don't support setting cluster ID ([16..23]) (known as Aff2
+ * in later ARM ARM versions), or any of the higher affinity level fields,
+ * so these bits always RAZ.
+ */
+Aff1 = cs->cpu_index / CPUS_PER_CLUSTER;
+Aff0 = cs->cpu_index % CPUS_PER_CLUSTER;
+cpu->mpidr = (Aff1 << 8) | Aff0;
+DPRINTF("Init(%p): index %d mpidr 0x%jX\n", cpu, cs->cpu_index, 
cpu->mpidr);
+
 #ifndef CONFIG_USER_ONLY
 /* Our inbound IRQ and FIQ lines */
 if (kvm_enabled()) {
diff --git a/target-arm/helper.c b/target-arm/helper.c
index 1cc4993..99c7a30 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -2063,12 +2063,9 @@ static const ARM

[Qemu-devel] [PATCH RFC V3 4/4] Add virt-v3 machine that uses GIC-500

2015-06-04 Thread Shlomo Pongratz
From: Pavel Fedin 

I would like to offer this, slightly improved implementation. The key thing is 
a new
kernel_irqchip_type member in Machine class. Currently it it used only by virt 
machine for
its internal purposes, however in future it is to be passed to KVM in
kvm_irqchip_create(). The variable is defined as int in order to be 
architecture agnostic,
for potential future users.

Signed-off-by: Pavel Fedin 
---
 hw/arm/virt.c | 159 +-
 include/hw/arm/virt.h |   4 ++
 include/hw/boards.h   |   1 +
 3 files changed, 135 insertions(+), 29 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 19c8c3a..22a39b4 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -45,6 +45,7 @@
 #include "qemu/error-report.h"
 #include "hw/pci-host/gpex.h"
 #include "hw/arm/virt-acpi-build.h"
+#include "linux/kvm.h" /* For KVM_DEV_TYPE_ARM_VGIC_V{2|3} */
 
 /* Number of external interrupt lines to configure the GIC with */
 #define NUM_IRQS 128
@@ -58,7 +59,7 @@
 #define GIC_FDT_IRQ_FLAGS_LEVEL_LO 8
 
 #define GIC_FDT_IRQ_PPI_CPU_START 8
-#define GIC_FDT_IRQ_PPI_CPU_WIDTH 8
+#define GIC_FDT_IRQ_PPI_CPU_WIDTH 24
 
 typedef struct VirtBoardInfo {
 struct arm_boot_info bootinfo;
@@ -69,6 +70,7 @@ typedef struct VirtBoardInfo {
 void *fdt;
 int fdt_size;
 uint32_t clock_phandle;
+const char *class_name;
 } VirtBoardInfo;
 
 typedef struct {
@@ -82,6 +84,7 @@ typedef struct {
 } VirtMachineState;
 
 #define TYPE_VIRT_MACHINE   "virt"
+#define TYPE_VIRTV3_MACHINE   "virt-v3"
 #define VIRT_MACHINE(obj) \
 OBJECT_CHECK(VirtMachineState, (obj), TYPE_VIRT_MACHINE)
 #define VIRT_MACHINE_GET_CLASS(obj) \
@@ -103,20 +106,24 @@ typedef struct {
  */
 static const MemMapEntry a15memmap[] = {
 /* Space up to 0x800 is reserved for a boot ROM */
-[VIRT_FLASH] =  {  0, 0x0800 },
-[VIRT_CPUPERIPHS] = { 0x0800, 0x0002 },
+[VIRT_FLASH] =   {  0, 0x0800 },
+[VIRT_CPUPERIPHS] =  { 0x0800, 0x0002 },
 /* GIC distributor and CPU interfaces sit inside the CPU peripheral space 
*/
-[VIRT_GIC_DIST] =   { 0x0800, 0x0001 },
-[VIRT_GIC_CPU] ={ 0x0801, 0x0001 },
-[VIRT_UART] =   { 0x0900, 0x1000 },
-[VIRT_RTC] ={ 0x0901, 0x1000 },
-[VIRT_FW_CFG] = { 0x0902, 0x000a },
-[VIRT_MMIO] =   { 0x0a00, 0x0200 },
+[VIRT_GIC_DIST] ={ 0x0800, 0x0001 },
+[VIRT_GIC_CPU] = { 0x0801, 0x0001 },
+/* On v3 VIRT_GIC_DIST_SPI takes place of VIRT_GIC_CPU */
+[VIRT_ITS_CONTROL] = { 0x0802, 0x0001 },
+[VIRT_ITS_TRANSLATION] = { 0x0803, 0x0001 },
+[VIRT_LPI] = { 0x0804, 0x0080 },
+[VIRT_UART] ={ 0x0900, 0x1000 },
+[VIRT_RTC] = { 0x0901, 0x1000 },
+[VIRT_FW_CFG] =  { 0x0902, 0x000a },
+[VIRT_MMIO] ={ 0x0a00, 0x0200 },
 /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
-[VIRT_PCIE_MMIO] =  { 0x1000, 0x2eff },
-[VIRT_PCIE_PIO] =   { 0x3eff, 0x0001 },
-[VIRT_PCIE_ECAM] =  { 0x3f00, 0x0100 },
-[VIRT_MEM] ={ 0x4000, 30ULL * 1024 * 1024 * 1024 },
+[VIRT_PCIE_MMIO] =   { 0x1000, 0x2eff },
+[VIRT_PCIE_PIO] ={ 0x3eff, 0x0001 },
+[VIRT_PCIE_ECAM] =   { 0x3f00, 0x0100 },
+[VIRT_MEM] = { 0x4000, 30ULL * 1024 * 1024 * 1024 },
 };
 
 static const int a15irqmap[] = {
@@ -131,16 +138,25 @@ static VirtBoardInfo machines[] = {
 .cpu_model = "cortex-a15",
 .memmap = a15memmap,
 .irqmap = a15irqmap,
+.class_name = TYPE_ARM_CPU,
+},
+{
+.cpu_model = "cortex-a53",
+.memmap = a15memmap,
+.irqmap = a15irqmap,
+.class_name = TYPE_AARCH64_CPU,
 },
 {
 .cpu_model = "cortex-a57",
 .memmap = a15memmap,
 .irqmap = a15irqmap,
+.class_name = TYPE_AARCH64_CPU,
 },
 {
 .cpu_model = "host",
 .memmap = a15memmap,
 .irqmap = a15irqmap,
+.class_name = TYPE_ARM_CPU,
 },
 };
 
@@ -209,7 +225,6 @@ static void fdt_add_psci_node(const VirtBoardInfo *vbi)
 if (armcpu->psci_version == 2) {
 const char comp[] = "arm,psci-0.2\0arm,psci";
 qemu_fdt_setprop(fdt, "/psci", "compatible", comp, sizeof(comp));
-
 cpu_off_fn = QEMU_PSCI_0_2_FN_CPU_OFF;
 if (arm_feature(&armcpu->env, ARM_FEATURE_AARCH64)) {
 cpu_suspend_fn = QEMU_PSCI_0_2_FN64_CPU_SUSPEND;
@@ -222,7 +237,6 @@ static void fdt_add_psci_node(const VirtBoardInfo *vbi)
 }
 } else {
 qemu_fdt_setprop_string(fdt, "/psci", "compatible", "arm,psci");
-
 cpu_suspend_fn = QEMU_PSCI_0_1_FN_CPU_SUSPEND;
 cpu_off_fn = QEMU_PSCI_0_1_FN_CPU_OFF;
 

Re: [Qemu-devel] [PATCH] Use Aff1 with mpidr

2015-06-01 Thread Shlomo Pongratz
Hi

> -Original Message-
> From: Pavel Fedin [mailto:p.fe...@samsung.com]
> Sent: Monday, 01 June, 2015 12:04 PM
> To: Shlomo Pongratz; qemu-devel@nongnu.org
> Cc: peter.mayd...@linaro.org; 'Ashok Kumar'
> Subject: RE: [Qemu-devel] [PATCH] Use Aff1 with mpidr
> 
>  Hi again!
> 
> > I intend to release the V3 version during this week.
> 
>  One more note. I remember that i suggested to merge ITS_CONTROL and
> ITS_TRANSLATION into one region in virt memory map. I was wrong, don't do
> this and leave them as they are.
> There is in-kernel ITS implementation suggested on ARM-Linux mailing list,
> and qemu support will rely on this (it will have to pass CONTROL region to
> KVM and intercept TRANSLATION region).
> 

Hi Pavel,

You mean that I put your patch 
https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg04495.html
The first patch of my V3 patch series, with the sole exception of fixing the 
ITS control & ITS translation.

Best regards,

S.P.


> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia




Re: [Qemu-devel] [PATCH RFC V2 1/4] Use Aff1 with mpidr

2015-06-01 Thread Shlomo Pongratz


> -Original Message-
> From: Igor Mammedov [mailto:imamm...@redhat.com]
> Sent: Wednesday, 27 May, 2015 7:12 PM
> To: shlomopongr...@gmail.com
> Cc: qemu-devel@nongnu.org; peter.mayd...@linaro.org; Shlomo Pongratz
> Subject: Re: [Qemu-devel] [PATCH RFC V2 1/4] Use Aff1 with mpidr
> 
> On Wed,  6 May 2015 17:04:39 +0300
> shlomopongr...@gmail.com wrote:
> 
> > From: Shlomo Pongratz 
> >
> > In order to support up to 128 cores with GIC-500 (GICv3
> > implementation)
> > affinity1 must be used. GIC-500 support up to 32 clusters with up to
> > 8 cores in a cluster. So for example, if one wishes to have 16 cores,
> > the options are: 2 clusters of 8 cores each, 4 clusters with 4 cores
> > each Currently only the first option is supported.
> > In order to have more flexible scheme the virt machine must pass the
> > desired scheme.
> >
> > Signed-off-by: Shlomo Pongratz 
> > ---
> >  target-arm/helper.c | 12 ++--
> >  target-arm/psci.c   | 18 --
> >  2 files changed, 26 insertions(+), 4 deletions(-)
> >
> > diff --git a/target-arm/helper.c b/target-arm/helper.c index
> > f8f8d76..f555c0b 100644
> > --- a/target-arm/helper.c
> > +++ b/target-arm/helper.c
> > @@ -2035,11 +2035,19 @@ static const ARMCPRegInfo
> > strongarm_cp_reginfo[] = {  static uint64_t mpidr_read(CPUARMState
> > *env, const ARMCPRegInfo *ri)  {
> >  CPUState *cs = CPU(arm_env_get_cpu(env));
> > -uint32_t mpidr = cs->cpu_index;
> > -/* We don't support setting cluster ID ([8..11]) (known as Aff1
> > +uint32_t mpidr, aff0, aff1;
> > +uint32_t cpuid = cs->cpu_index;
> > +/* We don't support setting cluster ID ([16..23]) (known as Aff2
> >   * in later ARM ARM versions), or any of the higher affinity level 
> > fields,
> >   * so these bits always RAZ.
> >   */
> > +/* Currently GIC-500 code supports 64 cores in 16 clusters with 8 cores
> each
> > + * Future code will remove this limitation.
> > + * This code is valid for GIC-400 too.
> > + */
> > +aff0 = cpuid % 8;
> Even though GIC-500 spec says that it supports 8 cores I haven't found in it
> how it encodes aff0 but encoding is specified by respective CPU specs. I've
> checked a53, a57, a72 specs and they limit aff0 to max 0x3 value.
> I think encoding should be CPU type specific i.e. not defined by what GIC can
> support and once we add CPU type with 8 cores, it would provide it's own
> version of mpidr_read since it would be defined by spec how to encode aff0.

Assume we let the processor select its mpidr than for cortex-57 it will select 
0-3 now how a system with GICv2 with 8 cores will work?
Note that GICv2 that doesn't affinity levels. In the GIC-500 it is written that 
with GICv2 backwards compatibility " This mode supports up to eight cores. The 
eight cores supported 
by the GIC-500 are the cores with the lowest affinity numbers". So all 8 cores 
must have aff0 0 to 7.
It is obvious that with GICv3 if we have 2 cortex-a57 we should have affinities 
0.0.0.{0-3} and 0.0.1.{0-3} but with GICv2 we should have 0.0.0.{0-7}.
Now the only component that knows which GIC is used is the virt, it can set a 
property telling the GIC and the system what is the size of the lowest level 
affinity.
I think this number is the upper power of two of the SMP number divided by the 
(GICD_TYPER.CPUNumber + 1) but I'm not sure.

Best regards,

S.P.

> 
> > +aff1 = cpuid / 8;
> > +mpidr = (aff1 << 8) | aff0;
> >  if (arm_feature(env, ARM_FEATURE_V7MP)) {
> >  mpidr |= (1U << 31);
> >  /* Cores which are uniprocessor (non-coherent) diff --git
> > a/target-arm/psci.c b/target-arm/psci.c index d8fafab..64fbe61 100644
> > --- a/target-arm/psci.c
> > +++ b/target-arm/psci.c
> > @@ -86,6 +86,7 @@ void arm_handle_psci_call(ARMCPU *cpu)
> >  CPUARMState *env = &cpu->env;
> >  uint64_t param[4];
> >  uint64_t context_id, mpidr;
> > +uint32_t core, Aff1, Aff0;
> >  target_ulong entry;
> >  int32_t ret = 0;
> >  int i;
> > @@ -121,7 +122,16 @@ void arm_handle_psci_call(ARMCPU *cpu)
> >
> >  switch (param[2]) {
> >  case 0:
> > -target_cpu_state = qemu_get_cpu(mpidr & 0xff);
> > +/* MPIDR_EL1 
> > [RES0:affinity-3:RES1:U:RES0:MT:affinity-1:affinity-0]
> > + * GIC 500 code currently supports 32 clusters with 8 cores 
> > each
> > + * but no more than 128 cores. Future version will have 
>

Re: [Qemu-devel] [PATCH] Use Aff1 with mpidr

2015-05-31 Thread Shlomo Pongratz
Hi,

See below.

> -Original Message-
> From: Pavel Fedin [mailto:p.fe...@samsung.com]
> Sent: Friday, 29 May, 2015 9:45 AM
> To: Shlomo Pongratz; qemu-devel@nongnu.org
> Cc: peter.mayd...@linaro.org; 'Ashok Kumar'
> Subject: RE: [PATCH] Use Aff1 with mpidr
> 
>  Hi!
> 
> > I see that you added mpidr to ARMCpu and initialized it in virt.c then
> > you use it in
> mpidr_read.
> > Thus you must fix all other virtual machines in hw/arm not just virt.c
> > as it is not
> initialized for
> > them.
> 
>  The only change in virt.c is:
> --- cut ---
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c index a7f9a10..a1186c5 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -317,7 +317,11 @@ static void fdt_add_cpu_nodes(const VirtBoardInfo
> *vbi)
>  "enable-method", "psci");
>  }
> 
> -qemu_fdt_setprop_cell(vbi->fdt, nodename, "reg", cpu);
> +/*
> + * If cpus node's #address-cells property is set to 1
> + * The reg cell bits [23:0] must be set to bits [23:0] of MPIDR_EL1.
> + */
> +qemu_fdt_setprop_cell(vbi->fdt, nodename, "reg",
> + armcpu->mpidr);
>  g_free(nodename);
>  }
>  }
> --- cut ---
>  So that it takes MPIDR instead of just CPU index. Theoretically - yes, may be
> other machines should also be changed, but:
> 1. They are 32-bit, so MPIDR == index for them, because there are no more
> than 8 CPUs.
> 2. Those machines AFAIK do not compose device tree by themselves. They
> use pre-made ones instead, coming for example with kernel.
>  Only virt currently can be a 64-bit machine and cares about more than 8
> CPUs.
>  As to MPIDR initialization, it is done in arm_cpu_initfn(), therefore all ARM
> CPUs get this automatically. There's no need to modify code for every
> machine.

I think we should take Igor's comment into account. The CPUS_PER_CLUSTER should 
not be a constant, and maybe should be initialized in arm_cpu_initfn and 
aarch64_{a53|a57}_initfn, as psci need to know it.

>  I would kindly ask you to use my patch in your next series, or base
> something on it, if you dislike the implementation, but it's crucial for KVM
> that MPIDR values can be obtained from the host. Your original
> implementation cannot do this by design, this is why i made my own solution.
> See kvm_arch_init_vcpu() in my patch - without this CPUs beyond #7 will not
> power up in KVM.

I understand that you suggest that I'll not use my 1 & 4 patches. I have no 
problem with that, but how do I commit? Should I write that the patch series is 
pending until your patch series is integrated?

>  And when are you planning to post v3? I am waiting for it to be integrated, 
> in
> order to add KVM vGICv3. Otherwise i'm afraid there are too many collisions.

I intend to release the V3 version during this week.
I have some issues with the new groups code that was introduce to GICv2 but 
they are minor.

> 
> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
> 




Re: [Qemu-devel] [PATCH] Use Aff1 with mpidr

2015-05-28 Thread Shlomo Pongratz
Hi,

I see that you added mpidr to ARMCpu and initialized it in virt.c then you use 
it in mpidr_read.
Thus you must fix all other virtual machines in hw/arm not just virt.c as it is 
not initialized for them.

I have no other comments.

Best regards,

S.P.


> -Original Message-
> From: Pavel Fedin [mailto:p.fe...@samsung.com]
> Sent: Friday, 22 May, 2015 1:33 PM
> To: qemu-devel@nongnu.org
> Cc: peter.mayd...@linaro.org; 'Ashok Kumar'; Shlomo Pongratz
> Subject: [PATCH] Use Aff1 with mpidr
> 
>  This is an improved and KVM-aware alternative to
> https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg00942.html.
> Changes are:
> 1. MPIDR value (minus feature bits) is now directly stored in CPU instance.
> 2. Default assignment is based on original rule (8 CPUs per cluster).
> 3. If KVM is used, MPIDR values are overridden with those from KVM. This is
> necessary for proper KVM PSCI functioning.
> 4. Some cosmetic changes which would make further expansion of this code
> easier.
> 5. Added some debugging macros because CPU ID assignment is tricky part,
> and if there are some problems, i think there should be a quick way to make
> sure they are correct.
>  This does not have an RFC mark because it is perfectly okay to be committed
> alone, it does not break anything. Commit message follows. Cc'ed to all
> interested parties because i really hope to get things going somewhere this
> time.
> 
> In order to support up to 128 cores with GIC-500 (GICv3 implementation)
> affinity1 must be used. GIC-500 support up to 32 clusters with up to
> 8 cores in a cluster. So for example, if one wishes to have 16 cores, the
> options are: 2 clusters of 8 cores each, 4 clusters with 4 cores each 
> Currently
> only the first option is supported for TCG. However, KVM expects to have
> the same clusters layout as host machine has. Therefore, if we use 64-bit
> KVM we override CPU IDs with those provided by the KVM. For 32-bit
> systems it is OK to leave the default because so far we do not have more
> than 8 CPUs on any of them.
> This implementation has a potential to be extended with some methods
> which would define cluster layout instead of hardcoded CPUS_PER_CLUSTER
> definition. This would allow to emulate real machines with different layouts.
> However, in case of KVM we would still have to inherit IDs from the host.
> 
> Signed-off-by: Shlomo Pongratz 
> Signed-off-by: Pavel Fedin 
> ---
>  hw/arm/virt.c|  6 +-
>  target-arm/cpu-qom.h |  3 +++
>  target-arm/cpu.c | 17 +
>  target-arm/helper.c  |  9 +++--
>  target-arm/kvm64.c   | 25 +
>  target-arm/psci.c| 20 ++--
>  6 files changed, 71 insertions(+), 9 deletions(-)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c index a7f9a10..a1186c5 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -317,7 +317,11 @@ static void fdt_add_cpu_nodes(const VirtBoardInfo
> *vbi)
>  "enable-method", "psci");
>  }
> 
> -qemu_fdt_setprop_cell(vbi->fdt, nodename, "reg", cpu);
> +/*
> + * If cpus node's #address-cells property is set to 1
> + * The reg cell bits [23:0] must be set to bits [23:0] of MPIDR_EL1.
> + */
> +qemu_fdt_setprop_cell(vbi->fdt, nodename, "reg",
> + armcpu->mpidr);
>  g_free(nodename);
>  }
>  }
> diff --git a/target-arm/cpu-qom.h b/target-arm/cpu-qom.h index
> ed5a644..a382a09 100644
> --- a/target-arm/cpu-qom.h
> +++ b/target-arm/cpu-qom.h
> @@ -159,6 +159,7 @@ typedef struct ARMCPU {
>  uint64_t id_aa64mmfr1;
>  uint32_t dbgdidr;
>  uint32_t clidr;
> +uint64_t mpidr; /* Without feature bits */
>  /* The elements of this array are the CCSIDR values for each cache,
>   * in the order L1DCache, L1ICache, L2DCache, L2ICache, etc.
>   */
> @@ -171,6 +172,8 @@ typedef struct ARMCPU {
>  uint64_t rvbar;
>  } ARMCPU;
> 
> +#define CPUS_PER_CLUSTER 8
> +
>  #define TYPE_AARCH64_CPU "aarch64-cpu"
>  #define AARCH64_CPU_CLASS(klass) \
>  OBJECT_CLASS_CHECK(AArch64CPUClass, (klass), TYPE_AARCH64_CPU)
> diff --git a/target-arm/cpu.c b/target-arm/cpu.c index 3ca3fa8..7dc2595
> 100644
> --- a/target-arm/cpu.c
> +++ b/target-arm/cpu.c
> @@ -31,6 +31,12 @@
>  #include "sysemu/kvm.h"
>  #include "kvm_arm.h"
> 
> +#ifdef DEBUG
> +# define DPRINTF(format, ...) printf("armcpu: " format , ##
> +__VA_ARGS__) #else # define DPRINTF(format, ...) do { } while (0)
> +#endif
> +
>  static void arm_cpu_set_pc(CPUState *cs, vaddr value) 

Re: [Qemu-devel] [PATCH RFC V2 1/4] Use Aff1 with mpidr

2015-05-28 Thread Shlomo Pongratz
See below.

> -Original Message-
> From: Igor Mammedov [mailto:imamm...@redhat.com]
> Sent: Thursday, 28 May, 2015 12:30 PM
> To: Shlomo Pongratz
> Cc: shlomopongr...@gmail.com; qemu-devel@nongnu.org;
> peter.mayd...@linaro.org
> Subject: Re: [Qemu-devel] [PATCH RFC V2 1/4] Use Aff1 with mpidr
> 
> On Thu, 28 May 2015 07:23:55 +
> Shlomo Pongratz  wrote:
> 
> >
> >
> > > -Original Message-
> > > From: Igor Mammedov [mailto:imamm...@redhat.com]
> > > Sent: Wednesday, 27 May, 2015 7:12 PM
> > > To: shlomopongr...@gmail.com
> > > Cc: qemu-devel@nongnu.org; peter.mayd...@linaro.org; Shlomo
> Pongratz
> > > Subject: Re: [Qemu-devel] [PATCH RFC V2 1/4] Use Aff1 with mpidr
> > >
> > > On Wed,  6 May 2015 17:04:39 +0300
> > > shlomopongr...@gmail.com wrote:
> > >
> > > > From: Shlomo Pongratz 
> > > >
> > > > In order to support up to 128 cores with GIC-500 (GICv3
> > > > implementation)
> > > > affinity1 must be used. GIC-500 support up to 32 clusters with up
> > > > to
> > > > 8 cores in a cluster. So for example, if one wishes to have 16
> > > > cores, the options are: 2 clusters of 8 cores each, 4 clusters
> > > > with 4 cores each Currently only the first option is supported.
> > > > In order to have more flexible scheme the virt machine must pass
> > > > the desired scheme.
> > > >
> > > > Signed-off-by: Shlomo Pongratz 
> > > > ---
> > > >  target-arm/helper.c | 12 ++--
> > > >  target-arm/psci.c   | 18 --
> > > >  2 files changed, 26 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/target-arm/helper.c b/target-arm/helper.c index
> > > > f8f8d76..f555c0b 100644
> > > > --- a/target-arm/helper.c
> > > > +++ b/target-arm/helper.c
> > > > @@ -2035,11 +2035,19 @@ static const ARMCPRegInfo
> > > > strongarm_cp_reginfo[] = {  static uint64_t mpidr_read(CPUARMState
> > > > *env, const ARMCPRegInfo *ri)  {
> > > >  CPUState *cs = CPU(arm_env_get_cpu(env));
> > > > -uint32_t mpidr = cs->cpu_index;
> > > > -/* We don't support setting cluster ID ([8..11]) (known as Aff1
> > > > +uint32_t mpidr, aff0, aff1;
> > > > +uint32_t cpuid = cs->cpu_index;
> > > > +/* We don't support setting cluster ID ([16..23]) (known as
> > > > + Aff2
> > > >   * in later ARM ARM versions), or any of the higher affinity level
> fields,
> > > >   * so these bits always RAZ.
> > > >   */
> > > > +/* Currently GIC-500 code supports 64 cores in 16 clusters
> > > > + with 8 cores
> > > each
> > > > + * Future code will remove this limitation.
> > > > + * This code is valid for GIC-400 too.
> > > > + */
> > > > +aff0 = cpuid % 8;
> > > Even though GIC-500 spec says that it supports 8 cores I haven't
> > > found in it how it encodes aff0 but encoding is specified by
> > > respective CPU specs. I've checked a53, a57, a72 specs and they limit aff0
> to max 0x3 value.
> > > I think encoding should be CPU type specific i.e. not defined by
> > > what GIC can support and once we add CPU type with 8 cores, it would
> > > provide it's own version of mpidr_read since it would be defined by spec
> how to encode aff0.
> > >
> >
> > Hi,
> >
> > I wonder how this will work with GICv2 with 8 cores.
> > I agree that with GICv3 and 2 cortex-57 the affinities should be
> > 0.0.0.{0..3} and 0.0.1.{0..3} but with GICv2 or GICv3 in backwards
> compatibility we should have 0.0.0.{0..7}. Now only the virt knows which GIC
> is been used and psci is ignorant.
> > I assume the virt can set a property to the GIC and the QEMU saying how
> many cores are in the lowest affinity level.
> > That should be 0..7 with GICv2 according to the SMP value and processor
> dependent for GICV3.
> I think aff0 format is CPU dependent regardless of which GIC is used since
> mpidr is CPU owned register.
> 

O.K. so let's see.
In virt.c::fdt_add_cpu_node we set aff1 to be cpu_number / #cores-in-soc and 
aff0 to be cpu_numer % #cores-is-soc. This is fairly.
In  helpr.c:mpidr_read we calculate aff0 & aff1 from cs->cpu_index in the same 
way but AFAIK the input parameter i.e. CPUARMState doesn't has the 
#cores-in-soc nor any other useful information. We can add 

Re: [Qemu-devel] [PATCH RFC V2 1/4] Use Aff1 with mpidr

2015-05-28 Thread Shlomo Pongratz


> -Original Message-
> From: Igor Mammedov [mailto:imamm...@redhat.com]
> Sent: Wednesday, 27 May, 2015 7:12 PM
> To: shlomopongr...@gmail.com
> Cc: qemu-devel@nongnu.org; peter.mayd...@linaro.org; Shlomo Pongratz
> Subject: Re: [Qemu-devel] [PATCH RFC V2 1/4] Use Aff1 with mpidr
> 
> On Wed,  6 May 2015 17:04:39 +0300
> shlomopongr...@gmail.com wrote:
> 
> > From: Shlomo Pongratz 
> >
> > In order to support up to 128 cores with GIC-500 (GICv3
> > implementation)
> > affinity1 must be used. GIC-500 support up to 32 clusters with up to
> > 8 cores in a cluster. So for example, if one wishes to have 16 cores,
> > the options are: 2 clusters of 8 cores each, 4 clusters with 4 cores
> > each Currently only the first option is supported.
> > In order to have more flexible scheme the virt machine must pass the
> > desired scheme.
> >
> > Signed-off-by: Shlomo Pongratz 
> > ---
> >  target-arm/helper.c | 12 ++--
> >  target-arm/psci.c   | 18 --
> >  2 files changed, 26 insertions(+), 4 deletions(-)
> >
> > diff --git a/target-arm/helper.c b/target-arm/helper.c index
> > f8f8d76..f555c0b 100644
> > --- a/target-arm/helper.c
> > +++ b/target-arm/helper.c
> > @@ -2035,11 +2035,19 @@ static const ARMCPRegInfo
> > strongarm_cp_reginfo[] = {  static uint64_t mpidr_read(CPUARMState
> > *env, const ARMCPRegInfo *ri)  {
> >  CPUState *cs = CPU(arm_env_get_cpu(env));
> > -uint32_t mpidr = cs->cpu_index;
> > -/* We don't support setting cluster ID ([8..11]) (known as Aff1
> > +uint32_t mpidr, aff0, aff1;
> > +uint32_t cpuid = cs->cpu_index;
> > +/* We don't support setting cluster ID ([16..23]) (known as Aff2
> >   * in later ARM ARM versions), or any of the higher affinity level 
> > fields,
> >   * so these bits always RAZ.
> >   */
> > +/* Currently GIC-500 code supports 64 cores in 16 clusters with 8 cores
> each
> > + * Future code will remove this limitation.
> > + * This code is valid for GIC-400 too.
> > + */
> > +aff0 = cpuid % 8;
> Even though GIC-500 spec says that it supports 8 cores I haven't found in it
> how it encodes aff0 but encoding is specified by respective CPU specs. I've
> checked a53, a57, a72 specs and they limit aff0 to max 0x3 value.
> I think encoding should be CPU type specific i.e. not defined by what GIC can
> support and once we add CPU type with 8 cores, it would provide it's own
> version of mpidr_read since it would be defined by spec how to encode aff0.
> 

Hi,

I wonder how this will work with GICv2 with 8 cores.
I agree that with GICv3 and 2 cortex-57 the affinities should be 0.0.0.{0..3} 
and 0.0.1.{0..3} but with GICv2 or GICv3 in backwards compatibility we should 
have
0.0.0.{0..7}. Now only the virt knows which GIC is been used and psci is 
ignorant.
I assume the virt can set a property to the GIC and the QEMU saying how many 
cores are in the lowest affinity level.
That should be 0..7 with GICv2 according to the SMP value and processor 
dependent for GICV3.

Best regards,

S.P.
   

> > +aff1 = cpuid / 8;
> > +mpidr = (aff1 << 8) | aff0;
> >  if (arm_feature(env, ARM_FEATURE_V7MP)) {
> >  mpidr |= (1U << 31);
> >  /* Cores which are uniprocessor (non-coherent) diff --git
> > a/target-arm/psci.c b/target-arm/psci.c index d8fafab..64fbe61 100644
> > --- a/target-arm/psci.c
> > +++ b/target-arm/psci.c
> > @@ -86,6 +86,7 @@ void arm_handle_psci_call(ARMCPU *cpu)
> >  CPUARMState *env = &cpu->env;
> >  uint64_t param[4];
> >  uint64_t context_id, mpidr;
> > +uint32_t core, Aff1, Aff0;
> >  target_ulong entry;
> >  int32_t ret = 0;
> >  int i;
> > @@ -121,7 +122,16 @@ void arm_handle_psci_call(ARMCPU *cpu)
> >
> >  switch (param[2]) {
> >  case 0:
> > -target_cpu_state = qemu_get_cpu(mpidr & 0xff);
> > +/* MPIDR_EL1 
> > [RES0:affinity-3:RES1:U:RES0:MT:affinity-1:affinity-0]
> > + * GIC 500 code currently supports 32 clusters with 8 cores 
> > each
> > + * but no more than 128 cores. Future version will have 
> > flexible
> > + * affinity selection
> > + * GIC 400 supports 8 cores so 0x7 for Aff0 is O.K. too
> > + */
> > +Aff1 = (mpidr & 0xff00) >> (8 - 3); /* Shift by 8 multiply by 
> > 8 */
> > +Aff0 = mpidr & 0x7;
> > +core = Aff1 + Aff0;

Re: [Qemu-devel] [PATCH RFC V2 2/4] Implment GIC-500

2015-05-26 Thread Shlomo Pongratz


> -Original Message-
> From: Pavel Fedin [mailto:p.fe...@samsung.com]
> Sent: Monday, 25 May, 2015 6:27 PM
> To: 'Eric Auger'; qemu-devel@nongnu.org; shlomopongr...@gmail.com;
> Shlomo Pongratz
> Subject: RE: [Qemu-devel] [PATCH RFC V2 2/4] Implment GIC-500
> 
> Hi!
> 
> > > +static const uint8_t gic_lpi_ids[] = {
> > > +0x44, 0x00, 0x00, 0x00, 0x093, 0xB4, 0x3B, 0x00, 0x0D, 0xF0,
> > > +0x05, 0xB1 };
> 
>  Found one more thing. Shouldn't there be 0x091 instead of 0x093 (5th byte =
> PIDR0) ? 0x93 is ITS while REDIST is 0x91.
> 
> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
> 

Hi Pavel,

The value is taken from the gic-500 document in the re-distributer's table 
(3-7).
I don't know where the value 0x91 comes from as GICD_PIDR0 is 0x92, GICR_PIDR0 
is 0x93 and GITS_PIDR0 is 0x94.
In the GICv3 document section 5.4.21  it is also been written:
0xFFE0  GICR_PIDR0 Bits [7:0]. Bits [7:0] of the ARM-defined DevID field. This 
field is 0x93 in 
ARM implementations of a GICv3 or later Re-distributor.

Best regards,

S.P.




Re: [Qemu-devel] [PATCH RFC V2 2/4] Implment GIC-500

2015-05-23 Thread Shlomo Pongratz
Hi Pavel,

No problem.

Best regards,

S.P.


On Friday, May 22, 2015, Pavel Fedin  wrote:

>  Hello!
>
> > Please find some more comments inline.
>
>  Since there are notes about code style, i would add one more thing.
> structures of v3
> implementation keep old names (like GICState), and i would suggest to
> rename these things
> (like GICv3State) in order to avoid confusion.
>
> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
>
>
>


Re: [Qemu-devel] [PATCH RFC V2 1/4] Use Aff1 with mpidr

2015-05-23 Thread Shlomo Pongratz
On Friday, May 22, 2015, Pavel Fedin  wrote:

>  Hello!
>
> > The GIC-500 provides registers for managing interrupt sources, interrupt
> behavior, and interrupt
> > routing to one or more cores. It supports:
> > • Multiprocessor environments with up to 128 cores.
> > • Up to 32 affinity-level 1 clusters.
> > • Up to eight cores for each cluster.
>
> > I guess your hardware uses different GIC.
>
>  Heh, yes, looks like that. And perhaps it's somewhat non-standard...
>  I will study kvmtool and try to come up with some good solution.
>
>  By the way, since you're referring to documentation... TRM you have
> mentioned contains references to "GIC architecture reference manual v3.0",
> which i was unable to find. On ARM resource center i see only v2 of the
> manual. And it looks like you have it because otherwise you would not get
> description of many registers. Can you point me at a correct place ?
>
> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
>
>
> Hi Pavel,

We have a copy at work which I came from ARM, I'm sure that since Samsung
manufactures ARM SoC under licence you can put your hand on one.
I can also add that most of my work was done using the GIC-500 document and
by looking for what the Linux (kernel) is doing. Only recently I was able
to put my hand on the GICv3, but It only confirmed what I already assumed.

Best regards,

S.P.


Re: [Qemu-devel] [PATCH RFC V2 1/4] Use Aff1 with mpidr

2015-05-21 Thread Shlomo Pongratz
On Thursday, May 21, 2015, Pavel Fedin  wrote:

>  Hello!
>
> > In order to support up to 128 cores with GIC-500 (GICv3 implementation)
> > affinity1 must be used. GIC-500 support up to 32 clusters with up to
> > 8 cores in a cluster. So for example, if one wishes to have 16 cores,
> > the options are: 2 clusters of 8 cores each, 4 clusters with 4 cores each
>
>  I have found one more concern. Are you really sure about this scheme ? I
> am currently
> experimenting with KVM, and it seems to have 16 CPUs per cluster, at least
> on my machine.
> Actually i suggest that KVM inherits the mapping from the host. Can we do
> the same?
>  I will take a look at kvmtool, how it determines these IDs. But hardcoded
> scheme is
> definitely wrong.
>
>  Cc'ed Ashok because he might also be interested.
>
> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
>
> Hi Pavel,

I can only quote from GIC-500 document (
http://infocenter.arm.com/help/topic/com.arm.doc.ddi0516b/DDI0516B_gic5000_r0p0_trm.pdf)
which is currently the only GICv3 implementation) I'll recheck the GICv3
when I'll return to work on Monday.

>From section 1.3 features.

The GIC-500 provides registers for managing interrupt sources, interrupt
behavior, and interrupt
routing to one or more cores. It supports:
• Multiprocessor environments with up to 128 cores.
• Up to 32 affinity-level 1 clusters.
• Up to eight cores for each cluster.

I guess your hardware uses different GIC.

Best regards,

S.P.


Re: [Qemu-devel] [PATCH RFC V2 4/4] Add virtv2 machine that uses GIC-500

2015-05-19 Thread Shlomo Pongratz
> -Original Message-
> From: Eric Auger [mailto:eric.au...@linaro.org]
> Sent: Tuesday, 19 May, 2015 5:39 PM
> To: shlomopongr...@gmail.com; qemu-devel@nongnu.org
> Cc: peter.mayd...@linaro.org; Shlomo Pongratz
> Subject: Re: [Qemu-devel] [PATCH RFC V2 4/4] Add virtv2 machine that uses
> GIC-500
> 
> Hi Shlomo,
> On 05/06/2015 04:04 PM, shlomopongr...@gmail.com wrote:
> > From: Shlomo Pongratz 
> >
> > There is a need to support flexible clusters size. The GIC-500 can
> > support up to 128 cores, up to 32 clusters and up to 8 cores is a cluster.
> > So for example, if one wishes to have 16 cores, the options are:
> > 2 clusters of 8 cores each, 4 clusters with 4 cores each Currently
> > only the first option is supported.
> > There is an issue of passing clock affinity to via the dtb. In the dtb
> >
> > interrupt section there are only 24 bit left to affinity since the
> > variable is a 32 bit entity and 8 bits are reserved for flags.
> > See Documentation/devicetree/bindings/arm/arch_timer.txt.
> > Note that this issue is not seems to be critical as when checking
> > /proc/irq/3/smp_affinity with 32 cores all 32 bits are one.
> 
> As a general comment your series needs a good rebase in virt.c. This would
> definitively ease the review.
> 
> I have the impression that the number of modifications related to the
> GICv3 addition do not require to create a new virt.c file. My personal feeling
> is using Ashok's approach based on machine properties may be worth trying.
> Obviously this will urge you to fill the memory map holes but that's should be
> feasible.
> 

Hi Eric,

Thank you.

I have just pulled from git and I'm currently working on rebasing my 1-3 
patches to commit 62bf3d , mainly adding interrupt groups and fiq.
I accept that virtv2 is not needed and I'm currently using Pavel's patch 
https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg02930.html with small 
modifications.
However If there is a consensus I'll move to Ashok's virt instead of Pavel's 
one.

I'm looking forward for a decision regarding which virt is to be used.

Best regards,

S.P. 



Re: [Qemu-devel] [PATCH v2] Add virt-v3 machine that uses GIC-500

2015-05-14 Thread Shlomo Pongratz
Hi Pavel,

Please see in-line.
Best regards,

S.P.


From: Pavel Fedin [p.fe...@samsung.com]
Sent: Wednesday, May 13, 2015 4:57 PM
To: Shlomo Pongratz; qemu-devel@nongnu.org
Subject: RE: [Qemu-devel] [PATCH v2] Add virt-v3 machine that uses GIC-500

 Hello!

> 1. In fdt_add_timer why you didn't used the 24 bit limit I posed on the 
> irqflags? Please
note that
> the argument is 32 bits wide and 8 bits are for flags.

  Simply missed it when checking for differences. Please fix. :) Perhaps it is 
the reason
why >=24 CPUs fail for you.

I wonder how it works for you. Do you aware of an alternative way to configure 
the clock irqflags for more then 24 cores, or is it just ignored. According to 
Linux documentation this fdt field sets the clock IRQ affinity.

My current status is as follows:
With 64 cores there is no printouts what so ever.  
With 32 cores the boot usually get stuck after the message "[   45.719102] SCSI 
subsystem initialized"
With 24 cores the system noontimes complete the boot and sometimes get stuck 
like the 32 cores system.

> 2. In machvirt_init, I used TYPE_AARCH64_CPU while you reverted it to 
> TYPE_ARM_CPU, I
> assume this is because you want to support cortex-a15. Don't you think It 
> should be
according
> to the cortex type?

 Yes, i just left it as it was because it already works fine with ARM64. 
Actually,
TYPE_AARCH64_CPU is a subclass of TYPE_ARM_CPU.

I see but I guess that I want aarch64_cpu_initfn to be called and not 
arm_cpu_initfn.

> (BTW you removed cortex-a53).

 Yes, because i didn't see how it is different from a57 (or a15). I tried to 
follow
minimal intervention principle.
 But perhaps i was wrong because there was real support for a53 added recently:
http://lists.nongnu.org/archive/html/qemu-devel/2015-05/msg01304.html, so feel 
free to
re-add it back.

I agree with you I think this should wait for the patch you mentioned above to 
be integrated.

 BTW, just for interest, have you tried to do anything with KVM support of 
vGICv3? I have
some code but it's inherently unstable and lock up for unknown (yet) reason.

No, this is because I don't have an ARM64 based server needed for running KVM 
for ARM64.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia





Re: [Qemu-devel] [PATCH v2] Add virt-v3 machine that uses GIC-500

2015-05-13 Thread Shlomo Pongratz
> -Original Message-
> From: Pavel Fedin [mailto:p.fe...@samsung.com]
> Sent: Tuesday, 12 May, 2015 3:33 PM
> To: Shlomo Pongratz; qemu-devel@nongnu.org
> Subject: RE: [PATCH v2] Add virt-v3 machine that uses GIC-500
> 
>  Hello!
> 
> > BTW did you try going beyond 16 cores I had problems with 32 and 64 cores.
> 
>  Just tried it. Works fine, except qemu takes incredibly long time to start up
> with so many cores. 64 cores took something like 2 minutes. Indeed, looks
> like freeze, but if you're patient enough, you'll see it running. I believe 
> it's
> interpretation mode flaw.
> 
> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
> 


Hi Pavel,

With your patch I can work up to 16 cores it gets stuck at 24 cores.

I have two questions regarding your changes.
1. In fdt_add_timer why you didn't used the 24 bit limit I posed on the 
irqflags? Please note that the argument is 32 bits wide an 8 bits are for flags.
2. In machvirt_init, I used TYPE_AARCH64_CPU while you reverted it to 
TYPE_ARM_CPU, I assume this is because you want to support cortex-a15. Don't 
you think It should be according to the cortex type? (BTW you removed 
cortex-a53).

Best regards,

S.P.





Re: [Qemu-devel] [PATCH v2] Add virt-v3 machine that uses GIC-500

2015-05-12 Thread Shlomo Pongratz
> -Original Message-
> From: Pavel Fedin [mailto:p.fe...@samsung.com]
> Sent: Tuesday, 12 May, 2015 12:06 PM
> To: Shlomo Pongratz; qemu-devel@nongnu.org
> Subject: RE: [PATCH v2] Add virt-v3 machine that uses GIC-500
> 
>  Hello!
> 
> > I just pulled last git and git am-ed your patch on-top of my first 3
> > patches and got
> this error:
> > /home/shlomo/qemu-new.git/qemu-64/hw/arm/virt.c: In function
> 'fdt_add_gic_node':
> > /home/shlomo/qemu-new.git/qemu-64/hw/arm/virt.c:366:17: error:
> > 'KVM_DEV_TYPE_ARM_VGIC_V3' undeclared (first use in this function)
> >  if (type == KVM_DEV_TYPE_ARM_VGIC_V3) {
> 
>  Very strange. Check your source tree. This value is declared in linux-
> headers/linux/kvm.h. Try to #include it directly maybe ? OTOH it perfectly
> works for me, i have just verified, i have no extra #include's which could be
> missing from the patch.
>  But, may be something has changed in master over the last three days?
> Actually my base tree is from friday.
> 
> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
> 

Hi Pavel,

I did include "linux/kvm.h" which fixed the issue.
But as I wrote I did "git pull" before applying the patches, so it is strange.

BTW did you try going beyond 16 cores I had problems with 32 and 64 cores.

Best regards,

S.P.




Re: [Qemu-devel] [PATCH v2] Add virt-v3 machine that uses GIC-500

2015-05-11 Thread Shlomo Pongratz

Hi Pavel,

Thank you,

I just pulled last git and git am-ed your patch on-top of my first 3 patches 
and got this error:
/home/shlomo/qemu-new.git/qemu-64/hw/arm/virt.c: In function ‘fdt_add_gic_node’:
/home/shlomo/qemu-new.git/qemu-64/hw/arm/virt.c:366:17: error: 
‘KVM_DEV_TYPE_ARM_VGIC_V3’ undeclared (first use in this function)
 if (type == KVM_DEV_TYPE_ARM_VGIC_V3) {

Best regards,

S.P.



Re: [Qemu-devel] [PATCH RFC V2 4/4] Add virtv2 machine that uses GIC-500

2015-05-07 Thread Shlomo Pongratz
On Thursday, May 7, 2015, Pavel Fedin  wrote:

> Hello!
>
> > The issue is similar to what you have noticed.
>
>  In this case it's probably not your fault.
>
> > The larger the number of smp cores it is more likely that the boot will
> get stuck.
> > I've commuted this patch series as an RFC because of this issue.
>
>  I have tried to give more testing to it. Up to 4 CPUs worked okay (if we
> ignore "press any key" problem which i described). Starting from 5 i get a
> hard freeze right after " SMP: Total of N processors activated" message.
> However, i get this deadlock also with GICv2, so it's perhaps not your
> fault. The second problem is that i did not manage to bring up more than 8
> CPUs. The following happens:
> --- cut ---
> Initializing cgroup subsys memory
> Initializing cgroup subsys hugetlb
> hw perfevents: no hardware support available
> EFI services will not be available.
> CPU1: Booted secondary processor
> Detected PIPT I-cache on CPU1
> CPU1: found redistributor 1 region 0:0x0806
> CPU2: Booted secondary processor
> Detected PIPT I-cache on CPU2
> CPU2: found redistributor 2 region 0:0x0808
> CPU3: Booted secondary processor
> Detected PIPT I-cache on CPU3
> CPU3: found redistributor 3 region 0:0x080a
> CPU4: Booted secondary processor
> Detected PIPT I-cache on CPU4
> CPU4: found redistributor 4 region 0:0x080c
> CPU5: Booted secondary processor
> Detected PIPT I-cache on CPU5
> CPU5: found redistributor 5 region 0:0x080e
> CPU6: Booted secondary processor
> Detected PIPT I-cache on CPU6
> CPU6: found redistributor 6 region 0:0x0810
> CPU7: Booted secondary processor
> Detected PIPT I-cache on CPU7
> CPU7: found redistributor 7 region 0:0x0812
> psci: failed to boot CPU8 (-22)
> CPU8: failed to boot: -22
> psci: failed to boot CPU9 (-22)
> CPU9: failed to boot: -22
> psci: failed to boot CPU10 (-22)
> CPU10: failed to boot: -22
> psci: failed to boot CPU11 (-22)
> CPU11: failed to boot: -22
> psci: failed to boot CPU12 (-22)
> CPU12: failed to boot: -22
> psci: failed to boot CPU13 (-22)
> CPU13: failed to boot: -22
> psci: failed to boot CPU14 (-22)
> CPU14: failed to boot: -22
> psci: failed to boot CPU15 (-22)
> CPU15: failed to boot: -22
> Brought up 8 CPUs
> SMP: Total of 8 processors activated.
> --- cut ---
>  But, perhaps it's the fault of my code (i replaced your patch No 4 with
> my own implementation, merging the code). I will post this alternative
> version soon if you don't mind (need to rebase it on master).
>
>  My current guest kernel is v3.19.5. I will test on 4.1rc2 soon.
>
> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
>
>
> Hi Pavel,

Check your code and see if in fdt_add_cpu_nodes and see if CPUs are added
with correct MPIDR (Affinity).

Best regards,

S.P.


Re: [Qemu-devel] [PATCH RFC V2 4/4] Add virtv2 machine that uses GIC-500

2015-05-07 Thread Shlomo Pongratz
On Thursday, May 7, 2015, Pavel Fedin  wrote:

>  Hello!
>
> ➢ Thank you for your comment, I might do it after solving the stability
> issues.
>
>  What kind of stability issues do you have ?
>  You know, i was testing 64-bit qemu in TCG mode, and i have also seen
> weird behavior. The kernel just stops and waits until you hit any key on
> the console, then it continues. This happens if i try to use more than one
> CPU. This has to do with guest kernel version, in v4.1rc2 this is gone,
> while still present in v4.0rc1. Originally i discovered this issue on
> v3.19. So, perhaps it's some guest kernel flaw, having to do with
> interrupts.
>  The first place where it is stuck is after initializing usbcore. After
> hitting enter it says "switching clocksource to..." (i don't remember) and
> goes on. After this it can stop like this again, in some random moment.
>  Which kernel version do you use for guest ?
>
> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
>
> Hi Pavel,

The stability issue is described in patch .
The issue is similar to what you have noticed.
The larger the number of smp cores it is more likely that the boot will get
stuck.
I've commuted this patch series as an RFC because of this issue.

Best regards,

S.P.


Re: [Qemu-devel] [PATCH RFC V2 4/4] Add virtv2 machine that uses GIC-500

2015-05-07 Thread Shlomo Pongratz
On Thursday, May 7, 2015, Pavel Fedin  wrote:

>  Hello!
>
>  Why do we need 'virt2' ? I am currently working on testing your
> implementation, and i try to use different approach. I see this as
> '-machine
> virt,gicv=N' option, where N is 2 or 3. Isn't it better than duplicating
> the
> whole virt.c code just for single different function ?
>
> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
>
>
>
Hi Pavel,

Thank you for your comment, I might do it after solving the stability
issues.

Best regards,

S.P.


Re: [Qemu-devel] [PATCH RFC] Implement GIC-500 from GICv3 family for arm64

2015-03-10 Thread Shlomo Pongratz


On 10 آذار, 2015 ص 09:06, Pei XiaoYong wrote:

于 2015/3/9 22:41, shlomo.pongr...@toganetworks.com 写道:

From: Shlomo Pongratz 

This patch is a first step toward 128 cores support for arm64.

At first only 64 cores are supported for two reasons:
First the largest integer type has the size of 64 bits and modifying
essential data structures in order to support 128 cores will require
the usage of bitops.
Second currently the Linux (kernel) can be configured to support
up to 64 cores thus there is no urgency with 128 cores support.

Things left to do:

Currently the booting Linux may got stuck. The probability of getting stuck
increases with the number of cores. I'll appreciate core review.

There is a need to support flexible clusters size. The GIC-500 can support
up to 128 cores, up to 32 clusters and up to 8 cores is a cluster.
So for example, if one wishes to have 16 cores, the options are:
2 clusters of 8 cores each, 4 clusters with 4 cores each
Currently only the first option is supported.
There is an issue of passing clock affinity to via the dtb. In the dtb

interrupt section there are only 24 bit left to affinity since the
variable is a 32 bit entity and 8 bits are reserved for flags.
See Documentation/devicetree/bindings/arm/arch_timer.txt.
Note that this issue is not seems to be critical as when checking
/proc/irq/3/smp_affinity with 32 cores all 32 bits are one.

The last issue is to add support for 128 cores. This requires the usage
of bitops and currently can be tested up to 64 cores.

Signed-off-by: Shlomo Pongratz 
---
  hw/arm/Makefile.objs   |2 +-
  hw/arm/virtv2.c|  774 +
  hw/intc/Makefile.objs  |2 +
  hw/intc/arm_gic_common.c   |2 +
  hw/intc/arm_gicv3.c| 1596 
  hw/intc/arm_gicv3_common.c |  188 +
  hw/intc/gicv3_internal.h   |  153 
  include/hw/intc/arm_gicv3.h|   44 +
  include/hw/intc/arm_gicv3_common.h |  136 +++
  target-arm/cpu.c   |1 +
  target-arm/cpu.h   |6 +
  target-arm/cpu64.c |   92 +++
  target-arm/helper.c|   12 +-
  target-arm/psci.c  |   18 +-
  target-arm/translate-a64.c |   14 +
  15 files changed, 3034 insertions(+), 6 deletions(-)
  create mode 100644 hw/arm/virtv2.c
  create mode 100644 hw/intc/arm_gicv3.c
  create mode 100644 hw/intc/arm_gicv3_common.c
  create mode 100644 hw/intc/gicv3_internal.h
  create mode 100644 include/hw/intc/arm_gicv3.h
  create mode 100644 include/hw/intc/arm_gicv3_common.h









as far as we know , there are many components in gic-v3 implementation ,
like distributor , redistributor , its , lpi . Offsets of them is not
defined in the gic-v3 specification , i think wo should implement these
components independently , not like v2&v1 implementation in qemu.


Hi Peixiaoyong,

My immediate goal is running more than 8 cores, so currently "its" and
"ipi" are not supported.
I've used the offsets' rules from GIC-500 which is an implementation of
GICv3.
When and if "its" and "ipi" will be implemented then I think a new virt
machine will need to be created
as this is like a new HW BSP with different architecture.

Best regards,

S.P.
-
This email and any files transmitted and/or attachments with it are 
confidential and proprietary information of
Toga Networks Ltd., and intended solely for the use of the individual or entity 
to whom they are addressed.
If you have received this email in error please notify the system manager. This 
message contains confidential
information of Toga Networks Ltd., and is intended only for the individual 
named. If you are not the named
addressee you should not disseminate, distribute or copy this e-mail. Please 
notify the sender immediately
by e-mail if you have received this e-mail by mistake and delete this e-mail 
from your system. If you are not
the intended recipient you are notified that disclosing, copying, distributing 
or taking any action in reliance on
the contents of this information is strictly prohibited.





Re: [Qemu-devel] [PATCH RFC] Implement GIC-500 from GICv3 family for arm64

2015-03-10 Thread Shlomo Pongratz


On 10 آذار, 2015 ص 03:18, Shannon Zhao wrote:

On 2015/3/9 22:41, shlomo.pongr...@toganetworks.com wrote:

From: Shlomo Pongratz 

This patch is a first step toward 128 cores support for arm64.

At first only 64 cores are supported for two reasons:
First the largest integer type has the size of 64 bits and modifying
essential data structures in order to support 128 cores will require
the usage of bitops.
Second currently the Linux (kernel) can be configured to support
up to 64 cores thus there is no urgency with 128 cores support.

Things left to do:

Currently the booting Linux may got stuck. The probability of getting stuck
increases with the number of cores. I'll appreciate core review.

There is a need to support flexible clusters size. The GIC-500 can support
up to 128 cores, up to 32 clusters and up to 8 cores is a cluster.
So for example, if one wishes to have 16 cores, the options are:
2 clusters of 8 cores each, 4 clusters with 4 cores each
Currently only the first option is supported.
There is an issue of passing clock affinity to via the dtb. In the dtb

interrupt section there are only 24 bit left to affinity since the
variable is a 32 bit entity and 8 bits are reserved for flags.
See Documentation/devicetree/bindings/arm/arch_timer.txt.
Note that this issue is not seems to be critical as when checking
/proc/irq/3/smp_affinity with 32 cores all 32 bits are one.

The last issue is to add support for 128 cores. This requires the usage
of bitops and currently can be tested up to 64 cores.

Signed-off-by: Shlomo Pongratz 
---
  hw/arm/Makefile.objs   |2 +-
  hw/arm/virtv2.c|  774 +

Hi,

I think here you want to introduce GICv3 in this patch. So is this necessary to
add a new virtv2 machine? And the codes of this machine mostly are same with 
virt.

Maybe we can add a parameter such as -GICv3 for machine virt to choose GICv3 
for it
and choose GICv2 without this parameter. Then we can reuse more codes.


Hi Shannon,

Using a parameter and configuring the virtual machine makes the core
unreadable.
There are to many if then...else statements.

Best regards.
-
This email and any files transmitted and/or attachments with it are 
confidential and proprietary information of
Toga Networks Ltd., and intended solely for the use of the individual or entity 
to whom they are addressed.
If you have received this email in error please notify the system manager. This 
message contains confidential
information of Toga Networks Ltd., and is intended only for the individual 
named. If you are not the named
addressee you should not disseminate, distribute or copy this e-mail. Please 
notify the sender immediately
by e-mail if you have received this e-mail by mistake and delete this e-mail 
from your system. If you are not
the intended recipient you are notified that disclosing, copying, distributing 
or taking any action in reliance on
the contents of this information is strictly prohibited.





Re: [Qemu-devel] [PATCH RFC] Implement GIC-500 from GICv3 family for arm64

2015-03-10 Thread Shlomo Pongratz


On 09 آذار, 2015 م 05:13, Peter Maydell wrote:

On 9 March 2015 at 23:41,   wrote:

From: Shlomo Pongratz 

This patch is a first step toward 128 cores support for arm64.

At first only 64 cores are supported for two reasons:
First the largest integer type has the size of 64 bits and modifying
essential data structures in order to support 128 cores will require
the usage of bitops.
Second currently the Linux (kernel) can be configured to support
up to 64 cores thus there is no urgency with 128 cores support.

Things left to do:

Currently the booting Linux may got stuck. The probability of getting stuck
increases with the number of cores. I'll appreciate core review.

There is a need to support flexible clusters size. The GIC-500 can support
up to 128 cores, up to 32 clusters and up to 8 cores is a cluster.
So for example, if one wishes to have 16 cores, the options are:
2 clusters of 8 cores each, 4 clusters with 4 cores each
Currently only the first option is supported.
There is an issue of passing clock affinity to via the dtb. In the dtb

interrupt section there are only 24 bit left to affinity since the
variable is a 32 bit entity and 8 bits are reserved for flags.
See Documentation/devicetree/bindings/arm/arch_timer.txt.
Note that this issue is not seems to be critical as when checking
/proc/irq/3/smp_affinity with 32 cores all 32 bits are one.

The last issue is to add support for 128 cores. This requires the usage
of bitops and currently can be tested up to 64 cores.

Signed-off-by: Shlomo Pongratz 
---
  hw/arm/Makefile.objs   |2 +-
  hw/arm/virtv2.c|  774 +
  hw/intc/Makefile.objs  |2 +
  hw/intc/arm_gic_common.c   |2 +
  hw/intc/arm_gicv3.c| 1596 
  hw/intc/arm_gicv3_common.c |  188 +
  hw/intc/gicv3_internal.h   |  153 
  include/hw/intc/arm_gicv3.h|   44 +
  include/hw/intc/arm_gicv3_common.h |  136 +++
  target-arm/cpu.c   |1 +
  target-arm/cpu.h   |6 +
  target-arm/cpu64.c |   92 +++
  target-arm/helper.c|   12 +-
  target-arm/psci.c  |   18 +-
  target-arm/translate-a64.c |   14 +
  15 files changed, 3034 insertions(+), 6 deletions(-)
  create mode 100644 hw/arm/virtv2.c
  create mode 100644 hw/intc/arm_gicv3.c
  create mode 100644 hw/intc/arm_gicv3_common.c
  create mode 100644 hw/intc/gicv3_internal.h
  create mode 100644 include/hw/intc/arm_gicv3.h
  create mode 100644 include/hw/intc/arm_gicv3_common.h

This is way too big to review as a single patch; you should
find a way to split it into a series of multiple coherent patches.

thanks
-- PMM

Hi Peter,

Thanks I'll do that.

Best regards,

S.P.
-
This email and any files transmitted and/or attachments with it are 
confidential and proprietary information of
Toga Networks Ltd., and intended solely for the use of the individual or entity 
to whom they are addressed.
If you have received this email in error please notify the system manager. This 
message contains confidential
information of Toga Networks Ltd., and is intended only for the individual 
named. If you are not the named
addressee you should not disseminate, distribute or copy this e-mail. Please 
notify the sender immediately
by e-mail if you have received this e-mail by mistake and delete this e-mail 
from your system. If you are not
the intended recipient you are notified that disclosing, copying, distributing 
or taking any action in reliance on
the contents of this information is strictly prohibited.





Re: [Qemu-devel] GICv3/GIC-500

2015-03-05 Thread Shlomo Pongratz
Hi Peter,

Thank you for your response.
You are correct, I'm implementing fully emulated GIC-500.
I assume that you are correct and indeed I have a bug in the implementation, I 
think it is related to timing somehow as by adding debug printouts the system 
is more likely to boot.

I'll prepare a RFC patch and send it at the beginning of next week, and I hope 
you can spare the time to review it.

Best regards,

S.P.
 


[Qemu-devel] GICv3/GIC-500

2015-03-05 Thread Shlomo Pongratz
Hi,

I'm trying to implement GICv3 (actually GIC-500) in order to support more than 
8 cores for ARM64.
Up to 24 cores I didn't notice any significant problems (just slow boot) but 
with 64 or 32 cores the Linux kernel is usually got stuck, seldom it completes 
the boot.
When examining the registers I see that all cores except one is stuck in ether 
"PC=ffc000108a18" or "PC=ffc000108a18" which is according to objdump in:
kernel/stop_machine.c::multi_cpu_stop

msdata->state = newstate;
ffc000108a10:   b9002261str w1, [x19,#32]
ffc000108a14:   2a1403e1mov w1, w20
default:
break;
}
ack_state(msdata);
}
} while (curstate != MULTI_STOP_EXIT);
ffc000108a18:   7100103fcmp w1, #0x4
ffc000108a1c:   54000120b.eqffc000108a40 


/* Simple state machine */
do {
/* Chill out and ensure we re-read multi_stop_state. */
cpu_relax();
if (msdata->state != curstate) {
ffc000108a20:   b9402274ldr w20, [x19,#32]
ffc000108a24:   6b01029fcmp w20, w1

There is one CPU however (and there is always such CPU) is stuck in 
"PC=ffc0002cd9f4" which is in

drivers/irqchip/irq-gic-v3.c ::gic_eoi_irq

static void gic_eoi_irq(struct irq_data *d)
{
gic_write_eoir(gic_irq(d));
ffc0002cd9ec:   b9400800ldr w0, [x0,#8]
ffc0002cd9f0:   d518cc20msr s3_0_c12_c12_1, x0
ffc0002cd9f4:   d5033fdfisb
}
ffc0002cd9f8:   d65f03c0ret

But according to target-arm/translate-a64.c::handle_sync "isb" is translated as 
no-op!
BTW X00=001b is the virtual timer IRQ-27, so it seems that only 
this core (number 7 at this point)  is getting clock.

Dose anyone can give me an advice of how to further debug this issue?

Best regards,

S.P.


[Qemu-devel] QEMU PEX HW device

2012-02-23 Thread Shlomo Pongratz
Hi,

I want to add a new PEX HW device emulation to QEMU, but I can't find a 
skeleton/template driver or documentation that explains how to do it.
Are there any guidelines for this task?

Best regards,

S.P.