On Thu, 4 Aug 2022, Mark Cave-Ayland wrote:
On 04/08/2022 00:04, BALATON Zoltan wrote:
On Wed, 3 Aug 2022, Cédric Le Goater wrote:
Reviewed-by: Daniel Henrique Barboza <danielhb...@gmail.com>
Signed-off-by: Cédric Le Goater <c...@kaod.org>
---
hw/ppc/ppc405.h | 16 +++++++++++
hw/ppc/ppc405_uc.c | 71 +++++++++++++++++++++++++++++++---------------
2 files changed, 64 insertions(+), 23 deletions(-)
diff --git a/hw/ppc/ppc405.h b/hw/ppc/ppc405.h
index 1da34a7f10f3..1c7fe07b8084 100644
--- a/hw/ppc/ppc405.h
+++ b/hw/ppc/ppc405.h
@@ -65,7 +65,22 @@ struct ppc4xx_bd_info_t {
typedef struct Ppc405SoCState Ppc405SoCState;
+/* Peripheral controller */
+#define TYPE_PPC405_EBC "ppc405-ebc"
+OBJECT_DECLARE_SIMPLE_TYPE(Ppc405EbcState, PPC405_EBC);
+struct Ppc405EbcState {
+ DeviceState parent_obj;
+
+ PowerPCCPU *cpu;
+ uint32_t addr;
+ uint32_t bcr[8];
+ uint32_t bap[8];
+ uint32_t bear;
+ uint32_t besr0;
+ uint32_t besr1;
+ uint32_t cfg;
+};
/* DMA controller */
#define TYPE_PPC405_DMA "ppc405-dma"
@@ -203,6 +218,7 @@ struct Ppc405SoCState {
Ppc405OcmState ocm;
Ppc405GpioState gpio;
Ppc405DmaState dma;
+ Ppc405EbcState ebc;
};
/* PowerPC 405 core */
diff --git a/hw/ppc/ppc405_uc.c b/hw/ppc/ppc405_uc.c
index 6bd93c1cb90c..0166f3fc36da 100644
--- a/hw/ppc/ppc405_uc.c
+++ b/hw/ppc/ppc405_uc.c
@@ -393,17 +393,6 @@ static void ppc4xx_opba_init(hwaddr base)
/*****************************************************************************/
/* Peripheral controller */
-typedef struct ppc4xx_ebc_t ppc4xx_ebc_t;
-struct ppc4xx_ebc_t {
- uint32_t addr;
- uint32_t bcr[8];
- uint32_t bap[8];
- uint32_t bear;
- uint32_t besr0;
- uint32_t besr1;
- uint32_t cfg;
-};
-
enum {
EBC0_CFGADDR = 0x012,
EBC0_CFGDATA = 0x013,
@@ -411,10 +400,9 @@ enum {
static uint32_t dcr_read_ebc (void *opaque, int dcrn)
{
- ppc4xx_ebc_t *ebc;
+ Ppc405EbcState *ebc = PPC405_EBC(opaque);
uint32_t ret;
- ebc = opaque;
I think QOM casts are kind of expensive (maybe because we have quo-debug
enabled by default even without --enable-debug and it does additional
checks; I've tried to change this default once but it was thought to be
better to have it enabled). So it's advised to use QOM casts sparingly,
e.g. store the result in a local variable if you need it more than once and
so. Therefore I tend to consider these read/write callbacks that the object
itself registers with itself as the opaque pointer to be internal to the
object and guaranteed to be passed the object pointer so no QOM cast is
necessary and the direct assignment can be kept. This avoids potential
overhead on every register access. Not sure if it's measurable but I think
if an overhead can be avoided it probably should be.
Can you provide any evidence for this? IIRC the efficiency of the QOM cast
macros without --enable-debug was improved several years ago to the point
where their impact is minimal (note: this does not include
object_dynamic_cast()). From memory the previous discussions concluded that
It probably could be measured on a slower machine when something does a
lot of register access but I did not have any concrete numbers to prove it
and in this particular case not sure how often this device is accessed if
it does anything at all. But this is a general remark for all devices. An
IDE device could be accessed a lot of times for example so I generally
try to avoid unnecessary overhead.
AFAIK (which could well be wrong) a QOM cast is optimised down to a simple
cast if qom-debug is disabled. Problem is it's never disabled unless
somebody explicitly compiles with --disable-qom-cast-debug as this is
enabled by default, even in release builds without --enable-debug. At
least that was the case when this was in configure, I don't know where it
went during meson conversion but I think the default haven't changed. With
qom-cast-debug a QOM cast is ultimately calling object_dynamic_cast_assert
in OBJECT_CHECK.
Here is the discussion when I've tried to change this:
https://lists.nongnu.org/archive/html/qemu-devel/2018-07/msg03371.html
whilst the QOM cast did add some runtime overhead, it was dwarfed by the cost
of breaking out of emulation to handle the MMIO access itself. If something
has changed here then that sounds like a bug.
Not saying it has changed but having something already slow is not an
argument to make it even slower if that additional overhead can be
avoided. Maybe that makes it a little less slow even if the main reason
for slowness is not this.
I think it's worth keeping the QOM casts in place unless there is a good
reason not to, simply because they have helped me many times in past catch
out refactoring mistakes. For example I can certainly imagine that the recent
PHB series would have been a lot more painful without having them.
A good reason in my opinion is that these are read/write callbacks of the
object whith are registered in the realize method with the object itself
as the opaque parameter which was already QOM cast from the realize
method's device parameter so there's no way these read/wtite callbacks are
called with an unchecked object. Therefore the QOM cast with check is
unnecessary here and we can safely assign it to the appropriate type
without checcking it again at every register access. Because of this, I
always avoid QOM casts in these callback functions as this can only make
things better and unlikely to make it worse.
The QOM casts are warranted in the object methods such as realize or init
that maybe somehow could be called with a wrong object (I'm not sure why
if these are object methods but maybe through a subclass or something) but
not needed in register access callbacks that are internal to the object
and only passed already checked objects.
Regards,
BALATON Zoltan