Re: [Qemu-devel] [RFC PATCH 02/34] tcg+qom: QOMify core CPU defintions

2015-05-12 Thread Peter Crosthwaite
On Mon, May 11, 2015 at 1:18 PM, Richard Henderson r...@twiddle.net wrote:
 On 05/11/2015 03:24 AM, Paolo Bonzini wrote:


 On 11/05/2015 12:18, Andreas Färber wrote:
 +int (*cpu_mmu_index)(CPUState *cpu);
 +void (*cpu_get_tb_cpu_state)(CPUState *cpu,
 + void *pc, /* target_long * */
 + void *cs_base, /* target_long */
 + int *flags);
 +void (*gen_intermediate_code)(void *env, struct TranslationBlock 
 *tb);
 +void (*gen_intermediate_code_pc)(void *env, struct TranslationBlock 
 *tb);
 +void (*restore_state_to_opc)(void *env, struct TranslationBlock *tb,
 + int pc_pos);
 +void (*tlb_fill)(CPUState *cs, uint64_t addr, int is_write, int 
 mmu_idx,
 + uintptr_t retaddr);
  } CPUClass;

  #ifdef HOST_WORDS_BIGENDIAN
 [snip]

 Paolo had objected to this when I tried it. The counter-suggestion was
 something about reworking how the cputlb code is built per target -
 please check the archives.

 Right.  My point was that these functions are not polymorphic.  Each
 call to these should know exactly which function to call.

 That's some major surgery you have planned there.


Just tried this. It's ... rather hard.

 Especially the path via the qemu_ld/st helpers, where function to call is
 currently hard-coded into the tcg backend.


Yes. I actually seemed to make decent progress on the multi-compile
approach until I hit this brick wall:

tcg/i386/tcg-target.c:1131:17: error: ‘helper_ret_ldub_mmu’ undeclared
here (not in a function)
 [MO_UB]   = helper_ret_ldub_mmu,

In my multi-compile approach helper_*[ld|st]* needs to be renamed
per-arch for the multiple compiled cputlb.o. Hence I have no symbol
with the unqualified name. But even if I do solve my namespacing
problem, I still have an ambiguity of which cputlb.o provided
helper_*[ld|st]* to use from the TCG backend. This would mean all
those APIs would have to virtualised. The big question for Paolo, is
what complete set of APIs defines the common-code/non-common-code
boundary? tlb_fill does seem to do the job nicely and looking at the
architecture implementations it's not a super fast path (falling back
to a page table faulter).

Somewhere along the call path from the qemu_st_helpers uses
(tcg/i386/tcg-target.c) through to tlb_fill there has to be a
virtualised function unless I am missing something?

 I think that this is a decent step forward, modulo the conditionals along the
 use paths.  I think we ought to clean up all of the translators to the new QOM
 hooks.


So the conditional can be ditched by having the CPU base class
defaulting the hook to the globally defined function. Then arches can
be brought online one-by-one.

 I can't imagine that most of these hooks are called frequently enough that the
 indirect call really matters.  Certainly gen_intermediate_code need not use 
 the
 hook when initializing the mmu_idx in the DisasContext.


Ok so the solution to this is to opt-out of the hook via a re-#define
when we have a target-specific cpu.h handy. This will actually mean no
change to single-arch builds but multi-arch will use the hook from
core code only.

 That said, I'd approve of a goal to arrange for the correct qemu_ld/st helpers
 to be called, and a direct call to the proper tlb_fill.  But, one step at a 
 time...


I don't know what this means exactly. tlb_fill is called by functions
that are linked to common code (TCG backends) so I don't see a non
virtualized solution. Is this refactoring to move tlb_fill?

Regards,
Peter


 r~




Re: [Qemu-devel] [RFC PATCH 02/34] tcg+qom: QOMify core CPU defintions

2015-05-12 Thread Peter Crosthwaite
On Mon, May 11, 2015 at 3:39 AM, Paolo Bonzini pbonz...@redhat.com wrote:


 On 11/05/2015 12:36, Andreas Färber wrote:
  Right.  My point was that these functions are not polymorphic.  Each
  call to these should know exactly which function to call.
 
  cputlb.c, cpu-exec.c and parts of translate-all.c should be the moral
  equivalent of C++ templates.  I wouldn't mind switching to C++, but if
  we want to make them polymorphic we should do it at compile time through
  multiple compilation and/or inclusion from target-*.

 I think we got more concrete than that, possibly on IRC only? I believe
 #include'ing cputlb.c from some target-* file was one of your proposed
 solutions?

 Yes (but Peter is making more functions virtual, so he'd have to do the
 same also for cpu-exec.c and parts of translate-all.c).

 And splitting some of the inline functions that I was getting
 rid of into some new (or existing?) file?

 If I remember correctly, another solution could be to keep the virtual
 functions, but ensure that all the hot paths were devirtualizing them
 and calling the CPU-specific function directly.


What does this mean exactly? Is there still a common code for cputlb.c
and friends?

Regards,
Peter

 Paolo




Re: [Qemu-devel] [RFC PATCH 02/34] tcg+qom: QOMify core CPU defintions

2015-05-12 Thread Richard Henderson
On 05/12/2015 12:23 AM, Peter Crosthwaite wrote:
 In my multi-compile approach helper_*[ld|st]* needs to be renamed
 per-arch for the multiple compiled cputlb.o. Hence I have no symbol
 with the unqualified name. But even if I do solve my namespacing
 problem, I still have an ambiguity of which cputlb.o provided
 helper_*[ld|st]* to use from the TCG backend. This would mean all
 those APIs would have to virtualised. The big question for Paolo, is
 what complete set of APIs defines the common-code/non-common-code
 boundary? tlb_fill does seem to do the job nicely and looking at the
 architecture implementations it's not a super fast path (falling back
 to a page table faulter).
 
 Somewhere along the call path from the qemu_st_helpers uses
 (tcg/i386/tcg-target.c) through to tlb_fill there has to be a
 virtualised function unless I am missing something?

I think both cpu_unaligned_access and tlb_fill need to be hooked.

 I think that this is a decent step forward, modulo the conditionals along the
 use paths.  I think we ought to clean up all of the translators to the new 
 QOM
 hooks.

 
 So the conditional can be ditched by having the CPU base class
 defaulting the hook to the globally defined function. Then arches can
 be brought online one-by-one.

Yes, exactly.

 Ok so the solution to this is to opt-out of the hook via a re-#define
 when we have a target-specific cpu.h handy. This will actually mean no
 change to single-arch builds but multi-arch will use the hook from
 core code only.

Err... not via #defines, please.  Just use the arch_foo name all spelled out
from target-specific code.

 I don't know what this means exactly. tlb_fill is called by functions
 that are linked to common code (TCG backends) so I don't see a non
 virtualized solution. Is this refactoring to move tlb_fill?

It means if we do find a way to parameterize the tcg backend, e.g. by putting
the whole table of functions into the class, then we can revisit generating
cpu-specific versions of the memory helpers.


r~




[Qemu-devel] [RFC PATCH 02/34] tcg+qom: QOMify core CPU defintions

2015-05-11 Thread Peter Crosthwaite
These definitions are defined per-target and globall linked/defined
between core code and target-foo. QOMify them. Provide weakly linked
conditional default implementations for the non-qomified global fns.
This means converted architectures which install a QOM hook do not need
to define a function for the old globals even the common code expects
to link against something.

The top level definition of some functions is still left up to the
individual target cpu.h files, making the QOMified code paths opt-in
per target.

Signed-off-by: Peter Crosthwaite crosthwaite.pe...@gmail.com
---
 include/qom/cpu.h  | 24 
 qom/cpu.c  |  6 ++
 softmmu_template.h |  6 ++
 translate-all.c| 47 ---
 4 files changed, 80 insertions(+), 3 deletions(-)

diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 363c928..2cb89ab 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -119,6 +119,7 @@ struct TranslationBlock;
  * @cpu_exec_exit: Callback for cpu_exec cleanup.
  * @cpu_exec_interrupt: Callback for processing interrupts in cpu_exec.
  * @disas_set_info: Setup architecture specific components of disassembly info
+ * @cpu_mmu_index: Get MMU index for normal load stores
  *
  * Represents a CPU family or model.
  */
@@ -176,6 +177,17 @@ typedef struct CPUClass {
 bool (*cpu_exec_interrupt)(CPUState *cpu, int interrupt_request);
 
 void (*disas_set_info)(CPUState *cpu, disassemble_info *info);
+int (*cpu_mmu_index)(CPUState *cpu);
+void (*cpu_get_tb_cpu_state)(CPUState *cpu,
+ void *pc, /* target_long * */
+ void *cs_base, /* target_long */
+ int *flags);
+void (*gen_intermediate_code)(void *env, struct TranslationBlock *tb);
+void (*gen_intermediate_code_pc)(void *env, struct TranslationBlock *tb);
+void (*restore_state_to_opc)(void *env, struct TranslationBlock *tb,
+ int pc_pos);
+void (*tlb_fill)(CPUState *cs, uint64_t addr, int is_write, int mmu_idx,
+ uintptr_t retaddr);
 } CPUClass;
 
 #ifdef HOST_WORDS_BIGENDIAN
@@ -319,6 +331,18 @@ struct CPUState {
(absolute value) offset as small as possible.  This reduces code
size, especially for hosts without large memory offsets.  */
 volatile sig_atomic_t tcg_exit_req;
+
+int (*cpu_mmu_index)(CPUState *cpu);
+void (*cpu_get_tb_cpu_state)(CPUState *cpu,
+ void *pc, /* target_long * */
+ void *cs_base, /* target_long */
+ int *flags);
+void (*gen_intermediate_code)(void *env, struct TranslationBlock *tb);
+void (*gen_intermediate_code_pc)(void *env, struct TranslationBlock *tb);
+void (*restore_state_to_opc)(void *env, struct TranslationBlock *tb,
+ int pc_pos);
+void (*tlb_fill)(CPUState *cs, uint64_t addr, int is_write, int mmu_idx,
+ uintptr_t retaddr);
 };
 
 QTAILQ_HEAD(CPUTailQ, CPUState);
diff --git a/qom/cpu.c b/qom/cpu.c
index 108bfa2..3fd7869 100644
--- a/qom/cpu.c
+++ b/qom/cpu.c
@@ -313,6 +313,12 @@ static void cpu_common_initfn(Object *obj)
 CPUClass *cc = CPU_GET_CLASS(obj);
 
 cpu-gdb_num_regs = cpu-gdb_num_g_regs = cc-gdb_num_core_regs;
+cpu-cpu_mmu_index = cc-cpu_mmu_index;
+cpu-cpu_get_tb_cpu_state = cc-cpu_get_tb_cpu_state;
+cpu-gen_intermediate_code = cc-gen_intermediate_code;
+cpu-gen_intermediate_code_pc = cc-gen_intermediate_code_pc;
+cpu-restore_state_to_opc = cc-restore_state_to_opc;
+cpu-tlb_fill = cc-tlb_fill;
 }
 
 static int64_t cpu_common_get_arch_id(CPUState *cpu)
diff --git a/softmmu_template.h b/softmmu_template.h
index 16b0852..dea12d2 100644
--- a/softmmu_template.h
+++ b/softmmu_template.h
@@ -141,6 +141,10 @@
 vidx = 0;\
 })
 
+#define tlb_fill(cpu, a, f, i, r) \
+((cpu)-tlb_fill ? (cpu)-tlb_fill((cpu), (a), (f), (i), (r)) \
+ : tlb_fill((cpu), (a), (f), (i), (r)))
+
 #ifndef SOFTMMU_CODE_ACCESS
 static inline DATA_TYPE glue(io_read, SUFFIX)(CPUArchState *env,
   CPUIOTLBEntry *iotlbentry,
@@ -576,3 +580,5 @@ glue(glue(helper_st, SUFFIX), MMUSUFFIX)(CPUArchState *env, 
target_ulong addr,
 #undef helper_be_st_name
 #undef helper_te_ld_name
 #undef helper_te_st_name
+
+#undef tlb_fill
diff --git a/translate-all.c b/translate-all.c
index 65a76c5..1b9a405 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -144,6 +144,7 @@ void cpu_gen_init(void)
 */
 int cpu_gen_code(CPUArchState *env, TranslationBlock *tb, int 
*gen_code_size_ptr)
 {
+CPUState *cs = ENV_GET_CPU(env);
 TCGContext *s = tcg_ctx;
 tcg_insn_unit *gen_code_buf;
 int gen_code_size;
@@ -158,7 +159,11 @@ int cpu_gen_code(CPUArchState *env, 

Re: [Qemu-devel] [RFC PATCH 02/34] tcg+qom: QOMify core CPU defintions

2015-05-11 Thread Andreas Färber
Am 11.05.2015 um 12:24 schrieb Paolo Bonzini:
 On 11/05/2015 12:18, Andreas Färber wrote:
 +int (*cpu_mmu_index)(CPUState *cpu);
 +void (*cpu_get_tb_cpu_state)(CPUState *cpu,
 + void *pc, /* target_long * */
 + void *cs_base, /* target_long */
 + int *flags);
 +void (*gen_intermediate_code)(void *env, struct TranslationBlock *tb);
 +void (*gen_intermediate_code_pc)(void *env, struct TranslationBlock 
 *tb);
 +void (*restore_state_to_opc)(void *env, struct TranslationBlock *tb,
 + int pc_pos);
 +void (*tlb_fill)(CPUState *cs, uint64_t addr, int is_write, int 
 mmu_idx,
 + uintptr_t retaddr);
  } CPUClass;
  
  #ifdef HOST_WORDS_BIGENDIAN
 [snip]

 Paolo had objected to this when I tried it. The counter-suggestion was
 something about reworking how the cputlb code is built per target -
 please check the archives.
 
 Right.  My point was that these functions are not polymorphic.  Each
 call to these should know exactly which function to call.
 
 cputlb.c, cpu-exec.c and parts of translate-all.c should be the moral
 equivalent of C++ templates.  I wouldn't mind switching to C++, but if
 we want to make them polymorphic we should do it at compile time through
 multiple compilation and/or inclusion from target-*.

I think we got more concrete than that, possibly on IRC only? I believe
#include'ing cputlb.c from some target-* file was one of your proposed
solutions? And splitting some of the inline functions that I was getting
rid of into some new (or existing?) file?

Andreas

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Dilip Upmanyu, Graham Norton; HRB
21284 (AG Nürnberg)



Re: [Qemu-devel] [RFC PATCH 02/34] tcg+qom: QOMify core CPU defintions

2015-05-11 Thread Paolo Bonzini


On 11/05/2015 12:18, Andreas Färber wrote:
  +int (*cpu_mmu_index)(CPUState *cpu);
  +void (*cpu_get_tb_cpu_state)(CPUState *cpu,
  + void *pc, /* target_long * */
  + void *cs_base, /* target_long */
  + int *flags);
  +void (*gen_intermediate_code)(void *env, struct TranslationBlock *tb);
  +void (*gen_intermediate_code_pc)(void *env, struct TranslationBlock 
  *tb);
  +void (*restore_state_to_opc)(void *env, struct TranslationBlock *tb,
  + int pc_pos);
  +void (*tlb_fill)(CPUState *cs, uint64_t addr, int is_write, int 
  mmu_idx,
  + uintptr_t retaddr);
   } CPUClass;
   
   #ifdef HOST_WORDS_BIGENDIAN
 [snip]
 
 Paolo had objected to this when I tried it. The counter-suggestion was
 something about reworking how the cputlb code is built per target -
 please check the archives.

Right.  My point was that these functions are not polymorphic.  Each
call to these should know exactly which function to call.

cputlb.c, cpu-exec.c and parts of translate-all.c should be the moral
equivalent of C++ templates.  I wouldn't mind switching to C++, but if
we want to make them polymorphic we should do it at compile time through
multiple compilation and/or inclusion from target-*.

Paolo



Re: [Qemu-devel] [RFC PATCH 02/34] tcg+qom: QOMify core CPU defintions

2015-05-11 Thread Paolo Bonzini


On 11/05/2015 12:36, Andreas Färber wrote:
  Right.  My point was that these functions are not polymorphic.  Each
  call to these should know exactly which function to call.
  
  cputlb.c, cpu-exec.c and parts of translate-all.c should be the moral
  equivalent of C++ templates.  I wouldn't mind switching to C++, but if
  we want to make them polymorphic we should do it at compile time through
  multiple compilation and/or inclusion from target-*.
 
 I think we got more concrete than that, possibly on IRC only? I believe
 #include'ing cputlb.c from some target-* file was one of your proposed
 solutions?

Yes (but Peter is making more functions virtual, so he'd have to do the
same also for cpu-exec.c and parts of translate-all.c).

 And splitting some of the inline functions that I was getting
 rid of into some new (or existing?) file?

If I remember correctly, another solution could be to keep the virtual
functions, but ensure that all the hot paths were devirtualizing them
and calling the CPU-specific function directly.

Paolo



Re: [Qemu-devel] [RFC PATCH 02/34] tcg+qom: QOMify core CPU defintions

2015-05-11 Thread Andreas Färber
Am 11.05.2015 um 08:29 schrieb Peter Crosthwaite:
 These definitions are defined per-target and globall linked/defined
 between core code and target-foo. QOMify them. Provide weakly linked
 conditional default implementations for the non-qomified global fns.
 This means converted architectures which install a QOM hook do not need
 to define a function for the old globals even the common code expects
 to link against something.
 
 The top level definition of some functions is still left up to the
 individual target cpu.h files, making the QOMified code paths opt-in
 per target.
 
 Signed-off-by: Peter Crosthwaite crosthwaite.pe...@gmail.com
 ---
  include/qom/cpu.h  | 24 
  qom/cpu.c  |  6 ++
  softmmu_template.h |  6 ++
  translate-all.c| 47 ---
  4 files changed, 80 insertions(+), 3 deletions(-)
 
 diff --git a/include/qom/cpu.h b/include/qom/cpu.h
 index 363c928..2cb89ab 100644
 --- a/include/qom/cpu.h
 +++ b/include/qom/cpu.h
 @@ -119,6 +119,7 @@ struct TranslationBlock;
   * @cpu_exec_exit: Callback for cpu_exec cleanup.
   * @cpu_exec_interrupt: Callback for processing interrupts in cpu_exec.
   * @disas_set_info: Setup architecture specific components of disassembly 
 info
 + * @cpu_mmu_index: Get MMU index for normal load stores
   *
   * Represents a CPU family or model.
   */
 @@ -176,6 +177,17 @@ typedef struct CPUClass {
  bool (*cpu_exec_interrupt)(CPUState *cpu, int interrupt_request);
  
  void (*disas_set_info)(CPUState *cpu, disassemble_info *info);
 +int (*cpu_mmu_index)(CPUState *cpu);
 +void (*cpu_get_tb_cpu_state)(CPUState *cpu,
 + void *pc, /* target_long * */
 + void *cs_base, /* target_long */
 + int *flags);
 +void (*gen_intermediate_code)(void *env, struct TranslationBlock *tb);
 +void (*gen_intermediate_code_pc)(void *env, struct TranslationBlock *tb);
 +void (*restore_state_to_opc)(void *env, struct TranslationBlock *tb,
 + int pc_pos);
 +void (*tlb_fill)(CPUState *cs, uint64_t addr, int is_write, int mmu_idx,
 + uintptr_t retaddr);
  } CPUClass;
  
  #ifdef HOST_WORDS_BIGENDIAN
[snip]

Paolo had objected to this when I tried it. The counter-suggestion was
something about reworking how the cputlb code is built per target -
please check the archives.

Regards,
Andreas

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Dilip Upmanyu, Graham Norton; HRB
21284 (AG Nürnberg)



Re: [Qemu-devel] [RFC PATCH 02/34] tcg+qom: QOMify core CPU defintions

2015-05-11 Thread Peter Maydell
On 11 May 2015 at 07:29, Peter Crosthwaite crosthwaitepe...@gmail.com wrote:
 These definitions are defined per-target and globall linked/defined
 between core code and target-foo. QOMify them. Provide weakly linked
 conditional default implementations for the non-qomified global fns.
 This means converted architectures which install a QOM hook do not need
 to define a function for the old globals even the common code expects
 to link against something.

 +void __attribute__((weak)) gen_intermediate_code(CPUArchState *env,
 + struct TranslationBlock *tb)
 +{
 +abort();
 +}

__attribute__((weak)) doesn't work on all the platforms we need
to support, I'm afraid. The approach we've settled on instead
is to provide stub functions in source files in the stubs/ directory
which get built into a libqemustub.a library; then the linker will
fish out the .o files from that as needed to satisfy otherwise
missing symbols.

-- PMM



Re: [Qemu-devel] [RFC PATCH 02/34] tcg+qom: QOMify core CPU defintions

2015-05-11 Thread Richard Henderson
On 05/11/2015 03:24 AM, Paolo Bonzini wrote:
 
 
 On 11/05/2015 12:18, Andreas Färber wrote:
 +int (*cpu_mmu_index)(CPUState *cpu);
 +void (*cpu_get_tb_cpu_state)(CPUState *cpu,
 + void *pc, /* target_long * */
 + void *cs_base, /* target_long */
 + int *flags);
 +void (*gen_intermediate_code)(void *env, struct TranslationBlock *tb);
 +void (*gen_intermediate_code_pc)(void *env, struct TranslationBlock 
 *tb);
 +void (*restore_state_to_opc)(void *env, struct TranslationBlock *tb,
 + int pc_pos);
 +void (*tlb_fill)(CPUState *cs, uint64_t addr, int is_write, int 
 mmu_idx,
 + uintptr_t retaddr);
  } CPUClass;
  
  #ifdef HOST_WORDS_BIGENDIAN
 [snip]

 Paolo had objected to this when I tried it. The counter-suggestion was
 something about reworking how the cputlb code is built per target -
 please check the archives.
 
 Right.  My point was that these functions are not polymorphic.  Each
 call to these should know exactly which function to call.

That's some major surgery you have planned there.

Especially the path via the qemu_ld/st helpers, where function to call is
currently hard-coded into the tcg backend.

I think that this is a decent step forward, modulo the conditionals along the
use paths.  I think we ought to clean up all of the translators to the new QOM
hooks.

I can't imagine that most of these hooks are called frequently enough that the
indirect call really matters.  Certainly gen_intermediate_code need not use the
hook when initializing the mmu_idx in the DisasContext.

That said, I'd approve of a goal to arrange for the correct qemu_ld/st helpers
to be called, and a direct call to the proper tlb_fill.  But, one step at a 
time...


r~