Re: [PATCH v5 03/14] target/hexagon: import README for idef-parser

2021-06-24 Thread Alessandro Di Federico via
On Wed, 23 Jun 2021 15:46:22 +
Taylor Simpson  wrote:

> The output isn't actually indented, but it would be great if it were.
>  This is especially true for instructions where an "if" or "for" show
> up in the emitted code.

Emitting whitespaces in the parser is a bit annoying and fragile, but
it can be done.

We post-process the parser output with `indent` internally. We could
run it after the output is produced, but that would mean a new
dependency.

I'd go for opportunistic indentation through `indent` if installed.

> Is there a way to force the parser not to emit a particular
> instruction (i.e., fall back on the reference implementation)?

Yes, see `gen_idef_parser_funcs.py`.

-- 
Alessandro Di Federico
rev.ng



RE: [PATCH v5 03/14] target/hexagon: import README for idef-parser

2021-06-23 Thread Taylor Simpson


> -Original Message-
> From: Alessandro Di Federico 
> Sent: Saturday, June 19, 2021 3:37 AM
> To: qemu-devel@nongnu.org
> Cc: Taylor Simpson ; Brian Cain
> ; bab...@rev.ng; ni...@rev.ng; phi...@redhat.com;
> richard.hender...@linaro.org; Alessandro Di Federico 
> Subject: [PATCH v5 03/14] target/hexagon: import README for idef-parser
> 
> From: Alessandro Di Federico 
> 
> Signed-off-by: Alessandro Di Federico 
> ---
>  target/hexagon/README |   5 +
>  target/hexagon/idef-parser/README.rst | 447
> ++
>  2 files changed, 452 insertions(+)
>  create mode 100644 target/hexagon/idef-parser/README.rst
> 
> diff --git a/target/hexagon/README b/target/hexagon/README index
> b0b2435070..2f2814380c 100644
> --- a/target/hexagon/README
> +++ b/target/hexagon/README
> @@ -43,6 +47,7 @@ header files in /target/hexagon
>  gen_tcg_funcs.py-> tcg_funcs_generated.c.inc
>  gen_tcg_func_table.py   -> tcg_func_table_generated.c.inc
>  gen_helper_funcs.py -> helper_funcs_generated.c.inc
> +gen_idef_parser_funcs.py-> idef_parser_input.h

The output file is actually named idef_parser_input.h.inc


> a/target/hexagon/idef-parser/README.rst b/target/hexagon/idef-
> parser/README.rst
> new file mode 100644
> index 00..f4cb416e8b
> --- /dev/null
> +++ b/target/hexagon/idef-parser/README.rst
> @@ -0,0 +1,447 @@
> +Hexagon ISA instruction definitions to tinycode generator compiler
> +--
> +
> +idef-parser is a small compiler able to translate the Hexagon ISA
> +description language into tinycode generator code, that can be easily
> integrated into QEMU.
> +
> +Compilation Example
> +---
> +
> +To better understand the scope of the idef-parser, we'll explore an
> +applicative example. Let's start by one of the simplest Hexagon instruction:
> the ``add``.
> +
> +The ISA description language represents the ``add`` instruction as
> +follows:
> +
> +.. code:: c
> +
> +   A2_add(RdV, in RsV, in RtV) {
> +   { RdV=RsV+RtV;}
> +   }
> +
> +idef-parser will compile the above code into the following code:
> +
> +.. code:: c
> +
> +   /* A2_add */
> +   void emit_A2_add(DisasContext *ctx, Insn *insn, Packet *pkt, TCGv_i32
> RdV,
> +TCGv_i32 RsV, TCGv_i32 RtV)
> +   /*  { RdV=RsV+RtV;} */
> +   {
> +   tcg_gen_movi_i32(RdV, 0);
> +   TCGv_i32 tmp_0 = tcg_temp_new_i32();
> +   tcg_gen_add_i32(tmp_0, RsV, RtV);
> +   tcg_gen_mov_i32(RdV, tmp_0);
> +   tcg_temp_free_i32(tmp_0);
> +   }

The output isn't actually indented, but it would be great if it were.  This is 
especially true for instructions where an "if" or "for" show up in the emitted 
code.

> +
> +Another approach to fix QEMU system test, where many instructions might
> +fail, is to compare the execution trace of your implementation with the
> +reference implementations already present in QEMU. To do so you should
> +obtain a QEMU build where the instruction pass the test, and run it with the
> following command:
> +
> +::
> +
> +   sudo unshare -p sudo -u  bash -c \
> +   'env -i  -d cpu '
> +
> +And do the same for your implementation, the generated execution traces
> +will be inherently aligned and can be inspected for behavioral
> +differences using the ``diff`` tool.

Is there a way to force the parser not to emit a particular instruction (i.e., 
fall back on the reference implementation)?




[PATCH v5 03/14] target/hexagon: import README for idef-parser

2021-06-19 Thread Alessandro Di Federico via
From: Alessandro Di Federico 

Signed-off-by: Alessandro Di Federico 
---
 target/hexagon/README |   5 +
 target/hexagon/idef-parser/README.rst | 447 ++
 2 files changed, 452 insertions(+)
 create mode 100644 target/hexagon/idef-parser/README.rst

diff --git a/target/hexagon/README b/target/hexagon/README
index b0b2435070..2f2814380c 100644
--- a/target/hexagon/README
+++ b/target/hexagon/README
@@ -23,6 +23,10 @@ Hexagon-specific code are
 encode*.def Encoding patterns for each instruction
 iclass.def  Instruction class definitions used to determine
 legal VLIW slots for each instruction
+qemu/target/hexagon/idef-parser
+Parser that, given the high-level definitions of an instruction,
+produces a C function generating equivalent tiny code instructions.
+See README.rst.
 qemu/linux-user/hexagon
 Helpers for loading the ELF file and making Linux system calls,
 signals, etc
@@ -43,6 +47,7 @@ header files in /target/hexagon
 gen_tcg_funcs.py-> tcg_funcs_generated.c.inc
 gen_tcg_func_table.py   -> tcg_func_table_generated.c.inc
 gen_helper_funcs.py -> helper_funcs_generated.c.inc
+gen_idef_parser_funcs.py-> idef_parser_input.h
 
 Qemu helper functions have 3 parts
 DEF_HELPER declaration indicates the signature of the helper
diff --git a/target/hexagon/idef-parser/README.rst 
b/target/hexagon/idef-parser/README.rst
new file mode 100644
index 00..f4cb416e8b
--- /dev/null
+++ b/target/hexagon/idef-parser/README.rst
@@ -0,0 +1,447 @@
+Hexagon ISA instruction definitions to tinycode generator compiler
+--
+
+idef-parser is a small compiler able to translate the Hexagon ISA description
+language into tinycode generator code, that can be easily integrated into QEMU.
+
+Compilation Example
+---
+
+To better understand the scope of the idef-parser, we'll explore an applicative
+example. Let's start by one of the simplest Hexagon instruction: the ``add``.
+
+The ISA description language represents the ``add`` instruction as
+follows:
+
+.. code:: c
+
+   A2_add(RdV, in RsV, in RtV) {
+   { RdV=RsV+RtV;}
+   }
+
+idef-parser will compile the above code into the following code:
+
+.. code:: c
+
+   /* A2_add */
+   void emit_A2_add(DisasContext *ctx, Insn *insn, Packet *pkt, TCGv_i32 RdV,
+TCGv_i32 RsV, TCGv_i32 RtV)
+   /*  { RdV=RsV+RtV;} */
+   {
+   tcg_gen_movi_i32(RdV, 0);
+   TCGv_i32 tmp_0 = tcg_temp_new_i32();
+   tcg_gen_add_i32(tmp_0, RsV, RtV);
+   tcg_gen_mov_i32(RdV, tmp_0);
+   tcg_temp_free_i32(tmp_0);
+   }
+
+The output of the compilation process will be a function, containing the
+tinycode generator code, implementing the correct semantics. That function will
+not access any global variable, because all the accessed data structures will 
be
+passed explicitly as function parameters. Among the passed parameters we will
+have TCGv (tinycode variables) representing the input and output registers of
+the architecture, integers representing the immediates that come from the code,
+and other data structures which hold information about the disassemblation
+context (``DisasContext`` struct).
+
+Let's begin by describing the input code. The ``add`` instruction is associated
+with a unique identifier, in this case ``A2_add``, which allows to distinguish
+variants of the same instruction, and expresses the class to which the
+instruction belongs, in this case ``A2`` corresponds to the Hexagon
+``ALU32/ALU`` instruction subclass.
+
+After the instruction identifier, we have a series of parameters that 
represents
+TCG variables that will be passed to the generated function. Parameters marked
+with ``in`` are already initialized, while the others are output parameters.
+
+We will leverage this information to infer several information:
+
+-  Fill in the output function signature with the correct TCGv registers
+-  Fill in the output function signature with the immediate integers
+-  Keep track of which registers, among the declared one, have been
+   initialized
+
+Let's now observe the actual instruction description code, in this case:
+
+.. code:: c
+
+   { RdV=RsV+RtV;}
+
+This code is composed by a subset of the C syntax, and is the result of the
+application of some macro definitions contained in the ``macros.h`` file.
+
+This file is used to reduce the complexity of the input language where complex
+variants of similar constructs can be mapped to a unique primitive, so that the
+idef-parser has to handle a lower number of computation primitives.
+
+As you may notice, the description code modifies the registers which have been
+declared by the declaration statements. In this case all the three registers
+will be declared, ``RsV`` and ``RtV`` wil