Diego and I seek your comments on the following (loose) proposal.

Generating gimple and tree expressions require lots of detail,
which is hard to remember and easy to get wrong.  There is some
amount of boilerplate code that can, in most cases, be reduced and
managed automatically.

We will add a set of helper classes to be used as local variables
to manage the details of handling the existing types.  That is,
a layer over 'gimple_build_*'. We intend to provide helpers for
those facilities that are both commonly used and have room for
significant simplification.


Generating an Expression

Suppose one wants to generate the expression (shadow != 0) &
(((base_addr & 7) + offset) >= shadow), where offset is a value and
the other identifiers are variables.  The current code to generate
this expression is as follows.

/* t = shadow != 0 */
g = gimple_build_assign_with_ops (NE_EXPR,
            make_ssa_name (boolean_type_node, NULL),
            shadow,
            build_int_cst (shadow_type, 0));
gimple_set_location (g, location);
gsi_insert_after (&gsi, g, GSI_NEW_STMT);
t = gimple_assign_lhs (g);

/* a = base_addr & 7 */
g = gimple_build_assign_with_ops (BIT_AND_EXPR,
            make_ssa_name (uintptr_type, NULL),
            base_addr,
            build_int_cst (uintptr_type, 7));
gimple_set_location (g, location);
gsi_insert_after (&gsi, g, GSI_NEW_STMT);

/* b = (shadow_type)a */
g = gimple_build_assign_with_ops (NOP_EXPR,
            make_ssa_name (shadow_type, NULL),
            gimple_assign_lhs (g),
            NULL_TREE);
gimple_set_location (g, location);
gsi_insert_after (&gsi, g, GSI_NEW_STMT);

/* c = b + offset */
g = gimple_build_assign_with_ops (PLUS_EXPR,
            make_ssa_name (shadow_type, NULL),
            gimple_assign_lhs (g),
            build_int_cst (shadow_type, offset));
gimple_set_location (g, location);
gsi_insert_after (&gsi, g, GSI_NEW_STMT);

/* d = c >= shadow */
g = gimple_build_assign_with_ops (GE_EXPR,
            make_ssa_name (boolean_type_node, NULL),
            gimple_assign_lhs (g),
            shadow);
gimple_set_location (g, location);
gsi_insert_after (&gsi, g, GSI_NEW_STMT);

/* r = t & d */
g = gimple_build_assign_with_ops (BIT_AND_EXPR,
            make_ssa_name (boolean_type_node, NULL),
            t,
            gimple_assign_lhs (g));
gimple_set_location (g, location);
gsi_insert_after (&gsi, g, GSI_NEW_STMT);
r = gimple_assign_lhs (g);

We propose a simplified form using new build helper classes ssa_seq
and ssa_stmt that would allow the above code to be written as
follows.

ssa_seq q;
ssa_stmt t = q.stmt (NE_EXPR, shadow, 0);
ssa_stmt a = q.stmt (BIT_AND_EXPR, base_addr, 7);
ssa_stmt b = q.stmt (shadow_type, a);
ssa_stmt c = q.stmt (PLUS_EXPR, b, offset);
ssa_stmt d = q.stmt (GE_EXPR, c, shadow);
ssa_stmt e = q.stmt (BIT_AND_EXPR, t, d);
q.set_location (location);
r = e.lhs ();

There are a few important things to note about this example.

.. We have a new class (ssa_seq) that knows how to sequence
statements automatically and can build expressions out of types.

.. Every statement created produces an SSA name.  Referencing an
ssa_stmt instance in a an argument to ssa_seq::stmt retrieves the
SSA name generated by that statement.

.. The statement result type is that of the arguments.

.. The type of integral constant arguments is that of the other
argument.  (Which implies that only one argument can be constant.)

.. The 'stmt' method handles linking the statement into the sequence.

.. The 'set_location' method iterates over all statements.

There will be another class of builders for generating GIMPLE
in normal form (gimple_stmt).  We expect that this will mostly
affect all transformations that need to generate new expressions
and statements, like instrumentation passes.

We also expect to reduce calls to tree expression builders by
allowing the use of numeric and string constants to be converted
to the appropriate tree _CST node.  This will only work when the
type of the constant can be deduced from the other argument in some
expressions, of course.


Generating a Type

Consider the generation of the following type.

struct __asan_global {
  const_ptr_type_node __beg;
  inttype __size;
  inttype __size_with_redzone;
  const_ptr_type_node __name;
  inttype __has_dynamic_init;
};

The current code to generate it is as follows.

tree inttype = build_nonstandard_integer_type (POINTER_SIZE, 1);
tree ret = make_node (RECORD_TYPE);
TYPE_NAME (ret) = get_identifier ("__asan_global");
tree beg = build_decl (UNKNOWN_LOCATION, FIELD_DECL,
                       get_identifier ("__beg"), const_ptr_type_node);
DECL_CONTEXT (beg) = ret;
TYPE_FIELDS (ret) = beg;
tree size = build_decl (UNKNOWN_LOCATION, FIELD_DECL,
                        get_identifier ("__size"), inttype);
DECL_CONTEXT (size) = ret;
DECL_CHAIN (beg) = size;
tree red = build_decl (UNKNOWN_LOCATION, FIELD_DECL,
                       get_identifier ("__size_with_redzone"), inttype);
DECL_CONTEXT (red) = ret;
DECL_CHAIN (size) = red;
tree name = build_decl (UNKNOWN_LOCATION, FIELD_DECL,
                        get_identifier ("__name"), const_ptr_type_node);
DECL_CONTEXT (name) = ret;
DECL_CHAIN (red) = name;
tree init = build_decl (UNKNOWN_LOCATION, FIELD_DECL,
                        get_identifier ("__has_dynamic_init"), inttype);
DECL_CONTEXT (init) = ret;
DECL_CHAIN (name) = init;
layout_type (ret);

We propose a form as follows.

tree inttype = build_nonstandard_integer_type (POINTER_SIZE, 1);
record_builder rec ("__asan_global");
rec.field ("__beg", const_ptr_type_node);
rec.field ("__size", inttype);
rec.field ("__size_with_redzone", inttype);
rec.field ("__name", const_ptr_type_node);
rec.field ("__has_dynamic_init", inttype);
rec.finish ();
tree ret = rec.as_tree ();

There are a few important things to note about this example.

.. The 'field' method will add context and chain links.

.. The 'field' method is overloaded on both strings and identifiers.

.. The 'finish' method lays out the struct.


Proposal

Create a set of IL builder classes that provide a simplified IL
building interface.  Essentially, these builder classes will abstract
most of the bookkeeping code required by the current interfaces.

These classes will not replace the existing interfaces.  We do not
expect that all the IL generation done in current transformations
will be able to use the simplified interfaces.  The goal is to
simplify most of them, however.

-- 
Lawrence Crowl

Reply via email to