On Mon, Oct 15, 2012 at 5:47 PM, Gary Funck <g...@intrepid.com> wrote:
> We have maintained the gupc (GNU Unified Parallel C) branch for
> a couple of years now, and would like to merge these changes into
> the GCC trunk.
>
> It is our goal to integrate the GUPC changes into the GCC 4.8
> trunk, in order to provide a UPC (Unified Parallel C) capability
> in the subsequent GCC 4.8 release.
>
> The purpose of this note is to introduce the GUPC project,
> provide an overview of the UPC-related changes and to introduce
> the subsequent sets of patches which merge the GUPC branch into
> GCC 4.8.
>
> For reference,
>
> The GUPC project page is here:
> http://gcc.gnu.org/projects/gupc.html
>
> The current GUPC release is distributed here:
> http://gccupc.org
>
> Roughly a year ago, we described the front-end related
> changes at the time:
> http://gcc.gnu.org/ml/gcc-patches/2011-07/msg00081.html
>
> We merge the GCC trunk into the gupc branch on approximately
> a weekly basis.  The current GUPC branch is based upon a recent
> version of the GCC trunk (192449 dated 2012-10-15), and has
> been bootstrapped on x86_64/i686 Linux, PPC/POWER7/Linux and
> IA64/Altix Linux. In earlier versions, GUPC was successfully
> ported to SGI/MIPS (big endian) and SciCortex/MIPS (little endian).
>
> The UPC-related source code differences
> can be viewed here in various formats:
>   http://gccupc.org/gupc-changes
>
> In the discussion below, the changes are
> excerpted in order to highlight important
> aspects of the UPC-related changes.  The version used in
> this presentation is 190707.
>
> UPC's Shared Qualifier and Layout Qualifier
> -------------------------------------------
>
> The UPC language specification describes
> the language syntax and semantics:
>   http://upc.gwu.edu/docs/upc_specs_1.2.pdf
>
> UPC introduces a new qualifier, "shared"
> that indicates that the qualified object
> is located in a global shared address space
> that is accessible by all UPC threads.
> Additional qualifiers ("strict" and "relaxed")
> further specify the semantics of accesses to
> UPC shared objects.
>
> In UPC, a shared qualified array can further
> specify a "layout qualifier" that indicates
> how the shared data is blocked and distributed.
>
> There are two language pre-defined identifiers
> that indicate the number of threads that
> will be created when the program starts (THREADS)
> and the current (zero-based) thread number
> (MYTHREAD).  Typically, a UPC thread is implemented
> as an operating system process.  Access to UPC
> shared memory may be implemented locally via
> OS provided facilities (for example, mmap),
> or across nodes via a high speed network
> inter-connect (for example, Infiniband).
>
> GUPC provides a runtime (libgupc) that targets
> an SMP-based system and uses mmap() to implement
> global shared memory.
>
> Optionally, GUPC can use the more general and
> more capable Berkeley UPCR runtime:
>   http://upc.lbl.gov/download/source.shtml#runtime
> The UPCR runtime supports a number of network
> topologies, and has been ported to most of the
> current High Performance Computing (HPC) systems.
>
> The following example illustrates
> the use of the UPC "shared" qualifier
> combined with a layout qualifier.
>
>     #define BLKSIZE 5
>     #define N_PER_THREAD (4 * BLKSIZE)
>     shared [BLKSIZE] double A[N_PER_THREAD*THREADS];
>
> Above the "[BLKSIZE]" construct is the UPC
> layout factor; this specifies that the shared
> array, A, distributes its elements across
> each thread in blocks of 5 elements.  If the
> program is run with two threads, then A is
> distributed as shown below:
>
>     Thread 0    Thread 1
>     --------    ---------
>     A[ 0.. 4]   A[ 5.. 9]
>     A[10..14]   A[15..19]
>     A[20..24]   A[25..29]
>     A[30..34]   A[35..39]
>
> Above, the elements shown for thread 0
> are defined as having "affinity" to thread 0.
> Similarly, those elements shown for thread 1
> have affinity to thread 1.  In UPC, a pointer
> to a shared object can be cast to a thread
> local pointer (a "C" pointer), when the
> designated shared object has affinity
> to the referencing thread.
>
> A UPC "pointer-to-shared" (PTS) is a pointer
> that references a UPC shared object.
> A UPC pointer-to-shared is a "fat" pointer
> with the following logical fields:
>    (virt_addr, thread, offset)
>
> The virtual address (virt_addr) field is combined with
> the thread number (thread) and offset within the
> block (offset), to derive the location of the
> referenced object within the UPC shared address space.
>
> GUPC implements pointer-to-shared objects using
> either a "packed" representation or a "struct"
> representation.  The user can select the
> pointer-to-shared representation with a "configure"
> parameter.  The packed representation is the default.
>
> The "packed" pointer-to-shared representation
> limits the range of the various fields within
> the pointer-to-shared in order to gain efficiency.
> Packed pointer-to-shared values encode the three
> part shared address (described above) as a 64-bit
> value (on both 64-bit and 32-bit platforms).
>
> The "struct" representation provides a wider
> addressing range at the expense of requiring
> twice the number of bits (128) needed to encode
> the pointer-to-shared value.
>
> UPC-Related Front-End Changes
> -----------------------------
>
> GCC's internal tree representation is
> extended to record the UPC "shared",
> "strict", "relaxed" qualifiers,
> and the layout qualifier.

What immediately comes to my mind is that apart from parsing
the core machinery should be shareable with Cilk+, no?

Richard.

> Index: gcc/tree.h
> ===================================================================
> --- gcc/tree.h  (.../trunk)     (revision 190707)
> +++ gcc/tree.h  (.../branches/gupc)     (revision 190736)
> @@ -458,7 +458,10 @@ struct GTY(()) tree_base {
>        unsigned packed_flag : 1;
>        unsigned user_align : 1;
>        unsigned nameless_flag : 1;
> -      unsigned spare0 : 4;
> +      unsigned upc_shared_flag : 1;
> +      unsigned upc_strict_flag : 1;
> +      unsigned upc_relaxed_flag : 1;
> +      unsigned spare0 : 1;
>
>        unsigned spare1 : 8;
>
>
> UPC defines a few additional tree node types:
>
> +++ gcc/upc/upc-tree.def        (.../branches/gupc)     (revision 190736)
> +/* UPC statements */
> +
> +/* Used to represent a `upc_forall' statement. The operands are
> +   UPC_FORALL_INIT_STMT, UPC_FORALL_COND, UPC_FORALL_EXPR,
> +   UPC_FORALL_BODY, and UPC_FORALL_AFFINITY respectively. */
> +
> +DEFTREECODE (UPC_FORALL_STMT, "upc_forall_stmt", tcc_statement, 5)
> +
> +/* Used to represent a UPC synchronization statement. The first
> +   operand is the synchronization operation, UPC_SYNC_OP:
> +   UPC_SYNC_NOTIFY_OP  1       Notify operation
> +   UPC_SYNC_WAIT_OP    2       Wait operation
> +   UPC_SYNC_BARRIER_OP 3       Barrier operation
> +
> +   The second operand, UPC_SYNC_ID is the (optional) expression
> +   whose value specifies the barrier identifier which is checked
> +   by the various synchronization operations. */
> +
> +DEFTREECODE (UPC_SYNC_STMT, "upc_sync_stmt", tcc_statement, 2)
>
> The "C" parser is extended to recognize UPC's syntactic
> extensions.
>
> --- gcc/c-family/c-common.c     (.../trunk)     (revision 190707)
> +++ gcc/c-family/c-common.c     (.../branches/gupc)     (revision 190736)
> @@ -30,6 +30,7 @@ along with GCC; see the file COPYING3.
>  #include "ggc.h"
>  #include "c-common.h"
>  #include "c-objc.h"
> +#include "c-upc.h"
>  #include "tm_p.h"
>  #include "obstack.h"
>  #include "cpplib.h"
> @@ -193,6 +194,24 @@ const char *pch_file;
>     user's namespace.  */
>  int flag_iso;
>
> +/* Nonzero whenever UPC -fupc-threads-N is asserted.
> +   The value N gives the number of UPC threads to be
> +   defined at compile-time. */
> +int flag_upc_threads;
> +
> +/* Nonzero whenever UPC -fupc-pthreads-model-* is asserted. */
> +int flag_upc_pthreads;
> +
> +/* The -fupc-pthreads-per-process-N switch tells the UPC compiler
> +   and runtime to map N UPC threads per process onto
> +   N POSIX threads running inside the process. */
> +int flag_upc_pthreads_per_process;
> +
> +/* The implementation model for UPC threads that
> +   are mapped to POSIX threads, specified at compilation
> +   time by the -fupc-pthreads-model-* switch. */
> +upc_pthreads_model_kind upc_pthreads_model;
> +
>  /* Warn about #pragma directives that are not recognized.  */
>
>  int warn_unknown_pragmas; /* Tri state variable.  */
> @@ -389,8 +408,9 @@ static int resort_field_decl_cmp (const
>     C --std=c89: D_C99 | D_CXXONLY | D_OBJC | D_CXX_OBJC
>     C --std=c99: D_CXXONLY | D_OBJC
>     ObjC is like C except that D_OBJC and D_CXX_OBJC are not set
> -   C++ --std=c98: D_CONLY | D_CXXOX | D_OBJC
> -   C++ --std=c0x: D_CONLY | D_OBJC
> +   UPC is like C except that D_UPC is not set
> +   C++ --std=c98: D_CONLY | D_CXXOX | D_OBJC | D_UPC
> +   C++ --std=c0x: D_CONLY | D_OBJC | D_UPC
>     ObjC++ is like C++ except that D_OBJC is not set
>
>     If -fno-asm is used, D_ASM is added to the mask.  If
> @@ -583,6 +603,19 @@ const struct c_common_resword c_common_r
>    { "inout",           RID_INOUT,              D_OBJC },
>    { "oneway",          RID_ONEWAY,             D_OBJC },
>    { "out",             RID_OUT,                D_OBJC },
> +
> +  /* UPC keywords */
> +  { "shared",          RID_SHARED,             D_UPC },
> +  { "relaxed",         RID_RELAXED,            D_UPC },
> +  { "strict",          RID_STRICT,             D_UPC },
> +  { "upc_barrier",     RID_UPC_BARRIER,        D_UPC },
> +  { "upc_blocksizeof", RID_UPC_BLOCKSIZEOF,    D_UPC },
> +  { "upc_elemsizeof",  RID_UPC_ELEMSIZEOF,     D_UPC },
> +  { "upc_forall",      RID_UPC_FORALL,         D_UPC },
> +  { "upc_localsizeof", RID_UPC_LOCALSIZEOF,    D_UPC },
> +  { "upc_notify",      RID_UPC_NOTIFY,         D_UPC },
> +  { "upc_wait",                RID_UPC_WAIT,           D_UPC },
> +
>
> --- gcc/c/c-parser.c    (.../trunk)     (revision 190707)
> +++ gcc/c/c-parser.c    (.../branches/gupc)     (revision 190736)
> [...]
> @@ -498,6 +504,11 @@ c_token_starts_typename (c_token *token)
>         case RID_ACCUM:
>         case RID_SAT:
>           return true;
> +        /* UPC qualifiers */
> +       case RID_SHARED:
> +       case RID_STRICT:
> +       case RID_RELAXED:
> +         return true;
> [...]
> @@ -1224,6 +1245,14 @@ static void c_parser_objc_at_dynamic_dec
>  static bool c_parser_objc_diagnose_bad_element_prefix
>    (c_parser *, struct c_declspecs *);
>
> +/* These UPC parser functions are only ever called when
> +   compiling UPC.  */
> +static void c_parser_upc_forall_statement (c_parser *);
> +static void c_parser_upc_sync_statement (c_parser *, int);
> +static void c_parser_upc_shared_qual (source_location,
> +                                      c_parser *,
> +                                     struct c_declspecs *);
> +
> [...]
> +        /* UPC qualifiers */
> +       case RID_SHARED:
> +         attrs_ok = true;
> +          c_parser_upc_shared_qual (loc, parser, specs);
> +         break;
> +       case RID_STRICT:
> +       case RID_RELAXED:
> +         attrs_ok = true;
> +         declspecs_add_qual (loc, specs, c_parser_peek_token 
> (parser)->value);
> +         c_parser_consume_token (parser);
> +         break;
>         case RID_ATTRIBUTE:
>           if (!attrs_ok)
>             goto out;
> [...]
> @@ -4558,6 +4612,22 @@ c_parser_statement_after_labels (c_parse
>           gcc_assert (c_dialect_objc ());
>           c_parser_objc_synchronized_statement (parser);
>           break;
> +       case RID_UPC_FORALL:
> +          gcc_assert (c_dialect_upc ());
> +         c_parser_upc_forall_statement (parser);
> +         break;
> +        case RID_UPC_NOTIFY:
> +          gcc_assert (c_dialect_upc ());
> +         c_parser_upc_sync_statement (parser, UPC_SYNC_NOTIFY_OP);
> +         goto expect_semicolon;
> +        case RID_UPC_WAIT:
> +          gcc_assert (c_dialect_upc ());
> +         c_parser_upc_sync_statement (parser, UPC_SYNC_WAIT_OP);
> +         goto expect_semicolon;
> +        case RID_UPC_BARRIER:
> +          gcc_assert (c_dialect_upc ());
> +         c_parser_upc_sync_statement (parser, UPC_SYNC_BARRIER_OP);
> +         goto expect_semicolon;
>         default:
>           goto expr_stmt;
>         }
>
> --- gcc/c-family/c-pragma.c     (.../trunk)     (revision 190707)
> +++ gcc/c-family/c-pragma.c     (.../branches/gupc)     (revision 190736)
> @@ -30,6 +30,7 @@ along with GCC; see the file COPYING3.
>  #include "c-pragma.h"
>  #include "flags.h"
>  #include "c-common.h"
> +#include "c-upc.h"
>  #include "tm_p.h"              /* For REGISTER_TARGET_PRAGMAS (why is
>                                    this not a target hook?).  */
>  #include "vec.h"
> @@ -507,6 +508,242 @@ add_to_renaming_pragma_list (tree oldnam
>  /* The current prefix set by #pragma extern_prefix.  */
>  GTY(()) tree pragma_extern_prefix;
>
> +/* variables used to implement #pragma upc semantics */
> +#ifndef UPC_CMODE_STACK_INCREMENT
> +#define UPC_CMODE_STACK_INCREMENT 32
> +#endif
> +static int pragma_upc_permitted;
> +static int upc_cmode;
> +static int *upc_cmode_stack;
> +static int upc_cmode_stack_in_use;
> +static int upc_cmode_stack_allocated;
> +
> +static void init_pragma_upc (void);
> +static void handle_pragma_upc (cpp_reader * ARG_UNUSED (dummy));
>
> c-decl.c handles the additional UPC qualifiers
> and declspecs.  The layout qualifier is handled here:
>
> --- gcc/c/c-decl.c      (.../trunk)     (revision 190707)
> +++ gcc/c/c-decl.c      (.../branches/gupc)     (revision 190736)
> [...]
> @@ -8857,6 +9046,23 @@ declspecs_add_qual (source_location loc,
>    bool dupe = false;
>    specs->non_sc_seen_p = true;
>    specs->declspecs_seen_p = true;
> +
> +  /* A UPC layout qualifier is encoded as an ARRAY_REF,
> +     further, it implies the presence of the 'shared' keyword. */
> +  if (TREE_CODE (qual) == ARRAY_REF)
> +    {
> +      if (specs->upc_layout_qualifier)
> +        {
> +          error ("two or more layout qualifiers specified");
> +         return specs;
> +        }
> +      else
> +        {
> +          specs->upc_layout_qualifier = qual;
> +          qual = ridpointers[RID_SHARED];
> +        }
> +    }
> +
>
> In UPC, a qualifier includes both the traditional
> "C" qualifier flags and the UPC "layout qualifier".
> Thus, the pointer_quals field of a declarator node
> is defined as a struct including both qualifier
> flags and the UPC type qualifier, as shown below.
>
> @@ -5702,7 +5835,9 @@ grokdeclarator (const struct c_declarato
>
>             /* Process type qualifiers (such as const or volatile)
>                that were given inside the `*'.  */
> -           type_quals = declarator->u.pointer_quals;
> +           type_quals = declarator->u.pointer.quals;
> +           upc_layout_qualifier = declarator->u.pointer.upc_layout_qual;
> +           sharedp = ((type_quals & TYPE_QUAL_SHARED) != 0);
>
> UPC shared variables are allocated at runtime in the global
> memory that is allocated and managed by the UPC runtime.
> A separate link section is used as a method of assigning
> virtual addresses to UPC shared variables.  The UPC
> shared variable section is designated as a "no load"
> section on systems that support that facility; in that
> case, the linkage section begins at virtual address zero.
> The logic below assigns UPC shared variables to
> their own linkage section.
>
> @@ -6235,6 +6409,13 @@ grokdeclarator (const struct c_declarato
> [...]
> +    /* Shared variables are given their own link section on
> +       most target platforms, and if compiling in pthreads mode
> +       regular local file scope variables are made thread local. */
> +    if ((TREE_CODE(decl) == VAR_DECL)
> +        && !threadp && (TREE_SHARED (decl) || flag_upc_pthreads))
> +      upc_set_decl_section (decl);
> +
>
> Various UPC language related checks and operations
> are called in the "C" front-end and middle-end.
> To insure that these operations are defined,
> when linked with the other language front-ends
> and compilers, these functions are stub-ed,
> in a fashion similar to Objective C:
>
> --- gcc/c-family/c-upc.h        (.../trunk)     (revision 0)
> +++ gcc/c-family/c-upc.h        (.../branches/gupc)     (revision 190736)
> [...]
> +
> +/* UPC entry points.  */
> +
> +/* The following UPC functions are called by the C front-end;
> + * they all must have corresponding stubs in stub-upc.c.  */
> +
> +extern int count_upc_threads_refs (tree);
> +extern void deny_pragma_upc (void);
> +extern int get_upc_consistency_mode (void);
> [...]
> +extern tree upc_rts_forall_depth_var (void);
> +extern void upc_set_decl_section (tree);
> +extern void upc_write_global_declarations (void);
>
> A few command line option flags must also be
> stub'ed out in order to link the other
> language front-ends.
>
> --- gcc/c-family/stub-upc.c     (.../trunk)     (revision 0)
> +++ gcc/c-family/stub-upc.c     (.../branches/gupc)     (revision 190736)
> [...]
> +int compiling_upc;
> +int flag_upc;
> +int use_upc_dwarf2_extensions;
>
> The complete set of GUPC-related patches will be provided for
> review in a collection of 16 patch sets.  A listing of those
> patch sets is attached.
>
> Each patch set will be sent in an separate email following
> this one for the purposes of review.
>
>                       -- end --

Reply via email to