On Mon, Oct 15, 2012 at 5:47 PM, Gary Funck <g...@intrepid.com> wrote: > We have maintained the gupc (GNU Unified Parallel C) branch for > a couple of years now, and would like to merge these changes into > the GCC trunk. > > It is our goal to integrate the GUPC changes into the GCC 4.8 > trunk, in order to provide a UPC (Unified Parallel C) capability > in the subsequent GCC 4.8 release. > > The purpose of this note is to introduce the GUPC project, > provide an overview of the UPC-related changes and to introduce > the subsequent sets of patches which merge the GUPC branch into > GCC 4.8. > > For reference, > > The GUPC project page is here: > http://gcc.gnu.org/projects/gupc.html > > The current GUPC release is distributed here: > http://gccupc.org > > Roughly a year ago, we described the front-end related > changes at the time: > http://gcc.gnu.org/ml/gcc-patches/2011-07/msg00081.html > > We merge the GCC trunk into the gupc branch on approximately > a weekly basis. The current GUPC branch is based upon a recent > version of the GCC trunk (192449 dated 2012-10-15), and has > been bootstrapped on x86_64/i686 Linux, PPC/POWER7/Linux and > IA64/Altix Linux. In earlier versions, GUPC was successfully > ported to SGI/MIPS (big endian) and SciCortex/MIPS (little endian). > > The UPC-related source code differences > can be viewed here in various formats: > http://gccupc.org/gupc-changes > > In the discussion below, the changes are > excerpted in order to highlight important > aspects of the UPC-related changes. The version used in > this presentation is 190707. > > UPC's Shared Qualifier and Layout Qualifier > ------------------------------------------- > > The UPC language specification describes > the language syntax and semantics: > http://upc.gwu.edu/docs/upc_specs_1.2.pdf > > UPC introduces a new qualifier, "shared" > that indicates that the qualified object > is located in a global shared address space > that is accessible by all UPC threads. > Additional qualifiers ("strict" and "relaxed") > further specify the semantics of accesses to > UPC shared objects. > > In UPC, a shared qualified array can further > specify a "layout qualifier" that indicates > how the shared data is blocked and distributed. > > There are two language pre-defined identifiers > that indicate the number of threads that > will be created when the program starts (THREADS) > and the current (zero-based) thread number > (MYTHREAD). Typically, a UPC thread is implemented > as an operating system process. Access to UPC > shared memory may be implemented locally via > OS provided facilities (for example, mmap), > or across nodes via a high speed network > inter-connect (for example, Infiniband). > > GUPC provides a runtime (libgupc) that targets > an SMP-based system and uses mmap() to implement > global shared memory. > > Optionally, GUPC can use the more general and > more capable Berkeley UPCR runtime: > http://upc.lbl.gov/download/source.shtml#runtime > The UPCR runtime supports a number of network > topologies, and has been ported to most of the > current High Performance Computing (HPC) systems. > > The following example illustrates > the use of the UPC "shared" qualifier > combined with a layout qualifier. > > #define BLKSIZE 5 > #define N_PER_THREAD (4 * BLKSIZE) > shared [BLKSIZE] double A[N_PER_THREAD*THREADS]; > > Above the "[BLKSIZE]" construct is the UPC > layout factor; this specifies that the shared > array, A, distributes its elements across > each thread in blocks of 5 elements. If the > program is run with two threads, then A is > distributed as shown below: > > Thread 0 Thread 1 > -------- --------- > A[ 0.. 4] A[ 5.. 9] > A[10..14] A[15..19] > A[20..24] A[25..29] > A[30..34] A[35..39] > > Above, the elements shown for thread 0 > are defined as having "affinity" to thread 0. > Similarly, those elements shown for thread 1 > have affinity to thread 1. In UPC, a pointer > to a shared object can be cast to a thread > local pointer (a "C" pointer), when the > designated shared object has affinity > to the referencing thread. > > A UPC "pointer-to-shared" (PTS) is a pointer > that references a UPC shared object. > A UPC pointer-to-shared is a "fat" pointer > with the following logical fields: > (virt_addr, thread, offset) > > The virtual address (virt_addr) field is combined with > the thread number (thread) and offset within the > block (offset), to derive the location of the > referenced object within the UPC shared address space. > > GUPC implements pointer-to-shared objects using > either a "packed" representation or a "struct" > representation. The user can select the > pointer-to-shared representation with a "configure" > parameter. The packed representation is the default. > > The "packed" pointer-to-shared representation > limits the range of the various fields within > the pointer-to-shared in order to gain efficiency. > Packed pointer-to-shared values encode the three > part shared address (described above) as a 64-bit > value (on both 64-bit and 32-bit platforms). > > The "struct" representation provides a wider > addressing range at the expense of requiring > twice the number of bits (128) needed to encode > the pointer-to-shared value. > > UPC-Related Front-End Changes > ----------------------------- > > GCC's internal tree representation is > extended to record the UPC "shared", > "strict", "relaxed" qualifiers, > and the layout qualifier.
What immediately comes to my mind is that apart from parsing the core machinery should be shareable with Cilk+, no? Richard. > Index: gcc/tree.h > =================================================================== > --- gcc/tree.h (.../trunk) (revision 190707) > +++ gcc/tree.h (.../branches/gupc) (revision 190736) > @@ -458,7 +458,10 @@ struct GTY(()) tree_base { > unsigned packed_flag : 1; > unsigned user_align : 1; > unsigned nameless_flag : 1; > - unsigned spare0 : 4; > + unsigned upc_shared_flag : 1; > + unsigned upc_strict_flag : 1; > + unsigned upc_relaxed_flag : 1; > + unsigned spare0 : 1; > > unsigned spare1 : 8; > > > UPC defines a few additional tree node types: > > +++ gcc/upc/upc-tree.def (.../branches/gupc) (revision 190736) > +/* UPC statements */ > + > +/* Used to represent a `upc_forall' statement. The operands are > + UPC_FORALL_INIT_STMT, UPC_FORALL_COND, UPC_FORALL_EXPR, > + UPC_FORALL_BODY, and UPC_FORALL_AFFINITY respectively. */ > + > +DEFTREECODE (UPC_FORALL_STMT, "upc_forall_stmt", tcc_statement, 5) > + > +/* Used to represent a UPC synchronization statement. The first > + operand is the synchronization operation, UPC_SYNC_OP: > + UPC_SYNC_NOTIFY_OP 1 Notify operation > + UPC_SYNC_WAIT_OP 2 Wait operation > + UPC_SYNC_BARRIER_OP 3 Barrier operation > + > + The second operand, UPC_SYNC_ID is the (optional) expression > + whose value specifies the barrier identifier which is checked > + by the various synchronization operations. */ > + > +DEFTREECODE (UPC_SYNC_STMT, "upc_sync_stmt", tcc_statement, 2) > > The "C" parser is extended to recognize UPC's syntactic > extensions. > > --- gcc/c-family/c-common.c (.../trunk) (revision 190707) > +++ gcc/c-family/c-common.c (.../branches/gupc) (revision 190736) > @@ -30,6 +30,7 @@ along with GCC; see the file COPYING3. > #include "ggc.h" > #include "c-common.h" > #include "c-objc.h" > +#include "c-upc.h" > #include "tm_p.h" > #include "obstack.h" > #include "cpplib.h" > @@ -193,6 +194,24 @@ const char *pch_file; > user's namespace. */ > int flag_iso; > > +/* Nonzero whenever UPC -fupc-threads-N is asserted. > + The value N gives the number of UPC threads to be > + defined at compile-time. */ > +int flag_upc_threads; > + > +/* Nonzero whenever UPC -fupc-pthreads-model-* is asserted. */ > +int flag_upc_pthreads; > + > +/* The -fupc-pthreads-per-process-N switch tells the UPC compiler > + and runtime to map N UPC threads per process onto > + N POSIX threads running inside the process. */ > +int flag_upc_pthreads_per_process; > + > +/* The implementation model for UPC threads that > + are mapped to POSIX threads, specified at compilation > + time by the -fupc-pthreads-model-* switch. */ > +upc_pthreads_model_kind upc_pthreads_model; > + > /* Warn about #pragma directives that are not recognized. */ > > int warn_unknown_pragmas; /* Tri state variable. */ > @@ -389,8 +408,9 @@ static int resort_field_decl_cmp (const > C --std=c89: D_C99 | D_CXXONLY | D_OBJC | D_CXX_OBJC > C --std=c99: D_CXXONLY | D_OBJC > ObjC is like C except that D_OBJC and D_CXX_OBJC are not set > - C++ --std=c98: D_CONLY | D_CXXOX | D_OBJC > - C++ --std=c0x: D_CONLY | D_OBJC > + UPC is like C except that D_UPC is not set > + C++ --std=c98: D_CONLY | D_CXXOX | D_OBJC | D_UPC > + C++ --std=c0x: D_CONLY | D_OBJC | D_UPC > ObjC++ is like C++ except that D_OBJC is not set > > If -fno-asm is used, D_ASM is added to the mask. If > @@ -583,6 +603,19 @@ const struct c_common_resword c_common_r > { "inout", RID_INOUT, D_OBJC }, > { "oneway", RID_ONEWAY, D_OBJC }, > { "out", RID_OUT, D_OBJC }, > + > + /* UPC keywords */ > + { "shared", RID_SHARED, D_UPC }, > + { "relaxed", RID_RELAXED, D_UPC }, > + { "strict", RID_STRICT, D_UPC }, > + { "upc_barrier", RID_UPC_BARRIER, D_UPC }, > + { "upc_blocksizeof", RID_UPC_BLOCKSIZEOF, D_UPC }, > + { "upc_elemsizeof", RID_UPC_ELEMSIZEOF, D_UPC }, > + { "upc_forall", RID_UPC_FORALL, D_UPC }, > + { "upc_localsizeof", RID_UPC_LOCALSIZEOF, D_UPC }, > + { "upc_notify", RID_UPC_NOTIFY, D_UPC }, > + { "upc_wait", RID_UPC_WAIT, D_UPC }, > + > > --- gcc/c/c-parser.c (.../trunk) (revision 190707) > +++ gcc/c/c-parser.c (.../branches/gupc) (revision 190736) > [...] > @@ -498,6 +504,11 @@ c_token_starts_typename (c_token *token) > case RID_ACCUM: > case RID_SAT: > return true; > + /* UPC qualifiers */ > + case RID_SHARED: > + case RID_STRICT: > + case RID_RELAXED: > + return true; > [...] > @@ -1224,6 +1245,14 @@ static void c_parser_objc_at_dynamic_dec > static bool c_parser_objc_diagnose_bad_element_prefix > (c_parser *, struct c_declspecs *); > > +/* These UPC parser functions are only ever called when > + compiling UPC. */ > +static void c_parser_upc_forall_statement (c_parser *); > +static void c_parser_upc_sync_statement (c_parser *, int); > +static void c_parser_upc_shared_qual (source_location, > + c_parser *, > + struct c_declspecs *); > + > [...] > + /* UPC qualifiers */ > + case RID_SHARED: > + attrs_ok = true; > + c_parser_upc_shared_qual (loc, parser, specs); > + break; > + case RID_STRICT: > + case RID_RELAXED: > + attrs_ok = true; > + declspecs_add_qual (loc, specs, c_parser_peek_token > (parser)->value); > + c_parser_consume_token (parser); > + break; > case RID_ATTRIBUTE: > if (!attrs_ok) > goto out; > [...] > @@ -4558,6 +4612,22 @@ c_parser_statement_after_labels (c_parse > gcc_assert (c_dialect_objc ()); > c_parser_objc_synchronized_statement (parser); > break; > + case RID_UPC_FORALL: > + gcc_assert (c_dialect_upc ()); > + c_parser_upc_forall_statement (parser); > + break; > + case RID_UPC_NOTIFY: > + gcc_assert (c_dialect_upc ()); > + c_parser_upc_sync_statement (parser, UPC_SYNC_NOTIFY_OP); > + goto expect_semicolon; > + case RID_UPC_WAIT: > + gcc_assert (c_dialect_upc ()); > + c_parser_upc_sync_statement (parser, UPC_SYNC_WAIT_OP); > + goto expect_semicolon; > + case RID_UPC_BARRIER: > + gcc_assert (c_dialect_upc ()); > + c_parser_upc_sync_statement (parser, UPC_SYNC_BARRIER_OP); > + goto expect_semicolon; > default: > goto expr_stmt; > } > > --- gcc/c-family/c-pragma.c (.../trunk) (revision 190707) > +++ gcc/c-family/c-pragma.c (.../branches/gupc) (revision 190736) > @@ -30,6 +30,7 @@ along with GCC; see the file COPYING3. > #include "c-pragma.h" > #include "flags.h" > #include "c-common.h" > +#include "c-upc.h" > #include "tm_p.h" /* For REGISTER_TARGET_PRAGMAS (why is > this not a target hook?). */ > #include "vec.h" > @@ -507,6 +508,242 @@ add_to_renaming_pragma_list (tree oldnam > /* The current prefix set by #pragma extern_prefix. */ > GTY(()) tree pragma_extern_prefix; > > +/* variables used to implement #pragma upc semantics */ > +#ifndef UPC_CMODE_STACK_INCREMENT > +#define UPC_CMODE_STACK_INCREMENT 32 > +#endif > +static int pragma_upc_permitted; > +static int upc_cmode; > +static int *upc_cmode_stack; > +static int upc_cmode_stack_in_use; > +static int upc_cmode_stack_allocated; > + > +static void init_pragma_upc (void); > +static void handle_pragma_upc (cpp_reader * ARG_UNUSED (dummy)); > > c-decl.c handles the additional UPC qualifiers > and declspecs. The layout qualifier is handled here: > > --- gcc/c/c-decl.c (.../trunk) (revision 190707) > +++ gcc/c/c-decl.c (.../branches/gupc) (revision 190736) > [...] > @@ -8857,6 +9046,23 @@ declspecs_add_qual (source_location loc, > bool dupe = false; > specs->non_sc_seen_p = true; > specs->declspecs_seen_p = true; > + > + /* A UPC layout qualifier is encoded as an ARRAY_REF, > + further, it implies the presence of the 'shared' keyword. */ > + if (TREE_CODE (qual) == ARRAY_REF) > + { > + if (specs->upc_layout_qualifier) > + { > + error ("two or more layout qualifiers specified"); > + return specs; > + } > + else > + { > + specs->upc_layout_qualifier = qual; > + qual = ridpointers[RID_SHARED]; > + } > + } > + > > In UPC, a qualifier includes both the traditional > "C" qualifier flags and the UPC "layout qualifier". > Thus, the pointer_quals field of a declarator node > is defined as a struct including both qualifier > flags and the UPC type qualifier, as shown below. > > @@ -5702,7 +5835,9 @@ grokdeclarator (const struct c_declarato > > /* Process type qualifiers (such as const or volatile) > that were given inside the `*'. */ > - type_quals = declarator->u.pointer_quals; > + type_quals = declarator->u.pointer.quals; > + upc_layout_qualifier = declarator->u.pointer.upc_layout_qual; > + sharedp = ((type_quals & TYPE_QUAL_SHARED) != 0); > > UPC shared variables are allocated at runtime in the global > memory that is allocated and managed by the UPC runtime. > A separate link section is used as a method of assigning > virtual addresses to UPC shared variables. The UPC > shared variable section is designated as a "no load" > section on systems that support that facility; in that > case, the linkage section begins at virtual address zero. > The logic below assigns UPC shared variables to > their own linkage section. > > @@ -6235,6 +6409,13 @@ grokdeclarator (const struct c_declarato > [...] > + /* Shared variables are given their own link section on > + most target platforms, and if compiling in pthreads mode > + regular local file scope variables are made thread local. */ > + if ((TREE_CODE(decl) == VAR_DECL) > + && !threadp && (TREE_SHARED (decl) || flag_upc_pthreads)) > + upc_set_decl_section (decl); > + > > Various UPC language related checks and operations > are called in the "C" front-end and middle-end. > To insure that these operations are defined, > when linked with the other language front-ends > and compilers, these functions are stub-ed, > in a fashion similar to Objective C: > > --- gcc/c-family/c-upc.h (.../trunk) (revision 0) > +++ gcc/c-family/c-upc.h (.../branches/gupc) (revision 190736) > [...] > + > +/* UPC entry points. */ > + > +/* The following UPC functions are called by the C front-end; > + * they all must have corresponding stubs in stub-upc.c. */ > + > +extern int count_upc_threads_refs (tree); > +extern void deny_pragma_upc (void); > +extern int get_upc_consistency_mode (void); > [...] > +extern tree upc_rts_forall_depth_var (void); > +extern void upc_set_decl_section (tree); > +extern void upc_write_global_declarations (void); > > A few command line option flags must also be > stub'ed out in order to link the other > language front-ends. > > --- gcc/c-family/stub-upc.c (.../trunk) (revision 0) > +++ gcc/c-family/stub-upc.c (.../branches/gupc) (revision 190736) > [...] > +int compiling_upc; > +int flag_upc; > +int use_upc_dwarf2_extensions; > > The complete set of GUPC-related patches will be provided for > review in a collection of 16 patch sets. A listing of those > patch sets is attached. > > Each patch set will be sent in an separate email following > this one for the purposes of review. > > -- end --