Chapel commmunity -- Cray Inc. and the Chapel developer community are pleased to announce the release of version 1.9.0 of Chapel. This release's highlights include:
* Improved operator precedence for bitwise operators, '..' and 'in'. See the Quick Reference guide in the release or on the web (http://chapel.cray.com/spec/quickReference.pdf) to view the revised precedence table. * The ability for users to provide their own overloads for the assignment operator. For example: proc =(ref lhs:mytype, rhs:mytype) { ... } * Improved the mapping of Chapel's atomic variables to intrinsic hardware capabilities when using Intel and Cray compilers as the back-end compiler. * Significant performance improvements and reductions in compiler-introduced memory leaks. To see or track changes in Chapel performance over time, refer to our newly-available-to-the-public performance tracking graphs at: http://chapel.sourceforge.net/perf/ See the 'Performance Improvements' and 'Memory Improvements' sections of the CHANGES file (excerpted at the bottom of this message) for more details in what has improved. * Several new Chapel versions of the Computer Language Benchmarks Game (http://benchmarksgame.alioth.debian.org/) computations are now available in the examples/benchmarks/shootout/ directory. * Improved the performance of most nested data parallel idioms by flipping the default value of dataParIgnoreRunningTasks from 'false' to 'true'. This causes inner data parallel loops to be sensitive to the number of tasks that are already running rather than assuming that they are the only thing running. See the "Controlling Degree of Data Parallelism" section in doc/README.executing for more details. * Significantly improved the quality and production-readiness of the LLVM back-end. See doc/technotes/README.llvm for more information. * Improved the symmetry of program startup and execution across the locales a program is running on. Historically, there has been an asymmetry in which locale #0 has set up its tasks and threads in a manner that was distinct from all other locales. * Added a 'make check' rule to the top-level Makefile that can be used to validate that a build seems to be compiling and running correctly. * Made significant improvements to the testing system, particularly for performance testing scenarios. * Implemented numerous bug fixes. See the "Bug Fixes" section in the CHANGES file, excerpted at the end of this file, for details. ...and much more! See below for a more complete list of changes in version 1.9.0, or refer to $CHPL_HOME/CHANGES within the release itself. Contributors to this release include: Kyle Brady, Cray Inc. Brad Chamberlain, Cray Inc. Sung-Eun Choi, Cray Inc. Lydia Duncan, Cray Inc. Michael Ferguson, LTS Akihiro Hayashi, Rice University Tom Hildebrandt, Cray Inc. David Iten, Cray Inc. Rafael Larrosa Jiminez, University of Malaga Vassily Litvinov, Cray Inc. Jun Nakashima, University of Tokyo Elliot Ronaghan, Cray Inc./Moravian College Brandon Ross, University at Buffalo Greg Titus, Cray Inc. Thomas Van Doren, Cray Inc. Chris Wailes, Indiana University To download the release, visit our SourceForge page at: http://sourceforge.net/projects/chapel At this site, you can also browse the mailing list archives and track our progress day-by-day. Our main project page continues to be hosted at: http://chapel.cray.com and it remains the best place to browse papers, presentations, documents, tutorials, and news items; or to read about collaborations with researchers and educators. As always, we're interested in your feedback on how we can make the Chapel language and implementation more useful to you. On behalf of the Chapel Team, -Brad Chamberlain ============= version 1.9.0 ============= Twelfth public release of Chapel, April 17, 2014 Highlights (see entries in subsequent categories for details) ------------------------------------------------------------- * numerous performance improvements (see 'Performance Improvements' below) * significant reductions in compiler-introduced memory leaks * added five new Computer Language Benchmark Games to the examples/ directory * improved operator precedence for '|', '^', '&', '<<', '>>', '..', and 'in' * added the ability for a user to create overloads of the assignment operator * implemented atomic variables using intrinsics for Intel and Cray compilers * flipped the default nested parallelism policy via dataParIgnoreRunningTasks * significantly improved the stability/generality of the LLVM back-end * added a slurm-srun launcher for use with native SLURM and improved pbs-aprun * added a 'make check' rule to the top-level Makefile to validate a build * improved the symmetry of program startup and polling across the locales * significant improvements to the testing system, esp. performance testing * updates to the GASNet and GMP packages and new snapshots of hwloc and re2 * improved the code base's portability w.r.t. clang, gcc, Mac OS X, Debian 7.4 * numerous bug fixes (see 'Bug Fixes' below) Packaging Changes ----------------- * added a 'make check' rule to the top-level Makefile to validate a build * removed the half-hearted support for Chapel syntax highlighting in emacs 21 Environment Changes ------------------- * added a new CHPL_HWLOC environment variable to control the use of 'hwloc' (see doc/README.chplenv) * made CHPL_*_COMPILER default to 'clang' for CHPL_*_PLATFORM 'darwin' * made CHPL_TASKS default to 'qthreads' when CHPL_LOCALE_MODEL == 'numa' * made CHPL_HWLOC default to 'hwloc' when CHPL_TASKS = 'qthreads' * established a 1:1 correspondance between CHPL_TASKS and CHPL_THREADS options * deprecated the user-controlled CHPL_THREADS environment variable * removed support for CHPL_TASKS=none due to lack of significant utility * made GASNet use the 'large' segment by default for the 'ibv' conduit * made CHPL_LAUNCHER default to 'gasnetrun_ibv' when using GASNet's mxm conduit (see doc/README.launcher) Semantic Changes / Changes to Chapel Language --------------------------------------------- * improved operator precedence for '|', '^', '&', '<<', '>>', '..', and 'in' (see precedence tables in the 'Expressions' spec chapter or quick ref card) * added the ability for a user to create overloads of the assignment operator (see 'Statements' chapter in the language specification) * added a 'noinit' capability to squash default initialization for basic types (see 'Variables' chapter in the language specification) * for a domain D, previously {D} == D; now it interprets it as 'domain(D.type)' * added support for an expression-less 'serial' statement (i.e., 'serial do' == 'serial true do') * added support for dynamic casts of the 'nil' value, producing 'nil' (see 'Conversions' chapter of the language specification) * clarified that deleting a 'nil' value is OK and will have no effect (see 'Classes' chapter of the language specification) * added the ability to mark the 'this' as having 'ref' intent for methods (see 'Classes' chapter of the language specification) New Features ------------ * implemented support for the 'break' statement within param loops Changes to the Implementation ----------------------------- * dataParIgnoreRunningTasks is now 'false' by default for locale model 'flat' (see doc/README.executing for details) * changed the default size of the call stack size to 8 MiB for all task options (see doc/README.tasks for details) New Interoperability Features ----------------------------- * extended c_ptrTo() to support 1D rectangular arrays * added support for casts between c_ptr types Standard Modules ---------------- * added support for abs() on imaginary types * added isSubtype() and isProperSubtype() queries to the standard Types module (see 'Standard Modules' chapter of the spec for details) Documentation ------------- * added descriptions of 'atomic' variables and 'noinit' expressions to the spec (see the 'Task Parallelism and Synchronization' and 'Variables' sections) * clarified specification of casting from numeric types to 'bool' * reworked LICENSE files to clarify third-party licenses and isolate BSD text (see LICENSE and LICENSE.chapel) * refreshed and reorganized README.tasks * documented that 'clang' is available as a CHPL_*_COMPILER option (see doc/README.chplenv) * improved description of Cray-specific runtime environment variables (see doc/platforms/README.cray) * clarified formatted I/O documentation regarding width/precision (see doc/technotes/README.io) * added a performance notes file (PERFORMANCE) * removed the user agreement (AGREEMENT) * generally refreshed README-based documentation * general updates and improvements to the language specification Example Codes ------------- * added new Chapel ports of several Computer Language Benchmark Games (CLBG) (see spectralnorm.chpl, mandelbrot.chpl, fannkuchredux.chpl, meteor.chpl, and pidigits.chpl in benchmarks/shootout/) * added an improved/simplified version of the CLBG chameneos-redux example (see benchmarks/shootout/chameneosredux.chpl) * improved the release versions of RA to use atomic rather than sync vars * made the examples/programs/tree.chpl example reclaim its memory * made minor improvements to the MiniMD example and primer examples * fixed a few incorrect statements in comments within primers/syncsingle.chpl Cray-specific Notes ------------------- * changed Cray XC Systems(TM) to use GASNet over the aries conduit by default * added a slurm-srun launcher for use with Cray systems supporting native SLURM Launcher-specific Notes ----------------------- * added a slurm-srun launcher for use with native SLURM systems * improved the pbs-aprun launcher for use with Moab/Torque * made the gasnetrun_ibv launcher forward all CHPL_ environment variables * made the 'aprun' launcher more careful about when it can correctly use '-j' Portability of code base ------------------------ * improved code base's portability to newer versions of gcc * improved code base's portability to Darwin/Mac OS X and Debian 7.4 * improved code base's portability for compilation with 'clang' * enabled 'tcmalloc' to be built on Cray XC systems Compiler Flags (see 'man chpl' for details) ------------------------------------------- * removed the --serial and --serial-forall flags; use serial statements instead * started ignoring --static for Mac OS X since it isn't well-supported * added a --print-passes-file flag to print passes to a specified filename Error Message Improvements -------------------------- * changed the wording of internal errors for succinctness Performance Improvements ------------------------ * implemented atomic variables using intrinsics for Intel and Cray compilers * optimized whole-array binary read/write operations for default domains/arrays * extended global constant replication to additional types * improved the compiler's ability to remote value forward values * optimized away sublocale-related code for the 'flat' locale model * improved 'numa' locale model performance for --local compilations) * optimized blocking on sync variables for 'fifo' tasking based on # of tasks * within serial sections, optimized forall loops over ranges, domains, arrays * improved the task accounting for loops over Block-distributed arrays * improved the loop-invariant code motion optimization's use of alias analysis * removed unnecessary copies for formal arguments * optimized program startup times in several ways Locale Model Improvements ------------------------- * improved the 'numa' locale model to reduce unnecessary changes in sublocale * improved the 'numa' locale model's range, domain, and array iterators Memory Improvements ------------------- * reduced compiler-introduced memory leaks, particulary in I/O code * reduced memory usage due to compiler-introduced copies on primitive types * improved the reclamation of arrays of sync/single variables * moved the end-of-program memory reporting to a point after the runtime exits Third-Party Software Changes ---------------------------- * updated GASNet to version 1.22.0 with a patch to fix 'aries' conduit bugs * added a snapshot of hwloc version 1.7.2 for compute node introspection * added a snapshot of re2 (20140111) to the third-party directory * added a snapshot of dygraphs to the third-party directory for perf graphs * updated our snapshot of GMP to version 6.0.0 * various minor improvements to the Qthreads tasking layer * disabled pshm for all non-udp GASNet conduits Bug Fixes / New Semantic Checks (for old semantics) --------------------------------------------------- * improved const-ness checking for const fields and const records/unions * added a semantic check for tuple size mismatch when destructuring a tuple * fixed a bug in which [u]int & imag literals were represented incorrectly * fixed a bug where iterators with complex control flow could yield bad values * fixed a bug in which timer.clear() did not reset the timer if it was running * fixed a bug in which abs() on reals incorrectly called the complex version * fixed a bug in converting between Chapel and C strings * fixed a bug in which casts from ints/uints to bools were not always correct * fixed some problems with the GMP random number routines * fixed a bug on Cygwin for usernames with spaces in them * extended global constant replication to additional types * fixed a "read after freed" bug on domains used in nonblocking on calls * fixed bug in loop invariant code motion related to aliasing in records/tuples * fixed a subtle/minor race condition regarding accounting of tasks * fixed Qthreads tasking layer bug resulting incorrect task placement * fixed a bug in which Qthreads was asked for task-local storage prematurely * fixed a potential race in task reporting (-t/--taskreport) * fixed an optimization shortcut for array reindexing * fixed a bug in which while loops warned about testing against local consts * improved 'printchplenv' to avoid perl problems in unexpected cases Runtime Library Changes ----------------------- * improved the symmetry of program startup and polling across the locales * improved descriptions of the runtime communication interface in chpl-comm.h * simplified the implementation of the registry of global variables * added ability for fifo tasks to implement thread-level storage using __thread Generated Code Cleanups ----------------------- * simplified the implementation of operations on homogeneous tuples * removed the passing of wide strings by ref by default * squashed the code generation of an unused program initialization function * squashed redundant calls to initialize the ChapelStandard module * folded out tautological comparisons between uints and 0s (e.g., myuint >= 0) Compiler Performance -------------------- * short-circuited the beautify pass if --savec is not specified * short-circuited some logic in the parallel pass for --local compiles Compiler Improvements --------------------- * significantly improved the stability/generality of the LLVM back-end * re-implemented copy propagation to handle aliases better * made de-nested functions use 'const' arguments in more cases Testing System -------------- * added the ability to run multiple trials of each test to the testing system * added support for sweeping current performance tests across past releases * added the ability to track compiler performance during testing * added a regexp-based correctness check capability for performance tests * changed performance testing to support --fast by default * added a script to splice .dat files created by performance testing * permit SLURM to control the node list for parallel testing * replaced 'paratest.server's -duplex flag with -nodepara for oversubscription * added a capability to add annotations to performance graphs * made performance testing compile with --static by default * in generating performance graphs, the previous directory is now removed * added a capability for the performance graphs to be rsync'd to SourceForge * added a logarithmic/linear toggle to the generated performance graphs * added a capability for the 'nightly' script to svn commit performance data * added additional print messages to 'nightly' to better describe progress * added a -retaintree option to 'nightly' to use the existing writable tree * added support for testing the '--fast' flag in the 'nightly' script * worked on making the testing system less Cray-centric in its design * made test scripts send mail to the sourceforge mailing lists by default * added options for naming test suites, specifying recipients, etc. * unified the naming and structure of the cron job testing scripts * removed reliance on tcsh-specific features for improved csh portability Makefile Changes ---------------- * made all builds update Makefile dependences, not just developer builds * made Makefiles propagate CFLAGS/CXXFLAGS to third-party builds Internal/Developer-oriented --------------------------- * added dataPar* arguments to these() iterators for ranges/default rectangular * made Block's leader iterator forward to DefaultRectangular * made sure that arguments to exported functions are always local/narrow * changed most assignment operator signatures to take the LHS by 'ref' * added support for a "minimal modules" compilation mode for core language work * added a developer flag --report-optimized-loop-iterators to track loop opts * made internal errors appear as such in developer mode * refactored the reference count code for domain maps, domains, and arrays * switched to a symbolic initializer for locale IDs to improve flexibility * refactored QIO and Regexp reference counting * deprecated the internal InitPrivateGlobals module * added config params to support program startup communication diagnostics (see 'printInitVerboseComm' and 'printInitCommCounts') * added a verification pass to ensure ref types for return types are available * renamed NUM_KIND_FLOAT to NUM_KIND_REAL * renamed 'Class'->'Aggregate' in the compiler sources when record|class|union * switched to a bulk copy of flags when copying Symbol classes * changed the representation of intents to support bitmasks * moved the initialization of memory tracking from the modules to the runtime * removed user-level memory leaks from example codes in the language spec * created FLAG_COMPILER_GENERATED to separate functions/labels from FLAG_TEMP * added PRIM_IS_ATOMIC_TYPE to query whether something is an atomic type * added new primitives for querying whether a type is a tuple, sync, or single * improved the internal use cases of 'printchplenv' * removed redundant overloads of ==/!= for syserr and err_t * improved the implementation when 'noRefCount' is true * removed no-longer-necessary _ensure_reference_type() feature * changed extern routines that take string arguments to take 'c_string' instead * changed extern routines that take 'inout' arguments to 'ref' when appropriate * numerous refactorings of the compiler code for clarity and/or effiency ------------------------------------------------------------------------------ Learn Graph Databases - Download FREE O'Reilly Book "Graph Databases" is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/NeoTech _______________________________________________ Chapel-announce mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/chapel-announce ------------------------------------------------------------------------------ Learn Graph Databases - Download FREE O'Reilly Book "Graph Databases" is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/NeoTech _______________________________________________ Chapel-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/chapel-developers
