Re: Thoughts on parallel programming?

Fawzi Mohamed Thu, 11 Nov 2010 06:18:13 -0800

On 11-nov-10, at 09:58, Russel Winder wrote:

On Thu, 2010-11-11 at 02:24 +0000, jfd wrote:
Any thoughts on parallel programming. I was looking at somethingabout Chapeland X10 languages etc. for parallelism, and it looks interesting.I know thatit is still an area of active research, and it is not yet (farfrom?) done,
but anyone have thoughts on this as future direction?  Thank you.
Any programming language that cannot be used to program applications
running on a heterogeneous collection of processors, including CPUsand
GPUs as computational devices, on a single chip, with there being many
such chips on a board, possibly clustered, doesn't have much of a
future.  Timescale 5--10 years.

on this I am not so sure, heterogeneous clusters are more difficult toprogram, and GPU & co are slowly becoming more and more general purpose.Being able to take advantage of those is useful, but I am notconvinced they are necessarily the future.

Intel's 80-core, 48-core and 50-core devices show the way server,
workstation and laptop architectures are going.  There may be a large
central memory unit as now, but it will be secondary storage notprimarystorage. All the chip architectures are shifting to distributedmemory-- basically cache coherence is too hard a problem to solve, soinstead
of solving it, they are getting rid of it.  Also the memory bus stops
being the bottleneck for computations, which is actually the biggest
problem with current architectures.

yes many core is the future I agree on this, and also that distributedapproach is the only way to scale to a really large number ofprocessors.Bud distributed systems *are* more complex, so I think that for theforeseeable future one will have a hybrid approach.

Windows, Linux and Mac OS X have a serious problem and will either die
or be revolutionized.  Apple at least recognize the issue, hence they
pushed OpenCL.

again not sure the situation is as dire as you paint it, Linux doesquite well in the HPC field... but I agree that to be the ideal OS forthese architectures it will need more changes.

Actor model, CSP, dataflow, and similar distributed memory/process-based
architectures will become increasingly important for software.  There
will be an increasing move to declarative expression, but I doubt
functional languages will ever make the main stream. The issue hereisthat parallelism generally requires programmers not to try and tellthecomputer every detail how to do something, but instead specify thestart
and end conditions and allow the runtime system to handle the
realization of the transformation. Hence the move in Fortran fromlots
of "do" loops to "whole array" operations.

Whole array operation are useful, and when possible one gains muchusing them, unfortunately not all problems can be reduced to few largearray operations, data parallel languages are not the main type oflanguage for these reasons.

MPI and all the SPMD approaches have a severely limited future, but I
bet the HPC codes are still using Fortran and MPI in 50 years time.

well whole array operations are a generalization of the SPMD approach,so I this sense you said that that kind of approach will have a future(but with a more difficult optimization as the hardware is more complex.

About MPI I think that many don't see what MPI really does, mpi offersa simplified parallel model.The main weakness of this model is that it assumes some kind ofreliability, but then it offersa clear computational model with processors ordered in a linear ofhigher dimensional structure and efficient collective communicationprimitives.Yes MPI is not the right choice for all problems, but when usable itis very powerful, often superior to the alternatives, and programmingwith it is *simpler* than thinking about a generic distributed system.So I think that for problems that are not trivially parallel, oreasily parallelizable MPI will remain as the best choice.

You mentioned Chapel and X10, but don't forget the other one of the
original three HPCS projects, Fortress.  Whilst all three are PGAS
(partitioned global address space) languages, Fortress takes a very
different viewpoint compared to Chapel and X10.

It might be a personal thing, but I am kind of "suspicious" towardPGAS, I find a generalized MPI model better than PGAS when you want tohave separated address spaces.Using MPI one can define a PGAS like object wrapping local storagewith an object that sends remote requests to access remote memorypieces.This means having a local server where this wrapped objects can be"published" and that can respond in any moment to external requests. Icall this rpc (remote procedure call) and it can be realized easily onthe top of MPI.As not all objects are distributed and in a complex program it doesnot always makes sense to distribute these objects on all processorsor none, I find that the robust partitioning and collectivecommunication primitives of MPI superior to PGAS.With enough effort you probably can get everything also from PGAS, butthen you loose all its simplicity.

The summary of the summary is:  programmers will either be developing
parallelism systems or they will be unemployed.

The situation is not so dire, some problems are trivially parallel, orcan be solved with simple parallel patterns, others don't need to besolved in parallel, as the sequential solution if fast enough, but Ido agree that being able to develop parallel systems is increasinglyimportant.

In fact it is something that I like to do, and I thought about a lot.

I did program parallel systems, and out of my experience I tried tobuild something to do parallel programs "the way it should be", or atleast the way I would like it to be ;)


The result is what I did with blip, http://dsource.org/projects/blip .

I don't think that (excluding some simple examples) fully automatic(trasparent) parallelization is really feasible.At some point being parallel is more complex, and it puts an extraburden on the programmer.Still it is possible to have several levels of parallelization, and ifyou program a fully parallel program it should still be possible touse it relatively efficiently locally, but a local program will notautomatically become fully parallel.

What I did is a basic smp parallelization for programs with sharedmemory.This level tries to schedule efficiently independent recursive tasksusing all processors as efficiently as possible (using the topologydetected by libhwloc.It leverages an event based framework (libev) to avoid blockingwaiting for external tasks.The ability to describe complex asynchronous processes can be veryuseful also to work with GPUs.

mpi parallelization is part of the hierarchy of parallelization, forthe reasons I described before, it is wrapped so that on a singleprocessor one can use a "pseudo" mpi.

rpc (remote procedure call) might be better described as distributedobjects, offers a server that can responds to external requests at anymoment and the possibility to publish objects that will be thenidentified by urls.There urls can be used to create local proxies that call the remoteobject and get results from it.

This can be done using mpi, or directly sockets.

If one uses sockets he has the whole flexibility (but also the wholecomplexity) of a fully distributed system.The basic building blocks of this can be used also in a distributedprotocol like distributed hashtables.

blip is available now, and works with osx and linux. It should bepossible to port it to windows, (both libhwloc and libev work onwindows), but I didn't do it.It needs D1 and tango, tango trunk can be compiled using the scriptsin blip/buildTango, and then programs using blip can be compiled moreeasily with the dbuild script (that uses xfbuild behind the scenes).

I planned to make an official release this w.e., but you can lookalready now, the code is all there...


Fawzi

-----------------------------------------------------
Dr. Fawzi Mohamed,                      Office: 3'322
Humboldt-Universitaet zu Berlin, Institut fuer Chemie
Post:               Unter den Linden 6, 10099, Berlin
Besucher/Pakete:    Brook-Taylor-Str. 2, 12489 Berlin
Tel: +49 30 20 93 7140          Fax: +49 30 2093 7136
-----------------------------------------------------

Re: Thoughts on parallel programming?

Reply via email to