Re: Thoughts on parallel programming?

2010-11-13 Thread sybrandy

Distributed programming is essentially a bunch of little sequential
program that interact, which is basically how people cooperate in the
real world. I think that is by far the most intuitive of any
concurrent programming model, though it's still a significant
conceptual shift from the traditional monolithic imperative program.


The Erlang people seem to say that a lot. The thing they omit to say,
though, is that it is very, very difficult in the real world!
Consider managing a team of ten people. Getting them to be ten times as
productive as a single person is extremely difficult -- virtually
impossible, in fact.


That's only part of the reasoning behind all of the little programs in 
Erlang.  The one of the more important aspect is the concept of 
supervisor trees where you have processes that monitor* other processes. 
 In the event that a child process fails, the parent process will try 
to perform a simpler version of what needs to occur until it is successful.


The other aspect is the concept of failing fast.  It is assumed that a 
process that fails does not know how to resolve the issue, therefore it 
should just stop running and allow the parent process to do the right thing.


If you build your software the Erlang way, then you implicitly build 
software that is multi-core friendly.  How well it uses multiple cores 
depends on the software that is written, however I believe that Erlang 
is supposed to be better than most other languages at obtaining 
something close to linear scaling across cores.  Not 100% sure, though.


Does this mean that I believe distributed programming is easy in Erlang? 
 Well, that depends on what you're doing, but I will say that being 
able to spawn functions on different machines is dirt simple.  Doing it 
efficiently...well, that's where I think the programmer needs to know 
what they're doing.


Casey

* The monitoring is something implicit to the language.


Re: Thoughts on parallel programming?

2010-11-13 Thread sybrandy

True enough.  But it's certainly more natural to think about than mutex-based 
concurrency, automatic parallelization, etc.  In the long term there may turn 
out to be better models, but I don't know of one today.

Also, there are other goals for such a design than increasing computation 
speed: decreased maintenance cost, system reliability, etc.  Erlang processes 
are equivalent to objects in C++ or Java with the added benefit of asynchronous 
execution in instances where an immediate response (ie. RPC) is not required.  
Performance gain is a direct function of how often this is true.  But even 
where it's not, the other benefits exist.



I like that description!

Casey


Re: Thoughts on parallel programming?

2010-11-12 Thread Tobias Pfaff

On 11/12/2010 12:44 AM, dsimcha wrote:

== Quote from Tobias Pfaff (nos...@spam.no)'s article

On 11/11/2010 08:10 PM, Russel Winder wrote:

On Thu, 2010-11-11 at 18:24 +0100, Tobias Pfaff wrote:
[ . . . ]

Unfortunately I only know about the standard stuff, OpenMP/OpenCL...
Speaking of which: Are there any attempts to support lightweight
multithreading in D, that is, something like OpenMP ?


I'd hardly call OpenMP lightweight.  I agree that as a meta-notation for
directing the compiler how to insert appropriate code to force
multithreading of certain classes of code, using OpenMP generally beats
manual coding of the threads.  But OpenMP is very Fortran oriented even
though it can be useful for C, and indeed C++ as well.

However, given things like Threading Building Blocks (TBB) and the
functional programming inspired techniques used by Chapel, OpenMP
increasingly looks like a hack rather than a solution.

Using parallel versions of for, map, filter, reduce in the language is
probably a better way forward.

Having a D binding to OpenCL (and OpenGL, MPI, etc.) is probably going
to be a good thing.


Well, I am looking for an easy  efficient way to perform parallel
numerical calculations on our 4-8 core machines. With C++, that's OpenMP
(or GPGPU stuff using CUDA/OpenCL) for us now. Maybe lightweight was the
wrong word, what I meant is that OpenMP is easy to use, and efficient
for the problems we are solving. There actually might be better tools
for that, honestly we didn't look into that much options -- we are no
HPC guys, 1000-cpu clusters are not a relevant scenario and we are happy
that we even started parallelizing our code at all :)
Anyways, I was thinking about the logical thing to use in D for this
scenario. It's nothing super-fancy, in cases just a parallel_for we
will, and sometimes a map/reduce operation...
Cheers,
Tobias


I think you'll be very pleased with std.parallelism when/if it gets into Phobos.
The design philosophy is exactly what you're looking for:  Simple shared memory
parallelism on multicore computers, assuming no fancy/unusual OS-, compiler- or
hardware-level infrastructure.  Basically, it's got parallel foreach, parallel
map, parallel reduce and parallel tasks.  All you need to fully utilize it is 
DMD
and a multicore PC.

As a reminder, the docs are at
http://cis.jhu.edu/~dsimcha/d/phobos/std_parallelism.html and the code is at
http://dsource.org/projects/scrapple/browser/trunk/parallelFuture/std_parallelism.d
 .
  If this doesn't meet your needs in its current form, I'd like as much
constructive criticism as possible, as long as it's within the scope of simple,
everyday parallelism without fancy infrastructure.


I did a quick test of the module, looks really good so far, thanks for 
providing this ! (Is this module scheduled for inclusion in phobos2 ?)

If I find issues with it I'll let you know.


Re: Thoughts on parallel programming?

2010-11-12 Thread Fawzi Mohamed

On 12-nov-10, at 00:29, Tobias Pfaff wrote:


[...]
Well, I am looking for an easy  efficient way to perform parallel  
numerical calculations on our 4-8 core machines. With C++, that's  
OpenMP (or GPGPU stuff using CUDA/OpenCL) for us now. Maybe  
lightweight was the wrong word, what I meant is that OpenMP is easy  
to use, and efficient for the problems we are solving. There  
actually might be better tools for that, honestly we didn't look  
into that much options -- we are no HPC guys, 1000-cpu clusters are  
not a relevant scenario and we are happy that we even started  
parallelizing our code at all :)


Anyways, I was thinking about the logical thing to use in D for this  
scenario. It's nothing super-fancy, in cases just a parallel_for we  
will, and sometimes a map/reduce operation...


If you use D1 blip.parallel.smp offers that, and it does scale well to  
4-8 cores.




Re: Thoughts on parallel programming?

2010-11-12 Thread Fawzi Mohamed


On 11-nov-10, at 20:41, Russel Winder wrote:


On Thu, 2010-11-11 at 15:16 +0100, Fawzi Mohamed wrote:
[ . . . ]
on this I am not so sure, heterogeneous clusters are more difficult  
to
program, and GPU  co are slowly becoming more and more general  
purpose.

Being able to take advantage of those is useful, but I am not
convinced they are necessarily the future.


The Intel roadmap is for processor chips that have a number of cores
with different architectures.  Heterogeneity is not going going to  
be a
choice, it is going to be an imposition.  And this is at bus level,  
not

at cluster level.


Vector co processors, yes I see that, and short term the effect of  
things like AMD fusion (CPU/GPU merging).
Is this necessarily the future? I don't know, neither does intel I  
think, as they are still evaluating larabee.

But CPU/GPU will stay around fro some time more for sure.


[ . . . ]
yes many core is the future I agree on this, and also that  
distributed

approach is the only way to scale to a really large number of
processors.
Bud distributed systems *are* more complex, so I think that for the
foreseeable future one will have a hybrid approach.


Hybrid is what I am saying is the future whether we like it or not.   
SMP

as the whole system is the past.




I disagree that distributed systems are more complex per se.  I  
suspect
comments are getting so general here that anything anyone writes can  
be

seen as both true and false simultaneously.  My perception is that
shared memory multithreading is less and less a tool that applications
programmers should be thinking in terms of.  Multiple processes with  
an

hierarchy of communications costs is the overarching architecture with
each process potentially being SMP or CSP or . . .


I agree that on not too large shared memory machines a hierarchy of  
tasks is the correct approach.
This is what I did in blip.parallel.smp. Using that one can have  
fairly efficient automatic scheduling, and so forget most of the  
complexities, and actual hardware configuration.



again not sure the situation is as dire as you paint it, Linux does
quite well in the HPC field... but I agree that to be the ideal OS  
for

these architectures it will need more changes.


The Linux driver architecture is already creaking at the seams, it
implies a central monolithic approach to operating system.  This falls
down in a multiprocessor shared memory context.  The fact that the Top
500 generally use Linux is because it is the least worst option.  M$
despite throwing large amounts of money at the problem, and indeed
bought some very high profile names to try and do something about the
lack of traction, have failed to make any headway in the HPC operating
system stakes.  Do you want to have to run a virus checker on your HPC
system?

My gut reaction is that we are going to see a rise of hypervisors as  
per

Tilera chips, at least in the short to medium term, simply as a bridge
from the now OSes to the future.  My guess is that L4 microkernels
and/or nanokernels, exokernels, etc. will find a central place in  
future
systems.  The problem to be solved is ensuring that the appropriate  
ABI
is available on the appropriate core at the appropriate time.   
Mobility

of ABI is the critical factor here.


yes microkernels co will be more and more important (but I wonder how  
much this will be the case for the desktop).
ABI mobility?not so sure, for hpc I can imagine having to compile to  
different ABIs (but maybe that is what you mean with ABI mobility)



[ . . . ]

Whole array operation are useful, and when possible one gains much
using them, unfortunately not all problems can be reduced to few  
large

array operations, data parallel languages are not the main type of
language for these reasons.


Agreed.  My point was that in 1960s code people explicitly handled  
array

operations using do loops because they had to.  Nowadays such code is
anathema to efficient execution.  My complaint here is that people  
have
put effort into compiler technology instead of rewriting the codes  
in a
better language and/or idiom.  Clearly whole array operations only  
apply

to algorithms that involve arrays!

[ . . . ]
well whole array operations are a generalization of the SPMD  
approach,
so I this sense you said that that kind of approach will have a  
future
(but with a more difficult optimization as the hardware is more  
complex.


I guess this is where the PGAS people are challenging things.
Applications can be couched in terms of array algorithms which can be
scattered across distributed memory systems.  Inappropriate operations
lead to huge inefficiencies, but handles correctly, code runs very
fast.

About MPI I think that many don't see what MPI really does, mpi  
offers

a simplified parallel model.
The main weakness of this model is that it assumes some kind of
reliability, but then it offers
a clear computational model with processors ordered in a linear of
higher 

Re: Thoughts on parallel programming?

2010-11-12 Thread Fawzi Mohamed

On 11-nov-10, at 20:10, Russel Winder wrote:


On Thu, 2010-11-11 at 18:24 +0100, Tobias Pfaff wrote:
[ . . . ]

Unfortunately I only know about the standard stuff, OpenMP/OpenCL...
Speaking of which: Are there any attempts to support lightweight
multithreading in D, that is, something like OpenMP ?


I'd hardly call OpenMP lightweight.  I agree that as a meta-notation  
for

directing the compiler how to insert appropriate code to force
multithreading of certain classes of code, using OpenMP generally  
beats
manual coding of the threads.  But OpenMP is very Fortran oriented  
even

though it can be useful for C, and indeed C++ as well.

However, given things like Threading Building Blocks (TBB) and the
functional programming inspired techniques used by Chapel, OpenMP
increasingly looks like a hack rather than a solution.


I agree I think that TBB offers primitives for many parallelization  
kinds, and is more clean and flexible than OpenMP, but in my opinion  
it has a big weakness: it cannot cope well with independent tasks.  
Coping well wit both nested parallelism and independent tasks is a  
crucial thing to have a generic solution that can be applied to  
several problems.

This is missing as far as I know also from Chapel.
I think that having a solution that copes well with both nested  
parallelism and independent tasks is an excellent starting on which to  
build almost all other higher level parallelization schemes.
It is important to handle this centrally, because the number of  
threads that one should spawn should ideally stay limited to the  
number of  execution units.




Re: Thoughts on parallel programming?

2010-11-12 Thread Don

Sean Kelly wrote:

Walter Bright Wrote:


Russel Winder wrote:

At the heart of all this is that programmers are taught that algorithm
is a sequence of actions to achieve a goal.  Programmers are trained to
think sequentially and this affects their coding.  This means that
parallelism has to be expressed at a sufficiently high level that
programmers can still reason about algorithms as sequential things. 
I think it's more than being trained to think sequentially. I think it is in the 
inherent nature of how we think.


Distributed programming is essentially a bunch of little sequential program 
that interact, which is basically how people cooperate in the real world.  I 
think that is by far the most intuitive of any concurrent programming model, 
though it's still a significant conceptual shift from the traditional 
monolithic imperative program.


The Erlang people seem to say that a lot. The thing they omit to say, 
though, is that it is very, very difficult in the real world!
Consider managing a team of ten people. Getting them to be ten times as 
productive as a single person is extremely difficult -- virtually 
impossible, in fact.


I agree with Walter -- I don't think it's got much to do with programmer 
training. It's a problem that hasn't been solved in the real world in 
the general case.


The analogy with the real world suggests to me that there are three 
cases that work well:

* massively parallel;
* _completely_ independent tasks; and
* very small teams.

Large teams are a management nightmare, and I see no reason to believe 
that wouldn't hold true for a large number of cores as well.


Re: Thoughts on parallel programming?

2010-11-11 Thread Russel Winder
On Thu, 2010-11-11 at 02:24 +, jfd wrote:
 Any thoughts on parallel programming.  I was looking at something about Chapel
 and X10 languages etc. for parallelism, and it looks interesting.  I know that
 it is still an area of active research, and it is not yet (far from?) done,
 but anyone have thoughts on this as future direction?  Thank you.

Any programming language that cannot be used to program applications
running on a heterogeneous collection of processors, including CPUs and
GPUs as computational devices, on a single chip, with there being many
such chips on a board, possibly clustered, doesn't have much of a
future.  Timescale 5--10 years.

Intel's 80-core, 48-core and 50-core devices show the way server,
workstation and laptop architectures are going.  There may be a large
central memory unit as now, but it will be secondary storage not primary
storage.  All the chip architectures are shifting to distributed memory
-- basically cache coherence is too hard a problem to solve, so instead
of solving it, they are getting rid of it.  Also the memory bus stops
being the bottleneck for computations, which is actually the biggest
problem with current architectures.

Windows, Linux and Mac OS X have a serious problem and will either die
or be revolutionized.  Apple at least recognize the issue, hence they
pushed OpenCL.

Actor model, CSP, dataflow, and similar distributed memory/process-based
architectures will become increasingly important for software.  There
will be an increasing move to declarative expression, but I doubt
functional languages will ever make the main stream.  The issue here is
that parallelism generally requires programmers not to try and tell the
computer every detail how to do something, but instead specify the start
and end conditions and allow the runtime system to handle the
realization of the transformation.  Hence the move in Fortran from lots
of do loops to whole array operations.

MPI and all the SPMD approaches have a severely limited future, but I
bet the HPC codes are still using Fortran and MPI in 50 years time.

You mentioned Chapel and X10, but don't forget the other one of the
original three HPCS projects, Fortress.  Whilst all three are PGAS
(partitioned global address space) languages, Fortress takes a very
different viewpoint compared to Chapel and X10.

The summary of the summary is:  programmers will either be developing
parallelism systems or they will be unemployed.

shameless-plug
To hear more, I am doing a session on all this stuff for ACCU London
2010-11-18 18:30+00:00
http://skillsmatter.com/event/java-jee/java-python-ruby-linux-windows-are-all-doomed
/shameless-plug

-- 
Russel.
=
Dr Russel Winder  t: +44 20 7585 2200   voip: sip:russel.win...@ekiga.net
41 Buckmaster Roadm: +44 7770 465 077   xmpp: rus...@russel.org.uk
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder


signature.asc
Description: This is a digitally signed message part


Re: Thoughts on parallel programming?

2010-11-11 Thread Fawzi Mohamed

On 11-nov-10, at 09:58, Russel Winder wrote:


On Thu, 2010-11-11 at 02:24 +, jfd wrote:
Any thoughts on parallel programming.  I was looking at something  
about Chapel
and X10 languages etc. for parallelism, and it looks interesting.   
I know that
it is still an area of active research, and it is not yet (far  
from?) done,

but anyone have thoughts on this as future direction?  Thank you.


Any programming language that cannot be used to program applications
running on a heterogeneous collection of processors, including CPUs  
and

GPUs as computational devices, on a single chip, with there being many
such chips on a board, possibly clustered, doesn't have much of a
future.  Timescale 5--10 years.


on this I am not so sure, heterogeneous clusters are more difficult to  
program, and GPU  co are slowly becoming more and more general purpose.
Being able to take advantage of those is useful, but I am not  
convinced they are necessarily the future.



Intel's 80-core, 48-core and 50-core devices show the way server,
workstation and laptop architectures are going.  There may be a large
central memory unit as now, but it will be secondary storage not  
primary
storage.  All the chip architectures are shifting to distributed  
memory
-- basically cache coherence is too hard a problem to solve, so  
instead

of solving it, they are getting rid of it.  Also the memory bus stops
being the bottleneck for computations, which is actually the biggest
problem with current architectures.


yes many core is the future I agree on this, and also that distributed  
approach is the only way to scale to a really large number of  
processors.
Bud distributed systems *are* more complex, so I think that for the  
foreseeable future one will have a hybrid approach.



Windows, Linux and Mac OS X have a serious problem and will either die
or be revolutionized.  Apple at least recognize the issue, hence they
pushed OpenCL.


again not sure the situation is as dire as you paint it, Linux does  
quite well in the HPC field... but I agree that to be the ideal OS for  
these architectures it will need more changes.


Actor model, CSP, dataflow, and similar distributed memory/process- 
based

architectures will become increasingly important for software.  There
will be an increasing move to declarative expression, but I doubt
functional languages will ever make the main stream.  The issue here  
is
that parallelism generally requires programmers not to try and tell  
the
computer every detail how to do something, but instead specify the  
start

and end conditions and allow the runtime system to handle the
realization of the transformation.  Hence the move in Fortran from  
lots

of do loops to whole array operations.


Whole array operation are useful, and when possible one gains much  
using them, unfortunately not all problems can be reduced to few large  
array operations, data parallel languages are not the main type of  
language for these reasons.



MPI and all the SPMD approaches have a severely limited future, but I
bet the HPC codes are still using Fortran and MPI in 50 years time.


well whole array operations are a generalization of the SPMD approach,  
so I this sense you said that that kind of approach will have a future  
(but with a more difficult optimization as the hardware is more complex.


About MPI I think that many don't see what MPI really does, mpi offers  
a simplified parallel model.
The main weakness of this model is that it assumes some kind of  
reliability, but then it offers
a clear computational model with processors ordered in a linear of  
higher dimensional structure and efficient collective communication  
primitives.
Yes MPI is not the right choice for all problems, but when usable it  
is very powerful, often superior to the alternatives, and programming  
with it is *simpler* than thinking about a generic distributed system.
So I think that for problems that are not trivially parallel, or  
easily parallelizable MPI will remain as the best choice.



You mentioned Chapel and X10, but don't forget the other one of the
original three HPCS projects, Fortress.  Whilst all three are PGAS
(partitioned global address space) languages, Fortress takes a very
different viewpoint compared to Chapel and X10.


It might be a personal thing, but I am kind of suspicious toward  
PGAS, I find a generalized MPI model better than PGAS when you want to  
have separated address spaces.
Using MPI one can define a PGAS like object wrapping local storage  
with an object that sends remote requests to access remote memory  
pieces.
This means having a local server where this wrapped objects can be  
published and that can respond in any moment to external requests. I  
call this rpc (remote procedure call) and it can be realized easily on  
the top of MPI.
As not all objects are distributed and in a complex program it does  
not always makes sense to distribute these objects on all processors  
or none, I find 

Re: Thoughts on parallel programming?

2010-11-11 Thread Fawzi Mohamed


On 11-nov-10, at 15:16, Fawzi Mohamed wrote:


On 11-nov-10, at 09:58, Russel Winder wrote:


On Thu, 2010-11-11 at 02:24 +, jfd wrote:
Any thoughts on parallel programming.  I was looking at something  
about Chapel
and X10 languages etc. for parallelism, and it looks interesting.   
I know that
it is still an area of active research, and it is not yet (far  
from?) done,

but anyone have thoughts on this as future direction?  Thank you.


I just finished reading Parallel Programmability and the Chapel  
Language by Chamberlain, Callahan and Zima.

A very nice read, and overview of several languages and approaches.
Still I stand by my earlier view, an MPI like approach is more  
flexible, but indeed having a nice parallel implementation of  
distributed arrays (which on MPI one can have using Global Arrays for  
example), can be very useful.
I think that a language like D can hide these behind wrapper objects,  
and reach for these objects (that are not the only ones present in a  
complex parallel program) an expressivity similar to chapel using the  
approach I have in blip.
A direct implementation might be more efficient on shared memory  
machines though.


Re: Thoughts on parallel programming?

2010-11-11 Thread Fawzi Mohamed


On 11-nov-10, at 15:16, Fawzi Mohamed wrote:


On 11-nov-10, at 09:58, Russel Winder wrote:


MPI and all the SPMD approaches have a severely limited future, but I
bet the HPC codes are still using Fortran and MPI in 50 years time.


well whole array operations are a generalization of the SPMD  
approach, so I this sense you said that that kind of approach will  
have a future (but with a more difficult optimization as the  
hardware is more complex.


sorry I translated that as SIMD, not SPMD, but the answer below still  
holds in my opinion, if one has a complex parallel problem mpi is a  
worthy contender, the thing is that in many occasions one doesn't need  
all its power.
If a client server, a distributed or a map/reduce approach work, then  
simpler and more flexible solutions are superior.
That (and its reliability problem, that PGAS also shares) is, in my  
opinion, the reason MPI is not very used outside the computational  
community.
Being able to tackle also MPMD in a good way can be useful, and that  
is what the rpc level does between computers, and the event based  
scheduling within a single computer (ensuring that one processor can  
do meaningful work while the other waits.


About MPI I think that many don't see what MPI really does, mpi  
offers a simplified parallel model.
The main weakness of this model is that it assumes some kind of  
reliability, but then it offers
a clear computational model with processors ordered in a linear of  
higher dimensional structure and efficient collective communication  
primitives.
Yes MPI is not the right choice for all problems, but when usable it  
is very powerful, often superior to the alternatives, and  
programming with it is *simpler* than thinking about a generic  
distributed system.
So I think that for problems that are not trivially parallel, or  
easily parallelizable MPI will remain as the best choice.




Re: Thoughts on parallel programming?

2010-11-11 Thread Tobias Pfaff

On 11/11/2010 03:24 AM, jfd wrote:

Any thoughts on parallel programming.  I was looking at something about Chapel
and X10 languages etc. for parallelism, and it looks interesting.  I know that
it is still an area of active research, and it is not yet (far from?) done,
but anyone have thoughts on this as future direction?  Thank you.


Unfortunately I only know about the standard stuff, OpenMP/OpenCL...
Speaking of which: Are there any attempts to support lightweight 
multithreading in D, that is, something like OpenMP ?


Thanks!


Re: Thoughts on parallel programming?

2010-11-11 Thread Trass3r

Unfortunately I only know about the standard stuff, OpenMP/OpenCL...
Speaking of which: Are there any attempts to support lightweight  
multithreading in D, that is, something like OpenMP ?


That would require compiler support for it.
Other than that there only seems to be dsimcha's std.parallelism


Re: Thoughts on parallel programming?

2010-11-11 Thread Tobias Pfaff

On 11/11/2010 07:01 PM, Trass3r wrote:

Unfortunately I only know about the standard stuff, OpenMP/OpenCL...
Speaking of which: Are there any attempts to support lightweight
multithreading in D, that is, something like OpenMP ?


That would require compiler support for it.
Other than that there only seems to be dsimcha's std.parallelism


Ok, that's what I suspected.
std.parallelism doesn't look to bad though, I'll try around with that...


Re: Thoughts on parallel programming?

2010-11-11 Thread Russel Winder
On Thu, 2010-11-11 at 18:24 +0100, Tobias Pfaff wrote:
[ . . . ]
 Unfortunately I only know about the standard stuff, OpenMP/OpenCL...
 Speaking of which: Are there any attempts to support lightweight 
 multithreading in D, that is, something like OpenMP ?

I'd hardly call OpenMP lightweight.  I agree that as a meta-notation for
directing the compiler how to insert appropriate code to force
multithreading of certain classes of code, using OpenMP generally beats
manual coding of the threads.  But OpenMP is very Fortran oriented even
though it can be useful for C, and indeed C++ as well.

However, given things like Threading Building Blocks (TBB) and the
functional programming inspired techniques used by Chapel, OpenMP
increasingly looks like a hack rather than a solution.

Using parallel versions of for, map, filter, reduce in the language is
probably a better way forward.

Having a D binding to OpenCL (and OpenGL, MPI, etc.) is probably going
to be a good thing.

-- 
Russel.
=
Dr Russel Winder  t: +44 20 7585 2200   voip: sip:russel.win...@ekiga.net
41 Buckmaster Roadm: +44 7770 465 077   xmpp: rus...@russel.org.uk
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder


signature.asc
Description: This is a digitally signed message part


Re: Thoughts on parallel programming?

2010-11-11 Thread Russel Winder
On Thu, 2010-11-11 at 15:16 +0100, Fawzi Mohamed wrote:
[ . . . ]
 on this I am not so sure, heterogeneous clusters are more difficult to  
 program, and GPU  co are slowly becoming more and more general purpose.
 Being able to take advantage of those is useful, but I am not  
 convinced they are necessarily the future.

The Intel roadmap is for processor chips that have a number of cores
with different architectures.  Heterogeneity is not going going to be a
choice, it is going to be an imposition.  And this is at bus level, not
at cluster level. 

[ . . . ]
 yes many core is the future I agree on this, and also that distributed  
 approach is the only way to scale to a really large number of  
 processors.
 Bud distributed systems *are* more complex, so I think that for the  
 foreseeable future one will have a hybrid approach.

Hybrid is what I am saying is the future whether we like it or not.  SMP
as the whole system is the past.

I disagree that distributed systems are more complex per se.  I suspect
comments are getting so general here that anything anyone writes can be
seen as both true and false simultaneously.  My perception is that
shared memory multithreading is less and less a tool that applications
programmers should be thinking in terms of.  Multiple processes with an
hierarchy of communications costs is the overarching architecture with
each process potentially being SMP or CSP or . . .  

 again not sure the situation is as dire as you paint it, Linux does  
 quite well in the HPC field... but I agree that to be the ideal OS for  
 these architectures it will need more changes.

The Linux driver architecture is already creaking at the seams, it
implies a central monolithic approach to operating system.  This falls
down in a multiprocessor shared memory context.  The fact that the Top
500 generally use Linux is because it is the least worst option.  M$
despite throwing large amounts of money at the problem, and indeed
bought some very high profile names to try and do something about the
lack of traction, have failed to make any headway in the HPC operating
system stakes.  Do you want to have to run a virus checker on your HPC
system?

My gut reaction is that we are going to see a rise of hypervisors as per
Tilera chips, at least in the short to medium term, simply as a bridge
from the now OSes to the future.  My guess is that L4 microkernels
and/or nanokernels, exokernels, etc. will find a central place in future
systems.  The problem to be solved is ensuring that the appropriate ABI
is available on the appropriate core at the appropriate time.  Mobility
of ABI is the critical factor here.  

[ . . . ]
 Whole array operation are useful, and when possible one gains much  
 using them, unfortunately not all problems can be reduced to few large  
 array operations, data parallel languages are not the main type of  
 language for these reasons.

Agreed.  My point was that in 1960s code people explicitly handled array
operations using do loops because they had to.  Nowadays such code is
anathema to efficient execution.  My complaint here is that people have
put effort into compiler technology instead of rewriting the codes in a
better language and/or idiom.  Clearly whole array operations only apply
to algorithms that involve arrays!

[ . . . ]
 well whole array operations are a generalization of the SPMD approach,  
 so I this sense you said that that kind of approach will have a future  
 (but with a more difficult optimization as the hardware is more complex.

I guess this is where the PGAS people are challenging things.
Applications can be couched in terms of array algorithms which can be
scattered across distributed memory systems.  Inappropriate operations
lead to huge inefficiencies, but handles correctly, code runs very
fast. 

 About MPI I think that many don't see what MPI really does, mpi offers  
 a simplified parallel model.
 The main weakness of this model is that it assumes some kind of  
 reliability, but then it offers
 a clear computational model with processors ordered in a linear of  
 higher dimensional structure and efficient collective communication  
 primitives.
 Yes MPI is not the right choice for all problems, but when usable it  
 is very powerful, often superior to the alternatives, and programming  
 with it is *simpler* than thinking about a generic distributed system.
 So I think that for problems that are not trivially parallel, or  
 easily parallelizable MPI will remain as the best choice.

I guess my main irritant with MPI is that I have to run the same
executable on every node and, perhaps more importantly, the message
passing structure is founded on Fortran primitive data types.  OK so you
can hack up some element of abstraction so as to send complex messages,
but it would be far better if the MPI standard provided better
abstractions. 

[ . . . ]
 It might be a personal thing, but I am kind of suspicious toward  
 PGAS, I find a generalized MPI model 

Re: Thoughts on parallel programming?

2010-11-11 Thread Sean Kelly
Tobias Pfaff Wrote:

 On 11/11/2010 03:24 AM, jfd wrote:
  Any thoughts on parallel programming.  I was looking at something about 
  Chapel
  and X10 languages etc. for parallelism, and it looks interesting.  I know 
  that
  it is still an area of active research, and it is not yet (far from?) done,
  but anyone have thoughts on this as future direction?  Thank you.
 
 Unfortunately I only know about the standard stuff, OpenMP/OpenCL...
 Speaking of which: Are there any attempts to support lightweight 
 multithreading in D, that is, something like OpenMP ?

I've considered backing spawn() calls by fibers multiplexed by a thread pool 
(receive() calls would cause the fiber to yield) instead of having each call 
generate a new kernel thread.  The only issue is that TLS (ie. non-shared 
static storage) is thread-local, not fiber-local.  One idea, however, is to do 
OSX-style manual TLS inside Fiber, so each fiber would have its own automatic 
local storage.  Perhaps as an experiment I'll create a new derivative of Fiber 
that does this and see how it works.


Re: Thoughts on parallel programming?

2010-11-11 Thread retard
Thu, 11 Nov 2010 19:41:56 +, Russel Winder wrote:

 On Thu, 2010-11-11 at 15:16 +0100, Fawzi Mohamed wrote: [ . . . ]
 on this I am not so sure, heterogeneous clusters are more difficult to
 program, and GPU  co are slowly becoming more and more general
 purpose. Being able to take advantage of those is useful, but I am not
 convinced they are necessarily the future.
 
 The Intel roadmap is for processor chips that have a number of cores
 with different architectures.  Heterogeneity is not going going to be a
 choice, it is going to be an imposition.  And this is at bus level, not
 at cluster level.
 
 [ . . . ]
 yes many core is the future I agree on this, and also that distributed
 approach is the only way to scale to a really large number of
 processors.
 Bud distributed systems *are* more complex, so I think that for the
 foreseeable future one will have a hybrid approach.
 
 Hybrid is what I am saying is the future whether we like it or not.  SMP
 as the whole system is the past.
 
 I disagree that distributed systems are more complex per se.  I suspect
 comments are getting so general here that anything anyone writes can be
 seen as both true and false simultaneously.  My perception is that
 shared memory multithreading is less and less a tool that applications
 programmers should be thinking in terms of.  Multiple processes with an
 hierarchy of communications costs is the overarching architecture with
 each process potentially being SMP or CSP or . . .
 
 again not sure the situation is as dire as you paint it, Linux does
 quite well in the HPC field... but I agree that to be the ideal OS for
 these architectures it will need more changes.
 
 The Linux driver architecture is already creaking at the seams, it
 implies a central monolithic approach to operating system.  This falls
 down in a multiprocessor shared memory context.  The fact that the Top
 500 generally use Linux is because it is the least worst option.  M$
 despite throwing large amounts of money at the problem, and indeed
 bought some very high profile names to try and do something about the
 lack of traction, have failed to make any headway in the HPC operating
 system stakes.  Do you want to have to run a virus checker on your HPC
 system?
 
 My gut reaction is that we are going to see a rise of hypervisors as per
 Tilera chips, at least in the short to medium term, simply as a bridge
 from the now OSes to the future.  My guess is that L4 microkernels
 and/or nanokernels, exokernels, etc. will find a central place in future
 systems.  The problem to be solved is ensuring that the appropriate ABI
 is available on the appropriate core at the appropriate time.  Mobility
 of ABI is the critical factor here.
 
 [ . . . ]
 Whole array operation are useful, and when possible one gains much
 using them, unfortunately not all problems can be reduced to few large
 array operations, data parallel languages are not the main type of
 language for these reasons.
 
 Agreed.  My point was that in 1960s code people explicitly handled array
 operations using do loops because they had to.  Nowadays such code is
 anathema to efficient execution.  My complaint here is that people have
 put effort into compiler technology instead of rewriting the codes in a
 better language and/or idiom.  Clearly whole array operations only apply
 to algorithms that involve arrays!
 
 [ . . . ]
 well whole array operations are a generalization of the SPMD approach,
 so I this sense you said that that kind of approach will have a future
 (but with a more difficult optimization as the hardware is more
 complex.
 
 I guess this is where the PGAS people are challenging things.
 Applications can be couched in terms of array algorithms which can be
 scattered across distributed memory systems.  Inappropriate operations
 lead to huge inefficiencies, but handles correctly, code runs very fast.
 
 About MPI I think that many don't see what MPI really does, mpi offers
 a simplified parallel model.
 The main weakness of this model is that it assumes some kind of
 reliability, but then it offers
 a clear computational model with processors ordered in a linear of
 higher dimensional structure and efficient collective communication
 primitives.
 Yes MPI is not the right choice for all problems, but when usable it is
 very powerful, often superior to the alternatives, and programming with
 it is *simpler* than thinking about a generic distributed system. So I
 think that for problems that are not trivially parallel, or easily
 parallelizable MPI will remain as the best choice.
 
 I guess my main irritant with MPI is that I have to run the same
 executable on every node and, perhaps more importantly, the message
 passing structure is founded on Fortran primitive data types.  OK so you
 can hack up some element of abstraction so as to send complex messages,
 but it would be far better if the MPI standard provided better
 abstractions.
 
 [ . . . ]
 It might be a personal thing, but I 

Re: Thoughts on parallel programming?

2010-11-11 Thread retard
Thu, 11 Nov 2010 20:01:09 +, retard wrote:

 in CPUs the
 problems with programmability are slowing things down and many laptops
 are still dual-core despite multiple cores are more energy efficient
 than higher GHz and my home PC has 8 virtual cores in a single CPU.

At least it seems so to me. My last 1 and 2 core systems had a TDP of 65 
and 105W. Now it's 130W, the next gen have 12 cores and 130W TDP.

So I currently have 8 CPU cores and 480 GPU cores. Unfortunately many 
open source applications don't use the GPU (maybe OpenGL 1.0 but usually 
software rendering. The gpu accelerated desktops are still buggy and 
crash prone) and are single threaded. Even some heavier tasks like video 
encoding uses cores very inefficiently. Would MPI help?


Re: Thoughts on parallel programming?

2010-11-11 Thread sybrandy

On 11/11/2010 02:41 PM, Sean Kelly wrote:

Tobias Pfaff Wrote:


On 11/11/2010 03:24 AM, jfd wrote:

Any thoughts on parallel programming.  I was looking at something about Chapel
and X10 languages etc. for parallelism, and it looks interesting.  I know that
it is still an area of active research, and it is not yet (far from?) done,
but anyone have thoughts on this as future direction?  Thank you.


Unfortunately I only know about the standard stuff, OpenMP/OpenCL...
Speaking of which: Are there any attempts to support lightweight
multithreading in D, that is, something like OpenMP ?


I've considered backing spawn() calls by fibers multiplexed by a thread pool 
(receive() calls would cause the fiber to yield) instead of having each call 
generate a new kernel thread.  The only issue is that TLS (ie. non-shared 
static storage) is thread-local, not fiber-local.  One idea, however, is to do 
OSX-style manual TLS inside Fiber, so each fiber would have its own automatic 
local storage.  Perhaps as an experiment I'll create a new derivative of Fiber 
that does this and see how it works.


I actually did something similar for a very simple web server I was 
experimenting with.  It is similar to how Erlang works in that the 
Erlang processes are, at least to me, similar to fibers and they are run 
in one of several threads in the interpreter.


The only problem I had was ensuring that my logging was thread-safe.  If 
you could implement a TLS-like system for Fibers, I think that would 
help prevent that issue.


Casey


Re: Thoughts on parallel programming?

2010-11-11 Thread Trass3r

Having a D binding to OpenCL is probably going to be a good thing.


http://bitbucket.org/trass3r/cl4d/wiki/Home


Re: Thoughts on parallel programming?

2010-11-11 Thread Walter Bright

Russel Winder wrote:

Agreed.  My point was that in 1960s code people explicitly handled array
operations using do loops because they had to.  Nowadays such code is
anathema to efficient execution.  My complaint here is that people have
put effort into compiler technology instead of rewriting the codes in a
better language and/or idiom.  Clearly whole array operations only apply
to algorithms that involve arrays!


Yup. I am bemused by the efforts put into analyzing loops so that they can (by 
the compiler) be re-written into a higher level construct, and then the higher 
level construct is compiled.


It just is backwards what the compiler should be doing. The high level construct 
is what the programmer should be writing. It shouldn't be something the compiler 
reconstructs from low level source code.


Re: Thoughts on parallel programming?

2010-11-11 Thread Gary Whatmore
retard Wrote:

 Thu, 11 Nov 2010 19:41:56 +, Russel Winder wrote:
 
  On Thu, 2010-11-11 at 15:16 +0100, Fawzi Mohamed wrote: [ . . . ]
  on this I am not so sure, heterogeneous clusters are more difficult to
  program, and GPU  co are slowly becoming more and more general
  purpose. Being able to take advantage of those is useful, but I am not
  convinced they are necessarily the future.
  
  The Intel roadmap is for processor chips that have a number of cores
  with different architectures.  Heterogeneity is not going going to be a
  choice, it is going to be an imposition.  And this is at bus level, not
  at cluster level.
  
  [ . . . ]
  yes many core is the future I agree on this, and also that distributed
  approach is the only way to scale to a really large number of
  processors.
  Bud distributed systems *are* more complex, so I think that for the
  foreseeable future one will have a hybrid approach.
  
  Hybrid is what I am saying is the future whether we like it or not.  SMP
  as the whole system is the past.
  
  I disagree that distributed systems are more complex per se.  I suspect
  comments are getting so general here that anything anyone writes can be
  seen as both true and false simultaneously.  My perception is that
  shared memory multithreading is less and less a tool that applications
  programmers should be thinking in terms of.  Multiple processes with an
  hierarchy of communications costs is the overarching architecture with
  each process potentially being SMP or CSP or . . .
  
  again not sure the situation is as dire as you paint it, Linux does
  quite well in the HPC field... but I agree that to be the ideal OS for
  these architectures it will need more changes.
  
  The Linux driver architecture is already creaking at the seams, it
  implies a central monolithic approach to operating system.  This falls
  down in a multiprocessor shared memory context.  The fact that the Top
  500 generally use Linux is because it is the least worst option.  M$
  despite throwing large amounts of money at the problem, and indeed
  bought some very high profile names to try and do something about the
  lack of traction, have failed to make any headway in the HPC operating
  system stakes.  Do you want to have to run a virus checker on your HPC
  system?
  
  My gut reaction is that we are going to see a rise of hypervisors as per
  Tilera chips, at least in the short to medium term, simply as a bridge
  from the now OSes to the future.  My guess is that L4 microkernels
  and/or nanokernels, exokernels, etc. will find a central place in future
  systems.  The problem to be solved is ensuring that the appropriate ABI
  is available on the appropriate core at the appropriate time.  Mobility
  of ABI is the critical factor here.
  
  [ . . . ]
  Whole array operation are useful, and when possible one gains much
  using them, unfortunately not all problems can be reduced to few large
  array operations, data parallel languages are not the main type of
  language for these reasons.
  
  Agreed.  My point was that in 1960s code people explicitly handled array
  operations using do loops because they had to.  Nowadays such code is
  anathema to efficient execution.  My complaint here is that people have
  put effort into compiler technology instead of rewriting the codes in a
  better language and/or idiom.  Clearly whole array operations only apply
  to algorithms that involve arrays!
  
  [ . . . ]
  well whole array operations are a generalization of the SPMD approach,
  so I this sense you said that that kind of approach will have a future
  (but with a more difficult optimization as the hardware is more
  complex.
  
  I guess this is where the PGAS people are challenging things.
  Applications can be couched in terms of array algorithms which can be
  scattered across distributed memory systems.  Inappropriate operations
  lead to huge inefficiencies, but handles correctly, code runs very fast.
  
  About MPI I think that many don't see what MPI really does, mpi offers
  a simplified parallel model.
  The main weakness of this model is that it assumes some kind of
  reliability, but then it offers
  a clear computational model with processors ordered in a linear of
  higher dimensional structure and efficient collective communication
  primitives.
  Yes MPI is not the right choice for all problems, but when usable it is
  very powerful, often superior to the alternatives, and programming with
  it is *simpler* than thinking about a generic distributed system. So I
  think that for problems that are not trivially parallel, or easily
  parallelizable MPI will remain as the best choice.
  
  I guess my main irritant with MPI is that I have to run the same
  executable on every node and, perhaps more importantly, the message
  passing structure is founded on Fortran primitive data types.  OK so you
  can hack up some element of abstraction so as to send complex messages,
  but it would be 

Re: Thoughts on parallel programming?

2010-11-11 Thread Walter Bright

Russel Winder wrote:

At the heart of all this is that programmers are taught that algorithm
is a sequence of actions to achieve a goal.  Programmers are trained to
think sequentially and this affects their coding.  This means that
parallelism has to be expressed at a sufficiently high level that
programmers can still reason about algorithms as sequential things. 


I think it's more than being trained to think sequentially. I think it is in the 
inherent nature of how we think.


Re: Thoughts on parallel programming?

2010-11-11 Thread bearophile
Walter:

 Yup. I am bemused by the efforts put into analyzing loops so that they can 
 (by 
 the compiler) be re-written into a higher level construct, and then the 
 higher 
 level construct is compiled.
 
 It just is backwards what the compiler should be doing. The high level 
 construct 
 is what the programmer should be writing. It shouldn't be something the 
 compiler 
 reconstructs from low level source code.

I agree a lot. The language has to offer means to express all the semantics and 
constraints, that the arrays are disjointed, that the operations done on them 
are pure or not pure, that the operations are not pure but determined only by a 
small window in the arrays, and so on and on. And then the compiler has to 
optimize the code according to the presence of SIMD registers, multi-cores, 
etc. This maybe is not enough for max performance applications, but in most 
situations it's plenty enough. (Incidentally, this is a lot what the Chapel 
language does (and D doesn't), and what I have explained in two past posts 
about Chapel, that were mostly ignored.)

Bye,
bearophile


Re: Thoughts on parallel programming?

2010-11-11 Thread retard
Thu, 11 Nov 2010 16:32:03 -0500, bearophile wrote:

 Walter:
 
 Yup. I am bemused by the efforts put into analyzing loops so that they
 can (by the compiler) be re-written into a higher level construct, and
 then the higher level construct is compiled.
 
 It just is backwards what the compiler should be doing. The high level
 construct is what the programmer should be writing. It shouldn't be
 something the compiler reconstructs from low level source code.
 
 I agree a lot. The language has to offer means to express all the
 semantics and constraints, that the arrays are disjointed, that the
 operations done on them are pure or not pure, that the operations are
 not pure but determined only by a small window in the arrays, and so on
 and on. And then the compiler has to optimize the code according to the
 presence of SIMD registers, multi-cores, etc. This maybe is not enough
 for max performance applications, but in most situations it's plenty
 enough. (Incidentally, this is a lot what the Chapel language does (and
 D doesn't), and what I have explained in two past posts about Chapel,
 that were mostly ignored.)

How does the Chapel work when I need to sort data (just basic quicksort 
on 12 cores, for instance) or e.g. compile many files in parallel or 
encode xvid? What is the content of the array with xvid files?


Re: Thoughts on parallel programming?

2010-11-11 Thread Sean Kelly
Walter Bright Wrote:

 Russel Winder wrote:
  At the heart of all this is that programmers are taught that algorithm
  is a sequence of actions to achieve a goal.  Programmers are trained to
  think sequentially and this affects their coding.  This means that
  parallelism has to be expressed at a sufficiently high level that
  programmers can still reason about algorithms as sequential things. 
 
 I think it's more than being trained to think sequentially. I think it is in 
 the 
 inherent nature of how we think.

Distributed programming is essentially a bunch of little sequential program 
that interact, which is basically how people cooperate in the real world.  I 
think that is by far the most intuitive of any concurrent programming model, 
though it's still a significant conceptual shift from the traditional 
monolithic imperative program.


Re: Thoughts on parallel programming?

2010-11-11 Thread %u
Sean Kelly Wrote:

 Walter Bright Wrote:
 
  Russel Winder wrote:
   At the heart of all this is that programmers are taught that algorithm
   is a sequence of actions to achieve a goal.  Programmers are trained to
   think sequentially and this affects their coding.  This means that
   parallelism has to be expressed at a sufficiently high level that
   programmers can still reason about algorithms as sequential things. 
  
  I think it's more than being trained to think sequentially. I think it is 
  in the 
  inherent nature of how we think.
 
 Distributed programming is essentially a bunch of little sequential program 
 that interact, which is basically how people cooperate in the real world.  I 
 think that is by far the most intuitive of any concurrent programming model, 
 though it's still a significant conceptual shift from the traditional 
 monolithic imperative program.

Intel promised this AVX instruction set next year. Does it also work like 
distributed processes? I hear it doubles your FLOPS. These are exciting times 
parallel computing. Lots of new medias for distributed message passing 
programming. Lots of little fibers filling the multimedia pipelines with 
parallel data. Might even beat GPU soon if Larrabee comes.


Re: Thoughts on parallel programming?

2010-11-11 Thread Tobias Pfaff

On 11/11/2010 08:10 PM, Russel Winder wrote:

On Thu, 2010-11-11 at 18:24 +0100, Tobias Pfaff wrote:
[ . . . ]

Unfortunately I only know about the standard stuff, OpenMP/OpenCL...
Speaking of which: Are there any attempts to support lightweight
multithreading in D, that is, something like OpenMP ?


I'd hardly call OpenMP lightweight.  I agree that as a meta-notation for
directing the compiler how to insert appropriate code to force
multithreading of certain classes of code, using OpenMP generally beats
manual coding of the threads.  But OpenMP is very Fortran oriented even
though it can be useful for C, and indeed C++ as well.

However, given things like Threading Building Blocks (TBB) and the
functional programming inspired techniques used by Chapel, OpenMP
increasingly looks like a hack rather than a solution.

Using parallel versions of for, map, filter, reduce in the language is
probably a better way forward.

Having a D binding to OpenCL (and OpenGL, MPI, etc.) is probably going
to be a good thing.



Well, I am looking for an easy  efficient way to perform parallel 
numerical calculations on our 4-8 core machines. With C++, that's OpenMP 
(or GPGPU stuff using CUDA/OpenCL) for us now. Maybe lightweight was the 
wrong word, what I meant is that OpenMP is easy to use, and efficient 
for the problems we are solving. There actually might be better tools 
for that, honestly we didn't look into that much options -- we are no 
HPC guys, 1000-cpu clusters are not a relevant scenario and we are happy 
that we even started parallelizing our code at all :)


Anyways, I was thinking about the logical thing to use in D for this 
scenario. It's nothing super-fancy, in cases just a parallel_for we 
will, and sometimes a map/reduce operation...


Cheers,
Tobias


Re: Thoughts on parallel programming?

2010-11-11 Thread dsimcha
== Quote from Tobias Pfaff (nos...@spam.no)'s article
 On 11/11/2010 08:10 PM, Russel Winder wrote:
  On Thu, 2010-11-11 at 18:24 +0100, Tobias Pfaff wrote:
  [ . . . ]
  Unfortunately I only know about the standard stuff, OpenMP/OpenCL...
  Speaking of which: Are there any attempts to support lightweight
  multithreading in D, that is, something like OpenMP ?
 
  I'd hardly call OpenMP lightweight.  I agree that as a meta-notation for
  directing the compiler how to insert appropriate code to force
  multithreading of certain classes of code, using OpenMP generally beats
  manual coding of the threads.  But OpenMP is very Fortran oriented even
  though it can be useful for C, and indeed C++ as well.
 
  However, given things like Threading Building Blocks (TBB) and the
  functional programming inspired techniques used by Chapel, OpenMP
  increasingly looks like a hack rather than a solution.
 
  Using parallel versions of for, map, filter, reduce in the language is
  probably a better way forward.
 
  Having a D binding to OpenCL (and OpenGL, MPI, etc.) is probably going
  to be a good thing.
 
 Well, I am looking for an easy  efficient way to perform parallel
 numerical calculations on our 4-8 core machines. With C++, that's OpenMP
 (or GPGPU stuff using CUDA/OpenCL) for us now. Maybe lightweight was the
 wrong word, what I meant is that OpenMP is easy to use, and efficient
 for the problems we are solving. There actually might be better tools
 for that, honestly we didn't look into that much options -- we are no
 HPC guys, 1000-cpu clusters are not a relevant scenario and we are happy
 that we even started parallelizing our code at all :)
 Anyways, I was thinking about the logical thing to use in D for this
 scenario. It's nothing super-fancy, in cases just a parallel_for we
 will, and sometimes a map/reduce operation...
 Cheers,
 Tobias

I think you'll be very pleased with std.parallelism when/if it gets into Phobos.
The design philosophy is exactly what you're looking for:  Simple shared memory
parallelism on multicore computers, assuming no fancy/unusual OS-, compiler- or
hardware-level infrastructure.  Basically, it's got parallel foreach, parallel
map, parallel reduce and parallel tasks.  All you need to fully utilize it is 
DMD
and a multicore PC.

As a reminder, the docs are at
http://cis.jhu.edu/~dsimcha/d/phobos/std_parallelism.html and the code is at
http://dsource.org/projects/scrapple/browser/trunk/parallelFuture/std_parallelism.d
 .
 If this doesn't meet your needs in its current form, I'd like as much
constructive criticism as possible, as long as it's within the scope of simple,
everyday parallelism without fancy infrastructure.


Re: Thoughts on parallel programming?

2010-11-11 Thread %u
Gary Whatmore Wrote:

 %u Wrote:
 
  Sean Kelly Wrote:
  
   Walter Bright Wrote:
   
Russel Winder wrote:
 At the heart of all this is that programmers are taught that algorithm
 is a sequence of actions to achieve a goal.  Programmers are trained 
 to
 think sequentially and this affects their coding.  This means that
 parallelism has to be expressed at a sufficiently high level that
 programmers can still reason about algorithms as sequential things. 

I think it's more than being trained to think sequentially. I think it 
is in the 
inherent nature of how we think.
   
   Distributed programming is essentially a bunch of little sequential 
   program that interact, which is basically how people cooperate in the 
   real world.  I think that is by far the most intuitive of any concurrent 
   programming model, though it's still a significant conceptual shift from 
   the traditional monolithic imperative program.
  
  Intel promised this AVX instruction set next year. Does it also work like 
  distributed processes? I hear it doubles your FLOPS. These are exciting 
  times parallel computing. Lots of new medias for distributed message 
  passing programming. Lots of little fibers filling the multimedia pipelines 
  with parallel data. Might even beat GPU soon if Larrabee comes.
 
 AVX isn't parallel programming, it's vector processing. A dying breed of 
 paradigms. Parallel programming deals with concurrency. OpenMP and MPI. 
 Chapel (don't know it, but heard it here). Fortran. These are all good 
 examples. AVX is just a cpu intrinsics stuff in std.intrinsics

Currently the amount of information available is scarce. I have no idea how I 
use AVX or SSE in D. Auto-vectorization? Does it cover all use cases?

So..

SSE  autovectorization  intrinsics = loops, hand written inline assembly 
parts, very small scale
local worker threads / fibers = dsimcha's lib, medium scale
local area network = the great flagship distributed message passing system, 
huge clusters with 1000+ computers?

Why is message passing system so important? Assume I have dual-core laptop with 
AVX instructions next year. Use of 2 threads doubles my processor power. Use of 
AVX gives 8 times more power in good loops. I have no cluster so the flagship 
system provides zero benefit.


Re: Thoughts on parallel programming?

2010-11-10 Thread bearophile
jfd:

 Any thoughts on parallel programming.  I was looking at something about Chapel
 and X10 languages etc. for parallelism, and it looks interesting.  I know that
 it is still an area of active research, and it is not yet (far from?) done,
 but anyone have thoughts on this as future direction?  Thank you.

In past I have shown here two large posts about Chapel, that's a language 
contains several good ideas worth stealing, but my posts were mostly ignored.

Chapel is designed for heavy numerical computing on multi-cores or multi-CPUs, 
it has good ideas of CPU-localization of the work, while D isn't much serious 
about that kind of parallelism (yet). So far D has instead embraced 
message-passing, that's fit for other purposes.

Bye,
bearophile


Re: Thoughts on parallel programming?

2010-11-10 Thread dsimcha
== Quote from jfd (j...@nospam.com)'s article
 Any thoughts on parallel programming.  I was looking at something about Chapel
 and X10 languages etc. for parallelism, and it looks interesting.  I know that
 it is still an area of active research, and it is not yet (far from?) done,
 but anyone have thoughts on this as future direction?  Thank you.

Well, there's my std.parallelism library, which is in review for inclusion in
Phobos.  (http://cis.jhu.edu/~dsimcha/d/phobos/std_parallelism.html,
http://www.dsource.org/projects/scrapple/browser/trunk/parallelFuture/std_parallelism.d)


One unfortunate thing about it is that it doesn't use (and actually bypasses) 
D's
thread isolation system and allows unchecked sharing.  I couldn't think of any 
way
to create a pedal-to-metal parallelism library that was simultaneously useful,
safe and worked with the language as-is, and I wanted something that worked
**now**, not next year or in D3 or whatever, so I decided to omit safety.

Given that the library is in review, now would be the perfect time to offer any
suggestions on how it can be improved.