Hello Wei,

I talk from experience of writing many HPC distributed application using
Open MPI (C/C++) on x86, PowerPC and Cell B.E. processors, and Parallel
Virtual Machine (PVM) way before that back in the 90's.  I can say with
absolute certainty:

*Any gains you believe there are because "C++ is faster than Java/Scala"
will be completely blown by the inordinate amount of time you spend
debugging your code and/or reinventing the wheel to do even basic tasks
like linear regression.*


There are undoubtably some very specialised use-cases where MPI and its
brethren still dominate for High Performance Computing tasks -- like for
example the nuclear decay simulations run by the US Department of Energy on
supercomputers where they've invested billions solving that use case.

Spark is part of the wider "Big Data" ecosystem, and its biggest advantages
are traction amongst internet scale companies, hundreds of developers
contributing to it and a community of thousands using it.

Need a distributed fault-tolerant file system? Use HDFS.  Need a
distributed/fault-tolerant message-queue? Use Kafka.  Need to co-ordinate
between your worker processes? Use Zookeeper.  Need to run it on a flexible
grid of computing resources and handle failures? Run it on Mesos!

The barrier to entry to get going with Spark is very low, download the
latest distribution and start the Spark shell.  Language bindings for Scala
/ Java / Python are excellent meaning you spend less time writing
boilerplate code, and more time solving problems.

Even if you believe you *need* to use native code to do something specific,
like fetching HD video frames from satellite video capture cards -- wrap it
in a small native library and use the Java Native Access interface to call
it from your Java/Scala code.

Have fun, and if you get stuck we're here to help!

MC


On 16 June 2014 08:17, Wei Da <xwd0...@gmail.com> wrote:

> Hi guys,
> We are making choices between C++ MPI and Spark. Is there any official
> comparation between them? Thanks a lot!
>
> Wei
>

Reply via email to