Re: SystemML on a single node

Matthias Boehm Sun, 16 Jul 2017 15:49:24 -0700

Hi Anthony,

other than for testing, we usually don't recommend to run in spark local
mode as it breaks the memory model of SystemML, and thus, can lead to OOMs.
Regarding the memory configuration of a single-node setup, you're usually
best served by allocating most of your memory to the driver as we support
multi-threaded single-node operations and this avoids the (unnecessary)
overhead of distributed operations (w/ partial aggregation, constraints,
and numerous other overheads).

The only exception are nodes with very large memory and if you have dense
matrices that exceed 16GB. Our current dense matrix blocks (for single-node
operations the entire matrix is represented as a single block) use a
linearized array which is limited to 2B elements in Java. In this
situation, it can be beneficial to spent most memory for the executor to
exploit all available memory and perform multi-threaded operations via
pseudo-distributed operations. Note that there is the open SYSTEMML-1312
task to support large dense matrix blocks with >16GB but it's unclear if it
will make it into the SystemML 1.0 release.

Regards,
Matthias

On Sun, Jul 16, 2017 at 3:25 PM, Anthony Thomas <ahtho...@eng.ucsd.edu>
wrote:

> Hi SystemML folks,
>
> Are there any recommended Spark configurations when running SystemML on a
> single machine? I.e. is there a difference between launching Spark with
> master=local[*] and running SystemML as a standard process in the JVM as
> opposed to launching a single node spark cluster? If the latter, is there a
> recommended balance between driver and executor memory?
>
> Thanks,
>
> Anthony
>

Re: SystemML on a single node

Reply via email to