Hi Anthony, other than for testing, we usually don't recommend to run in spark local mode as it breaks the memory model of SystemML, and thus, can lead to OOMs. Regarding the memory configuration of a single-node setup, you're usually best served by allocating most of your memory to the driver as we support multi-threaded single-node operations and this avoids the (unnecessary) overhead of distributed operations (w/ partial aggregation, constraints, and numerous other overheads).
The only exception are nodes with very large memory and if you have dense matrices that exceed 16GB. Our current dense matrix blocks (for single-node operations the entire matrix is represented as a single block) use a linearized array which is limited to 2B elements in Java. In this situation, it can be beneficial to spent most memory for the executor to exploit all available memory and perform multi-threaded operations via pseudo-distributed operations. Note that there is the open SYSTEMML-1312 task to support large dense matrix blocks with >16GB but it's unclear if it will make it into the SystemML 1.0 release. Regards, Matthias On Sun, Jul 16, 2017 at 3:25 PM, Anthony Thomas <ahtho...@eng.ucsd.edu> wrote: > Hi SystemML folks, > > Are there any recommended Spark configurations when running SystemML on a > single machine? I.e. is there a difference between launching Spark with > master=local[*] and running SystemML as a standard process in the JVM as > opposed to launching a single node spark cluster? If the latter, is there a > recommended balance between driver and executor memory? > > Thanks, > > Anthony >