Re: Sub projects in Language and run time for parameter servers [SYSTEMML-2083]

Matthias Boehm Fri, 09 Mar 2018 00:49:41 -0800

Hi Chamath,

ad 1: Yes, this is absolutely correct. However, it is important to realize
that within the workers, we want to run dml functions, and for these we'll
reuse our existing compiler, runtime, operations, and data structures.

ad 2: Yes, this is also correct. Indeed we can use an existing parfor (with
local execution mode) to emulate a local, synchronous parameter server.
However, it would be very hard - and conflicting with our functional and
thus, stateless execution semantics - to incorporate asynchronous updates
and strategies such as Hogwild!. Furthermore, such a local parameter server
might also have an application with very large models and batches, because
this would enable distributed data-parallel operations spawn from each
local worker.

ad 3: Unfortunately, there is no one single detailed architecture diagram
because the system evolves over time. I would recommend to look at the
following two papers, where especially [1] (the parfor paper, and its
extensions for Spark in [2]) might give you a better idea of the parameter
server and its workers, which are primarily meant to handle the
orchestration and efficient parameter updates/exchange. if you're looking
for coarse-grained component, then [3], slide 8 might be a starting point.
At a high-level each operation and some constructs like parfor have
physical operators for CP, SPARK, MR, and some for GPU. Similarly this
project aims to introduce a new paramserv builtin function (most similar to
parfor) and its different physical operators.

ad 4: Since this paramserv function has similarity with parfor, we will be
able to reuse key primitives for bringing up local/remote workers, shipping
the compiled functions, and input data. The major extensions will be to
call the shipped functions per batch, get the returned (i.e., updated)
parameters and handle the exchange accordingly to the paramserv
configuration. However, since paramserv as an operation is implemented from
scratch, we can customize as needed and are not restricted by script-level
semantics which renders the problem simpler as the general-purpose parfor
construct. Both have their use cases.

In case this did not clarify your questions, let us known and we'll sort it
out.

[1] http://www.vldb.org/pvldb/vol7/p553-boehm.pdf, 2014
[2] http://www.vldb.org/pvldb/vol9/p1425-boehm.pdf, 2016
[3] http://boss.dima.tu-berlin.de/media/BOSS16-Tutorial-mboehm.pdf, 2016

Regards,
Matthias

On Thu, Mar 8, 2018 at 10:28 PM, Chamath Abeysinghe <
abeysinghecham...@gmail.com> wrote:

> Hi,
> I am trying to understand the purpose and work needed for different sub
> projects in SYSTEMML-2083. And I got few questions,
>
> * In the JIRA it was mentioned that we are not integrating off the shelf
> Parameter Server, but rather develop language and run time support from
> scratch. As far as I understand, this means creating syntax for DML to
> interact with the parameter server. And the parameter server implementation
> is in different back-ends. So for example in Spark back end we have to
> create a some kind of parameter server implementation with different
> strategies, and it should be invoked by the syntax in DML. Is this
> understanding correct?
>
> * In the JIRA there is a sub project for local multi threaded back-end. In
> this project does "local" mean executing on single node similar to
> ExecType.CP? If it is the case why use a parameter server for a single
> node?
>
> * I was unable to find a architecture diagram for SystemML, is there any
> that kind of diagram to understand the interaction between different
> back-ends and language API or can you point me to those classes?
>
> * And those new run times, are they going to be completely new separate
> run times or improvements to the existing ones?
>
> Please help me understand these issues. Thanks in advance.
>
> Regards,
> Chamath
>
>

Re: Sub projects in Language and run time for parameter servers [SYSTEMML-2083]

Reply via email to