Hi Felix,

This is very good work.
I've played around a bit with Niketan's Python based internal/embedded DSL.
It seems like its meant for interactive work, as if in a notebook or a
REPL.
This work on the other hand could look similar to the OpenMP/OpenACC
paradigm. In its current form and the one you are suggesting with the
Algorithm instance, the user is responsible for "executing" the
"parallelized" snippet of code.

Maybe we could have it look like OpenMP/OpenACC, like so-
If my code looked like this:

/* Setup ....  */
for ( a <- 1 to 10000) { /* Expensive Computation */ }
/* Cleanup .... */


I could change it to

/* Setup ....  */
parallelize {
    for ( a <- 1 to 10000) { /* Expensive Computation */ }
}
/* Cleanup .... */

The code in "parallelize" would be DML-ized and sent to SystemML. The
appropriate conversions between data types in scala and those supported by
SystemML would happen automatically.

Thoughts?



-Nakul







On Sat, Sep 24, 2016 at 10:39 AM, Niketan Pansare <npan...@us.ibm.com>
wrote:

> Hi Felix,
>
> Thanks for the summary. The document is extremely useful. I particularly
> like the idea of parallelizing the code with 'breeze' library. I would like
> to pitch in few ideas which would enable your code to be reused by other
> DSLs:
> 1. Scala DSL/parallelize macro remains the same as described in your
> documentation, but instead of generating DML directly, we call an
> intermediate representation (IR). This IR then generates DML (instead of
> generating DML directly by parallelize). This IR will be then reused by
> Python DSL and R DSL.
> 2. As an example, IR could be a lazy Matrix class (which would be part of
> SystemML). It could have awkward syntax/mechanism for pushing down control
> structures for example: beginWhile and endWhile. Since IR will not be
> exposed to the end-user, it should be fine.
>
> Example:
>
> *https://github.com/apache/incubator-systemml/blob/master/src/main/python/systemml/defmatrix.py#L537*
> <https://github.com/apache/incubator-systemml/blob/master/src/main/python/systemml/defmatrix.py#L537>
> will call IR's add() method. At the end of parallelize or when the user
> wants result (i.e. eval() ), IR could generate DML code and execute it.
>
> Again, this is just a proposal and am fine dropping the idea of
> integrating different DSL if it makes the implementation of Scala DSL
> complicated. Also, please feel free to correct me if I am missing anything.
>
> Thanks,
>
> Niketan Pansare
> IBM Almaden Research Center
> E-mail: npansar At us.ibm.com
> http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar
>
> [image: Inactive hide details for Matthias Boehm---09/24/2016 01:11:36
> AM---thanks for sharing the summary - this is very nice. While l]Matthias
> Boehm---09/24/2016 01:11:36 AM---thanks for sharing the summary - this is
> very nice. While looking over the example, I had the follow
>
> From: Matthias Boehm/Almaden/IBM@IBMUS
> To: dev@systemml.incubator.apache.org
> Date: 09/24/2016 01:11 AM
> Subject: Re: Proof of Concept: Embedded Scala DSL
> ------------------------------
>
>
>
> thanks for sharing the summary - this is very nice. While looking over the
> example, I had the following questions:
>
> 1) Output handling: It would be great to see an example how the results of
> Algorithm.execute() are consumed. Do you intend to hand out our binary
> matrix representation or MLContext's Matrix from which the user then
> requests specific output formats? Also if there are multiple Algorithm
> instances, how is the MLContext (with its internal state of lazily
> evaluated intermediates) reused?
>
> 2) Scala-breeze prototyping: How do you intend to support operations that
> are not supported in breeze? Examples are removeEmpty, table, aggregate,
> rowIndexMax, quantile/centralmoment, cummin/cummax, and DNN operations?
>
> 3) Frame data type and operations: Do you also intend to add a frame type
> and its operations? I think for this initial prototype it is not
> necessarily required but please make the scope explicit.
>
> Regards,
> Matthias
>
>
> fschueler---09/23/2016 04:36:14 PM---As discussed in the related Jira
> (SYSTEMML-451) I have started to implement a prototype/proof of co
>
> From: fschue...@posteo.de
> To: dev@systemml.incubator.apache.org
> Date: 09/23/2016 04:36 PM
> Subject: Proof of Concept: Embedded Scala DSL
> ------------------------------
>
>
>
> As discussed in the related Jira (SYSTEMML-451) I have started to
> implement a prototype/proof of concept for an embedded DSL in Scala.
>
> I have summarized the current approach in a short document that you can
> find on github together with the code:
> *https://github.com/fschueler/emma/blob/sysml-dsl/emma-sysml-dsl/README.md*
> <https://github.com/fschueler/emma/blob/sysml-dsl/emma-sysml-dsl/README.md>
> Please note that current development happens in the Emma project but
> will move to an independent module in the SystemML project once the
> necessary additions to Emma are merged. By having the DSL in a separate
> module, we can include Scala and Emma dependencies only for the users
> that actually want to use the Scala DSL.
>
> The current code serves as a proof of concept to discuss further
> development with the SystemML community. I especially welcome input from
> SystemML Scala users on the usability of the API design.
> Next steps will include the translation from Scala code to DML with
> support of all features currently supported in DML, including control
> flow structures.
> Also, a coherent way of executing the generated scripts from Scala and
> the interaction with outside data formats (such as Spark Dataframes)
> will be integrated.
>
> I am happy to answer your questions and discuss the described approach
> here!
>
> Felix
>
>
>
>
>
>

Reply via email to