[jira] [Comment Edited] (SYSTEMML-1444) UDFs w/ single output in expressions

Matthias Boehm (JIRA) Thu, 27 Jul 2017 18:59:31 -0700

    [ 
https://issues.apache.org/jira/browse/SYSTEMML-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104312#comment-16104312
 ]


Matthias Boehm edited comment on SYSTEMML-1444 at 7/28/17 1:58 AM:
-------------------------------------------------------------------

great - after thinking about the design a little more, I'd like to recommend to 
go with approach (2) that would handle functions with a single output similar 
to any other hop, while multi-output functions would use the same mechanism as 
of today.

In detail, this would entail the following steps (which can be created as 
subtasks and addressed via PRs individually):

a) Hop/Lop extensions: Extend the existing {{FunctionOp}} to be used in two 
modes (single and multi output). Only in multi-output mode the list of outputs 
are used (always DAG outputs), while in single-output mode the {{FunctionOp}} 
can be used as input to any other HOP and hence be used in expressions. Besides 
changing the construction of hops, this also requires some minor extensions to 
the lop construction and instruction generation (e.g., using the 
compiler-provided name of temporary outputs when generating single-output 
instructions). At this point, all {{FunctionOps}} would still be created as 
multi-output functions at language level.

b) Language changes / tests: According to the change of HOP and LOPs, we can 
then construct differently configured HOPs for single-output functions at 
language level (see {{DMLTranslator}}). In order to use single output 
functions, we likely also need some changes of validation. This step should 
also introduces a couple of tests for functions in expressions.

c) Size propagation and IPA: Having functions in expressions poses a challenge 
to size propagation because there is no natural recompilation point after the 
function call anymore. We should address this as follows: First, flag 
dimension-preserving {{FunctionOps}} during {{InterProceduralAnalysis}} and 
accordingly modify {{FunctionOp.refreshSizeInformation}} and 
{{FunctionOp.inferOutputCharacteristics}} to allow size propagation over 
{{FunctionOps}} during dynamic recompilation. Second, introduce a rewrite to 
split DAGs after {{FunctionOps}} that return matrices/frames and are not 
dimension-preserving (see {{RewriteSplitDagDataDependentOperators}} for an 
example). 

Finally, let's separate the discussion on "structs" (or "tuples") as it's not 
really related. We would most likely implemented structs as syntactic sugar at 
parser level. In contrast, the discussion above was referring to multi outputs 
in HOP and LOP DAGs, which is much more involved.  


was (Author: mboehm7):
great - after thinking about the design a little more, I'd like to recommend to 
go with approach (2) that would handle functions with a single output similar 
to any other hop, while multi-output functions would use the same mechanism as 
of today.

In detail, this would entail the following steps (which can be created as 
subtasks and addressed via PRs individually):

a) Hop/Lop extensions: Extend the existing {{FunctionOp}} to be used in two 
modes (single and multi output). Only in multi-output mode the list of outputs 
are used (always DAG outputs), while in single-output mode the {{FunctionOp}} 
can be used as input to any other HOP and hence be used in expressions. Besides 
changing the construction of hops, this also requires some minor extensions to 
the lop construction and instruction generation (e.g., using the 
compiler-provides name of temporary outputs when generating single-output 
instructions). At this point, all {{FunctionOps}} would still be created as 
multi-output functions at language level.

b) Language changes / tests: According to the change of HOP and LOPs, we can 
then construct differently configured HOPs for single-output functions at 
language level (see DMLTranslator). In order to use single output functions, we 
likely also need some changes of validation. This step should also introduces a 
couple of tests for functions in expressions.

c) Size propagation and IPA: Having functions in expressions poses a challenge 
to size propagation because there is no natural recompilation point after the 
function call anymore. We should address this as follows: First, flag 
dimension-preserving {{FunctionOps}} during {{InterProceduralAnalysis}} and 
accordingly modify {{FunctionOp.refreshSizeInformation}} and 
{{FunctionOp.inferOutputCharacteristics}} to allow size propagation over 
{{FunctionOps}} during dynamic recompilation. Second, introduce a rewrite to 
split DAGs after {{FunctionOps}} that return matrices/frames and are not 
dimension-preserving (see {{RewriteSplitDagDataDependentOperators}} for an 
example). 

Finally, let's separate the discussion on "structs" (or "tuples") as it's not 
really related. We would most likely implemented structs as syntactic sugar at 
parser level. In contrast, the discussion above was referring to multi outputs 
in HOP and LOP DAGs, which is much more involved.  

> UDFs w/ single output in expressions
> ------------------------------------
>
>                 Key: SYSTEMML-1444
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-1444
>             Project: SystemML
>          Issue Type: Sub-task
>          Components: APIs, Compiler, Runtime
>            Reporter: Matthias Boehm
>            Assignee: Janardhan
>             Fix For: SystemML 1.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (SYSTEMML-1444) UDFs w/ single output in expressions

Reply via email to