[ https://issues.apache.org/jira/browse/SYSTEMML-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104312#comment-16104312 ]
Matthias Boehm edited comment on SYSTEMML-1444 at 7/28/17 1:58 AM: ------------------------------------------------------------------- great - after thinking about the design a little more, I'd like to recommend to go with approach (2) that would handle functions with a single output similar to any other hop, while multi-output functions would use the same mechanism as of today. In detail, this would entail the following steps (which can be created as subtasks and addressed via PRs individually): a) Hop/Lop extensions: Extend the existing {{FunctionOp}} to be used in two modes (single and multi output). Only in multi-output mode the list of outputs are used (always DAG outputs), while in single-output mode the {{FunctionOp}} can be used as input to any other HOP and hence be used in expressions. Besides changing the construction of hops, this also requires some minor extensions to the lop construction and instruction generation (e.g., using the compiler-provided name of temporary outputs when generating single-output instructions). At this point, all {{FunctionOps}} would still be created as multi-output functions at language level. b) Language changes / tests: According to the change of HOP and LOPs, we can then construct differently configured HOPs for single-output functions at language level (see {{DMLTranslator}}). In order to use single output functions, we likely also need some changes of validation. This step should also introduces a couple of tests for functions in expressions. c) Size propagation and IPA: Having functions in expressions poses a challenge to size propagation because there is no natural recompilation point after the function call anymore. We should address this as follows: First, flag dimension-preserving {{FunctionOps}} during {{InterProceduralAnalysis}} and accordingly modify {{FunctionOp.refreshSizeInformation}} and {{FunctionOp.inferOutputCharacteristics}} to allow size propagation over {{FunctionOps}} during dynamic recompilation. Second, introduce a rewrite to split DAGs after {{FunctionOps}} that return matrices/frames and are not dimension-preserving (see {{RewriteSplitDagDataDependentOperators}} for an example). Finally, let's separate the discussion on "structs" (or "tuples") as it's not really related. We would most likely implemented structs as syntactic sugar at parser level. In contrast, the discussion above was referring to multi outputs in HOP and LOP DAGs, which is much more involved. was (Author: mboehm7): great - after thinking about the design a little more, I'd like to recommend to go with approach (2) that would handle functions with a single output similar to any other hop, while multi-output functions would use the same mechanism as of today. In detail, this would entail the following steps (which can be created as subtasks and addressed via PRs individually): a) Hop/Lop extensions: Extend the existing {{FunctionOp}} to be used in two modes (single and multi output). Only in multi-output mode the list of outputs are used (always DAG outputs), while in single-output mode the {{FunctionOp}} can be used as input to any other HOP and hence be used in expressions. Besides changing the construction of hops, this also requires some minor extensions to the lop construction and instruction generation (e.g., using the compiler-provides name of temporary outputs when generating single-output instructions). At this point, all {{FunctionOps}} would still be created as multi-output functions at language level. b) Language changes / tests: According to the change of HOP and LOPs, we can then construct differently configured HOPs for single-output functions at language level (see DMLTranslator). In order to use single output functions, we likely also need some changes of validation. This step should also introduces a couple of tests for functions in expressions. c) Size propagation and IPA: Having functions in expressions poses a challenge to size propagation because there is no natural recompilation point after the function call anymore. We should address this as follows: First, flag dimension-preserving {{FunctionOps}} during {{InterProceduralAnalysis}} and accordingly modify {{FunctionOp.refreshSizeInformation}} and {{FunctionOp.inferOutputCharacteristics}} to allow size propagation over {{FunctionOps}} during dynamic recompilation. Second, introduce a rewrite to split DAGs after {{FunctionOps}} that return matrices/frames and are not dimension-preserving (see {{RewriteSplitDagDataDependentOperators}} for an example). Finally, let's separate the discussion on "structs" (or "tuples") as it's not really related. We would most likely implemented structs as syntactic sugar at parser level. In contrast, the discussion above was referring to multi outputs in HOP and LOP DAGs, which is much more involved. > UDFs w/ single output in expressions > ------------------------------------ > > Key: SYSTEMML-1444 > URL: https://issues.apache.org/jira/browse/SYSTEMML-1444 > Project: SystemML > Issue Type: Sub-task > Components: APIs, Compiler, Runtime > Reporter: Matthias Boehm > Assignee: Janardhan > Fix For: SystemML 1.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)