[ https://issues.apache.org/jira/browse/SYSTEMML-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988311#comment-15988311 ]
Mike Dusenberry commented on SYSTEMML-1561: ------------------------------------------- Yeah that's a good point, in these cases the functions are indeed inlined. Also, to be clear, the IPA scalar propagation causes the scalar leaf nodes of the {{Hout}} or {{Wout}} sub-dags to be replaced with literals, but {{Hout}} and {{Wout}} themselves are still not evaluated. I.e., for each there is still a dag of basic scalar operations that needs to be evaluated, but the good news is that it is not dependent on anything except for literals at the ends. > Improve constant folding during compilation > ------------------------------------------- > > Key: SYSTEMML-1561 > URL: https://issues.apache.org/jira/browse/SYSTEMML-1561 > Project: SystemML > Issue Type: Improvement > Reporter: Mike Dusenberry > Attachments: scenario1_plan.txt, scenario1.py, scenario2_plan.txt, > scenario2.py > > > In our `nn` library, our convolution and pooling layers have to pass around > the spatial dimensions (height and width) of the images that are stretched > out into rows of the input/output matrices. These output dimensions are > computed within the forward functions of the above layers as small scalar > equations. From a mathematical standpoint, these sizes can be determined at > compile time, and it is nice to have these size equations in DML (v.s. hiding > them inside the engine within built-in functions). However, we do not > currently evaluate these expressions during compilation, and thus we are left > with unknown sizes even during recompilation. This naturally leads to max > memory estimates and thus often leads to unnecessary distributed runtime ops > rather than simple CP ones. > I have two related scenarios for which this is a problem. They both involve > the {{Houtc1}} & {{Woutc1}} values that are returned from a > `conv2d::forward(...)` function. These represent the spatial dimensions of > the volume with each of the rows of the output {{outc1}} of the function, and > the third dimension is {{F1}}. Thus, {{outc1}} has a number of columns equal > to {{F1*Houtc1*Wouc1}}. > In the first scenario ({{scenario1.py}}), a random matrix {{doutc1}} is > created that should have the same dimensions as {{outc1}}. For the columns, > if I use {{cols=ncol(outc1)}} in this rand statement, the size will be > propagated and CP ops will be compiled and run. I I instead use > {{cols=F1*Houtc1*Woutc1}}, the size will forever be unknown, even during > recompilation, and thus Spark ops will be compiled and run. I have included > the recompile hops plan ({{scenario1_plan.txt}}). > In the second scenario ({{scenario2.py}}), a {{max_pool2d::forward(...)}} > function is inserted after the {{conv2d::forward(...)}} function that > requires the {{Houtc1}} and {{Woutc1}} variables to be supplied as arguments. > Since those latter variables are not executed during compilation time, the > max pooling sizes remain unknown, even during recompilation, and thus Spark > ops will be compiled and run. I have included the recompile hops plan > ({{scenario2_plan.txt}}). > We should either improve or fix our constant folding rewrites so that these > scenarios are fixed, as they are necessary for performant deep learning > applications. Note too that this issue will be present in other non-deep > learning scenarios as well. > Mailing list thread: > https://www.mail-archive.com/dev@systemml.incubator.apache.org/msg01657.html -- This message was sent by Atlassian JIRA (v6.3.15#6346)