[ https://issues.apache.org/jira/browse/MXNET-323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16463208#comment-16463208 ]
Anirudh Subramanian commented on MXNET-323: ------------------------------------------- Since the bottleneck seems to be the binary_broadcast_kernel, I tried to fuse unravel and dot inside the kernel. This doesn't seem to help much. Results after the run: {quote}('PASS: {}...', 0) func1: 0.764457941055 func2: 2.51862192154 func3: 0.547819852829 ('PASS: {}...', 1) func1: 0.487478017807 func2: 1.92227315903 func3: 0.447080135345 ('PASS: {}...', 2) func1: 0.491612911224 func2: 1.92092394829 func3: 0.447576999664 ('PASS: {}...', 3) func1: 0.490720033646 func2: 1.92054891586 func3: 0.446537017822 {quote} Building without the debug flag (-O3 flag) gives a better speedup for func2. {quote}('PASS: {}...', 0) func1: 0.143939971924 func2: 0.392518043518 func3: 0.147929906845 ('PASS: {}...', 1) func1: 0.129691123962 func2: 0.389477968216 func3: 0.145953178406 ('PASS: {}...', 2) func1: 0.128004074097 func2: 0.388610124588 func3: 0.144158840179 ('PASS: {}...', 3) func1: 0.130708932877 func2: 0.292946100235 func3: 0.10039305687 {quote} > Broadcasting ops are slow > ------------------------- > > Key: MXNET-323 > URL: https://issues.apache.org/jira/browse/MXNET-323 > Project: Apache MXNet > Issue Type: Improvement > Reporter: Lai Wei > Assignee: Anirudh Subramanian > Priority: Major > > https://github.com/apache/incubator-mxnet/issues/8219 -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@mxnet.apache.org For additional commands, e-mail: issues-h...@mxnet.apache.org