Caenorst commented on a change in pull request #16408: Add MXNet Ops for fast 
multihead attention
URL: https://github.com/apache/incubator-mxnet/pull/16408#discussion_r341182985
 
 

 ##########
 File path: src/operator/contrib/transformer-inl.h
 ##########
 @@ -34,6 +34,19 @@
 namespace mxnet {
 namespace op {
 
+struct InterleavedMatMulParam : public dmlc::Parameter<InterleavedMatMulParam> 
{
+  int heads;
+  bool bwd_ignore_zero_init;
+  DMLC_DECLARE_PARAMETER(InterleavedMatMulParam) {
+    DMLC_DECLARE_FIELD(heads)
+    .describe("Set number of heads");
+    DMLC_DECLARE_FIELD(bwd_ignore_zero_init)
+    .describe("Make backward pass ignore AddTo and not init to 0. "
+              " /!\\ Only enable with MXNET_ENABLE_EXEC_ADDTO fonctionality")
 
 Review comment:
   The problem is that MXNET_ENABLE_EXEC_ADDTO is used only during binding. So 
the user could potentially bind one symbol with MXNET_ENABLE_EXEC_ADDTO=0 then 
swap it to MXNET_ENABLE_EXEC_ADDTO=1 (for instance to bind another symbol). If 
I check for the flag both network will see it at 1, while only the 2nd network 
will actually use the functionality.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to