[GitHub] [incubator-tvm] cbalint13 opened a new pull request #5805: [QUANTIZE] Add nn.batch_flatten as quantizable.

GitBox Sun, 14 Jun 2020 00:15:06 -0700


cbalint13 opened a new pull request #5805:
URL: https://github.com/apache/incubator-tvm/pull/5805



   This PR adds ```nn.batch_flatten``` as quantizable layer.
   
   **Description**
   * ```nn/batch_flatten``` is commonly used before ```nn.dense``` in final 
layers.
   * Proposed PR allows it to be included in quantization process avoiding 
re-cast to ```float32```.
   
   **Outcome**
   * Before
   ```
     %19 = nn.max_pool2d(%18, pool_size=[2, 2], strides=[2, 2], padding=[0, 0, 
0, 0]) /* ty=Tensor[(1, 50, 4, 4), int8] */;
     %20 = cast(%19, dtype="int8") /* ty=Tensor[(1, 50, 4, 4), int8] */;
     %21 = annotation.stop_fusion(%20) /* ty=Tensor[(1, 50, 4, 4), int8] */;
     %22 = cast(%21, dtype="float32") /* ty=Tensor[(1, 50, 4, 4), float32] */;
     %23 = multiply(%22, 0.0625f /* ty=float32 */) /* ty=Tensor[(1, 50, 4, 4), 
float32] */;
     %24 = nn.batch_flatten(%23) /* ty=Tensor[(1, 800), float32] */;
     %25 = nn.batch_flatten(%24) /* ty=Tensor[(1, 800), float32] */;
     %26 = nn.batch_flatten(%25) /* ty=Tensor[(1, 800), float32] */;
     %27 = nn.dense(%26, meta[relay.Constant][2] /* ty=Tensor[(512, 800), 
float32] */ /* ty=Tensor[(512, 800), float32] */, units=512) /* ty=Tensor[(1, 
512), float32] */;
     %28 = nn.relu(%27) /* ty=Tensor[(1, 512), float32] */;
     %29 = nn.batch_flatten(%28) /* ty=Tensor[(1, 512), float32] */;
     %30 = nn.batch_flatten(%29) /* ty=Tensor[(1, 512), float32] */;
     nn.dense(%30, meta[relay.Constant][3] /* ty=Tensor[(10, 512), float32] */ 
/* ty=Tensor[(10, 512), float32] */, units=10) /* ty=Tensor[(1, 10), float32] */
   ```
   * After
   ```
     %19 = nn.max_pool2d(%18, pool_size=[2, 2], strides=[2, 2], padding=[0, 0, 
0, 0]) /* ty=Tensor[(1, 50, 4, 4), int8] */;
     %20 = cast(%19, dtype="int8") /* ty=Tensor[(1, 50, 4, 4), int8] */;
     %21 = annotation.stop_fusion(%20) /* ty=Tensor[(1, 50, 4, 4), int8] */;
     %22 = nn.batch_flatten(%21) /* ty=Tensor[(1, 800), int8] */;
     %23 = nn.batch_flatten(%22) /* ty=Tensor[(1, 800), int8] */;
     %24 = nn.batch_flatten(%23) /* ty=Tensor[(1, 800), int8] */;
     %25 = clip(%24, a_min=-127f, a_max=127f) /* ty=Tensor[(1, 800), int8] */;
     %26 = nn.dense(%25, meta[relay.Constant][2] /* ty=Tensor[(512, 800), int8] 
*/ /* ty=Tensor[(512, 800), int8] */, units=512, out_dtype="int32") /* 
ty=Tensor[(1, 512), int32] */;
     %27 = nn.relu(%26) /* ty=Tensor[(1, 512), int32] */;
     %28 = nn.batch_flatten(%27) /* ty=Tensor[(1, 512), int32] */;
     %29 = nn.batch_flatten(%28) /* ty=Tensor[(1, 512), int32] */;
     %30 = add(%29, 512 /* ty=int32 */) /* ty=Tensor[(1, 512), int32] */;
     %31 = right_shift(%30, 10 /* ty=int32 */) /* ty=Tensor[(1, 512), int32] */;
     %32 = clip(%31, a_min=-127f, a_max=127f) /* ty=Tensor[(1, 512), int32] */;
     %33 = cast(%32, dtype="int8") /* ty=Tensor[(1, 512), int8] */;
     %34 = nn.dense(%33, meta[relay.Constant][3] /* ty=Tensor[(10, 512), int8] 
*/ /* ty=Tensor[(10, 512), int8] */, units=10, out_dtype="int32") /* 
ty=Tensor[(1, 10), int32] */;
   ```
   @vinx13, @siju-samuel @masahi @FrozenGene @ZihengJiang please help with the 
review.
   
   Thank You !


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-tvm] cbalint13 opened a new pull request #5805: [QUANTIZE] Add nn.batch_flatten as quantizable.

Reply via email to