mikepapadim opened a new pull request, #12287:
URL: https://github.com/apache/tvm/pull/12287

   Prior implementation caused the FQI pass to produce the following graph when 
`abs` was used:
   
   ```
     %14 = qnn.dequantize(%13, 6.4712f /* ty=float32 */, 119 /* ty=int32 */, 
axis=1) /* ty=Tensor[(1, 960, 512, 64), float32] */;
     %15 = abs(%14) /* ty=Tensor[(1, 960, 512, 64), float32] */;
     %16 = qnn.quantize(%15, 3.46196f /* ty=float32 */, 0 /* ty=int32 */, 
out_dtype="uint8", axis=1) /* ty=Tensor[(1, 960, 512, 64), uint8] */;
   ```
   It seems that the output scale and zero point are different from the input 
scale and zero point. So, we doubled the precision. 
   
   By adding a qnn op for `abs` we manage requantize on the right values.
   
   ```
    def @main(%x: Tensor[(1, 960, 512, 64), int8] /* ty=Tensor[(1, 960, 512, 
64), int8] */) -> Tensor[(1, 960, 512, 64), int8] {
     %0 = qnn.abs(%x, 6.4712f /* ty=float32 */, 119 /* ty=int32 */, 6.4712f /* 
ty=float32 */, 0 /* ty=int32 */) /* ty=Tensor[(1, 960, 512, 64), int8] */;
     qnn.requantize(%0, 6.4712f /* ty=float32 */, 119 /* ty=int32 */, 3.46196f 
/* ty=float32 */, 0 /* ty=int32 */, out_dtype="int8") /* ty=Tensor[(1, 960, 
512, 64), int8] */
   }
   ```
   
   @sfvaroglu @mbrookhart @AndrewZhaoLuo @anwang2009 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to