mikepapadim opened a new pull request, #12287: URL: https://github.com/apache/tvm/pull/12287
Prior implementation caused the FQI pass to produce the following graph when `abs` was used: ``` %14 = qnn.dequantize(%13, 6.4712f /* ty=float32 */, 119 /* ty=int32 */, axis=1) /* ty=Tensor[(1, 960, 512, 64), float32] */; %15 = abs(%14) /* ty=Tensor[(1, 960, 512, 64), float32] */; %16 = qnn.quantize(%15, 3.46196f /* ty=float32 */, 0 /* ty=int32 */, out_dtype="uint8", axis=1) /* ty=Tensor[(1, 960, 512, 64), uint8] */; ``` It seems that the output scale and zero point are different from the input scale and zero point. So, we doubled the precision. By adding a qnn op for `abs` we manage requantize on the right values. ``` def @main(%x: Tensor[(1, 960, 512, 64), int8] /* ty=Tensor[(1, 960, 512, 64), int8] */) -> Tensor[(1, 960, 512, 64), int8] { %0 = qnn.abs(%x, 6.4712f /* ty=float32 */, 119 /* ty=int32 */, 6.4712f /* ty=float32 */, 0 /* ty=int32 */) /* ty=Tensor[(1, 960, 512, 64), int8] */; qnn.requantize(%0, 6.4712f /* ty=float32 */, 119 /* ty=int32 */, 3.46196f /* ty=float32 */, 0 /* ty=int32 */, out_dtype="int8") /* ty=Tensor[(1, 960, 512, 64), int8] */ } ``` @sfvaroglu @mbrookhart @AndrewZhaoLuo @anwang2009 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org