anijain2305 opened a new pull request #4611: {QNN] Making scale/zero_points as
expr instead of attrs.
URL: https://github.com/apache/incubator-tvm/pull/4611
Currently QNN dialect only deals with uniform quantization, which means each
tensor has just one scale and zero point. Because of this restriction, QNN
design had scale and zero points as op attributes. However, as we move towards
channel quantization, scale will become a vector (and behave similar to
something like bias for bias_add in terms of ops inputs).
Before moving to channel quantization, this PR makes the necessary changes
to make the scale and zero points as input expr to operators (instead of making
them op attrs). So, this PR does not bring any functional/performance change to
QNN graphs. The new type checks still check that scale and zero points must be
const scalar values. Once this PR is merged, and we start moving towards
channel-wise quantization, we can start relaxing the checks and modifying the
lowering.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services