anijain2305 opened a new pull request #4611: {QNN] Making scale/zero_points as 
expr instead of attrs.
URL: https://github.com/apache/incubator-tvm/pull/4611
 
 
   Currently QNN dialect only deals with uniform quantization, which means each 
tensor has just one scale and zero point. Because of this restriction, QNN 
design had scale and zero points as op attributes. However, as we move towards 
channel quantization, scale will become a vector (and behave similar to 
something like bias for bias_add in terms of ops inputs).
   
   Before moving to channel quantization, this PR makes the necessary changes 
to make the scale and zero points as input expr to operators (instead of making 
them op attrs). So, this PR does not bring any functional/performance change to 
QNN graphs. The new type checks still check that scale and zero points must be 
const scalar values. Once this PR is merged, and we start moving towards 
channel-wise quantization, we can start relaxing the checks and modifying the 
lowering.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to