[GitHub] [tvm] jverma-quic commented on a diff in pull request #12340: [TOPI][Hexagon] Implement quantized avgpool

2022-08-19 Thread GitBox


jverma-quic commented on code in PR #12340:
URL: https://github.com/apache/tvm/pull/12340#discussion_r950513114


##
tests/python/contrib/test_hexagon/test_fixed_point_conversion.py:
##
@@ -0,0 +1,58 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import math
+import struct
+import numpy as np
+import tvm.topi.hexagon.utils as utils
+
+"""
+Test float to fixed-point conversion. We do it by constructing a numpy array 
with the
+wide range of floating-point values. These values are converted into the 
+fixed-point value using topi.hexagon.utils.get_fixed_point_value. Then, these 
values are
+converted back into float using scale_factor provided by the function. These 
converted
+floating point values are then compared against the original values and an 
assertion is
+raised if they happened to be outside of the expected tolerance.
+"""
+
+
+class TestFixedPointConversion:
+def test_fixed_point_conversion(self):
+# Construct array with wide range of values
+fp1 = np.random.uniform(0.1, 0.0002, size=(10))
+fp2 = np.random.uniform(0.001, 0.02, size=(10))
+fp3 = np.random.uniform(1, 20, size=(10))
+fp4 = np.random.uniform(900, 1000, size=(10))
+fp5 = np.random.uniform(1e9, 1e10, size=(10))
+fp6 = np.random.uniform(2.44885652993e38, 2.54885652993e38, size=(1))
+fp7 = np.random.uniform(1.46711479073e-34, 1.76098837843e-34, size=(1))

Review Comment:
   I agree. I'll add some comments to make it explicit. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [tvm] jverma-quic commented on a diff in pull request #12340: [TOPI][Hexagon] Implement quantized avgpool

2022-08-19 Thread GitBox


jverma-quic commented on code in PR #12340:
URL: https://github.com/apache/tvm/pull/12340#discussion_r950511701


##
python/tvm/topi/hexagon/qnn/avg_pool2d.py:
##
@@ -0,0 +1,205 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+# pylint: disable=invalid-name, unused-variable, unused-argument, 
too-many-locals
+
+""" Compute and schedule for quantized avg_pool2d op
+
+Please note the following assumptions made by the implementation:
+
+1) The input must be padded in advance to account for 'padding'. In addition,
+   both input and output must be padded as per the physical buffer layout.
+2) The current implementation assumes 'count_include_pad' to be 'True'. It can 
be
+   modified to support 'False' case but the element count for the pooling 
window
+   must be pre-computed and provided as an input to reduce the run-time 
overhead.
+3) 'padding' is ignored. It must be handled outside of the sliced op.
+4) Please note that this implementation will not work if the output includes 
any
+   physical layout related padding as it can result into out-of-bound access
+   for the input.
+"""
+
+from tvm import te
+from tvm import tir
+from ..utils import get_layout_transform_fn, get_fixed_point_value
+
+
+def validate_out_shape(out_shape: list, in_shape: list, kernel: list, stride: 
list, dilation: list):
+"""Validate output shape"""
+_, oh, ow, _ = out_shape
+_, ih, iw, _ = in_shape
+kh, kw = kernel
+sh, sw = stride
+dh, dw = dilation
+if ih < (oh - 1) * sh + dh * (kh - 1) + 1:
+raise RuntimeError("Output height is too large")
+if iw < (ow - 1) * sw + dw * (kw - 1) + 1:
+raise RuntimeError("Output width is too large")
+
+
+def saturate(x: te.Tensor, dtype: str):
+"""Saturate value for the specified data type"""
+return te.max(te.min_value(dtype), te.min(x, te.max_value(dtype)))

Review Comment:
   I agree. I think it will be better to do it in TVM as we can generate the 
appropriate LLVM saturating instruction during TVM codegen which can then be 
lowered into target specific instructions in the LLVM backend. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [tvm] jverma-quic commented on a diff in pull request #12340: [TOPI][Hexagon] Implement quantized avgpool

2022-08-19 Thread GitBox


jverma-quic commented on code in PR #12340:
URL: https://github.com/apache/tvm/pull/12340#discussion_r950462450


##
tests/python/contrib/test_hexagon/test_fixed_point_conversion.py:
##
@@ -0,0 +1,58 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import math
+import struct
+import numpy as np
+import tvm.topi.hexagon.utils as utils
+
+"""
+Test float to fixed-point conversion. We do it by constructing a numpy array 
with the
+wide range of floating-point values. These values are converted into the 
+fixed-point value using topi.hexagon.utils.get_fixed_point_value. Then, these 
values are
+converted back into float using scale_factor provided by the function. These 
converted
+floating point values are then compared against the original values and an 
assertion is
+raised if they happened to be outside of the expected tolerance.
+"""
+
+
+class TestFixedPointConversion:
+def test_fixed_point_conversion(self):
+# Construct array with wide range of values
+fp1 = np.random.uniform(0.1, 0.0002, size=(10))
+fp2 = np.random.uniform(0.001, 0.02, size=(10))
+fp3 = np.random.uniform(1, 20, size=(10))
+fp4 = np.random.uniform(900, 1000, size=(10))
+fp5 = np.random.uniform(1e9, 1e10, size=(10))
+fp6 = np.random.uniform(2.44885652993e38, 2.54885652993e38, size=(1))
+fp7 = np.random.uniform(1.46711479073e-34, 1.76098837843e-34, size=(1))

Review Comment:
   I didn't really think about it since this is just a small unit test and 
we're just constructing at most 10 element long arrays. If you're really 
concerned about the complexity aspect of it, then I don't mind doing what 
you're suggesting but otherwise, I would prefer leaving it as is. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [tvm] jverma-quic commented on a diff in pull request #12340: [TOPI][Hexagon] Implement quantized avgpool

2022-08-19 Thread GitBox


jverma-quic commented on code in PR #12340:
URL: https://github.com/apache/tvm/pull/12340#discussion_r950320325


##
python/tvm/topi/hexagon/qnn/avg_pool2d.py:
##
@@ -0,0 +1,205 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+# pylint: disable=invalid-name, unused-variable, unused-argument, 
too-many-locals
+
+""" Compute and schedule for quantized avg_pool2d op
+
+Please note the following assumptions made by the implementation:
+
+1) The input must be padded in advance to account for 'padding'. In addition,
+   both input and output must be padded as per the physical buffer layout.
+2) The current implementation assumes 'count_include_pad' to be 'True'. It can 
be
+   modified to support 'False' case but the element count for the pooling 
window
+   must be pre-computed and provided as an input to reduce the run-time 
overhead.
+3) 'padding' is ignored. It must be handled outside of the sliced op.
+4) Please note that this implementation will not work if the output includes 
any
+   physical layout related padding as it can result into out-of-bound access
+   for the input.
+"""
+
+from tvm import te
+from tvm import tir
+from ..utils import get_layout_transform_fn, get_fixed_point_value
+
+
+def validate_out_shape(out_shape: list, in_shape: list, kernel: list, stride: 
list, dilation: list):
+"""Validate output shape"""
+_, oh, ow, _ = out_shape
+_, ih, iw, _ = in_shape
+kh, kw = kernel
+sh, sw = stride
+dh, dw = dilation
+if ih < (oh - 1) * sh + dh * (kh - 1) + 1:
+raise RuntimeError("Output height is too large")
+if iw < (ow - 1) * sw + dw * (kw - 1) + 1:
+raise RuntimeError("Output width is too large")
+
+
+def saturate(x: te.Tensor, dtype: str):
+"""Saturate value for the specified data type"""
+return te.max(te.min_value(dtype), te.min(x, te.max_value(dtype)))

Review Comment:
   > When I looked at several of the Hexagon `.so` files produced by this PR's 
unit tests, I didn't see any indication that Hexagon's `saturate` or `:sat` 
instructions were being used.
   > 
   > This isn't a critique of the PR; I'm just mentioning it as a point of 
interest for future work.
   
   That's very likely. Thanks for looking into it!
   
   Unless we generate saturating llvm instructions through TVM, we will have to 
add additional code in LLVM to recognize the sequence of min, max as saturate. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [tvm] jverma-quic commented on a diff in pull request #12340: [TOPI][Hexagon] Implement quantized avgpool

2022-08-19 Thread GitBox


jverma-quic commented on code in PR #12340:
URL: https://github.com/apache/tvm/pull/12340#discussion_r950318328


##
python/tvm/topi/hexagon/qnn/avg_pool2d.py:
##
@@ -0,0 +1,205 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+# pylint: disable=invalid-name, unused-variable, unused-argument, 
too-many-locals
+
+""" Compute and schedule for quantized avg_pool2d op
+
+Please note the following assumptions made by the implementation:
+
+1) The input must be padded in advance to account for 'padding'. In addition,
+   both input and output must be padded as per the physical buffer layout.
+2) The current implementation assumes 'count_include_pad' to be 'True'. It can 
be
+   modified to support 'False' case but the element count for the pooling 
window
+   must be pre-computed and provided as an input to reduce the run-time 
overhead.
+3) 'padding' is ignored. It must be handled outside of the sliced op.
+4) Please note that this implementation will not work if the output includes 
any
+   physical layout related padding as it can result into out-of-bound access
+   for the input.
+"""
+
+from tvm import te
+from tvm import tir
+from ..utils import get_layout_transform_fn, get_fixed_point_value
+
+
+def validate_out_shape(out_shape: list, in_shape: list, kernel: list, stride: 
list, dilation: list):
+"""Validate output shape"""
+_, oh, ow, _ = out_shape
+_, ih, iw, _ = in_shape
+kh, kw = kernel
+sh, sw = stride
+dh, dw = dilation
+if ih < (oh - 1) * sh + dh * (kh - 1) + 1:
+raise RuntimeError("Output height is too large")
+if iw < (ow - 1) * sw + dw * (kw - 1) + 1:
+raise RuntimeError("Output width is too large")
+
+
+def saturate(x: te.Tensor, dtype: str):
+"""Saturate value for the specified data type"""
+return te.max(te.min_value(dtype), te.min(x, te.max_value(dtype)))

Review Comment:
   Thanks for the comment, @cconvey! You're correct about saturate not being 
needed for float16 dtype.  Please note that the functions in this file 
qnn/avg_pool2d.py are meant to be used only for the quantized models and 
therefore should have uint8 and int8 dtypes.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [tvm] jverma-quic commented on a diff in pull request #12340: [TOPI][Hexagon] Implement quantized avgpool

2022-08-19 Thread GitBox


jverma-quic commented on code in PR #12340:
URL: https://github.com/apache/tvm/pull/12340#discussion_r950295180


##
python/tvm/topi/hexagon/utils.py:
##
@@ -150,4 +157,126 @@ def get_layout_transform_fn(layout):
 return nc_2048_2d
 if layout == "nhwc-8h8w32c-2d":
 return nhwc_8h8w32c_2d
+if layout == "n11c-2048c-2d":
+return n11c_2048c_2d
 raise RuntimeError(f"Unexpected layout '{layout}'")
+
+
+def get_fixed_point_value(flp: float, dtype: str = "int16"):
+"""
+Return fixed-point value and the corresponding log2 of the scale factor 
used to compute
+this value.
+
+Parameters
+--
+flp : float
+Floating-point value to be converted
+dtype : str
+Type of the resulting fixed-point value. By default, it's set to 
"int16"
+
+Returns
+---
+fixed_point_value : int
+Fixed-point value for the given floating-point value
+exp_scale_factor : int
+log2 of the scale factor
+
+Convert floating-point value into fixed-point number. This is done by
+multiplying the value by a scaling factor and then rounding it to the 
nearest
+integer value.
+
+As per IEEE-754 standard, a floating-point value can be represented as 
follows
+[see: https://en.wikipedia.org/wiki/IEEE_754-1985]:
+(-1)^S * M * 2^(E-Bias)
+
+Here,
+* S is the signed bit (0 or 1).
+* M is the mantissa. It's composed of an implicit 1 for the normalized 
floating-point
+  values or 0 for the denormalized values, and the fraction part. This 
ensures that
+  mantissa is always within [0, 2) range. Please note that this function 
doesn't
+  handle denormalized values.
+* E is the exponent.
+
+In single precision, 23 bits are used to represent the fraction part of
+the mantissa (and therefore, '23' shows up in one of the computations 
below) and
+8 bits are used for the exponent. Since exponent field needs to reperesent 
both
+positive and negative values, a bias (127 for single precision) is added 
to the actual
+value. Therefore, to compute the actual exponent, 127 must be subtracted 
from the stored
+value.
+
+As mentioned above, to find the corresponding fixed-point number, we 
multiply the
+value with a scaling factor and then round it to the nearest integer. The 
scaling factor
+is chosen to be a power for 2 and it's the largest value that can be 
safely multiplied
+to the floating-point value, without causing the resulting value to 
overflow the range
+of the integer type used to represent the fixed-point value.
+
+So, if we assume the scaling factor to be 2^x, the resulting fixed-point 
value will be:
+round((-1)^S * (M) * 2^(E-Bias) * 2^x)
+
+This can be simplified to:
+round((-1)^S * M * 2^(E-Bias+x)
+
+Now, if 'int16' is used for fixed-point value, then it has to be >= -(2 * 
2^14)
+and <= (2 * 2^14) - 1. Since M (Mantissa) is always < 2, in order for the 
fixed-point value
+to be within this range, 2^(E - Bias + x) must be <= 2^14 - 1.
+And, if we ignore -1, (E - Bias + x) should be <= 14. Note: if mantissa 
gets too close to 2,
+this will cause the resulting value to go out of range and require it to 
be saturated.
+In the following implementation, we perform range check and adjust the 
scale to avoid
+saturation.
+For most cases, 2^x, where x = 14 - (E - Bias) or 14 - (E - 127) for 
single precision, is the
+best scaling factor for 'int16' type that can be used to convert the 
floating-point value to
+fixed-point with the least amount of precision loss.
+
+Additonal notes on various floating-point values:
+
+1) Denormalized values: Can't be represented as fixed-point - causes 
assertion failure
+2) NaN and INF: assertion failure
+"""
+
+def within_range(val, dtype):
+if dtype == "int16":
+return -32768 <= val <= 32767
+raise RuntimeError(f"Unsupported dtype, {dtype}'")
+
+# Make sure that 'flp' isn't NaN or infinity
+if math.isnan(flp) or math.isinf(flp):
+raise RuntimeError("Can not handle NaN or INF")
+
+flp_f = struct.pack("f", flp)
+flp_i = struct.unpack("I", flp_f)
+exp_stored_value = (flp_i[0] >> 23) & 0xFF
+
+if exp_stored_value == 0:
+raise RuntimeError("Can not handle denormalized values")

Review Comment:
   Sure, I'll elaborate on this. Thanks!



##
python/tvm/topi/hexagon/utils.py:
##
@@ -150,4 +157,126 @@ def get_layout_transform_fn(layout):
 return nc_2048_2d
 if layout == "nhwc-8h8w32c-2d":
 return nhwc_8h8w32c_2d
+if layout == "n11c-2048c-2d":
+return n11c_2048c_2d
 raise RuntimeError(f"Unexpected layout '{layout}'")
+
+
+def get_fixed_point_value(flp: float, dtype: str = "int16"):
+"""
+Return fixed-point value and the corresponding log2 of t

[GitHub] [tvm] jverma-quic commented on a diff in pull request #12340: [TOPI][Hexagon] Implement quantized avgpool

2022-08-19 Thread GitBox


jverma-quic commented on code in PR #12340:
URL: https://github.com/apache/tvm/pull/12340#discussion_r950287893


##
tests/python/contrib/test_hexagon/test_fixed_point_conversion.py:
##
@@ -0,0 +1,58 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import math
+import struct
+import numpy as np
+import tvm.topi.hexagon.utils as utils
+
+"""
+Test float to fixed-point conversion. We do it by constructing a numpy array 
with the
+wide range of floating-point values. These values are converted into the 
+fixed-point value using topi.hexagon.utils.get_fixed_point_value. Then, these 
values are
+converted back into float using scale_factor provided by the function. These 
converted
+floating point values are then compared against the original values and an 
assertion is
+raised if they happened to be outside of the expected tolerance.
+"""
+
+
+class TestFixedPointConversion:
+def test_fixed_point_conversion(self):
+# Construct array with wide range of values
+fp1 = np.random.uniform(0.1, 0.0002, size=(10))
+fp2 = np.random.uniform(0.001, 0.02, size=(10))
+fp3 = np.random.uniform(1, 20, size=(10))
+fp4 = np.random.uniform(900, 1000, size=(10))
+fp5 = np.random.uniform(1e9, 1e10, size=(10))
+fp6 = np.random.uniform(2.44885652993e38, 2.54885652993e38, size=(1))
+fp7 = np.random.uniform(1.46711479073e-34, 1.76098837843e-34, size=(1))
+float_arr = np.concatenate((fp1, fp2, fp3, fp4, fp5, fp6, fp7))
+for flp in float_arr:
+fxp, rsh = utils.get_fixed_point_value(flp, "int16")
+# Compute scale_factor using rsh (rsh is log2 of the 
scale_factor). While doing this,
+# we use IEEE-754 floating-point representation since rsh can be 
negative or positive.
+
+scale = ((rsh + 127) & 0xFF) << 23  # Add bias (127) and position 
it into exponent bits
+scale_i = struct.pack("I", scale)  # Pack it as integer
+scale_f = struct.unpack("f", scale_i)  # Unpack as float
+
+converted_flp = fxp / scale_f[0]

Review Comment:
   That's what I had earlier but I decided not to do it mainly because it's 
need just for testing and doesn't provide any additional value. I would prefer 
to keep it that way unless this is a major concern. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [tvm] jverma-quic commented on a diff in pull request #12340: [TOPI][Hexagon] Implement quantized avgpool

2022-08-19 Thread GitBox


jverma-quic commented on code in PR #12340:
URL: https://github.com/apache/tvm/pull/12340#discussion_r950284384


##
tests/python/contrib/test_hexagon/test_fixed_point_conversion.py:
##
@@ -0,0 +1,58 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import math
+import struct
+import numpy as np
+import tvm.topi.hexagon.utils as utils
+
+"""
+Test float to fixed-point conversion. We do it by constructing a numpy array 
with the
+wide range of floating-point values. These values are converted into the 
+fixed-point value using topi.hexagon.utils.get_fixed_point_value. Then, these 
values are
+converted back into float using scale_factor provided by the function. These 
converted
+floating point values are then compared against the original values and an 
assertion is
+raised if they happened to be outside of the expected tolerance.
+"""
+
+
+class TestFixedPointConversion:
+def test_fixed_point_conversion(self):
+# Construct array with wide range of values
+fp1 = np.random.uniform(0.1, 0.0002, size=(10))
+fp2 = np.random.uniform(0.001, 0.02, size=(10))
+fp3 = np.random.uniform(1, 20, size=(10))
+fp4 = np.random.uniform(900, 1000, size=(10))
+fp5 = np.random.uniform(1e9, 1e10, size=(10))
+fp6 = np.random.uniform(2.44885652993e38, 2.54885652993e38, size=(1))
+fp7 = np.random.uniform(1.46711479073e-34, 1.76098837843e-34, size=(1))

Review Comment:
   The numbers don't really mean anything but I just wanted to test with some 
very large and small floating-point values to make sure that the conversion 
function is handling them properly, i.e., doesn't introduce large error. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [tvm] jverma-quic commented on a diff in pull request #12340: [TOPI][Hexagon] Implement quantized avgpool

2022-08-10 Thread GitBox


jverma-quic commented on code in PR #12340:
URL: https://github.com/apache/tvm/pull/12340#discussion_r942649464


##
python/tvm/topi/hexagon/qnn/avg_pool2d.py:
##
@@ -0,0 +1,180 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+# pylint: disable=invalid-name, unused-variable, unused-argument, 
too-many-locals
+
+""" Compute and schedule for quantized avg_pool2d op
+
+Please note the following assumptions made by the implementation:
+
+1) The input must be padded in advance to account for 'padding'. In addition,
+   both input and output must be padded as per the physical buffer layout.
+2) The current implementation assumes 'count_include_pad' to be 'True'. It can 
be
+   modified to support 'False' case but the element count for the pooling 
window
+   must be pre-computed and provided as an input to reduce the run-time 
overhead.
+3) 'padding' is ignored. It must be handled outside of the sliced op.
+4) Please note that this implementation will not work if the output includes 
any
+   physical layout related padding as it can result into out-of-bound access
+   for the input.
+"""
+
+from tvm import te
+from tvm import tir
+from ..utils import get_layout_transform_fn, get_fixed_point_value
+
+
+def validate_out_shape(out_shape, in_shape, kernel, stride, dilation):
+"""Validate output shape"""
+_, oh, ow, _ = out_shape
+_, ih, iw, _ = in_shape
+kh, kw = kernel
+sh, sw = stride
+dh, dw = dilation
+if ih < (oh - 1) * sh + dh * (kh - 1) + 1:
+raise RuntimeError("Output height is too large")
+if iw < (ow - 1) * sw + dw * (kw - 1) + 1:
+raise RuntimeError("Output width is too large")
+
+
+def saturate(x, dtype):
+"""Saturate value for the specified data type"""
+if dtype == "uint8":
+return te.max(0, te.min(x, 255))
+elif dtype == "int8":
+return te.max(-127, te.min(x, 128))
+return x
+
+
+def qnn_avg_pool2d_compute(
+data,
+kernel,
+stride,
+dilation,
+oshape,
+odtype,
+# quantization params:
+input_zero_point,
+input_scale,
+output_zero_point,
+output_scale,
+):
+"""Compute for quantized avg_pool2d"""
+kh, kw = kernel
+rh = te.reduce_axis((0, kh), name="rh")
+rw = te.reduce_axis((0, kw), name="rw")
+ob, oh, ow, oc = oshape
+if isinstance(ob, int):
+validate_out_shape(oshape, data.shape, kernel, stride, dilation)
+
+if odtype == "uint8":
+temp_dtype = "uint16"
+elif odtype == "int8":
+temp_dtype = "int16"
+else:
+raise RuntimeError(f"Unsupported output dtype, {odtype}'")
+
+sh, sw = stride
+dh, dw = dilation
+
+PoolArea = kh * kw
+
+scale = input_scale / output_scale
+scale_fixed_point, rsh = get_fixed_point_value(scale, "int16")
+scale_with_area = scale_fixed_point // PoolArea
+corr = (output_zero_point << rsh) - input_zero_point * scale_fixed_point
+
+Sum = te.compute(
+oshape,
+lambda b, h, w, c: te.sum(
+data[b, h * sh + dh * rh, w * sw + dw * rw, c].astype(temp_dtype), 
axis=[rh, rw]
+),
+name="sum",
+)
+
+Avg = te.compute(
+oshape,
+lambda b, h, w, c: saturate(
+((Sum[b, h, w, c] * scale_with_area) + corr) >> rsh, odtype
+).astype(odtype),
+name="avg",
+)
+return Avg
+
+
+def schedule_nhwc_8h8w32c(outs, ins, output_layout: str, input_layout: str):
+"""Schedule for input and output layout nhwc-8h8w32c"""
+func = te.create_prim_func([ins, outs])
+s = tir.Schedule(func)
+Sum = s.get_block("sum")
+Avg = s.get_block("avg")
+
+input_transform_fn = get_layout_transform_fn(input_layout)
+output_transform_fn = get_layout_transform_fn(output_layout)
+s.transform_layout(Sum, ("read", 0), input_transform_fn)
+s.transform_layout(Avg, ("write", 0), output_transform_fn)
+
+# Schedule 'Avg'
+n, h, w, c = s.get_loops(Avg)
+ho, hi = s.split(h, [None, 8])
+wo, wi = s.split(w, [None, 8])
+wio, wii = s.split(wi, [None, 4])
+co, ci = s.split(c, [None, 32])
+s.reorder(n, ho, wo, co, hi, wio, wii, ci)
+wii_ci = s.fuse(wii, ci)
+s.vectorize(wii_ci)
+
+# Schedule 'Sum'
+