[GitHub] [tvm] ibsidorenko commented on a diff in pull request #13854: [QNN][Relay][Topi] Add qnn.dense with weight layout

via GitHub Tue, 31 Jan 2023 08:19:58 -0800


ibsidorenko commented on code in PR #13854:
URL: https://github.com/apache/tvm/pull/13854#discussion_r1092167862



##########
python/tvm/topi/hexagon/qnn/nn.py:
##########
@@ -874,6 +916,98 @@ def schedule_qnn_dense(outs):
     return default_schedule(outs)
 
 
+def qnn_dense_pack_vrmpy(
+    data: te.Tensor,
+    weight: te.Tensor,
+    # Dense quantization params:
+    input_zero_point: te.Tensor,
+    kernel_zero_point: te.Tensor,
+    _input_scale: te.Tensor,
+    _kernel_scale: te.Tensor,
+    # bias
+    bias: te.Tensor,
+    # Requantization params:
+    rq_input_scale: te.Tensor,
+    rq_input_zero_point: te.Tensor,
+    rq_output_scale: te.Tensor,
+    rq_output_zero_point: te.Tensor,
+    out_dtype: str,
+):
+    """Compute for qnn.dense_pack
+
+    Output data type should be specified through the 'odtype' parameter. 
qnn.dense leverages int32
+    type to store intermediate results. If 'odtype' differs from int32, you 
need to specify
+    requantization parameters.
+    """
+    # Subtract zero point from input and weights.
+    weight = subtract_zero_point(weight, kernel_zero_point, "weight_zp")
+    data = subtract_zero_point(data, input_zero_point, "data_zp")
+
+    # Required for vrmpy intrinsic
+    assert "int8" in weight.dtype and "int8" in data.dtype
+
+    M, K = get_const_tuple(data.shape)
+    N_O, _, N_I, _ = get_const_tuple(weight.shape)
+    k = te.reduce_axis((0, K), "k")
+    out = te.compute(
+        (M, N_O * N_I),
+        lambda m, n: te.sum(
+            data[m, k].astype("int32")
+            * weight[
+                tvm.tir.indexdiv(n, 32),
+                tvm.tir.indexdiv(k, 4),
+                tvm.tir.indexmod(n, 32),
+                tvm.tir.indexmod(k, 4),
+            ].astype("int32"),
+            axis=k,
+        ),
+        name="qnn_dense_pack",
+    )
+
+    # Add bias
+    if bias is not None:
+        assert bias.ndim == 2
+        out = te.compute(out.shape, lambda n, c: out[n, c] + bias[0, c])
+
+    # Requantize output of qnn.dense_pack
+    if rq_input_scale is not None and rq_output_scale is not None:
+        # Now supported only scalar and 1D quantization parameters
+        assert rq_input_scale.ndim == 0 or rq_input_scale.ndim == 1
+        assert rq_output_scale.ndim == 0 or rq_output_scale.ndim == 1
+        axis = -1
+        if rq_input_scale.ndim == 1 or rq_output_scale.ndim == 1:
+            axis = 1  # Axis param should correspond to 'C' dimension.
+
+        return qnn_requantize(
+            out,
+            rq_input_scale,
+            rq_input_zero_point,
+            rq_output_scale,
+            rq_output_zero_point,
+            axis,
+            out_dtype,
+        )
+
+    return out
+
+
+def schedule_qnn_dense_pack_vrmpy(outs):
+    """Schedule for qnn.dense_pack
+
+    Parameters
+    ----------
+    outs: Array of Tensor
+          The computation graph description of qnn.dense
+          in the format of an array of tensors.
+
+    Returns
+    -------
+    sch: Schedule
+        The computation schedule for the op.
+    """
+    return default_schedule(outs)

Review Comment:
   I have such plans, but with low priority. The most interesting is work of 
MetaScheduler for now.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [tvm] ibsidorenko commented on a diff in pull request #13854: [QNN][Relay][Topi] Add qnn.dense with weight layout

Reply via email to