from:"masahi"

[tvm] branch main updated (f3873d7717 -> f3ffc32482)

2022-10-19 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from f3873d7717 [skip ci] Add Janet and Thomas to triagers to help with 
Issue Triage RFC (#13141)
 add f3ffc32482 [Hexagon] [runtime] Remove released buffer check for 
post-ReleaseResources calls to FreeDataSpace (#13139)

No new revisions were added by this update.

Summary of changes:
 src/runtime/hexagon/hexagon_buffer_manager.h | 24 ++--
 src/runtime/hexagon/hexagon_device_api.cc| 11 +++
 src/runtime/hexagon/hexagon_device_api.h | 10 --
 3 files changed, 9 insertions(+), 36 deletions(-)

[tvm] branch ci-docker-staging updated: fixed tag

2022-10-19 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a commit to branch ci-docker-staging
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/ci-docker-staging by this push:
 new 123134e1b9 fixed tag
123134e1b9 is described below

commit 123134e1b94d503c7f3fdfb9c3545163547b6117
Author: Masahiro Masuda 
AuthorDate: Wed Oct 19 17:20:18 2022 +0900

fixed tag
---
 Jenkinsfile   | 2 +-
 ci/jenkins/Jenkinsfile.j2 | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/Jenkinsfile b/Jenkinsfile
index 7e27698f15..2fd707a570 100755
--- a/Jenkinsfile
+++ b/Jenkinsfile
@@ -50,7 +50,7 @@
 import org.jenkinsci.plugins.pipeline.modeldefinition.Utils
 // NOTE: these lines are scanned by docker/dev_common.sh. Please update the 
regex as needed. -->
 ci_lint = 'tlcpack/ci-lint:20221013-060115-61c9742ea'
-ci_gpu = 'tlcpackstaging/ci_gpu:20221019-060125-0b4836739'
+ci_gpu = 'tlcpack/ci-gpu:20221019-060125-0b4836739'
 ci_cpu = 'tlcpack/ci-cpu:20221013-060115-61c9742ea'
 ci_minimal = 'tlcpack/ci-minimal:20221013-060115-61c9742ea'
 ci_wasm = 'tlcpack/ci-wasm:20221013-060115-61c9742ea'
diff --git a/ci/jenkins/Jenkinsfile.j2 b/ci/jenkins/Jenkinsfile.j2
index 60735aa4f9..bca7034938 100644
--- a/ci/jenkins/Jenkinsfile.j2
+++ b/ci/jenkins/Jenkinsfile.j2
@@ -52,7 +52,7 @@ import org.jenkinsci.plugins.pipeline.modeldefinition.Utils
 
 // NOTE: these lines are scanned by docker/dev_common.sh. Please update the 
regex as needed. -->
 ci_lint = 'tlcpack/ci-lint:20221013-060115-61c9742ea'
-ci_gpu = 'tlcpackstaging/ci_gpu:20221019-060125-0b4836739'
+ci_gpu = 'tlcpack/ci-gpu:20221019-060125-0b4836739'
 ci_cpu = 'tlcpack/ci-cpu:20221013-060115-61c9742ea'
 ci_minimal = 'tlcpack/ci-minimal:20221013-060115-61c9742ea'
 ci_wasm = 'tlcpack/ci-wasm:20221013-060115-61c9742ea'

[tvm] 01/01: Testing a new GPU image

2022-10-19 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a commit to branch ci-docker-staging
in repository https://gitbox.apache.org/repos/asf/tvm.git

commit ab914915c997750d59de71b6f61f7c70320eb6e3
Author: Masahiro Masuda 
AuthorDate: Wed Oct 19 16:16:11 2022 +0900

Testing a new GPU image
---
 Jenkinsfile   | 2 +-
 ci/jenkins/Jenkinsfile.j2 | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/Jenkinsfile b/Jenkinsfile
index d48e02cf13..7e27698f15 100755
--- a/Jenkinsfile
+++ b/Jenkinsfile
@@ -50,7 +50,7 @@
 import org.jenkinsci.plugins.pipeline.modeldefinition.Utils
 // NOTE: these lines are scanned by docker/dev_common.sh. Please update the 
regex as needed. -->
 ci_lint = 'tlcpack/ci-lint:20221013-060115-61c9742ea'
-ci_gpu = 'tlcpack/ci-gpu:20221013-060115-61c9742ea'
+ci_gpu = 'tlcpackstaging/ci_gpu:20221019-060125-0b4836739'
 ci_cpu = 'tlcpack/ci-cpu:20221013-060115-61c9742ea'
 ci_minimal = 'tlcpack/ci-minimal:20221013-060115-61c9742ea'
 ci_wasm = 'tlcpack/ci-wasm:20221013-060115-61c9742ea'
diff --git a/ci/jenkins/Jenkinsfile.j2 b/ci/jenkins/Jenkinsfile.j2
index f480f08b2b..60735aa4f9 100644
--- a/ci/jenkins/Jenkinsfile.j2
+++ b/ci/jenkins/Jenkinsfile.j2
@@ -52,7 +52,7 @@ import org.jenkinsci.plugins.pipeline.modeldefinition.Utils
 
 // NOTE: these lines are scanned by docker/dev_common.sh. Please update the 
regex as needed. -->
 ci_lint = 'tlcpack/ci-lint:20221013-060115-61c9742ea'
-ci_gpu = 'tlcpack/ci-gpu:20221013-060115-61c9742ea'
+ci_gpu = 'tlcpackstaging/ci_gpu:20221019-060125-0b4836739'
 ci_cpu = 'tlcpack/ci-cpu:20221013-060115-61c9742ea'
 ci_minimal = 'tlcpack/ci-minimal:20221013-060115-61c9742ea'
 ci_wasm = 'tlcpack/ci-wasm:20221013-060115-61c9742ea'

[tvm] branch ci-docker-staging updated (869a8f9591 -> ab914915c9)

2022-10-19 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch ci-docker-staging
in repository https://gitbox.apache.org/repos/asf/tvm.git


omit 869a8f9591 [CI] Add Zephyr-SDK binaries to PATH env. in ci_cortexm
 add e8aeb4adf3 [CI] Add Zephyr-SDK binaries to PATH env. in ci_cortexm 
(#12884)
 add eba75e4640 [METASCHEDULE] Mark work_dir as not optional in docs 
(#12888)
 add 428269f80c [FIX,PROFILING] Fix PAPI docs (#12861)
 add fbb500e92f [TIR][Schedule] Relax cache read/write's restriction and 
fix unexpected behavior (#12766)
 add 71f25b3d6c [IR] Use TVM_DEFINE_OBJECT_REF_METHODS macro for Op (#12893)
 add a61c1ad0f0 [TIR] Fix plan buffer allocation location for loop carried 
dependencies (#12757)
 add c8423a6843 [Meta Schedule][XGBoost] Update the custom callback 
function of xgboost in meta schedule (#12141)
 add 46ea2ed42e [MetaSchedule][UX] User Interface for Jupyter Notebook 
(#12866)
 add cc6e01edc6 [frontend][pytorch]support aten::zero_ operator (#12872)
 add 87085b0e0d [frontend][pytorch]Support aten::Tensor_split operator 
(#12871)
 add 4ef1465d40 [skip ci] Temporarily disable comments bot (#12903)
 add b6a660be58 [BUILD] Re-enable ccache by default (#12839)
 add 8711ba44b9 [TVMScript] Import TIR methods into the IRBuilder (#12900)
 add fd26813723 [TVMScript] Infer T.match_buffer parameters for region 
(#12890)
 add e1f3f90588 [TOPI][Hexagon] Implement quantize op for hexagon (#12820)
 add f25a702a1f [TOPI][Hexagon] Add schedule and test for maxpool uint8 
layout (#12826)
 add d4fb957ae1 [microTVM][ARM] Improve dense DSP micro kernel (#12908)
 add 830ebc4ec8 [TIR] Refactor IndexMap::Inverse in terms of 
NonSurjectiveInverse (#12904)
 add 5ddd35c377 [Relay][TE] Add default param name if needed (#12912)
 add 4d5ed07325 [TIR] Fix GetProducer/Consumer for duplicating dep edges 
(#12910)
 add f64e933246 [LLVM] Emit fp16/fp32 builtins directly into target module 
(#12877)
 add b61f633e10 [TVM PyTorch Integration] optimized_torch & as_torch how-to 
guide (#12318)
 add 7a4c10c44a [TIR][Transform] Remove num_unpacked_args from 
MakePackedAPI (#12892)
 add 7dbc68d108 [ONNX] Fix test_roi_align failure (#12906)
 add 77d8eef514 [Runtime][Bugfix] Added type-checking for Array::insert 
(#12691)
 add 9a673faa74 [ci] Initialize git during deploys (#12909)
 add 332b1469b7 [Hexagon] depth_to_space slice op (#12669)
 add 5a807e27c0 [Hexagon] [runtime] Add thread manager to resource 
management (#12905)
 add 82e6fc41f8 [microTVM] add the option to open a saved micro project for 
debugging (#12495)
 add a07a46ed19 [TIR] add unit-tests for upcoming primfunc-slicing (#12794)
 add bec9f16d42 [TIR][Transform] Clear buffer_map during MakeUnpackedAPI 
(#12891)
 add c89a8baeeb [usmp] Also remap VarNode to USMP-allocated buffer (#12880)
 add 178f82dc48 [TOPI] Implement Einsum with reduction axes (#12913)
 add d1c9febeca [ETHOSN] Remove support for 22.05 version of the driver 
stack (#12770)
 add 17e4644019 [TIR][MetaSchedule] Add regression test for layout_rewrite 
extent=1 (#12916)
 add e3a6cb6a1b [microTVM] Generalize depthwise_conv2d schedule (#12856)
 add 9d1fe6d8d1 [Target] Add Ampere GPUs CUDA tags (#12930)
 add 8af43d3c11 [Hexagon] [runtime] Add user DMA to device API resource 
management (#12918)
 add 68f9509b0c [TIR] Fix int64 dtype mismatch in Reindex (#12934)
 add 8c88aab778 [Bugfix][CMake] Update the minimum CMake version to 3.18 
(#12682)
 add 5f132fd6c1 [ETHOSN] Support conversion of add/mul to requantize where 
possible (#12887)
 add 5634a1a17a [CODEGEN][OPENCL] Compatibility for OpenCL version 3.0 
(#12938)
 add 0d8c9cef72 [Relay] Extend split for blocked ConvertLayout pass (#12886)
 add 9a45141165 [TIR] Use buffer's dtype when converting pad_value to TIR 
(#12925)
 add 3e3d900c66 [Virtual Machine] Implementation of 'set_output_zero_copy' 
(#11358)
 add ea01e3ffb4 [TIR] Preserve loop annotations in inject_software_pipeline 
pass (#12937)
 add 2379917985 [MetaSchedule] Add Script for TorchBench Model Tuning & 
Benchmarking (#12914)
 add 595f0b3975 [HEXAGON][QHL] Clippling the inputs of HVX version of QHL 
Sigmoid operation (#12919)
 add 25a54fb791 [TIR] Remove unused iters from the result of reindex 
(#12946)
 add 77c8b6e163 [Support] Add fallback definition of ccache in libinfo 
(#12945)
 add 4e4089edda [MetaSchedule] Fix XGBoost Import Issue (#12936)
 add e9eb0bc660 [LLVM] Change CHECK_NE(x, nullptr) to CHECK(x != nullptr), 
NFC (#12943)
 add dedf6393f1 [Hexagon] Change NULL to nullptr, NFC (#12944)
 add d4bf9ecf55 [Target] Add target_device_type attribute to override 
default device_type (#12509)
 add bf5637dc32 [DOCS][COMMUNITY] Elaborate Independence Principle for 
Project Participation (#12962)
 add c3357f68

[tvm] branch main updated (9c9f32536a -> 0b4836739c)

2022-10-18 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 9c9f32536a Update Pytorch to version 1.12.0 and TorchVision to 0.13.0 
(#13126)
 add 0b4836739c Skip stride check if shape is 1 in IsContiguous (#13121)

No new revisions were added by this update.

Summary of changes:
 include/tvm/runtime/ndarray.h |  9 ++
 tests/cpp/ndarray_test.cc | 73 +++
 2 files changed, 82 insertions(+)
 create mode 100644 tests/cpp/ndarray_test.cc

[tvm] branch main updated (e3b722b70d -> 9c9f32536a)

2022-10-18 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from e3b722b70d [Hexagon] [runtime] Use malloc/free for RPC buffers (#13125)
 add 9c9f32536a Update Pytorch to version 1.12.0 and TorchVision to 0.13.0 
(#13126)

No new revisions were added by this update.

Summary of changes:
 docker/install/ubuntu_install_onnx.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

[tvm] branch main updated (010d05c680 -> 3d22dbffd0)

2022-10-18 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 010d05c680 [QNN][Hexagon] Disable QNN canonicalization pass (#12398)
 add 3d22dbffd0 [Relay] fix: add compute tag for trilu (#13120)

No new revisions were added by this update.

Summary of changes:
 python/tvm/topi/transform.py |  2 +-
 tests/python/relay/test_op_level3.py | 19 +++
 2 files changed, 20 insertions(+), 1 deletion(-)

[tvm] branch main updated (6056e13db9 -> 010d05c680)

2022-10-18 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 6056e13db9 [Adreno] Fix winograd accuracy (#13117)
 add 010d05c680 [QNN][Hexagon] Disable QNN canonicalization pass (#12398)

No new revisions were added by this update.

Summary of changes:
 include/tvm/runtime/data_type.h|  10 +
 python/tvm/relay/backend/te_compiler.py|  33 +-
 python/tvm/relay/qnn/op/_qnn.py|  35 +-
 python/tvm/relay/qnn/op/qnn.py |   7 -
 .../vision/ssd => relay/qnn/strategy}/__init__.py  |   5 +-
 python/tvm/relay/qnn/strategy/generic.py   | 249 
 python/tvm/relay/qnn/strategy/hexagon.py   | 136 +
 python/tvm/te/__init__.py  |   1 +
 python/tvm/tir/__init__.py |   1 +
 python/tvm/topi/hexagon/qnn/__init__.py|   1 +
 python/tvm/topi/hexagon/qnn/nn.py  | 667 +
 src/relay/backend/te_compiler_cache.cc | 111 +++-
 src/relay/qnn/pass/legalize.cc |   2 +-
 src/relay/transforms/fuse_ops.cc   |   4 +-
 .../test_hexagon/test_wo_qnn_canonicalization.py   | 185 ++
 15 files changed, 1411 insertions(+), 36 deletions(-)
 copy python/tvm/{topi/vision/ssd => relay/qnn/strategy}/__init__.py (92%)
 create mode 100644 python/tvm/relay/qnn/strategy/generic.py
 create mode 100644 python/tvm/relay/qnn/strategy/hexagon.py
 create mode 100644 python/tvm/topi/hexagon/qnn/nn.py
 create mode 100644 
tests/python/contrib/test_hexagon/test_wo_qnn_canonicalization.py

[tvm] branch main updated (64975a425f -> 6056e13db9)

2022-10-18 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 64975a425f [skip ci][COMMUNITY] gigiblender -> Reviewer (#13122)
 add 6056e13db9 [Adreno] Fix winograd accuracy (#13117)

No new revisions were added by this update.

Summary of changes:
 python/tvm/topi/adreno/conv2d_alter_op.py  |   3 +
 .../opencl_texture/test_conv2d_nchw_texture.py |  33 +++
 .../opencl_texture/test_conv2d_nhwc_texture.py | 101 +
 3 files changed, 137 insertions(+)

[tvm] branch main updated (48be4ff344 -> 64975a425f)

2022-10-18 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 48be4ff344 [Docs] Add instructions on downloads page updating on 
release process (#13106)
 add 64975a425f [skip ci][COMMUNITY] gigiblender -> Reviewer (#13122)

No new revisions were added by this update.

Summary of changes:
 CONTRIBUTORS.md | 1 +
 1 file changed, 1 insertion(+)

[tvm] branch main updated (468732c6b3 -> 9f047c0627)

2022-10-17 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 468732c6b3 [PopenPool] Enable Stdout & Stderr Redirect in PopenPool & 
PopenWorker (#13112)
 add 9f047c0627 [DOCKER][Adreno]Docker infra for Adreno target with CLML 
support (#12833)

No new revisions were added by this update.

Summary of changes:
 .../{Dockerfile.ci_jekyll => Dockerfile.ci_adreno} | 15 +++--
 docker/bash.sh | 13 +++-
 docker/install/ubuntu_install_cmake_source.sh  |  2 +-
 python/tvm/testing/utils.py| 13 +++-
 .../stm32 => python/contrib/test_clml}/conftest.py |  9 ++-
 tests/python/contrib/test_clml/infrastructure.py   | 43 +
 tests/python/contrib/test_clml/test_network.py | 52 ---
 tests/python/contrib/test_clml/test_ops.py | 74 +-
 .../opencl_texture/test_conv2d_nchw_texture.py | 42 +++-
 tests/scripts/ci.py| 34 +-
 tests/scripts/task_build_adreno_bins.sh| 53 
 ...ig_build_jvm.sh => task_config_build_adreno.sh} | 17 ++---
 tests/scripts/task_python_adreno.sh| 65 +++
 13 files changed, 258 insertions(+), 174 deletions(-)
 copy docker/{Dockerfile.ci_jekyll => Dockerfile.ci_adreno} (70%)
 copy tests/{micro/stm32 => python/contrib/test_clml}/conftest.py (85%)
 create mode 100755 tests/scripts/task_build_adreno_bins.sh
 copy tests/scripts/{task_config_build_jvm.sh => task_config_build_adreno.sh} 
(75%)
 create mode 100755 tests/scripts/task_python_adreno.sh

[tvm] branch main updated (8ccc43445a -> 8d2e887dbb)

2022-10-17 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 8ccc43445a [Hexagon] Async DMA pipelining test suite (#13005)
 add 8d2e887dbb [HotFix] Fix python import (#13099)

No new revisions were added by this update.

Summary of changes:
 python/tvm/topi/utils.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

[tvm] branch main updated: [Hexagon] Async DMA pipelining test suite (#13005)

2022-10-17 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 8ccc43445a [Hexagon] Async DMA pipelining test suite (#13005)
8ccc43445a is described below

commit 8ccc43445a50df1b8f3c886113c379cc132a90c4
Author: Noah Verke 
AuthorDate: Mon Oct 17 14:45:27 2022 -0700

[Hexagon] Async DMA pipelining test suite (#13005)

* [Hexagon] Add tests to show how to properly utilize async dma pipelining 
on hexagon.

* Formatting updates.

* Update comments and reformatting.

* Skip long tests in CI.
---
 .../test_hexagon/test_async_dma_pipeline.py| 353 +
 1 file changed, 353 insertions(+)

diff --git a/tests/python/contrib/test_hexagon/test_async_dma_pipeline.py 
b/tests/python/contrib/test_hexagon/test_async_dma_pipeline.py
new file mode 100644
index 00..d05e0a6e92
--- /dev/null
+++ b/tests/python/contrib/test_hexagon/test_async_dma_pipeline.py
@@ -0,0 +1,353 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+""" Test different strategies for loading data into vtcm before running HVX 
workloads. """
+
+import numpy as np
+import tvm
+import pytest
+
+from tvm.script import tir as T
+from numpy.random import default_rng
+
+from tvm.tir.function import TensorIntrin
+
+VRMPY_SIZE_B = 128
+VRMPY_SIZE_INT32 = 32
+
+
+def conv_approximation(size_a, size_w):
+a_shape = (size_a, VRMPY_SIZE_B)
+w_shape = (size_w, VRMPY_SIZE_B)
+out_shape = (size_a, VRMPY_SIZE_INT32)
+
+@T.prim_func
+def operator(a: T.handle, b: T.handle, c: T.handle) -> None:
+T.func_attr({"global_symbol": "main", "tir.noalias": True})
+A = T.match_buffer(a, a_shape, dtype="uint8")
+W = T.match_buffer(b, w_shape, dtype="uint8")
+C = T.match_buffer(c, out_shape, dtype="int32")
+for n, i in T.grid(size_a, size_w):
+with T.block("C"):
+vn, vi = T.axis.remap("SR", [n, i])
+T.reads(A[vn, 0:VRMPY_SIZE_B], W[vi, 0:VRMPY_SIZE_B], C[vn, 
0:VRMPY_SIZE_INT32])
+T.writes(C[vn, 0:VRMPY_SIZE_INT32])
+with T.init():
+for x in T.serial(VRMPY_SIZE_INT32):
+C[vn, x] = 0
+C[vn, T.ramp(0, 1, 32)] = T.call_llvm_intrin(
+
T.llvm_lookup_intrinsic_id("llvm.hexagon.V6.vrmpyubv.acc.128B"),
+T.uint32(3),
+C[vn, T.ramp(0, 1, 32)],
+T.reinterpret(A[vn, T.ramp(0, 1, 128)], dtype="int32x32"),
+T.reinterpret(W[vi, T.ramp(0, 1, 128)], dtype="int32x32"),
+dtype="int32x32",
+)
+# Currently async DMA lowering does not add any wait to the end of 
schedules so
+# for timing purposes we are manually adding a wait to ensure that all 
copies
+# are complete when the schedule exits.
+T.evaluate(
+T.tvm_call_packed(
+"device_api.hexagon.dma_wait",
+0,  # QueueId
+0,  # Wait for 0 in flight
+dtype="int32",
+)
+)
+
+return tvm.tir.Schedule(operator)
+
+
+def evaluate(hexagon_session, sch, a, b, size_a, expected_output, 
use_async_copy=0):
+target_hexagon = tvm.target.hexagon("v68", link_params=True)
+with tvm.transform.PassContext(config={"tir.use_async_copy": 
use_async_copy}):
+func_tir = tvm.build(
+sch.mod["main"], target=tvm.target.Target(target_hexagon, 
host=target_hexagon)
+)
+module = hexagon_session.load_module(func_tir)
+
+a_hexagon = tvm.runtime.ndarray.array(a, device=hexagon_session.device)
+b_hexagon = tvm.runtime.ndarray.array(b, device=hexagon_session.device)
+c_hexagon = tvm.runtime.ndarray.array(
+np.zeros((size_a, VRMPY_SIZE_INT32), dtype="int32&qu

[tvm] branch main updated (5dd786b3c9 -> 4074127b71)

2022-10-17 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 5dd786b3c9 [Hexagon] [runtime] VTCM bugfix, runtime buffer 
clarification (#13066)
 add 4074127b71 quic-sanirudh -> Reviewer (#13098)

No new revisions were added by this update.

Summary of changes:
 CONTRIBUTORS.md | 1 +
 1 file changed, 1 insertion(+)

[tvm] branch main updated (ec5c692148 -> 5e862d4e41)

2022-10-14 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from ec5c692148 Add include directory for OpenBLAS on RedHat (#13087)
 add 5e862d4e41 [Frontend][PyTorch]Fix keywords to canonicalize scale and 
zero point access for FX-quantized graphs (#13071)

No new revisions were added by this update.

Summary of changes:
 python/tvm/relay/frontend/qnn_torch.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

[tvm] branch main updated (71f32ca4e8 -> ec5c692148)

2022-10-14 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 71f32ca4e8 [MetaSchedule][UX] Support Interactive Performance Table 
Printing in Notebook (#13006)
 add ec5c692148 Add include directory for OpenBLAS on RedHat (#13087)

No new revisions were added by this update.

Summary of changes:
 cmake/modules/contrib/BLAS.cmake | 5 +
 1 file changed, 5 insertions(+)

[tvm] branch main updated (5eab64885a -> 5ed94eefad)

2022-10-14 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 5eab64885a [ROOFLINE] Add support for different dtypes (#13003)
 add 5ed94eefad [Node] Fix structural equal path tracing pointer usage 
(#13082)

No new revisions were added by this update.

Summary of changes:
 src/node/structural_equal.cc | 20 
 1 file changed, 8 insertions(+), 12 deletions(-)

[tvm] branch main updated (44c35dcd96 -> 342ffb91d6)

2022-10-14 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 44c35dcd96 [TVMScript] Fix parsing int64 loop with optional loop start 
(#13068)
 add 342ffb91d6 [Hexagon]Register fast softmax schedule with default 
schedule (#13083)

No new revisions were added by this update.

Summary of changes:
 python/tvm/relay/op/strategy/hexagon.py | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

[tvm] branch main updated (29a8f06066 -> b389d4dac4)

2022-10-13 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 29a8f06066 [Arith] Optional rewriting and simplification into AND of 
ORs (#12972)
 add b389d4dac4 [Torch] Fix torch contrib issues (#13061)

No new revisions were added by this update.

Summary of changes:
 apps/pt_tvmdsoop/tests/test_as_torch.py| 20 +---
 python/tvm/contrib/torch/as_torch.py   |  7 +--
 python/tvm/contrib/torch/optimize_torch.py |  4 +++-
 python/tvm/meta_schedule/__init__.py   |  1 +
 4 files changed, 18 insertions(+), 14 deletions(-)

[tvm] branch main updated (90c666f860 -> 61c9742ea7)

2022-10-12 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 90c666f860 [Relay][Pass] ConcretizeCastLikeRewrite for SimplifyExpr 
(#12923)
 add 61c9742ea7 [Hexagon] Enable multi input Async DMA; same queue / stage 
(#13037)

No new revisions were added by this update.

Summary of changes:
 src/driver/driver_api.cc   |   1 +
 src/tir/transforms/inject_software_pipeline.cc |  51 --
 src/tir/transforms/lower_async_dma.cc  |   7 +-
 .../test_hexagon/test_software_pipeline_async.py   | 180 +++--
 4 files changed, 173 insertions(+), 66 deletions(-)

[tvm] branch main updated (20aa0cf2f7 -> 189338c919)

2022-10-07 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 20aa0cf2f7 [ci] Re-enable roofline test (#13007)
 add 189338c919 [MetaSchedule] Support RewriteLayout postproc on 
AllocateConst  (#12991)

No new revisions were added by this update.

Summary of changes:
 python/tvm/meta_schedule/relay_integration.py  |  14 +-
 src/meta_schedule/postproc/postproc.cc |   5 +-
 src/meta_schedule/postproc/rewrite_layout.cc   |  56 +--
 .../multi_level_tiling_wide_vector.cc  |   6 +
 src/relay/backend/te_compiler_cache.cc | 147 +-
 .../transforms/meta_schedule_layout_rewrite.cc |  11 +-
 src/te/operation/create_primfunc.cc|  24 +--
 .../remove_weight_layout_rewrite_block.cc  | 172 +++--
 .../contrib/test_hexagon/test_meta_schedule.py |  88 +--
 tests/python/integration/test_auto_tensorize.py|   5 +-
 .../test_meta_schedule_relay_integration.py|  78 ++
 .../test_meta_schedule_vnni_integration.py |   4 +-
 tests/python/unittest/test_te_create_primfunc.py   |   2 +
 13 files changed, 539 insertions(+), 73 deletions(-)

[tvm] branch main updated (2d50979606 -> 7fc35da3b9)

2022-10-06 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 2d50979606 [TVMScript] Allow T.bool type annotations (#12975)
 add 7fc35da3b9 [TEST] CPU feature detection for x86 and ARM dot product 
instructions (#12980)

No new revisions were added by this update.

Summary of changes:
 python/tvm/testing/utils.py| 45 +++
 .../test_meta_schedule_auto_tensorize.py   |  7 ++-
 tests/python/relay/test_op_level1.py   |  2 +-
 tests/python/relay/test_op_level10.py  |  2 +-
 tests/python/relay/test_op_level2.py   | 65 --
 .../unittest/test_meta_schedule_tune_relay.py  |  9 +--
 6 files changed, 91 insertions(+), 39 deletions(-)

[tvm] branch main updated (af01526ae2 -> a997c23e94)

2022-10-05 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from af01526ae2 [skip ci] Edits to the Bug & Flaky test Issue templates to 
reduce word count (#12985)
 add a997c23e94 [CODEGEN][OPENCL] Sampler definition should be at outermost 
scope (#12951)

No new revisions were added by this update.

Summary of changes:
 src/target/source/codegen_opencl.cc | 13 -
 src/target/source/codegen_opencl.h  |  1 +
 2 files changed, 13 insertions(+), 1 deletion(-)

[tvm] branch main updated (f3d3ecebe1 -> 4e260d183f)

2022-10-03 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from f3d3ecebe1 [Hexagon] vrmpy tensorization for e2e compilation of int8 
models (#12911)
 add 4e260d183f [BugFix][Pattern] Fixed a bug in PatternGrouper (#12901)

No new revisions were added by this update.

Summary of changes:
 .../contrib/cmsisnn/scalar_to_tensor_constant.cc   |  6 +++--
 src/relay/ir/dataflow_matcher.cc   |  5 
 tests/python/relay/test_dataflow_pattern.py| 30 ++
 tests/python/relay/test_pass_merge_composite.py| 14 +-
 4 files changed, 45 insertions(+), 10 deletions(-)

[tvm] branch main updated (2379917985 -> 595f0b3975)

2022-09-29 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 2379917985 [MetaSchedule] Add Script for TorchBench Model Tuning & 
Benchmarking (#12914)
 add 595f0b3975 [HEXAGON][QHL] Clippling the inputs of HVX version of QHL 
Sigmoid operation (#12919)

No new revisions were added by this update.

Summary of changes:
 src/target/llvm/intrin_rule_hexagon.cc | 10 ++-
 src/tir/op/op.cc   |  2 +-
 .../{topi/test_relu_slice.py => test_sigmoid.py}   | 88 ++
 3 files changed, 51 insertions(+), 49 deletions(-)
 copy tests/python/contrib/test_hexagon/{topi/test_relu_slice.py => 
test_sigmoid.py} (53%)

[tvm] branch main updated: [Relay] Extend split for blocked ConvertLayout pass (#12886)

2022-09-29 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 0d8c9cef72 [Relay] Extend split for blocked ConvertLayout pass (#12886)
0d8c9cef72 is described below

commit 0d8c9cef7212e62c18814f1632613fb04de6d290
Author: Andrey Malyshev 
AuthorDate: Thu Sep 29 16:50:59 2022 +0400

[Relay] Extend split for blocked ConvertLayout pass (#12886)

* [Relay] Extend split for blocked ConvertLayout pass

* Fix lint hits

* Fix spelling
---
 src/relay/op/tensor/transform.cc  | 24 ++-
 tests/python/relay/test_pass_convert_op_layout.py | 49 +++
 2 files changed, 72 insertions(+), 1 deletion(-)

diff --git a/src/relay/op/tensor/transform.cc b/src/relay/op/tensor/transform.cc
index deb05e8877..985222307a 100644
--- a/src/relay/op/tensor/transform.cc
+++ b/src/relay/op/tensor/transform.cc
@@ -2982,10 +2982,32 @@ InferCorrectLayoutOutput SplitInferCorrectLayout(const 
Attrs& attrs,
 
   // If new_in_layouts are defined, this code tries to modify the layout.
   if (new_in_layouts.defined() && old_in_layouts.defined()) {
+bool divisible = true;
 const auto& sp_dim = old_in_layouts[0][axis];
 auto new_index = new_in_layouts[0].IndexOf(sp_dim);
 param->axis = new_index;
-ret = new_in_layouts[0];
+int factor = new_in_layouts[0].FactorOf(sp_dim);
+if (factor > 1) {
+  if (!param->indices_or_sections.as()) {
+auto ios = Downcast>(param->indices_or_sections);
+Array new_ios;
+for (const auto& v : ios) {
+  const IntImmNode* vint = v.as();
+  new_ios.push_back(vint->value / factor);
+  if (vint->value % factor) {
+divisible = false;
+  }
+}
+if (divisible) {
+  param->indices_or_sections = new_ios;
+}
+  }
+}
+if (divisible) {
+  ret = new_in_layouts[0];
+} else {
+  ret = old_in_layouts[0];
+}
   } else if (old_in_layouts.defined()) {
 ret = old_in_layouts[0];
   }
diff --git a/tests/python/relay/test_pass_convert_op_layout.py 
b/tests/python/relay/test_pass_convert_op_layout.py
index 3d5af83b8c..223926a877 100644
--- a/tests/python/relay/test_pass_convert_op_layout.py
+++ b/tests/python/relay/test_pass_convert_op_layout.py
@@ -1760,9 +1760,58 @@ def test_conv_split_convert_layout():
 
 assert tvm.ir.structural_equal(a, b), "Actual = \n" + str(a)
 
+def _test_conv_split_convert_layout_blocking():
+def before():
+x = relay.var("x", shape=(1, 512, 38, 38))
+weight = relay.var("weight", shape=(512, 512, 3, 3))
+y = relay.nn.conv2d(
+x,
+weight,
+channels=512,
+kernel_size=(3, 3),
+data_layout="NCHW",
+kernel_layout="OIHW",
+)
+y = relay.nn.relu(y)
+y = relay.op.split(y, indices_or_sections=[256], axis=1).astuple()
+a = relay.TupleGetItem(y, 0)
+b = relay.TupleGetItem(y, 1)
+out = relay.Tuple([a, b])
+return relay.Function(analysis.free_vars(out), out)
+
+def expected():
+x = relay.var("x", shape=(1, 512, 38, 38))
+weight = relay.var("weight", shape=(512, 512, 3, 3))
+weight = relay.layout_transform(weight, "OIHW", "OIHW4o")
+x = relay.layout_transform(x, "NCHW", "NCHW4c")
+y = relay.op.nn.contrib_conv2d_nchwc(
+x,
+weight,
+channels=512,
+kernel_size=(3, 3),
+padding=(0, 0),
+data_layout="NCHW4c",
+kernel_layout="OIHW4o",
+)
+y = relay.nn.relu(y)
+y = relay.op.split(y, indices_or_sections=[64], axis=1).astuple()
+a = relay.TupleGetItem(y, 0)
+b = relay.TupleGetItem(y, 1)
+a = relay.layout_transform(a, "NCHW4c", "NCHW")
+b = relay.layout_transform(b, "NCHW4c", "NCHW")
+out = relay.Tuple([a, b])
+return relay.Function(analysis.free_vars(out), out)
+
+a = before()
+a = run_opt_pass(a, transform.ConvertLayout({"nn.conv2d": ["NCHW4c", 
"OIHW4o"]}))
+b = run_opt_pass(expected(), transform.InferType())
+
+assert tvm.ir.structural_equal(a, b), "Actual = \n" + str(a)
+
 _test_conv_split_convert_layout1()
 _test_conv_split_convert_layout2()
 _test_conv_split_convert_layout3()
+_test_conv_split_convert_layout_blocking()
 
 
 def test_conv_strided_slice_axes_convert_layout():

[tvm] branch main updated: [CODEGEN][OPENCL] Compatibility for OpenCL version 3.0 (#12938)

2022-09-29 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 5634a1a17a [CODEGEN][OPENCL] Compatibility for OpenCL version 3.0 
(#12938)
5634a1a17a is described below

commit 5634a1a17a3d337728bdc375183c9aee71c40b29
Author: Siva 
AuthorDate: Thu Sep 29 15:31:00 2022 +0530

[CODEGEN][OPENCL] Compatibility for OpenCL version 3.0 (#12938)
---
 src/target/source/codegen_opencl.cc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/target/source/codegen_opencl.cc 
b/src/target/source/codegen_opencl.cc
index e8d47b720b..73a064bc80 100644
--- a/src/target/source/codegen_opencl.cc
+++ b/src/target/source/codegen_opencl.cc
@@ -139,7 +139,8 @@ std::string CodeGenOpenCL::Finish() {
 // For now we rely on OpenCL preprocessor directives to utilize the 
correct behavior
 // depending on the OpenCL version detected at OpenCL compile time.
 decl_stream << "#ifdef __OPENCL_VERSION__\n"
-<< "#if __OPENCL_VERSION__ == CL_VERSION_2_0\n"
+<< "#if __OPENCL_VERSION__ == CL_VERSION_2_0"
+<< " || __OPENCL_VERSION__ == CL_VERSION_3_0 \n"
 << "#define READ_IMAGEH(image, sampler, coord) "
 << "read_imageh(image, sampler, coord)\n"
 << "#define READ_IMAGEF(image, sampler, coord) "

[tvm] branch main updated (f64e933246 -> b61f633e10)

2022-09-26 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from f64e933246 [LLVM] Emit fp16/fp32 builtins directly into target module 
(#12877)
 add b61f633e10 [TVM PyTorch Integration] optimized_torch & as_torch how-to 
guide (#12318)

No new revisions were added by this update.

Summary of changes:
 gallery/how_to/work_with_pytorch/using_as_torch.py | 159 +
 .../work_with_pytorch/using_optimized_torch.py | 149 +++
 python/tvm/contrib/torch/as_torch.py   |   9 +-
 python/tvm/contrib/torch/optimize_torch.py |   4 +-
 4 files changed, 316 insertions(+), 5 deletions(-)
 create mode 100644 gallery/how_to/work_with_pytorch/using_as_torch.py
 create mode 100644 gallery/how_to/work_with_pytorch/using_optimized_torch.py

[tvm] branch main updated (4d5ed07325 -> f64e933246)

2022-09-26 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 4d5ed07325 [TIR] Fix GetProducer/Consumer for duplicating dep edges 
(#12910)
 add f64e933246 [LLVM] Emit fp16/fp32 builtins directly into target module 
(#12877)

No new revisions were added by this update.

Summary of changes:
 src/runtime/builtin_fp16.cc   |   3 -
 src/target/llvm/codegen_llvm.cc   | 227 ++
 src/target/llvm/codegen_llvm.h|   8 +
 tests/python/unittest/test_target_codegen_llvm.py |   7 +-
 tests/python/unittest/test_target_codegen_x86.py  |  74 +--
 5 files changed, 298 insertions(+), 21 deletions(-)

[tvm] branch main updated (87085b0e0d -> 4ef1465d40)

2022-09-26 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 87085b0e0d [frontend][pytorch]Support aten::Tensor_split operator 
(#12871)
 add 4ef1465d40 [skip ci] Temporarily disable comments bot (#12903)

No new revisions were added by this update.

Summary of changes:
 .github/{workflows => disabled_workflows}/pr_comment_bot.yml | 0
 1 file changed, 0 insertions(+), 0 deletions(-)
 rename .github/{workflows => disabled_workflows}/pr_comment_bot.yml (100%)

[tvm] branch main updated: [frontend][pytorch]Support aten::Tensor_split operator (#12871)

2022-09-26 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 87085b0e0d [frontend][pytorch]Support aten::Tensor_split operator 
(#12871)
87085b0e0d is described below

commit 87085b0e0dad2a422993472e35431d4f22fd69d8
Author: chengven027-intellif 
AuthorDate: Mon Sep 26 17:14:33 2022 +0800

[frontend][pytorch]Support aten::Tensor_split operator (#12871)

Support aten::Tensor_split operator
---
 python/tvm/relay/frontend/pytorch.py  | 54 +++
 tests/python/frontend/pytorch/test_forward.py | 22 +++
 2 files changed, 76 insertions(+)

diff --git a/python/tvm/relay/frontend/pytorch.py 
b/python/tvm/relay/frontend/pytorch.py
index c1bf69502b..1b86b120df 100644
--- a/python/tvm/relay/frontend/pytorch.py
+++ b/python/tvm/relay/frontend/pytorch.py
@@ -559,6 +559,59 @@ class PyTorchOpConverter:
 
 return _op.split(data, indices, dim)
 
+def tensor_split(self, inputs, input_types):
+# Reference: 
https://pytorch.org/docs/stable/generated/torch.tensor_split.html
+import torch
+
+if not isinstance(inputs[1], (int, list, tuple, torch.Tensor)):
+msg = "indices_or_sections type %s could not be parsed in 
tensor_split op" % (
+type(inputs[1])
+)
+raise AssertionError(msg)
+
+if isinstance(inputs[1], torch.Tensor) and not (
+list(inputs[1].shape) == [] or list(inputs[1].shape) == 1
+):
+msg = "indices_or_sections must be a zero-dimensional or 
one-dimensional long tensor"
+raise AssertionError(msg)
+
+if isinstance(inputs[1], int) or (
+isinstance(inputs[1], torch.Tensor) and list(inputs[1].shape) == []
+):
+data = inputs[0]
+n = int(inputs[1])
+dim = int(inputs[2])
+
+split_size = int(self.infer_shape(data)[dim] / n)
+split_rest = int(self.infer_shape(data)[dim] % n)
+
+indices = []
+split_index = split_size
+if split_rest == 0:
+for i in range(n - 1):
+indices.append(split_index)
+split_index += split_size
+else:
+for i in range(split_rest):
+indices.append(split_index + 1)
+split_index = (i + 1) * (split_index + 1)
+for i in range(n - split_rest - 1):
+split_index += split_size
+indices.append(split_index)
+
+return _op.split(data, indices, dim)
+else:
+data = inputs[0]
+sections = inputs[1]
+dim = int(inputs[2])
+
+if isinstance(sections, tuple):
+sections = list(sections)
+elif isinstance(sections, torch.Tensor):
+sections = sections.cpu().numpy().tolist()
+
+return _op.split(data, sections, dim)
+
 def select(self, inputs, input_types):
 data = inputs[0]
 dim = int(inputs[1])
@@ -3484,6 +3537,7 @@ class PyTorchOpConverter:
 "aten::slice": self.slice,
 "aten::narrow": self.narrow,
 "aten::split": self.split,
+"aten::tensor_split": self.tensor_split,
 "aten::split_with_sizes": self.split_with_sizes,
 "aten::select": self.select,
 "aten::take": self.take,
diff --git a/tests/python/frontend/pytorch/test_forward.py 
b/tests/python/frontend/pytorch/test_forward.py
index 33c70a4d74..3c8bd5efd8 100755
--- a/tests/python/frontend/pytorch/test_forward.py
+++ b/tests/python/frontend/pytorch/test_forward.py
@@ -959,6 +959,28 @@ def test_forward_split():
 verify_model(Split([2, 3, 5], 1).float().eval(), input_data=input_data)
 
 
+@tvm.testing.uses_gpu
+def test_forward_tensor_split():
+"""test_forward_tensor_split"""
+torch.set_grad_enabled(False)
+input_shape = [4, 10]
+
+class Tensor_Split(Module):
+def __init__(self, split_size_or_sections, dim):
+super().__init__()
+self.split_size_or_sections = split_size_or_sections
+self.dim = dim
+
+def forward(self, *args):
+return torch.tensor_split(args[0], self.split_size_or_sections, 
self.dim)
+
+input_data = torch.rand(input_shape).float()
+verify_model(Tensor_Split(2, 0).float().eval(), input_data=input_data)
+verify_model(Tensor_Split(torch.tensor(3), 1).float().eval(), 
input_data=input_data)
+verify_model(Tensor_Split([2, 3, 5], 1).float().eval(), 
input_data=input_data)
+verify_model(Tensor_Split((2, 3, 5), 1).float().eval(), 
input_data=input_data)
+
+
 @tvm.testing.uses_gpu
 def test_forward_avgpool1d():
 """test_forward_avgpool1d"""

[tvm] branch main updated (46ea2ed42e -> cc6e01edc6)

2022-09-26 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 46ea2ed42e [MetaSchedule][UX] User Interface for Jupyter Notebook 
(#12866)
 add cc6e01edc6 [frontend][pytorch]support aten::zero_ operator (#12872)

No new revisions were added by this update.

Summary of changes:
 python/tvm/relay/frontend/pytorch.py  | 5 +
 tests/python/frontend/pytorch/test_forward.py | 7 +++
 2 files changed, 12 insertions(+)

[tvm] branch main updated (39f71ae288 -> fe75f00991)

2022-09-22 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 39f71ae288 [frontend][pytorch] Add a new test case for torch 
aten::fill_ operator implementation (#12857)
 add fe75f00991 [AutoTVM] Introducing multi_filter into ConfigSpace autotvm 
(#12545)

No new revisions were added by this update.

Summary of changes:
 python/tvm/autotvm/task/space.py   | 330 -
 python/tvm/autotvm/tuner/ga_tuner.py   | 108 +++
 python/tvm/autotvm/tuner/index_based_tuner.py  |  73 ++---
 python/tvm/autotvm/tuner/model_based_tuner.py  |  40 +--
 python/tvm/autotvm/tuner/sa_model_optimizer.py |  39 +--
 python/tvm/autotvm/tuner/tuner.py  |   1 +
 python/tvm/autotvm/utils.py|  32 --
 python/tvm/topi/adreno/conv2d_nchw.py  |  10 +-
 python/tvm/topi/adreno/conv2d_nhwc.py  |  10 +-
 python/tvm/topi/adreno/conv2d_winograd_common.py   |   7 +-
 python/tvm/topi/adreno/depthwise_conv2d_nchw.py|   9 +
 python/tvm/topi/adreno/depthwise_conv2d_nhwc.py|   9 +
 .../python/test_topi_conv2d_hwnc_tensorcore.py |   4 +-
 tests/python/unittest/test_autotvm_ga_tuner.py |  89 ++
 tests/python/unittest/test_autotvm_index_tuner.py  |  77 -
 tests/python/unittest/test_autotvm_space.py| 167 ++-
 16 files changed, 758 insertions(+), 247 deletions(-)
 create mode 100644 tests/python/unittest/test_autotvm_ga_tuner.py

[tvm] branch main updated (7aef584c0f -> 39f71ae288)

2022-09-22 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 7aef584c0f [Hybrid] Fix sys version check (#12837)
 add 39f71ae288 [frontend][pytorch] Add a new test case for torch 
aten::fill_ operator implementation (#12857)

No new revisions were added by this update.

Summary of changes:
 python/tvm/relay/frontend/pytorch.py  |  8 ++--
 tests/python/frontend/pytorch/test_forward.py | 10 ++
 2 files changed, 16 insertions(+), 2 deletions(-)

[tvm] branch main updated (3c8a94bd4e -> c0c7569529)

2022-09-21 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 3c8a94bd4e [frontend][torch] Support aten::relu6 operator (#12855)
 add c0c7569529 Allow failures in pr_comment_bot for now (#12860)

No new revisions were added by this update.

Summary of changes:
 .github/workflows/pr_comment_bot.yml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

[tvm] branch main updated (da0e5e3be2 -> 3c8a94bd4e)

2022-09-21 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from da0e5e3be2 [Utils] Disable automatic move constructor for tvm::With 
(#12822)
 add 3c8a94bd4e [frontend][torch] Support aten::relu6 operator (#12855)

No new revisions were added by this update.

Summary of changes:
 python/tvm/relay/frontend/pytorch.py  | 5 +
 tests/python/frontend/pytorch/test_forward.py | 9 +
 2 files changed, 14 insertions(+)

[tvm] branch main updated (52dbf102cd -> fa5045bf69)

2022-09-20 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 52dbf102cd Fix caffe, boost install in Python venvs by creating 
python3.X link (#12828)
 add fa5045bf69 [Metaschedule] MultiLevelTiling for wide vector 
architectures (#12845)

No new revisions were added by this update.

Summary of changes:
 include/tvm/meta_schedule/schedule_rule.h  |  15 +++
 python/tvm/meta_schedule/schedule_rule/__init__.py |   1 +
 .../schedule_rule/multi_level_tiling.py|  37 +++
 .../schedule_rule/multi_level_tiling.cc|  35 --
 .../schedule_rule/multi_level_tiling.h |   3 +
 .../multi_level_tiling_wide_vector.cc  | 120 +
 .../test_meta_schedule_schedule_rule_mlt.py| 108 ++-
 7 files changed, 307 insertions(+), 12 deletions(-)
 create mode 100644 
src/meta_schedule/schedule_rule/multi_level_tiling_wide_vector.cc

[tvm] branch main updated (a75dcabd3f -> e18b48bed8)

2022-09-19 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from a75dcabd3f [MetaSchedule] PyDatabase Complete Function Reload Support 
(#12838)
 add e18b48bed8 [Fix] naming outputs of graph nodes by op_name:output_index 
(#12809)

No new revisions were added by this update.

Summary of changes:
 src/runtime/graph_executor/graph_executor.cc | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

[tvm] 03/03: support constant folding on ndarray_size

2022-09-13 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a commit to branch torchbench
in repository https://gitbox.apache.org/repos/asf/tvm.git

commit bacf3946c727682e7aad82f03e34abbbd9f120a2
Author: Masahiro Masuda 
AuthorDate: Wed Sep 14 13:09:45 2022 +0900

support constant folding on ndarray_size
---
 python/tvm/relay/frontend/pytorch.py  |  2 +-
 src/relay/transforms/fold_constant.cc | 10 --
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/python/tvm/relay/frontend/pytorch.py 
b/python/tvm/relay/frontend/pytorch.py
index e2badaabf7..722b2889d3 100644
--- a/python/tvm/relay/frontend/pytorch.py
+++ b/python/tvm/relay/frontend/pytorch.py
@@ -2489,7 +2489,7 @@ class PyTorchOpConverter:
 )
 
 def numel(self, inputs, input_types):
-return _op.ndarray_size(inputs[0])
+return fold_constant(_op.ndarray_size(inputs[0]))
 
 def empty(self, inputs, input_types):
 shape = inputs[0]
diff --git a/src/relay/transforms/fold_constant.cc 
b/src/relay/transforms/fold_constant.cc
index 9dec840be0..f484dfc700 100644
--- a/src/relay/transforms/fold_constant.cc
+++ b/src/relay/transforms/fold_constant.cc
@@ -188,8 +188,7 @@ class ConstantFolder : public MixedModeMutator {
 if (is_no_computational && (is_no_qnn_canonicalized || !fold_qnn_)) {
   return std::move(post_call);
 }
-if (op == device_copy_op_ || op == shape_of_op_ || op == vm_shape_of_op_ ||
-op == ndarray_size_op_) {
+if (op == device_copy_op_ || op == shape_of_op_ || op == vm_shape_of_op_) {
   // We should think about potentially constant evaluation over these ops 
too.
   return std::move(post_call);
 }
@@ -383,6 +382,13 @@ class ConstantFolder : public MixedModeMutator {
   // TODO(mbs): This is not necessary since we only ever ask for the 
shapes for
   // pre-rewritten expressions which will always have a checked_type.
   return const_node->tensor_type()->shape;
+  //} else if (auto ttype = input->type_as()) {
+}  else if (const auto* var = input.as()) {
+  auto ty = var->type_annotation;
+  if (ty->IsInstance()) {
+return Downcast(ty)->shape;
+  }
+  return {};
 } else if (input->checked_type_.defined()) {
   return input->checked_type().as()->shape;
 } else {

[tvm] 01/03: add copy_ and embedding_bag

2022-09-13 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a commit to branch torchbench
in repository https://gitbox.apache.org/repos/asf/tvm.git

commit 292f55b59b7e82381cb339bfb6f0885b866f097d
Author: YJ Shi 
AuthorDate: Wed Jul 6 00:52:34 2022 -0700

add copy_ and embedding_bag
---
 python/tvm/relay/frontend/pytorch.py  | 65 +++
 tests/python/frontend/pytorch/test_forward.py |  9 
 2 files changed, 74 insertions(+)

diff --git a/python/tvm/relay/frontend/pytorch.py 
b/python/tvm/relay/frontend/pytorch.py
index 0e6d4caae0..9255c42383 100644
--- a/python/tvm/relay/frontend/pytorch.py
+++ b/python/tvm/relay/frontend/pytorch.py
@@ -39,7 +39,11 @@ from ..loops import while_loop
 from ..prelude import Prelude, StaticTensorArrayOps
 from ..ty import Any, TensorType, TupleType
 from . import qnn_torch
+<<<<<<< HEAD
 from .common import AttrCvt, get_relay_op, gru_cell, logger, rnn_cell
+===
+from .common import AttrCvt, fold_constant, get_relay_op, gru_cell, 
infer_shape, logger
+>>>>>>> dfcf28b5d... add copy_ and embedding_bag
 from .common import infer_shape as _infer_shape
 from .common import infer_value as _infer_value
 from .common import infer_value_simulated as _infer_value_simulated
@@ -811,6 +815,10 @@ class PyTorchOpConverter:
 fill_value = inputs[1]
 return self.full_impl(self.infer_shape(data), fill_value, 
input_types[0])
 
+def copy_(self, inputs, input_types):
+src = inputs[1]
+return _op.tensor.copy(src)
+
 def linspace(self, inputs, input_types):
 start = inputs[0]
 stop = inputs[1]
@@ -3407,6 +3415,61 @@ class PyTorchOpConverter:
 output = _op.random.multinomial(key, probs, num_samples)
 _, indices = _expr.TupleWrapper(output, 2)
 return indices
+
+def embedding_bag(self, inputs, _):
+assert len(inputs) == 9, "embedding_bag needs 9 arguments"
+(
+weights,
+indices,
+offsets_1d,
+scale_grad_by_freq,
+mode,
+sparse,
+per_sample_weights,
+include_last_offset,
+padding_idx,
+) = inputs
+
+assert scale_grad_by_freq == 0, "scale_grad_by_freq not supported in 
embedding_bag."
+assert padding_idx == None, "padding_idx not supported in 
embedding_bag."
+
+assert len(infer_shape(indices)) == 1, "Expects 1D indices for 
aten::embedding_bag."
+
+offsets_const_fold = fold_constant(offsets_1d)
+
+assert isinstance(
+offsets_const_fold, _expr.Constant
+), "Only constant offsets are supported."
+
+offsets_np = offsets_const_fold.data.numpy()
+if include_last_offset == 1:
+offsets_np = offsets_np[..., 0]  # exclude last dimension
+offsets_diff = np.diff(offsets_np)
+
+assert np.all(offsets_diff[1:] == offsets_diff[0]), "Only 2D cases 
supported for now."
+
+indices_2d = _op.reshape(indices, (-1, offsets_diff[0]))
+
+mode_map = {0: _op.sum, 1: _op.mean, 2: _op.max}
+assert mode in mode_map, "unsupported reduction op mode %d." % mode
+
+reduce_op = mode_map[mode]
+
+# TOOD(masahi): Implementing embedding_bag in terms of gather and 
reduce defeats the
+# purpose of using this op. Implement Relay / topi op for fused gather 
and reduce.
+gather = _op.take(weights, indices_2d, axis=0)
+if per_sample_weights is not None:
+if mode != 0:
+raise NotImplementedError(
+"Only mode 'sum' is supported when per_sample_weights is 
passed."
+)
+gather = gather * per_sample_weights
+reduced = reduce_op(gather, 1)
+# pytorch/aten/src/ATen/native/EmbeddingBag.cpp shows that 
aten::embedding_bag returns
+# 4 outputs: output, offset2bag, bag_size, max_indices
+# The Python version of the op only returns the first output, so we 
also support only the
+# first output. If the model uses other outputs, the conversion would 
fail.
+return reduced, None, None, None
 
 # Operator mappings
 def create_convert_map(self):
@@ -3444,6 +3507,7 @@ class PyTorchOpConverter:
 "aten::full_like": self.full_like,
 "aten::new_full": self.new_full,
 "aten::fill_": self.fill_,
+"aten::copy_": self.copy_,
 "aten::linspace": self.linspace,
 "aten::reciprocal": self.reciprocal,
 "aten::repeat": self.repeat,
@@ -3670,6 +3734,7 @@ class PyTorchOpConverter:
 "aten::__lshift__": self.make_elemwise("left_shift"),
 "aten::__rshift__": self.make_ele

[tvm] branch torchbench created (now bacf3946c7)

2022-09-13 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch torchbench
in repository https://gitbox.apache.org/repos/asf/tvm.git


  at bacf3946c7 support constant folding on ndarray_size

This branch includes the following new commits:

 new 292f55b59b add copy_ and embedding_bag
 new 5aa6d7c253 fix rebase
 new bacf3946c7 support constant folding on ndarray_size

The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.

[tvm] 02/03: fix rebase

2022-09-13 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a commit to branch torchbench
in repository https://gitbox.apache.org/repos/asf/tvm.git

commit 5aa6d7c25360ee5b339c42f8c9f5c655943a4333
Author: YJ Shi 
AuthorDate: Tue Sep 13 16:15:09 2022 -0700

fix rebase
---
 python/tvm/relay/frontend/pytorch.py  | 12 
 tests/python/frontend/pytorch/test_forward.py |  7 +++
 2 files changed, 7 insertions(+), 12 deletions(-)

diff --git a/python/tvm/relay/frontend/pytorch.py 
b/python/tvm/relay/frontend/pytorch.py
index 9255c42383..e2badaabf7 100644
--- a/python/tvm/relay/frontend/pytorch.py
+++ b/python/tvm/relay/frontend/pytorch.py
@@ -39,11 +39,7 @@ from ..loops import while_loop
 from ..prelude import Prelude, StaticTensorArrayOps
 from ..ty import Any, TensorType, TupleType
 from . import qnn_torch
-<<<<<<< HEAD
-from .common import AttrCvt, get_relay_op, gru_cell, logger, rnn_cell
-===
-from .common import AttrCvt, fold_constant, get_relay_op, gru_cell, 
infer_shape, logger
->>>>>>> dfcf28b5d... add copy_ and embedding_bag
+from .common import AttrCvt, fold_constant, get_relay_op, gru_cell, logger
 from .common import infer_shape as _infer_shape
 from .common import infer_value as _infer_value
 from .common import infer_value_simulated as _infer_value_simulated
@@ -3415,7 +3411,7 @@ class PyTorchOpConverter:
 output = _op.random.multinomial(key, probs, num_samples)
 _, indices = _expr.TupleWrapper(output, 2)
 return indices
-
+
 def embedding_bag(self, inputs, _):
 assert len(inputs) == 9, "embedding_bag needs 9 arguments"
 (
@@ -3433,10 +3429,10 @@ class PyTorchOpConverter:
 assert scale_grad_by_freq == 0, "scale_grad_by_freq not supported in 
embedding_bag."
 assert padding_idx == None, "padding_idx not supported in 
embedding_bag."
 
-assert len(infer_shape(indices)) == 1, "Expects 1D indices for 
aten::embedding_bag."
+assert len(_infer_shape(indices)) == 1, "Expects 1D indices for 
aten::embedding_bag."
 
 offsets_const_fold = fold_constant(offsets_1d)
-
+print(offsets_const_fold)
 assert isinstance(
 offsets_const_fold, _expr.Constant
 ), "Only constant offsets are supported."
diff --git a/tests/python/frontend/pytorch/test_forward.py 
b/tests/python/frontend/pytorch/test_forward.py
index f9ff4a212c..58a4dfbe94 100755
--- a/tests/python/frontend/pytorch/test_forward.py
+++ b/tests/python/frontend/pytorch/test_forward.py
@@ -4608,7 +4608,6 @@ def test_mod():
 verify_model(test_fn, [torch.tensor([1, 2, 3, 4, 5]), 
torch.tensor(-1.5)])
 
 
-<<<<<<< HEAD
 def test_softmax_fuse():
 # https://github.com/apache/tvm/issues/12001
 class Model(torch.nn.Module):
@@ -4686,15 +4685,15 @@ def test_multinomial():
 _test_multinomial(1),
 [torch.rand(size=[4, 5]).float()],
 cpu_only=True,
-check_correctness=False,
-===
+)
+
+
 def test_embedding_bag():
 embedding_matrix = torch.rand(10, 3)
 inp = torch.tensor([[1, 2, 4, 5], [4, 3, 2, 9], [6, 7, 8, 9]])
 verify_model(
 F.embedding_bag,
 [inp, embedding_matrix],
->>>>>>> dfcf28b5d... add copy_ and embedding_bag
 )

[tvm] branch main updated (76f91b42b9 -> 286fadecb8)

2022-09-12 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 76f91b42b9 [ETHOSN] Update driver stack version to 22.08 (#12650)
 add 286fadecb8 [TF] Add Bincount support (#12751)

No new revisions were added by this update.

Summary of changes:
 python/tvm/relay/frontend/tensorflow_ops.py| 41 ++-
 tests/python/frontend/tensorflow/test_forward.py   | 35 +
 .../frontend/tensorflow2/test_functional_models.py | 60 ++
 3 files changed, 135 insertions(+), 1 deletion(-)

[tvm] branch main updated (14999f8add -> 574794e915)

2022-09-09 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 14999f8add [TVMScript][TIR] Clarify scope of BlockNode::iter_vars 
(#12726)
 add 574794e915 [OpenCL] Enable OpenCL for GPU tests (#12490)

No new revisions were added by this update.

Summary of changes:
 src/runtime/opencl/opencl_common.h |   2 +-
 tests/cpp-runtime/opencl/opencl_timer_test.cc  |   1 +
 tests/cpp-runtime/opencl/run_gtests.cc |   2 +-
 .../python/contrib/test_opencl/test_run_gtests.py  |   1 +
 tests/python/driver/tvmc/test_compiler.py  |   3 +-
 .../test_conv2d_nchw_texture.py| 107 -
 .../test_conv2d_nhwc_texture.py|  92 +++---
 .../test_depthwise_conv2d_nchw_texture.py  |  26 ++---
 .../test_depthwise_conv2d_nhwc_texture.py  |  32 +++---
 .../{ => opencl_texture}/utils/adreno_utils.py |   0
 .../python/unittest/test_target_codegen_vulkan.py  |   3 +
 tests/scripts/task_config_build_gpu.sh |   1 +
 tests/scripts/task_python_integration.sh   |   6 +-
 tests/scripts/task_python_integration_gpuonly.sh   |   3 +-
 14 files changed, 112 insertions(+), 167 deletions(-)
 rename tests/python/relay/{ => opencl_texture}/test_conv2d_nchw_texture.py 
(90%)
 rename tests/python/relay/{ => opencl_texture}/test_conv2d_nhwc_texture.py 
(87%)
 rename tests/python/relay/{ => 
opencl_texture}/test_depthwise_conv2d_nchw_texture.py (91%)
 rename tests/python/relay/{ => 
opencl_texture}/test_depthwise_conv2d_nhwc_texture.py (91%)
 rename tests/python/relay/{ => opencl_texture}/utils/adreno_utils.py (100%)

[tvm] branch main updated (1c5ffc67ad -> cb08a1251f)

2022-09-09 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 1c5ffc67ad [ci][docker] Use CMake 3.20.0 for cortexm (#12744)
 add cb08a1251f [TF] Add DenseBincount support (#12728)

No new revisions were added by this update.

Summary of changes:
 python/tvm/relay/frontend/tensorflow_ops.py  | 55 
 tests/python/frontend/tensorflow/test_forward.py | 41 ++
 2 files changed, 96 insertions(+)

[tvm] branch main updated (abb2aa062f -> 6be04d72c2)

2022-09-07 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from abb2aa062f [TIR] Add unroll_loop_with_partition_hint_no_interval attr 
in LoopPartitionConfig to unroll loop (#12631)
 add 6be04d72c2 [OpenCLML] CLML Profiling fixes corresponding to OpenCL 
Timer recent … (#12711)

No new revisions were added by this update.

Summary of changes:
 src/runtime/contrib/clml/clml_runtime.cc | 161 +++
 tests/python/contrib/test_clml/infrastructure.py |   6 +-
 tests/python/contrib/test_clml/test_network.py   |   4 +-
 tests/python/contrib/test_clml/test_ops.py   |   2 +-
 4 files changed, 80 insertions(+), 93 deletions(-)

[tvm] branch main updated: support false-positive fast math (#12702)

2022-09-07 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 291dd2f063 support false-positive fast math (#12702)
291dd2f063 is described below

commit 291dd2f06331342f5c89216d5d211cb61fe3d19f
Author: cery999 <112694109+cery...@users.noreply.github.com>
AuthorDate: Wed Sep 7 15:06:31 2022 +0800

support false-positive fast math (#12702)
---
 include/tvm/topi/elemwise.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/tvm/topi/elemwise.h b/include/tvm/topi/elemwise.h
index fc9ab13988..f26105cb18 100644
--- a/include/tvm/topi/elemwise.h
+++ b/include/tvm/topi/elemwise.h
@@ -81,7 +81,7 @@ TOPI_DECLARE_UNARY_OP(isinf);
 inline Tensor fast_tanh_float(const Tensor& in, std::string name, std::string 
tag) {
   // Clamp the inputs to the range [-9, 9] since anything outside
   // this range is +/-1.0f in single-precision.
-  auto x = maximum(minimum(in, make_const(in->dtype, 9.0)), 
make_const(in->dtype, -9.0));
+  auto x = maximum(make_const(in->dtype, -9.0), minimum(make_const(in->dtype, 
9.0), in));
 
   // The monomial coefficients of the numerator polynomial (odd).
   auto alpha_1 = make_const(in->dtype, 4.89352455891786e-03);

[tvm] branch main updated (d4201a9d8e -> 141b17b23a)

2022-09-06 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from d4201a9d8e [COMMUNITY] ekalda -> Committer (#12715)
 add 141b17b23a [Hexagon] Add optimized schedule for nn.pad (#12714)

No new revisions were added by this update.

Summary of changes:
 python/tvm/relay/op/nn/_nn.py  |  2 +-
 python/tvm/relay/op/strategy/generic.py|  8 +++
 python/tvm/relay/op/strategy/hexagon.py|  7 +++
 python/tvm/topi/hexagon/__init__.py|  1 +
 python/tvm/topi/hexagon/{pooling.py => pad.py} | 22 +
 tests/python/contrib/test_hexagon/topi/test_pad.py | 57 ++
 6 files changed, 87 insertions(+), 10 deletions(-)
 copy python/tvm/topi/hexagon/{pooling.py => pad.py} (71%)
 create mode 100644 tests/python/contrib/test_hexagon/topi/test_pad.py

[tvm] branch main updated (4acddb1d03 -> b2d6600064)

2022-09-02 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 4acddb1d03 [COMMUNITY] Yaxing Cai -> Reviewer (#12683)
 add b2d6600064 [PyTorch] Fix aten::arange for pytorch (#12681)

No new revisions were added by this update.

Summary of changes:
 python/tvm/relay/frontend/pytorch.py | 32 ++--
 1 file changed, 14 insertions(+), 18 deletions(-)

[tvm] branch main updated: [Relay] Extract intermediate node by its expression ID (#12646)

2022-09-01 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 38ba8c0bb6 [Relay] Extract intermediate node by its expression ID 
(#12646)
38ba8c0bb6 is described below

commit 38ba8c0bb69dd76203a96ba6b2a5c067fe0b2ba0
Author: sisleyli <43139237+sisle...@users.noreply.github.com>
AuthorDate: Thu Sep 1 18:32:42 2022 +0800

[Relay] Extract intermediate node by its expression ID (#12646)

[Relay] Extract Intermediate Expr by relay expr ID for analysis

modify doc comments

Co-authored-by: Bin Li 
---
 python/tvm/relay/analysis/analysis.py  |  38 ++
 src/relay/analysis/extract_intermediate_expr.cc|  88 ++
 .../test_analysis_extract_intermediate_expr.py | 130 +
 3 files changed, 256 insertions(+)

diff --git a/python/tvm/relay/analysis/analysis.py 
b/python/tvm/relay/analysis/analysis.py
index 3b38c07a0a..12f659f003 100644
--- a/python/tvm/relay/analysis/analysis.py
+++ b/python/tvm/relay/analysis/analysis.py
@@ -431,3 +431,41 @@ def get_calibration_data(mod, data):
 calib_data[gvar] = value
 
 return calib_data
+
+
+def extract_intermdeiate_expr(mod, expr_id):
+"""Extract Relay Expr by its expression ID
+
+This function is used for extracting Relay Expr
+by its expression ID of the main function
+that we can see in `print(mod["main"])`.
+
+Parameters
+--
+mod : tvm.IRModule
+
+expr_id : the Expr ID that we want to extract
+
+Returns
+---
+ret : Extracted IRModule
+
+Examples
+
+.. code-block:: python
+
+# Suppose our module is printed like this:
+# def @main(%x: Tensor[(1, 1, 5, 1), float32], %w1, %w2) {
+#   %0 = nn.conv2d(%x, %w1, padding=[1, 1, 1, 1], channels=1, 
kernel_size=[3, 3]);
+#   %1 = nn.conv2d(%0, %w2, padding=[1, 1, 1, 1], channels=1, 
kernel_size=[3, 3]);
+#   %2 = add(%0, %1);
+#   %3 = split(%2, indices_or_sections=1);
+#   %4 = %3.0;
+#   add(%4, 1f)
+# }
+# if we want to extract `%1 = nn.conv2d`
+from tvm import relay
+
+relay.analysis.extract_intermdeiate_expr(mod, 1)
+"""
+return _ffi_api.ExtractIntermediateExpr(mod, expr_id)
diff --git a/src/relay/analysis/extract_intermediate_expr.cc 
b/src/relay/analysis/extract_intermediate_expr.cc
new file mode 100644
index 00..d7466e2729
--- /dev/null
+++ b/src/relay/analysis/extract_intermediate_expr.cc
@@ -0,0 +1,88 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+/*!
+ * \file extract_intermediate_expr.cc
+ * \brief Used for extracting Relay Expr
+by the expression ID of the main function
+that we can see in `print(mod["main"])`.
+ */
+#include 
+#include 
+#include 
+#include 
+
+namespace tvm {
+namespace relay {
+
+class ExtractIntermediateExprWrapper : private MixedModeVisitor {
+ public:
+  explicit ExtractIntermediateExprWrapper(const IRModule& mod, const int 
expr_id)
+  : mod_(mod), target_expr_id_(expr_id), counter_(0) {}
+
+  IRModule Extract() {
+VisitExpr(this->mod_->Lookup("main"));
+
+// ensure the target expr_id we want to extract is valid.
+ICHECK(target_expr_id_ >= 0 && target_expr_id_ < counter_);
+
+return IRModule::FromExpr(target_op_, {});
+  }
+
+ private:
+  using MixedModeVisitor::VisitExpr_;
+
+  const IRModule mod_;
+  /*! \brief the expr id that we want to extract. */
+  const int target_expr_id_;
+  int counter_;
+  Expr target_op_;
+
+  void VisitExpr_(const CallNode* n) final {
+CheckCounterAndIncrease(GetRef(n));
+MixedModeVisitor::VisitExpr_(n);
+  }
+
+  void VisitExpr_(const TupleNode* n) final {
+CheckCounterAndIncrease(GetRef(n));
+MixedModeVisitor::VisitExpr_(n);
+  }
+
+  void VisitExpr_(const TupleGetItemNode* n) final {
+CheckCounterAndIncrease(GetRef(n));
+MixedModeVisitor::VisitExpr_(n);
+  }
+
+  voi

[tvm] branch main updated (3d41ac3a9a -> c5c99a4b52)

2022-08-29 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 3d41ac3a9a [Refactor] Replace std::tie with structured bindings 
(#12610)
 add c5c99a4b52 [QNN] Align output_scale/zero_point of sigmoid to Torch 
(#12624)

No new revisions were added by this update.

Summary of changes:
 python/tvm/relay/frontend/pytorch.py   |  6 ++---
 python/tvm/relay/frontend/qnn_torch.py | 40 +++---
 2 files changed, 39 insertions(+), 7 deletions(-)

[tvm] branch main updated (648a29a53a -> 3d41ac3a9a)

2022-08-29 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 648a29a53a [MetaSchedule] Introduce `ScheduleFnDatabase` (#12626)
 add 3d41ac3a9a [Refactor] Replace std::tie with structured bindings 
(#12610)

No new revisions were added by this update.

Summary of changes:
 src/auto_scheduler/auto_schedule.cc|  4 +-
 src/auto_scheduler/compute_dag.cc  | 17 +++-
 src/auto_scheduler/feature.cc  |  9 +---
 src/auto_scheduler/search_policy/search_policy.cc  |  4 +-
 .../search_policy/sketch_policy_rules.cc   |  3 +-
 src/ir/instrument.cc   |  5 +--
 src/meta_schedule/database/json_database.cc|  4 +-
 .../mutator/mutate_compute_location.cc |  4 +-
 .../schedule_rule/cross_thread_reduction.cc|  6 +--
 .../space_generator/post_order_apply.cc|  4 +-
 src/relay/collage/partition_rule.cc| 12 ++---
 src/relay/collage/sub_graph.cc |  8 +---
 src/relay/qnn/op/convolution.cc|  4 +-
 src/relay/qnn/op/leaky_relu.cc |  6 +--
 src/relay/qnn/op/requantize.cc |  3 +-
 src/relay/qnn/utils.cc |  6 +--
 src/relay/quantize/realize.cc  |  6 +--
 src/relay/transforms/combine_parallel_conv2d.cc|  4 +-
 src/relay/transforms/combine_parallel_dense.cc |  4 +-
 src/runtime/graph_executor/graph_executor.cc   |  4 +-
 src/target/source/ptx.cc   | 12 ++---
 src/te/autodiff/ad_simplify.cc | 10 ++---
 src/te/autodiff/ad_utils.cc|  8 +---
 src/te/autodiff/jacobian.cc|  4 +-
 src/tir/schedule/analysis/analysis.cc  | 11 +++--
 src/tir/schedule/primitive/block_annotate.cc   |  4 +-
 .../schedule/primitive/layout_transformation.cc| 12 ++---
 src/tir/schedule/primitive/loop_transformation.cc  |  4 +-
 src/tir/schedule/primitive/reduction.cc| 12 ++---
 src/tir/schedule/primitive/sampling.cc |  4 +-
 src/tir/transforms/loop_partition.cc   | 51 ++
 src/tir/transforms/lower_cross_thread_reduction.cc | 15 ++-
 src/tir/transforms/lower_thread_allreduce.cc   |  5 +--
 src/tir/transforms/lower_warp_memory.cc| 11 ++---
 .../manifest_shared_memory_local_stage.cc  | 10 +
 35 files changed, 105 insertions(+), 185 deletions(-)

[tvm] branch main updated: [MetaSchedule] Introduce `ScheduleFnDatabase` (#12626)

2022-08-29 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 648a29a53a [MetaSchedule] Introduce `ScheduleFnDatabase` (#12626)
648a29a53a is described below

commit 648a29a53a641f1e923220600dce9c9215104879
Author: Junru Shao 
AuthorDate: Mon Aug 29 00:34:11 2022 -0700

[MetaSchedule] Introduce `ScheduleFnDatabase` (#12626)

Following #12520, this PR introduces `ScheduleFnDatabase`, a mocked
database to allow injecting handcrafted schedules provided by a schedule
function.

The schedule function comes with the following signature:

```python
def schedule_fn(
  sch: tir.Schedule,
) -> bool:
  task_name = sch.mod.attrs["task_name"]
  # ^^^ provides an optional name of the task queried
  ...
```

This mocked database helps incorporate the existing testing utility
`apply_fixed_schedule` more formally into the MetaSchedule-Relay build
pipeline, and allows further extension to Relax with the same interface.

Next as another follow-up, we will introduce ConcatDatabase that allows
mixing multiple databases, including the mocked and ones from JSON
files.
---
 include/tvm/meta_schedule/database.h   |  19 +++-
 python/tvm/meta_schedule/database/__init__.py  |   1 +
 python/tvm/meta_schedule/database/database.py  |  41 ++--
 .../{__init__.py => schedule_fn_database.py}   |  29 --
 python/tvm/meta_schedule/testing/utils.py  |  83 -
 src/meta_schedule/database/database.cc |  13 ++-
 src/meta_schedule/database/memory_database.cc  |  10 +-
 src/meta_schedule/database/schedule_fn_database.cc | 103 +
 src/relay/backend/te_compiler_cache.cc |   5 +-
 tests/python/unittest/test_link_params.py  |  15 ++-
 .../unittest/test_meta_schedule_multi_anchor.py|   8 +-
 .../test_meta_schedule_relay_tir_compute.py|  18 ++--
 .../unittest/test_meta_schedule_tune_relay.py  |   7 +-
 13 files changed, 210 insertions(+), 142 deletions(-)

diff --git a/include/tvm/meta_schedule/database.h 
b/include/tvm/meta_schedule/database.h
index 0e7f45d393..88db2e2277 100644
--- a/include/tvm/meta_schedule/database.h
+++ b/include/tvm/meta_schedule/database.h
@@ -207,23 +207,29 @@ class DatabaseNode : public runtime::Object {
* \brief Query the best record of the given workload from the database.
* \param mod The IRModule to be searched for.
* \param target The target to be searched for.
+   * \param workload_name The name of the workload to be searched for.
* \return The best record of the given workload; NullOpt if not found.
*/
-  virtual Optional QueryTuningRecord(IRModule mod, Target 
target);
+  virtual Optional QueryTuningRecord(const IRModule& mod, const 
Target& target,
+   const String& 
workload_name);
   /*!
* \brief Query the best schedule of the given workload from the database.
* \param mod The IRModule to be searched for.
* \param target The target to be searched for.
+   * \param workload_name The name of the workload to be searched for.
* \return The schedule in the best schedule of the given workload; NullOpt 
if not found.
*/
-  virtual Optional QuerySchedule(IRModule mod, Target target);
+  virtual Optional QuerySchedule(const IRModule& mod, const 
Target& target,
+const String& workload_name);
   /*!
* \brief Query the best IRModule of the given workload from the database.
* \param mod The IRModule to be searched for.
* \param target The target to be searched for.
+   * \param workload_name The name of the workload to be searched for.
* \return The IRModule in the best IRModule of the given workload; NullOpt 
if not found.
*/
-  virtual Optional QueryIRModule(IRModule mod, Target target);
+  virtual Optional QueryIRModule(const IRModule& mod, const Target& 
target,
+   const String& workload_name);
 
   static constexpr const char* _type_key = "meta_schedule.Database";
   TVM_DECLARE_BASE_OBJECT_INFO(DatabaseNode, runtime::Object);
@@ -336,6 +342,13 @@ class Database : public runtime::ObjectRef {
  public:
   /*! An in-memory database. */
   TVM_DLL static Database MemoryDatabase();
+  /*!
+   * \brief A database for injecting handcrafted schedule functions.
+   * \param schedule_fn The function to do scheduling, which takes a TIR 
schedule,
+   * and returns a boolean indicating if the schedule is successful.
+   */
+  TVM_DLL static Database ScheduleFnDatabase(
+  runtime::TypedPackedFunc schedule_fn);
   /*!
* \brief Create a default database that uses JSON file

[tvm] branch main updated (a9f7c32e42 -> 3224817d08)

2022-08-25 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from a9f7c32e42 [skip ci][Community] Wuwei Lin -> PMC (#12605)
 add 3224817d08 [TOPI][Bugfix] Make semantics of empty `axis` in `squeeze` 
consistent with Relay (#12596)

No new revisions were added by this update.

Summary of changes:
 include/tvm/topi/transform.h| 4 ++--
 tests/python/topi/python/test_topi_transform.py | 5 +++--
 2 files changed, 5 insertions(+), 4 deletions(-)

[tvm] branch docker-hex-update created (now f845432e18)

2022-08-25 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch docker-hex-update
in repository https://gitbox.apache.org/repos/asf/tvm.git


  at f845432e18 [CI] Update Hexagon image to install boost

This branch includes the following new commits:

 new f845432e18 [CI] Update Hexagon image to install boost

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.

[tvm] 01/01: [CI] Update Hexagon image to install boost

2022-08-25 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a commit to branch docker-hex-update
in repository https://gitbox.apache.org/repos/asf/tvm.git

commit f845432e1803309e4ede9f89b6faf103ff4519ad
Author: Masahiro Masuda 
AuthorDate: Fri Aug 26 15:09:50 2022 +0900

[CI] Update Hexagon image to install boost
---
 Jenkinsfile   | 4 ++--
 ci/jenkins/Jenkinsfile.j2 | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/Jenkinsfile b/Jenkinsfile
index 8c1ce9ed50..3278e83098 100755
--- a/Jenkinsfile
+++ b/Jenkinsfile
@@ -45,7 +45,7 @@
 // 'python3 jenkins/generate.py'
 // Note: This timestamp is here to ensure that updates to the Jenkinsfile are
 // always rebased on main before merging:
-// Generated at 2022-08-19T15:38:38.311410
+// Generated at 2022-08-26T15:09:39.104767
 
 import org.jenkinsci.plugins.pipeline.modeldefinition.Utils
 // NOTE: these lines are scanned by docker/dev_common.sh. Please update the 
regex as needed. -->
@@ -57,7 +57,7 @@ ci_wasm = 'tlcpack/ci-wasm:20220810-060142-fae79bbc3'
 ci_i386 = 'tlcpack/ci-i386:20220810-060142-fae79bbc3'
 ci_cortexm = 'tlcpack/ci-cortexm:20220810-060142-fae79bbc3'
 ci_arm = 'tlcpack/ci-arm:20220810-060142-fae79bbc3'
-ci_hexagon = 'tlcpack/ci-hexagon:20220810-060142-fae79bbc3'
+ci_hexagon = 'tlcpack/ci-hexagon:20220825-145056-fb7cf97f'
 ci_riscv = 'tlcpack/ci-riscv:20220810-060142-fae79bbc3'
 // <--- End of regex-scanned config.
 
diff --git a/ci/jenkins/Jenkinsfile.j2 b/ci/jenkins/Jenkinsfile.j2
index be2776c6d9..c932431a44 100644
--- a/ci/jenkins/Jenkinsfile.j2
+++ b/ci/jenkins/Jenkinsfile.j2
@@ -59,7 +59,7 @@ ci_wasm = 'tlcpack/ci-wasm:20220810-060142-fae79bbc3'
 ci_i386 = 'tlcpack/ci-i386:20220810-060142-fae79bbc3'
 ci_cortexm = 'tlcpack/ci-cortexm:20220810-060142-fae79bbc3'
 ci_arm = 'tlcpack/ci-arm:20220810-060142-fae79bbc3'
-ci_hexagon = 'tlcpack/ci-hexagon:20220810-060142-fae79bbc3'
+ci_hexagon = 'tlcpack/ci-hexagon:20220825-145056-fb7cf97f'
 ci_riscv = 'tlcpack/ci-riscv:20220810-060142-fae79bbc3'
 // <--- End of regex-scanned config.

[tvm] branch ci-docker-staging updated: pull from tlcpack

2022-08-25 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a commit to branch ci-docker-staging
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/ci-docker-staging by this push:
 new 2315195ff2 pull from tlcpack
2315195ff2 is described below

commit 2315195ff2ca5fc579aa608923981e61ad6e7e40
Author: Masahiro Masuda 
AuthorDate: Fri Aug 26 10:31:52 2022 +0900

pull from tlcpack
---
 Jenkinsfile   | 4 ++--
 ci/jenkins/Jenkinsfile.j2 | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/Jenkinsfile b/Jenkinsfile
index 2dbf78a08d..ffe051c2de 100755
--- a/Jenkinsfile
+++ b/Jenkinsfile
@@ -45,7 +45,7 @@
 // 'python3 jenkins/generate.py'
 // Note: This timestamp is here to ensure that updates to the Jenkinsfile are
 // always rebased on main before merging:
-// Generated at 2022-08-26T05:56:52.524372
+// Generated at 2022-08-26T10:31:38.252653
 
 import org.jenkinsci.plugins.pipeline.modeldefinition.Utils
 // NOTE: these lines are scanned by docker/dev_common.sh. Please update the 
regex as needed. -->
@@ -57,7 +57,7 @@ ci_wasm = 'tlcpack/ci-wasm:20220810-060142-fae79bbc3'
 ci_i386 = 'tlcpack/ci-i386:20220810-060142-fae79bbc3'
 ci_cortexm = 'tlcpack/ci-cortexm:20220810-060142-fae79bbc3'
 ci_arm = 'tlcpack/ci-arm:20220810-060142-fae79bbc3'
-ci_hexagon = 'tlcpackstaging/ci_hexagon:20220825-145056-fb7cf97f'
+ci_hexagon = 'tlcpack/ci-hexagon:20220825-145056-fb7cf97f'
 ci_riscv = 'tlcpack/ci-riscv:20220810-060142-fae79bbc3'
 // <--- End of regex-scanned config.
 
diff --git a/ci/jenkins/Jenkinsfile.j2 b/ci/jenkins/Jenkinsfile.j2
index 7f188cda92..c932431a44 100644
--- a/ci/jenkins/Jenkinsfile.j2
+++ b/ci/jenkins/Jenkinsfile.j2
@@ -59,7 +59,7 @@ ci_wasm = 'tlcpack/ci-wasm:20220810-060142-fae79bbc3'
 ci_i386 = 'tlcpack/ci-i386:20220810-060142-fae79bbc3'
 ci_cortexm = 'tlcpack/ci-cortexm:20220810-060142-fae79bbc3'
 ci_arm = 'tlcpack/ci-arm:20220810-060142-fae79bbc3'
-ci_hexagon = 'tlcpackstaging/ci_hexagon:20220825-145056-fb7cf97f'
+ci_hexagon = 'tlcpack/ci-hexagon:20220825-145056-fb7cf97f'
 ci_riscv = 'tlcpack/ci-riscv:20220810-060142-fae79bbc3'
 // <--- End of regex-scanned config.

[tvm] 01/01: testing new hexagon image

2022-08-25 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a commit to branch ci-docker-staging
in repository https://gitbox.apache.org/repos/asf/tvm.git

commit f4f550f0cca98e86740fe90c20034778b8d0fdfc
Author: Masahiro Masuda 
AuthorDate: Fri Aug 26 05:57:10 2022 +0900

testing new hexagon image
---
 Jenkinsfile   | 4 ++--
 ci/jenkins/Jenkinsfile.j2 | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/Jenkinsfile b/Jenkinsfile
index 8c1ce9ed50..2dbf78a08d 100755
--- a/Jenkinsfile
+++ b/Jenkinsfile
@@ -45,7 +45,7 @@
 // 'python3 jenkins/generate.py'
 // Note: This timestamp is here to ensure that updates to the Jenkinsfile are
 // always rebased on main before merging:
-// Generated at 2022-08-19T15:38:38.311410
+// Generated at 2022-08-26T05:56:52.524372
 
 import org.jenkinsci.plugins.pipeline.modeldefinition.Utils
 // NOTE: these lines are scanned by docker/dev_common.sh. Please update the 
regex as needed. -->
@@ -57,7 +57,7 @@ ci_wasm = 'tlcpack/ci-wasm:20220810-060142-fae79bbc3'
 ci_i386 = 'tlcpack/ci-i386:20220810-060142-fae79bbc3'
 ci_cortexm = 'tlcpack/ci-cortexm:20220810-060142-fae79bbc3'
 ci_arm = 'tlcpack/ci-arm:20220810-060142-fae79bbc3'
-ci_hexagon = 'tlcpack/ci-hexagon:20220810-060142-fae79bbc3'
+ci_hexagon = 'tlcpackstaging/ci_hexagon:20220825-145056-fb7cf97f'
 ci_riscv = 'tlcpack/ci-riscv:20220810-060142-fae79bbc3'
 // <--- End of regex-scanned config.
 
diff --git a/ci/jenkins/Jenkinsfile.j2 b/ci/jenkins/Jenkinsfile.j2
index be2776c6d9..7f188cda92 100644
--- a/ci/jenkins/Jenkinsfile.j2
+++ b/ci/jenkins/Jenkinsfile.j2
@@ -59,7 +59,7 @@ ci_wasm = 'tlcpack/ci-wasm:20220810-060142-fae79bbc3'
 ci_i386 = 'tlcpack/ci-i386:20220810-060142-fae79bbc3'
 ci_cortexm = 'tlcpack/ci-cortexm:20220810-060142-fae79bbc3'
 ci_arm = 'tlcpack/ci-arm:20220810-060142-fae79bbc3'
-ci_hexagon = 'tlcpack/ci-hexagon:20220810-060142-fae79bbc3'
+ci_hexagon = 'tlcpackstaging/ci_hexagon:20220825-145056-fb7cf97f'
 ci_riscv = 'tlcpack/ci-riscv:20220810-060142-fae79bbc3'
 // <--- End of regex-scanned config.

[tvm] branch ci-docker-staging updated (491738649a -> f4f550f0cc)

2022-08-25 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch ci-docker-staging
in repository https://gitbox.apache.org/repos/asf/tvm.git


 discard 491738649a [CI] Set test python.contrib.test_onnx.test_resize as xfail
 discard 228dfa9f6b [CI] Update Docker image tags to 20220822-105603-52f5c155f
 discard 95ec5c3c13 Update TensorFlow to release 2.9
 add 1ec2c36912 [TIR][CompactBufferAllocation] Improve upperbound 
estimation of buffer compaction (#12527)
 add 592148abf6 [Target] Replace IsaAnalyzer with Target Features (#12322)
 add 6e79f64108 [CI] Set test python.contrib.test_onnx.test_resize as xfail 
(#12568)
 add a0fe74b3c3 [ETHOSN] Support multiply conversion to depthwise (#12403)
 add 038523e5a2 [TIR] Expose Vector-related API in Python (#12571)
 add bf65b396c1 [Hexagon] Add support to run on multiple devices (#12504)
 add f53ee0cecf [Hexagon] Fix missing pytest import (#12565)
 add 1afd059395 [TOPI][Hexagon] Implement quantized avgpool (#12340)
 add 17989e8ab5 [microTVM] Fix `build` directory exists error (#12575)
 add b8fbfe26ae [MicroTVM] fix compile error when the compiler implements 
char as unsigned (#12519)
 add cd8fd9121d [TIR] Expose `shift_left` and `shift_right` to Python 
(#12584)
 add 9aac161a46 [MetaSchedule] Add software pipeline in CUDA tensor core 
auto tensorization (#12544)
 add b38738434b [TIR] Expose WMMA-related TensorCore builtins (#12589)
 add 40bdea8d7a [PyTorch] Add aten::new_empty (#12591)
 add fb7cf97fbc [CI] Install xgboost in Hexagon image (#12592)
 add cc19cdd711 [microTVM][Zephyr] Add recommended heap size for NRF and 
qemu_x86 (#12585)
 add 56b7c8ae96 [CI] Assert some unittests are not skipped in CI (#12436)
 add 61c034ae27 [DOC] fix code-block error in debuggging TVM part (#12597)
 add b547106fde [CI] github_cc_reviewers: Catch all exceptions so all 
reviewers can be processed (#12578)
 add 399f2e9b70 [microNPU] Remove xfail from tests relating to #12511 
(#12570)
 add f7c143608f [ETHOSN] Support conversion of add to depthwise (#12531)
 add 21db1eb586 [F2QI] Fix a rounding error on AvgPool when input and 
output affine scales differ (#12577)
 add bb00a15c26 [CUDA][CodeGen] Fix cuda codegen's fp16 inf literal (#12581)
 add 01fcdfcf5f [ci] Default to n=2 for test parallelism (#12414)
 add 8d60b3cbbc [Runtime] Change default alignment to 64 bytes. (#12586)
 add 5db38ba899 [COMMUNITY] @cconvey -> Reviewer (#12598)
 new f4f550f0cc testing new hexagon image

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (491738649a)
\
 N -- N -- N   refs/heads/ci-docker-staging (f4f550f0cc)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .github/workflows/{docs_bot.yml => tests_bot.yml}  |  11 +-
 CONTRIBUTORS.md|   1 +
 Jenkinsfile|  80 -
 apps/microtvm/zephyr/template_project/boards.json  |   8 +-
 .../zephyr/template_project/microtvm_api_server.py |   2 +
 ci/jenkins/Jenkinsfile.j2  |  22 +-
 ci/jenkins/macros.j2   |   3 +
 cmake/modules/contrib/TFLite.cmake |   2 -
 docker/Dockerfile.ci_cortexm   |   3 -
 docker/Dockerfile.ci_cpu   |   7 -
 docker/Dockerfile.ci_gpu   |   3 -
 docker/Dockerfile.ci_hexagon   |   4 +
 docker/Dockerfile.ci_riscv |   3 -
 docker/install/ubuntu_install_cmake_source.sh  |   4 +-
 docker/install/ubuntu_install_python_package.sh|   2 +-
 docker/install/ubuntu_install_tensorflow.sh|   5 +-
 .../install/ubuntu_install_tensorflow_aarch64.sh   |  23 +-
 docker/install/ubuntu_install_tflite.sh|  13 +-
 docker/install/ubuntu_install_vela.sh  |   2 +-
 docs/dev/how_to/debugging_tvm.rst  |   2 +-
 include/tvm/arith/int_set.h|  39 ++-
 include/tvm/meta_schedule/schedule_rule.h  |   3 +-
 include/tvm/runtime/device_api.h

[tvm] branch main updated (40bdea8d7a -> fb7cf97fbc)

2022-08-25 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 40bdea8d7a [PyTorch] Add aten::new_empty (#12591)
 add fb7cf97fbc [CI] Install xgboost in Hexagon image (#12592)

No new revisions were added by this update.

Summary of changes:
 docker/Dockerfile.ci_hexagon | 4 
 1 file changed, 4 insertions(+)

[tvm] branch main updated (b38738434b -> 40bdea8d7a)

2022-08-25 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from b38738434b [TIR] Expose WMMA-related TensorCore builtins (#12589)
 add 40bdea8d7a [PyTorch] Add aten::new_empty (#12591)

No new revisions were added by this update.

Summary of changes:
 python/tvm/relay/frontend/pytorch.py  | 16 
 tests/python/frontend/pytorch/test_forward.py | 17 +
 2 files changed, 33 insertions(+)

[tvm] branch main updated (9aac161a46 -> b38738434b)

2022-08-25 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 9aac161a46 [MetaSchedule] Add software pipeline in CUDA tensor core 
auto tensorization (#12544)
 add b38738434b [TIR] Expose WMMA-related TensorCore builtins (#12589)

No new revisions were added by this update.

Summary of changes:
 python/tvm/tir/__init__.py |   7 +
 python/tvm/tir/op.py   | 236 +
 tests/python/unittest/test_tir_op_types.py |  43 ++
 3 files changed, 286 insertions(+)

[tvm] branch main updated: [MetaSchedule] Add software pipeline in CUDA tensor core auto tensorization (#12544)

2022-08-24 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 9aac161a46 [MetaSchedule] Add software pipeline in CUDA tensor core 
auto tensorization (#12544)
9aac161a46 is described below

commit 9aac161a46e5aca4c433ccb901c1bb84e6c8bd0c
Author: Wuwei Lin 
AuthorDate: Wed Aug 24 23:28:54 2022 -0700

[MetaSchedule] Add software pipeline in CUDA tensor core auto tensorization 
(#12544)

cc @Hzfengsy @junrushao @junrushao1994 @masahi @spectrometerHBH
---
 include/tvm/meta_schedule/schedule_rule.h  |   3 +-
 python/tvm/meta_schedule/default_config.py |   1 +
 .../schedule_rule/multi_level_tiling.py|   4 +
 python/tvm/meta_schedule/testing/schedule_rule.py  |   2 +
 .../multi_level_tiling_tensor_core.cc  | 122 +++-
 ...ta_schedule_schedule_rule_multi_level_tiling.py | 125 +
 6 files changed, 255 insertions(+), 2 deletions(-)

diff --git a/include/tvm/meta_schedule/schedule_rule.h 
b/include/tvm/meta_schedule/schedule_rule.h
index b5f4a17b69..2da441c95e 100644
--- a/include/tvm/meta_schedule/schedule_rule.h
+++ b/include/tvm/meta_schedule/schedule_rule.h
@@ -190,13 +190,14 @@ class ScheduleRule : public runtime::ObjectRef {
* NullOpt means disable vectorization
* \param reuse_read Data reuse configuration for reading. NullOpt means no 
reuse.
* \param reuse_write Data reuse configuration for writing. NullOpt means no 
reuse.
+   * \param use_software_pipeline Whether use the software pipeline.
* \return The schedule rule created
*/
   TVM_DLL static ScheduleRule MultiLevelTilingTensorCore(
   Array> intrin_groups, String structure,
   Optional> tile_binds, Optional 
max_innermost_factor,
   Optional> vector_load_lens, Optional> reuse_read,
-  Optional> reuse_write);
+  Optional> reuse_write, bool 
use_software_pipeline);
 
   /*!
* \brief Create a rule: add-rfactor to some blocks if needed
diff --git a/python/tvm/meta_schedule/default_config.py 
b/python/tvm/meta_schedule/default_config.py
index 105b3467de..0f1f7d3c2c 100644
--- a/python/tvm/meta_schedule/default_config.py
+++ b/python/tvm/meta_schedule/default_config.py
@@ -381,6 +381,7 @@ class _DefaultCUDATensorCore:
 levels=[2],
 scope="shared",
 ),
+use_software_pipeline=False,
 ),
 *_DefaultCUDA.schedule_rules(),
 ]
diff --git a/python/tvm/meta_schedule/schedule_rule/multi_level_tiling.py 
b/python/tvm/meta_schedule/schedule_rule/multi_level_tiling.py
index a728a91eb7..6703bc5716 100644
--- a/python/tvm/meta_schedule/schedule_rule/multi_level_tiling.py
+++ b/python/tvm/meta_schedule/schedule_rule/multi_level_tiling.py
@@ -161,6 +161,8 @@ class MultiLevelTilingTensorCore(ScheduleRule):
 Data reuse configuration for reading. None means no reuse.
 reuse_write : Optional[ReuseType]
 Data reuse configuration for writing. None means no reuse.
+use_software_pipeline : bool
+Whether to use the software pipeline.
 """
 
 def __init__(
@@ -172,6 +174,7 @@ class MultiLevelTilingTensorCore(ScheduleRule):
 vector_load_lens: Optional[List[int]] = None,
 reuse_read: Optional[ReuseType] = None,
 reuse_write: Optional[ReuseType] = None,
+use_software_pipeline: bool = False,
 ) -> None:
 self.__init_handle_by_constructor__(
 _ffi_api.ScheduleRuleMultiLevelTilingTensorCore,  # type: ignore # 
pylint: disable=no-member
@@ -182,4 +185,5 @@ class MultiLevelTilingTensorCore(ScheduleRule):
 vector_load_lens,
 reuse_read.as_dict() if reuse_read is not None else None,
 reuse_write.as_dict() if reuse_write is not None else None,
+use_software_pipeline,
 )
diff --git a/python/tvm/meta_schedule/testing/schedule_rule.py 
b/python/tvm/meta_schedule/testing/schedule_rule.py
index 441ca930f8..46df4b95ce 100644
--- a/python/tvm/meta_schedule/testing/schedule_rule.py
+++ b/python/tvm/meta_schedule/testing/schedule_rule.py
@@ -119,6 +119,7 @@ def multi_level_tiling_tensor_core(
 in_dtype: Union[str, List[str]] = "float16",
 out_dtype: Union[str, List[str]] = "float32",
 trans_b: Union[bool, List[bool]] = False,
+use_software_pipeline: bool = False,
 ) -> ScheduleRule:
 """Default schedule rules for with multi-level tiling reuse for tensor 
core"""
 assert write_reuse_scope in ["shared", "global"]
@@ -154,6 +155,7 @@ def multi_level_tiling_tensor_core(
 levels=[2],
 scope=write_reuse_scope,
 ),
+use_software_pipeline=use_software_pipel

[tvm] branch main updated (c15cc5ef6d -> 577826182f)

2022-08-24 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from c15cc5ef6d [Target] Remove deprecated parameters from target (#12416)
 add 577826182f [PyTorch][Fix] Fix for numerically unstable logsigmoid 
(#12563)

No new revisions were added by this update.

Summary of changes:
 python/tvm/relay/frontend/pytorch.py  | 4 +++-
 tests/python/frontend/pytorch/test_forward.py | 2 ++
 2 files changed, 5 insertions(+), 1 deletion(-)

[tvm] branch main updated (8174d082e8 -> c15cc5ef6d)

2022-08-23 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 8174d082e8 Add using directives for otherwise hidden virtual 
functions, NFC (#12561)
 add c15cc5ef6d [Target] Remove deprecated parameters from target (#12416)

No new revisions were added by this update.

Summary of changes:
 apps/hexagon_launcher/README.md|  4 +-
 apps/howto_deploy/prepare_test_libs.py |  2 +-
 apps/sgx/src/build_model.py|  7 +-
 .../wasm-graph/tools/build_graph_lib.py|  9 ++-
 .../tune_with_autoscheduler/ci_logs/matmul.json|  2 +-
 .../ci_logs/resnet-50-NHWC-B1-llvm.json| 52 +++
 .../ci_logs/sparse_dense.json  |  2 +-
 gallery/how_to/tune_with_autotvm/tune_relay_x86.py |  4 +-
 gallery/how_to/work_with_microtvm/micro_tvmc.sh|  4 +-
 gallery/tutorial/auto_scheduler_matmul_x86.py  |  2 -
 python/tvm/contrib/hexagon/pytest_plugin.py|  2 +-
 python/tvm/relay/build_module.py   | 78 --
 python/tvm/target/target.py| 19 +-
 src/target/target_kind.cc  | 38 ++-
 tests/cpp/c_codegen_test.cc| 10 +--
 tests/cpp/target_test.cc   |  4 +-
 .../test_hexagon/topi/test_softmax_slice.py|  1 -
 tests/python/driver/tvmc/test_target.py|  6 +-
 tests/python/driver/tvmc/test_target_options.py|  2 +-
 tests/python/relay/aot/test_cpp_aot.py |  2 +-
 tests/python/relay/aot/test_crt_aot.py | 36 --
 tests/python/relay/test_build_module.py| 47 +++--
 .../test_tir_transform_common_subexpr_elim.py  |  2 +-
 tests/python/unittest/test_tvmscript_roundtrip.py  |  4 +-
 tests/scripts/task_python_docs.sh  |  2 +-
 25 files changed, 80 insertions(+), 261 deletions(-)

[tvm] branch main updated (262906516a -> e9aad35cf3)

2022-08-22 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 262906516a [TVMScript] Printer: add boolean operators to OperationDoc 
(#12518)
 add e9aad35cf3 fix group conv3d pack kernel shape error (#12523)

No new revisions were added by this update.

Summary of changes:
 python/tvm/topi/x86/conv3d.py  | 12 ++--
 tests/python/frontend/onnx/test_forward.py | 11 +++
 2 files changed, 21 insertions(+), 2 deletions(-)

[tvm] branch main updated (c83ee08c10 -> 1985c0153b)

2022-08-20 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from c83ee08c10 fix pytest (#12483)
 add 1985c0153b [Relay][Layout] Add FInferCorrectLayout for L2 norm layout 
transform. (#12497)

No new revisions were added by this update.

Summary of changes:
 src/relay/op/nn/nn.cc | 38 +-
 tests/python/relay/test_pass_convert_op_layout.py | 39 +++
 2 files changed, 76 insertions(+), 1 deletion(-)

[tvm] branch main updated (41be1b4533 -> 9d6039b879)

2022-08-19 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 41be1b4533 [TOPI]fix scatterND large shape problem (#12200)
 add 9d6039b879 fix group_conv3d caculate error (#12500)

No new revisions were added by this update.

Summary of changes:
 python/tvm/topi/x86/conv3d.py  | 4 ++--
 tests/python/frontend/onnx/test_forward.py | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

[tvm] branch main updated (8b3401ce6b -> 41be1b4533)

2022-08-19 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 8b3401ce6b [microTVM] Add config space to dense_dsp schedule (#12444)
 add 41be1b4533 [TOPI]fix scatterND large shape problem (#12200)

No new revisions were added by this update.

Summary of changes:
 src/target/llvm/codegen_cpu.cc   |  3 ++-
 tests/python/relay/test_op_level3.py | 20 
 2 files changed, 22 insertions(+), 1 deletion(-)

[tvm] branch main updated (e140a27495 -> aa97f4afb5)

2022-08-18 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from e140a27495 [COMMUNITY] Adam Straw -> Reviewer (#12480)
 add aa97f4afb5 [TIR] Disallow vectorization with strides in VerifyGPUCode 
(#12477)

No new revisions were added by this update.

Summary of changes:
 src/target/source/codegen_cuda.cc  |  1 +
 src/tir/analysis/verify_gpu_code.cc| 17 ++
 .../unittest/test_tir_analysis_verify_gpu_code.py  | 37 --
 3 files changed, 46 insertions(+), 9 deletions(-)

[tvm] branch main updated (436c17f885 -> e140a27495)

2022-08-18 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 436c17f885 [HEXAGON][TOPI] This PR adjusts schedules so >64 length 
vector loads/stores are not generated at LLVM level. This is a workaround for 
an instruction selection issue in current version of llvm for hexagon (#12471)
 add e140a27495 [COMMUNITY] Adam Straw -> Reviewer (#12480)

No new revisions were added by this update.

Summary of changes:
 CONTRIBUTORS.md | 1 +
 1 file changed, 1 insertion(+)

[tvm] branch main updated: [HEXAGON][TOPI] This PR adjusts schedules so >64 length vector loads/stores are not generated at LLVM level. This is a workaround for an instruction selection issue in curre

2022-08-18 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 436c17f885 [HEXAGON][TOPI] This PR adjusts schedules so >64 length 
vector loads/stores are not generated at LLVM level. This is a workaround for 
an instruction selection issue in current version of llvm for hexagon (#12471)
436c17f885 is described below

commit 436c17f88527406fa1b014b2431609aa217dee48
Author: arangasa <76030063+arang...@users.noreply.github.com>
AuthorDate: Thu Aug 18 14:25:44 2022 +0530

[HEXAGON][TOPI] This PR adjusts schedules so >64 length vector loads/stores 
are not generated at LLVM level. This is a workaround for an instruction 
selection issue in current version of llvm for hexagon (#12471)
---
 python/tvm/topi/hexagon/slice_ops/cast.py | 6 --
 tests/python/contrib/test_hexagon/topi/test_cast_slice.py | 4 ++--
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/python/tvm/topi/hexagon/slice_ops/cast.py 
b/python/tvm/topi/hexagon/slice_ops/cast.py
index b4984763e0..ac2e4c32e3 100644
--- a/python/tvm/topi/hexagon/slice_ops/cast.py
+++ b/python/tvm/topi/hexagon/slice_ops/cast.py
@@ -68,9 +68,10 @@ def cast_f16_f32_stir_schedule_nc(func, in_layout, 
out_layout, c_split_factor):
 block_name = "CastF16F32"
 _, c_orig = sch.get_loops(sch.get_block(block_name))
 _, c_inner = sch.split(c_orig, [None, c_split_factor])
+_, c_inner_inner = sch.split(c_inner, [None, 64])
 sch.transform_layout(block_name, "A", in_layout)
 sch.transform_layout(block_name, block_name, out_layout)
-sch.vectorize(c_inner)
+sch.vectorize(c_inner_inner)
 return sch
 
 
@@ -122,9 +123,10 @@ def cast_f32_f16_stir_schedule_nc(func, in_layout, 
out_layout, c_split_factor):
 block_name = "CastF32F16"
 _, c_orig = sch.get_loops(sch.get_block(block_name))
 _, c_inner = sch.split(c_orig, [None, c_split_factor])
+_, c_inner_inner = sch.split(c_inner, [None, 64])
 sch.transform_layout(block_name, "A", in_layout)
 sch.transform_layout(block_name, block_name, out_layout)
-sch.vectorize(c_inner)
+sch.vectorize(c_inner_inner)
 return sch
 
 
diff --git a/tests/python/contrib/test_hexagon/topi/test_cast_slice.py 
b/tests/python/contrib/test_hexagon/topi/test_cast_slice.py
index 30ea4c94b8..6569ce36bb 100644
--- a/tests/python/contrib/test_hexagon/topi/test_cast_slice.py
+++ b/tests/python/contrib/test_hexagon/topi/test_cast_slice.py
@@ -75,7 +75,7 @@ class TestCastF16F32Slice2d:
 """
 if hexagon_session._launcher._serial_number != "simulator":
 pytest.skip(msg="Due to 
https://github.com/apache/tvm/issues/11957";)
-target_hexagon = tvm.target.hexagon("v68")
+target_hexagon = tvm.target.hexagon("v69")
 target = tvm.target.Target(target_hexagon, host=target_hexagon)
 cast_input = te.placeholder(input_shape, name="A", dtype=dtype)
 cast_output = sl.cast_f16_f32_compute(cast_input)
@@ -161,7 +161,7 @@ class TestCastF32F16Slice2d:
 if hexagon_session._launcher._serial_number != "simulator":
 pytest.skip(msg="Due to 
https://github.com/apache/tvm/issues/11957";)
 
-target_hexagon = tvm.target.hexagon("v68")
+target_hexagon = tvm.target.hexagon("v69")
 target = tvm.target.Target(target_hexagon, host=target_hexagon)
 cast_input = te.placeholder(input_shape, name="A", dtype=dtype)
 cast_output = sl.cast_f32_f16_compute(cast_input)

[tvm] branch main updated (c9a350c800 -> f64a3bda25)

2022-08-17 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from c9a350c800 [docs] Add instructions for uploading CI resources to S3 
(#12476)
 add f64a3bda25 [Frontend][Pytorch] Add axis N when maxpool3d layout is 
(C,D,H,W) (#12467)

No new revisions were added by this update.

Summary of changes:
 python/tvm/relay/frontend/pytorch.py  |  7 ++-
 tests/python/frontend/pytorch/test_forward.py | 16 ++--
 2 files changed, 16 insertions(+), 7 deletions(-)

[tvm] branch main updated: [TVM PyTorch Integration] libstdc++ CXX11 ABI Compatibility & boolean tensor support (#12232)

2022-08-17 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 073304dadb [TVM PyTorch Integration] libstdc++ CXX11 ABI Compatibility 
& boolean tensor support (#12232)
073304dadb is described below

commit 073304dadb91ce70b3198cab8b3ae98ee4061b26
Author: Yaoda Zhou 
AuthorDate: Wed Aug 17 16:33:37 2022 +0800

[TVM PyTorch Integration] libstdc++ CXX11 ABI Compatibility & boolean 
tensor support (#12232)

* first commit

* rename

* cmake

* deprecated

* newline

* config

* config

* typo

* skip tvm_class

* rename

* delete ptr

* delete ptr

* save progress

* boolean support

* cmake file

* polish code

* compile config

* improving the codes

* format

* doc&errormsg

* zero-cost copy

* one step

* to ndarray

* extra output

* delete extra codes

* update test

* boolean support

* strong test

* decrease memory copy

* polish

* reformat

* polish

* remove redundant import

Co-authored-by: juda 
---
 apps/pt_tvmdsoop/tests/test_as_torch.py|   7 +-
 apps/pt_tvmdsoop/tests/test_boolean_tensor.py  | 129 ++
 cmake/modules/contrib/PT_TVMDSOOP.cmake|  68 --
 python/tvm/contrib/torch/__init__.py   |  25 +-
 python/tvm/contrib/torch/module.py |  17 ++
 python/tvm/contrib/torch/pytorch_tvm.py|  21 ++
 .../torch/pt_call_tvm/RuntimeModuleWrapper.cc  | 259 
 .../tvm_module_wrapper/RuntimeModuleWrapperTVM.cc  | 266 +
 .../RuntimeModuleWrapperTorch.cc   | 215 +
 .../torch/tvm_module_wrapper/runtime_bridge.h  | 116 +
 10 files changed, 844 insertions(+), 279 deletions(-)

diff --git a/apps/pt_tvmdsoop/tests/test_as_torch.py 
b/apps/pt_tvmdsoop/tests/test_as_torch.py
index 2c454e9454..a13d669e7f 100644
--- a/apps/pt_tvmdsoop/tests/test_as_torch.py
+++ b/apps/pt_tvmdsoop/tests/test_as_torch.py
@@ -17,6 +17,8 @@
 # specific language governing permissions and limitations
 # under the License.
 """Test script for tvm torch module"""
+import tempfile
+
 import numpy as np
 
 import torch
@@ -190,7 +192,10 @@ def test_tvmscript_torch_gpu():
 q1 = torch.arange(8, device=cuda0).type(torch.float32)
 q2 = torch.zeros((8,), dtype=torch.float32, device=cuda0)
 
-ModuleGPU(q1, q2)
+with tempfile.NamedTemporaryFile(suffix=".pt") as tmp:
+torch.save(ModuleGPU, tmp.name)
+loaded_mod = torch.load(tmp.name)
+loaded_mod(q1, q2)
 
 tvm.testing.assert_allclose(q2.cpu().numpy(), (q1 + 1).cpu().numpy(), 
atol=1e-5, rtol=1e-5)
 
diff --git a/apps/pt_tvmdsoop/tests/test_boolean_tensor.py 
b/apps/pt_tvmdsoop/tests/test_boolean_tensor.py
new file mode 100644
index 00..4718b40439
--- /dev/null
+++ b/apps/pt_tvmdsoop/tests/test_boolean_tensor.py
@@ -0,0 +1,129 @@
+#!/usr/bin/env python
+
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""Test script for boolean tensor support"""
+import tempfile
+
+import torch
+
+import tvm
+import tvm.testing
+from tvm.contrib.torch import as_torch, optimize_torch
+from tvm.script import tir as T
+
+
+def negate(x):
+return x.logical_not()
+
+
+def sum_up_tensor(x):
+return x.size(dim=0) - torch.sum(x.int())
+
+
+def tensor_boolean_operation(x):
+arr1 = (x + 0.3).floor().bool()
+arr2 = (~((x + 0.7).int().bool())).bool()
+ret = ((arr1 & arr2).byte() + 0.5).half()
+return ~(ret.bool())
+
+
+def test_bool_tensor_negate():
+input = torch.ones(1, dtype=torch.bool)
+optimized_negate = optimize_torch(
+negate,
+input,
+)
+with tempfile.NamedTemporaryFile(suffix=".pt") as tmp:
+torch.save(optimized_negate, tmp.name)
+loaded

[tvm] branch main updated (247c54b97d -> a1ddfb592f)

2022-08-16 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 247c54b97d Use std::make_unique instead of std::unique_ptr(new ...), 
NFC (#12459)
 add a1ddfb592f Remove uses of std::iterator, NFC (#12461)

No new revisions were added by this update.

Summary of changes:
 include/tvm/support/span.h | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

[tvm] branch main updated (57a02f7e26 -> 478b672f2b)

2022-08-12 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 57a02f7e26 Update hexagon max_concurrency to be at most equal to the 
number of HVX units available. (#12394)
 add 478b672f2b [skip ci] Revert "[ci] Default to n=2 for test parallelism 
(#12376)" (#12413)

No new revisions were added by this update.

Summary of changes:
 Jenkinsfile   | 59 ++-
 ci/jenkins/Jenkinsfile.j2 |  2 +-
 ci/jenkins/macros.j2  |  3 --
 tests/scripts/setup-pytest-env.sh |  8 +-
 4 files changed, 4 insertions(+), 68 deletions(-)

[tvm] branch main updated: [PyTorch] Fix all_any_common with no default input (#12395)

2022-08-12 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new a1c371f46c [PyTorch] Fix all_any_common with no default input (#12395)
a1c371f46c is described below

commit a1c371f46cf77dcdffa6f0ab55f5036bff1c5624
Author: Yuanjing Shi 
AuthorDate: Thu Aug 11 23:09:22 2022 -1000

[PyTorch] Fix all_any_common with no default input (#12395)

* fix all_any_common with no default input

* work around

* better naming
---
 python/tvm/relay/frontend/pytorch.py  | 10 --
 tests/python/frontend/pytorch/test_forward.py |  5 +
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/python/tvm/relay/frontend/pytorch.py 
b/python/tvm/relay/frontend/pytorch.py
index ffe4b313c5..0e6d4caae0 100644
--- a/python/tvm/relay/frontend/pytorch.py
+++ b/python/tvm/relay/frontend/pytorch.py
@@ -3253,8 +3253,14 @@ class PyTorchOpConverter:
 return (output, _op.stack(hy, 0), _op.stack(cy, 0))
 
 def all_any_common(self, op, inputs, input_types):
-dim = inputs[1]
-keepdim = inputs[2]
+if len(inputs) >= 2:
+dim = inputs[1]
+else:
+dim = None
+if len(inputs) >= 3:
+keepdim = inputs[2]
+else:
+keepdim = False
 if self.infer_type(inputs[0]).dtype != "bool":
 # The input dtype can be uint8.
 inp = _op.cast(inputs[0], "bool")
diff --git a/tests/python/frontend/pytorch/test_forward.py 
b/tests/python/frontend/pytorch/test_forward.py
index 6b1eb30a56..4c78ba4b85 100755
--- a/tests/python/frontend/pytorch/test_forward.py
+++ b/tests/python/frontend/pytorch/test_forward.py
@@ -4385,11 +4385,16 @@ def test_all_any():
 def test_fn(f, dim=None, keepdim=False):
 return lambda x: f(x, dim=dim, keepdim=keepdim)
 
+def test_fn_no_arg(f):
+return lambda x: f(x)
+
 for f in [torch.all, torch.any]:
 verify_model(test_fn(f, 0), [torch.rand(1, 2).bool()])
 verify_model(test_fn(f, 0), [torch.arange(0, 3).to(torch.uint8)])
 verify_model(test_fn(f, 1), [torch.rand(4, 2).bool()])
 verify_model(test_fn(f, 0, keepdim=True), [torch.rand(4, 2).bool()])
+verify_model(test_fn_no_arg(f), [torch.rand(1, 2).bool()])
+verify_model(test_fn_no_arg(f), [torch.arange(0, 3).to(torch.uint8)])
 
 
 @tvm.testing.uses_gpu

[tvm] branch main updated: [PyTorch] Fix pad_common for float pad_value (#12134)

2022-08-12 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 22dcf4490d [PyTorch] Fix pad_common for float pad_value (#12134)
22dcf4490d is described below

commit 22dcf4490dacc7813f5ef3d700ab0b64171c7662
Author: Yuanjing Shi 
AuthorDate: Thu Aug 11 21:02:48 2022 -1000

[PyTorch] Fix pad_common for float pad_value (#12134)

* fix pad

* fix constant padding and handle float infinity

* revert change to pad_width

* fix constant pad value
---
 python/tvm/relay/frontend/pytorch.py  | 11 -
 tests/python/frontend/pytorch/test_forward.py | 32 +--
 2 files changed, 36 insertions(+), 7 deletions(-)

diff --git a/python/tvm/relay/frontend/pytorch.py 
b/python/tvm/relay/frontend/pytorch.py
index 0fe8d57464..ffe4b313c5 100644
--- a/python/tvm/relay/frontend/pytorch.py
+++ b/python/tvm/relay/frontend/pytorch.py
@@ -1905,7 +1905,7 @@ class PyTorchOpConverter:
 
 # initialize paddings based on input len
 pad_len = len(self.infer_shape(data)) * 2
-paddings = [pad_value] * pad_len
+paddings = [0] * pad_len
 
 if len(pad_list) >= 2:
 paddings[-1] = pad_list[1]
@@ -1925,8 +1925,10 @@ class PyTorchOpConverter:
 for pad in paddings:
 const_paddings.append([])
 for p in pad:
-if not isinstance(p, int):
+if isinstance(p, _expr.Expr):
 p = int(_infer_value(p, {}).numpy())
+elif not isinstance(p, int):
+raise NotImplementedError("pad width should be int/expr")
 const_paddings[-1].append(p)
 if p != 0:
 non_zero_found = True
@@ -1934,12 +1936,11 @@ class PyTorchOpConverter:
 if not non_zero_found:
 return data
 elif mode == "constant":
-return _op.nn.pad(data, const_paddings, pad_value=inputs[2], 
pad_mode=mode)
+return _op.nn.pad(data, const_paddings, pad_value=pad_value, 
pad_mode=mode)
 else:
 return _op.nn.pad(data, const_paddings, pad_mode=mode)
 
 def pad(self, inputs, input_types):
-
 # mode: Optional default "constant"
 if len(inputs) > 2 and inputs[2] is not None:
 mode = inputs[2]
@@ -1960,7 +1961,7 @@ class PyTorchOpConverter:
 return self.pad_common(mode, pad_value, inputs, input_types)
 
 def constant_pad_nd(self, inputs, input_types):
-return self.pad_common("constant", 0, inputs, input_types)
+return self.pad_common("constant", _expr.const(inputs[2]), inputs, 
input_types)
 
 def reflection_pad1d(self, inputs, input_types):
 return self.pad_common("reflect", 0, inputs, input_types)
diff --git a/tests/python/frontend/pytorch/test_forward.py 
b/tests/python/frontend/pytorch/test_forward.py
index bc848f90b3..6b1eb30a56 100755
--- a/tests/python/frontend/pytorch/test_forward.py
+++ b/tests/python/frontend/pytorch/test_forward.py
@@ -2010,6 +2010,34 @@ def test_forward_functional_pad():
 pad = (0, 1, 2, 1, 3, 3)
 verify_model(Pad1().float().eval(), input_data=input_data)
 
+class Pad2(Module):
+def forward(self, *args):
+return torch.nn.functional.pad(args[0], pad, "constant", 1)
+
+input_data = torch.rand((3, 3, 4, 2))
+pad = (1, 1)
+verify_model(Pad2().float().eval(), input_data=input_data)
+
+pad = (1, 1, 2, 2)
+verify_model(Pad2().float().eval(), input_data=input_data)
+
+pad = (0, 1, 2, 1, 3, 3)
+verify_model(Pad2().float().eval(), input_data=input_data)
+
+class Pad3(Module):
+def forward(self, *args):
+return torch.nn.functional.pad(args[0], pad, "constant", 1.0)
+
+input_data = torch.rand((3, 3, 4, 2))
+pad = (1, 1)
+verify_model(Pad3().float().eval(), input_data=input_data)
+
+pad = (1, 1, 2, 2)
+verify_model(Pad3().float().eval(), input_data=input_data)
+
+pad = (0, 1, 2, 1, 3, 3)
+verify_model(Pad3().float().eval(), input_data=input_data)
+
 
 @tvm.testing.uses_gpu
 def test_forward_zero_pad2d():
@@ -2021,10 +2049,10 @@ def test_forward_zero_pad2d():
 @tvm.testing.uses_gpu
 def test_forward_constant_pad1d():
 inp = torch.rand((1, 2, 4))
-verify_model(torch.nn.ConstantPad2d(2, 3.5).eval(), inp)
+verify_model(torch.nn.ConstantPad1d(2, 3.5).eval(), inp)
 
 inp = torch.rand((1, 2, 3))
-verify_model(torch.nn.ConstantPad2d((3, 1), 3.5).eval(), inp)
+verify_model(torch.nn.ConstantPad1d((3, 1), 3.5).eval(), inp)
 
 
 @tvm.testing.uses_gpu

[tvm] branch main updated: [Adreno][OpenCL] Get rid of extra memory copy (#12286)

2022-08-11 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 5deb95a947 [Adreno][OpenCL] Get rid of extra memory copy (#12286)
5deb95a947 is described below

commit 5deb95a9472002c9fa36a150f9e348f4276d63c5
Author: Egor Churaev 
AuthorDate: Fri Aug 12 05:53:39 2022 +0300

[Adreno][OpenCL] Get rid of extra memory copy (#12286)

* Add annotation pass for device_copy where we get buffers but expect

textures

* Fix issues with running device_copy

* Get rid of extra memory copy

* Fix build after cherry-picking

* Fix lint

* Fix CI

* Apply comments

Co-authored-by: Andrey Malyshev 
---
 python/tvm/relay/op/strategy/adreno.py   |  18 +++
 python/tvm/topi/adreno/__init__.py   |   1 +
 python/tvm/topi/adreno/conv2d_nchw.py|  29 +++--
 python/tvm/topi/adreno/conv2d_nhwc.py|  29 +++--
 python/tvm/topi/adreno/depthwise_conv2d_nchw.py  |  10 +-
 python/tvm/topi/adreno/depthwise_conv2d_nhwc.py  |  10 +-
 python/tvm/topi/adreno/injective.py  |  66 ++
 python/tvm/topi/adreno/utils.py  |  23 +++-
 src/relay/transforms/annotate_texture_storage.cc | 152 +++
 src/runtime/opencl/opencl_device_api.cc  |  29 -
 tests/python/relay/test_conv2d_nchw_texture.py   |   8 +-
 11 files changed, 312 insertions(+), 63 deletions(-)

diff --git a/python/tvm/relay/op/strategy/adreno.py 
b/python/tvm/relay/op/strategy/adreno.py
index cc082c9d61..a537fa1e7b 100644
--- a/python/tvm/relay/op/strategy/adreno.py
+++ b/python/tvm/relay/op/strategy/adreno.py
@@ -257,3 +257,21 @@ def schedule_pool_adreno(attrs, outs, target):
 if attrs.layout == "NCHW4c":
 return topi.adreno.schedule_pool(outs, attrs.layout)
 return topi.cuda.schedule_pool(outs, attrs.layout)
+
+
+@schedule_injective.register(["adreno"])
+def schedule_injective_adreno(attrs, outs, target):
+"""schedule injective ops for adreno"""
+with target:
+return topi.adreno.schedule_injective(outs)
+
+
+@concatenate_strategy.register(["adreno"])
+def concatenate_strategy_adreno(attrs, inputs, out_type, target):
+strategy = _op.OpStrategy()
+strategy.add_implementation(
+wrap_compute_concat(topi.transform.concatenate),
+wrap_topi_schedule(topi.adreno.schedule_injective),
+name="concatenate.adreno",
+)
+return strategy
diff --git a/python/tvm/topi/adreno/__init__.py 
b/python/tvm/topi/adreno/__init__.py
index 57a9013b1a..227ca6aa9a 100644
--- a/python/tvm/topi/adreno/__init__.py
+++ b/python/tvm/topi/adreno/__init__.py
@@ -25,3 +25,4 @@ from .pooling import *
 from .conv2d_alter_op import *
 from .conv2d_nchw_winograd import *
 from .conv2d_nhwc_winograd import *
+from .injective import schedule_injective
diff --git a/python/tvm/topi/adreno/conv2d_nchw.py 
b/python/tvm/topi/adreno/conv2d_nchw.py
index 16ecaa84d0..65cd8e0150 100644
--- a/python/tvm/topi/adreno/conv2d_nchw.py
+++ b/python/tvm/topi/adreno/conv2d_nchw.py
@@ -279,28 +279,35 @@ def schedule_conv2d_NCHWc_KCRSk(cfg, s, output):
 ):  # len(latest.op.axis) == 4:
 # manage scheduling of datacopy
 pad_data, kernel = s[conv].op.input_tensors
-pack_data = pad_data.op.input_tensors[0]
-bind_data_copy(s[pack_data])
+if "pad_temp" in pad_data.op.name:
+pack_data = pad_data.op.input_tensors[0]
+bind_data_copy(s[pack_data])
+else:
+bind_data_copy(s[pad_data])
 bind_data_copy(s[kernel])
 
 pad_data, kernel = s[conv].op.input_tensors
 
-s[pad_data].compute_inline()
-
-s[conv].set_scope("local")
-if latest_blocked == latest and output != latest:
-s[output].compute_inline()
-
-# create cache stage
-AT = s.cache_read(pad_data, get_texture_storage(pad_data.shape), [conv])
-bind_data_copy(s[AT])
 if (
 autotvm.GLOBAL_SCOPE.in_tuning
 or isinstance(kernel.op, tvm.te.ComputeOp)
 and "filter_pack" in kernel.op.tag
 ):
+if "pad_temp" in pad_data.op.name:
+s[pad_data].compute_inline()
+AT = s.cache_read(pad_data, get_texture_storage(pad_data.shape), 
[conv])
+bind_data_copy(s[AT])
 WT = s.cache_read(kernel, get_texture_storage(kernel.shape), [conv])
 bind_data_copy(s[WT])
+elif "pad_temp" in pad_data.op.name:
+s[pad_data].compute_inline()
+# create cache stage
+AT = s.cache_read(pad_data, get_texture_storage(pad_data.shape), 
[conv])
+bind_data_copy(s[AT])
+
+s[conv].set_scope("local")
+if latest_blocked == latest and output !

[tvm] branch main updated (de12486271 -> e8de88e4f5)

2022-08-11 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from de12486271 [ci][docker] Tag tlcpackstaging images to tlcpack (#11832)
 add e8de88e4f5 [BYOC] [DNNL] enable in-place post-op sum in dnnl json 
runtime (#12371)

No new revisions were added by this update.

Summary of changes:
 src/runtime/contrib/dnnl/dnnl_json_runtime.cc| 35 -
 src/runtime/contrib/dnnl/dnnl_tensor_requisite.h | 25 +++
 tests/python/contrib/test_dnnl.py| 39 +++-
 3 files changed, 76 insertions(+), 23 deletions(-)

[tvm] branch main updated: [BYOC-DNNL] add partition test on sum pattern (#12357)

2022-08-10 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 22ba659438 [BYOC-DNNL] add partition test on sum pattern (#12357)
22ba659438 is described below

commit 22ba659438a317ca59c8201430c662f86e2550fd
Author: Ivy Zhang 
AuthorDate: Wed Aug 10 19:16:03 2022 +0800

[BYOC-DNNL] add partition test on sum pattern (#12357)

* add partition test on sum pattern

* fix lint
---
 python/tvm/relay/op/contrib/dnnl.py |  4 +-
 tests/python/relay/test_pass_partition_graph.py | 54 +
 2 files changed, 56 insertions(+), 2 deletions(-)

diff --git a/python/tvm/relay/op/contrib/dnnl.py 
b/python/tvm/relay/op/contrib/dnnl.py
index f76d4bd10d..4ef342a26b 100644
--- a/python/tvm/relay/op/contrib/dnnl.py
+++ b/python/tvm/relay/op/contrib/dnnl.py
@@ -249,8 +249,6 @@ def make_sum_pattren_predicate(checker):
 for e, op_name in zip([expr, expr.args[0]], ["sum", "bias_add"]):
 args = get_args(e)
 attrs = get_attrs(e.args[0])
-if attrs is None:
-return False
 if not checker(attrs, args, op_name):
 return False
 return True
@@ -284,6 +282,8 @@ def add_checker(attrs, args, op_name):
 if tuple(get_shape(args[0])) != tuple(get_shape(args[1])):
 return False
 if op_name == "bias_add":
+if attrs is None:
+return False
 if not isinstance(args[0].op, tvm.ir.op.Op):
 return False
 if args[0].op.name != "nn.conv2d":
diff --git a/tests/python/relay/test_pass_partition_graph.py 
b/tests/python/relay/test_pass_partition_graph.py
index 5e1a812fa3..f073a00c19 100644
--- a/tests/python/relay/test_pass_partition_graph.py
+++ b/tests/python/relay/test_pass_partition_graph.py
@@ -930,6 +930,10 @@ def test_dnnl_fuse():
 conv2d_relu_pat = pattern
 elif pattern[0] == "dnnl.conv2d_sigmoid":
 conv2d_sigmoid_pat = pattern
+elif pattern[0] == "dnnl.conv2d_bias_sum":
+conv2d_bias_sum_pat = pattern
+elif pattern[0] == "dnnl.conv2d_bias_sum_relu":
+conv2d_bias_sum_relu_pat = pattern
 
 def get_blocks(
 prefix,
@@ -1009,6 +1013,52 @@ def test_dnnl_fuse():
 mod = get_partitoned_mod(mod, params, pattern_table)
 assert len(mod.functions) - 1 == num_expected_partition  # -1 for main
 
+def test_sum_pattern(pattern_table, num_expected_partition):
+def get_conv2d_bn_sum_relu(
+x_shape=(1, 32, 8, 8),
+k_shape=(16, 32, 3, 3),
+sum_shape=(1, 16, 6, 6),
+dtype="float32",
+):
+x = relay.var("x", shape=(x_shape), dtype=dtype)
+kernel = relay.const(np.random.randint(0, 1, 
k_shape).astype(dtype))
+bias = relay.var("bias", shape=(k_shape[0],), dtype=dtype)
+beta = relay.const(np.zeros(k_shape[0]).astype(dtype))
+gamma = relay.const(np.ones(k_shape[0]).astype(dtype))
+moving_mean = relay.const(np.zeros(k_shape[0]).astype(dtype))
+moving_var = relay.const(np.ones(k_shape[0]).astype(dtype))
+sum_data = relay.var("data1", shape=sum_shape, dtype=dtype)
+
+dic = {"x": x_shape, "bias": (k_shape[0],), "sum_data": sum_shape}
+param_lst = ["bias", "sum_data"]
+
+conv = relay.nn.conv2d(
+x,
+kernel,
+channels=k_shape[0],
+kernel_size=k_shape[2:4],
+)
+conv_bias = relay.nn.bias_add(conv, bias)
+conv_bias_bn, _, _ = relay.nn.batch_norm(
+conv_bias,
+gamma=gamma,
+beta=beta,
+moving_mean=moving_mean,
+moving_var=moving_var,
+axis=1,
+center=True,
+scale=True,
+epsilon=1e-5,
+)
+conv_bias_bn_sum = relay.add(conv_bias_bn, sum_data)
+return relay.nn.relu(conv_bias_bn_sum), dic, param_lst
+
+net, dic, param_lst = get_conv2d_bn_sum_relu()
+net = tvm.IRModule.from_expr(net)
+params = {x: np.random.uniform(-1, 1, dic[x]).astype("float32") for x 
in param_lst}
+mod = get_partitoned_mod(net, params, pattern_table)
+assert len(mod.functions) - 1 == num_expected_partition  # -1 for main
+
 def test_partition():
 # conv + bn + relu, conv + relu -> fused conv_bias_relu, conv, and relu
 test_detect_pattern([conv2d_bias_relu_pat], False, True, False, 3)
@@ -1033,6 +1083,10 @@ def

[tvm] branch main updated (bd763d3c23 -> 151d6ab8ac)

2022-08-08 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from bd763d3c23 [Topi] add x86 schedule for batch_norm (#12321)
 add 151d6ab8ac [FIX,ROOFLINE] Only save tir functions for roofline (#12339)

No new revisions were added by this update.

Summary of changes:
 python/tvm/utils/roofline/__init__.py | 13 +
 1 file changed, 5 insertions(+), 8 deletions(-)

[tvm] branch main updated (d6be6940bd -> bd763d3c23)

2022-08-08 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from d6be6940bd [BYOC-DNNL] Bug Fix (#12314)
 add bd763d3c23 [Topi] add x86 schedule for batch_norm (#12321)

No new revisions were added by this update.

Summary of changes:
 python/tvm/relay/op/strategy/x86.py  | 12 ++
 python/tvm/topi/x86/nn.py| 30 
 tests/python/topi/python/test_topi_batch_norm.py |  1 +
 3 files changed, 43 insertions(+)

[tvm] branch main updated (ef39e46a1d -> d6be6940bd)

2022-08-08 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from ef39e46a1d [microTVM][Zephyr] Fix missing BOARD in CMakeCache file 
(#12338)
 add d6be6940bd [BYOC-DNNL] Bug Fix (#12314)

No new revisions were added by this update.

Summary of changes:
 python/tvm/relay/op/contrib/dnnl.py | 46 +
 tests/python/relay/test_pass_partition_graph.py | 42 ++
 2 files changed, 50 insertions(+), 38 deletions(-)

[tvm] branch main updated (4a6d655561 -> d3d1038e15)

2022-08-01 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 4a6d655561 [Pylint] Making hexagon tests pylint compliant Part 2 of N 
(#12176)
 add d3d1038e15 [ci][docker] Update GPU image (#12265)

No new revisions were added by this update.

Summary of changes:
 Jenkinsfile   | 4 ++--
 ci/jenkins/Jenkinsfile.j2 | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

[tvm] branch main updated: Enable conv family fused with mish (#12228)

2022-08-01 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new a49273e050 Enable conv family fused with mish (#12228)
a49273e050 is described below

commit a49273e05092480bde8593c6a137bb251b5dee6c
Author: billishyahao 
AuthorDate: Mon Aug 1 16:56:49 2022 +0800

Enable conv family fused with mish (#12228)
---
 python/tvm/relay/op/contrib/dnnl.py   | 11 +--
 src/runtime/contrib/dnnl/dnnl_json_runtime.cc |  4 
 tests/python/contrib/test_dnnl.py | 15 +++
 3 files changed, 24 insertions(+), 6 deletions(-)

diff --git a/python/tvm/relay/op/contrib/dnnl.py 
b/python/tvm/relay/op/contrib/dnnl.py
index f17b325dce..46c20e947f 100644
--- a/python/tvm/relay/op/contrib/dnnl.py
+++ b/python/tvm/relay/op/contrib/dnnl.py
@@ -53,7 +53,7 @@ from .register import register_pattern_table
 
 
 logger = logging.getLogger("DNNL")
-supported_post_elts = ["nn.relu", "tanh", "sigmoid", "clip", "gelu", "swish", 
None]
+supported_post_elts = ["nn.relu", "tanh", "sigmoid", "clip", "gelu", "swish", 
"mish", None]
 
 
 def _register_external_op_helper(op_name, supported=True):
@@ -137,6 +137,13 @@ def append_eltwise_ops(op, eltwise):
 elif eltwise == "swish":
 sig_out = is_op("sigmoid")(op)
 op = is_op("multiply")(op, sig_out)
+elif eltwise == "mish":
+const1 = wildcard()
+exp = is_op("exp")(op)
+add = is_op("add")(exp, const1)
+log = is_op("log")(add)
+tanh = is_op("tanh")(log)
+op = is_op("multiply")(op, tanh)
 elif eltwise:
 op = is_op(eltwise)(op)
 return op
@@ -411,7 +418,7 @@ def pattern_table():
 )
 )
 
-elt_list = ["nn.relu", "tanh", "sigmoid", "clip", "gelu", "swish", None]
+elt_list = ["nn.relu", "tanh", "sigmoid", "clip", "gelu", "swish", "mish", 
None]
 for with_bias in [True, False]:
 for elt in elt_list:
 if not with_bias and not elt:
diff --git a/src/runtime/contrib/dnnl/dnnl_json_runtime.cc 
b/src/runtime/contrib/dnnl/dnnl_json_runtime.cc
index 1fe8fccc77..d019f4e811 100644
--- a/src/runtime/contrib/dnnl/dnnl_json_runtime.cc
+++ b/src/runtime/contrib/dnnl/dnnl_json_runtime.cc
@@ -191,6 +191,7 @@ class DNNLJSONRuntime : public JSONRuntimeBase {
 std::regex gelu_pat(".*_gelu.*");
 std::regex swish_pat(".*_swish.*");
 std::regex sum_pat(".*_sum.*");
+std::regex mish_pat(".*_mish.*");
 
 // parsing of name to extract attributes
 auto op_name = nodes_[nid].GetOpName();
@@ -220,6 +221,9 @@ class DNNLJSONRuntime : public JSONRuntimeBase {
 if (std::regex_match(op_name, gelu_pat)) {
   ops.append_eltwise(1.f, dnnl::algorithm::eltwise_gelu_erf, 0.f, 0.f);
 }
+if (std::regex_match(op_name, mish_pat)) {
+  ops.append_eltwise(1.f, dnnl::algorithm::eltwise_mish, 1.f, 0.f);
+}
 if (ops.len() != 0) {
   attr.set_post_ops(ops);
 }
diff --git a/tests/python/contrib/test_dnnl.py 
b/tests/python/contrib/test_dnnl.py
index 74d0da1238..8de8bd9ce6 100755
--- a/tests/python/contrib/test_dnnl.py
+++ b/tests/python/contrib/test_dnnl.py
@@ -252,6 +252,13 @@ def add_activation(activation, out, dic, param_lst):
 elif activation == "gelu":
 out = gelu_helper(out)
 return out, dic, param_lst
+elif activation == "mish":
+exp = relay.exp(out)
+add = relay.add(exp, relay.const(1.0))
+log = relay.log(add)
+tanh = relay.tanh(log)
+out = relay.multiply(out, tanh)
+return out, dic, param_lst
 else:
 return out, dic, param_lst
 
@@ -765,7 +772,7 @@ def test_conv2d_weights_const(run_module, dtype="float32"):
 def test_conv2d_pattern(run_module, dtype="float32"):
 x_shape = (1, 32, 8, 8)
 k_shape = (16, 32, 3, 3)
-activation_lst = [None, "relu", "tanh", "sigmoid", "clip", "swish", "gelu"]
+activation_lst = [None, "relu", "tanh", "sigmoid", "clip", "swish", 
"gelu", "mish"]
 for a in activation_lst:
 conv2d, dic, param_lst = get_conv2d(x_shape, k_shape, activation=a, 
dtype=dtype)
 conv2d = tvm.IRModule.from_expr(conv2d)
@@ -849,7 +856,7 @@ def test_conv2d_transpose(run_module, dtype="float32"):
 
 
 def test_conv2d_transpose_pattern(run_module, dtype="

[tvm] branch main updated: [BYOC-DNNL] add post_sum pattern (#12151)

2022-07-31 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new c07d77f99c [BYOC-DNNL] add post_sum pattern (#12151)
c07d77f99c is described below

commit c07d77f99c024b9e2c162b574482dbbbd71d4680
Author: Ivy Zhang 
AuthorDate: Mon Aug 1 09:32:52 2022 +0800

[BYOC-DNNL] add post_sum pattern (#12151)

* add post_sum pattern

* add checkers for sum pattern

* fix lint

* fix error in test_pass_partition_graph

* fix lint error
---
 python/tvm/relay/op/contrib/dnnl.py | 106 +++-
 src/runtime/contrib/dnnl/dnnl_json_runtime.cc   |  14 +++-
 tests/python/contrib/test_dnnl.py   |  42 ++
 tests/python/relay/test_pass_partition_graph.py |   6 +-
 4 files changed, 164 insertions(+), 4 deletions(-)

diff --git a/python/tvm/relay/op/contrib/dnnl.py 
b/python/tvm/relay/op/contrib/dnnl.py
index fa98ed002c..f17b325dce 100644
--- a/python/tvm/relay/op/contrib/dnnl.py
+++ b/python/tvm/relay/op/contrib/dnnl.py
@@ -33,8 +33,10 @@ it is supported. For example:
 check the attributes of the op and decide if it should be offloaded to DNNL.
 """
 import logging
+from functools import reduce
 
 import tvm.ir
+from tvm.ir import Op
 from tvm import relay
 from tvm.relay import transform
 from tvm.relay.expr import GlobalVar
@@ -44,7 +46,7 @@ from tvm.relay.expr import const
 from tvm.relay.analysis import analysis as _analysis
 from tvm.relay import expr as _expr
 
-
+from tvm.relay.expr import Call, TupleGetItem
 from ... import _ffi_api
 from ...dataflow_pattern import wildcard, is_op, is_constant, is_expr, 
rewrite, DFPatternCallback
 from .register import register_pattern_table
@@ -167,6 +169,94 @@ def make_conv_pattern(conv_name, with_bias=True, 
with_eltwise=None):
 return append_eltwise_ops(conv_out, with_eltwise)
 
 
+def make_conv_bias_sum_relu_pattern(conv_type, has_relu=True):
+"""Create patterns with sum op.
+
+Parameters
+--
+conv_type : str
+Should be nn.conv1d / nn.conv2d / nn.conv3d.
+has_relu : bool
+Whether attach relu.
+Returns
+---
+out : CallPattern
+Call node sequence.
+"""
+data1 = wildcard()
+weight = wildcard()
+bias = wildcard()
+data2 = wildcard()
+out = is_op(conv_type)(data1, weight)
+out = is_op("add")(out, bias)
+out = is_op("add")(out, data2)
+if has_relu:
+out = is_op("nn.relu")(out)
+return out
+
+
+def get_op_name(expr):
+"""Get the operator name from an expression."""
+if isinstance(expr, Op):
+return expr.name
+if isinstance(expr, Call):
+return get_op_name(expr.op)
+if isinstance(expr, TupleGetItem):
+return get_op_name(expr.tuple_value)
+if isinstance(expr, relay.Tuple):
+return get_op_name(expr.fields[0])
+return ""
+
+
+def get_args(expr):
+"""Get the arguments from an expression."""
+if isinstance(expr, Call):
+return expr.args
+if isinstance(expr, TupleGetItem):
+return get_args(expr.tuple_value)
+if isinstance(expr, relay.Tuple):
+return [arg for args in map(get_args, expr.fields) for arg in args]
+return []
+
+
+def get_attrs(expr):
+"""Get the attributes from an expression."""
+if isinstance(expr, Call):
+return expr.attrs
+if isinstance(expr, TupleGetItem):
+return get_attrs(expr.tuple_value)
+return {}
+
+
+def make_predicate(checker):
+"""Check whether the conv_bias_add_sum pattern is as expected."""
+
+def predicate(expr):
+if get_op_name(expr) == "nn.relu":
+expr = expr.args[0]
+for e, op_name in zip([expr, expr.args[0]], ["sum", "bias_add"]):
+args = get_args(e)
+attrs = get_attrs(e.args[0])
+if not checker(attrs, args, op_name):
+return False
+return True
+
+return predicate
+
+
+def add_checker(attrs, args, op_name):
+"""Check if add is supported by DNNL."""
+if op_name == "sum":
+if tuple(get_shape(args[0])) != tuple(get_shape(args[1])):
+return False
+if op_name == "bias_add":
+channel = dict(attrs)["channels"]
+const_shape = get_shape(args[1])
+if channel != reduce(lambda x, y: x * y, const_shape):
+return False
+return True
+
+
 def make_dense_pattern(with_bias=True, with_eltwise=None):
 """Create patterns related to nn.dense.
 
@@ -306,6 +396,20 @@ def pattern_table():

[tvm] branch main updated: remove duplicated cast op when lowering qnn.requantize op in float mode (#12234)

2022-07-30 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new fb87c21bf8 remove duplicated cast op when lowering qnn.requantize op 
in float mode (#12234)
fb87c21bf8 is described below

commit fb87c21bf8d0fa5edec96a054a57a6d37c11289f
Author: paperplanet 
AuthorDate: Sat Jul 30 16:28:39 2022 +0800

remove duplicated cast op when lowering qnn.requantize op in float mode 
(#12234)
---
 src/relay/qnn/op/requantize.cc | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/src/relay/qnn/op/requantize.cc b/src/relay/qnn/op/requantize.cc
index 2a6153e810..5bf53a95ed 100644
--- a/src/relay/qnn/op/requantize.cc
+++ b/src/relay/qnn/op/requantize.cc
@@ -303,10 +303,7 @@ Expr RequantizeLowerFP(const Expr& input_tensor, const 
Expr& input_scale,
   -1,
   }),
   rank, {axis});
-tensor = Subtract(Cast(tensor, DataType::Float(Bits)),
-  Cast(input_zero_broadcast, DataType::Float(Bits)));
-  } else {
-tensor = Cast(tensor, DataType::Float(Bits));
+tensor = Subtract(tensor, Cast(input_zero_broadcast, 
DataType::Float(Bits)));
   }
 
   // 2) If the input and output scales are same, we can skip the 
multiplication. Check

[tvm] branch main updated (ebbce649f0 -> cfefa90a96)

2022-07-28 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from ebbce649f0 [microTVM][tutorial] AOT host-driven tutorial with TFLite 
model (#12182)
 add cfefa90a96 [Adreno] Update conv2d_nhwc test to use winograd (#12214)

No new revisions were added by this update.

Summary of changes:
 tests/python/relay/test_conv2d_nhwc_texture.py | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

[tvm] branch main updated: [MetaSchedule] Integration test for CUDA AutoTensorization (#12142)

2022-07-28 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new ee319d9d23 [MetaSchedule] Integration test for CUDA AutoTensorization 
(#12142)
ee319d9d23 is described below

commit ee319d9d23c80091da9c4fb764b1e6d49d462714
Author: Wuwei Lin 
AuthorDate: Thu Jul 28 02:03:27 2022 -0700

[MetaSchedule] Integration test for CUDA AutoTensorization (#12142)

* [MetaSchedule] Integration test for CUDA AutoTensorization

* cleanup

* fix
---
 python/tvm/meta_schedule/default_config.py | 52 +++
 src/meta_schedule/schedule_rule/auto_bind.cc   |  3 +
 .../test_meta_schedule_auto_tensorize.py   | 74 --
 3 files changed, 123 insertions(+), 6 deletions(-)

diff --git a/python/tvm/meta_schedule/default_config.py 
b/python/tvm/meta_schedule/default_config.py
index e99dd1383a..dc021e1731 100644
--- a/python/tvm/meta_schedule/default_config.py
+++ b/python/tvm/meta_schedule/default_config.py
@@ -349,3 +349,55 @@ class _DefaultCUDA:
 M.MutateUnroll(): 0.08,
 M.MutateThreadBinding(): 0.02,
 }
+
+
+class _DefaultCUDATensorCore:
+"""Default tuning configuration for CUDA TensorCore."""
+
+@staticmethod
+def schedule_rules():
+from tvm.meta_schedule import schedule_rule as M
+from tvm.tir.tensor_intrin import get_wmma_intrin_group
+
+return [
+M.MultiLevelTilingTensorCore(
+intrin_groups=[
+get_wmma_intrin_group(
+store_scope="shared",
+in_dtype="float16",
+out_dtype="float16",
+trans_b=trans_b,
+)
+for trans_b in [False, True]
+],
+structure="SSSRRSRS",
+tile_binds=["blockIdx.y", "blockIdx.x", "threadIdx.y"],
+max_innermost_factor=4,
+vector_load_lens=[1, 2, 3, 4],
+reuse_read=M.ReuseType(req="must", levels=[4], scope="shared"),
+reuse_write=M.ReuseType(
+req="must",
+levels=[2],
+scope="shared",
+),
+),
+*_DefaultCUDA.schedule_rules(),
+]
+
+@staticmethod
+def postprocs() -> List[Postproc]:
+from tvm.meta_schedule import postproc as M
+
+return [
+M.DisallowDynamicLoop(),
+M.RewriteCooperativeFetch(),
+M.RewriteUnboundBlock(),
+M.RewriteParallelVectorizeUnroll(),
+M.RewriteReductionBlock(),
+M.RewriteTensorize(),
+M.VerifyGPUCode(),
+]
+
+@staticmethod
+def mutator_probs() -> Dict[Mutator, float]:
+return _DefaultCUDA.mutator_probs()
diff --git a/src/meta_schedule/schedule_rule/auto_bind.cc 
b/src/meta_schedule/schedule_rule/auto_bind.cc
index a67432ebc5..ff4d26084e 100644
--- a/src/meta_schedule/schedule_rule/auto_bind.cc
+++ b/src/meta_schedule/schedule_rule/auto_bind.cc
@@ -34,6 +34,9 @@ void BindBlockThreadIdx(const tir::Schedule& sch, const 
tir::BlockRV& block_rv,
   if (block_sref->parent == nullptr) {
 return;
   }
+  if (tir::HasBeenMultiLevelTiled(block_sref)) {
+return;
+  }
   Array loops = tir::GetLoops(block_sref);
   int n = loops.size();
   int i_block_idx = -1;
diff --git a/tests/python/integration/test_meta_schedule_auto_tensorize.py 
b/tests/python/integration/test_meta_schedule_auto_tensorize.py
index b855dc6fa0..b1525df10e 100644
--- a/tests/python/integration/test_meta_schedule_auto_tensorize.py
+++ b/tests/python/integration/test_meta_schedule_auto_tensorize.py
@@ -27,6 +27,7 @@ from tvm import meta_schedule as ms
 from tvm import relay
 from tvm.meta_schedule import ApplyHistoryBest, postproc, schedule_rule
 from tvm.meta_schedule.relay_integration import extract_task_from_relay
+from tvm.meta_schedule.testing import relay_workload
 from tvm.meta_schedule.testing.tlcbench import load_quantized_bert_base
 from tvm.meta_schedule.tune import tune_extracted_tasks
 from tvm.tir.tensor_intrin import AMDGPU_SDOT4_INTRIN, DP4A_INTRIN
@@ -337,10 +338,71 @@ def test_dp4a_bert_int8():
 # _test_bert_int8("rocm", sch_rules_for_sdot4, postprocs_for_dp4a)
 
 
+@tvm.testing.requires_gpu
+@pytest.mark.skip("Slow on CI")
+@pytest.mark.parametrize(
+["model_name", "input_shape"],
+[("bert_base", (8, 128)), ("resnet_18", (16, 3, 224, 224)), ("resnet_50", 
(16, 3, 224, 224))],
+)
+def test_cuda_tensor_core(model_name, input_sh

[tvm] branch main updated: [Adreno] Fix winograd tests and accuracy (#12202)

2022-07-27 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 4bcaecf979 [Adreno] Fix winograd tests and accuracy (#12202)
4bcaecf979 is described below

commit 4bcaecf979fb17eaec4df80da534c8bc82933fb3
Author: Egor Churaev 
AuthorDate: Thu Jul 28 08:00:50 2022 +0300

[Adreno] Fix winograd tests and accuracy (#12202)

* [Adreno] Fix winograd tests and accuracy

* Fix lint

* Fix test on cpu
---
 python/tvm/topi/adreno/conv2d_winograd_common.py | 16 ---
 tests/python/relay/test_conv2d_nchw_texture.py   | 60 +++-
 tests/python/relay/utils/adreno_utils.py | 25 --
 3 files changed, 88 insertions(+), 13 deletions(-)

diff --git a/python/tvm/topi/adreno/conv2d_winograd_common.py 
b/python/tvm/topi/adreno/conv2d_winograd_common.py
index 6d11c1fe73..b0cec0f702 100644
--- a/python/tvm/topi/adreno/conv2d_winograd_common.py
+++ b/python/tvm/topi/adreno/conv2d_winograd_common.py
@@ -90,6 +90,7 @@ def conv2d_winograd_comp(
 
 convert_from4d = False
 if len(data.shape) == 4:
+convert_from4d = True
 if layout == "NCHW":
 N, DCI, H, W = get_const_tuple(data.shape)
 else:
@@ -120,7 +121,6 @@ def conv2d_winograd_comp(
 data = tvm.te.placeholder(dshape, data.dtype, 
name="data_placeholder")
 kernel = tvm.te.placeholder(kshape, kernel.dtype, 
name="kernel_placeholder")
 else:
-convert_from4d = True
 data = pack_input(
 data, layout, N, in_channel_chunks, in_channel_block, 
in_channel_tail, H, W
 )
@@ -220,9 +220,9 @@ def conv2d_winograd_comp(
 idxdiv = tvm.tir.indexdiv
 idxmod = tvm.tir.indexmod
 if layout == "NCHW":
-N, CI, H, W, CB = get_const_tuple(data.shape)
+N, CI, _, _, CB = get_const_tuple(data.shape)
 else:
-N, H, W, CI, CB = get_const_tuple(data.shape)
+N, _, _, CI, CB = get_const_tuple(data.shape)
 
 # pack input tile
 if layout == "NCHW":
@@ -494,16 +494,18 @@ def schedule_conv2d_winograd(cfg, s, output, 
pre_computed):
 s[OL].set_scope("local")
 output = s.outputs[0]
 
-m = alpha - 3 + 1
 if len(s[output].op.axis) == 4:
 n, co, h, w = s[output].op.axis
+cb = None
 else:
-n, co, h, w, _ = s[output].op.axis
-ho, wo, hi, wi = s[output].tile(h, w, m, m)
+n, co, h, w, cb = s[output].op.axis
 inverse_scope, n = s[output].split(n, nparts=1)
 
-fused = s[output].fuse(n, co, ho, wo)
+fused = s[output].fuse(n, co, h, w)
 bb, tt = s[output].split(fused, 128)
+if cb is not None:
+s[output].reorder(bb, tt, cb)
+s[output].vectorize(cb)
 
 s[output].bind(bb, te.thread_axis("blockIdx.x"))
 s[output].bind(tt, te.thread_axis("threadIdx.x"))
diff --git a/tests/python/relay/test_conv2d_nchw_texture.py 
b/tests/python/relay/test_conv2d_nchw_texture.py
index 89f68dacbd..2dd88f6118 100644
--- a/tests/python/relay/test_conv2d_nchw_texture.py
+++ b/tests/python/relay/test_conv2d_nchw_texture.py
@@ -20,6 +20,7 @@ import tvm
 import numpy as np
 from tvm import relay
 from tvm.relay import testing
+from tvm.contrib import utils
 from utils.adreno_utils import gpu_preprocess, build_run_compare
 
 
@@ -432,6 +433,63 @@ def test_conv2d_vgg16_winograd_4d():
 "bias": tvm.nd.array(bias_data),
 }
 
-graph = build_run_compare(mod, params1, {"data": input_shape}, dtype, 
target)
+temp = utils.tempdir()
+stat_file = temp.relpath("stat.log")
+with open(stat_file, "w") as f:
+f.write(
+'{"input": ["opencl -keys=adreno,opencl,gpu -device=adreno 
-max_num_threads=256", "conv2d_nchw_winograd_acc32.image2d", [["TENSOR", [1, 
512, 28, 28], "float16"], ["TENSOR", [512, 512, 3, 3], "float16"], [1, 1], [1, 
1, 1, 1], [1, 1], "float16"], {}], "config": {"index": 1591, "code_hash": null, 
"entity": [["auto_unroll_max_step", "ot", 4], ["tile_y", "sp", [-1, 1, 32]], 
["tile_x", "sp", [-1, 4, 2]], ["tile_rc", "sp", [-1, 8]]]}, "result": 
[[0.0037244], 0, 7.06374192237854, 165 [...]
+)
+graph = build_run_compare(
+mod, params1, {"data": input_shape}, dtype, target, stat_file=stat_file
+)
+matches = re.findall("winograd", graph)
+assert len(matches) > 0
+
+
+@tvm.testing.requires_opencl
+def test_conv2d_winograd_conv():
+target = "opencl --device=adreno"
+dtype = "float1

[tvm] branch main updated: [Relay][PyTorch] Add aten::lerp (#12167)

2022-07-27 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new c35c9fd3a5 [Relay][PyTorch] Add aten::lerp (#12167)
c35c9fd3a5 is described below

commit c35c9fd3a5249cfb01093b08b35979db846dfa33
Author: xndcn 
AuthorDate: Thu Jul 28 12:59:30 2022 +0800

[Relay][PyTorch] Add aten::lerp (#12167)
---
 python/tvm/relay/frontend/pytorch.py  | 11 +++
 tests/python/frontend/pytorch/test_forward.py | 15 +++
 2 files changed, 26 insertions(+)

diff --git a/python/tvm/relay/frontend/pytorch.py 
b/python/tvm/relay/frontend/pytorch.py
index b88e08b719..1bd3232871 100644
--- a/python/tvm/relay/frontend/pytorch.py
+++ b/python/tvm/relay/frontend/pytorch.py
@@ -343,6 +343,16 @@ class PyTorchOpConverter:
 diag_input = _op.zeros(input_shape, dtype=input_types[0])
 return _op.matrix_set_diag(data, diag_input, k=(k1, k2))
 
+def lerp(self, inputs, input_types):
+if len(inputs) != 3:
+msg = "Wrong number of arguments (%d) to parse." % (len(inputs))
+raise AssertionError(msg)
+
+start = inputs[0]
+end = inputs[1]
+weight = inputs[2]
+return start + weight * (end - start)
+
 def arange(self, inputs, input_types):
 def _get_value(val, dtype):
 # dtype is a tvm dtype
@@ -3412,6 +3422,7 @@ class PyTorchOpConverter:
 "aten::stft": self.stft,
 "aten::mul": self.make_elemwise("multiply"),
 "aten::pow": self.make_elemwise("power"),
+"aten::lerp": self.lerp,
 "aten::arange": self.arange,
 "aten::meshgrid": self.meshgrid,
 "aten::div": self.make_elemwise("divide"),
diff --git a/tests/python/frontend/pytorch/test_forward.py 
b/tests/python/frontend/pytorch/test_forward.py
index 6d7926396a..4332f3efe5 100644
--- a/tests/python/frontend/pytorch/test_forward.py
+++ b/tests/python/frontend/pytorch/test_forward.py
@@ -4596,5 +4596,20 @@ def test_softmax_fuse():
 tvm.testing.assert_allclose(out, output_torch, rtol=1e-5, atol=1e-5)
 
 
+@tvm.testing.uses_gpu
+def test_lerp():
+def test_fn(x, y, w):
+return torch.lerp(x, y, w)
+
+input_shape = [16]
+x = torch.rand(input_shape).float()
+y = torch.rand(input_shape).float()
+w = torch.rand(input_shape).float()
+
+# weight can be tensor or scalar
+verify_model(test_fn, [x, y, w])
+verify_model(test_fn, [x, y, w[0]])
+
+
 if __name__ == "__main__":
 pytest.main([__file__])

[tvm] branch main updated (96b151751a -> 85aa597245)

2022-07-27 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 96b151751a [OpenCL] Fix profiling hang for OpenCL device (#12173)
 add 85aa597245 [JIT] Force finalization of JITed code, expose sf/hf 
runtime functions (#12187)

No new revisions were added by this update.

Summary of changes:
 include/tvm/runtime/builtin_fp16.h | 2 ++
 src/runtime/builtin_fp16.cc| 3 +++
 src/target/llvm/llvm_module.cc | 6 ++
 3 files changed, 11 insertions(+)

[tvm] branch main updated (98d5feb297 -> 96b151751a)

2022-07-27 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 98d5feb297 [docs] Update tlcpack-sphinx-addon (#12188)
 add 96b151751a [OpenCL] Fix profiling hang for OpenCL device (#12173)

No new revisions were added by this update.

Summary of changes:
 .../graph_executor/debug/graph_executor_debug.cc   |  2 +-
 src/runtime/opencl/opencl_common.h | 33 +---
 src/runtime/opencl/opencl_device_api.cc|  2 +
 tests/cpp-runtime/opencl/opencl_timer_test.cc  | 61 ++
 4 files changed, 90 insertions(+), 8 deletions(-)
 create mode 100644 tests/cpp-runtime/opencl/opencl_timer_test.cc

[tvm] branch main updated: [ci] Skip broken android_rpc failures (#12192)

2022-07-26 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 421f9d756a [ci] Skip broken android_rpc failures (#12192)
421f9d756a is described below

commit 421f9d756a6ac48b9c3b886f7941a14dae133f5d
Author: driazati <9407960+driaz...@users.noreply.github.com>
AuthorDate: Tue Jul 26 18:52:27 2022 -0700

[ci] Skip broken android_rpc failures (#12192)

See #12191

Co-authored-by: driazati 
---
 .github/workflows/main.yml | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index 313c440cbd..eb346e4605 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -121,26 +121,31 @@ jobs:
   make jvmpkg
   - name: Build android_rpc
 working-directory: apps/android_rpc
+continue-on-error: true
 run: |
   export PATH="${ANDROID_NDK_HOME}:$PATH"
   gradle clean build
   - name: Upload android_rpc APK
 uses: actions/upload-artifact@v2
+continue-on-error: true
 with:
   name: android_rpc-debug.apk
   path: ./apps/android_rpc/app/build/outputs/apk/debug/app-debug.apk
   - name: Build android_deploy
 working-directory: apps/android_deploy
+continue-on-error: true
 run: |
   export PATH="${ANDROID_NDK_HOME}:$PATH"
   gradle clean build
   - name: Upload android_deploy APK
 uses: actions/upload-artifact@v2
+continue-on-error: true
 with:
   name: android_deploy-debug.apk
   path: ./apps/android_deploy/app/build/outputs/apk/debug/app-debug.apk
   - name: Build android_camera
 working-directory: apps/android_camera
+continue-on-error: true
 run: |
   mkdir -p app/src/main/assets/models/
   export 
TVM_NDK_CC=${ANDROID_NDK_HOME}/toolchains/llvm/prebuilt/linux-x86_64/bin/aarch64-linux-android30-clang++
@@ -156,6 +161,7 @@ jobs:
   gradle clean build
   - name: Upload android_camera APK
 uses: actions/upload-artifact@v2
+continue-on-error: true
 with:
   name: android_camera-debug.apk
   path: ./apps/android_camera/app/build/outputs/apk/debug/app-debug.apk
\ No newline at end of file

[tvm] branch main updated: TVM Vertical Integration with PyTorch (#11911)

2022-07-26 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new ea6ea42757 TVM Vertical Integration with PyTorch (#11911)
ea6ea42757 is described below

commit ea6ea42757f275573391eb3ff67034b2749948ae
Author: Yaoda Zhou 
AuthorDate: Tue Jul 26 16:00:44 2022 +0800

TVM Vertical Integration with PyTorch (#11911)

* optimize_torch & as_torch

* split files

* code formatting

* optimizing optimized_torch

* scrap your boilerplate

* as_torch polished

* configuration fixed

* Apply suggestions from code review

Co-authored-by: Lite Ye 

* more document

* file deleter

* optimize deleter

* drop how-to guides

* clang-format-10

* formatter changes

* reformat

* reformat

* reformat

* reformatting

* fixed

* auto setting

* fixed

* split long string

* tune_tir

* upgrade as_torch

* optimize as_torch

* as_torch

* fixed typo

Co-authored-by: juda 
Co-authored-by: Lite Ye 
---
 apps/pt_tvmdsoop/tests/test_as_torch.py| 257 
 apps/pt_tvmdsoop/tests/test_optimize_torch.py  | 161 +
 python/tvm/contrib/torch/__init__.py   |  12 +-
 python/tvm/contrib/torch/as_torch.py   | 124 ++
 python/tvm/contrib/torch/optimize_torch.py | 198 
 python/tvm/script/parser.py|  16 +-
 src/contrib/torch/base64.h |  75 ++
 .../torch/pt_call_tvm/RuntimeModuleWrapper.cc  | 259 +
 8 files changed, 1099 insertions(+), 3 deletions(-)

diff --git a/apps/pt_tvmdsoop/tests/test_as_torch.py 
b/apps/pt_tvmdsoop/tests/test_as_torch.py
new file mode 100644
index 00..2c454e9454
--- /dev/null
+++ b/apps/pt_tvmdsoop/tests/test_as_torch.py
@@ -0,0 +1,257 @@
+#!/usr/bin/env python
+
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""Test script for tvm torch module"""
+import numpy as np
+
+import torch
+import torch.nn
+
+import tvm
+from tvm.meta_schedule.tune import TuneConfig
+from tvm.target.target import Target
+import tvm.testing
+from tvm.contrib.torch import as_torch
+from tvm.script import tir as T
+
+
+@as_torch
+def matmul(M: int, N: int, K: int, dtype: str):
+@T.prim_func
+def main(a: T.handle, b: T.handle, c: T.handle) -> None:
+A = T.match_buffer(a, [M, K], dtype=dtype)
+B = T.match_buffer(b, [N, K], dtype=dtype)
+C = T.match_buffer(c, [M, N], dtype=dtype)
+for i, j, k in T.grid(M, N, K):
+with T.block():
+vi, vj, vk = T.axis.remap("SSR", [i, j, k])
+with T.init():
+C[vi, vj] = T.float32(0)
+C[vi, vj] = C[vi, vj] + A[vi, vk] * B[vj, vk]
+
+return main
+
+
+@as_torch
+@tvm.script.ir_module
+class ModuleGPU:
+@T.prim_func
+def main(A: T.Buffer[8, "float32"], B: T.Buffer[8, "float32"]) -> None:
+T.func_attr({"global_symbol": "main", "tir.noalias": True})
+for i_0 in T.thread_binding(2, thread="blockIdx.x"):
+for i_2 in T.thread_binding(2, thread="threadIdx.x"):
+for i_1 in T.serial(2):
+with T.block("B"):
+vi = T.axis.spatial(8, i_0 * 4 + i_1 * 2 + i_2)
+T.reads(A[vi])
+T.writes(B[vi])
+B[vi] = A[vi] + T.float32(1)
+
+
+@as_torch
+@T.prim_func
+def func_with_part_access_region(a: T.handle, b: T.handle, c: T.handle) -> 
None:
+A = T.match_buffer(a, [128, 128])
+B = T.match_buffer(b, [128, 128])
+C = T.match_buffer(c, [128, 128])
+
+with T.block():
+for i, j in T.grid(128, 128):
+with T.block("s1"):
+v

[tvm] branch main updated (9963b59ffa -> 9bef7de9f0)

2022-07-25 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 9963b59ffa fix typo (#12183)
 add 9bef7de9f0 [Doc] Fix link error in pipeline executor tutorial (#12185)

No new revisions were added by this update.

Summary of changes:
 gallery/how_to/work_with_relay/using_pipeline_executor.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

[tvm] branch main updated (19e5ec6576 -> 21d54f9880)

2022-07-25 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 19e5ec6576 [hexagon][testing] sequential input tensors (#12168)
 add 21d54f9880 [PyTorch] Add aten::numpy_T (#12179)

No new revisions were added by this update.

Summary of changes:
 python/tvm/relay/frontend/pytorch.py  | 14 --
 tests/python/frontend/pytorch/test_forward.py | 12 
 2 files changed, 24 insertions(+), 2 deletions(-)

[tvm] branch main updated: [BYOC-DNNL] suppport more dnnl ops (#11823)

2022-07-25 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 6eb3a1fc36 [BYOC-DNNL] suppport more dnnl ops (#11823)
6eb3a1fc36 is described below

commit 6eb3a1fc3688bede69efd7f14b313b35497ccf02
Author: Ivy Zhang 
AuthorDate: Mon Jul 25 15:16:16 2022 +0800

[BYOC-DNNL] suppport more dnnl ops (#11823)

* support dnnl.global_avg_pooling2d

* fuse pad-avg_pool2d

* fix lint
---
 python/tvm/relay/op/contrib/dnnl.py | 14 ++
 src/runtime/contrib/dnnl/dnnl_json_runtime.cc   | 36 +
 tests/python/contrib/test_dnnl.py   | 25 +
 tests/python/relay/test_pass_partition_graph.py |  4 +--
 4 files changed, 66 insertions(+), 13 deletions(-)

diff --git a/python/tvm/relay/op/contrib/dnnl.py 
b/python/tvm/relay/op/contrib/dnnl.py
index 228619e0ef..fa98ed002c 100644
--- a/python/tvm/relay/op/contrib/dnnl.py
+++ b/python/tvm/relay/op/contrib/dnnl.py
@@ -89,6 +89,7 @@ _register_external_op_helper("nn.conv3d_transpose")
 _register_external_op_helper("nn.dense")
 _register_external_op_helper("nn.max_pool2d")
 _register_external_op_helper("nn.avg_pool2d")
+_register_external_op_helper("nn.global_avg_pool2d")
 _register_external_op_helper("nn.max_pool3d")
 _register_external_op_helper("nn.avg_pool3d")
 _register_external_op_helper("abs")
@@ -459,6 +460,18 @@ def tag2layout(input_data, is_weight=False, 
conv_type="Conv1D"):
 return res
 
 
+def legalize_pad_avg_pool(attrs, inputs, types):
+"""Legalize pad->avg_pool2d pattern.
+Fuse this pattern into one avg_pool2d with padding = (1, 1),
+and count_include_pad = True"""
+data = inputs[0]
+new_attrs = dict(attrs)
+if isinstance(data, relay.expr.Call) and data.op.name == "nn.pad":
+new_attrs["padding"] = (1, 1)
+new_attrs["count_include_pad"] = True
+return relay.nn.avg_pool2d(data.args[0], **new_attrs)
+
+
 def legalize_group_conv(attrs, inputs, types):
 """Legalize group conv / conv_transpose calculation.
 Alter weight layout from OIHW to GOIHW / IOHW to GIOHW"""
@@ -575,6 +588,7 @@ class IsComputeIntensiveGraph(ExprVisitor):
 "nn.dense",
 "nn.layer_norm",
 "nn.batch_matmul",
+"nn.global_avg_pool2d",
 ]
 )
 if isinstance(call.op, tvm.tir.op.Op):
diff --git a/src/runtime/contrib/dnnl/dnnl_json_runtime.cc 
b/src/runtime/contrib/dnnl/dnnl_json_runtime.cc
index 93c53dda16..dcf1a86785 100644
--- a/src/runtime/contrib/dnnl/dnnl_json_runtime.cc
+++ b/src/runtime/contrib/dnnl/dnnl_json_runtime.cc
@@ -624,32 +624,46 @@ class DNNLJSONRuntime : public JSONRuntimeBase {
 
   void Pooling(const size_t& nid, dnnl::algorithm algo) {
 auto node = nodes_[nid];
+auto op_name = node.GetOpName();
+bool is_global = op_name.find("global") != std::string::npos;
 
 auto src_tr = GetInput(nid, 0);
 auto dst_tr = GetOutput(nid, 0);
 
-// Setup attributes.
-auto strides = GetNodeAttr>(node, "strides");
-auto dilates = GetNodeAttr>(node, "dilation");
-auto padding = GetNodeAttr>(node, "padding");
-std::vector padding_l(padding.begin(), padding.begin() + 
padding.size() / 2);
-std::vector padding_r(padding.begin() + padding.size() / 2, 
padding.end());
-auto kernel = GetNodeAttr>(node, "pool_size");
+// Get layout.
 auto src_layout = GetNodeAttr(node, "layout");
 auto dst_layout = GetNodeAttr(node, "out_layout");
 
 // dst_layout == "" means to use data_layout
 if (dst_layout.empty()) dst_layout = src_layout;
 
-// Minus one for DNNL representation. No dilation for DNNL is 0, for relay 
is 1.
-for (auto& d : dilates) d--;
-
 // Take into account provided layout strings
 src_tr = src_tr.TreatAs(src_layout);
 dst_tr = dst_tr.TreatAs(dst_layout);
 
+ICHECK(src_tr.dims().size() > 2);
+
+std::vector feature_size;
+for (size_t i = 2; i < src_tr.dims().size(); i++) {
+  feature_size.push_back(int64_t(src_tr.dims()[i]));
+}
+
+// Set attributes.
+auto kernel = is_global ? feature_size : 
GetNodeAttr>(node, "pool_size");
+auto strides = is_global ? std::vector(src_tr.dims().size() - 2, 
1)
+ : GetNodeAttr>(node, 
"strides");
+auto dilates = is_global ? std::vector(src_tr.dims().size() - 2, 
1)
+ : GetNodeAttr>(node, 
"dilation");
+auto padding = is_global

[tvm] branch main updated: [Runtime][PipelineExecutor] Tutorial of using pipeline executor. (#11557)

2022-07-22 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new ecd3c884de [Runtime][PipelineExecutor]  Tutorial of using pipeline 
executor. (#11557)
ecd3c884de is described below

commit ecd3c884de6b37d10b766bc9300bc71ee3776402
Author: Hua Jiang 
AuthorDate: Fri Jul 22 12:54:24 2022 -0700

[Runtime][PipelineExecutor]  Tutorial of using pipeline executor. (#11557)

* [Runtime][PipelineExecutor]  Tutorial of using pipeline executor.

Tutorial of using pipeline executor including the byoc use case.

* fix ci issue

* document change.

* triger build

* fix doc issue

* fix ci issue

* doc issue

* fix ci issue

* fix ci issue.

* fix __file__ not found problem.

this is a known issue of sphinx-gallery
https://github.com/sphinx-gallery/sphinx-gallery/issues/211

* fix byoc with dnnl issue

* enable dnnl and pipeline executor

* trigger build

* trigger build

* fix build issue

* trigger build

* oneflow cause crash, do test with change

* add sphinx skip

* plint

* remove from_oneflow change test.

* remove pipeline executor change for test

* plint

* enable DNNL and pipeline

* disable DNNL

* enable DNNL without pipeline

* remove dnnl and add cutlass

* use cutlass with byoc

* change into cutlass

* fix doc convention issue

* remove duplicate variable

* fix plint issue.

* address review comments.

* address review comments

* fix bug.

* polish the document

* fix plint issue

* address review comments.

* address review comments

* address review comments
---
 .../work_with_relay/using_pipeline_executor.py | 248 +
 python/tvm/contrib/pipeline_executor.py|  26 ++-
 python/tvm/contrib/pipeline_executor_build.py  |  14 +-
 tests/scripts/task_config_build_gpu.sh |   2 +
 4 files changed, 281 insertions(+), 9 deletions(-)

diff --git a/gallery/how_to/work_with_relay/using_pipeline_executor.py 
b/gallery/how_to/work_with_relay/using_pipeline_executor.py
new file mode 100755
index 00..5496058265
--- /dev/null
+++ b/gallery/how_to/work_with_relay/using_pipeline_executor.py
@@ -0,0 +1,248 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+Using Pipeline Executor in Relay
+=
+**Author**: `Hua Jiang <https://https://github.com/huajsj>`_
+
+This is a short tutorial on how to use "Pipeline Executor" with Relay.
+"""
+import tvm
+from tvm import te
+import numpy as np
+from tvm.contrib import graph_executor as runtime
+from tvm.relay.op.contrib.cutlass import partition_for_cutlass
+from tvm import relay
+from tvm.relay import testing
+import tvm.testing
+from tvm.contrib.cutlass import (
+has_cutlass,
+num_cutlass_partitions,
+finalize_modules,
+finalize_modules_vm,
+)
+
+img_size = 8
+###
+# Create a simple network, this network can be a pre-trained model too.
+# -
+# Let's create a very simple network for demonstration.
+# It consists of convolution, batch normalization, dense, and ReLU activation.
+def get_network():
+out_channels = 16
+batch_size = 1
+data = relay.var("data", relay.TensorType((batch_size, 3, img_size, 
img_size), "float16"))
+dense_weight = relay.var(
+"dweight", relay.TensorType((batch_size, 16 * img_size * img_size), 
"float16")
+)
+weight = relay.var("weight")
+second_weight = relay.var("second_weight")
+bn_gamma = relay.var("bn_gamma")
+bn_beta = relay.var("bn_beta")
+bn_mmean = relay.var("

< 1 2 3 4 5 6 7 8 9 10 >

501 - 600 of 1347 matches

Mail list logo