Re: [PR] [3rdparty] AUTO mode for custom all-reduce strategy [tvm]
yongwww merged PR #16797: URL: https://github.com/apache/tvm/pull/16797 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(tvm) branch main updated: [3rdparty] AUTO mode for custom all-reduce strategy (#16797)
This is an automated email from the ASF dual-hosted git repository. yongwww pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 2f889774ec [3rdparty] AUTO mode for custom all-reduce strategy (#16797) 2f889774ec is described below commit 2f889774ec10b56ebfac89f78698e06eb200db46 Author: Ruihang Lai AuthorDate: Wed Mar 27 01:30:09 2024 -0400 [3rdparty] AUTO mode for custom all-reduce strategy (#16797) This PR adds the automatic mode selection for customized all-reduce kernels, referring TensorRT-LLM. Meanwhile, this PR fixes a bug that may cause customized all-reduce kernel to hang forever. Prior to this PR, each worker resets its barrier values to 0 *after using all-gather to exchange their barrier handles*. Afterwards, the customized all-reduce kernels update the barriers of all workers. So it is possible that, worker 0 updates worker 1's barrier *before* worker 1 resets its barrier to 0. This lead to the all-reduce kernel hanging forever. This PR changes the behavior to resetting barriers before all-gather, and forcing a device synchronization after reset. --- 3rdparty/tensorrt_llm/custom_allreduce_kernels.h | 33 ++ .../tvm/relax/transform/ipc_allreduce_rewrite.py | 2 -- src/runtime/disco/cuda_ipc/cuda_ipc_memory.cc | 26 +++-- src/runtime/disco/cuda_ipc/custom_allreduce.cc | 12 ++-- tests/python/disco/test_custom_allreduce.py| 4 +++ 5 files changed, 63 insertions(+), 14 deletions(-) diff --git a/3rdparty/tensorrt_llm/custom_allreduce_kernels.h b/3rdparty/tensorrt_llm/custom_allreduce_kernels.h index 7fd66e5d10..7c515a03ac 100644 --- a/3rdparty/tensorrt_llm/custom_allreduce_kernels.h +++ b/3rdparty/tensorrt_llm/custom_allreduce_kernels.h @@ -25,8 +25,10 @@ constexpr size_t MAX_RANKS_PER_NODE = 8; constexpr size_t DEFAULT_BLOCK_SIZE = 1024; enum class AllReduceStrategyType : int8_t { + RING = 0, ONESHOT = 1, TWOSHOT = 2, + AUTO = 3, }; struct AllReduceParams { @@ -42,6 +44,37 @@ struct AllReduceParams { void* local_output_buffer_ptr; }; +inline size_t GetMaxRequiredWorkspaceSize(int world_size) { + if (world_size <= 2) { +return 16 * 1000 * 1000; + } + return 8 * 1000 * 1000; +} + +inline AllReduceStrategyType SelectImplementation(size_t message_size, int world_size) { + const size_t maxWorkspaceSize = GetMaxRequiredWorkspaceSize(world_size); + + if (message_size > maxWorkspaceSize) { +return AllReduceStrategyType::RING; + } + + if (world_size <= 2) { +return AllReduceStrategyType::ONESHOT; + } + + if (world_size <= 4) { +if (message_size < 1 * 1000 * 1000) { + return AllReduceStrategyType::ONESHOT; +} +return AllReduceStrategyType::TWOSHOT; + } + + if (message_size < 500 * 1000) { +return AllReduceStrategyType::ONESHOT; + } + return AllReduceStrategyType::TWOSHOT; +} + void customAllReduce(AllReduceParams& params, void* data, size_t elts, DLDataType dataType, AllReduceStrategyType strat, cudaStream_t stream); diff --git a/python/tvm/relax/transform/ipc_allreduce_rewrite.py b/python/tvm/relax/transform/ipc_allreduce_rewrite.py index 3e7b005a60..df40181cb9 100644 --- a/python/tvm/relax/transform/ipc_allreduce_rewrite.py +++ b/python/tvm/relax/transform/ipc_allreduce_rewrite.py @@ -40,8 +40,6 @@ class IPCAllReduceRewrite: The all-reduce strategy. Only "1" and "2" are supported. "1" stands for one-shot, and "2" stands for two-shot. """ -if allreduce_strategy not in [1, 2]: -raise ValueError(f"All-reduce strategy {allreduce_strategy} is not supported.") self.allreduce_strategy = allreduce_strategy def transform_module(self, mod: IRModule, _ctx: tvm.transform.PassContext) -> IRModule: diff --git a/src/runtime/disco/cuda_ipc/cuda_ipc_memory.cc b/src/runtime/disco/cuda_ipc/cuda_ipc_memory.cc index 451c3df0cb..fec5abec86 100644 --- a/src/runtime/disco/cuda_ipc/cuda_ipc_memory.cc +++ b/src/runtime/disco/cuda_ipc/cuda_ipc_memory.cc @@ -91,15 +91,13 @@ class CUDAIPCMemoryAllocator final : public memory::PooledAllocator { private: void* DeviceAllocDataSpace(Device dev, size_t size, size_t alignment, DLDataType type_hint) final { -auto [data_ptr, data_comm_ptrs] = AllocIPCMemory(dev, size, alignment, type_hint); +auto [data_ptr, data_comm_ptrs] = +AllocIPCMemory(dev, size, alignment, type_hint, /*reset_memory_to_zero=*/false); int barrier_ptr_size = sizeof(uint32_t) * (MAX_ALL_REDUCE_BLOCKS + 2) * MAX_RANKS_PER_NODE; -auto [barrier_in_ptr, barrier_in_comm_ptrs] = -AllocIPCMemory(dev, barrier_ptr_size, alignment, DataType::UInt(32)); -auto [barrier_out_ptr, barrier_out_comm_ptrs] = -AllocIPCMemory(dev, barrier_ptr_size,
Re: [PR] [TIR] Modify IntImmNode deep_equal to match regardless of type [tvm]
quic-sanirudh commented on PR #16795: URL: https://github.com/apache/tvm/pull/16795#issuecomment-2021948537 > Yeah I think fixing the dtype is a good idea, it would hopefully avoid this kind of problems in the future as well. Out of interest, what were the mismatching dtypes of the two compared `IntImmNode`s that you observed @quic-sanirudh? Thanks @ekalda. I'll update the PR to fix the dtypes in RampNode (and perhaps the broadcast node as well). The dtypes in my case were `int32` and `int64`. The expression I saw was something like this (slightly simpler version) `T.Broadcast(c, 128) + T.Ramp(T.int64(0), T.int64(1), T.int64(128))` The RampNode seems to get the int64 lanes because the all the iterators in our case is by default int64, but the broadcast seems to be inserted during the [evaluation of AddNode in op.cc here](https://github.com/apache/tvm/blob/d43e1ab71d5d9e16bbc962d4d7952dcc7a1cdbca/src/tir/op/op.cc#L126-L139) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(tvm) branch nightly updated (b2204ae698 -> d43e1ab71d)
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a change to branch nightly in repository https://gitbox.apache.org/repos/asf/tvm.git from b2204ae698 [IR] Default to empty attributes, instead of NULL (#16745) add 69c091400a [Fix] Fix build errors with VS2022 (#16790) add ae7b8d9aed [Codegen, Cuda] Add overload for fp8x4 e5m2 <-> half4 conversion (#16787) add 72f0326a88 [Analysis] Allow calls to GlobalVar in @R.function (#16778) add bf2d43e314 [IR][Relax] Improve highlighting in assert_structural_equal (#16756) add bcfbcabff8 [Bugfix][Cutlass] Remove a typo in cutlass build (#16789) add 016b512ad4 [Relax] Refactor PatternRewriter into separate Block/Expr mutators (#16730) add 8274d142a3 [Relax] Implement operators to inspec DLTensor::strides and offset (#16721) add 571fdaf1eb [Web] Add `kv_state` and `rnn_state` to wasm_runtime (#16791) add 4f3a863c1f [Cutlass] Add check for group gemm param shapes (#16788) add ac2f47867f [SME] Add support for inserting processor state annotations (#16761) add a768ee4900 [Fix] fix for numpy 2.0 compatibility (#16793) add d43e1ab71d [Doc] Fix set_axis_separator example (#16792) No new revisions were added by this update. Summary of changes: include/tvm/relax/analysis.h | 6 +- include/tvm/relax/dataflow_matcher.h | 4 +- include/tvm/relax/expr.h | 24 +- python/tvm/_ffi/runtime_ctypes.py | 2 +- python/tvm/contrib/cutlass/build.py| 2 +- python/tvm/relax/analysis/analysis.py | 8 +- python/tvm/relax/expr.py | 97 .../tvm/relax/transform/legalize_ops/__init__.py | 1 + .../tvm/relax/transform/legalize_ops/inspect_op.py | 128 +++ python/tvm/relay/frontend/paddlepaddle.py | 2 +- python/tvm/relay/frontend/pytorch.py | 4 +- python/tvm/script/parser/core/entry.py | 26 ++- python/tvm/tir/schedule/schedule.py| 2 +- python/tvm/topi/arm_cpu/pstate_attributes.py | 84 +++ src/node/structural_equal.cc | 45 ++-- src/relax/analysis/well_formed.cc | 47 ++-- src/relax/ir/dataflow_matcher.cc | 238 ++- src/relax/ir/expr.cc | 50 src/relax/op/tensor/inspect.cc | 180 --- src/relax/op/tensor/inspect.h | 39 src/runtime/contrib/cutlass/fp8_group_gemm.cu | 4 +- src/runtime/metadata.cc| 3 +- src/target/llvm/codegen_aarch64.cc | 102 + src/target/source/literal/cuda_half_t.h| 23 +- src/tir/analysis/identify_memcpy.cc| 2 +- src/tir/contrib/ethosu/passes.cc | 2 +- src/tir/transforms/lower_tvm_builtin.cc| 36 ++- .../python/codegen/test_target_codegen_aarch64.py | 116 +- .../contrib/test_msc/test_translate_tensorflow.py | 2 +- tests/python/frontend/pytorch/test_forward.py | 4 +- tests/python/frontend/tensorflow/test_forward.py | 2 +- tests/python/relax/test_analysis_well_formed.py| 34 +++ tests/python/relax/test_op_inspect.py | 252 + tests/python/relax/test_op_unpack.py | 127 --- tests/python/relax/test_tvmscript_parser.py| 37 +++ tests/python/relax/test_utils.py | 63 +- tests/python/relay/test_op_level3.py | 4 +- .../test_tir_transform_lower_tvm_builtin.py| 37 ++- tests/python/topi/test_topi_math.py| 4 +- web/emcc/wasm_runtime.cc | 2 + 40 files changed, 1480 insertions(+), 365 deletions(-) create mode 100644 python/tvm/relax/transform/legalize_ops/inspect_op.py create mode 100644 python/tvm/topi/arm_cpu/pstate_attributes.py create mode 100644 src/target/llvm/codegen_aarch64.cc create mode 100644 tests/python/relax/test_op_inspect.py delete mode 100644 tests/python/relax/test_op_unpack.py
[I] [Bug] [VTA, RPC] Can’t upload custom bit file by RPC on ZCU104 [tvm]
muonkmu opened a new issue, #16799: URL: https://github.com/apache/tvm/issues/16799 I am testing VTA in the following environment. Target : ZCU104 (pynq 2.7) Host : ubuntu 20.04 + TVM(v0.16,dev0) xilinx toos : vivado 2020.1 I successfully synthesized the “vta.bit” file for ZCU104, and successfully launched the PRC server on ZCU104. However, if I try to upload “vta.bit” using “vta.program_fpga (remote, bitstream=“vta.bit”)”, the following error occurs. Which version of TVM and Pynq are guaranteed compatibility Is there a solution for this. ```bash Traceback (most recent call last): File "Simple_Matrix_Multiply.py", line 24, in vta.program_fpga(remote, bitstream="vta.bit") File "/home/minwook/Workspace/Study_lab/71_tvm/tvm/vta/python/vta/rpc_client.py", line 66, in program_fpga fprogram(os.path.basename(bitstream)) File "/home/minwook/Workspace/Study_lab/71_tvm/tvm/python/tvm/_ffi/_ctypes/packed_func.py", line 239, in __call__ raise_last_ffi_error() File "/home/minwook/Workspace/Study_lab/71_tvm/tvm/python/tvm/_ffi/base.py", line 481, in raise_last_ffi_error raise py_err tvm.error.RPCError: Traceback (most recent call last): 3: tvm::runtime::RPCWrappedFunc::operator()(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const 2: tvm::runtime::RPCClientSession::CallFunc(void*, TVMValue const*, int const*, int, std::function const&) 1: tvm::runtime::RPCEndpoint::CallFunc(void*, TVMValue const*, int const*, int, std::function) 0: tvm::runtime::RPCEndpoint::HandleUntilReturnEvent(bool, std::function) File "/home/minwook/Workspace/Study_lab/71_tvm/tvm/src/runtime/rpc/rpc_endpoint.cc", line 427 RPCError: Error caught from RPC call: ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [RELAX] Tuning capability for external cuBLAS codegen [tvm]
vinx13 commented on code in PR #16764: URL: https://github.com/apache/tvm/pull/16764#discussion_r1539988421 ## src/runtime/contrib/cublas/cublas_json_runtime.cc: ## @@ -129,14 +132,50 @@ class CublasJSONRuntime : public JSONRuntimeBase { auto [a_ptr, b_ptr, bias_ptr] = get_inputs(node, epilogue != CUBLASLT_EPILOGUE_DEFAULT); +const cublasLtMatmulAlgo_t* predef_algo_ptr = nullptr; +int64_t dyn_dim_val = dl_tensors[std::get<0>(dyn_dim_position)]->shape[std::get<1>(dyn_dim_position)]; +auto algo_desc = algo_collection(dyn_dim_val); +if (algo_desc.defined()) + predef_algo_ptr = _desc->algo; + tvm::contrib::CallCublasLt(entry_ptr->handle, stream, entry_ptr->matmul_pref_desc, a_ptr, b_ptr, bias_ptr, out_ptr, transa, transb, - entry_ptr->workspace_ptr, entry_ptr->workspace_size, epilogue); + entry_ptr->workspace_ptr, entry_ptr->workspace_size, epilogue, + predef_algo_ptr); } } } void Run() override { LOG(FATAL) << "Unreachable"; } + + protected: + void LoadPredefAlgoCollection() { +for (const auto& node : nodes_) { + if (node.GetOpType() == "kernel" && node.HasAttr("predefined_algos")) { +// Load algo collection +auto predef_algos_str = node.GetAttr>("predefined_algos"); +ICHECK_EQ(predef_algos_str.size(), 1); +algo_collection = tvm::contrib::AlgoCollection::FromJSON(predef_algos_str[0]); + +// Define dynamic dimension position +for (const auto& ne : node.GetInputs()) { + auto shape = nodes_[ne.id_].GetOpShape()[ne.index_]; + auto found = std::find(shape.begin(), shape.end(), -1); + if (found != shape.end()) { +uint32_t dyn_dim_idx = std::distance(shape.begin(), found); +uint32_t dyn_dim_eid = EntryID(ne); +dyn_dim_position = {dyn_dim_eid, dyn_dim_idx}; Review Comment: when there are multiple nodes with predefined algos, does overwrite the results of previous iterations? ## src/relax/backend/contrib/cublas/algo_db.h: ## @@ -0,0 +1,104 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +/*! + * \brief Codegen part of tuning capabilities for cublas matmul primitives. + */ + +#include + +#include "../../../../runtime/contrib/cublas/cublas_algo.h" + +namespace tvm { +namespace relax { +namespace contrib { + +using AlgoCollection = tvm::contrib::AlgoCollection; +using AlgoDesc = tvm::contrib::AlgoDesc; + +/*! \brief Algo database with predefined Algo objects. */ +class AlgoDatabaseNode: public runtime::Object { + /*! \brief Mapping of compisite func struct hash to algo colelction. */ + std::map collections; + +public: + void VisitAttrs(tvm::AttrVisitor* v) { +// v->Visit("collections", ); Review Comment: remove this ## src/runtime/contrib/cublas/cublas_json_runtime.cc: ## @@ -129,14 +132,50 @@ class CublasJSONRuntime : public JSONRuntimeBase { auto [a_ptr, b_ptr, bias_ptr] = get_inputs(node, epilogue != CUBLASLT_EPILOGUE_DEFAULT); +const cublasLtMatmulAlgo_t* predef_algo_ptr = nullptr; +int64_t dyn_dim_val = dl_tensors[std::get<0>(dyn_dim_position)]->shape[std::get<1>(dyn_dim_position)]; +auto algo_desc = algo_collection(dyn_dim_val); +if (algo_desc.defined()) + predef_algo_ptr = _desc->algo; Review Comment: nit ```suggestion if (algo_desc.defined()) { predef_algo_ptr = _desc->algo; } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(tvm) branch main updated (a768ee4900 -> d43e1ab71d)
This is an automated email from the ASF dual-hosted git repository. wuwei pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git from a768ee4900 [Fix] fix for numpy 2.0 compatibility (#16793) add d43e1ab71d [Doc] Fix set_axis_separator example (#16792) No new revisions were added by this update. Summary of changes: python/tvm/tir/schedule/schedule.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Re: [PR] [Doc] Fix set_axis_separator example [tvm]
vinx13 merged PR #16792: URL: https://github.com/apache/tvm/pull/16792 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(tvm) branch main updated (ac2f47867f -> a768ee4900)
This is an automated email from the ASF dual-hosted git repository. tqchen pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git from ac2f47867f [SME] Add support for inserting processor state annotations (#16761) add a768ee4900 [Fix] fix for numpy 2.0 compatibility (#16793) No new revisions were added by this update. Summary of changes: python/tvm/_ffi/runtime_ctypes.py | 2 +- python/tvm/relay/frontend/paddlepaddle.py | 2 +- python/tvm/relay/frontend/pytorch.py | 4 ++-- tests/python/contrib/test_msc/test_translate_tensorflow.py | 2 +- tests/python/frontend/pytorch/test_forward.py | 4 ++-- tests/python/frontend/tensorflow/test_forward.py | 2 +- tests/python/relay/test_op_level3.py | 4 +--- tests/python/topi/test_topi_math.py| 4 +--- 8 files changed, 10 insertions(+), 14 deletions(-)
Re: [PR] [Fix] fix for numpy 2.0 compatibility [tvm]
tqchen merged PR #16793: URL: https://github.com/apache/tvm/pull/16793 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [Target] Use LLVM target parser for determining Arm(R) A-Profile Architecture features [tvm]
cbalint13 commented on PR #16425: URL: https://github.com/apache/tvm/pull/16425#issuecomment-2021120932 > > Here is a reproducer: > > mem_leak.cpp > > Thanks a lot for this, I start to look at it now. @lhutton1 , Here is a patch: [tvm-llvm-memleak.diff.gz](https://github.com/apache/tvm/files/14762596/tvm-llvm-memleak.diff.gz) Can confirm that on your side is fine ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [SME] Add support for inserting processor state annotations [tvm]
ekalda merged PR #16761: URL: https://github.com/apache/tvm/pull/16761 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(tvm) branch main updated (4f3a863c1f -> ac2f47867f)
This is an automated email from the ASF dual-hosted git repository. ekalda pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git from 4f3a863c1f [Cutlass] Add check for group gemm param shapes (#16788) add ac2f47867f [SME] Add support for inserting processor state annotations (#16761) No new revisions were added by this update. Summary of changes: python/tvm/topi/arm_cpu/pstate_attributes.py | 84 +++ src/target/llvm/codegen_aarch64.cc | 102 ++ .../python/codegen/test_target_codegen_aarch64.py | 116 - 3 files changed, 300 insertions(+), 2 deletions(-) create mode 100644 python/tvm/topi/arm_cpu/pstate_attributes.py create mode 100644 src/target/llvm/codegen_aarch64.cc
Re: [PR] [SME] Add support for inserting processor state annotations [tvm]
ekalda commented on PR #16761: URL: https://github.com/apache/tvm/pull/16761#issuecomment-2021063350 Thanks @lhutton1! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [TIR] Modify IntImmNode deep_equal to match regardless of type [tvm]
ekalda commented on PR #16795: URL: https://github.com/apache/tvm/pull/16795#issuecomment-2021055007 Yeah I think fixing the dtype is a good idea, it would hopefully avoid this kind of problems in the future as well. Out of interest, what were the mismatching dtypes of the two compared `IntImmNode`s that you observed @quic-sanirudh? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [Relax] Improve CanonicalizeBindings in DataflowVar edge case [tvm]
Lunderberg commented on PR #16783: URL: https://github.com/apache/tvm/pull/16783#issuecomment-2021040212 @tvm-bot rerun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [Relax][Transform] Provide callback versions of LazyTransformParams [tvm]
Lunderberg commented on PR #16798: URL: https://github.com/apache/tvm/pull/16798#issuecomment-2021025223 This PR is currently marked as a draft, as the unit tests depend on functionality introduced in https://github.com/apache/tvm/pull/16642. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [Relax][Transform] Provide callback versions of LazyTransformParams [tvm]
Lunderberg opened a new pull request, #16798: URL: https://github.com/apache/tvm/pull/16798 Prior to this commit, the `LazyTransformParams` function could be used to load model parameters on demand. However, the function used to load or set parameters needed to be registered within the global registry of `PackedFunc`s. This PR provides `LazyGetInput` and `LazySetOutput` transforms, which perform the lazy-loading through a `R.Callable` callback argument, rather than through a globally-registered `PackedFunc`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [SVE] Support scalable vectors in LoopVectorizer [tvm]
ekalda commented on PR #16782: URL: https://github.com/apache/tvm/pull/16782#issuecomment-2020983646 Thank you for your feedback @Lunderberg, much appreciated! > The implementation looks reasonable, though I have one main question for it: What is the behavior of the updated pass for a target that doesn't support SVE? Prior SVE-commits enabled the functionality, but didn't produce SVE in any of the default lowering passes. > > From [this line](https://github.com/apache/tvm/pull/16696/files#diff-f61b04b100f5145f2681340c81d3f2af221239594ed01e2e24896522329ce92cR598-R600), versions of LLVM before 11.0 do not support SVE, nor from my brief reading of the CUDA codegen [here](https://github.com/apache/tvm/blob/main/src/target/source/codegen_cuda.cc#L253) does CUDA. When it comes to targets that don't support SVE, I'd expect these targets to not trigger the creation of scalable vectors. In the current plan the creation of scalable vectors has to be intentional, i.e. it comes from splitting an axis by a `vscale` dependent expression in the (target dependent) schedules and vectorizing the resulting axis. If the `LoopVectorizer` is trying to create scalable vectors for target that doesn't support it, something has gone wrong and the compilation will fall over at some point: * If by some mistake a schedule that doesn't support VLA programming contains `vscale`, it will fall over latest in a target dependent codegen * If there is an attempt to vectorize loops with non-int extent that doesn't contain `vscale`, the "scalable ramp" creation will error since it expects the `PrimExpr lanes` in a form `vscale * int`. I realize though that this is a weird deviation from a current behaviour of ``` if (!extent_as_int || extent_as_int->value < 1) { LOG(FATAL) << "Failed to vectorize loop with extent " << op->extent; } ``` so I'll modify the patch such that it checks for a target and fails as before if the extent is not an int. > Since `VectorizeLoop` occurs after the `BindTarget` pass, we can check the function attribute to know which target will be executing each function. I think we should have the loop vectorization apply only to fixed-extent loops by default, but enable the scalable vectorization for targets that support it. In principle I'm not against making vectorizing for scalable vectors functionality more explicitly target specific, but it is not obvious to me what that would mean in terms of code? `ICHECK`s for the appropriate targets at the places where scalable vectors are created? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [Cutlass] Add check for group gemm param shapes [tvm]
tqchen merged PR #16788: URL: https://github.com/apache/tvm/pull/16788 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(tvm) branch main updated (571fdaf1eb -> 4f3a863c1f)
This is an automated email from the ASF dual-hosted git repository. tqchen pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git from 571fdaf1eb [Web] Add `kv_state` and `rnn_state` to wasm_runtime (#16791) add 4f3a863c1f [Cutlass] Add check for group gemm param shapes (#16788) No new revisions were added by this update. Summary of changes: src/runtime/contrib/cutlass/fp8_group_gemm.cu | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
Re: [PR] [TIR] Modify IntImmNode deep_equal to match regardless of type [tvm]
tqchen commented on PR #16795: URL: https://github.com/apache/tvm/pull/16795#issuecomment-2020842450 ah oK, i think in this case we should try to come up with a rule for lanes. I think having a fixed dtype probably makes sense then we handle cast for related cases -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [Web] Add `kv_state` and `rnn_state` to wasm_runtime [tvm]
tqchen merged PR #16791: URL: https://github.com/apache/tvm/pull/16791 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(tvm) branch main updated (8274d142a3 -> 571fdaf1eb)
This is an automated email from the ASF dual-hosted git repository. tqchen pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git from 8274d142a3 [Relax] Implement operators to inspec DLTensor::strides and offset (#16721) add 571fdaf1eb [Web] Add `kv_state` and `rnn_state` to wasm_runtime (#16791) No new revisions were added by this update. Summary of changes: web/emcc/wasm_runtime.cc | 2 ++ 1 file changed, 2 insertions(+)
Re: [PR] [Relax] Allow R.Prim('bool') in relax::If and assert_op [tvm]
Lunderberg commented on PR #16642: URL: https://github.com/apache/tvm/pull/16642#issuecomment-2020783444 Rebased onto main to resolve a merge conflict. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [SLM] Allow modules to define pre-processing of weights [tvm]
Lunderberg commented on code in PR #16785: URL: https://github.com/apache/tvm/pull/16785#discussion_r1539406497 ## python/tvm/relax/frontend/nn/op.py: ## @@ -676,12 +676,31 @@ def permute_dims(x: Tensor, axes: Optional[List[int]] = None, name: str = None) result : Tensor The transposed result. """ + +# TODO(Lunderberg): This is a more extensive auto-naming than +# intended here. Is this still worth it? Review Comment: Long-term, I want to move this automatic naming from the `nn.Module` side to the Relax side, since it could then be performed after removal of trivial bindings. I don't expect these chains to be deep, as it only tracks trivial bindings. The trivial binding from the Relax function parameter to the parameter's `param._expr` field should be the only one that would be tracked. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [SLM] Allow modules to define pre-processing of weights [tvm]
Lunderberg commented on code in PR #16785: URL: https://github.com/apache/tvm/pull/16785#discussion_r1539402961 ## tests/python/relax/test_frontend_nn_packing.py: ## @@ -25,7 +25,9 @@ def _iter_binding_names(mod): """Helper function to compare the names of relax variables""" for block in mod["forward"].body.blocks: for binding in block.bindings: -yield binding.var.name_hint +# Relax variable names may contain '.' even though it +# cannot be expressed in TVMScript. Review Comment: I could go either way. It's nice to have the 1:1 mapping between Relax and TVMScript, which would forbid the period within a relax variable name. However, it's also nice to have a 1:1 mapping between a Relax function parameter and weight tensor's name in a pytorch or safetensor file, which are usually written with a period in the name. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [SLM] Allow modules to define pre-processing of weights [tvm]
Lunderberg commented on code in PR #16785: URL: https://github.com/apache/tvm/pull/16785#discussion_r1539397202 ## python/tvm/relax/frontend/nn/core.py: ## @@ -591,7 +609,22 @@ def wrap_nested(expr: rx.Expr, name: str) -> Union[Tensor, Sequence[Tensor]]: The computed result. """ if not isinstance(expr, rx.DataflowVar): -expr = BlockBuilder.current().emit(expr, name) +block_builder = BlockBuilder.current() +if block_builder is None: +# Normalize to make sure we have valid StructInfo, but +# wait until we are actually building the function to +# flatten nested expressions. +# +# TODO(Lunderberg): Make this easier to call. Infering +# struct info for a nested expression should be doable in +# a free function, without requiring an active +# BlockBuilder and an active FunctionFrame. Review Comment: Long-term, I think it would be nice to distinguish between local struct inference and non-local struct inference. The local inference could be applied when a relax object is constructed, which would avoid the current two-phase initialization of relax objects. Since this step can only perform local struct inference, which would be applied by default, this entire conditional could be removed. There's some kinks that would need to be worked out first. Some of the struct inference for tensor operations currently throw errors a bit more than I think they should. (e.g. If `R.matmul` throws an exception if the arguments are not `R.Tensor`. If the arguments are `R.Object`, the exception is still thrown, even though `R.Tensor` is a subtype of `R.Object`.) These fallbacks would probably get more exercise with local inference, as there may be less information available. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [SLM] Allow modules to define pre-processing of weights [tvm]
Lunderberg commented on code in PR #16785: URL: https://github.com/apache/tvm/pull/16785#discussion_r1539399718 ## python/tvm/relax/frontend/nn/exporter.py: ## @@ -190,34 +207,64 @@ def _convert_input(arg): def _params(mode: str) -> typing.List[rx.Var]: inputs: typing.List[rx.Var] = [] -def _get_var(shape_var: tir.Var) -> tir.Var: -name = shape_var.name -if name in str2var_params: -return str2var_params[name] -var = tir.Var(name, "int64") -str2var_params[name] = var -return var +def _normalize_dim(dim: typing.Union[int, str, tir.Var]) -> tir.PrimExpr: +if isinstance(dim, int): +return tir.IntImm("int64", dim) +elif isinstance(dim, str): +if dim in str2var_params: +return str2var_params[dim] +else: +new_var = tir.Var(dim, "int64") +str2var_params[dim] = new_var +return new_var +elif isinstance(dim, tir.Var): +return dim +else: +raise TypeError( +f"Expected dim to be int, str, or tir.Var, " +f"but {dim} was of type {type(dim)}." +) for name, param in params: # Make sure the a symbolic shape is not re-registered (same as _method_spec_to_inputs) # e.g. we do not see `vocab_size` for `lm_head` and `vocab_size_1` for `embed_tokens` -new_shape = [_get_var(x) if isinstance(x, tir.Var) else x for x in param.shape] -var = core.Tensor.placeholder(new_shape, param.dtype, name)._expr +new_shape = [_normalize_dim(dim) for dim in param._shape] +# var_cls = rx.DataflowVar if mode == "packed" else rx.Var Review Comment: Whoops, that was a test during dev work. Removing the commented-out `var_cls` line. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [SLM] Allow modules to define pre-processing of weights [tvm]
Lunderberg commented on code in PR #16785: URL: https://github.com/apache/tvm/pull/16785#discussion_r1539376002 ## python/tvm/relax/frontend/nn/exporter.py: ## @@ -135,9 +136,18 @@ def _effects() -> typing.List[typing.Tuple[str, core.Effect]]: with self.builder.dataflow(): outputs, inputs = _emit_method(self.builder, method_spec, params, effects) self.builder.emit_func_output(outputs, inputs) + +# TODO(Lunderberg): Make a `ir.transform.ConvertSSA`, +# similar to the existing `tir.transform.ConvertSSA`, +# that converts an entire module to SSA, including TIR +# variable definitions used in either TIR or Relax. Review Comment: Both Relax and TIR require SSA to be well-formed. However, there's a number of cases where a module could be unambiguously converted to SSA. (e.g. Two functions use the same `relax.Var` as a parameter, which can be fixed by substituting a new variable in one of the functions.) So, it wouldn't be a pass that would be called directly by end users, but would be for internal use. If a pass is most easily written in a way that results in the same symbolic variable occurring in multiple different functions, then this would be used as a post-processing pass. (e.g. Apply `BindSymbolicVars` to one variable in a function, then save the result as a new function in the same IRModule. Useful, but would duplicate all other symbolic variables.) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [SVE] Support scalable vectors in LoopVectorizer [tvm]
lhutton1 commented on code in PR #16782: URL: https://github.com/apache/tvm/pull/16782#discussion_r1539232691 ## src/tir/ir/expr.cc: ## @@ -196,7 +196,9 @@ TVM_REGISTER_NODE_TYPE(StringImmNode); // Cast Cast::Cast(DataType t, PrimExpr value, Span span) { ICHECK(value.defined()); - ICHECK_EQ(t.lanes(), value.dtype().lanes()); + ICHECK_EQ(t.get_lanes_or_vscale_factor(), value.dtype().get_lanes_or_vscale_factor()); + ICHECK((t.is_scalable_vector() == value.dtype().is_scalable_vector()) || + (!t.is_scalable_vector() && !value.dtype().is_scalable_vector())); Review Comment: I think `a == b` already implies `!a && !b`, so the expression could be simplified to just `t.is_scalable_vector() == value.dtype().is_scalable_vector()` ## src/tir/transforms/vectorize_loop.cc: ## @@ -37,19 +37,36 @@ namespace tvm { namespace tir { -// TODO(ekalda): P5 in https://github.com/apache/tvm/issues/16455 -inline PrimExpr BroadcastTo(PrimExpr e, int lanes) { - if (e.dtype().lanes() == lanes) return e; +inline PrimExpr CreateNewLanes(bool is_scalable, int lanes_or_vscale_factor) { + if (is_scalable) { +return Mul(Call(DataType::Int(32), builtin::vscale(), {}), lanes_or_vscale_factor); + } else { +return lanes_or_vscale_factor; + } +} + +inline PrimExpr BroadcastTo(PrimExpr e, int lanes, bool is_scalable) { + // Check if e is already in the expected form + if (e.dtype().get_lanes_or_vscale_factor() == lanes && + e.dtype().is_scalable_vector() == is_scalable) +return e; + if (const BroadcastNode* op = e.as()) { -ICHECK(!e.dtype().is_scalable_vector()); -int broadcast_lanes = static_cast(Downcast(op->lanes)->value); -if (lanes % broadcast_lanes == 0) { - return Broadcast(op->value, lanes); +ICHECK(op->dtype.is_scalable_vector() == is_scalable) +<< "Can't broadcast between scalable and fixed length vectors."; +int e_lanes = is_scalable ? op->dtype.vscale_factor() : op->dtype.lanes(); Review Comment: nit: `get_lanes_or_vscale_factor()` ## src/tir/transforms/vectorize_loop.cc: ## @@ -433,20 +488,27 @@ class Vectorizer : public StmtMutator, public ExprFunctorVisitExpr(op->value); if (!indices.same_as(op->indices) || !value.same_as(op->value)) { + ICHECK(!op->buffer->dtype.is_scalable_vector()) + << "Vectorizing over scalable buffer elements is not supported in vectorizer."; // How many lanes of indexing are present in the index and - // buffer element type, excluding the last index. T + // buffer element type, excluding the last index. int other_index_lanes = op->buffer->dtype.lanes(); for (size_t i = 0; i < indices.size() - 1; i++) { other_index_lanes *= indices[i].dtype().lanes(); +// Only allow the last index to be scalable +ICHECK(!indices[i].dtype().is_scalable_vector()) << "Only the last index can be scalable."; } // The total number of lanes of indexing, including the last index. - int index_lanes = other_index_lanes * indices[indices.size() - 1].dtype().lanes(); + int lanes_in_last_index = indices[indices.size() - 1].dtype().get_lanes_or_vscale_factor(); + int index_lanes = other_index_lanes * lanes_in_last_index; // The total number of lanes in this store operation. Either // the index or the value will be broadcast out to this number // of lanes, depending on which has more lanes. - int total_lanes = std::max(index_lanes, value.dtype().lanes()); + int value_dtype_lanes = value.dtype().get_lanes_or_vscale_factor(); + bool is_last_index_scalable = indices[indices.size() - 1].dtype().is_scalable_vector(); Review Comment: nit: might be nicer to replace uses of `indices[indices.size() - 1].dtype()` with a `last_index_dtype` variable ## src/tir/transforms/vectorize_loop.cc: ## @@ -635,19 +701,22 @@ class Vectorizer : public StmtMutator, public ExprFunctora) && b.same_as(op->b)) { return GetRef(op); } else { - int lanes = std::max(a.dtype().lanes(), b.dtype().lanes()); + int a_lanes = a.dtype().get_lanes_or_vscale_factor(); + int b_lanes = b.dtype().get_lanes_or_vscale_factor(); + int lanes = std::max(a_lanes, b_lanes); if (lanes != 1) { const RampNode* b_ramp = b.as(); const RampNode* a_ramp = a.as(); -if (a.dtype().lanes() == 1 && b_ramp) { +if (!a.dtype().is_scalable_or_fixed_length_vector() && b_ramp) { Review Comment: `is_scalar`? ## tests/python/tir-transform/test_tir_transform_vectorize.py: ## @@ -64,28 +61,86 @@ def test_vectorize_vector(): assert isinstance(stmt.body.value, tvm.tir.Broadcast) -def test_vectorize_with_if(): -n = te.var("n") -x = te.var("x") -ib = tvm.tir.ir_builder.create() -A = ib.pointer("float32", name="A") -with ib.for_range(0, 4, kind="vectorize") as i: -
Re: [PR] [TIR] LowerTVMBuiltin may use device_type from PrimFunc annotation [tvm]
Lunderberg commented on code in PR #16727: URL: https://github.com/apache/tvm/pull/16727#discussion_r1539352746 ## tests/python/tir-transform/test_tir_transform_lower_tvm_builtin.py: ## @@ -260,11 +260,13 @@ def expected(): class TestLowerAllocateRequiresDeviceID(tvm.testing.CompareBeforeAfter): +"""If device id is missing, error.""" + transform = tvm.tir.transform.LowerTVMBuiltin() def before(): T.func_attr({"target": T.target("llvm")}) -T.attr("dummy", "device_id", 0) +T.attr("dummy", "device_type", 2) # kDLCuda Review Comment: Good question, and looks like it is defined in the `tvm.runtime.Device` struct. I've updated the usage here, and throughout this unit test file. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [TIR] Fix segfaults from ordering of Let/Assert in MakePackedAPI [tvm]
Lunderberg commented on code in PR #16543: URL: https://github.com/apache/tvm/pull/16543#discussion_r1539344129 ## src/tir/transforms/arg_binder.cc: ## @@ -186,18 +191,8 @@ void ArgBinder::BindDLTensor(const Buffer& buffer, const PrimExpr& device_type, if (!(buffer->dtype == DataType::Int(1) || buffer->dtype == DataType::Int(4) || buffer->dtype == DataType::UInt(4))) { auto type_msg = tvm::tir::StringImm(type_err_msg.str()); -asserts_.emplace_back(AssertStmt(a_ndim == v_ndim, msg, nop)); Review Comment: Yup. The buffer's dimensionality is checked earlier, so this is entirely a duplicate check on the dimensionality. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [TIR] Fix segfaults from ordering of Let/Assert in MakePackedAPI [tvm]
Lunderberg commented on code in PR #16543: URL: https://github.com/apache/tvm/pull/16543#discussion_r1539338859 ## rust/tvm-graph-rt/tests/test_tvm_basic/build.rs: ## @@ -48,10 +48,6 @@ fn main() -> Result<()> { obj_file.exists(), "Could not build tvm lib: {}", String::from_utf8(output.stderr)? -.trim() -.split("\n") -.last() -.unwrap_or("") Review Comment: Oh, that's really weird. I'm guessing it was from bouncing over to the PR branch of https://github.com/apache/tvm/pull/16183, which touched a number of the FFI bindings. I've removed this delta from the PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [SVE] Support scalable vectors in LoopVectorizer [tvm]
Lunderberg commented on PR #16782: URL: https://github.com/apache/tvm/pull/16782#issuecomment-2020540959 The implementation looks reasonable, though I have one main question for it: What is the behavior of the updated pass for a target that doesn't support SVE? Prior SVE-commits enabled the functionality, but didn't produce SVE in any of the default lowering passes. From [this line](https://github.com/apache/tvm/pull/16696/files#diff-f61b04b100f5145f2681340c81d3f2af221239594ed01e2e24896522329ce92cR598-R600), versions of LLVM before 11.0 do not support SVE, nor from my brief reading of the CUDA codegen [here](https://github.com/apache/tvm/blob/main/src/target/source/codegen_cuda.cc#L253) does CUDA. Since `VectorizeLoop` occurs after the `BindTarget` pass, we can check the function attribute to know which target will be executing each function. I think we should have the loop vectorization apply only to fixed-extent loops by default, but enable the scalable vectorization for targets that support it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [Relax] Implement operators to inspec DLTensor::strides and offset [tvm]
Lunderberg commented on PR #16721: URL: https://github.com/apache/tvm/pull/16721#issuecomment-2020500989 > We may want to revisit the default inferred PrimStructInfo for some of these calls in the future, namely if we handle offsets/strides more systematically later, though the approach here is correct for the present. Sounds like a plan. I think the biggest use of `strides` would be in exposing a view of a tensor to a compute kernel, without requiring the entire tensor to be exposed. (e.g. Improved `R.split` legalization) That said, there's enough kernels that assume contiguous tensors, as are currently provided by Relax, that for now I'd want to keep that requirement. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [Relax] Implement operators to inspec DLTensor::strides and offset [tvm]
Lunderberg merged PR #16721: URL: https://github.com/apache/tvm/pull/16721 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [Relax] Improve CanonicalizeBindings in DataflowVar edge case [tvm]
Lunderberg commented on code in PR #16783: URL: https://github.com/apache/tvm/pull/16783#discussion_r1539285615 ## src/relax/transform/canonicalize_bindings.cc: ## @@ -91,18 +91,20 @@ class CanonicalizePlanner : public ExprVisitor { bound_to = opt.value(); } - if (bound_var.as() || !bound_to.as()) { + if (bound_var.as() || !bound_to.as() || + !visitor.used_outside_home_dataflow_.count(bound_var)) { // Case 1: Var = Var // Case 2: DataflowVar = Var // Case 3: DataflowVar = DataflowVar +// Case 4a: Var = DataflowVar, but used outside this DataflowBlock Review Comment: Thank you, and updated the comment to be more explicit. I've also changed "this DataflowBlock" to "the DataflowBlock containing the binding", since this function is called after the entire function is visited, not during the visit of any specific dataflow block. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(tvm) branch main updated: [Relax] Implement operators to inspec DLTensor::strides and offset (#16721)
This is an automated email from the ASF dual-hosted git repository. lunderberg pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 8274d142a3 [Relax] Implement operators to inspec DLTensor::strides and offset (#16721) 8274d142a3 is described below commit 8274d142a3c229eb664d041c5a8034c3638f8c0f Author: Eric Lunderberg AuthorDate: Tue Mar 26 08:55:10 2024 -0500 [Relax] Implement operators to inspec DLTensor::strides and offset (#16721) * [TIR] LowerTVMBuiltin may use device_type from PrimFunc annotation If an allocation occurs within a host function, it may not have a device/host split. * lint fix * [Relax] Implement operators to inspec DLTensor::strides and offset A follow-up PR to https://github.com/apache/tvm/pull/16563. This PR implements similar operators to inspect the runtime values of `DLTensor::strides` and `DLTensor::byte_offset`. In addition, while the element offset is not explicitly present in the `DLTensor` struct, a Relax operator is implemented to infer it from the `byte_offset` and `data_type` fields, for use when interacting with the TIR `BufferNode::elem_offset` field. --- python/tvm/relax/expr.py | 97 .../tvm/relax/transform/legalize_ops/__init__.py | 1 + .../tvm/relax/transform/legalize_ops/inspect_op.py | 128 +++ src/relax/op/tensor/inspect.cc | 180 --- src/relax/op/tensor/inspect.h | 39 src/tir/transforms/lower_tvm_builtin.cc| 36 ++- tests/python/relax/test_op_inspect.py | 252 + tests/python/relax/test_op_unpack.py | 127 --- .../test_tir_transform_lower_tvm_builtin.py| 37 ++- 9 files changed, 727 insertions(+), 170 deletions(-) diff --git a/python/tvm/relax/expr.py b/python/tvm/relax/expr.py index 12f08f4dbf..4dca710e77 100644 --- a/python/tvm/relax/expr.py +++ b/python/tvm/relax/expr.py @@ -280,6 +280,33 @@ class ExprWithOp(Expr, Scriptable): self._check_for_tensor_struct_info() return _DLTensorShapeProxy(self) +@property +def strides(self) -> "_DLTensorStrideProxy": +"""Returns a proxy object for accessing DLTensor::strides""" +self._check_for_tensor_struct_info() +return _DLTensorStrideProxy(self) + +@property +def byte_offset(self) -> "Expr": +"""Returns a proxy object for accessing DLTensor::byte_offset""" +self._check_for_tensor_struct_info() +op = tvm.ir.Op.get("relax.inspect.tensor_byte_offset") +return tvm.relax.Call(op, [self]) + +@property +def elem_offset(self) -> "Expr": +"""Returns a proxy object for accessing a DLTensor's elem_offset + +This parameter is not stored in the DLTensor, but is instead +derived from the DLTensor's byte offset and datatype. This is +exposed in Relax for ease of use, and for translation into the +`tir::BufferNode::elem_offset` field when interacting with TIR +buffers. +""" +self._check_for_tensor_struct_info() +op = tvm.ir.Op.get("relax.inspect.tensor_elem_offset") +return tvm.relax.Call(op, [self]) + class _DLTensorDTypeProxy(tvm.runtime.ObjectGeneric): """A proxy object for unpacking DLDatatype from DLTensor @@ -431,6 +458,76 @@ class _DLTensorShapeProxy(tvm.runtime.ObjectGeneric): return tvm.relax.Call(op, [self.tensor, axis]) +class _DLTensorStrideProxy(tvm.runtime.ObjectGeneric): +"""A proxy object for unpacking the strides from DLTensor + +Exposes accessors for the `DLTensor::strides` field. Accessing +these fields will produce `relax.Call` expressions, representing +the field's runtime value. If the datatype of the tensor is known +at compile-time, the `relax.Call` will be normalized into a +`relax.PrimValue`, with no runtime cost. + +Parameters +-- +tensor: relax.Expr + +The relax tensor (or a variable referring to a relax tensor), +whose runtime strides is being inspected. +""" + +def __init__(self, tensor): +self.tensor = tensor + +def asobject(self): +"""Provide expected in error message + +This method is called when `_DLTensorStrideProxy` is used in a +context that requires a `relax.Expr`. This usage is not +supported, and raising an error here can provide suggested +fixes that are not present in the default error message from +`tvm.runtime.convert_to_object`. +""" +raise TypeError( +f"{self.tensor}.strides cannot be converted to a relax expression, " +f"and should be used as a proxy object to access the runtime strides of the DLTensor. " +f"The DLTensor::ndim
(tvm) branch p0-install-testing-infra deleted (was 02c6cad99e)
This is an automated email from the ASF dual-hosted git repository. lukhut pushed a change to branch p0-install-testing-infra in repository https://gitbox.apache.org/repos/asf/tvm.git was 02c6cad99e [SME][Docker] Add Fixed Virtual Platform (FVP) and toolchain install The revisions that were on this branch are still contained in other references; therefore, this change does not discard any commits from the repository.
Re: [PR] [Relax] Improve CanonicalizeBindings in DataflowVar edge case [tvm]
Lunderberg commented on code in PR #16783: URL: https://github.com/apache/tvm/pull/16783#discussion_r1539278031 ## src/relax/transform/canonicalize_bindings.cc: ## @@ -91,18 +91,20 @@ class CanonicalizePlanner : public ExprVisitor { bound_to = opt.value(); } - if (bound_var.as() || !bound_to.as()) { + if (bound_var.as() || !bound_to.as() || + !visitor.used_outside_home_dataflow_.count(bound_var)) { // Case 1: Var = Var // Case 2: DataflowVar = Var // Case 3: DataflowVar = DataflowVar +// Case 4a: Var = DataflowVar, but used outside this DataflowBlock // // For these three cases, the trivial binding can be Review Comment: Off by one errors, my ~~two~~ one nemesis! (Thank you, and fixed.) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [Relax] Allow composition of DFPattern replacements [tvm]
Lunderberg commented on PR #16732: URL: https://github.com/apache/tvm/pull/16732#issuecomment-2020480480 The pre-requisite PR https://github.com/apache/tvm/pull/16730 has landed, so this PR is now rebased on top of `main` and marked as ready. Thank you @slyubomirsky for the review, and so I think it's just waiting on CI now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [Web] Add `kv_state` and `rnn_state` to wasm_runtime [tvm]
CharlieFRuan commented on PR #16791: URL: https://github.com/apache/tvm/pull/16791#issuecomment-2020477526 Thank you so much @Hzfengsy! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [Relax] Allow composition of DFPattern replacements [tvm]
Lunderberg commented on code in PR #16732: URL: https://github.com/apache/tvm/pull/16732#discussion_r1539272553 ## src/relax/ir/dataflow_matcher.cc: ## @@ -1140,34 +1071,173 @@ class PatternRewriter : ExprMutator { return block; } - /*! \brief The pattern for rewriting call nodes */ - Optional pattern_; /*! \brief The pattern constraint contexts for rewriting dataflow blocks */ - Optional ctx_; + PatternContext ctx_; /*! * \brief The user-provided rewriter function. Its signature and semantics are: - * - (Call, Map) -> Call for call node rewriting. Given the matched - *call node and the map of patterns and matched expressions, it should return a new call node - *to replace the original one or the original matched call node as is. - * - (Map, Map) -> Map for dataflow block rewriting. - *Given the map of patterns and corresponding variables (bound variables or parameters), - *it should return a map that specifies new values for matched bound variables. It can refer + * + * - (Map, Map) -> Map + * + *Given the map of patterns and corresponding variables (bound + *variables or parameters), it should return a map that + *specifies new values for matched bound variables. It can refer *to the passed bindings to create the replacement expressions. */ - PackedFunc rewriter_func_; - std::unordered_set params_; + TypedPackedFunc(Map, Map)> rewriter_func_; +}; + +/*! + * \brief Apply pattern matching to each expression, replacing + * matches with the output of a user-provided rewriter function. + */ +class ExprPatternRewriter : ExprMutator { + public: + using ExprMutator::VisitBindingBlock_; + using ExprMutator::VisitExpr_; + + ExprPatternRewriter(DFPattern pat, + TypedPackedFunc)> rewriter_func) + : pattern_(pat), rewriter_func_(rewriter_func) {} + + template + static Function Run(PatternType pat, + TypedPackedFunc)> rewriter_func, + Function func) { +ExprPatternRewriter rewriter(pat, rewriter_func); +func = Downcast(rewriter(func)); +func = Downcast(RemoveAllUnused(func)); +return func; + } + + Expr VisitExpr_(const SeqExprNode* seq) override { +auto cache = bindings_; +SeqExpr prev = GetRef(seq); + +StructuralEqual struct_equal; + +while (true) { + SeqExpr next = Downcast(builder_->Normalize(ExprMutator::VisitExpr_(prev.get(; + if (struct_equal(prev, next)) { +return std::move(next); + } + + // Canonicalization may result in two previously-different + // expressions being recognized as identical. Elimination of + // common subexpressions may result in trival var-to-var + // bindings that can be canonicalized. Therefore, iterate the + // simplification steps until converged. + while (true) { +auto start_of_loop = next; +next = Downcast(CanonicalizeBindings(next)); +next = Downcast(EliminateCommonSubexpr(next)); +next = Downcast(RemoveAllUnused(next)); +if (struct_equal(start_of_loop, next)) { + break; +} + } + + if (struct_equal(prev, next)) { +return std::move(next); + } + + // Reset all knowledge of bindings that were collected from + // this SeqExpr. The collected bindings are only after + // the point where they were collected, and we are repeating + // the mutation of this SeqExpr. + bindings_ = cache; + prev = next; +} + } + + void VisitBinding_(const VarBindingNode* binding) override { +auto expr = VisitExpr(binding->value); +bindings_.Set(binding->var, expr); +ReEmitBinding(binding, expr); + } + + Expr VisitExpr(const Expr& expr) override { +auto node = ExprMutator::VisitExpr(expr); + +std::vector matches_top_level; +if (auto rewritten = TryRewrite(node, pattern_, _top_level)) { + return builder_->Normalize(rewritten.value()); +} + +return node; + } + + private: + Optional TryRewrite(const Expr& expr, const DFPattern& pattern, +std::vector* matches_top_level) { +ICHECK(matches_top_level); + +// Special handling if the user-supplied pattern is a `OrPattern`. +// While the `ExtractMatchedExpr` can handle match the Review Comment: Whoops, this was a typo. It should be "handle matching", and I've updated the PR with the correction. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [Relax] Refactor PatternRewriter into separate Block/Expr mutators [tvm]
Lunderberg merged PR #16730: URL: https://github.com/apache/tvm/pull/16730 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(tvm) branch main updated: [Relax] Refactor PatternRewriter into separate Block/Expr mutators (#16730)
This is an automated email from the ASF dual-hosted git repository. lunderberg pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 016b512ad4 [Relax] Refactor PatternRewriter into separate Block/Expr mutators (#16730) 016b512ad4 is described below commit 016b512ad4950cba32eaf81be0cfe3c0321851f7 Author: Eric Lunderberg AuthorDate: Tue Mar 26 08:43:36 2024 -0500 [Relax] Refactor PatternRewriter into separate Block/Expr mutators (#16730) Prior to this commit, the `PatternRewriter` mutator handled pattern rewriting at either the expression level (`rewrite_call`) or the dataflow block level (`rewrite_bindings`). These two functionalities had different external APIs, defined diffierent member variables, and visited different IR nodes. In effect, it had two entirely independent implementations, which just happened to be implemented within the same class. This commit refactors the single `PatternRewriter` mutator into separate `BlockPatternRewriter` and `ExprPatternRewriter` mutators. --- include/tvm/relax/dataflow_matcher.h | 4 +- src/relax/ir/dataflow_matcher.cc | 238 --- 2 files changed, 140 insertions(+), 102 deletions(-) diff --git a/include/tvm/relax/dataflow_matcher.h b/include/tvm/relax/dataflow_matcher.h index bbc8e9382e..8f2024f264 100644 --- a/include/tvm/relax/dataflow_matcher.h +++ b/include/tvm/relax/dataflow_matcher.h @@ -67,7 +67,9 @@ TVM_DLL Optional> MatchGraph(const PatternContext& ctx, * \param f The function to rewrite * \return The rewritten or the input function, depending on the pattern matching result. */ -TVM_DLL Function RewriteBindings(const PatternContext& ctx, PackedFunc rewriter, Function f); +TVM_DLL Function RewriteBindings( +const PatternContext& ctx, +TypedPackedFunc(Map, Map)> rewriter, Function f); /** * \brief Rewrite a function with the given pattern and the rewriter function. diff --git a/src/relax/ir/dataflow_matcher.cc b/src/relax/ir/dataflow_matcher.cc index a14d43f6d3..531971d3db 100644 --- a/src/relax/ir/dataflow_matcher.cc +++ b/src/relax/ir/dataflow_matcher.cc @@ -973,102 +973,33 @@ TVM_REGISTER_GLOBAL("relax.dpl.match_dfb") }); /*! - * \brief Apply pattern matching to each call node and dataflow block, and replace matching ones + * \brief Apply pattern matching to each dataflow block, replacing matches * with the output of a user-provided rewriter function. */ -class PatternRewriter : ExprMutator { +class BlockPatternRewriter : ExprMutator { public: using ExprMutator::VisitBindingBlock_; using ExprMutator::VisitExpr_; - PatternRewriter(DFPattern pat, PackedFunc rewriter_func, - const std::unordered_set& params) - : pattern_(pat), rewriter_func_(rewriter_func), params_(params) {} - - PatternRewriter(const PatternContext& ctx, PackedFunc rewriter_func, - const std::unordered_set& params) - : ctx_(ctx), rewriter_func_(rewriter_func), params_(params) {} + BlockPatternRewriter( + const PatternContext& ctx, + TypedPackedFunc(Map, Map)> rewriter_func) + : ctx_(ctx), rewriter_func_(rewriter_func) {} template - static Function Run(PatternType pat, PackedFunc rewriter_func, Function f) { -std::unordered_set params; -for (const auto& p : f->params) { - params.insert(p.get()); -} -PatternRewriter rewriter(pat, rewriter_func, params); -return Downcast(RemoveAllUnused(rewriter.VisitExpr(f))); - } - - Expr VisitExpr_(const SeqExprNode* seq) override { -if (ctx_) { - return ExprMutator::VisitExpr_(seq); -} - -auto cache = bindings_; -SeqExpr prev = GetRef(seq); - -StructuralEqual struct_equal; - -while (true) { - SeqExpr next = Downcast(builder_->Normalize(ExprMutator::VisitExpr_(prev.get(; - if (struct_equal(prev, next)) { -return std::move(next); - } - - // Canonicalization may result in two previously-different - // expressions being recognized as identical. Elimination of - // common subexpressions may result in trival var-to-var - // bindings that can be canonicalized. Therefore, iterate the - // simplification steps until converged. - while (true) { -auto start_of_loop = next; -next = Downcast(CanonicalizeBindings(next)); -next = Downcast(EliminateCommonSubexpr(next)); -next = Downcast(RemoveAllUnused(next)); -if (struct_equal(start_of_loop, next)) { - break; -} - } - - if (struct_equal(prev, next)) { -return std::move(next); - } - - // Reset all knowledge of bindings that were collected from - // this DataflowBlock. The collected bindings are only after - // the point where they were collected, and we are repeating - // the mutation of
Re: [PR] [Bugfix][Cutlass] Remove a typo in cutlass build [tvm]
Lunderberg merged PR #16789: URL: https://github.com/apache/tvm/pull/16789 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [IR][Relax] Improve highlighting in assert_structural_equal [tvm]
Lunderberg commented on PR #16756: URL: https://github.com/apache/tvm/pull/16756#issuecomment-2020467097 And the additional unit test is added in https://github.com/apache/tvm/pull/16796. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(tvm) branch main updated (bf2d43e314 -> bcfbcabff8)
This is an automated email from the ASF dual-hosted git repository. lunderberg pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git from bf2d43e314 [IR][Relax] Improve highlighting in assert_structural_equal (#16756) add bcfbcabff8 [Bugfix][Cutlass] Remove a typo in cutlass build (#16789) No new revisions were added by this update. Summary of changes: python/tvm/contrib/cutlass/build.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
[PR] [Relax] Unit-test for structural equal of recursive function [tvm]
Lunderberg opened a new pull request, #16796: URL: https://github.com/apache/tvm/pull/16796 A follow-up PR to https://github.com/apache/tvm/pull/16756, adding an explicit unit test for `tvm.ir.assert_structural_equal` of two distinct recursive functions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [TIR] Modify IntImmNode deep_equal to match regardless of type [tvm]
quic-sanirudh commented on PR #16795: URL: https://github.com/apache/tvm/pull/16795#issuecomment-2020458701 > In this particular case (deep equality), i think type do matter, so it would be great instead to fix the cases that would depend on the relaxed behavior. > > I know we had some i64/i32 issues, and general rule of thumb now is to try to be explicit as much as possible and that helps to reduce errors Oh okay, thanks for the feedback @tqchen. The cases we started seeing was that some expressions were not getting simplified properly after [`RampNode` lanes were changed to PrimExpr](https://github.com/apache/tvm/commit/a6157a6369c184b6fa5f66654feb685e58726737#diff-046cdcb6494a6719465080bb9156cd4620828af4b18f7018e5b443d6c7c1c1d0L792-R792). I narrowed down the exact simplification that was failing was actually a [rewrite simplify rule here](https://github.com/apache/tvm/blob/main/src/arith/rewrite_simplify.cc#L401). I realized that the simplification was not happening because lanes between broadcast and RampNode in this case had different types, so a couple other solutions I thought would apply here is to either fix the RampNode constructor to stick to some fixed dtype for lanes (something like int32/int16, since `DLDataType` anyways only supports int16 dtype), or to update the simplify rules here to try the same rules with an `PVar` lanes type in case of fixed length vectors. But if dtype does matter, then should we update the `PEqualChecked` to also check for dtypes? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(tvm) branch main updated: [IR][Relax] Improve highlighting in assert_structural_equal (#16756)
This is an automated email from the ASF dual-hosted git repository. lunderberg pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new bf2d43e314 [IR][Relax] Improve highlighting in assert_structural_equal (#16756) bf2d43e314 is described below commit bf2d43e314ca7e682ae26dca70ada657054f8786 Author: Eric Lunderberg AuthorDate: Tue Mar 26 08:26:52 2024 -0500 [IR][Relax] Improve highlighting in assert_structural_equal (#16756) * [IR][Relax] Improve highlighting in assert_structural_equal Prior to this commit, `tvm.ir.assert_structural_equal` would highlight an entire `relax::BindingBlock` if the number of elements in the binding block differs. This can result in the entire Relax function being highlighted, making it difficult to identify the location of the mismatch. This commit makes the following changes, to improve the error messages that occur when `tvm.ir.assert_structural_equal` raises an exception. - In `"node.StructuralEqual"`, set `defer_fails = true` when `assert_mode` is true. This highlights the first mismatch of an `Array`, rather than the entire array, in cases where the LHS and RHS have different sizes. - In the `SHashReduce` for `VarBinding` and `MatchCast`, visit the value first, and then the variable to which it is bound. This highlights the mismatched expression, rather than mismatches in the resulting struct info. - In `SEqualHandlerDefault::Impl::SEqualReduce`, defer the failure if enabled. This highlights the first mismatch, which may also have been deferred, rather than an early return a later mismatch occurs involving `NullOpt`. * DeferFail should follow assert_mode * Handle recursively defined lambda functions --- include/tvm/relax/expr.h | 24 --- src/node/structural_equal.cc | 45 +++- src/relax/ir/expr.cc | 50 +++ tests/python/relax/test_utils.py | 63 +++- 4 files changed, 149 insertions(+), 33 deletions(-) diff --git a/include/tvm/relax/expr.h b/include/tvm/relax/expr.h index 4634d1e228..40707675fe 100644 --- a/include/tvm/relax/expr.h +++ b/include/tvm/relax/expr.h @@ -780,18 +780,8 @@ class MatchCastNode : public BindingNode { v->Visit("span", ); } - bool SEqualReduce(const MatchCastNode* other, SEqualReducer equal) const { -// NOTE: pattern can contain ShapeExpr which defines the vars -return equal.DefEqual(var, other->var) && equal.DefEqual(struct_info, other->struct_info) && - equal(value, other->value); - } - - void SHashReduce(SHashReducer hash_reduce) const { -// NOTE: pattern can contain ShapeExpr which defines the vars -hash_reduce.DefHash(var); -hash_reduce.DefHash(struct_info); -hash_reduce(value); - } + bool SEqualReduce(const MatchCastNode* other, SEqualReducer equal) const; + void SHashReduce(SHashReducer hash_reduce) const; static constexpr const char* _type_key = "relax.expr.MatchCast"; static constexpr const bool _type_has_method_sequal_reduce = true; @@ -822,13 +812,9 @@ class VarBindingNode : public BindingNode { v->Visit("span", ); } - bool SEqualReduce(const VarBindingNode* other, SEqualReducer equal) const { -return equal.DefEqual(var, other->var) && equal(value, other->value); - } - void SHashReduce(SHashReducer hash_reduce) const { -hash_reduce.DefHash(var); -hash_reduce(value); - } + bool SEqualReduce(const VarBindingNode* other, SEqualReducer equal) const; + void SHashReduce(SHashReducer hash_reduce) const; + static constexpr const char* _type_key = "relax.expr.VarBinding"; static constexpr const bool _type_has_method_sequal_reduce = true; static constexpr const bool _type_has_method_shash_reduce = true; diff --git a/src/node/structural_equal.cc b/src/node/structural_equal.cc index 66a347f6b8..e0de514122 100644 --- a/src/node/structural_equal.cc +++ b/src/node/structural_equal.cc @@ -27,6 +27,7 @@ #include #include +#include #include #include "ndarray_hash_equal.h" @@ -249,15 +250,30 @@ class SEqualHandlerDefault::Impl { // in which case we can use same_as for quick checking, // or we have to run deep comparison and avoid to use same_as checks. auto run = [=]() { - if (!lhs.defined() && !rhs.defined()) return true; - if (!lhs.defined() && rhs.defined()) return false; - if (!rhs.defined() && lhs.defined()) return false; - if (lhs->type_index() != rhs->type_index()) return false; - auto it = equal_map_lhs_.find(lhs); - if (it != equal_map_lhs_.end()) { -return it->second.same_as(rhs); + std::optional early_result = [&]() -> std::optional { +if (!lhs.defined() && !rhs.defined()) return true; +
Re: [PR] [IR][Relax] Improve highlighting in assert_structural_equal [tvm]
Lunderberg commented on PR #16756: URL: https://github.com/apache/tvm/pull/16756#issuecomment-2020428978 > The only change that might be warranted would be a test case of a recursive function that does not match. Good call, and I'll add one in a follow-up PR. The failure mode that occurred for the recursive functions was an error being thrown for use of an undefined variable, which only required any comparison at all and is tested in `test_structural_equal_with_recursive_lambda_function`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [IR][Relax] Improve highlighting in assert_structural_equal [tvm]
Lunderberg merged PR #16756: URL: https://github.com/apache/tvm/pull/16756 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [Disco] Propagate structlog/logging config to workers [tvm]
Lunderberg commented on PR #16715: URL: https://github.com/apache/tvm/pull/16715#issuecomment-2020421377 > This seems straightforward enough, though I'm not aware of the wider context. Thank you. For context, some applications use `structlog` to provide more flexible logging than python's stdlib `logging`. The `structlog` configuration determines what pre-processing is done for the log statements (e.g. appending contextual information to a log statement). When starting child processes using `multiprocessing`, it would be useful for the child processes to format/save their logs in the same manner as the parent, but this doesn't occur by default. In [PR#16618](https://github.com/apache/tvm/pull/16618), I added handling to forward the `structlog` configuration from the main process to the `tvm.runtime.disco` worker processes. However, some configurations ([example from `structlog`'s documentation](https://www.structlog.org/en/stable/standard-library.html#rendering-using-structlog-based-formatters-within-logging) integrate `structlog` with the stdlib `logging`. The previous implementation only forwarded the configuration held by `structlog`, and didn't forward the configuration within the stdlib `logging`. This PR closes that gap. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [SLM] Add unit tests for SLM to Relax exporter [tvm]
Lunderberg commented on PR #16784: URL: https://github.com/apache/tvm/pull/16784#issuecomment-2020396730 > It's good to have test cases, especially when they explain the intended functionality very well. A tutorial featuring based on these examples (perhaps literally generated from the test cases) might be a good investment of time as well, especially if it can be made more visible. Thank you, and that was in part my goal with the sequence of unit tests. Whenever possible, I prefer test cases that double as mini-tutorials, since then they are less likely to become stale than full tutorials. (Though I agree that it would be beneficial to have a tutorial that follows user-focused flow, rather than the feature-focused flow of these unit tests.) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [SLM] Add unit tests for SLM to Relax exporter [tvm]
Lunderberg commented on code in PR #16784: URL: https://github.com/apache/tvm/pull/16784#discussion_r1539197565 ## tests/python/relax/test_frontend_nn_exporter.py: ## @@ -0,0 +1,632 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +import pytest + +import tvm +import tvm.testing + +from tvm import relax, tir +from tvm.ir import assert_structural_equal +from tvm.relax.frontend import nn +from tvm.script import ir as I, relax as R, tir as T + + +def test_simple(): +"""A module may be exported from nn.Module to Relax""" + +slm_mod = nn.modules.ReLU() +exported_mod, _ = slm_mod.export_tvm( +spec={"forward": {"x": nn.spec.Tensor((3, 3), "float32")}}, +debug=False, +) + +@I.ir_module +class Expected: +@R.function +def forward(x: R.Tensor([3, 3], dtype="float32")): +R.func_attr({"num_input": 1}) +with R.dataflow(): +relu = R.nn.relu(x) +relu = relu +R.output(relu) +return relu + +assert_structural_equal(exported_mod, Expected) + + +def test_custom_module(): +"""A module may be exported from nn.Module to Relax""" Review Comment: Good call, and I've updated the docstring to remove the copy/paste duplicate. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [TIR] Modify IntImmNode deep_equal to match regardless of type [tvm]
tqchen commented on PR #16795: URL: https://github.com/apache/tvm/pull/16795#issuecomment-2020375413 In this particular case, i think type do matter, so it would be great instead to fix the cases that would depend on the relaxed behavior. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [Analysis] Allow calls to GlobalVar in @R.function [tvm]
Lunderberg merged PR #16778: URL: https://github.com/apache/tvm/pull/16778 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(tvm) branch main updated: [Analysis] Allow calls to GlobalVar in @R.function (#16778)
This is an automated email from the ASF dual-hosted git repository. lunderberg pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 72f0326a88 [Analysis] Allow calls to GlobalVar in @R.function (#16778) 72f0326a88 is described below commit 72f0326a889b60a146fb51aca4041abf0fb0fbb9 Author: Eric Lunderberg AuthorDate: Tue Mar 26 08:03:33 2024 -0500 [Analysis] Allow calls to GlobalVar in @R.function (#16778) * [Analysis] Allow calls to GlobalVar in @R.function Prior to this commit, the post-parsing well-formed check performed by TVMScript allowed a call to `GlobalVar` in a `@R.function`, but only if it occurred within the context of a `@I.ir_module`. If `@R.function` appeared on its own, calls to a `GlobalVar` would be treated as calls to an undefined function. * Use approrpirate well-formed checks TIR/Relax functions * Lint fix * Import order fix --- include/tvm/relax/analysis.h| 6 ++-- python/tvm/relax/analysis/analysis.py | 8 ++--- python/tvm/script/parser/core/entry.py | 26 +- src/relax/analysis/well_formed.cc | 47 ++--- tests/python/relax/test_analysis_well_formed.py | 34 ++ tests/python/relax/test_tvmscript_parser.py | 37 +++ 6 files changed, 122 insertions(+), 36 deletions(-) diff --git a/include/tvm/relax/analysis.h b/include/tvm/relax/analysis.h index 0c43732813..fa928d082d 100644 --- a/include/tvm/relax/analysis.h +++ b/include/tvm/relax/analysis.h @@ -547,15 +547,15 @@ TVM_DLL bool ContainsImpureCall(const Expr& expr, /*! * \brief Check if the IRModule is well formed. * - * \param m the IRModule to check. + * \param obj The IRModule or relax::Function to check. * \param check_struct_info A boolean flag indicating if the property "every Expr * must have defined structure info" will be checked. - * \return true if the IRModule is well formed, false if not. + * \return true if the object is well formed, false if not. * \note By default the structure info is always checked. It is only in test cases * where `check_struct_info` might be false, so that other well-formed requirements * will be well tested and will not be blocked by not having structure info. */ -TVM_DLL bool WellFormed(IRModule m, bool check_struct_info = true); +TVM_DLL bool WellFormed(Variant obj, bool check_struct_info = true); /*! * \brief Using the layout transforms on the outputs, suggest layout transformation on the blocks diff --git a/python/tvm/relax/analysis/analysis.py b/python/tvm/relax/analysis/analysis.py index 83286c0980..e6eaff3711 100644 --- a/python/tvm/relax/analysis/analysis.py +++ b/python/tvm/relax/analysis/analysis.py @@ -434,13 +434,13 @@ def remove_all_unused(func: Function) -> Function: return _ffi_api.remove_all_unused(func) # type: ignore -def well_formed(mod: IRModule, check_struct_info: bool = True) -> bool: +def well_formed(obj: Union[IRModule, Function], check_struct_info: bool = True) -> bool: """Check if the IRModule is well formed. Parameters -- -mod : tvm.IRModule -The input IRModule. +obj : Union[tvm.IRModule, Function] +The input IRModule or relax.Function. check_struct_info : bool A boolean flag indicating if the property "every Expr must @@ -457,7 +457,7 @@ def well_formed(mod: IRModule, check_struct_info: bool = True) -> bool: where `check_struct_info` might be false, so that other well-formed requirements will be well tested and will not be blocked by not having structure info. """ -return _ffi_api.well_formed(mod, check_struct_info) # type: ignore +return _ffi_api.well_formed(obj, check_struct_info) # type: ignore def _get_prim_func_default_dtype(func: PrimFunc): diff --git a/python/tvm/script/parser/core/entry.py b/python/tvm/script/parser/core/entry.py index 0c88cacf8a..e7a7f98b76 100644 --- a/python/tvm/script/parser/core/entry.py +++ b/python/tvm/script/parser/core/entry.py @@ -18,6 +18,7 @@ import inspect from typing import Any, Dict, Union +import tvm from ir.module import IRModule from ...ir_builder import IRBuilder from . import doc @@ -34,12 +35,19 @@ WELL_FORMED_ERROR_MESSAGE = ( def _default_globals() -> Dict[str, Any]: -import tvm # pylint: disable=import-outside-toplevel from tvm.script.parser import ir # pylint: disable=import-outside-toplevel from tvm.script.parser import relax # pylint: disable=import-outside-toplevel from tvm.script.parser import tir # pylint: disable=import-outside-toplevel -extra_vars = {"tvm": tvm, "I": ir, "ir": ir, "T": tir, "tir": tir, "R": relax, "relax": relax} +extra_vars = { +"tvm": tvm, +"I": ir, +"ir": ir, +"T": tir, +
(tvm) branch main updated: [Codegen, Cuda] Add overload for fp8x4 e5m2 <-> half4 conversion (#16787)
This is an automated email from the ASF dual-hosted git repository. tqchen pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new ae7b8d9aed [Codegen, Cuda] Add overload for fp8x4 e5m2 <-> half4 conversion (#16787) ae7b8d9aed is described below commit ae7b8d9aeddd81c862e03255b7628bf5932c24ec Author: Wuwei Lin AuthorDate: Tue Mar 26 05:58:18 2024 -0700 [Codegen, Cuda] Add overload for fp8x4 e5m2 <-> half4 conversion (#16787) --- src/target/source/literal/cuda_half_t.h | 23 ++- 1 file changed, 22 insertions(+), 1 deletion(-) diff --git a/src/target/source/literal/cuda_half_t.h b/src/target/source/literal/cuda_half_t.h index bf3e83928e..27d44d9f7f 100644 --- a/src/target/source/literal/cuda_half_t.h +++ b/src/target/source/literal/cuda_half_t.h @@ -410,7 +410,28 @@ struct __align__(8) half4 { result.__x = (static_cast<__uint32_t>(lo_part.__x) | (static_cast<__uint32_t>(hi_part.__x) << 16)); return result; - })"; + } + __host__ __device__ explicit half4(const __nv_fp8x4_e5m2& fp8x4) { +__nv_fp8x2_e5m2 lo_part, hi_part; +lo_part.__x = static_cast<__nv_fp8x2_storage_t>(fp8x4.__x & 0x); +hi_part.__x = static_cast<__nv_fp8x2_storage_t>((fp8x4.__x >> 16) & 0x); +__half2 lo_half2 = static_cast<__half2>(lo_part); +__half2 hi_half2 = static_cast<__half2>(hi_part); +x = reinterpret_cast<__half*>(_half2)[0]; +y = reinterpret_cast<__half*>(_half2)[1]; +z = reinterpret_cast<__half*>(_half2)[0]; +w = reinterpret_cast<__half*>(_half2)[1]; + } + __host__ __device__ explicit operator __nv_fp8x4_e5m2() const { +__nv_fp8x4_e5m2 result; +__half2 lo_half2 = *reinterpret_cast(); +__half2 hi_half2 = *reinterpret_cast(); +__nv_fp8x2_e5m2 lo_part(lo_half2), hi_part(hi_half2); +result.__x = +(static_cast<__uint32_t>(lo_part.__x) | (static_cast<__uint32_t>(hi_part.__x) << 16)); +return result; + } + )"; } stream << R"( };
Re: [PR] [Fix] Fix build errors with VS2022 [tvm]
tqchen commented on PR #16790: URL: https://github.com/apache/tvm/pull/16790#issuecomment-2020360881 Thank you @Jiawei-Shao -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [Fix] Fix build errors with VS2022 [tvm]
tqchen merged PR #16790: URL: https://github.com/apache/tvm/pull/16790 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [Codegen, Cuda] Add overload for fp8x4 e5m2 <-> half4 conversion [tvm]
tqchen merged PR #16787: URL: https://github.com/apache/tvm/pull/16787 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(tvm) branch main updated (b2204ae698 -> 69c091400a)
This is an automated email from the ASF dual-hosted git repository. tqchen pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git from b2204ae698 [IR] Default to empty attributes, instead of NULL (#16745) add 69c091400a [Fix] Fix build errors with VS2022 (#16790) No new revisions were added by this update. Summary of changes: src/runtime/metadata.cc | 3 +-- src/tir/analysis/identify_memcpy.cc | 2 +- src/tir/contrib/ethosu/passes.cc| 2 +- 3 files changed, 3 insertions(+), 4 deletions(-)
[PR] [TIR] Modify IntImmNode deep_equal to match regardless of type [tvm]
quic-sanirudh opened a new pull request, #16795: URL: https://github.com/apache/tvm/pull/16795 This patch makes a small change to compare the values of IntImmNode to see if they're equal when performing a deep_equal of expressions. This is to try and align it with how the [`PEqualChecker`](https://github.com/apache/tvm/blob/b2204ae6988c7745ea9736340ccd900bc21ae821/src/arith/pattern_match.h#L168) works where we only compare the values if both are IntImm. This caused some simplifications to be inconsistent based on whether we used IntImmNode or PrimExpr to pass an integer between different passes, and it seemed to make more sense to say that if the values are equal, then we can conclude the immediates are equal. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [Frontend][PaddlePaddle] Update the export method of PaddlePaddle Softmax [tvm]
Zheng-Bicheng commented on PR #16653: URL: https://github.com/apache/tvm/pull/16653#issuecomment-2020098015 > Hello,@leandron . I found in [cmsis.py](https://github.com/apache/tvm/blob/ff3716b83a72c2ff261c492f259e1fcd260600ce/python/tvm/relay/op/contrib/cmsisnn.py#L90) that the scale of softmax must be 1/256 and the zero point must be -128. Why is that? According to the formula Q(x_fp32, scale, zero_point) = round(x_fp32/scale) + zero_point, scale and zp should be adjustable (for example, in the case where scale is 1/128 and zp is 0, it should still meet the conditions for int8), right? By the way, in my testing of the Paddle model, the scale is 0.0078649195 (close to 1/127), and the zero point is 0." -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [Frontend][PaddlePaddle] Update the export method of PaddlePaddle Softmax [tvm]
Zheng-Bicheng commented on PR #16653: URL: https://github.com/apache/tvm/pull/16653#issuecomment-2020092045 Hello,@leandron . I found in [cmsis.py](https://github.com/apache/tvm/blob/ff3716b83a72c2ff261c492f259e1fcd260600ce/python/tvm/relay/op/contrib/cmsisnn.py#L90) that the scale of softmax must be 1/256 and the zero point must be -128. Why is that? According to the formula Q(x_fp32, scale, zero_point) = round(x_fp32/scale) + zero_point, scale and zp should be adjustable (for example, in the case where scale is 1/128 and zp is 0, it should still meet the conditions for int8), right? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [SME] Target parser support for SME [tvm]
lhutton1 opened a new pull request, #16794: URL: https://github.com/apache/tvm/pull/16794 This commit adds support for recognising when the SME architecture feature is available based on the target string. A python user can use target.features.has_sme to check availability. This PR relies on #16425 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [Bug] Tensorization Failure During Multilevel Tiling with Tensor Intrin [tvm]
krishnab30 commented on issue #16614: URL: https://github.com/apache/tvm/issues/16614#issuecomment-2020074885 Hi @zxybazh , I am facing the same issue -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [Target] Use LLVM target parser for determining Arm(R) A-Profile Architecture features [tvm]
cbalint13 commented on PR #16425: URL: https://github.com/apache/tvm/pull/16425#issuecomment-2019978336 > Here is a reproducer: > mem_leak.cpp Thanks a lot for this, I start to look at it now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [Target] Use LLVM target parser for determining Arm(R) A-Profile Architecture features [tvm]
lhutton1 commented on PR #16425: URL: https://github.com/apache/tvm/pull/16425#issuecomment-2019958120 Here is a reproducer: mem_leak.cpp ```c++ #include "tvm/runtime/registry.h" #include "tvm/target/target.h" int main() { auto pf = tvm::runtime::Registry::Get("target.llvm_get_cpu_archlist"); (*pf)(tvm::Target("llvm")); } ``` Compile: ```bash g++ -std=c++17 -O2 -fPIC -I{TVM_DIR}/include -I{TVM_DIR}/3rdparty/dmlc-core/include -I{TVM_DIR}/tvm/3rdparty/dlpack/include -DDMLC_USE_LOGGING_LIBRARY=\ -o mem_leak_exec mem_leak.cpp -L{TVM_BUILD_DIR} -ldl -ltvm -pthread ``` Run with valgrind: ```bash LD_PRELOAD="{TVM_BUILD_DIR}/libtvm.so" valgrind --leak-check=full -v --track-origins=yes ./mem_leak_exec ``` Output: ``` ... ==475237== 12,369 (1,560 direct, 10,809 indirect) bytes in 1 blocks are definitely lost in loss record 42,596 of 42,630 ==475237==at 0x4849013: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==475237==by 0x12244479: ??? (in /usr/lib/x86_64-linux-gnu/libLLVM-17.so.1) ==475237==by 0xBC0131B: llvm::Target::createTargetMachine(llvm::StringRef, llvm::StringRef, llvm::StringRef, llvm::TargetOptions const&, std::optional, std::optional, llvm::CodeGenOpt::Level, bool) const (TargetRegistry.h:488) ==475237==by 0xBBFBC05: tvm::codegen::CreateLLVMTargetMachine(llvm::Target const*, std::__cxx11::basic_string, std::allocator > const&, std::__cxx11::basic_string, std::allocator > const&, std::__cxx11::basic_string, std::allocator > const&, llvm::TargetOptions const&, llvm::Reloc::Model const&, llvm::CodeModel::Model const&, llvm::CodeGenOpt::Level const&) (llvm_instance.cc:393) ==475237==by 0xBBFBD8A: tvm::codegen::GetLLVMSubtargetInfo(std::__cxx11::basic_string, std::allocator > const&, std::__cxx11::basic_string, std::allocator > const&, std::__cxx11::basic_string, std::allocator > const&) (llvm_instance.cc:408) ==475237==by 0xBBFEAE1: tvm::codegen::LLVMTargetInfo::GetAllLLVMTargetArches() const (llvm_instance.cc:835) ==475237==by 0xBBFA2BB: tvm::codegen::LLVMTargetInfo::LLVMTargetInfo(tvm::codegen::LLVMInstance&, tvm::Target const&) (llvm_instance.cc:218) ==475237==by 0xBC0EE16: tvm::codegen::__mk_TVM8::{lambda(tvm::Target const&)#1}::operator()(tvm::Target const) const (llvm_module.cc:695) ==475237==by 0xBC188B0: tvm::runtime::TypedPackedFunc (tvm::Target const&)>::AssignTypedLambda(tvm::codegen::__mk_TVM8::{lambda(tvm::Target const&)#1}, std::__cxx11::basic_string, std::allocator >)::{lambda(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)#1}::operator()(tvm::runtime::TVMArgs const, tvm::runtime::TVMRetValue) const (packed_func.h:1826) ==475237==by 0xBC233EE: tvm::runtime::PackedFuncObj::Extractor (tvm::Target const&)>::AssignTypedLambda(tvm::codegen::__mk_TVM8::{lambda(tvm::Target const&)#1}, std::__cxx11::basic_string, std::allocator >)::{lambda(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)#1}> >::Call(tvm::runtime::PackedFuncObj const*, std::__cxx11::basic_string, std::allocator >, tvm::runtime::TVMRetValue) (packed_func.h:1252) ==475237==by 0x1092F8: main (in /workspaces/tvm-ethosn/src/tvm/test_mem_leak/cpp_deploy) ... ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [Fix] fix for numpy 2.0 Compatibility [tvm]
mshr-h opened a new pull request, #16793: URL: https://github.com/apache/tvm/pull/16793 I checked the entire tvm codebase with `ruff check . --select NPY201` and no other deprecations were detected. Changes are below. - use `-np.inf` instead of `np.NINF` - use `np.inf` instead of `np.infty` - better attribute existence check #16780 ref: [NumPy 2.0 migration guide — NumPy v2.1.dev0 Manual](https://numpy.org/devdocs/numpy_2_0_migration_guide.html) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CI][AArch64] Enable ONNX and PyTorch tests on AArch64 [tvm]
Liam-Sturge commented on PR #16747: URL: https://github.com/apache/tvm/pull/16747#issuecomment-2019939171 OK, thanks for your feedback @tqchen. I see your point relating to overdoing the testing. Do we currently have any implementation of nightly tests set up that I could append these tests on to? I didn't see anything when I looked, but I may have missed something. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [Fix] Fix build errors with VS2022 [tvm]
Jiawei-Shao commented on PR #16790: URL: https://github.com/apache/tvm/pull/16790#issuecomment-2019840094 @tqchen PTAL, thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [Doc] Fix set_axis_separator example [tvm]
quic-sanirudh opened a new pull request, #16792: URL: https://github.com/apache/tvm/pull/16792 Minor fix to update the `set_axis_separator` example to match the definition -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org