[GitHub] [tvm] masahi commented on pull request #7128: Add `is_floating_point` and `div_` PyTorch ops
masahi commented on pull request #7128: URL: https://github.com/apache/tvm/pull/7128#issuecomment-747928318 Thanks @TylerADavis This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[tvm] branch main updated (6b9b898 -> 833f238)
This is an automated email from the ASF dual-hosted git repository. masahi pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git. from 6b9b898 Add ACL testing to the CI for AArch64. (#7122) add 833f238 Add `is_floating_point` and `div_` PyTorch ops (#7128) No new revisions were added by this update. Summary of changes: python/tvm/relay/frontend/pytorch.py | 13 + 1 file changed, 13 insertions(+)
[GitHub] [tvm] masahi merged pull request #7128: Add `is_floating_point` and `div_` PyTorch ops
masahi merged pull request #7128: URL: https://github.com/apache/tvm/pull/7128 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] masahi closed issue #6239: [Frontend][PyTorch] NotImplementedError: The following operators are not implemented: ['aten::is_floating_point', 'aten::true_divide']
masahi closed issue #6239: URL: https://github.com/apache/tvm/issues/6239 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] dlexplorer edited a comment on issue #7104: Segmentation fault with ACL integration on onnx inception v1
dlexplorer edited a comment on issue #7104: URL: https://github.com/apache/tvm/issues/7104#issuecomment-747878618 I double checked it and still confirm that loading inception v1 been compiled with ACL crashes on device and the version been compiled with pure llvm approach works well this is how I compiled network ``` import argparse import sys import onnx import tvm from tvm import relay from tvm.contrib import ndk parser = argparse.ArgumentParser(description= "Converts and compiles ONNX model") required = parser.add_argument_group('required arguments') required.add_argument('-m', '--input_model', required=True, type=str, help="path to ONNX model") args = parser.parse_args() onnx_model = onnx.load(args.input_model) mod, params = relay.frontend.from_onnx(onnx_model) target = "llvm -model=snapdragon835 -mtriple=arm-linux-android -mattr=+neon" target_host = "llvm -mtriple=aarch64-linux-android-g++" # next two line defines the ACL/pure llvm diffirentiation from tvm.relay.op.contrib.arm_compute_lib import partition_for_arm_compute_lib mod = partition_for_arm_compute_lib(mod) with tvm.transform.PassContext(opt_level=3): lib = relay.build(mod, target=target, target_host=target_host, params=params) print(args.input_model + ".so") lib.export_library(args.input_model + ".so", ndk.create_shared) ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] insop edited a comment on pull request #7111: Fix a bug in batch_matmul that te.max should be used
insop edited a comment on pull request #7111: URL: https://github.com/apache/tvm/pull/7111#issuecomment-747915428 @comaniac , @zhiics PTAL. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] insop commented on pull request #7111: Fix a bug in batch_matmul that te.max should be used
insop commented on pull request #7111: URL: https://github.com/apache/tvm/pull/7111#issuecomment-747915428 @comaniac , @zhiics PTLA. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] hzfan opened a new pull request #7131: [TOPI] cuda reduction schedule
hzfan opened a new pull request #7131: URL: https://github.com/apache/tvm/pull/7131 Fix cuda reduction schedules to support diamond-like compute, where a reduce becomes input twice, like ``` reduce / \ add multiply \ / add ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] manupa-arm commented on pull request #7002: Created CSourceMetaData module for model metadata
manupa-arm commented on pull request #7002: URL: https://github.com/apache/tvm/pull/7002#issuecomment-747900398 @tqchen @areusch @zhiics , I ve addressed all the comments and aligned with what we discussed in the RFC. Can you have another look when you have time? If its OK, please approve explicitly. Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] manupa-arm commented on a change in pull request #7002: Created CSourceMetaData module for model metadata
manupa-arm commented on a change in pull request #7002: URL: https://github.com/apache/tvm/pull/7002#discussion_r545600389 ## File path: src/relay/backend/contrib/codegen_c/codegen.cc ## @@ -259,18 +263,13 @@ class CSourceCodegen : public CSourceModuleCodegenBase { )op_macro"; code_stream_ << operator_macro << "\n\n"; - -ICHECK(ref->IsInstance()); -auto res = GenCFunc(Downcast(ref)); +code_stream_ << std::get<3>(res); std::string code = code_stream_.str(); -String sym = std::get<0>(res); -Array variables = std::get<1>(res); - // Create a CSource module const auto* pf = runtime::Registry::Get("runtime.CSourceModuleCreate"); ICHECK(pf != nullptr) << "Cannot find csource module to create the external runtime module"; -return (*pf)(code, "c", sym, variables); +return (*pf)(code, "c", Array{func_name}, sym, variables); Review comment: Used func_name as it is more clearer term and removed sym. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] tsupei edited a comment on pull request #7111: Fix a bug in batch_matmul that te.max should be used
tsupei edited a comment on pull request #7111: URL: https://github.com/apache/tvm/pull/7111#issuecomment-747881146 @insop Device `nvptx` will also have `Memory Verification` issue. I add a commit to skip that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] tsupei commented on pull request #7111: Fix a bug in batch_matmul that te.max should be used
tsupei commented on pull request #7111: URL: https://github.com/apache/tvm/pull/7111#issuecomment-747881146 @insop in device `nvptx` will also have `Memory Verification` issue, I add a commit to skip that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] dlexplorer commented on issue #7104: Segmentation fault with ACL integration on onnx inception v1
dlexplorer commented on issue #7104: URL: https://github.com/apache/tvm/issues/7104#issuecomment-747878618 I double checked it and still confirm that loading inception v1 been compiled with ACL crashes on device and the version been compiled with pure llvm approach works well this is how I compiled network ``` import argparse import sys import onnx import tvm from tvm import relay from tvm.contrib import ndk parser = argparse.ArgumentParser(description= "Converts and compiles ONNX model") required = parser.add_argument_group('required arguments') required.add_argument('-m', '--input_model', required=True, type=str, help="path to ONNX model") args = parser.parse_args() onnx_model = onnx.load(args.input_model) mod, params = relay.frontend.from_onnx(onnx_model) target = "llvm -model=snapdragon835 -mtriple=arm-linux-android -mattr=+neon" target_host = "llvm -mtriple=aarch64-linux-android-g++" with tvm.transform.PassContext(opt_level=3): lib = relay.build(mod, target=target, target_host=target_host, params=params) print(args.input_model + ".so") lib.export_library(args.input_model + ".so", ndk.create_shared) ~ ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] slyubomirsky commented on issue #7129: [Relay][Documentation] Behavior of relay.stack does not appear to match documentation
slyubomirsky commented on issue #7129: URL: https://github.com/apache/tvm/issues/7129#issuecomment-747872960 In retrospect, it is conceivable that the documentation did correctly note that the input must be a list or a tuple literal, but I would argue that restricting the possible input in this way defies user expectations. Also the fix was very easy. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] slyubomirsky opened a new pull request #7130: [Relay][fix] Stack should take exprs that evaluate to tuples
slyubomirsky opened a new pull request #7130: URL: https://github.com/apache/tvm/pull/7130 As noted in issue #7129, the documentation for `relay.stack` suggests it should be able to take a Relay expression that evaluates to a tuple but in reality only works for tuple literals or list literals. It was, in fact, easy to fix `relay.stack` to work on Relay expressions that evaluate to tuples. I have added test cases to show this off. I would appreciate suggestions for further test cases if possible. I realize the documentation for `stack` may have been specifically intended to mean that its input should always be a tuple literal, but I think it is reasonable to expect that it should be able to take an expression evaluate to a tuple as input. (The operator implementation and type relation can handle that just fine.) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] masahi commented on pull request #7123: Parallelize cumsum in get_valid_counts
masahi commented on pull request #7123: URL: https://github.com/apache/tvm/pull/7123#issuecomment-747865155 @mbrookhart Can you revive disabled topi `get_valid_count` test? It seems this test needs some updating. https://github.com/apache/tvm/blob/76b4ad09386f26ff360d0276745fe882d3ba6b0d/tests/python/topi/python/test_topi_vision.py#L124-L129 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] insop edited a comment on pull request #7111: Fix a bug in batch_matmul that te.max should be used
insop edited a comment on pull request #7111: URL: https://github.com/apache/tvm/pull/7111#issuecomment-747862949 @tsupei, I didn't see it on my system where it only ran `llvm` test. I have this [patch](https://github.com/insop/incubator-tvm/commit/9760c73dd0144bfbdd09f1dcbc7746efe269a823.patch) to skip dynamic test on `cuda`, which you could apply. However, I am not cure about this `Memory verification` shown your previous post and [jenkins](https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-7111/6/pipeline/) on cuda is expected or not? @comaniac , @zhiics, thought? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] masahi commented on pull request #7099: [CUDA] Parallel Cuda Mergesort
masahi commented on pull request #7099: URL: https://github.com/apache/tvm/pull/7099#issuecomment-747864475 @mbrookhart I think we can revive some tests that are currently disabled due to flaky sort. See https://github.com/apache/tvm/blob/bad149ed8a555444d813537608ee5cea9e95e97e/tests/python/relay/test_any.py#L253-L256 https://github.com/apache/tvm/blob/76b4ad09386f26ff360d0276745fe882d3ba6b0d/tests/python/topi/python/test_topi_argwhere.py#L63-L71 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] insop commented on pull request #7111: Fix a bug in batch_matmul that te.max should be used
insop commented on pull request #7111: URL: https://github.com/apache/tvm/pull/7111#issuecomment-747862949 @tsupei, I didn't see it on my system where it only ran `llvm` test. I have this [patch](https://github.com/insop/incubator-tvm/commit/8c0e1d7351cb5860e159cc29a51204525aa02833.patch) to skip dynamic test on `cuda`, which you could apply. However, I am not cure about this `Memory verification` shown your previous post and [jenkins](https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-7111/6/pipeline/) on cuda is expected or not? @comaniac , @zhiics, thought? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] masahi commented on pull request #7128: Add `is_floating_point` and `div_` PyTorch ops
masahi commented on pull request #7128: URL: https://github.com/apache/tvm/pull/7128#issuecomment-747862332 Have you looked at `verify_script_model` in pytorch/test_forward.py? It might help for adding tests for `is_floating_point`. I also encountered an empty graph when writing a test for `torch.numel` with tracing. See `test_numel()` for its usage. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] slyubomirsky commented on issue #7129: [Relay][Documentation] Behavior of relay.stack does not appear to match documentation
slyubomirsky commented on issue #7129: URL: https://github.com/apache/tvm/issues/7129#issuecomment-747862152 It appears this can be easily fixed with some changes to the Python binding for `stack`. I will PR a fix. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] slyubomirsky opened a new issue #7129: [Relay][Documentation] Documentation for relay.stack appears incorrect
slyubomirsky opened a new issue #7129: URL: https://github.com/apache/tvm/issues/7129 The documentation for [`tvm.relay.stack`](https://tvm.apache.org/docs/api/python/relay/index.html#tvm.relay.stack) states that its input can be either a list of Relay expressions or a tuple. However, if you give a Relay expression with a tuple output as an input to `tvm.relay.stack`, this will result in a Python type error, as the function [first tries to coerce the input into a Python list](https://github.com/apache/tvm/blob/6b9b8986bdf94834f3c19c0dc3c2841ac77d46fc/python/tvm/relay/op/tensor.py#L1119). The functionality should either be made to match the documentation or the documentation should be corrected to reflect the intended use. @hogepodge This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] TylerADavis commented on pull request #7128: Add `is_floating_point` and `div_` PyTorch ops
TylerADavis commented on pull request #7128: URL: https://github.com/apache/tvm/pull/7128#issuecomment-747854049 Hi, I just ran `black` on `pytorch.py` and pushed the changes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] masahi commented on pull request #7128: Add `is_floating_point` and `div_` PyTorch ops
masahi commented on pull request #7128: URL: https://github.com/apache/tvm/pull/7128#issuecomment-747848521 Thanks, please run `black` to fix formatting. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] TylerADavis opened a new pull request #7128: Add `is_floating_point` and `div_` PyTorch ops
TylerADavis opened a new pull request #7128: URL: https://github.com/apache/tvm/pull/7128 Thanks for contributing to TVM! Please refer to guideline https://tvm.apache.org/docs/contribute/ for useful information and tips. After the pull request is submitted, please request code reviews from [Reviewers](https://github.com/apache/incubator-tvm/blob/master/CONTRIBUTORS.md#reviewers) by @ them in the pull request thread. This PR adds support for the `is_floating_point` and `div_` PyTorch operations, and closes #6239 . This PR does not include a unit test for `div_` as there is currently no unit test for `div`, and they share an implementation. Additionally, while I wrote a unit test for `is_floating_point()` (see https://github.com/apache/tvm/compare/main...TylerADavis:tyler_add_ops_incl_tests?expand=1), it is not included in this PR as models consisting solely of `is_floating_point` are compiled to constant graphs with no inputs, and the current testing infrastructure assumes that all graphs have inputs. The unit test for `is_floating_point()` tests the operator with `torch.jit.script` and `torch.jit.trace` with inputs consisting of `int8, float64, float32`, and `float16`. Each of these test cases passes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] codeislife99 commented on a change in pull request #7125: Sparse reshape op
codeislife99 commented on a change in pull request #7125: URL: https://github.com/apache/tvm/pull/7125#discussion_r545533821 ## File path: include/tvm/topi/transform.h ## @@ -1386,6 +1386,85 @@ inline Array meshgrid(const Array& inputs, const std::string& in return result; } +/*! + * \brief Compute new sparse indices and return them after the sparsereshape operation + * + * \param sparse_indices Indices where values of the dense tensor exist + * \param sparse_values Values at the above indices respectively + * \param prev_shape Old Shape of the sparse tensor corresponding to sparse_indices + * \param new_shape Desired Shape of the sparse tensor which will correspond to output + * \param name The name of the operation + * \param tag The tag to mark the operation + * + * \return A Tensor whose op member is the sparsereshape operation + */ + +inline Array SparseReshape(const Tensor& sparse_indices, const Tensor& sparse_values, + const Tensor& prev_shape, const Tensor& new_shape, + const std::string name = "T_sparsereshape", + std::string tag = kInjective) { + Array result; + Array new_sparse_indices_shape{sparse_indices->shape[0], new_shape->shape[0]}; + std::vector multipliers(GetConstInt(prev_shape->shape[0]), 1); + std::vector dividers(GetConstInt(new_shape->shape[0]), 1); + + tvm::te::compute(Array{1}, [&](const Array& indices) { Review comment: Yeah I was thinking of that too, but then since the value of multipliers depends on its later index, it might be a bit non trivial to computer over multipliers. Am I missing something ? I am open to pseudo-code ideas to make it more clean. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] trevor-m commented on a change in pull request #7125: Sparse reshape op
trevor-m commented on a change in pull request #7125: URL: https://github.com/apache/tvm/pull/7125#discussion_r545520606 ## File path: include/tvm/topi/transform.h ## @@ -1386,6 +1386,85 @@ inline Array meshgrid(const Array& inputs, const std::string& in return result; } +/*! + * \brief Compute new sparse indices and return them after the sparsereshape operation + * + * \param sparse_indices Indices where values of the dense tensor exist + * \param sparse_values Values at the above indices respectively + * \param prev_shape Old Shape of the sparse tensor corresponding to sparse_indices + * \param new_shape Desired Shape of the sparse tensor which will correspond to output + * \param name The name of the operation + * \param tag The tag to mark the operation + * + * \return A Tensor whose op member is the sparsereshape operation + */ + +inline Array SparseReshape(const Tensor& sparse_indices, const Tensor& sparse_values, + const Tensor& prev_shape, const Tensor& new_shape, + const std::string name = "T_sparsereshape", + std::string tag = kInjective) { + Array result; + Array new_sparse_indices_shape{sparse_indices->shape[0], new_shape->shape[0]}; + std::vector multipliers(GetConstInt(prev_shape->shape[0]), 1); + std::vector dividers(GetConstInt(new_shape->shape[0]), 1); + + tvm::te::compute(Array{1}, [&](const Array& indices) { Review comment: This compute looks weird to me since it isn’t returning anything. I think it’s better to determine multipliers shape, do a compute over that shape to initialize multipliers, rather than a compute over [1] which has internal loops. And same for dividers. Can someone with more experience here chime in? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] comaniac commented on pull request #7122: [CI] Add ACL to the CI for AArch64
comaniac commented on pull request #7122: URL: https://github.com/apache/tvm/pull/7122#issuecomment-747811368 Thanks @u99127 @zhiics. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[tvm] branch main updated (7cbc453 -> 6b9b898)
This is an automated email from the ASF dual-hosted git repository. comaniac pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git. from 7cbc453 Fix spelling in some comments (#7124) add 6b9b898 Add ACL testing to the CI for AArch64. (#7122) No new revisions were added by this update. Summary of changes: docker/Dockerfile.ci_arm | 4 tests/scripts/task_config_build_arm.sh| 1 + ...task_python_ethosn_tests.sh => task_python_arm_compute_library.sh} | 2 +- 3 files changed, 6 insertions(+), 1 deletion(-) copy tests/scripts/{task_python_ethosn_tests.sh => task_python_arm_compute_library.sh} (92%)
[GitHub] [tvm] comaniac merged pull request #7122: [CI] Add ACL to the CI for AArch64
comaniac merged pull request #7122: URL: https://github.com/apache/tvm/pull/7122 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] comaniac commented on a change in pull request #7125: Sparse reshape op
comaniac commented on a change in pull request #7125: URL: https://github.com/apache/tvm/pull/7125#discussion_r545511543 ## File path: include/tvm/topi/transform.h ## @@ -1386,6 +1386,85 @@ inline Array meshgrid(const Array& inputs, const std::string& in return result; } +/*! + * \brief Compute new sparse indices and return them after the sparsereshape operation + * + * \param sparse_indices Indices where values of the dense tensor exist + * \param sparse_values Values at the above indices respectively + * \param prev_shape Old Shape of the sparse tensor corresponding to sparse_indices + * \param new_shape Desired Shape of the sparse tensor which will correspond to output + * \param name The name of the operation + * \param tag The tag to mark the operation + * + * \return A Tensor whose op member is the sparsereshape operation + */ + Review comment: remove this line. ## File path: python/tvm/relay/op/transform.py ## @@ -1320,3 +1320,47 @@ def adv_index(inputs): Output tensor. """ return _make.adv_index(Tuple(inputs)) + + +def sparsereshape(sparse_indices, sparse_values, prev_shape, new_shape): +""" +Reshape a Sparse Tensor + +Parameters +-- +inputs : List[relay.Expr] +Input tensor and indices. +The first tensor is input data and rests are indices. + +Returns +--- +result: relay.Expr +Output tensor. +Examples + +.. code-block:: python + +sparse_indices = [[0, 0, 0], + [0, 0, 1], + [0, 1, 0], + [1, 0, 0], + [1, 2, 3]] + +sparse_values = [7, 5, 6, 3, 9] + +prev_shape = [2, 3, 4] + +new_shape = [9, -1] + +relay.sparsereshape(sparse_indices, +sparse_values, +prev_shape, +new_shape) + = [[0, 0], + [0, 1], + [1, 2], + [4, 2], + [8, 1]] + Review comment: remove this line. ## File path: python/tvm/topi/transform.py ## @@ -931,3 +931,47 @@ def adv_index(data, indices): Output tensor """ return cpp.adv_index(data, indices) + + +def sparsereshape(sparse_indices, sparse_values, prev_shape, new_shape): +""" +Reshape a Sparse Tensor + +Parameters +-- +inputs : List[relay.Expr] +Input tensor and indices. +The first tensor is input data and rests are indices. + +Returns +--- +result: relay.Expr +Output tensor. +Examples + +.. code-block:: python + +sparse_indices = [[0, 0, 0], + [0, 0, 1], + [0, 1, 0], + [1, 0, 0], + [1, 2, 3]] + +sparse_values = [7, 5, 6, 3, 9] + +prev_shape = [2, 3, 4] + +new_shape = [9, -1] + +relay.sparsereshape(sparse_indices, +sparse_values, +prev_shape, +new_shape) + = [[0, 0], + [0, 1], + [1, 2], + [4, 2], + [8, 1]] + Review comment: remove this line. ## File path: python/tvm/topi/transform.py ## @@ -931,3 +931,47 @@ def adv_index(data, indices): Output tensor """ return cpp.adv_index(data, indices) + + +def sparsereshape(sparse_indices, sparse_values, prev_shape, new_shape): +""" +Reshape a Sparse Tensor + +Parameters +-- +inputs : List[relay.Expr] +Input tensor and indices. +The first tensor is input data and rests are indices. + +Returns +--- +result: relay.Expr +Output tensor. +Examples + Review comment: ```suggestion Examples ``` ## File path: python/tvm/relay/op/transform.py ## @@ -1320,3 +1320,47 @@ def adv_index(inputs): Output tensor. """ return _make.adv_index(Tuple(inputs)) + + +def sparsereshape(sparse_indices, sparse_values, prev_shape, new_shape): +""" +Reshape a Sparse Tensor + +Parameters +-- +inputs : List[relay.Expr] +Input tensor and indices. +The first tensor is input data and rests are indices. + +Returns +--- +result: relay.Expr +Output tensor. +Examples + Review comment: ```suggestion Examples ``` This is an automated message from the Apache Git Service. To respond to the message, p
[GitHub] [tvm] codeislife99 commented on pull request #7125: Sparse reshape op
codeislife99 commented on pull request #7125: URL: https://github.com/apache/tvm/pull/7125#issuecomment-747790790 @trevor-m @comaniac @zhiics PTAL This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] tqchen commented on pull request #6959: [CONTRIB] PopenPoolExecutor
tqchen commented on pull request #6959: URL: https://github.com/apache/tvm/pull/6959#issuecomment-747780063 going to merge after two days This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] tqchen merged pull request #7124: Fix spelling in some comments
tqchen merged pull request #7124: URL: https://github.com/apache/tvm/pull/7124 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[tvm] branch main updated (9e766d9 -> 7cbc453)
This is an automated email from the ASF dual-hosted git repository. tqchen pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git. from 9e766d9 [Relay][Topi]Add Sort Op to Relay (#6978) add 7cbc453 Fix spelling in some comments (#7124) No new revisions were added by this update. Summary of changes: include/tvm/runtime/data_type.h | 8 1 file changed, 4 insertions(+), 4 deletions(-)
[GitHub] [tvm] hogepodge edited a comment on pull request #7127: Added additional information to the from_onnx tutorial
hogepodge edited a comment on pull request #7127: URL: https://github.com/apache/tvm/pull/7127#issuecomment-747750771 @jwfromm @mbrookhart Taking a first run at updating some of the onnx tutorial information. Feedback greatly appreciated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] hogepodge commented on pull request #7127: Added additional information to the from_onnx tutorial
hogepodge commented on pull request #7127: URL: https://github.com/apache/tvm/pull/7127#issuecomment-747750771 /assign @jwfromm @mbrookhart Taking a first run at updating some of the onnx tutorial information. Feedback greatly appreciated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] hogepodge opened a new pull request #7127: Added additional information to the from_onnx tutorial
hogepodge opened a new pull request #7127: URL: https://github.com/apache/tvm/pull/7127 As ONNX support improves, add more information about how the model importer works to aid new users. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] codeislife99 opened a new pull request #7126: Sparse fill empty rows op
codeislife99 opened a new pull request #7126: URL: https://github.com/apache/tvm/pull/7126 Thanks for contributing to TVM! Please refer to guideline https://tvm.apache.org/docs/contribute/ for useful information and tips. After the pull request is submitted, please request code reviews from [Reviewers](https://github.com/apache/incubator-tvm/blob/master/CONTRIBUTORS.md#reviewers) by @ them in the pull request thread. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] tqchen commented on a change in pull request #7083: [RELAY,TOPI] Threefry PRNG: splittable and stateless
tqchen commented on a change in pull request #7083: URL: https://github.com/apache/tvm/pull/7083#discussion_r545449345 ## File path: src/relay/op/random/kernel.cc ## @@ -25,21 +25,23 @@ namespace relay { TVM_REGISTER_NODE_TYPE(ThreefryGenerateAttrs); +static const TensorType THREEFRY_KEY_TYPE = TensorType({10}, tvm::DataType::UInt(64)); Review comment: Usually CamelCase is preferred for a global or static function This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] altanh commented on a change in pull request #7083: [RELAY,TOPI] Threefry PRNG: splittable and stateless
altanh commented on a change in pull request #7083: URL: https://github.com/apache/tvm/pull/7083#discussion_r545434990 ## File path: src/relay/op/random/kernel.cc ## @@ -25,21 +25,23 @@ namespace relay { TVM_REGISTER_NODE_TYPE(ThreefryGenerateAttrs); +static const TensorType THREEFRY_KEY_TYPE = TensorType({10}, tvm::DataType::UInt(64)); Review comment: Got it, is the naming style OK for a static function? `THREEFRY_KEY_TYPE()`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] tqchen commented on a change in pull request #7083: [RELAY,TOPI] Threefry PRNG: splittable and stateless
tqchen commented on a change in pull request #7083: URL: https://github.com/apache/tvm/pull/7083#discussion_r545434253 ## File path: src/relay/op/random/kernel.cc ## @@ -25,21 +25,23 @@ namespace relay { TVM_REGISTER_NODE_TYPE(ThreefryGenerateAttrs); +static const TensorType THREEFRY_KEY_TYPE = TensorType({10}, tvm::DataType::UInt(64)); Review comment: As a rule of thumb, try to avoid static variables. Use static functions that returns these variables instead. As the construction is not that costly. As sometimes they have static variable constructing order issues ## File path: src/relay/op/random/kernel.cc ## @@ -25,21 +25,23 @@ namespace relay { TVM_REGISTER_NODE_TYPE(ThreefryGenerateAttrs); +static const TensorType THREEFRY_KEY_TYPE = TensorType({10}, tvm::DataType::UInt(64)); Review comment: As a rule of thumb, try to avoid static variables. As sometimes they have static variable constructing order issues Use static functions that returns these variables instead. As the construction is not that costly. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] tqchen commented on a change in pull request #7083: [RELAY,TOPI] Threefry PRNG: splittable and stateless
tqchen commented on a change in pull request #7083: URL: https://github.com/apache/tvm/pull/7083#discussion_r545434253 ## File path: src/relay/op/random/kernel.cc ## @@ -25,21 +25,23 @@ namespace relay { TVM_REGISTER_NODE_TYPE(ThreefryGenerateAttrs); +static const TensorType THREEFRY_KEY_TYPE = TensorType({10}, tvm::DataType::UInt(64)); Review comment: As a rule of thumb, try to avoid static variables. Use static functions that returns these variables instead. As the construction is not that costly This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[tvm] branch main updated (bad149e -> 9e766d9)
This is an automated email from the ASF dual-hosted git repository. masahi pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git. from bad149e [TOPI] Fix GPU Dynamic Op Schedule (#7117) add 9e766d9 [Relay][Topi]Add Sort Op to Relay (#6978) No new revisions were added by this update. Summary of changes: python/tvm/relay/op/_algorithm.py | 9 ++- python/tvm/relay/op/algorithm.py | 22 ++ python/tvm/relay/op/strategy/cuda.py | 20 + python/tvm/relay/op/strategy/generic.py| 24 ++ python/tvm/topi/cuda/sort.py | 103 + python/tvm/topi/generic/sort.py| 17 python/tvm/topi/sort.py| 42 ++ src/relay/op/algorithm/{argsort.cc => sort.cc} | 26 +++ src/runtime/contrib/sort/sort.cc | 63 ++- tests/python/relay/test_op_level6.py | 60 +++--- tests/python/topi/python/test_topi_sort.py | 53 + 11 files changed, 411 insertions(+), 28 deletions(-) copy src/relay/op/algorithm/{argsort.cc => sort.cc} (69%)
[GitHub] [tvm] masahi commented on pull request #6978: [Relay][Topi]Add Sort Op to Relay
masahi commented on pull request #6978: URL: https://github.com/apache/tvm/pull/6978#issuecomment-747728047 Thanks @mbrookhart @zhiics @kevinthesun @jwfromm This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] masahi merged pull request #6978: [Relay][Topi]Add Sort Op to Relay
masahi merged pull request #6978: URL: https://github.com/apache/tvm/pull/6978 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] masahi commented on a change in pull request #7123: Parallelize cumsum in get_valid_counts
masahi commented on a change in pull request #7123: URL: https://github.com/apache/tvm/pull/7123#discussion_r545429638 ## File path: python/tvm/topi/cuda/nms.py ## @@ -211,28 +211,119 @@ def get_valid_indices_ir(valid_boxes, valid_count, valid_indices): valid_count = ib.buffer_ptr(valid_count) valid_indices = ib.buffer_ptr(valid_indices) +max_threads = int(tvm.target.Target.current(allow_none=False).max_num_threads) + +def ceil_div(a, b): +return tvm.tir.indexdiv(a + b - 1, b) + +# Copy boxes to valid_indices +with ib.new_scope(): +nthread_tx = max_threads +nthread_bx = ceil_div(batch_size * num_anchors, max_threads) +tx = te.thread_axis("threadIdx.x") +bx = te.thread_axis("blockIdx.x") +ib.scope_attr(tx, "thread_extent", nthread_tx) +ib.scope_attr(bx, "thread_extent", nthread_bx) +tid = bx * max_threads + tx +with ib.if_scope(tid < batch_size * num_anchors): +valid_indices[tid] = valid_boxes[tid] + +nthread_tx = max_threads +nthread_bx = ceil_div(num_anchors, max_threads) +nthread_by = batch_size + +## The following algorithm performs parallel prefix sum to get +## a tensor that can later be used to select valid indices +# Up Sweep of prefix sum +lim = tvm.tir.generic.cast( +tvm.tir.ceil(tvm.tir.log2(tvm.tir.generic.cast(num_anchors, "float64"))), "int64" +) +with ib.for_range(0, lim, dtype="int64") as l2_width: +width = 2 << l2_width + +with ib.new_scope(): +tx = te.thread_axis("threadIdx.x") +bx = te.thread_axis("blockIdx.x") +ib.scope_attr(tx, "thread_extent", nthread_tx) +ib.scope_attr( +bx, +"thread_extent", +tvm.tir.generic.cast(ceil_div(num_anchors, max_threads * width), "int32"), +) +tid = bx * nthread_tx + tx + +by = te.thread_axis("blockIdx.y") +ib.scope_attr(by, "thread_extent", nthread_by) +start = ib.allocate("int64", (1,), name="start", scope="local") +middle = ib.allocate("int64", (1,), name="middle", scope="local") +end = ib.allocate("int64", (1,), name="end", scope="local") +start[0] = width * tid +with ib.if_scope(start[0] < num_anchors): +middle[0] = start[0] + tvm.tir.indexdiv(width, 2) +end[0] = tvm.te.min(start[0] + width, num_anchors) +with ib.if_scope(middle[0] < num_anchors): +valid_indices[by * num_anchors + end[0] - 1] += valid_indices[ +by * num_anchors + middle[0] - 1 +] + +# Down Sweep of prefix sum +with ib.for_range(0, lim - 1, dtype="int64") as l2_width: +width = 2 << (lim - l2_width - 2) + +with ib.new_scope(): +tx = te.thread_axis("threadIdx.x") +bx = te.thread_axis("blockIdx.x") +ib.scope_attr(tx, "thread_extent", nthread_tx) +ib.scope_attr( +bx, +"thread_extent", +tvm.tir.generic.cast(ceil_div(num_anchors, max_threads * width), "int32"), +) +tid = bx * nthread_tx + tx + +by = te.thread_axis("blockIdx.y") +ib.scope_attr(by, "thread_extent", nthread_by) +start = ib.allocate("int64", (1,), name="start", scope="local") +middle = ib.allocate("int64", (1,), name="middle", scope="local") +end = ib.allocate("int64", (1,), name="end", scope="local") +start[0] = width * tid +with ib.if_scope(tvm.tir.all(start[0] > 0, start[0] < num_anchors)): +middle[0] = start[0] + tvm.tir.indexdiv(width, 2) +with ib.if_scope(middle[0] < num_anchors): +valid_indices[by * num_anchors + middle[0] - 1] += valid_indices[ +by * num_anchors + start[0] - 1 +] + +## Write Sum to valid_count max_threads = int(tvm.target.Target.current(allow_none=False).max_num_threads) with ib.new_scope(): nthread_tx = max_threads -nthread_bx = batch_size // max_threads + 1 +nthread_bx = ceil_div(batch_size, max_threads) Review comment: What is the typical size of `batch_size` here? Do we need to launch this many threads (1024 minimum)? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] masahi commented on a change in pull request #7123: Parallelize cumsum in get_valid_counts
masahi commented on a change in pull request #7123: URL: https://github.com/apache/tvm/pull/7123#discussion_r545429638 ## File path: python/tvm/topi/cuda/nms.py ## @@ -211,28 +211,119 @@ def get_valid_indices_ir(valid_boxes, valid_count, valid_indices): valid_count = ib.buffer_ptr(valid_count) valid_indices = ib.buffer_ptr(valid_indices) +max_threads = int(tvm.target.Target.current(allow_none=False).max_num_threads) + +def ceil_div(a, b): +return tvm.tir.indexdiv(a + b - 1, b) + +# Copy boxes to valid_indices +with ib.new_scope(): +nthread_tx = max_threads +nthread_bx = ceil_div(batch_size * num_anchors, max_threads) +tx = te.thread_axis("threadIdx.x") +bx = te.thread_axis("blockIdx.x") +ib.scope_attr(tx, "thread_extent", nthread_tx) +ib.scope_attr(bx, "thread_extent", nthread_bx) +tid = bx * max_threads + tx +with ib.if_scope(tid < batch_size * num_anchors): +valid_indices[tid] = valid_boxes[tid] + +nthread_tx = max_threads +nthread_bx = ceil_div(num_anchors, max_threads) +nthread_by = batch_size + +## The following algorithm performs parallel prefix sum to get +## a tensor that can later be used to select valid indices +# Up Sweep of prefix sum +lim = tvm.tir.generic.cast( +tvm.tir.ceil(tvm.tir.log2(tvm.tir.generic.cast(num_anchors, "float64"))), "int64" +) +with ib.for_range(0, lim, dtype="int64") as l2_width: +width = 2 << l2_width + +with ib.new_scope(): +tx = te.thread_axis("threadIdx.x") +bx = te.thread_axis("blockIdx.x") +ib.scope_attr(tx, "thread_extent", nthread_tx) +ib.scope_attr( +bx, +"thread_extent", +tvm.tir.generic.cast(ceil_div(num_anchors, max_threads * width), "int32"), +) +tid = bx * nthread_tx + tx + +by = te.thread_axis("blockIdx.y") +ib.scope_attr(by, "thread_extent", nthread_by) +start = ib.allocate("int64", (1,), name="start", scope="local") +middle = ib.allocate("int64", (1,), name="middle", scope="local") +end = ib.allocate("int64", (1,), name="end", scope="local") +start[0] = width * tid +with ib.if_scope(start[0] < num_anchors): +middle[0] = start[0] + tvm.tir.indexdiv(width, 2) +end[0] = tvm.te.min(start[0] + width, num_anchors) +with ib.if_scope(middle[0] < num_anchors): +valid_indices[by * num_anchors + end[0] - 1] += valid_indices[ +by * num_anchors + middle[0] - 1 +] + +# Down Sweep of prefix sum +with ib.for_range(0, lim - 1, dtype="int64") as l2_width: +width = 2 << (lim - l2_width - 2) + +with ib.new_scope(): +tx = te.thread_axis("threadIdx.x") +bx = te.thread_axis("blockIdx.x") +ib.scope_attr(tx, "thread_extent", nthread_tx) +ib.scope_attr( +bx, +"thread_extent", +tvm.tir.generic.cast(ceil_div(num_anchors, max_threads * width), "int32"), +) +tid = bx * nthread_tx + tx + +by = te.thread_axis("blockIdx.y") +ib.scope_attr(by, "thread_extent", nthread_by) +start = ib.allocate("int64", (1,), name="start", scope="local") +middle = ib.allocate("int64", (1,), name="middle", scope="local") +end = ib.allocate("int64", (1,), name="end", scope="local") +start[0] = width * tid +with ib.if_scope(tvm.tir.all(start[0] > 0, start[0] < num_anchors)): +middle[0] = start[0] + tvm.tir.indexdiv(width, 2) +with ib.if_scope(middle[0] < num_anchors): +valid_indices[by * num_anchors + middle[0] - 1] += valid_indices[ +by * num_anchors + start[0] - 1 +] + +## Write Sum to valid_count max_threads = int(tvm.target.Target.current(allow_none=False).max_num_threads) with ib.new_scope(): nthread_tx = max_threads -nthread_bx = batch_size // max_threads + 1 +nthread_bx = ceil_div(batch_size, max_threads) Review comment: What is the typical size of `batch_size` here? Do we need to launch this many threads (1024 minimum)? How about fusing this kernel (populating `valid_counts`) into the last iteration of down sweep phase above? You have plenty of threads there, and only some of them need to write to `valid_counts`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.
[GitHub] [tvm] masahi commented on a change in pull request #7123: Parallelize cumsum in get_valid_counts
masahi commented on a change in pull request #7123: URL: https://github.com/apache/tvm/pull/7123#discussion_r545429638 ## File path: python/tvm/topi/cuda/nms.py ## @@ -211,28 +211,119 @@ def get_valid_indices_ir(valid_boxes, valid_count, valid_indices): valid_count = ib.buffer_ptr(valid_count) valid_indices = ib.buffer_ptr(valid_indices) +max_threads = int(tvm.target.Target.current(allow_none=False).max_num_threads) + +def ceil_div(a, b): +return tvm.tir.indexdiv(a + b - 1, b) + +# Copy boxes to valid_indices +with ib.new_scope(): +nthread_tx = max_threads +nthread_bx = ceil_div(batch_size * num_anchors, max_threads) +tx = te.thread_axis("threadIdx.x") +bx = te.thread_axis("blockIdx.x") +ib.scope_attr(tx, "thread_extent", nthread_tx) +ib.scope_attr(bx, "thread_extent", nthread_bx) +tid = bx * max_threads + tx +with ib.if_scope(tid < batch_size * num_anchors): +valid_indices[tid] = valid_boxes[tid] + +nthread_tx = max_threads +nthread_bx = ceil_div(num_anchors, max_threads) +nthread_by = batch_size + +## The following algorithm performs parallel prefix sum to get +## a tensor that can later be used to select valid indices +# Up Sweep of prefix sum +lim = tvm.tir.generic.cast( +tvm.tir.ceil(tvm.tir.log2(tvm.tir.generic.cast(num_anchors, "float64"))), "int64" +) +with ib.for_range(0, lim, dtype="int64") as l2_width: +width = 2 << l2_width + +with ib.new_scope(): +tx = te.thread_axis("threadIdx.x") +bx = te.thread_axis("blockIdx.x") +ib.scope_attr(tx, "thread_extent", nthread_tx) +ib.scope_attr( +bx, +"thread_extent", +tvm.tir.generic.cast(ceil_div(num_anchors, max_threads * width), "int32"), +) +tid = bx * nthread_tx + tx + +by = te.thread_axis("blockIdx.y") +ib.scope_attr(by, "thread_extent", nthread_by) +start = ib.allocate("int64", (1,), name="start", scope="local") +middle = ib.allocate("int64", (1,), name="middle", scope="local") +end = ib.allocate("int64", (1,), name="end", scope="local") +start[0] = width * tid +with ib.if_scope(start[0] < num_anchors): +middle[0] = start[0] + tvm.tir.indexdiv(width, 2) +end[0] = tvm.te.min(start[0] + width, num_anchors) +with ib.if_scope(middle[0] < num_anchors): +valid_indices[by * num_anchors + end[0] - 1] += valid_indices[ +by * num_anchors + middle[0] - 1 +] + +# Down Sweep of prefix sum +with ib.for_range(0, lim - 1, dtype="int64") as l2_width: +width = 2 << (lim - l2_width - 2) + +with ib.new_scope(): +tx = te.thread_axis("threadIdx.x") +bx = te.thread_axis("blockIdx.x") +ib.scope_attr(tx, "thread_extent", nthread_tx) +ib.scope_attr( +bx, +"thread_extent", +tvm.tir.generic.cast(ceil_div(num_anchors, max_threads * width), "int32"), +) +tid = bx * nthread_tx + tx + +by = te.thread_axis("blockIdx.y") +ib.scope_attr(by, "thread_extent", nthread_by) +start = ib.allocate("int64", (1,), name="start", scope="local") +middle = ib.allocate("int64", (1,), name="middle", scope="local") +end = ib.allocate("int64", (1,), name="end", scope="local") +start[0] = width * tid +with ib.if_scope(tvm.tir.all(start[0] > 0, start[0] < num_anchors)): +middle[0] = start[0] + tvm.tir.indexdiv(width, 2) +with ib.if_scope(middle[0] < num_anchors): +valid_indices[by * num_anchors + middle[0] - 1] += valid_indices[ +by * num_anchors + start[0] - 1 +] + +## Write Sum to valid_count max_threads = int(tvm.target.Target.current(allow_none=False).max_num_threads) with ib.new_scope(): nthread_tx = max_threads -nthread_bx = batch_size // max_threads + 1 +nthread_bx = ceil_div(batch_size, max_threads) Review comment: What is the typical size of `batch_size` here? Do we need to launch this many threads (1024 minimum)? How about fusing this kernel (`populating `valid_counts`) into the last iteration of down sweep phase above? You have plenty of threads there, and only some of them need to write to `valid_counts`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra
[tvm] branch main updated: [TOPI] Fix GPU Dynamic Op Schedule (#7117)
This is an automated email from the ASF dual-hosted git repository. kevinthesun pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new bad149e [TOPI] Fix GPU Dynamic Op Schedule (#7117) bad149e is described below commit bad149ed8a555444d813537608ee5cea9e95e97e Author: Yao Wang AuthorDate: Thu Dec 17 13:54:44 2020 -0800 [TOPI] Fix GPU Dynamic Op Schedule (#7117) * Fix GPU dynamic op schedules * Fix dynamic shape nms * Fix * Fix test format --- python/tvm/topi/cuda/conv2d_transpose_nchw.py | 7 +++- python/tvm/topi/cuda/injective.py | 13 ++- python/tvm/topi/cuda/nms.py | 49 +-- python/tvm/topi/cuda/sort.py | 3 ++ src/runtime/contrib/thrust/thrust.cu | 9 + src/runtime/vm/vm.cc | 17 -- tests/python/relay/test_any.py| 29 ++-- 7 files changed, 117 insertions(+), 10 deletions(-) diff --git a/python/tvm/topi/cuda/conv2d_transpose_nchw.py b/python/tvm/topi/cuda/conv2d_transpose_nchw.py index 609d1ac..3b70417 100644 --- a/python/tvm/topi/cuda/conv2d_transpose_nchw.py +++ b/python/tvm/topi/cuda/conv2d_transpose_nchw.py @@ -179,7 +179,10 @@ def schedule_conv2d_transpose_nchw(cfg, outs): # space definition begin # n, f, y, x = s[conv].op.axis rc = s[conv].op.reduce_axis[0] -cfg.define_split("tile_n", cfg.axis(n), num_outputs=4) +# TODO(@kevinthesun): Support tuning/optimization for dynamic shape. +bs = pad_data.shape[0] +n_tuning_axis = n if isinstance(bs, tvm.tir.IntImm) else 1 +cfg.define_split("tile_n", cfg.axis(n_tuning_axis), num_outputs=4) cfg.define_split("tile_f", cfg.axis(f), num_outputs=4) cfg.define_split("tile_y", cfg.axis(y), num_outputs=4) cfg.define_split("tile_x", cfg.axis(x), num_outputs=4) @@ -194,6 +197,8 @@ def schedule_conv2d_transpose_nchw(cfg, outs): if cfg.is_fallback: N, F, Y, X = get_const_tuple(conv.shape) +if not isinstance(N, int): +N = 1 _fallback_schedule(N, F, Y, X) # space definition end # diff --git a/python/tvm/topi/cuda/injective.py b/python/tvm/topi/cuda/injective.py index 60fb12e..7f0790a 100644 --- a/python/tvm/topi/cuda/injective.py +++ b/python/tvm/topi/cuda/injective.py @@ -44,8 +44,16 @@ def schedule_injective_from_existing(sch, out): # bandwidth. vector_width = 4 if out.dtype == "float16" else 1 +is_dynamic_output = False +for dim in out.shape: +if not isinstance(dim, tvm.tir.IntImm): +is_dynamic_output = True +break + +out_len = utils.prod(out.shape) + try: -const_size = utils.get_const_int(utils.prod(out.shape)) +const_size = utils.get_const_int(out_len) need_block_split = const_size > max_block * num_thread * vector_width except ValueError: need_block_split = False @@ -61,6 +69,9 @@ def schedule_injective_from_existing(sch, out): sch[out].bind(bx, te.thread_axis("blockIdx.x")) sch[out].bind(tx, te.thread_axis("threadIdx.x")) else: +# Use less threads for dynamic shape ops to avoid runtime error. +if is_dynamic_output: +num_thread //= 2 bx, tx = sch[out].split(fused, factor=num_thread) sch[out].bind(tx, te.thread_axis("threadIdx.x")) sch[out].bind(bx, te.thread_axis("blockIdx.x")) diff --git a/python/tvm/topi/cuda/nms.py b/python/tvm/topi/cuda/nms.py index d0915d9..2733970 100644 --- a/python/tvm/topi/cuda/nms.py +++ b/python/tvm/topi/cuda/nms.py @@ -22,7 +22,6 @@ from tvm import te from tvm.tir import if_then_else from .sort import argsort, argsort_thrust -from .. import tag def cuda_atomic_add_rule(op): @@ -95,7 +94,7 @@ def rearrange_indices_out_ir(data, output, valid_box_count): with ib.new_scope(): i = te.thread_axis("blockIdx.x") ib.scope_attr(i, "thread_extent", batch_size) -valid_idx = ib.allocate("int32", (1), name="valid_idx", scope="local") +valid_idx = ib.allocate("int32", (1,), name="valid_idx", scope="local") valid_idx[0] = 0 with ib.for_range(0, num_anchors, name="j") as j: with ib.if_scope(data[i, j] >= 0): @@ -654,6 +653,35 @@ def nms_ir( return ib.get() +def _fetch_score_ir(data, score, axis): +""" +Fetch score from data. +This routine is required for dynamic shape nms. +""" +batch_size = data.shape[0] +num_anchors = data.shape[1] +elem_length = data.shape[2] + +ib = tvm.tir.ir_builder.create() + +data = ib.buffer_ptr(data) +score = ib.buffer_ptr(score) +with ib.if_scope(num_anchors > 0):
[GitHub] [tvm] kevinthesun commented on pull request #7117: [TOPI] Fix GPU Dynamic Op Schedule
kevinthesun commented on pull request #7117: URL: https://github.com/apache/tvm/pull/7117#issuecomment-747725450 Thanks @mbrookhart This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] kevinthesun merged pull request #7117: [TOPI] Fix GPU Dynamic Op Schedule
kevinthesun merged pull request #7117: URL: https://github.com/apache/tvm/pull/7117 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] u99127 commented on pull request #7122: [CI] Add ACL to the CI for AArch64
u99127 commented on pull request #7122: URL: https://github.com/apache/tvm/pull/7122#issuecomment-747703831 > > > As I indicated in the covering letter, turning on task_python_integration.sh was failing in my test run with failures with tolerances that I suspect will need more time than I have this side of the holidays. It also means we can get quicker to first getting ACL runtime tests in the CI first. > > I must be missing the description before. Then it makes sense to me to have a separate script first until `task_python_integration.sh` fully works on AArch64 platform. Thanks for the swift review. Yes , sorry I don't think I was explicit enough in saying that. Ramana This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] tqchen commented on a change in pull request #6959: [CONTRIB] PopenPoolExecutor
tqchen commented on a change in pull request #6959: URL: https://github.com/apache/tvm/pull/6959#discussion_r545401076 ## File path: python/tvm/exec/popen_worker.py ## @@ -0,0 +1,104 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +# pylint: disable=invalid-name +"""Internal PopenWorker for PopenPool.""" +import sys +import os +import struct +import threading +import traceback +import pickle +import cloudpickle + +from tvm.contrib.popen_pool import StatusKind + + +class TimeoutStatus: +__slot__ = ["status"] + +def __init__(self): +self.status = StatusKind.RUNNING + + +def main(): +"""Main worker function""" +if len(sys.argv) != 3: +print("Usage: ") +return +if sys.platform == "win32": +# pylint: disable=import-outside-toplevel +import msvcrt + +reader = os.fdopen(msvcrt.open_osfhandle(int(sys.argv[1]), os.O_BINARY), "rb") +writer = os.fdopen(msvcrt.open_osfhandle(int(sys.argv[2]), os.O_BINARY), "wb") +else: +reader = os.fdopen(int(sys.argv[1]), "rb") +writer = os.fdopen(int(sys.argv[2]), "wb") + +lock = threading.Lock() + +def _respond(ret_value): +"""Send data back to the client.""" +data = cloudpickle.dumps(ret_value, protocol=pickle.HIGHEST_PROTOCOL) +writer.write(struct.pack("
[GitHub] [tvm] areusch edited a comment on pull request #6953: Add retry to sockets on EINTR error
areusch edited a comment on pull request #6953: URL: https://github.com/apache/tvm/pull/6953#issuecomment-747679635 that is correct. this problem is not specific to sockets--it is just the only place we see it now. it may appear in other contexts, such as in cases where a GPU driver hangs and you cannot ctrl+c it. it is better to solve this broadly using the EINTR hook rather than by requiring all languages to implement sockets on the frontend. I do agree that is an approach, but it just solves the one case rather than this issue more generally. @rkimball is right in saying that Python handles retrying internally--here is how it does it: when you call e.g. `os.read`, an internal CPython function (written in C) enters a loop. Each time, it calls read, and if it gets back `EINTR`, it calls `PyErr_CheckSignals` in a normal (I.e. not signal handler) context. It may get `EINTR` for many reasons, but one could be that a signal handler was invoked. When this happens and CPython is listening for said signal handler, it will have registered a C signal handler by calling `signal`. The OS now aborts `read` and calls this C signal handler. Notably, inside the C signal handler, the Python interpreter can't be resumed to handle the signal in Python, because the signal handler could be invoked at any time and that would be a multithreading catastrophe. So instead CPython sets an internal "signal handler pending" flag. Now two things could happen depending on the value of `siginterrupt`. If `siginterrupt` is set, then the syscall will return `EINTR`. If not, the OS will resume the syscall internally as if nothing happened. In the `EINTR` case, C functions called from Python should call `PyErr_CheckSignals`. Though it is called from a "c function call instruction," `PyErr_CheckSignals` checks if there are any pending Python-side signal handlers and jumps the Python interpreter to those Python-side signal handlers. If one of those raises an exception, the signal handler terminates and `PyErr_CheckSignals` returns 1. It is a bit of a hack to call this function from cython/ctypes, but it worked in my prototype. Note that in the case that `siginterrupt` is 0, a C extension probably should call `PyErr_CheckSignals` if it is not merely wrapping an OS syscall, because if `SIGINT` happened and was swallowed by `siginterrupt`, the C extension does need to handle the side effect of the syscall (i.e. the data returned from os.read should go somewhere), but it should not issue *another* syscall which may block and make Ctrl+C appear to do nothing. I believe this is why `siginterrupt` is automatically set when you set a custom Python signal handler. from Python's perspective, calling TVM's C functions through cython or ctypes is *the same* as calling one of its internal syscall wrapper functions. Python is expecting us to also handle `EINTR` ourselves. so if we add this hook, we are holding up our end of the contract, and solving this not just for socket ops, but providing an easy way to handle this for e.g. GPU syscalls. Finally, new language frontends need only implement this hook, rather than solve this in each separate e.g. socket, GPU, etc case. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] comaniac commented on pull request #7122: [CI] Add ACL to the CI for AArch64
comaniac commented on pull request #7122: URL: https://github.com/apache/tvm/pull/7122#issuecomment-747698590 > That is the next step once this is working .Turning that on probably also needs a bit of tweaking in the infra for the tests which I'll try and fix up next. The idea was to get the docker changes in and the testing on par with what happens with the other pipeline but natively. I see. Thanks for the clarification. > As I indicated in the covering letter, turning on task_python_integration.sh was failing in my test run with failures with tolerances that I suspect will need more time than I have this side of the holidays. It also means we can get quicker to first getting ACL runtime tests in the CI first. I must be missing the description before. Then it makes sense to me to have a separate script first until `task_python_integration.sh` fully works on AArch64 platform. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] u99127 edited a comment on pull request #7122: [CI] Add ACL to the CI for AArch64
u99127 edited a comment on pull request #7122: URL: https://github.com/apache/tvm/pull/7122#issuecomment-747688337 > LGTM to the dockerfile and build config. I expect the follow-up PR would also enable `USE_ARM_COMPUTE_LIB_GRAPH_RUNTIME` to cover ACL runtime unit tests. > That is the next step once this is working .Turning that on probably also needs a bit of tweaking in the infra for the tests which I'll try and fix up next. The idea was to get the docker changes in and the testing on par with what happens with the other pipeline but natively. > On the other hand, I don't think we need to add a new task script for ACL. The current Jenkinsfile already covers the ACL unit test in L280 (currently comment out). Specifically, `task_python_integration.sh` covers all the unit tests under `tests/python/contrib`, including ACL tests. As a result, I would prefer to bring L280 back in the Jenkinsfile to enable all the tests. > As I indicated in the covering letter, turning on task_python_integration.sh was failing in my test run with failures with tolerances that I suspect will need more time than I have this side of the holidays. It also means we can get quicker to first getting ACL runtime tests in the CI first. > Note: Docker image has to be updated after this PR. cc @tqchen regards Ramana This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] u99127 commented on pull request #7122: [CI] Add ACL to the CI for AArch64
u99127 commented on pull request #7122: URL: https://github.com/apache/tvm/pull/7122#issuecomment-747688337 > LGTM to the dockerfile and build config. I expect the follow-up PR would also enable `USE_ARM_COMPUTE_LIB_GRAPH_RUNTIME` to cover ACL runtime unit tests. > That is the next step once this is working .Turning that on probably also needs a bit of tweaking in the infra for the tests which I'll try and fix up next. The idea was to get the docker changes in and the testing on par with what happens with the other pipeline but natively. > On the other hand, I don't think we need to add a new task script for ACL. The current Jenkinsfile already covers the ACL unit test in L280 (currently comment out). Specifically, `task_python_integration.sh` covers all the unit tests under `tests/python/contrib`, including ACL tests. As a result, I would prefer to bring L280 back in the Jenkinsfile to enable all the tests. > As I indicated in the covering letter, turning on task_python_integration.sh was failing in my test run with failures with tolerances that I suspect will need more time than I have this side of the holidays. > Note: Docker image has to be updated after this PR. cc @tqchen regards Ramana This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] areusch commented on pull request #6953: Add retry to sockets on EINTR error
areusch commented on pull request #6953: URL: https://github.com/apache/tvm/pull/6953#issuecomment-747679635 that is correct. this problem is not specific to sockets--it is just the only place we see it now. it may appear in other contexts, such as in cases where a GPU driver hangs and you cannot ctrl+c it. it is better to solve this broadly using the EINTR hook rather than by requiring all languages to implement sockets on the frontend. I do agree that is an approach, but it just solves the one case rather than this issue more generally. @rkimball is right in saying that Python handles retrying internally--here is how it does it: when you call e.g. `os.read`, an internal CPython function (written in C) enters a loop. Each time, it calls read, and if it gets back `EINTR`, it calls `PyErr_CheckSignals` in a normal (I.e. not signal handler) context. It may get `EINTR` for many reasons, but one could be that a signal handler was invoked. When this happens and CPython is listening for said signal handler, it calls `signal` passing an internal function. Notably, inside this function, it can't resume the Python interpreter to handle the signal in Python, because the signal handler could be invoked at any time and that would be a multithreading catastrophe. So instead CPython sets an internal "signal handler pending" flag. Now two things could happen depending on the value of `siginterrupt`. If `siginterrupt` is set, then the syscall will return `EINTR`. If not, the OS will resume the syscall internally as if nothing happened. In the `EINTR` case, C functions called from Python should call `PyErr_CheckSignals`. Though it is called from a "c function call instruction," `PyErr_CheckSignals` checks if there are any pending Python-side signal handlers and jumps the Python interpreter to those Python-side signal handlers. If one of those raises an exception, the signal handler terminates and `PyErr_CheckSignals` returns 1. It is a bit of a hack to call this function from cython/ctypes, but it worked in my prototype. Note that in the case that `siginterrupt` is 0, a C extension probably should call `PyErr_CheckSignals` if it is not merely wrapping an OS syscall, because if `SIGINT` happened and was swallowed by `siginterrupt`, the C extension does need to handle the side effect of the syscall (i.e. the data returned from os.read should go somewhere), but it should not issue *another* syscall which may block and make Ctrl+C appear to do nothing. I believe this is why `siginterrupt` is automatically set when you set a custom Python signal handler. from Python's perspective, calling TVM's C functions through cython or ctypes is *the same* as calling one of its internal syscall wrapper functions. Python is expecting us to also handle `EINTR` ourselves. so if we add this hook, we are holding up our end of the contract, and solving this not just for socket ops, but providing an easy way to handle this for e.g. GPU syscalls. Finally, new language frontends need only implement this hook, rather than solve this in each separate e.g. socket, GPU, etc case. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] tkonolige commented on a change in pull request #7107: [Tutorial] Add output validation to sparse tutorial
tkonolige commented on a change in pull request #7107: URL: https://github.com/apache/tvm/pull/7107#discussion_r545374179 ## File path: tests/scripts/task_ci_python_setup.sh ## @@ -31,3 +31,4 @@ set -o pipefail echo "Addtiional setup in" ${CI_IMAGE_NAME} python3 -m pip install --user tlcpack-sphinx-addon==0.1.3 synr==0.2.1 +python3 -m pip install --user tokenizers==0.9.4 transformers==4.0.1 Review comment: Is this necessary? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] tqchen commented on pull request #6953: Add retry to sockets on EINTR error
tqchen commented on pull request #6953: URL: https://github.com/apache/tvm/pull/6953#issuecomment-747663520 I believe @areusch 's point is we can register a special function e.g. check python error PyErr_CheckSignals(via ctypes or cython), @areusch perhaps you can elaborate? So it behaves along the same lines as python socket This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] rkimball commented on pull request #6953: Add retry to sockets on EINTR error
rkimball commented on pull request #6953: URL: https://github.com/apache/tvm/pull/6953#issuecomment-747661773 The problem with python is that it handles signals internally. With PackedFunc we can't properly wrap the tvm socket calls with signal handlers to make them operate the way that native python sockets operate. Another alternative is to wrap the c++ rpc server in python, not using FFI. Within the wrapper we could handle signals and make the python wrapped RPC server operate just like other python network functions. What do other languages like Rust do with signals? Is there an issue with Rust? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] codeislife99 opened a new pull request #7125: Sparse reshape op
codeislife99 opened a new pull request #7125: URL: https://github.com/apache/tvm/pull/7125 Thanks for contributing to TVM! Please refer to guideline https://tvm.apache.org/docs/contribute/ for useful information and tips. After the pull request is submitted, please request code reviews from [Reviewers](https://github.com/apache/incubator-tvm/blob/master/CONTRIBUTORS.md#reviewers) by @ them in the pull request thread. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] comaniac commented on pull request #7070: Add autoscheduler support to tvmc
comaniac commented on pull request #7070: URL: https://github.com/apache/tvm/pull/7070#issuecomment-747655979 Thanks @giuseros @leandron @insop. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[tvm] branch main updated (829be98 -> fb8de5a)
This is an automated email from the ASF dual-hosted git repository. comaniac pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git. from 829be98 [CI][ACL] Switched to ACL 20.11 (#7106) add fb8de5a Add autoscheduler support to tvmc (#7070) No new revisions were added by this update. Summary of changes: python/tvm/driver/tvmc/autotuner.py| 259 ++--- python/tvm/driver/tvmc/compiler.py | 29 ++- tests/python/driver/tvmc/test_autoscheduler.py | 101 ++ tests/python/driver/tvmc/test_autotuner.py | 2 +- 4 files changed, 353 insertions(+), 38 deletions(-) create mode 100644 tests/python/driver/tvmc/test_autoscheduler.py
[GitHub] [tvm] comaniac merged pull request #7070: Add autoscheduler support to tvmc
comaniac merged pull request #7070: URL: https://github.com/apache/tvm/pull/7070 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] tqchen edited a comment on pull request #6953: Add retry to sockets on EINTR error
tqchen edited a comment on pull request #6953: URL: https://github.com/apache/tvm/pull/6953#issuecomment-747652309 @rkimball I agree that switching to python socket is another way (e.g. creating a customize channel that goes through python socket), the main reason to push the callback approach is that we may want the same libtvm runtime to function with and without the python. So ideally a handler function is slightly easier. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] tqchen edited a comment on pull request #6953: Add retry to sockets on EINTR error
tqchen edited a comment on pull request #6953: URL: https://github.com/apache/tvm/pull/6953#issuecomment-747652309 @rkimball I agree that switching to python socket is another way (e.g. creating a customize channel), the main reason to push the callback approach is that we may want the same libtvm runtime to function with and without the python. So ideally a handler function is slightly easier. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] tqchen commented on pull request #6953: Add retry to sockets on EINTR error
tqchen commented on pull request #6953: URL: https://github.com/apache/tvm/pull/6953#issuecomment-747652309 @rkimball I agree that switching to python socket is another way (e.g. creating a customize channel), the main reason to push the callback approach is that we may want the same libtvm runtime to function with and without the python. So ideally having a handler function is slightly easier. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] rkimball commented on pull request #6953: Add retry to sockets on EINTR error
rkimball commented on pull request #6953: URL: https://github.com/apache/tvm/pull/6953#issuecomment-747650777 I think if we really want a python RPC server then we should write it using python socket calls which already properly handle signals. The problem we have is that we are calling c++ socket calls via PackedFunc and those c++ calls are not specifically designed to handle python. If we call python socket calls like os.recv then we should have no issues. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] areusch commented on pull request #6953: Add retry to sockets on EINTR error
areusch commented on pull request #6953: URL: https://github.com/apache/tvm/pull/6953#issuecomment-74764 the only thing i would change is "default to nullptr **and retry**." by default, I think we do want to retry interrupted syscalls. It is only when a non-traditional signal handler such as Python-based SIGINT handler wants us to exit. Traditionally, the C SIGINT handler would call abort() in such cases, which means we don't need to by exit by default on an interrupted syscall. the rest makes sense to me! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] tqchen commented on pull request #6953: Add retry to sockets on EINTR error
tqchen commented on pull request #6953: URL: https://github.com/apache/tvm/pull/6953#issuecomment-747647080 I put some time to think about this, thanks to @rkimball @areusch for great discussions. Here is one possible solution: - Add a TypedPackedFunc field `eintr_handle` to the Socket which can be called to check whether we should retry when facing eintr. default to nullptr and no retry. - Setup a global PackedFunc registry `runtime.socket.SignalHandler` that returns a PackedFunc to set into the socket, call it during socket construction in RPC server. - Register when we are in python env during initialization. Let me know if that makes sense This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] tqchen commented on pull request #7084: [TIR] Support Return in TIR
tqchen commented on pull request #7084: URL: https://github.com/apache/tvm/pull/7084#issuecomment-747644045 Thanks @tkonolige , I agree, and perhaps https://github.com/apache/tvm/blob/main/python/tvm/tir/op.py is a better place given other intrinsics are in there This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] ZihengJiang commented on pull request #7084: [TIR] Support Return in TIR
ZihengJiang commented on pull request #7084: URL: https://github.com/apache/tvm/pull/7084#issuecomment-747634376 @tkonolige Happy to address your comment and hope it will make you understand TIR. Yes, usually a zero value would be returned before this PR and now a value can be explicitly returned. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] tkonolige commented on pull request #7084: [TIR] Support Return in TIR
tkonolige commented on pull request #7084: URL: https://github.com/apache/tvm/pull/7084#issuecomment-747632715 I'm just trying to understand things better. There's not really any documentation on TIR. I had thought that the last statement in every TIR expression was implicitly returned. Am I correct in saying that before this PR calling a TIR function would always return nullptr/None? And that now you can actually return a value? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] ZihengJiang commented on pull request #7084: [TIR] Support Return in TIR
ZihengJiang commented on pull request #7084: URL: https://github.com/apache/tvm/pull/7084#issuecomment-747613626 @tkonolige This will generate a function that only evaluate but not return: ``` python def func(a, b): c = a + b # return c ``` Think and have a try by yourself would make you understand other's PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] tkonolige commented on a change in pull request #7107: [Tutorial] Add output validation to sparse tutorial
tkonolige commented on a change in pull request #7107: URL: https://github.com/apache/tvm/pull/7107#discussion_r545294924 ## File path: tutorials/frontend/deploy_sparse.py ## @@ -119,6 +117,9 @@ # determines how sparse the generated weights should be. The higher # the sparsity, the faster the result. sparsity = 0.85 +# Running benchmarking mode might overload CI, +# so it is disabled by default. +benchmark = False Review comment: Just benchmark This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] junrushao1994 commented on pull request #7124: Fix spelling in some comments
junrushao1994 commented on pull request #7124: URL: https://github.com/apache/tvm/pull/7124#issuecomment-747604778 yeah there are too many typos in the codebase... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] tkonolige commented on pull request #7084: [TIR] Support Return in TIR
tkonolige commented on pull request #7084: URL: https://github.com/apache/tvm/pull/7084#issuecomment-747598083 I'm a little confused on how we use return in tir? How is the example you give different if we just remove the return? ```python a = tir.Var("a", "float32") b = tir.Var("b", "float32") c = a + b # c = tir.call_intrin("float32", "tir.myreturn", c) c = tir.Evaluate(c) /# doesn't this return a + b? ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] mbrookhart commented on a change in pull request #7120: [PatternLang] Add Syntatic Sugar to the C++ pattern API and support DataType Attribute Matching
mbrookhart commented on a change in pull request #7120: URL: https://github.com/apache/tvm/pull/7120#discussion_r545285509 ## File path: include/tvm/relay/dataflow_pattern.h ## @@ -46,6 +48,29 @@ class DFPatternNode : public Object { */ class DFPattern : public ObjectRef { public: + /*! \brief Syntatic Sugar for creating a CallPattern */ + DFPattern operator()(const std::vector& args); + /*! \brief Syntatic Sugar for creating a CallPattern with an "add" op */ Review comment: Hmm. I did the +-*/ overloads in python because Relay's IR does the same thing at the python level, and I wanted to keep operation on the pattern API in line with Relay's API. I continued that here, trying to keep the C++ pattern API in line with the Python Pattern API. So...for consistency I'd argue to leave it as it is. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] mbrookhart commented on a change in pull request #7120: [PatternLang] Add Syntatic Sugar to the C++ pattern API and support DataType Attribute Matching
mbrookhart commented on a change in pull request #7120: URL: https://github.com/apache/tvm/pull/7120#discussion_r545282998 ## File path: tests/cpp/dataflow_pattern_test.cc ## @@ -0,0 +1,200 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +#include +#include +#include + Review comment: I didn't add one because all of the operator +-*/ ops go through it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] ANSHUMAN87 commented on a change in pull request #7107: [Tutorial] Add output validation to sparse tutorial
ANSHUMAN87 commented on a change in pull request #7107: URL: https://github.com/apache/tvm/pull/7107#discussion_r545279781 ## File path: tutorials/frontend/deploy_sparse.py ## @@ -119,6 +117,9 @@ # determines how sparse the generated weights should be. The higher # the sparsity, the faster the result. sparsity = 0.85 +# Running benchmarking mode might overload CI, +# so it is disabled by default. +benchmark = False Review comment: Okay agree with your point, thanks for clarification! Just to confirm we should move all the config or only benchmark config to bottom? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] rkimball opened a new pull request #7124: Fix spelling in some comments
rkimball opened a new pull request #7124: URL: https://github.com/apache/tvm/pull/7124 Rather than have these tag along in a larger PR I just make this simple PR This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] tkonolige commented on a change in pull request #7107: [Tutorial] Add output validation to sparse tutorial
tkonolige commented on a change in pull request #7107: URL: https://github.com/apache/tvm/pull/7107#discussion_r545271417 ## File path: tutorials/frontend/deploy_sparse.py ## @@ -119,6 +117,9 @@ # determines how sparse the generated weights should be. The higher # the sparsity, the faster the result. sparsity = 0.85 +# Running benchmarking mode might overload CI, +# so it is disabled by default. +benchmark = False Review comment: All the other tutorials put it at the bottom, so we should be consistent. It also makes sense for the reader. The last step they want to do is run everything. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] tqchen edited a comment on pull request #7086: [Relay][Op] Consolidate reshape and reverse_reshape operators.
tqchen edited a comment on pull request #7086: URL: https://github.com/apache/tvm/pull/7086#issuecomment-747572747 As a rule of thumb(standardizationover flexibility), we should always keep API def consistent with numpy API when there is a related case in numpy, even though it could mean additional duplications. per https://tvm.apache.org/docs/contribute/code_review.html#deliberate-on-api-and-data-structures We can however, introduce additional attrs to reverse reshape given it is non-standard. I understand that there could be inconvenience here, but the benefits offered by standardization outweights the slight additional eng cost. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] tqchen edited a comment on pull request #7086: [Relay][Op] Consolidate reshape and reverse_reshape operators.
tqchen edited a comment on pull request #7086: URL: https://github.com/apache/tvm/pull/7086#issuecomment-747572747 As a rule of thumb(standardizationover flexibility), we should always keep API def consistent with numpy API when there is a related case in numpy, even though it could mean additional duplications. We can however, introduce additional attrs to reverse reshape given it is non-standard This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] tqchen edited a comment on pull request #7086: [Relay][Op] Consolidate reshape and reverse_reshape operators.
tqchen edited a comment on pull request #7086: URL: https://github.com/apache/tvm/pull/7086#issuecomment-747572747 As a rule of thumb(standardizatoin over flexibility), we should always keep API def consistent with numpy API when there is a related case in numpy, even though it could mean additional duplications. We can however, introduce additional attrs to reverse reshape given it is non-standard This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] tqchen edited a comment on pull request #7086: [Relay][Op] Consolidate reshape and reverse_reshape operators.
tqchen edited a comment on pull request #7086: URL: https://github.com/apache/tvm/pull/7086#issuecomment-747572747 As a rule of thumb, we should always keep API def consistent with numpy API when there is a related case in numpy, even though it could mean additional duplications. We can however, introduce additional attrs to reverse reshape given it is non-standard This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] tqchen edited a comment on pull request #7086: [Relay][Op] Consolidate reshape and reverse_reshape operators.
tqchen edited a comment on pull request #7086: URL: https://github.com/apache/tvm/pull/7086#issuecomment-747572747 As a rule of thumb, we should always keep API def consistent with numpy API when there is a case in the numpy, even though it could mean additional duplications. We can however, introduce additional attrs to reverse reshape given it is non-standard This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] tqchen edited a comment on pull request #7086: [Relay][Op] Consolidate reshape and reverse_reshape operators.
tqchen edited a comment on pull request #7086: URL: https://github.com/apache/tvm/pull/7086#issuecomment-747572747 As a rule of thumb, we should always keep API def consistent with numpy API when there is a case in the numpy. We can however, introduce additional attrs to reverse reshape given it is non-standard This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] mbrookhart opened a new pull request #7123: Parallelize cumsum in get_valid_counts
mbrookhart opened a new pull request #7123: URL: https://github.com/apache/tvm/pull/7123 As a followup to #6839 , this parallelizes the cumsum in get_valid_counts using an upsweep/downsweep tree-based prefix sum algorithm, similar to what I did in #7099. On my 1070 Ti, testing deploy_ssd_gluoncv.py, I previously reported that get_valid_counts took 3674.62 microseconds this reduces that to 495.8 microseconds. @masahi has expressed interest in implementing a more general prefix scan for other ops, as future work I expect we'll refactor this and do possible cache optimization. Thanks cc @Laurawly @zhiics @kevinthesun This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] tqchen commented on pull request #7086: [Relay][Op] Consolidate reshape and reverse_reshape operators.
tqchen commented on pull request #7086: URL: https://github.com/apache/tvm/pull/7086#issuecomment-747572747 As a rule of thumb, we should always keep reshape consistent with numpy API. We can however, introduce additional attrs to reverse reshape given it is non-standard This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] u99127 opened a new pull request #7122: Add ACL testing to the CI for AArch64.
u99127 opened a new pull request #7122: URL: https://github.com/apache/tvm/pull/7122 Add testing for ACL to the CI for AArch64. A PR follows to add this to the Jenkinsfile once the docker changes land. We also need a separate script to run the tests as the full python integration tests are currently broken. [I've tried this on an AArch64 Linux machine and it appears to work ok.] I think we'll need to bake a new docker image before the jenkins file changes land. @zhiics @comaniac @lhutton1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] tqchen merged pull request #7106: [CI][ACL] Switched to ACL 20.11
tqchen merged pull request #7106: URL: https://github.com/apache/tvm/pull/7106 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[tvm] branch main updated (4060b4f -> 829be98)
This is an automated email from the ASF dual-hosted git repository. tqchen pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git. from 4060b4f [BYOC] Added "include_non_call_ops" parameter to AnnotateTarget pass (#6655) add 829be98 [CI][ACL] Switched to ACL 20.11 (#7106) No new revisions were added by this update. Summary of changes: docker/install/ubuntu_install_arm_compute_lib.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
[GitHub] [tvm] d-smirnov edited a comment on pull request #7121: [BYOC] [ACL] include_non_call_ops = False
d-smirnov edited a comment on pull request #7121: URL: https://github.com/apache/tvm/pull/7121#issuecomment-747512494 cc @mbaret This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] d-smirnov commented on pull request #7121: [BYOC] [ACL] include_non_call_ops = False
d-smirnov commented on pull request #7121: URL: https://github.com/apache/tvm/pull/7121#issuecomment-747512494 @mbaret This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] d-smirnov opened a new pull request #7121: [BYOC] [ACL] include_non_call_ops = False
d-smirnov opened a new pull request #7121: URL: https://github.com/apache/tvm/pull/7121 This PR allows ACL codegen to use AnnotateTarget pass with include_non_call_ops = False to prevent promoting non-call ops under the target of its arguments. Squeezenet unit test added. This PR resolves issue https://discuss.tvm.apache.org/t/arm-compute-library-segv-with-inception-v1-squeezenet/7985/2 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] u99127 commented on pull request #7106: [CI][ACL] Switched to ACL 20.11
u99127 commented on pull request #7106: URL: https://github.com/apache/tvm/pull/7106#issuecomment-747487390 > > > can we enable the runtime tests since I already added an ARM instance to the CI? I'll follow up with a separate PR to add this to the AArch64 CI instance. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] d-smirnov commented on pull request #7106: [CI][ACL] Switched to ACL 20.11
d-smirnov commented on pull request #7106: URL: https://github.com/apache/tvm/pull/7106#issuecomment-747487250 @tqchen Bump This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org