Lunderberg commented on issue #13508: URL: https://github.com/apache/tvm/issues/13508#issuecomment-1330823429
Looks like the performance degredation is from `RemoveNoOp`. Even though the data-flow is disabled by default, the analyzer of `IRMutatorWithAnalyzer` still collects scoped information. Simplifications done by that analyzer don't show up in the output TIR, unless they are used to prove a statement to be a no-op (e.g. by having negative loop extent), but would impact the performance required. It looks like a quick fix may be to disable the `arith::RewriteSimplifier::kApplyConstraintsToBooleanBranches`, which is currently enabled for the analyzer in `RemoveNoOp`, which restored the performance in this test case. Can you check if it also improves the performance on your side by removing `kApplyConstraintsToBooleanBranches` from [this line](https://github.com/apache/tvm/blob/main/src/tir/transforms/remove_no_op.cc#L309)? I'm continuing to investigate, to see if this should be disabled, or if something else is wrong with simplifications. The lowered TIR has a lot of expressions that I would expect to be simplified. For example, that first `@tir.floor` in the if condition is `tir.floor((((cast(float32, n0_n0_k2.shifted.shifted) + 0.5f32)*0.5f32) - 0.5f32), dtype=float32))`. which is equivalent to `floordiv(n0_n0_k2.shifted.shifted - 1, 2)`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org