Lunderberg commented on issue #13508:
URL: https://github.com/apache/tvm/issues/13508#issuecomment-1330823429

   Looks like the performance degredation is from `RemoveNoOp`.  Even though 
the data-flow is disabled by default, the analyzer of `IRMutatorWithAnalyzer` 
still collects scoped information.  Simplifications done by that analyzer don't 
show up in the output TIR, unless they are used to prove a statement to be a 
no-op (e.g. by having negative loop extent), but would impact the performance 
required.
   
   It looks like a quick fix may be to disable the 
`arith::RewriteSimplifier::kApplyConstraintsToBooleanBranches`, which is 
currently enabled for the analyzer in `RemoveNoOp`, which restored the 
performance in this test case.  Can you check if it also improves the 
performance on your side by removing `kApplyConstraintsToBooleanBranches` from 
[this 
line](https://github.com/apache/tvm/blob/main/src/tir/transforms/remove_no_op.cc#L309)?
   
   I'm continuing to investigate, to see if this should be disabled, or if 
something else is wrong with simplifications.  The lowered TIR has a lot of 
expressions that I would expect to be simplified.  For example, that first 
`@tir.floor` in the if condition is `tir.floor((((cast(float32, 
n0_n0_k2.shifted.shifted) + 0.5f32)*0.5f32) - 0.5f32), dtype=float32))`.  which 
is equivalent to `floordiv(n0_n0_k2.shifted.shifted - 1, 2)`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to