According to the comment in `include/vta/hw_spec.h` and the hardware
implementation (simulation and xilinx, etc.), if `use_imm` is `True`, the input
index will be the same as the output index, i.e., both would be `dst_idx`.
```
// include/vta/hw_spec.h
341 * // Perform ALU operation
342 * if (use_imm) {
343 * acc_mem[dst_idx] = alu_op(alu_opcode, acc_mem[dst_idx], imm);
344 * } else {
345 * acc_mem[dst_idx] = alu_op(alu_opcode, acc_mem[dst_idx],
acc_mem[src_idx]);
346 * }
```
However, I think this is not true in some cases, e.g., when the input variable
may be referenced more than once. In this case the input variable should be
kept and will be further used.
For example, if I have the following fused operation to run on VTA:
```
1 b = right_shift(a, 5) // use_imm = True; a@acc_mem[idx_0], b@acc_mem[idx_0]
2 c = mul(b, 5) // use_imm = True; c@acc_mem[idx_1]
3 d = max(b, c) // use_imm = False; d@acc_mem[idx_0]
```
In line 2, even though `use_imm` = True, the indexes of input (i.e., 'b') and
output (i.e., 'c') have to be different, because the `max` op in line 3
references both `b` and `c`. In other words, `b` are referenced twice (line 2
and line 3), and line 2 cannot update `b`'s data in-place.
So in the above example, for `use_imm` = True, we have to use `src_idx` as the
input.
For the existing codebase and the Resnet models, we do not have op sequences
like the above. So if `use_imm = True`, we will see dst_idx = src_idx, thus not
triggering the bug. But for other customized models, it may be not the case.
How do you think? @thierry @liangfu and others?
---
[Visit
Topic](https://discuss.tvm.apache.org/t/vta-alu-bug-if-use-imm-is-true-which-idx-should-be-used-src-idx-or-dst-idx/8014/1)
to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click
here](https://discuss.tvm.apache.org/email/unsubscribe/5bee4691acce21c207aee5dc24e381d796fd3afcf343d0417f89c1d8acacc60b).