TVM deals with these in the Relay IR directly. For example, the IR with NCHW16c
and NCHW4c may look like:
```
%1 = nn.conv2d(...) // output layout: NCHW16c
%2 = layout_transform(%1, "NCHW4c") // output layout: NCHW4c
...
```
When compiling the above IR, `layout_tranform` is just an operator like
`conv2d`, so `%1` and `%2` are individual tensors. As a result, runtime only
needs to execute the compiled graph/bytecode and doesn't have to worry about
layout transform.
Weights can be done in the same way, but we usually simplify/fold the layout
transform in the case of model inference which weights are already constants:
```
def @main(%data) {
%1 = layout_transform(%const[0], "target_layout"); // %const[0] is the weights
%2 = nn.conv2d(%data, %1);
...
}
```
becomes:
```
def @main(%data) {
%1 = nn.conv2d(%data, %const[0]); // %const[0] is the weights in
target_layout.
...
}
```
---
[Visit
Topic](https://discuss.tvm.apache.org/t/where-does-layout-transform-data-copy-move-happen/11523/2)
to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click
here](https://discuss.tvm.apache.org/email/unsubscribe/95c8025db37ebe7da3616182321a18366e5eaf8cc1955af8d2fd5aef275f48a1).