[I] [Bug] The Onnx Frontend incorrectly mapping the Operator implementation version [tvm]

via GitHub Thu, 29 Jan 2026 23:47:16 -0800


locnd182644 opened a new issue, #18698:
URL: https://github.com/apache/tvm/issues/18698


   ### Description
   The Onnx Frontend incorrectly mapping the Operator implementation version. 
Specifically, frontend has selected implement an operator with a higher version 
than the model's opset number.
   
   ### Steps to Reproduce
   - Graph: ReduceMean Node
   <img width="1000" height="500" alt="Image" 
src="https://github.com/user-attachments/assets/79dac965-7d62-468b-803f-3797b0b8d682";
 />
   
   - Using tvm.relax.frontend.onnx.from_onnx with ignore pass opset, Relax 
module wrong output shape.
   
   ### Expected behavior
   Mean Relax have axis = 2 and output shape = (1, 68, 18)
   ```
   . . .
   
   lv90: R.Tensor((1, 516, 4, 18), dtype="float32") = R.concat((lv86, lv87, 
lv88, lv89), axis=1)
   lv91: R.Tensor((1, 68, 4, 18), dtype="float32") = R.nn.conv2d(lv90, 
metadata["relax.expr.Constant"][30], strides=[1, 1], padding=[0, 0, 0, 0], 
dilation=[1, 1], groups=1, data_layout="NCHW", kernel_layout="OIHW", 
out_layout="NCHW", out_dtype="void")
   lv92: R.Tensor((1, 68, 1, 1), dtype="float32") = 
R.reshape(metadata["relax.expr.Constant"][31], R.shape([1, 68, 1, 1]))
   lv93: R.Tensor((1, 68, 4, 18), dtype="float32") = R.add(lv91, lv92)
   gv: R.Tensor((1, 68, 18), dtype="float32") = R.mean(lv93, axis=[2], 
keepdims=False)
   R.output(gv)
   ```
   
   ### Actual behavior
   Mean Relax have axis = None and output shape = ( )
   ```
   . . .
   
   lv90: R.Tensor((1, 516, 4, 18), dtype="float32") = R.concat((lv86, lv87, 
lv88, lv89), axis=1)
   lv91: R.Tensor((1, 68, 4, 18), dtype="float32") = R.nn.conv2d(lv90, 
metadata["relax.expr.Constant"][30], strides=[1, 1], padding=[0, 0, 0, 0], 
dilation=[1, 1], groups=1, data_layout="NCHW", kernel_layout="OIHW", 
out_layout="NCHW", out_dtype="void")
   lv92: R.Tensor((1, 68, 1, 1), dtype="float32") = 
R.reshape(metadata["relax.expr.Constant"][31], R.shape([1, 68, 1, 1]))
   lv93: R.Tensor((1, 68, 4, 18), dtype="float32") = R.add(lv91, lv92)
   gv: R.Tensor((), dtype="float32") = R.mean(lv93, axis=None, keepdims=False)
   R.output(gv)
   ```
   
   ### Debug
   
   <img width="600" height="41" alt="Image" 
src="https://github.com/user-attachments/assets/211a6a8c-24ae-4585-b9ac-4a3f7fc9acb9";
 />
   
   - In method get_converter of OnnxOpConverter, the impl_v mapping algorithm 
takes the highest version among all versions <= opset. If there is no 
implementation version <= opset, the index of the versions will be a negative 
number.
   - Rotate to take the latest Implement of Op (Negative indexing starts at -1 
for the last element) -> Wrong Implement Operator
   
   Go back example: ReduceMean Operator in Onnx
    - Currently, in tvm,  ReduceMean have 2 implements ( _impl_v13 and 
_impl_v18)
    - Opset model = 9 -> get _impl_v18. Right must get _impl_v1
    - In ReduceMean, version 1 and version 18 have different about Inputs and 
Attributes. (Ref: https://onnx.ai/onnx/operators/text_diff_ReduceMean_1_18.html)
   
   ### Possible fix method
   - Suggestion 1:  Using onnx.defs.get_schema
      + Using "from onnx import defs" to look up the Operator schema based on 
the model's opset. Then get Operator Version Exactly to Implement. Result as 
ai.onnx.v1 in Image.
      + Todo: Implement all Version of ONNX Operator.
   
   <img width="300" height="200" alt="Image" 
src="https://github.com/user-attachments/assets/d37fc101-9d69-436b-a9e6-69a25e726d69";
 />
   
   - Suggestion 2: Add check empty impl, raise error
     + In method get_converter of OnnxOpConverter, filter Impl Version <= 
Opset, then get max this list versions to take Impl Version of Op. If this list 
filter versions are Empty, raise "Not found implement of operator compatible 
with the model's opset".
     + Todo: Implement version Op with different schemas (input, atrr)
   
   ### Triage
   * needs-triage
   * type: bug
   * frontend:onnx
   
   ---
   What do you think about this issue? I would be happy to hear different 
opinions from everyone.
   
   Thank you!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[I] [Bug] The Onnx Frontend incorrectly mapping the Operator implementation version [tvm]

Reply via email to