| Issue |
184302
|
| Summary |
[WebAssembly][Fast-ISel] generates inefficient shift sequence for extending i8/i16 to i32
|
| Labels |
new issue
|
| Assignees |
|
| Reporter |
ParkHanbum
|
### Description
When using FastISel (`-fast-isel`) target for WebAssembly, reading values from `i8` or `i16` variables into `i32` outputs requires sign extension operations(`sext`) lowering. Currently, these extensions are systematically converted into an unnecessary string of sequential bitwise variations (an unsigned load followed by `shl` and `shr_s`).
For `wasm-32`, WebAssembly has specifically engineered and natively built operations explicitly for this logic transformation: `i32.8_s` and `i16.i32_s` respectively. Modifying compiler rules to generate and directly incorporate these load instructions directly handles operations inside a single, compact instructions compared to processing it systematically over sequential shifting lines of variables.
This adjustment improves compile time, resulting codes are processed rapidly decreasing bytecode size footprint resulting in a smoother user flow.
### Steps to Reproduce
Set up an environment mapping, create a mock target LLVM configuration file (For example, `sext.ll`) enclosing below instructions:
```llvm
target datalayout = "e-m:e-p:32:32-i64:64-n32:64-S128"
target triple = "wasm32-unknown-unknown"
define i32 @sext_i16_i32(ptr %p) {
%v = load atomic i16, ptr %p seq_cst, align 2
%e = sext i16 %v to i32
ret i32 %e
}
define i32 @sext_i8_i32(ptr %p) {
%v = load atomic i8, ptr %p seq_cst, align 1
%e = sext i8 %v to i32
ret i32 %e
}
```
Implement FastLiselOut against previously initialized configurations.
Current Output Trace (wasm-fast-isel_out_unoptimal_seq_log1)
Right now outputs present inefficiencies as variables translate utilizing sequentially parsed shift mechanisms, generating operations mimicking standard Right Bit Arithmetic.
```wasm
sext_i16_i32: # @sext_i16_i32
.functype sext_i16_i32 (i32) -> (i32)
local.get 0
i32.load16_u 0
; -- inefficient translation rules ---
i32.const 16
i32.shl
i32.const 16
i32.shr_s
; ------------------------------------
end_function
sext_i8_i32: # @sext_i8_i32
.functype sext_i8_i32 (i32) -> (i32)
local.get 0
i32.load8_u 0
; -- inefficient translation rules ---
i32.const 24
i32.shl
i32.const 24
i32.shr_s
; ------------------------------------
end_function
```
Ideal Behavior Trace
Updates and refactoring should point and output target logic utilizing WebAssembly instructions: i32.8_s and i16.i32_s. These instructions explicitly fold native bit mappings appropriately adjusting extensions overriding top bit thresholds removing unnecessary logic operations from the translation tree:
```wasm
test_sext_i8:
local.get 0
i32.load8_s 0 # <--- Optimized Single Pass Fold
end_function
test_sext_i16:
local.get 0
i32.load16_s 0 # <--- Optimized Single Pass Fold
end_function
```
Reference Context Parameter Log
- Target Architecture Configuration: WebAssembly (wasm32)
- Core File Affected Component : WebAssemblyFastISel.cpp
- Test case exist : load-ext.ll
Current implementation within SelectSExt redirects parameters parsing through standard sequential logic expansion defaults. Modifying rulesets mapping should actively parse instruction flags targeting directly natively integrated elements directly utilizing WebAssembly load8_s and load16_s.
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs