Issue 180777
Summary [WebAssembly][FastISel] emits inefficient shift sequence for sext instead of extend instructions
Labels new issue
Assignees
Reporter ParkHanbum
    Description

When using FastISel (-fast-isel) for the WebAssembly target, sext (sign-extension) operations from i8 or i16 to i32 are lowered into a sequence of bitwise shift instructions (shl followed by shr_s).

WebAssembly has dedicated instructions for this purpose: i32.extend8_s and i32.extend16_s. Using these instructions is more efficient and results in smaller code size compared to the generic shift sequence.

Reproduction Steps

Create a file named sext.ll with the following content:
```
target datalayout = "e-m:e-p:32:32-i64:64-n32:64-S128"
target triple = "wasm32-unknown-unknown"
define i32 @sext_i16_i32(ptr %p) {
  %v = load atomic i16, ptr %p seq_cst, align 2
  %e = sext i16 %v to i32
  ret i32 %e
}

define i32 @sext_i8_i32(ptr %p) {
  %v = load atomic i8, ptr %p seq_cst, align 1
  %e = sext i8 %v to i32
  ret i32 %e
}
```
Current Behavior (Output)

FastISel generates a shift sequence (Left Shift + Arithmetic Right Shift) to sign-extend the value.
```
sext_i16_i32: # @sext_i16_i32
        .functype       sext_i16_i32 (i32) -> (i32)
        local.get       0
        i32.load16_u    0
; inefficient codes ===
        i32.const       16
        i32.shl
        i32.const 16
        i32.shr_s
; ===============
 end_function
sext_i8_i32:                            # @sext_i8_i32
 .functype       sext_i8_i32 (i32) -> (i32)
        local.get       0
 i32.load8_u     0
; inefficient codes ===
        i32.const       24  
 i32.shl
        i32.const       24
        i32.shr_s
; ===============
        end_function
```

Expected Behavior

FastISel should utilize i32.extend8_s and i32.extend16_s. These instructions correctly sign-extend the lower 8 or 16 bits regardless of the upper bits' content (which matches the behavior needed after i32.load8_u or i32.load16_u).

```
test_sext_i8:
    local.get 0
    i32.load8_u 0
 i32.extend8_s   # <--- Optimized
    end_function

test_sext_i16:
 local.get 0
    i32.load16_u 0
    i32.extend16_s  # <--- Optimized
 end_function
```
Additional Context

    Target: WebAssembly (wasm32)

 Component: WebAssemblyFastISel.cpp

    Function: SelectSExt

Currently, SelectSExt seems to fall back to the default expansion behavior. It should explicitly handle i8 and i16 source types by emitting WebAssembly::I32_EXTEND8_S and WebAssembly::I32_EXTEND16_S.


_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to