Issue 91555
Summary [flang] Debug information with MLIR inlining
Labels flang
Assignees
Reporter vzakhari
    This affects the design for debug information generation in Flang: https://github.com/llvm/llvm-project/blob/main/flang/docs/DebugGeneration.md

Right now, the `AddDebugInfo` pass is run pretty late in the pipeline.  The MLIR inliner pass runs before it.  Even though the MLIR inlining is not fully functional, there is a way to force inlining for some simple functions and demonstrate the problem.

Reproducer:
```
     1  subroutine test(x, y)
     2    real :: x, y
     3    x = x + y ! may alias
 4    call inner(x, y)
     5  contains
     6    subroutine inner(x, y)
     7      real :: x, y
     8      x = x + y ! may not alias
 9    end subroutine inner
    10  end subroutine test
```

Compile with: `flang-new -g -O3 alias.f90 -c -mmlir -mlir-print-ir-after-all -mmlir -inline-all=true -mmlir -mlir-print-debuginfo -v -mllvm -print-after-all -mllvm -print-module-scope`

Here is the location information that is attached to the operations after MLIR inliner pass:
```
#loc1 = loc("/path/alias.f90":1:1)
module attributes {dlti.dl_spec = #dlti.dl_spec<#dlti.dl_entry<i32, dense<32> : vector<2xi64>>, #dlti.dl_entry<f64, dense<64> : vector<2xi64>>, #dlti.dl_entry<f128, dense<128> : vector<2xi64>>, #dlti.dl_entry<!llvm.ptr<270>, dense<32> : vector<4xi64>>, #dlti.dl_entry<f16, dense<16> : vector<2xi64>>, #dlti.dl_entry<i1, dense<8> : vector<2xi64>>, #dlti.dl_entry<!llvm.ptr, dense<64> : vector<4xi64>>, #dlti.dl_entry<i16, dense<16> : vector<2xi64>>, #dlti.dl_entry<i8, dense<8> : vector<2xi64>>, #dlti.dl_entry<i128, dense<128> : vector<2xi64>>, #dlti.dl_entry<f80, dense<128> : vector<2xi64>>, #dlti.dl_entry<!llvm.ptr<272>, dense<64> : vector<4xi64>>, #dlti.dl_entry<!llvm.ptr<271>, dense<32> : vector<4xi64>>, #dlti.dl_entry<i64, dense<64> : vector<2xi64>>, #dlti.dl_entry<"dlti.endianness", "little">, #dlti.dl_entry<"dlti.stack_alignment", 128 : i64>>, fir.defaultkind = "a1c4d8i4l4r4", fir.kindmap = "", fir.target_cpu = "x86-64", llvm.data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128", llvm.target_triple = "x86_64-unknown-linux-gnu"} {
  func.func @_QPtest(%arg0: !fir.ref<f32> {fir.bindc_name = "x"} loc("/path/alias.f90":1:1), %arg1: !fir.ref<f32> {fir.bindc_name = "y"} loc("/path/alias.f90":1:1)) {
    %0 = fir.dummy_scope : !fir.dscope loc(#loc1)
    %1 = fir.declare %arg0 dummy_scope %0 {uniq_name = "_QFtestEx"} : (!fir.ref<f32>, !fir.dscope) -> !fir.ref<f32> loc(#loc2)
 %2 = fir.declare %arg1 dummy_scope %0 {uniq_name = "_QFtestEy"} : (!fir.ref<f32>, !fir.dscope) -> !fir.ref<f32> loc(#loc3)
    %3 = fir.load %1 : !fir.ref<f32> loc(#loc4)
    %4 = fir.load %2 : !fir.ref<f32> loc(#loc4)
    %5 = arith.addf %3, %4 fastmath<contract> : f32 loc(#loc4)
    fir.store %5 to %1 : !fir.ref<f32> loc(#loc4)
    %6 = fir.dummy_scope : !fir.dscope loc(#loc11)
    %7 = fir.declare %1 dummy_scope %6 {uniq_name = "_QFtestFinnerEx"} : (!fir.ref<f32>, !fir.dscope) -> !fir.ref<f32> loc(#loc12)
    %8 = fir.declare %2 dummy_scope %6 {uniq_name = "_QFtestFinnerEy"} : (!fir.ref<f32>, !fir.dscope) -> !fir.ref<f32> loc(#loc13)
    %9 = fir.load %7 : !fir.ref<f32> loc(#loc14)
    %10 = fir.load %8 : !fir.ref<f32> loc(#loc14)
    %11 = arith.addf %9, %10 fastmath<contract> : f32 loc(#loc14)
    fir.store %11 to %7 : !fir.ref<f32> loc(#loc14)
 return loc(#loc10)
  } loc(#loc1)
} loc(#loc)
#loc = loc("/path/alias.f90":0:0)
#loc2 = loc("/path/alias.f90":2:11)
#loc3 = loc("/path/alias.f90":2:14)
#loc4 = loc("/path/alias.f90":3:3)
#loc5 = loc("/path/alias.f90":6:3)
#loc6 = loc("/path/alias.f90":4:3)
#loc7 = loc("/path/alias.f90":7:13)
#loc8 = loc("/path/alias.f90":7:16)
#loc9 = loc("/path/alias.f90":8:5)
#loc10 = loc("/path/alias.f90":10:1)
#loc11 = loc(callsite(#loc5 at #loc6))
#loc12 = loc(callsite(#loc7 at #loc6))
#loc13 = loc(callsite(#loc8 at #loc6))
#loc14 = loc(callsite(#loc9 at #loc6))
```

The MLIR inliner implementation produces a callsite location for each operation inlined from `inner`.  After conversion to LLVM IR it looks like this:
```
define void @test_(ptr %0, ptr %1) #0 !dbg !5 {
  %3 = load float, ptr %0, align 4, !dbg !9, !tbaa !10
  %4 = load float, ptr %1, align 4, !dbg !9, !tbaa !16
  %5 = fadd contract float %3, %4, !dbg !9
  store float %5, ptr %0, align 4, !dbg !9, !tbaa !10
  %6 = load float, ptr %0, align 4, !dbg !18, !tbaa !10
  %7 = load float, ptr %1, align 4, !dbg !18, !tbaa !16
  %8 = fadd contract float %6, %7, !dbg !18
  store float %8, ptr %0, align 4, !dbg !18, !tbaa !10
  ret void, !dbg !19
}
!3 = distinct !DICompileUnit(language: DW_LANG_Fortran95, file: !4, producer: "flang version 19.0.0 (ssh://g...@gitlab-master.nvidia.com:12051/fortran/llvm-project-mirror.git e08863ed78aa96096f04f0856069da8645944ab1)", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug)
!4 = !DIFile(filename: "alias.f90", directory: "/path")
!5 = distinct !DISubprogram(name: "test_", linkageName: "test_", scope: !4, file: !4, line: 1, type: !6, scopeLine: 1, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !3)
!6 = !DISubroutineType(cc: DW_CC_normal, types: !7)
!7 = !{!8, !8}
!8 = !DIBasicType(name: "real", size: 32, encoding: DW_ATE_float)
!9 = !DILocation(line: 3, column: 3, scope: !5)
!18 = !DILocation(line: 4, column: 3, scope: !5)
```

So all instructions inlined from `inner` end up on the same line as the call site.  This is a problem for debugging. 
Compare it with LLVM inlining in this example: https://godbolt.org/z/x81d5vsxo - the inlined instructions inherit their line number information from the callee (look for the last instance of `; *** IR Dump After InlinerPass on (test) ***`).

I believe the problem is that the MLIR Location attached to the operations of `inner` *before* MLIR inlining do not have proper `LLVM::DILocalScopeAttr` attached to them.  So there is no scope attribute on the callee Locations in the callsite Locations created by MLIR inliner.  One can trace the call chain from https://github.com/llvm/llvm-project/blob/f958a7348fcb27c3c6b07f1c8bdb902c7525b845/mlir/lib/Target/LLVMIR/DebugTranslation.cpp#L418 to https://github.com/llvm/llvm-project/blob/f958a7348fcb27c3c6b07f1c8bdb902c7525b845/mlir/lib/Target/LLVMIR/DebugTranslation.cpp#L426, which results in the callsite Locations to be translated to LLVM debug metadata that corresponds to the call operation location.  I think instead, the callee Locations inside the callsite Locations should follow the call chain from https://github.com/llvm/llvm-project/blob/f958a7348fcb27c3c6b07f1c8bdb902c7525b845/mlir/lib/Target/LLVMIR/DebugTranslation.cpp#L418 to https://github.com/llvm/llvm-project/blob/f958a7348fcb27c3c6b07f1c8bdb902c7525b845/mlir/lib/Target/LLVMIR/DebugTranslation.cpp#L437, so that the callee's Locations are used for the cloned operations with proper scope.

It seems to mean that before MLIR inliner we have to have Locations with proper scope (e.g. `LLVM::DISubprogramAttr`) attached to them. This cannot be achieved with the current ordering of the MLIR inliner and `AddDebugInfo` passes.

@kiranchandramohan, @abidh, @jeanPerier, @VijayKandiah, FYI.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to