================ @@ -0,0 +1,385 @@ +# ClangIR ABI Lowering - Design Document + +## 1. Introduction + +This document proposes a comprehensive design for creating an MLIR-agnostic calling convention lowering framework. The framework will enable CIR to perform ABI-compliant calling convention lowering, be reusable by other MLIR dialects (particularly FIR), achieve parity with the CIR incubator implementation for x86_64 and AArch64, and integrate with or inform the GSoC ABI Lowering Library project. + +### 1.1 Problem Statement + +Calling convention lowering is currently implemented separately for each MLIR dialect that needs it. The CIR incubator has a partial implementation, but it's tightly coupled to CIR-specific types and operations, making it unsuitable for reuse by other dialects. This means that FIR (Fortran IR) and future MLIR dialects would need to duplicate this complex logic. While classic Clang codegen contains mature ABI lowering code, it cannot be reused directly because it's tightly coupled to Clang's AST representation and LLVM IR generation. + +### 1.2 Proposed Solution + +This design proposes a shared MLIR ABI lowering infrastructure that multiple dialects can leverage. The framework sits at the top, providing common interfaces and target-specific ABI classification logic. Each MLIR dialect (CIR, FIR, and future dialects) implements a small amount of dialect-specific glue code to connect to this infrastructure. At the bottom, target-specific implementations handle the complex ABI rules for architectures like x86_64 and AArch64. This approach enables code reuse while maintaining the flexibility for each dialect to handle its own operation creation patterns. + +``` +┌─────────────────────────────────────────────────────────┐ +│ MLIR ABI Lowering Infrastructure │ +│ mlir/include/mlir/Interfaces/ABI/ │ +└─────────────────────────────────────────────────────────┘ + │ + ┌─────────────────┼─────────────────┐ + │ │ │ + ▼ ▼ ▼ + ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ + │ CIR Dialect │ │ FIR Dialect │ │ Future │ + │ │ │ │ │ Dialects │ + └──────────────┘ └──────────────┘ └──────────────┘ + │ │ │ + └─────────────────┴─────────────────┘ + │ + ▼ + ┌───────────────────────┐ + │ Target ABI Logic │ + │ X86, AArch64, etc. │ + └───────────────────────┘ +``` + +### 1.3 Key Benefits + +This architecture avoids duplicating complex ABI logic across MLIR dialects, reducing the maintenance burden and risk of inconsistencies. It maintains correct ABI compliance for all targets by reusing proven classification algorithms. The clear separation of concerns enables easier testing and validation, as each layer can be tested independently. Additionally, the design provides a straightforward migration path from the existing CIR incubator implementation. + +### 1.4 Success Criteria + +The framework will be considered successful when CIR can correctly lower x86_64 and AArch64 calling conventions with full ABI compliance. FIR should be able to adopt the same infrastructure with minimal dialect-specific adaptation. ABI compliance will be validated through differential testing, comparing output against classic Clang codegen to ensure correct calling convention implementation. Finally, the performance overhead should remain under 5% compared to a direct, dialect-specific implementation. Initial scope focuses on fixed-argument functions; variadic function support (varargs) is deferred as future work given its complexity and the need to establish the core framework first. + +## 2. Background and Context + +### 2.1 What is Calling Convention Lowering? + +Calling convention lowering transforms high-level function signatures to match target ABI (Application Binary Interface) requirements. When a function is declared at the source level with convenient, language-level types, these types must be translated into the specific register assignments, memory layouts, and calling sequences that the target architecture expects. For example, on x86_64 System V ABI, a struct containing two 64-bit integers might be "expanded" into two separate arguments passed in registers, rather than being passed as a single aggregate: + +``` +// High-level CIR +func @foo(i32, struct<i64, i64>) -> i32 + +// After ABI lowering +func @foo(i32 %arg0, i64 %arg1, i64 %arg2) -> i32 +// ^ ^ ^ ^ +// | | +--------+---- struct expanded into fields +// | +---- first field passed in register +// +---- small integer passed in register +``` + +### 2.2 Why It's Complex + +Calling convention lowering is complex for several reasons. First, it's highly target-specific: each architecture (x86_64, AArch64, RISC-V, etc.) has different rules for how arguments are passed in registers versus memory. Second, it's type-dependent: the rules differ significantly for integers, floating-point values, structs, unions, and arrays. Third, it's context-sensitive: special handling is required for varargs functions, virtual method calls, and alternative calling conventions like vectorcall or preserve_most. Finally, the same target may have multiple ABI variants (e.g., x86_64 System V vs. Windows x64), adding another dimension of complexity. + +### 2.3 Existing Implementations + +#### Classic Clang CodeGen + +Classic Clang codegen (located in `clang/lib/CodeGen/`) transforms calling conventions during the AST-to-LLVM-IR lowering process. This implementation is mature and well-tested, handling all supported targets with comprehensive ABI coverage. However, it's tightly coupled to both Clang's AST representation and LLVM IR, making it difficult to reuse for MLIR-based frontends. + +#### CIR Incubator + +The CIR incubator includes a calling convention lowering pass in `clang/lib/CIR/Dialect/Transforms/TargetLowering/` that transforms CIR operations into ABI-lowered CIR operations as an MLIR pass. This implementation successfully adapted logic from classic codegen to work within the MLIR framework. However, it relies on CIR-specific types and operations, preventing reuse by other MLIR dialects. + +#### GSoC ABI Lowering Library + +A 2024 Google Summer of Code project produced PR #140112, which proposes extracting Clang's ABI logic into a reusable library. The design centers on a shadow type system (`abi::Type*`) separate from both Clang's AST types and LLVM IR types, enabling the ABI classification algorithms to work independently of any specific frontend representation. The library includes abstract `ABIInfo` base classes and target-specific implementations for platforms like x86_64 and BPF. + +While this work represents valuable progress toward making Clang's ABI knowledge reusable, several factors make it unsuitable as a foundation for MLIR dialect support. First, the PR is incomplete—it lacks AArch64 implementation (a primary target for CIR) and has been inactive since the GSoC program concluded. Second, and more fundamentally, the shadow type system creates an architectural mismatch with MLIR. Using the GSoC library from MLIR would require converting `mlir::Type` → `abi::Type*` → performing classification → converting results back to `mlir::Type`, introducing both complexity and runtime overhead. MLIR's TypeInterface mechanism already provides a native solution for type abstraction, eliminating the need for a shadow type system. ---------------- nikic wrote:
> has been inactive since the GSoC program concluded. Also to clarify this point: The referenced PR has indeed been inactive, but this is because it's being upstreamed in smaller parts (e.g. https://github.com/llvm/llvm-project/pull/158329). The speed at which this can be upstreamed is primarily limited by reviewer bandwidth. https://github.com/llvm/llvm-project/pull/178326 _______________________________________________ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
