Hi all,
I'd like to propose adding a new, optional debug information format to
FPC called OPDF (Object Pascal Debug Format). This is not a
replacement for DWARF — it is an additional debug format that can be
selected with the new -gO compiler flag, similar to how -gw selects
DWARF.
The compiler changes are available at:
https://github.com/graemeg/fpc
Branch: feature/opdf-support
To review or test against your local FPC checkout:
git remote add graemeg https://github.com/graemeg/fpc.git
git fetch graemeg feature/opdf-support
git diff main..graemeg/feature/opdf-support
A standalone reference debugger (PDR), the OPDF binary format library,
and all research/design documentation are available at:
https://github.com/graemeg/opdebugger
NOTE: Some of the early analysis and design documentation predates the
current implementation and may be slightly dated. I have kept it in to
show the rationale and design evolution behind the format. This was my
third attempt at getting this implementation right.
Why a new format?
-----------------
DWARF is a mature and capable standard, but it was designed around
C/C++ concepts. Object Pascal has language features that DWARF does
not natively represent:
- Properties (field-backed and method-backed with getter/setter)
- Reference-counted strings (ShortString, AnsiString, UnicodeString)
- Sets (bitfield over ordinal/enum base types)
- Class inheritance with VMT, published RTTI, and
interfaces (COM/CORBA)
- Dynamic arrays with SizeInt-based length metadata
Today, FPC encodes these into DWARF using workarounds and conventions
that external debuggers (GDB, LLDB) do not understand. The result is
that debugging Object Pascal with GDB often shows strings as raw byte
pointers, properties as inaccessible, and sets as opaque integers.
OPDF encodes these concepts directly. A property record carries its
getter method name and read/write kind. An AnsiString record carries
its type ID so the debugger knows to dereference the pointer and read
the length header. A set record carries its base enum type ID and
lower bound so the debugger can decode the bitfield into member names.
The full rationale and analysis, including a comparison with DWARF and
a look at Kylix and Delphi's approach, is documented at:
https://github.com/graemeg/opdebugger/blob/master/docs/analysis.adoc
What the compiler changes include
----------------------------------
- New debug writer: compiler/dbgopdf.pas (TOPDFDebugWriter)
- Type ID allocation: compiler/dbgopdf_typemap.pas (TTypeMapper)
- New -gO flag in compiler/options.pas
- Registration in compiler/systems.inc (dbg_opdf enum)
- Target registration in x86_64/cputarg.pas and i386/cputarg.pas
The debug writer emits an .opdf ELF section with all code addresses
resolved at link time by the assembler/linker. Each compilation unit
gets its own OPDF header, and the linker concatenates them. A
cross-module type deduplication system (via global TTypeMapper +
G_EmittedTypeIDs) ensures that shared types are emitted once, using
mangled type names for cross-unit dedup (following the same pattern as
DWARF in dbgdwarf.pas). A unit directory record at the end of the main
module lists all contributing units.
This has been tested with a 104-unit project producing a 6.4 MB .opdf
section with a 1.0x dedup ratio. The debugger also handles
cross-compilation TypeID collisions that arise when build systems like
PasBuild invoke ppcx64 separately per module — colliding TypeIDs are
remapped at load time.
The OPDF binary format specification (v0.3.0, 20 record types) is at:
https://github.com/graemeg/opdebugger/blob/master/docs/opdf-specification.adoc
IMPORTANT: dbgopdf.pas does NOT import any units from the opdebugger
repo. The REC_* constants are duplicated locally with "{ must match
opdf_types.pas }" comments. This keeps the compiler fully
self-contained.
What the reference debugger can do today
-----------------------------------------
The PDR debugger is a standalone CLI tool that reads OPDF data and
controls the debuggee via Linux ptrace. It currently supports:
- Breakpoints by file:line, hex address, or variable name
- Hit-count conditional breakpoints (break ... if count=N)
- Source-level stepping (step over and step into)
- Call stack display with function names and source locations
- Variable evaluation for all Object Pascal types:
primitives, floats, enums, sets, ShortString, AnsiString,
UnicodeString, pointers, static arrays, dynamic arrays,
records, classes (with field display), interfaces
- Compile-time constant evaluation
- Property evaluation: field-backed (automatic) and method-backed
(via call injection on explicit request)
- Array slice display: print arr[2..5]
- In-process variable assignment: set x = 42, set day = Monday
- Local variable listing with scope awareness (including
nested procedures)
- Structured inspect command for records, classes, and interfaces
- Auto-print display list (display/undisplay)
- Hardware watchpoints via x86_64 debug registers (DR0-DR3)
- Break on exception raise with class name and message display
- Method call injection for getter property evaluation
(x86_64 SysV ABI, register save/restore, INT3 sentinel, red zone
handling, managed return types)
27 automated integration tests cover all of the above.
The debugger uses a hexagonal architecture (ports and adapters). The
core engine depends only on interfaces (IProcessController,
IDebugInfoReader, IArchAdapter) defined in pdr_ports.pas. This means
integrating OPDF into an IDE like Lazarus or MSEide requires only
implementing the adapter interfaces — the type evaluation, property
resolution, breakpoint logic, call injection, and all display
formatting are reusable as-is.
Architecture documentation:
https://github.com/graemeg/opdebugger/blob/master/docs/architecture.adoc
Design decisions log:
https://github.com/graemeg/opdebugger/blob/master/docs/design-decisions.adoc
Relationship with DWARF
------------------------
To be clear: this proposal does not suggest removing DWARF support.
DWARF remains valuable for interoperability with standard tools (GDB,
LLDB, Valgrind, perf), for platforms where OPDF is not yet supported,
and for features that OPDF does not yet cover (e.g. register-allocated
variables in optimised code, .eh_frame-based stack unwinding).
OPDF and DWARF can coexist in the same binary — they use separate ELF
sections. A user who needs GDB compatibility continues using -gw. A
user who wants first-class Object Pascal debugging uses -gO. Over
time, OPDF coverage will expand, but there is no pressure to deprecate
anything.
Current limitations
--------------------
- ELF64 only (Linux x86_64). Windows PE/COFF and macOS Mach-O are
planned.
- Stack unwinding relies on RBP frame pointer chains, which breaks
with optimised RTL compiled without frame pointers. OPDF unwind
info records or .eh_frame fallback are planned.
- Float return types from injected method calls need PTRACE_GETFPREGS
for XMM0 reading.
- Variant records, Variant type, and generics are not yet covered.
The full progress tracker and roadmap:
https://github.com/graemeg/opdebugger/blob/master/docs/progress.adoc
After many years of building my career on Free Pascal and Lazarus, I
wanted to give something substantial back to the community beyond
individual bug fixes. OPDF and PDR have become a passion project that
I intend to continue developing, and I hope the FPC community finds
them useful.
I welcome any feedback on the approach, the compiler integration, or
the format design. Happy to answer questions.
Regards,
- Graeme -
_______________________________________________
fpc-devel maillist - [email protected]
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel