Issue 182907
Summary Libclang and its Python bindings lack holistic approach to error reporting
Labels clang:as-a-library
Assignees
Reporter Endilll
    Error reporting in both libclang and its Python binding is a patchwork on many levels, even including naming.

In libclang:
1. `CXErrorCode`, despite its generic name, is used by just 12 functions across C API.
2. `CXResult`, despite its generic name, it used by 4 search functions that take an AST visitor.
3. `clang_saveTranslationUnit` has its own `CXSaveError`.
4. 5 functions to query various layout properties have their own `CXTypeLayoutError`.
5. `clang_CompilationDatabase_fromDirectory` has its own `CXCompilationDatabase_Error`. 
6. `clang_loadDiagnostics` has its own `CXLoadDiag_Error`.
7. Many functions go for nullptr, empty strings, `-1`, etc. when any error occurs.

On top of that, C APIs don't agree whether errors should be reported via return slot or out-parameter.

In Python bindings:
1. `LibclangError`, despite its generic name, is used when Python bindings fail to load the shared library or fail to find a specific function there.
2. `TranslationUnitLoadError` is a type derived from `Exception` that seems to be used for functions that return `CXErrorCode`, but doesn't contain any additional information.
3. `TranslationUnitSaveError` is a type derived from `Exception` that seems to mirror `CXSaveError`, including its enumerators. The result of C API call is stored in `save_error`.
4. `CompilationDatabaseError` mirrors `CXCompilationDatabase_Error` with its enumerators, and is used both for reporting C API errors and errors within Python bindings.
5. Various built-in Python exception types are used to report invalid inputs before and after hitting C APIs, including some thoughtful mappings, like `clang_getDiagnostic` (by index) returning an error leads to `IndexError` being thrown.

Generally, I think that Python bindings should throw as useful exceptions as possible, including built-in exceptions where they fit the semantics. This implies more exception types embedded in a single hierarchy that have as much additional information as we can get. Which brings us to C API, which likely don't provide as much useful information. Rectifying this situation on top of a patchwork-y foundation doesn't sound like a good idea to me. On top of that, we're bound by hard ABI guarantees on C side, and somewhat softer source compatibility intent on both C and Python side.

Designing this given the goals and constraints will take considerable time and effort. In the meantime, we should stop digging the hole with new changes in this area.
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to