bruns created this revision. bruns added a reviewer: Frameworks. Herald added projects: Frameworks, Build System. Herald added subscribers: kde-buildsystem, kde-frameworks-devel. bruns requested review of this revision.
REVISION SUMMARY Depending on the locale, python3 may try to decode the source as ASCII when the file is opened in text mode. This will fail as soon as the code contains utf-8, e.g. (c) symbols. While it is possible to specify the encoding when reading the file, this is bad for several reasons: - only a very small part of the source is processed via _read_source, no need to decode the complete source and store it as string objects - the clang Cursor.extent.{start,end}.column refers to bytes, not multibyte characters. While python2 processes utf-8 containing sources without error messages, wrong extent borders are also an issue. The practical impact is low, as the issue only manifests if there is a multibyte character in front of *and* on the same line as the read token. TEST PLAN Python3: Build any bindings which contains sources with non-ASCII codepoints, e.g. kcoreaddons. Unpatched version fails when using e.g. LANG=C. Python2: Both versions generate sources successfully. REPOSITORY R240 Extra CMake Modules REVISION DETAIL https://phabricator.kde.org/D15068 AFFECTED FILES find-modules/sip_generator.py To: bruns, #frameworks Cc: kde-frameworks-devel, kde-buildsystem, michaelh, ngraham, bruns