Script 'mail_helper' called by obssrc Hello community, here is the log from the commit of package python-charset-normalizer for openSUSE:Factory checked in at 2022-02-17 00:29:57 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Comparing /work/SRC/openSUSE:Factory/python-charset-normalizer (Old) and /work/SRC/openSUSE:Factory/.python-charset-normalizer.new.1956 (New) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Package is "python-charset-normalizer" Thu Feb 17 00:29:57 2022 rev:12 rq:954654 version:2.0.12 Changes: -------- --- /work/SRC/openSUSE:Factory/python-charset-normalizer/python-charset-normalizer.changes 2022-01-11 21:20:37.289015879 +0100 +++ /work/SRC/openSUSE:Factory/.python-charset-normalizer.new.1956/python-charset-normalizer.changes 2022-02-17 00:30:05.709437803 +0100 @@ -1,0 +2,9 @@ +Tue Feb 15 08:42:30 UTC 2022 - Dirk M??ller <dmuel...@suse.com> + +- update to 2.0.12: + * ASCII miss-detection on rare cases (PR #170) + * Explicit support for Python 3.11 (PR #164) + * The logging behavior have been completely reviewed, now using only TRACE + and DEBUG levels + +------------------------------------------------------------------- Old: ---- charset_normalizer-2.0.10.tar.gz New: ---- charset_normalizer-2.0.12.tar.gz ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Other differences: ------------------ ++++++ python-charset-normalizer.spec ++++++ --- /var/tmp/diff_new_pack.UHOZtE/_old 2022-02-17 00:30:06.421437680 +0100 +++ /var/tmp/diff_new_pack.UHOZtE/_new 2022-02-17 00:30:06.425437679 +0100 @@ -19,7 +19,7 @@ %{?!python_module:%define python_module() python-%{**} python3-%{**}} %define skip_python2 1 Name: python-charset-normalizer -Version: 2.0.10 +Version: 2.0.12 Release: 0 Summary: Python Universal Charset detector License: MIT ++++++ charset_normalizer-2.0.10.tar.gz -> charset_normalizer-2.0.12.tar.gz ++++++ diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/charset_normalizer-2.0.10/.github/workflows/detector-coverage.yml new/charset_normalizer-2.0.12/.github/workflows/detector-coverage.yml --- old/charset_normalizer-2.0.10/.github/workflows/detector-coverage.yml 2022-01-04 21:14:06.000000000 +0100 +++ new/charset_normalizer-2.0.12/.github/workflows/detector-coverage.yml 2022-02-12 15:24:47.000000000 +0100 @@ -31,7 +31,7 @@ git clone https://github.com/Ousret/char-dataset.git - name: Coverage WITH preemptive run: | - python ./bin/coverage.py --coverage 98 --with-preemptive + python ./bin/coverage.py --coverage 97 --with-preemptive - name: Coverage WITHOUT preemptive run: | python ./bin/coverage.py --coverage 95 diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/charset_normalizer-2.0.10/.github/workflows/python-publish.yml new/charset_normalizer-2.0.12/.github/workflows/python-publish.yml --- old/charset_normalizer-2.0.10/.github/workflows/python-publish.yml 2022-01-04 21:14:06.000000000 +0100 +++ new/charset_normalizer-2.0.12/.github/workflows/python-publish.yml 2022-02-12 15:24:47.000000000 +0100 @@ -101,7 +101,7 @@ git clone https://github.com/Ousret/char-dataset.git - name: Coverage WITH preemptive run: | - python ./bin/coverage.py --coverage 98 --with-preemptive + python ./bin/coverage.py --coverage 97 --with-preemptive - name: Coverage WITHOUT preemptive run: | python ./bin/coverage.py --coverage 95 diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/charset_normalizer-2.0.10/.github/workflows/run-tests.yml new/charset_normalizer-2.0.12/.github/workflows/run-tests.yml --- old/charset_normalizer-2.0.10/.github/workflows/run-tests.yml 2022-01-04 21:14:06.000000000 +0100 +++ new/charset_normalizer-2.0.12/.github/workflows/run-tests.yml 2022-02-12 15:24:47.000000000 +0100 @@ -9,7 +9,7 @@ strategy: fail-fast: false matrix: - python-version: [3.5, 3.6, 3.7, 3.8, 3.9, "3.10"] + python-version: [3.5, 3.6, 3.7, 3.8, 3.9, "3.10", "3.11.0-alpha.4"] os: [ubuntu-latest] steps: diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/charset_normalizer-2.0.10/CHANGELOG.md new/charset_normalizer-2.0.12/CHANGELOG.md --- old/charset_normalizer-2.0.10/CHANGELOG.md 2022-01-04 21:14:06.000000000 +0100 +++ new/charset_normalizer-2.0.12/CHANGELOG.md 2022-02-12 15:24:47.000000000 +0100 @@ -2,6 +2,19 @@ All notable changes to charset-normalizer will be documented in this file. This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/). +## [2.0.12](https://github.com/Ousret/charset_normalizer/compare/2.0.11...2.0.12) (2022-02-12) + +### Fixed +- ASCII miss-detection on rare cases (PR #170) + +## [2.0.11](https://github.com/Ousret/charset_normalizer/compare/2.0.10...2.0.11) (2022-01-30) + +### Added +- Explicit support for Python 3.11 (PR #164) + +### Changed +- The logging behavior have been completely reviewed, now using only TRACE and DEBUG levels (PR #163 #165) + ## [2.0.10](https://github.com/Ousret/charset_normalizer/compare/2.0.9...2.0.10) (2022-01-04) ### Fixed diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/charset_normalizer-2.0.10/README.md new/charset_normalizer-2.0.12/README.md --- old/charset_normalizer-2.0.10/README.md 2022-01-04 21:14:06.000000000 +0100 +++ new/charset_normalizer-2.0.12/README.md 2022-02-12 15:24:47.000000000 +0100 @@ -33,12 +33,13 @@ | `License` | LGPL-2.1 | MIT | MPL-1.1 | `Native Python` | :heavy_check_mark: | :heavy_check_mark: | ??? | | `Detect spoken language` | ??? | :heavy_check_mark: | N/A | -| `Supported Encoding` | 30 | :tada: [93](https://charset-normalizer.readthedocs.io/en/latest/support.html) | 40 +| `Supported Encoding` | 30 | :tada: [93](https://charset-normalizer.readthedocs.io/en/latest/user/support.html#supported-encodings) | 40 <p align="center"> <img src="https://i.imgflip.com/373iay.gif" alt="Reading Normalized Text" width="226"/><img src="https://media.tenor.com/images/c0180f70732a18b4965448d33adba3d0/tenor.gif" alt="Cat Reading Text" width="200"/> *\*\* : They are clearly using specific code for a specific encoding even if covering most of used one*<br> +Did you got there because of the logs? See [https://charset-normalizer.readthedocs.io/en/latest/user/miscellaneous.html](https://charset-normalizer.readthedocs.io/en/latest/user/miscellaneous.html) ## ??? Your support diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/charset_normalizer-2.0.10/bin/bc.py new/charset_normalizer-2.0.12/bin/bc.py --- old/charset_normalizer-2.0.10/bin/bc.py 2022-01-04 21:14:06.000000000 +0100 +++ new/charset_normalizer-2.0.12/bin/bc.py 2022-02-12 15:24:47.000000000 +0100 @@ -43,7 +43,7 @@ success_count = 0 total_count = 0 - for tbt_path in glob("./char-dataset/**/*.*"): + for tbt_path in sorted(glob("./char-dataset/**/*.*")): total_count += 1 with open(tbt_path, "rb") as fp: diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/charset_normalizer-2.0.10/bin/coverage.py new/charset_normalizer-2.0.12/bin/coverage.py --- old/charset_normalizer-2.0.10/bin/coverage.py 2022-01-04 21:14:06.000000000 +0100 +++ new/charset_normalizer-2.0.12/bin/coverage.py 2022-02-12 15:24:47.000000000 +0100 @@ -43,7 +43,7 @@ success_count = 0 total_count = 0 - for tbt_path in glob("./char-dataset/**/*.*"): + for tbt_path in sorted(glob("./char-dataset/**/*.*")): expected_encoding = tbt_path.split(sep)[-2] total_count += 1 diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/charset_normalizer-2.0.10/bin/performance.py new/charset_normalizer-2.0.12/bin/performance.py --- old/charset_normalizer-2.0.10/bin/performance.py 2022-01-04 21:14:06.000000000 +0100 +++ new/charset_normalizer-2.0.12/bin/performance.py 2022-02-12 15:24:47.000000000 +0100 @@ -37,7 +37,7 @@ chardet_results = [] charset_normalizer_results = [] - for tbt_path in glob("./char-dataset/**/*.*"): + for tbt_path in sorted(glob("./char-dataset/**/*.*")): print(tbt_path) # Read Bin file diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/charset_normalizer-2.0.10/bin/serve.py new/charset_normalizer-2.0.12/bin/serve.py --- old/charset_normalizer-2.0.10/bin/serve.py 2022-01-04 21:14:06.000000000 +0100 +++ new/charset_normalizer-2.0.12/bin/serve.py 2022-02-12 15:24:47.000000000 +0100 @@ -13,7 +13,7 @@ def read_targets(): return jsonify( [ - el.replace("./char-dataset", "/raw").replace("\\", "/") for el in glob("./char-dataset/**/*") + el.replace("./char-dataset", "/raw").replace("\\", "/") for el in sorted(glob("./char-dataset/**/*")) ] ) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/charset_normalizer-2.0.10/charset_normalizer/api.py new/charset_normalizer-2.0.12/charset_normalizer/api.py --- old/charset_normalizer-2.0.10/charset_normalizer/api.py 2022-01-04 21:14:06.000000000 +0100 +++ new/charset_normalizer-2.0.12/charset_normalizer/api.py 2022-02-12 15:24:47.000000000 +0100 @@ -13,7 +13,7 @@ mb_encoding_languages, merge_coherence_ratios, ) -from .constant import IANA_SUPPORTED, TOO_BIG_SEQUENCE, TOO_SMALL_SEQUENCE +from .constant import IANA_SUPPORTED, TOO_BIG_SEQUENCE, TOO_SMALL_SEQUENCE, TRACE from .md import mess_ratio from .models import CharsetMatch, CharsetMatches from .utils import ( @@ -25,6 +25,8 @@ should_strip_sig_or_bom, ) +# Will most likely be controversial +# logging.addLevelName(TRACE, "TRACE") logger = logging.getLogger("charset_normalizer") explain_handler = logging.StreamHandler() explain_handler.setFormatter( @@ -70,19 +72,20 @@ if explain: previous_logger_level = logger.level # type: int logger.addHandler(explain_handler) - logger.setLevel(logging.DEBUG) + logger.setLevel(TRACE) length = len(sequences) # type: int if length == 0: - logger.warning("Encoding detection on empty bytes, assuming utf_8 intention.") + logger.debug("Encoding detection on empty bytes, assuming utf_8 intention.") if explain: logger.removeHandler(explain_handler) logger.setLevel(previous_logger_level or logging.WARNING) return CharsetMatches([CharsetMatch(sequences, "utf_8", 0.0, False, [], "")]) if cp_isolation is not None: - logger.debug( + logger.log( + TRACE, "cp_isolation is set. use this flag for debugging purpose. " "limited list of encoding allowed : %s.", ", ".join(cp_isolation), @@ -92,7 +95,8 @@ cp_isolation = [] if cp_exclusion is not None: - logger.debug( + logger.log( + TRACE, "cp_exclusion is set. use this flag for debugging purpose. " "limited list of encoding excluded : %s.", ", ".join(cp_exclusion), @@ -102,7 +106,8 @@ cp_exclusion = [] if length <= (chunk_size * steps): - logger.debug( + logger.log( + TRACE, "override steps (%i) and chunk_size (%i) as content does not fit (%i byte(s) given) parameters.", steps, chunk_size, @@ -118,16 +123,18 @@ is_too_large_sequence = len(sequences) >= TOO_BIG_SEQUENCE # type: bool if is_too_small_sequence: - logger.warning( + logger.log( + TRACE, "Trying to detect encoding from a tiny portion of ({}) byte(s).".format( length - ) + ), ) elif is_too_large_sequence: - logger.info( + logger.log( + TRACE, "Using lazy str decoding because the payload is quite large, ({}) byte(s).".format( length - ) + ), ) prioritized_encodings = [] # type: List[str] @@ -138,7 +145,8 @@ if specified_encoding is not None: prioritized_encodings.append(specified_encoding) - logger.info( + logger.log( + TRACE, "Detected declarative mark in sequence. Priority +1 given for %s.", specified_encoding, ) @@ -157,7 +165,8 @@ if sig_encoding is not None: prioritized_encodings.append(sig_encoding) - logger.info( + logger.log( + TRACE, "Detected a SIG or BOM mark on first %i byte(s). Priority +1 given for %s.", len(sig_payload), sig_encoding, @@ -188,7 +197,8 @@ ) # type: bool if encoding_iana in {"utf_16", "utf_32"} and not bom_or_sig_available: - logger.debug( + logger.log( + TRACE, "Encoding %s wont be tested as-is because it require a BOM. Will try some sub-encoder LE/BE.", encoding_iana, ) @@ -197,8 +207,10 @@ try: is_multi_byte_decoder = is_multi_byte_encoding(encoding_iana) # type: bool except (ModuleNotFoundError, ImportError): - logger.debug( - "Encoding %s does not provide an IncrementalDecoder", encoding_iana + logger.log( + TRACE, + "Encoding %s does not provide an IncrementalDecoder", + encoding_iana, ) continue @@ -219,7 +231,8 @@ ) except (UnicodeDecodeError, LookupError) as e: if not isinstance(e, LookupError): - logger.debug( + logger.log( + TRACE, "Code page %s does not fit given bytes sequence at ALL. %s", encoding_iana, str(e), @@ -235,7 +248,8 @@ break if similar_soft_failure_test: - logger.debug( + logger.log( + TRACE, "%s is deemed too similar to code page %s and was consider unsuited already. Continuing!", encoding_iana, encoding_soft_failed, @@ -255,7 +269,8 @@ ) # type: bool if multi_byte_bonus: - logger.debug( + logger.log( + TRACE, "Code page %s is a multi byte encoding table and it appear that at least one character " "was encoded using n-bytes.", encoding_iana, @@ -285,7 +300,8 @@ errors="ignore" if is_multi_byte_decoder else "strict", ) # type: str except UnicodeDecodeError as e: # Lazy str loading may have missed something there - logger.debug( + logger.log( + TRACE, "LazyStr Loading: After MD chunk decode, code page %s does not fit given bytes sequence at ALL. %s", encoding_iana, str(e), @@ -337,7 +353,8 @@ try: sequences[int(50e3) :].decode(encoding_iana, errors="strict") except UnicodeDecodeError as e: - logger.debug( + logger.log( + TRACE, "LazyStr Loading: After final lookup, code page %s does not fit given bytes sequence at ALL. %s", encoding_iana, str(e), @@ -350,7 +367,8 @@ ) # type: float if mean_mess_ratio >= threshold or early_stop_count >= max_chunk_gave_up: tested_but_soft_failure.append(encoding_iana) - logger.info( + logger.log( + TRACE, "%s was excluded because of initial chaos probing. Gave up %i time(s). " "Computed mean chaos is %f %%.", encoding_iana, @@ -373,7 +391,8 @@ fallback_u8 = fallback_entry continue - logger.info( + logger.log( + TRACE, "%s passed initial chaos probing. Mean measured chaos is %f %%", encoding_iana, round(mean_mess_ratio * 100, ndigits=3), @@ -385,10 +404,11 @@ target_languages = mb_encoding_languages(encoding_iana) if target_languages: - logger.debug( + logger.log( + TRACE, "{} should target any language(s) of {}".format( encoding_iana, str(target_languages) - ) + ), ) cd_ratios = [] @@ -406,10 +426,11 @@ cd_ratios_merged = merge_coherence_ratios(cd_ratios) if cd_ratios_merged: - logger.info( + logger.log( + TRACE, "We detected language {} using {}".format( cd_ratios_merged, encoding_iana - ) + ), ) results.append( @@ -427,8 +448,8 @@ encoding_iana in [specified_encoding, "ascii", "utf_8"] and mean_mess_ratio < 0.1 ): - logger.info( - "%s is most likely the one. Stopping the process.", encoding_iana + logger.debug( + "Encoding detection: %s is most likely the one.", encoding_iana ) if explain: logger.removeHandler(explain_handler) @@ -436,8 +457,9 @@ return CharsetMatches([results[encoding_iana]]) if encoding_iana == sig_encoding: - logger.info( - "%s is most likely the one as we detected a BOM or SIG within the beginning of the sequence.", + logger.debug( + "Encoding detection: %s is most likely the one as we detected a BOM or SIG within " + "the beginning of the sequence.", encoding_iana, ) if explain: @@ -447,13 +469,15 @@ if len(results) == 0: if fallback_u8 or fallback_ascii or fallback_specified: - logger.debug( - "Nothing got out of the detection process. Using ASCII/UTF-8/Specified fallback." + logger.log( + TRACE, + "Nothing got out of the detection process. Using ASCII/UTF-8/Specified fallback.", ) if fallback_specified: logger.debug( - "%s will be used as a fallback match", fallback_specified.encoding + "Encoding detection: %s will be used as a fallback match", + fallback_specified.encoding, ) results.append(fallback_specified) elif ( @@ -465,12 +489,21 @@ ) or (fallback_u8 is not None) ): - logger.warning("utf_8 will be used as a fallback match") + logger.debug("Encoding detection: utf_8 will be used as a fallback match") results.append(fallback_u8) elif fallback_ascii: - logger.warning("ascii will be used as a fallback match") + logger.debug("Encoding detection: ascii will be used as a fallback match") results.append(fallback_ascii) + if results: + logger.debug( + "Encoding detection: Found %s as plausible (best-candidate) for content. With %i alternatives.", + results.best().encoding, # type: ignore + len(results) - 1, + ) + else: + logger.debug("Encoding detection: Unable to determine any suitable charset.") + if explain: logger.removeHandler(explain_handler) logger.setLevel(previous_logger_level) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/charset_normalizer-2.0.10/charset_normalizer/constant.py new/charset_normalizer-2.0.12/charset_normalizer/constant.py --- old/charset_normalizer-2.0.10/charset_normalizer/constant.py 2022-01-04 21:14:06.000000000 +0100 +++ new/charset_normalizer-2.0.12/charset_normalizer/constant.py 2022-02-12 15:24:47.000000000 +0100 @@ -498,3 +498,6 @@ NOT_PRINTABLE_PATTERN = re_compile(r"[0-9\W\n\r\t]+") LANGUAGE_SUPPORTED_COUNT = len(FREQUENCIES) # type: int + +# Logging LEVEL bellow DEBUG +TRACE = 5 # type: int diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/charset_normalizer-2.0.10/charset_normalizer/md.py new/charset_normalizer-2.0.12/charset_normalizer/md.py --- old/charset_normalizer-2.0.10/charset_normalizer/md.py 2022-01-04 21:14:06.000000000 +0100 +++ new/charset_normalizer-2.0.12/charset_normalizer/md.py 2022-02-12 15:24:47.000000000 +0100 @@ -314,7 +314,7 @@ self._buffer = "" self._buffer_accent_count = 0 elif ( - character not in {"<", ">", "-", "="} + character not in {"<", ">", "-", "=", "~", "|", "_"} and character.isdigit() is False and is_symbol(character) ): diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/charset_normalizer-2.0.10/charset_normalizer/version.py new/charset_normalizer-2.0.12/charset_normalizer/version.py --- old/charset_normalizer-2.0.10/charset_normalizer/version.py 2022-01-04 21:14:06.000000000 +0100 +++ new/charset_normalizer-2.0.12/charset_normalizer/version.py 2022-02-12 15:24:47.000000000 +0100 @@ -2,5 +2,5 @@ Expose version """ -__version__ = "2.0.10" +__version__ = "2.0.12" VERSION = __version__.split(".") diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/charset_normalizer-2.0.10/dev-requirements.txt new/charset_normalizer-2.0.12/dev-requirements.txt --- old/charset_normalizer-2.0.10/dev-requirements.txt 2022-01-04 21:14:06.000000000 +0100 +++ new/charset_normalizer-2.0.12/dev-requirements.txt 2022-02-12 15:24:47.000000000 +0100 @@ -4,7 +4,7 @@ chardet==4.0.* Flask>=2.0,<3.0; python_version >= '3.6' requests>=2.26,<3.0; python_version >= '3.6' -black==21.12b0; python_version >= '3.6' +black==22.1.0; python_version >= '3.6' flake8==4.0.1; python_version >= '3.6' -mypy==0.930; python_version >= '3.6' +mypy==0.931; python_version >= '3.6' isort; python_version >= '3.6' diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/charset_normalizer-2.0.10/docs/user/miscellaneous.rst new/charset_normalizer-2.0.12/docs/user/miscellaneous.rst --- old/charset_normalizer-2.0.10/docs/user/miscellaneous.rst 2022-01-04 21:14:06.000000000 +0100 +++ new/charset_normalizer-2.0.12/docs/user/miscellaneous.rst 2022-02-12 15:24:47.000000000 +0100 @@ -18,3 +18,29 @@ # This should print '????????????????????????????????????????????????' print(str(result)) + + +Logging +------- + +Prior to the version 2.0.10 you may encounter some unexpected logs in your streams. +Something along the line of: + + :: + + ... | WARNING | override steps (5) and chunk_size (512) as content does not fit (465 byte(s) given) parameters. + ... | INFO | ascii passed initial chaos probing. Mean measured chaos is 0.000000 % + ... | INFO | ascii should target any language(s) of ['Latin Based'] + + +It is most likely because you altered the root getLogger instance. The package has its own logic behind logging and why +it is useful. See https://docs.python.org/3/howto/logging.html to learn the basics. + +If you are looking to silence and/or reduce drastically the amount of logs, please upgrade to the latest version +available for `charset-normalizer` using your package manager or by `pip install charset-normalizer -U`. + +The latest version will no longer produce any entry greater than `DEBUG`. +On `DEBUG` only one entry will be observed and that is about the detection result. + +Then regarding the others log entries, they will be pushed as `Level 5`. Commonly known as TRACE level, but we do +not register it globally. diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/charset_normalizer-2.0.10/setup.py new/charset_normalizer-2.0.12/setup.py --- old/charset_normalizer-2.0.10/setup.py 2022-01-04 21:14:06.000000000 +0100 +++ new/charset_normalizer-2.0.12/setup.py 2022-02-12 15:24:47.000000000 +0100 @@ -73,6 +73,7 @@ 'Programming Language :: Python :: 3.8', 'Programming Language :: Python :: 3.9', 'Programming Language :: Python :: 3.10', + 'Programming Language :: Python :: 3.11', 'Topic :: Text Processing :: Linguistic', 'Topic :: Utilities', 'Programming Language :: Python :: Implementation :: PyPy', diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/charset_normalizer-2.0.10/tests/test_logging.py new/charset_normalizer-2.0.12/tests/test_logging.py --- old/charset_normalizer-2.0.10/tests/test_logging.py 2022-01-04 21:14:06.000000000 +0100 +++ new/charset_normalizer-2.0.12/tests/test_logging.py 2022-02-12 15:24:47.000000000 +0100 @@ -3,6 +3,7 @@ from charset_normalizer.utils import set_logging_handler from charset_normalizer.api import from_bytes, explain_handler +from charset_normalizer.constant import TRACE class TestLogBehaviorClass: @@ -17,16 +18,16 @@ from_bytes(test_sequence, steps=1, chunk_size=50, explain=True) assert explain_handler not in self.logger.handlers for record in caplog.records: - assert record.levelname in ["INFO", "DEBUG"] + assert record.levelname in ["Level 5", "DEBUG"] def test_explain_false_handler_set_behavior(self, caplog): test_sequence = b'This is a test sequence of bytes that should be sufficient' - set_logging_handler(level=logging.INFO, format_string="%(message)s") + set_logging_handler(level=TRACE, format_string="%(message)s") from_bytes(test_sequence, steps=1, chunk_size=50, explain=False) assert any(isinstance(hdl, logging.StreamHandler) for hdl in self.logger.handlers) for record in caplog.records: - assert record.levelname in ["INFO", "DEBUG"] - assert "ascii is most likely the one. Stopping the process." in caplog.text + assert record.levelname in ["Level 5", "DEBUG"] + assert "Encoding detection: ascii is most likely the one." in caplog.text def test_set_stream_handler(self, caplog): set_logging_handler( @@ -34,7 +35,7 @@ ) self.logger.debug("log content should log with default format") for record in caplog.records: - assert record.levelname in ["INFO", "DEBUG"] + assert record.levelname in ["Level 5", "DEBUG"] assert "log content should log with default format" in caplog.text def test_set_stream_handler_format(self, caplog):