Meno25 created this task. Meno25 added projects: Pywikibot, Pywikibot-Scripts, Pywikibot-cosmetic-changes.py. Restricted Application added subscribers: pywikibot-bugs-list, alaa, Aklapper.
TASK DESCRIPTION **Command line:** python pwb.py cosmetic_changes -page:"نقاش:السلفية/أرشيف 1" -lang:ar **Output:** Retrieving 1 pages from wikipedia:ar. >>> نقاش:السلفية/أرشيف 1 <<< 1 read operation Execution time: 1 seconds Read operation time: 1.0 seconds Script terminated by exception: ERROR: 'utf-8' codec can't decode byte 0xd9 in position 6: invalid continuation byte (UnicodeDecodeError) Traceback (most recent call last): File "C:\Users\Mohammed\Downloads\core\pwb.py", line 39, in <module> sys.exit(main()) ^^^^^^ File "C:\Users\Mohammed\Downloads\core\pwb.py", line 35, in main runpy.run_path(str(path), run_name='__main__') File "<frozen runpy>", line 291, in run_path File "<frozen runpy>", line 98, in _run_module_code File "<frozen runpy>", line 88, in _run_code File "C:\Users\Mohammed\Downloads\core\pywikibot\scripts\wrapper.py", line 513, in <module> main() File "C:\Users\Mohammed\Downloads\core\pywikibot\scripts\wrapper.py", line 497, in main if not execute(): ^^^^^^^^^ File "C:\Users\Mohammed\Downloads\core\pywikibot\scripts\wrapper.py", line 484, in execute run_python_file(filename, script_args, module) File "C:\Users\Mohammed\Downloads\core\pywikibot\scripts\wrapper.py", line 147, in run_python_file exec(compile(source, filename, 'exec', dont_inherit=True), File "C:\Users\Mohammed\Downloads\core\scripts\cosmetic_changes.py", line 131, in <module> main() File "C:\Users\Mohammed\Downloads\core\scripts\cosmetic_changes.py", line 127, in main bot.run() File "C:\Users\Mohammed\Downloads\core\pywikibot\bot.py", line 1671, in run self.treat(page) File "C:\Users\Mohammed\Downloads\core\pywikibot\bot.py", line 1924, in treat self.treat_page() File "C:\Users\Mohammed\Downloads\core\scripts\cosmetic_changes.py", line 84, in treat_page new_text = cc_toolkit.change(old_text) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Mohammed\Downloads\core\pywikibot\cosmetic_changes.py", line 302, in change new_text = self._change(text) ^^^^^^^^^^^^^^^^^^ File "C:\Users\Mohammed\Downloads\core\pywikibot\cosmetic_changes.py", line 296, in _change text = self.safe_execute(method, text) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Mohammed\Downloads\core\pywikibot\cosmetic_changes.py", line 283, in safe_execute result = method(text) ^^^^^^^^^^^^ File "C:\Users\Mohammed\Downloads\core\pywikibot\cosmetic_changes.py", line 645, in cleanUpLinks text = textlib.replaceExcept(text, linkR, handleOneLink, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Mohammed\Downloads\core\pywikibot\textlib.py", line 452, in replaceExcept replacement = new(match) ^^^^^^^^^^ File "C:\Users\Mohammed\Downloads\core\pywikibot\cosmetic_changes.py", line 527, in handleOneLink is_interwiki = self.site.isInterwikiLink(titleWithSection) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Mohammed\Downloads\core\pywikibot\site\_basesite.py", line 336, in isInterwikiLink linkfam, linkcode = pywikibot.Link(text, self).parse_site() ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Mohammed\Downloads\core\pywikibot\page\_links.py", line 300, in __init__ self._text = pywikibot.tools.chars.url2string( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Mohammed\Downloads\core\pywikibot\tools\chars.py", line 136, in url2string raise first_exception File "C:\Users\Mohammed\Downloads\core\pywikibot\tools\chars.py", line 128, in url2string result = t.decode(enc) ^^^^^^^^^^^^^ UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd9 in position 6: invalid continuation byte CRITICAL: Exiting due to uncaught exception UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd9 in position 6: invalid continuation byte **What should have happened instead?**: When encountering such error, the bot should have skipped the page and continued working on other pages instead of crashing which forces me to restart the bot run. **Note:** This task is similar to T304288: reflinks.py: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc2 in position 18: invalid continuation byte <https://phabricator.wikimedia.org/T304288> which was fixed in rPWBC4298a6cd362f82fdb3af2794cafa9aea13ad7859 <https://phabricator.wikimedia.org/rPWBC4298a6cd362f82fdb3af2794cafa9aea13ad7859> but for the script `cosmetic_changes.py` instead of `reflinks.py` **Software version:** Pywikibot: [https] r-pywikibot-core (6ef2645, g17994, 2023/07/20, 13:19:10, master) Release version: 8.3.0.dev0 setuptools version: 68.0.0 mwparserfromhell version: 0.6.4 wikitextparser version: n/a requests version: 2.31.0 certificate test: ok Python: 3.11.4 (tags/v3.11.4:d2340ef, Jun 7 2023, 05:45:37) [MSC v.1934 64 bit (AMD64)] TASK DETAIL https://phabricator.wikimedia.org/T342470 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Meno25 Cc: Aklapper, alaa, pywikibot-bugs-list, Meno25, PotsdamLamb, Jyoo1011, JohnsonLee01, SHEKH, Dijkstra, Khutuck, Zkhalido, Viztor, Wenyi, Tbscho, MayS, Mdupont, JJMC89, Dvorapa, Altostratus, Avicennasis, mys_721tx, Xqt, jayvdb, Masti, Alchimista
_______________________________________________ pywikibot-bugs mailing list -- pywikibot-bugs@lists.wikimedia.org To unsubscribe send an email to pywikibot-bugs-le...@lists.wikimedia.org