Matěj Suchánek has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/329761 )
Change subject: Fix and improve default regexes ...................................................................... Fix and improve default regexes - Remove unneccessary flags. - Clean up 'header' using multiline. - Expand 'pre' to support HTML attributes (mostly 'style'). - Update 'property' to support parameters (currently, it supports "|from=" but it might support more in the future). - Localize 'property' and 'invoke' using magic words. Change-Id: Ib805bf70cb1cc99711138d7d6c7e40971f31b602 --- M pywikibot/textlib.py 1 file changed, 9 insertions(+), 7 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/pywikibot/core refs/changes/61/329761/2 diff --git a/pywikibot/textlib.py b/pywikibot/textlib.py index 9f7782e..2908ee1 100644 --- a/pywikibot/textlib.py +++ b/pywikibot/textlib.py @@ -221,13 +221,13 @@ _regex_cache.update({ 'comment': re.compile(r'(?s)<!--.*?-->'), # section headers - 'header': re.compile(r'\r?\n=+.+=+ *\r?\n'), + 'header': re.compile(r'(?m)^=+.+=+ *$'), # preformatted text - 'pre': re.compile(r'(?ism)<pre>.*?</pre>'), + 'pre': re.compile(r'(?is)<pre[ >].*?</pre>'), 'source': re.compile(r'(?is)<source .*?</source>'), - 'score': re.compile(r'(?ism)<score[ >].*?</score>'), + 'score': re.compile(r'(?is)<score[ >].*?</score>'), # inline references - 'ref': re.compile(r'(?ism)<ref[ >].*?</ref>'), + 'ref': re.compile(r'(?is)<ref[ >].*?</ref>'), 'template': NESTED_TEMPLATE_REGEX, # lines that start with a space are shown in a monospace font and # have whitespace preserved. @@ -247,11 +247,13 @@ site.validLanguageLinks() + list(site.family.obsolete.keys()))), # Wikibase property inclusions - 'property': re.compile(r'(?i)\{\{\s*#property:\s*p\d+\s*\}\}'), + 'property': (r'(?i)\{\{\s*#(%s):\s*p\d+.*?\}\}', + lambda site: '|'.join(site.getmagicwords('property'))), # Module invocations (currently only Lua) - 'invoke': re.compile(r'(?i)\{\{\s*#invoke:.*?}\}'), + 'invoke': (r'(?i)\{\{\s*#(%s):.*?\}\}', + lambda site: '|'.join(site.getmagicwords('invoke'))), # categories - 'category': ('\[\[ *(?:%s)\s*:.*?\]\]', + 'category': (r'\[\[ *(?:%s)\s*:.*?\]\]', lambda site: '|'.join(site.namespaces[14])), # files 'file': (FILE_LINK_REGEX, -- To view, visit https://gerrit.wikimedia.org/r/329761 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ib805bf70cb1cc99711138d7d6c7e40971f31b602 Gerrit-PatchSet: 2 Gerrit-Project: pywikibot/core Gerrit-Branch: master Gerrit-Owner: Matěj Suchánek <matejsuchane...@gmail.com> Gerrit-Reviewer: jenkins-bot <> _______________________________________________ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits