Cscott has uploaded a new change for review. https://gerrit.wikimedia.org/r/179185
Change subject: Don't break autolinks by stripping the final semicolon from an entity. ...................................................................... Don't break autolinks by stripping the final semicolon from an entity. Autolinking free external links is clever about making sure that trailing punctuation isn't included in the link. But if an HTML entity happens to terminate the URL, the semicolon from the entity is stripped from the url, breaking it. Fix this corner case. This also unifies autolink parsing with Parsoid. See: I5ae8435322c78dd1df170d7a3543fff3642759b1 Change-Id: I5482782c25e12283030b0fd2150ac55092f7979b --- M includes/parser/Parser.php M tests/parser/parserTests.txt 2 files changed, 21 insertions(+), 0 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/mediawiki/core refs/changes/85/179185/1 diff --git a/includes/parser/Parser.php b/includes/parser/Parser.php index 5c8253a..2d684ba 100644 --- a/includes/parser/Parser.php +++ b/includes/parser/Parser.php @@ -1485,6 +1485,13 @@ } $numSepChars = strspn( strrev( $url ), $sep ); + # Don't break a trailing HTML entity + if ( $numSepChars && substr( $url, -$numSepChars, 1 ) === ';') { + $chopped = substr( $url, 0, -$numSepChars ); + if ( preg_match( '/&([a-z]+|#x[\da-f]+|#\d+)$/i', $chopped ) ) { + $numSepChars--; + } + } if ( $numSepChars ) { $trail = substr( $url, -$numSepChars ) . $trail; $url = substr( $url, 0, -$numSepChars ); diff --git a/tests/parser/parserTests.txt b/tests/parser/parserTests.txt index 5f19e8b..0e62459 100644 --- a/tests/parser/parserTests.txt +++ b/tests/parser/parserTests.txt @@ -4171,6 +4171,13 @@ http://example.com? http://example.com) http://example.com/url_with_(brackets) +(http://example.com/url_without_brackets) +http://example.com/url_with_entity +http://example.com/url_with_entity  +http://example.com/url_with_entity  +http://example.com/url_with_entity< +http://example.com/url_with_entity< +http://example.com/url_with_entity< !! html <p><a rel="nofollow" class="external free" href="http://example.com">http://example.com</a>, <a rel="nofollow" class="external free" href="http://example.com">http://example.com</a>; @@ -4181,6 +4188,13 @@ <a rel="nofollow" class="external free" href="http://example.com">http://example.com</a>? <a rel="nofollow" class="external free" href="http://example.com">http://example.com</a>) <a rel="nofollow" class="external free" href="http://example.com/url_with_(brackets)">http://example.com/url_with_(brackets)</a> +(<a rel="nofollow" class="external free" href="http://example.com/url_without_brackets">http://example.com/url_without_brackets</a>) +<a rel="nofollow" class="external free" href="http://example.com/url_with_entity ">http://example.com/url_with_entity </a> +<a rel="nofollow" class="external free" href="http://example.com/url_with_entity ">http://example.com/url_with_entity </a> +<a rel="nofollow" class="external free" href="http://example.com/url_with_entity ">http://example.com/url_with_entity </a> +<a rel="nofollow" class="external free" href="http://example.com/url_with_entity">http://example.com/url_with_entity</a>< +<a rel="nofollow" class="external free" href="http://example.com/url_with_entity%3C">http://example.com/url_with_entity%3C</a> +<a rel="nofollow" class="external free" href="http://example.com/url_with_entity%3C">http://example.com/url_with_entity%3C</a> </p> !! end -- To view, visit https://gerrit.wikimedia.org/r/179185 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I5482782c25e12283030b0fd2150ac55092f7979b Gerrit-PatchSet: 1 Gerrit-Project: mediawiki/core Gerrit-Branch: master Gerrit-Owner: Cscott <canan...@wikimedia.org> _______________________________________________ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits