Both of these changes should be deployed to WMF wikis with 1.28.0-wmf.17,
see https://www.mediawiki.org/wiki/MediaWiki_1.28/Roadmap for the schedule.

== Alternative multi-value separator ==

With the merging of Gerrit change 305126,[1] if a value for a multi-valued
parameter must contain pipe characters (U+007C, "|"), it will now be
possible to use the Unit Separator character (U+001F) instead. As this
character is not otherwise valid input for any strings in MediaWiki, its
use here should not conflict with any valid input. To signal that you're
using this feature, the whole multi-value parameter must also be prefixed
with the Unit Separator character.

For example, you can now use meta=allmessages to parse the
"search-rewritten" message with the text "foo|bar" as $1 and "foo|baz" as
$2:
https://en.wikipedia.beta.wmflabs.org/w/api.php?action=query&meta=allmessages&ammessages=search-rewritten&amenableparser=1&amargs=%1Ffoo|bar%1Ffoo|baz
<https://en.wikipedia.beta.wmflabs.org/w/api.php?action=query&meta=allmessages&ammessages=search-rewritten&amenableparser=1&amargs=%1Ffoo%7Cbar%1Ffoo%7Cbaz>

Client libraries should consider updating to use this feature when asked to
send a multi-valued parameter with one or more values containing pipe
characters.


== Unicode normalization warnings ==

The API has always expected input as NFC-normalized Unicode represented as
UTF-8. Non-NFC-normalized UTF-8 input would be silently normalized, and in
the query string non-UTF-8 input would be interpreted as being in a
fallback encoding (such as Windows-1252) and converted to Unicode. This
sometimes led to subtle bugs when input was unexpectedly converted.

With the merging of Gerrit change 306491,[2] the API will now issue a
warning when input was subject to such conversion, which is hoped to make
it more obvious to clients when their input was subject to conversion.

When this happens in the 'titles' parameter for ApiPageSet-using modules,
the response will also include the conversion of each title in the existing
'normalized' element. Since the API result cannot directly represent
non-normalized data, these entries will have the 'from' element
percent-encoded and a 'fromencoded' boolean will be included alongside to
indicate this. This normalization step is separate from the existing Title
normalization (uppercasing the first letter and replacing underscores with
spaces), so two entries may be generated in the 'normalized' element.

See
https://en.wikipedia.beta.wmflabs.org/w/api.php?action=query&titles=a%CC%8A&formatversion=2
for an example showing the new warning and normalization entries.


 [1]: https://gerrit.wikimedia.org/r/#/c/305126/
 [2]: https://gerrit.wikimedia.org/r/#/c/306491/


-- 
Brad Jorsch (Anomie)
Senior Software Engineer
Wikimedia Foundation
_______________________________________________
Mediawiki-api-announce mailing list
mediawiki-api-annou...@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce
_______________________________________________
Mediawiki-api mailing list
Mediawiki-api@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api

Reply via email to