Thanks Brian,

Great for the link to Php72ToUpper.php !
I think I understand with it : for example, the first line says 'ƀ' => 'ƀ',
which should mean that this letter shouldn't be converted to uppercase by
MW ?
That's one of the letter I found that wasn't converted to uppercase and
that was generating a false positive in my code : so it's because specific
MW code is preventing the conversion :-)

Nico

On Sun, Aug 4, 2019 at 1:32 AM bawolff <bawolff...@gmail.com> wrote:

> MediaWiki uses php's mb_strtoupper.
>
> I believe this will use normal unicode uppercase algorithm. However this
> can vary depending on version of unicode. We are currently in the process
> of switching to php7, but for the moment we are still using HHVM's
> uppercasing code. There's a list of differences between hhvm and php7.2
> uppercasing at
>
> https://github.com/wikimedia/operations-mediawiki-config/blob/master/wmf-config/Php72ToUpper.php
> [All this is probably subject to change]
>
> However, I am at a loss as to why hhvm & php < 5.6 [1] wouldn't map that
> character, since the ɽ -> Ɽ mapping has been present since unicode 5
> (2006). Guess it was using a really old unicode data or something.
>
> See also  bug T219279 [2]
>
> --
> Brian
>
> [1] https://3v4l.org/GHt3b
> [2] https://phabricator.wikimedia.org/T219279
>
> On Sat, Aug 3, 2019 at 7:57 AM Nicolas Vervelle <nverve...@gmail.com>
> wrote:
>
> > Hello,
> >
> > On most wikis, MediaWiki is configuration to convert the first letter of
> a
> > title to uppercase, but apparently it's not converting every Unicode
> > characters : for example, on frwiki ɽ
> > <https://fr.wikipedia.org/w/index.php?title=%C9%BD&redirect=no> is a
> > different article than Ɽ <https://fr.wikipedia.org/wiki/%E2%B1%A4>, even
> > if
> > the second character is the uppercase version of the first one in
> Unicode.
> >
> > So, what characters are actually converted to uppercase by the title
> > normalization ?
> >
> > I need to know this information to stop reporting some false positives in
> > WPCleaner <https://fr.wikipedia.org/wiki/Wikip%C3%A9dia:WPCleaner>.
> >
> > Thanks, Nico
> > _______________________________________________
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to