2017-11-13 21:48 GMT+01:00 James Kass <jameskass...@gmail.com>:

> Peter Constable wrote,
>
> >> May be this test page ?
> >>
> >> http://www.i18nguy.com/unicode/supplementary-test.html
> >
> > Thanks. I’d need to know _at least something_ about what the characters
> > signify, though, to have a sense of whether there’s anything potentially
> > offensive.
>
> The Plane 2 characters on that page appear to be random.
>

That's probable but the authors claim these are common characters. It's
possible they collected statistics from some corpus to find some of the
most widely used characters in Plane 2, without needing to understand what
they would mean if they are put side by side (I had noted already that
there was no punctuation at all, and the exposed collection is too long for
a typical Chinese text, and in fact I would expect the presence of some CJK
punctuations.
May be we could compile a list of Chinese toponyms using these, and select
those that use more than one Plane2 character, then separate these names
using CJK commas and a final CJK full stop.

Some Wikidata or OSM data search could be used to compile such list (I
think these topynyms will more likely be found in Cantonese, or Taiwanese
related sources, using the zh-Hant variant, but note that Wikidata does not
distinguish zh-Hans and zh-Hant as Wikimedia wikis use a transliterator,
but I doubt this transliterator performs transforms with Plane2 characters
which should remain unchanged with most of them kept for both traditional
and simplified use).

Reply via email to