On 8/19/2012 4:05 PM, Manuel Strehl wrote:
Hello,
I'm looking for a data source, that maps countries to scripts used in
them. The target application is a visualization in the context of my
codepoints.net site, namely http://codepoints.net/scripts.
At the moment I've extracted the prefered scripts from CLDR (e.g., Cyrl
for Russia, Latn for Germany and so on). Then I've added some historic
scripts by looking at corresponding Wikipedia articles and did some
manual updating. However, this yields a not really satisfactory result.
For example, Russia has only Cyrl associated, while, as far as I can
tell, at least Latn and Arab should also be mentioned, also perhaps some
historic scripts.
I'd appreciate any pointers if and where I could find data sets that aid
me in completing and error-proofing this mapping.
Cheers,
Manuel
Heck, my utility bill in the US has Thai and Chinese characters (for the
fine print, not the statement itself). There's one more script, could be
Cyrillic, don't have one in front of me right now. In some areas of town
you'll find a mixture of scripts on shop signs as well.
The point it's easy to identify a majority script, but to get an
accurate handle on "other" scripts is going to be tricky, if not
impossible. And it all depends on your arbitrary decision of what other
scripts to include and on what basis.
A./