Hi Brenda,

On Tue, July 28, 2009 12:22 pm, Brenda Wallace wrote:
> I don't understand what ascii representations are needed here.
> Can you give an example of how some utf8 string tags would be
> represented in ascii?

in a discussion on IRC the city of Aarhus was mentioned as an example. While it 
is sad for some
people to not be able to use „Århus“ as a hashtag, „Aarhus“ allows foreign 
people – like, visitors
for example – to find the tag. The german party „Die Grünen“ is often 
hashtagged as „#GRUENE“, but
supporting umlauts we would see „#GRÜNE“ as well. Same applies for nearly every 
austrian party
(„FPÖ“, „SPÖ“) … Hashtags in different writing systems are a whole new group of 
problems.

Generally speaking, unicode hashtags will greatly increase the amount of 
different versions of the
same tag. As a writer of german notices I really understand the desire for 
umlaut (and hence other
unicode char) support in hashtags, but with being really old-fashioned and not 
supporting unicode
in hashtags, we assure that they are able to do what they should do: collecting 
notices.

I think the links in [1] could give an idea of a possible solution. PHP 
supports the
Internationalization extension [2] (A wrapper for International Components for 
Unicode), whose
class Normalizer looks quite good. Maybe we could add this as an optional 
dependency?

Regards,
Adrian Lang / Codeispoetry

[1] http://teddziuba.com/2009/07/this-is-america-take-your-unic.html
[2] http://docs.php.net/manual/en/book.intl.php

_______________________________________________
Laconica-dev mailing list
[email protected]
http://mail.laconi.ca/mailman/listinfo/laconica-dev

Reply via email to