> where 'transliteratorX' is the name of a transliteration class to use > (i.e. Unicode::Transliterate::ISO_8859_15::ASCII for the ISO_8859_15 > to ASCII transliteration table). >
Your main example was accent-stripping. This operation isn't related to encodings really. It is a general operation that simply happens to yield an ASCII subset for a certain input. Maybe you want to define your module as something more specific and more inclusive: a Unicode-to-human-readable-URI transliterator. This will be a non-trivial module to implement in full! Yet, something pretty useful could be done fairly simply using nothing more than regexes. Script-to-script transliterators (Japanese->Latin for example) would be useful, but encoding-to-encoding transliterators are not so useful really. There are too many dimensions to the problem. And, fallback characters or routines are probably the best design to generate useful output when mis-matched encodings are being cross-converted. Here's an interesting article on transliteration in general and ICU's implementation in particular: http://oss.software.ibm.com/icu/userguide/Transliteration.html Transliteration itself pre-dates computers by centuries. It is a fascinating topic for anyone interested in linguistics. =Ed