On Fri, Jan 17, 2003 at 01:25:59PM -0000, J Dinesh wrote: > I am developing xml2xml conversion tool. > The XML document contains utf-8, symbol font and dingbats font character > value. > I need to convert UTF-8, symbol font and dingbats font to entity.
I did something similar for converting UTF-8 into HTML entities. I decided to do it the "proper perl 5.8" way: make it into a plugin for the wonderful Encode module. What i did was: - Write a simple script which reads the HTML entity definitions and writes a .ucm file. See "perlodc enc2xs" for the format. - Use enc2xs to turn it into a module. - Just use Encode in any way you want to do the encoding. My use case was also the reason that the fallback mechanisms to € and   notation got added to Encode. The whole thing is small enough that I'll just attach it. It contains the HTML entity files and my script to parse them (ent2ucm). I did do some manual editing of the result. -- Bart.
Encode-HTMLEntities-0.01.tar.gz
Description: Binary data