On Fri, Jan 17, 2003 at 01:25:59PM -0000, J Dinesh wrote:
> I am developing xml2xml conversion tool.
> The XML document contains utf-8, symbol font and dingbats font character
> value.
> I need to convert UTF-8, symbol font and dingbats font to entity.

I did something similar for converting UTF-8 into HTML entities. I
decided to do it the "proper perl 5.8" way: make it into a plugin for
the wonderful Encode module.
What i did was:

- Write a simple script which reads the HTML entity definitions and
  writes a .ucm file. See "perlodc enc2xs" for the format.
- Use enc2xs to turn it into a module.
- Just use Encode in any way you want to do the encoding. My use case
  was also the reason that the fallback mechanisms to € and
    notation got added to Encode.

The whole thing is small enough that I'll just attach it. It contains
the HTML entity files and my script to parse them (ent2ucm). I did do
some manual editing of the result.

-- 
Bart.

Attachment: Encode-HTMLEntities-0.01.tar.gz
Description: Binary data

Reply via email to