Hi -
I have some questions concerning converting an export of  MARC8 (from a Horizon
system) records to both UTF8 and XML format for import into Zebra. I am having
trouble converting records from our statewide union catalog containing
characters that are found in the alternate graphic character sets, which are
most of our CJK, Russian, Sanskrit, Persian, and records containing
subscript/superscripts in their titles. The MARC8 records from Horizon contain
embedded ALA hex characters - which seem to have problems converting to the
appropriate UTF8 or XML characters. Many of these records do not contain a tag
066 or tag 880 that define alternate graphic sets.

For those that have catalogs that contain titles from many languages, and have
gone through a conversion from MARC8 to UTF8 or XML, can you suggest conversion
utilities that can handle multiscript MARC8 conversion? Ideally, I woud like a
utility that is UNIX/Linux based, preferrably PERL. I have been using
MARC::Record/MARC::Charset to iterate through my exported Horizon MARC8 file -
but am encountering mapping errors when encountering the embedded ALA hex codes,
which indicate that the ALA code can not be mapped to UTF8.  It is easy to
identify these records, but I would like to convert these records. Without
information from tags 066/tag 880, I was wondering if these records can be
converted.

Any help on this problem, will be greatly appreciated!!!

Elizabeth J. Forney
SILO Systems Support Specialist
[EMAIL PROTECTED]
Iowa State University Library
Room 034
Ames, IA  50011
(515)294-2955
FAX:  (515)294-5525



Reply via email to