Michael,

In a MARC UCS/Unicode UTF-8 environment, the Esc (0x1B) character doesn't serve 
any purpose

Correct re. the Esc character. The presence of an Esc is probably a good
indication that the record is in MARC-8.

So, I'm wondering if for MARC record testing, it would make sense to tighten up 
the ASCII
part of the regexp a bit to this:

        [\x1D-\x7E]

That would almost certainly do. I don't think I've ever seen a newline or a tab in a MARC record. However, knowing the amount of c**p we do get in records, it wouldn't surprise me
if one did appear somewhere.

Ashley.
--
Ashley Sanders               [EMAIL PROTECTED]
Copac http://copac.ac.uk A MIMAS Service funded by JISC

Reply via email to