On Tue, Dec 16, 2003 at 03:52:56PM +0100, Tajoli Zeno <[EMAIL PROTECTED]> wrote: > > 2) In MARC-8 character set a letter like "è" [e grave] is done with TWO > bytes one for the sign [the grave accent] and one for the letter [the > letter e]. > > 3)In the leader, position 0-4 you have the number of character, NOT the > number of bytes. In your record there are 901 characters and 903 bytes. > No it should be number of bytes (LOC has clarified this in their spec by saying "number of octets".) It has always been the length in bytes. In the example it looks like the non-spacing diacritic has been converted to two bytes (which sounds almost like it was assumed to be latin-1 and got incorrectly marc-8'd somewhere along the line. But the weird characters may result from the screen display, you need to ascertain what the actual values are there. (dumping the record in hex nay reveal something) Cheers Colin
-- Colin Campbell Technical Services Consultant Sirsi Ltd [EMAIL PROTECTED]