On Tue, Dec 16, 2003 at 03:52:56PM +0100, Tajoli Zeno wrote: > 1)When you call LOC without a specific character you recive data in MARC-8 > character set. > > 2) In MARC-8 character set a letter like "è" [e grave] is done with TWO > bytes one for the sign [the grave accent] and one for the letter [the > letter e]. > > 3)In the leader, position 0-4 you have the number of character, NOT the > number of bytes. In your record there are 901 characters and 903 bytes. > > In fact the "length" function of perl read the number of bytes. The best > option, now, is to use charset where 1 character is always 1 byte, for > example ISO 8859_1
While this is certainly part of the answer we still don't know why the record length is off. The way I see it, there are two possible options: 1. Net::Z3950 is doing on-the-fly conversion of MARC-8 to Latin1 2. LC's Z39.50 server is emitting the records that way, and not updating the record length. I guess one way to test which one is true would be to query another Z39.50 server for the same record, and see if the same problem exists....in which case 1 is probably the case. //Ed