On Tue, Dec 16, 2003 at 03:52:56PM +0100, Tajoli Zeno wrote:
> 1)When you call LOC without a specific character you recive data in MARC-8 
> character set.
> 
> 2) In MARC-8 character set a letter like "è"  [e grave] is done with TWO 
> bytes one for the sign [the grave accent] and one for the letter [the 
> letter e].
> 
> 3)In the leader, position 0-4 you have the number of character, NOT the 
> number of bytes. In your record there are 901 characters and 903 bytes.
> 
> In fact the "length" function of perl read the number of bytes. The best 
> option, now, is to use charset where 1 character is always 1 byte, for 
> example ISO 8859_1

While this is certainly part of the answer we still don't know why the 
record length is off. The way I see it, there are two possible options: 

1. Net::Z3950 is doing on-the-fly conversion of MARC-8 to Latin1
2. LC's Z39.50 server is emitting the records that way, and not updating the 
   record length.

I guess one way to test which one is true would be to query another Z39.50 
server for the same record, and see if the same problem exists....in which
case 1 is probably the case. 

//Ed

Reply via email to