If your software can be changed to not add 0x1a to the end of a MARC file
then by all means do so.  (I don't believe you'll find a specific
prohibition against having it there in the standard, but neither will you
find language anticipating or allowing it.  I'm ignoring padding issues for
records on magnetic tape.)  Even on a DOS/Windows system, adding the 0x1a
to the end of non-text file, which MARC surely qualifies as, is wrong.

That being said, the code in _next() you cite is there to make up for
various known data creation problems in the MARC universe.  I see no harm
in making \x1a one of the characters treated as ignorable inter-record
fill.  Make sure that there's a test case for it as well.  (BTW, when
modifying existing code, use the same conventions, in this case "\x1a"
rather than "\1A".)

Mike O'Regan



                                                                                       
                                           
                      Bryan Baldus                                                     
                                           
                      <[EMAIL PROTECTED]        To:       [EMAIL PROTECTED]            
                                        
                      -books.com>                  cc:                                 
                                           
                                                   Subject:  DOS EOF character in MARC 
files                                      
                      09/10/2004 12:34 PM                                              
                                           
                                                                                       
                                           
                                                                                       
                                           




Our MARC loading and extracting software (internally developed for storage
in a SQL database) adds a DOS end-of-file character (hex 1A) at the end of
each file of (extracted) MARC records. Are there other systems that add
this
character? The presence or absence of the character does not seem to cause
problems for our cataloging software (TLC's ITS for Windows and older
Bibliofile (DOS-based) software), but it does cause
MARC::Record/MARC::File::USMARC's _next method to fail. As a workaround, I
added the following to my local versions of MARC::File::USMARC:

#...

sub _next {
#...
    # remove illegal garbage that sometimes occurs between records
    #Added hex 1A in _next method, for files with that DOS EOF character at
the end of the MARC data.
    $usmarc =~ s/^[ \x00\x0a\x0d\x1A]+//;
#...
}

Could the \x1A removal be added to the official MARC::File::USMARC version?
Does having hex 1A at the end of MARC files pose problems for other
software? Does not having hex 1A at the end of files cause problems for any
software? Should I recommend that our software be revised not to add this
character?

Thank you for your assistance,

Bryan Baldus
Cataloger
Quality Books, Inc.
[EMAIL PROTECTED]



Reply via email to