file size estimation

2006-09-15 Thread Zhao Peng

Hi,

Suppose a EBCDIC file on a tape from IBM mainframe is read onto a Linux 
server, and this EBCDIC file on the tape has 100 records with a length 
of 13054, is it correct to estimate the size of the file on Linux server 
would be 1,305,400 bytes? Is block size information also needed to 
calculate the size?


Please correct me if these terms are used incorrectly, also hopefully 
this question is not too OT.


Thanks,
Zhao
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss


Re: file size estimation

2006-09-15 Thread Bill Ricker

Suppose a EBCDIC file on a tape from IBM mainframe is read onto a Linux
server, and this EBCDIC file on the tape has 100 records with a length
of 13054, is it correct to estimate the size of the file on Linux server
would be 1,305,400 bytes?


Maybe.

[Last time I did this, I out-sourced it to a boutique conversion shop
in Cambridge ... he had a VMS system with one of every tape drive
known to man, and set a custom conversion table since the tapes I had
were Mutant International EBCDIC from NLM. Sorry, I don't have name
handy, this was 10 years ago.]


Is block size information also needed to
calculate the size?


Probably not, although it will probably be necessary to read the tape,
depending on utility used. eg., dd(1) will require being told
blocksize and lrecl.


Please correct me if these terms are used incorrectly, also hopefully
this question is not too OT.


Terminology seems correct.

Normally, if doing EBCDIC=ASCII conversion for use on Unix later, I
would also do an LRECL=NL. This would insert an additional 100 bytes
beyond the size computed in your example.  If you really only plan to
use the file with sysread(2) as LRECL, you don't need to do this, but
to view it with more(1) or anything else, it's highly desireable, even
though the lrecl is rather long by Unix standards and will crush any
old fixed 1000 byte buffers.

(If by some disaster you convert it into Unicode, the size could be a
bit larger due to non-ASCII characters appearing in the EBCDIC, or
doubled if you convert it to UTF16. I wouldn't recommend that unless
you had compelling reasons!)

--
Bill
[EMAIL PROTECTED] [EMAIL PROTECTED]
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss