The last time I looked at this, my impression was that you can't portably find the file size in C without actually reading the whole file through to the end. (You could set up #ifdef'd sections for various OSes and use non-standard system calls, but then you might as well just use a separate C library that provides such functionality.)
I guess the right way to do this is to read the file in chunks of a certain size. You can then increase the size of your final buffer as needed, copying each chunk into the final buffer as you read through the file. This way you don't make any assumptions about memory usage, but you still only read through the file once, and only a constant amount of extra storage space is used (plus the amount for the final buffer, of course.) -Nicholas On Tue, Sep 15, 2009 at 6:32 PM, Howard Sanner <[email protected]> wrote: > Is there a *portable* way in C to determine the size of a binary file? > > It would seem that > > fseek(stream, 0L, SEEK_END); > fsize = ftell(stream); > > would do the trick. However, the standard does not require SEEK_END to be > implemented for binary files. (I looked it up last night.) > > Here's what I'm trying to do. I have a bunch of ASCII files that are the > result of running an OCR engine against scans of catalog cards. It would be > a great boon to be able to extract the Library of Congress catalog card no. > (LCCN) from these files for further processing. > > The simplest way to do this is to fread() the whole file into a buffer, scan > backwards for a hyphen using strrchr(), and then do some sanity checks & > massaging. > > The files are small, but I don't like the idea of just allocating a large > buffer and assuming that the files are smaller than the buffer, since I'm > more pessimistic than Murphy. I'd like to check the file size against the > buffer length and realloc() the buffer in the (unlikely, but still...) event > that the file is bigger than the buffer. > > I suspect this could be done quite easily in PERL or suchlike. The problems > are 1) I don't know PERL and don't have time or inclination to learn it, and > 2) the program has to run on a Windoze machine (= no PERL). I'll be lucky to > get access to VC++. (I'll be developing at home using GCC, since my home is > a Windows-free zone, then compiling at the office with VC++.) One hopes that > if I stick to ANSI standard C, VC++ will swallow it, but given that it's a > Microsoft product, that glass may be half empty, too. > > BTW, I don't know C++ (at least not well enough for this), which is why I've > couched this in C terms. > > Thanks for any pointers. (Couldn't resist.) > > Howard Sanner > [email protected] >
