Greets, InStream_Buf() and InStream_Advance_Buf(), as described at...
http://mail-archives.apache.org/mod_mbox/lucene-lucy-dev/200810.mbox/%[email protected]%3e ... were implemented for Unixen a while ago in our KS prototype, using mmap(). A fallback implementation using streamed io was left in place; Windows had been using that fallback until a few days ago. However, I've now finally finished the mapping Windows implementation, which uses CreateFile, CreateFileMapping, and MapViewOfFile -- and it seems to be working great. On 64-bit systems, we map the whole file as soon as it's opened. On 32-bit systems, Buf() and Advance_Buf() use a windowing technique to conserve addressable space. Each InStream can be asked to provide at most one "buf" at a time. By default, the size of that buf is sysconf(_SC_PAGESIZE) on Unixen -- typically 4k -- and dwAllocationGranularity from GetSystemInfo on Windows() -- typically 64k. For certain files, callers may request that Buf() provide them more that -- for example, SortReader maps all sort cache files in their entirety. Nevertheless, since we aren't mapping whole postings files, we shouldn't run out of addressable space for indexes of any practical size. We can keep the streaming io fallback around for systems which don't provide either mman.h or windows.h, though I don't imagine there are too many of those around these days. Marvin Humphrey
