I'm not an expert on memory. I used Process Explorer to look at the Process. The Working Set of the current run is 11GB. The Private Bytes is 708MB. Actually, see all the info here: https://www.dropbox.com/s/tzoud028pzdkfi7/screenshot_TURING_2018-10-08_133355.jpg?dl=0
I've got 16GB of RAM on this computer, and Process Explorer says it's almost full, just ~150MB left. This is physical memory. To your question: The loop does iterate, i.e. finding multiple matches. On Mon, Oct 8, 2018 at 1:20 PM Cameron Simpson <c...@cskk.id.au> wrote: > On 08Oct2018 10:56, Ram Rachum <r...@rachum.com> wrote: > >That's incredibly interesting. I've never used mmap before. > >However, there's a problem. > >I did a few experiments with mmap now, this is the latest: > > > >path = pathlib.Path(r'P:\huge_file') > > > >with path.open('r') as file: > > mmap = mmap.mmap(file.fileno(), 0, access=mmap.ACCESS_READ) > > Just a remark: don't tromp on the "mmap" name. Maybe "mapped"? > > > for match in re.finditer(b'.', mmap): > > pass > > > >The file is 338GB in size, and it seems that Python is trying to load it > >into memory. The process is now taking 4GB RAM and it's growing. I saw the > >same behavior when searching for a non-existing match. > > > >Should I open a Python bug for this? > > Probably not. First figure out what is going on. BTW, how much RAM have > you > got? > > As you access the mapped file the OS will try to keep it in memory in case > you > need that again. In the absense of competition, most stuff will get paged > out > to accomodate it. That's normal. All the data are "clean" (unmodified) so > the > OS can simply release the older pages instantly if something else needs > the > RAM. > > However, another possibility is the the regexp is consuming lots of memory. > > The regexp seems simple enough (b'.'), so I doubt it is leaking memory > like > mad; I'm guessing you're just seeing the OS page in as much of the file as > it > can. > > Also, does the loop iterate? i.e. does it find multiple matches as the > memory > gets consumed, or is the first iateration blocking and consuming gobs of > memory > before the first match comes back? A print() call will tell you that. > > Cheers, > Cameron Simpson <c...@cskk.id.au> > >
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/