Re: ply handling huge files

A.T.Hofkamp Tue, 31 Jul 2012 00:15:27 -0700

On 07/20/2012 10:24 AM, gthomas wrote:

You should mmap the file:


fh = open(filename, 'rb')
text = mmap.mmap(fh.fileno(), 0, mmap.READ_ACCESS)

Then you can use "text" anywhere you can use an str, but consumes no additional
memory!

Not entirely, you do claim a chunk of memory, so for very big files you'll still run out of it (at32 bit systems mostly due to the low limit of a few BG).

2012. július 17., kedd 11:54:58 UTC+2 időpontban PyRate a következőt írta:

     tokens sometime fall between the file chunks. This lead me to add some code
     to the lex and yacc module of ply so that it loads file chunks if it 
reaches
     a threshold (no of bytes). Is it the way it is done normally? Is there any
     better way to do this?


The core of the problem is that ply uses RE for scanning which assumes that its 
input is in memory.

One better way of doing this is thus to extend RE to accept data from a file 
stream.

Alternatively, you can write your own scanner that loads its input from a file. I think someone alsowrote a lex-like generator that you could attach to ply as scanner. I don't remember what it assumedas input, but it may be worth checking.



Albert

--
You received this message because you are subscribed to the Google Groups 
"ply-hack" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Re: ply handling huge files

Reply via email to