On Fri, Mar 16, 2012 at 02:03:38PM +0100, Philippe Veber wrote: > Dear camlers, > > Say that you'd like to search a regexp on a file with lines so long that > you'd rather not load them entirely at once. If you can bound the size of a > match by k << length of a line, then you know that you can only keep a > small portion of the line in memory to search the regexp. Typically you'd > like to access substrings of size k from left to right. I guess such a > thing should involve buffered inputs and avoid copying strings as much as > possible. My question is as follows: has anybody written a library to > access these substrings gracefully and with decent performance? > Cheers, > Philippe.
To your question of such a library: I don't know such a lib. I wonder if your lines would fill some GB or RAM...?! Not sure if it matches your question, but if there is no such lib, you maybe want to implement the Regexp-serach by yourself...?! ==> http://swtch.com/~rsc/regexp/regexp1.html For fast input the Buffe Module is really a performance boost, compared to normal string-appending operations. ==> http://caml.inria.fr/pub/docs/manual-ocaml/libref/Buffer.html Ciao, Oliver -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs