On Sun, 10 Nov 2002 Richard Gaskin <[EMAIL PROTECTED]> wrote: > My hunch is that reading for lines is slower than reading a > specified number of chars, since with lines it needs to evaluate > each incoming character to determine if it's a return -- Scott, am I > right or should they be about the same?
You're right, though I wouldn't think it would make *that* much difference. As for my guess as to the fastest way to do this, it'd probably be a hybrid approach, using both "read for x" and "repeat for each line". You'd start by opening the file for binary read (faster than other modes). Then read for X characters, where X would be some large number experimentally determined for each system (it'd probably some large percentage of the free RAM, and so probably on the order of a few MB), and then use "repeat for each line l in it". The trick is that the last line will be incomplete in this case, so for the second and subsequent reads you subtract the length of the last line from X, and do "read for X at Y", where Y is a running total of what's been read, after subtracting the partial lines of course. Some extra bookkeeping will be required in this case (e.g., if the tag you're looking for is in the partial last line you need to subtract 1 from the count so you don't count it twice). Exactly how to do this part most efficiently is left as an excercise for the reader ;-) Regards, Scott > -- > Richard Gaskin > Fourth World Media Corporation > Developer of WebMerge 2.0: Publish any database on any site > ___________________________________________________________ > [EMAIL PROTECTED] http://www.FourthWorld.com > Tel: 323-225-3717 AIM: FourthWorldInc ******************************************************** Scott Raney [EMAIL PROTECTED] http://www.metacard.com MetaCard: You know, there's an easier way to do that... _______________________________________________ metacard mailing list [EMAIL PROTECTED] http://lists.runrev.com/mailman/listinfo/metacard