I Clancy,
Have you tried to convert your content to an .ini file? I heard that the
php function "parse_ini_file" is very fast for reading this kind of
files, although for writing you have to use some sort of php class, that
can be slower than your method.
Depends what you trying to do I guess.
Regards
Clancy wrote:
I have been pondering whether it would be feasible to work with a 100,000 entry
index
file, and had put yesterday aside to do some timing tests. I first generated
some sample
index files of various lengths. Each entry consisted of a single line with the
form
ASDF;rhubarb, rhubarb, ....
where the ASDF is a randomly generated four character index, and the rest of
the line is
filling, which varies slightly in length and contents from line to line, just
in case
something tried to get smart and cache the line. The average lenth of the line
is about
80 bytes.
Then I wrote another program which read the file into an array, using the four
character
index as the key, and the filling as the contents, sorted the array, and then
rewrote it
to another file, reporting the elapsed time after each step.
My first version used fgets() to read the source file a line at a time, and
fwrite() to
write the new file. This version performed quite consistently, and took
approximately 1.3
seconds to read in a 100,000 entry 7.86Mb file, and another 5 seconds to write
it out
again.
I then read the discussion following fschnittke's post "File write operation
slows to a
crawl ... " and wondered if the suggestions made there would help.
First I used file() to read the entire file into memory, then processed each
line into the
form required to set up my matrix. This gave a useful improvement for small
files, halving
the time required to read and process a 10,000 entry 815 kB file, but for a
30,000 entry
file it had dropped to about 15%, and it made little difference for a 300,000
entry file.
Then I tried writing my whole array into a single horrendous string, and using
file_put_contents() to write out the whole string in one bang. I started
testing on a
short file, and thought I was onto a good thing, as it halved the time to write
out a
10,000 entry 800 K file. But as I increased the file size it began to fail
dismally. With
a 30,000 entry file it was 20% slower, and at 100,000 entries it was three
times slower.
On Shawn McKenzie's suggestion, I also tried replacing fgets() with
stream_get_line(). As
I had anticipated any difference was well within below the timing noise level.
In conclusion, for short (1MB!) files, using file() to read the whole file into
memory is
substantially better than using fgets() to read the file a line at a time, but
the
advantage rapidly diminishes for longer files. Similarly using
file_put_contents() in
place of fwrite() to write it out again is better for short files (up to
perhaps 1 MB) but
the performance deteriorates rapidly above this.
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php