On 01/03/2014 02:28 PM, Rob Dixon wrote:
On 02/01/2014 15:21, mani kandan wrote:
Hi,

We have file size of huge size 500MB, Need to Manipulate the file, some
replacement and then write the file, I have used File::slurp and works
for file size of 300MB (Thanks Uri) but for this huge size 500MB it is
not processing and come out with error. I have also used Tie::file
module same case as not processing, any guidance.

Slurping entire files into memory is usually overkill, and you should
only do it if you can aford the memory and *really need* random access
to the entire file at once. Most of the time a simple sequential
read/modify/write is appropriate, and Perl will take care of buffering
the input and output files in reasonable amounts.


of course i differ on that opinion. slurping is almost always faster and in many cases the code is simpler than line by line i/o. also you can do much easier parsing and processing of whole files in single scalar than line by line. and reasonable size has shifted dramatically over the decades. in the olden days line by line was mandated due to small amounts of ram. the typical file size (code, configs, text, markup, html, etc) has not grown much since then but ram has gotten so large and cheap. slurping is the way to go today other than for genetics, logs and similar super large files.


According to your later posts you have just 2GB of memory, and although
Windows XP *can* run in 500MB I wouldn't like to see a program that
slurped a quarter of the entire memory.

I haven't seen you describe what processing you want to do on the file.
If the input is a text file and the changes can be done line by line,
then you are much better off with a program that looks like this

use strict;
use warnings;

open my $in, '<', 'myfile.txt' or die $!;
open my $out, '>', 'outfile.txt' or die $!;

while (<$in>) {
   s/from string/to string/g;
   print $out $_;
}

__END__

But if you need more, then I would guess that Tie::File is your best
bet. You don't say what problems you are getting using this module, so
please explain.

tie::file will be horrible for editing a large file like that. your line by line or similar code would be much better. tie::file does so much seeking and i/o, much more than linear access buffering would do. when lines wrap over block boundaries (much more likely than not), tie::file does extra amounts of i/o.

uri

--
Uri Guttman - The Perl Hunter
The Best Perl Jobs, The Best Perl Hackers
http://PerlHunter.com

--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to