linux...@gmx.net writes: > I have a large file which has millions of lines. > They are text only lines. > If I have to change one line in the file, what's the efficient way? > I don't want to slurp the whole file into memory, change that line and > write the full content back to disk again.
It seems like the editing is line-based, I could recommend checking out 'perl -i' for doing in-place editing, if you are OK with regexp-based search & replace -- it's basically the same as doing it in vim, perl -p -i.orig -e "s/hello/wow/" input.txt This finds the first line that matces regexp ^foo$ , then replaces the entire line with "foobar", shift the remainder of the file correctly and wrote everything back to input.txt -- while keep an original copy of input.txt at input.txt.orig However that may match on multiple lines, if you know the line number in advance, you could check the line number variable $. perl -p -i -e "s/^.+$/wow/ if $. == 2" input.txt Note that doing this would still scan the entire input.txt line by line. And adding `exit()` or `next` in the body of `-e` would make the program finish early but would also truncate input.txt -- which is probblay not what we want. ---- Alternatively, if you are looking for doing this with some code but not with "perl" command, read on... How efficient it could be depends a little bit on how the target line is identified and the kind of editing that's required. For sure you could avoid slupring by doing doing line-based changes like: while (defined(my $line = <$fh>)) { ... } If the editing is replacing $line with something that's equal in length, then it can be pretty efficient -- just print the thing and the file is modified in place. If, say, we want to just just the 42nd line in the file, here's how I would do: # Open as read-write mode. open my $fh, "+<", "input.txt"; # Seek to the beginning of 42nd line my $lineno = 1; while (defined(my $line = <$fh>)) { $lineno += 1; last if $lineno == 42; } # Print the new content at the begging of 42nd line. print $fh $newcontent; This is the most efficient scenario because the program can end here without reading the remainder of input.txt. However, if $newcontent is longer than the 42nd line, the program would still finish and when we inspect the file, we'll see that the text in $newcontext bleed over to the 43rd line and maybe further lines. Similarly, if the $newcontent is shorter, the original conten in the 42nd line will only be partially replaced. Most likely that's not the kind of editing we want to be doing. Meaninng, if $newcontent is longer or shorter, the remainder of the file should be shifted a few characters forward or backword and we want to re-print those lines back to $fh -- which also requires a lot of bookkeeping code just to get everything corner case right. If the editing we want is rather generic I'd say we probably want to put the output to a different file instead of doing in-place editing. And we will still end up slurping the entire file, but only keeping one line at a time in memory. -- Cheers, Kang-min Liu -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/