Google-docs won't handle a 4GB file... So I will have to try unix tools... http://support.google.com/docs/bin/answer.py?hl=en&answer=37603
On Thu, Jun 7, 2012 at 1:19 PM, Dominique Pellé <dominique.pe...@gmail.com>wrote: > Gary Johnson <garyj...@spocom.com> wrote: > > > On 2012-06-07, Dominique Pellé wrote: > >> Marc Weber wrote: > >> > >> > forgett about vim, on linux just do: > >> > tail -n +10 file.sql | head -n +10 > trimmed.sql > >> > >> Many people posted solutions with head and tail that don't work. > >> > >> Here is one that works: > >> > >> $ sed 1,10d < input.txt | tac | sed 1,10d | tac > output.txt > > > > Are you sure that tac works in this case? I thought that tac pushed > > all the input lines onto a stack in memory, then popped each line as > > it was output. That means having to put the entire file into > > memory, which we were trying to avoid. > > Yes, I was wondering about that too. But I checked and > tac only uses very little memory even on huge files. > > Actually, tac behaves differently with an input file and with > an input stream: > > * with an input file, it outputs results immediately. I suppose > tac reads the file by block from the end of the file and reverse > each blocks. > * with an input stream it can't do that, so it has to read the > full stream before it can output. Yet it uses very little memory. > It uses temporary files (confirmed by looking at /prof/<pid>/fd) > > So tac solution works. But given that tac is more efficient on > an input file than on an input stream, changing the order > should be better. In other words, this... > > $ tac input.txt | sed 1,10d | tac | sed 1,10d > output.txt > > ... should be faster than this: > > $ sed 1,10d input.txt | tac | sed 1,10d | tac > output.txt > > I measured it on a big file to confirm it: > > first solution took 9.2 sec, second solution took 12.2 sec > > In any case, the Perl solution that I gave and which use a > rotating buffer does one pass only, does not use much memory > and does not use temporary files either. > > $ perl -ne 'print $l[$.%10] if ($. >= 10*2); $l[$.%10] = $_' input.txt > >output.txt > > Yet this Perl solution is slower than tac. It takes 14.8 sec on > the same input file. > > The strange looking solution... > > $ sed -e :a -e '$d;N;2,10ba' -e 'P;D' input.txt > output.txt > > ... takes 6.0 sec. > > Tim Chase wrote: > > > I think you're reading it backwards, as head/tail (at least GNU > > versions; for other flavors, YMMV) allow for a "+" in front of the > > number so > > > > tail -n +20 > > > > chops off the first 19 lines in the file; similarly, "-" in front of > > the number with head does all but the N last lines of the file. The > > example above should likely read something like > > > > tail -n +11 file.sql | head -n -10 > trimmed.sql > > Right. My apologies. That works indeed and it's much faster. > It does not have to parse line by line with this solution I suppose. > With the same large input as above, it only took 2.6 sec. > > -- Dominique > > -- > You received this message from the "vim_use" maillist. > Do not top-post! Type your reply below the text you are replying to. > For more information, visit http://www.vim.org/maillist.php > -- You received this message from the "vim_use" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php