Google-docs won't handle a 4GB file...
So I will have to try unix tools...
http://support.google.com/docs/bin/answer.py?hl=en&answer=37603





On Thu, Jun 7, 2012 at 1:19 PM, Dominique Pellé
<dominique.pe...@gmail.com>wrote:

> Gary Johnson <garyj...@spocom.com> wrote:
>
> > On 2012-06-07, Dominique Pellé wrote:
> >> Marc Weber wrote:
> >>
> >> > forgett about vim, on linux just do:
> >> > tail -n +10 file.sql  | head -n +10 > trimmed.sql
> >>
> >> Many people posted solutions with head and tail that don't work.
> >>
> >> Here is one that works:
> >>
> >> $ sed 1,10d < input.txt | tac | sed 1,10d | tac > output.txt
> >
> > Are you sure that tac works in this case?  I thought that tac pushed
> > all the input lines onto a stack in memory, then popped each line as
> > it was output.  That means having to put the entire file into
> > memory, which we were trying to avoid.
>
> Yes, I was wondering about that too.  But I checked and
> tac only uses very little memory even on huge files.
>
> Actually, tac behaves differently with an input file and with
> an input stream:
>
> * with an input file, it outputs results immediately. I suppose
>  tac reads the file by block from the end of the file and reverse
>  each blocks.
> * with an input stream it can't do that, so it has to read the
>  full stream before it can output. Yet it uses very little memory.
>  It uses temporary files (confirmed by looking at /prof/<pid>/fd)
>
> So tac solution works.  But given that tac is more efficient on
> an input file than on an input stream, changing the order
> should be better.  In other words, this...
>
> $ tac input.txt | sed 1,10d | tac | sed 1,10d > output.txt
>
> ... should be faster than this:
>
> $ sed 1,10d input.txt | tac | sed 1,10d | tac > output.txt
>
> I measured it on a big file to confirm it:
>
> first solution took 9.2 sec, second solution took 12.2 sec
>
> In any case, the Perl solution that I gave and which use a
> rotating buffer does one pass only, does not use much memory
> and does not use temporary files either.
>
> $ perl -ne 'print $l[$.%10] if ($. >= 10*2); $l[$.%10] = $_' input.txt
> >output.txt
>
> Yet this Perl solution is slower than tac.  It takes 14.8 sec on
> the same input file.
>
> The strange looking solution...
>
> $ sed -e :a -e '$d;N;2,10ba' -e 'P;D' input.txt > output.txt
>
> ... takes 6.0 sec.
>
> Tim Chase wrote:
>
> > I think you're reading it backwards, as head/tail (at least GNU
> > versions; for other flavors, YMMV) allow for a "+" in front of the
> > number so
> >
> > tail -n +20
> >
> > chops off the first 19 lines in the file; similarly, "-" in front of
> > the number with head does all but the N last lines of the file.  The
> > example above should likely read something like
> >
> >  tail -n +11 file.sql | head -n -10 > trimmed.sql
>
> Right. My apologies. That works indeed and it's much faster.
> It does not have to parse line by line with this solution I suppose.
> With the same large input as above, it only took 2.6 sec.
>
> -- Dominique
>
> --
> You received this message from the "vim_use" maillist.
> Do not top-post! Type your reply below the text you are replying to.
> For more information, visit http://www.vim.org/maillist.php
>

-- 
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

Reply via email to