On Wed, May 01, 2002 at 09:45:39AM -0400, Cox, Mark wrote:
> I would try a combo of grep, tail and head. If you have a file like
>
> Chris
> Mark
> Markus
> Marko
> Donna
> Jim
>
> So if you are looking for the 2 and 3 occurrence of Ma in the file you would
> do this
>
> doing 'grep Ma test.txt | head -3 | tail -2'
> would return Markus and Marko. Where 3 is the Max number you want to start
> with and 2 is the number of occurrences you would like.
This works, but only on a constrained or small data set.
If, for instance, the lines containing the substring 'Ma' were interspersed within the
file,
and you wanted to get each occurrence AND the line before it, your solution would fail.
Just for grins, I changed the order of a few lines so the data now looks like:
Chris
Mark
Marko
Donna
Markus
Jim
grep -n Ma /tmp/test.txt|cut -f1 -d:|xargs -i sh -c 'head -n{} /tmp/test.txt|tail -n2'
which works, but too rigidly (and of course, inefficiently), and prints out two lines
for 'Mark':
once for the match, and once for the context on the following line.
grep -B1 Ma /tmp/test.txt
only prints one line for Mark.
The fix would require us to maintain state from line to line and omit any output lines
that have already been handled. Would be nice if one could create and manipulate
hashes from the shell.
I seem to recall seeing references to a set of shell dbm* tools out there somewhere.
Of course, you'd be
responsible for removing the temporary files when you're done.
This is a classic case of a problem that *could* be handled by perl, but for which
more suitable, specialized tools
exist.
-Gyepi