Hello,

I tried the parallel grep example on a sample file  (1.4G and 3,153,199
lines). The parallel grep (
http://www.gnu.org/software/parallel/man.html#example__parallel_grep)  is
considerably slower - I am trying to see where the bottle neck is.  Is
there an easy way for me to guess which way would be better based on the
size of the file?

# Regular grep
$ time cat testfile | grep -F test_pattern | wc -l
117

real    0m1.208s
user    0m0.543s
sys     0m1.704s

$ time cat testfile | parallel --pipe grep -F test_pattern | wc -l
117

real    0m18.815s
user    0m11.807s
sys     0m16.945s

The test was repeated multiple times to rule out disk speed. It was however
similarly slower the first time I ran it as well.

NOTE: It is not that i am trying to make a 1 second grep any faster :). I
am just trying to find why the 'parallel --pipe' was 10x slower. I would
have expected a similar or slightly worse performance for a 1.4G file.

-- 
Harry

Reply via email to