On Tue, Jul 16, 2013 at 3:02 PM, Ole Tange <[email protected]> wrote: > On Tue, Jul 16, 2013 at 2:20 PM, Diaa Sami <[email protected]> wrote: > >> Hi, >> I'm using gnu parallel with a custom python script that processes lines, one >> line in, one or more lines out, and this script happens to have a long >> startup time because of the kind of processing it has to perform on the >> input(it has to load a dictionary in memory first). >> I was wondering if gnu parallel can just keep the processes running and just >> feed them records rather than starting a process for each block. > > So you are doing something like: > > cat bigfile | parallel --pipe yourprogram > output > > And no: GNU Parallel currently does not have an option for feeding > more blocks to a running instance.
Now I have made a first version in git. It is highly inefficient, but give it a spin and see if you can find errors. cat bigfile | parallel --round-robin --pipe yourprogram > output git clone git://git.sv.gnu.org/parallel.git /Ole
