On Wed, Jun 27, 2012 at 10:12 AM, Rahul <[email protected]> wrote:
> How can I run Pipeline in synchronous mode ?  I have built junit tests but
> they all run in async mode over MRPipeline.
> I need to make sure that the pipeline gets finished so I have to continually
> monitor the output folder.  Is there a better way for this ?

Hi Rahul,

Pipelines themselves are always run synchronously; that is, a call to
Pipeline.run or Pipeline.done will block until all underlying jobs have
run (or failed).

There shouldn't be a need to monitor output folders; you can count
on output directories being populated once Pipeline.run or
Pipeline.done returns (and shouldn't count on the folders being populated
until that point.

Of course, the underlying MapReduce job(s) are run asynchronously.
The coordination of the asynchronous MR jobs and the blocking
of the MRPipeline class is handled by the CrunchJobControl class.

Hope this helps.

Regards,

Gabriel

Reply via email to