I've run the example code (using two Shakespeare plays as input) and
everything worked as expected (the pipeline produced wordcounts and then
shut down). Can you attach a debugger to
ExecutorServiceParallelExecutor:431 and see where the runner appears to
hang?

The log statement printed at that line is followed by a call to
ExecutorService#shutdown(), which should cause the worker threads to be
terminated (relatively quickly, as no additional work will be accepted
after that call, and all previous work has completed), which should allow
the JVM to shut down once it's run through #awaitCompletion() (this is
automatically invoked based on the default isBlockOnRun value); an explicit
call to `awaitCompletion()` should return immediately if this is the case.
There is admittedly a bit of additional defensive work that could be added
to ExecutorServiceParallelExecutor, which I've published in github PR 399

The fact that you're getting output files (and the write operations are
finalized) suggests that everything is being at least invoked as expected.
What are you using as input files?

https://github.com/apache/incubator-beam/pull/399

On Thu, May 26, 2016 at 8:33 AM, David Olsen <[email protected]>
wrote:

> Yes, the behavior is the same even without result.awaitCompletion().
>
> After turning debug on, looks like it's cleaning the temp files
> (result-0000?-of-00004-*) produced. And within the output folder only 4
> files are left.
>
> result-00000-of-00004  result-00002-of-00004
> result-00001-of-00004  result-00003-of-00004
>
> But only 4 lines are written to these files (each file only contains 1
> line)
>
> $cat result-0000* | wc -l
> 4
>
> Below is the end of messages after the debug level is turned on:
>
> /tmp/out/result-00001-of-00004-temp-e5c20f19-f6dc-49f7-afe5-dfd168dd6353
> 2016-05-26 23:20:28 DEBUG FileBasedSink$LocalFileOperations:654 - Removing
> file
> /tmp/out/result-00001-of-00004-temp-d2b6bd5f-ca22-402c-8fbb-9445059cb626
> 2016-05-26 23:20:28 DEBUG FileBasedSink$LocalFileOperations:654 - Removing
> file
> /tmp/out/result-00001-of-00004-temp-1b256ecc-e0f9-43fa-ad37-c0920db1aaf7
> 2016-05-26 23:20:28 DEBUG FileBasedSink$LocalFileOperations:654 - Removing
> file
> /tmp/out/result-00001-of-00004-temp-27a03df8-50c8-44e2-ac7b-c4c9b424ddb1
> 2016-05-26 23:20:28 DEBUG FileBasedSink$LocalFileOperations:654 - Removing
> file
> /tmp/out/result-00001-of-00004-temp-70857308-2faf-4e38-98d1-4dc0c0b14b2a
> 2016-05-26 23:20:28 DEBUG FileBasedSink$LocalFileOperations:654 - Removing
> file
> /tmp/out/result-00001-of-00004-temp-ae226ef3-a7b3-4aa6-9caf-dff60a2e3f1e
> 2016-05-26 23:20:28 DEBUG FileBasedSink$LocalFileOperations:654 - Removing
> file
> /tmp/out/result-00001-of-00004-temp-4b19e860-2505-4aa9-b779-a2ae12b5c781
> 2016-05-26 23:20:28 DEBUG FileBasedSink$LocalFileOperations:654 - Removing
> file
> /tmp/out/result-00001-of-00004-temp-c9ac3c39-2812-4fc7-92f6-804717ab9dbd
> 2016-05-26 23:20:28 DEBUG FileBasedSink$LocalFileOperations:654 - Removing
> file
> /tmp/out/result-00001-of-00004-temp-68041765-c800-4116-82eb-2f3874b6d0c7
> 2016-05-26 23:20:28 DEBUG Write:234 - Done finalizing write operation
> org.apache.beam.sdk.io.TextIO$TextSink$TextWriteOperation@ac61f9
> 2016-05-26 23:20:28 DEBUG ExecutorServiceParallelExecutor:431 - Pipeline
> is finished. Shutting down. {}
>
> Although the console prints 'Shutting down', it still hangs indefinitely
> without back to the command prompt.
>
> Thanks for help.
>
>
> On 26 May 2016 at 04:53, Thomas Groh <[email protected]> wrote:
>
>> Additionally, if you set the logging to Debug (with the same beam
>> version) there should be two additional log lines printed for each Write
>> operation, which would help significantly.
>>
>> On Wed, May 25, 2016 at 10:52 AM, Thomas Groh <[email protected]> wrote:
>>
>>> If you inspect the output result (at /tmp/out/result-...-of-00004),
>>> once the pipeline appears to be hung, has the pipeline produced output? The
>>> logs that claim that the write has been finalized suggest that the pipeline
>>> is complete, and the failure is occurring between the underlying executor
>>> and the InProcessPipelineResult, rather than the Pipeline execution. My
>>> initial expectation is that this may be due to an issue between the
>>> InProcessPipelineResult and the underlying executor rather than the
>>> executor proper.
>>>
>>> When the call to `result.awaitCompletion()` is removed, you see the same
>>> behavior, correct?
>>>
>>> On Wed, May 25, 2016 at 8:58 AM, David Olsen <[email protected]>
>>> wrote:
>>>
>>>> I try word count with InProcessPipelineRunner and it basically works.
>>>> But I am not sure if I use the correct way to stop the pipeline running. My
>>>> code is at http://paste.debian.net/702925
>>>>
>>>> The execution prints the following messages and then seems to hang
>>>> forever with thread not being terminated.
>>>>
>>>> 2016-05-25 22:37:12 INFO  Write:183 - Opening writer for write
>>>> operation org.apache.beam.sdk.io.TextIO$TextSink$TextWriteOperation@1b1a53a
>>>> 2016-05-25 22:37:12 INFO  Write:183 - Opening writer for write
>>>> operation org.apache.beam.sdk.io.TextIO$TextSink$TextWriteOperation@40b01b
>>>> 2016-05-25 22:37:16 INFO  Write:230 - Finalizing write operation
>>>> org.apache.beam.sdk.io.TextIO$TextSink$TextWriteOperation@40b01b
>>>> 2016-05-25 22:37:16 INFO  Write:230 - Finalizing write operation
>>>> org.apache.beam.sdk.io.TextIO$TextSink$TextWriteOperation@1e9d9fe
>>>> 2016-05-25 22:37:17 INFO  Write:230 - Finalizing write operation
>>>> org.apache.beam.sdk.io.TextIO$TextSink$TextWriteOperation@1b1a53a
>>>> 2016-05-25 22:37:17 INFO  Write:230 - Finalizing write operation
>>>> org.apache.beam.sdk.io.TextIO$TextSink$TextWriteOperation@ff69ad
>>>>
>>>> Following code snippet is used to terminate the execution, but I don't
>>>> know whether it's correct usage or not.
>>>>
>>>> final InProcessPipelineResult result = (InProcessPipelineResult)
>>>> p.run();
>>>> result.awaitCompletion();
>>>>
>>>> Also calling 'p.run();' only without obtaining result then
>>>> awaitCompletion() has the same issue.
>>>>
>>>> beam version is git commit 78c8c528e23dba4f41e6ffc98b4c0d78bcea5c08
>>>> scala 2.11.x
>>>> sbt 0.13.x
>>>>
>>>> I appreciate any suggestions. Thanks.
>>>>
>>>
>>>
>>
>

Reply via email to