Yes, the behavior is the same even without result.awaitCompletion().

After turning debug on, looks like it's cleaning the temp files
(result-0000?-of-00004-*) produced. And within the output folder only 4
files are left.

result-00000-of-00004  result-00002-of-00004
result-00001-of-00004  result-00003-of-00004

But only 4 lines are written to these files (each file only contains 1 line)

$cat result-0000* | wc -l
4

Below is the end of messages after the debug level is turned on:

/tmp/out/result-00001-of-00004-temp-e5c20f19-f6dc-49f7-afe5-dfd168dd6353
2016-05-26 23:20:28 DEBUG FileBasedSink$LocalFileOperations:654 - Removing
file
/tmp/out/result-00001-of-00004-temp-d2b6bd5f-ca22-402c-8fbb-9445059cb626
2016-05-26 23:20:28 DEBUG FileBasedSink$LocalFileOperations:654 - Removing
file
/tmp/out/result-00001-of-00004-temp-1b256ecc-e0f9-43fa-ad37-c0920db1aaf7
2016-05-26 23:20:28 DEBUG FileBasedSink$LocalFileOperations:654 - Removing
file
/tmp/out/result-00001-of-00004-temp-27a03df8-50c8-44e2-ac7b-c4c9b424ddb1
2016-05-26 23:20:28 DEBUG FileBasedSink$LocalFileOperations:654 - Removing
file
/tmp/out/result-00001-of-00004-temp-70857308-2faf-4e38-98d1-4dc0c0b14b2a
2016-05-26 23:20:28 DEBUG FileBasedSink$LocalFileOperations:654 - Removing
file
/tmp/out/result-00001-of-00004-temp-ae226ef3-a7b3-4aa6-9caf-dff60a2e3f1e
2016-05-26 23:20:28 DEBUG FileBasedSink$LocalFileOperations:654 - Removing
file
/tmp/out/result-00001-of-00004-temp-4b19e860-2505-4aa9-b779-a2ae12b5c781
2016-05-26 23:20:28 DEBUG FileBasedSink$LocalFileOperations:654 - Removing
file
/tmp/out/result-00001-of-00004-temp-c9ac3c39-2812-4fc7-92f6-804717ab9dbd
2016-05-26 23:20:28 DEBUG FileBasedSink$LocalFileOperations:654 - Removing
file
/tmp/out/result-00001-of-00004-temp-68041765-c800-4116-82eb-2f3874b6d0c7
2016-05-26 23:20:28 DEBUG Write:234 - Done finalizing write operation
org.apache.beam.sdk.io.TextIO$TextSink$TextWriteOperation@ac61f9
2016-05-26 23:20:28 DEBUG ExecutorServiceParallelExecutor:431 - Pipeline is
finished. Shutting down. {}

Although the console prints 'Shutting down', it still hangs indefinitely
without back to the command prompt.

Thanks for help.


On 26 May 2016 at 04:53, Thomas Groh <[email protected]> wrote:

> Additionally, if you set the logging to Debug (with the same beam version)
> there should be two additional log lines printed for each Write operation,
> which would help significantly.
>
> On Wed, May 25, 2016 at 10:52 AM, Thomas Groh <[email protected]> wrote:
>
>> If you inspect the output result (at /tmp/out/result-...-of-00004), once
>> the pipeline appears to be hung, has the pipeline produced output? The logs
>> that claim that the write has been finalized suggest that the pipeline is
>> complete, and the failure is occurring between the underlying executor and
>> the InProcessPipelineResult, rather than the Pipeline execution. My initial
>> expectation is that this may be due to an issue between the
>> InProcessPipelineResult and the underlying executor rather than the
>> executor proper.
>>
>> When the call to `result.awaitCompletion()` is removed, you see the same
>> behavior, correct?
>>
>> On Wed, May 25, 2016 at 8:58 AM, David Olsen <[email protected]>
>> wrote:
>>
>>> I try word count with InProcessPipelineRunner and it basically works.
>>> But I am not sure if I use the correct way to stop the pipeline running. My
>>> code is at http://paste.debian.net/702925
>>>
>>> The execution prints the following messages and then seems to hang
>>> forever with thread not being terminated.
>>>
>>> 2016-05-25 22:37:12 INFO  Write:183 - Opening writer for write operation
>>> org.apache.beam.sdk.io.TextIO$TextSink$TextWriteOperation@1b1a53a
>>> 2016-05-25 22:37:12 INFO  Write:183 - Opening writer for write operation
>>> org.apache.beam.sdk.io.TextIO$TextSink$TextWriteOperation@40b01b
>>> 2016-05-25 22:37:16 INFO  Write:230 - Finalizing write operation
>>> org.apache.beam.sdk.io.TextIO$TextSink$TextWriteOperation@40b01b
>>> 2016-05-25 22:37:16 INFO  Write:230 - Finalizing write operation
>>> org.apache.beam.sdk.io.TextIO$TextSink$TextWriteOperation@1e9d9fe
>>> 2016-05-25 22:37:17 INFO  Write:230 - Finalizing write operation
>>> org.apache.beam.sdk.io.TextIO$TextSink$TextWriteOperation@1b1a53a
>>> 2016-05-25 22:37:17 INFO  Write:230 - Finalizing write operation
>>> org.apache.beam.sdk.io.TextIO$TextSink$TextWriteOperation@ff69ad
>>>
>>> Following code snippet is used to terminate the execution, but I don't
>>> know whether it's correct usage or not.
>>>
>>> final InProcessPipelineResult result = (InProcessPipelineResult)
>>> p.run();
>>> result.awaitCompletion();
>>>
>>> Also calling 'p.run();' only without obtaining result then
>>> awaitCompletion() has the same issue.
>>>
>>> beam version is git commit 78c8c528e23dba4f41e6ffc98b4c0d78bcea5c08
>>> scala 2.11.x
>>> sbt 0.13.x
>>>
>>> I appreciate any suggestions. Thanks.
>>>
>>
>>
>

Reply via email to