[ 
https://issues.apache.org/jira/browse/BEAM-9651?focusedWorklogId=419849&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-419849
 ]

ASF GitHub Bot logged work on BEAM-9651:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 09/Apr/20 22:39
            Start Date: 09/Apr/20 22:39
    Worklog Time Spent: 10m 
      Work Description: reuvenlax commented on issue #11364: [BEAM-9651] 
Prevent StreamPool and stream initialization livelock
URL: https://github.com/apache/beam/pull/11364#issuecomment-611787817
 
 
   run dataflow validatesrunner
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 419849)
    Time Spent: 1h  (was: 50m)

> StreamingDataflowWorker stuck waiting for 
> org.apache.beam.runners.dataflow.worker.windmill.DirectStreamObserver.onNext
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: BEAM-9651
>                 URL: https://issues.apache.org/jira/browse/BEAM-9651
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-dataflow
>            Reporter: Sam Whittle
>            Assignee: Sam Whittle
>            Priority: Major
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> Operation ongoing in step <redacted> for at least 28h10m00s without 
> outputting or completing in state windmill-read at 
> sun.misc.Unsafe.park(Native Method) at 
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at 
> java.util.concurrent.Phaser$QNode.block(Phaser.java:1140) at 
> java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323) at 
> java.util.concurrent.Phaser.internalAwaitAdvance(Phaser.java:1067) at 
> java.util.concurrent.Phaser.awaitAdvanceInterruptibly(Phaser.java:758) at 
> org.apache.beam.runners.dataflow.worker.windmill.DirectStreamObserver.onNext(DirectStreamObserver.java:49)
>  at 
> org.apache.beam.runners.dataflow.worker.windmill.GrpcWindmillServer$AbstractWindmillStream.send(GrpcWindmillServer.java:615)
>  at 
> org.apache.beam.runners.dataflow.worker.windmill.GrpcWindmillServer$GrpcGetDataStream.onNewStream(GrpcWindmillServer.java:946)
>  at 
> org.apache.beam.runners.dataflow.worker.windmill.GrpcWindmillServer$AbstractWindmillStream.startStream(GrpcWindmillServer.java:628)
>  at 
> org.apache.beam.runners.dataflow.worker.windmill.GrpcWindmillServer$GrpcGetDataStream.<init>(GrpcWindmillServer.java:941)
>  at 
> org.apache.beam.runners.dataflow.worker.windmill.GrpcWindmillServer.getDataStream(GrpcWindmillServer.java:506)
>  at 
> org.apache.beam.runners.dataflow.worker.MetricTrackingWindmillServerStub$$Lambda$129/665137804.get(Unknown
>  Source) at 
> org.apache.beam.runners.dataflow.worker.windmill.WindmillServerStub$StreamPool$StreamData.<init>(WindmillServerStub.java:159)
>  at 
> org.apache.beam.runners.dataflow.worker.windmill.WindmillServerStub$StreamPool$StreamData.<init>(WindmillServerStub.java:158)
>  at 
> org.apache.beam.runners.dataflow.worker.windmill.WindmillServerStub$StreamPool.getStream(WindmillServerStub.java:191)
>  at 
> org.apache.beam.runners.dataflow.worker.MetricTrackingWindmillServerStub.getStateData(MetricTrackingWindmillServerStub.java:199)
>  at 
> org.apache.beam.runners.dataflow.worker.WindmillStateReader.startBatchAndBlock(WindmillStateReader.java:433)
>  at 
> org.apache.beam.runners.dataflow.worker.WindmillStateReader$WrappedFuture.get(WindmillStateReader.java:328)
>  at 
> org.apache.beam.runners.dataflow.worker.WindmillStateInternals$WindmillValue.read(WindmillStateInternals.java:389)
>  at
> <redacted>
> Because the stream is started in a StreamPool synchronized block, all other 
> threads interacting with StreamPool to get or release streams end up blocking.
> It is unclear if the stream never became usable and thus blocked forever or 
> if there is a race with the use of the Phaser that causes the stuckness.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to