Oleg Skovpen created BEAM-5387:
----------------------------------

             Summary: Processing stuck in StreamingFn
                 Key: BEAM-5387
                 URL: https://issues.apache.org/jira/browse/BEAM-5387
             Project: Beam
          Issue Type: Bug
          Components: io-java-gcp
    Affects Versions: 2.6.0
         Environment: GCP Dataflow, Scio SDK (version 0.6.0), Apache Beam 
(version 2.5.6)
            Reporter: Oleg Skovpen
            Assignee: Chamikara Jayalath


Sometimes after a few "java.net.SocketTimeoutException: Read timed out" (1) we 
get the message in the logs: "Processing stuck in step 
saveAsTypedBigQuery$extension@\{RawJob.scala:57}1/StreamingInserts/StreamingWriteTables/StreamingWrite
 for at least 5m00s without outputting or completing in state finish" (2).

After this error, the threads on which it occurred run for a while and then 
stop writing any information to the logs.

Also Dataflow Job continues to work, but significantly reduced bandwidth.

Job on which this problem is reproduced uses Dynamic Destinations and writes to 
dozens of BigQuery tables. 
On other jobs that write to one table, this problem does not exist.

Could this be due to some internal deadlock inside the beam?

Full stack trace (1) (logger "com.google.api.client.http.HttpTransport"):
{code:java}
exception: "java.net.SocketTimeoutException: Read timed out
at org.conscrypt.NativeCrypto.SSL_read(Native Method)
at org.conscrypt.NativeSsl.read(NativeSsl.java:416)
at 
org.conscrypt.ConscryptFileDescriptorSocket$SSLInputStream.read(ConscryptFileDescriptorSocket.java:547)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:704)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:647)
at 
sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1536)
at 
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1441)
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
at 
sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:338)
at 
com.google.api.client.http.javanet.NetHttpResponse.<init>(NetHttpResponse.java:37)
at 
com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:105)
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:981)
at 
com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419)
at 
com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
at 
com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
at 
org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.lambda$insertAll$0(BigQueryServicesImpl.java:724)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
"{code}
Full stack trace (2) (logger 
"com.google.cloud.dataflow.worker.DataflowOperationContext"):
{code:java}
"Processing stuck in step 
saveAsTypedBigQuery$extension@{RawJob.scala:57}1/StreamingInserts/StreamingWriteTables/StreamingWrite
 for at least 15m00s without outputting or completing in state finish
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429)
at java.util.concurrent.FutureTask.get(FutureTask.java:191)
at 
org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.insertAll(BigQueryServicesImpl.java:752)
at 
org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.insertAll(BigQueryServicesImpl.java:813)
at 
org.apache.beam.sdk.io.gcp.bigquery.StreamingWriteFn.flushRows(StreamingWriteFn.java:122)
at 
org.apache.beam.sdk.io.gcp.bigquery.StreamingWriteFn.finishBundle(StreamingWriteFn.java:94)
at 
org.apache.beam.sdk.io.gcp.bigquery.StreamingWriteFn$DoFnInvoker.invokeFinishBundle(Unknown
 Source)
"{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to