mdlnr commented on PR #23234:
URL: https://github.com/apache/beam/pull/23234#issuecomment-1305416050

   > This PR causes the BigQuery client to hang on shutdown.
   
   @reuvenlax We observed the behaviour of hanging threads in our Dataflow jobs 
during `StreamWriter.close` with Beam Version 1.41.0. So maybe the PR does not 
cause this problem but might make it more visible.
   
   In our pipelines we see continously increasing number of threads with stack 
traces like these:
   
   ```
   "pool-3-thread-61" #1767 prio=5 os_prio=0 cpu=0.22ms elapsed=337556.83s 
tid=0x00007f2d982459d0 nid=0x6f7 in Object.wait()  [0x00007f2d886f1000]
      java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait([email protected]/Native Method)
        - waiting on <no object reference available>
        at java.lang.Thread.join([email protected]/Thread.java:1304)
        - locked <merged>(a java.lang.Thread)
        at java.lang.Thread.join([email protected]/Thread.java:1372)
        at 
com.google.cloud.bigquery.storage.v1.StreamWriter.close(StreamWriter.java:369)
        at 
org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl$1.close(BigQueryServicesImpl.java:1339)
        at 
org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$$Lambda$692/0x00000008015dd4c8.run(Unknown
 Source)
        at 
org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords.lambda$runAsyncIgnoreFailure$1(StorageApiWritesShardedRecords.java:138)
        at 
org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$$Lambda$685/0x00000008015d6b58.run(Unknown
 Source)
        at 
java.util.concurrent.Executors$RunnableAdapter.call([email protected]/Executors.java:539)
        at 
java.util.concurrent.FutureTask.run([email protected]/FutureTask.java:264)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/ThreadPoolExecutor.java:1136)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:635)
        at java.lang.Thread.run([email protected]/Thread.java:833)
   ```
   
   We think this is related to using a pretty old version of the dependency 
`com.google.cloud:google-cloud-bigquerystorage:2.12.2`. The library has since 
made many changes in the `StreamWriter` class which could fix this issue. Is 
there anything that prevents updating a newer released version?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to