oliverw1 opened a new issue, #13863:
URL: https://github.com/apache/iceberg/issues/13863

   ### Apache Iceberg version
   
   1.9.1
   
   ### Query engine
   
   Spark
   
   ### Please describe the bug 🐞
   
   Running a PySpark structured streaming job with Iceberg as a source (stored 
on AWS S3), after some time it eventually crashes with a "Connection pool shut 
down" error message, as shown in the stacktrace (attachment). 
   
   The job in question is one that gets only row per minute on average. 
Increasing the `spark.sql.catalog.iceberg.s3.retry.max-wait-ms` to 65000 has 
increased the time to failure, but it's only a workaround: it does eventually 
fail (stacktrace2 has almost the same content).
   
   Changing the `spark.sql.catalog.iceberg.io-impl` from 
`org.apache.iceberg.aws.s3.S3FileIO` to 
`org.apache.iceberg.hadoop.HadoopFileIO` makes the job run at least 20x as long 
without failure yet, so we're assuming it works with that interface.
   
   Iceberg version: 1.9.1 (iceberg-spark-runtime-3.5_2.12-1.9.1.jar).
   
   I've looked at many of the previous reports of the "Connection pool shut 
down", but it seems many (as they're older) are fixed. 
   
   
[stacktrace.txt](https://github.com/user-attachments/files/21853378/stacktrace.txt)
   
[stacktrace2.txt](https://github.com/user-attachments/files/21853526/stacktrace2.txt)
   
   
   ### Willingness to contribute
   
   - [ ] I can contribute a fix for this bug independently
   - [ ] I would be willing to contribute a fix for this bug with guidance from 
the Iceberg community
   - [x] I cannot contribute a fix for this bug at this time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to