Claudenw commented on PR #13357: URL: https://github.com/apache/iceberg/pull/13357#issuecomment-3001179448
This is not a fix for tje race condition in JdbcCatalog. It is a fix to avoid the problem in other connectors. It also provides the connector the opportunity to do expensive initiation before each task attempts to do the initialization. For example, even with the fixed JdbcCatalog, without this change ecah task in the JdbcCatalog may attempt the expensive table creation. So if the connector attempts to start 50 threads it is possible that there will be 50 attempts to create tables followed by 49 calls to sww if the tables were created. I am certain that there are other catalogs that could implement a costly configuration on the first call that would be faster if the first attempt was made befora anumber of tasks were started. I think that in most cases this will be more efficient and in cases where it is not the performance degredation is minimal. LinkedIn: http://www.linkedin.com/in/claudewarren On Tue 24 Jun 2025, 17:01 Bryan Keller, ***@***.***> wrote: > *bryanck* left a comment (apache/iceberg#13357) > <https://github.com/apache/iceberg/pull/13357#issuecomment-3001068917> > > I feel this is a workaround that doesn't address the core issue in > JdbcCatalog. When multiple workers are launched, each loads the catalog. > The JdbcCatalog will try to initialize its schema if it hasn't been, and > this causes the race condition, as multiple workers try to initialize the > schema concurrently. JdbcCatalog should handle this case, as this can > happen in any concurrent scenario. For example, two different connectors > could load the same JdbcCatalog, and this fix won't help. > > — > Reply to this email directly, view it on GitHub > <https://github.com/apache/iceberg/pull/13357#issuecomment-3001068917>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AASTVHSZTWEFBRWCNYWXE5L3FFY65AVCNFSM6AAAAAB7XRYHNSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTAMBRGA3DQOJRG4> > . > You are receiving this because you authored the thread.Message ID: > ***@***.***> > -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
