ngsg commented on PR #5771: URL: https://github.com/apache/hive/pull/5771#issuecomment-3017913821
@dengzhhu653 > My idea is to make the client thread-safe in each lawyer, so the HS2, Spark or any other engines don't need to take special caution on this. imaging thrift -> B -> A, if we only put the safety guard on A, it doesn't mean the thrift -> B is thread safe. Thanks for the explanation, I now understand what you are worrying about. My suggestion is to introduce a new layer (apologies if you meant lawyer) that guards all method invocations, just like the current logic in `HiveMetaStoreClient.newSynchronizedClient`. Although each proxy is marked as public, I intended each layer to be used internally, not exposed to the outside. (Proxies are marked as public because they are split between `standalone-metastore/metastore-common` and `ql`.) So I believe that users should access `HiveMetaStore` only via `HiveMetaStoreClient` or `SessionHiveMetaStoreClient`, not by directly composing proxies. Wrapping each layer with a guard would be a redundant check under this design, and only a single guard would be required in `HMSC` or `SHMSC`. I considered two other approaches: refactoring all layers to be thread-safe, and wrapping each layer with the existing logic. However, I think both are either too complex or redundant under the intended design as I mentioned above. Also, if we really need thread-safe proxy chains later, it would be sufficient to wrap each layer with the new synchronizing proxy. I would appreciate it if you could share your thoughts on my suggestion. Also, I'll update the patch as I suggested above in the next commit, so it would be helpful if you could review it. FYI, I quickly checked the usage of `HiveMetaStoreClient` in Spark. In fact, Spark may create `HiveMetaStoreClient` directly, not via `Hive.java`, which is possibly thread-unsafe. I didn't check whether Spark has an extra guard or lock for such a thread-unsafe client, but as you pointed out, it would be helpful if we make `HiveMetaStoreClient` thread-safe by default. FYI2, just for note. I noticed that there are some redundant synchronization logics, `SynchronizedMetaStoreClient` and `HiveMetaStoreClient.newSynchronizedClient`, in `Hive.java`. Maybe we could simplify them once we refactor the MSC families. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org