ngsg commented on PR #5771:
URL: https://github.com/apache/hive/pull/5771#issuecomment-3017913821

   @dengzhhu653 
   
   > My idea is to make the client thread-safe in each lawyer, so the HS2, 
Spark or any other engines don't need to take special caution on this. imaging 
thrift -> B -> A, if we only put the safety guard on A, it doesn't mean the 
thrift -> B is thread safe.
   
   Thanks for the explanation, I now understand what you are worrying about.
   
   My suggestion is to introduce a new layer (apologies if you meant lawyer) 
that guards all method invocations, just like the current logic in 
`HiveMetaStoreClient.newSynchronizedClient`. Although each proxy is marked as 
public, I intended each layer to be used internally, not exposed to the 
outside. (Proxies are marked as public because they are split between 
`standalone-metastore/metastore-common` and `ql`.) So I believe that users 
should access `HiveMetaStore` only via `HiveMetaStoreClient` or 
`SessionHiveMetaStoreClient`, not by directly composing proxies. Wrapping each 
layer with a guard would be a redundant check under this design, and only a 
single guard would be required in `HMSC` or `SHMSC`.
   
   I considered two other approaches: refactoring all layers to be thread-safe, 
and wrapping each layer with the existing logic. However, I think both are 
either too complex or redundant under the intended design as I mentioned above. 
Also, if we really need thread-safe proxy chains later, it would be sufficient 
to wrap each layer with the new synchronizing proxy.
   
   I would appreciate it if you could share your thoughts on my suggestion. 
Also, I'll update the patch as I suggested above in the next commit, so it 
would be helpful if you could review it.
   
   FYI, I quickly checked the usage of `HiveMetaStoreClient` in Spark. In fact, 
Spark may create `HiveMetaStoreClient` directly, not via `Hive.java`, which is 
possibly thread-unsafe. I didn't check whether Spark has an extra guard or lock 
for such a thread-unsafe client, but as you pointed out, it would be helpful if 
we make `HiveMetaStoreClient` thread-safe by default.
   
   FYI2, just for note. I noticed that there are some redundant synchronization 
logics, `SynchronizedMetaStoreClient` and 
`HiveMetaStoreClient.newSynchronizedClient`, in `Hive.java`. Maybe we could 
simplify them once we refactor the MSC families.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org

Reply via email to