belugabehr commented on a change in pull request #807: URL: https://github.com/apache/hive/pull/807#discussion_r435273094
########## File path: serde/src/java/org/apache/hadoop/hive/serde2/avro/InstanceCache.java ########## @@ -51,20 +51,19 @@ public Instance retrieve(SeedObject hv) throws AvroSerdeException { * Retrieve (or create if it doesn't exist) the correct Instance for this * SeedObject using 'seenSchemas' to resolve circular references */ - public synchronized Instance retrieve(SeedObject hv, - Set<SeedObject> seenSchemas) throws AvroSerdeException { + public Instance retrieve(SeedObject hv, Set<SeedObject> seenSchemas) + throws AvroSerdeException { if(LOG.isDebugEnabled()) LOG.debug("Checking for hv: " + hv.toString()); if(cache.containsKey(hv)) { Review comment: OK. That makes sense. Thanks for running that down. If you have a schema composed of several sub-schemas, then storing the root schema in the cache will trigger the creation of all the sub-schemas and those will also be inserted into the cache while the the initial insert still has the lock. With that said, I don't think it's worth changing the current construct. Unless there is some sort of demonstrable performance value here, I think it's confusing to use a `ConcurrentHashMap` but not be able to use the `computeX()` routine. If you would like, you can clean up the logging, shrink the synchronized block (instead of the entire method), and make some simplifications like... ``` Instance instance = cache.get(hv); if (o == null) { instance = makeInstance(hv, seenSchemas); cache.put(hv, instance); } return instance; ``` In that way, the synchronized lock will be held for shorter intervals. Oh, and obviously need JavaDoc here to explain the reentrant piece. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org