[ https://issues.apache.org/jira/browse/HDFS-12667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Shilun Fan updated HDFS-12667: ------------------------------ Target Version/s: 3.5.0 (was: 3.4.0) > KMSClientProvider#ValueQueue does synchronous fetch of edeks in background > async thread. > ---------------------------------------------------------------------------------------- > > Key: HDFS-12667 > URL: https://issues.apache.org/jira/browse/HDFS-12667 > Project: Hadoop HDFS > Issue Type: Bug > Components: encryption, kms > Affects Versions: 3.0.0-alpha4 > Reporter: Rushabh Shah > Assignee: Rushabh Shah > Priority: Major > Attachments: HDFS-12667-001.patch, HDFS-12667-002.patch > > > There are couple of issues in KMSClientProvider#ValueQueue. > 1. > {code:title=ValueQueue.java|borderStyle=solid} > private final LoadingCache<String, LinkedBlockingQueue<E>> keyQueues; > // Stripped rwlocks based on key name to synchronize the queue from > // the sync'ed rw-thread and the background async refill thread. > private final List<ReadWriteLock> lockArray = > new ArrayList<>(LOCK_ARRAY_SIZE); > {code} > It hashes the key name into 16 buckets. > In the code chunk below, > {code:title=ValueQueue.java|borderStyle=solid} > public List<E> getAtMost(String keyName, int num) throws IOException, > ExecutionException { > ... > ... > readLock(keyName); > E val = keyQueue.poll(); > readUnlock(keyName); > ... > } > private void submitRefillTask(final String keyName, > final Queue<E> keyQueue) throws InterruptedException { > ... > ... > writeLock(keyName); // It holds the write lock while the key is > being asynchronously fetched. So the read requests for all the keys that > hashes to this bucket will essentially be blocked. > try { > if (keyQueue.size() < threshold && !isCanceled()) { > refiller.fillQueueForKey(name, keyQueue, > cacheSize - keyQueue.size()); > } > ... > } finally { > writeUnlock(keyName); > } > } > } > {code} > According to above code chunk, if two keys (lets say key1 and key2) hashes to > the same bucket (between 1 and 16), then if key1 is asynchronously being > refetched then all the getKey for key2 will be blocked. > 2. Due to stripped rw locks, the asynchronous behavior of refill keys is now > synchronous to other handler threads. > I understand that locks were added so that we don't kick off multiple > asynchronous refilling thread for the same key. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org