[ https://issues.apache.org/jira/browse/RANGER-3442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448815#comment-17448815 ]
Pavi Subenderan commented on RANGER-3442: ----------------------------------------- [~dhavalshah9131] Here is the script used to create a key and then get the metadata of a key: [^kms-key.py] Make sure you update the variables for "KMS_PROTOCOL", "KMS_PORT", and KMS_HOST" in the above python script to match the protocol/host/port used by your environment. And then we combined the above script with this loop to generate many keys in a loop: {code:java} for year in {2010..2020}; do for month in {01..12}; do cat src_categorization_level_out | xargs -I {} ./kms-key.py -c v2_{}_$year$month -d "data categorization key" --commit; done; done {code} "cat src_categorization_level_out" in the above loop is just a file with a key name on each line like: {noformat} testkey1 privatekey usernamekey securitykey laptopsec category5 category6 cat7{noformat} Running this should create hundreds of keys very quickly. > Ranger KMS DAO memory issues when many new keys are created > ----------------------------------------------------------- > > Key: RANGER-3442 > URL: https://issues.apache.org/jira/browse/RANGER-3442 > Project: Ranger > Issue Type: Bug > Components: kms > Affects Versions: 2.0.0 > Reporter: Pavi Subenderan > Priority: Major > Attachments: RANGER-3442-entity-manager-clear.patch, kms-key.py > > > We have many keys created in our KMS keystore and recently we found that when > we create new keys, the KMS instances easily hit against the memory limit. > We can reproduce this with a script to call KMS createKey and then > getMetadata for new keys in a loop. Basically we restart our instances and > memory usage is approximately 1.5GB out of 8GB, but after running this script > for a bit (1-5 minutes), we hit close to the 8GB limit and the memory usage > does not go back down after that. > I did a heap dump and saw that most of the memory was being retained by > XXRangerKeystore and eclipse EntityManagerImpl. > * org.eclipse.persistence.internal.jpa.EntityManagerImpl > * org.eclipse.persistence.internal.sessions.RepeatableWriteUnitOfWork > And the largest shallow size object was char[] with 4GB+... > > *My fix* > I was ultimately able to solve this issue by adding an > getEntityManager().clear() call in BaseDao.java getAllKeys(). > After I added this fix, we can now run as many KMS CreateKey / getMetadata > calls as we want without any increase in memory usage whatsoever. Memory > usage now stays constant at <1.7GB. > My understanding is that Ranger KMS has a many instances of ThreadLocal > EntityManager (160+ according to my heap dump) which each held a cache of the > results for getAllKeys. Since we have so many keys in our KMS, this would > easily put as at the memory limit. > Not sure if there are any drawbacks to clearing EntityManager in > BaseDao.getAllKeys() but we are seeing greatly improved performance in our > case since we aren't constantly hitting the memory limit anymore. -- This message was sent by Atlassian Jira (v8.20.1#820001)