[ https://issues.apache.org/jira/browse/IGNITE-8299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ivan Rakov updated IGNITE-8299: ------------------------------- Description: Ignite performance significantly decreases when total size of local data is much greater than size of RAM. It can be explained by change of disk access pattern (random reads + random writes is complex even for SSDs), but after analysis of persistence code and JFRs it's clear that there's still room for optimization. The following possible optimizations should be investigated: 1) PageMemoryImpl.Segment#partGeneration performs allocation of GroupPartitionId during HashMap.get - we can get rid of it 2) LoadedPagesMap#getNearestAt is invoked at least 5 times in PageMemoryImpl.Segment#removePageForReplacement. It performs two allocations - we can get rid of it 3) If one of 5 evict candidates was erroneous, we'll find 5 new ones - we can reuse remaining 4 instead JFR that highlights excessive CPU usage by page replacement code is attached. See 1st and 3rd positions in "Hot Methods" section: Stack Trace Sample Count Percentage(%) ....PageMemoryImpl.acquirePage(int, long, boolean) 4 963 19,73 scala.Some.equals(Object) 4 932 19,606 java.util.HashMap.getNode(int, Object) 3 236 12,864 was: Ignite performance significantly decreases when total size of local data is much greater than size of RAM. It can be explained by change of disk access pattern (random reads + random writes is complex even for SSDs), but after analysis of persistence code and JFRs it's clear that there's still room for optimization. The following possible optimizations should be investigated: 1) PageMemoryImpl.Segment#partGeneration performs allocation of GroupPartitionId during HashMap.get - we can get rid of it 2) LoadedPagesMap#getNearestAt is invoked at least 5 times in PageMemoryImpl.Segment#removePageForReplacement. It performs two allocations - we can get rid of it 3) If one of 5 evict candidates was erroneous, we'll find 5 new ones - we can reuse remaining 4 instead JFR that highlights excessive CPU usage by page replacement code is attached. See 1st and 3rd positions in "Hot Methods" section: Stack Trace Sample Count Percentage(%) org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(int, long, boolean) 4 963 19,73 scala.Some.equals(Object) 4 932 19,606 java.util.HashMap.getNode(int, Object) 3 236 12,864 > Optimize allocations and CPU consumption in active page replacement scenario > ---------------------------------------------------------------------------- > > Key: IGNITE-8299 > URL: https://issues.apache.org/jira/browse/IGNITE-8299 > Project: Ignite > Issue Type: Improvement > Reporter: Ivan Rakov > Assignee: Ivan Rakov > Priority: Major > > Ignite performance significantly decreases when total size of local data is > much greater than size of RAM. It can be explained by change of disk access > pattern (random reads + random writes is complex even for SSDs), but after > analysis of persistence code and JFRs it's clear that there's still room for > optimization. > The following possible optimizations should be investigated: > 1) PageMemoryImpl.Segment#partGeneration performs allocation of > GroupPartitionId during HashMap.get - we can get rid of it > 2) LoadedPagesMap#getNearestAt is invoked at least 5 times in > PageMemoryImpl.Segment#removePageForReplacement. It performs two allocations > - we can get rid of it > 3) If one of 5 evict candidates was erroneous, we'll find 5 new ones - we can > reuse remaining 4 instead > JFR that highlights excessive CPU usage by page replacement code is attached. > See 1st and 3rd positions in "Hot Methods" section: > Stack Trace Sample Count Percentage(%) > ....PageMemoryImpl.acquirePage(int, long, boolean) 4 963 19,73 > scala.Some.equals(Object) 4 932 19,606 > java.util.HashMap.getNode(int, Object) 3 236 12,864 -- This message was sent by Atlassian JIRA (v7.6.3#76005)