[ https://issues.apache.org/jira/browse/HDDS-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Glen Geng updated HDDS-4186: ---------------------------- Summary: Adjust RetryPolicy of SCMConnectionManager (was: CLONE - Improve performance of the BufferPool management of Ozone client) > Adjust RetryPolicy of SCMConnectionManager > ------------------------------------------ > > Key: HDDS-4186 > URL: https://issues.apache.org/jira/browse/HDDS-4186 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Reporter: Glen Geng > Assignee: Glen Geng > Priority: Blocker > Labels: pull-request-available > > Teragen reported to be slow with low number of mappers compared to HDFS. > In my test (one pipeline, 3 yarn nodes) 10 g teragen with HDFS was ~3 mins > but with Ozone it was 6 mins. It could be fixed with using more mappers, but > when I investigated the execution I found a few problems reagrding to the > BufferPool management. > 1. IncrementalChunkBuffer is slow and it might not be required as BufferPool > itself is incremental > 2. For each write operation the bufferPool.allocateBufferIfNeeded is called > which can be a slow operation (positions should be calculated). > 3. There is no explicit support for write(byte) operations > In the flamegraph it's clearly visible that with low number of mappers the > client is busy with buffer operations. After the patch the rpc call and the > checksum calculation give the majority of the time. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org