[ https://issues.apache.org/jira/browse/HDFS-16891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17678009#comment-17678009 ]
ASF GitHub Bot commented on HDFS-16891: --------------------------------------- sodonnel commented on PR #5300: URL: https://github.com/apache/hadoop/pull/5300#issuecomment-1386198922 I don't recall my reason for using a copyOnWrite list, but the list is only used in the case of an exception, which will result in the image failing to load and the NN aborting, so its an exception that we really don't expect to happen. Therefore as it stands, the CopyOnWrite list has basically zero overhead. Even if there are exceptions, the total number of entries is equal to the parallel loading threads, so low tens of entries at the most. Using `Collections.synchronizedList` does seem simpler or synchronizing on the exceptions object rather than having a separate lock object probably makes sense to simplify this change further. > Avoid the overhead of copy-on-write exception list while loading inodes sub > sections in parallel > ------------------------------------------------------------------------------------------------ > > Key: HDFS-16891 > URL: https://issues.apache.org/jira/browse/HDFS-16891 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode > Affects Versions: 3.3.4 > Reporter: Viraj Jasani > Assignee: Viraj Jasani > Priority: Major > Labels: pull-request-available > > If we enable parallel loading and persisting of inodes from/to fs image, we > get the benefit of improved performance. However, while loading sub-sections > INODE_DIR_SUB and INODE_SUB, if we encounter any errors, we use copy-on-write > list to maintain the list of exceptions. Since our usecase is not to iterate > over this list while executor threads are adding new elements to the list, > using copy-on-write is bit of an overhead for this usecase. > It would be better to synchronize adding new elements to the list rather than > having the list copy all elements over every time new element is added to the > list. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org