[ https://issues.apache.org/jira/browse/HDFS-12225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16130299#comment-16130299 ]
Rakesh R commented on HDFS-12225: --------------------------------- Good work [~surendrasingh]. Adding few comments on the patch, please take a look at it. # Please remove unused variable in {{BlockManager.java}} {code} /** * Whether HA is enabled. */ private final boolean haEnabled; {code} # Please add log message to know the thread exit path, would be helpful to admins. # Name the thread -> {{new Daemon(new PendingSPSTaskScanner())}} # Rehrase {{// Maybe file got deleted, from the queue}} to {{// File doesn't exists (maybe got deleted), remove trackId from the queue}} # Typo: {{stratify the policy.}} -> {{satisfy the policy.}} # Would be good to unify the names for better code maintenance. So far, used trackId or trackinfo to represent the storage movement needed item. How about renaming {{Candidate}} class, we could use {{StorageMovementTrackInfo}} or {{SatisfyTrackInfo}} or any better name. Also, IMHO to use {{isDir}} bool flag to classify dir/file and make it explicit. {code} static class SatisfyTrackInfo { private final Long trackId; // Value will be 0 for a file path. Value will be >= 0 for a dir path. private final Long childCount; // true represents dir path, false otherwise. private final boolean isDir; } {code} # Please ensure the {{pendingWorkForDirectory}} is cleaned up during {{SPS#postBlkStorageMovementCleanup}}. # Please take care and update existing javadocs, below are few occurrences: {code} * @param id * - file block collection id. */ public void satisfyStoragePolicy(Long inodeId, List<Long> candidates) { * @param blockCollectionID * - tracking id / block collection id * @param allBlockLocsAttemptedToSatisfy * - failed to find matching target nodes to satisfy storage type for * all the block locations of the given blockCollectionID */ public void add(Candidate candidate, boolean allBlockLocsAttemptedToSatisfy) { private ItemInfo(long lastAttemptedOrReportedTime, Long parentId, boolean allBlockLocsAttemptedToSatisfy) { {code} # Make unit test more stable by replacing constant sleeping {{Thread.sleep(6000);}} by sliced sleeping of 250/100 millis in a loop and recheck for null. Maybe, you could try {{DFSTestUtil.waitForXattrRemoved()}} and wait for 10secs. > [SPS]: Optimize extended attributes for tracking SPS movements > -------------------------------------------------------------- > > Key: HDFS-12225 > URL: https://issues.apache.org/jira/browse/HDFS-12225 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, namenode > Reporter: Uma Maheswara Rao G > Assignee: Surendra Singh Lilhore > Attachments: HDFS-12225-HDFS-10285-01.patch, > HDFS-12225-HDFS-10285-02.patch > > > We have discussed to optimize number extended attributes and asked to report > separate JIRA while implementing [HDFS-11150 | > https://issues.apache.org/jira/browse/HDFS-11150?focusedCommentId=15766127&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15766127] > This is the JIRA to track that work > For the context, comment copied from HDFS-11150 > {quote} > [~yuanbo] wrote : I've tried that before. There is an issue here if we only > mark the directory. When recovering from FsImage, the InodeMap isn't built > up, so we don't know the sub-inode of a given inode, in the end, We cannot > add these inodes to movement queue in FSDirectory#addToInodeMap, any > thoughts?{quote} > {quote} > [~umamaheswararao] wrote: I got what you are saying. Ok for simplicity we can > add for all Inodes now. For this to handle 100%, we may need intermittent > processing, like first we should add them to some intermittentList while > loading fsImage, once fully loaded and when starting active services, we > should process that list and do required stuff. But it would add some > additional complexity may be. Let's do with all file inodes now and we can > revisit later if it is really creating issues. How about you raise a JIRA for > it and think to optimize separately? > {quote} > {quote} > [~andrew.wang] wrote in HDFS-10285 merge time review comment : HDFS-10899 > also the cursor of the iterator in the EZ root xattr to track progress and > handle restarts. I wonder if we can do something similar here to avoid having > an xattr-per-file being moved. > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org