[ https://issues.apache.org/jira/browse/HDFS-729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771715#action_12771715 ]
dhruba borthakur commented on HDFS-729: --------------------------------------- I am planning to follow Raghu's advice and add the following API to the namenode: {quote} /** * Returns a list of files that are corrupted. * <p> * Returns a list of files that have at least one block that has no valid replicas. * The returned list has numExpectedFiles files in it. If the number of files * returned is smaller than numExpectedFiles, then it implies that no more * corrupted files are available in the system. The startingNumber is the * startingNumber-th corrupted file in the system. * * @param numExpectedFiles the maximum number of files to be returned * @param startingNumber list files starting from startingNumberth to * (startingNumber + numExpectedFiles)th in the * list of corrupted files * @throws AccessControlException if the superuser privilege is violated. * @throws IOException if unable to retrieve information of a corrupt file */ public LocatedBlocks[] getCorruptFiles(int numExpectedFiles, int startingNumber) throws IOException; {quote} This will be used by fsck (or any other application) to quickly detect corrupted files. > fsck option to list only corrupted files > ---------------------------------------- > > Key: HDFS-729 > URL: https://issues.apache.org/jira/browse/HDFS-729 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: dhruba borthakur > Assignee: dhruba borthakur > > An option to fsck to list only corrupted files will be very helpful for > frequent monitoring. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.