[
https://issues.apache.org/jira/browse/HDFS-1111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12893537#action_12893537
]
Sriram Rao commented on HDFS-1111:
----------------------------------
After further discussions with Konstantin, a way to address this problem with
the API is to have the client provide the starting block id:
1. As long as there are corrupt blocks, have fsck return back pairs of the form
<block id>, <pathname>
2. In subsequent call, the client returns back the last corrupt block id; fsck
then uses that block id as the starting point for the next list
3. This process iterates until there are no more corrupt blocks; at which
point, fsck returns back "There are no more corrupt blocks"
This is similar in spirit to getListing().
I'll provide a patch which also addresses the synchronization problem that
Konstantin is referring to.
> getCorruptFiles() should give some hint that the list is not complete
> ---------------------------------------------------------------------
>
> Key: HDFS-1111
> URL: https://issues.apache.org/jira/browse/HDFS-1111
> Project: Hadoop HDFS
> Issue Type: New Feature
> Reporter: Rodrigo Schmidt
> Assignee: Rodrigo Schmidt
> Attachments: HADFS-1111.0.patch
>
>
> If the list of corruptfiles returned by the namenode doesn't say anything if
> the number of corrupted files is larger than the call output limit (which
> means the list is not complete). There should be a way to hint incompleteness
> to clients.
> A simple hack would be to add an extra entry to the array returned with the
> value null. Clients could interpret this as a sign that there are other
> corrupt files in the system.
> We should also do some rephrasing of the fsck output to make it more
> confident when the list is not complete and less confident when the list is
> known to be incomplete.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.