[jira] Updated: (HDFS-235) Add support for byte-ranges to hftp
[ https://issues.apache.org/jira/browse/HDFS-235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Zeller updated HDFS-235: - Status: Open (was: Patch Available) Add support for byte-ranges to hftp --- Key: HDFS-235 URL: https://issues.apache.org/jira/browse/HDFS-235 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.21.0 Reporter: Venkatesh S Assignee: Bill Zeller Fix For: 0.21.0 Attachments: hdfs-235-1.patch, hdfs-235-2.patch, hdfs-235-3.patch Support should be similar to http byte-serving. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-235) Add support for byte-ranges to hftp
[ https://issues.apache.org/jira/browse/HDFS-235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Zeller updated HDFS-235: - Attachment: hdfs-235-3.patch Add support for byte-ranges to hftp --- Key: HDFS-235 URL: https://issues.apache.org/jira/browse/HDFS-235 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.21.0 Reporter: Venkatesh S Assignee: Bill Zeller Fix For: 0.21.0 Attachments: hdfs-235-1.patch, hdfs-235-2.patch, hdfs-235-3.patch Support should be similar to http byte-serving. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-235) Add support for byte-ranges to hftp
[ https://issues.apache.org/jira/browse/HDFS-235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Zeller updated HDFS-235: - Status: Patch Available (was: Open) Add support for byte-ranges to hftp --- Key: HDFS-235 URL: https://issues.apache.org/jira/browse/HDFS-235 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.21.0 Reporter: Venkatesh S Assignee: Bill Zeller Fix For: 0.21.0 Attachments: hdfs-235-1.patch, hdfs-235-2.patch, hdfs-235-3.patch Support should be similar to http byte-serving. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-235) Add support for byte-ranges to hftp
[ https://issues.apache.org/jira/browse/HDFS-235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12751622#action_12751622 ] Bill Zeller commented on HDFS-235: -- Addressed Jakob's six issues. Add support for byte-ranges to hftp --- Key: HDFS-235 URL: https://issues.apache.org/jira/browse/HDFS-235 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.21.0 Reporter: Venkatesh S Assignee: Bill Zeller Fix For: 0.21.0 Attachments: hdfs-235-1.patch, hdfs-235-2.patch, hdfs-235-3.patch Support should be similar to http byte-serving. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-567) Two contrib tools to facilitate searching for block history information
[ https://issues.apache.org/jira/browse/HDFS-567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12751194#action_12751194 ] Bill Zeller commented on HDFS-567: -- I addressed Suresh's six issues above. Two contrib tools to facilitate searching for block history information Key: HDFS-567 URL: https://issues.apache.org/jira/browse/HDFS-567 Project: Hadoop HDFS Issue Type: New Feature Components: tools Affects Versions: 0.21.0 Reporter: Bill Zeller Assignee: Bill Zeller Priority: Minor Fix For: 0.21.0 Attachments: hdfs-567-1.patch, hdfs-567-2.patch, hdfs-567-3.patch, hdfs-567-4.patch, hdfs-567-5.patch Original Estimate: 5h Remaining Estimate: 5h Includes a java program to query the namenode for corrupt replica information at some interval. If a corrupt replica is found, a map reduce job is launched that will search (supplied) log files for one or more block ids. The mapred job can be used independently of the java client program and can also be used for arbitrary text searches. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-567) Two contrib tools to facilitate searching for block history information
[ https://issues.apache.org/jira/browse/HDFS-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Zeller updated HDFS-567: - Attachment: hdfs-567-5.patch Two contrib tools to facilitate searching for block history information Key: HDFS-567 URL: https://issues.apache.org/jira/browse/HDFS-567 Project: Hadoop HDFS Issue Type: New Feature Components: tools Affects Versions: 0.21.0 Reporter: Bill Zeller Assignee: Bill Zeller Priority: Minor Fix For: 0.21.0 Attachments: hdfs-567-1.patch, hdfs-567-2.patch, hdfs-567-3.patch, hdfs-567-4.patch, hdfs-567-5.patch Original Estimate: 5h Remaining Estimate: 5h Includes a java program to query the namenode for corrupt replica information at some interval. If a corrupt replica is found, a map reduce job is launched that will search (supplied) log files for one or more block ids. The mapred job can be used independently of the java client program and can also be used for arbitrary text searches. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-567) Two contrib tools to facilitate searching for block history information
[ https://issues.apache.org/jira/browse/HDFS-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Zeller updated HDFS-567: - Status: Patch Available (was: Open) Two contrib tools to facilitate searching for block history information Key: HDFS-567 URL: https://issues.apache.org/jira/browse/HDFS-567 Project: Hadoop HDFS Issue Type: New Feature Components: tools Affects Versions: 0.21.0 Reporter: Bill Zeller Assignee: Bill Zeller Priority: Minor Fix For: 0.21.0 Attachments: hdfs-567-1.patch, hdfs-567-2.patch, hdfs-567-3.patch, hdfs-567-4.patch, hdfs-567-5.patch Original Estimate: 5h Remaining Estimate: 5h Includes a java program to query the namenode for corrupt replica information at some interval. If a corrupt replica is found, a map reduce job is launched that will search (supplied) log files for one or more block ids. The mapred job can be used independently of the java client program and can also be used for arbitrary text searches. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-567) Two contrib tools to facilitate searching for block history information
[ https://issues.apache.org/jira/browse/HDFS-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Zeller updated HDFS-567: - Status: Open (was: Patch Available) Two contrib tools to facilitate searching for block history information Key: HDFS-567 URL: https://issues.apache.org/jira/browse/HDFS-567 Project: Hadoop HDFS Issue Type: New Feature Components: tools Affects Versions: 0.21.0 Reporter: Bill Zeller Assignee: Bill Zeller Priority: Minor Fix For: 0.21.0 Attachments: hdfs-567-1.patch, hdfs-567-2.patch, hdfs-567-3.patch, hdfs-567-4.patch, hdfs-567-5.patch Original Estimate: 5h Remaining Estimate: 5h Includes a java program to query the namenode for corrupt replica information at some interval. If a corrupt replica is found, a map reduce job is launched that will search (supplied) log files for one or more block ids. The mapred job can be used independently of the java client program and can also be used for arbitrary text searches. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HDFS-594) Add support for byte-ranges to hsftp
Add support for byte-ranges to hsftp Key: HDFS-594 URL: https://issues.apache.org/jira/browse/HDFS-594 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs client Affects Versions: 0.21.0 Reporter: Bill Zeller Fix For: 0.21.0 HsftpFileSystem should be modified to support byte-ranges so it has the same semantics as HftpFileSystem after committing HDFS-235. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-235) Add support for byte-ranges to hftp
[ https://issues.apache.org/jira/browse/HDFS-235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12751197#action_12751197 ] Bill Zeller commented on HDFS-235: -- Since this patch does not modifed HsftpFileSystem, I've filed HDFS-594. I will not have time to implement this because my internship ends tomorrow. Add support for byte-ranges to hftp --- Key: HDFS-235 URL: https://issues.apache.org/jira/browse/HDFS-235 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.21.0 Reporter: Venkatesh S Assignee: Bill Zeller Fix For: 0.21.0 Attachments: hdfs-235-1.patch Support should be similar to http byte-serving. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-235) Add support for byte-ranges to hftp
[ https://issues.apache.org/jira/browse/HDFS-235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Zeller updated HDFS-235: - Status: Open (was: Patch Available) Add support for byte-ranges to hftp --- Key: HDFS-235 URL: https://issues.apache.org/jira/browse/HDFS-235 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.21.0 Reporter: Venkatesh S Assignee: Bill Zeller Fix For: 0.21.0 Attachments: hdfs-235-1.patch Support should be similar to http byte-serving. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-235) Add support for byte-ranges to hftp
[ https://issues.apache.org/jira/browse/HDFS-235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Zeller updated HDFS-235: - Fix Version/s: 0.21.0 Affects Version/s: 0.21.0 Release Note: Added support for byte ranges in HftpFileSystem and support for serving byte ranges of files in StreamFile. Status: Patch Available (was: Open) HsftpFileSystem has not been modified. Add support for byte-ranges to hftp --- Key: HDFS-235 URL: https://issues.apache.org/jira/browse/HDFS-235 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.21.0 Reporter: Venkatesh S Assignee: Bill Zeller Fix For: 0.21.0 Attachments: hdfs-235-1.patch Support should be similar to http byte-serving. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-567) Two contrib tools to facilitate searching for block history information
[ https://issues.apache.org/jira/browse/HDFS-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Zeller updated HDFS-567: - Attachment: hdfs-567-3.patch Two contrib tools to facilitate searching for block history information Key: HDFS-567 URL: https://issues.apache.org/jira/browse/HDFS-567 Project: Hadoop HDFS Issue Type: New Feature Components: tools Affects Versions: 0.21.0 Reporter: Bill Zeller Assignee: Bill Zeller Priority: Minor Fix For: 0.21.0 Attachments: hdfs-567-1.patch, hdfs-567-2.patch, hdfs-567-3.patch Original Estimate: 5h Remaining Estimate: 5h Includes a java program to query the namenode for corrupt replica information at some interval. If a corrupt replica is found, a map reduce job is launched that will search (supplied) log files for one or more block ids. The mapred job can be used independently of the java client program and can also be used for arbitrary text searches. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-567) Two contrib tools to facilitate searching for block history information
[ https://issues.apache.org/jira/browse/HDFS-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Zeller updated HDFS-567: - Status: Open (was: Patch Available) Two contrib tools to facilitate searching for block history information Key: HDFS-567 URL: https://issues.apache.org/jira/browse/HDFS-567 Project: Hadoop HDFS Issue Type: New Feature Components: tools Affects Versions: 0.21.0 Reporter: Bill Zeller Assignee: Bill Zeller Priority: Minor Fix For: 0.21.0 Attachments: hdfs-567-1.patch, hdfs-567-2.patch, hdfs-567-3.patch Original Estimate: 5h Remaining Estimate: 5h Includes a java program to query the namenode for corrupt replica information at some interval. If a corrupt replica is found, a map reduce job is launched that will search (supplied) log files for one or more block ids. The mapred job can be used independently of the java client program and can also be used for arbitrary text searches. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-567) Two contrib tools to facilitate searching for block history information
[ https://issues.apache.org/jira/browse/HDFS-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Zeller updated HDFS-567: - Status: Patch Available (was: Open) Two contrib tools to facilitate searching for block history information Key: HDFS-567 URL: https://issues.apache.org/jira/browse/HDFS-567 Project: Hadoop HDFS Issue Type: New Feature Components: tools Affects Versions: 0.21.0 Reporter: Bill Zeller Assignee: Bill Zeller Priority: Minor Fix For: 0.21.0 Attachments: hdfs-567-1.patch, hdfs-567-2.patch, hdfs-567-3.patch Original Estimate: 5h Remaining Estimate: 5h Includes a java program to query the namenode for corrupt replica information at some interval. If a corrupt replica is found, a map reduce job is launched that will search (supplied) log files for one or more block ids. The mapred job can be used independently of the java client program and can also be used for arbitrary text searches. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-492) Expose corrupt replica/block information
[ https://issues.apache.org/jira/browse/HDFS-492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Zeller updated HDFS-492: - Status: Open (was: Patch Available) Expose corrupt replica/block information Key: HDFS-492 URL: https://issues.apache.org/jira/browse/HDFS-492 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, name-node Affects Versions: 0.21.0 Reporter: Bill Zeller Assignee: Bill Zeller Priority: Minor Fix For: 0.21.0 Attachments: hdfs-492-10.patch, hdfs-492-4.patch, hdfs-492-5.patch, hdfs-492-8.patch, hdfs-492-9.patch Original Estimate: 48h Remaining Estimate: 48h This adds two additional functions to FSNamesystem to provide more information about corrupt replicas. It also adds two servlets to the namenode that provide information (in JSON) about all blocks with corrupt replicas as well as information about a specific block. It also changes the file browsing servlet by adding a link from block ids to the above mentioned block information page. These JSON pages are designed to be used by client side tools which wish to analyze corrupt block/replicas. The only change to an existing (non-servlet) class is described below. Currently, CorruptReplicasMap stores a map of corrupt replica information and allows insertion and deletion. It also gives information about the corrupt replicas for a specific block. It does not allow iteration over all corrupt blocks. Two additional functions will be added to FSNamesystem (which will call BlockManager which will call CorruptReplicasMap). The first will return the size of the corrupt replicas map, which represents the number of blocks that have corrupt replicas (but less than the number of corrupt replicas if a block has multiple corrupt replicas). The second will allow paging through a list of block ids that contain corrupt replicas: {{public synchronized ListLong getCorruptReplicaBlockIds(int n, Long startingBlockId)}} {{n}} is the number of block ids to return and {{startingBlockId}} is the block id offset. To prevent a large number of items being returned at one time, n is constrained to 0 = {{n}} = 100. If {{startingBlockId}} is null, up to {{n}} items are returned starting at the beginning of the list. Ordering is enforced through the internal use of TreeMap in CorruptReplicasMap. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-492) Expose corrupt replica/block information
[ https://issues.apache.org/jira/browse/HDFS-492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Zeller updated HDFS-492: - Status: Patch Available (was: Open) I addressed Nicholas' issues. Expose corrupt replica/block information Key: HDFS-492 URL: https://issues.apache.org/jira/browse/HDFS-492 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, name-node Affects Versions: 0.21.0 Reporter: Bill Zeller Assignee: Bill Zeller Priority: Minor Fix For: 0.21.0 Attachments: hdfs-492-10.patch, hdfs-492-4.patch, hdfs-492-5.patch, hdfs-492-8.patch, hdfs-492-9.patch Original Estimate: 48h Remaining Estimate: 48h This adds two additional functions to FSNamesystem to provide more information about corrupt replicas. It also adds two servlets to the namenode that provide information (in JSON) about all blocks with corrupt replicas as well as information about a specific block. It also changes the file browsing servlet by adding a link from block ids to the above mentioned block information page. These JSON pages are designed to be used by client side tools which wish to analyze corrupt block/replicas. The only change to an existing (non-servlet) class is described below. Currently, CorruptReplicasMap stores a map of corrupt replica information and allows insertion and deletion. It also gives information about the corrupt replicas for a specific block. It does not allow iteration over all corrupt blocks. Two additional functions will be added to FSNamesystem (which will call BlockManager which will call CorruptReplicasMap). The first will return the size of the corrupt replicas map, which represents the number of blocks that have corrupt replicas (but less than the number of corrupt replicas if a block has multiple corrupt replicas). The second will allow paging through a list of block ids that contain corrupt replicas: {{public synchronized ListLong getCorruptReplicaBlockIds(int n, Long startingBlockId)}} {{n}} is the number of block ids to return and {{startingBlockId}} is the block id offset. To prevent a large number of items being returned at one time, n is constrained to 0 = {{n}} = 100. If {{startingBlockId}} is null, up to {{n}} items are returned starting at the beginning of the list. Ordering is enforced through the internal use of TreeMap in CorruptReplicasMap. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-492) Expose corrupt replica/block information
[ https://issues.apache.org/jira/browse/HDFS-492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Zeller updated HDFS-492: - Attachment: hdfs-492-11.patch Expose corrupt replica/block information Key: HDFS-492 URL: https://issues.apache.org/jira/browse/HDFS-492 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, name-node Affects Versions: 0.21.0 Reporter: Bill Zeller Assignee: Bill Zeller Priority: Minor Fix For: 0.21.0 Attachments: hdfs-492-10.patch, hdfs-492-11.patch, hdfs-492-4.patch, hdfs-492-5.patch, hdfs-492-8.patch, hdfs-492-9.patch Original Estimate: 48h Remaining Estimate: 48h This adds two additional functions to FSNamesystem to provide more information about corrupt replicas. It also adds two servlets to the namenode that provide information (in JSON) about all blocks with corrupt replicas as well as information about a specific block. It also changes the file browsing servlet by adding a link from block ids to the above mentioned block information page. These JSON pages are designed to be used by client side tools which wish to analyze corrupt block/replicas. The only change to an existing (non-servlet) class is described below. Currently, CorruptReplicasMap stores a map of corrupt replica information and allows insertion and deletion. It also gives information about the corrupt replicas for a specific block. It does not allow iteration over all corrupt blocks. Two additional functions will be added to FSNamesystem (which will call BlockManager which will call CorruptReplicasMap). The first will return the size of the corrupt replicas map, which represents the number of blocks that have corrupt replicas (but less than the number of corrupt replicas if a block has multiple corrupt replicas). The second will allow paging through a list of block ids that contain corrupt replicas: {{public synchronized ListLong getCorruptReplicaBlockIds(int n, Long startingBlockId)}} {{n}} is the number of block ids to return and {{startingBlockId}} is the block id offset. To prevent a large number of items being returned at one time, n is constrained to 0 = {{n}} = 100. If {{startingBlockId}} is null, up to {{n}} items are returned starting at the beginning of the list. Ordering is enforced through the internal use of TreeMap in CorruptReplicasMap. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-492) Expose corrupt replica/block information
[ https://issues.apache.org/jira/browse/HDFS-492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Zeller updated HDFS-492: - Attachment: hdfs-492-13.patch Expose corrupt replica/block information Key: HDFS-492 URL: https://issues.apache.org/jira/browse/HDFS-492 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, name-node Affects Versions: 0.21.0 Reporter: Bill Zeller Assignee: Bill Zeller Priority: Minor Fix For: 0.21.0 Attachments: hdfs-492-10.patch, hdfs-492-11.patch, hdfs-492-13.patch, hdfs-492-4.patch, hdfs-492-5.patch, hdfs-492-8.patch, hdfs-492-9.patch Original Estimate: 48h Remaining Estimate: 48h This adds two additional functions to FSNamesystem to provide more information about corrupt replicas. It also adds two servlets to the namenode that provide information (in JSON) about all blocks with corrupt replicas as well as information about a specific block. It also changes the file browsing servlet by adding a link from block ids to the above mentioned block information page. These JSON pages are designed to be used by client side tools which wish to analyze corrupt block/replicas. The only change to an existing (non-servlet) class is described below. Currently, CorruptReplicasMap stores a map of corrupt replica information and allows insertion and deletion. It also gives information about the corrupt replicas for a specific block. It does not allow iteration over all corrupt blocks. Two additional functions will be added to FSNamesystem (which will call BlockManager which will call CorruptReplicasMap). The first will return the size of the corrupt replicas map, which represents the number of blocks that have corrupt replicas (but less than the number of corrupt replicas if a block has multiple corrupt replicas). The second will allow paging through a list of block ids that contain corrupt replicas: {{public synchronized ListLong getCorruptReplicaBlockIds(int n, Long startingBlockId)}} {{n}} is the number of block ids to return and {{startingBlockId}} is the block id offset. To prevent a large number of items being returned at one time, n is constrained to 0 = {{n}} = 100. If {{startingBlockId}} is null, up to {{n}} items are returned starting at the beginning of the list. Ordering is enforced through the internal use of TreeMap in CorruptReplicasMap. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-567) Two contrib tools to facilitate searching for block history information
[ https://issues.apache.org/jira/browse/HDFS-567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12748147#action_12748147 ] Bill Zeller commented on HDFS-567: -- The failed test is unrelated to this patch. See: https://issues.apache.org/jira/browse/HDFS-568 Two contrib tools to facilitate searching for block history information Key: HDFS-567 URL: https://issues.apache.org/jira/browse/HDFS-567 Project: Hadoop HDFS Issue Type: New Feature Components: tools Affects Versions: 0.21.0 Reporter: Bill Zeller Assignee: Bill Zeller Priority: Minor Fix For: 0.21.0 Attachments: hdfs-567-1.patch, hdfs-567-2.patch Original Estimate: 5h Remaining Estimate: 5h Includes a java program to query the namenode for corrupt replica information at some interval. If a corrupt replica is found, a map reduce job is launched that will search (supplied) log files for one or more block ids. The mapred job can be used independently of the java client program and can also be used for arbitrary text searches. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-492) Expose corrupt replica/block information
[ https://issues.apache.org/jira/browse/HDFS-492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Zeller updated HDFS-492: - Status: Open (was: Patch Available) Expose corrupt replica/block information Key: HDFS-492 URL: https://issues.apache.org/jira/browse/HDFS-492 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, name-node Affects Versions: 0.21.0 Reporter: Bill Zeller Assignee: Bill Zeller Priority: Minor Fix For: 0.21.0 Attachments: hdfs-492-4.patch, hdfs-492-5.patch, hdfs-492-8.patch, hdfs-492-9.patch Original Estimate: 48h Remaining Estimate: 48h This adds two additional functions to FSNamesystem to provide more information about corrupt replicas. It also adds two servlets to the namenode that provide information (in JSON) about all blocks with corrupt replicas as well as information about a specific block. It also changes the file browsing servlet by adding a link from block ids to the above mentioned block information page. These JSON pages are designed to be used by client side tools which wish to analyze corrupt block/replicas. The only change to an existing (non-servlet) class is described below. Currently, CorruptReplicasMap stores a map of corrupt replica information and allows insertion and deletion. It also gives information about the corrupt replicas for a specific block. It does not allow iteration over all corrupt blocks. Two additional functions will be added to FSNamesystem (which will call BlockManager which will call CorruptReplicasMap). The first will return the size of the corrupt replicas map, which represents the number of blocks that have corrupt replicas (but less than the number of corrupt replicas if a block has multiple corrupt replicas). The second will allow paging through a list of block ids that contain corrupt replicas: {{public synchronized ListLong getCorruptReplicaBlockIds(int n, Long startingBlockId)}} {{n}} is the number of block ids to return and {{startingBlockId}} is the block id offset. To prevent a large number of items being returned at one time, n is constrained to 0 = {{n}} = 100. If {{startingBlockId}} is null, up to {{n}} items are returned starting at the beginning of the list. Ordering is enforced through the internal use of TreeMap in CorruptReplicasMap. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-492) Expose corrupt replica/block information
[ https://issues.apache.org/jira/browse/HDFS-492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Zeller updated HDFS-492: - Status: Patch Available (was: Open) Expose corrupt replica/block information Key: HDFS-492 URL: https://issues.apache.org/jira/browse/HDFS-492 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, name-node Affects Versions: 0.21.0 Reporter: Bill Zeller Assignee: Bill Zeller Priority: Minor Fix For: 0.21.0 Attachments: hdfs-492-4.patch, hdfs-492-5.patch, hdfs-492-8.patch, hdfs-492-9.patch Original Estimate: 48h Remaining Estimate: 48h This adds two additional functions to FSNamesystem to provide more information about corrupt replicas. It also adds two servlets to the namenode that provide information (in JSON) about all blocks with corrupt replicas as well as information about a specific block. It also changes the file browsing servlet by adding a link from block ids to the above mentioned block information page. These JSON pages are designed to be used by client side tools which wish to analyze corrupt block/replicas. The only change to an existing (non-servlet) class is described below. Currently, CorruptReplicasMap stores a map of corrupt replica information and allows insertion and deletion. It also gives information about the corrupt replicas for a specific block. It does not allow iteration over all corrupt blocks. Two additional functions will be added to FSNamesystem (which will call BlockManager which will call CorruptReplicasMap). The first will return the size of the corrupt replicas map, which represents the number of blocks that have corrupt replicas (but less than the number of corrupt replicas if a block has multiple corrupt replicas). The second will allow paging through a list of block ids that contain corrupt replicas: {{public synchronized ListLong getCorruptReplicaBlockIds(int n, Long startingBlockId)}} {{n}} is the number of block ids to return and {{startingBlockId}} is the block id offset. To prevent a large number of items being returned at one time, n is constrained to 0 = {{n}} = 100. If {{startingBlockId}} is null, up to {{n}} items are returned starting at the beginning of the list. Ordering is enforced through the internal use of TreeMap in CorruptReplicasMap. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HDFS-567) Two contrib tools to facilitate searching for block history information
Two contrib tools to facilitate searching for block history information Key: HDFS-567 URL: https://issues.apache.org/jira/browse/HDFS-567 Project: Hadoop HDFS Issue Type: New Feature Components: tools Reporter: Bill Zeller Assignee: Bill Zeller Priority: Minor Fix For: 0.21.0 Includes a java program to query the namenode for corrupt replica information at some interval. If a corrupt replica is found, a map reduce job is launched that will search (supplied) log files for one or more block ids. The mapred job can be used independently of the java client program and can also be used for arbitrary text searches. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-567) Two contrib tools to facilitate searching for block history information
[ https://issues.apache.org/jira/browse/HDFS-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Zeller updated HDFS-567: - Affects Version/s: 0.21.0 Release Note: Adds a contrib that includes two programs to facilitate searching for block history information in log files. Status: Patch Available (was: Open) Two contrib tools to facilitate searching for block history information Key: HDFS-567 URL: https://issues.apache.org/jira/browse/HDFS-567 Project: Hadoop HDFS Issue Type: New Feature Components: tools Affects Versions: 0.21.0 Reporter: Bill Zeller Assignee: Bill Zeller Priority: Minor Fix For: 0.21.0 Attachments: hdfs-567-1.patch Original Estimate: 5h Remaining Estimate: 5h Includes a java program to query the namenode for corrupt replica information at some interval. If a corrupt replica is found, a map reduce job is launched that will search (supplied) log files for one or more block ids. The mapred job can be used independently of the java client program and can also be used for arbitrary text searches. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-567) Two contrib tools to facilitate searching for block history information
[ https://issues.apache.org/jira/browse/HDFS-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Zeller updated HDFS-567: - Attachment: hdfs-567-1.patch Two contrib tools to facilitate searching for block history information Key: HDFS-567 URL: https://issues.apache.org/jira/browse/HDFS-567 Project: Hadoop HDFS Issue Type: New Feature Components: tools Reporter: Bill Zeller Assignee: Bill Zeller Priority: Minor Fix For: 0.21.0 Attachments: hdfs-567-1.patch Original Estimate: 5h Remaining Estimate: 5h Includes a java program to query the namenode for corrupt replica information at some interval. If a corrupt replica is found, a map reduce job is launched that will search (supplied) log files for one or more block ids. The mapred job can be used independently of the java client program and can also be used for arbitrary text searches. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-567) Two contrib tools to facilitate searching for block history information
[ https://issues.apache.org/jira/browse/HDFS-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Zeller updated HDFS-567: - Status: Patch Available (was: Open) Two contrib tools to facilitate searching for block history information Key: HDFS-567 URL: https://issues.apache.org/jira/browse/HDFS-567 Project: Hadoop HDFS Issue Type: New Feature Components: tools Affects Versions: 0.21.0 Reporter: Bill Zeller Assignee: Bill Zeller Priority: Minor Fix For: 0.21.0 Attachments: hdfs-567-1.patch, hdfs-567-2.patch Original Estimate: 5h Remaining Estimate: 5h Includes a java program to query the namenode for corrupt replica information at some interval. If a corrupt replica is found, a map reduce job is launched that will search (supplied) log files for one or more block ids. The mapred job can be used independently of the java client program and can also be used for arbitrary text searches. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-567) Two contrib tools to facilitate searching for block history information
[ https://issues.apache.org/jira/browse/HDFS-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Zeller updated HDFS-567: - Attachment: hdfs-567-2.patch Two contrib tools to facilitate searching for block history information Key: HDFS-567 URL: https://issues.apache.org/jira/browse/HDFS-567 Project: Hadoop HDFS Issue Type: New Feature Components: tools Affects Versions: 0.21.0 Reporter: Bill Zeller Assignee: Bill Zeller Priority: Minor Fix For: 0.21.0 Attachments: hdfs-567-1.patch, hdfs-567-2.patch Original Estimate: 5h Remaining Estimate: 5h Includes a java program to query the namenode for corrupt replica information at some interval. If a corrupt replica is found, a map reduce job is launched that will search (supplied) log files for one or more block ids. The mapred job can be used independently of the java client program and can also be used for arbitrary text searches. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-492) Expose corrupt replica/block information
[ https://issues.apache.org/jira/browse/HDFS-492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12747753#action_12747753 ] Bill Zeller commented on HDFS-492: -- In regards to (1), doesn't toArray() require an object array? I don't believe this method supports primitive data types. Expose corrupt replica/block information Key: HDFS-492 URL: https://issues.apache.org/jira/browse/HDFS-492 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, name-node Affects Versions: 0.21.0 Reporter: Bill Zeller Assignee: Bill Zeller Priority: Minor Fix For: 0.21.0 Attachments: hdfs-492-4.patch, hdfs-492-5.patch, hdfs-492-8.patch Original Estimate: 48h Remaining Estimate: 48h This adds two additional functions to FSNamesystem to provide more information about corrupt replicas. It also adds two servlets to the namenode that provide information (in JSON) about all blocks with corrupt replicas as well as information about a specific block. It also changes the file browsing servlet by adding a link from block ids to the above mentioned block information page. These JSON pages are designed to be used by client side tools which wish to analyze corrupt block/replicas. The only change to an existing (non-servlet) class is described below. Currently, CorruptReplicasMap stores a map of corrupt replica information and allows insertion and deletion. It also gives information about the corrupt replicas for a specific block. It does not allow iteration over all corrupt blocks. Two additional functions will be added to FSNamesystem (which will call BlockManager which will call CorruptReplicasMap). The first will return the size of the corrupt replicas map, which represents the number of blocks that have corrupt replicas (but less than the number of corrupt replicas if a block has multiple corrupt replicas). The second will allow paging through a list of block ids that contain corrupt replicas: {{public synchronized ListLong getCorruptReplicaBlockIds(int n, Long startingBlockId)}} {{n}} is the number of block ids to return and {{startingBlockId}} is the block id offset. To prevent a large number of items being returned at one time, n is constrained to 0 = {{n}} = 100. If {{startingBlockId}} is null, up to {{n}} items are returned starting at the beginning of the list. Ordering is enforced through the internal use of TreeMap in CorruptReplicasMap. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-492) Expose corrupt replica/block information
[ https://issues.apache.org/jira/browse/HDFS-492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Zeller updated HDFS-492: - Attachment: hdfs-492-8.patch Ran Hudson tests locally: [exec] There appear to be 147 release audit warnings before the patch and 147 release audit warnings after applying the patch. [exec] [exec] +1 overall. [exec] +1 @author. The patch does not contain any @author tags. [exec] +1 tests included. The patch appears to include 2 new or modified tests. [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] == [exec] == [exec] Finished build. [exec] == [exec] == [exec] BUILD SUCCESSFUL (some blank lines removed) Running ant run-test-hdfs: BUILD SUCCESSFUL Total time: 36 minutes 1 second Expose corrupt replica/block information Key: HDFS-492 URL: https://issues.apache.org/jira/browse/HDFS-492 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, name-node Affects Versions: 0.21.0 Reporter: Bill Zeller Assignee: Bill Zeller Priority: Minor Fix For: 0.21.0 Attachments: hdfs-492-4.patch, hdfs-492-5.patch, hdfs-492-8.patch Original Estimate: 48h Remaining Estimate: 48h This adds two additional functions to FSNamesystem to provide more information about corrupt replicas. It also adds two servlets to the namenode that provide information (in JSON) about all blocks with corrupt replicas as well as information about a specific block. It also changes the file browsing servlet by adding a link from block ids to the above mentioned block information page. These JSON pages are designed to be used by client side tools which wish to analyze corrupt block/replicas. The only change to an existing (non-servlet) class is described below. Currently, CorruptReplicasMap stores a map of corrupt replica information and allows insertion and deletion. It also gives information about the corrupt replicas for a specific block. It does not allow iteration over all corrupt blocks. Two additional functions will be added to FSNamesystem (which will call BlockManager which will call CorruptReplicasMap). The first will return the size of the corrupt replicas map, which represents the number of blocks that have corrupt replicas (but less than the number of corrupt replicas if a block has multiple corrupt replicas). The second will allow paging through a list of block ids that contain corrupt replicas: {{public synchronized ListLong getCorruptReplicaBlockIds(int n, Long startingBlockId)}} {{n}} is the number of block ids to return and {{startingBlockId}} is the block id offset. To prevent a large number of items being returned at one time, n is constrained to 0 = {{n}} = 100. If {{startingBlockId}} is null, up to {{n}} items are returned starting at the beginning of the list. Ordering is enforced through the internal use of TreeMap in CorruptReplicasMap. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-235) Add support for byte-ranges to hftp
[ https://issues.apache.org/jira/browse/HDFS-235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12746094#action_12746094 ] Bill Zeller commented on HDFS-235: -- I believe you're asking if seek() could be called on the FSDataInputStream object returned by HftpFileSystem::open if byte-ranges are implemented. Seek could be called, but it would only allow seeking between the byte-range initially specified when making the open() call. I don't believe byte-ranges could be used to optimize seek, because the seek is happening after the HTTP response returns. This forces seek() to work within the confines of the bytes already requested. Add support for byte-ranges to hftp --- Key: HDFS-235 URL: https://issues.apache.org/jira/browse/HDFS-235 Project: Hadoop HDFS Issue Type: New Feature Reporter: Venkatesh S Assignee: Bill Zeller Support should be similar to http byte-serving. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-235) Add support for byte-ranges to hftp
[ https://issues.apache.org/jira/browse/HDFS-235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12746235#action_12746235 ] Bill Zeller commented on HDFS-235: -- Supporting seek() would be more generic. Perhaps the initial open() call should do nothing, with no HTTP requests made until one attempts to read a file. Then the servlet could accept a start position, but not an end position. Seeking would close any open HTTP connection and open a new one. Yes, the end position is not required. I believe HftpFileSystem could be easily modified to support the above semantics. Add support for byte-ranges to hftp --- Key: HDFS-235 URL: https://issues.apache.org/jira/browse/HDFS-235 Project: Hadoop HDFS Issue Type: New Feature Reporter: Venkatesh S Assignee: Bill Zeller Support should be similar to http byte-serving. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HDFS-235) Add support for byte-ranges to hftp
[ https://issues.apache.org/jira/browse/HDFS-235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Zeller reassigned HDFS-235: Assignee: Bill Zeller Add support for byte-ranges to hftp --- Key: HDFS-235 URL: https://issues.apache.org/jira/browse/HDFS-235 Project: Hadoop HDFS Issue Type: New Feature Reporter: Venkatesh S Assignee: Bill Zeller Support should be similar to http byte-serving. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-492) Expose corrupt replica/block information
[ https://issues.apache.org/jira/browse/HDFS-492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12739258#action_12739258 ] Bill Zeller commented on HDFS-492: -- The second argument is the startingBlockId, not the starting block offset. Passing 0 indicates a block id of 0. This function could be special cased to treat 0 to mean to start from the beginning and assume that no real block id would ever have the value 0, but this seems to be dirtier than just using {{null}}. Expose corrupt replica/block information Key: HDFS-492 URL: https://issues.apache.org/jira/browse/HDFS-492 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, name-node Affects Versions: 0.21.0 Reporter: Bill Zeller Assignee: Bill Zeller Priority: Minor Fix For: 0.21.0 Attachments: hdfs-492-4.patch, hdfs-492-5.patch Original Estimate: 48h Remaining Estimate: 48h This adds two additional functions to FSNamesystem to provide more information about corrupt replicas. It also adds two servlets to the namenode that provide information (in JSON) about all blocks with corrupt replicas as well as information about a specific block. It also changes the file browsing servlet by adding a link from block ids to the above mentioned block information page. These JSON pages are designed to be used by client side tools which wish to analyze corrupt block/replicas. The only change to an existing (non-servlet) class is described below. Currently, CorruptReplicasMap stores a map of corrupt replica information and allows insertion and deletion. It also gives information about the corrupt replicas for a specific block. It does not allow iteration over all corrupt blocks. Two additional functions will be added to FSNamesystem (which will call BlockManager which will call CorruptReplicasMap). The first will return the size of the corrupt replicas map, which represents the number of blocks that have corrupt replicas (but less than the number of corrupt replicas if a block has multiple corrupt replicas). The second will allow paging through a list of block ids that contain corrupt replicas: {{public synchronized ListLong getCorruptReplicaBlockIds(int n, Long startingBlockId)}} {{n}} is the number of block ids to return and {{startingBlockId}} is the block id offset. To prevent a large number of items being returned at one time, n is constrained to 0 = {{n}} = 100. If {{startingBlockId}} is null, up to {{n}} items are returned starting at the beginning of the list. Ordering is enforced through the internal use of TreeMap in CorruptReplicasMap. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-167) DFSClient continues to retry indefinitely
[ https://issues.apache.org/jira/browse/HDFS-167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12738619#action_12738619 ] Bill Zeller commented on HDFS-167: -- If an end-user has the appropriate ClientProtocol objects, I have no problem with him using this constructor. Is there a reason to restrict dependency injection? DFSClient continues to retry indefinitely - Key: HDFS-167 URL: https://issues.apache.org/jira/browse/HDFS-167 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Reporter: Derek Wollenstein Assignee: Bill Zeller Priority: Minor Fix For: 0.21.0 Attachments: hdfs-167-4.patch, hdfs-167-5.patch, hdfs-167-6.patch, hdfs-167-for-20-1.patch I encountered a bug when trying to upload data using the Hadoop DFS Client. After receiving a NotReplicatedYetException, the DFSClient will normally retry its upload up to some limited number of times. In this case, I found that this retry loop continued indefinitely, to the point that the number of tries remaining was negative: 2009-03-25 16:20:02 [INFO] 2009-03-25 16:20:02 [INFO] 09/03/25 16:20:02 INFO hdfs.DFSClient: Waiting for replication for 21 seconds 2009-03-25 16:20:03 [INFO] 09/03/25 16:20:02 WARN hdfs.DFSClient: NotReplicatedYetException sleeping /apollo/env/SummaryMySQL/var/logstore/fiorello_logs_2009 0325_us/logs_20090325_us_13 retries left -1 The stack trace for the failure that's retrying is: 2009-03-25 16:20:02 [INFO] 09/03/25 16:20:02 INFO hdfs.DFSClient: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.server.namenode.NotReplicated YetException: Not replicated yet:filename 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1266) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351) 2009-03-25 16:20:02 [INFO] at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source) 2009-03-25 16:20:02 [INFO] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 2009-03-25 16:20:02 [INFO] at java.lang.reflect.Method.invoke(Method.java:597) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.ipc.Server$Handler.run(Server.java:894) 2009-03-25 16:20:02 [INFO] 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.ipc.Client.call(Client.java:697) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216) 2009-03-25 16:20:02 [INFO] at $Proxy0.addBlock(Unknown Source) 2009-03-25 16:20:02 [INFO] at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) 2009-03-25 16:20:02 [INFO] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 2009-03-25 16:20:02 [INFO] at java.lang.reflect.Method.invoke(Method.java:597) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) 2009-03-25 16:20:02 [INFO] at $Proxy0.addBlock(Unknown Source) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2814) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2696) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1996) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2183) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-492) Expose corrupt replica/block information
[ https://issues.apache.org/jira/browse/HDFS-492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12738752#action_12738752 ] Bill Zeller commented on HDFS-492: -- I made all the above changes except for (4). The reason I take a Long as the startingBlockId parameter (instead of a long), is to allow {{null}} to be passed. If {{null}} is passed as startingBlockId, the iteration begins at the beginning of the corrupt replica list. If {{null}} is not passed, the iteration begins at the block following startingBlockId. I couldn't think of a clean way to do this other than to create another method which duplicates some of the functionality. Another option would be to create a third method which takes an iterator and n and returns up to n blocks wherever that iterator begins. Two methods could be created calling this third method (one which advanced the iterator to the right block and the other which passed an iterator which hadn't been advanced at all). Adding another method would require propagating it to BlockManager and FSNameSystem, which I thought might be too heavyweight. Expose corrupt replica/block information Key: HDFS-492 URL: https://issues.apache.org/jira/browse/HDFS-492 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, name-node Affects Versions: 0.21.0 Reporter: Bill Zeller Assignee: Bill Zeller Priority: Minor Fix For: 0.21.0 Attachments: hdfs-492-4.patch, hdfs-492-5.patch Original Estimate: 48h Remaining Estimate: 48h This adds two additional functions to FSNamesystem to provide more information about corrupt replicas. It also adds two servlets to the namenode that provide information (in JSON) about all blocks with corrupt replicas as well as information about a specific block. It also changes the file browsing servlet by adding a link from block ids to the above mentioned block information page. These JSON pages are designed to be used by client side tools which wish to analyze corrupt block/replicas. The only change to an existing (non-servlet) class is described below. Currently, CorruptReplicasMap stores a map of corrupt replica information and allows insertion and deletion. It also gives information about the corrupt replicas for a specific block. It does not allow iteration over all corrupt blocks. Two additional functions will be added to FSNamesystem (which will call BlockManager which will call CorruptReplicasMap). The first will return the size of the corrupt replicas map, which represents the number of blocks that have corrupt replicas (but less than the number of corrupt replicas if a block has multiple corrupt replicas). The second will allow paging through a list of block ids that contain corrupt replicas: {{public synchronized ListLong getCorruptReplicaBlockIds(int n, Long startingBlockId)}} {{n}} is the number of block ids to return and {{startingBlockId}} is the block id offset. To prevent a large number of items being returned at one time, n is constrained to 0 = {{n}} = 100. If {{startingBlockId}} is null, up to {{n}} items are returned starting at the beginning of the list. Ordering is enforced through the internal use of TreeMap in CorruptReplicasMap. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-167) DFSClient continues to retry indefinitely
[ https://issues.apache.org/jira/browse/HDFS-167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Zeller updated HDFS-167: - Attachment: hdfs-167-for-20-1.patch hdfs-167-for-20-1.patch is a patch for .20. Running test-patch: [exec] +1 overall. [exec] +1 @author. The patch does not contain any @author tags. [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity. == == [exec] Finished build. == == Running ant test passed all test except one (org.apache.hadoop.cli.TestCLI) which I believe is a configuration error on my part (I get the same error when testing the trunk with no modifications). DFSClient continues to retry indefinitely - Key: HDFS-167 URL: https://issues.apache.org/jira/browse/HDFS-167 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Reporter: Derek Wollenstein Assignee: Bill Zeller Priority: Minor Fix For: 0.21.0 Attachments: hdfs-167-4.patch, hdfs-167-5.patch, hdfs-167-6.patch, hdfs-167-for-20-1.patch I encountered a bug when trying to upload data using the Hadoop DFS Client. After receiving a NotReplicatedYetException, the DFSClient will normally retry its upload up to some limited number of times. In this case, I found that this retry loop continued indefinitely, to the point that the number of tries remaining was negative: 2009-03-25 16:20:02 [INFO] 2009-03-25 16:20:02 [INFO] 09/03/25 16:20:02 INFO hdfs.DFSClient: Waiting for replication for 21 seconds 2009-03-25 16:20:03 [INFO] 09/03/25 16:20:02 WARN hdfs.DFSClient: NotReplicatedYetException sleeping /apollo/env/SummaryMySQL/var/logstore/fiorello_logs_2009 0325_us/logs_20090325_us_13 retries left -1 The stack trace for the failure that's retrying is: 2009-03-25 16:20:02 [INFO] 09/03/25 16:20:02 INFO hdfs.DFSClient: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.server.namenode.NotReplicated YetException: Not replicated yet:filename 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1266) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351) 2009-03-25 16:20:02 [INFO] at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source) 2009-03-25 16:20:02 [INFO] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 2009-03-25 16:20:02 [INFO] at java.lang.reflect.Method.invoke(Method.java:597) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.ipc.Server$Handler.run(Server.java:894) 2009-03-25 16:20:02 [INFO] 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.ipc.Client.call(Client.java:697) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216) 2009-03-25 16:20:02 [INFO] at $Proxy0.addBlock(Unknown Source) 2009-03-25 16:20:02 [INFO] at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) 2009-03-25 16:20:02 [INFO] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 2009-03-25 16:20:02 [INFO] at java.lang.reflect.Method.invoke(Method.java:597) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) 2009-03-25 16:20:02 [INFO] at $Proxy0.addBlock(Unknown Source) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2814) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2696) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1996) 2009-03-25 16:20:02 [INFO] at
[jira] Commented: (HDFS-492) Expose corrupt replica/block information
[ https://issues.apache.org/jira/browse/HDFS-492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12737192#action_12737192 ] Bill Zeller commented on HDFS-492: -- The change to BlockInfo broke this patch. I need to make it work with the current code. Expose corrupt replica/block information Key: HDFS-492 URL: https://issues.apache.org/jira/browse/HDFS-492 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, name-node Affects Versions: 0.21.0 Reporter: Bill Zeller Assignee: Bill Zeller Priority: Minor Fix For: 0.21.0 Attachments: hdfs-492-4.patch, hdfs-492-5.patch Original Estimate: 48h Remaining Estimate: 48h This adds two additional functions to FSNamesystem to provide more information about corrupt replicas. It also adds two servlets to the namenode that provide information (in JSON) about all blocks with corrupt replicas as well as information about a specific block. It also changes the file browsing servlet by adding a link from block ids to the above mentioned block information page. These JSON pages are designed to be used by client side tools which wish to analyze corrupt block/replicas. The only change to an existing (non-servlet) class is described below. Currently, CorruptReplicasMap stores a map of corrupt replica information and allows insertion and deletion. It also gives information about the corrupt replicas for a specific block. It does not allow iteration over all corrupt blocks. Two additional functions will be added to FSNamesystem (which will call BlockManager which will call CorruptReplicasMap). The first will return the size of the corrupt replicas map, which represents the number of blocks that have corrupt replicas (but less than the number of corrupt replicas if a block has multiple corrupt replicas). The second will allow paging through a list of block ids that contain corrupt replicas: {{public synchronized ListLong getCorruptReplicaBlockIds(int n, Long startingBlockId)}} {{n}} is the number of block ids to return and {{startingBlockId}} is the block id offset. To prevent a large number of items being returned at one time, n is constrained to 0 = {{n}} = 100. If {{startingBlockId}} is null, up to {{n}} items are returned starting at the beginning of the list. Ordering is enforced through the internal use of TreeMap in CorruptReplicasMap. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HDFS-514) DFSClient.namenode is a public field. Should be private.
DFSClient.namenode is a public field. Should be private. Key: HDFS-514 URL: https://issues.apache.org/jira/browse/HDFS-514 Project: Hadoop HDFS Issue Type: Bug Reporter: Bill Zeller Assignee: Bill Zeller -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-514) DFSClient.namenode is a public field. Should be private.
[ https://issues.apache.org/jira/browse/HDFS-514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12737224#action_12737224 ] Bill Zeller commented on HDFS-514: -- *They cannot be private final due to the changes that need to be made in HDFS-167. DFSClient.namenode is a public field. Should be private. Key: HDFS-514 URL: https://issues.apache.org/jira/browse/HDFS-514 Project: Hadoop HDFS Issue Type: Bug Reporter: Bill Zeller Assignee: Bill Zeller Attachments: hdfs-514-2.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-514) DFSClient.namenode is a public field. Should be private.
[ https://issues.apache.org/jira/browse/HDFS-514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12737246#action_12737246 ] Bill Zeller commented on HDFS-514: -- I posted a new patch that only makes DFSClient.namenode private. DFSClient.namenode is a public field. Should be private. Key: HDFS-514 URL: https://issues.apache.org/jira/browse/HDFS-514 Project: Hadoop HDFS Issue Type: Bug Reporter: Bill Zeller Assignee: Bill Zeller Attachments: hdfs-514-2.patch, hdfs-514-3.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-514) DFSClient.namenode is a public field. Should be private.
[ https://issues.apache.org/jira/browse/HDFS-514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Zeller updated HDFS-514: - Release Note: Changes DFSClient.namenode to be private, adds getter and changes referencing code to use getter. (was: Changes DFSClient.namenode to be private and nonfinal (also rpcNamenode), adds getter and changes referencing code to use getter. ) Status: Patch Available (was: Open) DFSClient.namenode is a public field. Should be private. Key: HDFS-514 URL: https://issues.apache.org/jira/browse/HDFS-514 Project: Hadoop HDFS Issue Type: Bug Reporter: Bill Zeller Assignee: Bill Zeller Attachments: hdfs-514-2.patch, hdfs-514-3.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-514) DFSClient.namenode is a public field. Should be private.
[ https://issues.apache.org/jira/browse/HDFS-514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Zeller updated HDFS-514: - Attachment: hdfs-514-3.patch DFSClient.namenode is a public field. Should be private. Key: HDFS-514 URL: https://issues.apache.org/jira/browse/HDFS-514 Project: Hadoop HDFS Issue Type: Bug Reporter: Bill Zeller Assignee: Bill Zeller Attachments: hdfs-514-2.patch, hdfs-514-3.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-514) DFSClient.namenode is a public field. Should be private.
[ https://issues.apache.org/jira/browse/HDFS-514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12737364#action_12737364 ] Bill Zeller commented on HDFS-514: -- [exec] There appear to be 148 release audit warnings before the patch and 148 release audit warnings after applying the patch. [exec] +1 overall. [exec] +1 @author. The patch does not contain any @author tags. [exec] +1 tests included. The patch appears to include 36 new or modified tests. [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] Finished build. BUILD SUCCESSFUL Total time: 9 minutes 47 seconds ant test passed all tests. DFSClient.namenode is a public field. Should be private. Key: HDFS-514 URL: https://issues.apache.org/jira/browse/HDFS-514 Project: Hadoop HDFS Issue Type: Bug Reporter: Bill Zeller Assignee: Bill Zeller Attachments: hdfs-514-2.patch, hdfs-514-3.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-167) DFSClient continues to retry indefinitely
[ https://issues.apache.org/jira/browse/HDFS-167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Zeller updated HDFS-167: - Attachment: hdfs-167-4.patch DFSClient continues to retry indefinitely - Key: HDFS-167 URL: https://issues.apache.org/jira/browse/HDFS-167 Project: Hadoop HDFS Issue Type: Bug Reporter: Derek Wollenstein Assignee: Bill Zeller Priority: Minor Attachments: hdfs-167-4.patch I encountered a bug when trying to upload data using the Hadoop DFS Client. After receiving a NotReplicatedYetException, the DFSClient will normally retry its upload up to some limited number of times. In this case, I found that this retry loop continued indefinitely, to the point that the number of tries remaining was negative: 2009-03-25 16:20:02 [INFO] 2009-03-25 16:20:02 [INFO] 09/03/25 16:20:02 INFO hdfs.DFSClient: Waiting for replication for 21 seconds 2009-03-25 16:20:03 [INFO] 09/03/25 16:20:02 WARN hdfs.DFSClient: NotReplicatedYetException sleeping /apollo/env/SummaryMySQL/var/logstore/fiorello_logs_2009 0325_us/logs_20090325_us_13 retries left -1 The stack trace for the failure that's retrying is: 2009-03-25 16:20:02 [INFO] 09/03/25 16:20:02 INFO hdfs.DFSClient: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.server.namenode.NotReplicated YetException: Not replicated yet:filename 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1266) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351) 2009-03-25 16:20:02 [INFO] at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source) 2009-03-25 16:20:02 [INFO] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 2009-03-25 16:20:02 [INFO] at java.lang.reflect.Method.invoke(Method.java:597) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.ipc.Server$Handler.run(Server.java:894) 2009-03-25 16:20:02 [INFO] 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.ipc.Client.call(Client.java:697) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216) 2009-03-25 16:20:02 [INFO] at $Proxy0.addBlock(Unknown Source) 2009-03-25 16:20:02 [INFO] at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) 2009-03-25 16:20:02 [INFO] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 2009-03-25 16:20:02 [INFO] at java.lang.reflect.Method.invoke(Method.java:597) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) 2009-03-25 16:20:02 [INFO] at $Proxy0.addBlock(Unknown Source) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2814) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2696) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1996) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2183) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-167) DFSClient continues to retry indefinitely
[ https://issues.apache.org/jira/browse/HDFS-167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Zeller updated HDFS-167: - Release Note: Fixes logical error in DFSClient::DFSOutputStream::DataStreamer::locateFollowingBlock that caused infinite retries on write. Modified DFSClient constructor to allow unit testing of locateFollowingBlock and added unit tests. Status: Patch Available (was: Open) DFSClient continues to retry indefinitely - Key: HDFS-167 URL: https://issues.apache.org/jira/browse/HDFS-167 Project: Hadoop HDFS Issue Type: Bug Reporter: Derek Wollenstein Assignee: Bill Zeller Priority: Minor Attachments: hdfs-167-4.patch I encountered a bug when trying to upload data using the Hadoop DFS Client. After receiving a NotReplicatedYetException, the DFSClient will normally retry its upload up to some limited number of times. In this case, I found that this retry loop continued indefinitely, to the point that the number of tries remaining was negative: 2009-03-25 16:20:02 [INFO] 2009-03-25 16:20:02 [INFO] 09/03/25 16:20:02 INFO hdfs.DFSClient: Waiting for replication for 21 seconds 2009-03-25 16:20:03 [INFO] 09/03/25 16:20:02 WARN hdfs.DFSClient: NotReplicatedYetException sleeping /apollo/env/SummaryMySQL/var/logstore/fiorello_logs_2009 0325_us/logs_20090325_us_13 retries left -1 The stack trace for the failure that's retrying is: 2009-03-25 16:20:02 [INFO] 09/03/25 16:20:02 INFO hdfs.DFSClient: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.server.namenode.NotReplicated YetException: Not replicated yet:filename 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1266) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351) 2009-03-25 16:20:02 [INFO] at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source) 2009-03-25 16:20:02 [INFO] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 2009-03-25 16:20:02 [INFO] at java.lang.reflect.Method.invoke(Method.java:597) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.ipc.Server$Handler.run(Server.java:894) 2009-03-25 16:20:02 [INFO] 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.ipc.Client.call(Client.java:697) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216) 2009-03-25 16:20:02 [INFO] at $Proxy0.addBlock(Unknown Source) 2009-03-25 16:20:02 [INFO] at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) 2009-03-25 16:20:02 [INFO] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 2009-03-25 16:20:02 [INFO] at java.lang.reflect.Method.invoke(Method.java:597) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) 2009-03-25 16:20:02 [INFO] at $Proxy0.addBlock(Unknown Source) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2814) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2696) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1996) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2183) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-167) DFSClient continues to retry indefinitely
[ https://issues.apache.org/jira/browse/HDFS-167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12734265#action_12734265 ] Bill Zeller commented on HDFS-167: -- The offending code: {quote} if (--retries == 0 !NotReplicatedYetException.class.getName(). equals(e.getClassName())) { throw e; } {quote} This code attempts to retry until the above condition is met. The above condition says to {{throw e}} if the number of retries is 0 and the exception thrown is not a {{NotReplicatedYetException}}. However, the code later assumes that any exception not thrown is a {{NotReplicatedYetException}}. The intent seems to be to retry a certain number of times if a NotReplicatedYetException is thrown and to throw any other type of exception. The {{}} in the if statement should be changed to an {{||}}. DFSClient continues to retry indefinitely - Key: HDFS-167 URL: https://issues.apache.org/jira/browse/HDFS-167 Project: Hadoop HDFS Issue Type: Bug Reporter: Derek Wollenstein Priority: Minor I encountered a bug when trying to upload data using the Hadoop DFS Client. After receiving a NotReplicatedYetException, the DFSClient will normally retry its upload up to some limited number of times. In this case, I found that this retry loop continued indefinitely, to the point that the number of tries remaining was negative: 2009-03-25 16:20:02 [INFO] 2009-03-25 16:20:02 [INFO] 09/03/25 16:20:02 INFO hdfs.DFSClient: Waiting for replication for 21 seconds 2009-03-25 16:20:03 [INFO] 09/03/25 16:20:02 WARN hdfs.DFSClient: NotReplicatedYetException sleeping /apollo/env/SummaryMySQL/var/logstore/fiorello_logs_2009 0325_us/logs_20090325_us_13 retries left -1 The stack trace for the failure that's retrying is: 2009-03-25 16:20:02 [INFO] 09/03/25 16:20:02 INFO hdfs.DFSClient: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.server.namenode.NotReplicated YetException: Not replicated yet:filename 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1266) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351) 2009-03-25 16:20:02 [INFO] at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source) 2009-03-25 16:20:02 [INFO] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 2009-03-25 16:20:02 [INFO] at java.lang.reflect.Method.invoke(Method.java:597) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.ipc.Server$Handler.run(Server.java:894) 2009-03-25 16:20:02 [INFO] 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.ipc.Client.call(Client.java:697) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216) 2009-03-25 16:20:02 [INFO] at $Proxy0.addBlock(Unknown Source) 2009-03-25 16:20:02 [INFO] at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) 2009-03-25 16:20:02 [INFO] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 2009-03-25 16:20:02 [INFO] at java.lang.reflect.Method.invoke(Method.java:597) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) 2009-03-25 16:20:02 [INFO] at $Proxy0.addBlock(Unknown Source) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2814) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2696) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1996) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2183) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-167) DFSClient continues to retry indefinitely
[ https://issues.apache.org/jira/browse/HDFS-167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12734275#action_12734275 ] Bill Zeller commented on HDFS-167: -- The above code should be: {code:title=org.apache.hadoop.hdfs.DFSClient::locateFollowingBlock|borderStyle=solid} if (--retries == 0 !NotReplicatedYetException.class.getName(). equals(e.getClassName())) { throw e; } {code} (Sorry about the repost) DFSClient continues to retry indefinitely - Key: HDFS-167 URL: https://issues.apache.org/jira/browse/HDFS-167 Project: Hadoop HDFS Issue Type: Bug Reporter: Derek Wollenstein Priority: Minor I encountered a bug when trying to upload data using the Hadoop DFS Client. After receiving a NotReplicatedYetException, the DFSClient will normally retry its upload up to some limited number of times. In this case, I found that this retry loop continued indefinitely, to the point that the number of tries remaining was negative: 2009-03-25 16:20:02 [INFO] 2009-03-25 16:20:02 [INFO] 09/03/25 16:20:02 INFO hdfs.DFSClient: Waiting for replication for 21 seconds 2009-03-25 16:20:03 [INFO] 09/03/25 16:20:02 WARN hdfs.DFSClient: NotReplicatedYetException sleeping /apollo/env/SummaryMySQL/var/logstore/fiorello_logs_2009 0325_us/logs_20090325_us_13 retries left -1 The stack trace for the failure that's retrying is: 2009-03-25 16:20:02 [INFO] 09/03/25 16:20:02 INFO hdfs.DFSClient: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.server.namenode.NotReplicated YetException: Not replicated yet:filename 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1266) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351) 2009-03-25 16:20:02 [INFO] at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source) 2009-03-25 16:20:02 [INFO] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 2009-03-25 16:20:02 [INFO] at java.lang.reflect.Method.invoke(Method.java:597) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.ipc.Server$Handler.run(Server.java:894) 2009-03-25 16:20:02 [INFO] 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.ipc.Client.call(Client.java:697) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216) 2009-03-25 16:20:02 [INFO] at $Proxy0.addBlock(Unknown Source) 2009-03-25 16:20:02 [INFO] at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) 2009-03-25 16:20:02 [INFO] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 2009-03-25 16:20:02 [INFO] at java.lang.reflect.Method.invoke(Method.java:597) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) 2009-03-25 16:20:02 [INFO] at $Proxy0.addBlock(Unknown Source) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2814) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2696) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1996) 2009-03-25 16:20:02 [INFO] at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2183) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-492) Expose corrupt replica/block information
[ https://issues.apache.org/jira/browse/HDFS-492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Zeller updated HDFS-492: - Fix Version/s: 0.21.0 Status: Patch Available (was: Open) Expose corrupt replica/block information Key: HDFS-492 URL: https://issues.apache.org/jira/browse/HDFS-492 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, name-node Affects Versions: 0.21.0 Reporter: Bill Zeller Priority: Minor Fix For: 0.21.0 Attachments: hdfs-492-4.patch, hdfs-492-5.patch Original Estimate: 48h Remaining Estimate: 48h This adds two additional functions to FSNamesystem to provide more information about corrupt replicas. It also adds two servlets to the namenode that provide information (in JSON) about all blocks with corrupt replicas as well as information about a specific block. It also changes the file browsing servlet by adding a link from block ids to the above mentioned block information page. These JSON pages are designed to be used by client side tools which wish to analyze corrupt block/replicas. The only change to an existing (non-servlet) class is described below. Currently, CorruptReplicasMap stores a map of corrupt replica information and allows insertion and deletion. It also gives information about the corrupt replicas for a specific block. It does not allow iteration over all corrupt blocks. Two additional functions will be added to FSNamesystem (which will call BlockManager which will call CorruptReplicasMap). The first will return the size of the corrupt replicas map, which represents the number of blocks that have corrupt replicas (but less than the number of corrupt replicas if a block has multiple corrupt replicas). The second will allow paging through a list of block ids that contain corrupt replicas: {{public synchronized ListLong getCorruptReplicaBlockIds(int n, Long startingBlockId)}} {{n}} is the number of block ids to return and {{startingBlockId}} is the block id offset. To prevent a large number of items being returned at one time, n is constrained to 0 = {{n}} = 100. If {{startingBlockId}} is null, up to {{n}} items are returned starting at the beginning of the list. Ordering is enforced through the internal use of TreeMap in CorruptReplicasMap. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HDFS-492) Expose corrupt replica/block information
[ https://issues.apache.org/jira/browse/HDFS-492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Zeller reassigned HDFS-492: Assignee: Bill Zeller Expose corrupt replica/block information Key: HDFS-492 URL: https://issues.apache.org/jira/browse/HDFS-492 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, name-node Affects Versions: 0.21.0 Reporter: Bill Zeller Assignee: Bill Zeller Priority: Minor Fix For: 0.21.0 Attachments: hdfs-492-4.patch, hdfs-492-5.patch Original Estimate: 48h Remaining Estimate: 48h This adds two additional functions to FSNamesystem to provide more information about corrupt replicas. It also adds two servlets to the namenode that provide information (in JSON) about all blocks with corrupt replicas as well as information about a specific block. It also changes the file browsing servlet by adding a link from block ids to the above mentioned block information page. These JSON pages are designed to be used by client side tools which wish to analyze corrupt block/replicas. The only change to an existing (non-servlet) class is described below. Currently, CorruptReplicasMap stores a map of corrupt replica information and allows insertion and deletion. It also gives information about the corrupt replicas for a specific block. It does not allow iteration over all corrupt blocks. Two additional functions will be added to FSNamesystem (which will call BlockManager which will call CorruptReplicasMap). The first will return the size of the corrupt replicas map, which represents the number of blocks that have corrupt replicas (but less than the number of corrupt replicas if a block has multiple corrupt replicas). The second will allow paging through a list of block ids that contain corrupt replicas: {{public synchronized ListLong getCorruptReplicaBlockIds(int n, Long startingBlockId)}} {{n}} is the number of block ids to return and {{startingBlockId}} is the block id offset. To prevent a large number of items being returned at one time, n is constrained to 0 = {{n}} = 100. If {{startingBlockId}} is null, up to {{n}} items are returned starting at the beginning of the list. Ordering is enforced through the internal use of TreeMap in CorruptReplicasMap. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-492) Expose corrupt replica/block information
[ https://issues.apache.org/jira/browse/HDFS-492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Zeller updated HDFS-492: - Attachment: hdfs-492-4.patch Expose corrupt replica/block information Key: HDFS-492 URL: https://issues.apache.org/jira/browse/HDFS-492 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, name-node Affects Versions: 0.21.0 Reporter: Bill Zeller Priority: Minor Attachments: hdfs-492-4.patch Original Estimate: 48h Remaining Estimate: 48h This adds two additional functions to FSNamesystem to provide more information about corrupt replicas. It also adds two servlets to the namenode that provide information (in JSON) about all blocks with corrupt replicas as well as information about a specific block. It also changes the file browsing servlet by adding a link from block ids to the above mentioned block information page. These JSON pages are designed to be used by client side tools which wish to analyze corrupt block/replicas. The only change to an existing (non-servlet) class is described below. Currently, CorruptReplicasMap stores a map of corrupt replica information and allows insertion and deletion. It also gives information about the corrupt replicas for a specific block. It does not allow iteration over all corrupt blocks. Two additional functions will be added to FSNamesystem (which will call BlockManager which will call CorruptReplicasMap). The first will return the size of the corrupt replicas map, which represents the number of blocks that have corrupt replicas (but less than the number of corrupt replicas if a block has multiple corrupt replicas). The second will allow paging through a list of block ids that contain corrupt replicas: {{public synchronized ListLong getCorruptReplicaBlockIds(int n, Long startingBlockId)}} {{n}} is the number of block ids to return and {{startingBlockId}} is the block id offset. To prevent a large number of items being returned at one time, n is constrained to 0 = {{n}} = 100. If {{startingBlockId}} is null, up to {{n}} items are returned starting at the beginning of the list. Ordering is enforced through the internal use of TreeMap in CorruptReplicasMap. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-492) Expose corrupt replica/block information
[ https://issues.apache.org/jira/browse/HDFS-492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Zeller updated HDFS-492: - Release Note: Exposes data about corrupt blocks/replicas in FSNamesystem and adds two JSON JSP pages that provide block information. Status: Patch Available (was: Open) Expose corrupt replica/block information Key: HDFS-492 URL: https://issues.apache.org/jira/browse/HDFS-492 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, name-node Affects Versions: 0.21.0 Reporter: Bill Zeller Priority: Minor Attachments: hdfs-492-4.patch Original Estimate: 48h Remaining Estimate: 48h This adds two additional functions to FSNamesystem to provide more information about corrupt replicas. It also adds two servlets to the namenode that provide information (in JSON) about all blocks with corrupt replicas as well as information about a specific block. It also changes the file browsing servlet by adding a link from block ids to the above mentioned block information page. These JSON pages are designed to be used by client side tools which wish to analyze corrupt block/replicas. The only change to an existing (non-servlet) class is described below. Currently, CorruptReplicasMap stores a map of corrupt replica information and allows insertion and deletion. It also gives information about the corrupt replicas for a specific block. It does not allow iteration over all corrupt blocks. Two additional functions will be added to FSNamesystem (which will call BlockManager which will call CorruptReplicasMap). The first will return the size of the corrupt replicas map, which represents the number of blocks that have corrupt replicas (but less than the number of corrupt replicas if a block has multiple corrupt replicas). The second will allow paging through a list of block ids that contain corrupt replicas: {{public synchronized ListLong getCorruptReplicaBlockIds(int n, Long startingBlockId)}} {{n}} is the number of block ids to return and {{startingBlockId}} is the block id offset. To prevent a large number of items being returned at one time, n is constrained to 0 = {{n}} = 100. If {{startingBlockId}} is null, up to {{n}} items are returned starting at the beginning of the list. Ordering is enforced through the internal use of TreeMap in CorruptReplicasMap. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HDFS-492) Expose corrupt replica/block information
Expose corrupt replica/block information Key: HDFS-492 URL: https://issues.apache.org/jira/browse/HDFS-492 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, name-node Affects Versions: 0.21.0 Reporter: Bill Zeller Priority: Minor This adds two additional functions to FSNamesystem to provide more information about corrupt replicas. It also adds two servlets to the namenode that provide information (in JSON) about all blocks with corrupt replicas as well as information about a specific block. It also changes the file browsing servlet by adding a link from block ids to the above mentioned block information page. These JSON pages are designed to be used by client side tools which wish to analyze corrupt block/replicas. The only change to an existing (non-servlet) class is described below). Currently, CorruptReplicasMap stores a map of corrupt replica information and allows insertion and deletion. It also gives information about the corrupt replicas for a specific block. It does not allow iteration over all corrupt blocks. Two additional functions will be added to FSNamesystem (which will call BlockManager which will call CorruptReplicasMap). The first will return the size of the corrupt replicas map, which represents the number of blocks that have corrupt replicas (but less than the number of corrupt replicas if a block has multiple corrupt replicas). The second will allow paging through a list of block ids that contain corrupt replicas: {{ public synchronized ListLong getCorruptReplicaBlockIds(int n, Long startingBlockId) }} {{n}} is the number of block ids to return and {{startingBlockId}} is the block id offset. To prevent a large number of items being returned at one time, n is constrained to 0 = {{n}} = 100. If {{startingBlockId}} is null, up to {{n}} items are returned starting at the beginning of the list. Ordering is enforced through the internal use of TreeMap in CorruptReplicasMap. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-492) Expose corrupt replica/block information
[ https://issues.apache.org/jira/browse/HDFS-492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Zeller updated HDFS-492: - Description: This adds two additional functions to FSNamesystem to provide more information about corrupt replicas. It also adds two servlets to the namenode that provide information (in JSON) about all blocks with corrupt replicas as well as information about a specific block. It also changes the file browsing servlet by adding a link from block ids to the above mentioned block information page. These JSON pages are designed to be used by client side tools which wish to analyze corrupt block/replicas. The only change to an existing (non-servlet) class is described below. Currently, CorruptReplicasMap stores a map of corrupt replica information and allows insertion and deletion. It also gives information about the corrupt replicas for a specific block. It does not allow iteration over all corrupt blocks. Two additional functions will be added to FSNamesystem (which will call BlockManager which will call CorruptReplicasMap). The first will return the size of the corrupt replicas map, which represents the number of blocks that have corrupt replicas (but less than the number of corrupt replicas if a block has multiple corrupt replicas). The second will allow paging through a list of block ids that contain corrupt replicas: {{public synchronized ListLong getCorruptReplicaBlockIds(int n, Long startingBlockId)}} {{n}} is the number of block ids to return and {{startingBlockId}} is the block id offset. To prevent a large number of items being returned at one time, n is constrained to 0 = {{n}} = 100. If {{startingBlockId}} is null, up to {{n}} items are returned starting at the beginning of the list. Ordering is enforced through the internal use of TreeMap in CorruptReplicasMap. was: This adds two additional functions to FSNamesystem to provide more information about corrupt replicas. It also adds two servlets to the namenode that provide information (in JSON) about all blocks with corrupt replicas as well as information about a specific block. It also changes the file browsing servlet by adding a link from block ids to the above mentioned block information page. These JSON pages are designed to be used by client side tools which wish to analyze corrupt block/replicas. The only change to an existing (non-servlet) class is described below). Currently, CorruptReplicasMap stores a map of corrupt replica information and allows insertion and deletion. It also gives information about the corrupt replicas for a specific block. It does not allow iteration over all corrupt blocks. Two additional functions will be added to FSNamesystem (which will call BlockManager which will call CorruptReplicasMap). The first will return the size of the corrupt replicas map, which represents the number of blocks that have corrupt replicas (but less than the number of corrupt replicas if a block has multiple corrupt replicas). The second will allow paging through a list of block ids that contain corrupt replicas: {{public synchronized ListLong getCorruptReplicaBlockIds(int n, Long startingBlockId)}} {{n}} is the number of block ids to return and {{startingBlockId}} is the block id offset. To prevent a large number of items being returned at one time, n is constrained to 0 = {{n}} = 100. If {{startingBlockId}} is null, up to {{n}} items are returned starting at the beginning of the list. Ordering is enforced through the internal use of TreeMap in CorruptReplicasMap. Expose corrupt replica/block information Key: HDFS-492 URL: https://issues.apache.org/jira/browse/HDFS-492 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, name-node Affects Versions: 0.21.0 Reporter: Bill Zeller Priority: Minor Original Estimate: 48h Remaining Estimate: 48h This adds two additional functions to FSNamesystem to provide more information about corrupt replicas. It also adds two servlets to the namenode that provide information (in JSON) about all blocks with corrupt replicas as well as information about a specific block. It also changes the file browsing servlet by adding a link from block ids to the above mentioned block information page. These JSON pages are designed to be used by client side tools which wish to analyze corrupt block/replicas. The only change to an existing (non-servlet) class is described below. Currently, CorruptReplicasMap stores a map of corrupt replica information and allows insertion and deletion. It also gives information about the corrupt replicas for a specific block. It does not allow iteration over all corrupt blocks. Two additional functions will be added to FSNamesystem (which will call BlockManager