[jira] [Updated] (HDFS-10737) disk balance reporter print null for the volume's path
[ https://issues.apache.org/jira/browse/HDFS-10737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuanbo Liu updated HDFS-10737: -- Description: reproduction steps: 1. hdfs diskbalancer -plan xxx.xx(host name of datanode) 2. If plan json is created successfully, run hdfs diskbalancer -report -node xxx.xx the output info is here: {noformat} [DISK: volume-null] - 0.00 used: 45997/101122146304, 1.00 free: 101122100307/101122146304, isFailed: False, isReadOnly: False, isSkip: False, isTransient: False. {noformat} {{vol.getPath()}} returns null in {{ReportCommand#handleTopReport}} was: reproduction steps: 1. hdfs diskbalancer -plan xxx.xx(host name of datanode) 2. If plan json is created successfully, run hdfs diskbalancer -report xxx.xx the output info is here: {noformat} [DISK: volume-null] - 0.00 used: 45997/101122146304, 1.00 free: 101122100307/101122146304, isFailed: False, isReadOnly: False, isSkip: False, isTransient: False. {noformat} {{vol.getPath()}} returns null in {{ReportCommand#handleTopReport}} > disk balance reporter print null for the volume's path > -- > > Key: HDFS-10737 > URL: https://issues.apache.org/jira/browse/HDFS-10737 > Project: Hadoop HDFS > Issue Type: Bug > Components: diskbalancer, hdfs >Reporter: Yuanbo Liu >Assignee: Yuanbo Liu > > reproduction steps: > 1. hdfs diskbalancer -plan xxx.xx(host name of datanode) > 2. If plan json is created successfully, run > hdfs diskbalancer -report -node xxx.xx > the output info is here: > {noformat} > [DISK: volume-null] - 0.00 used: 45997/101122146304, 1.00 free: > 101122100307/101122146304, isFailed: False, isReadOnly: False, isSkip: False, > isTransient: False. > {noformat} > {{vol.getPath()}} returns null in {{ReportCommand#handleTopReport}} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9530) ReservedSpace is not cleared for abandoned Blocks
[ https://issues.apache.org/jira/browse/HDFS-9530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414657#comment-15414657 ] Arpit Agarwal commented on HDFS-9530: - Hi [~srikanth.sampath], thanks for the report. I just did a dry run cherry-pick to branch-2.6 and there was a single conflict that looks straightforward to resolve. [~brahmareddy], do you want to take a crack at backporting this to branch-2.6? If not I can I do so. > ReservedSpace is not cleared for abandoned Blocks > - > > Key: HDFS-9530 > URL: https://issues.apache.org/jira/browse/HDFS-9530 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.1 >Reporter: Fei Hui >Assignee: Brahma Reddy Battula >Priority: Critical > Fix For: 2.7.3 > > Attachments: HDFS-9530-01.patch, HDFS-9530-02.patch, > HDFS-9530-03.patch, HDFS-9530-branch-2.7-001.patch, > HDFS-9530-branch-2.7-002.patch > > > i think there are bugs in HDFS > === > here is config > > dfs.datanode.data.dir > > > file:///mnt/disk4,file:///mnt/disk1,file:///mnt/disk3,file:///mnt/disk2 > > > here is dfsadmin report > [hadoop@worker-1 ~]$ hadoop dfsadmin -report > DEPRECATED: Use of this script to execute hdfs command is deprecated. > Instead use the hdfs command for it. > Configured Capacity: 240769253376 (224.23 GB) > Present Capacity: 238604832768 (222.22 GB) > DFS Remaining: 215772954624 (200.95 GB) > DFS Used: 22831878144 (21.26 GB) > DFS Used%: 9.57% > Under replicated blocks: 4 > Blocks with corrupt replicas: 0 > Missing blocks: 0 > - > Live datanodes (3): > Name: 10.117.60.59:50010 (worker-2) > Hostname: worker-2 > Decommission Status : Normal > Configured Capacity: 80256417792 (74.74 GB) > DFS Used: 7190958080 (6.70 GB) > Non DFS Used: 721473536 (688.05 MB) > DFS Remaining: 72343986176 (67.38 GB) > DFS Used%: 8.96% > DFS Remaining%: 90.14% > Configured Cache Capacity: 0 (0 B) > Cache Used: 0 (0 B) > Cache Remaining: 0 (0 B) > Cache Used%: 100.00% > Cache Remaining%: 0.00% > Xceivers: 1 > Last contact: Wed Dec 09 15:55:02 CST 2015 > Name: 10.168.156.0:50010 (worker-3) > Hostname: worker-3 > Decommission Status : Normal > Configured Capacity: 80256417792 (74.74 GB) > DFS Used: 7219073024 (6.72 GB) > Non DFS Used: 721473536 (688.05 MB) > DFS Remaining: 72315871232 (67.35 GB) > DFS Used%: 9.00% > DFS Remaining%: 90.11% > Configured Cache Capacity: 0 (0 B) > Cache Used: 0 (0 B) > Cache Remaining: 0 (0 B) > Cache Used%: 100.00% > Cache Remaining%: 0.00% > Xceivers: 1 > Last contact: Wed Dec 09 15:55:03 CST 2015 > Name: 10.117.15.38:50010 (worker-1) > Hostname: worker-1 > Decommission Status : Normal > Configured Capacity: 80256417792 (74.74 GB) > DFS Used: 8421847040 (7.84 GB) > Non DFS Used: 721473536 (688.05 MB) > DFS Remaining: 71113097216 (66.23 GB) > DFS Used%: 10.49% > DFS Remaining%: 88.61% > Configured Cache Capacity: 0 (0 B) > Cache Used: 0 (0 B) > Cache Remaining: 0 (0 B) > Cache Used%: 100.00% > Cache Remaining%: 0.00% > Xceivers: 1 > Last contact: Wed Dec 09 15:55:03 CST 2015 > > when running hive job , dfsadmin report as follows > [hadoop@worker-1 ~]$ hadoop dfsadmin -report > DEPRECATED: Use of this script to execute hdfs command is deprecated. > Instead use the hdfs command for it. > Configured Capacity: 240769253376 (224.23 GB) > Present Capacity: 108266011136 (100.83 GB) > DFS Remaining: 80078416384 (74.58 GB) > DFS Used: 28187594752 (26.25 GB) > DFS Used%: 26.04% > Under replicated blocks: 7 > Blocks with corrupt replicas: 0 > Missing blocks: 0 > - > Live datanodes (3): > Name: 10.117.60.59:50010 (worker-2) > Hostname: worker-2 > Decommission Status : Normal > Configured Capacity: 80256417792 (74.74 GB) > DFS Used: 9015627776 (8.40 GB) > Non DFS Used: 44303742464 (41.26 GB) > DFS Remaining: 26937047552 (25.09 GB) > DFS Used%: 11.23% > DFS Remaining%: 33.56% > Configured Cache Capacity: 0 (0 B) > Cache Used: 0 (0 B) > Cache Remaining: 0 (0 B) > Cache Used%: 100.00% > Cache Remaining%: 0.00% > Xceivers: 693 > Last contact: Wed Dec 09 15:37:35 CST 2015 > Name: 10.168.156.0:50010 (worker-3) > Hostname: worker-3 > Decommission Status : Normal > Configured Capacity: 80256417792 (74.74 GB) > DFS Used: 9163116544 (8.53 GB) > Non DFS Used: 47895897600 (44.61 GB) > DFS Remaining: 23197403648 (21.60 GB) > DFS Used%: 11.42% > DFS Remaining%: 28.90% > Configured Cache Capacity: 0 (0 B) > Cache Used: 0 (0 B) > Cache Remaining: 0 (0 B) > Cache Used%: 100.00% > Cache Remaining%: 0.00% > Xceivers: 750 > Last contact: Wed Dec 09 15:37:36 CST 2015 > Name: 10
[jira] [Comment Edited] (HDFS-8957) Consolidate client striping input stream codes for stateful read and positional read
[ https://issues.apache.org/jira/browse/HDFS-8957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414620#comment-15414620 ] Youwei Wang edited comment on HDFS-8957 at 8/10/16 2:48 AM: Note: this patch "HDFS-8957.v3.patch" is based on the Hadoop trunk. Corresponding commitid is: commit 7992c0b42ceb10fd3ca6c4ced4f59b8e8998e046 Author: Karthik Kambatla Date: Tue Aug 9 16:50:57 2016 -0700 was (Author: hayabusa): Note: this patch is based on the Hadoop trunk. Corresponding commitid is: commit 7992c0b42ceb10fd3ca6c4ced4f59b8e8998e046 Author: Karthik Kambatla Date: Tue Aug 9 16:50:57 2016 -0700 > Consolidate client striping input stream codes for stateful read and > positional read > > > Key: HDFS-8957 > URL: https://issues.apache.org/jira/browse/HDFS-8957 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Reporter: Kai Zheng >Assignee: Youwei Wang > Attachments: HDFS-8957-v1.patch, HDFS-8957.v2.patch, > HDFS-8957.v3.patch > > > Currently we have different implementations for client striping read, having > both *StatefulStripeReader* and *PositionStripeReader*. I attempted to > consolidate the two implementations into one, and it results in much simpler > codes, and also better performance. Now in both read paths, it will: > * Use pooled ByteBuffers, as currently stateful read does; > * Read directly into application's buffer, as currently positional read does; > * Try to align and merge multiple stripes, as currently positional read does; > * Use *ECChunk* version decode API. > The resultant *StripeReader* is approaching very near now to the ideal state > desired by next step, employing *ErasureCoder* API instead of > *RawErasureCoder* API. > Will upload an initial patch to illustrate the rough change, even though it > depends on other issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8957) Consolidate client striping input stream codes for stateful read and positional read
[ https://issues.apache.org/jira/browse/HDFS-8957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Youwei Wang updated HDFS-8957: -- Attachment: HDFS-8957.v3.patch Note: this patch is based on the Hadoop trunk. Corresponding commitid is: commit 7992c0b42ceb10fd3ca6c4ced4f59b8e8998e046 Author: Karthik Kambatla Date: Tue Aug 9 16:50:57 2016 -0700 > Consolidate client striping input stream codes for stateful read and > positional read > > > Key: HDFS-8957 > URL: https://issues.apache.org/jira/browse/HDFS-8957 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Reporter: Kai Zheng >Assignee: Youwei Wang > Attachments: HDFS-8957-v1.patch, HDFS-8957.v2.patch, > HDFS-8957.v3.patch > > > Currently we have different implementations for client striping read, having > both *StatefulStripeReader* and *PositionStripeReader*. I attempted to > consolidate the two implementations into one, and it results in much simpler > codes, and also better performance. Now in both read paths, it will: > * Use pooled ByteBuffers, as currently stateful read does; > * Read directly into application's buffer, as currently positional read does; > * Try to align and merge multiple stripes, as currently positional read does; > * Use *ECChunk* version decode API. > The resultant *StripeReader* is approaching very near now to the ideal state > desired by next step, employing *ErasureCoder* API instead of > *RawErasureCoder* API. > Will upload an initial patch to illustrate the rough change, even though it > depends on other issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10738) Fix TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration test failure
[ https://issues.apache.org/jira/browse/HDFS-10738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414615#comment-15414615 ] Rakesh R commented on HDFS-10738: - Thanks [~kihwal] for final reviews and committing the patch. > Fix TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration test > failure > > > Key: HDFS-10738 > URL: https://issues.apache.org/jira/browse/HDFS-10738 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Rakesh R >Assignee: Rakesh R > Fix For: 2.9.0, 3.0.0-alpha2 > > Attachments: HDFS-10738-00.patch, HDFS-10738-01.patch > > > This jira is to analyse and fix the test case failure, which is failing in > Jenkins build, > [Build_16326|https://builds.apache.org/job/PreCommit-HDFS-Build/16326/testReport/org.apache.hadoop.security/TestRefreshUserMappings/testRefreshSuperUserGroupsConfiguration/] > very frequently. > {code} > Error Message > first auth for user2 should've succeeded: User: super_userL is not allowed to > impersonate userL2 > Stacktrace > java.lang.AssertionError: first auth for user2 should've succeeded: User: > super_userL is not allowed to impersonate userL2 > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.security.TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration(TestRefreshUserMappings.java:200) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-8957) Consolidate client striping input stream codes for stateful read and positional read
[ https://issues.apache.org/jira/browse/HDFS-8957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414607#comment-15414607 ] Youwei Wang edited comment on HDFS-8957 at 8/10/16 2:40 AM: The uploaded patch "HDFS-8957.v2.patch" depends on the HDFS-8901.v14.patch at this link: https://issues.apache.org/jira/browse/HDFS-8901 was (Author: hayabusa): This patch depends on the HDFS-8901.v14.patch at this link: https://issues.apache.org/jira/browse/HDFS-8901 > Consolidate client striping input stream codes for stateful read and > positional read > > > Key: HDFS-8957 > URL: https://issues.apache.org/jira/browse/HDFS-8957 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Reporter: Kai Zheng >Assignee: Youwei Wang > Attachments: HDFS-8957-v1.patch, HDFS-8957.v2.patch > > > Currently we have different implementations for client striping read, having > both *StatefulStripeReader* and *PositionStripeReader*. I attempted to > consolidate the two implementations into one, and it results in much simpler > codes, and also better performance. Now in both read paths, it will: > * Use pooled ByteBuffers, as currently stateful read does; > * Read directly into application's buffer, as currently positional read does; > * Try to align and merge multiple stripes, as currently positional read does; > * Use *ECChunk* version decode API. > The resultant *StripeReader* is approaching very near now to the ideal state > desired by next step, employing *ErasureCoder* API instead of > *RawErasureCoder* API. > Will upload an initial patch to illustrate the rough change, even though it > depends on other issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8957) Consolidate client striping input stream codes for stateful read and positional read
[ https://issues.apache.org/jira/browse/HDFS-8957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Youwei Wang updated HDFS-8957: -- Attachment: HDFS-8957.v2.patch This patch depends on the HDFS-8901.v14.patch at this link: https://issues.apache.org/jira/browse/HDFS-8901 > Consolidate client striping input stream codes for stateful read and > positional read > > > Key: HDFS-8957 > URL: https://issues.apache.org/jira/browse/HDFS-8957 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Reporter: Kai Zheng >Assignee: Youwei Wang > Attachments: HDFS-8957-v1.patch, HDFS-8957.v2.patch > > > Currently we have different implementations for client striping read, having > both *StatefulStripeReader* and *PositionStripeReader*. I attempted to > consolidate the two implementations into one, and it results in much simpler > codes, and also better performance. Now in both read paths, it will: > * Use pooled ByteBuffers, as currently stateful read does; > * Read directly into application's buffer, as currently positional read does; > * Try to align and merge multiple stripes, as currently positional read does; > * Use *ECChunk* version decode API. > The resultant *StripeReader* is approaching very near now to the ideal state > desired by next step, employing *ErasureCoder* API instead of > *RawErasureCoder* API. > Will upload an initial patch to illustrate the rough change, even though it > depends on other issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10645) Make block report size as a metric and add this metric to datanode web ui
[ https://issues.apache.org/jira/browse/HDFS-10645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuanbo Liu updated HDFS-10645: -- Attachment: HDFS-10645.009.patch > Make block report size as a metric and add this metric to datanode web ui > - > > Key: HDFS-10645 > URL: https://issues.apache.org/jira/browse/HDFS-10645 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, ui >Reporter: Yuanbo Liu >Assignee: Yuanbo Liu > Attachments: HDFS-10645.001.patch, HDFS-10645.002.patch, > HDFS-10645.003.patch, HDFS-10645.004.patch, HDFS-10645.005.patch, > HDFS-10645.006.patch, HDFS-10645.007.patch, HDFS-10645.008.patch, > HDFS-10645.009.patch, Selection_047.png, Selection_048.png > > > Record block report size as a metric and show it on datanode UI. It's > important for administrators to know the bottleneck of block report, and the > metric is also a good tuning metric. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10645) Make block report size as a metric and add this metric to datanode web ui
[ https://issues.apache.org/jira/browse/HDFS-10645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414601#comment-15414601 ] Yuanbo Liu commented on HDFS-10645: --- Uploaded v9 patch to address [~ajisakaa]'s comment. > Make block report size as a metric and add this metric to datanode web ui > - > > Key: HDFS-10645 > URL: https://issues.apache.org/jira/browse/HDFS-10645 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, ui >Reporter: Yuanbo Liu >Assignee: Yuanbo Liu > Attachments: HDFS-10645.001.patch, HDFS-10645.002.patch, > HDFS-10645.003.patch, HDFS-10645.004.patch, HDFS-10645.005.patch, > HDFS-10645.006.patch, HDFS-10645.007.patch, HDFS-10645.008.patch, > Selection_047.png, Selection_048.png > > > Record block report size as a metric and show it on datanode UI. It's > important for administrators to know the bottleneck of block report, and the > metric is also a good tuning metric. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10645) Make block report size as a metric and add this metric to datanode web ui
[ https://issues.apache.org/jira/browse/HDFS-10645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuanbo Liu updated HDFS-10645: -- Attachment: (was: HDFS-10645.009.patch) > Make block report size as a metric and add this metric to datanode web ui > - > > Key: HDFS-10645 > URL: https://issues.apache.org/jira/browse/HDFS-10645 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, ui >Reporter: Yuanbo Liu >Assignee: Yuanbo Liu > Attachments: HDFS-10645.001.patch, HDFS-10645.002.patch, > HDFS-10645.003.patch, HDFS-10645.004.patch, HDFS-10645.005.patch, > HDFS-10645.006.patch, HDFS-10645.007.patch, HDFS-10645.008.patch, > Selection_047.png, Selection_048.png > > > Record block report size as a metric and show it on datanode UI. It's > important for administrators to know the bottleneck of block report, and the > metric is also a good tuning metric. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10645) Make block report size as a metric and add this metric to datanode web ui
[ https://issues.apache.org/jira/browse/HDFS-10645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuanbo Liu updated HDFS-10645: -- Attachment: HDFS-10645.009.patch > Make block report size as a metric and add this metric to datanode web ui > - > > Key: HDFS-10645 > URL: https://issues.apache.org/jira/browse/HDFS-10645 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, ui >Reporter: Yuanbo Liu >Assignee: Yuanbo Liu > Attachments: HDFS-10645.001.patch, HDFS-10645.002.patch, > HDFS-10645.003.patch, HDFS-10645.004.patch, HDFS-10645.005.patch, > HDFS-10645.006.patch, HDFS-10645.007.patch, HDFS-10645.008.patch, > HDFS-10645.009.patch, Selection_047.png, Selection_048.png > > > Record block report size as a metric and show it on datanode UI. It's > important for administrators to know the bottleneck of block report, and the > metric is also a good tuning metric. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8224) Schedule a block for scanning if its metadata file is corrupt
[ https://issues.apache.org/jira/browse/HDFS-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414585#comment-15414585 ] Hudson commented on HDFS-8224: -- SUCCESS: Integrated in Hadoop-trunk-Commit #10251 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/10251/]) HDFS-8224. Schedule a block for scanning if its metadata file is (weichiu: rev d00d3add9e3c7ac7e79bb99b615bcfaeed892b96) * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/InvalidChecksumSizeException.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDiskError.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/DataChecksum.java > Schedule a block for scanning if its metadata file is corrupt > - > > Key: HDFS-8224 > URL: https://issues.apache.org/jira/browse/HDFS-8224 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.6.0 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Fix For: 3.0.0-alpha2 > > Attachments: HDFS-8224-trunk-1.patch, HDFS-8224-trunk-2.patch, > HDFS-8224-trunk-3.patch, HDFS-8224-trunk.patch > > > This happened in our 2.6 cluster. > One of the block and its metadata file were corrupted. > The disk was healthy in this case. > Only the block was corrupt. > Namenode tried to copy that block to another datanode but failed with the > following stack trace: > 2015-04-20 01:04:04,421 > [org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer@11319bc4] WARN > datanode.DataNode: DatanodeRegistration(a.b.c.d, > datanodeUuid=e8c5135c-9b9f-4d05-a59d-e5525518aca7, infoPort=1006, > infoSecurePort=0, ipcPort=8020, > storageInfo=lv=-56;cid=CID-e7f736ac-158e-446e-9091-7e66f3cddf3c;nsid=358250775;c=1428471998571):Failed > to transfer BP-xxx-1351096255769:blk_2697560713_1107108863999 to > a1.b1.c1.d1:1004 got > java.io.IOException: Could not create DataChecksum of type 0 with > bytesPerChecksum 0 > at > org.apache.hadoop.util.DataChecksum.newDataChecksum(DataChecksum.java:125) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:175) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:140) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readDataChecksum(BlockMetadataHeader.java:102) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:287) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:1989) > at java.lang.Thread.run(Thread.java:722) > The following catch block in DataTransfer#run method will treat every > IOException as disk error fault and run disk errror > {noformat} > catch (IOException ie) { > LOG.warn(bpReg + ":Failed to transfer " + b + " to " + > targets[0] + " got ", ie); > // check if there are any disk problem > checkDiskErrorAsync(); > } > {noformat} > This block was never scanned by BlockPoolSliceScanner otherwise it would have > reported as corrupt block. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10645) Make block report size as a metric and add this metric to datanode web ui
[ https://issues.apache.org/jira/browse/HDFS-10645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414581#comment-15414581 ] Yuanbo Liu commented on HDFS-10645: --- [~ajisakaa] Yes I agree with you, since {{maxDataLength}} is part of {{BPServiceActorInfo}}, we should document high level metric. > Make block report size as a metric and add this metric to datanode web ui > - > > Key: HDFS-10645 > URL: https://issues.apache.org/jira/browse/HDFS-10645 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, ui >Reporter: Yuanbo Liu >Assignee: Yuanbo Liu > Attachments: HDFS-10645.001.patch, HDFS-10645.002.patch, > HDFS-10645.003.patch, HDFS-10645.004.patch, HDFS-10645.005.patch, > HDFS-10645.006.patch, HDFS-10645.007.patch, HDFS-10645.008.patch, > Selection_047.png, Selection_048.png > > > Record block report size as a metric and show it on datanode UI. It's > important for administrators to know the bottleneck of block report, and the > metric is also a good tuning metric. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10682) Replace FsDatasetImpl object lock with a separate lock object
[ https://issues.apache.org/jira/browse/HDFS-10682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414579#comment-15414579 ] Hadoop QA commented on HDFS-10682: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 18m 57s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 11m 45s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s{color} | {color:green} branch-2 passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s{color} | {color:green} branch-2 passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 1s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 19s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 4s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s{color} | {color:green} branch-2 passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 39s{color} | {color:green} branch-2 passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 30s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 508 unchanged - 11 fixed = 509 total (was 519) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 38s{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 73m 0s{color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_101. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}193m 7s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_101 Failed junit tests | hadoop.hdfs.server.namenode.ha.TestBootstrapStandby | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl | | JDK v1.7.0_101 Failed junit tests | hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure | | | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes | | JDK v1.7.0_101 Timed out junit tests | org.apache.hadoop.hdfs.TestLeaseRecovery2 | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:b59b8b7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12822840/HDFS-10682-branch-2.003.patch | | JIRA Issu
[jira] [Comment Edited] (HDFS-10690) Optimize insertion/removal of replica in ShortCircuitCache.java
[ https://issues.apache.org/jira/browse/HDFS-10690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15407012#comment-15407012 ] Fenghua Hu edited comment on HDFS-10690 at 8/10/16 1:30 AM: Xiaoyu, [~xyao]I tried to replace TreeMap with linkedHashMap, but found LinkedHashMap lacks of function "ceilingEntry" or similar alternative, which is key to implement LRU-based replacement algorithm. LinkedHashMap also can't provide getYoungest or getEldest or similar functions. That's to say, if we want to use LinkedHashMap, we actually need to rewrite it. Any comments? Thanks. Finally i found the correct email for you:-) was (Author: fenghua_hu): Xiaoyu, [~xiaoyuyao] I tried to replace TreeMap with linkedHashMap, but found LinkedHashMap lacks of function "ceilingEntry" or similar alternative, which is key to implement LRU-based replacement algorithm. LinkedHashMap also can't provide getYoungest or getEldest or similar functions. That's to say, if we want to use LinkedHashMap, we actually need to rewrite it. Any comments? Thanks. > Optimize insertion/removal of replica in ShortCircuitCache.java > --- > > Key: HDFS-10690 > URL: https://issues.apache.org/jira/browse/HDFS-10690 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 3.0.0-alpha2 >Reporter: Fenghua Hu >Assignee: Fenghua Hu > Attachments: HDFS-10690.001.patch, HDFS-10690.002.patch > > Original Estimate: 336h > Remaining Estimate: 336h > > Currently in ShortCircuitCache, two TreeMap objects are used to track the > cached replicas. > private final TreeMap evictable = new TreeMap<>(); > private final TreeMap evictableMmapped = new > TreeMap<>(); > TreeMap employs Red-Black tree for sorting. This isn't an issue when using > traditional HDD. But when using high-performance SSD/PCIe Flash, the cost > inserting/removing an entry becomes considerable. > To mitigate it, we designed a new list-based for replica tracking. > The list is a double-linked FIFO. FIFO is time-based, thus insertion is a > very low cost operation. On the other hand, list is not lookup-friendly. To > address this issue, we introduce two references into ShortCircuitReplica > object. > ShortCircuitReplica next = null; > ShortCircuitReplica prev = null; > In this way, lookup is not needed when removing a replica from the list. We > only need to modify its predecessor's and successor's references in the lists. > Our tests showed up to 15-50% performance improvement when using PCIe flash > as storage media. > The original patch is against 2.6.4, now I am porting to Hadoop trunk, and > patch will be posted soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8224) Schedule a block for scanning if its metadata file is corrupt
[ https://issues.apache.org/jira/browse/HDFS-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414567#comment-15414567 ] Wei-Chiu Chuang commented on HDFS-8224: --- Committed to trunk. Thanks a lot [~shahrs87] for the patch and [~kihwal] for comments. Can you also upload a branch-2 patch? There are conflicts cherrypicking from trunk. > Schedule a block for scanning if its metadata file is corrupt > - > > Key: HDFS-8224 > URL: https://issues.apache.org/jira/browse/HDFS-8224 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.6.0 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Fix For: 3.0.0-alpha2 > > Attachments: HDFS-8224-trunk-1.patch, HDFS-8224-trunk-2.patch, > HDFS-8224-trunk-3.patch, HDFS-8224-trunk.patch > > > This happened in our 2.6 cluster. > One of the block and its metadata file were corrupted. > The disk was healthy in this case. > Only the block was corrupt. > Namenode tried to copy that block to another datanode but failed with the > following stack trace: > 2015-04-20 01:04:04,421 > [org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer@11319bc4] WARN > datanode.DataNode: DatanodeRegistration(a.b.c.d, > datanodeUuid=e8c5135c-9b9f-4d05-a59d-e5525518aca7, infoPort=1006, > infoSecurePort=0, ipcPort=8020, > storageInfo=lv=-56;cid=CID-e7f736ac-158e-446e-9091-7e66f3cddf3c;nsid=358250775;c=1428471998571):Failed > to transfer BP-xxx-1351096255769:blk_2697560713_1107108863999 to > a1.b1.c1.d1:1004 got > java.io.IOException: Could not create DataChecksum of type 0 with > bytesPerChecksum 0 > at > org.apache.hadoop.util.DataChecksum.newDataChecksum(DataChecksum.java:125) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:175) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:140) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readDataChecksum(BlockMetadataHeader.java:102) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:287) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:1989) > at java.lang.Thread.run(Thread.java:722) > The following catch block in DataTransfer#run method will treat every > IOException as disk error fault and run disk errror > {noformat} > catch (IOException ie) { > LOG.warn(bpReg + ":Failed to transfer " + b + " to " + > targets[0] + " got ", ie); > // check if there are any disk problem > checkDiskErrorAsync(); > } > {noformat} > This block was never scanned by BlockPoolSliceScanner otherwise it would have > reported as corrupt block. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8224) Schedule a block for scanning if its metadata file is corrupt
[ https://issues.apache.org/jira/browse/HDFS-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-8224: -- Fix Version/s: 3.0.0-alpha2 > Schedule a block for scanning if its metadata file is corrupt > - > > Key: HDFS-8224 > URL: https://issues.apache.org/jira/browse/HDFS-8224 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.6.0 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Fix For: 3.0.0-alpha2 > > Attachments: HDFS-8224-trunk-1.patch, HDFS-8224-trunk-2.patch, > HDFS-8224-trunk-3.patch, HDFS-8224-trunk.patch > > > This happened in our 2.6 cluster. > One of the block and its metadata file were corrupted. > The disk was healthy in this case. > Only the block was corrupt. > Namenode tried to copy that block to another datanode but failed with the > following stack trace: > 2015-04-20 01:04:04,421 > [org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer@11319bc4] WARN > datanode.DataNode: DatanodeRegistration(a.b.c.d, > datanodeUuid=e8c5135c-9b9f-4d05-a59d-e5525518aca7, infoPort=1006, > infoSecurePort=0, ipcPort=8020, > storageInfo=lv=-56;cid=CID-e7f736ac-158e-446e-9091-7e66f3cddf3c;nsid=358250775;c=1428471998571):Failed > to transfer BP-xxx-1351096255769:blk_2697560713_1107108863999 to > a1.b1.c1.d1:1004 got > java.io.IOException: Could not create DataChecksum of type 0 with > bytesPerChecksum 0 > at > org.apache.hadoop.util.DataChecksum.newDataChecksum(DataChecksum.java:125) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:175) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:140) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readDataChecksum(BlockMetadataHeader.java:102) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:287) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:1989) > at java.lang.Thread.run(Thread.java:722) > The following catch block in DataTransfer#run method will treat every > IOException as disk error fault and run disk errror > {noformat} > catch (IOException ie) { > LOG.warn(bpReg + ":Failed to transfer " + b + " to " + > targets[0] + " got ", ie); > // check if there are any disk problem > checkDiskErrorAsync(); > } > {noformat} > This block was never scanned by BlockPoolSliceScanner otherwise it would have > reported as corrupt block. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8224) Schedule a block for scanning if its metadata file is corrupt
[ https://issues.apache.org/jira/browse/HDFS-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-8224: -- Target Version/s: 2.8.0 Hadoop Flags: Reviewed Fix Version/s: (was: 2.8.0) > Schedule a block for scanning if its metadata file is corrupt > - > > Key: HDFS-8224 > URL: https://issues.apache.org/jira/browse/HDFS-8224 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.6.0 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Attachments: HDFS-8224-trunk-1.patch, HDFS-8224-trunk-2.patch, > HDFS-8224-trunk-3.patch, HDFS-8224-trunk.patch > > > This happened in our 2.6 cluster. > One of the block and its metadata file were corrupted. > The disk was healthy in this case. > Only the block was corrupt. > Namenode tried to copy that block to another datanode but failed with the > following stack trace: > 2015-04-20 01:04:04,421 > [org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer@11319bc4] WARN > datanode.DataNode: DatanodeRegistration(a.b.c.d, > datanodeUuid=e8c5135c-9b9f-4d05-a59d-e5525518aca7, infoPort=1006, > infoSecurePort=0, ipcPort=8020, > storageInfo=lv=-56;cid=CID-e7f736ac-158e-446e-9091-7e66f3cddf3c;nsid=358250775;c=1428471998571):Failed > to transfer BP-xxx-1351096255769:blk_2697560713_1107108863999 to > a1.b1.c1.d1:1004 got > java.io.IOException: Could not create DataChecksum of type 0 with > bytesPerChecksum 0 > at > org.apache.hadoop.util.DataChecksum.newDataChecksum(DataChecksum.java:125) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:175) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:140) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readDataChecksum(BlockMetadataHeader.java:102) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:287) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:1989) > at java.lang.Thread.run(Thread.java:722) > The following catch block in DataTransfer#run method will treat every > IOException as disk error fault and run disk errror > {noformat} > catch (IOException ie) { > LOG.warn(bpReg + ":Failed to transfer " + b + " to " + > targets[0] + " got ", ie); > // check if there are any disk problem > checkDiskErrorAsync(); > } > {noformat} > This block was never scanned by BlockPoolSliceScanner otherwise it would have > reported as corrupt block. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8224) Schedule a block for scanning if its metadata file is corrupt
[ https://issues.apache.org/jira/browse/HDFS-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414545#comment-15414545 ] Wei-Chiu Chuang commented on HDFS-8224: --- The test failures are unrelated. Committing the v3 patch. > Schedule a block for scanning if its metadata file is corrupt > - > > Key: HDFS-8224 > URL: https://issues.apache.org/jira/browse/HDFS-8224 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.6.0 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Fix For: 2.8.0 > > Attachments: HDFS-8224-trunk-1.patch, HDFS-8224-trunk-2.patch, > HDFS-8224-trunk-3.patch, HDFS-8224-trunk.patch > > > This happened in our 2.6 cluster. > One of the block and its metadata file were corrupted. > The disk was healthy in this case. > Only the block was corrupt. > Namenode tried to copy that block to another datanode but failed with the > following stack trace: > 2015-04-20 01:04:04,421 > [org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer@11319bc4] WARN > datanode.DataNode: DatanodeRegistration(a.b.c.d, > datanodeUuid=e8c5135c-9b9f-4d05-a59d-e5525518aca7, infoPort=1006, > infoSecurePort=0, ipcPort=8020, > storageInfo=lv=-56;cid=CID-e7f736ac-158e-446e-9091-7e66f3cddf3c;nsid=358250775;c=1428471998571):Failed > to transfer BP-xxx-1351096255769:blk_2697560713_1107108863999 to > a1.b1.c1.d1:1004 got > java.io.IOException: Could not create DataChecksum of type 0 with > bytesPerChecksum 0 > at > org.apache.hadoop.util.DataChecksum.newDataChecksum(DataChecksum.java:125) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:175) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:140) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readDataChecksum(BlockMetadataHeader.java:102) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:287) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:1989) > at java.lang.Thread.run(Thread.java:722) > The following catch block in DataTransfer#run method will treat every > IOException as disk error fault and run disk errror > {noformat} > catch (IOException ie) { > LOG.warn(bpReg + ":Failed to transfer " + b + " to " + > targets[0] + " got ", ie); > // check if there are any disk problem > checkDiskErrorAsync(); > } > {noformat} > This block was never scanned by BlockPoolSliceScanner otherwise it would have > reported as corrupt block. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10742) Measurement of lock held time in FsDatasetImpl
[ https://issues.apache.org/jira/browse/HDFS-10742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414539#comment-15414539 ] Hadoop QA commented on HDFS-10742: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 62m 4s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 83m 51s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12822912/HDFS-10742.003.patch | | JIRA Issue | HDFS-10742 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux f25f05411bc2 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 9c6a438 | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/16372/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/16372/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/16372/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Measurement of lock held time in FsDatasetImpl > -- > > Key: HDFS-10742 > URL: https://issues.apache.org/jira/browse/HDFS-10742 > Project: Hadoop HDFS > Issue T
[jira] [Commented] (HDFS-10742) Measurement of lock held time in FsDatasetImpl
[ https://issues.apache.org/jira/browse/HDFS-10742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414535#comment-15414535 ] Hadoop QA commented on HDFS-10742: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 25s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 37s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 28s{color} | {color:orange} root: The patch generated 1 new + 137 unchanged - 0 fixed = 138 total (was 137) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 42s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 19m 46s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 86m 54s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 29s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}149m 52s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.security.TestGroupsCaching | | | hadoop.hdfs.security.TestDelegationTokenForProxyUser | | Timed out junit tests | org.apache.hadoop.http.TestHttpServerLifecycle | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12822891/HDFS-10742.001.patch | | JIRA Issue | HDFS-10742 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 28184fae56fe 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 85422bb | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/16369/artifact/patchprocess/diff-checkstyle-root.txt | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/16369/artifact/patch
[jira] [Updated] (HDFS-10638) Modifications to remove the assumption that StorageLocation is associated with java.io.File.
[ https://issues.apache.org/jira/browse/HDFS-10638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-10638: -- Attachment: HDFS-10638.002.patch Updated patch to work with the new patches of HDFS-10636 and HDFS-10637. > Modifications to remove the assumption that StorageLocation is associated > with java.io.File. > > > Key: HDFS-10638 > URL: https://issues.apache.org/jira/browse/HDFS-10638 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, fs >Reporter: Virajith Jalaparti > Attachments: HDFS-10638.001.patch, HDFS-10638.002.patch > > > Changes to ensure that {{StorageLocation}} need not be associated with a > {{java.io.File}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8224) Schedule a block for scanning if its metadata file is corrupt
[ https://issues.apache.org/jira/browse/HDFS-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414520#comment-15414520 ] Hadoop QA commented on HDFS-8224: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 42s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 27s{color} | {color:green} root: The patch generated 0 new + 310 unchanged - 2 fixed = 310 total (was 312) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 21s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 57s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}114m 5s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeLifeline | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery | | | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12822898/HDFS-8224-trunk-3.patch | | JIRA Issue | HDFS-8224 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 0c910e4273b6 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 85422bb | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/16371/artifact/patchprocess/whitespace-eol.txt | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/16371/artifact/
[jira] [Updated] (HDFS-10637) Modifications to remove the assumption that FsVolumes are backed by java.io.File.
[ https://issues.apache.org/jira/browse/HDFS-10637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-10637: -- Attachment: HDFS-10637.005.patch Posting a modified patch to work with the latest patch on HDFS-10636. > Modifications to remove the assumption that FsVolumes are backed by > java.io.File. > - > > Key: HDFS-10637 > URL: https://issues.apache.org/jira/browse/HDFS-10637 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, fs >Reporter: Virajith Jalaparti >Assignee: Virajith Jalaparti > Attachments: HDFS-10637.001.patch, HDFS-10637.002.patch, > HDFS-10637.003.patch, HDFS-10637.004.patch, HDFS-10637.005.patch > > > Modifications to {{FsVolumeSpi}} and {{FsVolumeImpl}} to remove references to > {{java.io.File}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10742) Measurement of lock held time in FsDatasetImpl
[ https://issues.apache.org/jira/browse/HDFS-10742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Liang updated HDFS-10742: -- Attachment: HDFS-10742.003.patch > Measurement of lock held time in FsDatasetImpl > -- > > Key: HDFS-10742 > URL: https://issues.apache.org/jira/browse/HDFS-10742 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 3.0.0-alpha2 >Reporter: Chen Liang >Assignee: Chen Liang > Attachments: HDFS-10742.001.patch, HDFS-10742.002.patch, > HDFS-10742.003.patch > > > This JIRA proposes to measure the time the of lock of {{FsDatasetImpl}} is > held by a thread. Doing so will allow us to measure lock statistics. > This can be done by extending the {{AutoCloseableLock}} lock object in > {{FsDatasetImpl}}. In the future we can also consider replacing the lock with > a read-write lock. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10742) Measurement of lock held time in FsDatasetImpl
[ https://issues.apache.org/jira/browse/HDFS-10742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Liang updated HDFS-10742: -- Attachment: HDFS-10742.002.patch > Measurement of lock held time in FsDatasetImpl > -- > > Key: HDFS-10742 > URL: https://issues.apache.org/jira/browse/HDFS-10742 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 3.0.0-alpha2 >Reporter: Chen Liang >Assignee: Chen Liang > Attachments: HDFS-10742.001.patch, HDFS-10742.002.patch > > > This JIRA proposes to measure the time the of lock of {{FsDatasetImpl}} is > held by a thread. Doing so will allow us to measure lock statistics. > This can be done by extending the {{AutoCloseableLock}} lock object in > {{FsDatasetImpl}}. In the future we can also consider replacing the lock with > a read-write lock. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10457) DataNode should not auto-format block pool directory if VERSION is missing
[ https://issues.apache.org/jira/browse/HDFS-10457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414429#comment-15414429 ] Hudson commented on HDFS-10457: --- SUCCESS: Integrated in Hadoop-trunk-Commit #10249 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/10249/]) HDFS-10457. DataNode should not auto-format block pool directory if (lei: rev cc48251bfdef3d38ca5658da5a3624ef8941858d) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailureReporting.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceStorage.java > DataNode should not auto-format block pool directory if VERSION is missing > -- > > Key: HDFS-10457 > URL: https://issues.apache.org/jira/browse/HDFS-10457 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Fix For: 2.9.0, 3.0.0-alpha2 > > Attachments: HDFS-10457.001.patch, HDFS-10457.002.patch > > > HDFS-10360 prevents DN to auto-formats a volume directory if the > current/VERSION is missing. However, if instead, the current/VERSION in a > block pool directory is missing, DN still auto-formats the directory. > Filing this jira to fix the bug. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10681) DiskBalancer: query command should report Plan file path apart from PlanID
[ https://issues.apache.org/jira/browse/HDFS-10681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414428#comment-15414428 ] Hudson commented on HDFS-10681: --- SUCCESS: Integrated in Hadoop-trunk-Commit #10249 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/10249/]) HDFS-10681. DiskBalancer: query command should report Plan file path (lei: rev 9c6a4383cac29b2893ce14e6c9a75705fabfd522) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/diskbalancer/TestDiskBalancer.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/diskbalancer/TestDiskBalancerRPC.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/server/datanode/DiskBalancerWorkStatus.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DiskBalancer.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/diskbalancer/TestDiskBalancerWithMockMover.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/diskbalancer/command/ExecuteCommand.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/diskbalancer/command/QueryCommand.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientDatanodeProtocolServerSideTranslatorPB.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/ClientDatanodeProtocol.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientDatanodeProtocolTranslatorPB.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/proto/ClientDatanodeProtocol.proto > DiskBalancer: query command should report Plan file path apart from PlanID > -- > > Key: HDFS-10681 > URL: https://issues.apache.org/jira/browse/HDFS-10681 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: diskbalancer >Affects Versions: 3.0.0-alpha1 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy >Priority: Minor > Fix For: 3.0.0-alpha2 > > Attachments: HDFS-10681.001.patch, HDFS-10681.002.patch > > > DiskBalancer query command currently reports planID (SHA512 hex) only. > Currently ongoing disk balancing activity in a datanode can be cancelled > wither by planID + datanode_address or just by pointing to the right plan > file. Since there could be many plan files, to avoid ambiguity its better if > query command can report the plan file path also. > {noformat} > $ hdfs diskbalancer --help query > usage: hdfs diskbalancer -query [options] > Query Plan queries a given data node about the current state of disk > balancer execution. > --queryQueries the disk balancer status of a given datanode. > Query command retrievs *the plan ID* and the current running state. > {noformat} > Sample query command output: > {noformat} > 16/06/20 15:42:16 INFO command.Command: Executing "query plan" command. > Plan ID: > 04f41e2e1fa2d63558284be85155ea68154fb6ab435f1078c642d605d06626f176da16b321b35c99f1f6cd0cd77090c8743bb9a19190c4a01b5f8c51a515e240 > Result: PLAN_UNDER_PROGRESS > or > 16/06/20 15:46:09 INFO command.Command: Executing "query plan" command. > Plan ID: > 04f41e2e1fa2d63558284be85155ea68154fb6ab435f1078c642d605d06626f176da16b321b35c99f1f6cd0cd77090c8743bb9a19190c4a01b5f8c51a515e240 > Result: PLAN_DONE > {noformat} > Cancel command syntax: > {noformat} > $ hdfs diskbalancer --help cancel > *usage: hdfs diskbalancer -cancel | -cancel -node > * > Cancel command cancels a running disk balancer operation. > --cancelCancels a running plan using a plan file. > --node Cancels a running plan using a plan ID and hostName > Cancel command can be run via pointing to a plan file, or by reading the > plan ID using the query command and then using planID and hostname. > Examples of how to run this command are > hdfs diskbalancer -cancel > hdfs diskbalancer -cancel -node > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10681) DiskBalancer: query command should report Plan file path apart from PlanID
[ https://issues.apache.org/jira/browse/HDFS-10681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-10681: - Resolution: Fixed Fix Version/s: 3.0.0-alpha2 Status: Resolved (was: Patch Available) Committed to {{trunk}}. Thanks for the hard work, [~manojg] > DiskBalancer: query command should report Plan file path apart from PlanID > -- > > Key: HDFS-10681 > URL: https://issues.apache.org/jira/browse/HDFS-10681 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: diskbalancer >Affects Versions: 3.0.0-alpha1 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy >Priority: Minor > Fix For: 3.0.0-alpha2 > > Attachments: HDFS-10681.001.patch, HDFS-10681.002.patch > > > DiskBalancer query command currently reports planID (SHA512 hex) only. > Currently ongoing disk balancing activity in a datanode can be cancelled > wither by planID + datanode_address or just by pointing to the right plan > file. Since there could be many plan files, to avoid ambiguity its better if > query command can report the plan file path also. > {noformat} > $ hdfs diskbalancer --help query > usage: hdfs diskbalancer -query [options] > Query Plan queries a given data node about the current state of disk > balancer execution. > --queryQueries the disk balancer status of a given datanode. > Query command retrievs *the plan ID* and the current running state. > {noformat} > Sample query command output: > {noformat} > 16/06/20 15:42:16 INFO command.Command: Executing "query plan" command. > Plan ID: > 04f41e2e1fa2d63558284be85155ea68154fb6ab435f1078c642d605d06626f176da16b321b35c99f1f6cd0cd77090c8743bb9a19190c4a01b5f8c51a515e240 > Result: PLAN_UNDER_PROGRESS > or > 16/06/20 15:46:09 INFO command.Command: Executing "query plan" command. > Plan ID: > 04f41e2e1fa2d63558284be85155ea68154fb6ab435f1078c642d605d06626f176da16b321b35c99f1f6cd0cd77090c8743bb9a19190c4a01b5f8c51a515e240 > Result: PLAN_DONE > {noformat} > Cancel command syntax: > {noformat} > $ hdfs diskbalancer --help cancel > *usage: hdfs diskbalancer -cancel | -cancel -node > * > Cancel command cancels a running disk balancer operation. > --cancelCancels a running plan using a plan file. > --node Cancels a running plan using a plan ID and hostName > Cancel command can be run via pointing to a plan file, or by reading the > plan ID using the query command and then using planID and hostname. > Examples of how to run this command are > hdfs diskbalancer -cancel > hdfs diskbalancer -cancel -node > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8224) Schedule a block for scanning if its metadata file is corrupt
[ https://issues.apache.org/jira/browse/HDFS-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414396#comment-15414396 ] Wei-Chiu Chuang commented on HDFS-8224: --- +1 pending Jenkins. > Schedule a block for scanning if its metadata file is corrupt > - > > Key: HDFS-8224 > URL: https://issues.apache.org/jira/browse/HDFS-8224 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.6.0 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Fix For: 2.8.0 > > Attachments: HDFS-8224-trunk-1.patch, HDFS-8224-trunk-2.patch, > HDFS-8224-trunk-3.patch, HDFS-8224-trunk.patch > > > This happened in our 2.6 cluster. > One of the block and its metadata file were corrupted. > The disk was healthy in this case. > Only the block was corrupt. > Namenode tried to copy that block to another datanode but failed with the > following stack trace: > 2015-04-20 01:04:04,421 > [org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer@11319bc4] WARN > datanode.DataNode: DatanodeRegistration(a.b.c.d, > datanodeUuid=e8c5135c-9b9f-4d05-a59d-e5525518aca7, infoPort=1006, > infoSecurePort=0, ipcPort=8020, > storageInfo=lv=-56;cid=CID-e7f736ac-158e-446e-9091-7e66f3cddf3c;nsid=358250775;c=1428471998571):Failed > to transfer BP-xxx-1351096255769:blk_2697560713_1107108863999 to > a1.b1.c1.d1:1004 got > java.io.IOException: Could not create DataChecksum of type 0 with > bytesPerChecksum 0 > at > org.apache.hadoop.util.DataChecksum.newDataChecksum(DataChecksum.java:125) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:175) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:140) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readDataChecksum(BlockMetadataHeader.java:102) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:287) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:1989) > at java.lang.Thread.run(Thread.java:722) > The following catch block in DataTransfer#run method will treat every > IOException as disk error fault and run disk errror > {noformat} > catch (IOException ie) { > LOG.warn(bpReg + ":Failed to transfer " + b + " to " + > targets[0] + " got ", ie); > // check if there are any disk problem > checkDiskErrorAsync(); > } > {noformat} > This block was never scanned by BlockPoolSliceScanner otherwise it would have > reported as corrupt block. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9276) Failed to Update HDFS Delegation Token for long running application in HA mode
[ https://issues.apache.org/jira/browse/HDFS-9276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414394#comment-15414394 ] Xiao Chen commented on HDFS-9276: - Linking HADOOP-8751. Without it, {{TestUserGroupInformation#testPrivateTokenExclusion}} will fail with NPE. > Failed to Update HDFS Delegation Token for long running application in HA mode > -- > > Key: HDFS-9276 > URL: https://issues.apache.org/jira/browse/HDFS-9276 > Project: Hadoop HDFS > Issue Type: Bug > Components: fs, ha, security >Affects Versions: 2.7.1 >Reporter: Liangliang Gu >Assignee: Liangliang Gu > Fix For: 2.9.0, 3.0.0-alpha2 > > Attachments: HDFS-9276.01.patch, HDFS-9276.02.patch, > HDFS-9276.03.patch, HDFS-9276.04.patch, HDFS-9276.05.patch, > HDFS-9276.06.patch, HDFS-9276.07.patch, HDFS-9276.08.patch, > HDFS-9276.09.patch, HDFS-9276.10.patch, HDFS-9276.11.patch, > HDFS-9276.12.patch, HDFS-9276.13.patch, HDFS-9276.14.patch, > HDFS-9276.15.patch, HDFS-9276.16.patch, HDFS-9276.17.patch, > HDFS-9276.18.patch, HDFS-9276.19.patch, HDFS-9276.20.patch, > HDFSReadLoop.scala, debug1.PNG, debug2.PNG > > > The Scenario is as follows: > 1. NameNode HA is enabled. > 2. Kerberos is enabled. > 3. HDFS Delegation Token (not Keytab or TGT) is used to communicate with > NameNode. > 4. We want to update the HDFS Delegation Token for long running applicatons. > HDFS Client will generate private tokens for each NameNode. When we update > the HDFS Delegation Token, these private tokens will not be updated, which > will cause token expired. > This bug can be reproduced by the following program: > {code} > import java.security.PrivilegedExceptionAction > import org.apache.hadoop.conf.Configuration > import org.apache.hadoop.fs.{FileSystem, Path} > import org.apache.hadoop.security.UserGroupInformation > object HadoopKerberosTest { > def main(args: Array[String]): Unit = { > val keytab = "/path/to/keytab/xxx.keytab" > val principal = "x...@abc.com" > val creds1 = new org.apache.hadoop.security.Credentials() > val ugi1 = > UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab) > ugi1.doAs(new PrivilegedExceptionAction[Void] { > // Get a copy of the credentials > override def run(): Void = { > val fs = FileSystem.get(new Configuration()) > fs.addDelegationTokens("test", creds1) > null > } > }) > val ugi = UserGroupInformation.createRemoteUser("test") > ugi.addCredentials(creds1) > ugi.doAs(new PrivilegedExceptionAction[Void] { > // Get a copy of the credentials > override def run(): Void = { > var i = 0 > while (true) { > val creds1 = new org.apache.hadoop.security.Credentials() > val ugi1 = > UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab) > ugi1.doAs(new PrivilegedExceptionAction[Void] { > // Get a copy of the credentials > override def run(): Void = { > val fs = FileSystem.get(new Configuration()) > fs.addDelegationTokens("test", creds1) > null > } > }) > UserGroupInformation.getCurrentUser.addCredentials(creds1) > val fs = FileSystem.get( new Configuration()) > i += 1 > println() > println(i) > println(fs.listFiles(new Path("/user"), false)) > Thread.sleep(60 * 1000) > } > null > } > }) > } > } > {code} > To reproduce the bug, please set the following configuration to Name Node: > {code} > dfs.namenode.delegation.token.max-lifetime = 10min > dfs.namenode.delegation.key.update-interval = 3min > dfs.namenode.delegation.token.renew-interval = 3min > {code} > The bug will occure after 3 minutes. > The stacktrace is: > {code} > Exception in thread "main" > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): > token (HDFS_DELEGATION_TOKEN token 330156 for test) is expired > at org.apache.hadoop.ipc.Client.call(Client.java:1347) > at org.apache.hadoop.ipc.Client.call(Client.java:1300) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) > at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:651) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.M
[jira] [Updated] (HDFS-8224) Schedule a block for scanning if its metadata file is corrupt
[ https://issues.apache.org/jira/browse/HDFS-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rushabh S Shah updated HDFS-8224: - Status: Patch Available (was: Open) > Schedule a block for scanning if its metadata file is corrupt > - > > Key: HDFS-8224 > URL: https://issues.apache.org/jira/browse/HDFS-8224 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.6.0 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Fix For: 2.8.0 > > Attachments: HDFS-8224-trunk-1.patch, HDFS-8224-trunk-2.patch, > HDFS-8224-trunk-3.patch, HDFS-8224-trunk.patch > > > This happened in our 2.6 cluster. > One of the block and its metadata file were corrupted. > The disk was healthy in this case. > Only the block was corrupt. > Namenode tried to copy that block to another datanode but failed with the > following stack trace: > 2015-04-20 01:04:04,421 > [org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer@11319bc4] WARN > datanode.DataNode: DatanodeRegistration(a.b.c.d, > datanodeUuid=e8c5135c-9b9f-4d05-a59d-e5525518aca7, infoPort=1006, > infoSecurePort=0, ipcPort=8020, > storageInfo=lv=-56;cid=CID-e7f736ac-158e-446e-9091-7e66f3cddf3c;nsid=358250775;c=1428471998571):Failed > to transfer BP-xxx-1351096255769:blk_2697560713_1107108863999 to > a1.b1.c1.d1:1004 got > java.io.IOException: Could not create DataChecksum of type 0 with > bytesPerChecksum 0 > at > org.apache.hadoop.util.DataChecksum.newDataChecksum(DataChecksum.java:125) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:175) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:140) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readDataChecksum(BlockMetadataHeader.java:102) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:287) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:1989) > at java.lang.Thread.run(Thread.java:722) > The following catch block in DataTransfer#run method will treat every > IOException as disk error fault and run disk errror > {noformat} > catch (IOException ie) { > LOG.warn(bpReg + ":Failed to transfer " + b + " to " + > targets[0] + " got ", ie); > // check if there are any disk problem > checkDiskErrorAsync(); > } > {noformat} > This block was never scanned by BlockPoolSliceScanner otherwise it would have > reported as corrupt block. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8224) Schedule a block for scanning if its metadata file is corrupt
[ https://issues.apache.org/jira/browse/HDFS-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rushabh S Shah updated HDFS-8224: - Attachment: HDFS-8224-trunk-3.patch Created a new patch addressing all the previous comments. [~jojochuang]: please review. > Schedule a block for scanning if its metadata file is corrupt > - > > Key: HDFS-8224 > URL: https://issues.apache.org/jira/browse/HDFS-8224 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.6.0 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Fix For: 2.8.0 > > Attachments: HDFS-8224-trunk-1.patch, HDFS-8224-trunk-2.patch, > HDFS-8224-trunk-3.patch, HDFS-8224-trunk.patch > > > This happened in our 2.6 cluster. > One of the block and its metadata file were corrupted. > The disk was healthy in this case. > Only the block was corrupt. > Namenode tried to copy that block to another datanode but failed with the > following stack trace: > 2015-04-20 01:04:04,421 > [org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer@11319bc4] WARN > datanode.DataNode: DatanodeRegistration(a.b.c.d, > datanodeUuid=e8c5135c-9b9f-4d05-a59d-e5525518aca7, infoPort=1006, > infoSecurePort=0, ipcPort=8020, > storageInfo=lv=-56;cid=CID-e7f736ac-158e-446e-9091-7e66f3cddf3c;nsid=358250775;c=1428471998571):Failed > to transfer BP-xxx-1351096255769:blk_2697560713_1107108863999 to > a1.b1.c1.d1:1004 got > java.io.IOException: Could not create DataChecksum of type 0 with > bytesPerChecksum 0 > at > org.apache.hadoop.util.DataChecksum.newDataChecksum(DataChecksum.java:125) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:175) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:140) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readDataChecksum(BlockMetadataHeader.java:102) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:287) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:1989) > at java.lang.Thread.run(Thread.java:722) > The following catch block in DataTransfer#run method will treat every > IOException as disk error fault and run disk errror > {noformat} > catch (IOException ie) { > LOG.warn(bpReg + ":Failed to transfer " + b + " to " + > targets[0] + " got ", ie); > // check if there are any disk problem > checkDiskErrorAsync(); > } > {noformat} > This block was never scanned by BlockPoolSliceScanner otherwise it would have > reported as corrupt block. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8224) Schedule a block for scanning if its metadata file is corrupt
[ https://issues.apache.org/jira/browse/HDFS-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rushabh S Shah updated HDFS-8224: - Status: Open (was: Patch Available) [~jojochuang]: Thanks a lot for your valuable reviews. Cancelling the patch to address your comments. > Schedule a block for scanning if its metadata file is corrupt > - > > Key: HDFS-8224 > URL: https://issues.apache.org/jira/browse/HDFS-8224 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.6.0 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Fix For: 2.8.0 > > Attachments: HDFS-8224-trunk-1.patch, HDFS-8224-trunk-2.patch, > HDFS-8224-trunk.patch > > > This happened in our 2.6 cluster. > One of the block and its metadata file were corrupted. > The disk was healthy in this case. > Only the block was corrupt. > Namenode tried to copy that block to another datanode but failed with the > following stack trace: > 2015-04-20 01:04:04,421 > [org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer@11319bc4] WARN > datanode.DataNode: DatanodeRegistration(a.b.c.d, > datanodeUuid=e8c5135c-9b9f-4d05-a59d-e5525518aca7, infoPort=1006, > infoSecurePort=0, ipcPort=8020, > storageInfo=lv=-56;cid=CID-e7f736ac-158e-446e-9091-7e66f3cddf3c;nsid=358250775;c=1428471998571):Failed > to transfer BP-xxx-1351096255769:blk_2697560713_1107108863999 to > a1.b1.c1.d1:1004 got > java.io.IOException: Could not create DataChecksum of type 0 with > bytesPerChecksum 0 > at > org.apache.hadoop.util.DataChecksum.newDataChecksum(DataChecksum.java:125) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:175) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:140) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readDataChecksum(BlockMetadataHeader.java:102) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:287) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:1989) > at java.lang.Thread.run(Thread.java:722) > The following catch block in DataTransfer#run method will treat every > IOException as disk error fault and run disk errror > {noformat} > catch (IOException ie) { > LOG.warn(bpReg + ":Failed to transfer " + b + " to " + > targets[0] + " got ", ie); > // check if there are any disk problem > checkDiskErrorAsync(); > } > {noformat} > This block was never scanned by BlockPoolSliceScanner otherwise it would have > reported as corrupt block. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10682) Replace FsDatasetImpl object lock with a separate lock object
[ https://issues.apache.org/jira/browse/HDFS-10682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Liang updated HDFS-10682: -- Status: Patch Available (was: In Progress) > Replace FsDatasetImpl object lock with a separate lock object > - > > Key: HDFS-10682 > URL: https://issues.apache.org/jira/browse/HDFS-10682 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Chen Liang >Assignee: Chen Liang > Fix For: 3.0.0-alpha2 > > Attachments: HDFS-10682-branch-2.001.patch, > HDFS-10682-branch-2.002.patch, HDFS-10682-branch-2.003.patch, > HDFS-10682.001.patch, HDFS-10682.002.patch, HDFS-10682.003.patch, > HDFS-10682.004.patch, HDFS-10682.005.patch, HDFS-10682.006.patch, > HDFS-10682.007.patch, HDFS-10682.008.patch, HDFS-10682.009.patch, > HDFS-10682.010.patch > > > This Jira proposes to replace the FsDatasetImpl object lock with a separate > lock object. Doing so will make it easier to measure lock statistics like > lock held time and warn about potential lock contention due to slow disk > operations. > Right now we can use org.apache.hadoop.util.AutoCloseableLock. In the future > we can also consider replacing the lock with a read-write lock. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10742) Measurement of lock held time in FsDatasetImpl
[ https://issues.apache.org/jira/browse/HDFS-10742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Liang updated HDFS-10742: -- Status: Patch Available (was: Open) > Measurement of lock held time in FsDatasetImpl > -- > > Key: HDFS-10742 > URL: https://issues.apache.org/jira/browse/HDFS-10742 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 3.0.0-alpha2 >Reporter: Chen Liang >Assignee: Chen Liang > Attachments: HDFS-10742.001.patch > > > This JIRA proposes to measure the time the of lock of {{FsDatasetImpl}} is > held by a thread. Doing so will allow us to measure lock statistics. > This can be done by extending the {{AutoCloseableLock}} lock object in > {{FsDatasetImpl}}. In the future we can also consider replacing the lock with > a read-write lock. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10742) Measurement of lock held time in FsDatasetImpl
[ https://issues.apache.org/jira/browse/HDFS-10742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Liang updated HDFS-10742: -- Attachment: HDFS-10742.001.patch > Measurement of lock held time in FsDatasetImpl > -- > > Key: HDFS-10742 > URL: https://issues.apache.org/jira/browse/HDFS-10742 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 3.0.0-alpha2 >Reporter: Chen Liang >Assignee: Chen Liang > Attachments: HDFS-10742.001.patch > > > This JIRA proposes to measure the time the of lock of {{FsDatasetImpl}} is > held by a thread. Doing so will allow us to measure lock statistics. > This can be done by extending the {{AutoCloseableLock}} lock object in > {{FsDatasetImpl}}. In the future we can also consider replacing the lock with > a read-write lock. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10740) libhdfs++: Implement recursive directory generator
[ https://issues.apache.org/jira/browse/HDFS-10740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414326#comment-15414326 ] James Clampffer commented on HDFS-10740: Cool stuff, relatively minor comments. A little more picky than usual since this is an API usage example. You might want to rename the "width" argument in the usage message to something like "fanout". To me width could mean number of directories in each subdir or number of all leaf nodes at the top of the tree. This is just my preference, I'm fine with the current comment. Comment no longer applicable, could you please replace this with "man" style instructions on usage and any optional flags? At some point we need to start commenting in the same markup as the rest of the HDFS native code as well. {code} /* A a stripped down version of unix's "cat". Doesn't deal with any flags for now, will just attempt to read the whole file. */ {code} Since this is at least partly intended to be in example about how to take advantage of the async operations some more comments would go a long way. Just basic things about how the recursion doesn't block and how everything waits on the promises later on. Bringing the whole namespaces into scope can lead to weirdness. You could also just add the definitions you want "using std::string" etc. Not a blocker since nothing it's a standalone util but lets try to avoid this outside of the examples and test directories. {code} using namespace std; using namespace hdfs; {code} Generally taking references to any kind of smart_ptr as arguments is a bad idea. Taking references to unique_ptr is a really bad idea because it's sidestepping the contract that it's supposed to be unique and will lead to confusion. Taking a raw pointer using .get() and passing it around is slightly better because the loss of uniqueness is explicit. {code} unique_ptr & fs {code} get_port will return an optional, if that hasn't been set it's undefined behavior to use the deference operator or otherwise retrieve the value. That might end up making the error message useless depending on what libstdc++ feels like doing. {code} cerr << "Could not connect to " << uri->get_host() << ":" << *(uri->get_port()) << endl; {code} > libhdfs++: Implement recursive directory generator > -- > > Key: HDFS-10740 > URL: https://issues.apache.org/jira/browse/HDFS-10740 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Anatoli Shein >Assignee: Anatoli Shein > Attachments: HDFS-10740.HDFS-8707.000.patch > > > This tool will allow us do benchmarking/testing our find functionality, and > will be a good example showing how to call a large number or namenode > operations reqursively. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10457) DataNode should not auto-format block pool directory if VERSION is missing
[ https://issues.apache.org/jira/browse/HDFS-10457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414322#comment-15414322 ] Wei-Chiu Chuang commented on HDFS-10457: Thanks [~eddyxu]! > DataNode should not auto-format block pool directory if VERSION is missing > -- > > Key: HDFS-10457 > URL: https://issues.apache.org/jira/browse/HDFS-10457 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Fix For: 2.9.0, 3.0.0-alpha2 > > Attachments: HDFS-10457.001.patch, HDFS-10457.002.patch > > > HDFS-10360 prevents DN to auto-formats a volume directory if the > current/VERSION is missing. However, if instead, the current/VERSION in a > block pool directory is missing, DN still auto-formats the directory. > Filing this jira to fix the bug. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8224) Schedule a block for scanning if its metadata file is corrupt
[ https://issues.apache.org/jira/browse/HDFS-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-8224: -- Summary: Schedule a block for scanning if its metadata file is corrupt (was: Any IOException in DataTransfer#run() will run diskError thread even if it is not disk error) > Schedule a block for scanning if its metadata file is corrupt > - > > Key: HDFS-8224 > URL: https://issues.apache.org/jira/browse/HDFS-8224 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.6.0 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Fix For: 2.8.0 > > Attachments: HDFS-8224-trunk-1.patch, HDFS-8224-trunk-2.patch, > HDFS-8224-trunk.patch > > > This happened in our 2.6 cluster. > One of the block and its metadata file were corrupted. > The disk was healthy in this case. > Only the block was corrupt. > Namenode tried to copy that block to another datanode but failed with the > following stack trace: > 2015-04-20 01:04:04,421 > [org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer@11319bc4] WARN > datanode.DataNode: DatanodeRegistration(a.b.c.d, > datanodeUuid=e8c5135c-9b9f-4d05-a59d-e5525518aca7, infoPort=1006, > infoSecurePort=0, ipcPort=8020, > storageInfo=lv=-56;cid=CID-e7f736ac-158e-446e-9091-7e66f3cddf3c;nsid=358250775;c=1428471998571):Failed > to transfer BP-xxx-1351096255769:blk_2697560713_1107108863999 to > a1.b1.c1.d1:1004 got > java.io.IOException: Could not create DataChecksum of type 0 with > bytesPerChecksum 0 > at > org.apache.hadoop.util.DataChecksum.newDataChecksum(DataChecksum.java:125) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:175) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:140) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readDataChecksum(BlockMetadataHeader.java:102) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:287) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:1989) > at java.lang.Thread.run(Thread.java:722) > The following catch block in DataTransfer#run method will treat every > IOException as disk error fault and run disk errror > {noformat} > catch (IOException ie) { > LOG.warn(bpReg + ":Failed to transfer " + b + " to " + > targets[0] + " got ", ie); > // check if there are any disk problem > checkDiskErrorAsync(); > } > {noformat} > This block was never scanned by BlockPoolSliceScanner otherwise it would have > reported as corrupt block. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8224) Schedule a block for scanning if its metadata file is corrupt
[ https://issues.apache.org/jira/browse/HDFS-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-8224: -- Component/s: datanode > Schedule a block for scanning if its metadata file is corrupt > - > > Key: HDFS-8224 > URL: https://issues.apache.org/jira/browse/HDFS-8224 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.6.0 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Fix For: 2.8.0 > > Attachments: HDFS-8224-trunk-1.patch, HDFS-8224-trunk-2.patch, > HDFS-8224-trunk.patch > > > This happened in our 2.6 cluster. > One of the block and its metadata file were corrupted. > The disk was healthy in this case. > Only the block was corrupt. > Namenode tried to copy that block to another datanode but failed with the > following stack trace: > 2015-04-20 01:04:04,421 > [org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer@11319bc4] WARN > datanode.DataNode: DatanodeRegistration(a.b.c.d, > datanodeUuid=e8c5135c-9b9f-4d05-a59d-e5525518aca7, infoPort=1006, > infoSecurePort=0, ipcPort=8020, > storageInfo=lv=-56;cid=CID-e7f736ac-158e-446e-9091-7e66f3cddf3c;nsid=358250775;c=1428471998571):Failed > to transfer BP-xxx-1351096255769:blk_2697560713_1107108863999 to > a1.b1.c1.d1:1004 got > java.io.IOException: Could not create DataChecksum of type 0 with > bytesPerChecksum 0 > at > org.apache.hadoop.util.DataChecksum.newDataChecksum(DataChecksum.java:125) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:175) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:140) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readDataChecksum(BlockMetadataHeader.java:102) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:287) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:1989) > at java.lang.Thread.run(Thread.java:722) > The following catch block in DataTransfer#run method will treat every > IOException as disk error fault and run disk errror > {noformat} > catch (IOException ie) { > LOG.warn(bpReg + ":Failed to transfer " + b + " to " + > targets[0] + " got ", ie); > // check if there are any disk problem > checkDiskErrorAsync(); > } > {noformat} > This block was never scanned by BlockPoolSliceScanner otherwise it would have > reported as corrupt block. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9271) Implement basic NN operations
[ https://issues.apache.org/jira/browse/HDFS-9271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414311#comment-15414311 ] Hadoop QA commented on HDFS-9271: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 40s{color} | {color:green} HDFS-8707 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 38s{color} | {color:green} HDFS-8707 passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 41s{color} | {color:green} HDFS-8707 passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 17s{color} | {color:green} HDFS-8707 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s{color} | {color:green} HDFS-8707 passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 40s{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 5m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 40s{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 5m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 40s{color} | {color:green} hadoop-hdfs-native-client in the patch passed with JDK v1.7.0_101. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 50m 28s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0cf5e66 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12822880/HDFS-9271.HDFS-8707.008.patch | | JIRA Issue | HDFS-9271 | | Optional Tests | asflicense compile cc mvnsite javac unit | | uname | Linux ed873562c7de 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | HDFS-8707 / 0a29537 | | Default Java | 1.7.0_101 | | Multi-JDK versions | /usr/lib/jvm/java-8-oracle:1.8.0_101 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_101 | | JDK v1.7.0_101 Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/16368/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs-native-client U: hadoop-hdfs-project/hadoop-hdfs-native-client | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/16368/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Implement basic NN operations > - > > Key: HDFS-9271 > URL: https://issues.apache.org/jira/browse/HDFS-9271 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Bob Hansen >Assignee: Anatoli Shein > Attachments: HDFS-9271.HDFS-8707.000.patch, > HDFS-92
[jira] [Updated] (HDFS-10739) libhdfs++: In RPC engine replace vector with deque for pending requests
[ https://issues.apache.org/jira/browse/HDFS-10739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Clampffer updated HDFS-10739: --- Resolution: Fixed Status: Resolved (was: Patch Available) > libhdfs++: In RPC engine replace vector with deque for pending requests > --- > > Key: HDFS-10739 > URL: https://issues.apache.org/jira/browse/HDFS-10739 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Anatoli Shein >Assignee: Anatoli Shein > Attachments: HDFS-10739.HDFS-8707.000.patch > > > Needs to be added in order to improve performance -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9271) Implement basic NN operations
[ https://issues.apache.org/jira/browse/HDFS-9271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anatoli Shein updated HDFS-9271: Attachment: HDFS-9271.HDFS-8707.008.patch New patch attached. > Implement basic NN operations > - > > Key: HDFS-9271 > URL: https://issues.apache.org/jira/browse/HDFS-9271 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Bob Hansen >Assignee: Anatoli Shein > Attachments: HDFS-9271.HDFS-8707.000.patch, > HDFS-9271.HDFS-8707.001.patch, HDFS-9271.HDFS-8707.002.patch, > HDFS-9271.HDFS-8707.003.patch, HDFS-9271.HDFS-8707.004.patch, > HDFS-9271.HDFS-8707.005.patch, HDFS-9271.HDFS-8707.006.patch, > HDFS-9271.HDFS-8707.007.patch, HDFS-9271.HDFS-8707.008.patch > > > Expose via C and C++ API: > * mkdirs > * rename > * delete > * stat > * chmod > * chown > * getListing > * setOwner -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9271) Implement basic NN operations
[ https://issues.apache.org/jira/browse/HDFS-9271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414239#comment-15414239 ] Anatoli Shein commented on HDFS-9271: - [~James C], thank you for the review. I have addressed your comments as follows: 1) nit: weird space between namespace and type in hdfs.cc void set_working_directory(std:: string new_directory) { working_directory = new_directory; } (/) Done. 2) Right now hdfsFile_internal tracks bytes read. Should this get pushed into FileHandle? (/) Done. I moved it to the FileHandle, it does make more sense to keep it there. 3) pedantic nit: 2 vs 4 space tabs in CheckSystemAndHandle (/) Fixed. 4) Might still be worth doing calling CheckSystem in hdfsAvailable even if it's not implemented so latent bugs don't show up in surprising ways once it is implemented. Same thing for any other functions that are just stubbed in for now like hdfsUnbufferFile. (/) Done. 5) Might be worth turning NameNodeOperations::IsHighBitSet into a simple function in common/util.h. There's other places where that would be nice to use like checking before casting an unsigned into a signed type (related to HDFS-10554). Templating the function so it could check any integral type would be nice too but I can do that later on when I work on the conversion stuff. Also if you're as bad at counting at counting a long row of 0s as I am it might be easier to replace the hex literal with something that makes your intention very clear: uint64_t firstBit = 0x8000; with uint64_t firstBit = 0x1 << 63; (/) I moved it to common/util.h and fixed the literal. Please review the new patch (attached). > Implement basic NN operations > - > > Key: HDFS-9271 > URL: https://issues.apache.org/jira/browse/HDFS-9271 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Bob Hansen >Assignee: Anatoli Shein > Attachments: HDFS-9271.HDFS-8707.000.patch, > HDFS-9271.HDFS-8707.001.patch, HDFS-9271.HDFS-8707.002.patch, > HDFS-9271.HDFS-8707.003.patch, HDFS-9271.HDFS-8707.004.patch, > HDFS-9271.HDFS-8707.005.patch, HDFS-9271.HDFS-8707.006.patch, > HDFS-9271.HDFS-8707.007.patch > > > Expose via C and C++ API: > * mkdirs > * rename > * delete > * stat > * chmod > * chown > * getListing > * setOwner -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-10742) Measurement of lock held time in FsDatasetImpl
Chen Liang created HDFS-10742: - Summary: Measurement of lock held time in FsDatasetImpl Key: HDFS-10742 URL: https://issues.apache.org/jira/browse/HDFS-10742 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 3.0.0-alpha2 Reporter: Chen Liang Assignee: Chen Liang This JIRA proposes to measure the time the of lock of {{FsDatasetImpl}} is held by a thread. Doing so will allow us to measure lock statistics. This can be done by extending the {{AutoCloseableLock}} lock object in {{FsDatasetImpl}}. In the future we can also consider replacing the lock with a read-write lock. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8224) Any IOException in DataTransfer#run() will run diskError thread even if it is not disk error
[ https://issues.apache.org/jira/browse/HDFS-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414187#comment-15414187 ] Hadoop QA commented on HDFS-8224: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 42s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 30s{color} | {color:green} root: The patch generated 0 new + 497 unchanged - 2 fixed = 497 total (was 499) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 12s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 71m 40s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}122m 32s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.tracing.TestTracing | | | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped | | | hadoop.security.TestRefreshUserMappings | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12822850/HDFS-8224-trunk-2.patch | | JIRA Issue | HDFS-8224 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 59cb6e669104 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / c4b77ae | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/16365/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/16365/testReport/ | | modules | C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdf
[jira] [Commented] (HDFS-10738) Fix TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration test failure
[ https://issues.apache.org/jira/browse/HDFS-10738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414172#comment-15414172 ] Hudson commented on HDFS-10738: --- SUCCESS: Integrated in Hadoop-trunk-Commit #10247 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/10247/]) HDFS-10738. Fix (kihwal: rev 0f701f433dd3be233bf53e856864c82349e8274e) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/security/TestRefreshUserMappings.java > Fix TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration test > failure > > > Key: HDFS-10738 > URL: https://issues.apache.org/jira/browse/HDFS-10738 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Rakesh R >Assignee: Rakesh R > Fix For: 2.9.0, 3.0.0-alpha2 > > Attachments: HDFS-10738-00.patch, HDFS-10738-01.patch > > > This jira is to analyse and fix the test case failure, which is failing in > Jenkins build, > [Build_16326|https://builds.apache.org/job/PreCommit-HDFS-Build/16326/testReport/org.apache.hadoop.security/TestRefreshUserMappings/testRefreshSuperUserGroupsConfiguration/] > very frequently. > {code} > Error Message > first auth for user2 should've succeeded: User: super_userL is not allowed to > impersonate userL2 > Stacktrace > java.lang.AssertionError: first auth for user2 should've succeeded: User: > super_userL is not allowed to impersonate userL2 > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.security.TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration(TestRefreshUserMappings.java:200) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8224) Any IOException in DataTransfer#run() will run diskError thread even if it is not disk error
[ https://issues.apache.org/jira/browse/HDFS-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414165#comment-15414165 ] Wei-Chiu Chuang commented on HDFS-8224: --- Thanks for the patch! Overall it looks good to me. I just like to add a few nits: * Can you move the test to TestDiskError.java? In this way, you do not need to make {{BlockScanner#markSuspectBlock}}, {{DataNode#setBlockScanner}} and {{DataNode#transferBlock}} public. I also think this is more nature to place this test in this test class. * {code:title= InvalidChecksumSizeException.java} * Thrown when bytesPerChecksun field in the meta file is less than * or equal to 0. {code} To be more precise, the exception can also be thrown if the type is invalid. * The following line should be removed. {code:title= TestDataTransferProtocol.java} //config.setLong(DFS_DATANODE_SCAN_PERIOD_HOURS_KEY, -1); {code} * Finally, could you add a comment here that basically says if the peer disconnects the block is already added to BlockScanner, so do not add to the scan queue again. However an InvalidChecksumSizeException is thrown because metafile is corrupt (caused by a flaky disk) and therefore add to scan queue here. {code:title=DataNode.java} } catch (IOException ie) { if (ie instanceof InvalidChecksumSizeException) { {code} > Any IOException in DataTransfer#run() will run diskError thread even if it is > not disk error > > > Key: HDFS-8224 > URL: https://issues.apache.org/jira/browse/HDFS-8224 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Fix For: 2.8.0 > > Attachments: HDFS-8224-trunk-1.patch, HDFS-8224-trunk-2.patch, > HDFS-8224-trunk.patch > > > This happened in our 2.6 cluster. > One of the block and its metadata file were corrupted. > The disk was healthy in this case. > Only the block was corrupt. > Namenode tried to copy that block to another datanode but failed with the > following stack trace: > 2015-04-20 01:04:04,421 > [org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer@11319bc4] WARN > datanode.DataNode: DatanodeRegistration(a.b.c.d, > datanodeUuid=e8c5135c-9b9f-4d05-a59d-e5525518aca7, infoPort=1006, > infoSecurePort=0, ipcPort=8020, > storageInfo=lv=-56;cid=CID-e7f736ac-158e-446e-9091-7e66f3cddf3c;nsid=358250775;c=1428471998571):Failed > to transfer BP-xxx-1351096255769:blk_2697560713_1107108863999 to > a1.b1.c1.d1:1004 got > java.io.IOException: Could not create DataChecksum of type 0 with > bytesPerChecksum 0 > at > org.apache.hadoop.util.DataChecksum.newDataChecksum(DataChecksum.java:125) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:175) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:140) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readDataChecksum(BlockMetadataHeader.java:102) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:287) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:1989) > at java.lang.Thread.run(Thread.java:722) > The following catch block in DataTransfer#run method will treat every > IOException as disk error fault and run disk errror > {noformat} > catch (IOException ie) { > LOG.warn(bpReg + ":Failed to transfer " + b + " to " + > targets[0] + " got ", ie); > // check if there are any disk problem > checkDiskErrorAsync(); > } > {noformat} > This block was never scanned by BlockPoolSliceScanner otherwise it would have > reported as corrupt block. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10738) Fix TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration test failure
[ https://issues.apache.org/jira/browse/HDFS-10738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-10738: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.0.0-alpha2 2.9.0 Status: Resolved (was: Patch Available) Thanks for reporting and fixing this, [~rakeshr]. I've committed this to trunk and branch-2. > Fix TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration test > failure > > > Key: HDFS-10738 > URL: https://issues.apache.org/jira/browse/HDFS-10738 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Rakesh R >Assignee: Rakesh R > Fix For: 2.9.0, 3.0.0-alpha2 > > Attachments: HDFS-10738-00.patch, HDFS-10738-01.patch > > > This jira is to analyse and fix the test case failure, which is failing in > Jenkins build, > [Build_16326|https://builds.apache.org/job/PreCommit-HDFS-Build/16326/testReport/org.apache.hadoop.security/TestRefreshUserMappings/testRefreshSuperUserGroupsConfiguration/] > very frequently. > {code} > Error Message > first auth for user2 should've succeeded: User: super_userL is not allowed to > impersonate userL2 > Stacktrace > java.lang.AssertionError: first auth for user2 should've succeeded: User: > super_userL is not allowed to impersonate userL2 > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.security.TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration(TestRefreshUserMappings.java:200) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10739) libhdfs++: In RPC engine replace vector with deque for pending requests
[ https://issues.apache.org/jira/browse/HDFS-10739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414140#comment-15414140 ] James Clampffer commented on HDFS-10739: Committed this to HDFS-8707. Thanks Anatoli! This simply substitutes one data structure with another that has the same API but is more appropriate for the way it's being used. Looks like vector::resize was killing tons of time shifting elements around and reallocating (including on optimized builds). > libhdfs++: In RPC engine replace vector with deque for pending requests > --- > > Key: HDFS-10739 > URL: https://issues.apache.org/jira/browse/HDFS-10739 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Anatoli Shein >Assignee: Anatoli Shein > Attachments: HDFS-10739.HDFS-8707.000.patch > > > Needs to be added in order to improve performance -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA
[ https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414133#comment-15414133 ] Hadoop QA commented on HDFS-8897: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 130 unchanged - 7 fixed = 130 total (was 137) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 56m 45s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 75m 38s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.tracing.TestTracing | | | hadoop.security.TestRefreshUserMappings | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12822855/HDFS-8897.006.patch | | JIRA Issue | HDFS-8897 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 6e6d12c17333 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / c4b77ae | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/16367/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/16367/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/16367/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Balancer should handle fs.defaultFS trailing slash in HA > > > Key: HDFS-8897 > URL: https://issues.apache.org/jira/browse/HDFS-8897 > Project: Hadoop HDFS > Issue Type
[jira] [Commented] (HDFS-10738) Fix TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration test failure
[ https://issues.apache.org/jira/browse/HDFS-10738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414127#comment-15414127 ] Kihwal Lee commented on HDFS-10738: --- +1 the patch looks good. {{TestTracing}} was just fixed by HADOOP-13473. > Fix TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration test > failure > > > Key: HDFS-10738 > URL: https://issues.apache.org/jira/browse/HDFS-10738 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Rakesh R >Assignee: Rakesh R > Attachments: HDFS-10738-00.patch, HDFS-10738-01.patch > > > This jira is to analyse and fix the test case failure, which is failing in > Jenkins build, > [Build_16326|https://builds.apache.org/job/PreCommit-HDFS-Build/16326/testReport/org.apache.hadoop.security/TestRefreshUserMappings/testRefreshSuperUserGroupsConfiguration/] > very frequently. > {code} > Error Message > first auth for user2 should've succeeded: User: super_userL is not allowed to > impersonate userL2 > Stacktrace > java.lang.AssertionError: first auth for user2 should've succeeded: User: > super_userL is not allowed to impersonate userL2 > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.security.TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration(TestRefreshUserMappings.java:200) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10738) Fix TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration test failure
[ https://issues.apache.org/jira/browse/HDFS-10738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414121#comment-15414121 ] Hadoop QA commented on HDFS-10738: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 1s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 25 unchanged - 6 fixed = 25 total (was 31) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 58m 35s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 78m 38s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.tracing.TestTracing | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12822851/HDFS-10738-01.patch | | JIRA Issue | HDFS-10738 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux e8a5e2804d0b 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / c4b77ae | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/16366/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/16366/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/16366/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Fix TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration test > failure > > > Key: HDFS-10738 > URL: https://issues.apache.org/jira/browse/HDFS-10738 > Project: Hadoop HDFS > Issue Type: B
[jira] [Commented] (HDFS-10679) libhdfs++: Implement parallel find with wildcards tool
[ https://issues.apache.org/jira/browse/HDFS-10679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414085#comment-15414085 ] Hadoop QA commented on HDFS-10679: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 3s{color} | {color:green} HDFS-8707 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 8s{color} | {color:green} HDFS-8707 passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 0s{color} | {color:green} HDFS-8707 passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 16s{color} | {color:green} HDFS-8707 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s{color} | {color:green} HDFS-8707 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 11s{color} | {color:green} HDFS-8707 passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 10s{color} | {color:green} HDFS-8707 passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 4s{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 6m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 6s{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 6m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 7s{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 8s{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 13s{color} | {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.7.0_101. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 53m 19s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_101 Failed CTEST tests | test_libhdfs_threaded_hdfspp_test_shim_static | | | test_hdfs_ext_hdfspp_test_shim_static | | JDK v1.7.0_101 Failed CTEST tests | test_libhdfs_threaded_hdfspp_test_shim_static | | | test_hdfs_ext_hdfspp_test_shim_static | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0cf5e66 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12822849/HDFS-10679.HDFS-8707.005.patch | | JIRA Issue | HDFS-10679 | | Optional Tests | asflicense compile cc mvnsite javac unit javadoc mvninstall | | uname | Linux f80f14ac333d 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | HDFS-8707 / 8e40027 | | Default Java | 1.7.0_101 | | Multi-JDK versi
[jira] [Commented] (HDFS-9038) DFS reserved space is erroneously counted towards non-DFS used.
[ https://issues.apache.org/jira/browse/HDFS-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414078#comment-15414078 ] Arpit Agarwal commented on HDFS-9038: - Hi [~brahmareddy], thanks for the updated patch. I will try to review this next week. > DFS reserved space is erroneously counted towards non-DFS used. > --- > > Key: HDFS-9038 > URL: https://issues.apache.org/jira/browse/HDFS-9038 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.7.1 >Reporter: Chris Nauroth >Assignee: Brahma Reddy Battula > Attachments: GetFree.java, HDFS-9038-002.patch, HDFS-9038-003.patch, > HDFS-9038-004.patch, HDFS-9038-005.patch, HDFS-9038-006.patch, > HDFS-9038-007.patch, HDFS-9038-008.patch, HDFS-9038-009.patch, HDFS-9038.patch > > > HDFS-5215 changed the DataNode volume available space calculation to consider > the reserved space held by the {{dfs.datanode.du.reserved}} configuration > property. As a side effect, reserved space is now counted towards non-DFS > used. I don't believe it was intentional to change the definition of non-DFS > used. This issue proposes restoring the prior behavior: do not count > reserved space towards non-DFS used. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10740) libhdfs++: Implement recursive directory generator
[ https://issues.apache.org/jira/browse/HDFS-10740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414066#comment-15414066 ] Hadoop QA commented on HDFS-10740: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 13m 19s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 27s{color} | {color:green} HDFS-8707 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 34s{color} | {color:green} HDFS-8707 passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 36s{color} | {color:green} HDFS-8707 passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 17s{color} | {color:green} HDFS-8707 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s{color} | {color:green} HDFS-8707 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 10s{color} | {color:green} HDFS-8707 passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 10s{color} | {color:green} HDFS-8707 passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 37s{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 5m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 39s{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 5m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 6s{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 8s{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 48s{color} | {color:green} hadoop-hdfs-native-client in the patch passed with JDK v1.7.0_101. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 66m 6s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0cf5e66 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12822847/HDFS-10740.HDFS-8707.000.patch | | JIRA Issue | HDFS-10740 | | Optional Tests | asflicense compile cc mvnsite javac unit javadoc mvninstall | | uname | Linux 7e11a7a28847 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | HDFS-8707 / 8e40027 | | Default Java | 1.7.0_101 | | Multi-JDK versions | /usr/lib/jvm/java-8-oracle:1.8.0_101 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_101 | | JDK v1.7.0_101 Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/16363/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs-native-client U: hadoop-hdfs-project/hadoo
[jira] [Commented] (HDFS-10608) Include event for AddBlock in Inotify Event Stream
[ https://issues.apache.org/jira/browse/HDFS-10608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414057#comment-15414057 ] churro morales commented on HDFS-10608: --- Hi [~surendrasingh] Thank you for the review. Our original goal for this patch was to use inotify to replicate blocks to a remote data center. Thus when we had a new penultimateBlock we knew it would be complete and could replicate it. The latest block is not ready to be replicated yet, so that is why I added that info. I'll fix items 2,3 and 4 no problem. Are you okay with keeping penultimateBlock in the event? > Include event for AddBlock in Inotify Event Stream > -- > > Key: HDFS-10608 > URL: https://issues.apache.org/jira/browse/HDFS-10608 > Project: Hadoop HDFS > Issue Type: Task >Reporter: churro morales >Priority: Minor > Attachments: HDFS-10608.patch, HDFS-10608.v1.patch > > > It would be nice to have an AddBlockEvent in the INotify pipeline. Based on > discussions from mailing list: > http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201607.mbox/%3C1467743792.4040080.657624289.7BE240AD%40webmail.messagingengine.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10739) libhdfs++: In RPC engine replace vector with deque for pending requests
[ https://issues.apache.org/jira/browse/HDFS-10739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414033#comment-15414033 ] Hadoop QA commented on HDFS-10739: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 58s{color} | {color:green} HDFS-8707 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 50s{color} | {color:green} HDFS-8707 passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 39s{color} | {color:green} HDFS-8707 passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 16s{color} | {color:green} HDFS-8707 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s{color} | {color:green} HDFS-8707 passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 39s{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 5m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 40s{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 5m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 45s{color} | {color:green} hadoop-hdfs-native-client in the patch passed with JDK v1.7.0_101. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 49m 37s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0cf5e66 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12822845/HDFS-10739.HDFS-8707.000.patch | | JIRA Issue | HDFS-10739 | | Optional Tests | asflicense compile cc mvnsite javac unit | | uname | Linux 3677479aa4bc 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | HDFS-8707 / 8e40027 | | Default Java | 1.7.0_101 | | Multi-JDK versions | /usr/lib/jvm/java-8-oracle:1.8.0_101 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_101 | | JDK v1.7.0_101 Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/16362/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs-native-client U: hadoop-hdfs-project/hadoop-hdfs-native-client | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/16362/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > libhdfs++: In RPC engine replace vector with deque for pending requests > --- > > Key: HDFS-10739 > URL: https://issues.apache.org/jira/browse/HDFS-10739 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Anatoli Shein >
[jira] [Resolved] (HDFS-10741) TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration fails consistently.
[ https://issues.apache.org/jira/browse/HDFS-10741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang resolved HDFS-10741. Resolution: Duplicate Looks like it's a dup of HDFS-10738 where a patch is uploaded. Let's move over and review that patch. Thanks! > TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration fails > consistently. > --- > > Key: HDFS-10741 > URL: https://issues.apache.org/jira/browse/HDFS-10741 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Rushabh S Shah > > Following test is failing consistently in trunk. > Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 5.783 sec <<< > FAILURE! - in org.apache.hadoop.security.TestRefreshUserMappings > testRefreshSuperUserGroupsConfiguration(org.apache.hadoop.security.TestRefreshUserMappings) > Time elapsed: 3.942 sec <<< FAILURE! > java.lang.AssertionError: first auth for user2 should've succeeded: User: > super_userL is not allowed to impersonate userL2 > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.security.TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration(TestRefreshUserMappings.java:200) > Results : > Failed tests: > TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration:200 first > auth for user2 should've succeeded: User: super_userL is not allowed to > impersonate userL2 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10608) Include event for AddBlock in Inotify Event Stream
[ https://issues.apache.org/jira/browse/HDFS-10608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414010#comment-15414010 ] Surendra Singh Lilhore commented on HDFS-10608: --- Thanks [~churromorales] for the patch.. Some review comment from my side.. 1. I feel event should give only info of newly added block name (blk_), generationStamp, file path, blockPoolId. No need of giving {{penultimateBlock}} info. 2. Remove unused import from {{InotifyFSEditLogOpTranslator.java}} {code} +import org.apache.hadoop.hdfs.protocol.ClientProtocol; +import org.apache.hadoop.hdfs.protocol.ExtendedBlock; +import org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo; {code} 3. In test class don't change the assertion which is not related to this patch. 4. Please fix the check style issue.. > Include event for AddBlock in Inotify Event Stream > -- > > Key: HDFS-10608 > URL: https://issues.apache.org/jira/browse/HDFS-10608 > Project: Hadoop HDFS > Issue Type: Task >Reporter: churro morales >Priority: Minor > Attachments: HDFS-10608.patch, HDFS-10608.v1.patch > > > It would be nice to have an AddBlockEvent in the INotify pipeline. Based on > discussions from mailing list: > http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201607.mbox/%3C1467743792.4040080.657624289.7BE240AD%40webmail.messagingengine.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA
[ https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-8897: - Attachment: HDFS-8897.006.patch Patch 006: * Fix checkstyle Unit test failures in 005 were unrelated. > Balancer should handle fs.defaultFS trailing slash in HA > > > Key: HDFS-8897 > URL: https://issues.apache.org/jira/browse/HDFS-8897 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.7.1 > Environment: Centos 6.6 >Reporter: LINTE >Assignee: John Zhuge > Attachments: HDFS-8897.001.patch, HDFS-8897.002.patch, > HDFS-8897.003.patch, HDFS-8897.004.patch, HDFS-8897.005.patch, > HDFS-8897.006.patch > > > When balancer is launched, it should test if there is already a > /system/balancer.id file in HDFS. > When the file doesn't exist, the balancer don't want to run : > 15/08/14 16:35:12 INFO balancer.Balancer: namenodes = [hdfs://sandbox/, > hdfs://sandbox] > 15/08/14 16:35:12 INFO balancer.Balancer: parameters = > Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration > = 5, number of nodes to be excluded = 0, number of nodes to be included = 0] > Time Stamp Iteration# Bytes Already Moved Bytes Left To Move > Bytes Being Moved > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > java.io.IOException: Another Balancer is running.. Exiting ... > Aug 14, 2015 4:35:14 PM Balancing took 2.408 seconds > Looking at the audit log file when trying to run the balancer, the balancer > create the /system/balancer.id and then delete it on exiting ... > 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=create > src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r- > proto=rpc > 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=delete > src=/system/balancer.id dst=nullperm=null proto=rpc > The error seems to be located in > org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java > The function checkAndMarkRunning return null even if the /system/balancer.id > doesn't exist before entering this function; if it exists, then it is deleted > and the balancer exit with the same error. > > private OutputStream checkAndMarkRunning() throws IOException { > try { > if (fs.exists(idPath)) { > // try appending to it so that it will fail fast if another balancer > is > // running. > IOUtils.closeStream(fs.append(idPath)); > fs.delete(idPath, true); > } > final FSDataOutputStream fsout = fs.create(idPath, false); > // mark balancer idPath to be deleted during filesystem closure > fs.deleteOnExit(idPath); > if (write2IdFile) { > fsout.writeBytes(InetAddress.getLocalHost().getHostName()); > fsout.hflush(); > } > return fsout; > } catch(RemoteException e) { > > if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){ > return null; > } else { > throw e; > } > } > } > > Regards -- This message was sent by Atlassian JIRA (v6.3.4#6332) -
[jira] [Updated] (HDFS-10741) TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration fails consistently.
[ https://issues.apache.org/jira/browse/HDFS-10741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rushabh S Shah updated HDFS-10741: -- Description: Following test is failing consistently in trunk. Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 5.783 sec <<< FAILURE! - in org.apache.hadoop.security.TestRefreshUserMappings testRefreshSuperUserGroupsConfiguration(org.apache.hadoop.security.TestRefreshUserMappings) Time elapsed: 3.942 sec <<< FAILURE! java.lang.AssertionError: first auth for user2 should've succeeded: User: super_userL is not allowed to impersonate userL2 at org.junit.Assert.fail(Assert.java:88) at org.apache.hadoop.security.TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration(TestRefreshUserMappings.java:200) Results : Failed tests: TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration:200 first auth for user2 should've succeeded: User: super_userL is not allowed to impersonate userL2 was: Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 5.783 sec <<< FAILURE! - in org.apache.hadoop.security.TestRefreshUserMappings testRefreshSuperUserGroupsConfiguration(org.apache.hadoop.security.TestRefreshUserMappings) Time elapsed: 3.942 sec <<< FAILURE! java.lang.AssertionError: first auth for user2 should've succeeded: User: super_userL is not allowed to impersonate userL2 at org.junit.Assert.fail(Assert.java:88) at org.apache.hadoop.security.TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration(TestRefreshUserMappings.java:200) Results : Failed tests: TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration:200 first auth for user2 should've succeeded: User: super_userL is not allowed to impersonate userL2 > TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration fails > consistently. > --- > > Key: HDFS-10741 > URL: https://issues.apache.org/jira/browse/HDFS-10741 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Rushabh S Shah > > Following test is failing consistently in trunk. > Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 5.783 sec <<< > FAILURE! - in org.apache.hadoop.security.TestRefreshUserMappings > testRefreshSuperUserGroupsConfiguration(org.apache.hadoop.security.TestRefreshUserMappings) > Time elapsed: 3.942 sec <<< FAILURE! > java.lang.AssertionError: first auth for user2 should've succeeded: User: > super_userL is not allowed to impersonate userL2 > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.security.TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration(TestRefreshUserMappings.java:200) > Results : > Failed tests: > TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration:200 first > auth for user2 should've succeeded: User: super_userL is not allowed to > impersonate userL2 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8224) Any IOException in DataTransfer#run() will run diskError thread even if it is not disk error
[ https://issues.apache.org/jira/browse/HDFS-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414003#comment-15414003 ] Rushabh S Shah commented on HDFS-8224: -- {quote} TestRefreshUserMappings is failing with and without my patch. Will open a ticket to track this. {quote} Filed HDFS-10741 to track this. > Any IOException in DataTransfer#run() will run diskError thread even if it is > not disk error > > > Key: HDFS-8224 > URL: https://issues.apache.org/jira/browse/HDFS-8224 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Fix For: 2.8.0 > > Attachments: HDFS-8224-trunk-1.patch, HDFS-8224-trunk-2.patch, > HDFS-8224-trunk.patch > > > This happened in our 2.6 cluster. > One of the block and its metadata file were corrupted. > The disk was healthy in this case. > Only the block was corrupt. > Namenode tried to copy that block to another datanode but failed with the > following stack trace: > 2015-04-20 01:04:04,421 > [org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer@11319bc4] WARN > datanode.DataNode: DatanodeRegistration(a.b.c.d, > datanodeUuid=e8c5135c-9b9f-4d05-a59d-e5525518aca7, infoPort=1006, > infoSecurePort=0, ipcPort=8020, > storageInfo=lv=-56;cid=CID-e7f736ac-158e-446e-9091-7e66f3cddf3c;nsid=358250775;c=1428471998571):Failed > to transfer BP-xxx-1351096255769:blk_2697560713_1107108863999 to > a1.b1.c1.d1:1004 got > java.io.IOException: Could not create DataChecksum of type 0 with > bytesPerChecksum 0 > at > org.apache.hadoop.util.DataChecksum.newDataChecksum(DataChecksum.java:125) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:175) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:140) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readDataChecksum(BlockMetadataHeader.java:102) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:287) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:1989) > at java.lang.Thread.run(Thread.java:722) > The following catch block in DataTransfer#run method will treat every > IOException as disk error fault and run disk errror > {noformat} > catch (IOException ie) { > LOG.warn(bpReg + ":Failed to transfer " + b + " to " + > targets[0] + " got ", ie); > // check if there are any disk problem > checkDiskErrorAsync(); > } > {noformat} > This block was never scanned by BlockPoolSliceScanner otherwise it would have > reported as corrupt block. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-10741) TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration fails consistently.
Rushabh S Shah created HDFS-10741: - Summary: TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration fails consistently. Key: HDFS-10741 URL: https://issues.apache.org/jira/browse/HDFS-10741 Project: Hadoop HDFS Issue Type: Bug Reporter: Rushabh S Shah Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 5.783 sec <<< FAILURE! - in org.apache.hadoop.security.TestRefreshUserMappings testRefreshSuperUserGroupsConfiguration(org.apache.hadoop.security.TestRefreshUserMappings) Time elapsed: 3.942 sec <<< FAILURE! java.lang.AssertionError: first auth for user2 should've succeeded: User: super_userL is not allowed to impersonate userL2 at org.junit.Assert.fail(Assert.java:88) at org.apache.hadoop.security.TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration(TestRefreshUserMappings.java:200) Results : Failed tests: TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration:200 first auth for user2 should've succeeded: User: super_userL is not allowed to impersonate userL2 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10738) Fix TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration test failure
[ https://issues.apache.org/jira/browse/HDFS-10738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-10738: Attachment: HDFS-10738-01.patch Attached patch to fix checkstyle warnings. > Fix TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration test > failure > > > Key: HDFS-10738 > URL: https://issues.apache.org/jira/browse/HDFS-10738 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Rakesh R >Assignee: Rakesh R > Attachments: HDFS-10738-00.patch, HDFS-10738-01.patch > > > This jira is to analyse and fix the test case failure, which is failing in > Jenkins build, > [Build_16326|https://builds.apache.org/job/PreCommit-HDFS-Build/16326/testReport/org.apache.hadoop.security/TestRefreshUserMappings/testRefreshSuperUserGroupsConfiguration/] > very frequently. > {code} > Error Message > first auth for user2 should've succeeded: User: super_userL is not allowed to > impersonate userL2 > Stacktrace > java.lang.AssertionError: first auth for user2 should've succeeded: User: > super_userL is not allowed to impersonate userL2 > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.security.TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration(TestRefreshUserMappings.java:200) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-10738) Fix TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration test failure
[ https://issues.apache.org/jira/browse/HDFS-10738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413953#comment-15413953 ] Rakesh R edited comment on HDFS-10738 at 8/9/16 6:22 PM: - Attached patch by fixing checkstyle warnings. was (Author: rakeshr): Attached patch to fix checkstyle warnings. > Fix TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration test > failure > > > Key: HDFS-10738 > URL: https://issues.apache.org/jira/browse/HDFS-10738 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Rakesh R >Assignee: Rakesh R > Attachments: HDFS-10738-00.patch, HDFS-10738-01.patch > > > This jira is to analyse and fix the test case failure, which is failing in > Jenkins build, > [Build_16326|https://builds.apache.org/job/PreCommit-HDFS-Build/16326/testReport/org.apache.hadoop.security/TestRefreshUserMappings/testRefreshSuperUserGroupsConfiguration/] > very frequently. > {code} > Error Message > first auth for user2 should've succeeded: User: super_userL is not allowed to > impersonate userL2 > Stacktrace > java.lang.AssertionError: first auth for user2 should've succeeded: User: > super_userL is not allowed to impersonate userL2 > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.security.TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration(TestRefreshUserMappings.java:200) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8224) Any IOException in DataTransfer#run() will run diskError thread even if it is not disk error
[ https://issues.apache.org/jira/browse/HDFS-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rushabh S Shah updated HDFS-8224: - Attachment: HDFS-8224-trunk-2.patch Attaching a patch for trunk. Will post a patch for branch-2.8 and branch-2.9 once trunk one get reviewed. > Any IOException in DataTransfer#run() will run diskError thread even if it is > not disk error > > > Key: HDFS-8224 > URL: https://issues.apache.org/jira/browse/HDFS-8224 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Fix For: 2.8.0 > > Attachments: HDFS-8224-trunk-1.patch, HDFS-8224-trunk-2.patch, > HDFS-8224-trunk.patch > > > This happened in our 2.6 cluster. > One of the block and its metadata file were corrupted. > The disk was healthy in this case. > Only the block was corrupt. > Namenode tried to copy that block to another datanode but failed with the > following stack trace: > 2015-04-20 01:04:04,421 > [org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer@11319bc4] WARN > datanode.DataNode: DatanodeRegistration(a.b.c.d, > datanodeUuid=e8c5135c-9b9f-4d05-a59d-e5525518aca7, infoPort=1006, > infoSecurePort=0, ipcPort=8020, > storageInfo=lv=-56;cid=CID-e7f736ac-158e-446e-9091-7e66f3cddf3c;nsid=358250775;c=1428471998571):Failed > to transfer BP-xxx-1351096255769:blk_2697560713_1107108863999 to > a1.b1.c1.d1:1004 got > java.io.IOException: Could not create DataChecksum of type 0 with > bytesPerChecksum 0 > at > org.apache.hadoop.util.DataChecksum.newDataChecksum(DataChecksum.java:125) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:175) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:140) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readDataChecksum(BlockMetadataHeader.java:102) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:287) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:1989) > at java.lang.Thread.run(Thread.java:722) > The following catch block in DataTransfer#run method will treat every > IOException as disk error fault and run disk errror > {noformat} > catch (IOException ie) { > LOG.warn(bpReg + ":Failed to transfer " + b + " to " + > targets[0] + " got ", ie); > // check if there are any disk problem > checkDiskErrorAsync(); > } > {noformat} > This block was never scanned by BlockPoolSliceScanner otherwise it would have > reported as corrupt block. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8224) Any IOException in DataTransfer#run() will run diskError thread even if it is not disk error
[ https://issues.apache.org/jira/browse/HDFS-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rushabh S Shah updated HDFS-8224: - Status: Patch Available (was: Open) > Any IOException in DataTransfer#run() will run diskError thread even if it is > not disk error > > > Key: HDFS-8224 > URL: https://issues.apache.org/jira/browse/HDFS-8224 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Fix For: 2.8.0 > > Attachments: HDFS-8224-trunk-1.patch, HDFS-8224-trunk-2.patch, > HDFS-8224-trunk.patch > > > This happened in our 2.6 cluster. > One of the block and its metadata file were corrupted. > The disk was healthy in this case. > Only the block was corrupt. > Namenode tried to copy that block to another datanode but failed with the > following stack trace: > 2015-04-20 01:04:04,421 > [org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer@11319bc4] WARN > datanode.DataNode: DatanodeRegistration(a.b.c.d, > datanodeUuid=e8c5135c-9b9f-4d05-a59d-e5525518aca7, infoPort=1006, > infoSecurePort=0, ipcPort=8020, > storageInfo=lv=-56;cid=CID-e7f736ac-158e-446e-9091-7e66f3cddf3c;nsid=358250775;c=1428471998571):Failed > to transfer BP-xxx-1351096255769:blk_2697560713_1107108863999 to > a1.b1.c1.d1:1004 got > java.io.IOException: Could not create DataChecksum of type 0 with > bytesPerChecksum 0 > at > org.apache.hadoop.util.DataChecksum.newDataChecksum(DataChecksum.java:125) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:175) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:140) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readDataChecksum(BlockMetadataHeader.java:102) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:287) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:1989) > at java.lang.Thread.run(Thread.java:722) > The following catch block in DataTransfer#run method will treat every > IOException as disk error fault and run disk errror > {noformat} > catch (IOException ie) { > LOG.warn(bpReg + ":Failed to transfer " + b + " to " + > targets[0] + " got ", ie); > // check if there are any disk problem > checkDiskErrorAsync(); > } > {noformat} > This block was never scanned by BlockPoolSliceScanner otherwise it would have > reported as corrupt block. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10679) libhdfs++: Implement parallel find with wildcards tool
[ https://issues.apache.org/jira/browse/HDFS-10679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anatoli Shein updated HDFS-10679: - Attachment: HDFS-10679.HDFS-8707.005.patch Fixed locking to use lock_guards and time-out in RPC request. This patch also includes patch HDFS-10739. > libhdfs++: Implement parallel find with wildcards tool > -- > > Key: HDFS-10679 > URL: https://issues.apache.org/jira/browse/HDFS-10679 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Anatoli Shein >Assignee: Anatoli Shein > Attachments: HDFS-10679.HDFS-8707.000.patch, > HDFS-10679.HDFS-8707.001.patch, HDFS-10679.HDFS-8707.002.patch, > HDFS-10679.HDFS-8707.003.patch, HDFS-10679.HDFS-8707.004.patch, > HDFS-10679.HDFS-8707.005.patch > > > The find tool will issue the GetListing namenode operation on a given > directory, and filter the results using posix globbing library. > If the recursive option is selected, for each returned entry that is a > directory the tool will issue another asynchronous call GetListing and repeat > the result processing in a recursive fashion. > One implementation issue that needs to be addressed is the way how results > are returned back to the user: we can either buffer the results and return > them to the user in bulk, or we can return results continuously as they > arrive. While buffering would be an easier solution, returning results as > they arrive would be more beneficial to the user in terms of performance, > since the result processing can start as soon as the first results arrive > without any delay. In order to do that we need the user to use a loop to > process arriving results, and we need to send a special message back to the > user when the search is over. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8224) Any IOException in DataTransfer#run() will run diskError thread even if it is not disk error
[ https://issues.apache.org/jira/browse/HDFS-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rushabh S Shah updated HDFS-8224: - Status: Open (was: Patch Available) Cancelling the patch since forgot to add a new file in the patch. > Any IOException in DataTransfer#run() will run diskError thread even if it is > not disk error > > > Key: HDFS-8224 > URL: https://issues.apache.org/jira/browse/HDFS-8224 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Fix For: 2.8.0 > > Attachments: HDFS-8224-trunk-1.patch, HDFS-8224-trunk.patch > > > This happened in our 2.6 cluster. > One of the block and its metadata file were corrupted. > The disk was healthy in this case. > Only the block was corrupt. > Namenode tried to copy that block to another datanode but failed with the > following stack trace: > 2015-04-20 01:04:04,421 > [org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer@11319bc4] WARN > datanode.DataNode: DatanodeRegistration(a.b.c.d, > datanodeUuid=e8c5135c-9b9f-4d05-a59d-e5525518aca7, infoPort=1006, > infoSecurePort=0, ipcPort=8020, > storageInfo=lv=-56;cid=CID-e7f736ac-158e-446e-9091-7e66f3cddf3c;nsid=358250775;c=1428471998571):Failed > to transfer BP-xxx-1351096255769:blk_2697560713_1107108863999 to > a1.b1.c1.d1:1004 got > java.io.IOException: Could not create DataChecksum of type 0 with > bytesPerChecksum 0 > at > org.apache.hadoop.util.DataChecksum.newDataChecksum(DataChecksum.java:125) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:175) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:140) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readDataChecksum(BlockMetadataHeader.java:102) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:287) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:1989) > at java.lang.Thread.run(Thread.java:722) > The following catch block in DataTransfer#run method will treat every > IOException as disk error fault and run disk errror > {noformat} > catch (IOException ie) { > LOG.warn(bpReg + ":Failed to transfer " + b + " to " + > targets[0] + " got ", ie); > // check if there are any disk problem > checkDiskErrorAsync(); > } > {noformat} > This block was never scanned by BlockPoolSliceScanner otherwise it would have > reported as corrupt block. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10738) Fix TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration test failure
[ https://issues.apache.org/jira/browse/HDFS-10738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413936#comment-15413936 ] Hadoop QA commented on HDFS-10738: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 23s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 25 unchanged - 6 fixed = 27 total (was 31) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 81m 5s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}101m 4s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.tracing.TestTracing | | | hadoop.hdfs.server.datanode.TestLargeBlockReport | | | hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation | | | hadoop.hdfs.server.namenode.ha.TestHASafeMode | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12822825/HDFS-10738-00.patch | | JIRA Issue | HDFS-10738 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 2516fbd092d7 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 4aba858 | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/16360/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/16360/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/16360/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/16360/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generate
[jira] [Commented] (HDFS-10739) libhdfs++: In RPC engine replace vector with deque for pending requests
[ https://issues.apache.org/jira/browse/HDFS-10739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413934#comment-15413934 ] James Clampffer commented on HDFS-10739: +1 Thanks for finding/fixing this. I'll commit as soon as it passes CI tests. > libhdfs++: In RPC engine replace vector with deque for pending requests > --- > > Key: HDFS-10739 > URL: https://issues.apache.org/jira/browse/HDFS-10739 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Anatoli Shein >Assignee: Anatoli Shein > Attachments: HDFS-10739.HDFS-8707.000.patch > > > Needs to be added in order to improve performance -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10739) libhdfs++: In RPC engine replace vector with deque for pending requests
[ https://issues.apache.org/jira/browse/HDFS-10739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413929#comment-15413929 ] Bob Hansen commented on HDFS-10739: --- Looks good. +1 if it passes tests. > libhdfs++: In RPC engine replace vector with deque for pending requests > --- > > Key: HDFS-10739 > URL: https://issues.apache.org/jira/browse/HDFS-10739 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Anatoli Shein >Assignee: Anatoli Shein > Attachments: HDFS-10739.HDFS-8707.000.patch > > > Needs to be added in order to improve performance -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10740) libhdfs++: Implement recursive directory generator
[ https://issues.apache.org/jira/browse/HDFS-10740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anatoli Shein updated HDFS-10740: - Attachment: HDFS-10740.HDFS-8707.000.patch Generator implemented. This also has fix HDFS-10739 applied. > libhdfs++: Implement recursive directory generator > -- > > Key: HDFS-10740 > URL: https://issues.apache.org/jira/browse/HDFS-10740 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Anatoli Shein >Assignee: Anatoli Shein > Attachments: HDFS-10740.HDFS-8707.000.patch > > > This tool will allow us do benchmarking/testing our find functionality, and > will be a good example showing how to call a large number or namenode > operations reqursively. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10740) libhdfs++: Implement recursive directory generator
[ https://issues.apache.org/jira/browse/HDFS-10740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anatoli Shein updated HDFS-10740: - Status: Patch Available (was: Open) > libhdfs++: Implement recursive directory generator > -- > > Key: HDFS-10740 > URL: https://issues.apache.org/jira/browse/HDFS-10740 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Anatoli Shein >Assignee: Anatoli Shein > > This tool will allow us do benchmarking/testing our find functionality, and > will be a good example showing how to call a large number or namenode > operations reqursively. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-10740) libhdfs++: Implement recursive directory generator
[ https://issues.apache.org/jira/browse/HDFS-10740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anatoli Shein reassigned HDFS-10740: Assignee: Anatoli Shein > libhdfs++: Implement recursive directory generator > -- > > Key: HDFS-10740 > URL: https://issues.apache.org/jira/browse/HDFS-10740 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Anatoli Shein >Assignee: Anatoli Shein > > This tool will allow us do benchmarking/testing our find functionality, and > will be a good example showing how to call a large number or namenode > operations reqursively. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-10740) libhdfs++: Implement recursive directory generator
Anatoli Shein created HDFS-10740: Summary: libhdfs++: Implement recursive directory generator Key: HDFS-10740 URL: https://issues.apache.org/jira/browse/HDFS-10740 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Anatoli Shein This tool will allow us do benchmarking/testing our find functionality, and will be a good example showing how to call a large number or namenode operations reqursively. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10739) libhdfs++: In RPC engine replace vector with deque for pending requests
[ https://issues.apache.org/jira/browse/HDFS-10739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anatoli Shein updated HDFS-10739: - Attachment: HDFS-10739.HDFS-8707.000.patch Attached. > libhdfs++: In RPC engine replace vector with deque for pending requests > --- > > Key: HDFS-10739 > URL: https://issues.apache.org/jira/browse/HDFS-10739 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Anatoli Shein >Assignee: Anatoli Shein > Attachments: HDFS-10739.HDFS-8707.000.patch > > > Needs to be added in order to improve performance -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-10739) libhdfs++: In RPC engine replace vector with deque for pending requests
[ https://issues.apache.org/jira/browse/HDFS-10739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anatoli Shein reassigned HDFS-10739: Assignee: Anatoli Shein > libhdfs++: In RPC engine replace vector with deque for pending requests > --- > > Key: HDFS-10739 > URL: https://issues.apache.org/jira/browse/HDFS-10739 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Anatoli Shein >Assignee: Anatoli Shein > > Needs to be added in order to improve performance -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-10739) libhdfs++: In RPC engine replace vector with deque for pending requests
Anatoli Shein created HDFS-10739: Summary: libhdfs++: In RPC engine replace vector with deque for pending requests Key: HDFS-10739 URL: https://issues.apache.org/jira/browse/HDFS-10739 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Anatoli Shein Needs to be added in order to improve performance -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10739) libhdfs++: In RPC engine replace vector with deque for pending requests
[ https://issues.apache.org/jira/browse/HDFS-10739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anatoli Shein updated HDFS-10739: - Status: Patch Available (was: Open) > libhdfs++: In RPC engine replace vector with deque for pending requests > --- > > Key: HDFS-10739 > URL: https://issues.apache.org/jira/browse/HDFS-10739 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Anatoli Shein >Assignee: Anatoli Shein > > Needs to be added in order to improve performance -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8406) Lease recovery continually failed
[ https://issues.apache.org/jira/browse/HDFS-8406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413905#comment-15413905 ] Ravi Prakash commented on HDFS-8406: It may not be related to HDFS-9194. I am seeing this on a 2.7.1 cluster (which has HDFS-6651 already (HDFS-9194 is duplicating HDFS-6651)) > Lease recovery continually failed > - > > Key: HDFS-8406 > URL: https://issues.apache.org/jira/browse/HDFS-8406 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Keith Turner > Labels: Accumulo, HBase, SolrCloud > > While testing Accumulo on a cluster and killing processes, I ran into a > situation where the lease on an accumulo write ahead log in HDFS could not be > recovered. Even restarting HDFS and Accumulo would not fix the problem. > The following message was seen in an Accumulo tablet server log immediately > before the tablet server was killed. > {noformat} > 2015-05-14 17:12:37,466 [hdfs.DFSClient] WARN : DFSOutputStream > ResponseProcessor exception for block > BP-802741494-10.1.5.6-1431557089849:blk_1073932823_192060 > java.io.IOException: Bad response ERROR for block > BP-802741494-10.1.5.6-1431557089849:blk_1073932823_192060 from datanode > 10.1.5.9:50010 > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:897) > 2015-05-14 17:12:37,466 [hdfs.DFSClient] WARN : Error Recovery for block > BP-802741494-10.1.5.6-1431557089849:blk_1073932823_192060 in pipeline > 10.1.5.55:50010, 10.1.5.9:5 > {noformat} > Before recovering data from a write ahead log, the Accumulo master attempts > to recover the lease. This repeatedly failed with messages like the > following. > {noformat} > 2015-05-14 17:14:54,301 [recovery.HadoopLogCloser] WARN : Error recovering > lease on > hdfs://10.1.5.6:1/accumulo/wal/worker11+9997/3a731759-3594-4535-8086-245eed7cd4c2 > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException): > failed to create file > /accumulo/wal/worker11+9997/3a731759-3594-4535-8086-245eed7cd4c2 for > DFSClient_NONMAPREDUCE_950713214_16 for client 10.1.5.158 because > pendingCreates is non-null but no leases found. > {noformat} > Below is some info from the NN logs for the problematic file. > {noformat} > [ec2-user@leader2 logs]$ grep 3a731759-3594-4535-8086-245 > hadoop-ec2-user-namenode-leader2.log > 2015-05-14 17:10:46,299 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > allocateBlock: > /accumulo/wal/worker11+9997/3a731759-3594-4535-8086-245eed7cd4c2. > BP-802741494-10.1.5.6-1431557089849 > blk_1073932823_192060{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, > replicas=[ReplicaUnderConstruction[[DISK]DS-ffe07d7d-0e68-45b8-b3d5-c976f1716481:NORMAL:10.1.5.55:50010|RBW], > > ReplicaUnderConstruction[[DISK]DS-6efec702-3f1f-4ec0-a31f-de947e7e6097:NORMAL:10.1.5.9:50010|RBW], > > ReplicaUnderConstruction[[DISK]DS-5e27df17-abf8-47df-b4bc-c38d0cd426ea:NORMAL:10.1.5.45:50010|RBW]]} > 2015-05-14 17:10:46,628 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > fsync: /accumulo/wal/worker11+9997/3a731759-3594-4535-8086-245eed7cd4c2 for > DFSClient_NONMAPREDUCE_-1128465883_16 > 2015-05-14 17:14:49,288 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: recoverLease: [Lease. > Holder: DFSClient_NONMAPREDUCE_-1128465883_16, pendingcreates: 1], > src=/accumulo/wal/worker11+9997/3a731759-3594-4535-8086-245eed7cd4c2 from > client DFSClient_NONMAPREDUCE_-1128465883_16 > 2015-05-14 17:14:49,288 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Recovering [Lease. > Holder: DFSClient_NONMAPREDUCE_-1128465883_16, pendingcreates: 1], > src=/accumulo/wal/worker11+9997/3a731759-3594-4535-8086-245eed7cd4c2 > 2015-05-14 17:14:49,289 WARN org.apache.hadoop.hdfs.StateChange: DIR* > NameSystem.internalReleaseLease: File > /accumulo/wal/worker11+9997/3a731759-3594-4535-8086-245eed7cd4c2 has not been > closed. Lease recovery is in progress. RecoveryId = 192257 for block > blk_1073932823_192060{blockUCState=UNDER_RECOVERY, primaryNodeIndex=2, > replicas=[ReplicaUnderConstruction[[DISK]DS-ffe07d7d-0e68-45b8-b3d5-c976f1716481:NORMAL:10.1.5.55:50010|RBW], > > ReplicaUnderConstruction[[DISK]DS-6efec702-3f1f-4ec0-a31f-de947e7e6097:NORMAL:10.1.5.9:50010|RBW], > > ReplicaUnderConstruction[[DISK]DS-5e27df17-abf8-47df-b4bc-c38d0cd426ea:NORMAL:10.1.5.45:50010|RBW]]} > java.lang.IllegalStateException: Failed to finalize INodeFile > 3a731759-3594-4535-8086-245eed7cd4c2 since blocks[0] is non-complete, where > blocks=[blk_1073932823_192257{blockUCState=COMMITTED, primaryNodeIndex=2, > replicas=[ReplicaUnderConstruction[[DISK]DS-ffe07d7d-0e68-45b8-b3d5-c976f1716481:NORMAL:10.1.5.55:50010|RBW], > > ReplicaUnderConstruction[[DISK]DS-5e27df17-a
[jira] [Updated] (HDFS-10682) Replace FsDatasetImpl object lock with a separate lock object
[ https://issues.apache.org/jira/browse/HDFS-10682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Liang updated HDFS-10682: -- Attachment: HDFS-10682-branch-2.003.patch Somehow there is one call missed in a method... Which breaks all the places calling it. Fixed in this patch. > Replace FsDatasetImpl object lock with a separate lock object > - > > Key: HDFS-10682 > URL: https://issues.apache.org/jira/browse/HDFS-10682 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Chen Liang >Assignee: Chen Liang > Fix For: 3.0.0-alpha2 > > Attachments: HDFS-10682-branch-2.001.patch, > HDFS-10682-branch-2.002.patch, HDFS-10682-branch-2.003.patch, > HDFS-10682.001.patch, HDFS-10682.002.patch, HDFS-10682.003.patch, > HDFS-10682.004.patch, HDFS-10682.005.patch, HDFS-10682.006.patch, > HDFS-10682.007.patch, HDFS-10682.008.patch, HDFS-10682.009.patch, > HDFS-10682.010.patch > > > This Jira proposes to replace the FsDatasetImpl object lock with a separate > lock object. Doing so will make it easier to measure lock statistics like > lock held time and warn about potential lock contention due to slow disk > operations. > Right now we can use org.apache.hadoop.util.AutoCloseableLock. In the future > we can also consider replacing the lock with a read-write lock. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10682) Replace FsDatasetImpl object lock with a separate lock object
[ https://issues.apache.org/jira/browse/HDFS-10682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Liang updated HDFS-10682: -- Status: In Progress (was: Patch Available) > Replace FsDatasetImpl object lock with a separate lock object > - > > Key: HDFS-10682 > URL: https://issues.apache.org/jira/browse/HDFS-10682 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Chen Liang >Assignee: Chen Liang > Fix For: 3.0.0-alpha2 > > Attachments: HDFS-10682-branch-2.001.patch, > HDFS-10682-branch-2.002.patch, HDFS-10682.001.patch, HDFS-10682.002.patch, > HDFS-10682.003.patch, HDFS-10682.004.patch, HDFS-10682.005.patch, > HDFS-10682.006.patch, HDFS-10682.007.patch, HDFS-10682.008.patch, > HDFS-10682.009.patch, HDFS-10682.010.patch > > > This Jira proposes to replace the FsDatasetImpl object lock with a separate > lock object. Doing so will make it easier to measure lock statistics like > lock held time and warn about potential lock contention due to slow disk > operations. > Right now we can use org.apache.hadoop.util.AutoCloseableLock. In the future > we can also consider replacing the lock with a read-write lock. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8224) Any IOException in DataTransfer#run() will run diskError thread even if it is not disk error
[ https://issues.apache.org/jira/browse/HDFS-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413883#comment-15413883 ] Hadoop QA commented on HDFS-8224: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 40s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 19s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 24s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 31s{color} | {color:red} root in the patch failed. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 31s{color} | {color:red} root in the patch failed. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 26s{color} | {color:green} root: The patch generated 0 new + 497 unchanged - 2 fixed = 497 total (was 499) {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 21s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 25s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 18s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 24s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 58s{color} | {color:red} hadoop-hdfs-project_hadoop-hdfs generated 1 new + 7 unchanged - 0 fixed = 8 total (was 7) {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 21s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 26s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 29m 56s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12822831/HDFS-8224-trunk-1.patch | | JIRA Issue | HDFS-8224 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux d0d3f2244de6 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / b10c936 | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | mvni
[jira] [Commented] (HDFS-10342) BlockManager#createLocatedBlocks should not check corrupt replicas if none are corrupt
[ https://issues.apache.org/jira/browse/HDFS-10342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413874#comment-15413874 ] Hudson commented on HDFS-10342: --- SUCCESS: Integrated in Hadoop-trunk-Commit #10244 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/10244/]) HDFS-10342. BlockManager#createLocatedBlocks should not check corrupt (kihwal: rev b10c936020e2616609dcb3b2126e8c34328c10ca) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java > BlockManager#createLocatedBlocks should not check corrupt replicas if none > are corrupt > -- > > Key: HDFS-10342 > URL: https://issues.apache.org/jira/browse/HDFS-10342 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 2.7.0 >Reporter: Daryn Sharp >Assignee: Kuhu Shukla > Fix For: 3.0.0-alpha2 > > Attachments: HDFS-10342.001.patch, HDFS-10342.002.patch, > HDFS-10342.003.patch > > > {{corruptReplicas#isReplicaCorrupt(block, node)}} is called for every node > while populating the machines array. There's no need to invoke the method if > {{corruptReplicas#numCorruptReplicas(block)}} returned 0. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8224) Any IOException in DataTransfer#run() will run diskError thread even if it is not disk error
[ https://issues.apache.org/jira/browse/HDFS-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413845#comment-15413845 ] Wei-Chiu Chuang commented on HDFS-8224: --- Sure I'll review soon. Thx for the patch. > Any IOException in DataTransfer#run() will run diskError thread even if it is > not disk error > > > Key: HDFS-8224 > URL: https://issues.apache.org/jira/browse/HDFS-8224 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Fix For: 2.8.0 > > Attachments: HDFS-8224-trunk-1.patch, HDFS-8224-trunk.patch > > > This happened in our 2.6 cluster. > One of the block and its metadata file were corrupted. > The disk was healthy in this case. > Only the block was corrupt. > Namenode tried to copy that block to another datanode but failed with the > following stack trace: > 2015-04-20 01:04:04,421 > [org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer@11319bc4] WARN > datanode.DataNode: DatanodeRegistration(a.b.c.d, > datanodeUuid=e8c5135c-9b9f-4d05-a59d-e5525518aca7, infoPort=1006, > infoSecurePort=0, ipcPort=8020, > storageInfo=lv=-56;cid=CID-e7f736ac-158e-446e-9091-7e66f3cddf3c;nsid=358250775;c=1428471998571):Failed > to transfer BP-xxx-1351096255769:blk_2697560713_1107108863999 to > a1.b1.c1.d1:1004 got > java.io.IOException: Could not create DataChecksum of type 0 with > bytesPerChecksum 0 > at > org.apache.hadoop.util.DataChecksum.newDataChecksum(DataChecksum.java:125) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:175) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:140) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readDataChecksum(BlockMetadataHeader.java:102) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:287) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:1989) > at java.lang.Thread.run(Thread.java:722) > The following catch block in DataTransfer#run method will treat every > IOException as disk error fault and run disk errror > {noformat} > catch (IOException ie) { > LOG.warn(bpReg + ":Failed to transfer " + b + " to " + > targets[0] + " got ", ie); > // check if there are any disk problem > checkDiskErrorAsync(); > } > {noformat} > This block was never scanned by BlockPoolSliceScanner otherwise it would have > reported as corrupt block. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10301) BlockReport retransmissions may lead to storages falsely being declared zombie if storage report processing happens out of order
[ https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413846#comment-15413846 ] Daryn Sharp commented on HDFS-10301: My main objections (other than the fatal bug) are the incompatible change to the protocol coupled with essentially a malformed block report buffer. It's an attempt to shoehorn into the block report processing what should be handled by a heartbeat's storage reports. I think when you say my compatibility concern was addressed, it wasn't code fixed, but stated as don't-do-that? Won't the empty storage reports in the last rpc cause an older NN to go into a replication storm? Full downtime on a ~5k cluster to rollback, then ~40 mins to go active, is unacceptable when a failover to the prior release would have worked if not for this patch. This approach will also negate asynchronously processing FBRs (like I did with IBRs). Zombies should be handled by the heartbeat's pruning of excess storages. As an illustration, shouldn't something close to this work? {code} --- a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java +++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java @@ -466,11 +466,16 @@ public void updateHeartbeatState(StorageReport[] reports, long cacheCapacity, setLastUpdateMonotonic(Time.monotonicNow()); this.volumeFailures = volFailures; this.volumeFailureSummary = volumeFailureSummary; + +boolean storagesUpToDate = true; for (StorageReport report : reports) { DatanodeStorageInfo storage = updateStorage(report.getStorage()); if (checkFailedStorages) { failedStorageInfos.remove(storage); } + // don't prune unless block reports for all the storages in the + // heartbeat have been processed + storagesUpToDate &= (storage.getLastBlockReportId() == curBlockReportId); storage.receivedHeartbeat(report); totalCapacity += report.getCapacity(); @@ -492,7 +497,8 @@ public void updateHeartbeatState(StorageReport[] reports, long cacheCapacity, synchronized (storageMap) { storageMapSize = storageMap.size(); } -if (storageMapSize != reports.length) { +if (curBlockReportId != 0 +? storagesUpToDate : storageMapSize != reports.length) { pruneStorageMap(reports); } } @@ -527,6 +533,7 @@ private void pruneStorageMap(final StorageReport[] reports) { // This can occur until all block reports are received. LOG.debug("Deferring removal of stale storage {} with {} blocks", storageInfo, storageInfo.numBlocks()); + storageInfo.setState(DatanodeStorage.State.FAILED); } } } {code} The next heartbeat after all reports are sent triggers the pruning.Other changes are required, such as removal of much of the context processing code similar to the current patch. > BlockReport retransmissions may lead to storages falsely being declared > zombie if storage report processing happens out of order > > > Key: HDFS-10301 > URL: https://issues.apache.org/jira/browse/HDFS-10301 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.6.1 >Reporter: Konstantin Shvachko >Assignee: Vinitha Reddy Gankidi >Priority: Critical > Fix For: 2.7.4 > > Attachments: HDFS-10301.002.patch, HDFS-10301.003.patch, > HDFS-10301.004.patch, HDFS-10301.005.patch, HDFS-10301.006.patch, > HDFS-10301.007.patch, HDFS-10301.008.patch, HDFS-10301.009.patch, > HDFS-10301.01.patch, HDFS-10301.010.patch, HDFS-10301.011.patch, > HDFS-10301.012.patch, HDFS-10301.013.patch, HDFS-10301.branch-2.7.patch, > HDFS-10301.branch-2.patch, HDFS-10301.sample.patch, zombieStorageLogs.rtf > > > When NameNode is busy a DataNode can timeout sending a block report. Then it > sends the block report again. Then NameNode while process these two reports > at the same time can interleave processing storages from different reports. > This screws up the blockReportId field, which makes NameNode think that some > storages are zombie. Replicas from zombie storages are immediately removed, > causing missing blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8224) Any IOException in DataTransfer#run() will run diskError thread even if it is not disk error
[ https://issues.apache.org/jira/browse/HDFS-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rushabh S Shah updated HDFS-8224: - Status: Open (was: Patch Available) Cancelling the patch to address checkstyle issues. Following are the failing tests: TestRollingUpgrade#testRollback is failing and is tracked via HDFS-9664 TestTracing#testTracing is failing and tracked via HADOOP-13473 TestHttpServerLifecycle is running fine on my machine. TestRefreshUserMappings is failing with and without my patch. Will open a ticket to track this. > Any IOException in DataTransfer#run() will run diskError thread even if it is > not disk error > > > Key: HDFS-8224 > URL: https://issues.apache.org/jira/browse/HDFS-8224 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Fix For: 2.8.0 > > Attachments: HDFS-8224-trunk.patch > > > This happened in our 2.6 cluster. > One of the block and its metadata file were corrupted. > The disk was healthy in this case. > Only the block was corrupt. > Namenode tried to copy that block to another datanode but failed with the > following stack trace: > 2015-04-20 01:04:04,421 > [org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer@11319bc4] WARN > datanode.DataNode: DatanodeRegistration(a.b.c.d, > datanodeUuid=e8c5135c-9b9f-4d05-a59d-e5525518aca7, infoPort=1006, > infoSecurePort=0, ipcPort=8020, > storageInfo=lv=-56;cid=CID-e7f736ac-158e-446e-9091-7e66f3cddf3c;nsid=358250775;c=1428471998571):Failed > to transfer BP-xxx-1351096255769:blk_2697560713_1107108863999 to > a1.b1.c1.d1:1004 got > java.io.IOException: Could not create DataChecksum of type 0 with > bytesPerChecksum 0 > at > org.apache.hadoop.util.DataChecksum.newDataChecksum(DataChecksum.java:125) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:175) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:140) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readDataChecksum(BlockMetadataHeader.java:102) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:287) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:1989) > at java.lang.Thread.run(Thread.java:722) > The following catch block in DataTransfer#run method will treat every > IOException as disk error fault and run disk errror > {noformat} > catch (IOException ie) { > LOG.warn(bpReg + ":Failed to transfer " + b + " to " + > targets[0] + " got ", ie); > // check if there are any disk problem > checkDiskErrorAsync(); > } > {noformat} > This block was never scanned by BlockPoolSliceScanner otherwise it would have > reported as corrupt block. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10342) BlockManager#createLocatedBlocks should not check corrupt replicas if none are corrupt
[ https://issues.apache.org/jira/browse/HDFS-10342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-10342: -- Target Version/s: 2.8.0 > BlockManager#createLocatedBlocks should not check corrupt replicas if none > are corrupt > -- > > Key: HDFS-10342 > URL: https://issues.apache.org/jira/browse/HDFS-10342 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 2.7.0 >Reporter: Daryn Sharp >Assignee: Kuhu Shukla > Attachments: HDFS-10342.001.patch, HDFS-10342.002.patch, > HDFS-10342.003.patch > > > {{corruptReplicas#isReplicaCorrupt(block, node)}} is called for every node > while populating the machines array. There's no need to invoke the method if > {{corruptReplicas#numCorruptReplicas(block)}} returned 0. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10342) BlockManager#createLocatedBlocks should not check corrupt replicas if none are corrupt
[ https://issues.apache.org/jira/browse/HDFS-10342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-10342: -- Fix Version/s: 3.0.0-alpha2 > BlockManager#createLocatedBlocks should not check corrupt replicas if none > are corrupt > -- > > Key: HDFS-10342 > URL: https://issues.apache.org/jira/browse/HDFS-10342 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 2.7.0 >Reporter: Daryn Sharp >Assignee: Kuhu Shukla > Fix For: 3.0.0-alpha2 > > Attachments: HDFS-10342.001.patch, HDFS-10342.002.patch, > HDFS-10342.003.patch > > > {{corruptReplicas#isReplicaCorrupt(block, node)}} is called for every node > while populating the machines array. There's no need to invoke the method if > {{corruptReplicas#numCorruptReplicas(block)}} returned 0. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10342) BlockManager#createLocatedBlocks should not check corrupt replicas if none are corrupt
[ https://issues.apache.org/jira/browse/HDFS-10342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413838#comment-15413838 ] Kihwal Lee commented on HDFS-10342: --- Committed this to trunk. > BlockManager#createLocatedBlocks should not check corrupt replicas if none > are corrupt > -- > > Key: HDFS-10342 > URL: https://issues.apache.org/jira/browse/HDFS-10342 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 2.7.0 >Reporter: Daryn Sharp >Assignee: Kuhu Shukla > Fix For: 3.0.0-alpha2 > > Attachments: HDFS-10342.001.patch, HDFS-10342.002.patch, > HDFS-10342.003.patch > > > {{corruptReplicas#isReplicaCorrupt(block, node)}} is called for every node > while populating the machines array. There's no need to invoke the method if > {{corruptReplicas#numCorruptReplicas(block)}} returned 0. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10342) BlockManager#createLocatedBlocks should not check corrupt replicas if none are corrupt
[ https://issues.apache.org/jira/browse/HDFS-10342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413835#comment-15413835 ] Kihwal Lee commented on HDFS-10342: --- +1 lgtm. This is a simple optimization for the common case, but since it is called for every open-read operation, the overall positive impact will be meaningful. Would you provide a branch-2 version of the patch? > BlockManager#createLocatedBlocks should not check corrupt replicas if none > are corrupt > -- > > Key: HDFS-10342 > URL: https://issues.apache.org/jira/browse/HDFS-10342 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 2.7.0 >Reporter: Daryn Sharp >Assignee: Kuhu Shukla > Attachments: HDFS-10342.001.patch, HDFS-10342.002.patch, > HDFS-10342.003.patch > > > {{corruptReplicas#isReplicaCorrupt(block, node)}} is called for every node > while populating the machines array. There's no need to invoke the method if > {{corruptReplicas#numCorruptReplicas(block)}} returned 0. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8224) Any IOException in DataTransfer#run() will run diskError thread even if it is not disk error
[ https://issues.apache.org/jira/browse/HDFS-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rushabh S Shah updated HDFS-8224: - Status: Patch Available (was: Open) [~jojochuang]: Since you know the most context, do you mind reviewing the patch ? > Any IOException in DataTransfer#run() will run diskError thread even if it is > not disk error > > > Key: HDFS-8224 > URL: https://issues.apache.org/jira/browse/HDFS-8224 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Fix For: 2.8.0 > > Attachments: HDFS-8224-trunk-1.patch, HDFS-8224-trunk.patch > > > This happened in our 2.6 cluster. > One of the block and its metadata file were corrupted. > The disk was healthy in this case. > Only the block was corrupt. > Namenode tried to copy that block to another datanode but failed with the > following stack trace: > 2015-04-20 01:04:04,421 > [org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer@11319bc4] WARN > datanode.DataNode: DatanodeRegistration(a.b.c.d, > datanodeUuid=e8c5135c-9b9f-4d05-a59d-e5525518aca7, infoPort=1006, > infoSecurePort=0, ipcPort=8020, > storageInfo=lv=-56;cid=CID-e7f736ac-158e-446e-9091-7e66f3cddf3c;nsid=358250775;c=1428471998571):Failed > to transfer BP-xxx-1351096255769:blk_2697560713_1107108863999 to > a1.b1.c1.d1:1004 got > java.io.IOException: Could not create DataChecksum of type 0 with > bytesPerChecksum 0 > at > org.apache.hadoop.util.DataChecksum.newDataChecksum(DataChecksum.java:125) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:175) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:140) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readDataChecksum(BlockMetadataHeader.java:102) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:287) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:1989) > at java.lang.Thread.run(Thread.java:722) > The following catch block in DataTransfer#run method will treat every > IOException as disk error fault and run disk errror > {noformat} > catch (IOException ie) { > LOG.warn(bpReg + ":Failed to transfer " + b + " to " + > targets[0] + " got ", ie); > // check if there are any disk problem > checkDiskErrorAsync(); > } > {noformat} > This block was never scanned by BlockPoolSliceScanner otherwise it would have > reported as corrupt block. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8224) Any IOException in DataTransfer#run() will run diskError thread even if it is not disk error
[ https://issues.apache.org/jira/browse/HDFS-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rushabh S Shah updated HDFS-8224: - Attachment: HDFS-8224-trunk-1.patch > Any IOException in DataTransfer#run() will run diskError thread even if it is > not disk error > > > Key: HDFS-8224 > URL: https://issues.apache.org/jira/browse/HDFS-8224 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Fix For: 2.8.0 > > Attachments: HDFS-8224-trunk-1.patch, HDFS-8224-trunk.patch > > > This happened in our 2.6 cluster. > One of the block and its metadata file were corrupted. > The disk was healthy in this case. > Only the block was corrupt. > Namenode tried to copy that block to another datanode but failed with the > following stack trace: > 2015-04-20 01:04:04,421 > [org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer@11319bc4] WARN > datanode.DataNode: DatanodeRegistration(a.b.c.d, > datanodeUuid=e8c5135c-9b9f-4d05-a59d-e5525518aca7, infoPort=1006, > infoSecurePort=0, ipcPort=8020, > storageInfo=lv=-56;cid=CID-e7f736ac-158e-446e-9091-7e66f3cddf3c;nsid=358250775;c=1428471998571):Failed > to transfer BP-xxx-1351096255769:blk_2697560713_1107108863999 to > a1.b1.c1.d1:1004 got > java.io.IOException: Could not create DataChecksum of type 0 with > bytesPerChecksum 0 > at > org.apache.hadoop.util.DataChecksum.newDataChecksum(DataChecksum.java:125) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:175) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:140) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readDataChecksum(BlockMetadataHeader.java:102) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:287) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:1989) > at java.lang.Thread.run(Thread.java:722) > The following catch block in DataTransfer#run method will treat every > IOException as disk error fault and run disk errror > {noformat} > catch (IOException ie) { > LOG.warn(bpReg + ":Failed to transfer " + b + " to " + > targets[0] + " got ", ie); > // check if there are any disk problem > checkDiskErrorAsync(); > } > {noformat} > This block was never scanned by BlockPoolSliceScanner otherwise it would have > reported as corrupt block. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10733) NameNode terminated after full GC thinking QJM is unresponsive.
[ https://issues.apache.org/jira/browse/HDFS-10733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413811#comment-15413811 ] Kihwal Lee commented on HDFS-10733: --- Can we something similar to HDFS-9107? > NameNode terminated after full GC thinking QJM is unresponsive. > --- > > Key: HDFS-10733 > URL: https://issues.apache.org/jira/browse/HDFS-10733 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode, qjm >Affects Versions: 2.6.4 >Reporter: Konstantin Shvachko > > NameNode went into full GC while in {{AsyncLoggerSet.waitForWriteQuorum()}}. > After completing GC it checks if the timeout for quorum is reached. If the GC > was long enough the timeout can expire, and {{QuorumCall.waitFor()}} will > throw {{TimeoutExcpetion}}. Finally {{FSEditLog.logSync()}} catches the > exception and terminates NameNode. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10738) Fix TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration test failure
[ https://issues.apache.org/jira/browse/HDFS-10738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-10738: Target Version/s: 2.8.0 (was: 2.9.0) > Fix TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration test > failure > > > Key: HDFS-10738 > URL: https://issues.apache.org/jira/browse/HDFS-10738 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Rakesh R >Assignee: Rakesh R > Attachments: HDFS-10738-00.patch > > > This jira is to analyse and fix the test case failure, which is failing in > Jenkins build, > [Build_16326|https://builds.apache.org/job/PreCommit-HDFS-Build/16326/testReport/org.apache.hadoop.security/TestRefreshUserMappings/testRefreshSuperUserGroupsConfiguration/] > very frequently. > {code} > Error Message > first auth for user2 should've succeeded: User: super_userL is not allowed to > impersonate userL2 > Stacktrace > java.lang.AssertionError: first auth for user2 should've succeeded: User: > super_userL is not allowed to impersonate userL2 > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.security.TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration(TestRefreshUserMappings.java:200) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10738) Fix TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration test failure
[ https://issues.apache.org/jira/browse/HDFS-10738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-10738: Target Version/s: 2.9.0 Status: Patch Available (was: Open) > Fix TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration test > failure > > > Key: HDFS-10738 > URL: https://issues.apache.org/jira/browse/HDFS-10738 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Rakesh R >Assignee: Rakesh R > Attachments: HDFS-10738-00.patch > > > This jira is to analyse and fix the test case failure, which is failing in > Jenkins build, > [Build_16326|https://builds.apache.org/job/PreCommit-HDFS-Build/16326/testReport/org.apache.hadoop.security/TestRefreshUserMappings/testRefreshSuperUserGroupsConfiguration/] > very frequently. > {code} > Error Message > first auth for user2 should've succeeded: User: super_userL is not allowed to > impersonate userL2 > Stacktrace > java.lang.AssertionError: first auth for user2 should've succeeded: User: > super_userL is not allowed to impersonate userL2 > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.security.TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration(TestRefreshUserMappings.java:200) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10738) Fix TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration test failure
[ https://issues.apache.org/jira/browse/HDFS-10738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413791#comment-15413791 ] Rakesh R commented on HDFS-10738: - Thank you [~kihwal] for the advice and it worked. Attached patch with the suggested approach. > Fix TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration test > failure > > > Key: HDFS-10738 > URL: https://issues.apache.org/jira/browse/HDFS-10738 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Rakesh R >Assignee: Rakesh R > Attachments: HDFS-10738-00.patch > > > This jira is to analyse and fix the test case failure, which is failing in > Jenkins build, > [Build_16326|https://builds.apache.org/job/PreCommit-HDFS-Build/16326/testReport/org.apache.hadoop.security/TestRefreshUserMappings/testRefreshSuperUserGroupsConfiguration/] > very frequently. > {code} > Error Message > first auth for user2 should've succeeded: User: super_userL is not allowed to > impersonate userL2 > Stacktrace > java.lang.AssertionError: first auth for user2 should've succeeded: User: > super_userL is not allowed to impersonate userL2 > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.security.TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration(TestRefreshUserMappings.java:200) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org