[jira] [Updated] (HDFS-16808) HDFS metrics will hold the previous value if there is no new call
[ https://issues.apache.org/jira/browse/HDFS-16808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDFS-16808: -- Labels: pull-request-available (was: ) > HDFS metrics will hold the previous value if there is no new call > - > > Key: HDFS-16808 > URL: https://issues.apache.org/jira/browse/HDFS-16808 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: leo sun >Priority: Major > Labels: pull-request-available > Attachments: image-2022-10-19-23-59-19-673.png > > > According to the implementation of MutableStat.snapshot(), HDFS metrics will > always hold the previous value if there is no more new call. > It will cause even if user switch active and standby, the previous > ANN(standby now) will always output the old value as the pic shows > !image-2022-10-19-23-59-19-673.png! -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16808) HDFS metrics will hold the previous value if there is no new call
[ https://issues.apache.org/jira/browse/HDFS-16808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17620779#comment-17620779 ] ASF GitHub Bot commented on HDFS-16808: --- ted12138 opened a new pull request, #5049: URL: https://github.com/apache/hadoop/pull/5049 ### Description of PR ### How was this patch tested? ### For code changes: - [ ] Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')? - [ ] Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, `NOTICE-binary` files? > HDFS metrics will hold the previous value if there is no new call > - > > Key: HDFS-16808 > URL: https://issues.apache.org/jira/browse/HDFS-16808 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: leo sun >Priority: Major > Attachments: image-2022-10-19-23-59-19-673.png > > > According to the implementation of MutableStat.snapshot(), HDFS metrics will > always hold the previous value if there is no more new call. > It will cause even if user switch active and standby, the previous > ANN(standby now) will always output the old value as the pic shows > !image-2022-10-19-23-59-19-673.png! -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16806) ec data balancer block blk_id The index error ,Data cannot be moved
[ https://issues.apache.org/jira/browse/HDFS-16806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ruiliang resolved HDFS-16806. - Hadoop Flags: Reviewed Resolution: Fixed > ec data balancer block blk_id The index error ,Data cannot be moved > --- > > Key: HDFS-16806 > URL: https://issues.apache.org/jira/browse/HDFS-16806 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.1.0 >Reporter: ruiliang >Priority: Critical > Attachments: image-2022-10-20-11-32-35-833.png > > > ec data balancer block blk_id The index error ,Data cannot be moved > dn->10.12.15.149 use disk 100% > > {code:java} > echo 10.12.15.149>sorucehost > balancer -fs hdfs://xxcluster06 -threshold 10 -source -f sorucehost > 2>>~/balancer.log & {code} > > datanode logs > A lot of this log output > {code:java} > datanode logs > ... > 2022-10-19 14:43:02,031 ERROR datanode.DataNode (DataXceiver.java:run(321)) - > fs-hiido-dn-12-15-149.xx.com:1019:DataXceiver error processing COPY_BLOCK > operation src: /10.12.65.216:58214 dst: /10.12.15.149:1019 > org.apache.hadoop.hdfs.server.datanode.ReplicaNotFoundException: Replica not > found for > BP-1822992414-10.12.65.48-1660893388633:blk_-9223372036799576592_4218617 > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.getReplica(BlockSender.java:492) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:256) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.copyBlock(DataXceiver.java:1089) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opCopyBlock(Receiver.java:291) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:113) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290) > at java.lang.Thread.run(Thread.java:748) > ... > > hdfs fsck -fs hdfs://xxcluster06 -blockId blk_-9223372036799576592 > Connecting to namenode via > http://fs-hiido-xxcluster06-yynn2.xx.com:50070/fsck?ugi=hdfs&blockId=blk_-9223372036799576592+&path=%2F > FSCK started by hdfs (auth:KERBEROS_SSL) from /10.12.19.4 at Wed Oct 19 > 14:47:15 CST 2022Block Id: blk_-9223372036799576592 > Block belongs to: > /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz > No. of Expected Replica: 5 > No. of live Replica: 5 > No. of excess Replica: 0 > No. of stale Replica: 5 > No. of decommissioned Replica: 0 > No. of decommissioning Replica: 0 > No. of corrupted Replica: 0 > Block replica on datanode/rack: fs-hiido-dn-12-66-4.xx.com/4F08-01-09 is > HEALTHY > Block replica on datanode/rack: fs-hiido-dn-12-65-244.xx.com/4F08-01-08 is > HEALTHY > Block replica on datanode/rack: fs-hiido-dn-12-15-149.xx.com/4F08-05-13 is > HEALTHY > Block replica on datanode/rack: fs-hiido-dn-12-65-218.xx.com/4F08-12-04 is > HEALTHY > Block replica on datanode/rack: fs-hiido-dn-12-17-35.xx.com/4F08-03-03 is > HEALTHY > hdfs fsck -fs hdfs://xxcluster06 > /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz > -files -blocks -locations > Connecting to namenode via > http://xx.com:50070/fsck?ugi=hdfs&files=1&blocks=1&locations=1&path=%2Fhive_warehouse%2Fwarehouse_old_snapshots%2Fyy_mbsdkevent_original%2Fdt%3D20210505%2Fpost_202105052129_33.log.gz > FSCK started by hdfs (auth:KERBEROS_SSL) from /10.12.19.4 for path > /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz > at Wed Oct 19 14:48:42 CST 2022 > /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz > 500582412 bytes, erasure-coded: policy=RS-3-2-1024k, 1 block(s): OK > 0. BP-1822992414-10.12.65.48-1660893388633:blk_-9223372036799576592_4218617 > len=500582412 Live_repl=5 > [blk_-9223372036799576592:DatanodeInfoWithStorage[10.12.17.35:1019,DS-3ccebf8d-5f05-45b5-ac7f-96d1cfb48608,DISK], > > blk_-9223372036799576591:DatanodeInfoWithStorage[10.12.65.218:1019,DS-4f8e3114-7566-4cf1-ad5a-e454c8ea8805,DISK], > > blk_-9223372036799576590:DatanodeInfoWithStorage[10.12.15.149:1019,DS-1dd55c27-8f47-46a6-935b-1d9024ca9188,DISK], > > blk_-9223372036799576589:DatanodeInfoWithStorage[10.12.65.244:1019,DS-a9ffd747-c427-4aaa-8559-04cded7d9d5f,DISK], > > blk_-9223372036799576588:DatanodeInfoWithStorage[10.12.66.4:1019,DS-d88f94db-6db1-4753-a652-780d7cd7f081,DISK]] > Status: HEALTHY > Number of data-nodes: 62 > Number of racks: 19 > Total dirs: 0 > Total symlinks: 0Replicated Blocks: > Total size: 0 B > Total files: 0 > Total blocks (validated): 0 > Minimally replicated blocks: 0 > Over-replica
[jira] [Commented] (HDFS-3570) Balancer shouldn't rely on "DFS Space Used %" as that ignores non-DFS used space
[ https://issues.apache.org/jira/browse/HDFS-3570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17620751#comment-17620751 ] ASF GitHub Bot commented on HDFS-3570: -- hadoop-yetus commented on PR #5044: URL: https://github.com/apache/hadoop/pull/5044#issuecomment-1284967160 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 56s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | markdownlint | 0m 0s | | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 42m 15s | | trunk passed | | +1 :green_heart: | compile | 1m 44s | | trunk passed with JDK Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | compile | 1m 30s | | trunk passed with JDK Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | checkstyle | 1m 26s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 42s | | trunk passed | | +1 :green_heart: | javadoc | 1m 26s | | trunk passed with JDK Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | javadoc | 1m 39s | | trunk passed with JDK Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 3m 40s | | trunk passed | | +1 :green_heart: | shadedclient | 23m 2s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 24s | | the patch passed | | +1 :green_heart: | compile | 1m 23s | | the patch passed with JDK Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | javac | 1m 23s | | the patch passed | | +1 :green_heart: | compile | 1m 19s | | the patch passed with JDK Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | javac | 1m 19s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 58s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 25s | | the patch passed | | +1 :green_heart: | javadoc | 0m 56s | | the patch passed with JDK Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | javadoc | 1m 30s | | the patch passed with JDK Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 3m 16s | | the patch passed | | +1 :green_heart: | shadedclient | 22m 28s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 243m 56s | | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 1m 7s | | The patch does not generate ASF License warnings. | | | | 357m 41s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5044/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/5044 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint | | uname | Linux 35a3ff40da0a 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 43e802df586a2e1d8e8a429d64b9163b20d927f9 | | Default Java | Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5044/2/testReport/ | | Max. process+thread count | 3023 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5044/2/conso
[jira] [Commented] (HDFS-16806) ec data balancer block blk_id The index error ,Data cannot be moved
[ https://issues.apache.org/jira/browse/HDFS-16806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17620704#comment-17620704 ] ruiliang commented on HDFS-16806: - After I pull HDFS-16333, I only update hadoop-hdfs.jar on balancer client service, and the problem is solved. The following figure is a comparison before and after the update. !image-2022-10-20-11-32-35-833.png! > ec data balancer block blk_id The index error ,Data cannot be moved > --- > > Key: HDFS-16806 > URL: https://issues.apache.org/jira/browse/HDFS-16806 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.1.0 >Reporter: ruiliang >Priority: Critical > Attachments: image-2022-10-20-11-32-35-833.png > > > ec data balancer block blk_id The index error ,Data cannot be moved > dn->10.12.15.149 use disk 100% > > {code:java} > echo 10.12.15.149>sorucehost > balancer -fs hdfs://xxcluster06 -threshold 10 -source -f sorucehost > 2>>~/balancer.log & {code} > > datanode logs > A lot of this log output > {code:java} > datanode logs > ... > 2022-10-19 14:43:02,031 ERROR datanode.DataNode (DataXceiver.java:run(321)) - > fs-hiido-dn-12-15-149.xx.com:1019:DataXceiver error processing COPY_BLOCK > operation src: /10.12.65.216:58214 dst: /10.12.15.149:1019 > org.apache.hadoop.hdfs.server.datanode.ReplicaNotFoundException: Replica not > found for > BP-1822992414-10.12.65.48-1660893388633:blk_-9223372036799576592_4218617 > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.getReplica(BlockSender.java:492) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:256) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.copyBlock(DataXceiver.java:1089) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opCopyBlock(Receiver.java:291) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:113) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290) > at java.lang.Thread.run(Thread.java:748) > ... > > hdfs fsck -fs hdfs://xxcluster06 -blockId blk_-9223372036799576592 > Connecting to namenode via > http://fs-hiido-xxcluster06-yynn2.xx.com:50070/fsck?ugi=hdfs&blockId=blk_-9223372036799576592+&path=%2F > FSCK started by hdfs (auth:KERBEROS_SSL) from /10.12.19.4 at Wed Oct 19 > 14:47:15 CST 2022Block Id: blk_-9223372036799576592 > Block belongs to: > /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz > No. of Expected Replica: 5 > No. of live Replica: 5 > No. of excess Replica: 0 > No. of stale Replica: 5 > No. of decommissioned Replica: 0 > No. of decommissioning Replica: 0 > No. of corrupted Replica: 0 > Block replica on datanode/rack: fs-hiido-dn-12-66-4.xx.com/4F08-01-09 is > HEALTHY > Block replica on datanode/rack: fs-hiido-dn-12-65-244.xx.com/4F08-01-08 is > HEALTHY > Block replica on datanode/rack: fs-hiido-dn-12-15-149.xx.com/4F08-05-13 is > HEALTHY > Block replica on datanode/rack: fs-hiido-dn-12-65-218.xx.com/4F08-12-04 is > HEALTHY > Block replica on datanode/rack: fs-hiido-dn-12-17-35.xx.com/4F08-03-03 is > HEALTHY > hdfs fsck -fs hdfs://xxcluster06 > /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz > -files -blocks -locations > Connecting to namenode via > http://xx.com:50070/fsck?ugi=hdfs&files=1&blocks=1&locations=1&path=%2Fhive_warehouse%2Fwarehouse_old_snapshots%2Fyy_mbsdkevent_original%2Fdt%3D20210505%2Fpost_202105052129_33.log.gz > FSCK started by hdfs (auth:KERBEROS_SSL) from /10.12.19.4 for path > /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz > at Wed Oct 19 14:48:42 CST 2022 > /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz > 500582412 bytes, erasure-coded: policy=RS-3-2-1024k, 1 block(s): OK > 0. BP-1822992414-10.12.65.48-1660893388633:blk_-9223372036799576592_4218617 > len=500582412 Live_repl=5 > [blk_-9223372036799576592:DatanodeInfoWithStorage[10.12.17.35:1019,DS-3ccebf8d-5f05-45b5-ac7f-96d1cfb48608,DISK], > > blk_-9223372036799576591:DatanodeInfoWithStorage[10.12.65.218:1019,DS-4f8e3114-7566-4cf1-ad5a-e454c8ea8805,DISK], > > blk_-9223372036799576590:DatanodeInfoWithStorage[10.12.15.149:1019,DS-1dd55c27-8f47-46a6-935b-1d9024ca9188,DISK], > > blk_-9223372036799576589:DatanodeInfoWithStorage[10.12.65.244:1019,DS-a9ffd747-c427-4aaa-8559-04cded7d9d5f,DISK], > > blk_-9223372036799576588:DatanodeInfoWithStorage[10.12.66.4:1019,DS-d88f94db-6db1-4753-a652-780d7cd7f081,DISK]] > Status: HEALTHY > Number of data-nodes: 62 > Number of racks: 19
[jira] [Updated] (HDFS-16806) ec data balancer block blk_id The index error ,Data cannot be moved
[ https://issues.apache.org/jira/browse/HDFS-16806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ruiliang updated HDFS-16806: Attachment: image-2022-10-20-11-32-35-833.png > ec data balancer block blk_id The index error ,Data cannot be moved > --- > > Key: HDFS-16806 > URL: https://issues.apache.org/jira/browse/HDFS-16806 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.1.0 >Reporter: ruiliang >Priority: Critical > Attachments: image-2022-10-20-11-32-35-833.png > > > ec data balancer block blk_id The index error ,Data cannot be moved > dn->10.12.15.149 use disk 100% > > {code:java} > echo 10.12.15.149>sorucehost > balancer -fs hdfs://xxcluster06 -threshold 10 -source -f sorucehost > 2>>~/balancer.log & {code} > > datanode logs > A lot of this log output > {code:java} > datanode logs > ... > 2022-10-19 14:43:02,031 ERROR datanode.DataNode (DataXceiver.java:run(321)) - > fs-hiido-dn-12-15-149.xx.com:1019:DataXceiver error processing COPY_BLOCK > operation src: /10.12.65.216:58214 dst: /10.12.15.149:1019 > org.apache.hadoop.hdfs.server.datanode.ReplicaNotFoundException: Replica not > found for > BP-1822992414-10.12.65.48-1660893388633:blk_-9223372036799576592_4218617 > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.getReplica(BlockSender.java:492) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:256) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.copyBlock(DataXceiver.java:1089) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opCopyBlock(Receiver.java:291) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:113) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290) > at java.lang.Thread.run(Thread.java:748) > ... > > hdfs fsck -fs hdfs://xxcluster06 -blockId blk_-9223372036799576592 > Connecting to namenode via > http://fs-hiido-xxcluster06-yynn2.xx.com:50070/fsck?ugi=hdfs&blockId=blk_-9223372036799576592+&path=%2F > FSCK started by hdfs (auth:KERBEROS_SSL) from /10.12.19.4 at Wed Oct 19 > 14:47:15 CST 2022Block Id: blk_-9223372036799576592 > Block belongs to: > /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz > No. of Expected Replica: 5 > No. of live Replica: 5 > No. of excess Replica: 0 > No. of stale Replica: 5 > No. of decommissioned Replica: 0 > No. of decommissioning Replica: 0 > No. of corrupted Replica: 0 > Block replica on datanode/rack: fs-hiido-dn-12-66-4.xx.com/4F08-01-09 is > HEALTHY > Block replica on datanode/rack: fs-hiido-dn-12-65-244.xx.com/4F08-01-08 is > HEALTHY > Block replica on datanode/rack: fs-hiido-dn-12-15-149.xx.com/4F08-05-13 is > HEALTHY > Block replica on datanode/rack: fs-hiido-dn-12-65-218.xx.com/4F08-12-04 is > HEALTHY > Block replica on datanode/rack: fs-hiido-dn-12-17-35.xx.com/4F08-03-03 is > HEALTHY > hdfs fsck -fs hdfs://xxcluster06 > /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz > -files -blocks -locations > Connecting to namenode via > http://xx.com:50070/fsck?ugi=hdfs&files=1&blocks=1&locations=1&path=%2Fhive_warehouse%2Fwarehouse_old_snapshots%2Fyy_mbsdkevent_original%2Fdt%3D20210505%2Fpost_202105052129_33.log.gz > FSCK started by hdfs (auth:KERBEROS_SSL) from /10.12.19.4 for path > /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz > at Wed Oct 19 14:48:42 CST 2022 > /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz > 500582412 bytes, erasure-coded: policy=RS-3-2-1024k, 1 block(s): OK > 0. BP-1822992414-10.12.65.48-1660893388633:blk_-9223372036799576592_4218617 > len=500582412 Live_repl=5 > [blk_-9223372036799576592:DatanodeInfoWithStorage[10.12.17.35:1019,DS-3ccebf8d-5f05-45b5-ac7f-96d1cfb48608,DISK], > > blk_-9223372036799576591:DatanodeInfoWithStorage[10.12.65.218:1019,DS-4f8e3114-7566-4cf1-ad5a-e454c8ea8805,DISK], > > blk_-9223372036799576590:DatanodeInfoWithStorage[10.12.15.149:1019,DS-1dd55c27-8f47-46a6-935b-1d9024ca9188,DISK], > > blk_-9223372036799576589:DatanodeInfoWithStorage[10.12.65.244:1019,DS-a9ffd747-c427-4aaa-8559-04cded7d9d5f,DISK], > > blk_-9223372036799576588:DatanodeInfoWithStorage[10.12.66.4:1019,DS-d88f94db-6db1-4753-a652-780d7cd7f081,DISK]] > Status: HEALTHY > Number of data-nodes: 62 > Number of racks: 19 > Total dirs: 0 > Total symlinks: 0Replicated Blocks: > Total size: 0 B > Total files: 0 > Total blocks (validated): 0 > Minimally replicated blocks: 0 > Over-replicated
[jira] [Commented] (HDFS-16803) Improve some annotations in hdfs module
[ https://issues.apache.org/jira/browse/HDFS-16803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17620690#comment-17620690 ] ASF GitHub Bot commented on HDFS-16803: --- ZanderXu commented on PR #5031: URL: https://github.com/apache/hadoop/pull/5031#issuecomment-1284845545 Merged into trunk. Thanks @jianghuazhu for your contribution and thanks @ashutoshcipher @DaveTeng0 for your review. > Improve some annotations in hdfs module > --- > > Key: HDFS-16803 > URL: https://issues.apache.org/jira/browse/HDFS-16803 > Project: Hadoop HDFS > Issue Type: Improvement > Components: documentation, namenode >Affects Versions: 2.9.2, 3.3.4 >Reporter: JiangHua Zhu >Assignee: JiangHua Zhu >Priority: Major > Labels: pull-request-available > > In hdfs module, some annotations are out of date. E.g: > {code:java} > FSDirRenameOp: > /** >* @see {@link #unprotectedRenameTo(FSDirectory, String, String, > INodesInPath, >* INodesInPath, long, BlocksMapUpdateInfo, Options.Rename...)} >*/ > static RenameResult renameTo(FSDirectory fsd, FSPermissionChecker pc, > String src, String dst, BlocksMapUpdateInfo collectedBlocks, > boolean logRetryCache,Options.Rename... options) > throws IOException { > {code} > We should try to improve these annotations to make the documentation look > more comfortable. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16803) Improve some annotations in hdfs module
[ https://issues.apache.org/jira/browse/HDFS-16803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ZanderXu resolved HDFS-16803. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Improve some annotations in hdfs module > --- > > Key: HDFS-16803 > URL: https://issues.apache.org/jira/browse/HDFS-16803 > Project: Hadoop HDFS > Issue Type: Improvement > Components: documentation, namenode >Affects Versions: 2.9.2, 3.3.4 >Reporter: JiangHua Zhu >Assignee: JiangHua Zhu >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > In hdfs module, some annotations are out of date. E.g: > {code:java} > FSDirRenameOp: > /** >* @see {@link #unprotectedRenameTo(FSDirectory, String, String, > INodesInPath, >* INodesInPath, long, BlocksMapUpdateInfo, Options.Rename...)} >*/ > static RenameResult renameTo(FSDirectory fsd, FSPermissionChecker pc, > String src, String dst, BlocksMapUpdateInfo collectedBlocks, > boolean logRetryCache,Options.Rename... options) > throws IOException { > {code} > We should try to improve these annotations to make the documentation look > more comfortable. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16803) Improve some annotations in hdfs module
[ https://issues.apache.org/jira/browse/HDFS-16803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17620689#comment-17620689 ] ASF GitHub Bot commented on HDFS-16803: --- ZanderXu merged PR #5031: URL: https://github.com/apache/hadoop/pull/5031 > Improve some annotations in hdfs module > --- > > Key: HDFS-16803 > URL: https://issues.apache.org/jira/browse/HDFS-16803 > Project: Hadoop HDFS > Issue Type: Improvement > Components: documentation, namenode >Affects Versions: 2.9.2, 3.3.4 >Reporter: JiangHua Zhu >Assignee: JiangHua Zhu >Priority: Major > Labels: pull-request-available > > In hdfs module, some annotations are out of date. E.g: > {code:java} > FSDirRenameOp: > /** >* @see {@link #unprotectedRenameTo(FSDirectory, String, String, > INodesInPath, >* INodesInPath, long, BlocksMapUpdateInfo, Options.Rename...)} >*/ > static RenameResult renameTo(FSDirectory fsd, FSPermissionChecker pc, > String src, String dst, BlocksMapUpdateInfo collectedBlocks, > boolean logRetryCache,Options.Rename... options) > throws IOException { > {code} > We should try to improve these annotations to make the documentation look > more comfortable. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16803) Improve some annotations in hdfs module
[ https://issues.apache.org/jira/browse/HDFS-16803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17620678#comment-17620678 ] ASF GitHub Bot commented on HDFS-16803: --- jianghuazhu commented on PR #5031: URL: https://github.com/apache/hadoop/pull/5031#issuecomment-1284803556 @ashutoshcipher , thank you for helping review this pr. Can you help with merging into trunk branch, @ZanderXu. Thanks. > Improve some annotations in hdfs module > --- > > Key: HDFS-16803 > URL: https://issues.apache.org/jira/browse/HDFS-16803 > Project: Hadoop HDFS > Issue Type: Improvement > Components: documentation, namenode >Affects Versions: 2.9.2, 3.3.4 >Reporter: JiangHua Zhu >Assignee: JiangHua Zhu >Priority: Major > Labels: pull-request-available > > In hdfs module, some annotations are out of date. E.g: > {code:java} > FSDirRenameOp: > /** >* @see {@link #unprotectedRenameTo(FSDirectory, String, String, > INodesInPath, >* INodesInPath, long, BlocksMapUpdateInfo, Options.Rename...)} >*/ > static RenameResult renameTo(FSDirectory fsd, FSPermissionChecker pc, > String src, String dst, BlocksMapUpdateInfo collectedBlocks, > boolean logRetryCache,Options.Rename... options) > throws IOException { > {code} > We should try to improve these annotations to make the documentation look > more comfortable. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-3570) Balancer shouldn't rely on "DFS Space Used %" as that ignores non-DFS used space
[ https://issues.apache.org/jira/browse/HDFS-3570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17620568#comment-17620568 ] ASF GitHub Bot commented on HDFS-3570: -- hadoop-yetus commented on PR #5044: URL: https://github.com/apache/hadoop/pull/5044#issuecomment-1284516102 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 56s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | markdownlint | 0m 0s | | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 42m 8s | | trunk passed | | +1 :green_heart: | compile | 1m 38s | | trunk passed with JDK Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | compile | 1m 31s | | trunk passed with JDK Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | checkstyle | 1m 17s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 40s | | trunk passed | | +1 :green_heart: | javadoc | 1m 16s | | trunk passed with JDK Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | javadoc | 1m 38s | | trunk passed with JDK Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 3m 44s | | trunk passed | | +1 :green_heart: | shadedclient | 25m 59s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 25s | | the patch passed | | +1 :green_heart: | compile | 1m 28s | | the patch passed with JDK Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | javac | 1m 28s | | the patch passed | | +1 :green_heart: | compile | 1m 19s | | the patch passed with JDK Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | javac | 1m 19s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 1m 0s | [/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5044/1/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 21 unchanged - 0 fixed = 22 total (was 21) | | +1 :green_heart: | mvnsite | 1m 26s | | the patch passed | | +1 :green_heart: | javadoc | 0m 56s | | the patch passed with JDK Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | javadoc | 1m 25s | | the patch passed with JDK Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 3m 33s | | the patch passed | | +1 :green_heart: | shadedclient | 25m 44s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 353m 14s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5044/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 56s | | The patch does not generate ASF License warnings. | | | | 471m 51s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.namenode.ha.TestObserverNode | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5044/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/5044 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint | | uname | Linux f6778e909231 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / e0e1a60554aa05ff878fc9685e6cb4b3ec01f618 | | Default Java | Private Build-1.8.0_342
[jira] [Commented] (HDFS-16771) JN should tersely print logs about NewerTxnIdException
[ https://issues.apache.org/jira/browse/HDFS-16771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17620476#comment-17620476 ] Erik Krogen commented on HDFS-16771: Thanks for catching my mistake [~ferhui] ! > JN should tersely print logs about NewerTxnIdException > -- > > Key: HDFS-16771 > URL: https://issues.apache.org/jira/browse/HDFS-16771 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: ZanderXu >Assignee: ZanderXu >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > JournalNode should tersely print some logs about NewerTxnIdException. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16808) HDFS metrics will hold the previous value if there is no new call
leo sun created HDFS-16808: -- Summary: HDFS metrics will hold the previous value if there is no new call Key: HDFS-16808 URL: https://issues.apache.org/jira/browse/HDFS-16808 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs Reporter: leo sun Attachments: image-2022-10-19-23-59-19-673.png According to the implementation of MutableStat.snapshot(), HDFS metrics will always hold the previous value if there is no more new call. It will cause even if user switch active and standby, the previous ANN(standby now) will always output the old value as the pic shows !image-2022-10-19-23-59-19-673.png! -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-3570) Balancer shouldn't rely on "DFS Space Used %" as that ignores non-DFS used space
[ https://issues.apache.org/jira/browse/HDFS-3570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDFS-3570: - Labels: pull-request-available (was: ) > Balancer shouldn't rely on "DFS Space Used %" as that ignores non-DFS used > space > > > Key: HDFS-3570 > URL: https://issues.apache.org/jira/browse/HDFS-3570 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.0.0-alpha >Reporter: Harsh J >Assignee: Ashutosh Gupta >Priority: Minor > Labels: pull-request-available > Attachments: HDFS-3570.003.patch, HDFS-3570.2.patch, > HDFS-3570.aash.1.patch > > > Report from a user here: > https://groups.google.com/a/cloudera.org/d/msg/cdh-user/pIhNyDVxdVY/b7ENZmEvBjIJ, > post archived at http://pastebin.com/eVFkk0A0 > This user had a specific DN that had a large non-DFS usage among > dfs.data.dirs, and very little DFS usage (which is computed against total > possible capacity). > Balancer apparently only looks at the usage, and ignores to consider that > non-DFS usage may also be high on a DN/cluster. Hence, it thinks that if a > DFS Usage report from DN is 8% only, its got a lot of free space to write > more blocks, when that isn't true as shown by the case of this user. It went > on scheduling writes to the DN to balance it out, but the DN simply can't > accept any more blocks as a result of its disks' state. > I think it would be better if we _computed_ the actual utilization based on > {{(100-(actual remaining space))/(capacity)}}, as opposed to the current > {{(dfs used)/(capacity)}}. Thoughts? > This isn't very critical, however, cause it is very rare to see DN space > being used for non DN data, but it does expose a valid bug. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-3570) Balancer shouldn't rely on "DFS Space Used %" as that ignores non-DFS used space
[ https://issues.apache.org/jira/browse/HDFS-3570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17620309#comment-17620309 ] ASF GitHub Bot commented on HDFS-3570: -- ashutoshcipher opened a new pull request, #5044: URL: https://github.com/apache/hadoop/pull/5044 ### Description of PR **Balancer shouldn't rely on "DFS Space Used %" as that ignores non-DFS used space** Report from a user here: https://groups.google.com/a/cloudera.org/d/msg/cdh-user/pIhNyDVxdVY/b7ENZmEvBjIJ (Not available now) , post archived at http://pastebin.com/eVFkk0A0 This user had a specific DN that had a large non-DFS usage among dfs.data.dirs, and very little DFS usage (which is computed against total possible capacity). Balancer apparently only looks at the usage, and ignores to consider that non-DFS usage may also be high on a DN/cluster. Hence, it thinks that if a DFS Usage report from DN is 8% only, its got a lot of free space to write more blocks, when that isn't true as shown by the case of this user. It went on scheduling writes to the DN to balance it out, but the DN simply can't accept any more blocks as a result of its disks' state. It would be better if we computed the actual utilization based on (100-(actual remaining space))/(capacity), as opposed to the current (dfs used)/(capacity). Thoughts? This isn't very critical, however, cause it is very rare to see DN space being used for non DN data, but it does expose a valid bug. ### How was this patch tested? UT ### For code changes: - [X] Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')? - [ ] Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, `NOTICE-binary` files? > Balancer shouldn't rely on "DFS Space Used %" as that ignores non-DFS used > space > > > Key: HDFS-3570 > URL: https://issues.apache.org/jira/browse/HDFS-3570 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.0.0-alpha >Reporter: Harsh J >Assignee: Ashutosh Gupta >Priority: Minor > Attachments: HDFS-3570.003.patch, HDFS-3570.2.patch, > HDFS-3570.aash.1.patch > > > Report from a user here: > https://groups.google.com/a/cloudera.org/d/msg/cdh-user/pIhNyDVxdVY/b7ENZmEvBjIJ, > post archived at http://pastebin.com/eVFkk0A0 > This user had a specific DN that had a large non-DFS usage among > dfs.data.dirs, and very little DFS usage (which is computed against total > possible capacity). > Balancer apparently only looks at the usage, and ignores to consider that > non-DFS usage may also be high on a DN/cluster. Hence, it thinks that if a > DFS Usage report from DN is 8% only, its got a lot of free space to write > more blocks, when that isn't true as shown by the case of this user. It went > on scheduling writes to the DN to balance it out, but the DN simply can't > accept any more blocks as a result of its disks' state. > I think it would be better if we _computed_ the actual utilization based on > {{(100-(actual remaining space))/(capacity)}}, as opposed to the current > {{(dfs used)/(capacity)}}. Thoughts? > This isn't very critical, however, cause it is very rare to see DN space > being used for non DN data, but it does expose a valid bug. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-3570) Balancer shouldn't rely on "DFS Space Used %" as that ignores non-DFS used space
[ https://issues.apache.org/jira/browse/HDFS-3570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17620302#comment-17620302 ] Ashutosh Gupta commented on HDFS-3570: -- I have gone through the discussion. Taking it for fix. > Balancer shouldn't rely on "DFS Space Used %" as that ignores non-DFS used > space > > > Key: HDFS-3570 > URL: https://issues.apache.org/jira/browse/HDFS-3570 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.0.0-alpha >Reporter: Harsh J >Assignee: Ashutosh Gupta >Priority: Minor > Attachments: HDFS-3570.003.patch, HDFS-3570.2.patch, > HDFS-3570.aash.1.patch > > > Report from a user here: > https://groups.google.com/a/cloudera.org/d/msg/cdh-user/pIhNyDVxdVY/b7ENZmEvBjIJ, > post archived at http://pastebin.com/eVFkk0A0 > This user had a specific DN that had a large non-DFS usage among > dfs.data.dirs, and very little DFS usage (which is computed against total > possible capacity). > Balancer apparently only looks at the usage, and ignores to consider that > non-DFS usage may also be high on a DN/cluster. Hence, it thinks that if a > DFS Usage report from DN is 8% only, its got a lot of free space to write > more blocks, when that isn't true as shown by the case of this user. It went > on scheduling writes to the DN to balance it out, but the DN simply can't > accept any more blocks as a result of its disks' state. > I think it would be better if we _computed_ the actual utilization based on > {{(100-(actual remaining space))/(capacity)}}, as opposed to the current > {{(dfs used)/(capacity)}}. Thoughts? > This isn't very critical, however, cause it is very rare to see DN space > being used for non DN data, but it does expose a valid bug. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16807) Improve legacy ClientProtocol#rename2() interface
[ https://issues.apache.org/jira/browse/HDFS-16807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17620206#comment-17620206 ] JiangHua Zhu commented on HDFS-16807: - Can you guys post some suggestions? [~weichiu] [~aajisaka] [~hexiaoqiao] [~steve_l] [~ayushtkn]. Any suggestion is fine. > Improve legacy ClientProtocol#rename2() interface > - > > Key: HDFS-16807 > URL: https://issues.apache.org/jira/browse/HDFS-16807 > Project: Hadoop HDFS > Issue Type: Improvement > Components: dfsclient >Affects Versions: 3.3.3 >Reporter: JiangHua Zhu >Priority: Major > > In HDFS-2298, rename2() replaced rename(), which is a very meaningful > improvement. It looks like some old customs are still preserved, they are: > 1. When using the shell to execute the mv command, rename() is still used. > ./bin/hdfs dfs -mv [source] [target] > {code:java} > In MoveCommands#Rename: > protected void processPath(PathData src, PathData target) throws > IOException { > .. > if (!target.fs.rename(src.path, target.path)) { > // we have no way to know the actual error... > throw new PathIOException(src.toString()); > } > } > {code} > 2. When NNThroughputBenchmark verifies the rename. > In NNThroughputBenchmark#RenameFileStats: > {code:java} > long executeOp(int daemonId, int inputIdx, String ignore) > throws IOException { > long start = Time.now(); > clientProto.rename(fileNames[daemonId][inputIdx], > destNames[daemonId][inputIdx]); > long end = Time.now(); > return end-start; > } > {code} > I think the interface should be kept uniform since rename() is deprecated. > For NNThroughputBenchmark, it's easy. But it is not easy to improve > MoveCommands, because it involves the transformation of FileSystem. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16807) Improve legacy ClientProtocol#rename2() interface
JiangHua Zhu created HDFS-16807: --- Summary: Improve legacy ClientProtocol#rename2() interface Key: HDFS-16807 URL: https://issues.apache.org/jira/browse/HDFS-16807 Project: Hadoop HDFS Issue Type: Improvement Components: dfsclient Affects Versions: 3.3.3 Reporter: JiangHua Zhu In HDFS-2298, rename2() replaced rename(), which is a very meaningful improvement. It looks like some old customs are still preserved, they are: 1. When using the shell to execute the mv command, rename() is still used. ./bin/hdfs dfs -mv [source] [target] {code:java} In MoveCommands#Rename: protected void processPath(PathData src, PathData target) throws IOException { .. if (!target.fs.rename(src.path, target.path)) { // we have no way to know the actual error... throw new PathIOException(src.toString()); } } {code} 2. When NNThroughputBenchmark verifies the rename. In NNThroughputBenchmark#RenameFileStats: {code:java} long executeOp(int daemonId, int inputIdx, String ignore) throws IOException { long start = Time.now(); clientProto.rename(fileNames[daemonId][inputIdx], destNames[daemonId][inputIdx]); long end = Time.now(); return end-start; } {code} I think the interface should be kept uniform since rename() is deprecated. For NNThroughputBenchmark, it's easy. But it is not easy to improve MoveCommands, because it involves the transformation of FileSystem. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16806) ec data balancer block blk_id The index error ,Data cannot be moved
[ https://issues.apache.org/jira/browse/HDFS-16806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17620098#comment-17620098 ] Takanobu Asanuma commented on HDFS-16806: - Thanks for reporting the issue, [~ruilaing]. * You need to apply HDFS-16333 to the balancer client, and you don't need to apply it to NameNode. However, I'm not sure whether HDFS-16333 fixes this problem. * I think the priority of Blocker is too much for now. Changed the priority to Critical. > ec data balancer block blk_id The index error ,Data cannot be moved > --- > > Key: HDFS-16806 > URL: https://issues.apache.org/jira/browse/HDFS-16806 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.1.0 >Reporter: ruiliang >Priority: Critical > > ec data balancer block blk_id The index error ,Data cannot be moved > dn->10.12.15.149 use disk 100% > > {code:java} > echo 10.12.15.149>sorucehost > balancer -fs hdfs://xxcluster06 -threshold 10 -source -f sorucehost > 2>>~/balancer.log & {code} > > datanode logs > A lot of this log output > {code:java} > datanode logs > ... > 2022-10-19 14:43:02,031 ERROR datanode.DataNode (DataXceiver.java:run(321)) - > fs-hiido-dn-12-15-149.xx.com:1019:DataXceiver error processing COPY_BLOCK > operation src: /10.12.65.216:58214 dst: /10.12.15.149:1019 > org.apache.hadoop.hdfs.server.datanode.ReplicaNotFoundException: Replica not > found for > BP-1822992414-10.12.65.48-1660893388633:blk_-9223372036799576592_4218617 > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.getReplica(BlockSender.java:492) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:256) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.copyBlock(DataXceiver.java:1089) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opCopyBlock(Receiver.java:291) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:113) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290) > at java.lang.Thread.run(Thread.java:748) > ... > > hdfs fsck -fs hdfs://xxcluster06 -blockId blk_-9223372036799576592 > Connecting to namenode via > http://fs-hiido-xxcluster06-yynn2.xx.com:50070/fsck?ugi=hdfs&blockId=blk_-9223372036799576592+&path=%2F > FSCK started by hdfs (auth:KERBEROS_SSL) from /10.12.19.4 at Wed Oct 19 > 14:47:15 CST 2022Block Id: blk_-9223372036799576592 > Block belongs to: > /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz > No. of Expected Replica: 5 > No. of live Replica: 5 > No. of excess Replica: 0 > No. of stale Replica: 5 > No. of decommissioned Replica: 0 > No. of decommissioning Replica: 0 > No. of corrupted Replica: 0 > Block replica on datanode/rack: fs-hiido-dn-12-66-4.xx.com/4F08-01-09 is > HEALTHY > Block replica on datanode/rack: fs-hiido-dn-12-65-244.xx.com/4F08-01-08 is > HEALTHY > Block replica on datanode/rack: fs-hiido-dn-12-15-149.xx.com/4F08-05-13 is > HEALTHY > Block replica on datanode/rack: fs-hiido-dn-12-65-218.xx.com/4F08-12-04 is > HEALTHY > Block replica on datanode/rack: fs-hiido-dn-12-17-35.xx.com/4F08-03-03 is > HEALTHY > hdfs fsck -fs hdfs://xxcluster06 > /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz > -files -blocks -locations > Connecting to namenode via > http://xx.com:50070/fsck?ugi=hdfs&files=1&blocks=1&locations=1&path=%2Fhive_warehouse%2Fwarehouse_old_snapshots%2Fyy_mbsdkevent_original%2Fdt%3D20210505%2Fpost_202105052129_33.log.gz > FSCK started by hdfs (auth:KERBEROS_SSL) from /10.12.19.4 for path > /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz > at Wed Oct 19 14:48:42 CST 2022 > /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz > 500582412 bytes, erasure-coded: policy=RS-3-2-1024k, 1 block(s): OK > 0. BP-1822992414-10.12.65.48-1660893388633:blk_-9223372036799576592_4218617 > len=500582412 Live_repl=5 > [blk_-9223372036799576592:DatanodeInfoWithStorage[10.12.17.35:1019,DS-3ccebf8d-5f05-45b5-ac7f-96d1cfb48608,DISK], > > blk_-9223372036799576591:DatanodeInfoWithStorage[10.12.65.218:1019,DS-4f8e3114-7566-4cf1-ad5a-e454c8ea8805,DISK], > > blk_-9223372036799576590:DatanodeInfoWithStorage[10.12.15.149:1019,DS-1dd55c27-8f47-46a6-935b-1d9024ca9188,DISK], > > blk_-9223372036799576589:DatanodeInfoWithStorage[10.12.65.244:1019,DS-a9ffd747-c427-4aaa-8559-04cded7d9d5f,DISK], > > blk_-9223372036799576588:DatanodeInfoWithStorage[10.12.66.4:1019,DS-d88f94db-6db1-4753-a652-780d7cd7f081,DISK]] > Status: HEALTHY > Number of data-nodes:
[jira] [Updated] (HDFS-16806) ec data balancer block blk_id The index error ,Data cannot be moved
[ https://issues.apache.org/jira/browse/HDFS-16806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takanobu Asanuma updated HDFS-16806: Priority: Critical (was: Blocker) > ec data balancer block blk_id The index error ,Data cannot be moved > --- > > Key: HDFS-16806 > URL: https://issues.apache.org/jira/browse/HDFS-16806 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.1.0 >Reporter: ruiliang >Priority: Critical > > ec data balancer block blk_id The index error ,Data cannot be moved > dn->10.12.15.149 use disk 100% > > {code:java} > echo 10.12.15.149>sorucehost > balancer -fs hdfs://xxcluster06 -threshold 10 -source -f sorucehost > 2>>~/balancer.log & {code} > > datanode logs > A lot of this log output > {code:java} > datanode logs > ... > 2022-10-19 14:43:02,031 ERROR datanode.DataNode (DataXceiver.java:run(321)) - > fs-hiido-dn-12-15-149.xx.com:1019:DataXceiver error processing COPY_BLOCK > operation src: /10.12.65.216:58214 dst: /10.12.15.149:1019 > org.apache.hadoop.hdfs.server.datanode.ReplicaNotFoundException: Replica not > found for > BP-1822992414-10.12.65.48-1660893388633:blk_-9223372036799576592_4218617 > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.getReplica(BlockSender.java:492) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:256) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.copyBlock(DataXceiver.java:1089) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opCopyBlock(Receiver.java:291) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:113) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290) > at java.lang.Thread.run(Thread.java:748) > ... > > hdfs fsck -fs hdfs://xxcluster06 -blockId blk_-9223372036799576592 > Connecting to namenode via > http://fs-hiido-xxcluster06-yynn2.xx.com:50070/fsck?ugi=hdfs&blockId=blk_-9223372036799576592+&path=%2F > FSCK started by hdfs (auth:KERBEROS_SSL) from /10.12.19.4 at Wed Oct 19 > 14:47:15 CST 2022Block Id: blk_-9223372036799576592 > Block belongs to: > /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz > No. of Expected Replica: 5 > No. of live Replica: 5 > No. of excess Replica: 0 > No. of stale Replica: 5 > No. of decommissioned Replica: 0 > No. of decommissioning Replica: 0 > No. of corrupted Replica: 0 > Block replica on datanode/rack: fs-hiido-dn-12-66-4.xx.com/4F08-01-09 is > HEALTHY > Block replica on datanode/rack: fs-hiido-dn-12-65-244.xx.com/4F08-01-08 is > HEALTHY > Block replica on datanode/rack: fs-hiido-dn-12-15-149.xx.com/4F08-05-13 is > HEALTHY > Block replica on datanode/rack: fs-hiido-dn-12-65-218.xx.com/4F08-12-04 is > HEALTHY > Block replica on datanode/rack: fs-hiido-dn-12-17-35.xx.com/4F08-03-03 is > HEALTHY > hdfs fsck -fs hdfs://xxcluster06 > /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz > -files -blocks -locations > Connecting to namenode via > http://xx.com:50070/fsck?ugi=hdfs&files=1&blocks=1&locations=1&path=%2Fhive_warehouse%2Fwarehouse_old_snapshots%2Fyy_mbsdkevent_original%2Fdt%3D20210505%2Fpost_202105052129_33.log.gz > FSCK started by hdfs (auth:KERBEROS_SSL) from /10.12.19.4 for path > /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz > at Wed Oct 19 14:48:42 CST 2022 > /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz > 500582412 bytes, erasure-coded: policy=RS-3-2-1024k, 1 block(s): OK > 0. BP-1822992414-10.12.65.48-1660893388633:blk_-9223372036799576592_4218617 > len=500582412 Live_repl=5 > [blk_-9223372036799576592:DatanodeInfoWithStorage[10.12.17.35:1019,DS-3ccebf8d-5f05-45b5-ac7f-96d1cfb48608,DISK], > > blk_-9223372036799576591:DatanodeInfoWithStorage[10.12.65.218:1019,DS-4f8e3114-7566-4cf1-ad5a-e454c8ea8805,DISK], > > blk_-9223372036799576590:DatanodeInfoWithStorage[10.12.15.149:1019,DS-1dd55c27-8f47-46a6-935b-1d9024ca9188,DISK], > > blk_-9223372036799576589:DatanodeInfoWithStorage[10.12.65.244:1019,DS-a9ffd747-c427-4aaa-8559-04cded7d9d5f,DISK], > > blk_-9223372036799576588:DatanodeInfoWithStorage[10.12.66.4:1019,DS-d88f94db-6db1-4753-a652-780d7cd7f081,DISK]] > Status: HEALTHY > Number of data-nodes: 62 > Number of racks: 19 > Total dirs: 0 > Total symlinks: 0Replicated Blocks: > Total size: 0 B > Total files: 0 > Total blocks (validated): 0 > Minimally replicated blocks: 0 > Over-replicated blocks: 0 > Under-replicated blocks: 0
[jira] [Commented] (HDFS-16806) ec data balancer block blk_id The index error ,Data cannot be moved
[ https://issues.apache.org/jira/browse/HDFS-16806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17620065#comment-17620065 ] ruiliang commented on HDFS-16806: - https://issues.apache.org/jira/browse/HDFS-16333 Is that the question? All I have to do is join the balancer client, right? Or pull it to the namenode server > ec data balancer block blk_id The index error ,Data cannot be moved > --- > > Key: HDFS-16806 > URL: https://issues.apache.org/jira/browse/HDFS-16806 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.1.0 >Reporter: ruiliang >Priority: Blocker > > ec data balancer block blk_id The index error ,Data cannot be moved > dn->10.12.15.149 use disk 100% > > {code:java} > echo 10.12.15.149>sorucehost > balancer -fs hdfs://xxcluster06 -threshold 10 -source -f sorucehost > 2>>~/balancer.log & {code} > > datanode logs > A lot of this log output > {code:java} > datanode logs > ... > 2022-10-19 14:43:02,031 ERROR datanode.DataNode (DataXceiver.java:run(321)) - > fs-hiido-dn-12-15-149.xx.com:1019:DataXceiver error processing COPY_BLOCK > operation src: /10.12.65.216:58214 dst: /10.12.15.149:1019 > org.apache.hadoop.hdfs.server.datanode.ReplicaNotFoundException: Replica not > found for > BP-1822992414-10.12.65.48-1660893388633:blk_-9223372036799576592_4218617 > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.getReplica(BlockSender.java:492) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:256) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.copyBlock(DataXceiver.java:1089) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opCopyBlock(Receiver.java:291) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:113) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290) > at java.lang.Thread.run(Thread.java:748) > ... > > hdfs fsck -fs hdfs://xxcluster06 -blockId blk_-9223372036799576592 > Connecting to namenode via > http://fs-hiido-xxcluster06-yynn2.xx.com:50070/fsck?ugi=hdfs&blockId=blk_-9223372036799576592+&path=%2F > FSCK started by hdfs (auth:KERBEROS_SSL) from /10.12.19.4 at Wed Oct 19 > 14:47:15 CST 2022Block Id: blk_-9223372036799576592 > Block belongs to: > /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz > No. of Expected Replica: 5 > No. of live Replica: 5 > No. of excess Replica: 0 > No. of stale Replica: 5 > No. of decommissioned Replica: 0 > No. of decommissioning Replica: 0 > No. of corrupted Replica: 0 > Block replica on datanode/rack: fs-hiido-dn-12-66-4.xx.com/4F08-01-09 is > HEALTHY > Block replica on datanode/rack: fs-hiido-dn-12-65-244.xx.com/4F08-01-08 is > HEALTHY > Block replica on datanode/rack: fs-hiido-dn-12-15-149.xx.com/4F08-05-13 is > HEALTHY > Block replica on datanode/rack: fs-hiido-dn-12-65-218.xx.com/4F08-12-04 is > HEALTHY > Block replica on datanode/rack: fs-hiido-dn-12-17-35.xx.com/4F08-03-03 is > HEALTHY > hdfs fsck -fs hdfs://xxcluster06 > /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz > -files -blocks -locations > Connecting to namenode via > http://xx.com:50070/fsck?ugi=hdfs&files=1&blocks=1&locations=1&path=%2Fhive_warehouse%2Fwarehouse_old_snapshots%2Fyy_mbsdkevent_original%2Fdt%3D20210505%2Fpost_202105052129_33.log.gz > FSCK started by hdfs (auth:KERBEROS_SSL) from /10.12.19.4 for path > /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz > at Wed Oct 19 14:48:42 CST 2022 > /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz > 500582412 bytes, erasure-coded: policy=RS-3-2-1024k, 1 block(s): OK > 0. BP-1822992414-10.12.65.48-1660893388633:blk_-9223372036799576592_4218617 > len=500582412 Live_repl=5 > [blk_-9223372036799576592:DatanodeInfoWithStorage[10.12.17.35:1019,DS-3ccebf8d-5f05-45b5-ac7f-96d1cfb48608,DISK], > > blk_-9223372036799576591:DatanodeInfoWithStorage[10.12.65.218:1019,DS-4f8e3114-7566-4cf1-ad5a-e454c8ea8805,DISK], > > blk_-9223372036799576590:DatanodeInfoWithStorage[10.12.15.149:1019,DS-1dd55c27-8f47-46a6-935b-1d9024ca9188,DISK], > > blk_-9223372036799576589:DatanodeInfoWithStorage[10.12.65.244:1019,DS-a9ffd747-c427-4aaa-8559-04cded7d9d5f,DISK], > > blk_-9223372036799576588:DatanodeInfoWithStorage[10.12.66.4:1019,DS-d88f94db-6db1-4753-a652-780d7cd7f081,DISK]] > Status: HEALTHY > Number of data-nodes: 62 > Number of racks: 19 > Total dirs: 0 > Total symlinks: 0Replicated Blocks: > Total size: 0 B >
[jira] [Updated] (HDFS-16806) ec data balancer block blk_id The index error ,Data cannot be moved
[ https://issues.apache.org/jira/browse/HDFS-16806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ruiliang updated HDFS-16806: Description: ec data balancer block blk_id The index error ,Data cannot be moved dn->10.12.15.149 use disk 100% {code:java} echo 10.12.15.149>sorucehost balancer -fs hdfs://xxcluster06 -threshold 10 -source -f sorucehost 2>>~/balancer.log & {code} datanode logs A lot of this log output {code:java} datanode logs ... 2022-10-19 14:43:02,031 ERROR datanode.DataNode (DataXceiver.java:run(321)) - fs-hiido-dn-12-15-149.xx.com:1019:DataXceiver error processing COPY_BLOCK operation src: /10.12.65.216:58214 dst: /10.12.15.149:1019 org.apache.hadoop.hdfs.server.datanode.ReplicaNotFoundException: Replica not found for BP-1822992414-10.12.65.48-1660893388633:blk_-9223372036799576592_4218617 at org.apache.hadoop.hdfs.server.datanode.BlockSender.getReplica(BlockSender.java:492) at org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:256) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.copyBlock(DataXceiver.java:1089) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opCopyBlock(Receiver.java:291) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:113) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290) at java.lang.Thread.run(Thread.java:748) ... hdfs fsck -fs hdfs://xxcluster06 -blockId blk_-9223372036799576592 Connecting to namenode via http://fs-hiido-xxcluster06-yynn2.xx.com:50070/fsck?ugi=hdfs&blockId=blk_-9223372036799576592+&path=%2F FSCK started by hdfs (auth:KERBEROS_SSL) from /10.12.19.4 at Wed Oct 19 14:47:15 CST 2022Block Id: blk_-9223372036799576592 Block belongs to: /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz No. of Expected Replica: 5 No. of live Replica: 5 No. of excess Replica: 0 No. of stale Replica: 5 No. of decommissioned Replica: 0 No. of decommissioning Replica: 0 No. of corrupted Replica: 0 Block replica on datanode/rack: fs-hiido-dn-12-66-4.xx.com/4F08-01-09 is HEALTHY Block replica on datanode/rack: fs-hiido-dn-12-65-244.xx.com/4F08-01-08 is HEALTHY Block replica on datanode/rack: fs-hiido-dn-12-15-149.xx.com/4F08-05-13 is HEALTHY Block replica on datanode/rack: fs-hiido-dn-12-65-218.xx.com/4F08-12-04 is HEALTHY Block replica on datanode/rack: fs-hiido-dn-12-17-35.xx.com/4F08-03-03 is HEALTHY hdfs fsck -fs hdfs://xxcluster06 /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz -files -blocks -locations Connecting to namenode via http://xx.com:50070/fsck?ugi=hdfs&files=1&blocks=1&locations=1&path=%2Fhive_warehouse%2Fwarehouse_old_snapshots%2Fyy_mbsdkevent_original%2Fdt%3D20210505%2Fpost_202105052129_33.log.gz FSCK started by hdfs (auth:KERBEROS_SSL) from /10.12.19.4 for path /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz at Wed Oct 19 14:48:42 CST 2022 /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz 500582412 bytes, erasure-coded: policy=RS-3-2-1024k, 1 block(s): OK 0. BP-1822992414-10.12.65.48-1660893388633:blk_-9223372036799576592_4218617 len=500582412 Live_repl=5 [blk_-9223372036799576592:DatanodeInfoWithStorage[10.12.17.35:1019,DS-3ccebf8d-5f05-45b5-ac7f-96d1cfb48608,DISK], blk_-9223372036799576591:DatanodeInfoWithStorage[10.12.65.218:1019,DS-4f8e3114-7566-4cf1-ad5a-e454c8ea8805,DISK], blk_-9223372036799576590:DatanodeInfoWithStorage[10.12.15.149:1019,DS-1dd55c27-8f47-46a6-935b-1d9024ca9188,DISK], blk_-9223372036799576589:DatanodeInfoWithStorage[10.12.65.244:1019,DS-a9ffd747-c427-4aaa-8559-04cded7d9d5f,DISK], blk_-9223372036799576588:DatanodeInfoWithStorage[10.12.66.4:1019,DS-d88f94db-6db1-4753-a652-780d7cd7f081,DISK]] Status: HEALTHY Number of data-nodes: 62 Number of racks: 19 Total dirs: 0 Total symlinks: 0Replicated Blocks: Total size: 0 B Total files: 0 Total blocks (validated): 0 Minimally replicated blocks: 0 Over-replicated blocks: 0 Under-replicated blocks: 0 Mis-replicated blocks: 0 Default replication factor: 3 Average block replication: 0.0 Missing blocks: 0 Corrupt blocks: 0 Missing replicas: 0Erasure Coded Block Groups: Total size: 500582412 B Total files: 1 Total block groups (validated): 1 (avg. block group size 500582412 B) Minimally erasure-coded block groups: 1 (100.0 %) Over-erasure-coded block groups: 0 (0.0 %) Under-erasure-coded block groups: 0 (0.0 %) Unsatisfactory placement block groups: 0 (0.0 %) Average block group size: 5.0 Missing block groups: 0 Corrupt block gr
[jira] [Created] (HDFS-16806) ec data balancer block blk_id The index error ,Data cannot be moved
ruiliang created HDFS-16806: --- Summary: ec data balancer block blk_id The index error ,Data cannot be moved Key: HDFS-16806 URL: https://issues.apache.org/jira/browse/HDFS-16806 Project: Hadoop HDFS Issue Type: Bug Components: hdfs Affects Versions: 3.1.0 Reporter: ruiliang ec data balancer block blk_id The index error ,Data cannot be moved dn->10.12.15.149 use disk 100% {code:java} echo 10.12.15.149>sorucehost balancer -fs hdfs://xxcluster06 -threshold 10 -source -f sorucehost 2>>~/balancer.log & {code} {code:java} datanode logs ... 2022-10-19 14:43:02,031 ERROR datanode.DataNode (DataXceiver.java:run(321)) - fs-hiido-dn-12-15-149.xx.com:1019:DataXceiver error processing COPY_BLOCK operation src: /10.12.65.216:58214 dst: /10.12.15.149:1019 org.apache.hadoop.hdfs.server.datanode.ReplicaNotFoundException: Replica not found for BP-1822992414-10.12.65.48-1660893388633:blk_-9223372036799576592_4218617 at org.apache.hadoop.hdfs.server.datanode.BlockSender.getReplica(BlockSender.java:492) at org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:256) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.copyBlock(DataXceiver.java:1089) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opCopyBlock(Receiver.java:291) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:113) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290) at java.lang.Thread.run(Thread.java:748) ... hdfs fsck -fs hdfs://xxcluster06 -blockId blk_-9223372036799576592 Connecting to namenode via http://fs-hiido-xxcluster06-yynn2.xx.com:50070/fsck?ugi=hdfs&blockId=blk_-9223372036799576592+&path=%2F FSCK started by hdfs (auth:KERBEROS_SSL) from /10.12.19.4 at Wed Oct 19 14:47:15 CST 2022Block Id: blk_-9223372036799576592 Block belongs to: /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz No. of Expected Replica: 5 No. of live Replica: 5 No. of excess Replica: 0 No. of stale Replica: 5 No. of decommissioned Replica: 0 No. of decommissioning Replica: 0 No. of corrupted Replica: 0 Block replica on datanode/rack: fs-hiido-dn-12-66-4.xx.com/4F08-01-09 is HEALTHY Block replica on datanode/rack: fs-hiido-dn-12-65-244.xx.com/4F08-01-08 is HEALTHY Block replica on datanode/rack: fs-hiido-dn-12-15-149.xx.com/4F08-05-13 is HEALTHY Block replica on datanode/rack: fs-hiido-dn-12-65-218.xx.com/4F08-12-04 is HEALTHY Block replica on datanode/rack: fs-hiido-dn-12-17-35.xx.com/4F08-03-03 is HEALTHYhdfs fsck -fs hdfs://xxcluster06 /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz -files -blocks -locations Connecting to namenode via http://xx.com:50070/fsck?ugi=hdfs&files=1&blocks=1&locations=1&path=%2Fhive_warehouse%2Fwarehouse_old_snapshots%2Fyy_mbsdkevent_original%2Fdt%3D20210505%2Fpost_202105052129_33.log.gz FSCK started by hdfs (auth:KERBEROS_SSL) from /10.12.19.4 for path /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz at Wed Oct 19 14:48:42 CST 2022 /hive_warehouse/warehouse_old_snapshots/yy_mbsdkevent_original/dt=20210505/post_202105052129_33.log.gz 500582412 bytes, erasure-coded: policy=RS-3-2-1024k, 1 block(s): OK 0. BP-1822992414-10.12.65.48-1660893388633:blk_-9223372036799576592_4218617 len=500582412 Live_repl=5 [blk_-9223372036799576592:DatanodeInfoWithStorage[10.12.17.35:1019,DS-3ccebf8d-5f05-45b5-ac7f-96d1cfb48608,DISK], blk_-9223372036799576591:DatanodeInfoWithStorage[10.12.65.218:1019,DS-4f8e3114-7566-4cf1-ad5a-e454c8ea8805,DISK], blk_-9223372036799576590:DatanodeInfoWithStorage[10.12.15.149:1019,DS-1dd55c27-8f47-46a6-935b-1d9024ca9188,DISK], blk_-9223372036799576589:DatanodeInfoWithStorage[10.12.65.244:1019,DS-a9ffd747-c427-4aaa-8559-04cded7d9d5f,DISK], blk_-9223372036799576588:DatanodeInfoWithStorage[10.12.66.4:1019,DS-d88f94db-6db1-4753-a652-780d7cd7f081,DISK]] Status: HEALTHY Number of data-nodes: 62 Number of racks: 19 Total dirs: 0 Total symlinks: 0Replicated Blocks: Total size: 0 B Total files: 0 Total blocks (validated): 0 Minimally replicated blocks: 0 Over-replicated blocks: 0 Under-replicated blocks: 0 Mis-replicated blocks: 0 Default replication factor: 3 Average block replication: 0.0 Missing blocks: 0 Corrupt blocks: 0 Missing replicas: 0Erasure Coded Block Groups: Total size: 500582412 B Total files: 1 Total block groups (validated): 1 (avg. block group size 500582412 B) Minimally erasure-coded block groups: 1 (100.0 %) Over-erasure-coded block groups: 0 (0.0 %) Under-erasure-coded block groups: