[jira] [Commented] (HDFS-16756) RBF proxies the client's user by the login user to enable CacheEntry

2022-09-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600567#comment-17600567
 ] 

ASF GitHub Bot commented on HDFS-16756:
---

hadoop-yetus commented on PR #4853:
URL: https://github.com/apache/hadoop/pull/4853#issuecomment-1237627230

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 53s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  42m 19s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 51s |  |  trunk passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  compile  |   0m 46s |  |  trunk passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   0m 38s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 51s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 56s |  |  trunk passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m  4s |  |  trunk passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 39s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  24m 23s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 37s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 41s |  |  the patch passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javac  |   0m 41s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 35s |  |  the patch passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   0m 35s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 20s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 38s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 37s |  |  the patch passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   0m 53s |  |  the patch passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 26s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  23m 39s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  32m 57s |  |  hadoop-hdfs-rbf in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 43s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 138m 47s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4853/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4853 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 0390337bfa3e 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 
01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 9b8d1f332d125e30434b0714762ea92ba26d6fdc |
   | Default Java | Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4853/2/testReport/ |
   | Max. process+thread count | 2360 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4853/2/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> RBF proxies the client's user 

[jira] [Commented] (HDFS-2139) Fast copy for HDFS.

2022-09-05 Thread Hui Fei (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600565#comment-17600565
 ] 

Hui Fei commented on HDFS-2139:
---

[~xuzq_zander] cut a feature branch HDFS-2139 from trunk. can start your work, 
thanks!

> Fast copy for HDFS.
> ---
>
> Key: HDFS-2139
> URL: https://issues.apache.org/jira/browse/HDFS-2139
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Pritam Damania
>Assignee: Rituraj
>Priority: Major
> Attachments: HDFS-2139-For-2.7.1.patch, HDFS-2139.patch, 
> HDFS-2139.patch, image-2022-08-11-11-48-17-994.png
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> There is a need to perform fast file copy on HDFS. The fast copy mechanism 
> for a file works as
> follows :
> 1) Query metadata for all blocks of the source file.
> 2) For each block 'b' of the file, find out its datanode locations.
> 3) For each block of the file, add an empty block to the namesystem for
> the destination file.
> 4) For each location of the block, instruct the datanode to make a local
> copy of that block.
> 5) Once each datanode has copied over its respective blocks, they
> report to the namenode about it.
> 6) Wait for all blocks to be copied and exit.
> This would speed up the copying process considerably by removing top of
> the rack data transfers.
> Note : An extra improvement, would be to instruct the datanode to create a
> hardlink of the block file if we are copying a block on the same datanode
> [~xuzq_zander]Provided a design doc 
> https://docs.google.com/document/d/1OHdUpQmKD3TZ3xdmQsXNmlXJetn2QFPinMH31Q4BqkI/edit?usp=sharing



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16737) Fix number of threads in FsDatasetAsyncDiskService#addExecutorForVolume

2022-09-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600562#comment-17600562
 ] 

ASF GitHub Bot commented on HDFS-16737:
---

ZanderXu commented on PR #4784:
URL: https://github.com/apache/hadoop/pull/4784#issuecomment-1237614839

   @ayushtkn @jianghuazhu Sir, I added one UT to test the active count  and 
core pool size of the executer. Can you help me review it again.
   I found this issue because we encountered many stacked pending deletion in 
our prod environment and caused ReplicationNotFoundException.  So this PR is 
used to increase the active thread count of the executer. And we will raise 
another PR to resolve ReplicationNotFoundException. 
   
   @ferhui @haiyang1987 Masters, can you help me review this PR?
   




> Fix number of threads in FsDatasetAsyncDiskService#addExecutorForVolume
> ---
>
> Key: HDFS-16737
> URL: https://issues.apache.org/jira/browse/HDFS-16737
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>
> The number of threads in FsDatasetAsyncDiskService#addExecutorForVolume is 
> elastic right now, make it fixed.
> Presently the corePoolSize is set to 1 and maximumPoolSize is set to 
> maxNumThreadsPerVolume, but since the size of Queue is Integer.MAX, the queue 
> doesn't tend to get full and threads are always confined to 1 irrespective of 
> maxNumThreadsPerVolume.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16703) Enable RPC Timeout for some protocols of NameNode.

2022-09-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600560#comment-17600560
 ] 

ASF GitHub Bot commented on HDFS-16703:
---

ZanderXu commented on PR #4660:
URL: https://github.com/apache/hadoop/pull/4660#issuecomment-1237612512

   @slfan1989 Sir, sorry to ping you again. I still think this is a useful PR 
for Hadoop Admin. So I hope you can give me some more suggestions and push this 
PR forward. Thanks again.




> Enable RPC Timeout for some protocols of NameNode.
> --
>
> Key: HDFS-16703
> URL: https://issues.apache.org/jira/browse/HDFS-16703
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> When I read some code about protocol, I found that only 
> ClientNamenodeProtocolPB proxy with RPC timeout, other protocolPB proxy 
> without RPC timeout, such as RefreshAuthorizationPolicyProtocolPB, 
> RefreshUserMappingsProtocolPB, RefreshCallQueueProtocolPB, 
> GetUserMappingsProtocolPB and NamenodeProtocolPB.
>  
> If proxy without rpc timeout,  it will be blocked for a long time if the NN 
> machine crash or bad network during writing or reading with NN. 
>  
> So I feel that we should enable RPC timeout for all ProtocolPB.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16697) Randomly setting “dfs.namenode.resource.checked.volumes.minimum” will always prevent safe mode from being turned off

2022-09-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600517#comment-17600517
 ] 

ASF GitHub Bot commented on HDFS-16697:
---

hadoop-yetus commented on PR #4849:
URL: https://github.com/apache/hadoop/pull/4849#issuecomment-1237421411

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 46s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  39m 34s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 34s |  |  trunk passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  compile  |   1m 36s |  |  trunk passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 20s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 43s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 15s |  |  trunk passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m 41s |  |  trunk passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 34s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 57s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 27s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 28s |  |  the patch passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javac  |   1m 28s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 23s |  |  the patch passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   1m 23s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   1m  0s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4849/2/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs-project/hadoop-hdfs: The patch generated 3 new + 2 unchanged - 
0 fixed = 5 total (was 2)  |
   | +1 :green_heart: |  mvnsite  |   1m 29s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 58s |  |  the patch passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m 32s |  |  the patch passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 27s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  23m  8s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 238m 59s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   1m  4s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 349m 54s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4849/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4849 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 16b29e1f2be6 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 
01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / a59baba5cc09202aa35a4df296164c16adf5d534 |
   | Default Java | Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4849/2/testReport/ |
   | Max. process+thread count | 3084 (vs. ulimit of 5500) |
   | modules | C: 

[jira] [Commented] (HDFS-16756) RBF proxies the client's user by the login user to enable CacheEntry

2022-09-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600495#comment-17600495
 ] 

ASF GitHub Bot commented on HDFS-16756:
---

hadoop-yetus commented on PR #4853:
URL: https://github.com/apache/hadoop/pull/4853#issuecomment-1237303695

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 54s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  40m 57s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 52s |  |  trunk passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  compile  |   0m 45s |  |  trunk passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   0m 37s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 49s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 58s |  |  trunk passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m  2s |  |  trunk passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 38s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  24m  2s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 38s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 42s |  |  the patch passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javac  |   0m 42s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 34s |  |  the patch passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   0m 34s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 21s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 38s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 36s |  |  the patch passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   0m 56s |  |  the patch passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 26s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  26m  7s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 130m 58s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4853/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-rbf in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 42s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 238m 14s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.federation.fairness.TestRouterRefreshFairnessPolicyController
 |
   |   | hadoop.fs.contract.router.TestRouterHDFSContractCreate |
   |   | hadoop.fs.contract.router.web.TestRouterWebHDFSContractSeek |
   |   | hadoop.hdfs.server.federation.metrics.TestNameserviceRPCMetrics |
   |   | hadoop.fs.contract.router.TestRouterHDFSContractSetTimes |
   |   | hadoop.hdfs.server.federation.router.TestRouterUserMappings |
   |   | 
hadoop.hdfs.server.federation.router.TestRouterFederationRenamePermission |
   |   | hadoop.hdfs.server.federation.router.TestRouterAdminCLI |
   |   | hadoop.hdfs.server.federation.router.TestRouterRpcMultiDestination |
   |   | hadoop.hdfs.server.federation.router.TestRouterMountTable |
   |   | 
hadoop.hdfs.server.federation.router.TestRouterFederationRenameInKerberosEnv |
   |   | hadoop.hdfs.server.federation.router.TestRouterClientRejectOverload |
   |   | hadoop.fs.contract.router.TestRouterHDFSContractConcatSecure |
   |   | hadoop.hdfs.server.federation.router.TestRouterWebHdfsMethods |
   |   | hadoop.hdfs.server.federation.router.TestRouterMultiRack |
   |   | hadoop.fs.contract.router.TestRouterHDFSContractRootDirectorySecure |
   |   | hadoop.hdfs.server.federation.router.TestRouterNamenodeMonitoring |
   |   | 

[jira] [Commented] (HDFS-16697) Randomly setting “dfs.namenode.resource.checked.volumes.minimum” will always prevent safe mode from being turned off

2022-09-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600445#comment-17600445
 ] 

ASF GitHub Bot commented on HDFS-16697:
---

Likkey commented on PR #4849:
URL: https://github.com/apache/hadoop/pull/4849#issuecomment-1237088816

   > @Likkey Thanks a lot for your contribution, but the checkstyle issue needs 
to be fixed.
   
@slfan1989  Thanks for your reply :), I have fixed the checkstyle issue.




> Randomly setting “dfs.namenode.resource.checked.volumes.minimum” will always 
> prevent safe mode from being turned off
> 
>
> Key: HDFS-16697
> URL: https://issues.apache.org/jira/browse/HDFS-16697
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.3
> Environment: Linux version 4.15.0-142-generic 
> (buildd@lgw01-amd64-039) (gcc version 5.4.0 20160609 (Ubuntu 
> 5.4.0-6ubuntu1~16.04.12))
> java version "1.8.0_162"
> Java(TM) SE Runtime Environment (build 1.8.0_162-b12)
> Java HotSpot(TM) 64-Bit Server VM (build 25.162-b12, mixed mode)
>Reporter: Jingxuan Fu
>Assignee: Jingxuan Fu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> {code:java}
> 
>   dfs.namenode.resource.checked.volumes.minimum
>   1
>   
>     The minimum number of redundant NameNode storage volumes required.
>   
> {code}
> I found that when setting the value of 
> “dfs.namenode.resource.checked.volumes.minimum” is greater than the total 
> number of storage volumes in the NameNode, it is always impossible to turn 
> off the safe mode, and when in safe mode, the file system only accepts read 
> data requests, but not delete, modify and other change requests, which is 
> greatly limited by the function.
> The default value of the configuration item is 1, we set to 2 as an example 
> for illustration, after starting hdfs logs and the client will throw the 
> relevant reminders.
> {code:java}
> 2022-07-27 17:37:31,772 WARN 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: NameNode low on 
> available disk space. Already in safe mode.
> 2022-07-27 17:37:31,772 INFO org.apache.hadoop.hdfs.StateChange: STATE* Safe 
> mode is ON.
> Resources are low on NN. Please add or free up more resourcesthen turn off 
> safe mode manually. NOTE:  If you turn off safe mode before adding resources, 
> the NN will immediately return to safe mode. Use "hdfs dfsadmin -safemode 
> leave" to turn safe mode off.
> {code}
> {code:java}
> org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create 
> directory /hdfsapi/test. Name node is in safe mode.
> Resources are low on NN. Please add or free up more resourcesthen turn off 
> safe mode manually. NOTE:  If you turn off safe mode before adding resources, 
> the NN will immediately return to safe mode. Use "hdfs dfsadmin -safemode 
> leave" to turn safe mode off. NamenodeHostName:192.168.1.167
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.newSafemodeException(FSNamesystem.java:1468)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1455)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3174)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:1145)
>         at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:714)
>         at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:527)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1036)
>         at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1000)
>         at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928)
>         at java.base/java.security.AccessController.doPrivileged(Native 
> Method)
>         at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2916){code}
> According to the prompt, it is believed that there is not enough resource 
> space to meet the corresponding conditions to close safe mode, but after 
> adding or releasing more resources and lowering the resource condition 
> threshold "dfs.namenode.resource.du.reserved", it still fails to close safe 
> mode and throws the same prompt .
> According to the source code, we 

[jira] [Commented] (HDFS-16721) Improve the check code of the important configuration item “dfs.client.socket-timeout”.

2022-09-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600442#comment-17600442
 ] 

ASF GitHub Bot commented on HDFS-16721:
---

Likkey commented on PR #4847:
URL: https://github.com/apache/hadoop/pull/4847#issuecomment-1237070768

   > Thank you very much for your contribution!From my personal point of view, 
it is unreasonable to configure a negative number for timeout. I don't think 
this change is necessary, the possibility of timeout being configured as a 
negative number is very low.
   > 
   > I think there should be reasonable reasons to support us to modify the 
parameter verification.
   
   I also endorse your statement, thank you very much for the advice:)




> Improve the check code of the important configuration item 
> “dfs.client.socket-timeout”.
> ---
>
> Key: HDFS-16721
> URL: https://issues.apache.org/jira/browse/HDFS-16721
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: dfsclient
>Affects Versions: 3.1.3
> Environment: Linux version 4.15.0-142-generic 
> (buildd@lgw01-amd64-039) (gcc version 5.4.0 20160609 (Ubuntu 
> 5.4.0-6ubuntu1~16.04.12))
> java version "1.8.0_162"
> Java(TM) SE Runtime Environment (build 1.8.0_162-b12)
> Java HotSpot(TM) 64-Bit Server VM (build 25.162-b12, mixed mode)
>Reporter: Jingxuan Fu
>Assignee: Jingxuan Fu
>Priority: Major
>  Labels: pull-request-available
>
> {code:java}
> 
>   dfs.client.socket-timeout
>   6
>   
>     Default timeout value in milliseconds for all sockets.
>   
> {code}
> "dfs.client.socket-timeout" as the default timeout value for all sockets is 
> applied in multiple places, it is a configuration item with significant 
> impact, but the value of this configuration item is not checked in the source 
> code and when it is set to an abnormal value just throw an overgeneralized 
> exception and cannot be corrected in time , which affects the normal use of 
> the program.
> {code:java}
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File 
> /hdfsapi/test/testhdfs.txt could only be written to 0 of the 1 minReplication 
> nodes. There are 1 datanode(s) running and 1 node(s) are excluded in this 
> operation.
>         at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2205)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2731)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:892)
>         at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:568)
>         at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:527)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1036)
>         at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1000)
>         at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928)
>         at java.base/java.security.AccessController.doPrivileged(Native 
> Method)
>         at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2916){code}
> So I used Precondition.checkArgument() to refine the code for checking this 
> configuration item.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16653) Safe mode related operations cannot be performed when “dfs.client.mmap.cache.size” is set to a negative number

2022-09-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600439#comment-17600439
 ] 

ASF GitHub Bot commented on HDFS-16653:
---

Likkey commented on PR #4848:
URL: https://github.com/apache/hadoop/pull/4848#issuecomment-1237061518

   > @Likkey Thank you very much for your contribution. Can we provide the 
stack before and after modification ?
   
   Thank you very much for your reply
   Before the modification the terminal output looked like this:
   ```
   hadoop@ljq1:~/hadoop-3.1.3-work/sbin$ hdfs dfsadmin -safemode leave 
   safemode: null
   Usage: hdfs dfsadmin [-safemode enter | leave | get | wait | forceExit]
   hadoop@ljq1:~/hadoop-3.1.3-work/sbin$ hdfs dfsadmin -safemode enter
   safemode: null
   Usage: hdfs dfsadmin [-safemode enter | leave | get | wait | forceExit]
   hadoop@ljq1:~/hadoop-3.1.3-work/sbin$ hdfs dfsadmin -safemode get
   safemode: null
   Usage: hdfs dfsadmin [-safemode enter | leave | get | wait | forceExit]
   hadoop@ljq1:~/hadoop-3.1.3-work/sbin$ hdfs dfsadmin -safemode forceExit
   safemode: null
   Usage: hdfs dfsadmin [-safemode enter | leave | get | wait | forceExit]
   ```
   And after the modification provides a clear reminder that:
   ```
   hadoop@ljq1:~/hadoop-3.1.3-work/sbin$ hdfs dfsadmin -safemode leave
   safemode: Invalid argument: dfs.client.mmap.cache.size must be greater than 
zero.
   Usage: hdfs dfsadmin [-safemode enter | leave | get | wait | forceExit]
   ```




> Safe mode related operations cannot be performed when 
> “dfs.client.mmap.cache.size” is set to a negative number
> --
>
> Key: HDFS-16653
> URL: https://issues.apache.org/jira/browse/HDFS-16653
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: dfsadmin
>Affects Versions: 3.1.3
> Environment: Linux version 4.15.0-142-generic 
> (buildd@lgw01-amd64-039) (gcc version 5.4.0 20160609 (Ubuntu 
> 5.4.0-6ubuntu1~16.04.12))
>Reporter: Jingxuan Fu
>Assignee: Jingxuan Fu
>Priority: Major
>  Labels: pull-request-available
>
>  
> {code:java}
> 
>   dfs.client.mmap.cache.size
>   256
>   
>     When zero-copy reads are used, the DFSClient keeps a cache of recently 
> used
>     memory mapped regions.  This parameter controls the maximum number of
>     entries that we will keep in that cache.
>     The larger this number is, the more file descriptors we will potentially
>     use for memory-mapped files.  mmaped files also use virtual address space.
>     You may need to increase your ulimit virtual address space limits before
>     increasing the client mmap cache size.
>     
>     Note that you can still do zero-copy reads when this size is set to 0.
>   
> 
> {code}
> When the configuration item “dfs.client.mmap.cache.size” is set to a negative 
> number, it will cause /hadoop/bin hdfs dfsadmin -safemode provides all the 
> operation options including enter, leave, get, wait and forceExit are 
> invalid, the terminal returns security mode is null and no exceptions are 
> thrown.
> {code:java}
> hadoop@ljq1:~/hadoop-3.1.3-work/etc/hadoop$ hdfs dfsadmin -safemode leave
> safemode: null
> Usage: hdfs dfsadmin [-safemode enter | leave | get | wait | forceExit] {code}
> In summary, I think we need to improve the check mechanism related to this 
> configuration item, add maxEvictableMmapedSize that is 
> "dfs.client.mmap.cache.size" related Precondition check suite error 
> message,and give a clear indication when the configuration is abnormal in 
> order to solve the problem in time and reduce the impact on the safe mode 
> related operations.
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16748) RBF: DFSClient should uniquely identify writing files by namespace id and iNodeId

2022-09-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600393#comment-17600393
 ] 

ASF GitHub Bot commented on HDFS-16748:
---

ZanderXu commented on PR #4813:
URL: https://github.com/apache/hadoop/pull/4813#issuecomment-1236986107

   @ayushtkn @Hexiaoqiao Masters, thank you very much for helping me to review 
this patch.




> RBF: DFSClient should uniquely identify writing files by namespace id and 
> iNodeId
> -
>
> Key: HDFS-16748
> URL: https://issues.apache.org/jira/browse/HDFS-16748
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> DFSClient should diff the writing files with namespaceId and iNodeId, because 
> the writing files may belongs to different namespace with the same iNodeId.
> And the related code as bellows:
> {code:java}
> public void putFileBeingWritten(final long inodeId,
>   final DFSOutputStream out) {
> synchronized(filesBeingWritten) {
>   filesBeingWritten.put(inodeId, out);
>   // update the last lease renewal time only when there was no
>   // writes. once there is one write stream open, the lease renewer
>   // thread keeps it updated well with in anyone's expiration time.
>   if (lastLeaseRenewal == 0) {
> updateLastLeaseRenewal();
>   }
> }
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16756) RBF proxies the client's user by the login user to enable CacheEntry

2022-09-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-16756:
--
Labels: pull-request-available  (was: )

> RBF proxies the client's user by the login user to enable CacheEntry
> 
>
> Key: HDFS-16756
> URL: https://issues.apache.org/jira/browse/HDFS-16756
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>
> RBF just proxies the client's user by the login user for Kerberos 
> authentication. If the cluster uses the SIMPLE authentication method, the RBF 
> will not proxies the client's user by the login user, the downstream 
> namespace will not use the real clientIp, clientPort, clientId and callId 
> even if the namenode configured dfs.namenode.ip-proxy-users.
>  
> And the related code as bellow:
> {code:java}
> UserGroupInformation connUGI = ugi;
> if (UserGroupInformation.isSecurityEnabled()) {
>   UserGroupInformation routerUser = UserGroupInformation.getLoginUser();
>   connUGI = UserGroupInformation.createProxyUser(
>   ugi.getUserName(), routerUser);
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16756) RBF proxies the client's user by the login user to enable CacheEntry

2022-09-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600388#comment-17600388
 ] 

ASF GitHub Bot commented on HDFS-16756:
---

ZanderXu opened a new pull request, #4853:
URL: https://github.com/apache/hadoop/pull/4853

   RBF just proxies the client's user by the login user for Kerberos 
authentication. 
   
   If the cluster uses the SIMPLE authentication method, the RBF will not 
proxies the client's user by the login user, the downstream namespace will not 
be able to use the real clientIp, clientPort, clientId and callId even if the 
namenode configured `dfs.namenode.ip-proxy-users`.
   
And the related code of RBF as bellow:
   ```
   UserGroupInformation connUGI = ugi;
   if (UserGroupInformation.isSecurityEnabled()) {
 UserGroupInformation routerUser = UserGroupInformation.getLoginUser();
 connUGI = UserGroupInformation.createProxyUser(
 ugi.getUserName(), routerUser);
   }
   ``` 




> RBF proxies the client's user by the login user to enable CacheEntry
> 
>
> Key: HDFS-16756
> URL: https://issues.apache.org/jira/browse/HDFS-16756
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>
> RBF just proxies the client's user by the login user for Kerberos 
> authentication. If the cluster uses the SIMPLE authentication method, the RBF 
> will not proxies the client's user by the login user, the downstream 
> namespace will not use the real clientIp, clientPort, clientId and callId 
> even if the namenode configured dfs.namenode.ip-proxy-users.
>  
> And the related code as bellow:
> {code:java}
> UserGroupInformation connUGI = ugi;
> if (UserGroupInformation.isSecurityEnabled()) {
>   UserGroupInformation routerUser = UserGroupInformation.getLoginUser();
>   connUGI = UserGroupInformation.createProxyUser(
>   ugi.getUserName(), routerUser);
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16756) RBF proxies the client's user by the login user to enable CacheEntry

2022-09-05 Thread ZanderXu (Jira)
ZanderXu created HDFS-16756:
---

 Summary: RBF proxies the client's user by the login user to enable 
CacheEntry
 Key: HDFS-16756
 URL: https://issues.apache.org/jira/browse/HDFS-16756
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: ZanderXu
Assignee: ZanderXu


RBF just proxies the client's user by the login user for Kerberos 
authentication. If the cluster uses the SIMPLE authentication method, the RBF 
will not proxies the client's user by the login user, the downstream namespace 
will not use the real clientIp, clientPort, clientId and callId even if the 
namenode configured dfs.namenode.ip-proxy-users.

 

And the related code as bellow:
{code:java}
UserGroupInformation connUGI = ugi;
if (UserGroupInformation.isSecurityEnabled()) {
  UserGroupInformation routerUser = UserGroupInformation.getLoginUser();
  connUGI = UserGroupInformation.createProxyUser(
  ugi.getUserName(), routerUser);
} {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16748) RBF: DFSClient should uniquely identify writing files by namespace id and iNodeId

2022-09-05 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-16748:

Summary: RBF: DFSClient should uniquely identify writing files by namespace 
id and iNodeId  (was: DFSClient should uniquely identify writing files by 
namespace id and iNodeId)

> RBF: DFSClient should uniquely identify writing files by namespace id and 
> iNodeId
> -
>
> Key: HDFS-16748
> URL: https://issues.apache.org/jira/browse/HDFS-16748
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Critical
>  Labels: pull-request-available
>
> DFSClient should diff the writing files with namespaceId and iNodeId, because 
> the writing files may belongs to different namespace with the same iNodeId.
> And the related code as bellows:
> {code:java}
> public void putFileBeingWritten(final long inodeId,
>   final DFSOutputStream out) {
> synchronized(filesBeingWritten) {
>   filesBeingWritten.put(inodeId, out);
>   // update the last lease renewal time only when there was no
>   // writes. once there is one write stream open, the lease renewer
>   // thread keeps it updated well with in anyone's expiration time.
>   if (lastLeaseRenewal == 0) {
> updateLastLeaseRenewal();
>   }
> }
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16748) RBF: DFSClient should uniquely identify writing files by namespace id and iNodeId

2022-09-05 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600365#comment-17600365
 ] 

Ayush Saxena commented on HDFS-16748:
-

Committed to trunk.

Thanx [~xuzq_zander] for the contribution!!!

> RBF: DFSClient should uniquely identify writing files by namespace id and 
> iNodeId
> -
>
> Key: HDFS-16748
> URL: https://issues.apache.org/jira/browse/HDFS-16748
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Critical
>  Labels: pull-request-available
>
> DFSClient should diff the writing files with namespaceId and iNodeId, because 
> the writing files may belongs to different namespace with the same iNodeId.
> And the related code as bellows:
> {code:java}
> public void putFileBeingWritten(final long inodeId,
>   final DFSOutputStream out) {
> synchronized(filesBeingWritten) {
>   filesBeingWritten.put(inodeId, out);
>   // update the last lease renewal time only when there was no
>   // writes. once there is one write stream open, the lease renewer
>   // thread keeps it updated well with in anyone's expiration time.
>   if (lastLeaseRenewal == 0) {
> updateLastLeaseRenewal();
>   }
> }
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16748) RBF: DFSClient should uniquely identify writing files by namespace id and iNodeId

2022-09-05 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena resolved HDFS-16748.
-
Fix Version/s: 3.4.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

> RBF: DFSClient should uniquely identify writing files by namespace id and 
> iNodeId
> -
>
> Key: HDFS-16748
> URL: https://issues.apache.org/jira/browse/HDFS-16748
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> DFSClient should diff the writing files with namespaceId and iNodeId, because 
> the writing files may belongs to different namespace with the same iNodeId.
> And the related code as bellows:
> {code:java}
> public void putFileBeingWritten(final long inodeId,
>   final DFSOutputStream out) {
> synchronized(filesBeingWritten) {
>   filesBeingWritten.put(inodeId, out);
>   // update the last lease renewal time only when there was no
>   // writes. once there is one write stream open, the lease renewer
>   // thread keeps it updated well with in anyone's expiration time.
>   if (lastLeaseRenewal == 0) {
> updateLastLeaseRenewal();
>   }
> }
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16748) DFSClient should uniquely identify writing files by namespace id and iNodeId

2022-09-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600364#comment-17600364
 ] 

ASF GitHub Bot commented on HDFS-16748:
---

ayushtkn merged PR #4813:
URL: https://github.com/apache/hadoop/pull/4813




> DFSClient should uniquely identify writing files by namespace id and iNodeId
> 
>
> Key: HDFS-16748
> URL: https://issues.apache.org/jira/browse/HDFS-16748
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Critical
>  Labels: pull-request-available
>
> DFSClient should diff the writing files with namespaceId and iNodeId, because 
> the writing files may belongs to different namespace with the same iNodeId.
> And the related code as bellows:
> {code:java}
> public void putFileBeingWritten(final long inodeId,
>   final DFSOutputStream out) {
> synchronized(filesBeingWritten) {
>   filesBeingWritten.put(inodeId, out);
>   // update the last lease renewal time only when there was no
>   // writes. once there is one write stream open, the lease renewer
>   // thread keeps it updated well with in anyone's expiration time.
>   if (lastLeaseRenewal == 0) {
> updateLastLeaseRenewal();
>   }
> }
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16593) Correct inaccurate BlocksRemoved metric on DataNode side

2022-09-05 Thread Xiaoqiao He (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoqiao He resolved HDFS-16593.

Fix Version/s: 3.4.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

Committed to trunk.

> Correct inaccurate BlocksRemoved metric on DataNode side
> 
>
> Key: HDFS-16593
> URL: https://issues.apache.org/jira/browse/HDFS-16593
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> When tracing the root cause of production issue, I found that the 
> BlocksRemoved  metric on Datanode size was inaccurate.
> {code:java}
> case DatanodeProtocol.DNA_INVALIDATE:
>   //
>   // Some local block(s) are obsolete and can be 
>   // safely garbage-collected.
>   //
>   Block toDelete[] = bcmd.getBlocks();
>   try {
> // using global fsdataset
> dn.getFSDataset().invalidate(bcmd.getBlockPoolId(), toDelete);
>   } catch(IOException e) {
> // Exceptions caught here are not expected to be disk-related.
> throw e;
>   }
>   dn.metrics.incrBlocksRemoved(toDelete.length);
>   break;
> {code}
> Because even if the invalidate method throws an exception, some blocks may 
> have been successfully deleted internally.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16593) Correct inaccurate BlocksRemoved metric on DataNode side

2022-09-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600338#comment-17600338
 ] 

ASF GitHub Bot commented on HDFS-16593:
---

Hexiaoqiao commented on PR #4353:
URL: https://github.com/apache/hadoop/pull/4353#issuecomment-1236883561

   Committed to trunk. Thanks @ZanderXu .




> Correct inaccurate BlocksRemoved metric on DataNode side
> 
>
> Key: HDFS-16593
> URL: https://issues.apache.org/jira/browse/HDFS-16593
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> When tracing the root cause of production issue, I found that the 
> BlocksRemoved  metric on Datanode size was inaccurate.
> {code:java}
> case DatanodeProtocol.DNA_INVALIDATE:
>   //
>   // Some local block(s) are obsolete and can be 
>   // safely garbage-collected.
>   //
>   Block toDelete[] = bcmd.getBlocks();
>   try {
> // using global fsdataset
> dn.getFSDataset().invalidate(bcmd.getBlockPoolId(), toDelete);
>   } catch(IOException e) {
> // Exceptions caught here are not expected to be disk-related.
> throw e;
>   }
>   dn.metrics.incrBlocksRemoved(toDelete.length);
>   break;
> {code}
> Because even if the invalidate method throws an exception, some blocks may 
> have been successfully deleted internally.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16593) Correct inaccurate BlocksRemoved metric on DataNode side

2022-09-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600337#comment-17600337
 ] 

ASF GitHub Bot commented on HDFS-16593:
---

Hexiaoqiao merged PR #4353:
URL: https://github.com/apache/hadoop/pull/4353




> Correct inaccurate BlocksRemoved metric on DataNode side
> 
>
> Key: HDFS-16593
> URL: https://issues.apache.org/jira/browse/HDFS-16593
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> When tracing the root cause of production issue, I found that the 
> BlocksRemoved  metric on Datanode size was inaccurate.
> {code:java}
> case DatanodeProtocol.DNA_INVALIDATE:
>   //
>   // Some local block(s) are obsolete and can be 
>   // safely garbage-collected.
>   //
>   Block toDelete[] = bcmd.getBlocks();
>   try {
> // using global fsdataset
> dn.getFSDataset().invalidate(bcmd.getBlockPoolId(), toDelete);
>   } catch(IOException e) {
> // Exceptions caught here are not expected to be disk-related.
> throw e;
>   }
>   dn.metrics.incrBlocksRemoved(toDelete.length);
>   break;
> {code}
> Because even if the invalidate method throws an exception, some blocks may 
> have been successfully deleted internally.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16752) 2.7.2 Rolling Upgrade 3.3.4 Datanode cannot be Degraded due to version Inconsistency

2022-09-05 Thread yuyanlei (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yuyanlei resolved HDFS-16752.
-
Resolution: Fixed

> 2.7.2 Rolling Upgrade 3.3.4 Datanode cannot be Degraded due to version 
> Inconsistency
> 
>
> Key: HDFS-16752
> URL: https://issues.apache.org/jira/browse/HDFS-16752
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, rolling upgrades
>Affects Versions: 2.7.2, 3.3.4
>Reporter: yuyanlei
>Priority: Blocker
>
> I am following the official website: HDFS Rolling 
> Upgrade([https://hadoop.apache.org/docs/r3.3.4/hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html]):
>  After the Rolling Upgrade (HDFS dfsAdmin -RollingUpgrade Finalize is not 
> executed), start degradeing Datanode . However, Datanode degradation fails. 
> The log of Datanode is displayed as follows:
> {panel:title=myhost001.datanode.log}
> INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on 
> /data01/block/in_use.lock acquired by nodename 37422@myhost001
> WARN org.apache.hadoop.hdfs.server.common.Storage: 
> org.apache.hadoop.hdfs.server.common.IncorrectVersionException: Unexpected 
> version of storage directory /data01/block. Reported: -57. Expecting = -56.
> INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on 
> /data02/block/in_use.lock acquired by nodename 37422@myhost001
> WARN org.apache.hadoop.hdfs.server.common.Storage: 
> org.apache.hadoop.hdfs.server.common.IncorrectVersionException: Unexpected 
> version of storage directory /data02/block. Reported: -57. Expecting = -56.
> INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on 
> /data03/block/in_use.lock acquired by nodename 37422@myhost001
> WARN org.apache.hadoop.hdfs.server.common.Storage: 
> org.apache.hadoop.hdfs.server.common.IncorrectVersionException: Unexpected 
> version of storage directory /data03/block. Reported: -57. Expecting = -56.
> INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on 
> /data01/block/in_use.lock acquired by nodename 37422@myhost001
> WARN org.apache.hadoop.hdfs.server.common.Storage: 
> org.apache.hadoop.hdfs.server.common.IncorrectVersionException: Unexpected 
> version of storage directory /data01/block. Reported: -57. Expecting = -56.
> INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on 
> /data02/block/in_use.lock acquired by nodename 37422@myhost001
> WARN org.apache.hadoop.hdfs.server.common.Storage: 
> org.apache.hadoop.hdfs.server.common.IncorrectVersionException: Unexpected 
> version of storage directory /data02/block. Reported: -57. Expecting = -56.
> INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on 
> /data03/block/in_use.lock acquired by nodename 37422@myhost001
> WARN org.apache.hadoop.hdfs.server.common.Storage: 
> org.apache.hadoop.hdfs.server.common.IncorrectVersionException: Unexpected 
> version of storage directory /data03/block. Reported: -57. Expecting = -56.
> FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed 
> for Block pool  (Datanode Uuid unassigned) service to 
> myhost002/***:9002. Exiting.
> java.io.IOException: All specified directories are failed to load.
>         at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:478)
>         at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1393)
>         at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1358)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:313)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:219)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:644)
>         at java.lang.Thread.run(Thread.java:748)
> FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed 
> for Block pool  (Datanode Uuid unassigned) service to 
> myhost002/:9002. Exiting.
> java.io.IOException: All specified directories are failed to load.
>         at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:478)
>         at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1393)
>         at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1358)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:313)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:219)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:644)
>         at 

[jira] [Commented] (HDFS-16752) 2.7.2 Rolling Upgrade 3.3.4 Datanode cannot be Degraded due to version Inconsistency

2022-09-05 Thread yuyanlei (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600278#comment-17600278
 ] 

yuyanlei commented on HDFS-16752:
-

after rollback https://issues.apache.org/jira/browse/HDFS-8791,Can demote 
datanode successfully

> 2.7.2 Rolling Upgrade 3.3.4 Datanode cannot be Degraded due to version 
> Inconsistency
> 
>
> Key: HDFS-16752
> URL: https://issues.apache.org/jira/browse/HDFS-16752
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, rolling upgrades
>Affects Versions: 2.7.2, 3.3.4
>Reporter: yuyanlei
>Priority: Blocker
>
> I am following the official website: HDFS Rolling 
> Upgrade([https://hadoop.apache.org/docs/r3.3.4/hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html]):
>  After the Rolling Upgrade (HDFS dfsAdmin -RollingUpgrade Finalize is not 
> executed), start degradeing Datanode . However, Datanode degradation fails. 
> The log of Datanode is displayed as follows:
> {panel:title=myhost001.datanode.log}
> INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on 
> /data01/block/in_use.lock acquired by nodename 37422@myhost001
> WARN org.apache.hadoop.hdfs.server.common.Storage: 
> org.apache.hadoop.hdfs.server.common.IncorrectVersionException: Unexpected 
> version of storage directory /data01/block. Reported: -57. Expecting = -56.
> INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on 
> /data02/block/in_use.lock acquired by nodename 37422@myhost001
> WARN org.apache.hadoop.hdfs.server.common.Storage: 
> org.apache.hadoop.hdfs.server.common.IncorrectVersionException: Unexpected 
> version of storage directory /data02/block. Reported: -57. Expecting = -56.
> INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on 
> /data03/block/in_use.lock acquired by nodename 37422@myhost001
> WARN org.apache.hadoop.hdfs.server.common.Storage: 
> org.apache.hadoop.hdfs.server.common.IncorrectVersionException: Unexpected 
> version of storage directory /data03/block. Reported: -57. Expecting = -56.
> INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on 
> /data01/block/in_use.lock acquired by nodename 37422@myhost001
> WARN org.apache.hadoop.hdfs.server.common.Storage: 
> org.apache.hadoop.hdfs.server.common.IncorrectVersionException: Unexpected 
> version of storage directory /data01/block. Reported: -57. Expecting = -56.
> INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on 
> /data02/block/in_use.lock acquired by nodename 37422@myhost001
> WARN org.apache.hadoop.hdfs.server.common.Storage: 
> org.apache.hadoop.hdfs.server.common.IncorrectVersionException: Unexpected 
> version of storage directory /data02/block. Reported: -57. Expecting = -56.
> INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on 
> /data03/block/in_use.lock acquired by nodename 37422@myhost001
> WARN org.apache.hadoop.hdfs.server.common.Storage: 
> org.apache.hadoop.hdfs.server.common.IncorrectVersionException: Unexpected 
> version of storage directory /data03/block. Reported: -57. Expecting = -56.
> FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed 
> for Block pool  (Datanode Uuid unassigned) service to 
> myhost002/***:9002. Exiting.
> java.io.IOException: All specified directories are failed to load.
>         at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:478)
>         at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1393)
>         at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1358)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:313)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:219)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:644)
>         at java.lang.Thread.run(Thread.java:748)
> FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed 
> for Block pool  (Datanode Uuid unassigned) service to 
> myhost002/:9002. Exiting.
> java.io.IOException: All specified directories are failed to load.
>         at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:478)
>         at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1393)
>         at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1358)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:313)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:219)
>         at 
>