[jira] [Assigned] (HDFS-16789) Update FileSystem class to read disk usage from actual usage value instead of file's length

2022-09-30 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang reassigned HDFS-16789:
--

Assignee: Dave Teng

> Update FileSystem class to read disk usage from actual usage value instead of 
> file's length
> ---
>
> Key: HDFS-16789
> URL: https://issues.apache.org/jira/browse/HDFS-16789
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: dfs
>Reporter: Dave Teng
>Assignee: Dave Teng
>Priority: Major
>
> Currently FileSystem class retrieve the disk usage value directly from 
> [file's 
> length|https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L1895]
>  instead of actual usage value. 
> We need to update FileSystem & related classes to read disk usage from actual 
> usage value.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16789) Update FileSystem class to read disk usage from actual usage value instead of file's length

2022-09-30 Thread Dave Teng (Jira)
Dave Teng created HDFS-16789:


 Summary: Update FileSystem class to read disk usage from actual 
usage value instead of file's length
 Key: HDFS-16789
 URL: https://issues.apache.org/jira/browse/HDFS-16789
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: dfs
Reporter: Dave Teng


Currently FileSystem class retrieve the disk usage value directly from [file's 
length|https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L1895]
 instead of actual usage value. 

Update FileSystem & related classes to read disk usage from actual usage value.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16789) Update FileSystem class to read disk usage from actual usage value instead of file's length

2022-09-30 Thread Dave Teng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Teng updated HDFS-16789:
-
Description: 
Currently FileSystem class retrieve the disk usage value directly from [file's 
length|https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L1895]
 instead of actual usage value. 

We need to update FileSystem & related classes to read disk usage from actual 
usage value.

  was:
Currently FileSystem class retrieve the disk usage value directly from [file's 
length|https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L1895]
 instead of actual usage value. 

Update FileSystem & related classes to read disk usage from actual usage value.


> Update FileSystem class to read disk usage from actual usage value instead of 
> file's length
> ---
>
> Key: HDFS-16789
> URL: https://issues.apache.org/jira/browse/HDFS-16789
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: dfs
>Reporter: Dave Teng
>Priority: Major
>
> Currently FileSystem class retrieve the disk usage value directly from 
> [file's 
> length|https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L1895]
>  instead of actual usage value. 
> We need to update FileSystem & related classes to read disk usage from actual 
> usage value.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13369) FSCK Report broken with RequestHedgingProxyProvider

2022-09-30 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17611650#comment-17611650
 ] 

ASF GitHub Bot commented on HDFS-13369:
---

ChenSammi merged PR #4917:
URL: https://github.com/apache/hadoop/pull/4917




> FSCK Report broken with RequestHedgingProxyProvider 
> 
>
> Key: HDFS-13369
> URL: https://issues.apache.org/jira/browse/HDFS-13369
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.8.3, 3.0.0, 3.1.0
>Reporter: Harshakiran Reddy
>Assignee: Ranith Sardar
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-13369.001.patch, HDFS-13369.002.patch, 
> HDFS-13369.003.patch, HDFS-13369.004.patch, HDFS-13369.005.patch, 
> HDFS-13369.006.patch, HDFS-13369.007.patch
>
>
> Scenario:-
> 1.Configure the RequestHedgingProxy
> 2. write some files in file system
> 3. Take FSCK report for the above files
>  
> {noformat}
> bin> hdfs fsck /file1 -locations -files -blocks
> Exception in thread "main" java.lang.ClassCastException: 
> org.apache.hadoop.hdfs.server.namenode.ha.RequestHedgingProxyProvider$RequestHedgingInvocationHandler
>  cannot be cast to org.apache.hadoop.ipc.RpcInvocationHandler
> at org.apache.hadoop.ipc.RPC.getConnectionIdForProxy(RPC.java:626)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.getConnectionId(RetryInvocationHandler.java:438)
> at org.apache.hadoop.ipc.RPC.getConnectionIdForProxy(RPC.java:628)
> at org.apache.hadoop.ipc.RPC.getServerAddress(RPC.java:611)
> at org.apache.hadoop.hdfs.HAUtil.getAddressOfActive(HAUtil.java:263)
> at 
> org.apache.hadoop.hdfs.tools.DFSck.getCurrentNamenodeAddress(DFSck.java:257)
> at org.apache.hadoop.hdfs.tools.DFSck.doWork(DFSck.java:319)
> at org.apache.hadoop.hdfs.tools.DFSck.access$000(DFSck.java:72)
> at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:156)
> at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:153)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)
> at org.apache.hadoop.hdfs.tools.DFSck.run(DFSck.java:152)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
> at org.apache.hadoop.hdfs.tools.DFSck.main(DFSck.java:385){noformat}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13369) FSCK Report broken with RequestHedgingProxyProvider

2022-09-30 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17611649#comment-17611649
 ] 

ASF GitHub Bot commented on HDFS-13369:
---

ChenSammi commented on PR #4917:
URL: https://github.com/apache/hadoop/pull/4917#issuecomment-1263717827

   +1.  Thanks @navinko for the contribution. 




> FSCK Report broken with RequestHedgingProxyProvider 
> 
>
> Key: HDFS-13369
> URL: https://issues.apache.org/jira/browse/HDFS-13369
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.8.3, 3.0.0, 3.1.0
>Reporter: Harshakiran Reddy
>Assignee: Ranith Sardar
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-13369.001.patch, HDFS-13369.002.patch, 
> HDFS-13369.003.patch, HDFS-13369.004.patch, HDFS-13369.005.patch, 
> HDFS-13369.006.patch, HDFS-13369.007.patch
>
>
> Scenario:-
> 1.Configure the RequestHedgingProxy
> 2. write some files in file system
> 3. Take FSCK report for the above files
>  
> {noformat}
> bin> hdfs fsck /file1 -locations -files -blocks
> Exception in thread "main" java.lang.ClassCastException: 
> org.apache.hadoop.hdfs.server.namenode.ha.RequestHedgingProxyProvider$RequestHedgingInvocationHandler
>  cannot be cast to org.apache.hadoop.ipc.RpcInvocationHandler
> at org.apache.hadoop.ipc.RPC.getConnectionIdForProxy(RPC.java:626)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.getConnectionId(RetryInvocationHandler.java:438)
> at org.apache.hadoop.ipc.RPC.getConnectionIdForProxy(RPC.java:628)
> at org.apache.hadoop.ipc.RPC.getServerAddress(RPC.java:611)
> at org.apache.hadoop.hdfs.HAUtil.getAddressOfActive(HAUtil.java:263)
> at 
> org.apache.hadoop.hdfs.tools.DFSck.getCurrentNamenodeAddress(DFSck.java:257)
> at org.apache.hadoop.hdfs.tools.DFSck.doWork(DFSck.java:319)
> at org.apache.hadoop.hdfs.tools.DFSck.access$000(DFSck.java:72)
> at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:156)
> at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:153)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)
> at org.apache.hadoop.hdfs.tools.DFSck.run(DFSck.java:152)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
> at org.apache.hadoop.hdfs.tools.DFSck.main(DFSck.java:385){noformat}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16785) DataNode hold BP write lock to scan disk

2022-09-30 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17611503#comment-17611503
 ] 

ASF GitHub Bot commented on HDFS-16785:
---

ZanderXu commented on PR #4945:
URL: https://github.com/apache/hadoop/pull/4945#issuecomment-1263411206

   @MingXiangLi Sir, thanks for your review.
   
   > Case2: It's wrong, but not caused by this lock.
   
   About this case, I will create one new PR to fix it.
   
   > And can avoid conflict cause by volume/block pool remove or add.
   
   We can add one synchronized lock to `addBlockPool` and `shutdownBlockPool` 
as before to avoid the conflict caused by volume/blockPool remove or add. And 
this synchronized just lock volume/blockPool remove or add, so it will not 
block other operations, such as read or write from client. If ok, I will fix it 
in a new ticket with Case2 together.
   
   @Hexiaoqiao looking forward your good ideas. 




> DataNode hold BP write lock to scan disk
> 
>
> Key: HDFS-16785
> URL: https://issues.apache.org/jira/browse/HDFS-16785
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>
> When patching the fine-grained locking of datanode, I  found that `addVolume` 
> will hold the write block of the BP lock to scan the new volume to get the 
> blocks. If we try to add one full volume that was fixed offline before, i 
> will hold the write lock for a long time.
> The related code as bellows:
> {code:java}
> for (final NamespaceInfo nsInfo : nsInfos) {
>   String bpid = nsInfo.getBlockPoolID();
>   try (AutoCloseDataSetLock l = lockManager.writeLock(LockLevel.BLOCK_POOl, 
> bpid)) {
> fsVolume.addBlockPool(bpid, this.conf, this.timer);
> fsVolume.getVolumeMap(bpid, tempVolumeMap, ramDiskReplicaTracker);
>   } catch (IOException e) {
> LOG.warn("Caught exception when adding " + fsVolume +
> ". Will throw later.", e);
> exceptions.add(e);
>   }
> } {code}
> And I noticed that this lock is added by HDFS-15382, means that this logic is 
> not in lock before. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org