[jira] [Commented] (HDFS-13093) Quota set don't compute usage of unspecified storage policy content

2018-01-31 Thread liaoyuxiangqin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348128#comment-16348128
 ] 

liaoyuxiangqin commented on HDFS-13093:
---

Thanks [~xyao] review this and give detail reason explanations and many 
solutions. As above you proposed three options which can resolve the problem of 
 "hdfs dfs count" CLI return an inconsistent result, but because the incorrect 
remaining quota can't limit client continue write data  to HDFS, so that the  
"hdfs dfs count" CLI may be return negative number for  remaining quota  after 
NN restarted. 
{noformat}
 SSD_QUOTA REM_SSD_QUOTA DISK_QUOTA REM_DISK_QUOTA ARCHIVE_QUOTA 
REM_ARCHIVE_QUOTA PROVIDED_QUOTA REM_PROVIDED_QUOTA PATHNAME
 none inf 6 G -3 G none inf none inf /hot
{noformat}

> Quota set don't compute usage of unspecified storage policy content
> ---
>
> Key: HDFS-13093
> URL: https://issues.apache.org/jira/browse/HDFS-13093
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.0
> Environment: hdfs: hadoop-3.1.0-SNAPSHOT
> node:1 namenode, 9 datanodes
>Reporter: liaoyuxiangqin
>Priority: Major
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> test as following steps:
>  1. hdfs dfs -mkdir /hot
>  2. hdfs dfs -put 1G.img /hot/file1
>  3. hdfs dfsadmin -setSpaceQuota 6442450944 -storageType DISK /hot
>  4. hdfs storagepolicies -setStoragePolicy -path /hot -policy HOT
>  5. hdfs dfs -count -q -h -v -t DISK /hot
> {code:java}
> SSD_QUOTA REM_SSD_QUOTA DISK_QUOTA REM_DISK_QUOTA ARCHIVE_QUOTA 
> REM_ARCHIVE_QUOTA PROVIDED_QUOTA REM_PROVIDED_QUOTA PATHNAME
>  none inf 6 G 6 G none inf none inf /hot{code}
> In step5 i speculation the remaining quota is 3G(quota - 1G*3 replicas ),but 
> 6G actually.
>  if i change the turn of step3 and step4, then the remaining quota equal to 
> what I think 3G.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12897) getErasureCodingPolicy should handle .snapshot dir better

2018-01-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348129#comment-16348129
 ] 

Hudson commented on HDFS-12897:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13597 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13597/])
HDFS-12897. getErasureCodingPolicy should handle .snapshot dir better. (xiao: 
rev ae2177d296a322d13708b85aaa8a971b8dcce128)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirErasureCodingOp.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestErasureCodingPolicies.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestErasureCodingPolicyWithSnapshot.java


> getErasureCodingPolicy should handle .snapshot dir better
> -
>
> Key: HDFS-12897
> URL: https://issues.apache.org/jira/browse/HDFS-12897
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding, hdfs, snapshots
>Affects Versions: 3.0.0-alpha1, 3.1.0
>Reporter: Harshakiran Reddy
>Assignee: LiXin Ge
>Priority: Major
> Fix For: 3.1.0, 3.0.1
>
> Attachments: HDFS-12897.001.patch, HDFS-12897.002.patch, 
> HDFS-12897.003.patch, HDFS-12897.004.patch, HDFS-12897.005.patch
>
>
> Scenario:-
> ---
> Operation on snapshot dir.
> *EC policy*
> bin> ./hdfs ec -getPolicy -path /dir/
> RS-3-2-1024k
> bin> ./hdfs ec -getPolicy -path /dir/.snapshot/
> {{FileNotFoundException: Path not found: /dir/.snapshot}}
> bin> ./hdfs dfs -ls /dir/.snapshot/
> Found 2 items
> drwxr-xr-x   - user group  0 2017-12-05 12:27 /dir/.snapshot/s1
> drwxr-xr-x   - user group  0 2017-12-05 12:28 /dir/.snapshot/s2
> *Storagepolicies*
> bin> ./hdfs storagepolicies -getStoragePolicy -path /dir/.snapshot/
> {{The storage policy of /dir/.snapshot/ is unspecified}}
> bin> ./hdfs storagepolicies -getStoragePolicy -path /dir/
> The storage policy of /dir/:
> BlockStoragePolicy{COLD:2, storageTypes=[ARCHIVE], creationFallbacks=[], 
> replicationFallbacks=[]}
> *Which is the correct behavior ?*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13060) Adding a BlacklistBasedTrustedChannelResolver for TrustedChannelResolver

2018-01-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348115#comment-16348115
 ] 

Hudson commented on HDFS-13060:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13596 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13596/])
HDFS-13060. Adding a BlacklistBasedTrustedChannelResolver for (xyao: rev 
af015c0b2359be317132e2cf35735429f4f34ea7)
* (add) 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/CombinedIPList.java
* (add) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/package-info.java
* (add) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocol/datatransfer/sasl/TestBlackListBasedTrustedChannelResolver.java
* (add) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/BlackListBasedTrustedChannelResolver.java


> Adding a BlacklistBasedTrustedChannelResolver for TrustedChannelResolver
> 
>
> Key: HDFS-13060
> URL: https://issues.apache.org/jira/browse/HDFS-13060
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, security
>Reporter: Xiaoyu Yao
>Assignee: Ajay Kumar
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HDFS-13060.000.patch, HDFS-13060.001.patch, 
> HDFS-13060.002.patch, HDFS-13060.003.patch
>
>
> HDFS-5910 introduces encryption negotiation between client and server based 
> on a customizable TrustedChannelResolver class. The TrustedChannelResolver is 
> invoked on both client and server side. If the resolver indicates that the 
> channel is trusted, then the data transfer will not be encrypted even if 
> dfs.encrypt.data.transfer is set to true. 
> The default trust channel resolver implementation returns false indicating 
> that the channel is not trusted, which always enables encryption. HDFS-5910 
> also added a build-int whitelist based trust channel resolver. It allows you 
> to put IP address/Network Mask of trusted client/server in whitelist files to 
> skip encryption for certain traffics. 
> This ticket is opened to add a blacklist based trust channel resolver for 
> cases only certain machines (IPs) are untrusted without adding each trusted 
> IP individually.
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12897) getErasureCodingPolicy should handle .snapshot dir better

2018-01-31 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-12897:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.1
   3.1.0
   Status: Resolved  (was: Patch Available)

> getErasureCodingPolicy should handle .snapshot dir better
> -
>
> Key: HDFS-12897
> URL: https://issues.apache.org/jira/browse/HDFS-12897
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding, hdfs, snapshots
>Affects Versions: 3.0.0-alpha1, 3.1.0
>Reporter: Harshakiran Reddy
>Assignee: LiXin Ge
>Priority: Major
> Fix For: 3.1.0, 3.0.1
>
> Attachments: HDFS-12897.001.patch, HDFS-12897.002.patch, 
> HDFS-12897.003.patch, HDFS-12897.004.patch, HDFS-12897.005.patch
>
>
> Scenario:-
> ---
> Operation on snapshot dir.
> *EC policy*
> bin> ./hdfs ec -getPolicy -path /dir/
> RS-3-2-1024k
> bin> ./hdfs ec -getPolicy -path /dir/.snapshot/
> {{FileNotFoundException: Path not found: /dir/.snapshot}}
> bin> ./hdfs dfs -ls /dir/.snapshot/
> Found 2 items
> drwxr-xr-x   - user group  0 2017-12-05 12:27 /dir/.snapshot/s1
> drwxr-xr-x   - user group  0 2017-12-05 12:28 /dir/.snapshot/s2
> *Storagepolicies*
> bin> ./hdfs storagepolicies -getStoragePolicy -path /dir/.snapshot/
> {{The storage policy of /dir/.snapshot/ is unspecified}}
> bin> ./hdfs storagepolicies -getStoragePolicy -path /dir/
> The storage policy of /dir/:
> BlockStoragePolicy{COLD:2, storageTypes=[ARCHIVE], creationFallbacks=[], 
> replicationFallbacks=[]}
> *Which is the correct behavior ?*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12897) getErasureCodingPolicy should handle .snapshot dir better

2018-01-31 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348107#comment-16348107
 ] 

Xiao Chen commented on HDFS-12897:
--

+1 on patch 5, failed tests are not related to the change here.

Committed to trunk and branch-3.0. Thanks [~Harsha1206] for reporting the 
issue, [~GeLiXin] for the fix and [~rakeshr] for the review!

> getErasureCodingPolicy should handle .snapshot dir better
> -
>
> Key: HDFS-12897
> URL: https://issues.apache.org/jira/browse/HDFS-12897
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding, hdfs, snapshots
>Affects Versions: 3.0.0-alpha1, 3.1.0
>Reporter: Harshakiran Reddy
>Assignee: LiXin Ge
>Priority: Major
> Attachments: HDFS-12897.001.patch, HDFS-12897.002.patch, 
> HDFS-12897.003.patch, HDFS-12897.004.patch, HDFS-12897.005.patch
>
>
> Scenario:-
> ---
> Operation on snapshot dir.
> *EC policy*
> bin> ./hdfs ec -getPolicy -path /dir/
> RS-3-2-1024k
> bin> ./hdfs ec -getPolicy -path /dir/.snapshot/
> {{FileNotFoundException: Path not found: /dir/.snapshot}}
> bin> ./hdfs dfs -ls /dir/.snapshot/
> Found 2 items
> drwxr-xr-x   - user group  0 2017-12-05 12:27 /dir/.snapshot/s1
> drwxr-xr-x   - user group  0 2017-12-05 12:28 /dir/.snapshot/s2
> *Storagepolicies*
> bin> ./hdfs storagepolicies -getStoragePolicy -path /dir/.snapshot/
> {{The storage policy of /dir/.snapshot/ is unspecified}}
> bin> ./hdfs storagepolicies -getStoragePolicy -path /dir/
> The storage policy of /dir/:
> BlockStoragePolicy{COLD:2, storageTypes=[ARCHIVE], creationFallbacks=[], 
> replicationFallbacks=[]}
> *Which is the correct behavior ?*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12897) Path not found when getErasureCodingPolicy on a .snapshot dir

2018-01-31 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-12897:
-
Summary: Path not found when getErasureCodingPolicy on a .snapshot dir  
(was: Path not found when we get the ec policy for a .snapshot dir)

> Path not found when getErasureCodingPolicy on a .snapshot dir
> -
>
> Key: HDFS-12897
> URL: https://issues.apache.org/jira/browse/HDFS-12897
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding, hdfs, snapshots
>Affects Versions: 3.0.0-alpha1, 3.1.0
>Reporter: Harshakiran Reddy
>Assignee: LiXin Ge
>Priority: Major
> Attachments: HDFS-12897.001.patch, HDFS-12897.002.patch, 
> HDFS-12897.003.patch, HDFS-12897.004.patch, HDFS-12897.005.patch
>
>
> Scenario:-
> ---
> Operation on snapshot dir.
> *EC policy*
> bin> ./hdfs ec -getPolicy -path /dir/
> RS-3-2-1024k
> bin> ./hdfs ec -getPolicy -path /dir/.snapshot/
> {{FileNotFoundException: Path not found: /dir/.snapshot}}
> bin> ./hdfs dfs -ls /dir/.snapshot/
> Found 2 items
> drwxr-xr-x   - user group  0 2017-12-05 12:27 /dir/.snapshot/s1
> drwxr-xr-x   - user group  0 2017-12-05 12:28 /dir/.snapshot/s2
> *Storagepolicies*
> bin> ./hdfs storagepolicies -getStoragePolicy -path /dir/.snapshot/
> {{The storage policy of /dir/.snapshot/ is unspecified}}
> bin> ./hdfs storagepolicies -getStoragePolicy -path /dir/
> The storage policy of /dir/:
> BlockStoragePolicy{COLD:2, storageTypes=[ARCHIVE], creationFallbacks=[], 
> replicationFallbacks=[]}
> *Which is the correct behavior ?*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12897) getErasureCodingPolicy should handle .snapshot dir better

2018-01-31 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-12897:
-
Summary: getErasureCodingPolicy should handle .snapshot dir better  (was: 
Path not found when getErasureCodingPolicy on a .snapshot dir)

> getErasureCodingPolicy should handle .snapshot dir better
> -
>
> Key: HDFS-12897
> URL: https://issues.apache.org/jira/browse/HDFS-12897
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding, hdfs, snapshots
>Affects Versions: 3.0.0-alpha1, 3.1.0
>Reporter: Harshakiran Reddy
>Assignee: LiXin Ge
>Priority: Major
> Attachments: HDFS-12897.001.patch, HDFS-12897.002.patch, 
> HDFS-12897.003.patch, HDFS-12897.004.patch, HDFS-12897.005.patch
>
>
> Scenario:-
> ---
> Operation on snapshot dir.
> *EC policy*
> bin> ./hdfs ec -getPolicy -path /dir/
> RS-3-2-1024k
> bin> ./hdfs ec -getPolicy -path /dir/.snapshot/
> {{FileNotFoundException: Path not found: /dir/.snapshot}}
> bin> ./hdfs dfs -ls /dir/.snapshot/
> Found 2 items
> drwxr-xr-x   - user group  0 2017-12-05 12:27 /dir/.snapshot/s1
> drwxr-xr-x   - user group  0 2017-12-05 12:28 /dir/.snapshot/s2
> *Storagepolicies*
> bin> ./hdfs storagepolicies -getStoragePolicy -path /dir/.snapshot/
> {{The storage policy of /dir/.snapshot/ is unspecified}}
> bin> ./hdfs storagepolicies -getStoragePolicy -path /dir/
> The storage policy of /dir/:
> BlockStoragePolicy{COLD:2, storageTypes=[ARCHIVE], creationFallbacks=[], 
> replicationFallbacks=[]}
> *Which is the correct behavior ?*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13060) Adding a BlacklistBasedTrustedChannelResolver for TrustedChannelResolver

2018-01-31 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-13060:
--
Component/s: security
 datanode

> Adding a BlacklistBasedTrustedChannelResolver for TrustedChannelResolver
> 
>
> Key: HDFS-13060
> URL: https://issues.apache.org/jira/browse/HDFS-13060
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, security
>Reporter: Xiaoyu Yao
>Assignee: Ajay Kumar
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HDFS-13060.000.patch, HDFS-13060.001.patch, 
> HDFS-13060.002.patch, HDFS-13060.003.patch
>
>
> HDFS-5910 introduces encryption negotiation between client and server based 
> on a customizable TrustedChannelResolver class. The TrustedChannelResolver is 
> invoked on both client and server side. If the resolver indicates that the 
> channel is trusted, then the data transfer will not be encrypted even if 
> dfs.encrypt.data.transfer is set to true. 
> The default trust channel resolver implementation returns false indicating 
> that the channel is not trusted, which always enables encryption. HDFS-5910 
> also added a build-int whitelist based trust channel resolver. It allows you 
> to put IP address/Network Mask of trusted client/server in whitelist files to 
> skip encryption for certain traffics. 
> This ticket is opened to add a blacklist based trust channel resolver for 
> cases only certain machines (IPs) are untrusted without adding each trusted 
> IP individually.
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13060) Adding a BlacklistBasedTrustedChannelResolver for TrustedChannelResolver

2018-01-31 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-13060:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.1.0
   Status: Resolved  (was: Patch Available)

Thanks [~ajayydv] for the contribution. I've committed the patch to the trunk 
and branch-3.0.

> Adding a BlacklistBasedTrustedChannelResolver for TrustedChannelResolver
> 
>
> Key: HDFS-13060
> URL: https://issues.apache.org/jira/browse/HDFS-13060
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, security
>Reporter: Xiaoyu Yao
>Assignee: Ajay Kumar
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HDFS-13060.000.patch, HDFS-13060.001.patch, 
> HDFS-13060.002.patch, HDFS-13060.003.patch
>
>
> HDFS-5910 introduces encryption negotiation between client and server based 
> on a customizable TrustedChannelResolver class. The TrustedChannelResolver is 
> invoked on both client and server side. If the resolver indicates that the 
> channel is trusted, then the data transfer will not be encrypted even if 
> dfs.encrypt.data.transfer is set to true. 
> The default trust channel resolver implementation returns false indicating 
> that the channel is not trusted, which always enables encryption. HDFS-5910 
> also added a build-int whitelist based trust channel resolver. It allows you 
> to put IP address/Network Mask of trusted client/server in whitelist files to 
> skip encryption for certain traffics. 
> This ticket is opened to add a blacklist based trust channel resolver for 
> cases only certain machines (IPs) are untrusted without adding each trusted 
> IP individually.
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13095) Improve slice tree traversal implementation

2018-01-31 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348072#comment-16348072
 ] 

Xiao Chen commented on HDFS-13095:
--

I think it's also arguable whether snapshots should be supported for sps. 
What's the use case where we need to support snapshot on sps?

The major consideration for re-encryption is snapshots are supposed to be 
immutable (at least not to further complicate the semantic), and simplicity of 
code / test / support.

> Improve slice tree traversal implementation
> ---
>
> Key: HDFS-13095
> URL: https://issues.apache.org/jira/browse/HDFS-13095
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Major
>
> This task is to refine the existing slice tree traversal logic in 
> [ReencryptionHandler|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ReencryptionHandler.java#L74]
>  class.
> Please refer Daryn's review comments
> {quote}*FSTreeTraverser*
>  I need to study this more but I have grave concerns this will work correctly 
> in a mutating namesystem.  Ex. renames and deletes esp. in combination with 
> snapshots. Looks like there's a chance it will go off in the weeds when 
> backtracking out of a renamed directory.
> traverseDir may NPE if it's traversing a tree in a snapshot and one of the 
> ancestors is deleted.
> Not sure why it's bothering to re-check permissions during the crawl.  The 
> storage policy is inherited by the entire tree, regardless of whether the 
> sub-contents are accessible.  The effect of this patch is the storage policy 
> is enforced for all readable files, non-readable violate the new storage 
> policy, new non-readable will conform to the new storage policy.  Very 
> convoluted.  Since new files will conform, should just process the entire 
> tree.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10453) ReplicationMonitor thread could stuck for long time due to the race between replication and delete of same file in a large cluster.

2018-01-31 Thread He Xiaoqiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-10453:
---
Fix Version/s: 2.7.6

> ReplicationMonitor thread could stuck for long time due to the race between 
> replication and delete of same file in a large cluster.
> ---
>
> Key: HDFS-10453
> URL: https://issues.apache.org/jira/browse/HDFS-10453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.1, 2.5.2, 2.7.1, 2.6.4
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Fix For: 2.7.6
>
> Attachments: HDFS-10453-branch-2.001.patch, 
> HDFS-10453-branch-2.003.patch, HDFS-10453-branch-2.7.004.patch, 
> HDFS-10453-branch-2.7.005.patch, HDFS-10453-branch-2.7.006.patch, 
> HDFS-10453.001.patch
>
>
> ReplicationMonitor thread could stuck for long time and loss data with little 
> probability. Consider the typical scenario:
> (1) create and close a file with the default replicas(3);
> (2) increase replication (to 10) of the file.
> (3) delete the file while ReplicationMonitor is scheduling blocks belong to 
> that file for replications.
> if ReplicationMonitor stuck reappeared, NameNode will print log as:
> {code:xml}
> 2016-04-19 10:20:48,083 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> ..
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough 
> replicas: expected size is 7 but only 0 storage types can be selected 
> (replication=10, selected=[], unavailable=[DISK, ARCHIVE], removed=[DISK, 
> DISK, DISK, DISK, DISK, DISK, DISK], policy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) All required storage types are unavailable:  
> unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
> {code}
> This is because 2 threads (#NameNodeRpcServer and #ReplicationMonitor) 
> process same block at the same moment.
> (1) ReplicationMonitor#computeReplicationWorkForBlocks get blocks to 
> replicate and leave the global lock.
> (2) FSNamesystem#delete invoked to delete blocks then clear the reference in 
> blocksmap, needReplications, etc. the block's NumBytes will set 
> NO_ACK(Long.MAX_VALUE) which is used to indicate that the block deletion does 
> not need explicit ACK from the node. 
> (3) ReplicationMonitor#computeReplicationWorkForBlocks continue to 
> chooseTargets for the same blocks and no node will be selected after traverse 
> whole cluster because  no node choice satisfy the goodness criteria 
> (remaining spaces achieve required size Long.MAX_VALUE). 
> During of stage#3 ReplicationMonitor stuck for long time, especial in a large 
> cluster. invalidateBlocks & neededReplications continues to grow and no 
> consumes. it will loss data at the worst.
> This can mostly be avoided by skip chooseTarget for BlockCommand.NO_ACK block 
> and remove it from neededReplications.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12897) Path not found when we get the ec policy for a .snapshot dir

2018-01-31 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348058#comment-16348058
 ] 

Rakesh R commented on HDFS-12897:
-

Thanks [~GeLiXin], good unit test cases. +1 latest patch looks good to me.
{quote}sure, I will create a jira soon and try to fix it.
{quote}
Makes sense, good to handle separately.

> Path not found when we get the ec policy for a .snapshot dir
> 
>
> Key: HDFS-12897
> URL: https://issues.apache.org/jira/browse/HDFS-12897
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding, hdfs, snapshots
>Affects Versions: 3.0.0-alpha1, 3.1.0
>Reporter: Harshakiran Reddy
>Assignee: LiXin Ge
>Priority: Major
> Attachments: HDFS-12897.001.patch, HDFS-12897.002.patch, 
> HDFS-12897.003.patch, HDFS-12897.004.patch, HDFS-12897.005.patch
>
>
> Scenario:-
> ---
> Operation on snapshot dir.
> *EC policy*
> bin> ./hdfs ec -getPolicy -path /dir/
> RS-3-2-1024k
> bin> ./hdfs ec -getPolicy -path /dir/.snapshot/
> {{FileNotFoundException: Path not found: /dir/.snapshot}}
> bin> ./hdfs dfs -ls /dir/.snapshot/
> Found 2 items
> drwxr-xr-x   - user group  0 2017-12-05 12:27 /dir/.snapshot/s1
> drwxr-xr-x   - user group  0 2017-12-05 12:28 /dir/.snapshot/s2
> *Storagepolicies*
> bin> ./hdfs storagepolicies -getStoragePolicy -path /dir/.snapshot/
> {{The storage policy of /dir/.snapshot/ is unspecified}}
> bin> ./hdfs storagepolicies -getStoragePolicy -path /dir/
> The storage policy of /dir/:
> BlockStoragePolicy{COLD:2, storageTypes=[ARCHIVE], creationFallbacks=[], 
> replicationFallbacks=[]}
> *Which is the correct behavior ?*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13060) Adding a BlacklistBasedTrustedChannelResolver for TrustedChannelResolver

2018-01-31 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348057#comment-16348057
 ] 

Xiaoyu Yao commented on HDFS-13060:
---

+1 for v3 patch. I'll fix the minor checkstyle comment issue upon commit.

> Adding a BlacklistBasedTrustedChannelResolver for TrustedChannelResolver
> 
>
> Key: HDFS-13060
> URL: https://issues.apache.org/jira/browse/HDFS-13060
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xiaoyu Yao
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HDFS-13060.000.patch, HDFS-13060.001.patch, 
> HDFS-13060.002.patch, HDFS-13060.003.patch
>
>
> HDFS-5910 introduces encryption negotiation between client and server based 
> on a customizable TrustedChannelResolver class. The TrustedChannelResolver is 
> invoked on both client and server side. If the resolver indicates that the 
> channel is trusted, then the data transfer will not be encrypted even if 
> dfs.encrypt.data.transfer is set to true. 
> The default trust channel resolver implementation returns false indicating 
> that the channel is not trusted, which always enables encryption. HDFS-5910 
> also added a build-int whitelist based trust channel resolver. It allows you 
> to put IP address/Network Mask of trusted client/server in whitelist files to 
> skip encryption for certain traffics. 
> This ticket is opened to add a blacklist based trust channel resolver for 
> cases only certain machines (IPs) are untrusted without adding each trusted 
> IP individually.
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10453) ReplicationMonitor thread could stuck for long time due to the race between replication and delete of same file in a large cluster.

2018-01-31 Thread He Xiaoqiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348056#comment-16348056
 ] 

He Xiaoqiao commented on HDFS-10453:


[~ajayydv] Thank you for your suggestion,  I just attach new patch 
[#HDFS-10453-branch-2.7.006.patch] for branch-2.7 and first check if {{block}} 
is abandoned or reopen for append, thus it can avoid choose target fail for 
deleted blocks endless loop. FYI.
please correct me if i am wrong, Thanks again.

> ReplicationMonitor thread could stuck for long time due to the race between 
> replication and delete of same file in a large cluster.
> ---
>
> Key: HDFS-10453
> URL: https://issues.apache.org/jira/browse/HDFS-10453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.1, 2.5.2, 2.7.1, 2.6.4
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-10453-branch-2.001.patch, 
> HDFS-10453-branch-2.003.patch, HDFS-10453-branch-2.7.004.patch, 
> HDFS-10453-branch-2.7.005.patch, HDFS-10453-branch-2.7.006.patch, 
> HDFS-10453.001.patch
>
>
> ReplicationMonitor thread could stuck for long time and loss data with little 
> probability. Consider the typical scenario:
> (1) create and close a file with the default replicas(3);
> (2) increase replication (to 10) of the file.
> (3) delete the file while ReplicationMonitor is scheduling blocks belong to 
> that file for replications.
> if ReplicationMonitor stuck reappeared, NameNode will print log as:
> {code:xml}
> 2016-04-19 10:20:48,083 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> ..
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough 
> replicas: expected size is 7 but only 0 storage types can be selected 
> (replication=10, selected=[], unavailable=[DISK, ARCHIVE], removed=[DISK, 
> DISK, DISK, DISK, DISK, DISK, DISK], policy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) All required storage types are unavailable:  
> unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
> {code}
> This is because 2 threads (#NameNodeRpcServer and #ReplicationMonitor) 
> process same block at the same moment.
> (1) ReplicationMonitor#computeReplicationWorkForBlocks get blocks to 
> replicate and leave the global lock.
> (2) FSNamesystem#delete invoked to delete blocks then clear the reference in 
> blocksmap, needReplications, etc. the block's NumBytes will set 
> NO_ACK(Long.MAX_VALUE) which is used to indicate that the block deletion does 
> not need explicit ACK from the node. 
> (3) ReplicationMonitor#computeReplicationWorkForBlocks continue to 
> chooseTargets for the same blocks and no node will be selected after traverse 
> whole cluster because  no node choice satisfy the goodness criteria 
> (remaining spaces achieve required size Long.MAX_VALUE). 
> During of stage#3 ReplicationMonitor stuck for long time, especial in a large 
> cluster. invalidateBlocks & neededReplications continues to grow and no 
> consumes. it will loss data at the worst.
> This can mostly be avoided by skip chooseTarget for BlockCommand.NO_ACK block 
> and remove it from neededReplications.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: 

[jira] [Commented] (HDFS-13068) RBF: Add router admin option to manage safe mode

2018-01-31 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348053#comment-16348053
 ] 

genericqa commented on HDFS-13068:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 36s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 58s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}116m  5s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}167m  7s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HDFS-13068 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908712/HDFS-13068.003.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  cc  |
| uname | Linux b41cacb647f2 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 
11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 0bee384 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22914/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22914/testReport/ |
| Max. process+thread count | 2853 (vs. ulimit of 5000) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 

[jira] [Updated] (HDFS-10453) ReplicationMonitor thread could stuck for long time due to the race between replication and delete of same file in a large cluster.

2018-01-31 Thread He Xiaoqiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-10453:
---
Attachment: HDFS-10453-branch-2.7.006.patch

> ReplicationMonitor thread could stuck for long time due to the race between 
> replication and delete of same file in a large cluster.
> ---
>
> Key: HDFS-10453
> URL: https://issues.apache.org/jira/browse/HDFS-10453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.1, 2.5.2, 2.7.1, 2.6.4
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-10453-branch-2.001.patch, 
> HDFS-10453-branch-2.003.patch, HDFS-10453-branch-2.7.004.patch, 
> HDFS-10453-branch-2.7.005.patch, HDFS-10453-branch-2.7.006.patch, 
> HDFS-10453.001.patch
>
>
> ReplicationMonitor thread could stuck for long time and loss data with little 
> probability. Consider the typical scenario:
> (1) create and close a file with the default replicas(3);
> (2) increase replication (to 10) of the file.
> (3) delete the file while ReplicationMonitor is scheduling blocks belong to 
> that file for replications.
> if ReplicationMonitor stuck reappeared, NameNode will print log as:
> {code:xml}
> 2016-04-19 10:20:48,083 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> ..
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough 
> replicas: expected size is 7 but only 0 storage types can be selected 
> (replication=10, selected=[], unavailable=[DISK, ARCHIVE], removed=[DISK, 
> DISK, DISK, DISK, DISK, DISK, DISK], policy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) All required storage types are unavailable:  
> unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
> {code}
> This is because 2 threads (#NameNodeRpcServer and #ReplicationMonitor) 
> process same block at the same moment.
> (1) ReplicationMonitor#computeReplicationWorkForBlocks get blocks to 
> replicate and leave the global lock.
> (2) FSNamesystem#delete invoked to delete blocks then clear the reference in 
> blocksmap, needReplications, etc. the block's NumBytes will set 
> NO_ACK(Long.MAX_VALUE) which is used to indicate that the block deletion does 
> not need explicit ACK from the node. 
> (3) ReplicationMonitor#computeReplicationWorkForBlocks continue to 
> chooseTargets for the same blocks and no node will be selected after traverse 
> whole cluster because  no node choice satisfy the goodness criteria 
> (remaining spaces achieve required size Long.MAX_VALUE). 
> During of stage#3 ReplicationMonitor stuck for long time, especial in a large 
> cluster. invalidateBlocks & neededReplications continues to grow and no 
> consumes. it will loss data at the worst.
> This can mostly be avoided by skip chooseTarget for BlockCommand.NO_ACK block 
> and remove it from neededReplications.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13093) Quota set don't compute usage of unspecified storage policy content

2018-01-31 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348042#comment-16348042
 ] 

Xiaoyu Yao commented on HDFS-13093:
---

Thanks [~liaoyuxiangqin] for reporting this. This could relate to HDFS-8898 
which returns a cached quota usage with the new getQuotaUsage() API from INode 
directly without recursive traversal and recalculate like getContentSummary() 
API.

Before HDFS-8898, the "hdfs dfs -count" CLI uses getContentSummary(),  which is 
expensive because it always walks the whole sub-tree to recalculate quota and 
usage. This guarantees the correctness no matter the order of step 3 and 4.

After HDFS-8898, we switch to use the getQuotaUsage() API for "hdfs dfs count" 
CLI, the cached INode quota usage for storage type will be 0 if you don't set 
storage policy before storage type quota.

This is because the storage type usage is strongly tied to the storage policy. 
If there is no storage policy set first, we will not be able to determine the 
quota usage for different storage type. As a result, 0 will be the default in 
this case.

I believe this is a transient incorrectness that can be fixed by one of the 
following three options. 

1. This is a corner case. Document the procedure to set storage policy first 
before set storage type quota. Otherwise, getQuotaUsage() API and "hdfs dfs 
count" CLI will return an inconsistent result. No fix needed.

2. The cached quota usage will be recalculated correctly any way upon next NN 
restart, no fix needed.

3. If we really want to allow setting storage type quota before setting storage 
policy, we can provide an option for "hdfs dfs count" CLI to use 
getContentSummary() to report the accurate usage.

cc: [~mingma], [~kihwal] for additional comments.

> Quota set don't compute usage of unspecified storage policy content
> ---
>
> Key: HDFS-13093
> URL: https://issues.apache.org/jira/browse/HDFS-13093
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.0
> Environment: hdfs: hadoop-3.1.0-SNAPSHOT
> node:1 namenode, 9 datanodes
>Reporter: liaoyuxiangqin
>Priority: Major
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> test as following steps:
>  1. hdfs dfs -mkdir /hot
>  2. hdfs dfs -put 1G.img /hot/file1
>  3. hdfs dfsadmin -setSpaceQuota 6442450944 -storageType DISK /hot
>  4. hdfs storagepolicies -setStoragePolicy -path /hot -policy HOT
>  5. hdfs dfs -count -q -h -v -t DISK /hot
> {code:java}
> SSD_QUOTA REM_SSD_QUOTA DISK_QUOTA REM_DISK_QUOTA ARCHIVE_QUOTA 
> REM_ARCHIVE_QUOTA PROVIDED_QUOTA REM_PROVIDED_QUOTA PATHNAME
>  none inf 6 G 6 G none inf none inf /hot{code}
> In step5 i speculation the remaining quota is 3G(quota - 1G*3 replicas ),but 
> 6G actually.
>  if i change the turn of step3 and step4, then the remaining quota equal to 
> what I think 3G.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13095) Improve slice tree traversal implementation

2018-01-31 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348021#comment-16348021
 ] 

Rakesh R commented on HDFS-13095:
-

Thank you [~xiaochen] for the quick reply. I understand that {{snapshot}} 
handling is not required in edek logic, so we will take care this condition 
specifically to sps.

> Improve slice tree traversal implementation
> ---
>
> Key: HDFS-13095
> URL: https://issues.apache.org/jira/browse/HDFS-13095
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Major
>
> This task is to refine the existing slice tree traversal logic in 
> [ReencryptionHandler|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ReencryptionHandler.java#L74]
>  class.
> Please refer Daryn's review comments
> {quote}*FSTreeTraverser*
>  I need to study this more but I have grave concerns this will work correctly 
> in a mutating namesystem.  Ex. renames and deletes esp. in combination with 
> snapshots. Looks like there's a chance it will go off in the weeds when 
> backtracking out of a renamed directory.
> traverseDir may NPE if it's traversing a tree in a snapshot and one of the 
> ancestors is deleted.
> Not sure why it's bothering to re-check permissions during the crawl.  The 
> storage policy is inherited by the entire tree, regardless of whether the 
> sub-contents are accessible.  The effect of this patch is the storage policy 
> is enforced for all readable files, non-readable violate the new storage 
> policy, new non-readable will conform to the new storage policy.  Very 
> convoluted.  Since new files will conform, should just process the entire 
> tree.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12512) RBF: Add WebHDFS

2018-01-31 Thread Wei Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan updated HDFS-12512:
---
Status: Patch Available  (was: Open)

> RBF: Add WebHDFS
> 
>
> Key: HDFS-12512
> URL: https://issues.apache.org/jira/browse/HDFS-12512
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs
>Reporter: Íñigo Goiri
>Assignee: Wei Yan
>Priority: Major
>  Labels: RBF
> Attachments: HDFS-12512.000.patch, HDFS-12512.001.patch, 
> HDFS-12512.002.patch, HDFS-12512.003.patch, HDFS-12512.004.patch
>
>
> The Router currently does not support WebHDFS. It needs to implement 
> something similar to {{NamenodeWebHdfsMethods}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12512) RBF: Add WebHDFS

2018-01-31 Thread Wei Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan updated HDFS-12512:
---
Status: Open  (was: Patch Available)

> RBF: Add WebHDFS
> 
>
> Key: HDFS-12512
> URL: https://issues.apache.org/jira/browse/HDFS-12512
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs
>Reporter: Íñigo Goiri
>Assignee: Wei Yan
>Priority: Major
>  Labels: RBF
> Attachments: HDFS-12512.000.patch, HDFS-12512.001.patch, 
> HDFS-12512.002.patch, HDFS-12512.003.patch, HDFS-12512.004.patch
>
>
> The Router currently does not support WebHDFS. It needs to implement 
> something similar to {{NamenodeWebHdfsMethods}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10285) Storage Policy Satisfier in Namenode

2018-01-31 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347472#comment-16347472
 ] 

Rakesh R edited comment on HDFS-10285 at 2/1/18 5:07 AM:
-

Thank you very much [~daryn] for your time and useful comments/thoughts. My 
reply follows, please take a look at it.

+Comment-1)+
{quote}BlockManager
 Shouldn’t spsMode be volatile? Although I question why it’s here.
{quote}
[Rakesh's reply] Agreed, will do the changes.

+Comment-2)+
{quote}Adding SPS methods to this class implies an unexpected coupling of the 
SPS service to the block manager. Please move them out to prove it’s not 
tightly coupled.
{quote}
[Rakesh's reply] Agreed. We are planning to create 
{{StoragePolicySatisfyManager}} and keep all the related apis over there.

+Comment-3)+
{quote}BPServiceActor
 Is it actually sending back the moved blocks? Aren’t IBRs sufficient?

BlockStorageMovementCommand/BlocksStorageMoveAttemptFinished
 Again, not sure that a new DN command is necessary, and why does it 
specifically report back successful moves instead of relying on IBRs? I would 
actually expect the DN to be completely ignorant of a SPS move vs any other 
move.
{quote}
[Rakesh's reply] We have explored IBR approach and the required code changes. 
If sps rely on this, then it would requires an *extra* check to know whether 
this new block has occurred due to sps move or others, which will be quite 
often considering more other ops compares to SPSBlockMove op. Currently, it is 
sending back {{blksMovementsFinished}} list separately, each movement finished 
block can be easily/quickly recognized by the Satisfier in NN side and updates 
the tracking details. If you agree this *extra* check is not an issue then we 
would be happy to implement the IBR approach. Secondly, 
BlockStorageMovementCommand is added to carry the block vs src/target pairs 
which is much needed for the move operation and we tried to decouple sps code 
using this command. 

+Comment-4)+
{quote}DataNode
 Why isn’t this just a block transfer? How is transferring between DNs any 
different than across storages?
{quote}
[Rakesh's reply] I could see Mover is also using {{REPLACE_BLOCK}} call and we 
just followed same approach in sps also. Am I missing anything here?

+Comment-5)+
{quote}DatanodeDescriptor
 Why use a synchronized linked list to offer/poll instead of BlockingQueue?
{quote}
[Rakesh's reply] Agreed, will do the changes.

+Comment-6)+
{quote}DatanodeManager
 I know it’s configurable, but realistically, when would you ever want to give 
storage movement tasks equal footing with under-replication? Is there really a 
use case for not valuing durability?
{quote}
[Rakesh's reply] We don't have any particular use case, though. One scenario we 
thought is, user configured SSDs and filled up quickly. In that case, there 
could be situations that cleaning-up is considered as a high priority. If you 
feel, this is not a real case then I'm OK to remove this config and SPS will 
use only the remaining slots always.

+Comment-7)+
{quote}Adding getDatanodeStorageReport is concerning. getDatanodeListForReport 
is already a very bad method that should be avoided for anything but jmx – even 
then it’s a concern. I eliminated calls to it years ago. All it takes is a 
nscd/dns hiccup and you’re left holding the fsn lock for an excessive length of 
time. Beyond that, the response is going to be pretty large and tagging all the 
storage reports is not going to be cheap.

verifyTargetDatanodeHasSpaceForScheduling does it really need the namesystem 
lock? Can’t DatanodeDescriptor#chooseStorage4Block synchronize on its 
storageMap?

Appears to be calling getLiveDatanodeStorageReport for every file. As mentioned 
earlier, this is NOT cheap. The SPS should be able to operate on a fuzzy/cached 
state of the world. Then it gets another datanode report to determine the 
number of live nodes to decide if it should sleep before processing the next 
path. The number of nodes from the prior cached view of the world should 
suffice.
{quote}
[Rakesh's reply] Good point. Sometime back Uma and me thought about cache part. 
Actually, we depend on this api for the data node storage types and remaining 
space details. I think, it requires two different mechanisms for internal and 
external sps. For internal, how about sps can directly refer 
{{DatanodeManager#datanodeMap}} for every file. For the external, IIUC you are 
suggesting a cache mechanism. How about, get storageReport once and cache at 
ExternalContext. This local cache can be refreshed periodically. Say, After 
every 5mins (just an arbitrary number I put here, if you have some period in 
mind please suggest), when getDatanodeStorageReport called, cache can be 
treated as expired and fetch freshly. Within 5mins it can use from cache. Does 
this make sense to you?

Another point we thought of is, right now for checking whether 

[jira] [Updated] (HDFS-13092) Reduce verbosity for ThrottledAsyncChecker.java:schedule

2018-01-31 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-13092:
-
Affects Version/s: 3.0.0
 Priority: Minor  (was: Major)

> Reduce verbosity for ThrottledAsyncChecker.java:schedule
> 
>
> Key: HDFS-13092
> URL: https://issues.apache.org/jira/browse/HDFS-13092
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.0.0
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Minor
> Fix For: 3.1.0, 3.0.1
>
> Attachments: HDFS-13092.001.patch
>
>
> ThrottledAsyncChecker.java:schedule prints a log message every time a disk 
> check is scheduled. However if the previous check was triggered lesser than 
> the frequency at "minMsBetweenChecks" then the task is not scheduled. This 
> jira will reduce the log verbosity by printing the message only when the task 
> will be scheduled.
> {code}
> 2018-01-29 00:51:44,467 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/2/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,470 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/2/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,477 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/4/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,480 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/4/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,486 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/11/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,501 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/13/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,507 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/11/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,533 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/2/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,536 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/12/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,543 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/10/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,544 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/2/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,548 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/3/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,549 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/5/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,550 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/6/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,551 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/2/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,552 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/10/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,552 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/8/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,552 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/12/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,554 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/9/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,555 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/8/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,555 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/14/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,560 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/12/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,560 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - 

[jira] [Updated] (HDFS-13092) Reduce verbosity for ThrottledAsyncChecker.java:schedule

2018-01-31 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-13092:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.1
   3.1.0
   Status: Resolved  (was: Patch Available)

Seems this JIRA can be resolved.

> Reduce verbosity for ThrottledAsyncChecker.java:schedule
> 
>
> Key: HDFS-13092
> URL: https://issues.apache.org/jira/browse/HDFS-13092
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.0.0
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Fix For: 3.1.0, 3.0.1
>
> Attachments: HDFS-13092.001.patch
>
>
> ThrottledAsyncChecker.java:schedule prints a log message every time a disk 
> check is scheduled. However if the previous check was triggered lesser than 
> the frequency at "minMsBetweenChecks" then the task is not scheduled. This 
> jira will reduce the log verbosity by printing the message only when the task 
> will be scheduled.
> {code}
> 2018-01-29 00:51:44,467 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/2/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,470 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/2/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,477 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/4/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,480 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/4/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,486 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/11/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,501 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/13/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,507 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/11/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,533 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/2/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,536 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/12/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,543 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/10/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,544 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/2/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,548 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/3/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,549 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/5/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,550 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/6/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,551 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/2/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,552 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/10/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,552 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/8/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,552 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/12/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,554 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/9/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,555 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/8/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,555 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/14/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,560 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> 

[jira] [Commented] (HDFS-7134) Replication count for a block should not update till the blocks have settled on Datanodes

2018-01-31 Thread liaoyuxiangqin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347959#comment-16347959
 ] 

liaoyuxiangqin commented on HDFS-7134:
--

[~gurmukhd] I have test in hadoop 3.1.0, this issue no longer appear.

> Replication count for a block should not update till the blocks have settled 
> on Datanodes
> -
>
> Key: HDFS-7134
> URL: https://issues.apache.org/jira/browse/HDFS-7134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, hdfs
>Affects Versions: 1.2.1, 2.6.0, 2.7.3
> Environment: Linux nn1.cluster1.com 2.6.32-431.20.3.el6.x86_64 #1 SMP 
> Thu Jun 19 21:14:45 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
> [hadoop@nn1 conf]$ cat /etc/redhat-release
> CentOS release 6.5 (Final)
>Reporter: gurmukh singh
>Priority: Critical
>  Labels: HDFS
>
> The count for the number of replica's for a block should not change till the 
> blocks have settled on the datanodes.
> Test Case:
> Hadoop Cluster with 1 namenode and 3 datanodes.
> nn1.cluster1.com(192.168.1.70)
> dn1.cluster1.com(192.168.1.72)
> dn2.cluster1.com(192.168.1.73)
> dn3.cluster1.com(192.168.1.74)
> Cluster up and running fine with replication set to "1" for parameter 
> "dfs.replication on all nodes"
> 
> dfs.replication
> 1
> 
> To reduce the wait time, have reduced the dfs.heartbeat and recheck 
> parameters.
> on datanode2 (192.168.1.72)
> [hadoop@dn2 ~]$ hadoop fs -Ddfs.replication=2 -put from_dn2 /
> [hadoop@dn2 ~]$ hadoop fs -ls /from_dn2
> Found 1 items
> -rw-r--r--   2 hadoop supergroup 17 2014-09-23 13:33 /from_dn2
> On Namenode
> ===
> As expected, copy was done from datanode2, one copy will go locally.
> [hadoop@nn1 conf]$ hadoop fsck /from_dn2 -files -blocks -locations
> FSCK started by hadoop from /192.168.1.70 for path /from_dn2 at Tue Sep 23 
> 13:53:16 IST 2014
> /from_dn2 17 bytes, 1 block(s):  OK
> 0. blk_8132629811771280764_1175 len=17 repl=2 [192.168.1.74:50010, 
> 192.168.1.73:50010]
> Can see the blocks on the data nodes disks as well under the "current" 
> directory.
> Now, shutdown datanode2(192.168.1.73) and as expected block moves to another 
> datanode to maintain a replication of 2
> [hadoop@nn1 conf]$ hadoop fsck /from_dn2 -files -blocks -locations
> FSCK started by hadoop from /192.168.1.70 for path /from_dn2 at Tue Sep 23 
> 13:54:21 IST 2014
> /from_dn2 17 bytes, 1 block(s):  OK
> 0. blk_8132629811771280764_1175 len=17 repl=2 [192.168.1.74:50010, 
> 192.168.1.72:50010]
> But, now if i bring back the datanode2, and although the namenode see that 
> this block is at 3 places now and fires a invalidate command for 
> datanode1(192.168.1.72) but the replication on the namenode is bumped to 3 
> immediately.
> [hadoop@nn1 conf]$ hadoop fsck /from_dn2 -files -blocks -locations
> FSCK started by hadoop from /192.168.1.70 for path /from_dn2 at Tue Sep 23 
> 13:56:12 IST 2014
> /from_dn2 17 bytes, 1 block(s):  OK
> 0. blk_8132629811771280764_1175 len=17 repl=3 [192.168.1.74:50010, 
> 192.168.1.72:50010, 192.168.1.73:50010]
> on Datanode1 - The invalidate command has been fired immediately and the 
> block deleted.
> =
> 2014-09-23 13:54:17,483 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Receiving blk_8132629811771280764_1175 src: /192.168.1.74:38099 dest: 
> /192.168.1.72:50010
> 2014-09-23 13:54:17,502 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Received blk_8132629811771280764_1175 src: /192.168.1.74:38099 dest: 
> /192.168.1.72:50010 size 17
> 2014-09-23 13:55:28,720 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Scheduling blk_8132629811771280764_1175 file 
> /space/disk1/current/blk_8132629811771280764 for deletion
> 2014-09-23 13:55:28,721 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Deleted blk_8132629811771280764_1175 at file 
> /space/disk1/current/blk_8132629811771280764
> The namenode still shows 3 replica's. even if one has been deleted, even 
> after more then 30 mins.
> [hadoop@nn1 conf]$ hadoop fsck /from_dn2 -files -blocks -locations
> FSCK started by hadoop from /192.168.1.70 for path /from_dn2 at Tue Sep 23 
> 14:21:27 IST 2014
> /from_dn2 17 bytes, 1 block(s):  OK
> 0. blk_8132629811771280764_1175 len=17 repl=3 [192.168.1.74:50010, 
> 192.168.1.72:50010, 192.168.1.73:50010]
> This could be a dangerous, if someone remove or other 2 datanodes fail.
> On Datanode 1
> =
> Before, the datanode1 is brought back
> [hadoop@dn1 conf]$ ls -l /space/disk*/current
> /space/disk1/current:
> total 28
> -rw-rw-r-- 1 hadoop hadoop   13 Sep 21 09:09 blk_2278001646987517832
> -rw-rw-r-- 1 hadoop hadoop   11 Sep 21 09:09 blk_2278001646987517832_1171.meta
> -rw-rw-r-- 1 hadoop hadoop   17 Sep 23 13:54 blk_8132629811771280764
> -rw-rw-r-- 1 

[jira] [Commented] (HDFS-13043) RBF: Expose the state of the Routers in the federation

2018-01-31 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347949#comment-16347949
 ] 

Yiqun Lin commented on HDFS-13043:
--

Failed unit tests are not related. LGTM, +1.
[~elgoiri], as the work for the tracking Router state all be done in trunk, 
what's the next planning in RBF phase2? Subcluster Rebalaner or others?

> RBF: Expose the state of the Routers in the federation
> --
>
> Key: HDFS-13043
> URL: https://issues.apache.org/jira/browse/HDFS-13043
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-13043.000.patch, HDFS-13043.001.patch, 
> HDFS-13043.002.patch, HDFS-13043.003.patch, HDFS-13043.004.patch, 
> HDFS-13043.005.patch, HDFS-13043.006.patch, HDFS-13043.007.patch, 
> HDFS-13043.008.patch, HDFS-13043.009.patch, router-info.png
>
>
> The Router should expose the state of the other Routers in the federation 
> through a user UI.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13068) RBF: Add router admin option to manage safe mode

2018-01-31 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-13068:
-
Attachment: HDFS-13068.003.patch

> RBF: Add router admin option to manage safe mode
> 
>
> Key: HDFS-13068
> URL: https://issues.apache.org/jira/browse/HDFS-13068
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Yiqun Lin
>Priority: Major
> Attachments: HDFS-13068.001.patch, HDFS-13068.002.patch, 
> HDFS-13068.003.patch
>
>
> HDFS-13044 adds a safe mode to reject requests. We should have an option to 
> manually set the Router into safe mode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13068) RBF: Add router admin option to manage safe mode

2018-01-31 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347945#comment-16347945
 ] 

Yiqun Lin commented on HDFS-13068:
--

Thanks for the review, [~elgoiri].
Attach the update patch to address the comments.

> RBF: Add router admin option to manage safe mode
> 
>
> Key: HDFS-13068
> URL: https://issues.apache.org/jira/browse/HDFS-13068
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Yiqun Lin
>Priority: Major
> Attachments: HDFS-13068.001.patch, HDFS-13068.002.patch, 
> HDFS-13068.003.patch
>
>
> HDFS-13044 adds a safe mode to reject requests. We should have an option to 
> manually set the Router into safe mode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13060) Adding a BlacklistBasedTrustedChannelResolver for TrustedChannelResolver

2018-01-31 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347931#comment-16347931
 ] 

genericqa commented on HDFS-13060:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
 5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 56s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
51s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 13m 
41s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 43s{color} | {color:orange} root: The patch generated 1 new + 0 unchanged - 
0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 10s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
54s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  9m  9s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}119m 33s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  2m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}221m 24s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.security.TestRaceWhenRelogin |
|   | hadoop.hdfs.server.namenode.ha.TestInitializeSharedEdits |
|   | hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HDFS-13060 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908673/HDFS-13060.003.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux b5b3365fff81 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 
11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Commented] (HDFS-12997) Move logging to slf4j in BlockPoolSliceStorage and Storage

2018-01-31 Thread Akira Ajisaka (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347903#comment-16347903
 ] 

Akira Ajisaka commented on HDFS-12997:
--

+1, thanks Ajay.

> Move logging to slf4j in BlockPoolSliceStorage and Storage 
> ---
>
> Key: HDFS-12997
> URL: https://issues.apache.org/jira/browse/HDFS-12997
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HDFS-12997.001.patch, HDFS-12997.002.patch, 
> HDFS-12997.003.patch, HDFS-12997.004.patch, HDFS-12997.005.patch, 
> HDFS-12997.006.patch, HDFS-12997.007.patch
>
>
> Move logging to slf4j in BlockPoolSliceStorage and Storage classes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13056) Expose file-level composite CRCs in HDFS which are comparable across different instances/layouts

2018-01-31 Thread Dennis Huo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Huo updated HDFS-13056:
--
Attachment: hdfs-file-composite-crc32-v3.pdf

> Expose file-level composite CRCs in HDFS which are comparable across 
> different instances/layouts
> 
>
> Key: HDFS-13056
> URL: https://issues.apache.org/jira/browse/HDFS-13056
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, distcp, erasure-coding, federation, hdfs
>Affects Versions: 3.0.0
>Reporter: Dennis Huo
>Priority: Major
> Attachments: HDFS-13056-branch-2.8.001.patch, 
> HDFS-13056-branch-2.8.poc1.patch, HDFS-13056.001.patch, 
> Reference_only_zhen_PPOC_hadoop2.6.X.diff, hdfs-file-composite-crc32-v1.pdf, 
> hdfs-file-composite-crc32-v2.pdf, hdfs-file-composite-crc32-v3.pdf
>
>
> FileChecksum was first introduced in 
> [https://issues-test.apache.org/jira/browse/HADOOP-3981] and ever since then 
> has remained defined as MD5-of-MD5-of-CRC, where per-512-byte chunk CRCs are 
> already stored as part of datanode metadata, and the MD5 approach is used to 
> compute an aggregate value in a distributed manner, with individual datanodes 
> computing the MD5-of-CRCs per-block in parallel, and the HDFS client 
> computing the second-level MD5.
>  
> A shortcoming of this approach which is often brought up is the fact that 
> this FileChecksum is sensitive to the internal block-size and chunk-size 
> configuration, and thus different HDFS files with different block/chunk 
> settings cannot be compared. More commonly, one might have different HDFS 
> clusters which use different block sizes, in which case any data migration 
> won't be able to use the FileChecksum for distcp's rsync functionality or for 
> verifying end-to-end data integrity (on top of low-level data integrity 
> checks applied at data transfer time).
>  
> This was also revisited in https://issues.apache.org/jira/browse/HDFS-8430 
> during the addition of checksum support for striped erasure-coded files; 
> while there was some discussion of using CRC composability, it still 
> ultimately settled on hierarchical MD5 approach, which also adds the problem 
> that checksums of basic replicated files are not comparable to striped files.
>  
> This feature proposes to add a "COMPOSITE-CRC" FileChecksum type which uses 
> CRC composition to remain completely chunk/block agnostic, and allows 
> comparison between striped vs replicated files, between different HDFS 
> instances, and possible even between HDFS and other external storage systems. 
> This feature can also be added in-place to be compatible with existing block 
> metadata, and doesn't need to change the normal path of chunk verification, 
> so is minimally invasive. This also means even large preexisting HDFS 
> deployments could adopt this feature to retroactively sync data. A detailed 
> design document can be found here: 
> https://storage.googleapis.com/dennishuo/hdfs-file-composite-crc32-v1.pdf



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13056) Expose file-level composite CRCs in HDFS which are comparable across different instances/layouts

2018-01-31 Thread Dennis Huo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347851#comment-16347851
 ] 

Dennis Huo commented on HDFS-13056:
---

Uploaded initial end-to-end working draft against trunk which supports 
CRC32/CRC32C and partial file prefixes of arbitrary bytes-per-crc or blocksize 
and across replicated vs striped encodings as well.

Still a TODO to support the striped-reconstruction path, and adding stripe 
support made everything a lot messier so some refactoring is in order. Also, 
unittests still pending, but manual testing in a real setup works:

 

 
{code:java}
$ hdfs dfs -cp gs://hadoop-cloud-dev-dhuo/random-crctest.dat 
hdfs:///tmp/random-crctest-default1.dat
$ hdfs dfs -cp gs://hadoop-cloud-dev-dhuo/random-crctest.dat 
hdfs:///tmp/random-crctest-default2.dat
$ hdfs dfs -Ddfs.bytes-per-checksum=1024 -cp 
gs://hadoop-cloud-dev-dhuo/random-crctest.dat 
hdfs:///tmp/random-crctest-bpc1024.dat
$ hdfs dfs -Ddfs.blocksize=67108864 -cp 
gs://hadoop-cloud-dev-dhuo/random-crctest.dat 
hdfs:///tmp/random-crctest-blocksize64mb.dat
$ hdfs dfs -cp gs://hadoop-cloud-dev-dhuo/random-crctest-unaligned.dat 
hdfs:///tmp/random-crctest-unaligned1.dat
$ hdfs dfs -Ddfs.bytes-per-checksum=1024 -cp 
gs://hadoop-cloud-dev-dhuo/random-crctest-unaligned.dat 
hdfs:///tmp/random-crctest-unaligned2.dat
$ hdfs dfs -Ddfs.checksum.type=CRC32 -cp 
gs://hadoop-cloud-dev-dhuo/random-crctest.dat 
hdfs:///tmp/random-crctest-gzipcrc32-1.dat
$ hdfs dfs -Ddfs.checksum.type=CRC32 -Ddfs.bytes-per-checksum=1024 -cp 
gs://hadoop-cloud-dev-dhuo/random-crctest.dat 
hdfs:///tmp/random-crctest-gzipcrc32-2.dat


$ hdfs dfs -mkdir hdfs:///tmpec
$ hdfs ec -enablePolicy -policy XOR-2-1-1024k
$ hdfs ec -setPolicy -path hdfs:///tmpec -policy XOR-2-1-1024k


$ hdfs dfs -cp gs://hadoop-cloud-dev-dhuo/random-crctest.dat 
hdfs:///tmpec/random-crctest-default1.dat
$ hdfs dfs -cp gs://hadoop-cloud-dev-dhuo/random-crctest.dat 
hdfs:///tmpec/random-crctest-default2.dat
$ hdfs dfs -Ddfs.bytes-per-checksum=1024 -cp 
gs://hadoop-cloud-dev-dhuo/random-crctest.dat 
hdfs:///tmpec/random-crctest-bpc1024.dat
$ hdfs dfs -Ddfs.blocksize=67108864 -cp 
gs://hadoop-cloud-dev-dhuo/random-crctest.dat 
hdfs:///tmpec/random-crctest-blocksize64mb.dat
$ hdfs dfs -cp gs://hadoop-cloud-dev-dhuo/random-crctest-unaligned.dat 
hdfs:///tmpec/random-crctest-unaligned1.dat
$ hdfs dfs -Ddfs.bytes-per-checksum=1024 -cp 
gs://hadoop-cloud-dev-dhuo/random-crctest-unaligned.dat 
hdfs:///tmpec/random-crctest-unaligned2.dat
$ hdfs dfs -Ddfs.checksum.type=CRC32 -cp 
gs://hadoop-cloud-dev-dhuo/random-crctest.dat 
hdfs:///tmpec/random-crctest-gzipcrc32-1.dat
$ hdfs dfs -Ddfs.checksum.type=CRC32 -Ddfs.bytes-per-checksum=1024 -cp 
gs://hadoop-cloud-dev-dhuo/random-crctest.dat 
hdfs:///tmpec/random-crctest-gzipcrc32-2.dat

$ hdfs dfs -checksum hdfs:///tmp/random-crctest*.dat
hdfs:///tmp/random-crctest-blocksize64mb.datMD5-of-131072MD5-of-512CRC32C   
02028baa940ef6ed21fb4bd6224ce917d127
hdfs:///tmp/random-crctest-bpc1024.dat  MD5-of-131072MD5-of-1024CRC32C  
0402930b0d7ad333786a839b044ed8d18d2d
hdfs:///tmp/random-crctest-default1.dat MD5-of-262144MD5-of-512CRC32C   
0204c0baeeacbc4b5a3c8af5152944fe2d79
hdfs:///tmp/random-crctest-default2.dat MD5-of-262144MD5-of-512CRC32C   
0204c0baeeacbc4b5a3c8af5152944fe2d79
hdfs:///tmp/random-crctest-gzipcrc32-1.dat  MD5-of-262144MD5-of-512CRC32
020449d52fdd25aa08559e20536acc34d51d
hdfs:///tmp/random-crctest-gzipcrc32-2.dat  MD5-of-131072MD5-of-1024CRC32   
04021d5468ea4093ddb3741790b8dc3b9a57
hdfs:///tmp/random-crctest-unaligned1.dat   MD5-of-262144MD5-of-512CRC32C   
02040da665dadca0df00456206f234d5f8b0
hdfs:///tmp/random-crctest-unaligned2.dat   MD5-of-131072MD5-of-1024CRC32C  
040227c2198f48224a0ddb92c4dc4addd28b

$ hdfs dfs -checksum hdfs:///tmpec/random-crctest*.dat
18/02/01 01:15:54 INFO gcs.GoogleHadoopFileSystemBase: GHFS version: 
1.6.2-hadoop2
hdfs:///tmpec/random-crctest-blocksize64mb.dat  MD5-of-131072MD5-of-512CRC32C   
02025b54faaa368ed81b25984a746c767d39
hdfs:///tmpec/random-crctest-bpc1024.datMD5-of-131072MD5-of-1024CRC32C  
040289a128b1e1995256bdb34fb95720dafc
hdfs:///tmpec/random-crctest-default1.dat   MD5-of-262144MD5-of-512CRC32C   
020407ee18e8f4909647adf085ec0f464d1a
hdfs:///tmpec/random-crctest-default2.dat   MD5-of-262144MD5-of-512CRC32C   
020407ee18e8f4909647adf085ec0f464d1a
hdfs:///tmpec/random-crctest-gzipcrc32-1.datMD5-of-262144MD5-of-512CRC32
0204d79ad1fa00fad2f0adb18f49f2e90bb3
hdfs:///tmpec/random-crctest-gzipcrc32-2.datMD5-of-131072MD5-of-1024CRC32   
0402126ac7bc467c59942734bd8ebf690440

[jira] [Updated] (HDFS-13056) Expose file-level composite CRCs in HDFS which are comparable across different instances/layouts

2018-01-31 Thread Dennis Huo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Huo updated HDFS-13056:
--
Attachment: HDFS-13056.001.patch

> Expose file-level composite CRCs in HDFS which are comparable across 
> different instances/layouts
> 
>
> Key: HDFS-13056
> URL: https://issues.apache.org/jira/browse/HDFS-13056
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, distcp, erasure-coding, federation, hdfs
>Affects Versions: 3.0.0
>Reporter: Dennis Huo
>Priority: Major
> Attachments: HDFS-13056-branch-2.8.001.patch, 
> HDFS-13056-branch-2.8.poc1.patch, HDFS-13056.001.patch, 
> Reference_only_zhen_PPOC_hadoop2.6.X.diff, hdfs-file-composite-crc32-v1.pdf, 
> hdfs-file-composite-crc32-v2.pdf
>
>
> FileChecksum was first introduced in 
> [https://issues-test.apache.org/jira/browse/HADOOP-3981] and ever since then 
> has remained defined as MD5-of-MD5-of-CRC, where per-512-byte chunk CRCs are 
> already stored as part of datanode metadata, and the MD5 approach is used to 
> compute an aggregate value in a distributed manner, with individual datanodes 
> computing the MD5-of-CRCs per-block in parallel, and the HDFS client 
> computing the second-level MD5.
>  
> A shortcoming of this approach which is often brought up is the fact that 
> this FileChecksum is sensitive to the internal block-size and chunk-size 
> configuration, and thus different HDFS files with different block/chunk 
> settings cannot be compared. More commonly, one might have different HDFS 
> clusters which use different block sizes, in which case any data migration 
> won't be able to use the FileChecksum for distcp's rsync functionality or for 
> verifying end-to-end data integrity (on top of low-level data integrity 
> checks applied at data transfer time).
>  
> This was also revisited in https://issues.apache.org/jira/browse/HDFS-8430 
> during the addition of checksum support for striped erasure-coded files; 
> while there was some discussion of using CRC composability, it still 
> ultimately settled on hierarchical MD5 approach, which also adds the problem 
> that checksums of basic replicated files are not comparable to striped files.
>  
> This feature proposes to add a "COMPOSITE-CRC" FileChecksum type which uses 
> CRC composition to remain completely chunk/block agnostic, and allows 
> comparison between striped vs replicated files, between different HDFS 
> instances, and possible even between HDFS and other external storage systems. 
> This feature can also be added in-place to be compatible with existing block 
> metadata, and doesn't need to change the normal path of chunk verification, 
> so is minimally invasive. This also means even large preexisting HDFS 
> deployments could adopt this feature to retroactively sync data. A detailed 
> design document can be found here: 
> https://storage.googleapis.com/dennishuo/hdfs-file-composite-crc32-v1.pdf



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13062) Provide support for JN to use separate journal disk per namespace

2018-01-31 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347786#comment-16347786
 ] 

genericqa commented on HDFS-13062:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 44s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 17s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}129m 25s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}174m 13s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
|   | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
|   | hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HDFS-13062 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908648/HDFS-13062.06.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 3d0f1b19df7f 4.4.0-89-generic #112-Ubuntu SMP Mon Jul 31 
19:38:41 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 3ce2190 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22911/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22911/testReport/ |
| Max. process+thread count | 3986 (vs. ulimit of 5000) |
| modules | C: 

[jira] [Commented] (HDFS-13098) RBF: Datanodes interacting with Routers

2018-01-31 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-13098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347783#comment-16347783
 ] 

Íñigo Goiri commented on HDFS-13098:


Currently, we do the assignment of DNs to subclusters using external tools that 
generate {{hdfs-site.xml}}. These tools could be moved into the RBF 
infrastructure.

I had some initial conversation about this topic with [~curino].
One of his concerns was to avoid passing every single heartbeat through the 
Routers.
To solve this, we could make the DNs to just register the first time through 
the Router and afterwards switch to heartbeating into the actual Namenodes.

I think this could also apply to YARN federation and we could share some 
infrastructure; [~subru], [~giovanni.fumarola], any thoughts here?

Currently, this is an initial brainstorming and not much design done yet so 
feedback is welcomed.


> RBF: Datanodes interacting with Routers
> ---
>
> Key: HDFS-13098
> URL: https://issues.apache.org/jira/browse/HDFS-13098
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Priority: Major
>
> Datanodes talk to particular Namenodes. We could use the Router 
> infrastructure for the Datanodes to register/heartbeating into them and the 
> Routers would forward this to particular Namenodes. This would make the 
> assignment of Datanodes to subclusters potentially more dynamic.
> The implementation would potentially include:
> * Router to implement part of DatanodeProtocol
> * Forwarding DN messages into Routers
> * Policies to assign datanodes to subclusters
> * Datanodes to make blockpool configuration dynamic



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-13098) RBF: Datanodes interacting with Routers

2018-01-31 Thread JIRA
Íñigo Goiri created HDFS-13098:
--

 Summary: RBF: Datanodes interacting with Routers
 Key: HDFS-13098
 URL: https://issues.apache.org/jira/browse/HDFS-13098
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Íñigo Goiri


Datanodes talk to particular Namenodes. We could use the Router infrastructure 
for the Datanodes to register/heartbeating into them and the Routers would 
forward this to particular Namenodes. This would make the assignment of 
Datanodes to subclusters potentially more dynamic.

The implementation would potentially include:
* Router to implement part of DatanodeProtocol
* Forwarding DN messages into Routers
* Policies to assign datanodes to subclusters
* Datanodes to make blockpool configuration dynamic



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13097) [SPS]: Fix the branch review comments(Part1)

2018-01-31 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347764#comment-16347764
 ] 

genericqa commented on HDFS-13097:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
10s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-10285 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
28s{color} | {color:green} HDFS-10285 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} HDFS-10285 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} HDFS-10285 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} HDFS-10285 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 25s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
48s{color} | {color:green} HDFS-10285 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} HDFS-10285 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 42s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 14 new + 1057 unchanged - 0 fixed = 1071 total (was 1057) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m  1s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
52s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 2 new + 0 
unchanged - 0 fixed = 2 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}103m 27s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}154m 57s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-project/hadoop-hdfs |
|  |  Synchronization performed on java.util.concurrent.BlockingQueue in 
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor.addBlocksToMoveStorage(BlockStorageMovementCommand$BlockMovingInfo)
  At 
DatanodeDescriptor.java:org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor.addBlocksToMoveStorage(BlockStorageMovementCommand$BlockMovingInfo)
  At DatanodeDescriptor.java:[line 1087] |
|  |  Synchronization performed on java.util.concurrent.BlockingQueue in 
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor.getBlocksToMoveStorages(int)
  At 
DatanodeDescriptor.java:org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor.getBlocksToMoveStorages(int)
  At DatanodeDescriptor.java:[line 1109] |
| Failed junit tests | hadoop.hdfs.TestDistributedFileSystemWithECFile |
|   | hadoop.hdfs.server.namenode.TestQuotaByStorageType |
|   | hadoop.hdfs.TestReadWhileWriting |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070 |
|   | 

[jira] [Comment Edited] (HDFS-10453) ReplicationMonitor thread could stuck for long time due to the race between replication and delete of same file in a large cluster.

2018-01-31 Thread Ajay Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347753#comment-16347753
 ] 

Ajay Kumar edited comment on HDFS-10453 at 1/31/18 11:12 PM:
-

Hi [~hexiaoqiao], Thanks for working on this. Patch looks good to me. One minor 
suggestion, I think we can simplify the patch a bit my merging the new check 
{{if (rw.block.getNumBytes() == BlockCommand.NO_ACK)}} with {{if(bc == null || 
(bc.isUnderConstruction() && block.equals(bc.getLastBlock(}} inside 
{{BlockManager#computeReplicationWorkForBlocks}} L1501.

 


was (Author: ajayydv):
Hi [~hexiaoqiao], Thanks for working on this. Patch looks good to me. One minor 
suggestion, I think we can simplify the patch a bit my merging the new check 
{{if (rw.block.getNumBytes() == BlockCommand.NO_ACK)}}{{ with \{{if(bc == null 
|| (bc.isUnderConstruction() && block.equals(bc.getLastBlock( inside 
{{computeReplicationWorkForBlocks}} L1501.

 

> ReplicationMonitor thread could stuck for long time due to the race between 
> replication and delete of same file in a large cluster.
> ---
>
> Key: HDFS-10453
> URL: https://issues.apache.org/jira/browse/HDFS-10453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.1, 2.5.2, 2.7.1, 2.6.4
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-10453-branch-2.001.patch, 
> HDFS-10453-branch-2.003.patch, HDFS-10453-branch-2.7.004.patch, 
> HDFS-10453-branch-2.7.005.patch, HDFS-10453.001.patch
>
>
> ReplicationMonitor thread could stuck for long time and loss data with little 
> probability. Consider the typical scenario:
> (1) create and close a file with the default replicas(3);
> (2) increase replication (to 10) of the file.
> (3) delete the file while ReplicationMonitor is scheduling blocks belong to 
> that file for replications.
> if ReplicationMonitor stuck reappeared, NameNode will print log as:
> {code:xml}
> 2016-04-19 10:20:48,083 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> ..
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough 
> replicas: expected size is 7 but only 0 storage types can be selected 
> (replication=10, selected=[], unavailable=[DISK, ARCHIVE], removed=[DISK, 
> DISK, DISK, DISK, DISK, DISK, DISK], policy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) All required storage types are unavailable:  
> unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
> {code}
> This is because 2 threads (#NameNodeRpcServer and #ReplicationMonitor) 
> process same block at the same moment.
> (1) ReplicationMonitor#computeReplicationWorkForBlocks get blocks to 
> replicate and leave the global lock.
> (2) FSNamesystem#delete invoked to delete blocks then clear the reference in 
> blocksmap, needReplications, etc. the block's NumBytes will set 
> NO_ACK(Long.MAX_VALUE) which is used to indicate that the block deletion does 
> not need explicit ACK from the node. 
> (3) ReplicationMonitor#computeReplicationWorkForBlocks continue to 
> chooseTargets for the same blocks and no node will be selected after traverse 
> whole cluster because  no node choice satisfy the goodness criteria 
> (remaining spaces achieve required size Long.MAX_VALUE). 
> During of stage#3 ReplicationMonitor stuck 

[jira] [Comment Edited] (HDFS-10453) ReplicationMonitor thread could stuck for long time due to the race between replication and delete of same file in a large cluster.

2018-01-31 Thread Ajay Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347753#comment-16347753
 ] 

Ajay Kumar edited comment on HDFS-10453 at 1/31/18 11:11 PM:
-

Hi [~hexiaoqiao], Thanks for working on this. Patch looks good to me. One minor 
suggestion, I think we can simplify the patch a bit my merging the new check 
{{if (rw.block.getNumBytes() == BlockCommand.NO_ACK)}}{{ with \{{if(bc == null 
|| (bc.isUnderConstruction() && block.equals(bc.getLastBlock( inside 
{{computeReplicationWorkForBlocks}} L1501.

 


was (Author: ajayydv):
Hi [~hexiaoqiao], Thanks for working on this. Patch looks good to me. One minor 
suggestion, I think we can simplify the patch a bit my merging the new check 
\{{if (rw.block.getNumBytes() == BlockCommand.NO_ACK)}}{{ with {{}}if(bc == 
null || (bc.isUnderConstruction() && 
block.equals(bc.getLastBlock({{inside 
\{{computeReplicationWorkForBlocks}} L1501.

 

> ReplicationMonitor thread could stuck for long time due to the race between 
> replication and delete of same file in a large cluster.
> ---
>
> Key: HDFS-10453
> URL: https://issues.apache.org/jira/browse/HDFS-10453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.1, 2.5.2, 2.7.1, 2.6.4
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-10453-branch-2.001.patch, 
> HDFS-10453-branch-2.003.patch, HDFS-10453-branch-2.7.004.patch, 
> HDFS-10453-branch-2.7.005.patch, HDFS-10453.001.patch
>
>
> ReplicationMonitor thread could stuck for long time and loss data with little 
> probability. Consider the typical scenario:
> (1) create and close a file with the default replicas(3);
> (2) increase replication (to 10) of the file.
> (3) delete the file while ReplicationMonitor is scheduling blocks belong to 
> that file for replications.
> if ReplicationMonitor stuck reappeared, NameNode will print log as:
> {code:xml}
> 2016-04-19 10:20:48,083 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> ..
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough 
> replicas: expected size is 7 but only 0 storage types can be selected 
> (replication=10, selected=[], unavailable=[DISK, ARCHIVE], removed=[DISK, 
> DISK, DISK, DISK, DISK, DISK, DISK], policy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) All required storage types are unavailable:  
> unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
> {code}
> This is because 2 threads (#NameNodeRpcServer and #ReplicationMonitor) 
> process same block at the same moment.
> (1) ReplicationMonitor#computeReplicationWorkForBlocks get blocks to 
> replicate and leave the global lock.
> (2) FSNamesystem#delete invoked to delete blocks then clear the reference in 
> blocksmap, needReplications, etc. the block's NumBytes will set 
> NO_ACK(Long.MAX_VALUE) which is used to indicate that the block deletion does 
> not need explicit ACK from the node. 
> (3) ReplicationMonitor#computeReplicationWorkForBlocks continue to 
> chooseTargets for the same blocks and no node will be selected after traverse 
> whole cluster because  no node choice satisfy the goodness criteria 
> (remaining spaces achieve required size Long.MAX_VALUE). 
> During of stage#3 ReplicationMonitor stuck for 

[jira] [Commented] (HDFS-10453) ReplicationMonitor thread could stuck for long time due to the race between replication and delete of same file in a large cluster.

2018-01-31 Thread Ajay Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347753#comment-16347753
 ] 

Ajay Kumar commented on HDFS-10453:
---

Hi [~hexiaoqiao], Thanks for working on this. Patch looks good to me. One minor 
suggestion, I think we can simplify the patch a bit my merging the new check 
\{{if (rw.block.getNumBytes() == BlockCommand.NO_ACK)}}{{ with {{}}if(bc == 
null || (bc.isUnderConstruction() && 
block.equals(bc.getLastBlock({{inside 
\{{computeReplicationWorkForBlocks}} L1501.

 

> ReplicationMonitor thread could stuck for long time due to the race between 
> replication and delete of same file in a large cluster.
> ---
>
> Key: HDFS-10453
> URL: https://issues.apache.org/jira/browse/HDFS-10453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.1, 2.5.2, 2.7.1, 2.6.4
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-10453-branch-2.001.patch, 
> HDFS-10453-branch-2.003.patch, HDFS-10453-branch-2.7.004.patch, 
> HDFS-10453-branch-2.7.005.patch, HDFS-10453.001.patch
>
>
> ReplicationMonitor thread could stuck for long time and loss data with little 
> probability. Consider the typical scenario:
> (1) create and close a file with the default replicas(3);
> (2) increase replication (to 10) of the file.
> (3) delete the file while ReplicationMonitor is scheduling blocks belong to 
> that file for replications.
> if ReplicationMonitor stuck reappeared, NameNode will print log as:
> {code:xml}
> 2016-04-19 10:20:48,083 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> ..
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough 
> replicas: expected size is 7 but only 0 storage types can be selected 
> (replication=10, selected=[], unavailable=[DISK, ARCHIVE], removed=[DISK, 
> DISK, DISK, DISK, DISK, DISK, DISK], policy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) All required storage types are unavailable:  
> unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
> {code}
> This is because 2 threads (#NameNodeRpcServer and #ReplicationMonitor) 
> process same block at the same moment.
> (1) ReplicationMonitor#computeReplicationWorkForBlocks get blocks to 
> replicate and leave the global lock.
> (2) FSNamesystem#delete invoked to delete blocks then clear the reference in 
> blocksmap, needReplications, etc. the block's NumBytes will set 
> NO_ACK(Long.MAX_VALUE) which is used to indicate that the block deletion does 
> not need explicit ACK from the node. 
> (3) ReplicationMonitor#computeReplicationWorkForBlocks continue to 
> chooseTargets for the same blocks and no node will be selected after traverse 
> whole cluster because  no node choice satisfy the goodness criteria 
> (remaining spaces achieve required size Long.MAX_VALUE). 
> During of stage#3 ReplicationMonitor stuck for long time, especial in a large 
> cluster. invalidateBlocks & neededReplications continues to grow and no 
> consumes. it will loss data at the worst.
> This can mostly be avoided by skip chooseTarget for BlockCommand.NO_ACK block 
> and remove it from neededReplications.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: 

[jira] [Commented] (HDFS-13043) RBF: Expose the state of the Routers in the federation

2018-01-31 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347752#comment-16347752
 ] 

genericqa commented on HDFS-13043:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 54s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  5s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 85m 38s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}139m  2s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
|   | hadoop.hdfs.qjournal.server.TestJournalNodeSync |
|   | hadoop.hdfs.TestDFSClientRetries |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HDFS-13043 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908652/HDFS-13043.009.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  cc  |
| uname | Linux e60a4d8092ec 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 3ce2190 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22912/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22912/testReport/ |
| Max. process+thread count | 4713 (vs. ulimit of 5000) |
| modules | C: 

[jira] [Commented] (HDFS-13062) Provide support for JN to use separate journal disk per namespace

2018-01-31 Thread Hanisha Koneru (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347747#comment-16347747
 ] 

Hanisha Koneru commented on HDFS-13062:
---

Thanks [~bharatviswa]. +1 for patch v06.
Test failures are unrelated. FindBugs error is inaccurate - we do use 
{{validateAndCreateJournalDir}}.

 

> Provide support for JN to use separate journal disk per namespace
> -
>
> Key: HDFS-13062
> URL: https://issues.apache.org/jira/browse/HDFS-13062
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: HDFS-13062.00.patch, HDFS-13062.01.patch, 
> HDFS-13062.02.patch, HDFS-13062.03.patch, HDFS-13062.04.patch, 
> HDFS-13062.05.patch, HDFS-13062.06.patch
>
>
> In Federated HA setup, provide support for separate journal disk for each 
> namespace.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13060) Adding a BlacklistBasedTrustedChannelResolver for TrustedChannelResolver

2018-01-31 Thread Ajay Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347728#comment-16347728
 ] 

Ajay Kumar commented on HDFS-13060:
---

added package-info file in patch v3 to address checkstyle issue.

> Adding a BlacklistBasedTrustedChannelResolver for TrustedChannelResolver
> 
>
> Key: HDFS-13060
> URL: https://issues.apache.org/jira/browse/HDFS-13060
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xiaoyu Yao
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HDFS-13060.000.patch, HDFS-13060.001.patch, 
> HDFS-13060.002.patch, HDFS-13060.003.patch
>
>
> HDFS-5910 introduces encryption negotiation between client and server based 
> on a customizable TrustedChannelResolver class. The TrustedChannelResolver is 
> invoked on both client and server side. If the resolver indicates that the 
> channel is trusted, then the data transfer will not be encrypted even if 
> dfs.encrypt.data.transfer is set to true. 
> The default trust channel resolver implementation returns false indicating 
> that the channel is not trusted, which always enables encryption. HDFS-5910 
> also added a build-int whitelist based trust channel resolver. It allows you 
> to put IP address/Network Mask of trusted client/server in whitelist files to 
> skip encryption for certain traffics. 
> This ticket is opened to add a blacklist based trust channel resolver for 
> cases only certain machines (IPs) are untrusted without adding each trusted 
> IP individually.
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13060) Adding a BlacklistBasedTrustedChannelResolver for TrustedChannelResolver

2018-01-31 Thread Ajay Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HDFS-13060:
--
Attachment: HDFS-13060.003.patch

> Adding a BlacklistBasedTrustedChannelResolver for TrustedChannelResolver
> 
>
> Key: HDFS-13060
> URL: https://issues.apache.org/jira/browse/HDFS-13060
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xiaoyu Yao
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HDFS-13060.000.patch, HDFS-13060.001.patch, 
> HDFS-13060.002.patch, HDFS-13060.003.patch
>
>
> HDFS-5910 introduces encryption negotiation between client and server based 
> on a customizable TrustedChannelResolver class. The TrustedChannelResolver is 
> invoked on both client and server side. If the resolver indicates that the 
> channel is trusted, then the data transfer will not be encrypted even if 
> dfs.encrypt.data.transfer is set to true. 
> The default trust channel resolver implementation returns false indicating 
> that the channel is not trusted, which always enables encryption. HDFS-5910 
> also added a build-int whitelist based trust channel resolver. It allows you 
> to put IP address/Network Mask of trusted client/server in whitelist files to 
> skip encryption for certain traffics. 
> This ticket is opened to add a blacklist based trust channel resolver for 
> cases only certain machines (IPs) are untrusted without adding each trusted 
> IP individually.
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13073) Cleanup code in InterQJournalProtocol.proto

2018-01-31 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347712#comment-16347712
 ] 

genericqa commented on HDFS-13073:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 15m 
33s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 38s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 54s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}140m  1s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}206m 26s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
|   | hadoop.hdfs.server.namenode.ha.TestHAAppend |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HDFS-13073 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908631/HDFS-13073.01.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  cc  |
| uname | Linux 5287a3a05d0e 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 
11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 3ce2190 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22909/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 

[jira] [Commented] (HDFS-13062) Provide support for JN to use separate journal disk per namespace

2018-01-31 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347677#comment-16347677
 ] 

genericqa commented on HDFS-13062:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
36s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
 9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 58s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 43s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m  
2s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}139m  6s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}185m 43s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-project/hadoop-hdfs |
|  |  Private method 
org.apache.hadoop.hdfs.qjournal.server.JournalNode.validateAndCreateJournalDir(File)
 is never called  At JournalNode.java:called  At JournalNode.java:[lines 
202-207] |
| Failed junit tests | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
|   | hadoop.hdfs.TestDFSUpgradeFromImage |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.TestSafeModeWithStripedFile |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure |
|   | hadoop.hdfs.TestErasureCodingMultipleRacks |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150 |
|   | hadoop.hdfs.TestReconstructStripedFile |
|   | hadoop.hdfs.TestDFSStorageStateRecovery |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070 |
|   | hadoop.hdfs.web.TestFSMainOperationsWebHdfs |
|   | hadoop.hdfs.TestDFSStripedOutputStream |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure140 |
|   | hadoop.hdfs.TestFileAppend4 |
|   | hadoop.hdfs.TestReadStripedFileWithDNFailure |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure120 |
|   | hadoop.hdfs.TestPread |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | 

[jira] [Assigned] (HDFS-12545) Autotune NameNode RPC handler threads according to number of datanodes in cluster

2018-01-31 Thread Ajay Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar reassigned HDFS-12545:
-

Assignee: (was: Ajay Kumar)

> Autotune NameNode RPC handler threads according to number of datanodes in 
> cluster
> -
>
> Key: HDFS-12545
> URL: https://issues.apache.org/jira/browse/HDFS-12545
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Priority: Major
>
> Autotune NameNode RPC handler threads according to number of datanodes in 
> cluster. 
> Currently rpc handler are controlled by {{dfs.namenode.handler.count}} on 
> cluster start. Jira is to discuss best way to auto tune it according to no of 
> datanodes and any other relevant input. Updating this to 
> {{max(dfs.namenode.handler.count, min(200,20 * log2(number of DataNodes)))}} 
> on NameNode start is one possible way. (This heuristic is from  [Hadoop 
> Operations|http://shop.oreilly.com/product/0636920025085.do] book.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13060) Adding a BlacklistBasedTrustedChannelResolver for TrustedChannelResolver

2018-01-31 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347625#comment-16347625
 ] 

genericqa commented on HDFS-13060:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
 6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 17s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
38s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m  
6s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 46s{color} | {color:orange} root: The patch generated 1 new + 0 unchanged - 
0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
8m 38s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
57s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
11s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}143m 49s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}232m 42s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.TestDistributedFileSystemWithECFileWithRandomECPolicy |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
|   | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure |
|   | hadoop.hdfs.server.namenode.TestNameNodeMXBean |
|   | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure130 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070 |
|   | hadoop.hdfs.server.datanode.TestDataNodeUUID |
|   | hadoop.hdfs.server.federation.router.TestRouterSafemode |
|   | hadoop.hdfs.server.mover.TestMover |
|   | hadoop.hdfs.TestDFSClientRetries |
|   | hadoop.hdfs.TestReplaceDatanodeOnFailure |
|   | 

[jira] [Commented] (HDFS-13095) Improve slice tree traversal implementation

2018-01-31 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347616#comment-16347616
 ] 

Xiao Chen commented on HDFS-13095:
--

Thanks [~rakeshr] for creating the Jira and [~daryn] for reviewing 
(re-encryption essentially :)).

On re-encryption we chose not to change snapshots due to the immutable nature 
of snapshots (old edek can still work if the ez key version is still there). 
Good point about permissions, perhaps since this is enforced to be hdfs 
superuser role we can skip perm checks...

> Improve slice tree traversal implementation
> ---
>
> Key: HDFS-13095
> URL: https://issues.apache.org/jira/browse/HDFS-13095
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Major
>
> This task is to refine the existing slice tree traversal logic in 
> [ReencryptionHandler|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ReencryptionHandler.java#L74]
>  class.
> Please refer Daryn's review comments
> {quote}*FSTreeTraverser*
>  I need to study this more but I have grave concerns this will work correctly 
> in a mutating namesystem.  Ex. renames and deletes esp. in combination with 
> snapshots. Looks like there's a chance it will go off in the weeds when 
> backtracking out of a renamed directory.
> traverseDir may NPE if it's traversing a tree in a snapshot and one of the 
> ancestors is deleted.
> Not sure why it's bothering to re-check permissions during the crawl.  The 
> storage policy is inherited by the entire tree, regardless of whether the 
> sub-contents are accessible.  The effect of this patch is the storage policy 
> is enforced for all readable files, non-readable violate the new storage 
> policy, new non-readable will conform to the new storage policy.  Very 
> convoluted.  Since new files will conform, should just process the entire 
> tree.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13060) Adding a BlacklistBasedTrustedChannelResolver for TrustedChannelResolver

2018-01-31 Thread Ajay Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347614#comment-16347614
 ] 

Ajay Kumar commented on HDFS-13060:
---

[~xyao] thanks for review, created [HADOOP-15202] for "deprecation of 
CombinedIPWhiteList".

> Adding a BlacklistBasedTrustedChannelResolver for TrustedChannelResolver
> 
>
> Key: HDFS-13060
> URL: https://issues.apache.org/jira/browse/HDFS-13060
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xiaoyu Yao
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HDFS-13060.000.patch, HDFS-13060.001.patch, 
> HDFS-13060.002.patch
>
>
> HDFS-5910 introduces encryption negotiation between client and server based 
> on a customizable TrustedChannelResolver class. The TrustedChannelResolver is 
> invoked on both client and server side. If the resolver indicates that the 
> channel is trusted, then the data transfer will not be encrypted even if 
> dfs.encrypt.data.transfer is set to true. 
> The default trust channel resolver implementation returns false indicating 
> that the channel is not trusted, which always enables encryption. HDFS-5910 
> also added a build-int whitelist based trust channel resolver. It allows you 
> to put IP address/Network Mask of trusted client/server in whitelist files to 
> skip encryption for certain traffics. 
> This ticket is opened to add a blacklist based trust channel resolver for 
> cases only certain machines (IPs) are untrusted without adding each trusted 
> IP individually.
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13061) SaslDataTransferClient#checkTrustAndSend should not trust a partially trusted channel

2018-01-31 Thread Ajay Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347601#comment-16347601
 ] 

Ajay Kumar commented on HDFS-13061:
---

[~xyao], thanks for review and commit.

> SaslDataTransferClient#checkTrustAndSend should not trust a partially trusted 
> channel
> -
>
> Key: HDFS-13061
> URL: https://issues.apache.org/jira/browse/HDFS-13061
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xiaoyu Yao
>Assignee: Ajay Kumar
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HDFS-13061.000.patch, HDFS-13061.001.patch, 
> HDFS-13061.002.patch, HDFS-13061.003.patch
>
>
> HDFS-5910 introduces encryption negotiation between client and server based 
> on a customizable TrustedChannelResolver class. The TrustedChannelResolver is 
> invoked on both client and server side. If the resolver indicates that the 
> channel is trusted, then the data transfer will not be encrypted even if 
> dfs.encrypt.data.transfer is set to true. 
> SaslDataTransferClient#checkTrustAndSend ask the channel resolve whether the 
> client and server address are trusted, respectively. It decides the channel 
> is untrusted only if both client and server are not trusted to enforce 
> encryption. *This ticket is opened to change it to not trust (and encrypt) if 
> either client or server address are not trusted.*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13043) RBF: Expose the state of the Routers in the federation

2018-01-31 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HDFS-13043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-13043:
---
Attachment: HDFS-13043.009.patch

> RBF: Expose the state of the Routers in the federation
> --
>
> Key: HDFS-13043
> URL: https://issues.apache.org/jira/browse/HDFS-13043
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-13043.000.patch, HDFS-13043.001.patch, 
> HDFS-13043.002.patch, HDFS-13043.003.patch, HDFS-13043.004.patch, 
> HDFS-13043.005.patch, HDFS-13043.006.patch, HDFS-13043.007.patch, 
> HDFS-13043.008.patch, HDFS-13043.009.patch, router-info.png
>
>
> The Router should expose the state of the other Routers in the federation 
> through a user UI.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10285) Storage Policy Satisfier in Namenode

2018-01-31 Thread Surendra Singh Lilhore (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347575#comment-16347575
 ] 

Surendra Singh Lilhore edited comment on HDFS-10285 at 1/31/18 8:38 PM:


Thanks [~daryn] for reviews.

Created Part1 Jira HDFS-13097 to fix few comments


was (Author: surendrasingh):
Thanks [~daryn] for reviews.

Create Part1 Jira HDFS-13097 to fix few comments

> Storage Policy Satisfier in Namenode
> 
>
> Key: HDFS-10285
> URL: https://issues.apache.org/jira/browse/HDFS-10285
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: HDFS-10285
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Major
> Attachments: HDFS-10285-consolidated-merge-patch-00.patch, 
> HDFS-10285-consolidated-merge-patch-01.patch, 
> HDFS-10285-consolidated-merge-patch-02.patch, 
> HDFS-10285-consolidated-merge-patch-03.patch, 
> HDFS-10285-consolidated-merge-patch-04.patch, 
> HDFS-10285-consolidated-merge-patch-05.patch, 
> HDFS-SPS-TestReport-20170708.pdf, SPS Modularization.pdf, 
> Storage-Policy-Satisfier-in-HDFS-June-20-2017.pdf, 
> Storage-Policy-Satisfier-in-HDFS-May10.pdf, 
> Storage-Policy-Satisfier-in-HDFS-Oct-26-2017.pdf
>
>
> Heterogeneous storage in HDFS introduced the concept of storage policy. These 
> policies can be set on directory/file to specify the user preference, where 
> to store the physical block. When user set the storage policy before writing 
> data, then the blocks could take advantage of storage policy preferences and 
> stores physical block accordingly. 
> If user set the storage policy after writing and completing the file, then 
> the blocks would have been written with default storage policy (nothing but 
> DISK). User has to run the ‘Mover tool’ explicitly by specifying all such 
> file names as a list. In some distributed system scenarios (ex: HBase) it 
> would be difficult to collect all the files and run the tool as different 
> nodes can write files separately and file can have different paths.
> Another scenarios is, when user rename the files from one effected storage 
> policy file (inherited policy from parent directory) to another storage 
> policy effected directory, it will not copy inherited storage policy from 
> source. So it will take effect from destination file/dir parent storage 
> policy. This rename operation is just a metadata change in Namenode. The 
> physical blocks still remain with source storage policy.
> So, Tracking all such business logic based file names could be difficult for 
> admins from distributed nodes(ex: region servers) and running the Mover tool. 
> Here the proposal is to provide an API from Namenode itself for trigger the 
> storage policy satisfaction. A Daemon thread inside Namenode should track 
> such calls and process to DN as movement commands. 
> Will post the detailed design thoughts document soon. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13097) [SPS]: Fix the branch review comments(Part1)

2018-01-31 Thread Surendra Singh Lilhore (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surendra Singh Lilhore updated HDFS-13097:
--
Status: Patch Available  (was: Open)

> [SPS]: Fix the branch review comments(Part1)
> 
>
> Key: HDFS-13097
> URL: https://issues.apache.org/jira/browse/HDFS-13097
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: HDFS-10285
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13097-HDFS-10285.01.patch
>
>
> Fix the branch review comment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10285) Storage Policy Satisfier in Namenode

2018-01-31 Thread Surendra Singh Lilhore (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347575#comment-16347575
 ] 

Surendra Singh Lilhore commented on HDFS-10285:
---

Thanks [~daryn] for reviews.

Create Part1 Jira HDFS-13097 to fix few comments

> Storage Policy Satisfier in Namenode
> 
>
> Key: HDFS-10285
> URL: https://issues.apache.org/jira/browse/HDFS-10285
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: HDFS-10285
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Major
> Attachments: HDFS-10285-consolidated-merge-patch-00.patch, 
> HDFS-10285-consolidated-merge-patch-01.patch, 
> HDFS-10285-consolidated-merge-patch-02.patch, 
> HDFS-10285-consolidated-merge-patch-03.patch, 
> HDFS-10285-consolidated-merge-patch-04.patch, 
> HDFS-10285-consolidated-merge-patch-05.patch, 
> HDFS-SPS-TestReport-20170708.pdf, SPS Modularization.pdf, 
> Storage-Policy-Satisfier-in-HDFS-June-20-2017.pdf, 
> Storage-Policy-Satisfier-in-HDFS-May10.pdf, 
> Storage-Policy-Satisfier-in-HDFS-Oct-26-2017.pdf
>
>
> Heterogeneous storage in HDFS introduced the concept of storage policy. These 
> policies can be set on directory/file to specify the user preference, where 
> to store the physical block. When user set the storage policy before writing 
> data, then the blocks could take advantage of storage policy preferences and 
> stores physical block accordingly. 
> If user set the storage policy after writing and completing the file, then 
> the blocks would have been written with default storage policy (nothing but 
> DISK). User has to run the ‘Mover tool’ explicitly by specifying all such 
> file names as a list. In some distributed system scenarios (ex: HBase) it 
> would be difficult to collect all the files and run the tool as different 
> nodes can write files separately and file can have different paths.
> Another scenarios is, when user rename the files from one effected storage 
> policy file (inherited policy from parent directory) to another storage 
> policy effected directory, it will not copy inherited storage policy from 
> source. So it will take effect from destination file/dir parent storage 
> policy. This rename operation is just a metadata change in Namenode. The 
> physical blocks still remain with source storage policy.
> So, Tracking all such business logic based file names could be difficult for 
> admins from distributed nodes(ex: region servers) and running the Mover tool. 
> Here the proposal is to provide an API from Namenode itself for trigger the 
> storage policy satisfaction. A Daemon thread inside Namenode should track 
> such calls and process to DN as movement commands. 
> Will post the detailed design thoughts document soon. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13097) [SPS]: Fix the branch review comments(Part1)

2018-01-31 Thread Surendra Singh Lilhore (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347573#comment-16347573
 ] 

Surendra Singh Lilhore commented on HDFS-13097:
---

Attached v1 patch

> [SPS]: Fix the branch review comments(Part1)
> 
>
> Key: HDFS-13097
> URL: https://issues.apache.org/jira/browse/HDFS-13097
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: HDFS-10285
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13097-HDFS-10285.01.patch
>
>
> Fix the branch review comment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13097) [SPS]: Fix the branch review comments(Part1)

2018-01-31 Thread Surendra Singh Lilhore (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surendra Singh Lilhore updated HDFS-13097:
--
Attachment: HDFS-13097-HDFS-10285.01.patch

> [SPS]: Fix the branch review comments(Part1)
> 
>
> Key: HDFS-13097
> URL: https://issues.apache.org/jira/browse/HDFS-13097
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: HDFS-10285
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13097-HDFS-10285.01.patch
>
>
> Fix the branch review comment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13062) Provide support for JN to use separate journal disk per namespace

2018-01-31 Thread Bharat Viswanadham (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347571#comment-16347571
 ] 

Bharat Viswanadham commented on HDFS-13062:
---

Hi [~hanishakoneru]

Thanks for offline discussion for updating getLogDir, to use 
validateAndCreateJournalDir(dir), instead of validateAndCreateJournalDir(). I 
have mistakenly did this in v05 patch, reverted to be same as in v04 patch.

And also removed the validateAndCreateJournalDir(), instead called 
validateAndCreateJournalDir(dir) in a for loop of localDir in start() method.

> Provide support for JN to use separate journal disk per namespace
> -
>
> Key: HDFS-13062
> URL: https://issues.apache.org/jira/browse/HDFS-13062
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: HDFS-13062.00.patch, HDFS-13062.01.patch, 
> HDFS-13062.02.patch, HDFS-13062.03.patch, HDFS-13062.04.patch, 
> HDFS-13062.05.patch, HDFS-13062.06.patch
>
>
> In Federated HA setup, provide support for separate journal disk for each 
> namespace.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13043) RBF: Expose the state of the Routers in the federation

2018-01-31 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347569#comment-16347569
 ] 

genericqa commented on HDFS-13043:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 21s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 30s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 49s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}127m 17s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}173m 58s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HDFS-13043 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908615/HDFS-13043.008.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  cc  |
| uname | Linux 53766c00bf50 4.4.0-89-generic #112-Ubuntu SMP Mon Jul 31 
19:38:41 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / d481344 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22907/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 

[jira] [Updated] (HDFS-13062) Provide support for JN to use separate journal disk per namespace

2018-01-31 Thread Bharat Viswanadham (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDFS-13062:
--
Attachment: HDFS-13062.06.patch

> Provide support for JN to use separate journal disk per namespace
> -
>
> Key: HDFS-13062
> URL: https://issues.apache.org/jira/browse/HDFS-13062
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: HDFS-13062.00.patch, HDFS-13062.01.patch, 
> HDFS-13062.02.patch, HDFS-13062.03.patch, HDFS-13062.04.patch, 
> HDFS-13062.05.patch, HDFS-13062.06.patch
>
>
> In Federated HA setup, provide support for separate journal disk for each 
> namespace.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13097) [SPS]: Fix the branch review comments(Part1)

2018-01-31 Thread Surendra Singh Lilhore (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347565#comment-16347565
 ] 

Surendra Singh Lilhore commented on HDFS-13097:
---

Fixing below comments

*Comment-1)*
{quote}BlockManager
 Shouldn’t spsMode be volatile? Although I question why it’s here.
{quote}
[Rakesh's reply] Agreed, will do the changes.

*Comment-2)*
{quote}Adding SPS methods to this class implies an unexpected coupling of the 
SPS service to the block manager. Please move them out to prove it’s not 
tightly coupled.
{quote}
[Rakesh's reply] Agreed. I'm planning to create {{StoragePolicySatisfyManager}} 
and keep all the related apis over there.

*Comment-5)*
{quote}DatanodeDescriptor
 Why use a synchronized linked list to offer/poll instead of BlockingQueue?
{quote}
[Rakesh's reply] Agreed, will do the changes.

*Comment-8)*
{quote}DFSUtil
 DFSUtil.removeOverlapBetweenStorageTypes and {{DFSUtil.getSPSWorkMultiplier
 }}. These aren’t generally useful methods so why are they in DFSUtil? Why 
aren’t they in the only calling class StoragePolicySatisfier?
{quote}
[Rakesh's reply] Agreed, Will do the changes.

*Comment-11)*
{quote}HdfsServerConstants
 The xattr is called user.hdfs.sps.xattr. Why does the xattr name actually 
contain the word “xattr”?
{quote}
[Rakesh's reply] Sure, will remove “xattr” word.

*Comment-12)*
{quote}NameNode
 Super trivial but using the plural pronoun “we” in this exception message is 
odd. Changing the value isn’t a joint activity.

For enabling or disabling storage policy satisfier, we must pass either 
none/internal/external string value only
{quote}
[Rakesh's reply] oops, sorry for the mistake. Will change it.

 
*Comment-16)*
{quote}FSDirStatAndListOp
Not sure why javadoc was changed to add needLocation. It's already present and 
now doubled up.{quote}
[Rakesh'r reply] Agreed, will correct it.

 
*Comment-18)*
{quote}DFS_MOVER_MOVERTHREADS_DEFAULT is 1000 per DN? If the DN is concurrently 
doing 1000 moves, it's not in a good state, disk io is probably saturated, and 
this will only make it much worse. 10 is probably more than sufficient.{quote}
[Rakesh'r reply] Agreed, will make it to smaller value 10.
 

> [SPS]: Fix the branch review comments(Part1)
> 
>
> Key: HDFS-13097
> URL: https://issues.apache.org/jira/browse/HDFS-13097
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: HDFS-10285
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Major
>
> Fix the branch review comment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-13097) [SPS]: Fix the branch review comments(Part1)

2018-01-31 Thread Surendra Singh Lilhore (JIRA)
Surendra Singh Lilhore created HDFS-13097:
-

 Summary: [SPS]: Fix the branch review comments(Part1)
 Key: HDFS-13097
 URL: https://issues.apache.org/jira/browse/HDFS-13097
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: HDFS-10285
Reporter: Surendra Singh Lilhore
Assignee: Surendra Singh Lilhore


Fix the branch review comment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-11419) BlockPlacementPolicyDefault is choosing datanode in an inefficient way

2018-01-31 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDFS-11419.
--
Resolution: Done

> BlockPlacementPolicyDefault is choosing datanode in an inefficient way
> --
>
> Key: HDFS-11419
> URL: https://issues.apache.org/jira/browse/HDFS-11419
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
>
> Currently in {{BlockPlacementPolicyDefault}}, {{chooseTarget}} will end up 
> calling into {{chooseRandom}}, which will first find a random datanode by 
> calling
> {code}DatanodeDescriptor chosenNode = chooseDataNode(scope, 
> excludedNodes);{code}, then it checks whether that returned datanode 
> satisfies storage type requirement
> {code}storage = chooseStorage4Block(
>   chosenNode, blocksize, results, entry.getKey());{code}
> If yes, {{numOfReplicas--;}}, otherwise, the node is added to excluded nodes, 
> and runs the loop again until {{numOfReplicas}} is down to 0.
> A problem here is that, storage type is not being considered only until after 
> a random node is already returned.  We've seen a case where a cluster has a 
> large number of datanodes, while only a few satisfy the storage type 
> condition. So, for the most part, this code blindly picks random datanodes 
> that do not satisfy the storage type requirement.
> To make matters worse, the way {{NetworkTopology#chooseRandom}} works is 
> that, given a set of excluded nodes, it first finds a random datanodes, then 
> if it is in excluded nodes set, try find another random nodes. So the more 
> excluded nodes there are, the more likely a random node will be in the 
> excluded set, in which case we basically wasted one iteration.
> Therefore, this JIRA proposes to augment/modify the relevant classes in a way 
> that datanodes can be found more efficiently. There are currently two 
> different high level solutions we are considering:
> 1. add some field to Node base types to describe the storage type info, and 
> when searching for a node, we take into account such field(s), and do not 
> return node that does not meet the storage type requirement.
> 2. change {{NetworkTopology}} class to be aware of storage types, e.g. for 
> one storage type, there is one tree subset that connects all the nodes with 
> that type. And one search happens on only one such subset. So unexpected 
> storage types are simply not in the search space. 
> Thanks [~szetszwo] for the offline discussion, and thanks [~linyiqun] for 
> pointing out a wrong statement (corrected now) in the description. Any 
> further comments are more than welcome.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12512) RBF: Add WebHDFS

2018-01-31 Thread Wei Yan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347525#comment-16347525
 ] 

Wei Yan commented on HDFS-12512:


Saw exceptions "java.lang.OutOfMemoryError: unable to create new native thread" 
from the test log, and the VM was crashed, which generated the two error log 
files (and Yetus complained ASF license of these files). Will wait for a while 
to retrigger the job,.

> RBF: Add WebHDFS
> 
>
> Key: HDFS-12512
> URL: https://issues.apache.org/jira/browse/HDFS-12512
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs
>Reporter: Íñigo Goiri
>Assignee: Wei Yan
>Priority: Major
>  Labels: RBF
> Attachments: HDFS-12512.000.patch, HDFS-12512.001.patch, 
> HDFS-12512.002.patch, HDFS-12512.003.patch, HDFS-12512.004.patch
>
>
> The Router currently does not support WebHDFS. It needs to implement 
> something similar to {{NamenodeWebHdfsMethods}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12512) RBF: Add WebHDFS

2018-01-31 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347510#comment-16347510
 ] 

genericqa commented on HDFS-12512:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 
24s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 13 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
 8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 27s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 34s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 14 new + 121 unchanged - 8 fixed = 135 total (was 129) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 38s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 89m 42s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
24s{color} | {color:red} The patch generated 2 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}146m 40s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.metrics2.sink.TestRollingFileSystemSinkWithSecureHdfs |
|   | hadoop.fs.contract.hdfs.TestHDFSContractDelete |
|   | hadoop.fs.permission.TestStickyBit |
|   | hadoop.hdfs.TestFileAppend |
|   | hadoop.fs.contract.router.web.TestRouterWebHDFSContractRename |
|   | hadoop.fs.TestSymlinkHdfsFileSystem |
|   | hadoop.fs.contract.router.web.TestRouterWebHDFSContractMkdir |
|   | hadoop.fs.contract.router.web.TestRouterWebHDFSContractRootDirectory |
|   | hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs |
|   | hadoop.hdfs.TestMaintenanceState |
|   | hadoop.fs.contract.router.web.TestRouterWebHDFSContractCreate |
|   | hadoop.hdfs.TestDFSStartupVersions |
|   | hadoop.fs.contract.hdfs.TestHDFSContractCreate |
|   | hadoop.hdfs.TestFileCreation |
|   | hadoop.hdfs.TestAppendSnapshotTruncate |
|   | hadoop.security.TestPermission |
|   | hadoop.hdfs.TestExternalBlockReader |
|   | hadoop.hdfs.TestAbandonBlock |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure010 |
|   | hadoop.fs.contract.router.web.TestRouterWebHDFSContractDelete |
|   

[jira] [Created] (HDFS-13096) HDFS group quota

2018-01-31 Thread Ruslan Dautkhanov (JIRA)
Ruslan Dautkhanov created HDFS-13096:


 Summary: HDFS group quota
 Key: HDFS-13096
 URL: https://issues.apache.org/jira/browse/HDFS-13096
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: datanode, fs, hdfs, nn
Affects Versions: 3.0.0, 2.7.5, 2.8.3
Reporter: Ruslan Dautkhanov


We have groups of people that have their own set of HDFS directories. 
For example, they have HDFS staging place for new files:
/datascience
/analysts 
... 
but at the same time they have Hive warehouse directory 
/hivewarehouse/datascience
/hivewarehouse/analysts 
... 
on top of that they also have some files stored under /user/${username}/ 

It's always been a challenge to maintain a combined quota on all HDFS locations 
a particular group of people owns. As we're currently forced to put a 
particular quota for each directory independently.

It would be great if HDFS would have a quota tied either
- to a set of HDFS locations ;
- or to a group of people (where `group`is defined as which HDFS group a 
particular file/directory belongs to).

Linux allows to define quotas at group level, i.e. `edquota -g devel` etc.. 
would be great to have the same at HDFS level.

Other thoughts and ideas?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10285) Storage Policy Satisfier in Namenode

2018-01-31 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347472#comment-16347472
 ] 

Rakesh R commented on HDFS-10285:
-

Thank you very much [~daryn] for your time and useful comments/thoughts. My 
reply follows, please take a look at it.

+Comment-1)+
{quote}BlockManager
 Shouldn’t spsMode be volatile? Although I question why it’s here.
{quote}
[Rakesh's reply] Agreed, will do the changes.

+Comment-2)+
{quote}Adding SPS methods to this class implies an unexpected coupling of the 
SPS service to the block manager. Please move them out to prove it’s not 
tightly coupled.
{quote}
[Rakesh's reply] Agreed. We are planning to create 
{{StoragePolicySatisfyManager}} and keep all the related apis over there.

+Comment-3)+
{quote}BPServiceActor
 Is it actually sending back the moved blocks? Aren’t IBRs sufficient?

BlockStorageMovementCommand/BlocksStorageMoveAttemptFinished
 Again, not sure that a new DN command is necessary, and why does it 
specifically report back successful moves instead of relying on IBRs? I would 
actually expect the DN to be completely ignorant of a SPS move vs any other 
move.
{quote}
[Rakesh's reply] We have explored IBR approach and the required code changes. 
If sps rely on this, then it would requires an *extra* check to know whether 
this new block has occurred due to sps move or others, which will be quite 
often considering more other ops compares to SPSBlockMove op. Currently, it is 
sending back {{blksMovementsFinished}} list separately, each movement finished 
block can be easily/quickly recognized by the Satisfier in NN side and updates 
the tracking details. If you agree this *extra* check is not an issue then we 
would be happy to implement the IBR approach. Secondly, 
BlockStorageMovementCommand is added to carry the block vs src/target pairs 
which is much needed for the move operation and we tried to decouple sps code 
using this command. 

+Comment-4)+
{quote}DataNode
 Why isn’t this just a block transfer? How is transferring between DNs any 
different than across storages?
{quote}
[Rakesh's reply] I could see Mover is also using {{REPLACE_BLOCK}} call and we 
just followed same approach in sps also. Am I missing anything here?

+Comment-5)+
{quote}DatanodeDescriptor
 Why use a synchronized linked list to offer/poll instead of BlockingQueue?
{quote}
[Rakesh's reply] Agreed, will do the changes.

+Comment-6)+
{quote}DatanodeManager
 I know it’s configurable, but realistically, when would you ever want to give 
storage movement tasks equal footing with under-replication? Is there really a 
use case for not valuing durability?
{quote}
[Rakesh's reply] We don't have any particular use case, though. One scenario we 
thought is, user configured SSDs and filled up quickly. In that case, there 
could be situations that cleaning-up is considered as a high priority. If you 
feel, this is not a real case then I'm OK to remove this config and SPS will 
use only the remaining slots always.

+Comment-7)+
{quote}Adding getDatanodeStorageReport is concerning. getDatanodeListForReport 
is already a very bad method that should be avoided for anything but jmx – even 
then it’s a concern. I eliminated calls to it years ago. All it takes is a 
nscd/dns hiccup and you’re left holding the fsn lock for an excessive length of 
time. Beyond that, the response is going to be pretty large and tagging all the 
storage reports is not going to be cheap.

verifyTargetDatanodeHasSpaceForScheduling does it really need the namesystem 
lock? Can’t DatanodeDescriptor#chooseStorage4Block synchronize on its 
storageMap?

Appears to be calling getLiveDatanodeStorageReport for every file. As mentioned 
earlier, this is NOT cheap. The SPS should be able to operate on a fuzzy/cached 
state of the world. Then it gets another datanode report to determine the 
number of live nodes to decide if it should sleep before processing the next 
path. The number of nodes from the prior cached view of the world should 
suffice.
{quote}
[Rakesh's reply] Good point. Sometime back Uma and me thought about cache part. 
Actually, we depend on this api for the data node storage types and remaining 
space details. I think, it requires two different mechanisms for internal and 
external sps. For internal, how about sps can directly refer 
{{DatanodeManager#datanodeMap}} for every file. For the external, IIUC you are 
suggesting a cache mechanism. How about, get storageReport once and cache at 
ExternalContext. This local cache can be refreshed periodically. Say, After 
every 5mins (just an arbitrary number I put here, if you have some period in 
mind please suggest), when getDatanodeStorageReport called, cache can be 
treated as expired and fetch freshly. Within 5mins it can use from cache. Does 
this make sense to you?

Another point we thought of is, right now for checking whether node has good 
space, its going to NN. With 

[jira] [Created] (HDFS-13095) Improve slice tree traversal implementation

2018-01-31 Thread Rakesh R (JIRA)
Rakesh R created HDFS-13095:
---

 Summary: Improve slice tree traversal implementation
 Key: HDFS-13095
 URL: https://issues.apache.org/jira/browse/HDFS-13095
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Rakesh R
Assignee: Rakesh R


This task is to refine the existing slice tree traversal logic in 
[ReencryptionHandler|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ReencryptionHandler.java#L74]
 class.

Please refer Daryn's review comments
{quote}*FSTreeTraverser*
 I need to study this more but I have grave concerns this will work correctly 
in a mutating namesystem.  Ex. renames and deletes esp. in combination with 
snapshots. Looks like there's a chance it will go off in the weeds when 
backtracking out of a renamed directory.

traverseDir may NPE if it's traversing a tree in a snapshot and one of the 
ancestors is deleted.

Not sure why it's bothering to re-check permissions during the crawl.  The 
storage policy is inherited by the entire tree, regardless of whether the 
sub-contents are accessible.  The effect of this patch is the storage policy is 
enforced for all readable files, non-readable violate the new storage policy, 
new non-readable will conform to the new storage policy.  Very convoluted.  
Since new files will conform, should just process the entire tree.
{quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13092) Reduce verbosity for ThrottledAsyncChecker.java:schedule

2018-01-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347445#comment-16347445
 ] 

Hudson commented on HDFS-13092:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13593 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13593/])
HDFS-13092. Reduce verbosity for ThrottledAsyncChecker#schedule. 
(hanishakoneru: rev 3ce2190b581526ad2d49e8c3a47be1547037310c)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/checker/ThrottledAsyncChecker.java


> Reduce verbosity for ThrottledAsyncChecker.java:schedule
> 
>
> Key: HDFS-13092
> URL: https://issues.apache.org/jira/browse/HDFS-13092
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Attachments: HDFS-13092.001.patch
>
>
> ThrottledAsyncChecker.java:schedule prints a log message every time a disk 
> check is scheduled. However if the previous check was triggered lesser than 
> the frequency at "minMsBetweenChecks" then the task is not scheduled. This 
> jira will reduce the log verbosity by printing the message only when the task 
> will be scheduled.
> {code}
> 2018-01-29 00:51:44,467 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/2/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,470 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/2/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,477 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/4/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,480 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/4/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,486 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/11/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,501 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/13/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,507 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/11/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,533 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/2/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,536 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/12/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,543 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/10/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,544 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/2/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,548 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/3/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,549 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/5/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,550 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/6/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,551 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/2/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,552 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/10/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,552 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/8/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,552 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/12/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,554 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/9/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,555 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/8/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,555 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> 

[jira] [Updated] (HDFS-13092) Reduce verbosity for ThrottledAsyncChecker.java:schedule

2018-01-31 Thread Hanisha Koneru (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanisha Koneru updated HDFS-13092:
--
Issue Type: Improvement  (was: Bug)

> Reduce verbosity for ThrottledAsyncChecker.java:schedule
> 
>
> Key: HDFS-13092
> URL: https://issues.apache.org/jira/browse/HDFS-13092
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Attachments: HDFS-13092.001.patch
>
>
> ThrottledAsyncChecker.java:schedule prints a log message every time a disk 
> check is scheduled. However if the previous check was triggered lesser than 
> the frequency at "minMsBetweenChecks" then the task is not scheduled. This 
> jira will reduce the log verbosity by printing the message only when the task 
> will be scheduled.
> {code}
> 2018-01-29 00:51:44,467 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/2/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,470 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/2/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,477 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/4/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,480 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/4/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,486 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/11/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,501 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/13/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,507 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/11/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,533 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/2/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,536 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/12/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,543 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/10/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,544 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/2/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,548 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/3/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,549 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/5/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,550 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/6/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,551 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/2/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,552 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/10/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,552 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/8/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,552 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/12/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,554 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/9/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,555 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/8/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,555 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/14/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,560 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/12/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,560 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/12/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,564 

[jira] [Comment Edited] (HDFS-13073) Cleanup code in InterQJournalProtocol.proto

2018-01-31 Thread Bharat Viswanadham (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347408#comment-16347408
 ] 

Bharat Viswanadham edited comment on HDFS-13073 at 1/31/18 7:08 PM:


Fixed checkstyle issues in patch v01.

Test cases are not added because, added method calls existing 
getEditLogManifest.

And this is tested in TestJournalNodeSync testcases.


was (Author: bharatviswa):
Fixed checkstyle issues in patch v01.

> Cleanup code in InterQJournalProtocol.proto
> ---
>
> Key: HDFS-13073
> URL: https://issues.apache.org/jira/browse/HDFS-13073
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: HDFS-13073.00.patch, HDFS-13073.01.patch
>
>
> We can reuse the messages in QJournalProtocol.proto, instead of redefining 
> again in InterQJournalProtocol.proto.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13092) Reduce verbosity for ThrottledAsyncChecker.java:schedule

2018-01-31 Thread Hanisha Koneru (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347416#comment-16347416
 ] 

Hanisha Koneru commented on HDFS-13092:
---

Committed to trunk and branch-3.0.

> Reduce verbosity for ThrottledAsyncChecker.java:schedule
> 
>
> Key: HDFS-13092
> URL: https://issues.apache.org/jira/browse/HDFS-13092
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Attachments: HDFS-13092.001.patch
>
>
> ThrottledAsyncChecker.java:schedule prints a log message every time a disk 
> check is scheduled. However if the previous check was triggered lesser than 
> the frequency at "minMsBetweenChecks" then the task is not scheduled. This 
> jira will reduce the log verbosity by printing the message only when the task 
> will be scheduled.
> {code}
> 2018-01-29 00:51:44,467 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/2/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,470 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/2/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,477 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/4/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,480 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/4/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,486 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/11/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,501 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/13/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,507 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/11/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,533 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/2/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,536 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/12/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,543 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/10/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,544 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/2/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,548 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/3/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,549 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/5/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,550 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/6/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,551 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/2/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,552 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/10/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,552 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/8/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,552 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/12/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,554 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/9/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,555 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/8/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,555 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/14/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,560 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/12/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,560 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/12/hadoop/hdfs/data/current
> 

[jira] [Commented] (HDFS-13061) SaslDataTransferClient#checkTrustAndSend should not trust a partially trusted channel

2018-01-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347412#comment-16347412
 ] 

Hudson commented on HDFS-13061:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13592 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13592/])
HDFS-13061. SaslDataTransferClient#checkTrustAndSend should not trust a (xyao: 
rev 37b753656849d0864ed3c8858edf3b85515cbf39)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/sasl/SaslDataTransferClient.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocol/datatransfer/sasl/TestSaslDataTransfer.java


> SaslDataTransferClient#checkTrustAndSend should not trust a partially trusted 
> channel
> -
>
> Key: HDFS-13061
> URL: https://issues.apache.org/jira/browse/HDFS-13061
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xiaoyu Yao
>Assignee: Ajay Kumar
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HDFS-13061.000.patch, HDFS-13061.001.patch, 
> HDFS-13061.002.patch, HDFS-13061.003.patch
>
>
> HDFS-5910 introduces encryption negotiation between client and server based 
> on a customizable TrustedChannelResolver class. The TrustedChannelResolver is 
> invoked on both client and server side. If the resolver indicates that the 
> channel is trusted, then the data transfer will not be encrypted even if 
> dfs.encrypt.data.transfer is set to true. 
> SaslDataTransferClient#checkTrustAndSend ask the channel resolve whether the 
> client and server address are trusted, respectively. It decides the channel 
> is untrusted only if both client and server are not trusted to enforce 
> encryption. *This ticket is opened to change it to not trust (and encrypt) if 
> either client or server address are not trusted.*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13056) Expose file-level composite CRCs in HDFS which are comparable across different instances/layouts

2018-01-31 Thread Dennis Huo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347407#comment-16347407
 ] 

Dennis Huo commented on HDFS-13056:
---

I ended up going with modifying the existing protocol, since otherwise the same 
splitting of the BlockGroupChecksum method for striped encodings ends up 
getting unwieldy. I've uploaded an amended v2 design doc outlining the pros and 
cons we've discussed for the DataTransferProtocol. It turns out this approach 
is also useful for dealing with merging stripe cells, I'm just finalizing that 
piece of the design still.

> Expose file-level composite CRCs in HDFS which are comparable across 
> different instances/layouts
> 
>
> Key: HDFS-13056
> URL: https://issues.apache.org/jira/browse/HDFS-13056
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, distcp, erasure-coding, federation, hdfs
>Affects Versions: 3.0.0
>Reporter: Dennis Huo
>Priority: Major
> Attachments: HDFS-13056-branch-2.8.001.patch, 
> HDFS-13056-branch-2.8.poc1.patch, Reference_only_zhen_PPOC_hadoop2.6.X.diff, 
> hdfs-file-composite-crc32-v1.pdf, hdfs-file-composite-crc32-v2.pdf
>
>
> FileChecksum was first introduced in 
> [https://issues-test.apache.org/jira/browse/HADOOP-3981] and ever since then 
> has remained defined as MD5-of-MD5-of-CRC, where per-512-byte chunk CRCs are 
> already stored as part of datanode metadata, and the MD5 approach is used to 
> compute an aggregate value in a distributed manner, with individual datanodes 
> computing the MD5-of-CRCs per-block in parallel, and the HDFS client 
> computing the second-level MD5.
>  
> A shortcoming of this approach which is often brought up is the fact that 
> this FileChecksum is sensitive to the internal block-size and chunk-size 
> configuration, and thus different HDFS files with different block/chunk 
> settings cannot be compared. More commonly, one might have different HDFS 
> clusters which use different block sizes, in which case any data migration 
> won't be able to use the FileChecksum for distcp's rsync functionality or for 
> verifying end-to-end data integrity (on top of low-level data integrity 
> checks applied at data transfer time).
>  
> This was also revisited in https://issues.apache.org/jira/browse/HDFS-8430 
> during the addition of checksum support for striped erasure-coded files; 
> while there was some discussion of using CRC composability, it still 
> ultimately settled on hierarchical MD5 approach, which also adds the problem 
> that checksums of basic replicated files are not comparable to striped files.
>  
> This feature proposes to add a "COMPOSITE-CRC" FileChecksum type which uses 
> CRC composition to remain completely chunk/block agnostic, and allows 
> comparison between striped vs replicated files, between different HDFS 
> instances, and possible even between HDFS and other external storage systems. 
> This feature can also be added in-place to be compatible with existing block 
> metadata, and doesn't need to change the normal path of chunk verification, 
> so is minimally invasive. This also means even large preexisting HDFS 
> deployments could adopt this feature to retroactively sync data. A detailed 
> design document can be found here: 
> https://storage.googleapis.com/dennishuo/hdfs-file-composite-crc32-v1.pdf



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13073) Cleanup code in InterQJournalProtocol.proto

2018-01-31 Thread Bharat Viswanadham (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347408#comment-16347408
 ] 

Bharat Viswanadham commented on HDFS-13073:
---

Fixed checkstyle issues in patch v01.

> Cleanup code in InterQJournalProtocol.proto
> ---
>
> Key: HDFS-13073
> URL: https://issues.apache.org/jira/browse/HDFS-13073
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: HDFS-13073.00.patch, HDFS-13073.01.patch
>
>
> We can reuse the messages in QJournalProtocol.proto, instead of redefining 
> again in InterQJournalProtocol.proto.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13073) Cleanup code in InterQJournalProtocol.proto

2018-01-31 Thread Bharat Viswanadham (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDFS-13073:
--
Attachment: HDFS-13073.01.patch

> Cleanup code in InterQJournalProtocol.proto
> ---
>
> Key: HDFS-13073
> URL: https://issues.apache.org/jira/browse/HDFS-13073
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: HDFS-13073.00.patch, HDFS-13073.01.patch
>
>
> We can reuse the messages in QJournalProtocol.proto, instead of redefining 
> again in InterQJournalProtocol.proto.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13056) Expose file-level composite CRCs in HDFS which are comparable across different instances/layouts

2018-01-31 Thread Dennis Huo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Huo updated HDFS-13056:
--
Attachment: hdfs-file-composite-crc32-v2.pdf

> Expose file-level composite CRCs in HDFS which are comparable across 
> different instances/layouts
> 
>
> Key: HDFS-13056
> URL: https://issues.apache.org/jira/browse/HDFS-13056
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, distcp, erasure-coding, federation, hdfs
>Affects Versions: 3.0.0
>Reporter: Dennis Huo
>Priority: Major
> Attachments: HDFS-13056-branch-2.8.001.patch, 
> HDFS-13056-branch-2.8.poc1.patch, Reference_only_zhen_PPOC_hadoop2.6.X.diff, 
> hdfs-file-composite-crc32-v1.pdf, hdfs-file-composite-crc32-v2.pdf
>
>
> FileChecksum was first introduced in 
> [https://issues-test.apache.org/jira/browse/HADOOP-3981] and ever since then 
> has remained defined as MD5-of-MD5-of-CRC, where per-512-byte chunk CRCs are 
> already stored as part of datanode metadata, and the MD5 approach is used to 
> compute an aggregate value in a distributed manner, with individual datanodes 
> computing the MD5-of-CRCs per-block in parallel, and the HDFS client 
> computing the second-level MD5.
>  
> A shortcoming of this approach which is often brought up is the fact that 
> this FileChecksum is sensitive to the internal block-size and chunk-size 
> configuration, and thus different HDFS files with different block/chunk 
> settings cannot be compared. More commonly, one might have different HDFS 
> clusters which use different block sizes, in which case any data migration 
> won't be able to use the FileChecksum for distcp's rsync functionality or for 
> verifying end-to-end data integrity (on top of low-level data integrity 
> checks applied at data transfer time).
>  
> This was also revisited in https://issues.apache.org/jira/browse/HDFS-8430 
> during the addition of checksum support for striped erasure-coded files; 
> while there was some discussion of using CRC composability, it still 
> ultimately settled on hierarchical MD5 approach, which also adds the problem 
> that checksums of basic replicated files are not comparable to striped files.
>  
> This feature proposes to add a "COMPOSITE-CRC" FileChecksum type which uses 
> CRC composition to remain completely chunk/block agnostic, and allows 
> comparison between striped vs replicated files, between different HDFS 
> instances, and possible even between HDFS and other external storage systems. 
> This feature can also be added in-place to be compatible with existing block 
> metadata, and doesn't need to change the normal path of chunk verification, 
> so is minimally invasive. This also means even large preexisting HDFS 
> deployments could adopt this feature to retroactively sync data. A detailed 
> design document can be found here: 
> https://storage.googleapis.com/dennishuo/hdfs-file-composite-crc32-v1.pdf



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13058) Fix dfs.namenode.shared.edits.dir in TestJournalNode

2018-01-31 Thread Bharat Viswanadham (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347402#comment-16347402
 ] 

Bharat Viswanadham commented on HDFS-13058:
---

Test failures are not related.

Ran them locally, tests have passed.

> Fix dfs.namenode.shared.edits.dir in TestJournalNode
> 
>
> Key: HDFS-13058
> URL: https://issues.apache.org/jira/browse/HDFS-13058
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: HDFS-13058.00.patch, HDFS-13058.01.patch
>
>
> In TestJournalNode.java
> dfs.namenode.shared.edits.dir is set as below.
> conf.set(DFSConfigKeys.DFS_NAMENODE_SHARED_EDITS_DIR_KEY +".ns1" +".nn1",
>  "qjournal://journalnode0:9900;journalnode1:9901");
>  
> From HDFS documentaion:
> [https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html]
> The URI should be of the form: 
> {{qjournal://*host1:port1*;*host2:port2*;*host3:port3*/*journalId*}}. 
>  
> Found this, when I was working for another jira, as when parsing this 
> dfs.namenode.shared.edits.dir property got an exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13060) Adding a BlacklistBasedTrustedChannelResolver for TrustedChannelResolver

2018-01-31 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347397#comment-16347397
 ] 

Xiaoyu Yao commented on HDFS-13060:
---

Thanks [~ajayydv] for the update. Patch v2 LGTM, +1 pending Jenkins. 
Please also file a ticket to deprecate CombinedIPWhiteList to use 
CombinedIPList for the white list based resolver as well to reduce the 
duplicated code.

> Adding a BlacklistBasedTrustedChannelResolver for TrustedChannelResolver
> 
>
> Key: HDFS-13060
> URL: https://issues.apache.org/jira/browse/HDFS-13060
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xiaoyu Yao
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HDFS-13060.000.patch, HDFS-13060.001.patch, 
> HDFS-13060.002.patch
>
>
> HDFS-5910 introduces encryption negotiation between client and server based 
> on a customizable TrustedChannelResolver class. The TrustedChannelResolver is 
> invoked on both client and server side. If the resolver indicates that the 
> channel is trusted, then the data transfer will not be encrypted even if 
> dfs.encrypt.data.transfer is set to true. 
> The default trust channel resolver implementation returns false indicating 
> that the channel is not trusted, which always enables encryption. HDFS-5910 
> also added a build-int whitelist based trust channel resolver. It allows you 
> to put IP address/Network Mask of trusted client/server in whitelist files to 
> skip encryption for certain traffics. 
> This ticket is opened to add a blacklist based trust channel resolver for 
> cases only certain machines (IPs) are untrusted without adding each trusted 
> IP individually.
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13061) SaslDataTransferClient#checkTrustAndSend should not trust a partially trusted channel

2018-01-31 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-13061:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.1.0
   Status: Resolved  (was: Patch Available)

Thanks [~ajayydv] for the contribution. I've committed the patch to trunk and 
branch-3.0.

> SaslDataTransferClient#checkTrustAndSend should not trust a partially trusted 
> channel
> -
>
> Key: HDFS-13061
> URL: https://issues.apache.org/jira/browse/HDFS-13061
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xiaoyu Yao
>Assignee: Ajay Kumar
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HDFS-13061.000.patch, HDFS-13061.001.patch, 
> HDFS-13061.002.patch, HDFS-13061.003.patch
>
>
> HDFS-5910 introduces encryption negotiation between client and server based 
> on a customizable TrustedChannelResolver class. The TrustedChannelResolver is 
> invoked on both client and server side. If the resolver indicates that the 
> channel is trusted, then the data transfer will not be encrypted even if 
> dfs.encrypt.data.transfer is set to true. 
> SaslDataTransferClient#checkTrustAndSend ask the channel resolve whether the 
> client and server address are trusted, respectively. It decides the channel 
> is untrusted only if both client and server are not trusted to enforce 
> encryption. *This ticket is opened to change it to not trust (and encrypt) if 
> either client or server address are not trusted.*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13092) Reduce verbosity for ThrottledAsyncChecker.java:schedule

2018-01-31 Thread Hanisha Koneru (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347383#comment-16347383
 ] 

Hanisha Koneru commented on HDFS-13092:
---

Thanks for the patch, [~msingh].
LGTM. Test failures are unrelated and pass locally. +1.

> Reduce verbosity for ThrottledAsyncChecker.java:schedule
> 
>
> Key: HDFS-13092
> URL: https://issues.apache.org/jira/browse/HDFS-13092
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Attachments: HDFS-13092.001.patch
>
>
> ThrottledAsyncChecker.java:schedule prints a log message every time a disk 
> check is scheduled. However if the previous check was triggered lesser than 
> the frequency at "minMsBetweenChecks" then the task is not scheduled. This 
> jira will reduce the log verbosity by printing the message only when the task 
> will be scheduled.
> {code}
> 2018-01-29 00:51:44,467 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/2/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,470 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/2/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,477 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/4/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,480 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/4/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,486 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/11/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,501 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/13/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,507 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/11/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,533 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/2/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,536 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/12/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,543 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/10/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,544 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/2/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,548 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/3/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,549 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/5/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,550 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/6/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,551 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/2/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,552 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/10/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,552 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/8/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,552 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/12/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,554 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/9/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,555 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/8/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,555 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/14/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,560 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for 
> /grid/12/hadoop/hdfs/data/current
> 2018-01-29 00:51:44,560 INFO  checker.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(107)) - 

[jira] [Created] (HDFS-13094) Refactor TestJournalNode

2018-01-31 Thread Bharat Viswanadham (JIRA)
Bharat Viswanadham created HDFS-13094:
-

 Summary: Refactor TestJournalNode
 Key: HDFS-13094
 URL: https://issues.apache.org/jira/browse/HDFS-13094
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Bharat Viswanadham
Assignee: Bharat Viswanadham


This Jira is created from the review comment from  [~arpitagarwal] in 
HDFS-13062.
We have used this testName to add testcase-specific behavior in the past but it 
is fragile.

Perhaps we should open a separate Jira to move this behavior to 
testcase-specific init routines by using a test base class and derived classes 
for individual unit tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13062) Provide support for JN to use separate journal disk per namespace

2018-01-31 Thread Bharat Viswanadham (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347375#comment-16347375
 ] 

Bharat Viswanadham commented on HDFS-13062:
---

Hi [~hanishakoneru]

Thanks for review.

Addressed review comments in patch v05.
 * 
{quote}In {{setConf()}}, why are we getting the nameserviceIds from the config 
key {{DFS_INTERNAL_NAMESERVICES_KEY}} before {{DFS_NAMESERVICES}}? IIUC from 
HDFS-6376, which introduced the internal nameservices key, it is meant for 
datanodes to distinguish between which nameservices to connect to. JournalNodes 
should not be using this configuration to deduce the nameservice Ids. Please 
correct me if I am wrong.{quote}If DFS_INTERNAL_NAMESERVICES_KEY is set, those 
nameservices belongs to the cluster running, DFS_NAMESERVICES can be set with 
nameservices belonging to cluster and  external cluster nameservices which it 
will be connecting to. So, that is the reason for checking 
DFS_INTERNAL_NAMESERVICES_KEY.

> Provide support for JN to use separate journal disk per namespace
> -
>
> Key: HDFS-13062
> URL: https://issues.apache.org/jira/browse/HDFS-13062
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: HDFS-13062.00.patch, HDFS-13062.01.patch, 
> HDFS-13062.02.patch, HDFS-13062.03.patch, HDFS-13062.04.patch, 
> HDFS-13062.05.patch
>
>
> In Federated HA setup, provide support for separate journal disk for each 
> namespace.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13062) Provide support for JN to use separate journal disk per namespace

2018-01-31 Thread Bharat Viswanadham (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDFS-13062:
--
Attachment: HDFS-13062.05.patch

> Provide support for JN to use separate journal disk per namespace
> -
>
> Key: HDFS-13062
> URL: https://issues.apache.org/jira/browse/HDFS-13062
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: HDFS-13062.00.patch, HDFS-13062.01.patch, 
> HDFS-13062.02.patch, HDFS-13062.03.patch, HDFS-13062.04.patch, 
> HDFS-13062.05.patch
>
>
> In Federated HA setup, provide support for separate journal disk for each 
> namespace.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-13056) Expose file-level composite CRCs in HDFS which are comparable across different instances/layouts

2018-01-31 Thread zhenzhao wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347281#comment-16347281
 ] 

zhenzhao wang edited comment on HDFS-13056 at 1/31/18 6:44 PM:
---

Thanks for the detailed info. Both adding new method and modifying existing 
protocol sounds good to me.

Part of the reason I added a data checksum option in top level is because 
CRC32/CRC32C is more generic checksum method (name) which is easy to 
understand. E.g. if I want copy a file from HDFS to GCS in distcp, the file 
checksum type or algorithm from GCS is CRC32C. And I hope I could use a same 
checksum type/name to get checksum from HDFS for verification. But I understand 
your concern too, as you said, it's difficult to come up with an entirely 
satisfactory approach. Both approach make sense to me. Now I got a patch to 
verify the data integrity in distcp by specifying the source and target fs 
checksum type explicitly, will modify it accordingly once this feature is 
accomplished.

As for the CRC, your approach is much faster. CRC(concatenate(A, B)) = 
CRC(concatenate(A, \{length of B}))^CRC(B). Shift-right is faster than the 
matrix approach while calculating concatenate(A, \{length of B}) though the 
complexity are all O Log(\{length of B}).

 

 


was (Author: wzzdreamer):
Thanks for the detailed info. Both adding new method and modifying existing 
protocol sounds good to me.

Part of the reason I added a data checksum option in top level is because 
CRC32/CRC32C is more generic checksum method (name) which is easy to 
understand. E.g. if I want copy a file from HDFS to GCS in distcp, the file 
checksum type or algorithm from GCS is CRC32C. And I hope I could use a same 
checksum type/name to get checksum from HDFS for verification. But I understand 
your concern too, as you said, it's difficult to come up with an entirely 
satisfactory approach. Both approach make sense to me. Now I got a patch to 
verify the data integrity in distcp by specifying the source and target fs 
checksum type explicitly, will modify it according once this feature is 
accomplished.

As for the CRC, your approach is much faster. CRC(concatenate(A, B)) = 
CRC(concatenate(A, \{length of B}))^CRC(B). Shift-right is faster than the 
matrix approach while calculating concatenate(A, \{length of B}) though the 
complexity are all O Log(\{length of B}).

 

 

> Expose file-level composite CRCs in HDFS which are comparable across 
> different instances/layouts
> 
>
> Key: HDFS-13056
> URL: https://issues.apache.org/jira/browse/HDFS-13056
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, distcp, erasure-coding, federation, hdfs
>Affects Versions: 3.0.0
>Reporter: Dennis Huo
>Priority: Major
> Attachments: HDFS-13056-branch-2.8.001.patch, 
> HDFS-13056-branch-2.8.poc1.patch, Reference_only_zhen_PPOC_hadoop2.6.X.diff, 
> hdfs-file-composite-crc32-v1.pdf
>
>
> FileChecksum was first introduced in 
> [https://issues-test.apache.org/jira/browse/HADOOP-3981] and ever since then 
> has remained defined as MD5-of-MD5-of-CRC, where per-512-byte chunk CRCs are 
> already stored as part of datanode metadata, and the MD5 approach is used to 
> compute an aggregate value in a distributed manner, with individual datanodes 
> computing the MD5-of-CRCs per-block in parallel, and the HDFS client 
> computing the second-level MD5.
>  
> A shortcoming of this approach which is often brought up is the fact that 
> this FileChecksum is sensitive to the internal block-size and chunk-size 
> configuration, and thus different HDFS files with different block/chunk 
> settings cannot be compared. More commonly, one might have different HDFS 
> clusters which use different block sizes, in which case any data migration 
> won't be able to use the FileChecksum for distcp's rsync functionality or for 
> verifying end-to-end data integrity (on top of low-level data integrity 
> checks applied at data transfer time).
>  
> This was also revisited in https://issues.apache.org/jira/browse/HDFS-8430 
> during the addition of checksum support for striped erasure-coded files; 
> while there was some discussion of using CRC composability, it still 
> ultimately settled on hierarchical MD5 approach, which also adds the problem 
> that checksums of basic replicated files are not comparable to striped files.
>  
> This feature proposes to add a "COMPOSITE-CRC" FileChecksum type which uses 
> CRC composition to remain completely chunk/block agnostic, and allows 
> comparison between striped vs replicated files, between different HDFS 
> instances, and possible even between HDFS and other external storage systems. 
> This feature can also be added in-place to be 

[jira] [Assigned] (HDFS-10453) ReplicationMonitor thread could stuck for long time due to the race between replication and delete of same file in a large cluster.

2018-01-31 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal reassigned HDFS-10453:


Assignee: He Xiaoqiao

> ReplicationMonitor thread could stuck for long time due to the race between 
> replication and delete of same file in a large cluster.
> ---
>
> Key: HDFS-10453
> URL: https://issues.apache.org/jira/browse/HDFS-10453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.1, 2.5.2, 2.7.1, 2.6.4
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-10453-branch-2.001.patch, 
> HDFS-10453-branch-2.003.patch, HDFS-10453-branch-2.7.004.patch, 
> HDFS-10453-branch-2.7.005.patch, HDFS-10453.001.patch
>
>
> ReplicationMonitor thread could stuck for long time and loss data with little 
> probability. Consider the typical scenario:
> (1) create and close a file with the default replicas(3);
> (2) increase replication (to 10) of the file.
> (3) delete the file while ReplicationMonitor is scheduling blocks belong to 
> that file for replications.
> if ReplicationMonitor stuck reappeared, NameNode will print log as:
> {code:xml}
> 2016-04-19 10:20:48,083 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> ..
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough 
> replicas: expected size is 7 but only 0 storage types can be selected 
> (replication=10, selected=[], unavailable=[DISK, ARCHIVE], removed=[DISK, 
> DISK, DISK, DISK, DISK, DISK, DISK], policy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) All required storage types are unavailable:  
> unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
> {code}
> This is because 2 threads (#NameNodeRpcServer and #ReplicationMonitor) 
> process same block at the same moment.
> (1) ReplicationMonitor#computeReplicationWorkForBlocks get blocks to 
> replicate and leave the global lock.
> (2) FSNamesystem#delete invoked to delete blocks then clear the reference in 
> blocksmap, needReplications, etc. the block's NumBytes will set 
> NO_ACK(Long.MAX_VALUE) which is used to indicate that the block deletion does 
> not need explicit ACK from the node. 
> (3) ReplicationMonitor#computeReplicationWorkForBlocks continue to 
> chooseTargets for the same blocks and no node will be selected after traverse 
> whole cluster because  no node choice satisfy the goodness criteria 
> (remaining spaces achieve required size Long.MAX_VALUE). 
> During of stage#3 ReplicationMonitor stuck for long time, especial in a large 
> cluster. invalidateBlocks & neededReplications continues to grow and no 
> consumes. it will loss data at the worst.
> This can mostly be avoided by skip chooseTarget for BlockCommand.NO_ACK block 
> and remove it from neededReplications.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-13060) Adding a BlacklistBasedTrustedChannelResolver for TrustedChannelResolver

2018-01-31 Thread Ajay Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347201#comment-16347201
 ] 

Ajay Kumar edited comment on HDFS-13060 at 1/31/18 6:08 PM:


[~xyao], thanks for review. Updated patch v2 to address suggestions. Created 
[HDFS-13090] to support composite trusted channel resolver.


was (Author: ajayydv):
[~xyao], thanks for review. Updated patch v2 to address suggestions.

> Adding a BlacklistBasedTrustedChannelResolver for TrustedChannelResolver
> 
>
> Key: HDFS-13060
> URL: https://issues.apache.org/jira/browse/HDFS-13060
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xiaoyu Yao
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HDFS-13060.000.patch, HDFS-13060.001.patch, 
> HDFS-13060.002.patch
>
>
> HDFS-5910 introduces encryption negotiation between client and server based 
> on a customizable TrustedChannelResolver class. The TrustedChannelResolver is 
> invoked on both client and server side. If the resolver indicates that the 
> channel is trusted, then the data transfer will not be encrypted even if 
> dfs.encrypt.data.transfer is set to true. 
> The default trust channel resolver implementation returns false indicating 
> that the channel is not trusted, which always enables encryption. HDFS-5910 
> also added a build-int whitelist based trust channel resolver. It allows you 
> to put IP address/Network Mask of trusted client/server in whitelist files to 
> skip encryption for certain traffics. 
> This ticket is opened to add a blacklist based trust channel resolver for 
> cases only certain machines (IPs) are untrusted without adding each trusted 
> IP individually.
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13056) Expose file-level composite CRCs in HDFS which are comparable across different instances/layouts

2018-01-31 Thread zhenzhao wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347281#comment-16347281
 ] 

zhenzhao wang commented on HDFS-13056:
--

Thanks for the detailed info. Both adding new method and modifying existing 
protocol sounds good to me.

Part of the reason I added a data checksum option in top level is because 
CRC32/CRC32C is more generic checksum method (name) which is easy to 
understand. E.g. if I want copy a file from HDFS to GCS in distcp, the file 
checksum type or algorithm from GCS is CRC32C. And I hope I could use a same 
checksum type/name to get checksum from HDFS for verification. But I understand 
your concern too, as you said, it's difficult to come up with an entirely 
satisfactory approach. Both approach make sense to me. Now I got a patch to 
verify the data integrity in distcp by specifying the source and target fs 
checksum type explicitly, will modify it according once this feature is 
accomplished.

As for the CRC, your approach is much faster. CRC(concatenate(A, B)) = 
CRC(concatenate(A, \{length of B}))^CRC(B). Shift-right is faster than the 
matrix approach while calculating concatenate(A, \{length of B}) though the 
complexity are all O Log(\{length of B}).

 

 

> Expose file-level composite CRCs in HDFS which are comparable across 
> different instances/layouts
> 
>
> Key: HDFS-13056
> URL: https://issues.apache.org/jira/browse/HDFS-13056
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, distcp, erasure-coding, federation, hdfs
>Affects Versions: 3.0.0
>Reporter: Dennis Huo
>Priority: Major
> Attachments: HDFS-13056-branch-2.8.001.patch, 
> HDFS-13056-branch-2.8.poc1.patch, Reference_only_zhen_PPOC_hadoop2.6.X.diff, 
> hdfs-file-composite-crc32-v1.pdf
>
>
> FileChecksum was first introduced in 
> [https://issues-test.apache.org/jira/browse/HADOOP-3981] and ever since then 
> has remained defined as MD5-of-MD5-of-CRC, where per-512-byte chunk CRCs are 
> already stored as part of datanode metadata, and the MD5 approach is used to 
> compute an aggregate value in a distributed manner, with individual datanodes 
> computing the MD5-of-CRCs per-block in parallel, and the HDFS client 
> computing the second-level MD5.
>  
> A shortcoming of this approach which is often brought up is the fact that 
> this FileChecksum is sensitive to the internal block-size and chunk-size 
> configuration, and thus different HDFS files with different block/chunk 
> settings cannot be compared. More commonly, one might have different HDFS 
> clusters which use different block sizes, in which case any data migration 
> won't be able to use the FileChecksum for distcp's rsync functionality or for 
> verifying end-to-end data integrity (on top of low-level data integrity 
> checks applied at data transfer time).
>  
> This was also revisited in https://issues.apache.org/jira/browse/HDFS-8430 
> during the addition of checksum support for striped erasure-coded files; 
> while there was some discussion of using CRC composability, it still 
> ultimately settled on hierarchical MD5 approach, which also adds the problem 
> that checksums of basic replicated files are not comparable to striped files.
>  
> This feature proposes to add a "COMPOSITE-CRC" FileChecksum type which uses 
> CRC composition to remain completely chunk/block agnostic, and allows 
> comparison between striped vs replicated files, between different HDFS 
> instances, and possible even between HDFS and other external storage systems. 
> This feature can also be added in-place to be compatible with existing block 
> metadata, and doesn't need to change the normal path of chunk verification, 
> so is minimally invasive. This also means even large preexisting HDFS 
> deployments could adopt this feature to retroactively sync data. A detailed 
> design document can be found here: 
> https://storage.googleapis.com/dennishuo/hdfs-file-composite-crc32-v1.pdf



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-13061) SaslDataTransferClient#checkTrustAndSend should not trust a partially trusted channel

2018-01-31 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347274#comment-16347274
 ] 

Xiaoyu Yao edited comment on HDFS-13061 at 1/31/18 5:55 PM:


Thanks [~ajayydv] for the update. +1 for the v3 patch. 
The test failures are unrelated. I will commit it shortly.


was (Author: xyao):
Thanks [~ajayydv] for the update. +1 for the v4 patch. 
The test failures are unrelated. I will commit it shortly.

> SaslDataTransferClient#checkTrustAndSend should not trust a partially trusted 
> channel
> -
>
> Key: HDFS-13061
> URL: https://issues.apache.org/jira/browse/HDFS-13061
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xiaoyu Yao
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HDFS-13061.000.patch, HDFS-13061.001.patch, 
> HDFS-13061.002.patch, HDFS-13061.003.patch
>
>
> HDFS-5910 introduces encryption negotiation between client and server based 
> on a customizable TrustedChannelResolver class. The TrustedChannelResolver is 
> invoked on both client and server side. If the resolver indicates that the 
> channel is trusted, then the data transfer will not be encrypted even if 
> dfs.encrypt.data.transfer is set to true. 
> SaslDataTransferClient#checkTrustAndSend ask the channel resolve whether the 
> client and server address are trusted, respectively. It decides the channel 
> is untrusted only if both client and server are not trusted to enforce 
> encryption. *This ticket is opened to change it to not trust (and encrypt) if 
> either client or server address are not trusted.*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13061) SaslDataTransferClient#checkTrustAndSend should not trust a partially trusted channel

2018-01-31 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347274#comment-16347274
 ] 

Xiaoyu Yao commented on HDFS-13061:
---

Thanks [~ajayydv] for the update. +1 for the v4 patch. 
The test failures are unrelated. I will commit it shortly.

> SaslDataTransferClient#checkTrustAndSend should not trust a partially trusted 
> channel
> -
>
> Key: HDFS-13061
> URL: https://issues.apache.org/jira/browse/HDFS-13061
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xiaoyu Yao
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HDFS-13061.000.patch, HDFS-13061.001.patch, 
> HDFS-13061.002.patch, HDFS-13061.003.patch
>
>
> HDFS-5910 introduces encryption negotiation between client and server based 
> on a customizable TrustedChannelResolver class. The TrustedChannelResolver is 
> invoked on both client and server side. If the resolver indicates that the 
> channel is trusted, then the data transfer will not be encrypted even if 
> dfs.encrypt.data.transfer is set to true. 
> SaslDataTransferClient#checkTrustAndSend ask the channel resolve whether the 
> client and server address are trusted, respectively. It decides the channel 
> is untrusted only if both client and server are not trusted to enforce 
> encryption. *This ticket is opened to change it to not trust (and encrypt) if 
> either client or server address are not trusted.*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12512) RBF: Add WebHDFS

2018-01-31 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-12512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347266#comment-16347266
 ] 

Íñigo Goiri commented on HDFS-12512:


 [^HDFS-12512.004.patch] looks good.
Waiting for Yetus to come back.

> RBF: Add WebHDFS
> 
>
> Key: HDFS-12512
> URL: https://issues.apache.org/jira/browse/HDFS-12512
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs
>Reporter: Íñigo Goiri
>Assignee: Wei Yan
>Priority: Major
>  Labels: RBF
> Attachments: HDFS-12512.000.patch, HDFS-12512.001.patch, 
> HDFS-12512.002.patch, HDFS-12512.003.patch, HDFS-12512.004.patch
>
>
> The Router currently does not support WebHDFS. It needs to implement 
> something similar to {{NamenodeWebHdfsMethods}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13068) RBF: Add router admin option to manage safe mode

2018-01-31 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-13068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347260#comment-16347260
 ] 

Íñigo Goiri commented on HDFS-13068:


[^HDFS-13068.002.patch] looks good other than the check style issues.
 Minor comments:
 * In {{HDFSCommands.md}}, you have {{enter}} twice instead of {{leave}} in the 
table.
 * In {{HDFSRouterFederation.md}}
 ** Change {{There is a manuall way provided to manage Safe Mode for the 
Router.}} to {{There is a manual way to manage the Safe Mode for the Router.}}
 ** Change {{by following command}} to {{using the following command}}

> RBF: Add router admin option to manage safe mode
> 
>
> Key: HDFS-13068
> URL: https://issues.apache.org/jira/browse/HDFS-13068
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Yiqun Lin
>Priority: Major
> Attachments: HDFS-13068.001.patch, HDFS-13068.002.patch
>
>
> HDFS-13044 adds a safe mode to reject requests. We should have an option to 
> manually set the Router into safe mode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13043) RBF: Expose the state of the Routers in the federation

2018-01-31 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HDFS-13043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-13043:
---
Attachment: HDFS-13043.008.patch

> RBF: Expose the state of the Routers in the federation
> --
>
> Key: HDFS-13043
> URL: https://issues.apache.org/jira/browse/HDFS-13043
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-13043.000.patch, HDFS-13043.001.patch, 
> HDFS-13043.002.patch, HDFS-13043.003.patch, HDFS-13043.004.patch, 
> HDFS-13043.005.patch, HDFS-13043.006.patch, HDFS-13043.007.patch, 
> HDFS-13043.008.patch, router-info.png
>
>
> The Router should expose the state of the other Routers in the federation 
> through a user UI.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12512) RBF: Add WebHDFS

2018-01-31 Thread Wei Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan updated HDFS-12512:
---
Attachment: HDFS-12512.004.patch

> RBF: Add WebHDFS
> 
>
> Key: HDFS-12512
> URL: https://issues.apache.org/jira/browse/HDFS-12512
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs
>Reporter: Íñigo Goiri
>Assignee: Wei Yan
>Priority: Major
>  Labels: RBF
> Attachments: HDFS-12512.000.patch, HDFS-12512.001.patch, 
> HDFS-12512.002.patch, HDFS-12512.003.patch, HDFS-12512.004.patch
>
>
> The Router currently does not support WebHDFS. It needs to implement 
> something similar to {{NamenodeWebHdfsMethods}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13044) RBF: Add a safe mode for the Router

2018-01-31 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HDFS-13044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-13044:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.1
   2.9.1
   2.10.0
   3.1.0
   Status: Resolved  (was: Patch Available)

Thanks [~linyiqun] for the review. I did the commit to {{branch-3.0}}, 
{{branch-2}}, and {{branch-2.9}}. [~linyiqun] had already done {{trunk}}.

> RBF: Add a safe mode for the Router
> ---
>
> Key: HDFS-13044
> URL: https://issues.apache.org/jira/browse/HDFS-13044
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Fix For: 3.1.0, 2.10.0, 2.9.1, 3.0.1
>
> Attachments: HDFS-13004.000.patch, HDFS-13044-branch-3.0.000.patch, 
> HDFS-13044.001.patch, HDFS-13044.002.patch, HDFS-13044.003.patch, 
> HDFS-13044.004.patch, HDFS-13044.005.patch
>
>
> When a Router cannot communicate with the State Store, it should enter into a 
> safe mode that disallows certain operations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13060) Adding a BlacklistBasedTrustedChannelResolver for TrustedChannelResolver

2018-01-31 Thread Ajay Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347201#comment-16347201
 ] 

Ajay Kumar commented on HDFS-13060:
---

[~xyao], thanks for review. Updated patch v2 to address suggestions.

> Adding a BlacklistBasedTrustedChannelResolver for TrustedChannelResolver
> 
>
> Key: HDFS-13060
> URL: https://issues.apache.org/jira/browse/HDFS-13060
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xiaoyu Yao
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HDFS-13060.000.patch, HDFS-13060.001.patch, 
> HDFS-13060.002.patch
>
>
> HDFS-5910 introduces encryption negotiation between client and server based 
> on a customizable TrustedChannelResolver class. The TrustedChannelResolver is 
> invoked on both client and server side. If the resolver indicates that the 
> channel is trusted, then the data transfer will not be encrypted even if 
> dfs.encrypt.data.transfer is set to true. 
> The default trust channel resolver implementation returns false indicating 
> that the channel is not trusted, which always enables encryption. HDFS-5910 
> also added a build-int whitelist based trust channel resolver. It allows you 
> to put IP address/Network Mask of trusted client/server in whitelist files to 
> skip encryption for certain traffics. 
> This ticket is opened to add a blacklist based trust channel resolver for 
> cases only certain machines (IPs) are untrusted without adding each trusted 
> IP individually.
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13060) Adding a BlacklistBasedTrustedChannelResolver for TrustedChannelResolver

2018-01-31 Thread Ajay Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HDFS-13060:
--
Attachment: HDFS-13060.002.patch

> Adding a BlacklistBasedTrustedChannelResolver for TrustedChannelResolver
> 
>
> Key: HDFS-13060
> URL: https://issues.apache.org/jira/browse/HDFS-13060
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xiaoyu Yao
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HDFS-13060.000.patch, HDFS-13060.001.patch, 
> HDFS-13060.002.patch
>
>
> HDFS-5910 introduces encryption negotiation between client and server based 
> on a customizable TrustedChannelResolver class. The TrustedChannelResolver is 
> invoked on both client and server side. If the resolver indicates that the 
> channel is trusted, then the data transfer will not be encrypted even if 
> dfs.encrypt.data.transfer is set to true. 
> The default trust channel resolver implementation returns false indicating 
> that the channel is not trusted, which always enables encryption. HDFS-5910 
> also added a build-int whitelist based trust channel resolver. It allows you 
> to put IP address/Network Mask of trusted client/server in whitelist files to 
> skip encryption for certain traffics. 
> This ticket is opened to add a blacklist based trust channel resolver for 
> cases only certain machines (IPs) are untrusted without adding each trusted 
> IP individually.
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11187) Optimize disk access for last partial chunk checksum of Finalized replica

2018-01-31 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346967#comment-16346967
 ] 

Kihwal Lee commented on HDFS-11187:
---

bq.  Not sure why yetus said it failed tests.
Some tests failed to terminate normally, so surefire did not report any 
failures, but maven did. Look at the output.
 
[https://builds.apache.org/job/PreCommit-HDFS-Build/22888/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt]

> Optimize disk access for last partial chunk checksum of Finalized replica
> -
>
> Key: HDFS-11187
> URL: https://issues.apache.org/jira/browse/HDFS-11187
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Attachments: HDFS-11187.001.patch, HDFS-11187.002.patch, 
> HDFS-11187.003.patch, HDFS-11187.004.patch, HDFS-11187.005.patch
>
>
> The patch at HDFS-11160 ensures BlockSender reads the correct version of 
> metafile when there are concurrent writers.
> However, the implementation is not optimal, because it must always read the 
> last partial chunk checksum from disk while holding FsDatasetImpl lock for 
> every reader. It is possible to optimize this by keeping an up-to-date 
> version of last partial checksum in-memory and reduce disk access.
> I am separating the optimization into a new jira, because maintaining the 
> state of in-memory checksum requires a lot more work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11187) Optimize disk access for last partial chunk checksum of Finalized replica

2018-01-31 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346942#comment-16346942
 ] 

Wei-Chiu Chuang commented on HDFS-11187:


the latest patch actually passed all tests. Not sure why yetus said it failed 
tests.

[~yzhangal] could you please review the patch? Thank you

> Optimize disk access for last partial chunk checksum of Finalized replica
> -
>
> Key: HDFS-11187
> URL: https://issues.apache.org/jira/browse/HDFS-11187
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Attachments: HDFS-11187.001.patch, HDFS-11187.002.patch, 
> HDFS-11187.003.patch, HDFS-11187.004.patch, HDFS-11187.005.patch
>
>
> The patch at HDFS-11160 ensures BlockSender reads the correct version of 
> metafile when there are concurrent writers.
> However, the implementation is not optimal, because it must always read the 
> last partial chunk checksum from disk while holding FsDatasetImpl lock for 
> every reader. It is possible to optimize this by keeping an up-to-date 
> version of last partial checksum in-memory and reduce disk access.
> I am separating the optimization into a new jira, because maintaining the 
> state of in-memory checksum requires a lot more work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12504) Ozone: Improve SQLCLI performance

2018-01-31 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346858#comment-16346858
 ] 

genericqa commented on HDFS-12504:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red} 10m 
22s{color} | {color:red} Docker failed to build yetus/hadoop:d11161b. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-12504 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12891263/HDFS-12504-HDFS-7240.001.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22904/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Ozone: Improve SQLCLI performance
> -
>
> Key: HDFS-12504
> URL: https://issues.apache.org/jira/browse/HDFS-12504
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Weiwei Yang
>Assignee: Yuanbo Liu
>Priority: Major
>  Labels: performance
> Attachments: HDFS-12504-HDFS-7240.001.patch
>
>
> In my test, my {{ksm.db}} has *3017660* entries with total size of *128mb*, 
> SQLCLI tool runs over *2 hours* but still not finish exporting the DB. This 
> is because it iterates each entry and inserts that to another sqllite DB 
> file, which is not efficient. We need to improve this to be running more 
> efficiently on large DB files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-12504) Ozone: Improve SQLCLI performance

2018-01-31 Thread Lokesh Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lokesh Jain reassigned HDFS-12504:
--

Assignee: Yuanbo Liu  (was: Weiwei Yang)

> Ozone: Improve SQLCLI performance
> -
>
> Key: HDFS-12504
> URL: https://issues.apache.org/jira/browse/HDFS-12504
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Weiwei Yang
>Assignee: Yuanbo Liu
>Priority: Major
>  Labels: performance
> Attachments: HDFS-12504-HDFS-7240.001.patch
>
>
> In my test, my {{ksm.db}} has *3017660* entries with total size of *128mb*, 
> SQLCLI tool runs over *2 hours* but still not finish exporting the DB. This 
> is because it iterates each entry and inserts that to another sqllite DB 
> file, which is not efficient. We need to improve this to be running more 
> efficiently on large DB files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



  1   2   >