[jira] [Comment Edited] (HDFS-13915) replace datanode failed because of NameNodeRpcServer#getAdditionalDatanode returning excessive datanodeInfo

2018-11-14 Thread Jiandan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686177#comment-16686177
 ] 

Jiandan Yang  edited comment on HDFS-13915 at 11/14/18 8:01 AM:


I add a case  in [^HDFS-13915.001.patch] based on trunk to reproduce issue. 
HI, [~szetszwo]  BlockStoragePolicy#chooseStorageTypes may return excessive 
storageType, and I do not understander why after looking through related code. 
Can we remove excessive storageType?

{code:java}
if (storageTypes.size() < expectedSize) {
  LOG.warn("Failed to place enough replicas: expected size is {}"
  + " but only {} storage types can be selected (replication={},"
  + " selected={}, unavailable={}" + ", removed={}" + ", policy={}"
  + ")", expectedSize, storageTypes.size(), replication, storageTypes,
  unavailables, removed, this);
} else if (storageTypes.size() > expectedSize) {
  //should remove excess storageType to return expectedSize storageType 
  int storageTypesSize = storageTypes.size();
  int excessiveStorageTypeNum = storageTypesSize - expectedSize;
  for (int i = 0; i < excessiveStorageTypeNum; i++) {
storageTypes.remove(storageTypesSize - 1 - i);
  }
}
{code}




was (Author: yangjiandan):
I add a case  in [^HDFS-13915.001.patch] based on trunk to reproduce issue. 
HI, [~szetszwo]  BlockStoragePolicy#chooseStorageTypes may return excessive 
storageType, and I do not understander why after looking through related code. 
Can we remove excessive storageType?

{code:java}
if (storageTypes.size() < expectedSize) {
  LOG.warn("Failed to place enough replicas: expected size is {}"
  + " but only {} storage types can be selected (replication={},"
  + " selected={}, unavailable={}" + ", removed={}" + ", policy={}"
  + ")", expectedSize, storageTypes.size(), replication, storageTypes,
  unavailables, removed, this);
} else if (storageTypes.size() > expectedSize) {
//should remove excess storageType to return expectedSize storageType
}
{code}



> replace datanode failed because of  NameNodeRpcServer#getAdditionalDatanode 
> returning excessive datanodeInfo
> 
>
> Key: HDFS-13915
> URL: https://issues.apache.org/jira/browse/HDFS-13915
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
> Environment: 
>Reporter: Jiandan Yang 
>Priority: Major
>
> Consider following situation:
> 1. create a file with ALLSSD policy
> 2. return [SSD,SSD,DISK] due to lack of SSD space
> 3. client call NameNodeRpcServer#getAdditionalDatanode when recovering write 
> pipeline and replacing bad datanode
> 4. BlockPlacementPolicyDefault#chooseTarget will call 
> StoragePolicy#chooseStorageTypes(3, [SSD,DISK], none, false), but 
> chooseStorageTypes return [SSD,SSD]
> {code:java}
>   @Test
>   public void testAllSSDFallbackAndNonNewBlock() {
> final BlockStoragePolicy allSSD = POLICY_SUITE.getPolicy(ALLSSD);
> List storageTypes = allSSD.chooseStorageTypes((short) 3,
> Arrays.asList(StorageType.DISK, StorageType.SSD),
> EnumSet.noneOf(StorageType.class), false);
> assertEquals(2, storageTypes.size());
> assertEquals(StorageType.SSD, storageTypes.get(0));
> assertEquals(StorageType.SSD, storageTypes.get(1));
>   }
> {code}
> 5. do numOfReplicas = requiredStorageTypes.size() and numOfReplicas is set to 
> 2 and choose additional two datanodes
> 6. BlockPlacementPolicyDefault#chooseTarget return four datanodes to client
> 7. DataStreamer#findNewDatanode find nodes.length != original.length + 1  and 
> throw IOException, and finally lead to write failed
> {code:java}
> private int findNewDatanode(final DatanodeInfo[] original
>   ) throws IOException {
> if (nodes.length != original.length + 1) {
>   throw new IOException(
>   "Failed to replace a bad datanode on the existing pipeline "
>   + "due to no more good datanodes being available to try. "
>   + "(Nodes: current=" + Arrays.asList(nodes)
>   + ", original=" + Arrays.asList(original) + "). "
>   + "The current failed datanode replacement policy is "
>   + dfsClient.dtpReplaceDatanodeOnFailure
>   + ", and a client may configure this via '"
>   + BlockWrite.ReplaceDatanodeOnFailure.POLICY_KEY
>   + "' in its configuration.");
> }
> for(int i = 0; i < nodes.length; i++) {
>   int j = 0;
>   for(; j < original.length && !nodes[i].equals(original[j]); j++);
>   if (j == original.length) {
> return i;
>   }
> }
> throw new IOException("Failed: new datanode not found: nodes="
> 

[jira] [Updated] (HDFS-13915) replace datanode failed because of NameNodeRpcServer#getAdditionalDatanode returning excessive datanodeInfo

2018-11-14 Thread Jiandan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiandan Yang  updated HDFS-13915:
-
Attachment: HDFS-13915.001.patch

> replace datanode failed because of  NameNodeRpcServer#getAdditionalDatanode 
> returning excessive datanodeInfo
> 
>
> Key: HDFS-13915
> URL: https://issues.apache.org/jira/browse/HDFS-13915
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
> Environment: 
>Reporter: Jiandan Yang 
>Priority: Major
> Attachments: HDFS-13915.001.patch
>
>
> Consider following situation:
> 1. create a file with ALLSSD policy
> 2. return [SSD,SSD,DISK] due to lack of SSD space
> 3. client call NameNodeRpcServer#getAdditionalDatanode when recovering write 
> pipeline and replacing bad datanode
> 4. BlockPlacementPolicyDefault#chooseTarget will call 
> StoragePolicy#chooseStorageTypes(3, [SSD,DISK], none, false), but 
> chooseStorageTypes return [SSD,SSD]
> {code:java}
>   @Test
>   public void testAllSSDFallbackAndNonNewBlock() {
> final BlockStoragePolicy allSSD = POLICY_SUITE.getPolicy(ALLSSD);
> List storageTypes = allSSD.chooseStorageTypes((short) 3,
> Arrays.asList(StorageType.DISK, StorageType.SSD),
> EnumSet.noneOf(StorageType.class), false);
> assertEquals(2, storageTypes.size());
> assertEquals(StorageType.SSD, storageTypes.get(0));
> assertEquals(StorageType.SSD, storageTypes.get(1));
>   }
> {code}
> 5. do numOfReplicas = requiredStorageTypes.size() and numOfReplicas is set to 
> 2 and choose additional two datanodes
> 6. BlockPlacementPolicyDefault#chooseTarget return four datanodes to client
> 7. DataStreamer#findNewDatanode find nodes.length != original.length + 1  and 
> throw IOException, and finally lead to write failed
> {code:java}
> private int findNewDatanode(final DatanodeInfo[] original
>   ) throws IOException {
> if (nodes.length != original.length + 1) {
>   throw new IOException(
>   "Failed to replace a bad datanode on the existing pipeline "
>   + "due to no more good datanodes being available to try. "
>   + "(Nodes: current=" + Arrays.asList(nodes)
>   + ", original=" + Arrays.asList(original) + "). "
>   + "The current failed datanode replacement policy is "
>   + dfsClient.dtpReplaceDatanodeOnFailure
>   + ", and a client may configure this via '"
>   + BlockWrite.ReplaceDatanodeOnFailure.POLICY_KEY
>   + "' in its configuration.");
> }
> for(int i = 0; i < nodes.length; i++) {
>   int j = 0;
>   for(; j < original.length && !nodes[i].equals(original[j]); j++);
>   if (j == original.length) {
> return i;
>   }
> }
> throw new IOException("Failed: new datanode not found: nodes="
> + Arrays.asList(nodes) + ", original=" + Arrays.asList(original));
>   }
> {code}
> client warn logs is:
>  {code:java}
> WARN [DataStreamer for file 
> /home/yarn/opensearch/in/data/120141286/0_65535/table/ucs_process/MANIFEST-093545
>  block BP-1742758844-11.138.8.184-1483707043031:blk_7086344902_6012765313] 
> org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception
> java.io.IOException: Failed to replace a bad datanode on the existing 
> pipeline due to no more good datanodes being available to try. (Nodes: 
> current=[DatanodeInfoWithStorage[11.138.5.4:50010,DS-04826cfc-1885-4213-a58b-8606845c5c42,SSD],
>  
> DatanodeInfoWithStorage[11.138.5.9:50010,DS-f6d8eb8b-2550-474b-a692-c991d7a6f6b3,SSD],
>  
> DatanodeInfoWithStorage[11.138.5.153:50010,DS-f5d77ca0-6fe3-4523-8ca8-5af975f845b6,SSD],
>  
> DatanodeInfoWithStorage[11.138.9.156:50010,DS-0d15ea12-1bad--84f7-1a4917a1e194,DISK]],
>  
> original=[DatanodeInfoWithStorage[11.138.5.4:50010,DS-04826cfc-1885-4213-a58b-8606845c5c42,SSD],
>  
> DatanodeInfoWithStorage[11.138.9.156:50010,DS-0d15ea12-1bad--84f7-1a4917a1e194,DISK]]).
>  The current failed datanode replacement policy is DEFAULT, and a client may 
> configure this via 
> 'dfs.client.block.write.replace-datanode-on-failure.policy' in its 
> configuration.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14064) WEBHDFS: Support Enable/Disable EC Policy

2018-11-14 Thread Ayush Saxena (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686188#comment-16686188
 ] 

Ayush Saxena commented on HDFS-14064:
-

Test Failures due to unable to create new native thread.Not related just 
refactored test from prev v3.Seems unrelated.Verified at local level too.

> WEBHDFS: Support Enable/Disable EC Policy
> -
>
> Key: HDFS-14064
> URL: https://issues.apache.org/jira/browse/HDFS-14064
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-14064-01.patch, HDFS-14064-02.patch, 
> HDFS-14064-03.patch, HDFS-14064-04.patch, HDFS-14064-04.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13915) replace datanode failed because of NameNodeRpcServer#getAdditionalDatanode returning excessive datanodeInfo

2018-11-14 Thread Jiandan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiandan Yang  updated HDFS-13915:
-
Assignee: Jiandan Yang 
  Status: Patch Available  (was: Open)

> replace datanode failed because of  NameNodeRpcServer#getAdditionalDatanode 
> returning excessive datanodeInfo
> 
>
> Key: HDFS-13915
> URL: https://issues.apache.org/jira/browse/HDFS-13915
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
> Environment: 
>Reporter: Jiandan Yang 
>Assignee: Jiandan Yang 
>Priority: Major
> Attachments: HDFS-13915.001.patch
>
>
> Consider following situation:
> 1. create a file with ALLSSD policy
> 2. return [SSD,SSD,DISK] due to lack of SSD space
> 3. client call NameNodeRpcServer#getAdditionalDatanode when recovering write 
> pipeline and replacing bad datanode
> 4. BlockPlacementPolicyDefault#chooseTarget will call 
> StoragePolicy#chooseStorageTypes(3, [SSD,DISK], none, false), but 
> chooseStorageTypes return [SSD,SSD]
> {code:java}
>   @Test
>   public void testAllSSDFallbackAndNonNewBlock() {
> final BlockStoragePolicy allSSD = POLICY_SUITE.getPolicy(ALLSSD);
> List storageTypes = allSSD.chooseStorageTypes((short) 3,
> Arrays.asList(StorageType.DISK, StorageType.SSD),
> EnumSet.noneOf(StorageType.class), false);
> assertEquals(2, storageTypes.size());
> assertEquals(StorageType.SSD, storageTypes.get(0));
> assertEquals(StorageType.SSD, storageTypes.get(1));
>   }
> {code}
> 5. do numOfReplicas = requiredStorageTypes.size() and numOfReplicas is set to 
> 2 and choose additional two datanodes
> 6. BlockPlacementPolicyDefault#chooseTarget return four datanodes to client
> 7. DataStreamer#findNewDatanode find nodes.length != original.length + 1  and 
> throw IOException, and finally lead to write failed
> {code:java}
> private int findNewDatanode(final DatanodeInfo[] original
>   ) throws IOException {
> if (nodes.length != original.length + 1) {
>   throw new IOException(
>   "Failed to replace a bad datanode on the existing pipeline "
>   + "due to no more good datanodes being available to try. "
>   + "(Nodes: current=" + Arrays.asList(nodes)
>   + ", original=" + Arrays.asList(original) + "). "
>   + "The current failed datanode replacement policy is "
>   + dfsClient.dtpReplaceDatanodeOnFailure
>   + ", and a client may configure this via '"
>   + BlockWrite.ReplaceDatanodeOnFailure.POLICY_KEY
>   + "' in its configuration.");
> }
> for(int i = 0; i < nodes.length; i++) {
>   int j = 0;
>   for(; j < original.length && !nodes[i].equals(original[j]); j++);
>   if (j == original.length) {
> return i;
>   }
> }
> throw new IOException("Failed: new datanode not found: nodes="
> + Arrays.asList(nodes) + ", original=" + Arrays.asList(original));
>   }
> {code}
> client warn logs is:
>  {code:java}
> WARN [DataStreamer for file 
> /home/yarn/opensearch/in/data/120141286/0_65535/table/ucs_process/MANIFEST-093545
>  block BP-1742758844-11.138.8.184-1483707043031:blk_7086344902_6012765313] 
> org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception
> java.io.IOException: Failed to replace a bad datanode on the existing 
> pipeline due to no more good datanodes being available to try. (Nodes: 
> current=[DatanodeInfoWithStorage[11.138.5.4:50010,DS-04826cfc-1885-4213-a58b-8606845c5c42,SSD],
>  
> DatanodeInfoWithStorage[11.138.5.9:50010,DS-f6d8eb8b-2550-474b-a692-c991d7a6f6b3,SSD],
>  
> DatanodeInfoWithStorage[11.138.5.153:50010,DS-f5d77ca0-6fe3-4523-8ca8-5af975f845b6,SSD],
>  
> DatanodeInfoWithStorage[11.138.9.156:50010,DS-0d15ea12-1bad--84f7-1a4917a1e194,DISK]],
>  
> original=[DatanodeInfoWithStorage[11.138.5.4:50010,DS-04826cfc-1885-4213-a58b-8606845c5c42,SSD],
>  
> DatanodeInfoWithStorage[11.138.9.156:50010,DS-0d15ea12-1bad--84f7-1a4917a1e194,DISK]]).
>  The current failed datanode replacement policy is DEFAULT, and a client may 
> configure this via 
> 'dfs.client.block.write.replace-datanode-on-failure.policy' in its 
> configuration.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13915) replace datanode failed because of NameNodeRpcServer#getAdditionalDatanode returning excessive datanodeInfo

2018-11-14 Thread Jiandan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiandan Yang  updated HDFS-13915:
-
Attachment: HDFS-13915.001.patch

> replace datanode failed because of  NameNodeRpcServer#getAdditionalDatanode 
> returning excessive datanodeInfo
> 
>
> Key: HDFS-13915
> URL: https://issues.apache.org/jira/browse/HDFS-13915
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
> Environment: 
>Reporter: Jiandan Yang 
>Priority: Major
> Attachments: HDFS-13915.001.patch
>
>
> Consider following situation:
> 1. create a file with ALLSSD policy
> 2. return [SSD,SSD,DISK] due to lack of SSD space
> 3. client call NameNodeRpcServer#getAdditionalDatanode when recovering write 
> pipeline and replacing bad datanode
> 4. BlockPlacementPolicyDefault#chooseTarget will call 
> StoragePolicy#chooseStorageTypes(3, [SSD,DISK], none, false), but 
> chooseStorageTypes return [SSD,SSD]
> {code:java}
>   @Test
>   public void testAllSSDFallbackAndNonNewBlock() {
> final BlockStoragePolicy allSSD = POLICY_SUITE.getPolicy(ALLSSD);
> List storageTypes = allSSD.chooseStorageTypes((short) 3,
> Arrays.asList(StorageType.DISK, StorageType.SSD),
> EnumSet.noneOf(StorageType.class), false);
> assertEquals(2, storageTypes.size());
> assertEquals(StorageType.SSD, storageTypes.get(0));
> assertEquals(StorageType.SSD, storageTypes.get(1));
>   }
> {code}
> 5. do numOfReplicas = requiredStorageTypes.size() and numOfReplicas is set to 
> 2 and choose additional two datanodes
> 6. BlockPlacementPolicyDefault#chooseTarget return four datanodes to client
> 7. DataStreamer#findNewDatanode find nodes.length != original.length + 1  and 
> throw IOException, and finally lead to write failed
> {code:java}
> private int findNewDatanode(final DatanodeInfo[] original
>   ) throws IOException {
> if (nodes.length != original.length + 1) {
>   throw new IOException(
>   "Failed to replace a bad datanode on the existing pipeline "
>   + "due to no more good datanodes being available to try. "
>   + "(Nodes: current=" + Arrays.asList(nodes)
>   + ", original=" + Arrays.asList(original) + "). "
>   + "The current failed datanode replacement policy is "
>   + dfsClient.dtpReplaceDatanodeOnFailure
>   + ", and a client may configure this via '"
>   + BlockWrite.ReplaceDatanodeOnFailure.POLICY_KEY
>   + "' in its configuration.");
> }
> for(int i = 0; i < nodes.length; i++) {
>   int j = 0;
>   for(; j < original.length && !nodes[i].equals(original[j]); j++);
>   if (j == original.length) {
> return i;
>   }
> }
> throw new IOException("Failed: new datanode not found: nodes="
> + Arrays.asList(nodes) + ", original=" + Arrays.asList(original));
>   }
> {code}
> client warn logs is:
>  {code:java}
> WARN [DataStreamer for file 
> /home/yarn/opensearch/in/data/120141286/0_65535/table/ucs_process/MANIFEST-093545
>  block BP-1742758844-11.138.8.184-1483707043031:blk_7086344902_6012765313] 
> org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception
> java.io.IOException: Failed to replace a bad datanode on the existing 
> pipeline due to no more good datanodes being available to try. (Nodes: 
> current=[DatanodeInfoWithStorage[11.138.5.4:50010,DS-04826cfc-1885-4213-a58b-8606845c5c42,SSD],
>  
> DatanodeInfoWithStorage[11.138.5.9:50010,DS-f6d8eb8b-2550-474b-a692-c991d7a6f6b3,SSD],
>  
> DatanodeInfoWithStorage[11.138.5.153:50010,DS-f5d77ca0-6fe3-4523-8ca8-5af975f845b6,SSD],
>  
> DatanodeInfoWithStorage[11.138.9.156:50010,DS-0d15ea12-1bad--84f7-1a4917a1e194,DISK]],
>  
> original=[DatanodeInfoWithStorage[11.138.5.4:50010,DS-04826cfc-1885-4213-a58b-8606845c5c42,SSD],
>  
> DatanodeInfoWithStorage[11.138.9.156:50010,DS-0d15ea12-1bad--84f7-1a4917a1e194,DISK]]).
>  The current failed datanode replacement policy is DEFAULT, and a client may 
> configure this via 
> 'dfs.client.block.write.replace-datanode-on-failure.policy' in its 
> configuration.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13915) replace datanode failed because of NameNodeRpcServer#getAdditionalDatanode returning excessive datanodeInfo

2018-11-14 Thread Jiandan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiandan Yang  updated HDFS-13915:
-
Attachment: (was: HDFS-13915.001.patch)

> replace datanode failed because of  NameNodeRpcServer#getAdditionalDatanode 
> returning excessive datanodeInfo
> 
>
> Key: HDFS-13915
> URL: https://issues.apache.org/jira/browse/HDFS-13915
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
> Environment: 
>Reporter: Jiandan Yang 
>Priority: Major
> Attachments: HDFS-13915.001.patch
>
>
> Consider following situation:
> 1. create a file with ALLSSD policy
> 2. return [SSD,SSD,DISK] due to lack of SSD space
> 3. client call NameNodeRpcServer#getAdditionalDatanode when recovering write 
> pipeline and replacing bad datanode
> 4. BlockPlacementPolicyDefault#chooseTarget will call 
> StoragePolicy#chooseStorageTypes(3, [SSD,DISK], none, false), but 
> chooseStorageTypes return [SSD,SSD]
> {code:java}
>   @Test
>   public void testAllSSDFallbackAndNonNewBlock() {
> final BlockStoragePolicy allSSD = POLICY_SUITE.getPolicy(ALLSSD);
> List storageTypes = allSSD.chooseStorageTypes((short) 3,
> Arrays.asList(StorageType.DISK, StorageType.SSD),
> EnumSet.noneOf(StorageType.class), false);
> assertEquals(2, storageTypes.size());
> assertEquals(StorageType.SSD, storageTypes.get(0));
> assertEquals(StorageType.SSD, storageTypes.get(1));
>   }
> {code}
> 5. do numOfReplicas = requiredStorageTypes.size() and numOfReplicas is set to 
> 2 and choose additional two datanodes
> 6. BlockPlacementPolicyDefault#chooseTarget return four datanodes to client
> 7. DataStreamer#findNewDatanode find nodes.length != original.length + 1  and 
> throw IOException, and finally lead to write failed
> {code:java}
> private int findNewDatanode(final DatanodeInfo[] original
>   ) throws IOException {
> if (nodes.length != original.length + 1) {
>   throw new IOException(
>   "Failed to replace a bad datanode on the existing pipeline "
>   + "due to no more good datanodes being available to try. "
>   + "(Nodes: current=" + Arrays.asList(nodes)
>   + ", original=" + Arrays.asList(original) + "). "
>   + "The current failed datanode replacement policy is "
>   + dfsClient.dtpReplaceDatanodeOnFailure
>   + ", and a client may configure this via '"
>   + BlockWrite.ReplaceDatanodeOnFailure.POLICY_KEY
>   + "' in its configuration.");
> }
> for(int i = 0; i < nodes.length; i++) {
>   int j = 0;
>   for(; j < original.length && !nodes[i].equals(original[j]); j++);
>   if (j == original.length) {
> return i;
>   }
> }
> throw new IOException("Failed: new datanode not found: nodes="
> + Arrays.asList(nodes) + ", original=" + Arrays.asList(original));
>   }
> {code}
> client warn logs is:
>  {code:java}
> WARN [DataStreamer for file 
> /home/yarn/opensearch/in/data/120141286/0_65535/table/ucs_process/MANIFEST-093545
>  block BP-1742758844-11.138.8.184-1483707043031:blk_7086344902_6012765313] 
> org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception
> java.io.IOException: Failed to replace a bad datanode on the existing 
> pipeline due to no more good datanodes being available to try. (Nodes: 
> current=[DatanodeInfoWithStorage[11.138.5.4:50010,DS-04826cfc-1885-4213-a58b-8606845c5c42,SSD],
>  
> DatanodeInfoWithStorage[11.138.5.9:50010,DS-f6d8eb8b-2550-474b-a692-c991d7a6f6b3,SSD],
>  
> DatanodeInfoWithStorage[11.138.5.153:50010,DS-f5d77ca0-6fe3-4523-8ca8-5af975f845b6,SSD],
>  
> DatanodeInfoWithStorage[11.138.9.156:50010,DS-0d15ea12-1bad--84f7-1a4917a1e194,DISK]],
>  
> original=[DatanodeInfoWithStorage[11.138.5.4:50010,DS-04826cfc-1885-4213-a58b-8606845c5c42,SSD],
>  
> DatanodeInfoWithStorage[11.138.9.156:50010,DS-0d15ea12-1bad--84f7-1a4917a1e194,DISK]]).
>  The current failed datanode replacement policy is DEFAULT, and a client may 
> configure this via 
> 'dfs.client.block.write.replace-datanode-on-failure.policy' in its 
> configuration.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-834) Datanode goes OOM based because of segment size

2018-11-14 Thread Mukul Kumar Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDDS-834:
---
Attachment: HDDS-834-ozone-0.3.001.patch

> Datanode goes OOM based because of segment size
> ---
>
> Key: HDDS-834
> URL: https://issues.apache.org/jira/browse/HDDS-834
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.3.0
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Attachments: HDDS-834-ozone-0.3.001.patch, HDDS-834.001.patch
>
>
> Currently ratis segment size is set to 1GB. After RATIS-253, the entry size 
> for a write Chunk is not  counted towards the entry being written to Raft Log.
> This jira controls the segment size to 16KB which makes sure that the number 
> of entries with WriteChunk is limited to 64. This means with 16MB chunk, the 
> total data pending in the segment is 1GB.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-801) Quasi close the container when close is not executed via Ratis

2018-11-14 Thread Nanda kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nanda kumar updated HDDS-801:
-
Attachment: HDDS-801.002.patch

> Quasi close the container when close is not executed via Ratis
> --
>
> Key: HDDS-801
> URL: https://issues.apache.org/jira/browse/HDDS-801
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Affects Versions: 0.3.0
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
> Attachments: HDDS-801.000.patch, HDDS-801.001.patch, 
> HDDS-801.002.patch
>
>
> When datanode received CloseContainerCommand and the replication type is not 
> RATIS, we should QUASI close the container. After quasi-closing the container 
> an ICR has to be sent to SCM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-801) Quasi close the container when close is not executed via Ratis

2018-11-14 Thread Nanda kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686242#comment-16686242
 ] 

Nanda kumar commented on HDDS-801:
--

/cc [~jnp] [~arpitagarwal] [~msingh] [~hanishakoneru]

> Quasi close the container when close is not executed via Ratis
> --
>
> Key: HDDS-801
> URL: https://issues.apache.org/jira/browse/HDDS-801
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Affects Versions: 0.3.0
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
> Attachments: HDDS-801.000.patch, HDDS-801.001.patch, 
> HDDS-801.002.patch
>
>
> When datanode received CloseContainerCommand and the replication type is not 
> RATIS, we should QUASI close the container. After quasi-closing the container 
> an ICR has to be sent to SCM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-14077) DFSAdmin Report datanode Count was not matched when datanode in Decommissioned state

2018-11-14 Thread Harshakiran Reddy (JIRA)
Harshakiran Reddy created HDFS-14077:


 Summary: DFSAdmin  Report datanode Count was not matched when 
datanode in Decommissioned state
 Key: HDFS-14077
 URL: https://issues.apache.org/jira/browse/HDFS-14077
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 3.1.1
Reporter: Harshakiran Reddy


{noformat}
In DFSAdmin Reports showing the live datanodes are incorrect when some 
datanodes in Decommissioned State
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-14077) DFSAdmin Report datanode Count was not matched when datanode in Decommissioned state

2018-11-14 Thread Ranith Sardar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ranith Sardar reassigned HDFS-14077:


Assignee: Ranith Sardar

> DFSAdmin  Report datanode Count was not matched when datanode in 
> Decommissioned state
> -
>
> Key: HDFS-14077
> URL: https://issues.apache.org/jira/browse/HDFS-14077
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.1.1
>Reporter: Harshakiran Reddy
>Assignee: Ranith Sardar
>Priority: Major
>
> {noformat}
> In DFSAdmin Reports showing the live datanodes are incorrect when some 
> datanodes in Decommissioned State
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14045) Use different metrics in DataNode to better measure latency of heartbeat/blockReports/incrementalBlockReports of Active/Standby NN

2018-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686313#comment-16686313
 ] 

Hadoop QA commented on HDFS-14045:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  8m 
47s{color} | {color:red} root in trunk failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  3m 
43s{color} | {color:red} root in trunk failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 48s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
38s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
15s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
16s{color} | {color:red} root in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 16s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
54s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 60 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
1s{color} | {color:red} The patch 600 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 40s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  7m 
50s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}139m 22s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
30s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}207m 57s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.snapshot.TestSnapshotRename |
|   | hadoop.hdfs.TestRollingUpgradeRollback |
|   | hadoop.hdfs.server.namenode.TestAuditLoggerWithCommands |
|   | hadoop.hdfs.server.namenode.TestReconstructStripedBlocks |
|   | hadoop.hdfs.TestModTime |
|   | hadoop.hdfs.TestDFSClientFailover |
|   | hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup |
|   | hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy |
|   | hadoop.hdfs.TestDFSStripedOutputStream |
|   | hadoop.hdfs.server.blockmanagement.TestSequentialBlockGroupId |
|   | hadoop.hdfs.server.mover.TestMover |
|   | hadoop.hdfs.server.s

[jira] [Commented] (HDDS-834) Datanode goes OOM based because of segment size

2018-11-14 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686320#comment-16686320
 ] 

Shashikant Banerjee commented on HDDS-834:
--

Thanks [~msingh] for reporting and working on this. The patch looks good to me. 
I am +1 on the patch with some minor changes:

1) containerCommandCompletionMap renamed to applyTransactionCompletionMap in 
ContainerStateMachine.

2) Adding some more comments while adding dummy entries in 
applyTransactionCompletionMap.

I will take care of these while committing.

> Datanode goes OOM based because of segment size
> ---
>
> Key: HDDS-834
> URL: https://issues.apache.org/jira/browse/HDDS-834
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.3.0
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Attachments: HDDS-834-ozone-0.3.001.patch, HDDS-834.001.patch
>
>
> Currently ratis segment size is set to 1GB. After RATIS-253, the entry size 
> for a write Chunk is not  counted towards the entry being written to Raft Log.
> This jira controls the segment size to 16KB which makes sure that the number 
> of entries with WriteChunk is limited to 64. This means with 16MB chunk, the 
> total data pending in the segment is 1GB.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-834) Datanode goes OOM based because of segment size

2018-11-14 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-834:
-
   Resolution: Fixed
Fix Version/s: 0.4.0
   0.3.0
   Status: Resolved  (was: Patch Available)

Thanks [~msingh] for working on this. I have committed this change to trunk as 
well as ozone-0.3.

> Datanode goes OOM based because of segment size
> ---
>
> Key: HDDS-834
> URL: https://issues.apache.org/jira/browse/HDDS-834
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.3.0
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Fix For: 0.3.0, 0.4.0
>
> Attachments: HDDS-834-ozone-0.3.001.patch, HDDS-834.001.patch
>
>
> Currently ratis segment size is set to 1GB. After RATIS-253, the entry size 
> for a write Chunk is not  counted towards the entry being written to Raft Log.
> This jira controls the segment size to 16KB which makes sure that the number 
> of entries with WriteChunk is limited to 64. This means with 16MB chunk, the 
> total data pending in the segment is 1GB.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-14078) Admin helper fails to prettify NullPointerExceptions

2018-11-14 Thread Elek, Marton (JIRA)
Elek, Marton created HDFS-14078:
---

 Summary: Admin helper fails to prettify NullPointerExceptions
 Key: HDFS-14078
 URL: https://issues.apache.org/jira/browse/HDFS-14078
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Elek, Marton
Assignee: Elek, Marton


org.apache.hadoop.hdfs.tools.AdminHelper has a method to prettifyExceptions:

{code}
  static String prettifyException(Exception e) {
return e.getClass().getSimpleName() + ": "
+ e.getLocalizedMessage().split("\n")[0];
  }
{code}

But if e is a NPE the e.getLocalizedMessage() could be null. In that case NPE 
will be thrown and the original error message will be lost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14078) Admin helper fails to prettify NullPointerExceptions

2018-11-14 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HDFS-14078:

Attachment: HDFS-14078.001.patch

> Admin helper fails to prettify NullPointerExceptions
> 
>
> Key: HDFS-14078
> URL: https://issues.apache.org/jira/browse/HDFS-14078
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Attachments: HDFS-14078.001.patch
>
>
> org.apache.hadoop.hdfs.tools.AdminHelper has a method to prettifyExceptions:
> {code}
>   static String prettifyException(Exception e) {
> return e.getClass().getSimpleName() + ": "
> + e.getLocalizedMessage().split("\n")[0];
>   }
> {code}
> But if e is a NPE the e.getLocalizedMessage() could be null. In that case NPE 
> will be thrown and the original error message will be lost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14078) Admin helper fails to prettify NullPointerExceptions

2018-11-14 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HDFS-14078:

Status: Patch Available  (was: Open)

> Admin helper fails to prettify NullPointerExceptions
> 
>
> Key: HDFS-14078
> URL: https://issues.apache.org/jira/browse/HDFS-14078
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Attachments: HDFS-14078.001.patch
>
>
> org.apache.hadoop.hdfs.tools.AdminHelper has a method to prettifyExceptions:
> {code}
>   static String prettifyException(Exception e) {
> return e.getClass().getSimpleName() + ": "
> + e.getLocalizedMessage().split("\n")[0];
>   }
> {code}
> But if e is a NPE the e.getLocalizedMessage() could be null. In that case NPE 
> will be thrown and the original error message will be lost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12309) Incorrect null check in the AdminHelper.java

2018-11-14 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HDFS-12309:

Resolution: Implemented
Status: Resolved  (was: Patch Available)

> Incorrect null check in the AdminHelper.java
> 
>
> Key: HDFS-12309
> URL: https://issues.apache.org/jira/browse/HDFS-12309
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Oleg Danilov
>Priority: Trivial
> Attachments: HDFS-12309.patch
>
>
> '!= null' is not required there:
> line 147-150:
> {code:java}
> public HelpCommand(Command[] commands) {
>   Preconditions.checkNotNull(commands != null);
>   this.commands = commands;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12309) Incorrect null check in the AdminHelper.java

2018-11-14 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686348#comment-16686348
 ] 

Elek, Marton commented on HDFS-12309:
-

Thank you [~olegd] to post this patch. Unfortunately I found it just now and in 
the meantime it's fixed in HDFS-13261.

Will close this issue as of now. 

> Incorrect null check in the AdminHelper.java
> 
>
> Key: HDFS-12309
> URL: https://issues.apache.org/jira/browse/HDFS-12309
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Oleg Danilov
>Priority: Trivial
> Attachments: HDFS-12309.patch
>
>
> '!= null' is not required there:
> line 147-150:
> {code:java}
> public HelpCommand(Command[] commands) {
>   Preconditions.checkNotNull(commands != null);
>   this.commands = commands;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-834) Datanode goes OOM based because of segment size

2018-11-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686381#comment-16686381
 ] 

Hudson commented on HDDS-834:
-

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15425 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/15425/])
HDDS-834. Datanode goes OOM based because of segment size. Contributed 
(shashikant: rev a94828170684793b80efdd76dc8a3167e324c0ea)
* (edit) 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java
* (edit) hadoop-hdds/common/src/main/resources/ozone-default.xml
* (edit) 
hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/ScmConfigKeys.java


> Datanode goes OOM based because of segment size
> ---
>
> Key: HDDS-834
> URL: https://issues.apache.org/jira/browse/HDDS-834
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.3.0
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Fix For: 0.3.0, 0.4.0
>
> Attachments: HDDS-834-ozone-0.3.001.patch, HDDS-834.001.patch
>
>
> Currently ratis segment size is set to 1GB. After RATIS-253, the entry size 
> for a write Chunk is not  counted towards the entry being written to Raft Log.
> This jira controls the segment size to 16KB which makes sure that the number 
> of entries with WriteChunk is limited to 64. This means with 16MB chunk, the 
> total data pending in the segment is 1GB.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14056) Fix error messages in HDFS-12716

2018-11-14 Thread Vinayakumar B (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686410#comment-16686410
 ] 

Vinayakumar B commented on HDFS-14056:
--

LGTM +1

> Fix error messages in HDFS-12716
> 
>
> Key: HDFS-14056
> URL: https://issues.apache.org/jira/browse/HDFS-14056
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.10.0, 3.2.0, 3.0.4, 3.1.2
>Reporter: Adam Antal
>Assignee: Ayush Saxena
>Priority: Minor
> Attachments: HDFS-14056-01.patch, HDFS-14056-02.patch
>
>
> There are misleading error messages in the committed HDFS-12716 patch.
> As I saw in the code in DataNode.java:startDataNode
> {code:java}
> throw new DiskErrorException("Invalid value configured for "
> + "dfs.datanode.failed.volumes.tolerated - " + volFailuresTolerated
> + ". Value configured is either greater than -1 or >= "
> + "to the number of configured volumes (" + volsConfigured + ").");
>   }
> {code}
> Here the error message seems a bit misleading. The error comes up when the 
> given quantity in the configuration set to volsConfigured is set lower than 
> -1 but in that case the error should say something like "Value configured is 
> either _less_ than -1 or >= ...".
> Also the general error message in DataNode.java
> {code:java}
> public static final String MAX_VOLUME_FAILURES_TOLERATED_MSG = "should be 
> greater than -1";
> {code}
> May be better changed to "should be greater than _or equal to_ -1" to be 
> precise, as -1 is a valid choice.
> In hdfs-default.xml I couldn't understand the phrase "The range of the value 
> is -1 now, -1 represents the minimum of volume valids is 1." It might be 
> better to write something clearer like "The minimum is -1 representing 1 
> valid remaining volume".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-837) Persist originNodeId as part of .container file in datanode

2018-11-14 Thread Nanda kumar (JIRA)
Nanda kumar created HDDS-837:


 Summary: Persist originNodeId as part of .container file in 
datanode
 Key: HDDS-837
 URL: https://issues.apache.org/jira/browse/HDDS-837
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
  Components: Ozone Datanode
Reporter: Nanda kumar
Assignee: Nanda kumar


To differentiate the replica of QUASI_CLOSED containers we need 
{{originNodeId}} field. With this field, we can uniquely identify a 
QUASI_CLOSED container replica. This will be needed when we want to CLOSE a 
QUASI_CLOSED container.

This field will be set by the node where the container is created and stored as 
part of {{.container}} file and will be sent as part of ContainerReport to SCM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13915) replace datanode failed because of NameNodeRpcServer#getAdditionalDatanode returning excessive datanodeInfo

2018-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686419#comment-16686419
 ] 

Hadoop QA commented on HDFS-13915:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 58s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
18s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m  
9s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m  3s{color} | {color:orange} hadoop-hdfs-project: The patch generated 1 new + 
203 unchanged - 0 fixed = 204 total (was 203) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  2s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
34s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 76m 54s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}153m 56s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestBlockStoragePolicy |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | HDFS-13915 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12948098/HDFS-13915.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 68ffd073f883 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 3fade86 |
| 

[jira] [Commented] (HDFS-13963) NN UI is broken with IE11

2018-11-14 Thread Vinayakumar B (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686427#comment-16686427
 ] 

Vinayakumar B commented on HDFS-13963:
--

[~daisuke.kobayashi]
bq. Sorry for late here. I have checked if it works on my Windows env. in VM 
and found that the page is still broken with IE9 mode.
Yes,  this test is not same as verifying with "ie=egde". You have used IE-11 
and changed the mode to IE-9 in developer tools.

Changing to 'ie=edge' does not change anything (neither break more nor fix 
existing issue) on IE 9, but *it fixes the issue on later versions*.

So I feel, this change in  [^HDFS-13963-02.patch] is not an incompatible 
change, we can go ahead and commit it. 

[~elek], do you agree with this.?

[~ayushtkn], please do similar change in the other HTML files of HDFS as well.

> NN UI is broken with IE11
> -
>
> Key: HDFS-13963
> URL: https://issues.apache.org/jira/browse/HDFS-13963
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, ui
>Affects Versions: 3.1.1
>Reporter: Daisuke Kobayashi
>Assignee: Ayush Saxena
>Priority: Minor
>  Labels: newbie
> Attachments: Document-mode-IE9.png, HDFS-13963-01.patch, 
> HDFS-13963-02.patch, Screen Shot 2018-10-05 at 20.22.20.png, 
> test-with-edge-mode.png
>
>
> Internet Explorer 11 cannot correctly display Namenode Web UI while the NN 
> itself starts successfully. I have confirmed this over 3.1.1 (latest release) 
> and 3.3.0-SNAPSHOT (current trunk) that the following message is shown.
> {code}
> Failed to retrieve data from /jmx?qry=java.lang:type=Memory, cause: 
> SyntaxError: Invalid character
> {code}
> Apparently, this is because {{dfshealth.html}} runs as IE9 mode by default.
> {code}
> 
> {code}
> Once the compatible mode is changed to IE11 through developer tool, it's 
> rendered correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13911) HDFS - Inconsistency in get and put syntax if filename/dirname contains space

2018-11-14 Thread Vinayakumar B (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686432#comment-16686432
 ] 

Vinayakumar B commented on HDFS-13911:
--

[~ayushtkn], Thanks for the patch.

I think change should fix the above said issue.

Please include one test verifying the fix.

> HDFS - Inconsistency in get and put syntax if filename/dirname contains space
> -
>
> Key: HDFS-13911
> URL: https://issues.apache.org/jira/browse/HDFS-13911
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.1.1
>Reporter: vivek kumar
>Assignee: Ayush Saxena
>Priority: Minor
> Attachments: HDFS-13911-01.patch
>
>
> Inconsistency in get and put syntax if file/fdir name contains space. 
> While copying file/dir from local to HDFS, space needs to be represented with 
> %20. However, the same representation does not work for copying file to 
> Local. Expectaion is to have same syntax for both get and put.
> test:/ # mkdir /opt/
>  test:/ # mkdir /opt/test\ space
>  test:/ # vi /opt/test\ space/test\ file.txt
>  test:/ # ll /opt/test\ space/
>  total 4
>  -rw-r--r-- 1 root root 7 Sep 12 18:37 test file.txt
>  test:/ #
>  *test:/ # hadoop fs -put /opt/test\ space/ /tmp/*
>  *put: unexpected URISyntaxException*
>  test:/ #
>  *test:/ # hadoop fs -put /opt/test%20space/ /tmp/*
>  test:/ #
>  test:/ # hadoop fs -ls /tmp
>  drwxr-xr-x - user1 hadoop 0 2018-09-12 18:38 /tmp/test space
>  test:/ #
>  *test:/ # hadoop fs -get /tmp/test%20space /srv/*
>  *get: `/tmp/test%20space': No such file or directory*
>  test:/ #
>  *test:/ # hadoop fs -get /tmp/test\ space /srv/*
>  test:/ # ll /srv/test\ space/
>  total 4
>  -rw-r--r-- 1 root root 7 Sep 12 18:39 test file.txt



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-801) Quasi close the container when close is not executed via Ratis

2018-11-14 Thread Mukul Kumar Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686450#comment-16686450
 ] 

Mukul Kumar Singh commented on HDDS-801:


Thanks for working on this [~nandakumar131]. The patch looks really good to me. 
Please find my comment as following

1) CloseContainerCommandHandler:94 -> contaienrState to containerState
2) CloseContainerCommandHandler:99 -> updateContainerState should be changed to 
appropriate type like closing or stopContainer
3) CloseContainerCommandHandler:103-104, I feel this can be moved inside 
XceiverServerRatis, and this can be changed to function doesPipelineExists, 
adding a call in ratis to check a pipeline exists is trivial and this code can 
be removed once this issue is fixed in Ratis.
4) ContainerData.java:244, lets add a getter function
5) KeyValueContainer:272-308, all the three functions are performing similar 
steps, Should this be encapsulated in one function and a supplier provided to 
perform the actual transitions i.e. CLOSING, CLOSED and QUASI_CLOSED?
6) KeyValueContainer:306, both quasiClose and Close are performing the 
compactDB operations. When the transition from QUASI_CLOSED to CLOSED is 
allowed later, we should not compact the DB again.
7) KeyValueHandler:390, the container should already be in CLOSING state, Lets 
add an precondition here that the container is already in closing state.
8) TestCloseContainerByPipeline:181, lets change the assertion here to 
isQuasiClosed.

> Quasi close the container when close is not executed via Ratis
> --
>
> Key: HDDS-801
> URL: https://issues.apache.org/jira/browse/HDDS-801
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Affects Versions: 0.3.0
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
> Attachments: HDDS-801.000.patch, HDDS-801.001.patch, 
> HDDS-801.002.patch
>
>
> When datanode received CloseContainerCommand and the replication type is not 
> RATIS, we should QUASI close the container. After quasi-closing the container 
> an ICR has to be sent to SCM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-14079) RBF : RouterAdmin should have failover concept for router

2018-11-14 Thread Surendra Singh Lilhore (JIRA)
Surendra Singh Lilhore created HDFS-14079:
-

 Summary: RBF : RouterAdmin should have failover concept for router
 Key: HDFS-14079
 URL: https://issues.apache.org/jira/browse/HDFS-14079
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.1.1
Reporter: Surendra Singh Lilhore
Assignee: Surendra Singh Lilhore


Currenlty {{RouterAdmin}} connect with only one router for admin operation, if 
the configured router is down then router admin command is failing. It should 
allow to configure all the router admin address.

{code}
// Initialize RouterClient
try {
  String address = getConf().getTrimmed(
  RBFConfigKeys.DFS_ROUTER_ADMIN_ADDRESS_KEY,
  RBFConfigKeys.DFS_ROUTER_ADMIN_ADDRESS_DEFAULT);
  InetSocketAddress routerSocket = NetUtils.createSocketAddr(address);
  client = new RouterClient(routerSocket, getConf());
} catch (RPC.VersionMismatch v) {
  System.err.println(
  "Version mismatch between client and server... command aborted");
  return exitCode;
}
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-801) Quasi close the container when close is not executed via Ratis

2018-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686462#comment-16686462
 ] 

Hadoop QA commented on HDDS-801:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
32s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 14 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
34s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 19m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
20m 45s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-ozone/integration-test hadoop-ozone/dist {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
16s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
17s{color} | {color:red} dist in the patch failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 18m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 18m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 18m  
4s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
3m 47s{color} | {color:orange} root: The patch generated 6 new + 11 unchanged - 
0 fixed = 17 total (was 11) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 27s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-ozone/integration-test hadoop-ozone/dist {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
13s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 42s{color} 
| {color:red} common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 42s{color} 
| {color:red} container-service in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 38s{color} 
| {color:red} server-scm in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 40s{color} 
| {color:red} integration-test in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 38s{color} 
| {color:red} tools in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
29s{color} | {color:green} dist in the patch passed. {color} |
| {color:green}+1{color} | {color

[jira] [Commented] (HDDS-774) Remove OpenContainerBlockMap from datanode

2018-11-14 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686484#comment-16686484
 ] 

Shashikant Banerjee commented on HDDS-774:
--

Thanks [~shashikant], for the review. I will hold off committing this till 
HDDS-801 gets committed as it may create conflicts.

> Remove OpenContainerBlockMap from datanode
> --
>
> Key: HDDS-774
> URL: https://issues.apache.org/jira/browse/HDDS-774
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.4.0
>
> Attachments: HDDS-774.000.patch, HDDS-774.001.patch
>
>
> With HDDS-675, partial flush of uncommitted keys on Datanodes is not 
> required. OpenContainerBlockMap hence serves no purpose anymore.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-774) Remove OpenContainerBlockMap from datanode

2018-11-14 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686484#comment-16686484
 ] 

Shashikant Banerjee edited comment on HDDS-774 at 11/14/18 1:07 PM:


Thanks [~jnp], for the review. I will hold off committing this till HDDS-801 
gets committed as it may create conflicts.


was (Author: shashikant):
Thanks [~shashikant], for the review. I will hold off committing this till 
HDDS-801 gets committed as it may create conflicts.

> Remove OpenContainerBlockMap from datanode
> --
>
> Key: HDDS-774
> URL: https://issues.apache.org/jira/browse/HDDS-774
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.4.0
>
> Attachments: HDDS-774.000.patch, HDDS-774.001.patch
>
>
> With HDDS-675, partial flush of uncommitted keys on Datanodes is not 
> required. OpenContainerBlockMap hence serves no purpose anymore.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13963) NN UI is broken with IE11

2018-11-14 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686495#comment-16686495
 ] 

Elek, Marton commented on HDFS-13963:
-

[~vinayrpet] Sure, I agree. Thanks the explanation. I understand now, that this 
change is only for the IE browser to use the latest rendering engine. Nothing 
about backward compatibility...

> NN UI is broken with IE11
> -
>
> Key: HDFS-13963
> URL: https://issues.apache.org/jira/browse/HDFS-13963
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, ui
>Affects Versions: 3.1.1
>Reporter: Daisuke Kobayashi
>Assignee: Ayush Saxena
>Priority: Minor
>  Labels: newbie
> Attachments: Document-mode-IE9.png, HDFS-13963-01.patch, 
> HDFS-13963-02.patch, Screen Shot 2018-10-05 at 20.22.20.png, 
> test-with-edge-mode.png
>
>
> Internet Explorer 11 cannot correctly display Namenode Web UI while the NN 
> itself starts successfully. I have confirmed this over 3.1.1 (latest release) 
> and 3.3.0-SNAPSHOT (current trunk) that the following message is shown.
> {code}
> Failed to retrieve data from /jmx?qry=java.lang:type=Memory, cause: 
> SyntaxError: Invalid character
> {code}
> Apparently, this is because {{dfshealth.html}} runs as IE9 mode by default.
> {code}
> 
> {code}
> Once the compatible mode is changed to IE11 through developer tool, it's 
> rendered correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14075) NPE while Edit Logging

2018-11-14 Thread Ayush Saxena (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-14075:

Attachment: HDFS-14075-01.patch

> NPE while Edit Logging
> --
>
> Key: HDFS-14075
> URL: https://issues.apache.org/jira/browse/HDFS-14075
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Critical
> Attachments: HDFS-14075-01.patch
>
>
> {noformat}
> 2018-11-10 18:59:38,427 FATAL 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog: Exception while edit 
> logging: null
> java.lang.NullPointerException
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.doEditTransaction(FSEditLog.java:481)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync$Edit.logEdit(FSEditLogAsync.java:288)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync.run(FSEditLogAsync.java:232)
>  at java.lang.Thread.run(Thread.java:745)
> 2018-11-10 18:59:38,532 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1: Exception while edit logging: null
> 2018-11-10 18:59:38,552 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> SHUTDOWN_MSG:
> {noformat}
> Before NPE Received the following Exception
> {noformat}
> INFO org.apache.hadoop.ipc.Server: IPC Server handler 9 on 65110, call 
> Call#23241 Retry#0 
> org.apache.hadoop.hdfs.server.protocol.NamenodeProtocol.rollEditLog from 
> 
> java.io.IOException: Unable to start log segment 7964819: too few journals 
> successfully started.
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.startLogSegment(FSEditLog.java:1385)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.startLogSegmentAndWriteHeaderTxn(FSEditLog.java:1395)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.rollEditLog(FSEditLog.java:1319)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.rollEditLog(FSImage.java:1352)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.rollEditLog(FSNamesystem.java:4669)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollEditLog(NameNodeRpcServer.java:1293)
>   at 
> org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolServerSideTranslatorPB.rollEditLog(NamenodeProtocolServerSideTranslatorPB.java:146)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.NamenodeProtocolProtos$NamenodeProtocolService$2.callBlockingMethod(NamenodeProtocolProtos.java:12974)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:878)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:824)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2684)
> Caused by: java.io.IOException: starting log segment 7964819 failed for too 
> many journals
>   at 
> org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:412)
>   at 
> org.apache.hadoop.hdfs.server.namenode.JournalSet.startLogSegment(JournalSet.java:207)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.startLogSegment(FSEditLog.java:1383)
>   ... 15 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-801) Quasi close the container when close is not executed via Ratis

2018-11-14 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686506#comment-16686506
 ] 

Shashikant Banerjee commented on HDDS-801:
--

Thanks [~nandakumar131] for working on this. Some comments 

1.KeyValueHandler.java : 865  -> update the comment to be container getting 
"quasi closed" rather than getting closed.

2.KeyValueHandler.java : 865 -> closeContainer is exposed to clients in 
ContainerProtocolCalls.Java. With SCMCLi as well, the close container can be 
invoked where a client can directly close (closeContainer in 
ContainerOperationClient). In such cases, a container in may be in just open 
state and hence the exception will be thrown:
{code:java}
// The container has to be in CLOSING state.
if (state != State.CLOSING) {
  ContainerProtos.Result error = state == State.INVALID ?
  INVALID_CONTAINER_STATE : CONTAINER_INTERNAL_ERROR;
  throw new StorageContainerException("Cannot close container #" +
  container.getContainerData().getContainerID() + " while in " +
  state + " state.", error);
}{code}
Should we disallow/remove the closeContainer call exposed to clients/SCMCLI?

3. Any state change in ContainerState should triggerICR.In that case, 
closeContainer/quasiCloseContainer call should call updateContainerState 
internally to send ICR instead of executing individually.

 

> Quasi close the container when close is not executed via Ratis
> --
>
> Key: HDDS-801
> URL: https://issues.apache.org/jira/browse/HDDS-801
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Affects Versions: 0.3.0
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
> Attachments: HDDS-801.000.patch, HDDS-801.001.patch, 
> HDDS-801.002.patch
>
>
> When datanode received CloseContainerCommand and the replication type is not 
> RATIS, we should QUASI close the container. After quasi-closing the container 
> an ICR has to be sent to SCM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13972) RBF: Support for Delegation Token (WebHDFS)

2018-11-14 Thread Brahma Reddy Battula (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686508#comment-16686508
 ] 

Brahma Reddy Battula commented on HDFS-13972:
-

bq.Could you help rebase HDFS-13891 branch with trunk. 

Done.Please pull the branch before you work.You might get conflicts , AS 
HDFS-13834 was missed(Sorry for this),I pushed this commit.

> RBF: Support for Delegation Token (WebHDFS)
> ---
>
> Key: HDFS-13972
> URL: https://issues.apache.org/jira/browse/HDFS-13972
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: CR Hota
>Priority: Major
>
> HDFS Router should support issuing HDFS delegation tokens through WebHDFS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-801) Quasi close the container when close is not executed via Ratis

2018-11-14 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686506#comment-16686506
 ] 

Shashikant Banerjee edited comment on HDDS-801 at 11/14/18 1:42 PM:


Thanks [~nandakumar131] for working on this. In addition to Mukul's comments 
,some more comments :

1.KeyValueHandler.java : 865  -> update the comment to be container getting 
"quasi closed" rather than getting closed.

2.KeyValueHandler.java : 865 -> closeContainer is exposed to clients in 
ContainerProtocolCalls.Java. With SCMCLi as well, the close container can be 
invoked where a client can directly close (closeContainer in 
ContainerOperationClient). In such cases, a container in may be in just open 
state and hence the exception will be thrown:
{code:java}
// The container has to be in CLOSING state.
if (state != State.CLOSING) {
  ContainerProtos.Result error = state == State.INVALID ?
  INVALID_CONTAINER_STATE : CONTAINER_INTERNAL_ERROR;
  throw new StorageContainerException("Cannot close container #" +
  container.getContainerData().getContainerID() + " while in " +
  state + " state.", error);
}{code}
Should we disallow/remove the closeContainer call exposed to clients/SCMCLI?

3. Any state change in ContainerState should triggerICR.In that case, 
closeContainer/quasiCloseContainer call should call updateContainerState 
internally to send ICR instead of executing individually.

 


was (Author: shashikant):
Thanks [~nandakumar131] for working on this. Some comments 

1.KeyValueHandler.java : 865  -> update the comment to be container getting 
"quasi closed" rather than getting closed.

2.KeyValueHandler.java : 865 -> closeContainer is exposed to clients in 
ContainerProtocolCalls.Java. With SCMCLi as well, the close container can be 
invoked where a client can directly close (closeContainer in 
ContainerOperationClient). In such cases, a container in may be in just open 
state and hence the exception will be thrown:
{code:java}
// The container has to be in CLOSING state.
if (state != State.CLOSING) {
  ContainerProtos.Result error = state == State.INVALID ?
  INVALID_CONTAINER_STATE : CONTAINER_INTERNAL_ERROR;
  throw new StorageContainerException("Cannot close container #" +
  container.getContainerData().getContainerID() + " while in " +
  state + " state.", error);
}{code}
Should we disallow/remove the closeContainer call exposed to clients/SCMCLI?

3. Any state change in ContainerState should triggerICR.In that case, 
closeContainer/quasiCloseContainer call should call updateContainerState 
internally to send ICR instead of executing individually.

 

> Quasi close the container when close is not executed via Ratis
> --
>
> Key: HDDS-801
> URL: https://issues.apache.org/jira/browse/HDDS-801
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Affects Versions: 0.3.0
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
> Attachments: HDDS-801.000.patch, HDDS-801.001.patch, 
> HDDS-801.002.patch
>
>
> When datanode received CloseContainerCommand and the replication type is not 
> RATIS, we should QUASI close the container. After quasi-closing the container 
> an ICR has to be sent to SCM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14075) NPE while Edit Logging

2018-11-14 Thread Ayush Saxena (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686499#comment-16686499
 ] 

Ayush Saxena commented on HDFS-14075:
-

Uploaded patch v1 with fix.

> NPE while Edit Logging
> --
>
> Key: HDFS-14075
> URL: https://issues.apache.org/jira/browse/HDFS-14075
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Critical
> Attachments: HDFS-14075-01.patch
>
>
> {noformat}
> 2018-11-10 18:59:38,427 FATAL 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog: Exception while edit 
> logging: null
> java.lang.NullPointerException
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.doEditTransaction(FSEditLog.java:481)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync$Edit.logEdit(FSEditLogAsync.java:288)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync.run(FSEditLogAsync.java:232)
>  at java.lang.Thread.run(Thread.java:745)
> 2018-11-10 18:59:38,532 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1: Exception while edit logging: null
> 2018-11-10 18:59:38,552 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> SHUTDOWN_MSG:
> {noformat}
> Before NPE Received the following Exception
> {noformat}
> INFO org.apache.hadoop.ipc.Server: IPC Server handler 9 on 65110, call 
> Call#23241 Retry#0 
> org.apache.hadoop.hdfs.server.protocol.NamenodeProtocol.rollEditLog from 
> 
> java.io.IOException: Unable to start log segment 7964819: too few journals 
> successfully started.
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.startLogSegment(FSEditLog.java:1385)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.startLogSegmentAndWriteHeaderTxn(FSEditLog.java:1395)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.rollEditLog(FSEditLog.java:1319)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.rollEditLog(FSImage.java:1352)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.rollEditLog(FSNamesystem.java:4669)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollEditLog(NameNodeRpcServer.java:1293)
>   at 
> org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolServerSideTranslatorPB.rollEditLog(NamenodeProtocolServerSideTranslatorPB.java:146)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.NamenodeProtocolProtos$NamenodeProtocolService$2.callBlockingMethod(NamenodeProtocolProtos.java:12974)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:878)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:824)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2684)
> Caused by: java.io.IOException: starting log segment 7964819 failed for too 
> many journals
>   at 
> org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:412)
>   at 
> org.apache.hadoop.hdfs.server.namenode.JournalSet.startLogSegment(JournalSet.java:207)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.startLogSegment(FSEditLog.java:1383)
>   ... 15 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14054) TestLeaseRecovery2: testHardLeaseRecoveryAfterNameNodeRestart2 and testHardLeaseRecoveryWithRenameAfterNameNodeRestart are flaky

2018-11-14 Thread Zsolt Venczel (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zsolt Venczel updated HDFS-14054:
-
Status: Patch Available  (was: In Progress)

> TestLeaseRecovery2: testHardLeaseRecoveryAfterNameNodeRestart2 and 
> testHardLeaseRecoveryWithRenameAfterNameNodeRestart are flaky
> 
>
> Key: HDFS-14054
> URL: https://issues.apache.org/jira/browse/HDFS-14054
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.3, 2.6.0
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>Priority: Major
>  Labels: flaky-test
> Attachments: HDFS-14054.01.patch
>
>
> ---
>  T E S T S
> ---
> OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support 
> was removed in 8.0
> Running org.apache.hadoop.hdfs.TestLeaseRecovery2
> Tests run: 7, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 68.971 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.TestLeaseRecovery2
> testHardLeaseRecoveryAfterNameNodeRestart2(org.apache.hadoop.hdfs.TestLeaseRecovery2)
>   Time elapsed: 4.375 sec  <<< FAILURE!
> java.lang.AssertionError: lease holder should now be the NN
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.checkLease(TestLeaseRecovery2.java:568)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.hardLeaseRecoveryRestartHelper(TestLeaseRecovery2.java:520)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.testHardLeaseRecoveryAfterNameNodeRestart2(TestLeaseRecovery2.java:437)
> testHardLeaseRecoveryWithRenameAfterNameNodeRestart(org.apache.hadoop.hdfs.TestLeaseRecovery2)
>   Time elapsed: 4.339 sec  <<< FAILURE!
> java.lang.AssertionError: lease holder should now be the NN
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.checkLease(TestLeaseRecovery2.java:568)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.hardLeaseRecoveryRestartHelper(TestLeaseRecovery2.java:520)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.testHardLeaseRecoveryWithRenameAfterNameNodeRestart(TestLeaseRecovery2.java:443)
> Results :
> Failed tests: 
>   
> TestLeaseRecovery2.testHardLeaseRecoveryAfterNameNodeRestart2:437->hardLeaseRecoveryRestartHelper:520->checkLease:568
>  lease holder should now be the NN
>   
> TestLeaseRecovery2.testHardLeaseRecoveryWithRenameAfterNameNodeRestart:443->hardLeaseRecoveryRestartHelper:520->checkLease:568
>  lease holder should now be the NN
> Tests run: 7, Failures: 2, Errors: 0, Skipped: 0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14054) TestLeaseRecovery2: testHardLeaseRecoveryAfterNameNodeRestart2 and testHardLeaseRecoveryWithRenameAfterNameNodeRestart are flaky

2018-11-14 Thread Zsolt Venczel (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zsolt Venczel updated HDFS-14054:
-
Attachment: HDFS-14054.01.patch

> TestLeaseRecovery2: testHardLeaseRecoveryAfterNameNodeRestart2 and 
> testHardLeaseRecoveryWithRenameAfterNameNodeRestart are flaky
> 
>
> Key: HDFS-14054
> URL: https://issues.apache.org/jira/browse/HDFS-14054
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.0.3
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>Priority: Major
>  Labels: flaky-test
> Attachments: HDFS-14054.01.patch
>
>
> ---
>  T E S T S
> ---
> OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support 
> was removed in 8.0
> Running org.apache.hadoop.hdfs.TestLeaseRecovery2
> Tests run: 7, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 68.971 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.TestLeaseRecovery2
> testHardLeaseRecoveryAfterNameNodeRestart2(org.apache.hadoop.hdfs.TestLeaseRecovery2)
>   Time elapsed: 4.375 sec  <<< FAILURE!
> java.lang.AssertionError: lease holder should now be the NN
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.checkLease(TestLeaseRecovery2.java:568)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.hardLeaseRecoveryRestartHelper(TestLeaseRecovery2.java:520)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.testHardLeaseRecoveryAfterNameNodeRestart2(TestLeaseRecovery2.java:437)
> testHardLeaseRecoveryWithRenameAfterNameNodeRestart(org.apache.hadoop.hdfs.TestLeaseRecovery2)
>   Time elapsed: 4.339 sec  <<< FAILURE!
> java.lang.AssertionError: lease holder should now be the NN
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.checkLease(TestLeaseRecovery2.java:568)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.hardLeaseRecoveryRestartHelper(TestLeaseRecovery2.java:520)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.testHardLeaseRecoveryWithRenameAfterNameNodeRestart(TestLeaseRecovery2.java:443)
> Results :
> Failed tests: 
>   
> TestLeaseRecovery2.testHardLeaseRecoveryAfterNameNodeRestart2:437->hardLeaseRecoveryRestartHelper:520->checkLease:568
>  lease holder should now be the NN
>   
> TestLeaseRecovery2.testHardLeaseRecoveryWithRenameAfterNameNodeRestart:443->hardLeaseRecoveryRestartHelper:520->checkLease:568
>  lease holder should now be the NN
> Tests run: 7, Failures: 2, Errors: 0, Skipped: 0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14054) TestLeaseRecovery2: testHardLeaseRecoveryAfterNameNodeRestart2 and testHardLeaseRecoveryWithRenameAfterNameNodeRestart are flaky

2018-11-14 Thread Zsolt Venczel (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686550#comment-16686550
 ] 

Zsolt Venczel commented on HDFS-14054:
--

The failure happened due to FSEditLog.endCurrentLogSegment not being mocked 
early enough that had caused the edit log finalization to fail.

In very rare cases I've seen NPE in line 573. that is handled as well.

Also in very rare cases the timeout for line 575. was not enough.

> TestLeaseRecovery2: testHardLeaseRecoveryAfterNameNodeRestart2 and 
> testHardLeaseRecoveryWithRenameAfterNameNodeRestart are flaky
> 
>
> Key: HDFS-14054
> URL: https://issues.apache.org/jira/browse/HDFS-14054
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.0.3
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>Priority: Major
>  Labels: flaky-test
> Attachments: HDFS-14054.01.patch
>
>
> ---
>  T E S T S
> ---
> OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support 
> was removed in 8.0
> Running org.apache.hadoop.hdfs.TestLeaseRecovery2
> Tests run: 7, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 68.971 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.TestLeaseRecovery2
> testHardLeaseRecoveryAfterNameNodeRestart2(org.apache.hadoop.hdfs.TestLeaseRecovery2)
>   Time elapsed: 4.375 sec  <<< FAILURE!
> java.lang.AssertionError: lease holder should now be the NN
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.checkLease(TestLeaseRecovery2.java:568)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.hardLeaseRecoveryRestartHelper(TestLeaseRecovery2.java:520)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.testHardLeaseRecoveryAfterNameNodeRestart2(TestLeaseRecovery2.java:437)
> testHardLeaseRecoveryWithRenameAfterNameNodeRestart(org.apache.hadoop.hdfs.TestLeaseRecovery2)
>   Time elapsed: 4.339 sec  <<< FAILURE!
> java.lang.AssertionError: lease holder should now be the NN
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.checkLease(TestLeaseRecovery2.java:568)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.hardLeaseRecoveryRestartHelper(TestLeaseRecovery2.java:520)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.testHardLeaseRecoveryWithRenameAfterNameNodeRestart(TestLeaseRecovery2.java:443)
> Results :
> Failed tests: 
>   
> TestLeaseRecovery2.testHardLeaseRecoveryAfterNameNodeRestart2:437->hardLeaseRecoveryRestartHelper:520->checkLease:568
>  lease holder should now be the NN
>   
> TestLeaseRecovery2.testHardLeaseRecoveryWithRenameAfterNameNodeRestart:443->hardLeaseRecoveryRestartHelper:520->checkLease:568
>  lease holder should now be the NN
> Tests run: 7, Failures: 2, Errors: 0, Skipped: 0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-801) Quasi close the container when close is not executed via Ratis

2018-11-14 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686506#comment-16686506
 ] 

Shashikant Banerjee edited comment on HDDS-801 at 11/14/18 2:12 PM:


Thanks [~nandakumar131] for working on this. In addition to Mukul's comments 
,some more comments :

1.KeyValueHandler.java : 865  -> update the comment to be container getting 
"quasi closed" rather than getting closed.

2.KeyValueHandler.java : 865 -> closeContainer is exposed to clients in 
ContainerProtocolCalls.Java. With SCMCLi as well, the close container can be 
invoked where a client can directly close (closeContainer in 
ContainerOperationClient). In such cases, a container in may be in just open 
state and hence the exception will be thrown:
{code:java}
// The container has to be in CLOSING state.
if (state != State.CLOSING) {
  ContainerProtos.Result error = state == State.INVALID ?
  INVALID_CONTAINER_STATE : CONTAINER_INTERNAL_ERROR;
  throw new StorageContainerException("Cannot close container #" +
  container.getContainerData().getContainerID() + " while in " +
  state + " state.", error);
}{code}
Should we disallow/remove the closeContainer call exposed to clients/SCMCLI?

3. Any state change in ContainerState should triggerICR.In that case, 
closeContainer/quasiCloseContainer call should call updateContainerState 
internally to send ICR instead of executing individually.

4. There can be a case where let's say the SCM gets network separated from a 
follower before sending a closeCommand but Ratis ring is opeartional. In such 
case, leader will execute the closeContainer transaction  successfully and 
follower will try to replicate the same but it will fail as the container was 
never put into closing state in follower before as it was not communicating 
with SCM. The assumption that container will be in closing state before 
closeContainer is called may not be necessarily true.

 


was (Author: shashikant):
Thanks [~nandakumar131] for working on this. In addition to Mukul's comments 
,some more comments :

1.KeyValueHandler.java : 865  -> update the comment to be container getting 
"quasi closed" rather than getting closed.

2.KeyValueHandler.java : 865 -> closeContainer is exposed to clients in 
ContainerProtocolCalls.Java. With SCMCLi as well, the close container can be 
invoked where a client can directly close (closeContainer in 
ContainerOperationClient). In such cases, a container in may be in just open 
state and hence the exception will be thrown:
{code:java}
// The container has to be in CLOSING state.
if (state != State.CLOSING) {
  ContainerProtos.Result error = state == State.INVALID ?
  INVALID_CONTAINER_STATE : CONTAINER_INTERNAL_ERROR;
  throw new StorageContainerException("Cannot close container #" +
  container.getContainerData().getContainerID() + " while in " +
  state + " state.", error);
}{code}
Should we disallow/remove the closeContainer call exposed to clients/SCMCLI?

3. Any state change in ContainerState should triggerICR.In that case, 
closeContainer/quasiCloseContainer call should call updateContainerState 
internally to send ICR instead of executing individually.

 

> Quasi close the container when close is not executed via Ratis
> --
>
> Key: HDDS-801
> URL: https://issues.apache.org/jira/browse/HDDS-801
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Affects Versions: 0.3.0
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
> Attachments: HDDS-801.000.patch, HDDS-801.001.patch, 
> HDDS-801.002.patch
>
>
> When datanode received CloseContainerCommand and the replication type is not 
> RATIS, we should QUASI close the container. After quasi-closing the container 
> an ICR has to be sent to SCM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-801) Quasi close the container when close is not executed via Ratis

2018-11-14 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686506#comment-16686506
 ] 

Shashikant Banerjee edited comment on HDDS-801 at 11/14/18 2:17 PM:


Thanks [~nandakumar131] for working on this. In addition to Mukul's comments 
,some more comments :

1.KeyValueHandler.java : 865  -> update the comment to be container getting 
"quasi closed" rather than getting closed.

2.KeyValueHandler.java : 865 -> closeContainer is exposed to clients in 
ContainerProtocolCalls.Java. With SCMCLi as well, the close container can be 
invoked where a client can directly close (closeContainer in 
ContainerOperationClient). In such cases, a container in may be in just open 
state and hence the exception will be thrown:
{code:java}
// The container has to be in CLOSING state.
if (state != State.CLOSING) {
  ContainerProtos.Result error = state == State.INVALID ?
  INVALID_CONTAINER_STATE : CONTAINER_INTERNAL_ERROR;
  throw new StorageContainerException("Cannot close container #" +
  container.getContainerData().getContainerID() + " while in " +
  state + " state.", error);
}{code}
Should we disallow/remove the closeContainer call exposed to clients/SCMCLI?

3. Any state change in ContainerState should triggerICR.In that case, 
closeContainer/quasiCloseContainer call should call updateContainerState 
internally to send ICR instead of executing individually.

4. There can be a case where let's say the SCM gets network separated from a 
follower before sending a closeCommand but Ratis ring is opeartional. In such 
case, leader will execute the closeContainer transaction  successfully and 
follower will try to replicate the same but it will fail as the container was 
never put into closing state in follower before as it was not communicating 
with SCM. The assumption that container will be in closing state before 
closeContainer is called may not be necessarily true always.

 


was (Author: shashikant):
Thanks [~nandakumar131] for working on this. In addition to Mukul's comments 
,some more comments :

1.KeyValueHandler.java : 865  -> update the comment to be container getting 
"quasi closed" rather than getting closed.

2.KeyValueHandler.java : 865 -> closeContainer is exposed to clients in 
ContainerProtocolCalls.Java. With SCMCLi as well, the close container can be 
invoked where a client can directly close (closeContainer in 
ContainerOperationClient). In such cases, a container in may be in just open 
state and hence the exception will be thrown:
{code:java}
// The container has to be in CLOSING state.
if (state != State.CLOSING) {
  ContainerProtos.Result error = state == State.INVALID ?
  INVALID_CONTAINER_STATE : CONTAINER_INTERNAL_ERROR;
  throw new StorageContainerException("Cannot close container #" +
  container.getContainerData().getContainerID() + " while in " +
  state + " state.", error);
}{code}
Should we disallow/remove the closeContainer call exposed to clients/SCMCLI?

3. Any state change in ContainerState should triggerICR.In that case, 
closeContainer/quasiCloseContainer call should call updateContainerState 
internally to send ICR instead of executing individually.

4. There can be a case where let's say the SCM gets network separated from a 
follower before sending a closeCommand but Ratis ring is opeartional. In such 
case, leader will execute the closeContainer transaction  successfully and 
follower will try to replicate the same but it will fail as the container was 
never put into closing state in follower before as it was not communicating 
with SCM. The assumption that container will be in closing state before 
closeContainer is called may not be necessarily true.

 

> Quasi close the container when close is not executed via Ratis
> --
>
> Key: HDDS-801
> URL: https://issues.apache.org/jira/browse/HDDS-801
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Affects Versions: 0.3.0
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
> Attachments: HDDS-801.000.patch, HDDS-801.001.patch, 
> HDDS-801.002.patch
>
>
> When datanode received CloseContainerCommand and the replication type is not 
> RATIS, we should QUASI close the container. After quasi-closing the container 
> an ICR has to be sent to SCM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-801) Quasi close the container when close is not executed via Ratis

2018-11-14 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686506#comment-16686506
 ] 

Shashikant Banerjee edited comment on HDDS-801 at 11/14/18 2:24 PM:


Thanks [~nandakumar131] for working on this. In addition to Mukul's comments 
,some more comments :

1.KeyValueHandler.java : 865  -> update the comment to be container getting 
"quasi closed" rather than getting closed.

2.KeyValueHandler.java : 865 -> closeContainer is exposed to clients in 
ContainerProtocolCalls.Java. With SCMCLi as well, the close container can be 
invoked where a client can directly close (closeContainer in 
ContainerOperationClient). In such cases, a container in may be in just open 
state and hence the exception will be thrown:
{code:java}
// The container has to be in CLOSING state.
if (state != State.CLOSING) {
  ContainerProtos.Result error = state == State.INVALID ?
  INVALID_CONTAINER_STATE : CONTAINER_INTERNAL_ERROR;
  throw new StorageContainerException("Cannot close container #" +
  container.getContainerData().getContainerID() + " while in " +
  state + " state.", error);
}{code}
Should we disallow/remove the closeContainer call exposed to clients/SCMCLI?

3. Any state change in ContainerState should triggerICR.In that case, 
closeContainer/quasiCloseContainer call should call updateContainerState 
internally to send ICR instead of executing individually.

4. There can be a case where let's say the SCM gets network separated from a 
follower before sending a closeCommand but Ratis ring is opeartional. In such 
case, leader will execute the closeContainer transaction  successfully and 
follower will try to replicate the same but it will fail as the container was 
never put into closing state in follower before as it was not communicating 
with SCM. The assumption that container will be in closing state before 
closeContainer is called may not be necessarily true always.

5. KeyValueContainer.java : 310 ->

The comments look misleading here. The first comment specifies the compaction 
should be done asynchronously as otherwise it will be lot slower . The next 
comment says it is ok if the operation is slow. Can you please check?
{code:java}
@Override
public void close() throws StorageContainerException {

  //TODO: writing .container file and compaction can be done
  // asynchronously, otherwise rpc call for this will take a lot of time to
  // complete this action
  ContainerDataProto.State oldState = null;
  try {
writeLock();
oldState = containerData.getState();
containerData.closeContainer();
File containerFile = getContainerFile();
// update the new container data to .container File
updateContainerFile(containerFile);

  } catch (StorageContainerException ex) {
if (oldState != null) {
  // Failed to update .container file. Reset the state to CLOSING
  containerData.setState(oldState);
}
throw ex;
  } finally {
writeUnlock();
  }
  // It is ok if this operation takes a bit of time.
  // Close container is not expected to be instantaneous.
  compactDB();
}

{code}


was (Author: shashikant):
Thanks [~nandakumar131] for working on this. In addition to Mukul's comments 
,some more comments :

1.KeyValueHandler.java : 865  -> update the comment to be container getting 
"quasi closed" rather than getting closed.

2.KeyValueHandler.java : 865 -> closeContainer is exposed to clients in 
ContainerProtocolCalls.Java. With SCMCLi as well, the close container can be 
invoked where a client can directly close (closeContainer in 
ContainerOperationClient). In such cases, a container in may be in just open 
state and hence the exception will be thrown:
{code:java}
// The container has to be in CLOSING state.
if (state != State.CLOSING) {
  ContainerProtos.Result error = state == State.INVALID ?
  INVALID_CONTAINER_STATE : CONTAINER_INTERNAL_ERROR;
  throw new StorageContainerException("Cannot close container #" +
  container.getContainerData().getContainerID() + " while in " +
  state + " state.", error);
}{code}
Should we disallow/remove the closeContainer call exposed to clients/SCMCLI?

3. Any state change in ContainerState should triggerICR.In that case, 
closeContainer/quasiCloseContainer call should call updateContainerState 
internally to send ICR instead of executing individually.

4. There can be a case where let's say the SCM gets network separated from a 
follower before sending a closeCommand but Ratis ring is opeartional. In such 
case, leader will execute the closeContainer transaction  successfully and 
follower will try to replicate the same but it will fail as the container was 
never put into closing state in follower before as it was not communicating 
with SCM. The assumption that container will be in closing state before 
closeContainer is called may not be necessarily true always.

5. K

[jira] [Comment Edited] (HDDS-801) Quasi close the container when close is not executed via Ratis

2018-11-14 Thread Shashikant Banerjee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686506#comment-16686506
 ] 

Shashikant Banerjee edited comment on HDDS-801 at 11/14/18 2:24 PM:


Thanks [~nandakumar131] for working on this. In addition to Mukul's comments 
,some more comments :

1.KeyValueHandler.java : 865  -> update the comment to be container getting 
"quasi closed" rather than getting closed.

2.KeyValueHandler.java : 865 -> closeContainer is exposed to clients in 
ContainerProtocolCalls.Java. With SCMCLi as well, the close container can be 
invoked where a client can directly close (closeContainer in 
ContainerOperationClient). In such cases, a container in may be in just open 
state and hence the exception will be thrown:
{code:java}
// The container has to be in CLOSING state.
if (state != State.CLOSING) {
  ContainerProtos.Result error = state == State.INVALID ?
  INVALID_CONTAINER_STATE : CONTAINER_INTERNAL_ERROR;
  throw new StorageContainerException("Cannot close container #" +
  container.getContainerData().getContainerID() + " while in " +
  state + " state.", error);
}{code}
Should we disallow/remove the closeContainer call exposed to clients/SCMCLI?

3. Any state change in ContainerState should triggerICR.In that case, 
closeContainer/quasiCloseContainer call should call updateContainerState 
internally to send ICR instead of executing individually.

4. There can be a case where let's say the SCM gets network separated from a 
follower before sending a closeCommand but Ratis ring is opeartional. In such 
case, leader will execute the closeContainer transaction  successfully and 
follower will try to replicate the same but it will fail as the container was 
never put into closing state in follower before as it was not communicating 
with SCM. The assumption that container will be in closing state before 
closeContainer is called may not be necessarily true always.

5. KeyValueContainer.java : 310 ->

The comments look misleading here. The first comment specifies we compaction 
should be done asynchronously as otherwise it will be lot slower . The next 
comment says it is ok if thhe opeartion is slow. Can you please check?
{code:java}
@Override
public void close() throws StorageContainerException {

  //TODO: writing .container file and compaction can be done
  // asynchronously, otherwise rpc call for this will take a lot of time to
  // complete this action
  ContainerDataProto.State oldState = null;
  try {
writeLock();
oldState = containerData.getState();
containerData.closeContainer();
File containerFile = getContainerFile();
// update the new container data to .container File
updateContainerFile(containerFile);

  } catch (StorageContainerException ex) {
if (oldState != null) {
  // Failed to update .container file. Reset the state to CLOSING
  containerData.setState(oldState);
}
throw ex;
  } finally {
writeUnlock();
  }
  // It is ok if this operation takes a bit of time.
  // Close container is not expected to be instantaneous.
  compactDB();
}

{code}


was (Author: shashikant):
Thanks [~nandakumar131] for working on this. In addition to Mukul's comments 
,some more comments :

1.KeyValueHandler.java : 865  -> update the comment to be container getting 
"quasi closed" rather than getting closed.

2.KeyValueHandler.java : 865 -> closeContainer is exposed to clients in 
ContainerProtocolCalls.Java. With SCMCLi as well, the close container can be 
invoked where a client can directly close (closeContainer in 
ContainerOperationClient). In such cases, a container in may be in just open 
state and hence the exception will be thrown:
{code:java}
// The container has to be in CLOSING state.
if (state != State.CLOSING) {
  ContainerProtos.Result error = state == State.INVALID ?
  INVALID_CONTAINER_STATE : CONTAINER_INTERNAL_ERROR;
  throw new StorageContainerException("Cannot close container #" +
  container.getContainerData().getContainerID() + " while in " +
  state + " state.", error);
}{code}
Should we disallow/remove the closeContainer call exposed to clients/SCMCLI?

3. Any state change in ContainerState should triggerICR.In that case, 
closeContainer/quasiCloseContainer call should call updateContainerState 
internally to send ICR instead of executing individually.

4. There can be a case where let's say the SCM gets network separated from a 
follower before sending a closeCommand but Ratis ring is opeartional. In such 
case, leader will execute the closeContainer transaction  successfully and 
follower will try to replicate the same but it will fail as the container was 
never put into closing state in follower before as it was not communicating 
with SCM. The assumption that container will be in closing state before 
closeContainer is called may not be necessarily true always.

 

>

[jira] [Commented] (HDFS-14078) Admin helper fails to prettify NullPointerExceptions

2018-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686585#comment-16686585
 ] 

Hadoop QA commented on HDFS-14078:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  5s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 44s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 1 new + 6 unchanged - 0 fixed = 7 total (was 6) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 18s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}158m  6s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}214m 47s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.protocol.datatransfer.sasl.TestSaslDataTransfer |
|   | hadoop.hdfs.server.namenode.TestAuditLoggerWithCommands |
|   | hadoop.hdfs.server.namenode.TestReconstructStripedBlocks |
|   | hadoop.hdfs.TestModTime |
|   | hadoop.hdfs.shortcircuit.TestShortCircuitCache |
|   | hadoop.hdfs.TestHAAuxiliaryPort |
|   | hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy |
|   | hadoop.hdfs.TestDFSStripedOutputStream |
|   | hadoop.hdfs.server.blockmanagement.TestSequentialBlockGroupId |
|   | hadoop.hdfs.server.mover.TestMover |
|   | hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier |
|   | hadoop.hdfs.server.namenode.TestFsck |
|   | hadoop.hdfs.server.mover.TestStorageMover |
|   | hadoop.hdfs.server.namenode.ha.TestQuotasWithHA |
|   | hadoop.hdfs.TestUnsetAndChangeDirectoryEcPolicy |
|   | hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication |
|   | hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots |
|   | hadoop.hdfs.server.namenode.TestCheckpoint |
|   | hadoop.hdfs.server.namenode.TestFSImage |
|   | hadoop.hdfs.TestHFlush |
|   | hadoop.hdfs.TestErasureCodingMultipleRacks |

[jira] [Commented] (HDDS-813) [JDK11] mvn javadoc:javadoc -Phdds fails

2018-11-14 Thread Dinesh Chitlangia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686592#comment-16686592
 ] 

Dinesh Chitlangia commented on HDDS-813:


Test failures are unrelated to the patch and checkstyle issue can be ignored as 
described in previous comment.

> [JDK11] mvn javadoc:javadoc -Phdds fails
> 
>
> Key: HDDS-813
> URL: https://issues.apache.org/jira/browse/HDDS-813
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Akira Ajisaka
>Assignee: Dinesh Chitlangia
>Priority: Major
>  Labels: javadoc
> Attachments: HDDS-813.001.patch
>
>
> {{mvn javadoc:javadoc -Phdds}} fails on Java 11
> {noformat}
> [ERROR] 
> /Users/aajisaka/git/hadoop/hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/client/ScmClient.java:107:
>  error: bad use of '>'
> [ERROR]* @param count count must be > 0.
> [ERROR] 
> /Users/aajisaka/git/hadoop/hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/protocol/LocatedContainer.java:85:
>  error: unknown tag: DatanodeInfo
> [ERROR]   * @return Set nodes that currently host the container
> [ERROR] 
> /Users/aajisaka/git/hadoop/hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/protocol/ScmLocatedBlock.java:71:
>  error: unknown tag: DatanodeInfo
> [ERROR]   * @return List nodes that currently host the block
> [ERROR] 
> /Users/aajisaka/git/hadoop/hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/audit/Auditable.java:28:
>  error: malformed HTML
> [ERROR]   * @return Map with values to be logged in audit.
> [ERROR]                 ^
> [ERROR] 
> /Users/aajisaka/git/hadoop/hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/audit/Auditable.java:28:
>  error: bad use of '>'
> [ERROR]   * @return Map with values to be logged in audit.
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-774) Remove OpenContainerBlockMap from datanode

2018-11-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686595#comment-16686595
 ] 

Hudson commented on HDDS-774:
-

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #15426 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/15426/])
HDDS-774. Remove OpenContainerBlockMap from datanode. Contributed by 
(shashikant: rev b57cc73f837ecb79ed275fc6e50ffce684baf573)
* (edit) 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/scm/TestGetCommittedBlockLengthAndPutKey.java
* (edit) 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java
* (delete) 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/container/common/impl/TestCloseContainerHandler.java


> Remove OpenContainerBlockMap from datanode
> --
>
> Key: HDDS-774
> URL: https://issues.apache.org/jira/browse/HDDS-774
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.4.0
>
> Attachments: HDDS-774.000.patch, HDDS-774.001.patch
>
>
> With HDDS-675, partial flush of uncommitted keys on Datanodes is not 
> required. OpenContainerBlockMap hence serves no purpose anymore.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-774) Remove OpenContainerBlockMap from datanode

2018-11-14 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-774:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

As per offline discussion with [~msingh], committed this change to trunk. 
HDDS-801 will be rebased on top of it.

Thanks [~jnp] and [~msingh] for the reviews.

> Remove OpenContainerBlockMap from datanode
> --
>
> Key: HDDS-774
> URL: https://issues.apache.org/jira/browse/HDDS-774
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.4.0
>
> Attachments: HDDS-774.000.patch, HDDS-774.001.patch
>
>
> With HDDS-675, partial flush of uncommitted keys on Datanodes is not 
> required. OpenContainerBlockMap hence serves no purpose anymore.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13963) NN UI is broken with IE11

2018-11-14 Thread Ayush Saxena (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-13963:

Attachment: HDFS-13963-03.patch

> NN UI is broken with IE11
> -
>
> Key: HDFS-13963
> URL: https://issues.apache.org/jira/browse/HDFS-13963
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, ui
>Affects Versions: 3.1.1
>Reporter: Daisuke Kobayashi
>Assignee: Ayush Saxena
>Priority: Minor
>  Labels: newbie
> Attachments: Document-mode-IE9.png, HDFS-13963-01.patch, 
> HDFS-13963-02.patch, HDFS-13963-03.patch, Screen Shot 2018-10-05 at 
> 20.22.20.png, test-with-edge-mode.png
>
>
> Internet Explorer 11 cannot correctly display Namenode Web UI while the NN 
> itself starts successfully. I have confirmed this over 3.1.1 (latest release) 
> and 3.3.0-SNAPSHOT (current trunk) that the following message is shown.
> {code}
> Failed to retrieve data from /jmx?qry=java.lang:type=Memory, cause: 
> SyntaxError: Invalid character
> {code}
> Apparently, this is because {{dfshealth.html}} runs as IE9 mode by default.
> {code}
> 
> {code}
> Once the compatible mode is changed to IE11 through developer tool, it's 
> rendered correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13963) NN UI is broken with IE11

2018-11-14 Thread Ayush Saxena (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686607#comment-16686607
 ] 

Ayush Saxena commented on HDFS-13963:
-

Thanx [~vinayrpet] and [~elek] for the comments.

Have uploaded v3 with the said changes.

Pls Review :)

> NN UI is broken with IE11
> -
>
> Key: HDFS-13963
> URL: https://issues.apache.org/jira/browse/HDFS-13963
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, ui
>Affects Versions: 3.1.1
>Reporter: Daisuke Kobayashi
>Assignee: Ayush Saxena
>Priority: Minor
>  Labels: newbie
> Attachments: Document-mode-IE9.png, HDFS-13963-01.patch, 
> HDFS-13963-02.patch, HDFS-13963-03.patch, Screen Shot 2018-10-05 at 
> 20.22.20.png, test-with-edge-mode.png
>
>
> Internet Explorer 11 cannot correctly display Namenode Web UI while the NN 
> itself starts successfully. I have confirmed this over 3.1.1 (latest release) 
> and 3.3.0-SNAPSHOT (current trunk) that the following message is shown.
> {code}
> Failed to retrieve data from /jmx?qry=java.lang:type=Memory, cause: 
> SyntaxError: Invalid character
> {code}
> Apparently, this is because {{dfshealth.html}} runs as IE9 mode by default.
> {code}
> 
> {code}
> Once the compatible mode is changed to IE11 through developer tool, it's 
> rendered correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14054) TestLeaseRecovery2: testHardLeaseRecoveryAfterNameNodeRestart2 and testHardLeaseRecoveryWithRenameAfterNameNodeRestart are flaky

2018-11-14 Thread Zsolt Venczel (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686550#comment-16686550
 ] 

Zsolt Venczel edited comment on HDFS-14054 at 11/14/18 2:50 PM:


The failure happened due to FSEditLog.endCurrentLogSegment not being mocked 
early enough that had caused the edit log finalization to fail.

In very rare cases I've seen NPE in line 573. that is handled as well.

Also in very rare cases the waitForMillis for line 575. was not enough.


was (Author: zvenczel):
The failure happened due to FSEditLog.endCurrentLogSegment not being mocked 
early enough that had caused the edit log finalization to fail.

In very rare cases I've seen NPE in line 573. that is handled as well.

Also in very rare cases the timeout for line 575. was not enough.

> TestLeaseRecovery2: testHardLeaseRecoveryAfterNameNodeRestart2 and 
> testHardLeaseRecoveryWithRenameAfterNameNodeRestart are flaky
> 
>
> Key: HDFS-14054
> URL: https://issues.apache.org/jira/browse/HDFS-14054
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.0.3
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>Priority: Major
>  Labels: flaky-test
> Attachments: HDFS-14054.01.patch
>
>
> ---
>  T E S T S
> ---
> OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support 
> was removed in 8.0
> Running org.apache.hadoop.hdfs.TestLeaseRecovery2
> Tests run: 7, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 68.971 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.TestLeaseRecovery2
> testHardLeaseRecoveryAfterNameNodeRestart2(org.apache.hadoop.hdfs.TestLeaseRecovery2)
>   Time elapsed: 4.375 sec  <<< FAILURE!
> java.lang.AssertionError: lease holder should now be the NN
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.checkLease(TestLeaseRecovery2.java:568)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.hardLeaseRecoveryRestartHelper(TestLeaseRecovery2.java:520)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.testHardLeaseRecoveryAfterNameNodeRestart2(TestLeaseRecovery2.java:437)
> testHardLeaseRecoveryWithRenameAfterNameNodeRestart(org.apache.hadoop.hdfs.TestLeaseRecovery2)
>   Time elapsed: 4.339 sec  <<< FAILURE!
> java.lang.AssertionError: lease holder should now be the NN
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.checkLease(TestLeaseRecovery2.java:568)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.hardLeaseRecoveryRestartHelper(TestLeaseRecovery2.java:520)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.testHardLeaseRecoveryWithRenameAfterNameNodeRestart(TestLeaseRecovery2.java:443)
> Results :
> Failed tests: 
>   
> TestLeaseRecovery2.testHardLeaseRecoveryAfterNameNodeRestart2:437->hardLeaseRecoveryRestartHelper:520->checkLease:568
>  lease holder should now be the NN
>   
> TestLeaseRecovery2.testHardLeaseRecoveryWithRenameAfterNameNodeRestart:443->hardLeaseRecoveryRestartHelper:520->checkLease:568
>  lease holder should now be the NN
> Tests run: 7, Failures: 2, Errors: 0, Skipped: 0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-14080) DFS usage metrics reported in incorrect prefix

2018-11-14 Thread Greg Phillips (JIRA)
Greg Phillips created HDFS-14080:


 Summary: DFS usage metrics reported in incorrect prefix
 Key: HDFS-14080
 URL: https://issues.apache.org/jira/browse/HDFS-14080
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode, ui
Reporter: Greg Phillips


The NameNode webapp reports DFS usage metrics using standard SI prefixes (MB, 
GB, etc.). The number reported in the UI is calculated to be the binary size 
which should be noted using binary prefixes (MiB, GiB, etc.). The NameNode 
webapp should be modified to use the correct binary prefixes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14080) DFS usage metrics reported in incorrect prefix

2018-11-14 Thread Greg Phillips (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Phillips updated HDFS-14080:
-
Attachment: HDFS-14080.001.patch

> DFS usage metrics reported in incorrect prefix
> --
>
> Key: HDFS-14080
> URL: https://issues.apache.org/jira/browse/HDFS-14080
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, ui
>Reporter: Greg Phillips
>Priority: Trivial
> Attachments: HDFS-14080.001.patch
>
>
> The NameNode webapp reports DFS usage metrics using standard SI prefixes (MB, 
> GB, etc.). The number reported in the UI is calculated to be the binary size 
> which should be noted using binary prefixes (MiB, GiB, etc.). The NameNode 
> webapp should be modified to use the correct binary prefixes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14080) DFS usage metrics reported in incorrect prefix

2018-11-14 Thread Greg Phillips (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Phillips updated HDFS-14080:
-
Status: Patch Available  (was: Open)

> DFS usage metrics reported in incorrect prefix
> --
>
> Key: HDFS-14080
> URL: https://issues.apache.org/jira/browse/HDFS-14080
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, ui
>Reporter: Greg Phillips
>Priority: Trivial
> Attachments: HDFS-14080.001.patch
>
>
> The NameNode webapp reports DFS usage metrics using standard SI prefixes (MB, 
> GB, etc.). The number reported in the UI is calculated to be the binary size 
> which should be noted using binary prefixes (MiB, GiB, etc.). The NameNode 
> webapp should be modified to use the correct binary prefixes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14078) Admin helper fails to prettify NullPointerExceptions

2018-11-14 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HDFS-14078:

Attachment: HDFS-14078.002.patch

> Admin helper fails to prettify NullPointerExceptions
> 
>
> Key: HDFS-14078
> URL: https://issues.apache.org/jira/browse/HDFS-14078
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Attachments: HDFS-14078.001.patch, HDFS-14078.002.patch
>
>
> org.apache.hadoop.hdfs.tools.AdminHelper has a method to prettifyExceptions:
> {code}
>   static String prettifyException(Exception e) {
> return e.getClass().getSimpleName() + ": "
> + e.getLocalizedMessage().split("\n")[0];
>   }
> {code}
> But if e is a NPE the e.getLocalizedMessage() could be null. In that case NPE 
> will be thrown and the original error message will be lost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13963) NN UI is broken with IE11

2018-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686724#comment-16686724
 ] 

Hadoop QA commented on HDFS-13963:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
56s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
35m 13s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 12s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 50m 21s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | HDFS-13963 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12948149/HDFS-13963-03.patch |
| Optional Tests |  dupname  asflicense  shadedclient  |
| uname | Linux 5b1ce8955c22 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / b57cc73 |
| maven | version: Apache Maven 3.3.9 |
| Max. process+thread count | 462 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs 
hadoop-hdfs-project/hadoop-hdfs-rbf U: hadoop-hdfs-project |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/25521/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> NN UI is broken with IE11
> -
>
> Key: HDFS-13963
> URL: https://issues.apache.org/jira/browse/HDFS-13963
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, ui
>Affects Versions: 3.1.1
>Reporter: Daisuke Kobayashi
>Assignee: Ayush Saxena
>Priority: Minor
>  Labels: newbie
> Attachments: Document-mode-IE9.png, HDFS-13963-01.patch, 
> HDFS-13963-02.patch, HDFS-13963-03.patch, Screen Shot 2018-10-05 at 
> 20.22.20.png, test-with-edge-mode.png
>
>
> Internet Explorer 11 cannot correctly display Namenode Web UI while the NN 
> itself starts successfully. I have confirmed this over 3.1.1 (latest release) 
> and 3.3.0-SNAPSHOT (current trunk) that the following message is shown.
> {code}
> Failed to retrieve data from /jmx?qry=java.lang:type=Memory, cause: 
> SyntaxError: Invalid character
> {code}
> Apparently, this is because {{dfshealth.html}} runs as IE9 mode by default.
> {code}
> 
> {code}
> Once the compatible mode is changed to IE11 through developer tool, it's 
> rendered correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14080) DFS usage metrics reported in incorrect prefix

2018-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686760#comment-16686760
 ] 

Hadoop QA commented on HDFS-14080:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
29m 40s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 14s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 44m  6s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | HDFS-14080 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12948151/HDFS-14080.001.patch |
| Optional Tests |  dupname  asflicense  shadedclient  |
| uname | Linux f14fda47455b 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / b57cc73 |
| maven | version: Apache Maven 3.3.9 |
| Max. process+thread count | 451 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/25522/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> DFS usage metrics reported in incorrect prefix
> --
>
> Key: HDFS-14080
> URL: https://issues.apache.org/jira/browse/HDFS-14080
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, ui
>Reporter: Greg Phillips
>Priority: Trivial
> Attachments: HDFS-14080.001.patch
>
>
> The NameNode webapp reports DFS usage metrics using standard SI prefixes (MB, 
> GB, etc.). The number reported in the UI is calculated to be the binary size 
> which should be noted using binary prefixes (MiB, GiB, etc.). The NameNode 
> webapp should be modified to use the correct binary prefixes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14054) TestLeaseRecovery2: testHardLeaseRecoveryAfterNameNodeRestart2 and testHardLeaseRecoveryWithRenameAfterNameNodeRestart are flaky

2018-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686804#comment-16686804
 ] 

Hadoop QA commented on HDFS-14054:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
 6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 14s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 53s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 78m  4s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
40s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}139m  6s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA |
|   | hadoop.hdfs.server.namenode.TestNamenodeCapacityReport |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | HDFS-14054 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12948138/HDFS-14054.01.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux eed1fd22a2a2 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / a948281 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/25520/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/25520/testReport/ |
| Max. process+thread count | 4377 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://bu

[jira] [Commented] (HDDS-223) Create acceptance test for using datanode plugin

2018-11-14 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686808#comment-16686808
 ] 

Elek, Marton commented on HDDS-223:
---

Thanks [~Sandeep Nemuri] the path, I tested it and worked well. It looks good 
to me.

Only one question: Can you please help to understand why did you add s3 gateway 
to the ozone-hdfs cluster definition? s3 tests are not executed there so I 
can't see any reason for that... 

> Create acceptance test for using datanode plugin
> 
>
> Key: HDDS-223
> URL: https://issues.apache.org/jira/browse/HDDS-223
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Assignee: Sandeep Nemuri
>Priority: Major
>  Labels: alpha2, newbie
> Attachments: HDDS-223.001.patch, HDDS-223.002.patch
>
>
> In the current docker-compose files (both in the hadoop-dist and 
> acceptance-test) we use  simplified ozone clusters: there is no namenode and 
> we use standalone hdds datanode processes.
> To test ozone/hdds as a datanode plugin we need to create separated 
> acceptance tests which uses hadoop:3.1 and hadoop:3.0 + ozone hdds datanode 
> plugin artifact



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-836) Create Ozone identifier for delegation token and block token

2018-11-14 Thread Ajay Kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HDDS-836:

Attachment: HDDS-836-HDDS-4.02.patch

> Create Ozone identifier for delegation token and block token
> 
>
> Key: HDDS-836
> URL: https://issues.apache.org/jira/browse/HDDS-836
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Fix For: 0.4.0
>
> Attachments: HDDS-836-HDDS-4.00.patch, HDDS-836-HDDS-4.01.patch, 
> HDDS-836-HDDS-4.02.patch
>
>
> Create Ozone identifier for delegation token and block token.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-836) Create Ozone identifier for delegation token and block token

2018-11-14 Thread Ajay Kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HDDS-836:

Attachment: (was: HDDS-836-HDDS-4.02.patch)

> Create Ozone identifier for delegation token and block token
> 
>
> Key: HDDS-836
> URL: https://issues.apache.org/jira/browse/HDDS-836
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Fix For: 0.4.0
>
> Attachments: HDDS-836-HDDS-4.00.patch, HDDS-836-HDDS-4.01.patch
>
>
> Create Ozone identifier for delegation token and block token.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-836) Create Ozone identifier for delegation token and block token

2018-11-14 Thread Ajay Kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686840#comment-16686840
 ] 

Ajay Kumar commented on HDDS-836:
-

patch v2 to address jenkins issues.

> Create Ozone identifier for delegation token and block token
> 
>
> Key: HDDS-836
> URL: https://issues.apache.org/jira/browse/HDDS-836
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Fix For: 0.4.0
>
> Attachments: HDDS-836-HDDS-4.00.patch, HDDS-836-HDDS-4.01.patch, 
> HDDS-836-HDDS-4.02.patch
>
>
> Create Ozone identifier for delegation token and block token.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-836) Create Ozone identifier for delegation token and block token

2018-11-14 Thread Ajay Kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HDDS-836:

Attachment: HDDS-836-HDDS-4.02.patch

> Create Ozone identifier for delegation token and block token
> 
>
> Key: HDDS-836
> URL: https://issues.apache.org/jira/browse/HDDS-836
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Fix For: 0.4.0
>
> Attachments: HDDS-836-HDDS-4.00.patch, HDDS-836-HDDS-4.01.patch, 
> HDDS-836-HDDS-4.02.patch
>
>
> Create Ozone identifier for delegation token and block token.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-837) Persist originNodeId as part of .container file in datanode

2018-11-14 Thread Jitendra Nath Pandey (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686848#comment-16686848
 ] 

Jitendra Nath Pandey edited comment on HDDS-837 at 11/14/18 4:47 PM:
-

I think we should also add originPipelineId along with the originNodeId. Even 
if it is not used actively, it will be useful for debugging or looking at the 
history of a container. In case of multi-raft, with multiple pipelines being 
served from the same DN, it will be helpful to be able to distinguish which 
container belongs to which pipeline.


was (Author: jnp):
I think we should also add pipelineId along with the originNodeId. Even if it 
is not used actively, it will be useful for debugging or looking at the history 
of a container. In case of multi-raft, with multiple pipelines being served 
from the same DN, it will be helpful to be able to distinguish which container 
belongs to which pipeline.

> Persist originNodeId as part of .container file in datanode
> ---
>
> Key: HDDS-837
> URL: https://issues.apache.org/jira/browse/HDDS-837
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
>
> To differentiate the replica of QUASI_CLOSED containers we need 
> {{originNodeId}} field. With this field, we can uniquely identify a 
> QUASI_CLOSED container replica. This will be needed when we want to CLOSE a 
> QUASI_CLOSED container.
> This field will be set by the node where the container is created and stored 
> as part of {{.container}} file and will be sent as part of ContainerReport to 
> SCM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-837) Persist originNodeId as part of .container file in datanode

2018-11-14 Thread Jitendra Nath Pandey (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686848#comment-16686848
 ] 

Jitendra Nath Pandey commented on HDDS-837:
---

I think we should also add pipelineId along with the originNodeId. Even if it 
is not used actively, it will be useful for debugging or looking at the history 
of a container. In case of multi-raft, with multiple pipelines being served 
from the same DN, it will be helpful to be able to distinguish which container 
belongs to which pipeline.

> Persist originNodeId as part of .container file in datanode
> ---
>
> Key: HDDS-837
> URL: https://issues.apache.org/jira/browse/HDDS-837
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
>
> To differentiate the replica of QUASI_CLOSED containers we need 
> {{originNodeId}} field. With this field, we can uniquely identify a 
> QUASI_CLOSED container replica. This will be needed when we want to CLOSE a 
> QUASI_CLOSED container.
> This field will be set by the node where the container is created and stored 
> as part of {{.container}} file and will be sent as part of ContainerReport to 
> SCM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14075) NPE while Edit Logging

2018-11-14 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-14075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686927#comment-16686927
 ] 

Íñigo Goiri commented on HDFS-14075:


Is there a unit test we can extend to cover this?

> NPE while Edit Logging
> --
>
> Key: HDFS-14075
> URL: https://issues.apache.org/jira/browse/HDFS-14075
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Critical
> Attachments: HDFS-14075-01.patch
>
>
> {noformat}
> 2018-11-10 18:59:38,427 FATAL 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog: Exception while edit 
> logging: null
> java.lang.NullPointerException
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.doEditTransaction(FSEditLog.java:481)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync$Edit.logEdit(FSEditLogAsync.java:288)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync.run(FSEditLogAsync.java:232)
>  at java.lang.Thread.run(Thread.java:745)
> 2018-11-10 18:59:38,532 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1: Exception while edit logging: null
> 2018-11-10 18:59:38,552 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> SHUTDOWN_MSG:
> {noformat}
> Before NPE Received the following Exception
> {noformat}
> INFO org.apache.hadoop.ipc.Server: IPC Server handler 9 on 65110, call 
> Call#23241 Retry#0 
> org.apache.hadoop.hdfs.server.protocol.NamenodeProtocol.rollEditLog from 
> 
> java.io.IOException: Unable to start log segment 7964819: too few journals 
> successfully started.
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.startLogSegment(FSEditLog.java:1385)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.startLogSegmentAndWriteHeaderTxn(FSEditLog.java:1395)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.rollEditLog(FSEditLog.java:1319)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.rollEditLog(FSImage.java:1352)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.rollEditLog(FSNamesystem.java:4669)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollEditLog(NameNodeRpcServer.java:1293)
>   at 
> org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolServerSideTranslatorPB.rollEditLog(NamenodeProtocolServerSideTranslatorPB.java:146)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.NamenodeProtocolProtos$NamenodeProtocolService$2.callBlockingMethod(NamenodeProtocolProtos.java:12974)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:878)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:824)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2684)
> Caused by: java.io.IOException: starting log segment 7964819 failed for too 
> many journals
>   at 
> org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:412)
>   at 
> org.apache.hadoop.hdfs.server.namenode.JournalSet.startLogSegment(JournalSet.java:207)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.startLogSegment(FSEditLog.java:1383)
>   ... 15 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14080) DFS usage metrics reported in incorrect prefix

2018-11-14 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-14080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686928#comment-16686928
 ] 

Íñigo Goiri commented on HDFS-14080:


[~hfyang20071], you've done changes in this area lately in HDFS-13844.
Any thoughts on this change?

> DFS usage metrics reported in incorrect prefix
> --
>
> Key: HDFS-14080
> URL: https://issues.apache.org/jira/browse/HDFS-14080
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, ui
>Reporter: Greg Phillips
>Priority: Trivial
> Attachments: HDFS-14080.001.patch
>
>
> The NameNode webapp reports DFS usage metrics using standard SI prefixes (MB, 
> GB, etc.). The number reported in the UI is calculated to be the binary size 
> which should be noted using binary prefixes (MiB, GiB, etc.). The NameNode 
> webapp should be modified to use the correct binary prefixes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14017) ObserverReadProxyProviderWithIPFailover should work with HA configuration

2018-11-14 Thread Chen Liang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Liang updated HDFS-14017:
--
Attachment: HDFS-14017-HDFS-12943.011.patch

> ObserverReadProxyProviderWithIPFailover should work with HA configuration
> -
>
> Key: HDFS-14017
> URL: https://issues.apache.org/jira/browse/HDFS-14017
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-14017-HDFS-12943.001.patch, 
> HDFS-14017-HDFS-12943.002.patch, HDFS-14017-HDFS-12943.003.patch, 
> HDFS-14017-HDFS-12943.004.patch, HDFS-14017-HDFS-12943.005.patch, 
> HDFS-14017-HDFS-12943.006.patch, HDFS-14017-HDFS-12943.008.patch, 
> HDFS-14017-HDFS-12943.009.patch, HDFS-14017-HDFS-12943.010.patch, 
> HDFS-14017-HDFS-12943.011.patch
>
>
> Currently {{ObserverReadProxyProviderWithIPFailover}} extends 
> {{ObserverReadProxyProvider}}, and the only difference is changing the proxy 
> factory to use {{IPFailoverProxyProvider}}. However this is not enough 
> because when calling constructor of {{ObserverReadProxyProvider}} in 
> super(...), the follow line:
> {code:java}
> nameNodeProxies = getProxyAddresses(uri,
> HdfsClientConfigKeys.DFS_NAMENODE_RPC_ADDRESS_KEY);
> {code}
> will try to resolve the all configured NN addresses to do configured 
> failover. But in the case of IPFailover, this does not really apply.
>  
> A second issue closely related is about delegation token. For example, in 
> current IPFailover setup, say we have a virtual host nn.xyz.com, which points 
> to either of two physical nodes nn1.xyz.com or nn2.xyz.com. In current HDFS, 
> there is always only one DT being exchanged, which has hostname nn.xyz.com. 
> Server only issues this DT, and client only knows the host nn.xyz.com, so all 
> is good. But in Observer read, even with IPFailover, the client will no 
> longer contacting nn.xyz.com, but will actively reaching to nn1.xyz.com and 
> nn2.xyz.com. During this process, current code will look for DT associated 
> with hostname nn1.xyz.com or nn2.xyz.com, which is different from the DT 
> given by NN. causing Token authentication to fail. This happens in 
> {{AbstractDelegationTokenSelector#selectToken}}. New IPFailover proxy 
> provider will need to resolve this as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14078) Admin helper fails to prettify NullPointerExceptions

2018-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686942#comment-16686942
 ] 

Hadoop QA commented on HDFS-14078:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
35s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 10s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 33s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 96m 26s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}164m 46s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestReconstructStripedBlocks 
|
|   | hadoop.hdfs.tools.TestAdminHelper |
|   | hadoop.hdfs.TestMaintenanceState |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | HDFS-14078 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12948154/HDFS-14078.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 691cc810cb90 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / b57cc73 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/25523/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/25523/testReport/ |
| Max. process+thread count | 2564 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org

[jira] [Commented] (HDFS-14017) ObserverReadProxyProviderWithIPFailover should work with HA configuration

2018-11-14 Thread Chen Liang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686936#comment-16686936
 ] 

Chen Liang commented on HDFS-14017:
---

v011 patch to fix checkstyle issues.

> ObserverReadProxyProviderWithIPFailover should work with HA configuration
> -
>
> Key: HDFS-14017
> URL: https://issues.apache.org/jira/browse/HDFS-14017
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-14017-HDFS-12943.001.patch, 
> HDFS-14017-HDFS-12943.002.patch, HDFS-14017-HDFS-12943.003.patch, 
> HDFS-14017-HDFS-12943.004.patch, HDFS-14017-HDFS-12943.005.patch, 
> HDFS-14017-HDFS-12943.006.patch, HDFS-14017-HDFS-12943.008.patch, 
> HDFS-14017-HDFS-12943.009.patch, HDFS-14017-HDFS-12943.010.patch, 
> HDFS-14017-HDFS-12943.011.patch
>
>
> Currently {{ObserverReadProxyProviderWithIPFailover}} extends 
> {{ObserverReadProxyProvider}}, and the only difference is changing the proxy 
> factory to use {{IPFailoverProxyProvider}}. However this is not enough 
> because when calling constructor of {{ObserverReadProxyProvider}} in 
> super(...), the follow line:
> {code:java}
> nameNodeProxies = getProxyAddresses(uri,
> HdfsClientConfigKeys.DFS_NAMENODE_RPC_ADDRESS_KEY);
> {code}
> will try to resolve the all configured NN addresses to do configured 
> failover. But in the case of IPFailover, this does not really apply.
>  
> A second issue closely related is about delegation token. For example, in 
> current IPFailover setup, say we have a virtual host nn.xyz.com, which points 
> to either of two physical nodes nn1.xyz.com or nn2.xyz.com. In current HDFS, 
> there is always only one DT being exchanged, which has hostname nn.xyz.com. 
> Server only issues this DT, and client only knows the host nn.xyz.com, so all 
> is good. But in Observer read, even with IPFailover, the client will no 
> longer contacting nn.xyz.com, but will actively reaching to nn1.xyz.com and 
> nn2.xyz.com. During this process, current code will look for DT associated 
> with hostname nn1.xyz.com or nn2.xyz.com, which is different from the DT 
> given by NN. causing Token authentication to fail. This happens in 
> {{AbstractDelegationTokenSelector#selectToken}}. New IPFailover proxy 
> provider will need to resolve this as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-836) Create Ozone identifier for delegation token and block token

2018-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686963#comment-16686963
 ] 

Hadoop QA commented on HDDS-836:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} HDDS-4 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  7m 
11s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
14s{color} | {color:green} HDDS-4 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
25s{color} | {color:green} HDDS-4 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
 1s{color} | {color:green} HDDS-4 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
47s{color} | {color:green} HDDS-4 passed {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 15m 
33s{color} | {color:red} branch has errors when building and testing our client 
artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
54s{color} | {color:green} HDDS-4 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
55s{color} | {color:green} HDDS-4 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 15m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
 5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 10m 
24s{color} | {color:red} patch has errors when building and testing our client 
artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 38s{color} 
| {color:red} common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 33s{color} 
| {color:red} common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 31s{color} 
| {color:red} ozone-manager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
37s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}110m 14s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | HDDS-836 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12948185/HDDS-836-HDDS-4.02.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  cc  |
| uname | Linux e72f49975624 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | HDDS-4 / 629347b |
| maven | version: Apache Maven 3.3.9 

[jira] [Commented] (HDDS-836) Create Ozone identifier for delegation token and block token

2018-11-14 Thread Xiaoyu Yao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686952#comment-16686952
 ] 

Xiaoyu Yao commented on HDDS-836:
-

Thanks [~ajayydv] for the update. Two more minor issues, +1 after that being 
fixed pending Jenkins.

 

OzoneBlockTokenIdentifier.java

Line 67-68: NIT: we can eliminate the local variable user and use the blockId 
directly.

 

Line 142-155: these comments are from the hdfs blocktokenidentifier class that 
won't apply here.

 

> Create Ozone identifier for delegation token and block token
> 
>
> Key: HDDS-836
> URL: https://issues.apache.org/jira/browse/HDDS-836
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Fix For: 0.4.0
>
> Attachments: HDDS-836-HDDS-4.00.patch, HDDS-836-HDDS-4.01.patch, 
> HDDS-836-HDDS-4.02.patch
>
>
> Create Ozone identifier for delegation token and block token.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-836) Create Ozone identifier for delegation token and block token

2018-11-14 Thread Ajay Kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HDDS-836:

Attachment: HDDS-836-HDDS-4.03.patch

> Create Ozone identifier for delegation token and block token
> 
>
> Key: HDDS-836
> URL: https://issues.apache.org/jira/browse/HDDS-836
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Fix For: 0.4.0
>
> Attachments: HDDS-836-HDDS-4.00.patch, HDDS-836-HDDS-4.01.patch, 
> HDDS-836-HDDS-4.02.patch, HDDS-836-HDDS-4.03.patch
>
>
> Create Ozone identifier for delegation token and block token.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-836) Create Ozone identifier for delegation token and block token

2018-11-14 Thread Ajay Kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686977#comment-16686977
 ] 

Ajay Kumar commented on HDDS-836:
-

[~xyao] removed the user local variable and javadoc comment in patch v3.

> Create Ozone identifier for delegation token and block token
> 
>
> Key: HDDS-836
> URL: https://issues.apache.org/jira/browse/HDDS-836
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Fix For: 0.4.0
>
> Attachments: HDDS-836-HDDS-4.00.patch, HDDS-836-HDDS-4.01.patch, 
> HDDS-836-HDDS-4.02.patch, HDDS-836-HDDS-4.03.patch
>
>
> Create Ozone identifier for delegation token and block token.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14045) Use different metrics in DataNode to better measure latency of heartbeat/blockReports/incrementalBlockReports of Active/Standby NN

2018-11-14 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-14045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686969#comment-16686969
 ] 

Íñigo Goiri commented on HDFS-14045:


It looks like Jenkins is having a hard time.
Thanks for tackling my comments in [^HDFS-14045.010.patch].
For the Unknown-Unknown, I'm not sure is worth showing them, we should just not 
store those, what we had already covered this.
For the ns0-Unknown, we should just make it ns0.

> Use different metrics in DataNode to better measure latency of 
> heartbeat/blockReports/incrementalBlockReports of Active/Standby NN
> --
>
> Key: HDFS-14045
> URL: https://issues.apache.org/jira/browse/HDFS-14045
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Jiandan Yang 
>Assignee: Jiandan Yang 
>Priority: Major
> Attachments: HDFS-14045.001.patch, HDFS-14045.002.patch, 
> HDFS-14045.003.patch, HDFS-14045.004.patch, HDFS-14045.005.patch, 
> HDFS-14045.006.patch, HDFS-14045.007.patch, HDFS-14045.008.patch, 
> HDFS-14045.009.patch, HDFS-14045.010.patch
>
>
> Currently DataNode uses same metrics to measure rpc latency of NameNode, but 
> Active and Standby usually have different performance at the same time, 
> especially in large cluster. For example, rpc latency of Standby is very long 
> when Standby is catching up editlog. We may misunderstand the state of HDFS. 
> Using different metrics for Active and standby can help us obtain more 
> precise metric data.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14054) TestLeaseRecovery2: testHardLeaseRecoveryAfterNameNodeRestart2 and testHardLeaseRecoveryWithRenameAfterNameNodeRestart are flaky

2018-11-14 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-14054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686973#comment-16686973
 ] 

Íñigo Goiri commented on HDFS-14054:


Thanks [~zvenczel] for the investigation.
[^HDFS-14054.01.patch] LGTM.
+1

> TestLeaseRecovery2: testHardLeaseRecoveryAfterNameNodeRestart2 and 
> testHardLeaseRecoveryWithRenameAfterNameNodeRestart are flaky
> 
>
> Key: HDFS-14054
> URL: https://issues.apache.org/jira/browse/HDFS-14054
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.0.3
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>Priority: Major
>  Labels: flaky-test
> Attachments: HDFS-14054.01.patch
>
>
> ---
>  T E S T S
> ---
> OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support 
> was removed in 8.0
> Running org.apache.hadoop.hdfs.TestLeaseRecovery2
> Tests run: 7, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 68.971 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.TestLeaseRecovery2
> testHardLeaseRecoveryAfterNameNodeRestart2(org.apache.hadoop.hdfs.TestLeaseRecovery2)
>   Time elapsed: 4.375 sec  <<< FAILURE!
> java.lang.AssertionError: lease holder should now be the NN
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.checkLease(TestLeaseRecovery2.java:568)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.hardLeaseRecoveryRestartHelper(TestLeaseRecovery2.java:520)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.testHardLeaseRecoveryAfterNameNodeRestart2(TestLeaseRecovery2.java:437)
> testHardLeaseRecoveryWithRenameAfterNameNodeRestart(org.apache.hadoop.hdfs.TestLeaseRecovery2)
>   Time elapsed: 4.339 sec  <<< FAILURE!
> java.lang.AssertionError: lease holder should now be the NN
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.checkLease(TestLeaseRecovery2.java:568)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.hardLeaseRecoveryRestartHelper(TestLeaseRecovery2.java:520)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.testHardLeaseRecoveryWithRenameAfterNameNodeRestart(TestLeaseRecovery2.java:443)
> Results :
> Failed tests: 
>   
> TestLeaseRecovery2.testHardLeaseRecoveryAfterNameNodeRestart2:437->hardLeaseRecoveryRestartHelper:520->checkLease:568
>  lease holder should now be the NN
>   
> TestLeaseRecovery2.testHardLeaseRecoveryWithRenameAfterNameNodeRestart:443->hardLeaseRecoveryRestartHelper:520->checkLease:568
>  lease holder should now be the NN
> Tests run: 7, Failures: 2, Errors: 0, Skipped: 0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14063) Support noredirect param for CREATE/APPEND/OPEN in HttpFS

2018-11-14 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-14063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686986#comment-16686986
 ] 

Íñigo Goiri commented on HDFS-14063:


[~cheersyang], you have experience here.
Can you take a look?

> Support noredirect param for CREATE/APPEND/OPEN in HttpFS
> -
>
> Key: HDFS-14063
> URL: https://issues.apache.org/jira/browse/HDFS-14063
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-14063.000.patch, HDFS-14063.001.patch, 
> HDFS-14063.002.patch
>
>
> Currently HttpFS always redirects the URI. However, the WebUI uses 
> noredirect=true which means it only wants a response with the location. This 
> is properly done in {{NamenodeWebHDFSMethods}}. HttpFS should do the same.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14054) TestLeaseRecovery2: testHardLeaseRecoveryAfterNameNodeRestart2 and testHardLeaseRecoveryWithRenameAfterNameNodeRestart are flaky

2018-11-14 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-14054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686990#comment-16686990
 ] 

Íñigo Goiri commented on HDFS-14054:


Anybody else available for review? I'd like to get some additional feedback; 
otherwise, I would commit tomorrow or so.

> TestLeaseRecovery2: testHardLeaseRecoveryAfterNameNodeRestart2 and 
> testHardLeaseRecoveryWithRenameAfterNameNodeRestart are flaky
> 
>
> Key: HDFS-14054
> URL: https://issues.apache.org/jira/browse/HDFS-14054
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.0.3
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>Priority: Major
>  Labels: flaky-test
> Attachments: HDFS-14054.01.patch
>
>
> ---
>  T E S T S
> ---
> OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support 
> was removed in 8.0
> Running org.apache.hadoop.hdfs.TestLeaseRecovery2
> Tests run: 7, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 68.971 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.TestLeaseRecovery2
> testHardLeaseRecoveryAfterNameNodeRestart2(org.apache.hadoop.hdfs.TestLeaseRecovery2)
>   Time elapsed: 4.375 sec  <<< FAILURE!
> java.lang.AssertionError: lease holder should now be the NN
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.checkLease(TestLeaseRecovery2.java:568)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.hardLeaseRecoveryRestartHelper(TestLeaseRecovery2.java:520)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.testHardLeaseRecoveryAfterNameNodeRestart2(TestLeaseRecovery2.java:437)
> testHardLeaseRecoveryWithRenameAfterNameNodeRestart(org.apache.hadoop.hdfs.TestLeaseRecovery2)
>   Time elapsed: 4.339 sec  <<< FAILURE!
> java.lang.AssertionError: lease holder should now be the NN
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.checkLease(TestLeaseRecovery2.java:568)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.hardLeaseRecoveryRestartHelper(TestLeaseRecovery2.java:520)
>   at 
> org.apache.hadoop.hdfs.TestLeaseRecovery2.testHardLeaseRecoveryWithRenameAfterNameNodeRestart(TestLeaseRecovery2.java:443)
> Results :
> Failed tests: 
>   
> TestLeaseRecovery2.testHardLeaseRecoveryAfterNameNodeRestart2:437->hardLeaseRecoveryRestartHelper:520->checkLease:568
>  lease holder should now be the NN
>   
> TestLeaseRecovery2.testHardLeaseRecoveryWithRenameAfterNameNodeRestart:443->hardLeaseRecoveryRestartHelper:520->checkLease:568
>  lease holder should now be the NN
> Tests run: 7, Failures: 2, Errors: 0, Skipped: 0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-6874) Add GETFILEBLOCKLOCATIONS operation to HttpFS

2018-11-14 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-6874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686985#comment-16686985
 ] 

Íñigo Goiri commented on HDFS-6874:
---

The situation with GET_BLOCK_LOCATIONS and GETFILEBLOCKLOCATIONS is pretty 
messy.
HDFS-11156 is reverted and this is somewhat in between now.
I would like to get this sorted out and then tackle this.
[~cheersyang] you are involved in both, what would be the best course of action 
here?

GET_BLOCK_LOCATIONS is used by explorer.js to get the block locations.
Internally, I had to add this method casting to DistributedFileSystem, that 
could be an option.

> Add GETFILEBLOCKLOCATIONS operation to HttpFS
> -
>
> Key: HDFS-6874
> URL: https://issues.apache.org/jira/browse/HDFS-6874
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: httpfs
>Affects Versions: 2.4.1, 2.7.3
>Reporter: Gao Zhong Liang
>Assignee: Weiwei Yang
>Priority: Major
>  Labels: BB2015-05-TBR
> Attachments: HDFS-6874-1.patch, HDFS-6874-branch-2.6.0.patch, 
> HDFS-6874.02.patch, HDFS-6874.03.patch, HDFS-6874.04.patch, 
> HDFS-6874.05.patch, HDFS-6874.06.patch, HDFS-6874.07.patch, 
> HDFS-6874.08.patch, HDFS-6874.09.patch, HDFS-6874.10.patch, HDFS-6874.patch
>
>
> GETFILEBLOCKLOCATIONS operation is missing in HttpFS, which is already 
> supported in WebHDFS.  For the request of GETFILEBLOCKLOCATIONS in 
> org.apache.hadoop.fs.http.server.HttpFSServer, BAD_REQUEST is returned so far:
> ...
>  case GETFILEBLOCKLOCATIONS: {
> response = Response.status(Response.Status.BAD_REQUEST).build();
> break;
>   }
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14015) Improve error handling in hdfsThreadDestructor in native thread local storage

2018-11-14 Thread Daniel Templeton (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16687004#comment-16687004
 ] 

Daniel Templeton commented on HDFS-14015:
-

I don't see any evidence of a test failure, so I'm not sure what's up with the 
-1.  [~pranay_singh] or [~xiaochen], would one of you care to review this patch?

> Improve error handling in hdfsThreadDestructor in native thread local storage
> -
>
> Key: HDFS-14015
> URL: https://issues.apache.org/jira/browse/HDFS-14015
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: native
>Affects Versions: 3.0.0
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>Priority: Major
> Attachments: HDFS-14015.001.patch, HDFS-14015.002.patch, 
> HDFS-14015.003.patch, HDFS-14015.004.patch, HDFS-14015.005.patch, 
> HDFS-14015.006.patch, HDFS-14015.007.patch, HDFS-14015.008.patch, 
> HDFS-14015.009.patch, HDFS-14015.010.patch
>
>
> In the hdfsThreadDestructor() function, we ignore the return value from the 
> DetachCurrentThread() call.  We are seeing cases where a native thread dies 
> while holding a JVM monitor, and it doesn't release the monitor.  We're 
> hoping that logging this error instead of ignoring it will shed some light on 
> the issue.  In any case, it's good programming practice.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14017) ObserverReadProxyProviderWithIPFailover should work with HA configuration

2018-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16687006#comment-16687006
 ] 

Hadoop QA commented on HDFS-14017:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} HDFS-12943 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
12s{color} | {color:green} HDFS-12943 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} HDFS-12943 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} HDFS-12943 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
43s{color} | {color:green} HDFS-12943 passed {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 11m 
51s{color} | {color:red} branch has errors when building and testing our client 
artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
36s{color} | {color:green} HDFS-12943 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} HDFS-12943 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs-client: The patch 
generated 0 new + 4 unchanged - 3 fixed = 4 total (was 7) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 11m 
57s{color} | {color:red} patch has errors when building and testing our client 
artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 44s{color} 
| {color:red} hadoop-hdfs-client in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 54m 22s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | HDFS-14017 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12948191/HDFS-14017-HDFS-12943.011.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux f1821418cc4d 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | HDFS-12943 / 8b5277f |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/25524/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-client.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/25524/testReport/ |
| Max. process+thread count | 97 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-client U: 
hadoop

[jira] [Commented] (HDDS-836) Create Ozone identifier for delegation token and block token

2018-11-14 Thread Xiaoyu Yao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16687013#comment-16687013
 ] 

Xiaoyu Yao commented on HDDS-836:
-

There is an unexpected change in HDDSKeyPEMWriter.java, can you remove it from 
the patch v3? 

 

> Create Ozone identifier for delegation token and block token
> 
>
> Key: HDDS-836
> URL: https://issues.apache.org/jira/browse/HDDS-836
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Fix For: 0.4.0
>
> Attachments: HDDS-836-HDDS-4.00.patch, HDDS-836-HDDS-4.01.patch, 
> HDDS-836-HDDS-4.02.patch, HDDS-836-HDDS-4.03.patch
>
>
> Create Ozone identifier for delegation token and block token.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14067) Allow manual failover between standby and observer

2018-11-14 Thread Plamen Jeliazkov (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16687015#comment-16687015
 ] 

Plamen Jeliazkov commented on HDFS-14067:
-

Seems we may need to re-purpose this JIRA.

I am unable to make use of `--forcemanual` properly. The 
`-transitionToObserver` command fails to make use of it and if I try to I get 
"Incorrect number of arguments". Likely this is because of lines 477-479 of 
HAAdmin (without .000 patch) where `-transitionToObserver` is not listed as a 
command that accepts `--forcemanual`.

{code:java}
// Mutative commands take FORCEMANUAL option
if ("-transitionToActive".equals(cmd) ||
"-transitionToStandby".equals(cmd) ||
"-failover".equals(cmd)) {
  opts.addOption(FORCEMANUAL, false,
  "force manual control even if auto-failover is enabled");
}
{code}

> Allow manual failover between standby and observer
> --
>
> Key: HDFS-14067
> URL: https://issues.apache.org/jira/browse/HDFS-14067
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-14067-HDFS-12943.000.patch
>
>
> Currently if automatic failover is enabled in a HA environment, transition 
> from standby to observer would be blocked:
> {code}
> [hdfs@*** hadoop-3.3.0-SNAPSHOT]$ bin/hdfs haadmin -transitionToObserver ha2
> Automatic failover is enabled for NameNode at 
> Refusing to manually manage HA state, since it may cause
> a split-brain scenario or other incorrect state.
> If you are very sure you know what you are doing, please
> specify the --forcemanual flag.
> {code}
> We should allow manual transition between standby and observer in this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-120) Adding HDDS datanode Audit Log

2018-11-14 Thread Xiaoyu Yao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16687011#comment-16687011
 ] 

Xiaoyu Yao commented on HDDS-120:
-

Thanks [~dineshchitlangia] for the update. Patch v5 looks pretty good to me. 
Two more comments:

HddsDispatcher.java

Line 86: dnActionTypeMap can be eliminated by leveraging 
HddsUtils.isReadOnly(msg) with a helper function like below

{code}

private EventType getActionType(ContainerCommandRequestProto msg) {

 return HddsUtils.isReadOnly(msg) ? EventType.READ : EventType.WRITE

}

{code}

 

Line 460/476: let's put some null or fixed INVALID user/ip for now instead of 
using the routines from Hadoop RPC Server. We will provide a GRPC Security 
Context later so that they can retrieved properly. 

 

> Adding HDDS datanode Audit Log
> --
>
> Key: HDDS-120
> URL: https://issues.apache.org/jira/browse/HDDS-120
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Dinesh Chitlangia
>Priority: Major
>  Labels: alpha2
> Attachments: HDDS-120.001.patch, HDDS-120.002.patch, 
> HDDS-120.003.patch, HDDS-120.004.patch, HDDS-120.005.patch
>
>
> This can be useful to find users who overload the DNs. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14067) Allow manual failover between standby and observer

2018-11-14 Thread Plamen Jeliazkov (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16687015#comment-16687015
 ] 

Plamen Jeliazkov edited comment on HDFS-14067 at 11/14/18 7:32 PM:
---

Seems we may need to re-purpose this JIRA.

I am unable to make use of `-forcemanual` properly. The `-transitionToObserver` 
command fails to make use of it and if I try to I get "Incorrect number of 
arguments". Likely this is because of lines 477-479 of HAAdmin (without .000 
patch) where `-transitionToObserver` is not listed as a command that accepts 
`-forcemanual`.

FWIW, the .000 patch didn't address it either.

{code:java}
// Mutative commands take FORCEMANUAL option
if ("-transitionToActive".equals(cmd) ||
"-transitionToStandby".equals(cmd) ||
"-failover".equals(cmd)) {
  opts.addOption(FORCEMANUAL, false,
  "force manual control even if auto-failover is enabled");
}
{code}


was (Author: zero45):
Seems we may need to re-purpose this JIRA.

I am unable to make use of `-forcemanual` properly. The `-transitionToObserver` 
command fails to make use of it and if I try to I get "Incorrect number of 
arguments". Likely this is because of lines 477-479 of HAAdmin (without .000 
patch) where `-transitionToObserver` is not listed as a command that accepts 
`-forcemanual`.

{code:java}
// Mutative commands take FORCEMANUAL option
if ("-transitionToActive".equals(cmd) ||
"-transitionToStandby".equals(cmd) ||
"-failover".equals(cmd)) {
  opts.addOption(FORCEMANUAL, false,
  "force manual control even if auto-failover is enabled");
}
{code}

> Allow manual failover between standby and observer
> --
>
> Key: HDFS-14067
> URL: https://issues.apache.org/jira/browse/HDFS-14067
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-14067-HDFS-12943.000.patch
>
>
> Currently if automatic failover is enabled in a HA environment, transition 
> from standby to observer would be blocked:
> {code}
> [hdfs@*** hadoop-3.3.0-SNAPSHOT]$ bin/hdfs haadmin -transitionToObserver ha2
> Automatic failover is enabled for NameNode at 
> Refusing to manually manage HA state, since it may cause
> a split-brain scenario or other incorrect state.
> If you are very sure you know what you are doing, please
> specify the --forcemanual flag.
> {code}
> We should allow manual transition between standby and observer in this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14067) Allow manual failover between standby and observer

2018-11-14 Thread Plamen Jeliazkov (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16687015#comment-16687015
 ] 

Plamen Jeliazkov edited comment on HDFS-14067 at 11/14/18 7:28 PM:
---

Seems we may need to re-purpose this JIRA.

I am unable to make use of `-forcemanual` properly. The `-transitionToObserver` 
command fails to make use of it and if I try to I get "Incorrect number of 
arguments". Likely this is because of lines 477-479 of HAAdmin (without .000 
patch) where `-transitionToObserver` is not listed as a command that accepts 
`-forcemanual`.

{code:java}
// Mutative commands take FORCEMANUAL option
if ("-transitionToActive".equals(cmd) ||
"-transitionToStandby".equals(cmd) ||
"-failover".equals(cmd)) {
  opts.addOption(FORCEMANUAL, false,
  "force manual control even if auto-failover is enabled");
}
{code}


was (Author: zero45):
Seems we may need to re-purpose this JIRA.

I am unable to make use of `--forcemanual` properly. The 
`-transitionToObserver` command fails to make use of it and if I try to I get 
"Incorrect number of arguments". Likely this is because of lines 477-479 of 
HAAdmin (without .000 patch) where `-transitionToObserver` is not listed as a 
command that accepts `--forcemanual`.

{code:java}
// Mutative commands take FORCEMANUAL option
if ("-transitionToActive".equals(cmd) ||
"-transitionToStandby".equals(cmd) ||
"-failover".equals(cmd)) {
  opts.addOption(FORCEMANUAL, false,
  "force manual control even if auto-failover is enabled");
}
{code}

> Allow manual failover between standby and observer
> --
>
> Key: HDFS-14067
> URL: https://issues.apache.org/jira/browse/HDFS-14067
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-14067-HDFS-12943.000.patch
>
>
> Currently if automatic failover is enabled in a HA environment, transition 
> from standby to observer would be blocked:
> {code}
> [hdfs@*** hadoop-3.3.0-SNAPSHOT]$ bin/hdfs haadmin -transitionToObserver ha2
> Automatic failover is enabled for NameNode at 
> Refusing to manually manage HA state, since it may cause
> a split-brain scenario or other incorrect state.
> If you are very sure you know what you are doing, please
> specify the --forcemanual flag.
> {code}
> We should allow manual transition between standby and observer in this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14015) Improve error handling in hdfsThreadDestructor in native thread local storage

2018-11-14 Thread Daniel Templeton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton updated HDFS-14015:

Attachment: HDFS-14015.011.patch

> Improve error handling in hdfsThreadDestructor in native thread local storage
> -
>
> Key: HDFS-14015
> URL: https://issues.apache.org/jira/browse/HDFS-14015
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: native
>Affects Versions: 3.0.0
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>Priority: Major
> Attachments: HDFS-14015.001.patch, HDFS-14015.002.patch, 
> HDFS-14015.003.patch, HDFS-14015.004.patch, HDFS-14015.005.patch, 
> HDFS-14015.006.patch, HDFS-14015.007.patch, HDFS-14015.008.patch, 
> HDFS-14015.009.patch, HDFS-14015.010.patch, HDFS-14015.011.patch
>
>
> In the hdfsThreadDestructor() function, we ignore the return value from the 
> DetachCurrentThread() call.  We are seeing cases where a native thread dies 
> while holding a JVM monitor, and it doesn't release the monitor.  We're 
> hoping that logging this error instead of ignoring it will shed some light on 
> the issue.  In any case, it's good programming practice.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14015) Improve error handling in hdfsThreadDestructor in native thread local storage

2018-11-14 Thread Daniel Templeton (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16687029#comment-16687029
 ] 

Daniel Templeton commented on HDFS-14015:
-

I just did my own review of my patch and caught some issues which are now 
addressed in patch 11.

> Improve error handling in hdfsThreadDestructor in native thread local storage
> -
>
> Key: HDFS-14015
> URL: https://issues.apache.org/jira/browse/HDFS-14015
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: native
>Affects Versions: 3.0.0
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>Priority: Major
> Attachments: HDFS-14015.001.patch, HDFS-14015.002.patch, 
> HDFS-14015.003.patch, HDFS-14015.004.patch, HDFS-14015.005.patch, 
> HDFS-14015.006.patch, HDFS-14015.007.patch, HDFS-14015.008.patch, 
> HDFS-14015.009.patch, HDFS-14015.010.patch, HDFS-14015.011.patch
>
>
> In the hdfsThreadDestructor() function, we ignore the return value from the 
> DetachCurrentThread() call.  We are seeing cases where a native thread dies 
> while holding a JVM monitor, and it doesn't release the monitor.  We're 
> hoping that logging this error instead of ignoring it will shed some light on 
> the issue.  In any case, it's good programming practice.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-14081) hdfs dfsadmin -metasave metasave_test results NPE

2018-11-14 Thread Shweta (JIRA)
Shweta created HDFS-14081:
-

 Summary: hdfs dfsadmin -metasave metasave_test results NPE
 Key: HDFS-14081
 URL: https://issues.apache.org/jira/browse/HDFS-14081
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs
Affects Versions: 3.2.1
Reporter: Shweta
Assignee: Shweta
 Fix For: 3.2.1


Race condition is encountered while adding Block to 
postponedMisreplicatedBlocks which in turn tried to retrieve Block from 
BlockManager in which it may not be present. 

This happens in HA, metasave in first NN succeeded but failed in second NN, 
StackTrace showing NPE is as follows:

2018-07-12 21:39:09,783 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
24 on 8020, call Call#1 Retry#0 
org.apache.hadoop.hdfs.protocol.ClientProtocol.metaSave from 
172.26.9.163:602342018-07-12 21:39:09,783 WARN org.apache.hadoop.ipc.Server: 
IPC Server handler 24 on 8020, call Call#1 Retry#0 
org.apache.hadoop.hdfs.protocol.ClientProtocol.metaSave from 
172.26.9.163:60234java.lang.NullPointerException at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseSourceDatanodes(BlockManager.java:2175)
 at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.dumpBlockMeta(BlockManager.java:830)
 at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.metaSave(BlockManager.java:762)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.metaSave(FSNamesystem.java:1782)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.metaSave(FSNamesystem.java:1766)
 at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.metaSave(NameNodeRpcServer.java:1320)
 at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.metaSave(ClientNamenodeProtocolServerSideTranslatorPB.java:928)
 at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
 at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at 
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) at 
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) at 
java.security.AccessController.doPrivileged(Native Method) at 
javax.security.auth.Subject.doAs(Subject.java:422) at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14067) Allow manual failover between standby and observer

2018-11-14 Thread Chao Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16687063#comment-16687063
 ] 

Chao Sun commented on HDFS-14067:
-

Thanks [~zero45]. Yes, let's re-purpose this JIRA to allow manual transition 
with the `--forcemanual` flag. Will post a patch for this soon.

> Allow manual failover between standby and observer
> --
>
> Key: HDFS-14067
> URL: https://issues.apache.org/jira/browse/HDFS-14067
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-14067-HDFS-12943.000.patch
>
>
> Currently if automatic failover is enabled in a HA environment, transition 
> from standby to observer would be blocked:
> {code}
> [hdfs@*** hadoop-3.3.0-SNAPSHOT]$ bin/hdfs haadmin -transitionToObserver ha2
> Automatic failover is enabled for NameNode at 
> Refusing to manually manage HA state, since it may cause
> a split-brain scenario or other incorrect state.
> If you are very sure you know what you are doing, please
> specify the --forcemanual flag.
> {code}
> We should allow manual transition between standby and observer in this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-813) [JDK11] mvn javadoc:javadoc -Phdds fails

2018-11-14 Thread Anu Engineer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16687065#comment-16687065
 ] 

Anu Engineer commented on HDDS-813:
---

i know that Jenkins says java doc passed. However, when I run  mvn 
javadoc:javadoc -Phdds I get lots of errors. 

> [JDK11] mvn javadoc:javadoc -Phdds fails
> 
>
> Key: HDDS-813
> URL: https://issues.apache.org/jira/browse/HDDS-813
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Akira Ajisaka
>Assignee: Dinesh Chitlangia
>Priority: Major
>  Labels: javadoc
> Attachments: HDDS-813.001.patch
>
>
> {{mvn javadoc:javadoc -Phdds}} fails on Java 11
> {noformat}
> [ERROR] 
> /Users/aajisaka/git/hadoop/hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/client/ScmClient.java:107:
>  error: bad use of '>'
> [ERROR]* @param count count must be > 0.
> [ERROR] 
> /Users/aajisaka/git/hadoop/hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/protocol/LocatedContainer.java:85:
>  error: unknown tag: DatanodeInfo
> [ERROR]   * @return Set nodes that currently host the container
> [ERROR] 
> /Users/aajisaka/git/hadoop/hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/protocol/ScmLocatedBlock.java:71:
>  error: unknown tag: DatanodeInfo
> [ERROR]   * @return List nodes that currently host the block
> [ERROR] 
> /Users/aajisaka/git/hadoop/hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/audit/Auditable.java:28:
>  error: malformed HTML
> [ERROR]   * @return Map with values to be logged in audit.
> [ERROR]                 ^
> [ERROR] 
> /Users/aajisaka/git/hadoop/hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/audit/Auditable.java:28:
>  error: bad use of '>'
> [ERROR]   * @return Map with values to be logged in audit.
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14067) Allow manual failover between standby and observer

2018-11-14 Thread Konstantin Shvachko (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16687067#comment-16687067
 ] 

Konstantin Shvachko commented on HDFS-14067:


Looks like we need either 
# to permit {{-forcemanual}} option for {{-transitionToObserver}} command, or
# always call {{transitionToObserver()}} with {{FORCEMANUAL = true}}, since 
there is no other way to make the transition work.

> Allow manual failover between standby and observer
> --
>
> Key: HDFS-14067
> URL: https://issues.apache.org/jira/browse/HDFS-14067
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-14067-HDFS-12943.000.patch
>
>
> Currently if automatic failover is enabled in a HA environment, transition 
> from standby to observer would be blocked:
> {code}
> [hdfs@*** hadoop-3.3.0-SNAPSHOT]$ bin/hdfs haadmin -transitionToObserver ha2
> Automatic failover is enabled for NameNode at 
> Refusing to manually manage HA state, since it may cause
> a split-brain scenario or other incorrect state.
> If you are very sure you know what you are doing, please
> specify the --forcemanual flag.
> {code}
> We should allow manual transition between standby and observer in this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-832) Docs folder is missing from the Ozone distribution package

2018-11-14 Thread Anu Engineer (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDDS-832:
--
   Resolution: Fixed
Fix Version/s: 0.3.0
   Status: Resolved  (was: Patch Available)

Thanks for the fix. Resolving this now.

> Docs folder is missing from the Ozone distribution package
> --
>
> Key: HDDS-832
> URL: https://issues.apache.org/jira/browse/HDDS-832
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Blocker
> Fix For: 0.3.0
>
> Attachments: HDDS-832-ozone-0.3.001.patch
>
>
> After the 0.2.1 release the dist package create (together with the classpath 
> generation) are changed. 
> Problems: 
> 1. /docs folder is missing from the dist package
> 2. /docs is missing from the scm/om ui



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-539) ozone datanode ignores the invalid options

2018-11-14 Thread Anu Engineer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16687069#comment-16687069
 ] 

Anu Engineer commented on HDDS-539:
---

Hi, you might need to rebase the patch. Thanks. If you need help please let me 
know.

> ozone datanode ignores the invalid options
> --
>
> Key: HDDS-539
> URL: https://issues.apache.org/jira/browse/HDDS-539
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Namit Maheshwari
>Assignee: Vinicius Higa Murakami
>Priority: Major
>  Labels: newbie
> Attachments: HDDS-539.003.patch, HDDS-539.patch
>
>
> ozone datanode command starts datanode and ignores the invalid option, apart 
> from help
> {code:java}
> [root@ctr-e138-1518143905142-481027-01-02 bin]# ./ozone datanode -help
> Starts HDDS Datanode
> {code}
> For all the other invalid options, it just ignores and starts the DN like 
> below:
> {code:java}
> [root@ctr-e138-1518143905142-481027-01-02 bin]# ./ozone datanode -ABC
> 2018-09-22 00:59:34,462 [main] INFO - STARTUP_MSG:
> /
> STARTUP_MSG: Starting HddsDatanodeService
> STARTUP_MSG: host = 
> ctr-e138-1518143905142-481027-01-02.hwx.site/172.27.54.20
> STARTUP_MSG: args = [-ABC]
> STARTUP_MSG: version = 3.2.0-SNAPSHOT
> STARTUP_MSG: classpath = 
> /root/ozone-0.3.0-SNAPSHOT/etc/hadoop:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/commons-cli-1.2.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/kerb-crypto-1.0.1.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/guava-11.0.2.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/hadoop-auth-3.2.0-SNAPSHOT.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/jcip-annotations-1.0-1.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/jsr305-3.0.0.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/commons-compress-1.4.1.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/commons-beanutils-1.9.3.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/jackson-core-asl-1.9.13.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/jackson-jaxrs-1.9.13.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/commons-collections-3.2.2.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/jsp-api-2.1.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/kerb-simplekdc-1.0.1.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/htrace-core4-4.1.0-incubating.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/zookeeper-3.4.9.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/gson-2.2.4.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/token-provider-1.0.1.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/dnsjava-2.1.7.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/avro-1.7.7.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/jsr311-api-1.1.1.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/jersey-json-1.19.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/stax2-api-3.1.4.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/log4j-1.2.17.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/accessors-smart-1.2.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/commons-lang3-3.7.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/jersey-server-1.19.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/netty-3.10.5.Final.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/snappy-java-1.0.5.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/kerby-config-1.0.1.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/kerby-util-1.0.1.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/httpclient-4.5.2.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/jetty-security-9.3.19.v20170502.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/hadoop-annotations-3.2.0-SNAPSHOT.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/re2j-1.1.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/jackson-databind-2.9.5.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/commons-math3-3.1.1.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/commons-logging-1.1.3.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/jersey-core-1.19.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/kerb-client-1.0.1.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/jsch-0.1.54.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/jersey-servlet-1.19.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/asm-5.0.4.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/jackson-core-2.9.5.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/jetty-util-9.3.19.v20170502.jar:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar:/

[jira] [Updated] (HDDS-836) Create Ozone identifier for delegation token and block token

2018-11-14 Thread Ajay Kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HDDS-836:

Attachment: HDDS-836-HDDS-4.04.patch

> Create Ozone identifier for delegation token and block token
> 
>
> Key: HDDS-836
> URL: https://issues.apache.org/jira/browse/HDDS-836
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Fix For: 0.4.0
>
> Attachments: HDDS-836-HDDS-4.00.patch, HDDS-836-HDDS-4.01.patch, 
> HDDS-836-HDDS-4.02.patch, HDDS-836-HDDS-4.03.patch, HDDS-836-HDDS-4.04.patch
>
>
> Create Ozone identifier for delegation token and block token.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-836) Create Ozone identifier for delegation token and block token

2018-11-14 Thread Ajay Kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16687084#comment-16687084
 ] 

Ajay Kumar commented on HDDS-836:
-

Addressed in new patch.

> Create Ozone identifier for delegation token and block token
> 
>
> Key: HDDS-836
> URL: https://issues.apache.org/jira/browse/HDDS-836
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Fix For: 0.4.0
>
> Attachments: HDDS-836-HDDS-4.00.patch, HDDS-836-HDDS-4.01.patch, 
> HDDS-836-HDDS-4.02.patch, HDDS-836-HDDS-4.03.patch, HDDS-836-HDDS-4.04.patch
>
>
> Create Ozone identifier for delegation token and block token.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-813) [JDK11] mvn javadoc:javadoc -Phdds fails

2018-11-14 Thread Dinesh Chitlangia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16687088#comment-16687088
 ] 

Dinesh Chitlangia commented on HDDS-813:


[~anu] - Do you mind sharing the error.log ?

 

> [JDK11] mvn javadoc:javadoc -Phdds fails
> 
>
> Key: HDDS-813
> URL: https://issues.apache.org/jira/browse/HDDS-813
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Akira Ajisaka
>Assignee: Dinesh Chitlangia
>Priority: Major
>  Labels: javadoc
> Attachments: HDDS-813.001.patch
>
>
> {{mvn javadoc:javadoc -Phdds}} fails on Java 11
> {noformat}
> [ERROR] 
> /Users/aajisaka/git/hadoop/hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/client/ScmClient.java:107:
>  error: bad use of '>'
> [ERROR]* @param count count must be > 0.
> [ERROR] 
> /Users/aajisaka/git/hadoop/hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/protocol/LocatedContainer.java:85:
>  error: unknown tag: DatanodeInfo
> [ERROR]   * @return Set nodes that currently host the container
> [ERROR] 
> /Users/aajisaka/git/hadoop/hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/protocol/ScmLocatedBlock.java:71:
>  error: unknown tag: DatanodeInfo
> [ERROR]   * @return List nodes that currently host the block
> [ERROR] 
> /Users/aajisaka/git/hadoop/hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/audit/Auditable.java:28:
>  error: malformed HTML
> [ERROR]   * @return Map with values to be logged in audit.
> [ERROR]                 ^
> [ERROR] 
> /Users/aajisaka/git/hadoop/hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/audit/Auditable.java:28:
>  error: bad use of '>'
> [ERROR]   * @return Map with values to be logged in audit.
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-813) [JDK11] mvn javadoc:javadoc -Phdds fails

2018-11-14 Thread Anu Engineer (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDDS-813:
--
Attachment: HDDS-813.javadoc.output

> [JDK11] mvn javadoc:javadoc -Phdds fails
> 
>
> Key: HDDS-813
> URL: https://issues.apache.org/jira/browse/HDDS-813
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Akira Ajisaka
>Assignee: Dinesh Chitlangia
>Priority: Major
>  Labels: javadoc
> Attachments: HDDS-813.001.patch, HDDS-813.javadoc.output
>
>
> {{mvn javadoc:javadoc -Phdds}} fails on Java 11
> {noformat}
> [ERROR] 
> /Users/aajisaka/git/hadoop/hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/client/ScmClient.java:107:
>  error: bad use of '>'
> [ERROR]* @param count count must be > 0.
> [ERROR] 
> /Users/aajisaka/git/hadoop/hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/protocol/LocatedContainer.java:85:
>  error: unknown tag: DatanodeInfo
> [ERROR]   * @return Set nodes that currently host the container
> [ERROR] 
> /Users/aajisaka/git/hadoop/hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/protocol/ScmLocatedBlock.java:71:
>  error: unknown tag: DatanodeInfo
> [ERROR]   * @return List nodes that currently host the block
> [ERROR] 
> /Users/aajisaka/git/hadoop/hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/audit/Auditable.java:28:
>  error: malformed HTML
> [ERROR]   * @return Map with values to be logged in audit.
> [ERROR]                 ^
> [ERROR] 
> /Users/aajisaka/git/hadoop/hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/audit/Auditable.java:28:
>  error: bad use of '>'
> [ERROR]   * @return Map with values to be logged in audit.
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14059) Test reads from standby on a secure cluster with Configured failover

2018-11-14 Thread Plamen Jeliazkov (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16687093#comment-16687093
 ] 

Plamen Jeliazkov commented on HDFS-14059:
-

New finding. Let's call it (4):

With `dfs.ha.automatic-failover.enabled=true` still set, I am noticing that 
when I manually transition a Standby->Observer (that has ZKFC co-located), the 
ZKFC will automatically try to convert the Observer back to Standby mode. Logs 
end up looking like this:
{code}
2018-11-14 12:29:00,466 ERROR org.apache.hadoop.ha.ZKFailoverController: Local 
service NameNode at 
instance-3.pp-devcos-myhadoop.us-central1.gcp.dev.paypalinc.com/10.176.1.207:8030
 has changed the serviceState to observer. Expected was standby. Quitting 
election marking fencing necessary.
2018-11-14 12:29:00,466 INFO org.apache.hadoop.ha.ActiveStandbyElector: 
Yielding from election
2018-11-14 12:29:00,468 INFO org.apache.zookeeper.ZooKeeper: Session: 
0x1000acb2b350012 closed
2018-11-14 12:29:00,468 INFO org.apache.zookeeper.ClientCnxn: EventThread shut 
down for session: 0x1000acb2b350012
2018-11-14 12:29:01,469 INFO org.apache.zookeeper.ZooKeeper: Initiating client 
connection, 
connectString=instance-3.pp-devcos-myhadoop.us-central1.gcp.dev.paypalinc.com:2181
 sessionTimeout=1 
watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@2992f4e4
2018-11-14 12:29:01,471 INFO org.apache.zookeeper.ClientCnxn: Opening socket 
connection to server 
instance-3.pp-devcos-myhadoop.us-central1.gcp.dev.paypalinc.com/10.176.1.207:2181.
 Will not attempt to authenticate using SASL (unknown error)
2018-11-14 12:29:01,471 INFO org.apache.zookeeper.ClientCnxn: Socket connection 
established to 
instance-3.pp-devcos-myhadoop.us-central1.gcp.dev.paypalinc.com/10.176.1.207:2181,
 initiating session
2018-11-14 12:29:01,474 INFO org.apache.zookeeper.ClientCnxn: Session 
establishment complete on server 
instance-3.pp-devcos-myhadoop.us-central1.gcp.dev.paypalinc.com/10.176.1.207:2181,
 sessionid = 0x1000acb2b350013, negotiated timeout = 1
2018-11-14 12:29:01,475 INFO org.apache.hadoop.ha.ActiveStandbyElector: Session 
connected.
2018-11-14 12:29:01,479 INFO org.apache.hadoop.ha.ZKFailoverController: ZK 
Election indicated that NameNode at 
instance-3.pp-devcos-myhadoop.us-central1.gcp.dev.paypalinc.com/10.176.1.207:8030
 should become standby
2018-11-14 12:29:01,503 INFO org.apache.hadoop.ha.ZKFailoverController: 
Successfully transitioned NameNode at 
instance-3.pp-devcos-myhadoop.us-central1.gcp.dev.paypalinc.com/10.176.1.207:8030
 to standby state
{code}

With the ZKFC on the Standby killed, I am able to transition it to Observer and 
able to create directories, files, and then list status, cat, etc., as usual.

It seems we need to make a decision on whether we want to support automatic 
failover and go into ZKFC and, possibly, ZK states, or not support automatic 
failover but still support ConfiguredFailoverProxyProvider.

> Test reads from standby on a secure cluster with Configured failover
> 
>
> Key: HDFS-14059
> URL: https://issues.apache.org/jira/browse/HDFS-14059
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Konstantin Shvachko
>Assignee: Plamen Jeliazkov
>Priority: Major
>
> Run standard HDFS tests to verify reading from ObserverNode on a secure HA 
> cluster with {{ConfiguredFailoverProxyProvider}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14067) Allow manual failover between standby and observer

2018-11-14 Thread Chao Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16687092#comment-16687092
 ] 

Chao Sun commented on HDFS-14067:
-

I think option 2 is more or less what we were trying with patch v0? although 
the solution seems to be hard code a {{--forcemanual}} flag with the 
{{-transitionToObserver}} command, which is different from the patch.

IMHO we'd better to go with option 1 for now, since {{--transitionToObserver}} 
at the moment does carry certain risk as the observer may get elected to active 
later, and it's better to let user be aware of this. We can remove the 
{{-forcemanual}} requirement later once we figured out HDFS-13182.

> Allow manual failover between standby and observer
> --
>
> Key: HDFS-14067
> URL: https://issues.apache.org/jira/browse/HDFS-14067
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-14067-HDFS-12943.000.patch
>
>
> Currently if automatic failover is enabled in a HA environment, transition 
> from standby to observer would be blocked:
> {code}
> [hdfs@*** hadoop-3.3.0-SNAPSHOT]$ bin/hdfs haadmin -transitionToObserver ha2
> Automatic failover is enabled for NameNode at 
> Refusing to manually manage HA state, since it may cause
> a split-brain scenario or other incorrect state.
> If you are very sure you know what you are doing, please
> specify the --forcemanual flag.
> {code}
> We should allow manual transition between standby and observer in this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-838) Basic operations like create volume, freon are not working

2018-11-14 Thread Dinesh Chitlangia (JIRA)
Dinesh Chitlangia created HDDS-838:
--

 Summary: Basic operations like create volume, freon are not working
 Key: HDDS-838
 URL: https://issues.apache.org/jira/browse/HDDS-838
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Client, Ozone Manager
Reporter: Dinesh Chitlangia


After pulling latest from trunk, running simple operations like create volume, 
freon rk are failing with following exception:

 
{code:java}
MYBOX:ozone-0.4.0-SNAPSHOT dchitlangia$ bin/ozone sh volume create /test
2018-11-14 15:30:59,918 [main] ERROR - Couldn't create protocol class 
org.apache.hadoop.ozone.client.rpc.RpcClient exception:
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at 
org.apache.hadoop.ozone.client.OzoneClientFactory.getClientProtocol(OzoneClientFactory.java:291)
at 
org.apache.hadoop.ozone.client.OzoneClientFactory.getRpcClient(OzoneClientFactory.java:169)
at 
org.apache.hadoop.ozone.web.ozShell.OzoneAddress.createClient(OzoneAddress.java:111)
at 
org.apache.hadoop.ozone.web.ozShell.volume.CreateVolumeHandler.call(CreateVolumeHandler.java:70)
at 
org.apache.hadoop.ozone.web.ozShell.volume.CreateVolumeHandler.call(CreateVolumeHandler.java:38)
at picocli.CommandLine.execute(CommandLine.java:919)
at picocli.CommandLine.access$700(CommandLine.java:104)
at picocli.CommandLine$RunLast.handle(CommandLine.java:1083)
at picocli.CommandLine$RunLast.handle(CommandLine.java:1051)
at 
picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:959)
at picocli.CommandLine.parseWithHandlers(CommandLine.java:1242)
at picocli.CommandLine.parseWithHandler(CommandLine.java:1181)
at org.apache.hadoop.hdds.cli.GenericCli.execute(GenericCli.java:61)
at org.apache.hadoop.hdds.cli.GenericCli.run(GenericCli.java:52)
at org.apache.hadoop.ozone.web.ozShell.Shell.main(Shell.java:80)
Caused by: 
org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
java.lang.NullPointerException
at 
org.apache.hadoop.ozone.om.OzoneManager.getServiceList(OzoneManager.java:1118)
at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.getServiceList(OzoneManagerProtocolServerSideTranslatorPB.java:580)
at 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java:39227)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
at java.base/java.security.AccessController.doPrivileged(Native Method)
at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)

at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1511)
at org.apache.hadoop.ipc.Client.call(Client.java:1457)
at org.apache.hadoop.ipc.Client.call(Client.java:1367)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
at com.sun.proxy.$Proxy10.getServiceList(Unknown Source)
at 
org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.getServiceList(OzoneManagerProtocolClientSideTranslatorPB.java:766)
at 
org.apache.hadoop.ozone.client.rpc.RpcClient.getScmAddressForClient(RpcClient.java:169)
at org.apache.hadoop.ozone.client.rpc.RpcClient.(RpcClient.java:130)
... 19 more
java.lang.NullPointerException
{code}
Also verified using _jps_ that SCM, Datanode & OM are up and running.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



  1   2   3   >