[jira] [Commented] (HDDS-2607) DeadNodeHandler should not remove replica for a dead maintenance node

2019-11-29 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16985106#comment-16985106
 ] 

Stephen O'Donnell commented on HDDS-2607:
-

The NodeStateManager is responsible for firing a "dead node" event, but it 
currently only does this if the node is "IN_SERVICE". It will not do it if it 
is DECOMMISSIONING, DECOMMISSIONED, ENTERING_MAINTENANCE or IN_MAINTENANCE.

As part of this Jira we need to fix this, as the only time a dead node should 
not have the dead node event fired is when it is IN_MAINTENANCE. At other 
times, a "dead node event" should clear the nodes containers replica as usual. 

It is also important that the DatandeAdminMonitor aborts its workflow for any 
node which goes dead while maintenance is in progress (unless it has already 
reached IN_MAINTENANCE), for several reasons:

1. The dead node event will delete all the container replicas for the node, so 
its impossible to track them for replication correctly.
2. This could result in a node which is node completed decom / maintenance 
getting marked as completed.
3. If the node returns to service, the state on the cluster may have changed 
and new pipelines should be created etc meaning the admin workflow needs to 
restart.

In this Jira, we should therefore consider:

1. Resetting the nodes OperationalState to "IN_SERVICE" as part of the dead 
node handling.
2. Ensure the dead node event gets triggered for all operational states except 
IN_MAINTENANCE
3. The maintenance workflow is aborted if the health of any nodes becomes "DEAD"
4. How to trigger a dead node event for a node which is dead and was 
IN_MAINTENANCE and maintenance has ended either automatically or manually.

> DeadNodeHandler should not remove replica for a dead maintenance node
> -
>
> Key: HDDS-2607
> URL: https://issues.apache.org/jira/browse/HDDS-2607
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Affects Versions: 0.5.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>
> Normally, when a node goes dead, the DeadNodeHandler removes all the 
> containers and replica associated with the node from the ContainerManager.
> If a node is IN_MAINTENANCE and goes dead, then we do not want to remove its 
> replica. They should remain present in the system to prevent the container 
> being marked as under-replicated.
> We also need to consider the case where the node is dead, and then 
> maintenance expires automatically. In that case, the replica associated with 
> the node must be removed and the affected containers will become 
> under-replicated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Moved] (HDDS-2652) BlockOutputStreamEntryPool.java:allocateNewBlock spams the log with ExcludeList informatipn

2019-11-29 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh moved RATIS-764 to HDDS-2652:
---

Component/s: (was: client)
 Ozone Client
Key: HDDS-2652  (was: RATIS-764)
   Workflow: patch-available, re-open possible  (was: no-reopen-closed, 
patch-avail)
Project: Hadoop Distributed Data Store  (was: Ratis)

> BlockOutputStreamEntryPool.java:allocateNewBlock spams the log with 
> ExcludeList informatipn
> ---
>
> Key: HDDS-2652
> URL: https://issues.apache.org/jira/browse/HDDS-2652
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Reporter: Mukul Kumar Singh
>Priority: Major
>  Labels: MiniOzoneChaosCluster, newbie
>
> BlockOutputStreamEntryPool.java:allocateNewBlock spams the log with 
> ExcludeList information with the following log lines
> {code}
> 2019-11-29 20:22:51,590 [pool-244-thread-9] INFO  
> io.BlockOutputStreamEntryPool 
> (BlockOutputStreamEntryPool.java:allocateNewBlock(257)) - Allocating block 
> with ExcludeList {datanodes = [], containerIds = [], pipelineIds = []}
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek commented on a change in pull request #282: HDDS-2646. Start acceptance tests only if at least one THREE pipeline is available

2019-11-29 Thread GitBox
elek commented on a change in pull request #282: HDDS-2646. Start acceptance 
tests only if at least one THREE pipeline is available
URL: https://github.com/apache/hadoop-ozone/pull/282#discussion_r352132216
 
 

 ##
 File path: hadoop-ozone/dist/src/main/compose/ozone/docker-config
 ##
 @@ -25,6 +25,7 @@ OZONE-SITE.XML_ozone.scm.client.address=scm
 OZONE-SITE.XML_ozone.replication=3
 OZONE-SITE.XML_hdds.datanode.dir=/data/hdds
 OZONE-SITE.XML_hdds.profiler.endpoint.enabled=true
+OZONE-SITE.XML_hdds.scm.safemode.min.datanode=3
 
 Review comment:
   Yes, this is the problem what I tried to describe in #238 I am fine with the 
suggested approach but it makes more complex the definition. 
   
   What I am thinking is to create a `simple` compose definition which can work 
with one datanode (and we can adjust there the replication factor and the s3 
storage type as well).
   
   Almost all the tested functionality requires datanode=3, it seems to be 
enough to have one cluster which can work with one datanode...


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] adoroszlai commented on a change in pull request #272: HDDS-2629. Ozone CLI: CreationTime/modifyTime of volume/bucket/key in…

2019-11-29 Thread GitBox
adoroszlai commented on a change in pull request #272: HDDS-2629. Ozone CLI: 
CreationTime/modifyTime of volume/bucket/key in…
URL: https://github.com/apache/hadoop-ozone/pull/272#discussion_r351807634
 
 

 ##
 File path: hadoop-hdds/common/pom.xml
 ##
 @@ -147,6 +147,11 @@ https://maven.apache.org/xsd/maven-4.0.0.xsd;>
   snakeyaml
   1.16
 
+
+  com.fasterxml.jackson.datatype
+  jackson-datatype-jsr310
+  2.9.8
 
 Review comment:
   Can it use `${jackson2.version}` (currently 2.9.9) defined in root `pom.xml`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] adoroszlai commented on a change in pull request #272: HDDS-2629. Ozone CLI: CreationTime/modifyTime of volume/bucket/key in…

2019-11-29 Thread GitBox
adoroszlai commented on a change in pull request #272: HDDS-2629. Ozone CLI: 
CreationTime/modifyTime of volume/bucket/key in…
URL: https://github.com/apache/hadoop-ozone/pull/272#discussion_r352098332
 
 

 ##
 File path: 
hadoop-ozone/ozone-manager/src/test/java/org/apache/hadoop/ozone/web/ozShell/TestObjectPrinter.java
 ##
 @@ -40,11 +40,13 @@ public void printObjectAsJson() throws IOException {
 OzoneConfiguration conf = new OzoneConfiguration();
 OzoneVolume volume =
 new OzoneVolume(conf, Mockito.mock(ClientProtocol.class), "name",
-"admin", "owner", 1L, 0L,
+"admin", "owner", 1L, Instant.ofEpochMilli(0).toEpochMilli(),
 
 Review comment:
   This seems a complex way of writing `0`.  If the intention is to avoid 
hard-coding the value, I think this one would be simpler:
   
   ```
   Instant.EPOCH.toEpochMilli()
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] adoroszlai commented on a change in pull request #272: HDDS-2629. Ozone CLI: CreationTime/modifyTime of volume/bucket/key in…

2019-11-29 Thread GitBox
adoroszlai commented on a change in pull request #272: HDDS-2629. Ozone CLI: 
CreationTime/modifyTime of volume/bucket/key in…
URL: https://github.com/apache/hadoop-ozone/pull/272#discussion_r352096478
 
 

 ##
 File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestOzoneManagerHA.java
 ##
 @@ -228,7 +228,8 @@ public void testAllBucketOperations() throws Exception {
 Assert.assertEquals(bucketName, ozoneBucket.getName());
 Assert.assertTrue(ozoneBucket.getVersioning());
 Assert.assertEquals(StorageType.DISK, ozoneBucket.getStorageType());
-Assert.assertTrue(ozoneBucket.getCreationTime() <= Time.now());
+Assert.assertTrue(ozoneBucket.getCreationTime().compareTo(
+Instant.now()) <= 0);
 
 Review comment:
   I think using `isAfter`/`isBefore` would be easier to understand at a glance.
   
   ```
   assertFalse(ozoneBucket.getCreationTime().isAfter(Instant.now()))
   ```
   
   (I hope I got the conversion right).
   
   Similarly in all other places where `compareTo()` is used.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] adoroszlai commented on a change in pull request #282: HDDS-2646. Start acceptance tests only if at least one THREE pipeline is available

2019-11-29 Thread GitBox
adoroszlai commented on a change in pull request #282: HDDS-2646. Start 
acceptance tests only if at least one THREE pipeline is available
URL: https://github.com/apache/hadoop-ozone/pull/282#discussion_r352093132
 
 

 ##
 File path: hadoop-ozone/dist/src/main/compose/ozone/docker-config
 ##
 @@ -25,6 +25,7 @@ OZONE-SITE.XML_ozone.scm.client.address=scm
 OZONE-SITE.XML_ozone.replication=3
 OZONE-SITE.XML_hdds.datanode.dir=/data/hdds
 OZONE-SITE.XML_hdds.profiler.endpoint.enabled=true
+OZONE-SITE.XML_hdds.scm.safemode.min.datanode=3
 
 Review comment:
   This will make it impossible to use these environments with a single 
datanode without modifying the config locally.
   
   I would like to propose an alternative solution:
   
   1. define this config in the `environment` section in `docker-compose.yaml` 
using a variable that defaults to 1:
  `- 
"OZONE-SITE.XML_hdds.scm.safemode.min.datanode=${SAFEMODE_MIN_DATANODES:-1}"`
   2. set the variable to 3 in `testlib.sh`:
  `export SAFEMODE_MIN_DATANODES=3`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] adoroszlai commented on a change in pull request #283: HDDS-2651 Make startId parameter non-mandatory while listing containe…

2019-11-29 Thread GitBox
adoroszlai commented on a change in pull request #283: HDDS-2651 Make startId 
parameter non-mandatory while listing containe…
URL: https://github.com/apache/hadoop-ozone/pull/283#discussion_r352084455
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/SCMContainerManager.java
 ##
 @@ -217,7 +217,7 @@ public ContainerInfo getContainer(final ContainerID 
containerID)
   Collections.sort(containersIds);
 
   return containersIds.stream()
-  .filter(id -> id.getId() > startId)
+  .filter(id -> id.getId() >= startId)
 
 Review comment:
   The interface doc says `startId` is exclusive:
   
   
https://github.com/apache/hadoop-ozone/blob/cd3f8c9f8316ae742ffeb497ab77581a4519173c/hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerManager.java#L77-L92
   I don't know how "backward compatible" the interface needs to be at this 
stage, but I think it would be better to change the default `startId` to 0 in 
`scmcli` instead.  Otherwise, please update the javadoc, too.
   
   CC @anuengineer


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2651) Make startId parameter non-mandatory while listing containers through shell command

2019-11-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-2651:
-
Labels: pull-request-available  (was: )

> Make startId parameter non-mandatory while listing containers through shell 
> command
> ---
>
> Key: HDDS-2651
> URL: https://issues.apache.org/jira/browse/HDDS-2651
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Nilotpal Nandi
>Assignee: Nilotpal Nandi
>Priority: Minor
>  Labels: pull-request-available
>
> ozone scmcli container list -s=2
> Here the startId  "--start | -s " is a mandatory parameter .
> Need to make it as non-mandatory parameter as the default value for startId 
> is already defined.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] nilotpalnandi opened a new pull request #283: HDDS-2651 Make startId parameter non-mandatory while listing containe…

2019-11-29 Thread GitBox
nilotpalnandi opened a new pull request #283: HDDS-2651 Make startId parameter 
non-mandatory while listing containe…
URL: https://github.com/apache/hadoop-ozone/pull/283
 
 
   …rs through shell command
   
   ## What changes were proposed in this pull request?
   
   Currently  the startId  "--start | -s " is a mandatory parameter for 
container list shell command
   
   Need to make it as non-mandatory parameter as the default value for startId 
parameter is already defined.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-2651
   
   ## How was this patch tested?
   
   Applied the patch and rebuilt ozone and then tested it by creating docker 
cluster using docker-compose
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2651) Make startId parameter non-mandatory while listing containers through shell command

2019-11-29 Thread Nilotpal Nandi (Jira)
Nilotpal Nandi created HDDS-2651:


 Summary: Make startId parameter non-mandatory while listing 
containers through shell command
 Key: HDDS-2651
 URL: https://issues.apache.org/jira/browse/HDDS-2651
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Nilotpal Nandi
Assignee: Nilotpal Nandi


ozone scmcli container list -s=2

Here the startId  "--start | -s " is a mandatory parameter .

Need to make it as non-mandatory parameter as the default value for startId is 
already defined.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] ChenSammi commented on a change in pull request #282: HDDS-2646. Start acceptance tests only if at least one THREE pipeline is available

2019-11-29 Thread GitBox
ChenSammi commented on a change in pull request #282: HDDS-2646. Start 
acceptance tests only if at least one THREE pipeline is available
URL: https://github.com/apache/hadoop-ozone/pull/282#discussion_r352032450
 
 

 ##
 File path: hadoop-ozone/dist/src/main/compose/ozonescripts/docker-config
 ##
 @@ -26,6 +26,8 @@ OZONE-SITE.XML_ozone.scm.client.address=scm
 OZONE-SITE.XML_ozone.replication=1
 OZONE-SITE.XML_hdds.datanode.dir=/data/hdds
 
OZONE-SITE.XML_hdds.datanode.plugins=org.apache.hadoop.ozone.web.OzoneHddsDatanodeService
+OZONE-SITE.XML_hdds.scm.safemode.min.datanode=3
 
 Review comment:
   Same as ozone-hdfs/docker-config.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] ChenSammi commented on a change in pull request #282: HDDS-2646. Start acceptance tests only if at least one THREE pipeline is available

2019-11-29 Thread GitBox
ChenSammi commented on a change in pull request #282: HDDS-2646. Start 
acceptance tests only if at least one THREE pipeline is available
URL: https://github.com/apache/hadoop-ozone/pull/282#discussion_r352026041
 
 

 ##
 File path: hadoop-ozone/dist/src/main/compose/ozone-hdfs/docker-config
 ##
 @@ -23,6 +23,7 @@ OZONE-SITE.XML_ozone.metadata.dirs=/data/metadata
 OZONE-SITE.XML_ozone.scm.client.address=scm
 OZONE-SITE.XML_ozone.replication=1
 OZONE-SITE.XML_hdds.datanode.dir=/data/hdds
+OZONE-SITE.XML_hdds.scm.safemode.min.datanode=3
 
 Review comment:
   This docker-compose starts 1 datanode currenlty and doesn't have test.sh .  
So either we add more datanodes to docker-compose.yaml or we can skip this file.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] ChenSammi commented on a change in pull request #282: HDDS-2646. Start acceptance tests only if at least one THREE pipeline is available

2019-11-29 Thread GitBox
ChenSammi commented on a change in pull request #282: HDDS-2646. Start 
acceptance tests only if at least one THREE pipeline is available
URL: https://github.com/apache/hadoop-ozone/pull/282#discussion_r352029424
 
 

 ##
 File path: hadoop-ozone/dist/src/main/compose/ozone-mr/common-config
 ##
 @@ -22,6 +22,7 @@ OZONE-SITE.XML_ozone.scm.block.client.address=scm
 OZONE-SITE.XML_ozone.metadata.dirs=/data/metadata
 OZONE-SITE.XML_ozone.scm.client.address=scm
 OZONE-SITE.XML_ozone.replication=3
+OZONE-SITE.XML_hdds.scm.safemode.min.datanode=3
 
 Review comment:
   Not very faimilar with docker-compose.  Where do we tell docker-compose to 
start three datanodes with all these configurations? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2650) Fix createPipeline scmcli due to annotation changes in master

2019-11-29 Thread Li Cheng (Jira)
Li Cheng created HDDS-2650:
--

 Summary: Fix createPipeline scmcli due to annotation changes in 
master
 Key: HDDS-2650
 URL: https://issues.apache.org/jira/browse/HDDS-2650
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
Reporter: Li Cheng
Assignee: Li Cheng


Caused by: java.lang.IllegalArgumentException: Can not set 
org.apache.hadoop.hdds.scm.cli.SCMCLI field 
org.apache.hadoop.hdds.scm.cli.pipeline.CreatePipelineSubcommand.parent to 
org.apache.hadoop.hdds.scm.cli.pipeline.PipelineCommandsCaused by: 
java.lang.IllegalArgumentException: Can not set 
org.apache.hadoop.hdds.scm.cli.SCMCLI field 
org.apache.hadoop.hdds.scm.cli.pipeline.CreatePipelineSubcommand.parent to 
org.apache.hadoop.hdds.scm.cli.pipeline.PipelineCommands at 
sun.reflect.UnsafeFieldAccessorImpl.throwSetIllegalArgumentException(UnsafeFieldAccessorImpl.java:167)
 at 
sun.reflect.UnsafeFieldAccessorImpl.throwSetIllegalArgumentException(UnsafeFieldAccessorImpl.java:171)
 at 
sun.reflect.UnsafeObjectFieldAccessorImpl.set(UnsafeObjectFieldAccessorImpl.java:81)
 at java.lang.reflect.Field.set(Field.java:764) at 
picocli.CommandLine$Model$CommandReflection.initParentCommand(CommandLine.java:6476)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Reopened] (HDDS-1564) Ozone multi-raft support

2019-11-29 Thread Li Cheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Cheng reopened HDDS-1564:


Not sure what happened, but HDDS-1564 is still under implementation.

> Ozone multi-raft support
> 
>
> Key: HDDS-1564
> URL: https://issues.apache.org/jira/browse/HDDS-1564
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>  Components: Ozone Datanode, SCM
>Reporter: Siddharth Wagle
>Assignee: Li Cheng
>Priority: Major
> Attachments: Ozone Multi-Raft Support.pdf
>
>
> Apache Ratis supports multi-raft by allowing the same node to be a part of 
> multiple raft groups. The proposal is to allow datanodes to be a part of 
> multiple raft groups. The attached design doc explains the reasons for doing 
> this as well a few initial design decisions. 
> Some of the work in this feature also related to HDDS-700 which implements 
> rack-aware container placement for closed containers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline

2019-11-29 Thread Li Cheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Cheng resolved HDDS-2356.

Release Note: MPU is steady after changes made by Bharat. 
  Resolution: Fixed

> Multipart upload report errors while writing to ozone Ratis pipeline
> 
>
> Key: HDDS-2356
> URL: https://issues.apache.org/jira/browse/HDDS-2356
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.4.1
> Environment: Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM 
> on a separate VM
>Reporter: Li Cheng
>Assignee: Bharat Viswanadham
>Priority: Blocker
> Fix For: 0.5.0
>
> Attachments: 2018-11-15-OM-logs.txt, 2019-11-06_18_13_57_422_ERROR, 
> hs_err_pid9340.log, image-2019-10-31-18-56-56-177.png, 
> om-audit-VM_50_210_centos.log, om_audit_log_plc_1570863541668_9278.txt
>
>
> Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say 
> it's VM0.
> I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path 
> on VM0, while reading data from VM0 local disk and write to mount path. The 
> dataset has various sizes of files from 0 byte to GB-level and it has a 
> number of ~50,000 files. 
> The writing is slow (1GB for ~10 mins) and it stops after around 4GB. As I 
> look at hadoop-root-om-VM_50_210_centos.out log, I see OM throwing errors 
> related with Multipart upload. This error eventually causes the  writing to 
> terminate and OM to be closed. 
>  
> Updated on 11/06/2019:
> See new multipart upload error NO_SUCH_MULTIPART_UPLOAD_ERROR and full logs 
> are in the attachment.
>  2019-11-05 18:12:37,766 ERROR 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest:
>  MultipartUpload Commit is failed for Key:./2
> 0191012/plc_1570863541668_9278 in Volume/Bucket 
> s325d55ad283aa400af464c76d713c07ad/ozone-test
> NO_SUCH_MULTIPART_UPLOAD_ERROR 
> org.apache.hadoop.ozone.om.exceptions.OMException: No such Multipart upload 
> is with specified uploadId fcda8608-b431-48b7-8386-
> 0a332f1a709a-103084683261641950
> at 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest.validateAndUpdateCache(S3MultipartUploadCommitPartRequest.java:1
> 56)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestDirectlyToOM(OzoneManagerProtocolServerSideTranslatorPB.
> java:217)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:132)
> at 
> org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:100)
> at 
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
>  
> Updated on 10/28/2019:
> See MISMATCH_MULTIPART_LIST error.
>  
> 2019-10-28 11:44:34,079 [qtp1383524016-70] ERROR - Error in Complete 
> Multipart Upload Request for bucket: ozone-test, key: 
> 20191012/plc_1570863541668_927
>  8
>  MISMATCH_MULTIPART_LIST org.apache.hadoop.ozone.om.exceptions.OMException: 
> Complete Multipart Upload Failed: volume: 
> s3c89e813c80ffcea9543004d57b2a1239bucket:
>  ozone-testkey: 20191012/plc_1570863541668_9278
>  at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:732)
>  at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.completeMultipartUpload(OzoneManagerProtocolClientSideTranslatorPB
>  .java:1104)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:497)
>  at 
> org.apache.hadoop.hdds.tracing.TraceAllMethod.invoke(TraceAllMethod.java:66)
>  at 

[jira] [Assigned] (HDDS-2636) Refresh pipeline information in OzoneManager lookupFile call

2019-11-29 Thread Nanda kumar (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nanda kumar reassigned HDDS-2636:
-

Assignee: Nanda kumar

> Refresh pipeline information in OzoneManager lookupFile call
> 
>
> Key: HDDS-2636
> URL: https://issues.apache.org/jira/browse/HDDS-2636
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
>
> {{lookupFile}} call in OzoneManager doesn't refresh the pipeline information. 
> As a result of this, wrong pipeline information is returned to the client and 
> the client fails eventually.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] lokeshj1703 commented on issue #281: HDDS-2640 Add leaderID information in pipeline list subcommand

2019-11-29 Thread GitBox
lokeshj1703 commented on issue #281: HDDS-2640 Add leaderID information in 
pipeline list subcommand
URL: https://github.com/apache/hadoop-ozone/pull/281#issuecomment-559705110
 
 
   @nilotpalnandi Thanks for working on the PR! I have merged it to master 
branch.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] lokeshj1703 merged pull request #281: HDDS-2640 Add leaderID information in pipeline list subcommand

2019-11-29 Thread GitBox
lokeshj1703 merged pull request #281: HDDS-2640 Add leaderID information in 
pipeline list subcommand
URL: https://github.com/apache/hadoop-ozone/pull/281
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Moved] (HDDS-2649) TestOzoneManagerHttpServer#testHttpPolicy fails intermittently

2019-11-29 Thread Lokesh Jain (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lokesh Jain moved RATIS-763 to HDDS-2649:
-

 Key: HDDS-2649  (was: RATIS-763)
Workflow: patch-available, re-open possible  (was: no-reopen-closed, 
patch-avail)
 Project: Hadoop Distributed Data Store  (was: Ratis)

> TestOzoneManagerHttpServer#testHttpPolicy fails intermittently
> --
>
> Key: HDDS-2649
> URL: https://issues.apache.org/jira/browse/HDDS-2649
> Project: Hadoop Distributed Data Store
>  Issue Type: Test
>Reporter: Lokesh Jain
>Priority: Major
>
> TestOzoneManagerHttpServer#testHttpPolicy fails with the following exception.
> {code:java}
> [ERROR] Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 3.42 
> s <<< FAILURE! - in org.apache.hadoop.ozone.om.TestOzoneManagerHttpServer
> [ERROR] 
> testHttpPolicy[1](org.apache.hadoop.ozone.om.TestOzoneManagerHttpServer)  
> Time elapsed: 0.343 s  <<< FAILURE!
> java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:86)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at org.junit.Assert.assertTrue(Assert.java:52)
> at 
> org.apache.hadoop.ozone.om.TestOzoneManagerHttpServer.testHttpPolicy(TestOzoneManagerHttpServer.java:110)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> at org.junit.runners.Suite.runChild(Suite.java:127)
> at org.junit.runners.Suite.runChild(Suite.java:26)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2648) TestOzoneManagerDoubleBufferWithOMResponse

2019-11-29 Thread Marton Elek (Jira)
Marton Elek created HDDS-2648:
-

 Summary: TestOzoneManagerDoubleBufferWithOMResponse
 Key: HDDS-2648
 URL: https://issues.apache.org/jira/browse/HDDS-2648
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Manager
Reporter: Marton Elek


The test is flaky:

 

Example run: [https://github.com/apache/hadoop-ozone/runs/325281277]

 

Failure:
{code:java}
---
Test set: 
org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse
---
Tests run: 3, Failures: 1, Errors: 0, Skipped: 1, Time elapsed: 5.31 s <<< 
FAILURE! - in 
org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse
testDoubleBufferWithMixOfTransactionsParallel(org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse)
  Time elapsed: 0.282 s  <<< FAILURE!
java.lang.AssertionError: expected:<32> but was:<29>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at org.junit.Assert.assertEquals(Assert.java:542)
at 
org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse.testDoubleBufferWithMixOfTransactionsParallel(TestOzoneManagerDoubleBufferWithOMResponse.java:247)
 {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org