[jira] [Commented] (HDDS-2607) DeadNodeHandler should not remove replica for a dead maintenance node
[ https://issues.apache.org/jira/browse/HDDS-2607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16985106#comment-16985106 ] Stephen O'Donnell commented on HDDS-2607: - The NodeStateManager is responsible for firing a "dead node" event, but it currently only does this if the node is "IN_SERVICE". It will not do it if it is DECOMMISSIONING, DECOMMISSIONED, ENTERING_MAINTENANCE or IN_MAINTENANCE. As part of this Jira we need to fix this, as the only time a dead node should not have the dead node event fired is when it is IN_MAINTENANCE. At other times, a "dead node event" should clear the nodes containers replica as usual. It is also important that the DatandeAdminMonitor aborts its workflow for any node which goes dead while maintenance is in progress (unless it has already reached IN_MAINTENANCE), for several reasons: 1. The dead node event will delete all the container replicas for the node, so its impossible to track them for replication correctly. 2. This could result in a node which is node completed decom / maintenance getting marked as completed. 3. If the node returns to service, the state on the cluster may have changed and new pipelines should be created etc meaning the admin workflow needs to restart. In this Jira, we should therefore consider: 1. Resetting the nodes OperationalState to "IN_SERVICE" as part of the dead node handling. 2. Ensure the dead node event gets triggered for all operational states except IN_MAINTENANCE 3. The maintenance workflow is aborted if the health of any nodes becomes "DEAD" 4. How to trigger a dead node event for a node which is dead and was IN_MAINTENANCE and maintenance has ended either automatically or manually. > DeadNodeHandler should not remove replica for a dead maintenance node > - > > Key: HDDS-2607 > URL: https://issues.apache.org/jira/browse/HDDS-2607 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM >Affects Versions: 0.5.0 >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > > Normally, when a node goes dead, the DeadNodeHandler removes all the > containers and replica associated with the node from the ContainerManager. > If a node is IN_MAINTENANCE and goes dead, then we do not want to remove its > replica. They should remain present in the system to prevent the container > being marked as under-replicated. > We also need to consider the case where the node is dead, and then > maintenance expires automatically. In that case, the replica associated with > the node must be removed and the affected containers will become > under-replicated. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Moved] (HDDS-2652) BlockOutputStreamEntryPool.java:allocateNewBlock spams the log with ExcludeList informatipn
[ https://issues.apache.org/jira/browse/HDDS-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh moved RATIS-764 to HDDS-2652: --- Component/s: (was: client) Ozone Client Key: HDDS-2652 (was: RATIS-764) Workflow: patch-available, re-open possible (was: no-reopen-closed, patch-avail) Project: Hadoop Distributed Data Store (was: Ratis) > BlockOutputStreamEntryPool.java:allocateNewBlock spams the log with > ExcludeList informatipn > --- > > Key: HDDS-2652 > URL: https://issues.apache.org/jira/browse/HDDS-2652 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Reporter: Mukul Kumar Singh >Priority: Major > Labels: MiniOzoneChaosCluster, newbie > > BlockOutputStreamEntryPool.java:allocateNewBlock spams the log with > ExcludeList information with the following log lines > {code} > 2019-11-29 20:22:51,590 [pool-244-thread-9] INFO > io.BlockOutputStreamEntryPool > (BlockOutputStreamEntryPool.java:allocateNewBlock(257)) - Allocating block > with ExcludeList {datanodes = [], containerIds = [], pipelineIds = []} > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] elek commented on a change in pull request #282: HDDS-2646. Start acceptance tests only if at least one THREE pipeline is available
elek commented on a change in pull request #282: HDDS-2646. Start acceptance tests only if at least one THREE pipeline is available URL: https://github.com/apache/hadoop-ozone/pull/282#discussion_r352132216 ## File path: hadoop-ozone/dist/src/main/compose/ozone/docker-config ## @@ -25,6 +25,7 @@ OZONE-SITE.XML_ozone.scm.client.address=scm OZONE-SITE.XML_ozone.replication=3 OZONE-SITE.XML_hdds.datanode.dir=/data/hdds OZONE-SITE.XML_hdds.profiler.endpoint.enabled=true +OZONE-SITE.XML_hdds.scm.safemode.min.datanode=3 Review comment: Yes, this is the problem what I tried to describe in #238 I am fine with the suggested approach but it makes more complex the definition. What I am thinking is to create a `simple` compose definition which can work with one datanode (and we can adjust there the replication factor and the s3 storage type as well). Almost all the tested functionality requires datanode=3, it seems to be enough to have one cluster which can work with one datanode... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] adoroszlai commented on a change in pull request #272: HDDS-2629. Ozone CLI: CreationTime/modifyTime of volume/bucket/key in…
adoroszlai commented on a change in pull request #272: HDDS-2629. Ozone CLI: CreationTime/modifyTime of volume/bucket/key in… URL: https://github.com/apache/hadoop-ozone/pull/272#discussion_r351807634 ## File path: hadoop-hdds/common/pom.xml ## @@ -147,6 +147,11 @@ https://maven.apache.org/xsd/maven-4.0.0.xsd;> snakeyaml 1.16 + + com.fasterxml.jackson.datatype + jackson-datatype-jsr310 + 2.9.8 Review comment: Can it use `${jackson2.version}` (currently 2.9.9) defined in root `pom.xml`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] adoroszlai commented on a change in pull request #272: HDDS-2629. Ozone CLI: CreationTime/modifyTime of volume/bucket/key in…
adoroszlai commented on a change in pull request #272: HDDS-2629. Ozone CLI: CreationTime/modifyTime of volume/bucket/key in… URL: https://github.com/apache/hadoop-ozone/pull/272#discussion_r352098332 ## File path: hadoop-ozone/ozone-manager/src/test/java/org/apache/hadoop/ozone/web/ozShell/TestObjectPrinter.java ## @@ -40,11 +40,13 @@ public void printObjectAsJson() throws IOException { OzoneConfiguration conf = new OzoneConfiguration(); OzoneVolume volume = new OzoneVolume(conf, Mockito.mock(ClientProtocol.class), "name", -"admin", "owner", 1L, 0L, +"admin", "owner", 1L, Instant.ofEpochMilli(0).toEpochMilli(), Review comment: This seems a complex way of writing `0`. If the intention is to avoid hard-coding the value, I think this one would be simpler: ``` Instant.EPOCH.toEpochMilli() ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] adoroszlai commented on a change in pull request #272: HDDS-2629. Ozone CLI: CreationTime/modifyTime of volume/bucket/key in…
adoroszlai commented on a change in pull request #272: HDDS-2629. Ozone CLI: CreationTime/modifyTime of volume/bucket/key in… URL: https://github.com/apache/hadoop-ozone/pull/272#discussion_r352096478 ## File path: hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestOzoneManagerHA.java ## @@ -228,7 +228,8 @@ public void testAllBucketOperations() throws Exception { Assert.assertEquals(bucketName, ozoneBucket.getName()); Assert.assertTrue(ozoneBucket.getVersioning()); Assert.assertEquals(StorageType.DISK, ozoneBucket.getStorageType()); -Assert.assertTrue(ozoneBucket.getCreationTime() <= Time.now()); +Assert.assertTrue(ozoneBucket.getCreationTime().compareTo( +Instant.now()) <= 0); Review comment: I think using `isAfter`/`isBefore` would be easier to understand at a glance. ``` assertFalse(ozoneBucket.getCreationTime().isAfter(Instant.now())) ``` (I hope I got the conversion right). Similarly in all other places where `compareTo()` is used. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] adoroszlai commented on a change in pull request #282: HDDS-2646. Start acceptance tests only if at least one THREE pipeline is available
adoroszlai commented on a change in pull request #282: HDDS-2646. Start acceptance tests only if at least one THREE pipeline is available URL: https://github.com/apache/hadoop-ozone/pull/282#discussion_r352093132 ## File path: hadoop-ozone/dist/src/main/compose/ozone/docker-config ## @@ -25,6 +25,7 @@ OZONE-SITE.XML_ozone.scm.client.address=scm OZONE-SITE.XML_ozone.replication=3 OZONE-SITE.XML_hdds.datanode.dir=/data/hdds OZONE-SITE.XML_hdds.profiler.endpoint.enabled=true +OZONE-SITE.XML_hdds.scm.safemode.min.datanode=3 Review comment: This will make it impossible to use these environments with a single datanode without modifying the config locally. I would like to propose an alternative solution: 1. define this config in the `environment` section in `docker-compose.yaml` using a variable that defaults to 1: `- "OZONE-SITE.XML_hdds.scm.safemode.min.datanode=${SAFEMODE_MIN_DATANODES:-1}"` 2. set the variable to 3 in `testlib.sh`: `export SAFEMODE_MIN_DATANODES=3` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] adoroszlai commented on a change in pull request #283: HDDS-2651 Make startId parameter non-mandatory while listing containe…
adoroszlai commented on a change in pull request #283: HDDS-2651 Make startId parameter non-mandatory while listing containe… URL: https://github.com/apache/hadoop-ozone/pull/283#discussion_r352084455 ## File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/SCMContainerManager.java ## @@ -217,7 +217,7 @@ public ContainerInfo getContainer(final ContainerID containerID) Collections.sort(containersIds); return containersIds.stream() - .filter(id -> id.getId() > startId) + .filter(id -> id.getId() >= startId) Review comment: The interface doc says `startId` is exclusive: https://github.com/apache/hadoop-ozone/blob/cd3f8c9f8316ae742ffeb497ab77581a4519173c/hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerManager.java#L77-L92 I don't know how "backward compatible" the interface needs to be at this stage, but I think it would be better to change the default `startId` to 0 in `scmcli` instead. Otherwise, please update the javadoc, too. CC @anuengineer This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2651) Make startId parameter non-mandatory while listing containers through shell command
[ https://issues.apache.org/jira/browse/HDDS-2651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-2651: - Labels: pull-request-available (was: ) > Make startId parameter non-mandatory while listing containers through shell > command > --- > > Key: HDDS-2651 > URL: https://issues.apache.org/jira/browse/HDDS-2651 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Nilotpal Nandi >Assignee: Nilotpal Nandi >Priority: Minor > Labels: pull-request-available > > ozone scmcli container list -s=2 > Here the startId "--start | -s " is a mandatory parameter . > Need to make it as non-mandatory parameter as the default value for startId > is already defined. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] nilotpalnandi opened a new pull request #283: HDDS-2651 Make startId parameter non-mandatory while listing containe…
nilotpalnandi opened a new pull request #283: HDDS-2651 Make startId parameter non-mandatory while listing containe… URL: https://github.com/apache/hadoop-ozone/pull/283 …rs through shell command ## What changes were proposed in this pull request? Currently the startId "--start | -s " is a mandatory parameter for container list shell command Need to make it as non-mandatory parameter as the default value for startId parameter is already defined. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-2651 ## How was this patch tested? Applied the patch and rebuilt ozone and then tested it by creating docker cluster using docker-compose This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2651) Make startId parameter non-mandatory while listing containers through shell command
Nilotpal Nandi created HDDS-2651: Summary: Make startId parameter non-mandatory while listing containers through shell command Key: HDDS-2651 URL: https://issues.apache.org/jira/browse/HDDS-2651 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Nilotpal Nandi Assignee: Nilotpal Nandi ozone scmcli container list -s=2 Here the startId "--start | -s " is a mandatory parameter . Need to make it as non-mandatory parameter as the default value for startId is already defined. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] ChenSammi commented on a change in pull request #282: HDDS-2646. Start acceptance tests only if at least one THREE pipeline is available
ChenSammi commented on a change in pull request #282: HDDS-2646. Start acceptance tests only if at least one THREE pipeline is available URL: https://github.com/apache/hadoop-ozone/pull/282#discussion_r352032450 ## File path: hadoop-ozone/dist/src/main/compose/ozonescripts/docker-config ## @@ -26,6 +26,8 @@ OZONE-SITE.XML_ozone.scm.client.address=scm OZONE-SITE.XML_ozone.replication=1 OZONE-SITE.XML_hdds.datanode.dir=/data/hdds OZONE-SITE.XML_hdds.datanode.plugins=org.apache.hadoop.ozone.web.OzoneHddsDatanodeService +OZONE-SITE.XML_hdds.scm.safemode.min.datanode=3 Review comment: Same as ozone-hdfs/docker-config. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] ChenSammi commented on a change in pull request #282: HDDS-2646. Start acceptance tests only if at least one THREE pipeline is available
ChenSammi commented on a change in pull request #282: HDDS-2646. Start acceptance tests only if at least one THREE pipeline is available URL: https://github.com/apache/hadoop-ozone/pull/282#discussion_r352026041 ## File path: hadoop-ozone/dist/src/main/compose/ozone-hdfs/docker-config ## @@ -23,6 +23,7 @@ OZONE-SITE.XML_ozone.metadata.dirs=/data/metadata OZONE-SITE.XML_ozone.scm.client.address=scm OZONE-SITE.XML_ozone.replication=1 OZONE-SITE.XML_hdds.datanode.dir=/data/hdds +OZONE-SITE.XML_hdds.scm.safemode.min.datanode=3 Review comment: This docker-compose starts 1 datanode currenlty and doesn't have test.sh . So either we add more datanodes to docker-compose.yaml or we can skip this file. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] ChenSammi commented on a change in pull request #282: HDDS-2646. Start acceptance tests only if at least one THREE pipeline is available
ChenSammi commented on a change in pull request #282: HDDS-2646. Start acceptance tests only if at least one THREE pipeline is available URL: https://github.com/apache/hadoop-ozone/pull/282#discussion_r352029424 ## File path: hadoop-ozone/dist/src/main/compose/ozone-mr/common-config ## @@ -22,6 +22,7 @@ OZONE-SITE.XML_ozone.scm.block.client.address=scm OZONE-SITE.XML_ozone.metadata.dirs=/data/metadata OZONE-SITE.XML_ozone.scm.client.address=scm OZONE-SITE.XML_ozone.replication=3 +OZONE-SITE.XML_hdds.scm.safemode.min.datanode=3 Review comment: Not very faimilar with docker-compose. Where do we tell docker-compose to start three datanodes with all these configurations? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2650) Fix createPipeline scmcli due to annotation changes in master
Li Cheng created HDDS-2650: -- Summary: Fix createPipeline scmcli due to annotation changes in master Key: HDDS-2650 URL: https://issues.apache.org/jira/browse/HDDS-2650 Project: Hadoop Distributed Data Store Issue Type: Sub-task Reporter: Li Cheng Assignee: Li Cheng Caused by: java.lang.IllegalArgumentException: Can not set org.apache.hadoop.hdds.scm.cli.SCMCLI field org.apache.hadoop.hdds.scm.cli.pipeline.CreatePipelineSubcommand.parent to org.apache.hadoop.hdds.scm.cli.pipeline.PipelineCommandsCaused by: java.lang.IllegalArgumentException: Can not set org.apache.hadoop.hdds.scm.cli.SCMCLI field org.apache.hadoop.hdds.scm.cli.pipeline.CreatePipelineSubcommand.parent to org.apache.hadoop.hdds.scm.cli.pipeline.PipelineCommands at sun.reflect.UnsafeFieldAccessorImpl.throwSetIllegalArgumentException(UnsafeFieldAccessorImpl.java:167) at sun.reflect.UnsafeFieldAccessorImpl.throwSetIllegalArgumentException(UnsafeFieldAccessorImpl.java:171) at sun.reflect.UnsafeObjectFieldAccessorImpl.set(UnsafeObjectFieldAccessorImpl.java:81) at java.lang.reflect.Field.set(Field.java:764) at picocli.CommandLine$Model$CommandReflection.initParentCommand(CommandLine.java:6476) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Reopened] (HDDS-1564) Ozone multi-raft support
[ https://issues.apache.org/jira/browse/HDDS-1564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Cheng reopened HDDS-1564: Not sure what happened, but HDDS-1564 is still under implementation. > Ozone multi-raft support > > > Key: HDDS-1564 > URL: https://issues.apache.org/jira/browse/HDDS-1564 > Project: Hadoop Distributed Data Store > Issue Type: New Feature > Components: Ozone Datanode, SCM >Reporter: Siddharth Wagle >Assignee: Li Cheng >Priority: Major > Attachments: Ozone Multi-Raft Support.pdf > > > Apache Ratis supports multi-raft by allowing the same node to be a part of > multiple raft groups. The proposal is to allow datanodes to be a part of > multiple raft groups. The attached design doc explains the reasons for doing > this as well a few initial design decisions. > Some of the work in this feature also related to HDDS-700 which implements > rack-aware container placement for closed containers. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline
[ https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Cheng resolved HDDS-2356. Release Note: MPU is steady after changes made by Bharat. Resolution: Fixed > Multipart upload report errors while writing to ozone Ratis pipeline > > > Key: HDDS-2356 > URL: https://issues.apache.org/jira/browse/HDDS-2356 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Affects Versions: 0.4.1 > Environment: Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM > on a separate VM >Reporter: Li Cheng >Assignee: Bharat Viswanadham >Priority: Blocker > Fix For: 0.5.0 > > Attachments: 2018-11-15-OM-logs.txt, 2019-11-06_18_13_57_422_ERROR, > hs_err_pid9340.log, image-2019-10-31-18-56-56-177.png, > om-audit-VM_50_210_centos.log, om_audit_log_plc_1570863541668_9278.txt > > > Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say > it's VM0. > I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path > on VM0, while reading data from VM0 local disk and write to mount path. The > dataset has various sizes of files from 0 byte to GB-level and it has a > number of ~50,000 files. > The writing is slow (1GB for ~10 mins) and it stops after around 4GB. As I > look at hadoop-root-om-VM_50_210_centos.out log, I see OM throwing errors > related with Multipart upload. This error eventually causes the writing to > terminate and OM to be closed. > > Updated on 11/06/2019: > See new multipart upload error NO_SUCH_MULTIPART_UPLOAD_ERROR and full logs > are in the attachment. > 2019-11-05 18:12:37,766 ERROR > org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest: > MultipartUpload Commit is failed for Key:./2 > 0191012/plc_1570863541668_9278 in Volume/Bucket > s325d55ad283aa400af464c76d713c07ad/ozone-test > NO_SUCH_MULTIPART_UPLOAD_ERROR > org.apache.hadoop.ozone.om.exceptions.OMException: No such Multipart upload > is with specified uploadId fcda8608-b431-48b7-8386- > 0a332f1a709a-103084683261641950 > at > org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest.validateAndUpdateCache(S3MultipartUploadCommitPartRequest.java:1 > 56) > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestDirectlyToOM(OzoneManagerProtocolServerSideTranslatorPB. > java:217) > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:132) > at > org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72) > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:100) > at > org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) > > Updated on 10/28/2019: > See MISMATCH_MULTIPART_LIST error. > > 2019-10-28 11:44:34,079 [qtp1383524016-70] ERROR - Error in Complete > Multipart Upload Request for bucket: ozone-test, key: > 20191012/plc_1570863541668_927 > 8 > MISMATCH_MULTIPART_LIST org.apache.hadoop.ozone.om.exceptions.OMException: > Complete Multipart Upload Failed: volume: > s3c89e813c80ffcea9543004d57b2a1239bucket: > ozone-testkey: 20191012/plc_1570863541668_9278 > at > org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:732) > at > org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.completeMultipartUpload(OzoneManagerProtocolClientSideTranslatorPB > .java:1104) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.apache.hadoop.hdds.tracing.TraceAllMethod.invoke(TraceAllMethod.java:66) > at
[jira] [Assigned] (HDDS-2636) Refresh pipeline information in OzoneManager lookupFile call
[ https://issues.apache.org/jira/browse/HDDS-2636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar reassigned HDDS-2636: - Assignee: Nanda kumar > Refresh pipeline information in OzoneManager lookupFile call > > > Key: HDDS-2636 > URL: https://issues.apache.org/jira/browse/HDDS-2636 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > > {{lookupFile}} call in OzoneManager doesn't refresh the pipeline information. > As a result of this, wrong pipeline information is returned to the client and > the client fails eventually. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] lokeshj1703 commented on issue #281: HDDS-2640 Add leaderID information in pipeline list subcommand
lokeshj1703 commented on issue #281: HDDS-2640 Add leaderID information in pipeline list subcommand URL: https://github.com/apache/hadoop-ozone/pull/281#issuecomment-559705110 @nilotpalnandi Thanks for working on the PR! I have merged it to master branch. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] lokeshj1703 merged pull request #281: HDDS-2640 Add leaderID information in pipeline list subcommand
lokeshj1703 merged pull request #281: HDDS-2640 Add leaderID information in pipeline list subcommand URL: https://github.com/apache/hadoop-ozone/pull/281 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Moved] (HDDS-2649) TestOzoneManagerHttpServer#testHttpPolicy fails intermittently
[ https://issues.apache.org/jira/browse/HDDS-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lokesh Jain moved RATIS-763 to HDDS-2649: - Key: HDDS-2649 (was: RATIS-763) Workflow: patch-available, re-open possible (was: no-reopen-closed, patch-avail) Project: Hadoop Distributed Data Store (was: Ratis) > TestOzoneManagerHttpServer#testHttpPolicy fails intermittently > -- > > Key: HDDS-2649 > URL: https://issues.apache.org/jira/browse/HDDS-2649 > Project: Hadoop Distributed Data Store > Issue Type: Test >Reporter: Lokesh Jain >Priority: Major > > TestOzoneManagerHttpServer#testHttpPolicy fails with the following exception. > {code:java} > [ERROR] Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 3.42 > s <<< FAILURE! - in org.apache.hadoop.ozone.om.TestOzoneManagerHttpServer > [ERROR] > testHttpPolicy[1](org.apache.hadoop.ozone.om.TestOzoneManagerHttpServer) > Time elapsed: 0.343 s <<< FAILURE! > java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.ozone.om.TestOzoneManagerHttpServer.testHttpPolicy(TestOzoneManagerHttpServer.java:110) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) > at org.junit.runners.ParentRunner.run(ParentRunner.java:309) > at org.junit.runners.Suite.runChild(Suite.java:127) > at org.junit.runners.Suite.runChild(Suite.java:26) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.run(ParentRunner.java:309) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2648) TestOzoneManagerDoubleBufferWithOMResponse
Marton Elek created HDDS-2648: - Summary: TestOzoneManagerDoubleBufferWithOMResponse Key: HDDS-2648 URL: https://issues.apache.org/jira/browse/HDDS-2648 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Manager Reporter: Marton Elek The test is flaky: Example run: [https://github.com/apache/hadoop-ozone/runs/325281277] Failure: {code:java} --- Test set: org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse --- Tests run: 3, Failures: 1, Errors: 0, Skipped: 1, Time elapsed: 5.31 s <<< FAILURE! - in org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse testDoubleBufferWithMixOfTransactionsParallel(org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse) Time elapsed: 0.282 s <<< FAILURE! java.lang.AssertionError: expected:<32> but was:<29> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.junit.Assert.assertEquals(Assert.java:542) at org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse.testDoubleBufferWithMixOfTransactionsParallel(TestOzoneManagerDoubleBufferWithOMResponse.java:247) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org