[jira] [Updated] (HDFS-13972) RBF: Support for Delegation Token (WebHDFS)
[ https://issues.apache.org/jira/browse/HDFS-13972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] CR Hota updated HDFS-13972: --- Attachment: HDFS-13972-HDFS-13891.002.patch > RBF: Support for Delegation Token (WebHDFS) > --- > > Key: HDFS-13972 > URL: https://issues.apache.org/jira/browse/HDFS-13972 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: CR Hota >Priority: Major > Attachments: HDFS-13972-HDFS-13891.001.patch, > HDFS-13972-HDFS-13891.002.patch > > > HDFS Router should support issuing HDFS delegation tokens through WebHDFS. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-891) Create customized yetus personality for ozone
[ https://issues.apache.org/jira/browse/HDDS-891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772771#comment-16772771 ] Elek, Marton commented on HDDS-891: --- Thanks the answer [~aw] Sorry for the confusion. I think there are two questions which are mixed here. # I would like to fix HDDS-146. It was not clear for me what is the wrong. If I understood well, you suggest to double quote the $DOCKER_INTERACTIVE_RUN variable in the docker run line. I think it's a false positive report as DOCKER_INTERACTIVE_RUN should be either an empty string or unset (which will be replaced by "-i -t"). I think it shouldn't be quoted because in that case instead of -i -t we would add just one argument ('-i -t'). But please let me know if I am wrong. (I can add a check if it initial value anything other than an empty string.) # "_Cutting back on options just because you think they don't apply to Ozone is almost entirely incorrect."_ Please not that I am totally on the _opposite_ side. I would like to add MORE tests and MORE strict tests _in addition_ to the existing tests. (Especially: running docker based acceptance tests, run ALL the hdds/ozone unit tests all the time, not just for the changed projects, check ALL the findbugs/checkstyle issues not just the new ones). # I am convinced to run the more strict tests in addition to the existing yetus tests. Please let me know If I can do something to get Yetus results for the PR-s. > Create customized yetus personality for ozone > - > > Key: HDDS-891 > URL: https://issues.apache.org/jira/browse/HDDS-891 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Major > > Ozone pre commit builds (such as > https://builds.apache.org/job/PreCommit-HDDS-Build/) use the official hadoop > personality from the yetus personality. > Yetus personalities are bash scripts which contain personalization for > specific builds. > The hadoop personality tries to identify which project should be built and > use partial build to build only the required subprojects because the full > build is very time consuming. > But in Ozone: > 1.) The build + unit tests are very fast > 2.) We don't need all the checks (for example the hadoop specific shading > test) > 3.) We prefer to do a full build and full unit test for hadoop-ozone and > hadoop-hdds subrojects (for example the hadoop-ozone integration test always > should be executed as it contains many generic unit test) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13972) RBF: Support for Delegation Token (WebHDFS)
[ https://issues.apache.org/jira/browse/HDFS-13972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772776#comment-16772776 ] CR Hota commented on HDFS-13972: [~brahmareddy] [~elgoiri] Uploaded a new patch after rebasing with latest changes for early feedback. Working on wiring the tests. Realized that we did not add any WEB based tests as part kerberos patch in HDFS-12284. > RBF: Support for Delegation Token (WebHDFS) > --- > > Key: HDFS-13972 > URL: https://issues.apache.org/jira/browse/HDFS-13972 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: CR Hota >Priority: Major > Attachments: HDFS-13972-HDFS-13891.001.patch, > HDFS-13972-HDFS-13891.002.patch > > > HDFS Router should support issuing HDFS delegation tokens through WebHDFS. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1086) Remove RaftClient from OM
[ https://issues.apache.org/jira/browse/HDDS-1086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hanisha Koneru updated HDDS-1086: - Status: Open (was: Patch Available) > Remove RaftClient from OM > - > > Key: HDDS-1086 > URL: https://issues.apache.org/jira/browse/HDDS-1086 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: HA, OM >Reporter: Hanisha Koneru >Assignee: Hanisha Koneru >Priority: Major > Attachments: HDDS-1086.001.patch > > > Currently we run RaftClient in OM which takes the incoming client requests > and submits it to the OM's Ratis server. This hop can be avoided if OM > submits the incoming client request directly to its Ratis server. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14297) Add cache for getContentSummary() result
[ https://issues.apache.org/jira/browse/HDFS-14297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772780#comment-16772780 ] Tao Jie commented on HDFS-14297: Thank you [~xkrogen], {{getContentSummary}} is invoked from several peripheral systems, not only for monitoring quotas in our environment. We can replace {{getContentSummary}} by {{getQuotaUsage}} in some place. I still think we should do some improvement on server side. If we have a new user who call {{getContentSummary}} very frequently, it will cause a lot of load to namenode rpc server > Add cache for getContentSummary() result > > > Key: HDFS-14297 > URL: https://issues.apache.org/jira/browse/HDFS-14297 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Tao Jie >Priority: Major > > In a large HDFS cluster, calling {{getContentSummary}} for a directory with > large amount of files is very expensive. In a certain cluster with more than > 100 million files, calling {{getContentSummary}} may take more than 10s and > it will hold fsnamesystem lock for such a long time. > In our cluster, there are several peripheral systems calling > {{getContentSummary}} periodically to monitor the status of dirs. Actually we > don't need the very accurate result in most cases. We could keep a cache for > those contentSummary result in namenode, with which we could avoid repeated > heavy request in a span. Also we should add more restrictions to this cache: > 1,its size should be limited and it should be LRU, 2, only result of heavy > request would be added to this cache, eg, rpctime over 1000ms. > We may create a new RPC method or add a flag to the current method so that we > will not modify the current behavior and we can have a choose of a accurate > but expensive method or a fast but inaccurate method. > Any thought? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13762) Support non-volatile storage class memory(SCM) in HDFS cache directives
[ https://issues.apache.org/jira/browse/HDFS-13762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772797#comment-16772797 ] Uma Maheswara Rao G commented on HDFS-13762: Thanks [~PhiloHe] for updating the patch. Please find the following feedback. I am still reviewing it and will post remaining comments tomorrow. # {code:java} this.pmemManager = new PmemVolumeManager(); this.pmemManager.load(pmemVolumes); PmemMappedBlock.setPersistentMemoryManager(this.pmemManager); PmemMappedBlock.setDataset(dataset); {code} I think interface design can be cleaner. Now we have interfaces for MappableBlock but we really will not depend much on interface, we actually directly use class to load block as they are staic methods. I am thinking something like this: Lets have MemoryVolumeManager interface which will be implemented by two separate impelemnattions: PMemVolumeManager and MemMappedVolumeManager They can have init methods and getMappableBlockLoader They can return respective MappableBlockLoader classes, they are PMemMappedBlock or MemoryMappedBlock Then this loader will have interfaces of loadBlock, getLength, and afterCache. Lets make current static load methods can be provate static methods in PMemMappedBlock and MemoryMappedBlock and use them in loadBlock interface implemenattion. Does this make sense to you? This will avoid the if check every time at {code:java} if (pmemManager == null) { mappableBlock = MemoryMappedBlock.load(length, blockIn, metaIn, blockFileName, key); } else { mappableBlock = PmemMappedBlock.load(length, blockIn, metaIn, blockFileName, key); } {code} MemoryVolumeManager#getMappableBlockLoader will get right loader and load the block. # "Fail to msync region. --> Failed to msync region # // Load Intel ISA-L : you may want to removed this in pmdk file. I saw in couple of other places referring to ISA-L. Please remove them as its not related to that. # If some DNs configured with pmem volumes and some not, then blocks will be cached in both mmapped and pmemmapped? Probably you would like to document your recommendation? # You may want to update the documentation for this feature https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/CentralizedCacheManagement.html # Since its persistet, how does this handle or remember cached blocks when DN restart? # Do you want to have different metrics for different cache types? I am not sure its really necessary, but people may be interested to know how many blocks cached in pmem area? (Low priority) # {noformat} Build FAILED. "C:\Users\umgangum\Work\hadoop\hadoop-common-project\hadoop-common\src\main\native\native.sln" (default target) (1) -> "C:\Users\umgangum\Work\hadoop\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj" (default target) (2) -> (ClCompile target) -> src\org\apache\hadoop\util\bulk_crc32.c(59): warning C4091: 'static ' : ignored on left of 'int' when no variable is declared [C:\Users\umgangum\Work\hadoop\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj] "C:\Users\umgangum\Work\hadoop\hadoop-common-project\hadoop-common\src\main\native\native.sln" (default target) (1) -> "C:\Users\umgangum\Work\hadoop\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj" (default target) (2) -> (ClCompile target) -> c:\users\umgangum\work\hadoop\hadoop-common-project\hadoop-common\src\main\native\src\org\apache\hadoop\io\nativeio\pmdk_load.h(52): error C2061: syntax error : identifier '__d_pmem_map_file' [C:\Users\umgangum\Work\hadoop\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj] {noformat} You need to fix for Windows as well. Am I missing something? # From pmdk_load.h {code} __d_pmem_map_file pmem_map_file; __d_pmem_unmap pmem_unmap; __d_pmem_is_pmem pmem_is_pmem; __d_pmem_drain pmem_drain; __d_pmem_memcpy_nodrain pmem_memcpy_nodrain; __d_pmem_msync pmem_msync; {code} Quick question does this types available if non-unix env? I think you did tydef in Unix env only right > Support non-volatile storage class memory(SCM) in HDFS cache directives > --- > > Key: HDFS-13762 > URL: https://issues.apache.org/jira/browse/HDFS-13762 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Sammi Chen >Assignee: Feilong He >Priority: Major > Attachments: HDFS-13762.000.patch, HDFS-13762.001.patch, > HDFS-13762.002.patch, HDFS-13762.003.patch, HDFS-13762.004.patch, > HDFS-13762.005.patch, HDFS-13762.006.patch, HDFS-13762.007.patch, > HDFS-13762.008.patch, SCMCacheDesign-2018-11-08.pdf, SCMCacheTestPlan
[jira] [Comment Edited] (HDDS-1095) OzoneManager#openKey should do multiple block allocations in a single SCM rpc call
[ https://issues.apache.org/jira/browse/HDDS-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772802#comment-16772802 ] Shashikant Banerjee edited comment on HDDS-1095 at 2/20/19 9:20 AM: [~msingh],Thanks for updating the patch. I have two minor comments. Rest all looks good to me. 1) ScmBlockLocationProtocolClientSideTranslatorPB.java : The javadoc needs to be updated here accordingly. 2) Let's have the block allocation limit (equal to a container size as discussed), so that we don't ever hit the hadoop rpc payload limit in case. was (Author: shashikant): [~msingh],Thanks for updating the patch. I have two minor comments. Rest all looks good to me. 1) ScmBlockLocationProtocolClientSideTranslatorPB.java : The javadoc needs to be updated here accordingly. 2) Let's have the block allocation limit (equal to a container size), so that we don't ever hit the hadoop rpc payload limit in case. > OzoneManager#openKey should do multiple block allocations in a single SCM rpc > call > -- > > Key: HDDS-1095 > URL: https://issues.apache.org/jira/browse/HDDS-1095 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Affects Versions: 0.4.0 >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Major > Fix For: 0.4.0 > > Attachments: HDDS-1095.001.patch, HDDS-1095.002.patch > > > Currently in KeyManagerImpl#openKey, for a large key allocation, multiple > blocks are allocated in different rpc calls. If the key length is already > known, then multiple blocks can be allocated in one rpc call to SCM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-891) Create customized yetus personality for ozone
[ https://issues.apache.org/jira/browse/HDDS-891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elek, Marton resolved HDDS-891. --- Resolution: Won't Fix > Create customized yetus personality for ozone > - > > Key: HDDS-891 > URL: https://issues.apache.org/jira/browse/HDDS-891 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Major > > Ozone pre commit builds (such as > https://builds.apache.org/job/PreCommit-HDDS-Build/) use the official hadoop > personality from the yetus personality. > Yetus personalities are bash scripts which contain personalization for > specific builds. > The hadoop personality tries to identify which project should be built and > use partial build to build only the required subprojects because the full > build is very time consuming. > But in Ozone: > 1.) The build + unit tests are very fast > 2.) We don't need all the checks (for example the hadoop specific shading > test) > 3.) We prefer to do a full build and full unit test for hadoop-ozone and > hadoop-hdds subrojects (for example the hadoop-ozone integration test always > should be executed as it contains many generic unit test) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1095) OzoneManager#openKey should do multiple block allocations in a single SCM rpc call
[ https://issues.apache.org/jira/browse/HDDS-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772802#comment-16772802 ] Shashikant Banerjee commented on HDDS-1095: --- [~msingh],Thanks for updating the patch. I have two minor comments. Rest all looks good to me. 1) ScmBlockLocationProtocolClientSideTranslatorPB.java : The javadoc needs to be updated here accordingly. 2) Let's have the block allocation limit (equal to a container size), so that we don't ever hit the hadoop rpc payload limit in case. > OzoneManager#openKey should do multiple block allocations in a single SCM rpc > call > -- > > Key: HDDS-1095 > URL: https://issues.apache.org/jira/browse/HDDS-1095 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Affects Versions: 0.4.0 >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Major > Fix For: 0.4.0 > > Attachments: HDDS-1095.001.patch, HDDS-1095.002.patch > > > Currently in KeyManagerImpl#openKey, for a large key allocation, multiple > blocks are allocated in different rpc calls. If the key length is already > known, then multiple blocks can be allocated in one rpc call to SCM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1126) datanode is trying to qausi-close a container which is already closed
[ https://issues.apache.org/jira/browse/HDDS-1126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-1126: -- Component/s: Ozone Datanode > datanode is trying to qausi-close a container which is already closed > - > > Key: HDDS-1126 > URL: https://issues.apache.org/jira/browse/HDDS-1126 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Nilotpal Nandi >Assignee: Nanda kumar >Priority: Major > > steps taken : > > # created 12 datanodes cluster and running workload on all the nodes > # running failure injection/restart on 1 datanode at a time periodically and > randomly. > > Error seen in ozone.log : > -- > > {noformat} > 2019-02-18 06:06:32,780 [Datanode State Machine Thread - 0] DEBUG > (DatanodeStateMachine.java:176) - Executing cycle Number : 30 > 2019-02-18 06:06:32,784 [Command processor thread] DEBUG > (CloseContainerCommandHandler.java:71) - Processing Close Container command. > 2019-02-18 06:06:32,785 [Datanode State Machine Thread - 0] DEBUG > (DatanodeStateMachine.java:176) - Executing cycle Number : 31 > 2019-02-18 06:06:32,785 [Command processor thread] ERROR > (CloseContainerCommandHandler.java:118) - Can't close container #37 > org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: > Cannot quasi close container #37 while in CLOSED state. > at > org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.quasiCloseContainer(KeyValueHandler.java:903) > at > org.apache.hadoop.ozone.container.ozoneimpl.ContainerController.quasiCloseContainer(ContainerController.java:93) > at > org.apache.hadoop.ozone.container.common.statemachine.commandhandler.CloseContainerCommandHandler.handle(CloseContainerCommandHandler.java:110) > at > org.apache.hadoop.ozone.container.common.statemachine.commandhandler.CommandDispatcher.handle(CommandDispatcher.java:93) > at > org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.lambda$initCommandHandlerThread$1(DatanodeStateMachine.java:413) > at java.lang.Thread.run(Thread.java:748) > 2019-02-18 06:06:32,785 [Command processor thread] DEBUG > (CloseContainerCommandHandler.java:71) - Processing Close Container command. > 2019-02-18 06:06:32,788 [Command processor thread] DEBUG > (CloseContainerCommandHandler.java:71) - Processing Close Container command. > 2019-02-18 06:06:32,788 [Datanode State Machine Thread - 0] DEBUG > (DatanodeStateMachine.java:176) - Executing cycle Number : 32 > 2019-02-18 06:06:34,430 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:06:36,608 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:06:38,876 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:06:41,084 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:06:43,297 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:06:45,469 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:06:47,684 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:06:49,958 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:06:52,124 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:06:54,344 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:06:56,499 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:06:58,764 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:07:00,969 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:07:02,788 [Datanode State Machine Thread - 0] DEBUG > (DatanodeStateMachine.java:176) - Executing cycle Number : 33 > 2019-02-18 06:07:03,240 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:07:05,486 [main] DEBUG (OzoneClientFactory.ja
[jira] [Updated] (HDDS-1126) datanode is trying to qausi-close a container which is already closed
[ https://issues.apache.org/jira/browse/HDDS-1126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-1126: -- Target Version/s: 0.4.0 > datanode is trying to qausi-close a container which is already closed > - > > Key: HDDS-1126 > URL: https://issues.apache.org/jira/browse/HDDS-1126 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Nilotpal Nandi >Assignee: Nanda kumar >Priority: Major > > steps taken : > > # created 12 datanodes cluster and running workload on all the nodes > # running failure injection/restart on 1 datanode at a time periodically and > randomly. > > Error seen in ozone.log : > -- > > {noformat} > 2019-02-18 06:06:32,780 [Datanode State Machine Thread - 0] DEBUG > (DatanodeStateMachine.java:176) - Executing cycle Number : 30 > 2019-02-18 06:06:32,784 [Command processor thread] DEBUG > (CloseContainerCommandHandler.java:71) - Processing Close Container command. > 2019-02-18 06:06:32,785 [Datanode State Machine Thread - 0] DEBUG > (DatanodeStateMachine.java:176) - Executing cycle Number : 31 > 2019-02-18 06:06:32,785 [Command processor thread] ERROR > (CloseContainerCommandHandler.java:118) - Can't close container #37 > org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: > Cannot quasi close container #37 while in CLOSED state. > at > org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.quasiCloseContainer(KeyValueHandler.java:903) > at > org.apache.hadoop.ozone.container.ozoneimpl.ContainerController.quasiCloseContainer(ContainerController.java:93) > at > org.apache.hadoop.ozone.container.common.statemachine.commandhandler.CloseContainerCommandHandler.handle(CloseContainerCommandHandler.java:110) > at > org.apache.hadoop.ozone.container.common.statemachine.commandhandler.CommandDispatcher.handle(CommandDispatcher.java:93) > at > org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.lambda$initCommandHandlerThread$1(DatanodeStateMachine.java:413) > at java.lang.Thread.run(Thread.java:748) > 2019-02-18 06:06:32,785 [Command processor thread] DEBUG > (CloseContainerCommandHandler.java:71) - Processing Close Container command. > 2019-02-18 06:06:32,788 [Command processor thread] DEBUG > (CloseContainerCommandHandler.java:71) - Processing Close Container command. > 2019-02-18 06:06:32,788 [Datanode State Machine Thread - 0] DEBUG > (DatanodeStateMachine.java:176) - Executing cycle Number : 32 > 2019-02-18 06:06:34,430 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:06:36,608 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:06:38,876 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:06:41,084 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:06:43,297 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:06:45,469 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:06:47,684 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:06:49,958 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:06:52,124 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:06:54,344 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:06:56,499 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:06:58,764 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:07:00,969 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:07:02,788 [Datanode State Machine Thread - 0] DEBUG > (DatanodeStateMachine.java:176) - Executing cycle Number : 33 > 2019-02-18 06:07:03,240 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:07:05,486 [main] DEBUG (OzoneClientFactory.java:2
[jira] [Commented] (HDFS-3246) pRead equivalent for direct read path
[ https://issues.apache.org/jira/browse/HDFS-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772810#comment-16772810 ] Zheng Hu commented on HDFS-3246: [~stakiar], any thought about those failed UT and checkstyles ? Thanks > pRead equivalent for direct read path > - > > Key: HDFS-3246 > URL: https://issues.apache.org/jira/browse/HDFS-3246 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client, performance >Affects Versions: 3.0.0-alpha1 >Reporter: Henry Robinson >Assignee: Sahil Takiar >Priority: Major > Attachments: HDFS-3246.001.patch > > > There is no pread equivalent in ByteBufferReadable. We should consider adding > one. It would be relatively easy to implement for the distributed case > (certainly compared to HDFS-2834), since DFSInputStream does most of the > heavy lifting. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-1126) datanode is trying to qausi-close a container which is already closed
[ https://issues.apache.org/jira/browse/HDDS-1126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar reassigned HDDS-1126: - Assignee: Nanda kumar > datanode is trying to qausi-close a container which is already closed > - > > Key: HDDS-1126 > URL: https://issues.apache.org/jira/browse/HDDS-1126 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Nilotpal Nandi >Assignee: Nanda kumar >Priority: Major > > steps taken : > > # created 12 datanodes cluster and running workload on all the nodes > # running failure injection/restart on 1 datanode at a time periodically and > randomly. > > Error seen in ozone.log : > -- > > {noformat} > 2019-02-18 06:06:32,780 [Datanode State Machine Thread - 0] DEBUG > (DatanodeStateMachine.java:176) - Executing cycle Number : 30 > 2019-02-18 06:06:32,784 [Command processor thread] DEBUG > (CloseContainerCommandHandler.java:71) - Processing Close Container command. > 2019-02-18 06:06:32,785 [Datanode State Machine Thread - 0] DEBUG > (DatanodeStateMachine.java:176) - Executing cycle Number : 31 > 2019-02-18 06:06:32,785 [Command processor thread] ERROR > (CloseContainerCommandHandler.java:118) - Can't close container #37 > org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: > Cannot quasi close container #37 while in CLOSED state. > at > org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.quasiCloseContainer(KeyValueHandler.java:903) > at > org.apache.hadoop.ozone.container.ozoneimpl.ContainerController.quasiCloseContainer(ContainerController.java:93) > at > org.apache.hadoop.ozone.container.common.statemachine.commandhandler.CloseContainerCommandHandler.handle(CloseContainerCommandHandler.java:110) > at > org.apache.hadoop.ozone.container.common.statemachine.commandhandler.CommandDispatcher.handle(CommandDispatcher.java:93) > at > org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.lambda$initCommandHandlerThread$1(DatanodeStateMachine.java:413) > at java.lang.Thread.run(Thread.java:748) > 2019-02-18 06:06:32,785 [Command processor thread] DEBUG > (CloseContainerCommandHandler.java:71) - Processing Close Container command. > 2019-02-18 06:06:32,788 [Command processor thread] DEBUG > (CloseContainerCommandHandler.java:71) - Processing Close Container command. > 2019-02-18 06:06:32,788 [Datanode State Machine Thread - 0] DEBUG > (DatanodeStateMachine.java:176) - Executing cycle Number : 32 > 2019-02-18 06:06:34,430 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:06:36,608 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:06:38,876 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:06:41,084 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:06:43,297 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:06:45,469 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:06:47,684 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:06:49,958 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:06:52,124 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:06:54,344 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:06:56,499 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:06:58,764 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:07:00,969 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:07:02,788 [Datanode State Machine Thread - 0] DEBUG > (DatanodeStateMachine.java:176) - Executing cycle Number : 33 > 2019-02-18 06:07:03,240 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol. > 2019-02-18 06:07:05,486 [main] DEBUG (OzoneClientFactory.java:287) - Using > org.apache.hadoop.
[jira] [Comment Edited] (HDDS-891) Create customized yetus personality for ozone
[ https://issues.apache.org/jira/browse/HDDS-891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772771#comment-16772771 ] Elek, Marton edited comment on HDDS-891 at 2/20/19 9:23 AM: Thanks the answer [~aw] Sorry for the confusion. I think there are two questions which are mixed here. # I would like to fix HDDS-146. It was not clear for me what is the wrong. If I understood well, you suggest to double quote the $DOCKER_INTERACTIVE_RUN variable in the docker run line. I think it's a false positive report as DOCKER_INTERACTIVE_RUN should be either an empty string or unset (which will be replaced by "-i -t"). I think it shouldn't be quoted because in that case instead of -i -t we would add just one argument ('-i -t'). But please let me know if I am wrong. (I can add a check if it initial value anything other than an empty string.) # "_Cutting back on options just because you think they don't apply to Ozone is almost entirely incorrect."_ Please note that I am totally on the _opposite_ side. I would like to add MORE tests and MORE strict tests _in addition_ to the existing tests. (Especially: running docker based acceptance tests, run ALL the hdds/ozone unit tests all the time, not just for the changed projects, check ALL the findbugs/checkstyle issues not just the new ones). # I am convinced to run the more strict tests in addition to the existing yetus tests. Please let me know If I can do something to get Yetus results for the PR-s. was (Author: elek): Thanks the answer [~aw] Sorry for the confusion. I think there are two questions which are mixed here. # I would like to fix HDDS-146. It was not clear for me what is the wrong. If I understood well, you suggest to double quote the $DOCKER_INTERACTIVE_RUN variable in the docker run line. I think it's a false positive report as DOCKER_INTERACTIVE_RUN should be either an empty string or unset (which will be replaced by "-i -t"). I think it shouldn't be quoted because in that case instead of -i -t we would add just one argument ('-i -t'). But please let me know if I am wrong. (I can add a check if it initial value anything other than an empty string.) # "_Cutting back on options just because you think they don't apply to Ozone is almost entirely incorrect."_ Please not that I am totally on the _opposite_ side. I would like to add MORE tests and MORE strict tests _in addition_ to the existing tests. (Especially: running docker based acceptance tests, run ALL the hdds/ozone unit tests all the time, not just for the changed projects, check ALL the findbugs/checkstyle issues not just the new ones). # I am convinced to run the more strict tests in addition to the existing yetus tests. Please let me know If I can do something to get Yetus results for the PR-s. > Create customized yetus personality for ozone > - > > Key: HDDS-891 > URL: https://issues.apache.org/jira/browse/HDDS-891 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Major > > Ozone pre commit builds (such as > https://builds.apache.org/job/PreCommit-HDDS-Build/) use the official hadoop > personality from the yetus personality. > Yetus personalities are bash scripts which contain personalization for > specific builds. > The hadoop personality tries to identify which project should be built and > use partial build to build only the required subprojects because the full > build is very time consuming. > But in Ozone: > 1.) The build + unit tests are very fast > 2.) We don't need all the checks (for example the hadoop specific shading > test) > 3.) We prefer to do a full build and full unit test for hadoop-ozone and > hadoop-hdds subrojects (for example the hadoop-ozone integration test always > should be executed as it contains many generic unit test) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14235) Handle ArrayIndexOutOfBoundsException in DataNodeDiskMetrics#slowDiskDetectionDaemon
[ https://issues.apache.org/jira/browse/HDFS-14235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772815#comment-16772815 ] Hadoop QA commented on HDFS-14235: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 10s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 47s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 3 unchanged - 1 fixed = 3 total (was 4) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 19s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 98m 21s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 33s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}153m 56s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.qjournal.server.TestJournalNodeSync | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | HDFS-14235 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12959383/HDFS-14235.003.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 5fd4b1bbafad 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 1d30fd9 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_191 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/26266/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/26266/testReport/ | | Max. proce
[jira] [Commented] (HDFS-14235) Handle ArrayIndexOutOfBoundsException in DataNodeDiskMetrics#slowDiskDetectionDaemon
[ https://issues.apache.org/jira/browse/HDFS-14235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772817#comment-16772817 ] Ranith Sardar commented on HDFS-14235: -- [~surendrasingh], Fixed the checkstyle. Please review it once. > Handle ArrayIndexOutOfBoundsException in > DataNodeDiskMetrics#slowDiskDetectionDaemon > - > > Key: HDFS-14235 > URL: https://issues.apache.org/jira/browse/HDFS-14235 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Surendra Singh Lilhore >Assignee: Ranith Sardar >Priority: Major > Attachments: HDFS-14235.000.patch, HDFS-14235.001.patch, > HDFS-14235.002.patch, HDFS-14235.003.patch, NPE.png, exception.png > > > below code throwing exception because {{volumeIterator.next()}} called two > time without checking hashNext(). > {code:java} > while (volumeIterator.hasNext()) { > FsVolumeSpi volume = volumeIterator.next(); > DataNodeVolumeMetrics metrics = volumeIterator.next().getMetrics(); > String volumeName = volume.getBaseURI().getPath(); > metadataOpStats.put(volumeName, > metrics.getMetadataOperationMean()); > readIoStats.put(volumeName, metrics.getReadIoMean()); > writeIoStats.put(volumeName, metrics.getWriteIoMean()); > }{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14235) Handle ArrayIndexOutOfBoundsException in DataNodeDiskMetrics#slowDiskDetectionDaemon
[ https://issues.apache.org/jira/browse/HDFS-14235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772828#comment-16772828 ] lindongdong commented on HDFS-14235: why changes? the first patch is perfect:D:D:D:D:D:D > Handle ArrayIndexOutOfBoundsException in > DataNodeDiskMetrics#slowDiskDetectionDaemon > - > > Key: HDFS-14235 > URL: https://issues.apache.org/jira/browse/HDFS-14235 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Surendra Singh Lilhore >Assignee: Ranith Sardar >Priority: Major > Attachments: HDFS-14235.000.patch, HDFS-14235.001.patch, > HDFS-14235.002.patch, HDFS-14235.003.patch, NPE.png, exception.png > > > below code throwing exception because {{volumeIterator.next()}} called two > time without checking hashNext(). > {code:java} > while (volumeIterator.hasNext()) { > FsVolumeSpi volume = volumeIterator.next(); > DataNodeVolumeMetrics metrics = volumeIterator.next().getMetrics(); > String volumeName = volume.getBaseURI().getPath(); > metadataOpStats.put(volumeName, > metrics.getMetadataOperationMean()); > readIoStats.put(volumeName, metrics.getReadIoMean()); > writeIoStats.put(volumeName, metrics.getWriteIoMean()); > }{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14235) Handle ArrayIndexOutOfBoundsException in DataNodeDiskMetrics#slowDiskDetectionDaemon
[ https://issues.apache.org/jira/browse/HDFS-14235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772864#comment-16772864 ] Surendra Singh Lilhore commented on HDFS-14235: --- [~lindongdong], v1 patch is calling sleep() two time for same sleep interval and it can be handle with one simple if check without creating sleep() method. > Handle ArrayIndexOutOfBoundsException in > DataNodeDiskMetrics#slowDiskDetectionDaemon > - > > Key: HDFS-14235 > URL: https://issues.apache.org/jira/browse/HDFS-14235 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Surendra Singh Lilhore >Assignee: Ranith Sardar >Priority: Major > Attachments: HDFS-14235.000.patch, HDFS-14235.001.patch, > HDFS-14235.002.patch, HDFS-14235.003.patch, NPE.png, exception.png > > > below code throwing exception because {{volumeIterator.next()}} called two > time without checking hashNext(). > {code:java} > while (volumeIterator.hasNext()) { > FsVolumeSpi volume = volumeIterator.next(); > DataNodeVolumeMetrics metrics = volumeIterator.next().getMetrics(); > String volumeName = volume.getBaseURI().getPath(); > metadataOpStats.put(volumeName, > metrics.getMetadataOperationMean()); > readIoStats.put(volumeName, metrics.getReadIoMean()); > writeIoStats.put(volumeName, metrics.getWriteIoMean()); > }{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14259) RBF: Fix safemode message for Router
[ https://issues.apache.org/jira/browse/HDFS-14259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ranith Sardar updated HDFS-14259: - Attachment: HDFS-14259-HDFS-13891.001.patch > RBF: Fix safemode message for Router > > > Key: HDFS-14259 > URL: https://issues.apache.org/jira/browse/HDFS-14259 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: Ranith Sardar >Priority: Major > Attachments: HDFS-14259-HDFS-13891.000.patch, > HDFS-14259-HDFS-13891.001.patch > > > Currently, the {{getSafemode()}} bean checks the state of the Router but > returns the error if the status is different than SAFEMODE: > {code} > public String getSafemode() { > if (!getRouter().isRouterState(RouterServiceState.SAFEMODE)) { > return "Safe mode is ON. " + this.getSafeModeTip(); > } > } catch (IOException e) { > return "Failed to get safemode status. Please check router" > + "log for more detail."; > } > return ""; > } > {code} > The condition should be reversed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14259) RBF: Fix safemode message for Router
[ https://issues.apache.org/jira/browse/HDFS-14259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772868#comment-16772868 ] Ranith Sardar commented on HDFS-14259: -- [~elgoiri], Attached the patch. Please review once. > RBF: Fix safemode message for Router > > > Key: HDFS-14259 > URL: https://issues.apache.org/jira/browse/HDFS-14259 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: Ranith Sardar >Priority: Major > Attachments: HDFS-14259-HDFS-13891.000.patch, > HDFS-14259-HDFS-13891.001.patch > > > Currently, the {{getSafemode()}} bean checks the state of the Router but > returns the error if the status is different than SAFEMODE: > {code} > public String getSafemode() { > if (!getRouter().isRouterState(RouterServiceState.SAFEMODE)) { > return "Safe mode is ON. " + this.getSafeModeTip(); > } > } catch (IOException e) { > return "Failed to get safemode status. Please check router" > + "log for more detail."; > } > return ""; > } > {code} > The condition should be reversed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14254) RBF: Getfacl gives a wrong acl entries when the order of the mount table set to HASH_ALL or RANDOM
[ https://issues.apache.org/jira/browse/HDFS-14254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ranith Sardar updated HDFS-14254: - Attachment: HDFS-14254-HDFS-13891.002.patch > RBF: Getfacl gives a wrong acl entries when the order of the mount table set > to HASH_ALL or RANDOM > -- > > Key: HDFS-14254 > URL: https://issues.apache.org/jira/browse/HDFS-14254 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Shubham Dewan >Assignee: Ranith Sardar >Priority: Major > Attachments: HDFS-14254-HDFS-13891.000.patch, > HDFS-14254-HDFS-13891.001.patch, HDFS-14254-HDFS-13891.002.patch > > > ACL entries are missing when Order is set to HASH_ALL or RANDOM -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14254) RBF: Getfacl gives a wrong acl entries when the order of the mount table set to HASH_ALL or RANDOM
[ https://issues.apache.org/jira/browse/HDFS-14254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772901#comment-16772901 ] Ranith Sardar commented on HDFS-14254: -- Added new patch for UT failure. > RBF: Getfacl gives a wrong acl entries when the order of the mount table set > to HASH_ALL or RANDOM > -- > > Key: HDFS-14254 > URL: https://issues.apache.org/jira/browse/HDFS-14254 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Shubham Dewan >Assignee: Ranith Sardar >Priority: Major > Attachments: HDFS-14254-HDFS-13891.000.patch, > HDFS-14254-HDFS-13891.001.patch, HDFS-14254-HDFS-13891.002.patch > > > ACL entries are missing when Order is set to HASH_ALL or RANDOM -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1135) Ozone jars are missing in the Ozone Snapshot tar
[ https://issues.apache.org/jira/browse/HDDS-1135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772913#comment-16772913 ] Elek, Marton commented on HDDS-1135: +1 Thanks [~dineshchitlangia] the quick fix. Extracted the new tar file and started ozone with docker. Worked well. > Ozone jars are missing in the Ozone Snapshot tar > > > Key: HDDS-1135 > URL: https://issues.apache.org/jira/browse/HDDS-1135 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Affects Versions: 0.4.0 >Reporter: Shashikant Banerjee >Assignee: Dinesh Chitlangia >Priority: Major > Fix For: 0.4.0 > > Attachments: HDDS-1135.00.patch > > > After executing an ozone dist build the library jars are missing from the > created tar file. > The problem is on the maven side. The tar file creation is called before the > jar copies. > {code:java} > cd hadoop-ozone/dist > mvn clean package | grep "\-\-\-"{code} > {code:java} > [INFO] < org.apache.hadoop:hadoop-ozone-dist > >- > [INFO] [ pom > ]- > [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hadoop-ozone-dist > --- > [INFO] --- maven-antrun-plugin:1.7:run (create-testdirs) @ hadoop-ozone-dist > --- > [INFO] --- maven-remote-resources-plugin:1.5:process (default) @ > hadoop-ozone-dist --- > [INFO] --- exec-maven-plugin:1.3.1:exec (dist) @ hadoop-ozone-dist --- > [INFO] --- exec-maven-plugin:1.3.1:exec (tar-ozone) @ hadoop-ozone-dist --- > [INFO] --- maven-site-plugin:3.6:attach-descriptor (attach-descriptor) @ > hadoop-ozone-dist --- > [INFO] --- maven-dependency-plugin:3.0.2:build-classpath > (add-classpath-descriptor) @ hadoop-ozone-dist --- > [INFO] --- maven-dependency-plugin:3.0.2:copy (copy-classpath-files) @ > hadoop-ozone-dist --- > [INFO] --- maven-dependency-plugin:3.0.2:copy-dependencies (copy-jars) @ > hadoop-ozone-dist --- > [INFO] --- maven-jar-plugin:2.5:test-jar (default) @ hadoop-ozone-dist > ---{code} > The right order of the plugin executions are: > * Call 'dist' (dist-layout-stitching, it cleans the destination directory) > * Copy the jar files (copy-classpath-files, copy-jars) > * Create the tar package (tar-ozone) > It could be done with adjusting the maven phases in the pom.xml > I would suggest to move 'dist' to the 'compile' phase, move > 'copy-classpath-files' and 'copy-jars' to the 'prepare-package' phase, and > keep 'tar-ozone' at the 'package' phase. > With this setup we can be sure that the steps are executed in the right order. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1135) Ozone jars are missing in the Ozone Snapshot tar
[ https://issues.apache.org/jira/browse/HDDS-1135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elek, Marton updated HDDS-1135: --- Resolution: Fixed Status: Resolved (was: Patch Available) > Ozone jars are missing in the Ozone Snapshot tar > > > Key: HDDS-1135 > URL: https://issues.apache.org/jira/browse/HDDS-1135 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Affects Versions: 0.4.0 >Reporter: Shashikant Banerjee >Assignee: Dinesh Chitlangia >Priority: Major > Fix For: 0.4.0 > > Attachments: HDDS-1135.00.patch > > > After executing an ozone dist build the library jars are missing from the > created tar file. > The problem is on the maven side. The tar file creation is called before the > jar copies. > {code:java} > cd hadoop-ozone/dist > mvn clean package | grep "\-\-\-"{code} > {code:java} > [INFO] < org.apache.hadoop:hadoop-ozone-dist > >- > [INFO] [ pom > ]- > [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hadoop-ozone-dist > --- > [INFO] --- maven-antrun-plugin:1.7:run (create-testdirs) @ hadoop-ozone-dist > --- > [INFO] --- maven-remote-resources-plugin:1.5:process (default) @ > hadoop-ozone-dist --- > [INFO] --- exec-maven-plugin:1.3.1:exec (dist) @ hadoop-ozone-dist --- > [INFO] --- exec-maven-plugin:1.3.1:exec (tar-ozone) @ hadoop-ozone-dist --- > [INFO] --- maven-site-plugin:3.6:attach-descriptor (attach-descriptor) @ > hadoop-ozone-dist --- > [INFO] --- maven-dependency-plugin:3.0.2:build-classpath > (add-classpath-descriptor) @ hadoop-ozone-dist --- > [INFO] --- maven-dependency-plugin:3.0.2:copy (copy-classpath-files) @ > hadoop-ozone-dist --- > [INFO] --- maven-dependency-plugin:3.0.2:copy-dependencies (copy-jars) @ > hadoop-ozone-dist --- > [INFO] --- maven-jar-plugin:2.5:test-jar (default) @ hadoop-ozone-dist > ---{code} > The right order of the plugin executions are: > * Call 'dist' (dist-layout-stitching, it cleans the destination directory) > * Copy the jar files (copy-classpath-files, copy-jars) > * Create the tar package (tar-ozone) > It could be done with adjusting the maven phases in the pom.xml > I would suggest to move 'dist' to the 'compile' phase, move > 'copy-classpath-files' and 'copy-jars' to the 'prepare-package' phase, and > keep 'tar-ozone' at the 'package' phase. > With this setup we can be sure that the steps are executed in the right order. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14235) Handle ArrayIndexOutOfBoundsException in DataNodeDiskMetrics#slowDiskDetectionDaemon
[ https://issues.apache.org/jira/browse/HDFS-14235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772923#comment-16772923 ] Surendra Singh Lilhore commented on HDFS-14235: --- +1 > Handle ArrayIndexOutOfBoundsException in > DataNodeDiskMetrics#slowDiskDetectionDaemon > - > > Key: HDFS-14235 > URL: https://issues.apache.org/jira/browse/HDFS-14235 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Surendra Singh Lilhore >Assignee: Ranith Sardar >Priority: Major > Attachments: HDFS-14235.000.patch, HDFS-14235.001.patch, > HDFS-14235.002.patch, HDFS-14235.003.patch, NPE.png, exception.png > > > below code throwing exception because {{volumeIterator.next()}} called two > time without checking hashNext(). > {code:java} > while (volumeIterator.hasNext()) { > FsVolumeSpi volume = volumeIterator.next(); > DataNodeVolumeMetrics metrics = volumeIterator.next().getMetrics(); > String volumeName = volume.getBaseURI().getPath(); > metadataOpStats.put(volumeName, > metrics.getMetadataOperationMean()); > readIoStats.put(volumeName, metrics.getReadIoMean()); > writeIoStats.put(volumeName, metrics.getWriteIoMean()); > }{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13972) RBF: Support for Delegation Token (WebHDFS)
[ https://issues.apache.org/jira/browse/HDFS-13972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772924#comment-16772924 ] Hadoop QA commented on HDFS-13972: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} HDFS-13891 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 1s{color} | {color:green} HDFS-13891 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 57s{color} | {color:green} HDFS-13891 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 59s{color} | {color:green} HDFS-13891 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 26s{color} | {color:green} HDFS-13891 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 51s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 42s{color} | {color:green} HDFS-13891 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 11s{color} | {color:green} HDFS-13891 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 57s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 56s{color} | {color:orange} hadoop-hdfs-project: The patch generated 2 new + 132 unchanged - 0 fixed = 134 total (was 132) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 26s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 19s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 74m 1s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 21m 32s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 26s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}159m 46s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.qjournal.server.TestJournalNodeSync | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | HDFS-13972 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12959406/HDFS-13972-HDFS-13891.002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 737018ce8229 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 G
[jira] [Updated] (HDDS-919) Enable prometheus endpoints for Ozone datanodes
[ https://issues.apache.org/jira/browse/HDDS-919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elek, Marton updated HDDS-919: -- Status: Patch Available (was: Open) > Enable prometheus endpoints for Ozone datanodes > --- > > Key: HDDS-919 > URL: https://issues.apache.org/jira/browse/HDDS-919 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Major > Labels: newbie, pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > HDDS-846 provides a new metric endpoint which publishes the available Hadoop > metrics in prometheus friendly format with a new servlet. > Unfortunately it's enabled only on the scm/om side. It would be great to > enable it in the Ozone/HDDS datanodes on the web server of the HDDS Rest > endpoint. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-1124) java.lang.IllegalStateException exception in datanode log
[ https://issues.apache.org/jira/browse/HDDS-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee reassigned HDDS-1124: - Assignee: Shashikant Banerjee > java.lang.IllegalStateException exception in datanode log > - > > Key: HDDS-1124 > URL: https://issues.apache.org/jira/browse/HDDS-1124 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Nilotpal Nandi >Assignee: Shashikant Banerjee >Priority: Major > > steps taken : > > # created 12 datanodes cluster and running workload on all the nodes > exception seen : > --- > > {noformat} > 2019-02-15 10:15:53,355 INFO org.apache.ratis.server.storage.RaftLogWorker: > 943007c8-4fdd-4926-89e2-2c8c52c05073-RaftLogWorker: Rolled log segment from > /data/disk1/ozone/meta/ratis/01d3ef2a-912c-4fc0-80b6-012343d76adb/current/log_inprogress_3036 > to > /data/disk1/ozone/meta/ratis/01d3ef2a-912c-4fc0-80b6-012343d76adb/current/log_3036-3047 > 2019-02-15 10:15:53,367 INFO org.apache.ratis.server.impl.RaftServerImpl: > 943007c8-4fdd-4926-89e2-2c8c52c05073: set configuration 3048: > [a40a7b01-a30b-469c-b373-9fcb20a126ed:172.27.54.212:9858, > 8c77b16b-8054-49e3-b669-1ff759cfd271:172.27.23.196:9858, > 943007c8-4fdd-4926-89e2-2c8c52c05073:172.27.76.72:9858], old=null at 3048 > 2019-02-15 10:15:53,523 INFO org.apache.ratis.server.storage.RaftLogWorker: > 943007c8-4fdd-4926-89e2-2c8c52c05073-RaftLogWorker: created new log segment > /data/disk1/ozone/meta/ratis/01d3ef2a-912c-4fc0-80b6-012343d76adb/current/log_inprogress_3048 > 2019-02-15 10:15:53,580 ERROR org.apache.ratis.grpc.server.GrpcLogAppender: > Failed onNext serverReply { > requestorId: "943007c8-4fdd-4926-89e2-2c8c52c05073" > replyId: "a40a7b01-a30b-469c-b373-9fcb20a126ed" > raftGroupId { > id: "\001\323\357*\221,O\300\200\266\001#C\327j\333" > } > success: true > } > term: 3 > nextIndex: 3049 > followerCommit: 3047 > java.lang.IllegalStateException: reply's next index is 3049, request's > previous is term: 1 > index: 3047 > at org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:60) > at > org.apache.ratis.grpc.server.GrpcLogAppender.onSuccess(GrpcLogAppender.java:285) > at > org.apache.ratis.grpc.server.GrpcLogAppender$AppendLogResponseHandler.onNextImpl(GrpcLogAppender.java:230) > at > org.apache.ratis.grpc.server.GrpcLogAppender$AppendLogResponseHandler.onNext(GrpcLogAppender.java:215) > at > org.apache.ratis.grpc.server.GrpcLogAppender$AppendLogResponseHandler.onNext(GrpcLogAppender.java:197) > at > org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onMessage(ClientCalls.java:421) > at > org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onMessage(ForwardingClientCallListener.java:33) > at > org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onMessage(ForwardingClientCallListener.java:33) > at > org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInContext(ClientCallImpl.java:519) > at > org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) > at > org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2019-02-15 10:15:56,442 INFO org.apache.ratis.server.storage.RaftLogWorker: > 943007c8-4fdd-4926-89e2-2c8c52c05073-RaftLogWorker: Rolling segment > log-3048_3066 to index:3066 > 2019-02-15 10:15:56,442 INFO org.apache.ratis.server.storage.RaftLogWorker: > 943007c8-4fdd-4926-89e2-2c8c52c05073-RaftLogWorker: Rolled log segment from > /data/disk1/ozone/meta/ratis/01d3ef2a-912c-4fc0-80b6-012343d76adb/current/log_inprogress_3048 > to > /data/disk1/ozone/meta/ratis/01d3ef2a-912c-4fc0-80b6-012343d76adb/current/log_3048-3066 > 2019-02-15 10:15:56,564 INFO org.apache.ratis.server.storage.RaftLogWorker: > 943007c8-4fdd-4926-89e2-2c8c52c05073-RaftLogWorker: created new log segment > /data/disk1/ozone/meta/ratis/01d3ef2a-912c-4fc0-80b6-012343d76adb/current/log_inprogress_3067 > 2019-02-15 10:16:45,420 INFO org.apache.ratis.server.storage.RaftLogWorker: > 943007c8-4fdd-4926-89e2-2c8c52c05073-RaftLogWorker: Rolling segment > log-3067_3077 to index:3077 > {noformat} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-1116) Add java profiler servlet to the Ozone web servers
[ https://issues.apache.org/jira/browse/HDDS-1116?focusedWorklogId=201239&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201239 ] ASF GitHub Bot logged work on HDDS-1116: Author: ASF GitHub Bot Created on: 20/Feb/19 11:42 Start Date: 20/Feb/19 11:42 Worklog Time Spent: 10m Work Description: elek commented on pull request #491: HDDS-1116. Add java profiler servlet to the Ozone web servers URL: https://github.com/apache/hadoop/pull/491 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201239) Time Spent: 50m (was: 40m) > Add java profiler servlet to the Ozone web servers > -- > > Key: HDDS-1116 > URL: https://issues.apache.org/jira/browse/HDDS-1116 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Major > Labels: pull-request-available > Fix For: 0.4.0 > > Time Spent: 50m > Remaining Estimate: 0h > > Thanks to [~gopalv] we learned that [~prasanth_j] implemented a helper > servlet in Hive to initialize new [async > profiler|https://github.com/jvm-profiling-tools/async-profiler] sessions and > provide the svg based flame graph over HTTP. (see HIVE-20202) > It seems to very useful as with this approach the profiling could be very > easy. > This patch imports the servlet from the Hive code base to the Ozone code base > with minor modification (to make it work with our servlet containers) > * The two servlets are unified to one > * Streaming the svg to the browser based on IOUtils.copy > * Output message is improved > By default the profile servlet is turned off, but you can enable it with > 'hdds.profiler.endpoint.enabled=true' ozone-site.xml settings. In that case > you can access the /prof endpoint from scm,om,s3g. > You should upload the async profiler first > (https://github.com/jvm-profiling-tools/async-profiler) and set the > ASYNC_PROFILER_HOME environment variable to find it. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-1116) Add java profiler servlet to the Ozone web servers
[ https://issues.apache.org/jira/browse/HDDS-1116?focusedWorklogId=201238&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201238 ] ASF GitHub Bot logged work on HDDS-1116: Author: ASF GitHub Bot Created on: 20/Feb/19 11:42 Start Date: 20/Feb/19 11:42 Worklog Time Spent: 10m Work Description: elek commented on pull request #491: HDDS-1116. Add java profiler servlet to the Ozone web servers URL: https://github.com/apache/hadoop/pull/491#discussion_r258444609 ## File path: hadoop-ozone/dist/src/main/compose/ozone/docker-compose.yaml ## @@ -18,6 +18,7 @@ version: "3" services: datanode: image: apache/hadoop-runner + privileged: true #required by the profiler Review comment: Thanks the comment. I think we can assume that the kernel parameters are adjusted. I will test it without the privileged flag and remove those lines.. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201238) Time Spent: 40m (was: 0.5h) > Add java profiler servlet to the Ozone web servers > -- > > Key: HDDS-1116 > URL: https://issues.apache.org/jira/browse/HDDS-1116 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Major > Labels: pull-request-available > Fix For: 0.4.0 > > Time Spent: 40m > Remaining Estimate: 0h > > Thanks to [~gopalv] we learned that [~prasanth_j] implemented a helper > servlet in Hive to initialize new [async > profiler|https://github.com/jvm-profiling-tools/async-profiler] sessions and > provide the svg based flame graph over HTTP. (see HIVE-20202) > It seems to very useful as with this approach the profiling could be very > easy. > This patch imports the servlet from the Hive code base to the Ozone code base > with minor modification (to make it work with our servlet containers) > * The two servlets are unified to one > * Streaming the svg to the browser based on IOUtils.copy > * Output message is improved > By default the profile servlet is turned off, but you can enable it with > 'hdds.profiler.endpoint.enabled=true' ozone-site.xml settings. In that case > you can access the /prof endpoint from scm,om,s3g. > You should upload the async profiler first > (https://github.com/jvm-profiling-tools/async-profiler) and set the > ASYNC_PROFILER_HOME environment variable to find it. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14259) RBF: Fix safemode message for Router
[ https://issues.apache.org/jira/browse/HDFS-14259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772938#comment-16772938 ] Hadoop QA commented on HDFS-14259: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} HDFS-13891 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 36s{color} | {color:green} HDFS-13891 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s{color} | {color:green} HDFS-13891 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 21s{color} | {color:green} HDFS-13891 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s{color} | {color:green} HDFS-13891 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 53s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 53s{color} | {color:green} HDFS-13891 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s{color} | {color:green} HDFS-13891 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 29s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 24m 27s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 38s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 79m 15s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | HDFS-14259 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12959418/HDFS-14259-HDFS-13891.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 4c7e662628f7 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | HDFS-13891 / f94b6e3 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_191 | | findbugs | v3.1.0-RC1 | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/26268/artifact/out/whitespace-eol.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/26268/testReport/ | | Max. process+thread count | 975 (vs. ulimit of 1) | | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: hadoop-hdfs-project/hadoop-hdfs-rbf | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/26268/console | | Powered by | Apache
[jira] [Updated] (HDFS-14235) Handle ArrayIndexOutOfBoundsException in DataNodeDiskMetrics#slowDiskDetectionDaemon
[ https://issues.apache.org/jira/browse/HDFS-14235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Surendra Singh Lilhore updated HDFS-14235: -- Resolution: Fixed Fix Version/s: 3.2.1 3.3.0 Status: Resolved (was: Patch Available) Thanks [~RANith] for contribution. Committed to branch-3.2, trunk. > Handle ArrayIndexOutOfBoundsException in > DataNodeDiskMetrics#slowDiskDetectionDaemon > - > > Key: HDFS-14235 > URL: https://issues.apache.org/jira/browse/HDFS-14235 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Surendra Singh Lilhore >Assignee: Ranith Sardar >Priority: Major > Fix For: 3.3.0, 3.2.1 > > Attachments: HDFS-14235.000.patch, HDFS-14235.001.patch, > HDFS-14235.002.patch, HDFS-14235.003.patch, NPE.png, exception.png > > > below code throwing exception because {{volumeIterator.next()}} called two > time without checking hashNext(). > {code:java} > while (volumeIterator.hasNext()) { > FsVolumeSpi volume = volumeIterator.next(); > DataNodeVolumeMetrics metrics = volumeIterator.next().getMetrics(); > String volumeName = volume.getBaseURI().getPath(); > metadataOpStats.put(volumeName, > metrics.getMetadataOperationMean()); > readIoStats.put(volumeName, metrics.getReadIoMean()); > writeIoStats.put(volumeName, metrics.getWriteIoMean()); > }{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-14216) NullPointerException happens in NamenodeWebHdfs
[ https://issues.apache.org/jira/browse/HDFS-14216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Surendra Singh Lilhore reassigned HDFS-14216: - Assignee: lujie > NullPointerException happens in NamenodeWebHdfs > --- > > Key: HDFS-14216 > URL: https://issues.apache.org/jira/browse/HDFS-14216 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: lujie >Assignee: lujie >Priority: Critical > Attachments: HDFS-14216_1.patch, HDFS-14216_2.patch, > HDFS-14216_3.patch, HDFS-14216_4.patch, hadoop-hires-namenode-hadoop11.log > > > workload > {code:java} > curl -i -X PUT -T $HOMEPARH/test.txt > "http://hadoop1:9870/webhdfs/v1/input?op=CREATE&excludedatanodes=hadoop2"; > {code} > the method > {code:java} > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.chooseDatanode(String > excludeDatanodes){ > HashSet excludes = new HashSet(); > if (excludeDatanodes != null) { >for (String host : StringUtils > .getTrimmedStringCollection(excludeDatanodes)) { > int idx = host.indexOf(":"); >if (idx != -1) { > excludes.add(bm.getDatanodeManager().getDatanodeByXferAddr( >host.substring(0, idx), Integer.parseInt(host.substring(idx + > 1; >} else { > > excludes.add(bm.getDatanodeManager().getDatanodeByHost(host));//line280 >} > } > } > } > {code} > when datanode(e.g.hadoop2) is {color:#d04437}just wiped before > line280{color}, or{color:#33} > {color}{color:#ff}we{color}{color:#ff} give the wrong DN > name{color}*,*then bm.getDatanodeManager().getDatanodeByHost(host) will > return null, *_excludes_* *containes null*. while *_excludes_* are used > later, NPE happens: > {code:java} > java.lang.NullPointerException > at org.apache.hadoop.net.NodeBase.getPath(NodeBase.java:113) > at > org.apache.hadoop.net.NetworkTopology.countNumOfAvailableNodes(NetworkTopology.java:672) > at > org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:533) > at > org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:491) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.chooseDatanode(NamenodeWebHdfsMethods.java:323) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.redirectURI(NamenodeWebHdfsMethods.java:384) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.put(NamenodeWebHdfsMethods.java:652) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods$2.run(NamenodeWebHdfsMethods.java:600) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods$2.run(NamenodeWebHdfsMethods.java:597) > at org.apache.hadoop.ipc.ExternalCall.run(ExternalCall.java:73) > at org.apache.hadoop.ipc.ExternalCall.run(ExternalCall.java:30) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2830) > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14216) NullPointerException happens in NamenodeWebHdfs
[ https://issues.apache.org/jira/browse/HDFS-14216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772962#comment-16772962 ] Surendra Singh Lilhore commented on HDFS-14216: --- Added [~xiaoheipangzi] as contributor. > NullPointerException happens in NamenodeWebHdfs > --- > > Key: HDFS-14216 > URL: https://issues.apache.org/jira/browse/HDFS-14216 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: lujie >Assignee: lujie >Priority: Critical > Attachments: HDFS-14216_1.patch, HDFS-14216_2.patch, > HDFS-14216_3.patch, HDFS-14216_4.patch, hadoop-hires-namenode-hadoop11.log > > > workload > {code:java} > curl -i -X PUT -T $HOMEPARH/test.txt > "http://hadoop1:9870/webhdfs/v1/input?op=CREATE&excludedatanodes=hadoop2"; > {code} > the method > {code:java} > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.chooseDatanode(String > excludeDatanodes){ > HashSet excludes = new HashSet(); > if (excludeDatanodes != null) { >for (String host : StringUtils > .getTrimmedStringCollection(excludeDatanodes)) { > int idx = host.indexOf(":"); >if (idx != -1) { > excludes.add(bm.getDatanodeManager().getDatanodeByXferAddr( >host.substring(0, idx), Integer.parseInt(host.substring(idx + > 1; >} else { > > excludes.add(bm.getDatanodeManager().getDatanodeByHost(host));//line280 >} > } > } > } > {code} > when datanode(e.g.hadoop2) is {color:#d04437}just wiped before > line280{color}, or{color:#33} > {color}{color:#ff}we{color}{color:#ff} give the wrong DN > name{color}*,*then bm.getDatanodeManager().getDatanodeByHost(host) will > return null, *_excludes_* *containes null*. while *_excludes_* are used > later, NPE happens: > {code:java} > java.lang.NullPointerException > at org.apache.hadoop.net.NodeBase.getPath(NodeBase.java:113) > at > org.apache.hadoop.net.NetworkTopology.countNumOfAvailableNodes(NetworkTopology.java:672) > at > org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:533) > at > org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:491) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.chooseDatanode(NamenodeWebHdfsMethods.java:323) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.redirectURI(NamenodeWebHdfsMethods.java:384) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.put(NamenodeWebHdfsMethods.java:652) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods$2.run(NamenodeWebHdfsMethods.java:600) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods$2.run(NamenodeWebHdfsMethods.java:597) > at org.apache.hadoop.ipc.ExternalCall.run(ExternalCall.java:73) > at org.apache.hadoop.ipc.ExternalCall.run(ExternalCall.java:30) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2830) > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14216) NullPointerException happens in NamenodeWebHdfs
[ https://issues.apache.org/jira/browse/HDFS-14216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772975#comment-16772975 ] Surendra Singh Lilhore commented on HDFS-14216: --- Thanks [~xiaoheipangzi] for the patch. {quote} * I think the log should be at a INFO rather than an ERROR, given that it can happen under normal circumstances and the specified DataNode will still not be considered (so it is still, in a way, excluded). More explanation would probably be nice as well, something like "DataNode {} was requested to be excluded, but it was not found." (please use slf4j style statement){quote} I feel this log itself is not required. Why to fill namenode log file becuase of clients wrong input ? > NullPointerException happens in NamenodeWebHdfs > --- > > Key: HDFS-14216 > URL: https://issues.apache.org/jira/browse/HDFS-14216 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: lujie >Assignee: lujie >Priority: Critical > Attachments: HDFS-14216_1.patch, HDFS-14216_2.patch, > HDFS-14216_3.patch, HDFS-14216_4.patch, hadoop-hires-namenode-hadoop11.log > > > workload > {code:java} > curl -i -X PUT -T $HOMEPARH/test.txt > "http://hadoop1:9870/webhdfs/v1/input?op=CREATE&excludedatanodes=hadoop2"; > {code} > the method > {code:java} > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.chooseDatanode(String > excludeDatanodes){ > HashSet excludes = new HashSet(); > if (excludeDatanodes != null) { >for (String host : StringUtils > .getTrimmedStringCollection(excludeDatanodes)) { > int idx = host.indexOf(":"); >if (idx != -1) { > excludes.add(bm.getDatanodeManager().getDatanodeByXferAddr( >host.substring(0, idx), Integer.parseInt(host.substring(idx + > 1; >} else { > > excludes.add(bm.getDatanodeManager().getDatanodeByHost(host));//line280 >} > } > } > } > {code} > when datanode(e.g.hadoop2) is {color:#d04437}just wiped before > line280{color}, or{color:#33} > {color}{color:#ff}we{color}{color:#ff} give the wrong DN > name{color}*,*then bm.getDatanodeManager().getDatanodeByHost(host) will > return null, *_excludes_* *containes null*. while *_excludes_* are used > later, NPE happens: > {code:java} > java.lang.NullPointerException > at org.apache.hadoop.net.NodeBase.getPath(NodeBase.java:113) > at > org.apache.hadoop.net.NetworkTopology.countNumOfAvailableNodes(NetworkTopology.java:672) > at > org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:533) > at > org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:491) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.chooseDatanode(NamenodeWebHdfsMethods.java:323) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.redirectURI(NamenodeWebHdfsMethods.java:384) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.put(NamenodeWebHdfsMethods.java:652) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods$2.run(NamenodeWebHdfsMethods.java:600) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods$2.run(NamenodeWebHdfsMethods.java:597) > at org.apache.hadoop.ipc.ExternalCall.run(ExternalCall.java:73) > at org.apache.hadoop.ipc.ExternalCall.run(ExternalCall.java:30) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2830) > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1135) Ozone jars are missing in the Ozone Snapshot tar
[ https://issues.apache.org/jira/browse/HDDS-1135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772978#comment-16772978 ] Hudson commented on HDDS-1135: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #16005 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/16005/]) HDDS-1135. Ozone jars are missing in the Ozone Snapshot tar. Contributed (elek: rev 642fe6a2604c107070476b45aeab6cce09dfef1f) * (edit) hadoop-ozone/dist/pom.xml > Ozone jars are missing in the Ozone Snapshot tar > > > Key: HDDS-1135 > URL: https://issues.apache.org/jira/browse/HDDS-1135 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Affects Versions: 0.4.0 >Reporter: Shashikant Banerjee >Assignee: Dinesh Chitlangia >Priority: Major > Fix For: 0.4.0 > > Attachments: HDDS-1135.00.patch > > > After executing an ozone dist build the library jars are missing from the > created tar file. > The problem is on the maven side. The tar file creation is called before the > jar copies. > {code:java} > cd hadoop-ozone/dist > mvn clean package | grep "\-\-\-"{code} > {code:java} > [INFO] < org.apache.hadoop:hadoop-ozone-dist > >- > [INFO] [ pom > ]- > [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hadoop-ozone-dist > --- > [INFO] --- maven-antrun-plugin:1.7:run (create-testdirs) @ hadoop-ozone-dist > --- > [INFO] --- maven-remote-resources-plugin:1.5:process (default) @ > hadoop-ozone-dist --- > [INFO] --- exec-maven-plugin:1.3.1:exec (dist) @ hadoop-ozone-dist --- > [INFO] --- exec-maven-plugin:1.3.1:exec (tar-ozone) @ hadoop-ozone-dist --- > [INFO] --- maven-site-plugin:3.6:attach-descriptor (attach-descriptor) @ > hadoop-ozone-dist --- > [INFO] --- maven-dependency-plugin:3.0.2:build-classpath > (add-classpath-descriptor) @ hadoop-ozone-dist --- > [INFO] --- maven-dependency-plugin:3.0.2:copy (copy-classpath-files) @ > hadoop-ozone-dist --- > [INFO] --- maven-dependency-plugin:3.0.2:copy-dependencies (copy-jars) @ > hadoop-ozone-dist --- > [INFO] --- maven-jar-plugin:2.5:test-jar (default) @ hadoop-ozone-dist > ---{code} > The right order of the plugin executions are: > * Call 'dist' (dist-layout-stitching, it cleans the destination directory) > * Copy the jar files (copy-classpath-files, copy-jars) > * Create the tar package (tar-ozone) > It could be done with adjusting the maven phases in the pom.xml > I would suggest to move 'dist' to the 'compile' phase, move > 'copy-classpath-files' and 'copy-jars' to the 'prepare-package' phase, and > keep 'tar-ozone' at the 'package' phase. > With this setup we can be sure that the steps are executed in the right order. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-1142) Inconsistency in Ozone log rollovers
Shashikant Banerjee created HDDS-1142: - Summary: Inconsistency in Ozone log rollovers Key: HDDS-1142 URL: https://issues.apache.org/jira/browse/HDDS-1142 Project: Hadoop Distributed Data Store Issue Type: Improvement Affects Versions: 0.4.0 Reporter: Shashikant Banerjee Fix For: 0.4.0 HW15685:nodes-ozone-logs-1550481382 sbanerjee$ head ozone-log-1550481382-172.27.76.72/root/hadoop_trunk/ozone-0.4.0-SNAPSHOT/logs/ozone.log.2019-02-15 2019-02-17 00:00:17,817 [Datanode State Machine Thread - 0] DEBUG (DatanodeStateMachine.java:176) - Executing cycle Number : 4634 2019-02-17 00:00:47,819 [Datanode State Machine Thread - 0] DEBUG (DatanodeStateMachine.java:176) - Executing cycle Number : 4635 2019-02-17 00:00:57,662 [Datanode ReportManager Thread - 0] DEBUG (ContainerSet.java:192) - Starting container report iteration. 2019-02-17 00:00:57,887 [BlockDeletingService#7] DEBUG (TopNOrderedContainerDeletionChoosingPolicy.java:79) - Stop looking for next container, there is no pending deletion block contained in remaining containers. The ozone.log.2019-02-15 file contains logs for 2019-02-17. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14235) Handle ArrayIndexOutOfBoundsException in DataNodeDiskMetrics#slowDiskDetectionDaemon
[ https://issues.apache.org/jira/browse/HDFS-14235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772979#comment-16772979 ] Hudson commented on HDFS-14235: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #16005 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/16005/]) HDFS-14235. Handle ArrayIndexOutOfBoundsException in (surendralilhore: rev 41e18feda3f5ff924c87c4bed5b5cbbaecb19ae1) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/metrics/DataNodeDiskMetrics.java > Handle ArrayIndexOutOfBoundsException in > DataNodeDiskMetrics#slowDiskDetectionDaemon > - > > Key: HDFS-14235 > URL: https://issues.apache.org/jira/browse/HDFS-14235 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Surendra Singh Lilhore >Assignee: Ranith Sardar >Priority: Major > Fix For: 3.3.0, 3.2.1 > > Attachments: HDFS-14235.000.patch, HDFS-14235.001.patch, > HDFS-14235.002.patch, HDFS-14235.003.patch, NPE.png, exception.png > > > below code throwing exception because {{volumeIterator.next()}} called two > time without checking hashNext(). > {code:java} > while (volumeIterator.hasNext()) { > FsVolumeSpi volume = volumeIterator.next(); > DataNodeVolumeMetrics metrics = volumeIterator.next().getMetrics(); > String volumeName = volume.getBaseURI().getPath(); > metadataOpStats.put(volumeName, > metrics.getMetadataOperationMean()); > readIoStats.put(volumeName, metrics.getReadIoMean()); > writeIoStats.put(volumeName, metrics.getWriteIoMean()); > }{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14298) Improve log messages of ECTopologyVerifier
[ https://issues.apache.org/jira/browse/HDFS-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-14298: Attachment: HDFS-14298.002.patch > Improve log messages of ECTopologyVerifier > -- > > Key: HDFS-14298 > URL: https://issues.apache.org/jira/browse/HDFS-14298 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Minor > Attachments: HDFS-14298.001.patch, HDFS-14298.002.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14298) Improve log messages of ECTopologyVerifier
[ https://issues.apache.org/jira/browse/HDFS-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772985#comment-16772985 ] Kitti Nanasi commented on HDFS-14298: - Thanks for the review [~shwetayakkali]! TestNameNodeMXBean failed because of the patch v001, so I fixed that in patch v002. The other test failures are not related. > Improve log messages of ECTopologyVerifier > -- > > Key: HDFS-14298 > URL: https://issues.apache.org/jira/browse/HDFS-14298 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Minor > Attachments: HDFS-14298.001.patch, HDFS-14298.002.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14216) NullPointerException happens in NamenodeWebHdfs
[ https://issues.apache.org/jira/browse/HDFS-14216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772975#comment-16772975 ] Surendra Singh Lilhore edited comment on HDFS-14216 at 2/20/19 12:32 PM: - Thanks [~xiaoheipangzi] for the patch. {quote} * I think the log should be at a INFO rather than an ERROR, given that it can happen under normal circumstances and the specified DataNode will still not be considered (so it is still, in a way, excluded). More explanation would probably be nice as well, something like "DataNode {} was requested to be excluded, but it was not found." (please use slf4j style statement){quote} I feel this log itself is not required. Why to fill namenode log file because of clients wrong input ? was (Author: surendrasingh): Thanks [~xiaoheipangzi] for the patch. {quote} * I think the log should be at a INFO rather than an ERROR, given that it can happen under normal circumstances and the specified DataNode will still not be considered (so it is still, in a way, excluded). More explanation would probably be nice as well, something like "DataNode {} was requested to be excluded, but it was not found." (please use slf4j style statement){quote} I feel this log itself is not required. Why to fill namenode log file becuase of clients wrong input ? > NullPointerException happens in NamenodeWebHdfs > --- > > Key: HDFS-14216 > URL: https://issues.apache.org/jira/browse/HDFS-14216 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: lujie >Assignee: lujie >Priority: Critical > Attachments: HDFS-14216_1.patch, HDFS-14216_2.patch, > HDFS-14216_3.patch, HDFS-14216_4.patch, hadoop-hires-namenode-hadoop11.log > > > workload > {code:java} > curl -i -X PUT -T $HOMEPARH/test.txt > "http://hadoop1:9870/webhdfs/v1/input?op=CREATE&excludedatanodes=hadoop2"; > {code} > the method > {code:java} > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.chooseDatanode(String > excludeDatanodes){ > HashSet excludes = new HashSet(); > if (excludeDatanodes != null) { >for (String host : StringUtils > .getTrimmedStringCollection(excludeDatanodes)) { > int idx = host.indexOf(":"); >if (idx != -1) { > excludes.add(bm.getDatanodeManager().getDatanodeByXferAddr( >host.substring(0, idx), Integer.parseInt(host.substring(idx + > 1; >} else { > > excludes.add(bm.getDatanodeManager().getDatanodeByHost(host));//line280 >} > } > } > } > {code} > when datanode(e.g.hadoop2) is {color:#d04437}just wiped before > line280{color}, or{color:#33} > {color}{color:#ff}we{color}{color:#ff} give the wrong DN > name{color}*,*then bm.getDatanodeManager().getDatanodeByHost(host) will > return null, *_excludes_* *containes null*. while *_excludes_* are used > later, NPE happens: > {code:java} > java.lang.NullPointerException > at org.apache.hadoop.net.NodeBase.getPath(NodeBase.java:113) > at > org.apache.hadoop.net.NetworkTopology.countNumOfAvailableNodes(NetworkTopology.java:672) > at > org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:533) > at > org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:491) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.chooseDatanode(NamenodeWebHdfsMethods.java:323) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.redirectURI(NamenodeWebHdfsMethods.java:384) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.put(NamenodeWebHdfsMethods.java:652) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods$2.run(NamenodeWebHdfsMethods.java:600) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods$2.run(NamenodeWebHdfsMethods.java:597) > at org.apache.hadoop.ipc.ExternalCall.run(ExternalCall.java:73) > at org.apache.hadoop.ipc.ExternalCall.run(ExternalCall.java:30) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2830) > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14216) NullPointerException happens in NamenodeWebHdfs
[ https://issues.apache.org/jira/browse/HDFS-14216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772994#comment-16772994 ] lujie commented on HDFS-14216: -- Hi:[~surendrasingh] if we don't want to fill the namenode log file, we need throw a IOException to indicate that the client uses a wrong input, because we must give the reason message for debug. But the exception will prevent the "chooseDatanode" continue to run and the user request fails. > NullPointerException happens in NamenodeWebHdfs > --- > > Key: HDFS-14216 > URL: https://issues.apache.org/jira/browse/HDFS-14216 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: lujie >Assignee: lujie >Priority: Critical > Attachments: HDFS-14216_1.patch, HDFS-14216_2.patch, > HDFS-14216_3.patch, HDFS-14216_4.patch, hadoop-hires-namenode-hadoop11.log > > > workload > {code:java} > curl -i -X PUT -T $HOMEPARH/test.txt > "http://hadoop1:9870/webhdfs/v1/input?op=CREATE&excludedatanodes=hadoop2"; > {code} > the method > {code:java} > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.chooseDatanode(String > excludeDatanodes){ > HashSet excludes = new HashSet(); > if (excludeDatanodes != null) { >for (String host : StringUtils > .getTrimmedStringCollection(excludeDatanodes)) { > int idx = host.indexOf(":"); >if (idx != -1) { > excludes.add(bm.getDatanodeManager().getDatanodeByXferAddr( >host.substring(0, idx), Integer.parseInt(host.substring(idx + > 1; >} else { > > excludes.add(bm.getDatanodeManager().getDatanodeByHost(host));//line280 >} > } > } > } > {code} > when datanode(e.g.hadoop2) is {color:#d04437}just wiped before > line280{color}, or{color:#33} > {color}{color:#ff}we{color}{color:#ff} give the wrong DN > name{color}*,*then bm.getDatanodeManager().getDatanodeByHost(host) will > return null, *_excludes_* *containes null*. while *_excludes_* are used > later, NPE happens: > {code:java} > java.lang.NullPointerException > at org.apache.hadoop.net.NodeBase.getPath(NodeBase.java:113) > at > org.apache.hadoop.net.NetworkTopology.countNumOfAvailableNodes(NetworkTopology.java:672) > at > org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:533) > at > org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:491) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.chooseDatanode(NamenodeWebHdfsMethods.java:323) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.redirectURI(NamenodeWebHdfsMethods.java:384) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.put(NamenodeWebHdfsMethods.java:652) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods$2.run(NamenodeWebHdfsMethods.java:600) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods$2.run(NamenodeWebHdfsMethods.java:597) > at org.apache.hadoop.ipc.ExternalCall.run(ExternalCall.java:73) > at org.apache.hadoop.ipc.ExternalCall.run(ExternalCall.java:30) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2830) > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14216) NullPointerException happens in NamenodeWebHdfs
[ https://issues.apache.org/jira/browse/HDFS-14216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772996#comment-16772996 ] Surendra Singh Lilhore commented on HDFS-14216: --- {quote} if we don't want to fill the namenode log file, we need throw a IOException to indicate that the client uses a wrong input, because we must give the reason message for debug. But the exception will prevent the "chooseDatanode" continue to run and the user request fails. {quote} So better change it to DEBUG. [~xkrogen] what is your opinion ? > NullPointerException happens in NamenodeWebHdfs > --- > > Key: HDFS-14216 > URL: https://issues.apache.org/jira/browse/HDFS-14216 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: lujie >Assignee: lujie >Priority: Critical > Attachments: HDFS-14216_1.patch, HDFS-14216_2.patch, > HDFS-14216_3.patch, HDFS-14216_4.patch, hadoop-hires-namenode-hadoop11.log > > > workload > {code:java} > curl -i -X PUT -T $HOMEPARH/test.txt > "http://hadoop1:9870/webhdfs/v1/input?op=CREATE&excludedatanodes=hadoop2"; > {code} > the method > {code:java} > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.chooseDatanode(String > excludeDatanodes){ > HashSet excludes = new HashSet(); > if (excludeDatanodes != null) { >for (String host : StringUtils > .getTrimmedStringCollection(excludeDatanodes)) { > int idx = host.indexOf(":"); >if (idx != -1) { > excludes.add(bm.getDatanodeManager().getDatanodeByXferAddr( >host.substring(0, idx), Integer.parseInt(host.substring(idx + > 1; >} else { > > excludes.add(bm.getDatanodeManager().getDatanodeByHost(host));//line280 >} > } > } > } > {code} > when datanode(e.g.hadoop2) is {color:#d04437}just wiped before > line280{color}, or{color:#33} > {color}{color:#ff}we{color}{color:#ff} give the wrong DN > name{color}*,*then bm.getDatanodeManager().getDatanodeByHost(host) will > return null, *_excludes_* *containes null*. while *_excludes_* are used > later, NPE happens: > {code:java} > java.lang.NullPointerException > at org.apache.hadoop.net.NodeBase.getPath(NodeBase.java:113) > at > org.apache.hadoop.net.NetworkTopology.countNumOfAvailableNodes(NetworkTopology.java:672) > at > org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:533) > at > org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:491) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.chooseDatanode(NamenodeWebHdfsMethods.java:323) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.redirectURI(NamenodeWebHdfsMethods.java:384) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.put(NamenodeWebHdfsMethods.java:652) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods$2.run(NamenodeWebHdfsMethods.java:600) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods$2.run(NamenodeWebHdfsMethods.java:597) > at org.apache.hadoop.ipc.ExternalCall.run(ExternalCall.java:73) > at org.apache.hadoop.ipc.ExternalCall.run(ExternalCall.java:30) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2830) > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-14300) Fix A Typo In WebHDFS Document.
Ayush Saxena created HDFS-14300: --- Summary: Fix A Typo In WebHDFS Document. Key: HDFS-14300 URL: https://issues.apache.org/jira/browse/HDFS-14300 Project: Hadoop HDFS Issue Type: Bug Reporter: Ayush Saxena {noformat} Unset Storage Policy Submit a HTTP POT request.{noformat} POT needs to be changed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-14300) Fix A Typo In WebHDFS Document.
[ https://issues.apache.org/jira/browse/HDFS-14300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] venkata ramkumar reassigned HDFS-14300: --- Assignee: venkata ramkumar > Fix A Typo In WebHDFS Document. > --- > > Key: HDFS-14300 > URL: https://issues.apache.org/jira/browse/HDFS-14300 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ayush Saxena >Assignee: venkata ramkumar >Priority: Trivial > > {noformat} > Unset Storage Policy > Submit a HTTP POT request.{noformat} > POT needs to be changed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14300) Fix A Typo In WebHDFS Document.
[ https://issues.apache.org/jira/browse/HDFS-14300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773007#comment-16773007 ] venkata ramkumar commented on HDFS-14300: - Thanks [~ayushtkn] for putting this up. Will upload patch soon. :) > Fix A Typo In WebHDFS Document. > --- > > Key: HDFS-14300 > URL: https://issues.apache.org/jira/browse/HDFS-14300 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ayush Saxena >Assignee: venkata ramkumar >Priority: Trivial > > {noformat} > Unset Storage Policy > Submit a HTTP POT request.{noformat} > POT needs to be changed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1127) Fix failing and intermittent Ozone unit tests
[ https://issues.apache.org/jira/browse/HDDS-1127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773010#comment-16773010 ] Steve Loughran commented on HDDS-1127: -- If you look at the hadoop-aws and hadoop-azure test suites, we run them in parallel through some javascript magic in the pom and the use of per-fork temporary directories. Can't you do the same here? > Fix failing and intermittent Ozone unit tests > - > > Key: HDDS-1127 > URL: https://issues.apache.org/jira/browse/HDDS-1127 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Blocker > > Full Ozone build with acceptance + unit tests takes ~1.5 hour. > In the last 30 hours I executed a new full build at every 2 hours and > collected all the results. > We have ~1200 test method and ~15 are failed more than 4 times out of the 17 > run. > I propose the following method to fix them: > # Turn them off immediately (@Skip) to get real data for the pre-commits > # Create a Jira for every failing tests *with assignee* (I would choose an > assignee based on the history of the unit test). We can adjust the assignee > later but I would prefer use a default person instead of creating unassigned > jira-s. > # Fix them and enable the tests again. > Failing tests are the following: > |Package|Class|Test|84|83|82|81|80|79|78|77|76|75|74|73|72|71|70|69|68|67|66|65|FAILED| > |org.apache.hadoop.hdds.scm.chillmode|TestSCMChillModeManager|testDisableChillMode|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|N/A|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|N/A|N/A|17| > |org.apache.hadoop.hdds.scm.node|TestDeadNodeHandler|testOnMessage|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|N/A|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|N/A|N/A|17| > |org.apache.hadoop.hdds.scm.node|TestDeadNodeHandler|testOnMessageReplicaFailure|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|N/A|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|N/A|N/A|17| > |org.apache.hadoop.ozone|TestSecureOzoneCluster|testDelegationToken|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|N/A|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|N/A|N/A|17| > |org.apache.hadoop.ozone|TestSecureOzoneCluster|testDelegationTokenRenewal|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|N/A|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|N/A|N/A|17| > |org.apache.hadoop.ozone.container.ozoneimpl|TestOzoneContainerWithTLS|testCreateOzoneContainer[0]|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|N/A|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|N/A|N/A|17| > |org.apache.hadoop.ozone.container.ozoneimpl|TestOzoneContainerWithTLS|testCreateOzoneContainer[1]|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|N/A|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|N/A|N/A|17| > |[org.apache.hadoop.ozone.om|http://org.apache.hadoop.ozone.om/]|TestOzoneManager|testAccessVolume|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|N/A|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|N/A|N/A|17| > |[org.apache.hadoop.ozone.om|http://org.apache.hadoop.ozone.om/]|TestOzoneManager|testOmInitializationFailure|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|N/A|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|N/A|N/A|17| > |[org.apache.hadoop.ozone.om|http://org.apache.hadoop.ozone.om/]|TestOzoneManager|testRenameKey|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|N/A|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|N/A|N/A|17| > |[org.apache.hadoop.ozone.om|http://org.apache.hadoop.ozone.om/]|TestOzoneManagerHA|testTwoOMNodesDown|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|N/A|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|N/A|N/A|17| > |org.apache.hadoop.ozone.om.ratis|TestOzoneManagerRatisServer|testSubmitRatisRequest|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|N/A|FAILED|FAILED|FAILED|FAILED|FAILED|FAILED|N/A|N/A|17| > |org.apache.hadoop.ozone.freon|TestFreonWithDatanodeFastRestart|testRestart|FAILED|FAILED|FAILED|PASSED|FAILED|PASSED|FAILED|PASSED|FAILED|FAILED|FAILED|N/A|FAILED|PASSED|PASSED|FAILED|FAILED|PASSED|N/A|N/A|11| > |org.apache.hadoop.hdds.scm.container|TestContainerStateManagerIntegration|testGetMatchingContainerMultipleThreads|PASSED|PASSED|PASSED|PASSED|FAILED|PASSED|PASSED|PASSED|PASSED|PASSED|PASSED|N/A|PASSED|FAILED|PASSED|PASSED|FAILED|FAILED|N/A|N/A|4| > |org.apache.hadoop.ozone.client.rpc|TestFailureHandlingByClient|testMultiBlockWritesWithIntermittentDnFailures|PASSED|PASSED|FAILED|PASSED|PASSED|PASSED|FAILED|PASSED|PASSED|PASSED|PASSED|N
[jira] [Commented] (HDFS-14254) RBF: Getfacl gives a wrong acl entries when the order of the mount table set to HASH_ALL or RANDOM
[ https://issues.apache.org/jira/browse/HDFS-14254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773015#comment-16773015 ] Hadoop QA commented on HDFS-14254: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 31s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} HDFS-13891 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 19s{color} | {color:green} HDFS-13891 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s{color} | {color:green} HDFS-13891 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s{color} | {color:green} HDFS-13891 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s{color} | {color:green} HDFS-13891 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 50s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 56s{color} | {color:green} HDFS-13891 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s{color} | {color:green} HDFS-13891 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 44s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 23m 37s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 79m 25s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | HDFS-14254 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12959422/HDFS-14254-HDFS-13891.002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 6c9cb4dc439e 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | HDFS-13891 / f94b6e3 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_191 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/26269/testReport/ | | Max. process+thread count | 1041 (vs. ulimit of 1) | | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: hadoop-hdfs-project/hadoop-hdfs-rbf | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/26269/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > RBF: Getfacl gives a wrong acl entries when the order of the mount table set > to HASH_ALL
[jira] [Commented] (HDDS-1124) java.lang.IllegalStateException exception in datanode log
[ https://issues.apache.org/jira/browse/HDDS-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773027#comment-16773027 ] Shashikant Banerjee commented on HDDS-1124: --- 172.27.76.72 being a leader sends append request to follower node 172.27.54.212 . The follower's next index should be 3048 . Meanwhile leader election gets triggered and hence 172.27.54.212 also write a config entry top its log at index 3048 as well as 172.27.76.72 writes a conf entry at index 3048. {code:java} 2019-02-15 10:15:53,319 INFO org.apache.ratis.server.impl.RoleInfo: a40a7b01-a30b-469c-b373-9fcb20a126ed: start FollowerState 2019-02-15 10:15:53,319 INFO org.apache.ratis.server.impl.FollowerState: a40a7b01-a30b-469c-b373-9fcb20a126ed: FollowerState was interrupted: java.lang.InterruptedException: sleep interrupted 2019-02-15 10:15:53,463 INFO org.apache.ratis.server.impl.RaftServerImpl: a40a7b01-a30b-469c-b373-9fcb20a126ed: change Leader from null to 943007c8-4fdd-4926-89e2-2c8c52c05073 at term 3 for appendEntries, leader elected after 1364ms 2019-02-15 10:15:53,591 INFO org.apache.ratis.server.impl.RaftServerImpl: a40a7b01-a30b-469c-b373-9fcb20a126ed: set configuration 3048: [a40a7b01-a30b-469c-b373-9fcb20a126ed:172.27.54.212:9858, 8c77b16b-8054-49e3-b669-1ff759cfd271:172.27.23.196:9858, 943007c8-4fdd-4926-89e2-2c8c52c05073:172.27.76.72:9858], old=null at 3048 2019-02-15 10:15:53,594 INFO org.apache.ratis.server.storage.RaftLogWorker: a40a7b01-a30b-469c-b373-9fcb20a126ed-RaftLogWorker: Truncating segments toTruncate: (3036, 3048) isOpen? true, length=11529, newEndIndex=3047 toDelete: [], start index 3048 2019-02-15 10:15:53,594 INFO org.apache.ratis.server.storage.RaftLogWorker: a40a7b01-a30b-469c-b373-9fcb20a126ed-RaftLogWorker: Starting segment from index:3048 2019-02-15 10:15:53,597 INFO org.apache.ratis.server.storage.RaftLogWorker: a40a7b01-a30b-469c-b373-9fcb20a126ed-RaftLogWorker: Truncated log file /data/disk1/ozone/meta/ratis/01d3ef2a-912c-4fc0-80b6-012343d76adb/current/log_inprogress_3036 to length 11529 and moved it to /data/disk1/ozone/meta/ratis/01d3ef2a-912c-4fc0-80b6-012343d76adb/current/log_3036-3047{code} When the appendEntries reply gets received on 172.27.76.72, since the follower node has incremented the next index to 3049 (as a conf entry is written at index 3048), but requestor assumes it to be 3048, the assertion is hit. {code:java} final long replyNextIndex = reply.getNextIndex(); final long lastIndex = replyNextIndex - 1; final boolean updateMatchIndex; if (request.getEntriesCount() == 0) { Preconditions.assertTrue(!request.hasPreviousLog() || lastIndex == request.getPreviousLog().getIndex(), "reply's next index is %s, request's previous is %s", replyNextIndex, request.getPreviousLog()); updateMatchIndex = request.hasPreviousLog() && follower.getMatchIndex() < lastIndex; } e {code} This is quite possible in the system and not a fatal bug. > java.lang.IllegalStateException exception in datanode log > - > > Key: HDDS-1124 > URL: https://issues.apache.org/jira/browse/HDDS-1124 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Nilotpal Nandi >Assignee: Shashikant Banerjee >Priority: Major > > steps taken : > > # created 12 datanodes cluster and running workload on all the nodes > exception seen : > --- > > {noformat} > 2019-02-15 10:15:53,355 INFO org.apache.ratis.server.storage.RaftLogWorker: > 943007c8-4fdd-4926-89e2-2c8c52c05073-RaftLogWorker: Rolled log segment from > /data/disk1/ozone/meta/ratis/01d3ef2a-912c-4fc0-80b6-012343d76adb/current/log_inprogress_3036 > to > /data/disk1/ozone/meta/ratis/01d3ef2a-912c-4fc0-80b6-012343d76adb/current/log_3036-3047 > 2019-02-15 10:15:53,367 INFO org.apache.ratis.server.impl.RaftServerImpl: > 943007c8-4fdd-4926-89e2-2c8c52c05073: set configuration 3048: > [a40a7b01-a30b-469c-b373-9fcb20a126ed:172.27.54.212:9858, > 8c77b16b-8054-49e3-b669-1ff759cfd271:172.27.23.196:9858, > 943007c8-4fdd-4926-89e2-2c8c52c05073:172.27.76.72:9858], old=null at 3048 > 2019-02-15 10:15:53,523 INFO org.apache.ratis.server.storage.RaftLogWorker: > 943007c8-4fdd-4926-89e2-2c8c52c05073-RaftLogWorker: created new log segment > /data/disk1/ozone/meta/ratis/01d3ef2a-912c-4fc0-80b6-012343d76adb/current/log_inprogress_3048 > 2019-02-15 10:15:53,580 ERROR org.apache.ratis.grpc.server.GrpcLogAppender: > Failed onNext serverReply { > requestorId: "943007c8-4fdd-4926-89e2-2c8c52c05073" > replyId: "a40a7b01-a30b-469c-b373-9fcb20a126ed" > raftGroupId { > id: "\001\323\357*\221,O\300\200\266\001#C\327j\333" > } > success: true > } > term: 3 > nextIndex: 3049 > followerCommit: 3047 > java.lang.IllegalStateException: rep
[jira] [Commented] (HDDS-1125) java.lang.InterruptedException seen in datanode logs
[ https://issues.apache.org/jira/browse/HDDS-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773033#comment-16773033 ] Shashikant Banerjee commented on HDDS-1125: --- There is gc pauses in the leader, because of which readStateMachine thread hangs and times out, This also results in leader election getting triggered and as a result of which readStateMachine gets interrupted as the transition from leader to follower state happens on node 172.27.76.72. ReadStateMachhine times out for the two followers owing to gc pause {code:java} 2019-02-15 10:16:48,693 WARN org.apache.ratis.server.impl.LogAppender: GrpcLogAppender(943007c8-4fdd-4926-89e2-2c8c52c05073 -> 8c77b16b-8054-49e3-b669-1ff759cfd271): Failed get (t:3, i:3084), STATEMACHINELOGENTRY, client-632E77ADA885, cid=6232 in 140998012ns java.util.concurrent.TimeoutException at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1771) at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915) at org.apache.ratis.server.storage.RaftLog$EntryWithData.getEntry(RaftLog.java:433) 2019-02-15 10:16:48,697 WARN org.apache.ratis.server.impl.LogAppender: GrpcLogAppender(943007c8-4fdd-4926-89e2-2c8c52c05073 -> a40a7b01-a30b-469c-b373-9fcb20a126ed): Failed get (t:3, i:3084), STATEMACHINELOGENTRY, client-632E77ADA885, cid=6232 in 192996036ns java.util.concurrent.TimeoutException at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1771) at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915) at org.apache.ratis.server.storage.RaftLog$EntryWithData.getEntry(RaftLog.java:433) 2019-02-15 10:16:48,704 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1663ms GC pool 'PS MarkSweep' had collection(s): count=1 time=1669ms GC pool 'PS Scavenge' had collection(s): count=1 time=373ms {code} The readStateMachine thread will be interrupted when the current leader transitions to follower state. {code:java} 2019-02-15 10:16:48,710 INFO org.apache.ratis.server.impl.RaftServerImpl: 943007c8-4fdd-4926-89e2-2c8c52c05073: change Leader from 943007c8-4fdd-4926-89e2-2c8c52c05073 to null at term 4 for updateCurrentTerm 2019-02-15 10:16:48,710 INFO org.apache.ratis.server.impl.RaftServerImpl: 943007c8-4fdd-4926-89e2-2c8c52c05073 changes role from LEADER to FOLLOWER at term 4 for stepDown 2019-02-15 10:16:48,710 INFO org.apache.ratis.server.impl.RoleInfo: 943007c8-4fdd-4926-89e2-2c8c52c05073: shutdown LeaderState 2019-02-15 10:16:48,712 INFO org.apache.ratis.server.impl.PendingRequests: 943007c8-4fdd-4926-89e2-2c8c52c05073-PendingRequests: sendNotLeaderResponses 2019-02-15 10:16:48,713 ERROR org.apache.ratis.server.impl.LogAppender: 943007c8-4fdd-4926-89e2-2c8c52c05073: Failed readStateMachineData for (t:3, i:3084), STATEMACHINELOGENTRY, client-632E77ADA885, cid=6232 java.lang.InterruptedException at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:347) at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915) at org.apache.ratis.server.storage.RaftLog$EntryWithData.getEntry(RaftLog.java:433) {code} The system recovers from this eventually. The issue to be looked at here is why the gc pause happened which lead to ReadStateMachineData timing out. > java.lang.InterruptedException seen in datanode logs > > > Key: HDDS-1125 > URL: https://issues.apache.org/jira/browse/HDDS-1125 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Nilotpal Nandi >Assignee: Shashikant Banerjee >Priority: Major > > steps taken : > > # created 12 datanodes cluster and running workload on all the nodes > > exception seen : > - > > {noformat} > 2019-02-15 10:16:48,713 ERROR org.apache.ratis.server.impl.LogAppender: > 943007c8-4fdd-4926-89e2-2c8c52c05073: Failed readStateMachineData for (t:3, > i:3084), STATEMACHINELOGENTRY, client-632E77ADA885, cid=6232 > java.lang.InterruptedException > at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:347) > at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915) > at > org.apache.ratis.server.storage.RaftLog$EntryWithData.getEntry(RaftLog.java:433) > at org.apache.ratis.util.DataQueue.pollList(DataQueue.java:133) > at > org.apache.ratis.server.impl.LogAppender.createRequest(LogAppender.java:171) > at > org.apache.ratis.grpc.server.GrpcLogAppender.appendLog(GrpcLogAppender.java:152) > at > org.apache.ratis.grpc.server.GrpcLogAppender.runAppenderImpl(GrpcLogAppender.java:96) > at org.apache.ratis.server.impl.LogAppender.runAp
[jira] [Created] (HDDS-1143) Ensure stateMachineData to be evicted only after writeStateMachineData completes in ContainerStateMachine cache
Shashikant Banerjee created HDDS-1143: - Summary: Ensure stateMachineData to be evicted only after writeStateMachineData completes in ContainerStateMachine cache Key: HDDS-1143 URL: https://issues.apache.org/jira/browse/HDDS-1143 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: Ozone Datanode Affects Versions: 0.4.0 Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: 0.4.0 Currently, when we write StateMachineData, we first write to cache followed by write to disk. The entry in the cache can get evicted while the actual write is happening in case write is very slow. The purpose of this Jira is to ensure the cache eviction only after writeChunk completes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-1144) Introduce an option to optionally save the ratis log data when thhe pipeline gets destroyed
Shashikant Banerjee created HDDS-1144: - Summary: Introduce an option to optionally save the ratis log data when thhe pipeline gets destroyed Key: HDDS-1144 URL: https://issues.apache.org/jira/browse/HDDS-1144 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: Ozone Datanode Affects Versions: 0.4.0 Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: 0.4.0 Currently, when the pipeline gets destroyed, the associated ratis log dirs are cleaned up. This log may be useful for debugging purpose. The purpose of this Jira to introduce a config option to optionally save the ratis log data rather than deleting it when pipeline destruction happens. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13762) Support non-volatile storage class memory(SCM) in HDFS cache directives
[ https://issues.apache.org/jira/browse/HDFS-13762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773040#comment-16773040 ] Sammi Chen commented on HDFS-13762: --- Thanks [~PhiloHe] for continue working on this. Here are some feebacks for 008.patch. 1. NativeIO.java, suggest to define the different PMDK support code and it's meaning using a enum, so it will be easy to map the code to description. 2. NativeIO.c, it would be nice to refactor the error message, to avoid the potential array overflow if pmem_errormsg() returns too long content. There are several piece of similar code {code:java} char msg[1000]; snprintf(msg, sizeof(msg), "Fail to unmap region. address: %x, length: %x, error msg: %s", address, length, pmem_errormsg()); {code} 3. Support persistent memeory cache on Windows can be a follow-up task since Windows is not very common used in user environment. For build patch on Windows, if it's a goal of this JIRA, please update the "Building on Windows" in BUILDING.txt for any specific steps user need to take. If it's not a goal of this JIRA, make sure build on Windows will have clear message for this case. 4. Cmakelists.txt "#Require ISA-L" is for ISA-L, better keep it. 5. make PmemVolumeManager pmemManager final 6. This piece of code has the chance to delete user files accidentally. Better to provide user a choice to decide auto delete or not. {code:java} // Remove all files under the volume. Files may been left after a // unexpected data node restart. FileUtils.cleanDirectory(locFile); {code} 7. make count in PmemVolumeManager final 8. fillBuffer in MemoryMappedBlock and PmemMappedBlock are the same. refactor to keep only one piece of code > Support non-volatile storage class memory(SCM) in HDFS cache directives > --- > > Key: HDFS-13762 > URL: https://issues.apache.org/jira/browse/HDFS-13762 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Sammi Chen >Assignee: Feilong He >Priority: Major > Attachments: HDFS-13762.000.patch, HDFS-13762.001.patch, > HDFS-13762.002.patch, HDFS-13762.003.patch, HDFS-13762.004.patch, > HDFS-13762.005.patch, HDFS-13762.006.patch, HDFS-13762.007.patch, > HDFS-13762.008.patch, SCMCacheDesign-2018-11-08.pdf, SCMCacheTestPlan.pdf > > > No-volatile storage class memory is a type of memory that can keep the data > content after power failure or between the power cycle. Non-volatile storage > class memory device usually has near access speed as memory DIMM while has > lower cost than memory. So today It is usually used as a supplement to > memory to hold long tern persistent data, such as data in cache. > Currently in HDFS, we have OS page cache backed read only cache and RAMDISK > based lazy write cache. Non-volatile memory suits for both these functions. > This Jira aims to enable storage class memory first in read cache. Although > storage class memory has non-volatile characteristics, to keep the same > behavior as current read only cache, we don't use its persistent > characteristics currently. > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1135) Ozone jars are missing in the Ozone Snapshot tar
[ https://issues.apache.org/jira/browse/HDDS-1135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773054#comment-16773054 ] Dinesh Chitlangia commented on HDDS-1135: - [~elek] Thanks for review and commit. > Ozone jars are missing in the Ozone Snapshot tar > > > Key: HDDS-1135 > URL: https://issues.apache.org/jira/browse/HDDS-1135 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Affects Versions: 0.4.0 >Reporter: Shashikant Banerjee >Assignee: Dinesh Chitlangia >Priority: Major > Fix For: 0.4.0 > > Attachments: HDDS-1135.00.patch > > > After executing an ozone dist build the library jars are missing from the > created tar file. > The problem is on the maven side. The tar file creation is called before the > jar copies. > {code:java} > cd hadoop-ozone/dist > mvn clean package | grep "\-\-\-"{code} > {code:java} > [INFO] < org.apache.hadoop:hadoop-ozone-dist > >- > [INFO] [ pom > ]- > [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hadoop-ozone-dist > --- > [INFO] --- maven-antrun-plugin:1.7:run (create-testdirs) @ hadoop-ozone-dist > --- > [INFO] --- maven-remote-resources-plugin:1.5:process (default) @ > hadoop-ozone-dist --- > [INFO] --- exec-maven-plugin:1.3.1:exec (dist) @ hadoop-ozone-dist --- > [INFO] --- exec-maven-plugin:1.3.1:exec (tar-ozone) @ hadoop-ozone-dist --- > [INFO] --- maven-site-plugin:3.6:attach-descriptor (attach-descriptor) @ > hadoop-ozone-dist --- > [INFO] --- maven-dependency-plugin:3.0.2:build-classpath > (add-classpath-descriptor) @ hadoop-ozone-dist --- > [INFO] --- maven-dependency-plugin:3.0.2:copy (copy-classpath-files) @ > hadoop-ozone-dist --- > [INFO] --- maven-dependency-plugin:3.0.2:copy-dependencies (copy-jars) @ > hadoop-ozone-dist --- > [INFO] --- maven-jar-plugin:2.5:test-jar (default) @ hadoop-ozone-dist > ---{code} > The right order of the plugin executions are: > * Call 'dist' (dist-layout-stitching, it cleans the destination directory) > * Copy the jar files (copy-classpath-files, copy-jars) > * Create the tar package (tar-ozone) > It could be done with adjusting the maven phases in the pom.xml > I would suggest to move 'dist' to the 'compile' phase, move > 'copy-classpath-files' and 'copy-jars' to the 'prepare-package' phase, and > keep 'tar-ozone' at the 'package' phase. > With this setup we can be sure that the steps are executed in the right order. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-14301) libhdfs should expose a method to get conf parameters from a hdfsFS instance
Sahil Takiar created HDFS-14301: --- Summary: libhdfs should expose a method to get conf parameters from a hdfsFS instance Key: HDFS-14301 URL: https://issues.apache.org/jira/browse/HDFS-14301 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client, libhdfs, native Reporter: Sahil Takiar Assignee: Sahil Takiar libhdfs currently exposes a few methods for getting the values of a configuration parameters: {{hdfsConfGetStr}} and {{hdfsConfGetInt}}. The issue is that in {{hdfs.c}} the implementation of these methods simply calls {{new Configuration()}} and fetches values using {{get}}. The issue is that calling {{new Configuration}} simply loads the current {{hdfs-site.xml}}, {{core-site.xml}}, etc. which does not take into account the scenario where the default configuration has been modified for specific filesystem instances. For example, the {{hdfsBuilder}} exposes a {{hdfsBuilderConfSetStr}} method that allows setting non-default configuration parameters. This could lead to issues such as: {code:java} struct hdfsBuilder *bld; bld = hdfsNewBuilder(); hdfsBuilderSetForceNewInstance(bld); hdfsBuilderConfSetStr(bld, "hello", "world"); hdfs = hdfsBuilderConnect(bld); char* value = NULL; hdfsConfGetStr("hello", &value); // Value is NULL! {code} This JIRA proposes adding a new set of methods to libhdfs that take in a {{hdfsFS}} object and get the value of the key from the {{hdfsFS}} object rather than using {{new Configuration}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-3246) pRead equivalent for direct read path
[ https://issues.apache.org/jira/browse/HDFS-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773105#comment-16773105 ] Sahil Takiar commented on HDFS-3246: The patch doesn't actually compile because it is dependent on HDFS-14267 being merged first. I will fix the checkstyle issues. > pRead equivalent for direct read path > - > > Key: HDFS-3246 > URL: https://issues.apache.org/jira/browse/HDFS-3246 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client, performance >Affects Versions: 3.0.0-alpha1 >Reporter: Henry Robinson >Assignee: Sahil Takiar >Priority: Major > Attachments: HDFS-3246.001.patch > > > There is no pread equivalent in ByteBufferReadable. We should consider adding > one. It would be relatively easy to implement for the distributed case > (certainly compared to HDFS-2834), since DFSInputStream does most of the > heavy lifting. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14267) Add test_libhdfs_ops to libhdfs tests, mark libhdfs_read/write.c as examples
[ https://issues.apache.org/jira/browse/HDFS-14267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773106#comment-16773106 ] Sahil Takiar commented on HDFS-14267: - [~mackrorysd] were you able to get the tests running? What tests are you planning to run exactly? Just the existing libhdfs tests or do you have your own set of tests you run? > Add test_libhdfs_ops to libhdfs tests, mark libhdfs_read/write.c as examples > > > Key: HDFS-14267 > URL: https://issues.apache.org/jira/browse/HDFS-14267 > Project: Hadoop HDFS > Issue Type: Improvement > Components: libhdfs, native, test >Reporter: Sahil Takiar >Assignee: Sahil Takiar >Priority: Major > Attachments: HDFS-14267.001.patch, HDFS-14267.002.patch > > > {{test_libhdfs_ops.c}} provides test coverage for basic operations against > libhdfs, but currently has to be run manually (e.g. {{mvn install}} does not > run these tests). The goal of this patch is to add {{test_libhdfs_ops.c}} to > the list of tests that are automatically run for libhdfs. > It looks like {{test_libhdfs_ops.c}} was used in conjunction with > {{hadoop-hdfs-project/hadoop-hdfs/src/main/native/tests/test-libhdfs.sh}} to > run some tests against a mini DFS cluster. Now that the > {{NativeMiniDfsCluster}} exists, it makes more sense to use that rather than > rely on an external bash script to start a mini DFS cluster. > The {{libhdfs-tests}} directory (which contains {{test_libhdfs_ops.c}}) > contains two other files: {{test_libhdfs_read.c}} and > {{test_libhdfs_write.c}}. At some point, these files might have been used in > conjunction with {{test-libhdfs.sh}} to run some tests manually. However, > they (1) largely overlap with the test coverage provided by > {{test_libhdfs_ops.c}} and (2) are not designed to be run as unit tests. Thus > I suggest we move these two files into a new folder called > {{libhdfs-examples}} and use them to further document how users of libhdfs > can use the API. We can move {{test-libhdfs.sh}} into the examples folder as > well given that example files probably require the script to actually work. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14298) Improve log messages of ECTopologyVerifier
[ https://issues.apache.org/jira/browse/HDFS-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773122#comment-16773122 ] Hadoop QA commented on HDFS-14298: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 21s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 44s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 38s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 82m 16s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 31s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}132m 30s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks | | | hadoop.hdfs.tools.TestECAdmin | | | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.qjournal.server.TestJournalNodeSync | | | hadoop.hdfs.server.namenode.TestPersistentStoragePolicySatisfier | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | HDFS-14298 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12959437/HDFS-14298.002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux d61202eb509a 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / aa3ad36 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_191 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/26270/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/26270/testReport/ | | Max. process+thread count | 4076 (vs. ulimit of 1) |
[jira] [Commented] (HDFS-14249) RBF: Tooling to identify the subcluster location of a file
[ https://issues.apache.org/jira/browse/HDFS-14249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773131#comment-16773131 ] Ayush Saxena commented on HDFS-14249: - Thanx [~elgoiri] for updating. Nothing to add from my side. :) > RBF: Tooling to identify the subcluster location of a file > -- > > Key: HDFS-14249 > URL: https://issues.apache.org/jira/browse/HDFS-14249 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri >Priority: Major > Attachments: HDFS-14249-HDFS-13891.000.patch, > HDFS-14249-HDFS-13891.001.patch, HDFS-14249-HDFS-13891.002.patch > > > Mount points can spread files across multiple subclusters depennding on a > policy (e.g., HASH, HASH_ALL). Administrators would need a way to identify > the location. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14216) NullPointerException happens in NamenodeWebHdfs
[ https://issues.apache.org/jira/browse/HDFS-14216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773146#comment-16773146 ] Erik Krogen commented on HDFS-14216: Sure, I see the logic for using DEBUG level. I'm not sure that throwing an IOE is the right move, given that it is non-fatal. For example in the case of a well-behaved client deciding to exclude {{nodeA}}, then {{nodeA}} disappears, the client's subsequent request shouldn't fail. I wonder how this is handled on the RPC side? > NullPointerException happens in NamenodeWebHdfs > --- > > Key: HDFS-14216 > URL: https://issues.apache.org/jira/browse/HDFS-14216 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: lujie >Assignee: lujie >Priority: Critical > Attachments: HDFS-14216_1.patch, HDFS-14216_2.patch, > HDFS-14216_3.patch, HDFS-14216_4.patch, hadoop-hires-namenode-hadoop11.log > > > workload > {code:java} > curl -i -X PUT -T $HOMEPARH/test.txt > "http://hadoop1:9870/webhdfs/v1/input?op=CREATE&excludedatanodes=hadoop2"; > {code} > the method > {code:java} > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.chooseDatanode(String > excludeDatanodes){ > HashSet excludes = new HashSet(); > if (excludeDatanodes != null) { >for (String host : StringUtils > .getTrimmedStringCollection(excludeDatanodes)) { > int idx = host.indexOf(":"); >if (idx != -1) { > excludes.add(bm.getDatanodeManager().getDatanodeByXferAddr( >host.substring(0, idx), Integer.parseInt(host.substring(idx + > 1; >} else { > > excludes.add(bm.getDatanodeManager().getDatanodeByHost(host));//line280 >} > } > } > } > {code} > when datanode(e.g.hadoop2) is {color:#d04437}just wiped before > line280{color}, or{color:#33} > {color}{color:#ff}we{color}{color:#ff} give the wrong DN > name{color}*,*then bm.getDatanodeManager().getDatanodeByHost(host) will > return null, *_excludes_* *containes null*. while *_excludes_* are used > later, NPE happens: > {code:java} > java.lang.NullPointerException > at org.apache.hadoop.net.NodeBase.getPath(NodeBase.java:113) > at > org.apache.hadoop.net.NetworkTopology.countNumOfAvailableNodes(NetworkTopology.java:672) > at > org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:533) > at > org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:491) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.chooseDatanode(NamenodeWebHdfsMethods.java:323) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.redirectURI(NamenodeWebHdfsMethods.java:384) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.put(NamenodeWebHdfsMethods.java:652) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods$2.run(NamenodeWebHdfsMethods.java:600) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods$2.run(NamenodeWebHdfsMethods.java:597) > at org.apache.hadoop.ipc.ExternalCall.run(ExternalCall.java:73) > at org.apache.hadoop.ipc.ExternalCall.run(ExternalCall.java:30) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2830) > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1038) Datanode fails to connect with secure SCM
[ https://issues.apache.org/jira/browse/HDDS-1038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773148#comment-16773148 ] Xiaoyu Yao commented on HDDS-1038: -- Attach patch v5 fix the unit test issue and also guard the refreshServiceAcl only if hadoop security authorization is enabled. > Datanode fails to connect with secure SCM > - > > Key: HDDS-1038 > URL: https://issues.apache.org/jira/browse/HDDS-1038 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Labels: Security > Fix For: 0.4.0 > > Attachments: HDDS-1038.00.patch, HDDS-1038.01.patch, > HDDS-1038.02.patch, HDDS-1038.03.patch, HDDS-1038.04.patch, HDDS-1038.05.patch > > > In a secure Ozone cluster. Datanodes fail to connect to SCM on > {{StorageContainerDatanodeProtocol}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14298) Improve log messages of ECTopologyVerifier
[ https://issues.apache.org/jira/browse/HDFS-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-14298: Attachment: HDFS-14298.003.patch > Improve log messages of ECTopologyVerifier > -- > > Key: HDFS-14298 > URL: https://issues.apache.org/jira/browse/HDFS-14298 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Minor > Attachments: HDFS-14298.001.patch, HDFS-14298.002.patch, > HDFS-14298.003.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14298) Improve log messages of ECTopologyVerifier
[ https://issues.apache.org/jira/browse/HDFS-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773156#comment-16773156 ] Kitti Nanasi commented on HDFS-14298: - Added patch v003 to fix the tests introduced by the latest trunk in TestECAdmin. > Improve log messages of ECTopologyVerifier > -- > > Key: HDFS-14298 > URL: https://issues.apache.org/jira/browse/HDFS-14298 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Minor > Attachments: HDFS-14298.001.patch, HDFS-14298.002.patch, > HDFS-14298.003.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14297) Add cache for getContentSummary() result
[ https://issues.apache.org/jira/browse/HDFS-14297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773160#comment-16773160 ] Erik Krogen commented on HDFS-14297: Sure, it seems reasonable to me. I see the value in having something that can be controlled by the server, instead of needing to force users to pick a specific API. I have a few comments on the design: * I'm not sure that the total processing time is the right way to determine whether or not to cache, given that this can be subject to lock queue delays (since the lock may be released and re-acquired multiple times in the course of a large content summary). Maybe we should just set a threshold based off of the number of entries. * I would probably like to see some server flag controlling cache expiration time, and an optional flag added to the {{getContentSummary}} op which can skip the cache if desired. [~kihwal], any thoughts here? > Add cache for getContentSummary() result > > > Key: HDFS-14297 > URL: https://issues.apache.org/jira/browse/HDFS-14297 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Tao Jie >Priority: Major > > In a large HDFS cluster, calling {{getContentSummary}} for a directory with > large amount of files is very expensive. In a certain cluster with more than > 100 million files, calling {{getContentSummary}} may take more than 10s and > it will hold fsnamesystem lock for such a long time. > In our cluster, there are several peripheral systems calling > {{getContentSummary}} periodically to monitor the status of dirs. Actually we > don't need the very accurate result in most cases. We could keep a cache for > those contentSummary result in namenode, with which we could avoid repeated > heavy request in a span. Also we should add more restrictions to this cache: > 1,its size should be limited and it should be LRU, 2, only result of heavy > request would be added to this cache, eg, rpctime over 1000ms. > We may create a new RPC method or add a flag to the current method so that we > will not modify the current behavior and we can have a choose of a accurate > but expensive method or a fast but inaccurate method. > Any thought? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1038) Datanode fails to connect with secure SCM
[ https://issues.apache.org/jira/browse/HDDS-1038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HDDS-1038: - Attachment: HDDS-1038.05.patch > Datanode fails to connect with secure SCM > - > > Key: HDDS-1038 > URL: https://issues.apache.org/jira/browse/HDDS-1038 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Labels: Security > Fix For: 0.4.0 > > Attachments: HDDS-1038.00.patch, HDDS-1038.01.patch, > HDDS-1038.02.patch, HDDS-1038.03.patch, HDDS-1038.04.patch, HDDS-1038.05.patch > > > In a secure Ozone cluster. Datanodes fail to connect to SCM on > {{StorageContainerDatanodeProtocol}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14216) NullPointerException happens in NamenodeWebHdfs
[ https://issues.apache.org/jira/browse/HDFS-14216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773172#comment-16773172 ] Surendra Singh Lilhore commented on HDFS-14216: --- Anyway client is trying to exclude DN, if it is not found that also fine, no need to throw any exception. {quote} I wonder how this is handled on the RPC side? {quote} If client is doing any operation related to DN, then it should get the latest datanode report from namenode periodically. One more thing, asserting NullPointerException in UT is not good idea. UT should verify the functional failure case. > NullPointerException happens in NamenodeWebHdfs > --- > > Key: HDFS-14216 > URL: https://issues.apache.org/jira/browse/HDFS-14216 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: lujie >Assignee: lujie >Priority: Critical > Attachments: HDFS-14216_1.patch, HDFS-14216_2.patch, > HDFS-14216_3.patch, HDFS-14216_4.patch, hadoop-hires-namenode-hadoop11.log > > > workload > {code:java} > curl -i -X PUT -T $HOMEPARH/test.txt > "http://hadoop1:9870/webhdfs/v1/input?op=CREATE&excludedatanodes=hadoop2"; > {code} > the method > {code:java} > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.chooseDatanode(String > excludeDatanodes){ > HashSet excludes = new HashSet(); > if (excludeDatanodes != null) { >for (String host : StringUtils > .getTrimmedStringCollection(excludeDatanodes)) { > int idx = host.indexOf(":"); >if (idx != -1) { > excludes.add(bm.getDatanodeManager().getDatanodeByXferAddr( >host.substring(0, idx), Integer.parseInt(host.substring(idx + > 1; >} else { > > excludes.add(bm.getDatanodeManager().getDatanodeByHost(host));//line280 >} > } > } > } > {code} > when datanode(e.g.hadoop2) is {color:#d04437}just wiped before > line280{color}, or{color:#33} > {color}{color:#ff}we{color}{color:#ff} give the wrong DN > name{color}*,*then bm.getDatanodeManager().getDatanodeByHost(host) will > return null, *_excludes_* *containes null*. while *_excludes_* are used > later, NPE happens: > {code:java} > java.lang.NullPointerException > at org.apache.hadoop.net.NodeBase.getPath(NodeBase.java:113) > at > org.apache.hadoop.net.NetworkTopology.countNumOfAvailableNodes(NetworkTopology.java:672) > at > org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:533) > at > org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:491) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.chooseDatanode(NamenodeWebHdfsMethods.java:323) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.redirectURI(NamenodeWebHdfsMethods.java:384) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.put(NamenodeWebHdfsMethods.java:652) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods$2.run(NamenodeWebHdfsMethods.java:600) > at > org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods$2.run(NamenodeWebHdfsMethods.java:597) > at org.apache.hadoop.ipc.ExternalCall.run(ExternalCall.java:73) > at org.apache.hadoop.ipc.ExternalCall.run(ExternalCall.java:30) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2830) > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-1145) Add optional web server to the Ozone freon test tool
Elek, Marton created HDDS-1145: -- Summary: Add optional web server to the Ozone freon test tool Key: HDDS-1145 URL: https://issues.apache.org/jira/browse/HDDS-1145 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: Tools Reporter: Elek, Marton Assignee: Elek, Marton Recently we improved the default HttpServer to support prometheus monitoring and java profiling. It would be very useful to enable the same options for freon testing: 1. We need a simple way to profile freon and check the problems 2. Long running freons should be monitored We can create a new optional FreonHttpServer which includes all the required servlets by default. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1145) Add optional web server to the Ozone freon test tool
[ https://issues.apache.org/jira/browse/HDDS-1145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-1145: - Labels: pull-request-available (was: ) > Add optional web server to the Ozone freon test tool > > > Key: HDDS-1145 > URL: https://issues.apache.org/jira/browse/HDDS-1145 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Tools >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Major > Labels: pull-request-available > > Recently we improved the default HttpServer to support prometheus monitoring > and java profiling. > It would be very useful to enable the same options for freon testing: > 1. We need a simple way to profile freon and check the problems > 2. Long running freons should be monitored > We can create a new optional FreonHttpServer which includes all the required > servlets by default. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1145) Add optional web server to the Ozone freon test tool
[ https://issues.apache.org/jira/browse/HDDS-1145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elek, Marton updated HDDS-1145: --- Status: Patch Available (was: Open) > Add optional web server to the Ozone freon test tool > > > Key: HDDS-1145 > URL: https://issues.apache.org/jira/browse/HDDS-1145 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Tools >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Recently we improved the default HttpServer to support prometheus monitoring > and java profiling. > It would be very useful to enable the same options for freon testing: > 1. We need a simple way to profile freon and check the problems > 2. Long running freons should be monitored > We can create a new optional FreonHttpServer which includes all the required > servlets by default. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14292) Introduce Java ExecutorService to DataXceiverServer
[ https://issues.apache.org/jira/browse/HDFS-14292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773187#comment-16773187 ] BELUGA BEHR edited comment on HDFS-14292 at 2/20/19 4:52 PM: - [~xkrogen] Thanks for the feedback. Well, I don't know the history, but I too seemed thought it was the obvious choice, but I will tell you that it's not easy to do... unit tests are failing and I haven't pinned it down quite yet (ugh), but I think some of the code is expecting that threads are not-reused via ThreadLocal and other mechanisms. I'm still trying to hunt it down exactly, but the unit tests pass when I use a thread pool that does not re-use threads. was (Author: belugabehr): [~xkrogen] Thanks for the feedback. Well, I don't know the history, but I too seemed like it was the obvious choice, but I will tell you that it's not easy to do... unit tests are failing and I haven't pinned it down quite yet (ugh), but I think some of the code is expecting that threads are not-reused via ThreadLocal and other mechanisms. I'm still trying to hunt it down exactly, but the unit tests pass when I use a thread pool that does not re-use threads. > Introduce Java ExecutorService to DataXceiverServer > --- > > Key: HDFS-14292 > URL: https://issues.apache.org/jira/browse/HDFS-14292 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 3.2.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Major > Attachments: HDFS-14292.1.patch, HDFS-14292.2.patch, > HDFS-14292.3.patch > > > I wanted to investigate {{dfs.datanode.max.transfer.threads}} from > {{hdfs-site.xml}}. It is described as "Specifies the maximum number of > threads to use for transferring data in and out of the DN." The default > value is 4096. I found it interesting because 4096 threads sounds like a lot > to me. I'm not sure how a system with 8-16 cores would react to this large a > thread count. Intuitively, I would say that the overhead of context > switching would be immense. > During mt investigation, I discovered the > [following|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiverServer.java#L203-L216] > setup in the {{DataXceiverServer}} class: > # A peer connects to a DataNode > # A new thread is spun up to service this connection > # The thread runs to completion > # The tread dies > It would perhaps be better if we used a thread pool to better manage the > lifecycle of the service threads and to allow the DataNode to re-use existing > threads, saving on the need to create and spin-up threads on demand. > In this JIRA, I have added a couple of things: > # Added a thread pool to {{DataXceiverServer}} class that, on demand, will > create up to {{dfs.datanode.max.transfer.threads}}. A thread that has > completed its prior duties will stay idle for up to 60 seconds > (configurable), it will be retired if no new work has arrived. > # Added new methods to the {{Peer}} Interface to allow for better logging and > less code within each Thread ({{DataXceiver}}). > # Updated the Thread code ({{DataXceiver}}) regarding its interactions with > {{blockReceiver}} instance variable -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-1145) Add optional web server to the Ozone freon test tool
[ https://issues.apache.org/jira/browse/HDDS-1145?focusedWorklogId=201442&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201442 ] ASF GitHub Bot logged work on HDDS-1145: Author: ASF GitHub Bot Created on: 20/Feb/19 16:50 Start Date: 20/Feb/19 16:50 Worklog Time Spent: 10m Work Description: elek commented on pull request #505: HDDS-1145. Add optional web server to the Ozone freon test tool URL: https://github.com/apache/hadoop/pull/505 Recently we improved the default HttpServer to support prometheus monitoring and java profiling. It would be very useful to enable the same options for freon testing: 1. We need a simple way to profile freon and check the problems 2. Long running freons should be monitored We can create a new optional FreonHttpServer which includes all the required servlets by default. See: https://issues.apache.org/jira/browse/HDDS-1145 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201442) Time Spent: 10m Remaining Estimate: 0h > Add optional web server to the Ozone freon test tool > > > Key: HDDS-1145 > URL: https://issues.apache.org/jira/browse/HDDS-1145 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Tools >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Recently we improved the default HttpServer to support prometheus monitoring > and java profiling. > It would be very useful to enable the same options for freon testing: > 1. We need a simple way to profile freon and check the problems > 2. Long running freons should be monitored > We can create a new optional FreonHttpServer which includes all the required > servlets by default. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14292) Introduce Java ExecutorService to DataXceiverServer
[ https://issues.apache.org/jira/browse/HDFS-14292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773187#comment-16773187 ] BELUGA BEHR commented on HDFS-14292: [~xkrogen] Thanks for the feedback. Well, I don't know the history, but I too seemed like it was the obvious choice, but I will tell you that it's not easy to do... unit tests are failing and I haven't pinned it down quite yet (ugh), but I think some of the code is expecting that threads are not-reused via ThreadLocal and other mechanisms. I'm still trying to hunt it down exactly, but the unit tests pass when I use a thread pool that does not re-use threads. > Introduce Java ExecutorService to DataXceiverServer > --- > > Key: HDFS-14292 > URL: https://issues.apache.org/jira/browse/HDFS-14292 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 3.2.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Major > Attachments: HDFS-14292.1.patch, HDFS-14292.2.patch, > HDFS-14292.3.patch > > > I wanted to investigate {{dfs.datanode.max.transfer.threads}} from > {{hdfs-site.xml}}. It is described as "Specifies the maximum number of > threads to use for transferring data in and out of the DN." The default > value is 4096. I found it interesting because 4096 threads sounds like a lot > to me. I'm not sure how a system with 8-16 cores would react to this large a > thread count. Intuitively, I would say that the overhead of context > switching would be immense. > During mt investigation, I discovered the > [following|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiverServer.java#L203-L216] > setup in the {{DataXceiverServer}} class: > # A peer connects to a DataNode > # A new thread is spun up to service this connection > # The thread runs to completion > # The tread dies > It would perhaps be better if we used a thread pool to better manage the > lifecycle of the service threads and to allow the DataNode to re-use existing > threads, saving on the need to create and spin-up threads on demand. > In this JIRA, I have added a couple of things: > # Added a thread pool to {{DataXceiverServer}} class that, on demand, will > create up to {{dfs.datanode.max.transfer.threads}}. A thread that has > completed its prior duties will stay idle for up to 60 seconds > (configurable), it will be retired if no new work has arrived. > # Added new methods to the {{Peer}} Interface to allow for better logging and > less code within each Thread ({{DataXceiver}}). > # Updated the Thread code ({{DataXceiver}}) regarding its interactions with > {{blockReceiver}} instance variable -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14292) Introduce Java ExecutorService to DataXceiverServer
[ https://issues.apache.org/jira/browse/HDFS-14292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773208#comment-16773208 ] BELUGA BEHR commented on HDFS-14292: Ya, finally found it. I'm not sure that this is the issue with all of my failing unit tests, but it's a start: {code:java|title=LocalReplicaInPipeline} Thread thread = writer.get(); if ((thread == null) || (thread == Thread.currentThread()) || (!thread.isAlive())) { if (writer.compareAndSet(thread, null)) { return; // Done } // The writer changed. Go back to the start of the loop and attempt to // stop the new writer. continue; } thread.interrupt(); try { thread.join(xceiverStopTimeout); if (thread.isAlive()) { // Our thread join timed out. final String msg = "Join on writer thread " + thread + " timed out"; DataNode.LOG.warn(msg + "\n" + StringUtils.getStackTrace(thread)); throw new IOException(msg); } {code} [https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/LocalReplicaInPipeline.java#L265-L268] Interrupts the thread and waits for it to die ({{join}}), only now I've got a thread pool in my branch so the thread is re-used, it does not die. It waits here until timeout then fails. I'm not sure what the fix is, but I'm thinking about it. > Introduce Java ExecutorService to DataXceiverServer > --- > > Key: HDFS-14292 > URL: https://issues.apache.org/jira/browse/HDFS-14292 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 3.2.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Major > Attachments: HDFS-14292.1.patch, HDFS-14292.2.patch, > HDFS-14292.3.patch > > > I wanted to investigate {{dfs.datanode.max.transfer.threads}} from > {{hdfs-site.xml}}. It is described as "Specifies the maximum number of > threads to use for transferring data in and out of the DN." The default > value is 4096. I found it interesting because 4096 threads sounds like a lot > to me. I'm not sure how a system with 8-16 cores would react to this large a > thread count. Intuitively, I would say that the overhead of context > switching would be immense. > During mt investigation, I discovered the > [following|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiverServer.java#L203-L216] > setup in the {{DataXceiverServer}} class: > # A peer connects to a DataNode > # A new thread is spun up to service this connection > # The thread runs to completion > # The tread dies > It would perhaps be better if we used a thread pool to better manage the > lifecycle of the service threads and to allow the DataNode to re-use existing > threads, saving on the need to create and spin-up threads on demand. > In this JIRA, I have added a couple of things: > # Added a thread pool to {{DataXceiverServer}} class that, on demand, will > create up to {{dfs.datanode.max.transfer.threads}}. A thread that has > completed its prior duties will stay idle for up to 60 seconds > (configurable), it will be retired if no new work has arrived. > # Added new methods to the {{Peer}} Interface to allow for better logging and > less code within each Thread ({{DataXceiver}}). > # Updated the Thread code ({{DataXceiver}}) regarding its interactions with > {{blockReceiver}} instance variable -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14299) ViewFs: Error message when operation on mount point
[ https://issues.apache.org/jira/browse/HDFS-14299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri updated HDFS-14299: --- Summary: ViewFs: Error message when operation on mount point (was: Error message when operation mount point on ViewFs) > ViewFs: Error message when operation on mount point > --- > > Key: HDFS-14299 > URL: https://issues.apache.org/jira/browse/HDFS-14299 > Project: Hadoop HDFS > Issue Type: Bug > Components: federation >Affects Versions: 3.1.2 >Reporter: hu xiaodong >Assignee: hu xiaodong >Priority: Minor > Attachments: .PNG, HDFS-14299.001.patch > > > when error occurred when operating mount point, the error message liks this: > !.PNG! > I think a separator should be included between "operation" and "Path" > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14299) Error message when operation mount point on ViewFs
[ https://issues.apache.org/jira/browse/HDFS-14299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri updated HDFS-14299: --- Summary: Error message when operation mount point on ViewFs (was: error message when Operation mount point) > Error message when operation mount point on ViewFs > -- > > Key: HDFS-14299 > URL: https://issues.apache.org/jira/browse/HDFS-14299 > Project: Hadoop HDFS > Issue Type: Bug > Components: federation >Affects Versions: 3.1.2 >Reporter: hu xiaodong >Assignee: hu xiaodong >Priority: Minor > Attachments: .PNG, HDFS-14299.001.patch > > > when error occurred when operating mount point, the error message liks this: > !.PNG! > I think a separator should be included between "operation" and "Path" > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14259) RBF: Fix safemode message for Router
[ https://issues.apache.org/jira/browse/HDFS-14259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773227#comment-16773227 ] Íñigo Goiri commented on HDFS-14259: Thanks [~RANith], in addition to fixing the checkstyle, I think we can make the messages a little more intuitive: {code} assertTrue("Wrong safe mode message: " + safeModeMsg, safeModeMsg.startsWith("Safe mode is ON.")); ... assertEquals("Wrong safe mode message: " + safeModeMsg, "", safeModeMsg); {code} Let's also make the break lines consistent. > RBF: Fix safemode message for Router > > > Key: HDFS-14259 > URL: https://issues.apache.org/jira/browse/HDFS-14259 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: Ranith Sardar >Priority: Major > Attachments: HDFS-14259-HDFS-13891.000.patch, > HDFS-14259-HDFS-13891.001.patch > > > Currently, the {{getSafemode()}} bean checks the state of the Router but > returns the error if the status is different than SAFEMODE: > {code} > public String getSafemode() { > if (!getRouter().isRouterState(RouterServiceState.SAFEMODE)) { > return "Safe mode is ON. " + this.getSafeModeTip(); > } > } catch (IOException e) { > return "Failed to get safemode status. Please check router" > + "log for more detail."; > } > return ""; > } > {code} > The condition should be reversed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14244) refactor the libhdfs++ build system
[ https://issues.apache.org/jira/browse/HDFS-14244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773238#comment-16773238 ] James Clampffer commented on HDFS-14244: [~owen.omalley] anything that trims down the libhdfspp/third-party code sounds good to me. I tried applying the patch and ran into a couple issues doing the build in the container start_build_env.sh uses. -It looks like the dockerfile used by start_build_env.sh needs to install automake and autoconf. Doing a clean build in the container errors out when it tries to run those. After installing I make it to the next issue. -It looks like there's a dependency issue when building the whole tree at once. Things that use the minidfscluster wrapper code complain about missing symbols (e.g. nmdCreate). It looks like libhdfs.so has been built by the time it hits the error but cmake isn't picking up the library path while building tests. -I haven't made it far enough but it seems like the container will also need asio installed, or is this something the cmake external project approach handles? > refactor the libhdfs++ build system > --- > > Key: HDFS-14244 > URL: https://issues.apache.org/jira/browse/HDFS-14244 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs++, hdfs-client >Reporter: Owen O'Malley >Assignee: Owen O'Malley >Priority: Major > > The current cmake for libhdfs++ has the source code for the dependent > libraries. By refactoring we can remove 150kloc of third party code. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13972) RBF: Support for Delegation Token (WebHDFS)
[ https://issues.apache.org/jira/browse/HDFS-13972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773242#comment-16773242 ] Íñigo Goiri commented on HDFS-13972: Thanks [~crh] for the update; this looks pretty self-contained. * I see this includes HDFS-14052. * Extend javadoc for the new methods in RouterSecurityManager. * Why do we need the Router to implement the TokenVerifier? Where is this used? * The changes in NamenodeWebHdfsMethods can be done in a separate JIRA to regular HDFS. * The log in {{RouterSecurityManager#verifyToken()}} should use logger style {}. * As you mentioned, it needs some tests. > RBF: Support for Delegation Token (WebHDFS) > --- > > Key: HDFS-13972 > URL: https://issues.apache.org/jira/browse/HDFS-13972 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: CR Hota >Priority: Major > Attachments: HDFS-13972-HDFS-13891.001.patch, > HDFS-13972-HDFS-13891.002.patch > > > HDFS Router should support issuing HDFS delegation tokens through WebHDFS. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-891) Create customized yetus personality for ozone
[ https://issues.apache.org/jira/browse/HDDS-891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773255#comment-16773255 ] Allen Wittenauer commented on HDDS-891: --- 1. bq. If I understood well, you suggest to double quote the $DOCKER_INTERACTIVE_RUN variable in the docker run line. It's not even that. To do what you are wanting to do should really just be an extra flag to set the extra options rather than a full-blown 'set it from the outside' variable. bq. But please let me know if I am wrong. Let me do one better and give you an example. DOCKER_INTERACTIVE_RUN opens the door for users to set command line options to docker. Most notably, -c and -v and a few others that share one particular characteristic: they reference the file system. As soon as shell code hits the file system, it is no longer safe to assume space delimited options. In other words, -c /My Cool Filesystem/Docker Files/config.json or -v /c_drive/Program Files/Data:/data may be something a user wants to do, but the script now breaks because of the IFS assumptions. This bug is exactly why shellcheck is correctly flagging it as busted code. 2. bq. running docker based acceptance tests, While it is not well tested, it should be doable with 0.9.0+ with --dockerind mode. If external volumes are required to be mounted, things might get wonky though. Just be aware that users and other ASF build server patrons get annoyed when jobs take too long during peak hours. precommit response should be quick checks with better, post-commit checks happening at full build time. If they can't be triggered from maven test, then a custom test needs to be defined, either in the personality (recommended) or in the --user-plugins directory (not recommended, mainly because people will forget to set this option when they run test-patch interactively). bq. run ALL the hdds/ozone unit tests all the time, not just for the changed projects See above. Full tests are run as part of the nightlies due to time constraints. bq. check ALL the findbugs/checkstyle issues not just the new ones For findbugs, that's the --findbugs-strict-precheck option which AFAIK most/all of the Hadoop jobs have enabled. It will fail the patch if there are pre-existing findbugs issues. Adding a similar option to checkstyle wouldn't be hard, but a reminder that this info is also presented in the nightlies. Also, if the source tree is already clean, then new checkstyle failures should technically be 'all' already. Experience has shown, though, that users tend to blow right past precheck failures and commit code anyway. [Hell, many PMC members ignore errors that their own patches generated, blaming the Jenkins nodes when it's pretty clear that their Java code has e.g., javadoc errors.] 3. bq. I am convinced to run the more strict tests in addition to the existing yetus tests. It sounds like everything you want is either already there or is fairly trivial to implement. bq. Please let me know If I can do something to get Yetus results for the PR-s. I think [~ste...@apache.org] just needs to edit the user/pw for the hadoop-multibranch job credentials and HADOOP-16035 committed. I do practically zero Java these days, so it may not be 100% and probably needs a few more tweaks after it is implemented. (A definite flaw with Jenkins' multibranch pipelines.) > Create customized yetus personality for ozone > - > > Key: HDDS-891 > URL: https://issues.apache.org/jira/browse/HDDS-891 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Major > > Ozone pre commit builds (such as > https://builds.apache.org/job/PreCommit-HDDS-Build/) use the official hadoop > personality from the yetus personality. > Yetus personalities are bash scripts which contain personalization for > specific builds. > The hadoop personality tries to identify which project should be built and > use partial build to build only the required subprojects because the full > build is very time consuming. > But in Ozone: > 1.) The build + unit tests are very fast > 2.) We don't need all the checks (for example the hadoop specific shading > test) > 3.) We prefer to do a full build and full unit test for hadoop-ozone and > hadoop-hdds subrojects (for example the hadoop-ozone integration test always > should be executed as it contains many generic unit test) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14272) [SBN read] HDFS command line tools does not guarantee consistency
[ https://issues.apache.org/jira/browse/HDFS-14272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773262#comment-16773262 ] Erik Krogen commented on HDFS-14272: [~shv], [~vagarychen], [~csun], appreciate if any of you can help review > [SBN read] HDFS command line tools does not guarantee consistency > - > > Key: HDFS-14272 > URL: https://issues.apache.org/jira/browse/HDFS-14272 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools > Environment: CDH6.1 (Hadoop 3.0.x) + Consistency Reads from Standby + > SSL + Kerberos + RPC encryption >Reporter: Wei-Chiu Chuang >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-14272.000.patch, HDFS-14272.001.patch > > > It is typical for integration tests to create some files and then check their > existence. For example, like the following simple bash script: > {code:java} > # hdfs dfs -touchz /tmp/abc > # hdfs dfs -ls /tmp/abc > {code} > The test executes HDFS bash command sequentially, but it may fail with > Consistent Standby Read because the -ls does not find the file. > Analysis: the second bash command, while launched sequentially after the > first one, is not aware of the state id returned from the first bash command. > So ObserverNode wouldn't wait for the the edits to get propagated, and thus > fails. > I've got a cluster where the Observer has tens of seconds of RPC latency, and > this becomes very annoying. (I am still trying to figure out why this > Observer has such a long RPC latency. But that's another story.) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14272) [SBN read] ObserverReadProxyProvider should sync with active txnID on startup
[ https://issues.apache.org/jira/browse/HDFS-14272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Krogen updated HDFS-14272: --- Summary: [SBN read] ObserverReadProxyProvider should sync with active txnID on startup (was: [SBN read] HDFS command line tools does not guarantee consistency) > [SBN read] ObserverReadProxyProvider should sync with active txnID on startup > - > > Key: HDFS-14272 > URL: https://issues.apache.org/jira/browse/HDFS-14272 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools > Environment: CDH6.1 (Hadoop 3.0.x) + Consistency Reads from Standby + > SSL + Kerberos + RPC encryption >Reporter: Wei-Chiu Chuang >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-14272.000.patch, HDFS-14272.001.patch > > > It is typical for integration tests to create some files and then check their > existence. For example, like the following simple bash script: > {code:java} > # hdfs dfs -touchz /tmp/abc > # hdfs dfs -ls /tmp/abc > {code} > The test executes HDFS bash command sequentially, but it may fail with > Consistent Standby Read because the -ls does not find the file. > Analysis: the second bash command, while launched sequentially after the > first one, is not aware of the state id returned from the first bash command. > So ObserverNode wouldn't wait for the the edits to get propagated, and thus > fails. > I've got a cluster where the Observer has tens of seconds of RPC latency, and > this becomes very annoying. (I am still trying to figure out why this > Observer has such a long RPC latency. But that's another story.) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14272) [SBN read] ObserverReadProxyProvider should sync with active txnID on startup
[ https://issues.apache.org/jira/browse/HDFS-14272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773262#comment-16773262 ] Erik Krogen edited comment on HDFS-14272 at 2/20/19 6:37 PM: - [~shv], [~vagarychen], [~csun], appreciate if any of you can help review [~jojochuang], I changed the title to reflect the more broad change I would like to make as a fix for your original issue, let me know if you have any concern with this was (Author: xkrogen): [~shv], [~vagarychen], [~csun], appreciate if any of you can help review > [SBN read] ObserverReadProxyProvider should sync with active txnID on startup > - > > Key: HDFS-14272 > URL: https://issues.apache.org/jira/browse/HDFS-14272 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools > Environment: CDH6.1 (Hadoop 3.0.x) + Consistency Reads from Standby + > SSL + Kerberos + RPC encryption >Reporter: Wei-Chiu Chuang >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-14272.000.patch, HDFS-14272.001.patch > > > It is typical for integration tests to create some files and then check their > existence. For example, like the following simple bash script: > {code:java} > # hdfs dfs -touchz /tmp/abc > # hdfs dfs -ls /tmp/abc > {code} > The test executes HDFS bash command sequentially, but it may fail with > Consistent Standby Read because the -ls does not find the file. > Analysis: the second bash command, while launched sequentially after the > first one, is not aware of the state id returned from the first bash command. > So ObserverNode wouldn't wait for the the edits to get propagated, and thus > fails. > I've got a cluster where the Observer has tens of seconds of RPC latency, and > this becomes very annoying. (I am still trying to figure out why this > Observer has such a long RPC latency. But that's another story.) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-14302) Refactor NameNodeWebHdfsMethods to allow better extensibility 2
CR Hota created HDFS-14302: -- Summary: Refactor NameNodeWebHdfsMethods to allow better extensibility 2 Key: HDFS-14302 URL: https://issues.apache.org/jira/browse/HDFS-14302 Project: Hadoop HDFS Issue Type: Improvement Reporter: CR Hota Assignee: CR Hota Refactor NameNodeWebHdfsMethods to allow components such as hdfs routers to extend/override methods cleanly. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14279) [SBN Read] Race condition in ObserverReadProxyProvider
[ https://issues.apache.org/jira/browse/HDFS-14279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Krogen updated HDFS-14279: --- Attachment: HDFS-14279.001.patch > [SBN Read] Race condition in ObserverReadProxyProvider > -- > > Key: HDFS-14279 > URL: https://issues.apache.org/jira/browse/HDFS-14279 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, namenode >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-14279.000.patch, HDFS-14279.001.patch > > > There is a race condition in {{ObserverReadProxyProvider#getCurrentProxy()}}: > {code} > private NNProxyInfo getCurrentProxy() { > if (currentProxy == null) { > changeProxy(null); > } > return currentProxy; > } > {code} > {{currentProxy}} is a {{volatile}}. Another {{changeProxy()}} could occur > after the {{changeProxy()}} and before the {{return}}, thus making the return > value incorrect. I have seen this result in an NPE. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14279) [SBN Read] Race condition in ObserverReadProxyProvider
[ https://issues.apache.org/jira/browse/HDFS-14279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773276#comment-16773276 ] Erik Krogen commented on HDFS-14279: I ran some benchmarks offline and found that the use of {{volatile}} instead of {{synchronized}} wasn't helping performance at all. I ran with a client with 100/1000/5000 threads all doing {{getFileInfo()}} via the ObserverReadProxyProvider with my v000 (using volatile) and v001 (using synchronized always) patches and found no substantial speedup from v000. So, go with the simpler approach of always synchronizing. Attaching v001 patch accordingly. > [SBN Read] Race condition in ObserverReadProxyProvider > -- > > Key: HDFS-14279 > URL: https://issues.apache.org/jira/browse/HDFS-14279 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, namenode >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-14279.000.patch, HDFS-14279.001.patch > > > There is a race condition in {{ObserverReadProxyProvider#getCurrentProxy()}}: > {code} > private NNProxyInfo getCurrentProxy() { > if (currentProxy == null) { > changeProxy(null); > } > return currentProxy; > } > {code} > {{currentProxy}} is a {{volatile}}. Another {{changeProxy()}} could occur > after the {{changeProxy()}} and before the {{return}}, thus making the return > value incorrect. I have seen this result in an NPE. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14302) Refactor NameNodeWebHdfsMethods to allow better extensibility 2
[ https://issues.apache.org/jira/browse/HDFS-14302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] CR Hota updated HDFS-14302: --- Attachment: HDFS-14302.001.patch > Refactor NameNodeWebHdfsMethods to allow better extensibility 2 > --- > > Key: HDFS-14302 > URL: https://issues.apache.org/jira/browse/HDFS-14302 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: CR Hota >Assignee: CR Hota >Priority: Major > Attachments: HDFS-14302.001.patch > > > Refactor NameNodeWebHdfsMethods to allow components such as hdfs routers to > extend/override methods cleanly. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14298) Improve log messages of ECTopologyVerifier
[ https://issues.apache.org/jira/browse/HDFS-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773287#comment-16773287 ] Hadoop QA commented on HDFS-14298: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 42s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 43s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 23s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 99m 12s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 33s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}160m 32s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestRollingUpgrade | | | hadoop.hdfs.TestSafeMode | | | hadoop.hdfs.qjournal.server.TestJournalNodeSync | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | HDFS-14298 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12959456/HDFS-14298.003.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 385cd2ce548a 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / aa3ad36 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_191 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/26271/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/26271/testReport/ | | Max. process+thread count | 3174 (vs. ulimit of 1) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-B
[jira] [Commented] (HDDS-1060) Token: Add api to get OM certificate from SCM
[ https://issues.apache.org/jira/browse/HDDS-1060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773290#comment-16773290 ] Xiaoyu Yao commented on HDDS-1060: -- Thanks [~ajayydv] for the update. Patch v3 LGTM. I'd like to have unit tests to cover the new RPC message getCertificate(). We can address that in the follow up ticket. +1 for the v3 and I will commit it shortly. > Token: Add api to get OM certificate from SCM > - > > Key: HDDS-1060 > URL: https://issues.apache.org/jira/browse/HDDS-1060 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Labels: Blocker, Security > Fix For: 0.4.0 > > Attachments: HDDS-1060.00.patch, HDDS-1060.01.patch, > HDDS-1060.02.patch, HDDS-1060.03.patch, HDDS-1060.04.patch > > > Datanodes/OM need OM certificate to validate block tokens and delegation > tokens. > Add API for: > 1. getCertificate(String certSerialId): To get certificate from SCM based on > certificate serial id. > 2. getCACertificate(): To get CA certificate. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13894) Access HDFS through a proxy and natively
[ https://issues.apache.org/jira/browse/HDFS-13894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri updated HDFS-13894: --- Status: Patch Available (was: Open) > Access HDFS through a proxy and natively > > > Key: HDFS-13894 > URL: https://issues.apache.org/jira/browse/HDFS-13894 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri >Priority: Major > Attachments: HDFS-13894.000.patch > > > HDFS deployments are usually behind a firewall where one can access the > Namenode but not the Datanodes. To mitigate this situation there are proxies > that catch the DN requests (e.g., HttpFS). However, if a user submits a job > using the HttpFS endpoint, all the workers will use such endpoint which will > usually be a bottleneck. > We should create a new filesystem that supports accessing both: > * HttpFS for submission from outside the firewal > * HDFS from within the cluster -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14249) RBF: Tooling to identify the subcluster location of a file
[ https://issues.apache.org/jira/browse/HDFS-14249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated HDFS-14249: Resolution: Fixed Status: Resolved (was: Patch Available) > RBF: Tooling to identify the subcluster location of a file > -- > > Key: HDFS-14249 > URL: https://issues.apache.org/jira/browse/HDFS-14249 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri >Priority: Major > Attachments: HDFS-14249-HDFS-13891.000.patch, > HDFS-14249-HDFS-13891.001.patch, HDFS-14249-HDFS-13891.002.patch > > > Mount points can spread files across multiple subclusters depennding on a > policy (e.g., HASH, HASH_ALL). Administrators would need a way to identify > the location. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14249) RBF: Tooling to identify the subcluster location of a file
[ https://issues.apache.org/jira/browse/HDFS-14249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773294#comment-16773294 ] Giovanni Matteo Fumarola commented on HDFS-14249: - LGTM +1. Committed to the branch. Thanks [~elgoiri] for working on this and [~ayushtkn] for reviewing it. > RBF: Tooling to identify the subcluster location of a file > -- > > Key: HDFS-14249 > URL: https://issues.apache.org/jira/browse/HDFS-14249 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri >Priority: Major > Attachments: HDFS-14249-HDFS-13891.000.patch, > HDFS-14249-HDFS-13891.001.patch, HDFS-14249-HDFS-13891.002.patch > > > Mount points can spread files across multiple subclusters depennding on a > policy (e.g., HASH, HASH_ALL). Administrators would need a way to identify > the location. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14302) Refactor NameNodeWebHdfsMethods to allow better extensibility 2
[ https://issues.apache.org/jira/browse/HDFS-14302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] CR Hota updated HDFS-14302: --- Status: Patch Available (was: Open) > Refactor NameNodeWebHdfsMethods to allow better extensibility 2 > --- > > Key: HDFS-14302 > URL: https://issues.apache.org/jira/browse/HDFS-14302 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: CR Hota >Assignee: CR Hota >Priority: Major > Attachments: HDFS-14302.001.patch > > > Refactor NameNodeWebHdfsMethods to allow components such as hdfs routers to > extend/override methods cleanly. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1060) Add API to get OM certificate from SCM CA
[ https://issues.apache.org/jira/browse/HDDS-1060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HDDS-1060: - Summary: Add API to get OM certificate from SCM CA (was: Token: Add api to get OM certificate from SCM) > Add API to get OM certificate from SCM CA > - > > Key: HDDS-1060 > URL: https://issues.apache.org/jira/browse/HDDS-1060 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Labels: Blocker, Security > Fix For: 0.4.0 > > Attachments: HDDS-1060.00.patch, HDDS-1060.01.patch, > HDDS-1060.02.patch, HDDS-1060.03.patch, HDDS-1060.04.patch > > > Datanodes/OM need OM certificate to validate block tokens and delegation > tokens. > Add API for: > 1. getCertificate(String certSerialId): To get certificate from SCM based on > certificate serial id. > 2. getCACertificate(): To get CA certificate. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1109) Setup Failover Proxy Provider for client
[ https://issues.apache.org/jira/browse/HDDS-1109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hanisha Koneru updated HDDS-1109: - Description: We need to implement a OM Proxy Provider for the RPC Client. This Jira adds support for configuring one OM proxy on the client for each OM Node in the cluster. Client Request Retry and Failover implementation will be handled separately in another Jira. (was: We need to implement a OM Proxy Provider for the RPC Client. This Jira adds support for configuring one OM proxy on the client for each OM Node in the cluster. This patch does not include Client Request Retry and Failover implementation.) > Setup Failover Proxy Provider for client > > > Key: HDDS-1109 > URL: https://issues.apache.org/jira/browse/HDDS-1109 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Hanisha Koneru >Assignee: Hanisha Koneru >Priority: Major > Attachments: HDDS-1109.001.patch, HDDS-1109.002.patch > > > We need to implement a OM Proxy Provider for the RPC Client. This Jira adds > support for configuring one OM proxy on the client for each OM Node in the > cluster. Client Request Retry and Failover implementation will be handled > separately in another Jira. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1109) Setup Failover Proxy Provider for client
[ https://issues.apache.org/jira/browse/HDDS-1109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hanisha Koneru updated HDDS-1109: - Description: We need to implement a OM Proxy Provider for the RPC Client. This Jira adds support for configuring one OM proxy on the client for each OM Node in the cluster. This patch does not include Client Request Retry and Failover implementation. (was: We need to implement a OM Proxy Provider for the RPC Client. This Jira adds support for ) > Setup Failover Proxy Provider for client > > > Key: HDDS-1109 > URL: https://issues.apache.org/jira/browse/HDDS-1109 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Hanisha Koneru >Assignee: Hanisha Koneru >Priority: Major > Attachments: HDDS-1109.001.patch, HDDS-1109.002.patch > > > We need to implement a OM Proxy Provider for the RPC Client. This Jira adds > support for configuring one OM proxy on the client for each OM Node in the > cluster. This patch does not include Client Request Retry and Failover > implementation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-1146) Adding container related metrics in SCM
Nanda kumar created HDDS-1146: - Summary: Adding container related metrics in SCM Key: HDDS-1146 URL: https://issues.apache.org/jira/browse/HDDS-1146 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: SCM Reporter: Nanda kumar This jira aims to add more container related metrics to SCM. Following metrics will be added as part of this jira: * Number of containers * Number of open containers * Number of closed containers * Number of quasi closed containers * Number of closing containers * Number of successful create container calls * Number of failed create container calls * Number of successful delete container calls * Number of failed delete container calls * Number of successful container report processing * Number of failed container report processing * Number of successful incremental container report processing * Number of failed incremental container report processing -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-1146) Adding container related metrics in SCM
[ https://issues.apache.org/jira/browse/HDDS-1146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar reassigned HDDS-1146: - Assignee: Supratim Deka > Adding container related metrics in SCM > --- > > Key: HDDS-1146 > URL: https://issues.apache.org/jira/browse/HDDS-1146 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Nanda kumar >Assignee: Supratim Deka >Priority: Major > > This jira aims to add more container related metrics to SCM. > Following metrics will be added as part of this jira: > * Number of containers > * Number of open containers > * Number of closed containers > * Number of quasi closed containers > * Number of closing containers > * Number of successful create container calls > * Number of failed create container calls > * Number of successful delete container calls > * Number of failed delete container calls > * Number of successful container report processing > * Number of failed container report processing > * Number of successful incremental container report processing > * Number of failed incremental container report processing -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14081) hdfs dfsadmin -metasave metasave_test results NPE
[ https://issues.apache.org/jira/browse/HDFS-14081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shweta updated HDFS-14081: -- Attachment: HDFS-14081.010.patch > hdfs dfsadmin -metasave metasave_test results NPE > - > > Key: HDFS-14081 > URL: https://issues.apache.org/jira/browse/HDFS-14081 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.2.1 >Reporter: Shweta >Assignee: Shweta >Priority: Major > Attachments: HDFS-14081.001.patch, HDFS-14081.002.patch, > HDFS-14081.003.patch, HDFS-14081.004.patch, HDFS-14081.005.patch, > HDFS-14081.006.patch, HDFS-14081.007.patch, HDFS-14081.008.patch, > HDFS-14081.009.patch, HDFS-14081.010.patch > > > Race condition is encountered while adding Block to > postponedMisreplicatedBlocks which in turn tried to retrieve Block from > BlockManager in which it may not be present. > This happens in HA, metasave in first NN succeeded but failed in second NN, > StackTrace showing NPE is as follows: > {code} > 2018-07-12 21:39:09,783 WARN org.apache.hadoop.ipc.Server: IPC Server handler > 24 on 8020, call Call#1 Retry#0 > org.apache.hadoop.hdfs.protocol.ClientProtocol.metaSave from > 172.26.9.163:602342018-07-12 21:39:09,783 WARN org.apache.hadoop.ipc.Server: > IPC Server handler 24 on 8020, call Call#1 Retry#0 > org.apache.hadoop.hdfs.protocol.ClientProtocol.metaSave from > 172.26.9.163:60234java.lang.NullPointerException at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseSourceDatanodes(BlockManager.java:2175) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.dumpBlockMeta(BlockManager.java:830) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.metaSave(BlockManager.java:762) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.metaSave(FSNamesystem.java:1782) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.metaSave(FSNamesystem.java:1766) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.metaSave(NameNodeRpcServer.java:1320) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.metaSave(ClientNamenodeProtocolServerSideTranslatorPB.java:928) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at > org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) at > org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) at > java.security.AccessController.doPrivileged(Native Method) at > javax.security.auth.Subject.doAs(Subject.java:422) at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14081) hdfs dfsadmin -metasave metasave_test results NPE
[ https://issues.apache.org/jira/browse/HDFS-14081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773309#comment-16773309 ] Shweta commented on HDFS-14081: --- posted new patch 010 to address checkstyle issues. Please review. Thanks. > hdfs dfsadmin -metasave metasave_test results NPE > - > > Key: HDFS-14081 > URL: https://issues.apache.org/jira/browse/HDFS-14081 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.2.1 >Reporter: Shweta >Assignee: Shweta >Priority: Major > Attachments: HDFS-14081.001.patch, HDFS-14081.002.patch, > HDFS-14081.003.patch, HDFS-14081.004.patch, HDFS-14081.005.patch, > HDFS-14081.006.patch, HDFS-14081.007.patch, HDFS-14081.008.patch, > HDFS-14081.009.patch, HDFS-14081.010.patch > > > Race condition is encountered while adding Block to > postponedMisreplicatedBlocks which in turn tried to retrieve Block from > BlockManager in which it may not be present. > This happens in HA, metasave in first NN succeeded but failed in second NN, > StackTrace showing NPE is as follows: > {code} > 2018-07-12 21:39:09,783 WARN org.apache.hadoop.ipc.Server: IPC Server handler > 24 on 8020, call Call#1 Retry#0 > org.apache.hadoop.hdfs.protocol.ClientProtocol.metaSave from > 172.26.9.163:602342018-07-12 21:39:09,783 WARN org.apache.hadoop.ipc.Server: > IPC Server handler 24 on 8020, call Call#1 Retry#0 > org.apache.hadoop.hdfs.protocol.ClientProtocol.metaSave from > 172.26.9.163:60234java.lang.NullPointerException at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseSourceDatanodes(BlockManager.java:2175) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.dumpBlockMeta(BlockManager.java:830) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.metaSave(BlockManager.java:762) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.metaSave(FSNamesystem.java:1782) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.metaSave(FSNamesystem.java:1766) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.metaSave(NameNodeRpcServer.java:1320) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.metaSave(ClientNamenodeProtocolServerSideTranslatorPB.java:928) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at > org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) at > org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) at > java.security.AccessController.doPrivileged(Native Method) at > javax.security.auth.Subject.doAs(Subject.java:422) at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14267) Add test_libhdfs_ops to libhdfs tests, mark libhdfs_read/write.c as examples
[ https://issues.apache.org/jira/browse/HDFS-14267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773308#comment-16773308 ] Wei-Chiu Chuang commented on HDFS-14267: +1 Will commit soon. Sorry late to this. I was blocked by YARN-9319 for a while. Just for a piece of clarification, it looks like the Hadoop precommit does not trigger native tests. So I had to run the tests manually. Prior to the patch {noformat} (at hadoop-hdfs-project/hadoop-hdfs-native-client) $ mvn -Pnative test ... main: [exec] Test project /data/4/weichiu/hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/target [exec] Start 1: test_test_libhdfs_threaded_hdfs_static [exec] 1/3 Test #1: test_test_libhdfs_threaded_hdfs_static ... Passed 25.06 sec [exec] Start 2: test_test_libhdfs_zerocopy_hdfs_static [exec] 2/3 Test #2: test_test_libhdfs_zerocopy_hdfs_static ... Passed 7.26 sec [exec] Start 3: test_test_native_mini_dfs [exec] 3/3 Test #3: test_test_native_mini_dfs Passed 6.46 sec [exec] [exec] 100% tests passed, 0 tests failed out of 3 [exec] [exec] Total Test time (real) = 38.77 sec {noformat} After: {noformat} main: [exec] Test project /data/4/weichiu/hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/target [exec] Start 1: test_test_libhdfs_ops_hdfs_static [exec] 1/4 Test #1: test_test_libhdfs_ops_hdfs_static Passed 9.36 sec [exec] Start 2: test_test_libhdfs_threaded_hdfs_static [exec] 2/4 Test #2: test_test_libhdfs_threaded_hdfs_static ... Passed 24.90 sec [exec] Start 3: test_test_libhdfs_zerocopy_hdfs_static [exec] 3/4 Test #3: test_test_libhdfs_zerocopy_hdfs_static ... Passed 6.80 sec [exec] Start 4: test_test_native_mini_dfs [exec] 4/4 Test #4: test_test_native_mini_dfs Passed 6.47 sec [exec] [exec] 100% tests passed, 0 tests failed out of 4 [exec] [exec] Total Test time (real) = 47.53 sec {noformat} > Add test_libhdfs_ops to libhdfs tests, mark libhdfs_read/write.c as examples > > > Key: HDFS-14267 > URL: https://issues.apache.org/jira/browse/HDFS-14267 > Project: Hadoop HDFS > Issue Type: Improvement > Components: libhdfs, native, test >Reporter: Sahil Takiar >Assignee: Sahil Takiar >Priority: Major > Attachments: HDFS-14267.001.patch, HDFS-14267.002.patch > > > {{test_libhdfs_ops.c}} provides test coverage for basic operations against > libhdfs, but currently has to be run manually (e.g. {{mvn install}} does not > run these tests). The goal of this patch is to add {{test_libhdfs_ops.c}} to > the list of tests that are automatically run for libhdfs. > It looks like {{test_libhdfs_ops.c}} was used in conjunction with > {{hadoop-hdfs-project/hadoop-hdfs/src/main/native/tests/test-libhdfs.sh}} to > run some tests against a mini DFS cluster. Now that the > {{NativeMiniDfsCluster}} exists, it makes more sense to use that rather than > rely on an external bash script to start a mini DFS cluster. > The {{libhdfs-tests}} directory (which contains {{test_libhdfs_ops.c}}) > contains two other files: {{test_libhdfs_read.c}} and > {{test_libhdfs_write.c}}. At some point, these files might have been used in > conjunction with {{test-libhdfs.sh}} to run some tests manually. However, > they (1) largely overlap with the test coverage provided by > {{test_libhdfs_ops.c}} and (2) are not designed to be run as unit tests. Thus > I suggest we move these two files into a new folder called > {{libhdfs-examples}} and use them to further document how users of libhdfs > can use the API. We can move {{test-libhdfs.sh}} into the examples folder as > well given that example files probably require the script to actually work. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org