Improvement on DN HB lock
Hi everyone Here is a JIRA discussing about improving DN HB lock https://issues.apache.org/jira/browse/HDFS-7060 , it seems folks who participated the discussion both agreed on reducing the use of dataset log but the ticket goes to stale for a while. Recently we hit the issue in a 450 nodes cluster which has very heavy write load, after applying the patch, it improves the HB latency a lot. So I appreciate if someone can help to review and get this move forward. Appreciate your comments on this. Thanks! -- Weiwei
[jira] [Resolved] (HDFS-12757) DeadLock Happened Between DFSOutputStream and LeaseRenewer when LeaseRenewer#renew SocketTimeException
[ https://issues.apache.org/jira/browse/HDFS-12757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang resolved HDFS-12757. Resolution: Duplicate > DeadLock Happened Between DFSOutputStream and LeaseRenewer when > LeaseRenewer#renew SocketTimeException > -- > > Key: HDFS-12757 > URL: https://issues.apache.org/jira/browse/HDFS-12757 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Reporter: Jiandan Yang >Priority: Major > Attachments: HDFS-12757.patch > > > Java stack is : > {code:java} > Found one Java-level deadlock: > = > "Topology-2 (735/2000)": > waiting to lock monitor 0x7fff4523e6e8 (object 0x0005d3521078, a > org.apache.hadoop.hdfs.client.impl.LeaseRenewer), > which is held by "LeaseRenewer:admin@na61storage" > "LeaseRenewer:admin@na61storage": > waiting to lock monitor 0x7fff5d41e838 (object 0x0005ec0dfa88, a > org.apache.hadoop.hdfs.DFSOutputStream), > which is held by "Topology-2 (735/2000)" > Java stack information for the threads listed above: > === > "Topology-2 (735/2000)": > at > org.apache.hadoop.hdfs.client.impl.LeaseRenewer.addClient(LeaseRenewer.java:227) > - waiting to lock <0x0005d3521078> (a > org.apache.hadoop.hdfs.client.impl.LeaseRenewer) > at > org.apache.hadoop.hdfs.client.impl.LeaseRenewer.getInstance(LeaseRenewer.java:86) > at > org.apache.hadoop.hdfs.DFSClient.getLeaseRenewer(DFSClient.java:467) > at org.apache.hadoop.hdfs.DFSClient.endFileLease(DFSClient.java:479) > at > org.apache.hadoop.hdfs.DFSOutputStream.setClosed(DFSOutputStream.java:776) > at > org.apache.hadoop.hdfs.DFSOutputStream.closeThreads(DFSOutputStream.java:791) > at > org.apache.hadoop.hdfs.DFSOutputStream.closeImpl(DFSOutputStream.java:848) > - locked <0x0005ec0dfa88> (a > org.apache.hadoop.hdfs.DFSOutputStream) > at > org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:805) > - locked <0x0005ec0dfa88> (a > org.apache.hadoop.hdfs.DFSOutputStream) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106) > .. > "LeaseRenewer:admin@na61storage": > at > org.apache.hadoop.hdfs.DFSOutputStream.abort(DFSOutputStream.java:750) > - waiting to lock <0x0005ec0dfa88> (a > org.apache.hadoop.hdfs.DFSOutputStream) > at > org.apache.hadoop.hdfs.DFSClient.closeAllFilesBeingWritten(DFSClient.java:586) > at > org.apache.hadoop.hdfs.client.impl.LeaseRenewer.run(LeaseRenewer.java:453) > - locked <0x0005d3521078> (a > org.apache.hadoop.hdfs.client.impl.LeaseRenewer) > at > org.apache.hadoop.hdfs.client.impl.LeaseRenewer.access$700(LeaseRenewer.java:76) > at > org.apache.hadoop.hdfs.client.impl.LeaseRenewer$1.run(LeaseRenewer.java:310) > at java.lang.Thread.run(Thread.java:834) > Found 1 deadlock. > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/578/ [Nov 2, 2017 8:25:19 AM] (rohithsharmaks) addendum patch for YARN-7289. [Nov 2, 2017 8:43:08 AM] (aajisaka) MAPREDUCE-6983. Moving logging APIs over to slf4j in [Nov 2, 2017 9:12:04 AM] (sammi.chen) HADOOP-14997. Add hadoop-aliyun as dependency of hadoop-cloud-storage. [Nov 2, 2017 9:32:24 AM] (aajisaka) MAPREDUCE-6999. Fix typo onf in DynamicInputChunk.java. Contributed by [Nov 2, 2017 2:37:17 PM] (jlowe) YARN-7286. Add support for docker to have no capabilities. Contributed [Nov 2, 2017 4:51:28 PM] (wangda) YARN-7364. Queue dash board in new YARN UI has incorrect values. (Sunil [Nov 2, 2017 5:37:33 PM] (epayne) YARN-7370: Preemption properties should be refreshable. Contrubted by - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-12769) TestReadStripedFileWithDecodingCorruptData and TestReadStripedFileWithDecodingDeletedData timeout in trunk
Lei (Eddy) Xu created HDFS-12769: Summary: TestReadStripedFileWithDecodingCorruptData and TestReadStripedFileWithDecodingDeletedData timeout in trunk Key: HDFS-12769 URL: https://issues.apache.org/jira/browse/HDFS-12769 Project: Hadoop HDFS Issue Type: Bug Components: erasure-coding Affects Versions: 3.0.0-beta1 Reporter: Lei (Eddy) Xu Priority: Major Recently, TestReadStripedFileWithDecodingCorruptData and TestReadStripedFileWithDecodingDeletedData fail frequently. For example, in HDFS-12725. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-12760) Use PoolingHttpClientConnectionManager for OzoneRestClient
[ https://issues.apache.org/jira/browse/HDFS-12760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao resolved HDFS-12760. --- Resolution: Duplicate > Use PoolingHttpClientConnectionManager for OzoneRestClient > -- > > Key: HDFS-12760 > URL: https://issues.apache.org/jira/browse/HDFS-12760 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Xiaoyu Yao >Priority: Major > > This is based on [~ste...@apache.org]'s comments on HDFS-7240. I thought our > client already used it after fixing the server side issue with HDFS-11873. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-12768) Ozone: Qualify OzoneFileSystem paths in Filesystem APIs
Mukul Kumar Singh created HDFS-12768: Summary: Ozone: Qualify OzoneFileSystem paths in Filesystem APIs Key: HDFS-12768 URL: https://issues.apache.org/jira/browse/HDFS-12768 Project: Hadoop HDFS Issue Type: Sub-task Components: ozone Affects Versions: HDFS-7240 Reporter: Mukul Kumar Singh Assignee: Mukul Kumar Singh Priority: Major Fix For: HDFS-7240 This is based on [~ste...@apache.org]'s comments on HDFS-7240. This jira will be used to qualify filesystem paths before they are used, this jira will also address rest of the review comments for filesystem api as well. General 1) various places use LOG.info("text " + something). they should all move to LOG.info("text {}", something) 2) Once OzoneException -> IOE, you can cut the catch and translate here. 3) qualify path before all uses. That's needed to stop them being relative, and to catch things like someone calling ozfs.rename("o3://bucket/src", "s3a://bucket/dest"), delete("s3a://bucket/path"), etc, as well as problems with validation happening before paths are made absolute. 4) RenameIterator.iterate() it's going to log @ warn whenever it can't delete a temp file because it doesn't exist, which may be a distraction in failures. Better: if(!tmpFile.delete() && tmpFile.exists()), as that will only warn if the temp file is actually there. OzoneFileSystem.rename(). 1) Qualify all the paths before doing directory validation. Otherwise you can defeat the "don't rename into self checks" rename("/path/src", "/path/../path/src/dest"). 2) Log @ debu all the paths taken before returning so you can debug if needed. 3) S3A rename ended up having a special RenameFailedException() which innerRename() raises, with text and return code. Outer rename logs the text and returns the return code. This means that all failing paths have an exception clearly thrown, and when we eventually make rename/3 public, it's lined up to throw exceptions back to the caller. Consider copying this code. OzoneFileSystem.delete 1) qualify path before use 2) dont' log at error if you can't delete a nonexistent path, it is used everywhere for silent cleanup. Cut it OzoneFileSystem.ListStatusIterator 1) make status field final OzoneFileSystem.mkdir 1) do qualify path first. OzoneFileSystem.getFileStatus 1) getKeyInfo() catches all exceptions and maps to null, which is interpreted not found and eventually surfaces as FNFE. This is misleading if the failure is for any other reason. 2) Once OzoneException -> IOException, getKeyInfo() should only catch & downgrade the explicit not found (404?) responses. OzoneFileSystem.listKeys() unless this needs to be tagged as VisibleForTesting, make private. OzoneFileSystem.getDefaultBlockSize() implement getDefaultBlockSize(); add a config option to let people set it. add a sensible default like 64 or 128 MB. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-12767) Ozone: Handle larger file sizes for OzoneOutputStream & OzoneInputStream
Mukul Kumar Singh created HDFS-12767: Summary: Ozone: Handle larger file sizes for OzoneOutputStream & OzoneInputStream Key: HDFS-12767 URL: https://issues.apache.org/jira/browse/HDFS-12767 Project: Hadoop HDFS Issue Type: Sub-task Components: ozone Affects Versions: HDFS-7240 Reporter: Mukul Kumar Singh Assignee: Mukul Kumar Singh Priority: Major Fix For: HDFS-7240 This is based on [~ste...@apache.org]'s comments on HDFS-7240. This jira will add capabilities to OzoneOutputStream and OzoneInputStream stream to handle larger file sizes. This jira will also address other Streams related review comments. OzoneOutputStream Implement StreamCapabilities and declare that hsync/hflush are not supported. Unless there is no limit on the size of a PUT request/multipart uploads are supported, consider having the stream's write(int) method fail when the limit is reached. That way, things will at least fail fast. after close, set backupStream = null. flush() should be a no-op if called on a closed stream, so if (closed) return write() must fail if called on a closed stream, Again, OzoneException -> IOE translation which could/should be eliminated. OzoneInputStream 1)You have chosen an interesting solution to the "efficient seek" problem here: D/L the entire file and then seek around. While this probably works for the first release, larger files will have problems in both disk space and size of 2)Again, OzoneException -> IOE translation which could/should be eliminated. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-12766) Ozone: Ensure usage of parameterized slf4j log syntax for ozone
Xiaoyu Yao created HDFS-12766: - Summary: Ozone: Ensure usage of parameterized slf4j log syntax for ozone Key: HDFS-12766 URL: https://issues.apache.org/jira/browse/HDFS-12766 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Xiaoyu Yao various places use LOG.info("text " + something). they should all move to LOG.info("text {}", something) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-12765) Ozone: Add proper timeout for all ozone tests
Xiaoyu Yao created HDFS-12765: - Summary: Ozone: Add proper timeout for all ozone tests Key: HDFS-12765 URL: https://issues.apache.org/jira/browse/HDFS-12765 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Xiaoyu Yao Priority: Major Proper @Timeout annotation to guarantee ozone tests will not hold the Jenkins machines too long. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-12764) Ozone: Implement efficient copyFromLocalFile and copyToLocalFile using bucket operations.
Mukul Kumar Singh created HDFS-12764: Summary: Ozone: Implement efficient copyFromLocalFile and copyToLocalFile using bucket operations. Key: HDFS-12764 URL: https://issues.apache.org/jira/browse/HDFS-12764 Project: Hadoop HDFS Issue Type: Bug Components: ozone Affects Versions: HDFS-7240 Reporter: Mukul Kumar Singh Priority: Minor Fix For: HDFS-7240 This is based on [~ste...@apache.org]'s comments on HDFS-7240. This jira will help with a efficient version of OzoneFileSystem using bucket operations. {code} you could implement copyFromLocalFile and copyToLocalFile trivially using bucket.putKey(dst, path) & bucket.getKey(). This lines you up for HADOOP-14766, which is a high performance upload from the local FS to a store {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-12763) DataStreamer should heartbeat during flush
Kuhu Shukla created HDFS-12763: -- Summary: DataStreamer should heartbeat during flush Key: HDFS-12763 URL: https://issues.apache.org/jira/browse/HDFS-12763 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.8.1 Reporter: Kuhu Shukla Assignee: Kuhu Shukla Priority: Major >From HDFS-5032: bq. Absence of heartbeat during flush will be fixed in a separate jira by Daryn Sharp This JIRA tracks the case where absence of heartbeat can cause the pipeline to fail if operations like flush take some time to complete. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-12762) Ozone: Enhance tests for OzoneFileSystem
Mukul Kumar Singh created HDFS-12762: Summary: Ozone: Enhance tests for OzoneFileSystem Key: HDFS-12762 URL: https://issues.apache.org/jira/browse/HDFS-12762 Project: Hadoop HDFS Issue Type: Sub-task Components: ozone Affects Versions: HDFS-7240 Reporter: Mukul Kumar Singh Assignee: Mukul Kumar Singh Priority: Major Fix For: HDFS-7240 This is based on [~ste...@apache.org]'s comments on HDFS-7240. This jira will address the testing section of the comments. {code} Testing Implement something like AbstractSTestS3AHugeFiles for scale tests, again with the ability to spec on the maven build how big the files to be created are. Developers should be able to ask for a test run with an 8GB test write, read and seek, to see what happens. Add a subclass of org.apache.hadoop.fs.FileSystemContractBaseTest, ideally org.apache.hadoop.fs.FSMainOperationsBaseTest. These test things which the newer contract tests haven't yet reimplimented. TestOzoneFileInterfaces Needs a Timeout rule for test timeouts. all your assertEquals strings are the wrong way round. sorry. {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-12761) Ozone: Merge Ozone to trunk
Anu Engineer created HDFS-12761: --- Summary: Ozone: Merge Ozone to trunk Key: HDFS-12761 URL: https://issues.apache.org/jira/browse/HDFS-12761 Project: Hadoop HDFS Issue Type: Sub-task Components: ozone Affects Versions: HDFS-7240 Reporter: Anu Engineer Assignee: Anu Engineer Fix For: HDFS-7240 Based on the discussion in HDFS-7240, this JIRA is a place where we can discuss low level code/design/architecture details of Ozone. I expect comments here to spawn work items for ozone. cc:[~ste...@apache.org], [~cheersyang], [~linyiqun], [~yuanbo], [~xyao], [~vagarychen],[~jnp], [~arpitagarwal], [~msingh], [~elek], [~nandakumar131], [~szetszwo], [~ljain], [~shashikant] -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-12760) Use PoolingHttpClientConnectionManager for OzoneRestClient
Xiaoyu Yao created HDFS-12760: - Summary: Use PoolingHttpClientConnectionManager for OzoneRestClient Key: HDFS-12760 URL: https://issues.apache.org/jira/browse/HDFS-12760 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Priority: Major This is based on [~ste...@apache.org]'s comments on HDFS-7240. I though our client already used it after fixing the server side issue with HDFS-11873. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-12759) Ozone: web: integrate configuration reader page to the SCM/KSM web ui.
Elek, Marton created HDFS-12759: --- Summary: Ozone: web: integrate configuration reader page to the SCM/KSM web ui. Key: HDFS-12759 URL: https://issues.apache.org/jira/browse/HDFS-12759 Project: Hadoop HDFS Issue Type: Sub-task Components: ozone Affects Versions: HDFS-7240 Reporter: Elek, Marton Assignee: Elek, Marton In the current SCM/KSM web ui the configuration are * hidden under the Common Tools menu * opens a different type of web page (different menu and style). In this patch I integrate the configuration page to the existing web ui: >From user point of view: * Configuration page is moved to a separated main menu * The menu of the Configuration page is the same as all the others * Metrics are also moved to separatad pages/menus * As the configuraiton page requires full width, all the pages use full width layout >From technical point of view: * To support multiple pages I enabled the angular router (which has already been added as component) * Not, it's suppored to create multiple pages and navigate between them, so I also moved the metrics pages to different pages, making the main overview page more clean. * The layout changed to use the full width. TESTING: It's a client side only change. The easiest way to test is doing a full build, start SCM/KSM and check the menu items * All the menu items should work * Configuration page (from the main menu) should use the same header * The configuration item of the Common tools menu shows the good old raw configuration page -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
Re: Jdiff for 2.9.0 (was: Cutting branch-2.9)
Hello folks Quick update regarding this. We looked like the baseline jdiff xml file for v2.8.2 of the hadooop-yarn-api project was missing. We recreated it from branch-2.8.2 and ive committed it into branch-2.9 We re-ran jdiff on branch-2.9 after the above commit and went thru the generated html. I have uploaded the html files for each project here : http://home.apache.org/~asuresh/2.9.0-jdiff/html/ We found it useful to go to the changes-summary.html (for eg. http://home.apache.org/~asuresh/2.9.0-jdiff/html/hadoop-hdfs/changes-summary.html in the case of hdfs) for each sub project, as it gives a pretty concise view. As we mentioned earlier, we did not find anything majorly amiss. One thing we did notice was that quite a lot of method changes (especially in interfaces) were reported as "Changed from non-abstract to abstract". Not entirely sure why, especially give that the method/class in question has actually not been touched in years. @Junping, Kindly do take a look and let us know if something is amiss. Thanks -Arun/Subru On Tue, Oct 31, 2017 at 1:47 PM, Arun Sureshwrote: > Hi Junping / Wangda, > > We just ran jdiff on branch-2.9. > I have copied the output here : http://home.apache.org/~ > asuresh/2.9.0-jdiff/ > > We are going thru it but it would be great if you guys can please take a > look and let us know if you spot anything amiss. > > Thanks > -Arun > > On Mon, Oct 30, 2017 at 4:52 PM, Arun Suresh wrote: > >> Hello Jumping >> >> Thanks for chiming in.. Appreciate the offer to help. I did run jdiff on >> branch-2 today.. did not find any red flags. Will post the report here >> shortly for review. >> >> Cheers >> -Arun >> >> On Oct 30, 2017 4:44 PM, "Junping Du" wrote: >> >>> Hi Subru and Arun, >>> Thanks for moving forward with 2.9 release. Is the first cut of 2.9 >>> release supposed to be a stable version or just an alpha version? If it is >>> supposed to be a stable version, we should run jdiff test and check for API >>> compatibility before releasing out. Please let me know if you need any help >>> here. >>> >>> Thanks, >>> >>> Junping >>> >>> From: Subru Krishnan >>> Sent: Monday, October 30, 2017 12:39 PM >>> To: common-...@hadoop.apache.org; yarn-...@hadoop.apache.org; >>> hdfs-dev@hadoop.apache.org; mapreduce-...@hadoop.apache.org >>> Cc: Arun Suresh >>> Subject: Cutting branch-2.9 >>> >>> We want to give heads up that we are going to cut branch-2.9 tomorrow >>> morning. >>> >>> We are down to 3 blockers and they all are close to being committed >>> (thanks >>> everyone): >>> https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+2.9+Release >>> >>> There are 4 other non-blocker JIRAs that are targeted for 2.9.0 which are >>> close to completion. >>> >>> Folks who are working/reviewing these, kindly prioritize accordingly so >>> that we can make the release on time. >>> https://issues.apache.org/jira/browse/YARN-7398?filter=12342468 >>> >>> Thanks in advance! >>> >>> -Subru/Arun >>> >>> >>> >>> - >>> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org >>> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org >>> >>> >
[jira] [Created] (HDFS-12758) Ozone: Correcting assertEquals argument order in test cases
Nanda kumar created HDFS-12758: -- Summary: Ozone: Correcting assertEquals argument order in test cases Key: HDFS-12758 URL: https://issues.apache.org/jira/browse/HDFS-12758 Project: Hadoop HDFS Issue Type: Sub-task Components: ozone Environment: In few test cases, the arguments to {{Assert.assertEquals}} is swapped. Below is the list of classes and test-cases where this has to be corrected. {noformat} hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/ozone/ksm/TestKeySpaceManager.java testChangeVolumeQuota - line: 187, 197 & 204 hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/ozone/web/TestDistributedOzoneVolumes.java testCreateVolumes - line: 91 testCreateVolumesWithQuota - line: 103 testCreateVolumesWithInvalidQuota - line: 115 testCreateVolumesWithInvalidUser - line: 129 testCreateVolumesWithOutAdminRights - line: 144 testCreateVolumesInLoop - line: 156 hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/ozone/web/client/TestKeys.java runTestPutKey - line: 239 & 246 runTestPutAndListKey - line: 228, 229, 451, 452, 458 & 459 hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/ozone/container/transport/server/TestContainerServer.java testClientServerWithContainerDispatcher - line: 219 hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/ozone/container/ContainerTestHelper.java verifyGetKey - line: 491 hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/ozone/container/common/impl/TestContainerPersistence.java testUpdateContainer - line: 776, 778, 794, 796, 821 & 823 hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/ozone/container/common/TestEndPoint.java testGetVersion - line: 122 & 124 testRegister - line: 215 hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/ozone/container/replication/TestContainerReplicationManager.java testDetectSingleContainerReplica - line: 168 hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/ozone/scm/TestXceiverClientManager.java testCaching - line: 82, 91, 96 & 97 testFreeByReference - line: 120, 130 & 137 testFreeByEviction - line: 165, 170, 177 & 185 hadoop-hdfs-project/hadoop-hdfs-client/src/test/java/org/apache/hadoop/ozone/TestOzoneAcls.java testAclValues - line: 111, 112, 113, 116, 117, 118, 121, 122, 123, 126, 127, 128, 131, 132, 133, 136, 137 & 138 hadoop-tools/hadoop-ozone/src/test/java/org/apache/hadoop/fs/ozone/TestOzoneFileInterfaces.java testFileSystemInit - line: 102 testOzFsReadWrite - line: 123 testDirectory - line: 135, 138 & 139 {noformat} Reporter: Nanda kumar Priority: Minor -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-12757) DeadLock Happened Between DFSOutputStream and LeaseRenewer when LeaseRenewer#renew SocketTimeException
Jiandan Yang created HDFS-12757: Summary: DeadLock Happened Between DFSOutputStream and LeaseRenewer when LeaseRenewer#renew SocketTimeException Key: HDFS-12757 URL: https://issues.apache.org/jira/browse/HDFS-12757 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Reporter: Jiandan Yang Priority: Major Java stack is : Found one Java-level deadlock: = "Topology-2 (735/2000)": waiting to lock monitor 0x7fff4523e6e8 (object 0x0005d3521078, a org.apache.hadoop.hdfs.client.impl.LeaseRenewer), which is held by "LeaseRenewer:admin@na61storage" "LeaseRenewer:admin@na61storage": waiting to lock monitor 0x7fff5d41e838 (object 0x0005ec0dfa88, a org.apache.hadoop.hdfs.DFSOutputStream), which is held by "Topology-2 (735/2000)" Java stack information for the threads listed above: === "Topology-2 (735/2000)": at org.apache.hadoop.hdfs.client.impl.LeaseRenewer.addClient(LeaseRenewer.java:227) - waiting to lock <0x0005d3521078> (a org.apache.hadoop.hdfs.client.impl.LeaseRenewer) at org.apache.hadoop.hdfs.client.impl.LeaseRenewer.getInstance(LeaseRenewer.java:86) at org.apache.hadoop.hdfs.DFSClient.getLeaseRenewer(DFSClient.java:467) at org.apache.hadoop.hdfs.DFSClient.endFileLease(DFSClient.java:479) at org.apache.hadoop.hdfs.DFSOutputStream.setClosed(DFSOutputStream.java:776) at org.apache.hadoop.hdfs.DFSOutputStream.closeThreads(DFSOutputStream.java:791) at org.apache.hadoop.hdfs.DFSOutputStream.closeImpl(DFSOutputStream.java:848) - locked <0x0005ec0dfa88> (a org.apache.hadoop.hdfs.DFSOutputStream) at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:805) - locked <0x0005ec0dfa88> (a org.apache.hadoop.hdfs.DFSOutputStream) at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106) .. "LeaseRenewer:admin@na61storage": at org.apache.hadoop.hdfs.DFSOutputStream.abort(DFSOutputStream.java:750) - waiting to lock <0x0005ec0dfa88> (a org.apache.hadoop.hdfs.DFSOutputStream) at org.apache.hadoop.hdfs.DFSClient.closeAllFilesBeingWritten(DFSClient.java:586) at org.apache.hadoop.hdfs.client.impl.LeaseRenewer.run(LeaseRenewer.java:453) - locked <0x0005d3521078> (a org.apache.hadoop.hdfs.client.impl.LeaseRenewer) at org.apache.hadoop.hdfs.client.impl.LeaseRenewer.access$700(LeaseRenewer.java:76) at org.apache.hadoop.hdfs.client.impl.LeaseRenewer$1.run(LeaseRenewer.java:310) at java.lang.Thread.run(Thread.java:834) Found 1 deadlock. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org