[jira] [Resolved] (HDDS-1705) Recon: Add estimatedTotalCount to the response of containers and containers/{id} endpoints
[ https://issues.apache.org/jira/browse/HDDS-1705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal resolved HDDS-1705. - Resolution: Fixed Fix Version/s: 0.4.1 Target Version/s: (was: 0.5.0) I've committed this. Thanks for the contribution [~vivekratnavel] and thanks for the review [~swagle]. > Recon: Add estimatedTotalCount to the response of containers and > containers/{id} endpoints > -- > > Key: HDDS-1705 > URL: https://issues.apache.org/jira/browse/HDDS-1705 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Recon >Affects Versions: 0.4.0 >Reporter: Vivek Ratnavel Subramanian >Assignee: Vivek Ratnavel Subramanian >Priority: Major > Labels: pull-request-available > Fix For: 0.4.1 > > Time Spent: 3h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Reopened] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
[ https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang reopened HDFS-12748: > NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY > > > Key: HDFS-12748 > URL: https://issues.apache.org/jira/browse/HDFS-12748 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang >Assignee: Weiwei Yang >Priority: Major > Fix For: 3.3.0, 3.2.1 > > Attachments: HDFS-12748-branch-3.1.01.patch, HDFS-12748.001.patch, > HDFS-12748.002.patch, HDFS-12748.003.patch, HDFS-12748.004.patch, > HDFS-12748.005.patch > > > In our production environment, the standby NN often do fullgc, through mat we > found the largest object is FileSystem$Cache, which contains 7,844,890 > DistributedFileSystem. > By view hierarchy of method FileSystem.get() , I found only > NamenodeWebHdfsMethods#get call FileSystem.get(). I don't know why creating > different DistributedFileSystem every time instead of get a FileSystem from > cache. > {code:java} > case GETHOMEDIRECTORY: { > final String js = JsonUtil.toJsonString("Path", > FileSystem.get(conf != null ? conf : new Configuration()) > .getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } > {code} > When we close FileSystem when GETHOMEDIRECTORY, NN don't do fullgc. > {code:java} > case GETHOMEDIRECTORY: { > FileSystem fs = null; > try { > fs = FileSystem.get(conf != null ? conf : new Configuration()); > final String js = JsonUtil.toJsonString("Path", > fs.getHomeDirectory().toUri().getPath()); > return Response.ok(js).type(MediaType.APPLICATION_JSON).build(); > } finally { > if (fs != null) { > fs.close(); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-1775) Make OM KeyDeletingService compatible with HA model
Hanisha Koneru created HDDS-1775: Summary: Make OM KeyDeletingService compatible with HA model Key: HDDS-1775 URL: https://issues.apache.org/jira/browse/HDDS-1775 Project: Hadoop Distributed Data Store Issue Type: Sub-task Reporter: Hanisha Koneru Assignee: Hanisha Koneru Currently OM KeyDeletingService directly deletes all the keys in DeletedTable after deleting the corresponding blocks through SCM. For HA compatibility, the key purging should happen through the OM Ratis server. This Jira introduces PurgeKeys request in OM protocol. This request will be submitted to OMs Ratis server after SCM deletes blocks corresponding to deleted keys. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-1774) Add disk hang test to fault injection test
Eric Yang created HDDS-1774: --- Summary: Add disk hang test to fault injection test Key: HDDS-1774 URL: https://issues.apache.org/jira/browse/HDDS-1774 Project: Hadoop Distributed Data Store Issue Type: Improvement Reporter: Eric Yang When disk is corrupted, the disk may show behavior of hang in accessing data. One of the simulation that can be performed is to set disk IO throughput to 0 bytes/sec to simulate disk hang. Ozone file system client can detect disk access timeout, and proceed to read/write data to another datanode. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-14637) Namenode may not replicate blocks to meet the policy after enabling upgradeDomain
Stephen O'Donnell created HDFS-14637: Summary: Namenode may not replicate blocks to meet the policy after enabling upgradeDomain Key: HDFS-14637 URL: https://issues.apache.org/jira/browse/HDFS-14637 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 3.3.0 Reporter: Stephen O'Donnell Assignee: Stephen O'Donnell After changing the network topology or placement policy on a cluster and restarting the namenode, the namenode will scan all blocks on the cluster at startup, and check if they meet the current placement policy. If they do not, they are added to the replication queue and the namenode will arrange for them to be replicated to ensure the placement policy is used. If you start with a cluster with no UpgradeDomain, and then enable UpgradeDomain, then on restart the NN does notice all the blocks violate the placement policy and it adds them to the replication queue. I believe there are some issues in the logic that prevents the blocks from replicating depending on the setup: With UD enabled, but no racks configured, and possible on a 2 rack cluster, the queued replication work never makes any progress, as in blockManager.validateReconstructionWork(), it checks to see if the new replica increases the number of racks, and if it does not, it skips it and tries again later. {code:java} DatanodeStorageInfo[] targets = rw.getTargets(); if ((numReplicas.liveReplicas() >= requiredRedundancy) && (!isPlacementPolicySatisfied(block)) ) { if (!isInNewRack(rw.getSrcNodes(), targets[0].getDatanodeDescriptor())) { // No use continuing, unless a new rack in this case return false; } // mark that the reconstruction work is to replicate internal block to a // new rack. rw.setNotEnoughRack(); } {code} Additionally, in blockManager.scheduleReconstruction() is there some logic that sets the number of new replicas required to one, if the live replicas >= requiredReduncancy: {code:java} int additionalReplRequired; if (numReplicas.liveReplicas() < requiredRedundancy) { additionalReplRequired = requiredRedundancy - numReplicas.liveReplicas() - pendingNum; } else { additionalReplRequired = 1; // Needed on a new rack }{code} With UD, it is possible for 2 new replicas to be needed to meet the block placement policy, if all existing replicas are on a node with the same domain. For traditional '2 rack redundancy', only 1 new replica would ever have been needed in this scenario. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-1773) Add intermitten IO disk test to fault injection test
Eric Yang created HDDS-1773: --- Summary: Add intermitten IO disk test to fault injection test Key: HDDS-1773 URL: https://issues.apache.org/jira/browse/HDDS-1773 Project: Hadoop Distributed Data Store Issue Type: Improvement Reporter: Eric Yang Disk errors can also be simulated by setting cgroup blkio rate to 0 while Ozone cluster is running. This test will be added to corruption test project and this test will only be performed if there is write access into host cgroup to control the throttle of disk IO. Expected result: When datanode becomes irresponsive due to slow io, scm must flag the node as unhealthy. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-1772) Add disk full test to fault injection test
Eric Yang created HDDS-1772: --- Summary: Add disk full test to fault injection test Key: HDDS-1772 URL: https://issues.apache.org/jira/browse/HDDS-1772 Project: Hadoop Distributed Data Store Issue Type: Improvement Reporter: Eric Yang In Read-only test, one of the simulation to verify is the data disk becomes full. This can be tested by using a small Docker data disk to simulate disk full. When data disk is full, Ozone should continue to operate, and provide read access to Ozone file system. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-1771) Add slow IO disk test to fault injection test
Eric Yang created HDDS-1771: --- Summary: Add slow IO disk test to fault injection test Key: HDDS-1771 URL: https://issues.apache.org/jira/browse/HDDS-1771 Project: Hadoop Distributed Data Store Issue Type: Improvement Reporter: Eric Yang In fault injection test, one possible simulation to run is to create slow disk IO to assist developing a set of timing profiles that works for Ozone cluster. When we write to a file, the data travels across a bunch of buffers and caches before it is effectively written to the disk. By controlling cgroup blkio rate in Linux Kernel, we can simulate slow disk read, write. Docker provides the following parameters to control cgroup: {code} --device-read-bps="" --device-write-bps="" --device-read-iops="" --device-write-iops="" {code} The test will be added to read/write test with docker compose file as parameters to test the timing profiles. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-1338) ozone shell commands are throwing InvocationTargetException
[ https://issues.apache.org/jira/browse/HDDS-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elek, Marton resolved HDDS-1338. Resolution: Duplicate > ozone shell commands are throwing InvocationTargetException > --- > > Key: HDDS-1338 > URL: https://issues.apache.org/jira/browse/HDDS-1338 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Nilotpal Nandi >Priority: Major > > ozone version > {noformat} > Source code repository g...@github.com:hortonworks/ozone.git -r > 310ebf5dc83b6c9e68d09246ed6c6f7cf6370fde > Compiled by jenkins on 2019-03-21T22:06Z > Compiled with protoc 2.5.0 > From source with checksum 9c367143ad43b81ca84bfdaafd1c3f > Using HDDS 0.4.0.3.0.100.0-388 > Source code repository g...@github.com:hortonworks/ozone.git -r > 310ebf5dc83b6c9e68d09246ed6c6f7cf6370fde > Compiled by jenkins on 2019-03-21T22:06Z > Compiled with protoc 2.5.0 > From source with checksum f3297cbd3a5f59fb4e5fd551afa05ba9 > {noformat} > Here is the ozone volume create failure output : > {noformat} > hdfs@ctr-e139-1542663976389-91321-01-02 ~]$ ozone sh volume create > testvolume11 > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/usr/hdp/3.0.100.0-388/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/usr/hdp/3.0.100.0-388/hadoop-ozone/share/ozone/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > 19/03/26 17:31:37 ERROR client.OzoneClientFactory: Couldn't create protocol > class org.apache.hadoop.ozone.client.rpc.RpcClient exception: > java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at > org.apache.hadoop.ozone.client.OzoneClientFactory.getClientProtocol(OzoneClientFactory.java:291) > at > org.apache.hadoop.ozone.client.OzoneClientFactory.getRpcClient(OzoneClientFactory.java:169) > at > org.apache.hadoop.ozone.web.ozShell.OzoneAddress.createClient(OzoneAddress.java:111) > at > org.apache.hadoop.ozone.web.ozShell.volume.CreateVolumeHandler.call(CreateVolumeHandler.java:70) > at > org.apache.hadoop.ozone.web.ozShell.volume.CreateVolumeHandler.call(CreateVolumeHandler.java:38) > at picocli.CommandLine.execute(CommandLine.java:919) > at picocli.CommandLine.access$700(CommandLine.java:104) > at picocli.CommandLine$RunLast.handle(CommandLine.java:1083) > at picocli.CommandLine$RunLast.handle(CommandLine.java:1051) > at > picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:959) > at picocli.CommandLine.parseWithHandlers(CommandLine.java:1242) > at picocli.CommandLine.parseWithHandler(CommandLine.java:1181) > at org.apache.hadoop.hdds.cli.GenericCli.execute(GenericCli.java:61) > at org.apache.hadoop.ozone.web.ozShell.Shell.execute(Shell.java:82) > at org.apache.hadoop.hdds.cli.GenericCli.run(GenericCli.java:52) > at org.apache.hadoop.ozone.web.ozShell.Shell.main(Shell.java:93) > Caused by: java.lang.VerifyError: Cannot inherit from final class > at java.lang.ClassLoader.defineClass1(Native Method) > at java.lang.ClassLoader.defineClass(ClassLoader.java:763) > at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) > at java.net.URLClassLoader.defineClass(URLClassLoader.java:468) > at java.net.URLClassLoader.access$100(URLClassLoader.java:74) > at java.net.URLClassLoader$1.run(URLClassLoader.java:369) > at java.net.URLClassLoader$1.run(URLClassLoader.java:363) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:362) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > at > org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.(OzoneManagerProtocolClientSideTranslatorPB.java:169) > at org.apache.hadoop.ozone.client.rpc.RpcClient.(RpcClient.java:142) > ... 20 more > Couldn't create protocol class org.apache.hadoop.ozone.client.rpc.RpcClient > {noformat} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additi
[jira] [Resolved] (HDDS-1305) Robot test containers: hadoop client can't access o3fs
[ https://issues.apache.org/jira/browse/HDDS-1305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elek, Marton resolved HDDS-1305. Resolution: Duplicate Thanks to report this issue. It will be fixed in HDDS-1717 (Based on the timeline that one is the duplicate, but we have a working patch there, so I am closing this one). > Robot test containers: hadoop client can't access o3fs > -- > > Key: HDDS-1305 > URL: https://issues.apache.org/jira/browse/HDDS-1305 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Affects Versions: 0.5.0 >Reporter: Sandeep Nemuri >Assignee: Anu Engineer >Priority: Major > Attachments: run.log > > > Run the robot test using: > {code:java} > ./test.sh --keep --env ozonefs > {code} > login to OM container and check if we have desired volume/bucket/key got > created with robot tests. > {code:java} > [root@o3new ~]$ docker exec -it ozonefs_om_1 /bin/bash > bash-4.2$ ozone fs -ls o3fs://bucket1.fstest/ > Found 3 items > -rw-rw-rw- 1 hadoop hadoop 22990 2019-03-15 17:28 > o3fs://bucket1.fstest/KEY.txt > drwxrwxrwx - hadoop hadoop 0 1970-01-01 00:00 > o3fs://bucket1.fstest/testdir > drwxrwxrwx - hadoop hadoop 0 2019-03-15 17:27 > o3fs://bucket1.fstest/testdir1 > {code} > {code:java} > [root@o3new ~]$ docker exec -it ozonefs_hadoop3_1 /bin/bash > bash-4.4$ hadoop classpath > /opt/hadoop/etc/hadoop:/opt/hadoop/share/hadoop/common/lib/*:/opt/hadoop/share/hadoop/common/*:/opt/hadoop/share/hadoop/hdfs:/opt/hadoop/share/hadoop/hdfs/lib/*:/opt/hadoop/share/hadoop/hdfs/*:/opt/hadoop/share/hadoop/mapreduce/*:/opt/hadoop/share/hadoop/yarn:/opt/hadoop/share/hadoop/yarn/lib/*:/opt/hadoop/share/hadoop/yarn/*:/opt/ozone/share/ozone/lib/hadoop-ozone-filesystem-lib-current-0.5.0-SNAPSHOT.jar > bash-4.4$ hadoop fs -ls o3fs://bucket1.fstest/ > 2019-03-18 19:12:42 INFO Configuration:3204 - Removed undeclared tags: > 2019-03-18 19:12:42 ERROR OzoneClientFactory:294 - Couldn't create protocol > class org.apache.hadoop.ozone.client.rpc.RpcClient exception: > java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at > org.apache.hadoop.ozone.client.OzoneClientFactory.getClientProtocol(OzoneClientFactory.java:291) > at > org.apache.hadoop.ozone.client.OzoneClientFactory.getRpcClient(OzoneClientFactory.java:169) > at > org.apache.hadoop.fs.ozone.OzoneClientAdapterImpl.(OzoneClientAdapterImpl.java:127) > at > org.apache.hadoop.fs.ozone.OzoneFileSystem.initialize(OzoneFileSystem.java:189) > at > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3354) > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:124) > at > org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3403) > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3371) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:477) > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361) > at org.apache.hadoop.fs.shell.PathData.expandAsGlob(PathData.java:325) > at org.apache.hadoop.fs.shell.Command.expandArgument(Command.java:249) > at org.apache.hadoop.fs.shell.Command.expandArguments(Command.java:232) > at > org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:104) > at org.apache.hadoop.fs.shell.Command.run(Command.java:176) > at org.apache.hadoop.fs.FsShell.run(FsShell.java:328) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) > at org.apache.hadoop.fs.FsShell.main(FsShell.java:391) > Caused by: java.lang.VerifyError: Cannot inherit from final class > at java.lang.ClassLoader.defineClass1(Native Method) > at java.lang.ClassLoader.defineClass(ClassLoader.java:763) > at > java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) > at java.net.URLClassLoader.defineClass(URLClassLoader.java:467) > at java.net.URLClassLoader.access$100(URLClassLoader.java:73) > at java.net.URLClassLoader$1.run(URLClassLoader.java:368) > at java.net.URLClassLoader$1.run(URLClassLoader.java:362) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:361) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppC
[jira] [Resolved] (HDDS-1644) Overload RpcClient#createKey to pass non-default acls
[ https://issues.apache.org/jira/browse/HDDS-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar resolved HDDS-1644. -- Resolution: Won't Fix > Overload RpcClient#createKey to pass non-default acls > - > > Key: HDDS-1644 > URL: https://issues.apache.org/jira/browse/HDDS-1644 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Ajay Kumar >Assignee: Anu Engineer >Priority: Major > > Overload RpcClient#createKey to pass default acls as function parameters. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1191/ No changes -1 overall The following subsystems voted -1: asflicense findbugs hadolint pathlen unit The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: FindBugs : module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-documentstore Unread field:TimelineEventSubDoc.java:[line 56] Unread field:TimelineMetricSubDoc.java:[line 44] FindBugs : module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-mawo/hadoop-yarn-applications-mawo-core Class org.apache.hadoop.applications.mawo.server.common.TaskStatus implements Cloneable but does not define or use clone method At TaskStatus.java:does not define or use clone method At TaskStatus.java:[lines 39-346] Equals method for org.apache.hadoop.applications.mawo.server.worker.WorkerId assumes the argument is of type WorkerId At WorkerId.java:the argument is of type WorkerId At WorkerId.java:[line 114] org.apache.hadoop.applications.mawo.server.worker.WorkerId.equals(Object) does not check for null argument At WorkerId.java:null argument At WorkerId.java:[lines 114-115] FindBugs : module:hadoop-tools/hadoop-dynamometer/hadoop-dynamometer-infra org.apache.hadoop.tools.dynamometer.Client.addFileToZipRecursively(File, File, ZipOutputStream) may fail to clean up java.io.InputStream on checked exception Obligation to clean up resource created at Client.java:to clean up java.io.InputStream on checked exception Obligation to clean up resource created at Client.java:[line 859] is not discharged Exceptional return value of java.io.File.mkdirs() ignored in org.apache.hadoop.tools.dynamometer.DynoInfraUtils.fetchHadoopTarball(File, String, Configuration, Logger) At DynoInfraUtils.java:ignored in org.apache.hadoop.tools.dynamometer.DynoInfraUtils.fetchHadoopTarball(File, String, Configuration, Logger) At DynoInfraUtils.java:[line 138] Found reliance on default encoding in org.apache.hadoop.tools.dynamometer.SimulatedDataNodes.run(String[]):in org.apache.hadoop.tools.dynamometer.SimulatedDataNodes.run(String[]): new java.io.InputStreamReader(InputStream) At SimulatedDataNodes.java:[line 149] org.apache.hadoop.tools.dynamometer.SimulatedDataNodes.run(String[]) invokes System.exit(...), which shuts down the entire virtual machine At SimulatedDataNodes.java:down the entire virtual machine At SimulatedDataNodes.java:[line 123] org.apache.hadoop.tools.dynamometer.SimulatedDataNodes.run(String[]) may fail to close stream At SimulatedDataNodes.java:stream At SimulatedDataNodes.java:[line 149] FindBugs : module:hadoop-tools/hadoop-dynamometer/hadoop-dynamometer-blockgen Self assignment of field BlockInfo.replication in new org.apache.hadoop.tools.dynamometer.blockgenerator.BlockInfo(BlockInfo) At BlockInfo.java:in new org.apache.hadoop.tools.dynamometer.blockgenerator.BlockInfo(BlockInfo) At BlockInfo.java:[line 78] Failed junit tests : hadoop.util.TestDiskCheckerWithDiskIo hadoop.hdfs.server.datanode.TestDirectoryScanner hadoop.hdfs.web.TestWebHdfsTimeouts hadoop.hdfs.server.federation.router.TestRouterWithSecureStartup hadoop.hdfs.server.federation.security.TestRouterHttpDelegationToken hadoop.hdfs.server.federation.router.TestRouterRpcMultiDestination hadoop.ozone.container.ozoneimpl.TestOzoneContainer hadoop.ozone.om.TestOzoneManagerHA hadoop.ozone.client.rpc.TestOzoneRpcClientWithRatis hadoop.ozone.client.rpc.TestOzoneAtRestEncryption hadoop.ozone.client.rpc.TestOzoneRpcClient hadoop.ozone.client.rpc.TestSecureOzoneRpcClient hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures cc: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1191/artifact/out/diff-compile-cc-root.txt [4.0K] javac: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1191/artifact/out/diff-compile-javac-root.txt [336K] checkstyle: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1191/artifact/out/diff-checkstyle-root.txt [17M] hadolint: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1191/artifact/out/diff-patch-hadolint.txt [8.0K] pathlen: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1191/artifact/out/pathlen.txt [12K] pylint: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1191/artifact/out/diff-patch-pylint.txt [120K] shellcheck: https://builds.apache.org/
[jira] [Created] (HDDS-1770) SCM crashes when ReplicationManager is trying to re-replicate under replicated containers
Nanda kumar created HDDS-1770: - Summary: SCM crashes when ReplicationManager is trying to re-replicate under replicated containers Key: HDDS-1770 URL: https://issues.apache.org/jira/browse/HDDS-1770 Project: Hadoop Distributed Data Store Issue Type: Bug Components: SCM Reporter: Nanda kumar SCM crashes with the following exception when ReplicationManager is trying to re-replicate under replicated containers {noformat} 2019-07-08 12:46:36 ERROR ReplicationManager:215 - Exception in Replication Monitor Thread. java.lang.IllegalArgumentException: Affinity node /default-rack/aab15e2d07cc is not a member of topology at org.apache.hadoop.hdds.scm.net.NetworkTopologyImpl.checkAffinityNode(NetworkTopologyImpl.java:767) at org.apache.hadoop.hdds.scm.net.NetworkTopologyImpl.chooseRandom(NetworkTopologyImpl.java:407) at org.apache.hadoop.hdds.scm.container.placement.algorithms.SCMContainerPlacementRackAware.chooseNode(SCMContainerPlacementRackAware.java:242) at org.apache.hadoop.hdds.scm.container.placement.algorithms.SCMContainerPlacementRackAware.chooseDatanodes(SCMContainerPlacementRackAware.java:168) at org.apache.hadoop.hdds.scm.container.ReplicationManager.handleUnderReplicatedContainer(ReplicationManager.java:487) at org.apache.hadoop.hdds.scm.container.ReplicationManager.processContainer(ReplicationManager.java:293) at java.base/java.util.concurrent.ConcurrentHashMap$KeySetView.forEach(ConcurrentHashMap.java:4698) at java.base/java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1083) at org.apache.hadoop.hdds.scm.container.ReplicationManager.run(ReplicationManager.java:205) at java.base/java.lang.Thread.run(Thread.java:834) 2019-07-08 12:46:36 INFO ExitUtil:210 - Exiting with status 1: java.lang.IllegalArgumentException: Affinity node /default-rack/aab15e2d07cc is not a member of topology 2019-07-08 12:46:36 INFO StorageContainerManagerStarter:51 - SHUTDOWN_MSG: / SHUTDOWN_MSG: Shutting down StorageContainerManager at 8c763563f672/192.168.112.2 / {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/ No changes -1 overall The following subsystems voted -1: asflicense findbugs hadolint pathlen unit xml The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: XML : Parsing Error(s): hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/empty-configuration.xml hadoop-tools/hadoop-azure/src/config/checkstyle-suppressions.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/public/crossdomain.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/public/crossdomain.xml FindBugs : module:hadoop-common-project/hadoop-common Class org.apache.hadoop.fs.GlobalStorageStatistics defines non-transient non-serializable instance field map In GlobalStorageStatistics.java:instance field map In GlobalStorageStatistics.java FindBugs : module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase/hadoop-yarn-server-timelineservice-hbase-client Boxed value is unboxed and then immediately reboxed in org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result, byte[], byte[], KeyConverter, ValueConverter, boolean) At ColumnRWHelper.java:then immediately reboxed in org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result, byte[], byte[], KeyConverter, ValueConverter, boolean) At ColumnRWHelper.java:[line 335] Failed junit tests : hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys hadoop.hdfs.server.datanode.TestDirectoryScanner hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead hadoop.registry.secure.TestSecureLogins hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2 hadoop.yarn.sls.TestSLSRunner cc: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-compile-cc-root-jdk1.7.0_95.txt [4.0K] javac: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-compile-javac-root-jdk1.7.0_95.txt [328K] cc: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-compile-cc-root-jdk1.8.0_212.txt [4.0K] javac: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-compile-javac-root-jdk1.8.0_212.txt [308K] checkstyle: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-checkstyle-root.txt [16M] hadolint: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-patch-hadolint.txt [4.0K] pathlen: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/pathlen.txt [12K] pylint: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-patch-pylint.txt [24K] shellcheck: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-patch-shellcheck.txt [72K] shelldocs: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-patch-shelldocs.txt [8.0K] whitespace: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/whitespace-eol.txt [12M] https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/whitespace-tabs.txt [1.2M] xml: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/xml.txt [12K] findbugs: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/branch-findbugs-hadoop-common-project_hadoop-common-warnings.html [8.0K] https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice-hbase_hadoop-yarn-server-timelineservice-hbase-client-warnings.html [8.0K] javadoc: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-javadoc-javadoc-root-jdk1.7.0_95.txt [16K] https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-javadoc-javadoc-root-jdk1.8.0_212.txt [1.1M] unit: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt [228K] https://builds.apache.org/job/hado