[jira] [Commented] (HADOOP-17209) Erasure Coding: Native library memory leak
[ https://issues.apache.org/jira/browse/HADOOP-17209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17181624#comment-17181624 ] Takanobu Asanuma commented on HADOOP-17209: --- Thanks for finding the issue and uploading the patch, [~seanlook]. Although I'm not very familiar with JNI too, the changes seem to be right. It should call ReleaseIntArrayElements. I ran TestNativeRSRawCoder and TestNativeXORRawCoder with the patch in my local environment since jenkins had skipped them, and they succeeded. {noformat} [INFO] Running org.apache.hadoop.io.erasurecode.rawcoder.TestNativeRSRawCoder [INFO] [INFO] Results: [INFO] Tests run: 17, Failures: 0, Errors: 0, Skipped: 0 [INFO] Running org.apache.hadoop.io.erasurecode.rawcoder.TestNativeXORRawCoder [INFO] [INFO] Results: [INFO] Tests run: 8, Failures: 0, Errors: 0, Skipped: 0 {noformat} Now I'm +1 on [^HADOOP-17209.001.patch]. I'll commit it in a few days if there are no objection. > Erasure Coding: Native library memory leak > -- > > Key: HADOOP-17209 > URL: https://issues.apache.org/jira/browse/HADOOP-17209 > Project: Hadoop Common > Issue Type: Bug > Components: native >Affects Versions: 3.3.0, 3.2.1, 3.1.3 >Reporter: Sean Chow >Assignee: Sean Chow >Priority: Major > Attachments: HADOOP-17209.001.patch, > datanode.202137.detail_diff.5.txt, image-2020-08-15-18-26-44-744.png, > image-2020-08-20-12-35-39-906.png > > > We use both {{apache-hadoop-3.1.3}} and {{CDH-6.1.1-1.cdh6.1.1.p0.875250}} > HDFS in production, and both of them have the memory increasing over {{-Xmx}} > value. > !image-2020-08-15-18-26-44-744.png! > > We use EC strategy to to save storage costs. > This's the jvm options: > {code:java} > -Dproc_datanode -Dhdfs.audit.logger=INFO,RFAAUDIT > -Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true > -Xms8589934592 -Xmx8589934592 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC > -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled > -XX:+HeapDumpOnOutOfMemoryError ...{code} > The max jvm heapsize is 8GB, but we can see the datanode RSS memory is 48g. > All the other datanodes in this hdfs cluster has the same issue. > {code:java} > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 226044 hdfs 20 0 50.6g 48g 4780 S 90.5 77.0 14728:27 > /usr/java/jdk1.8.0_162/bin/java -Dproc_datanode{code} > > This too much memory used leads to my machine unresponsive(if enable swap), > or oom-killer happens. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17216) hadoop-aws having FileNotFoundException when accessing AWS s3 occassionally
[ https://issues.apache.org/jira/browse/HADOOP-17216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Wei updated HADOOP-17216: --- Description: Hi, When using spark streaming with deltalake, I got the following exception occasionally, something like 1 out of 100. Thanks. {code:java} Caused by: java.io.FileNotFoundException: No such file or directory: s3a://[pathToFolder]/date=2020-07-29/part-5-046af631-7198-422c-8cc8-8d3adfb4413e.c000.snappy.parquet at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2255) at org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2149) at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2088) at org.apache.spark.sql.delta.files.DelayedCommitProtocol$$anonfun$8.apply(DelayedCommitProtocol.scala:141) at org.apache.spark.sql.delta.files.DelayedCommitProtocol$$anonfun$8.apply(DelayedCommitProtocol.scala:139) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at org.apache.spark.sql.delta.files.DelayedCommitProtocol.commitTask(DelayedCommitProtocol.scala:139) at org.apache.spark.sql.execution.datasources.FileFormatDataWriter.commit(FileFormatDataWriter.scala:78) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:247) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:242){code} -Environment hadoop = "3.1.2" hadoop-aws = "3.1.2" spark = "2.4.5" spark-on-k8s-operator = "v1beta2-1.1.2-2.4.5" deployed into AWS EKS kubernates. Version information below: Server Version: version.Info\{Major:"1", Minor:"16+", GitVersion:"v1.16.8-eks-e16311", GitCommit:"e163110a04dcb2f39c3325af96d019b4925419eb", GitTreeState:"clean", BuildDate:"2020-03-27T22:37:12Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"linux/amd64"} was: Hi, When using spark streaming with deltalake, I got the following exception occasionally, something like 1 out of 100. Thanks. Caused by: java.io.FileNotFoundException: No such file or directory: s3a://[pathToFolder]/date=2020-07-29/part-5-046af631-7198-422c-8cc8-8d3adfb4413e.c000.snappy.parquet at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2255) at org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2149) at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2088) at org.apache.spark.sql.delta.files.DelayedCommitProtocol$$anonfun$8.apply(DelayedCommitProtocol.scala:141) at org.apache.spark.sql.delta.files.DelayedCommitProtocol$$anonfun$8.apply(DelayedCommitProtocol.scala:139) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at org.apache.spark.sql.delta.files.DelayedCommitProtocol.commitTask(DelayedCommitProtocol.scala:139) at org.apache.spark.sql.execution.datasources.FileFormatDataWriter.commit(FileFormatDataWriter.scala:78) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:247) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:242) > hadoop-aws having FileNotFoundException when accessing AWS s3 occassionally > --- > > Key: HADOOP-17216 > URL: https://issues.apache.org/jira/browse/HADOOP-17216 > Project: Hadoop Common > Issue Type: Bug > Components: fs/s3 >Affects Versions: 3.1.2 > Environment: hadoop = "3.1.2" > hadoop-aws = "3.1.2" > spark = "2.4.5" > spark-on-k8s-operator = "v1beta2-1.1.2-2.4.5" > deployed into AWS
[jira] [Created] (HADOOP-17216) hadoop-aws having FileNotFoundException when accessing AWS s3 occassionally
Cheng Wei created HADOOP-17216: -- Summary: hadoop-aws having FileNotFoundException when accessing AWS s3 occassionally Key: HADOOP-17216 URL: https://issues.apache.org/jira/browse/HADOOP-17216 Project: Hadoop Common Issue Type: Bug Components: fs/s3 Affects Versions: 3.1.2 Environment: hadoop = "3.1.2" hadoop-aws = "3.1.2" spark = "2.4.5" spark-on-k8s-operator = "v1beta2-1.1.2-2.4.5" deployed into AWS EKS kubernates. Version information below: Server Version: version.Info\{Major:"1", Minor:"16+", GitVersion:"v1.16.8-eks-e16311", GitCommit:"e163110a04dcb2f39c3325af96d019b4925419eb", GitTreeState:"clean", BuildDate:"2020-03-27T22:37:12Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"linux/amd64"} Reporter: Cheng Wei Hi, When using spark streaming with deltalake, I got the following exception occasionally, something like 1 out of 100. Thanks. Caused by: java.io.FileNotFoundException: No such file or directory: s3a://[pathToFolder]/date=2020-07-29/part-5-046af631-7198-422c-8cc8-8d3adfb4413e.c000.snappy.parquet at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2255) at org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2149) at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2088) at org.apache.spark.sql.delta.files.DelayedCommitProtocol$$anonfun$8.apply(DelayedCommitProtocol.scala:141) at org.apache.spark.sql.delta.files.DelayedCommitProtocol$$anonfun$8.apply(DelayedCommitProtocol.scala:139) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at org.apache.spark.sql.delta.files.DelayedCommitProtocol.commitTask(DelayedCommitProtocol.scala:139) at org.apache.spark.sql.execution.datasources.FileFormatDataWriter.commit(FileFormatDataWriter.scala:78) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:247) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:242) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] yangagile opened a new pull request #2235: HDFS-15484 Add new method batchRename for DistributedFileSystem and W…
yangagile opened a new pull request #2235: URL: https://github.com/apache/hadoop/pull/2235 …ebHdfsFileSystem ## NOTICE Please create an issue in ASF JIRA before opening a pull request, and you need to set the title of the pull request which starts with the corresponding JIRA issue number. (e.g. HADOOP-X. Fix a typo in YYY.) For more details, please see https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] umamaheswararao opened a new pull request #2234: HDFS-15529: getChildFilesystems should include fallback fs as well
umamaheswararao opened a new pull request #2234: URL: https://github.com/apache/hadoop/pull/2234 https://issues.apache.org/jira/browse/HDFS-15529 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus commented on pull request #2210: HADOOP-17199. S3A Directory Marker backport to branch-3.2
hadoop-yetus commented on pull request #2210: URL: https://github.com/apache/hadoop/pull/2210#issuecomment-677845882 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 1m 13s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | No case conflicting files found. | | +0 :ok: | markdownlint | 0m 0s | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 7 new or modified test files. | ||| _ branch-3.2 Compile Tests _ | | +0 :ok: | mvndep | 93m 38s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 33m 19s | branch-3.2 passed | | +1 :green_heart: | compile | 17m 25s | branch-3.2 passed | | +1 :green_heart: | checkstyle | 2m 44s | branch-3.2 passed | | +1 :green_heart: | mvnsite | 2m 2s | branch-3.2 passed | | +1 :green_heart: | shadedclient | 19m 19s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 39s | branch-3.2 passed | | +0 :ok: | spotbugs | 1m 3s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 3m 22s | branch-3.2 passed | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 23s | Maven dependency ordering for patch | | -1 :x: | mvninstall | 0m 18s | hadoop-aws in the patch failed. | | -1 :x: | compile | 15m 26s | root in the patch failed. | | -1 :x: | javac | 15m 26s | root in the patch failed. | | -0 :warning: | checkstyle | 2m 44s | root: The patch generated 20 new + 9 unchanged - 0 fixed = 29 total (was 9) | | -1 :x: | mvnsite | 0m 29s | hadoop-aws in the patch failed. | | -1 :x: | whitespace | 0m 0s | The patch has 3 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply | | +1 :green_heart: | shadedclient | 13m 31s | patch has no errors when building and testing our client artifacts. | | -1 :x: | javadoc | 0m 32s | hadoop-aws in the patch failed. | | -1 :x: | findbugs | 0m 29s | hadoop-aws in the patch failed. | ||| _ Other Tests _ | | +1 :green_heart: | unit | 10m 10s | hadoop-common in the patch passed. | | -1 :x: | unit | 0m 27s | hadoop-aws in the patch failed. | | +1 :green_heart: | asflicense | 0m 42s | The patch does not generate ASF License warnings. | | | | 225m 31s | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2210/3/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2210 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle markdownlint | | uname | Linux 98c586545eab 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | branch-3.2 / 94723bf | | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~16.04-b01 | | mvninstall | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2210/3/artifact/out/patch-mvninstall-hadoop-tools_hadoop-aws.txt | | compile | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2210/3/artifact/out/patch-compile-root.txt | | javac | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2210/3/artifact/out/patch-compile-root.txt | | checkstyle | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2210/3/artifact/out/diff-checkstyle-root.txt | | mvnsite | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2210/3/artifact/out/patch-mvnsite-hadoop-tools_hadoop-aws.txt | | whitespace | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2210/3/artifact/out/whitespace-eol.txt | | javadoc | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2210/3/artifact/out/patch-javadoc-hadoop-tools_hadoop-aws.txt | | findbugs | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2210/3/artifact/out/patch-findbugs-hadoop-tools_hadoop-aws.txt | | unit | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2210/3/artifact/out/patch-unit-hadoop-tools_hadoop-aws.txt | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2210/3/testReport/ | | Max. process+thread count | 1346 (vs. ulimit of 5500) | | modules | C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: . | | Console output |
[GitHub] [hadoop] hadoop-yetus commented on pull request #2210: HADOOP-17199. S3A Directory Marker backport to branch-3.2
hadoop-yetus commented on pull request #2210: URL: https://github.com/apache/hadoop/pull/2210#issuecomment-677845435 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 1m 10s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +0 :ok: | markdownlint | 0m 0s | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 7 new or modified test files. | ||| _ branch-3.2 Compile Tests _ | | +0 :ok: | mvndep | 87m 15s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 32m 43s | branch-3.2 passed | | +1 :green_heart: | compile | 17m 14s | branch-3.2 passed | | +1 :green_heart: | checkstyle | 2m 46s | branch-3.2 passed | | +1 :green_heart: | mvnsite | 2m 1s | branch-3.2 passed | | +1 :green_heart: | shadedclient | 19m 14s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 38s | branch-3.2 passed | | +0 :ok: | spotbugs | 1m 3s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 3m 19s | branch-3.2 passed | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 23s | Maven dependency ordering for patch | | -1 :x: | mvninstall | 0m 17s | hadoop-aws in the patch failed. | | -1 :x: | compile | 15m 36s | root in the patch failed. | | -1 :x: | javac | 15m 36s | root in the patch failed. | | -0 :warning: | checkstyle | 2m 45s | root: The patch generated 20 new + 9 unchanged - 0 fixed = 29 total (was 9) | | -1 :x: | mvnsite | 0m 31s | hadoop-aws in the patch failed. | | -1 :x: | whitespace | 0m 0s | The patch has 3 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply | | +1 :green_heart: | shadedclient | 13m 24s | patch has no errors when building and testing our client artifacts. | | -1 :x: | javadoc | 0m 32s | hadoop-aws in the patch failed. | | -1 :x: | findbugs | 0m 28s | hadoop-aws in the patch failed. | ||| _ Other Tests _ | | +1 :green_heart: | unit | 10m 14s | hadoop-common in the patch passed. | | -1 :x: | unit | 0m 29s | hadoop-aws in the patch failed. | | +1 :green_heart: | asflicense | 0m 42s | The patch does not generate ASF License warnings. | | | | 218m 12s | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2210/4/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2210 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle markdownlint | | uname | Linux 968346e9e383 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | branch-3.2 / 94723bf | | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~16.04-b01 | | mvninstall | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2210/4/artifact/out/patch-mvninstall-hadoop-tools_hadoop-aws.txt | | compile | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2210/4/artifact/out/patch-compile-root.txt | | javac | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2210/4/artifact/out/patch-compile-root.txt | | checkstyle | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2210/4/artifact/out/diff-checkstyle-root.txt | | mvnsite | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2210/4/artifact/out/patch-mvnsite-hadoop-tools_hadoop-aws.txt | | whitespace | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2210/4/artifact/out/whitespace-eol.txt | | javadoc | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2210/4/artifact/out/patch-javadoc-hadoop-tools_hadoop-aws.txt | | findbugs | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2210/4/artifact/out/patch-findbugs-hadoop-tools_hadoop-aws.txt | | unit | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2210/4/artifact/out/patch-unit-hadoop-tools_hadoop-aws.txt | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2210/4/testReport/ | | Max. process+thread count | 1347 (vs. ulimit of 5500) | | modules | C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: . | | Console output |
[jira] [Updated] (HADOOP-17214) Allow file system caching to be disabled for all file systems
[ https://issues.apache.org/jira/browse/HADOOP-17214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-17214: Component/s: (was: common) fs > Allow file system caching to be disabled for all file systems > - > > Key: HADOOP-17214 > URL: https://issues.apache.org/jira/browse/HADOOP-17214 > Project: Hadoop Common > Issue Type: Improvement > Components: fs >Affects Versions: 3.3.0 >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Major > > Right now, FileSystem.get(URI uri, Configuration conf) allows caching of file > systems to be disabled per scheme. > We can introduce a new global conf to disable caching for all FileSystem, the > default would be false (or do not disable cache gobally). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17214) Allow file system caching to be disabled for all file systems
[ https://issues.apache.org/jira/browse/HADOOP-17214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17181252#comment-17181252 ] Steve Loughran commented on HADOOP-17214: - any particular reason? > Allow file system caching to be disabled for all file systems > - > > Key: HADOOP-17214 > URL: https://issues.apache.org/jira/browse/HADOOP-17214 > Project: Hadoop Common > Issue Type: Improvement > Components: fs >Affects Versions: 3.3.0 >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Major > > Right now, FileSystem.get(URI uri, Configuration conf) allows caching of file > systems to be disabled per scheme. > We can introduce a new global conf to disable caching for all FileSystem, the > default would be false (or do not disable cache gobally). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17214) Allow file system caching to be disabled for all file systems
[ https://issues.apache.org/jira/browse/HADOOP-17214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-17214: Affects Version/s: 3.3.0 > Allow file system caching to be disabled for all file systems > - > > Key: HADOOP-17214 > URL: https://issues.apache.org/jira/browse/HADOOP-17214 > Project: Hadoop Common > Issue Type: Improvement > Components: common >Affects Versions: 3.3.0 >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Major > > Right now, FileSystem.get(URI uri, Configuration conf) allows caching of file > systems to be disabled per scheme. > We can introduce a new global conf to disable caching for all FileSystem, the > default would be false (or do not disable cache gobally). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17215) ABFS: Excessive Create Overwrites leads to race conditions
[ https://issues.apache.org/jira/browse/HADOOP-17215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17181249#comment-17181249 ] Steve Loughran commented on HADOOP-17215: - usual: tag affected versions, component etc > ABFS: Excessive Create Overwrites leads to race conditions > -- > > Key: HADOOP-17215 > URL: https://issues.apache.org/jira/browse/HADOOP-17215 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Sneha Vijayarajan >Assignee: Sneha Vijayarajan >Priority: Major > > Filesystem Create APIs that do not accept an argument for overwrite flag end > up defaulting it to true. > We are observing that request count of creates with overwrite=true is more > and primarily because of the default setting of the flag is true of the > called Create API. When a create with overwrite ends up timing out, we have > observed that it could lead to race conditions between the first create and > retried one running almost parallel. > To avoid this scenario for create with overwrite=true request, ABFS driver > will always attempt to create without overwrite. If the create fails due to > fileAlreadyPresent, it will resend the request with overwrite=true. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus commented on pull request #2210: HADOOP-17199. S3A Directory Marker backport to branch-3.2
hadoop-yetus commented on pull request #2210: URL: https://github.com/apache/hadoop/pull/2210#issuecomment-677705568 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 15m 21s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 7 new or modified test files. | ||| _ branch-3.2 Compile Tests _ | | +0 :ok: | mvndep | 3m 19s | Maven dependency ordering for branch | | -1 :x: | mvninstall | 10m 39s | root in branch-3.2 failed. | | -1 :x: | compile | 6m 1s | root in branch-3.2 failed. | | -0 :warning: | checkstyle | 1m 58s | The patch fails to run checkstyle in root | | -1 :x: | mvnsite | 0m 27s | hadoop-aws in branch-3.2 failed. | | -1 :x: | shadedclient | 7m 42s | branch has errors when building and testing our client artifacts. | | -1 :x: | javadoc | 0m 12s | hadoop-aws in branch-3.2 failed. | | +0 :ok: | spotbugs | 2m 7s | Used deprecated FindBugs config; considering switching to SpotBugs. | | -1 :x: | findbugs | 0m 12s | hadoop-aws in branch-3.2 failed. | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 22s | Maven dependency ordering for patch | | -1 :x: | mvninstall | 0m 10s | hadoop-aws in the patch failed. | | -1 :x: | compile | 6m 1s | root in the patch failed. | | -1 :x: | javac | 6m 1s | root in the patch failed. | | -0 :warning: | checkstyle | 1m 52s | The patch fails to run checkstyle in root | | -1 :x: | mvnsite | 0m 13s | hadoop-aws in the patch failed. | | -1 :x: | whitespace | 0m 0s | The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply | | -1 :x: | shadedclient | 3m 41s | patch has errors when building and testing our client artifacts. | | -1 :x: | javadoc | 0m 12s | hadoop-aws in the patch failed. | | -1 :x: | findbugs | 0m 12s | hadoop-aws in the patch failed. | ||| _ Other Tests _ | | +1 :green_heart: | unit | 9m 6s | hadoop-common in the patch passed. | | -1 :x: | unit | 0m 12s | hadoop-aws in the patch failed. | | +1 :green_heart: | asflicense | 0m 27s | The patch does not generate ASF License warnings. | | | | 76m 47s | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2210/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2210 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 2ae72db1e4f4 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | branch-3.2 / 42c71a5 | | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~16.04-b01 | | mvninstall | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2210/2/artifact/out/branch-mvninstall-root.txt | | compile | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2210/2/artifact/out/branch-compile-root.txt | | checkstyle | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2210/2/artifact/out/buildtool-branch-checkstyle-root.txt | | mvnsite | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2210/2/artifact/out/branch-mvnsite-hadoop-tools_hadoop-aws.txt | | javadoc | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2210/2/artifact/out/branch-javadoc-hadoop-tools_hadoop-aws.txt | | findbugs | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2210/2/artifact/out/branch-findbugs-hadoop-tools_hadoop-aws.txt | | mvninstall | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2210/2/artifact/out/patch-mvninstall-hadoop-tools_hadoop-aws.txt | | compile | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2210/2/artifact/out/patch-compile-root.txt | | javac | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2210/2/artifact/out/patch-compile-root.txt | | checkstyle | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2210/2/artifact/out/buildtool-patch-checkstyle-root.txt | | mvnsite | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2210/2/artifact/out/patch-mvnsite-hadoop-tools_hadoop-aws.txt | | whitespace | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2210/2/artifact/out/whitespace-eol.txt | | javadoc
[GitHub] [hadoop] steveloughran commented on pull request #2133: HADOOP-17122: Preserving Directory Attributes in DistCp with Atomic Copy
steveloughran commented on pull request #2133: URL: https://github.com/apache/hadoop/pull/2133#issuecomment-677648486 I'm on holiday this week. Don't panic. will look at when I return. I am not reviewing any patches this week This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus commented on pull request #2189: HDFS-15025. Applying NVDIMM storage media to HDFS
hadoop-yetus commented on pull request #2189: URL: https://github.com/apache/hadoop/pull/2189#issuecomment-677511383 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 33m 16s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | No case conflicting files found. | | +0 :ok: | buf | 0m 0s | buf was not available. | | +0 :ok: | markdownlint | 0m 0s | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 16 new or modified test files. | ||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 3m 23s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 33m 9s | trunk passed | | +1 :green_heart: | compile | 25m 37s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | -1 :x: | compile | 18m 42s | root in trunk failed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. | | +1 :green_heart: | checkstyle | 4m 13s | trunk passed | | -1 :x: | mvnsite | 0m 48s | hadoop-common in trunk failed. | | +1 :green_heart: | shadedclient | 24m 50s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 2m 15s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 3m 46s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 2m 36s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 8m 5s | trunk passed | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 27s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 2m 48s | the patch passed | | +1 :green_heart: | compile | 20m 36s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | -1 :x: | cc | 20m 36s | root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 15 new + 147 unchanged - 15 fixed = 162 total (was 162) | | +1 :green_heart: | javac | 20m 36s | the patch passed | | +1 :green_heart: | compile | 18m 52s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | -1 :x: | cc | 18m 52s | root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 39 new + 123 unchanged - 39 fixed = 162 total (was 162) | | -1 :x: | javac | 18m 52s | root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 187 new + 1766 unchanged - 0 fixed = 1953 total (was 1766) | | -0 :warning: | checkstyle | 2m 57s | root: The patch generated 3 new + 727 unchanged - 4 fixed = 730 total (was 731) | | +1 :green_heart: | mvnsite | 3m 59s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | xml | 0m 2s | The patch has no ill-formed XML file. | | +1 :green_heart: | shadedclient | 14m 4s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 2m 32s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 4m 10s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | findbugs | 9m 2s | the patch passed | ||| _ Other Tests _ | | +1 :green_heart: | unit | 9m 27s | hadoop-common in the patch passed. | | +1 :green_heart: | unit | 2m 21s | hadoop-hdfs-client in the patch passed. | | -1 :x: | unit | 96m 38s | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 1m 7s | The patch does not generate ASF License warnings. | | | | 348m 8s | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics | | | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped | | | hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier | | | hadoop.fs.contract.hdfs.TestHDFSContractMultipartUploader | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2189/4/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2189 | | Optional Tests | dupname asflicense compile javac javadoc
[jira] [Comment Edited] (HADOOP-17209) Erasure Coding: Native library memory leak
[ https://issues.apache.org/jira/browse/HADOOP-17209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17181077#comment-17181077 ] Sean Chow edited comment on HADOOP-17209 at 8/20/20, 9:51 AM: -- Hi [~sodonnell] , the patch has 3 {{ReleaseIntArrayElements}}.:P Yeah, I I've replaced with the recompiled native library in five datanodes from my production. All of them works fine when getting and putting files (using EC RS-3-2-1024K), and the memory used seems very promising. was (Author: seanlook): Hi [~sodonnell] , the patch has 3 {{ReleaseIntArrayElements}}.:P Yeah, I I've replaced with the recompiled native library in five datanodes from my production. All of them works fine when getting and putting files, and the memory used seems very promising. > Erasure Coding: Native library memory leak > -- > > Key: HADOOP-17209 > URL: https://issues.apache.org/jira/browse/HADOOP-17209 > Project: Hadoop Common > Issue Type: Bug > Components: native >Affects Versions: 3.3.0, 3.2.1, 3.1.3 >Reporter: Sean Chow >Assignee: Sean Chow >Priority: Major > Attachments: HADOOP-17209.001.patch, > datanode.202137.detail_diff.5.txt, image-2020-08-15-18-26-44-744.png, > image-2020-08-20-12-35-39-906.png > > > We use both {{apache-hadoop-3.1.3}} and {{CDH-6.1.1-1.cdh6.1.1.p0.875250}} > HDFS in production, and both of them have the memory increasing over {{-Xmx}} > value. > !image-2020-08-15-18-26-44-744.png! > > We use EC strategy to to save storage costs. > This's the jvm options: > {code:java} > -Dproc_datanode -Dhdfs.audit.logger=INFO,RFAAUDIT > -Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true > -Xms8589934592 -Xmx8589934592 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC > -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled > -XX:+HeapDumpOnOutOfMemoryError ...{code} > The max jvm heapsize is 8GB, but we can see the datanode RSS memory is 48g. > All the other datanodes in this hdfs cluster has the same issue. > {code:java} > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 226044 hdfs 20 0 50.6g 48g 4780 S 90.5 77.0 14728:27 > /usr/java/jdk1.8.0_162/bin/java -Dproc_datanode{code} > > This too much memory used leads to my machine unresponsive(if enable swap), > or oom-killer happens. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-17209) Erasure Coding: Native library memory leak
[ https://issues.apache.org/jira/browse/HADOOP-17209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17181077#comment-17181077 ] Sean Chow edited comment on HADOOP-17209 at 8/20/20, 9:49 AM: -- Hi [~sodonnell] , the patch has 3 {{ReleaseIntArrayElements}}.:P Yeah, I I've replaced with the recompiled native library in five datanodes from my production. All of them works fine when getting and putting files, and the memory used seems very promising. was (Author: seanlook): Hi [~sodonnell] , the patch has 3 {{ReleaseIntArrayElements}}.:P Yeah, I I've replace with the recompiled native library in five datanode from my production. All of them works fine when getting and putting files, and the memory used seems very promising. > Erasure Coding: Native library memory leak > -- > > Key: HADOOP-17209 > URL: https://issues.apache.org/jira/browse/HADOOP-17209 > Project: Hadoop Common > Issue Type: Bug > Components: native >Affects Versions: 3.3.0, 3.2.1, 3.1.3 >Reporter: Sean Chow >Assignee: Sean Chow >Priority: Major > Attachments: HADOOP-17209.001.patch, > datanode.202137.detail_diff.5.txt, image-2020-08-15-18-26-44-744.png, > image-2020-08-20-12-35-39-906.png > > > We use both {{apache-hadoop-3.1.3}} and {{CDH-6.1.1-1.cdh6.1.1.p0.875250}} > HDFS in production, and both of them have the memory increasing over {{-Xmx}} > value. > !image-2020-08-15-18-26-44-744.png! > > We use EC strategy to to save storage costs. > This's the jvm options: > {code:java} > -Dproc_datanode -Dhdfs.audit.logger=INFO,RFAAUDIT > -Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true > -Xms8589934592 -Xmx8589934592 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC > -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled > -XX:+HeapDumpOnOutOfMemoryError ...{code} > The max jvm heapsize is 8GB, but we can see the datanode RSS memory is 48g. > All the other datanodes in this hdfs cluster has the same issue. > {code:java} > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 226044 hdfs 20 0 50.6g 48g 4780 S 90.5 77.0 14728:27 > /usr/java/jdk1.8.0_162/bin/java -Dproc_datanode{code} > > This too much memory used leads to my machine unresponsive(if enable swap), > or oom-killer happens. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17209) Erasure Coding: Native library memory leak
[ https://issues.apache.org/jira/browse/HADOOP-17209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17181077#comment-17181077 ] Sean Chow commented on HADOOP-17209: Hi [~sodonnell] , the patch has 3 {{ReleaseIntArrayElements}}.:P Yeah, I I've replace with the recompiled native library in five datanode from my production. All of them works fine when getting and putting files, and the memory used seems very promising. > Erasure Coding: Native library memory leak > -- > > Key: HADOOP-17209 > URL: https://issues.apache.org/jira/browse/HADOOP-17209 > Project: Hadoop Common > Issue Type: Bug > Components: native >Affects Versions: 3.3.0, 3.2.1, 3.1.3 >Reporter: Sean Chow >Assignee: Sean Chow >Priority: Major > Attachments: HADOOP-17209.001.patch, > datanode.202137.detail_diff.5.txt, image-2020-08-15-18-26-44-744.png, > image-2020-08-20-12-35-39-906.png > > > We use both {{apache-hadoop-3.1.3}} and {{CDH-6.1.1-1.cdh6.1.1.p0.875250}} > HDFS in production, and both of them have the memory increasing over {{-Xmx}} > value. > !image-2020-08-15-18-26-44-744.png! > > We use EC strategy to to save storage costs. > This's the jvm options: > {code:java} > -Dproc_datanode -Dhdfs.audit.logger=INFO,RFAAUDIT > -Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true > -Xms8589934592 -Xmx8589934592 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC > -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled > -XX:+HeapDumpOnOutOfMemoryError ...{code} > The max jvm heapsize is 8GB, but we can see the datanode RSS memory is 48g. > All the other datanodes in this hdfs cluster has the same issue. > {code:java} > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 226044 hdfs 20 0 50.6g 48g 4780 S 90.5 77.0 14728:27 > /usr/java/jdk1.8.0_162/bin/java -Dproc_datanode{code} > > This too much memory used leads to my machine unresponsive(if enable swap), > or oom-killer happens. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-17209) Erasure Coding: Native library memory leak
[ https://issues.apache.org/jira/browse/HADOOP-17209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17181071#comment-17181071 ] Stephen O'Donnell edited comment on HADOOP-17209 at 8/20/20, 9:43 AM: -- >From the tutorial posted here: [http://www.iitk.ac.in/esc101/05Aug/tutorial/native1.1/implementing/array.html] It does indeed seem that you must call ReleaseIntArrayElements each time you call GetIntArrayElements, so the change makes sense to me. However, I have never used JNI, so my knowledge in this area is very small. Grepping the code for GetIntArrayElements, I see there are 3 occurrences of this currently: {code:java} $ pwd /Users/sodonnell/source/upstream_hadoop/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/erasurecode $ grep GetIntArrayElements *.c jni_common.c: tmpInputOffsets = (int*)(*env)->GetIntArrayElements(env, jni_common.c: tmpOutputOffsets = (int*)(*env)->GetIntArrayElements(env, jni_rs_decoder.c: int* tmpErasedIndexes = (int*)(*env)->GetIntArrayElements(env, {code} -This patch address 2 of them - in jni_common.c in the function getOutputs, do we need a call to ReleaseIntArrayElements there too?- This patch seems to address all 3. Earlier I missed the 3rd change in the patch file. [~seanlook] Have you been running with this patch in production for some time, and all EC operations are working fine with it? was (Author: sodonnell): >From the tutorial posted here: http://www.iitk.ac.in/esc101/05Aug/tutorial/native1.1/implementing/array.html It does indeed seem that you must call ReleaseIntArrayElements each time you call GetIntArrayElements, so the change makes sense to me. However, I have never used JNI, so my knowledge in this area is very small. Grepping the code for GetIntArrayElements, I see there are 3 occurrences of this currently: {code} $ pwd /Users/sodonnell/source/upstream_hadoop/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/erasurecode $ grep GetIntArrayElements *.c jni_common.c: tmpInputOffsets = (int*)(*env)->GetIntArrayElements(env, jni_common.c: tmpOutputOffsets = (int*)(*env)->GetIntArrayElements(env, jni_rs_decoder.c: int* tmpErasedIndexes = (int*)(*env)->GetIntArrayElements(env, {code} This patch address 2 of them - in jni_common.c in the function getOutputs, do we need a call to ReleaseIntArrayElements there too? {code} void getOutputs(JNIEnv *env, jobjectArray outputs, jintArray outputOffsets, unsigned char** destOutputs, int num) { int numOutputs = (*env)->GetArrayLength(env, outputs); int i, *tmpOutputOffsets; jobject byteBuffer; if (numOutputs != num) { THROW(env, "java/lang/InternalError", "Invalid outputs"); } tmpOutputOffsets = (int*)(*env)->GetIntArrayElements(env, outputOffsets, NULL); for (i = 0; i < numOutputs; i++) { byteBuffer = (*env)->GetObjectArrayElement(env, outputs, i); destOutputs[i] = (unsigned char *)((*env)->GetDirectBufferAddress(env, byteBuffer)); destOutputs[i] += tmpOutputOffsets[i]; } } {code} [~seanlook] Have you been running with this patch in production for some time, and all EC operations are working fine with it? > Erasure Coding: Native library memory leak > -- > > Key: HADOOP-17209 > URL: https://issues.apache.org/jira/browse/HADOOP-17209 > Project: Hadoop Common > Issue Type: Bug > Components: native >Affects Versions: 3.3.0, 3.2.1, 3.1.3 >Reporter: Sean Chow >Assignee: Sean Chow >Priority: Major > Attachments: HADOOP-17209.001.patch, > datanode.202137.detail_diff.5.txt, image-2020-08-15-18-26-44-744.png, > image-2020-08-20-12-35-39-906.png > > > We use both {{apache-hadoop-3.1.3}} and {{CDH-6.1.1-1.cdh6.1.1.p0.875250}} > HDFS in production, and both of them have the memory increasing over {{-Xmx}} > value. > !image-2020-08-15-18-26-44-744.png! > > We use EC strategy to to save storage costs. > This's the jvm options: > {code:java} > -Dproc_datanode -Dhdfs.audit.logger=INFO,RFAAUDIT > -Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true > -Xms8589934592 -Xmx8589934592 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC > -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled > -XX:+HeapDumpOnOutOfMemoryError ...{code} > The max jvm heapsize is 8GB, but we can see the datanode RSS memory is 48g. > All the other datanodes in this hdfs cluster has the same issue. > {code:java} > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 226044 hdfs 20 0 50.6g 48g 4780 S 90.5 77.0 14728:27 > /usr/java/jdk1.8.0_162/bin/java -Dproc_datanode{code} > > This too much memory used leads to my
[jira] [Commented] (HADOOP-17209) Erasure Coding: Native library memory leak
[ https://issues.apache.org/jira/browse/HADOOP-17209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17181071#comment-17181071 ] Stephen O'Donnell commented on HADOOP-17209: >From the tutorial posted here: http://www.iitk.ac.in/esc101/05Aug/tutorial/native1.1/implementing/array.html It does indeed seem that you must call ReleaseIntArrayElements each time you call GetIntArrayElements, so the change makes sense to me. However, I have never used JNI, so my knowledge in this area is very small. Grepping the code for GetIntArrayElements, I see there are 3 occurrences of this currently: {code} $ pwd /Users/sodonnell/source/upstream_hadoop/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/erasurecode $ grep GetIntArrayElements *.c jni_common.c: tmpInputOffsets = (int*)(*env)->GetIntArrayElements(env, jni_common.c: tmpOutputOffsets = (int*)(*env)->GetIntArrayElements(env, jni_rs_decoder.c: int* tmpErasedIndexes = (int*)(*env)->GetIntArrayElements(env, {code} This patch address 2 of them - in jni_common.c in the function getOutputs, do we need a call to ReleaseIntArrayElements there too? {code} void getOutputs(JNIEnv *env, jobjectArray outputs, jintArray outputOffsets, unsigned char** destOutputs, int num) { int numOutputs = (*env)->GetArrayLength(env, outputs); int i, *tmpOutputOffsets; jobject byteBuffer; if (numOutputs != num) { THROW(env, "java/lang/InternalError", "Invalid outputs"); } tmpOutputOffsets = (int*)(*env)->GetIntArrayElements(env, outputOffsets, NULL); for (i = 0; i < numOutputs; i++) { byteBuffer = (*env)->GetObjectArrayElement(env, outputs, i); destOutputs[i] = (unsigned char *)((*env)->GetDirectBufferAddress(env, byteBuffer)); destOutputs[i] += tmpOutputOffsets[i]; } } {code} [~seanlook] Have you been running with this patch in production for some time, and all EC operations are working fine with it? > Erasure Coding: Native library memory leak > -- > > Key: HADOOP-17209 > URL: https://issues.apache.org/jira/browse/HADOOP-17209 > Project: Hadoop Common > Issue Type: Bug > Components: native >Affects Versions: 3.3.0, 3.2.1, 3.1.3 >Reporter: Sean Chow >Assignee: Sean Chow >Priority: Major > Attachments: HADOOP-17209.001.patch, > datanode.202137.detail_diff.5.txt, image-2020-08-15-18-26-44-744.png, > image-2020-08-20-12-35-39-906.png > > > We use both {{apache-hadoop-3.1.3}} and {{CDH-6.1.1-1.cdh6.1.1.p0.875250}} > HDFS in production, and both of them have the memory increasing over {{-Xmx}} > value. > !image-2020-08-15-18-26-44-744.png! > > We use EC strategy to to save storage costs. > This's the jvm options: > {code:java} > -Dproc_datanode -Dhdfs.audit.logger=INFO,RFAAUDIT > -Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true > -Xms8589934592 -Xmx8589934592 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC > -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled > -XX:+HeapDumpOnOutOfMemoryError ...{code} > The max jvm heapsize is 8GB, but we can see the datanode RSS memory is 48g. > All the other datanodes in this hdfs cluster has the same issue. > {code:java} > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 226044 hdfs 20 0 50.6g 48g 4780 S 90.5 77.0 14728:27 > /usr/java/jdk1.8.0_162/bin/java -Dproc_datanode{code} > > This too much memory used leads to my machine unresponsive(if enable swap), > or oom-killer happens. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work stopped] (HADOOP-16206) Migrate from Log4j1 to Log4j2
[ https://issues.apache.org/jira/browse/HADOOP-16206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HADOOP-16206 stopped by Akira Ajisaka. -- > Migrate from Log4j1 to Log4j2 > - > > Key: HADOOP-16206 > URL: https://issues.apache.org/jira/browse/HADOOP-16206 > Project: Hadoop Common > Issue Type: Sub-task >Affects Versions: 3.3.0 >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Major > Attachments: HADOOP-16206-wip.001.patch > > > This sub-task is to remove log4j1 dependency and add log4j2 dependency. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-16206) Migrate from Log4j1 to Log4j2
[ https://issues.apache.org/jira/browse/HADOOP-16206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka reassigned HADOOP-16206: -- Assignee: (was: Akira Ajisaka) > Migrate from Log4j1 to Log4j2 > - > > Key: HADOOP-16206 > URL: https://issues.apache.org/jira/browse/HADOOP-16206 > Project: Hadoop Common > Issue Type: Sub-task >Affects Versions: 3.3.0 >Reporter: Akira Ajisaka >Priority: Major > Attachments: HADOOP-16206-wip.001.patch > > > This sub-task is to remove log4j1 dependency and add log4j2 dependency. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-16206) Migrate from Log4j1 to Log4j2
[ https://issues.apache.org/jira/browse/HADOOP-16206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka reassigned HADOOP-16206: -- Assignee: Akira Ajisaka > Migrate from Log4j1 to Log4j2 > - > > Key: HADOOP-16206 > URL: https://issues.apache.org/jira/browse/HADOOP-16206 > Project: Hadoop Common > Issue Type: Sub-task >Affects Versions: 3.3.0 >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Major > Attachments: HADOOP-16206-wip.001.patch > > > This sub-task is to remove log4j1 dependency and add log4j2 dependency. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16206) Migrate from Log4j1 to Log4j2
[ https://issues.apache.org/jira/browse/HADOOP-16206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17181033#comment-17181033 ] Akira Ajisaka commented on HADOOP-16206: [~ahussein] Now I don't have time to work on this issue. Sorry for very late response. > Migrate from Log4j1 to Log4j2 > - > > Key: HADOOP-16206 > URL: https://issues.apache.org/jira/browse/HADOOP-16206 > Project: Hadoop Common > Issue Type: Sub-task >Affects Versions: 3.3.0 >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Major > Attachments: HADOOP-16206-wip.001.patch > > > This sub-task is to remove log4j1 dependency and add log4j2 dependency. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-16206) Migrate from Log4j1 to Log4j2
[ https://issues.apache.org/jira/browse/HADOOP-16206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka reassigned HADOOP-16206: -- Assignee: (was: Akira Ajisaka) > Migrate from Log4j1 to Log4j2 > - > > Key: HADOOP-16206 > URL: https://issues.apache.org/jira/browse/HADOOP-16206 > Project: Hadoop Common > Issue Type: Sub-task >Affects Versions: 3.3.0 >Reporter: Akira Ajisaka >Priority: Major > Attachments: HADOOP-16206-wip.001.patch > > > This sub-task is to remove log4j1 dependency and add log4j2 dependency. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus commented on pull request #2185: HADOOP-15891. provide Regex Based Mount Point In Inode Tree
hadoop-yetus commented on pull request #2185: URL: https://github.com/apache/hadoop/pull/2185#issuecomment-677413520 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 26m 25s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | No case conflicting files found. | | +0 :ok: | markdownlint | 0m 0s | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 6 new or modified test files. | ||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 3m 22s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 25m 44s | trunk passed | | +1 :green_heart: | compile | 19m 25s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 17m 6s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 2m 56s | trunk passed | | +1 :green_heart: | mvnsite | 2m 49s | trunk passed | | +1 :green_heart: | shadedclient | 20m 1s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 38s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 3m 6s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 3m 57s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 6m 45s | trunk passed | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 29s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 2m 22s | the patch passed | | +1 :green_heart: | compile | 24m 54s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javac | 24m 54s | the patch passed | | +1 :green_heart: | compile | 21m 58s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | javac | 21m 58s | the patch passed | | -0 :warning: | checkstyle | 3m 37s | root: The patch generated 1 new + 182 unchanged - 1 fixed = 183 total (was 183) | | +1 :green_heart: | mvnsite | 3m 23s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | shadedclient | 17m 13s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 35s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 3m 27s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | findbugs | 7m 10s | the patch passed | ||| _ Other Tests _ | | +1 :green_heart: | unit | 10m 53s | hadoop-common in the patch passed. | | -1 :x: | unit | 111m 10s | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 1m 7s | The patch does not generate ASF License warnings. | | | | 338m 11s | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.TestDFSInotifyEventInputStream | | | hadoop.hdfs.TestErasureCodingPolicyWithSnapshot | | | hadoop.hdfs.TestReadStripedFileWithDNFailure | | | hadoop.hdfs.server.balancer.TestBalancerRPCDelay | | | hadoop.hdfs.TestDatanodeDeath | | | hadoop.hdfs.TestGetFileChecksum | | | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks | | | hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy | | | hadoop.hdfs.TestReadStripedFileWithDecoding | | | hadoop.hdfs.TestStripedFileAppend | | | hadoop.hdfs.TestDecommissionWithStripedBackoffMonitor | | | hadoop.fs.contract.hdfs.TestHDFSContractMultipartUploader | | | hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics | | | hadoop.hdfs.TestDFSStorageStateRecovery | | | hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier | | | hadoop.hdfs.server.datanode.TestBPOfferService | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2185/11/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2185 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle markdownlint | | uname | Linux 9aaa4225ec86 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
[jira] [Updated] (HADOOP-17215) ABFS: Excessive Create Overwrites leads to race conditions
[ https://issues.apache.org/jira/browse/HADOOP-17215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sneha Vijayarajan updated HADOOP-17215: --- Summary: ABFS: Excessive Create Overwrites leads to race conditions (was: ABFS: Excessive Create overwrites leading to unnecessary extra transactions) > ABFS: Excessive Create Overwrites leads to race conditions > -- > > Key: HADOOP-17215 > URL: https://issues.apache.org/jira/browse/HADOOP-17215 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Sneha Vijayarajan >Assignee: Sneha Vijayarajan >Priority: Major > > Filesystem Create APIs that do not accept an argument for overwrite flag end > up defaulting it to true. > We are observing that request count of creates with overwrite=true is more > and primarily because of the default setting of the flag is true of the > called Create API. When a create with overwrite ends up timing out, we have > observed that it could lead to race conditions between the first create and > retried one running almost parallel. > To avoid this scenario for create with overwrite=true request, ABFS driver > will always attempt to create without overwrite. If the create fails due to > fileAlreadyPresent, it will resend the request with overwrite=true. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-17215) ABFS: Excessive Create overwrites leading to unnecessary extra transactions
Sneha Vijayarajan created HADOOP-17215: -- Summary: ABFS: Excessive Create overwrites leading to unnecessary extra transactions Key: HADOOP-17215 URL: https://issues.apache.org/jira/browse/HADOOP-17215 Project: Hadoop Common Issue Type: Sub-task Reporter: Sneha Vijayarajan Assignee: Sneha Vijayarajan Filesystem Create APIs that do not accept an argument for overwrite flag end up defaulting it to true. We are observing that request count of creates with overwrite=true is more and primarily because of the default setting of the flag is true of the called Create API. When a create with overwrite ends up timing out, we have observed that it could lead to race conditions between the first create and retried one running almost parallel. To avoid this scenario for create with overwrite=true request, ABFS driver will always attempt to create without overwrite. If the create fails due to fileAlreadyPresent, it will resend the request with overwrite=true. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org