[jira] [Commented] (HADOOP-16918) Dependency update for Hadoop 2.10
[ https://issues.apache.org/jira/browse/HADOOP-16918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192681#comment-17192681 ] Masatake Iwasaki commented on HADOOP-16918: --- Since YARN-8936 upgraded hbase.one.version from 1.2.6 to 1.4.8, there is not commit to cherry-pick. I filed HADOOP-17254 to upgrade hbase. > Dependency update for Hadoop 2.10 > - > > Key: HADOOP-16918 > URL: https://issues.apache.org/jira/browse/HADOOP-16918 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Priority: Major > Labels: release-blocker > Attachments: dependency-check-report.html, > dependency-check-report.html > > > A number of dependencies can be updated. > nimbus-jose-jwt > jetty > netty > zookeeper > hbase-common > jackson-databind > and many more. They should be updated in the 2.10.1 release. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus commented on pull request #2287: HDFS-15552. Let DeadNode Detector also work for EC cases
hadoop-yetus commented on pull request #2287: URL: https://github.com/apache/hadoop/pull/2287#issuecomment-689348278 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 28m 48s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 1 new or modified test files. | ||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 3m 21s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 29m 55s | trunk passed | | +1 :green_heart: | compile | 4m 47s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 4m 18s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 1m 1s | trunk passed | | +1 :green_heart: | mvnsite | 2m 19s | trunk passed | | +1 :green_heart: | shadedclient | 19m 4s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 33s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 2m 1s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 2m 43s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 6m 8s | trunk passed | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 26s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 2m 5s | the patch passed | | +1 :green_heart: | compile | 4m 48s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javac | 4m 48s | the patch passed | | +1 :green_heart: | compile | 4m 11s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | javac | 4m 11s | the patch passed | | +1 :green_heart: | checkstyle | 0m 56s | the patch passed | | +1 :green_heart: | mvnsite | 2m 6s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | shadedclient | 15m 45s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 28s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 1m 58s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | findbugs | 7m 14s | the patch passed | ||| _ Other Tests _ | | +1 :green_heart: | unit | 2m 12s | hadoop-hdfs-client in the patch passed. | | -1 :x: | unit | 116m 30s | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 1m 4s | The patch does not generate ASF License warnings. | | | | 264m 41s | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped | | | hadoop.hdfs.TestDeadNodeDetection | | | hadoop.hdfs.server.datanode.TestBPOfferService | | | hadoop.hdfs.TestFileChecksum | | | hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics | | | hadoop.hdfs.TestSafeModeWithStripedFile | | | hadoop.hdfs.TestFileChecksumCompositeCrc | | | hadoop.hdfs.server.mover.TestMover | | | hadoop.hdfs.TestGetFileChecksum | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2287/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2287 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 5484194563c1 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 0d855159f09 | | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | unit | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2287/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/j
[jira] [Updated] (HADOOP-17254) Upgrade hbase to 1.2.6.1 on branch-2.10
[ https://issues.apache.org/jira/browse/HADOOP-17254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HADOOP-17254: -- Summary: Upgrade hbase to 1.2.6.1 on branch-2.10 (was: Upgrade hbase to 1.2.6.1 on branch-2) > Upgrade hbase to 1.2.6.1 on branch-2.10 > --- > > Key: HADOOP-17254 > URL: https://issues.apache.org/jira/browse/HADOOP-17254 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17254) Upgrade hbase to 1.2.6.1 on branch-2
[ https://issues.apache.org/jira/browse/HADOOP-17254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HADOOP-17254: -- Target Version/s: 2.10.1 > Upgrade hbase to 1.2.6.1 on branch-2 > > > Key: HADOOP-17254 > URL: https://issues.apache.org/jira/browse/HADOOP-17254 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-17254) Upgrade hbase to 1.2.6.1 on branch-2
Masatake Iwasaki created HADOOP-17254: - Summary: Upgrade hbase to 1.2.6.1 on branch-2 Key: HADOOP-17254 URL: https://issues.apache.org/jira/browse/HADOOP-17254 Project: Hadoop Common Issue Type: Improvement Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17253) Upgrade zookeeper to 3.4.14 on branch-2.10
[ https://issues.apache.org/jira/browse/HADOOP-17253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HADOOP-17253: Labels: pull-request-available (was: ) > Upgrade zookeeper to 3.4.14 on branch-2.10 > -- > > Key: HADOOP-17253 > URL: https://issues.apache.org/jira/browse/HADOOP-17253 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Since versions of zookeeper and curator have different history between > branch-2.10 and trunk, I filed this to upgrade both zookeeper and curator on > branch-2.10. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17253) Upgrade zookeeper to 3.4.14 on branch-2.10
[ https://issues.apache.org/jira/browse/HADOOP-17253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HADOOP-17253: -- Status: Patch Available (was: Open) > Upgrade zookeeper to 3.4.14 on branch-2.10 > -- > > Key: HADOOP-17253 > URL: https://issues.apache.org/jira/browse/HADOOP-17253 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Since versions of zookeeper and curator have different history between > branch-2.10 and trunk, I filed this to upgrade both zookeeper and curator on > branch-2.10. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17253) Upgrade zookeeper to 3.4.14 on branch-2.10
[ https://issues.apache.org/jira/browse/HADOOP-17253?focusedWorklogId=480620&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-480620 ] ASF GitHub Bot logged work on HADOOP-17253: --- Author: ASF GitHub Bot Created on: 09/Sep/20 06:35 Start Date: 09/Sep/20 06:35 Worklog Time Spent: 10m Work Description: iwasakims opened a new pull request #2289: URL: https://github.com/apache/hadoop/pull/2289 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 480620) Remaining Estimate: 0h Time Spent: 10m > Upgrade zookeeper to 3.4.14 on branch-2.10 > -- > > Key: HADOOP-17253 > URL: https://issues.apache.org/jira/browse/HADOOP-17253 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Since versions of zookeeper and curator have different history between > branch-2.10 and trunk, I filed this to upgrade both zookeeper and curator on > branch-2.10. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] iwasakims opened a new pull request #2289: HADOOP-17253. Upgrade zookeeper to 3.4.14 on branch-2.10.
iwasakims opened a new pull request #2289: URL: https://github.com/apache/hadoop/pull/2289 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17217) S3A FileSystem does not correctly delete directories with fake entries
[ https://issues.apache.org/jira/browse/HADOOP-17217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192666#comment-17192666 ] Kaya Kupferschmidt commented on HADOOP-17217: - Thanks a lot [~ste...@apache.org] . I will check the Hadoop branch-3.3 and inform you about the results. I am confident that the problem will be solved. > S3A FileSystem does not correctly delete directories with fake entries > -- > > Key: HADOOP-17217 > URL: https://issues.apache.org/jira/browse/HADOOP-17217 > Project: Hadoop Common > Issue Type: Bug > Components: fs/s3 >Affects Versions: 3.2.0 >Reporter: Kaya Kupferschmidt >Priority: Major > > h3. Summary > We are facing an issue where the Hadoop S3A Filesystem gets confused by fake > directory objects in S3. Specifically trying to recursively remove a whole > Hadoop directory (i.e. all objects with the same S3 prefix) doesn't work as > expected. > h2. Background > We are using Alluxio together with S3 as our deep store. For some > infrastructure reasons we decided to directly write to S3 (bypassing Alluxio) > with our Spark applications, and only use Alluxio for reading data. > When we directly write our results into S3 with Spark, everything is fine. > But once Alluxio accesses S3, it will create these fake directory entries. > When we now try to overwrite existing data in S3 with a Spark application, > the result is incorrect, since Spark will only remove the fake directory > entry, but not all other objects below that prefix in S3. > Of course it is questionable if Alluxio is doing the right thing, but on the > other hand it seems that Hadoop also does not behave as we expected. > h2. Steps to Reproduce > The following steps only require AWS CLI and Hadoop CLI to reproduce the > issue we are facing: > h3. Initial setup > {code:bash} > # First step: Create an new and empty bucket in S3 > $ aws s3 mb s3://dimajix-tmp > > make_bucket: dimajix-tmp > $ aws s3 ls > 2020-08-21 11:19:50 dimajix-tmp > # Upload some data > $ aws s3 cp some_file.txt s3://dimajix-tmp/tmp/ > upload: ./some_file.txt to s3://dimajix-tmp/tmp/some_file.txt > $ aws s3 ls s3://dimajix-tmp/tmp/ > > 2020-08-21 11:23:35 0 some_file.txt > # Check that Hadoop can list the file > $ /opt/hadoop/bin/hdfs dfs -ls s3a://dimajix-tmp/ > Found 1 items > drwxrwxrwx - kaya kaya 0 2020-08-21 11:24 s3a://dimajix-tmp/tmp > $ /opt/hadoop/bin/hdfs dfs -ls s3a://dimajix-tmp/tmp/ > -rw-rw-rw- 1 kaya kaya 0 2020-08-21 11:23 > s3a://dimajix-tmp/tmp/some_file.txt > # Evil step: Create fake directory entry in S3 > $ aws s3api put-object --bucket dimajix-tmp --key tmp/ > { > "ETag": "\"d41d8cd98f00b204e9800998ecf8427e\"" > } > # Look into S3, ensure that fake directory entry was created > $ aws s3 ls s3://dimajix-tmp/tmp/ > 2020-08-21 11:25:40 0 > 2020-08-21 11:23:35 0 some_file.txt > # Look into S3 using Hadoop CLI, ensure that everything looks okay (which is > the case) > $ /opt/hadoop/bin/hdfs dfs -ls s3a://dimajix-tmp/tmp/ > Found 1 items > -rw-rw-rw- 1 kaya kaya 0 2020-08-21 11:23 > s3a://dimajix-tmp/tmp/some_file.txt > {code} > h3. Reproduce questionable behaviour: Try to recursively delete directory > {code:bash} > # Bug: Now try to delete the directory with Hadoop CLI > $ /opt/hadoop/bin/hdfs dfs -rm s3a://dimajix-tmp/tmp/ > rm: `s3a://dimajix-tmp/tmp': Is a directory > # Okay, that didn't work out, Hadoop interprets the prefix as a directory > (which is fine). It also did not delete anything, as we can see in S3: > $ aws s3 ls s3://dimajix-tmp/tmp/ > 2020-08-21 11:25:40 0 > 2020-08-21 11:23:35 0 some_file.txt > # Now let's try a little bit harder by trying to recursively delete the > directory > $ /opt/hadoop/bin/hdfs dfs -rm -r s3a://dimajix-tmp/tmp/ > Deleted s3a://dimajix-tmp/tmp > # Everything looked fine so far. But let's inspect S3 directly. We'll find > that only the prefix (fake directory entry) has been removed. The file in the > directory is still there. > $ aws s3 ls s3://dimajix-tmp/tmp/ > 2020-08-21 11:23:35 0 some_file.txt > # We can also use Hadoop CLI to check that the directory containing the file > is still present, although we wanted to delete it above. > $ /opt/hadoop/bin/hdfs dfs -ls s3a://dimajix-tmp/tmp/ > Found 1 items > -rw-rw-rw- 1 kaya kaya 0 2020-08-21 11:23 > s3a://dimajix-tmp/tmp/some_file.txt > {code} > h3. Remedy by perform
[jira] [Commented] (HADOOP-16918) Dependency update for Hadoop 2.10
[ https://issues.apache.org/jira/browse/HADOOP-16918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192659#comment-17192659 ] Masatake Iwasaki commented on HADOOP-16918: --- Since versions of zookeeper and curator have different history between branch-2.10 and trunk, I filed HADOOP-17253 to upgrade both zookeeper and curator on branch-2.10. > Dependency update for Hadoop 2.10 > - > > Key: HADOOP-16918 > URL: https://issues.apache.org/jira/browse/HADOOP-16918 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Priority: Major > Labels: release-blocker > Attachments: dependency-check-report.html, > dependency-check-report.html > > > A number of dependencies can be updated. > nimbus-jose-jwt > jetty > netty > zookeeper > hbase-common > jackson-databind > and many more. They should be updated in the 2.10.1 release. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-17253) Upgrade zookeeper to 3.4.14 on branch-2.10
Masatake Iwasaki created HADOOP-17253: - Summary: Upgrade zookeeper to 3.4.14 on branch-2.10 Key: HADOOP-17253 URL: https://issues.apache.org/jira/browse/HADOOP-17253 Project: Hadoop Common Issue Type: Improvement Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Since versions of zookeeper and curator have different history between branch-2.10 and trunk, I filed this to upgrade both zookeeper and curator on branch-2.10. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=480616&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-480616 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 09/Sep/20 06:22 Start Date: 09/Sep/20 06:22 Worklog Time Spent: 10m Work Description: dbtsai commented on a change in pull request #2201: URL: https://github.com/apache/hadoop/pull/2201#discussion_r485364827 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java ## @@ -276,13 +258,27 @@ public void end() { // do nothing } - private native static void initIDs(); + private int decompressBytesDirect() throws IOException { +if (compressedDirectBufLen == 0) { + return 0; +} else { + // Set the position and limit of `compressedDirectBuf` for reading + compressedDirectBuf.position(0).limit(compressedDirectBufLen); + // There is compressed input, decompress it now. + int size = Snappy.uncompressedLength((ByteBuffer) compressedDirectBuf); + if (size > uncompressedDirectBuf.capacity()) { Review comment: Before calling `decompressBytesDirect`, we always reset `uncompressedDirectBuf`. So both method returns the same result. But you are right, I think .remaining is better. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 480616) Time Spent: 1h (was: 50m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] dbtsai commented on a change in pull request #2201: HADOOP-17125. Using snappy-java in SnappyCodec
dbtsai commented on a change in pull request #2201: URL: https://github.com/apache/hadoop/pull/2201#discussion_r485364827 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java ## @@ -276,13 +258,27 @@ public void end() { // do nothing } - private native static void initIDs(); + private int decompressBytesDirect() throws IOException { +if (compressedDirectBufLen == 0) { + return 0; +} else { + // Set the position and limit of `compressedDirectBuf` for reading + compressedDirectBuf.position(0).limit(compressedDirectBufLen); + // There is compressed input, decompress it now. + int size = Snappy.uncompressedLength((ByteBuffer) compressedDirectBuf); + if (size > uncompressedDirectBuf.capacity()) { Review comment: Before calling `decompressBytesDirect`, we always reset `uncompressedDirectBuf`. So both method returns the same result. But you are right, I think .remaining is better. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=480615&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-480615 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 09/Sep/20 06:19 Start Date: 09/Sep/20 06:19 Worklog Time Spent: 10m Work Description: dbtsai commented on a change in pull request #2201: URL: https://github.com/apache/hadoop/pull/2201#discussion_r485363709 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java ## @@ -276,13 +258,27 @@ public void end() { // do nothing } - private native static void initIDs(); + private int decompressBytesDirect() throws IOException { +if (compressedDirectBufLen == 0) { + return 0; +} else { + // Set the position and limit of `compressedDirectBuf` for reading + compressedDirectBuf.position(0).limit(compressedDirectBufLen); Review comment: `decompressBytesDirect` is the only call that reads the data from `compressedDirectBuf`, so I think we should read it from the beginning. In `compressedDirectBuf`, we should fully decompress the `compressedDirectBuf`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 480615) Time Spent: 50m (was: 40m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] dbtsai commented on a change in pull request #2201: HADOOP-17125. Using snappy-java in SnappyCodec
dbtsai commented on a change in pull request #2201: URL: https://github.com/apache/hadoop/pull/2201#discussion_r485363709 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java ## @@ -276,13 +258,27 @@ public void end() { // do nothing } - private native static void initIDs(); + private int decompressBytesDirect() throws IOException { +if (compressedDirectBufLen == 0) { + return 0; +} else { + // Set the position and limit of `compressedDirectBuf` for reading + compressedDirectBuf.position(0).limit(compressedDirectBufLen); Review comment: `decompressBytesDirect` is the only call that reads the data from `compressedDirectBuf`, so I think we should read it from the beginning. In `compressedDirectBuf`, we should fully decompress the `compressedDirectBuf`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] JohnZZGithub commented on a change in pull request #2185: HADOOP-15891. provide Regex Based Mount Point In Inode Tree
JohnZZGithub commented on a change in pull request #2185: URL: https://github.com/apache/hadoop/pull/2185#discussion_r485356795 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ViewFs.md ## @@ -366,6 +366,69 @@ Don't want to change scheme or difficult to copy mount-table configurations to a Please refer to the [View File System Overload Scheme Guide](./ViewFsOverloadScheme.html) +Regex Pattern Based Mount Points + + +The view file system mount points were a Key-Value based mapping system. It is not friendly for user cases which mapping config could be abstracted to rules. E.g. Users want to provide a GCS bucket per user and there might be thousands of users in total. The old key-value based approach won't work well for several reasons: + +1. The mount table is used by FileSystem clients. There's a cost to spread the config to all clients and we should avoid it if possible. The [View File System Overload Scheme Guide](./ViewFsOverloadScheme.html) could help the distribution by central mount table management. But the mount table still have to be updated on every change. The change could be greatly avoided if provide a rule-based mount table. + +2. The client have to understand all the KVs in the mount table. This is not ideal when the mountable grows to thousands of items. E.g. thousands of file systems might be initialized even users only need one. And the config itself will become bloated at scale. + +### Understand the Difference + +In the key-value based mount table, view file system treats every mount point as a partition. There's several file system APIs which will lead to operation on all partitions. E.g. there's an HDFS cluster with multiple mount. Users want to run “hadoop fs -put file viewfs://hdfs.namenode.apache.org/tmp/” cmd to copy data from local disk to our HDFS cluster. The cmd will trigger ViewFileSystem to call setVerifyChecksum() method which will initialize the file system for every mount point. +For a regex-base rule mount table entry, we couldn't know what's corresponding path until parsing. So the regex based mount table entry will be ignored on such cases. The file system (ChRootedFileSystem) will be created upon accessing. But the underlying file system will be cached by inner cache of ViewFileSystem. Review comment: Good idea. I guess this patch didn't add parsed fs to mount point yet. Maybe it's better when we modify the code and doc at the same time. Created https://issues.apache.org/jira/browse/HADOOP-17247 to track the issue. Does it make sense? Thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-15891) Provide Regex Based Mount Point In Inode Tree
[ https://issues.apache.org/jira/browse/HADOOP-15891?focusedWorklogId=480610&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-480610 ] ASF GitHub Bot logged work on HADOOP-15891: --- Author: ASF GitHub Bot Created on: 09/Sep/20 06:00 Start Date: 09/Sep/20 06:00 Worklog Time Spent: 10m Work Description: JohnZZGithub commented on a change in pull request #2185: URL: https://github.com/apache/hadoop/pull/2185#discussion_r485356795 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ViewFs.md ## @@ -366,6 +366,69 @@ Don't want to change scheme or difficult to copy mount-table configurations to a Please refer to the [View File System Overload Scheme Guide](./ViewFsOverloadScheme.html) +Regex Pattern Based Mount Points + + +The view file system mount points were a Key-Value based mapping system. It is not friendly for user cases which mapping config could be abstracted to rules. E.g. Users want to provide a GCS bucket per user and there might be thousands of users in total. The old key-value based approach won't work well for several reasons: + +1. The mount table is used by FileSystem clients. There's a cost to spread the config to all clients and we should avoid it if possible. The [View File System Overload Scheme Guide](./ViewFsOverloadScheme.html) could help the distribution by central mount table management. But the mount table still have to be updated on every change. The change could be greatly avoided if provide a rule-based mount table. + +2. The client have to understand all the KVs in the mount table. This is not ideal when the mountable grows to thousands of items. E.g. thousands of file systems might be initialized even users only need one. And the config itself will become bloated at scale. + +### Understand the Difference + +In the key-value based mount table, view file system treats every mount point as a partition. There's several file system APIs which will lead to operation on all partitions. E.g. there's an HDFS cluster with multiple mount. Users want to run “hadoop fs -put file viewfs://hdfs.namenode.apache.org/tmp/” cmd to copy data from local disk to our HDFS cluster. The cmd will trigger ViewFileSystem to call setVerifyChecksum() method which will initialize the file system for every mount point. +For a regex-base rule mount table entry, we couldn't know what's corresponding path until parsing. So the regex based mount table entry will be ignored on such cases. The file system (ChRootedFileSystem) will be created upon accessing. But the underlying file system will be cached by inner cache of ViewFileSystem. Review comment: Good idea. I guess this patch didn't add parsed fs to mount point yet. Maybe it's better when we modify the code and doc at the same time. Created https://issues.apache.org/jira/browse/HADOOP-17247 to track the issue. Does it make sense? Thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 480610) Time Spent: 5h 50m (was: 5h 40m) > Provide Regex Based Mount Point In Inode Tree > - > > Key: HADOOP-15891 > URL: https://issues.apache.org/jira/browse/HADOOP-15891 > Project: Hadoop Common > Issue Type: New Feature > Components: viewfs >Reporter: zhenzhao wang >Assignee: zhenzhao wang >Priority: Major > Labels: pull-request-available > Attachments: HADOOP-15891.015.patch, HDFS-13948.001.patch, > HDFS-13948.002.patch, HDFS-13948.003.patch, HDFS-13948.004.patch, > HDFS-13948.005.patch, HDFS-13948.006.patch, HDFS-13948.007.patch, > HDFS-13948.008.patch, HDFS-13948.009.patch, HDFS-13948.011.patch, > HDFS-13948.012.patch, HDFS-13948.013.patch, HDFS-13948.014.patch, HDFS-13948_ > Regex Link Type In Mont Table-V0.pdf, HDFS-13948_ Regex Link Type In Mount > Table-v1.pdf > > Time Spent: 5h 50m > Remaining Estimate: 0h > > This jira is created to support regex based mount point in Inode Tree. We > noticed that mount point only support fixed target path. However, we might > have user cases when target needs to refer some fields from source. e.g. We > might want a mapping of /cluster1/user1 => /cluster1-dc1/user-nn-user1, we > want to refer `cluster` and `user` field in source to construct target. It's > impossible to archive this with current link type. Though we could set > one-to-one mapping, the mount table would become bloated
[jira] [Commented] (HADOOP-17247) Support Non-Path Based FileSystem API in Regex Mount Points
[ https://issues.apache.org/jira/browse/HADOOP-17247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192645#comment-17192645 ] zhenzhao wang commented on HADOOP-17247: [~ste...@apache.org] Thanks, the parent patch is not merged yet. So I left the affected version. As for target version, I chose the next unreleased version, let me know whether it makes sense to you. > Support Non-Path Based FileSystem API in Regex Mount Points > --- > > Key: HADOOP-17247 > URL: https://issues.apache.org/jira/browse/HADOOP-17247 > Project: Hadoop Common > Issue Type: Sub-task > Components: viewfs >Reporter: zhenzhao wang >Priority: Major > > Regex mount points create ChoRootedFileSystem while accessing. And we > couldn't know the underlying filesystems ahead. This won't work with non-path > based APIs such as getAdditionalTokenIssuers. Instead of totaly unsuport it, > we should support APIs to some extend. The could be done recording the > FileSystem created and perform APIs for FileSystem instances created for > ViewFileSystem. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17247) Support Non-Path Based FileSystem API in Regex Mount Points
[ https://issues.apache.org/jira/browse/HADOOP-17247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhenzhao wang updated HADOOP-17247: --- Target Version/s: 3.3.1 > Support Non-Path Based FileSystem API in Regex Mount Points > --- > > Key: HADOOP-17247 > URL: https://issues.apache.org/jira/browse/HADOOP-17247 > Project: Hadoop Common > Issue Type: Sub-task > Components: viewfs >Reporter: zhenzhao wang >Priority: Major > > Regex mount points create ChoRootedFileSystem while accessing. And we > couldn't know the underlying filesystems ahead. This won't work with non-path > based APIs such as getAdditionalTokenIssuers. Instead of totaly unsuport it, > we should support APIs to some extend. The could be done recording the > FileSystem created and perform APIs for FileSystem instances created for > ViewFileSystem. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17247) Support Non-Path Based FileSystem API in Regex Mount Points
[ https://issues.apache.org/jira/browse/HADOOP-17247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhenzhao wang updated HADOOP-17247: --- Component/s: viewfs > Support Non-Path Based FileSystem API in Regex Mount Points > --- > > Key: HADOOP-17247 > URL: https://issues.apache.org/jira/browse/HADOOP-17247 > Project: Hadoop Common > Issue Type: Sub-task > Components: viewfs >Reporter: zhenzhao wang >Priority: Major > > Regex mount points create ChoRootedFileSystem while accessing. And we > couldn't know the underlying filesystems ahead. This won't work with non-path > based APIs such as getAdditionalTokenIssuers. Instead of totaly unsuport it, > we should support APIs to some extend. The could be done recording the > FileSystem created and perform APIs for FileSystem instances created for > ViewFileSystem. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] LeonGao91 opened a new pull request #2288: HDFS-15548. Allow configuring DISK/ARCHIVE storage types on same device mount
LeonGao91 opened a new pull request #2288: URL: https://github.com/apache/hadoop/pull/2288 ## NOTICE Please create an issue in ASF JIRA before opening a pull request, and you need to set the title of the pull request which starts with the corresponding JIRA issue number. (e.g. HADOOP-X. Fix a typo in YYY.) For more details, please see https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16918) Dependency update for Hadoop 2.10
[ https://issues.apache.org/jira/browse/HADOOP-16918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192638#comment-17192638 ] Masatake Iwasaki commented on HADOOP-16918: --- HADOOP-15816 upgraded zookeeper from 3.4.9 to 3.4.13. HADOOP-15974 is the follow-up to upgrade curator. I'm cherry-picking and testing them on my local. > Dependency update for Hadoop 2.10 > - > > Key: HADOOP-16918 > URL: https://issues.apache.org/jira/browse/HADOOP-16918 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Priority: Major > Labels: release-blocker > Attachments: dependency-check-report.html, > dependency-check-report.html > > > A number of dependencies can be updated. > nimbus-jose-jwt > jetty > netty > zookeeper > hbase-common > jackson-databind > and many more. They should be updated in the 2.10.1 release. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17230) JAVA_HOME with spaces not supported
[ https://issues.apache.org/jira/browse/HADOOP-17230?focusedWorklogId=480604&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-480604 ] ASF GitHub Bot logged work on HADOOP-17230: --- Author: ASF GitHub Bot Created on: 09/Sep/20 05:35 Start Date: 09/Sep/20 05:35 Worklog Time Spent: 10m Work Description: goiri commented on pull request #2248: URL: https://github.com/apache/hadoop/pull/2248#issuecomment-689315668 This looks interesting, I've been having to fight the path limit for a while. It kind of works, but I get: ``` %%I was unexpected at this time. ``` For my cmd, I actually need: ``` for %I in ("%JAVA_HOME%") do set JAVA_HOME=%%~sI ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 480604) Time Spent: 40m (was: 0.5h) > JAVA_HOME with spaces not supported > --- > > Key: HADOOP-17230 > URL: https://issues.apache.org/jira/browse/HADOOP-17230 > Project: Hadoop Common > Issue Type: Bug > Components: common >Affects Versions: 3.3.0 > Environment: Windows 10 > Hadoop 3.1.0, 3.2.1, 3.3.0, etc >Reporter: Wayne Seguin >Priority: Minor > Labels: pull-request-available > Attachments: image-2020-08-26-12-24-04-118.png, > image-2020-08-26-12-49-40-335.png > > Time Spent: 40m > Remaining Estimate: 0h > > When running on Windows, if JAVA_HOME contains a space (which is frequently > since the default Java install path is "C:\Program Files\Java", running > Hadoop fails to run. > !image-2020-08-26-12-24-04-118.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] goiri commented on pull request #2248: HADOOP-17230. Fix JAVA_HOME with spaces
goiri commented on pull request #2248: URL: https://github.com/apache/hadoop/pull/2248#issuecomment-689315668 This looks interesting, I've been having to fight the path limit for a while. It kind of works, but I get: ``` %%I was unexpected at this time. ``` For my cmd, I actually need: ``` for %I in ("%JAVA_HOME%") do set JAVA_HOME=%%~sI ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16918) Dependency update for Hadoop 2.10
[ https://issues.apache.org/jira/browse/HADOOP-16918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192635#comment-17192635 ] Wei-Chiu Chuang commented on HADOOP-16918: -- [^dependency-check-report.html] this is a new report. Looks good to me for the most part. Some of the known ones wouldn't be resolved in 2.10 and so we will ignore them. Other than that we're almost ready. > Dependency update for Hadoop 2.10 > - > > Key: HADOOP-16918 > URL: https://issues.apache.org/jira/browse/HADOOP-16918 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Priority: Major > Labels: release-blocker > Attachments: dependency-check-report.html, > dependency-check-report.html > > > A number of dependencies can be updated. > nimbus-jose-jwt > jetty > netty > zookeeper > hbase-common > jackson-databind > and many more. They should be updated in the 2.10.1 release. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-15891) Provide Regex Based Mount Point In Inode Tree
[ https://issues.apache.org/jira/browse/HADOOP-15891?focusedWorklogId=480603&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-480603 ] ASF GitHub Bot logged work on HADOOP-15891: --- Author: ASF GitHub Bot Created on: 09/Sep/20 05:29 Start Date: 09/Sep/20 05:29 Worklog Time Spent: 10m Work Description: JohnZZGithub commented on a change in pull request #2185: URL: https://github.com/apache/hadoop/pull/2185#discussion_r485346954 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/viewfs/Constants.java ## @@ -86,12 +86,21 @@ */ String CONFIG_VIEWFS_LINK_MERGE_SLASH = "linkMergeSlash"; + /** + * Config variable for specifying a regex link which uses regular expressions + * as source and target could use group captured in src. + * E.g. (^/(?\\w+), /prefix-${firstDir}) => + * (/path1/file1 => /prefix-path1/file1) + */ + String CONFIG_VIEWFS_LINK_REGEX = "linkRegex"; + FsPermission PERMISSION_555 = new FsPermission((short) 0555); String CONFIG_VIEWFS_RENAME_STRATEGY = "fs.viewfs.rename.strategy"; /** * Enable ViewFileSystem to cache all children filesystems in inner cache. Review comment: Nice catch. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 480603) Time Spent: 5h 40m (was: 5.5h) > Provide Regex Based Mount Point In Inode Tree > - > > Key: HADOOP-15891 > URL: https://issues.apache.org/jira/browse/HADOOP-15891 > Project: Hadoop Common > Issue Type: New Feature > Components: viewfs >Reporter: zhenzhao wang >Assignee: zhenzhao wang >Priority: Major > Labels: pull-request-available > Attachments: HADOOP-15891.015.patch, HDFS-13948.001.patch, > HDFS-13948.002.patch, HDFS-13948.003.patch, HDFS-13948.004.patch, > HDFS-13948.005.patch, HDFS-13948.006.patch, HDFS-13948.007.patch, > HDFS-13948.008.patch, HDFS-13948.009.patch, HDFS-13948.011.patch, > HDFS-13948.012.patch, HDFS-13948.013.patch, HDFS-13948.014.patch, HDFS-13948_ > Regex Link Type In Mont Table-V0.pdf, HDFS-13948_ Regex Link Type In Mount > Table-v1.pdf > > Time Spent: 5h 40m > Remaining Estimate: 0h > > This jira is created to support regex based mount point in Inode Tree. We > noticed that mount point only support fixed target path. However, we might > have user cases when target needs to refer some fields from source. e.g. We > might want a mapping of /cluster1/user1 => /cluster1-dc1/user-nn-user1, we > want to refer `cluster` and `user` field in source to construct target. It's > impossible to archive this with current link type. Though we could set > one-to-one mapping, the mount table would become bloated if we have thousands > of users. Besides, a regex mapping would empower us more flexibility. So we > are going to build a regex based mount point which target could refer groups > from src regex mapping. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16918) Dependency update for Hadoop 2.10
[ https://issues.apache.org/jira/browse/HADOOP-16918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HADOOP-16918: - Attachment: dependency-check-report.html > Dependency update for Hadoop 2.10 > - > > Key: HADOOP-16918 > URL: https://issues.apache.org/jira/browse/HADOOP-16918 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Priority: Major > Labels: release-blocker > Attachments: dependency-check-report.html, > dependency-check-report.html > > > A number of dependencies can be updated. > nimbus-jose-jwt > jetty > netty > zookeeper > hbase-common > jackson-databind > and many more. They should be updated in the 2.10.1 release. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] JohnZZGithub commented on a change in pull request #2185: HADOOP-15891. provide Regex Based Mount Point In Inode Tree
JohnZZGithub commented on a change in pull request #2185: URL: https://github.com/apache/hadoop/pull/2185#discussion_r485346954 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/viewfs/Constants.java ## @@ -86,12 +86,21 @@ */ String CONFIG_VIEWFS_LINK_MERGE_SLASH = "linkMergeSlash"; + /** + * Config variable for specifying a regex link which uses regular expressions + * as source and target could use group captured in src. + * E.g. (^/(?\\w+), /prefix-${firstDir}) => + * (/path1/file1 => /prefix-path1/file1) + */ + String CONFIG_VIEWFS_LINK_REGEX = "linkRegex"; + FsPermission PERMISSION_555 = new FsPermission((short) 0555); String CONFIG_VIEWFS_RENAME_STRATEGY = "fs.viewfs.rename.strategy"; /** * Enable ViewFileSystem to cache all children filesystems in inner cache. Review comment: Nice catch. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-15891) Provide Regex Based Mount Point In Inode Tree
[ https://issues.apache.org/jira/browse/HADOOP-15891?focusedWorklogId=480602&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-480602 ] ASF GitHub Bot logged work on HADOOP-15891: --- Author: ASF GitHub Bot Created on: 09/Sep/20 05:28 Start Date: 09/Sep/20 05:28 Worklog Time Spent: 10m Work Description: JohnZZGithub commented on a change in pull request #2185: URL: https://github.com/apache/hadoop/pull/2185#discussion_r485346691 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/viewfs/TestViewFileSystemLinkRegex.java ## @@ -0,0 +1,473 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.fs.viewfs; + +import java.io.File; +import java.io.IOException; +import java.net.URI; +import java.net.URISyntaxException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; + +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.FSDataOutputStream; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.FileSystemTestHelper; +import org.apache.hadoop.fs.FsConstants; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.hdfs.DFSConfigKeys; +import org.apache.hadoop.hdfs.MiniDFSCluster; +import org.apache.hadoop.hdfs.MiniDFSNNTopology; +import org.apache.hadoop.test.GenericTestUtils; +import org.junit.AfterClass; +import org.junit.Assert; +import org.junit.Before; +import org.junit.BeforeClass; +import org.junit.Test; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import static org.apache.hadoop.fs.viewfs.RegexMountPoint.INTERCEPTOR_INTERNAL_SEP; +import static org.junit.Assert.assertSame; + +/** + * Test linkRegex node type for view file system. + */ +public class TestViewFileSystemLinkRegex extends ViewFileSystemBaseTest { + public static final Logger LOGGER = + LoggerFactory.getLogger(TestViewFileSystemLinkRegex.class); + + private static FileSystem fsDefault; + private static MiniDFSCluster cluster; + private static Configuration clusterConfig; + private static final int NAME_SPACES_COUNT = 3; + private static final int DATA_NODES_COUNT = 3; + private static final int FS_INDEX_DEFAULT = 0; + private static final FileSystem[] FS_HDFS = new FileSystem[NAME_SPACES_COUNT]; + private static final String CLUSTER_NAME = + "TestViewFileSystemLinkRegexCluster"; + private static final File TEST_DIR = GenericTestUtils + .getTestDir(TestViewFileSystemLinkRegex.class.getSimpleName()); + private static final String TEST_BASE_PATH = + "/tmp/TestViewFileSystemLinkRegex"; + + @Override + protected FileSystemTestHelper createFileSystemHelper() { +return new FileSystemTestHelper(TEST_BASE_PATH); + } + + @BeforeClass + public static void clusterSetupAtBeginning() throws IOException { +SupportsBlocks = true; +clusterConfig = ViewFileSystemTestSetup.createConfig(); +clusterConfig.setBoolean( +DFSConfigKeys.DFS_NAMENODE_DELEGATION_TOKEN_ALWAYS_USE_KEY, +true); +cluster = new MiniDFSCluster.Builder(clusterConfig).nnTopology( +MiniDFSNNTopology.simpleFederatedTopology(NAME_SPACES_COUNT)) +.numDataNodes(DATA_NODES_COUNT).build(); +cluster.waitClusterUp(); + +for (int i = 0; i < NAME_SPACES_COUNT; i++) { + FS_HDFS[i] = cluster.getFileSystem(i); +} +fsDefault = FS_HDFS[FS_INDEX_DEFAULT]; + } + + @AfterClass + public static void clusterShutdownAtEnd() throws Exception { +if (cluster != null) { + cluster.shutdown(); +} + } + + @Override + @Before + public void setUp() throws Exception { +fsTarget = fsDefault; +super.setUp(); + } + + /** + * Override this so that we don't set the targetTestRoot to any path under the + * root of the FS, and so that we don't try to delete the test dir, but rather + * only its contents. + */ + @Override + void initializeTargetTestRoot() throws IOException { +targetTestRoot = fsDefault.makeQualified(new Path("/")); +for (FileStatus status : fsDefault.listStatus(targetTestRoot))
[jira] [Work logged] (HADOOP-15891) Provide Regex Based Mount Point In Inode Tree
[ https://issues.apache.org/jira/browse/HADOOP-15891?focusedWorklogId=480601&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-480601 ] ASF GitHub Bot logged work on HADOOP-15891: --- Author: ASF GitHub Bot Created on: 09/Sep/20 05:27 Start Date: 09/Sep/20 05:27 Worklog Time Spent: 10m Work Description: JohnZZGithub commented on a change in pull request #2185: URL: https://github.com/apache/hadoop/pull/2185#discussion_r485346542 ## File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/viewfs/TestRegexMountPoint.java ## @@ -0,0 +1,160 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.fs.viewfs; + +import java.io.IOException; +import java.net.URI; +import java.util.Map; +import java.util.Set; + +import org.apache.hadoop.conf.Configuration; +import org.junit.After; +import org.junit.Assert; +import org.junit.Before; +import org.junit.Test; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * Test Regex Mount Point. + */ +public class TestRegexMountPoint { + private static final Logger LOGGER = + LoggerFactory.getLogger(TestRegexMountPoint.class.getName()); + + private InodeTree inodeTree; + private Configuration conf; + + class TestRegexMountPointFileSystem { +public URI getUri() { + return uri; +} + +private URI uri; + +TestRegexMountPointFileSystem(URI uri) { + String uriStr = uri == null ? "null" : uri.toString(); + LOGGER.info("Create TestRegexMountPointFileSystem Via URI:" + uriStr); + this.uri = uri; +} + } + + @Before + public void setUp() throws Exception { +conf = new Configuration(); +ConfigUtil.addLink(conf, TestRegexMountPoint.class.getName(), "/mnt", +URI.create("file:///")); + +inodeTree = new InodeTree(conf, +TestRegexMountPoint.class.getName(), null, false) { + @Override + protected TestRegexMountPointFileSystem getTargetFileSystem( + final URI uri) { +return new TestRegexMountPointFileSystem(uri); + } + + @Override + protected TestRegexMountPointFileSystem getTargetFileSystem( + final INodeDir dir) { +return new TestRegexMountPointFileSystem(null); + } + + @Override + protected TestRegexMountPointFileSystem getTargetFileSystem( + final String settings, final URI[] mergeFsURIList) { +return new TestRegexMountPointFileSystem(null); + } +}; + } + + @After + public void tearDown() throws Exception { +inodeTree = null; + } + + @Test + public void testGetVarListInString() throws IOException { +String srcRegex = "/(\\w+)"; +String target = "/$0/${1}/$1/${2}/${2}"; +RegexMountPoint regexMountPoint = +new RegexMountPoint(inodeTree, srcRegex, target, null); +regexMountPoint.initialize(); +Map> varMap = regexMountPoint.getVarInDestPathMap(); +Assert.assertEquals(varMap.size(), 3); +Assert.assertEquals(varMap.get("0").size(), 1); +Assert.assertTrue(varMap.get("0").contains("$0")); +Assert.assertEquals(varMap.get("1").size(), 2); +Assert.assertTrue(varMap.get("1").contains("${1}")); +Assert.assertTrue(varMap.get("1").contains("$1")); +Assert.assertEquals(varMap.get("2").size(), 1); +Assert.assertTrue(varMap.get("2").contains("${2}")); + } + + @Test + public void testResolve() throws IOException { +String regexStr = "^/user/(?\\w+)"; +String dstPathStr = "/namenode1/testResolve/$username"; +String settingsStr = null; +RegexMountPoint regexMountPoint = +new RegexMountPoint(inodeTree, regexStr, dstPathStr, settingsStr); +regexMountPoint.initialize(); +InodeTree.ResolveResult resolveResult = +regexMountPoint.resolve("/user/hadoop/file1", true); +Assert.assertEquals(resolveResult.kind, InodeTree.ResultKind.EXTERNAL_DIR); +Assert.assertTrue( +resolveResult.targetFileSystem +instanceof TestRegexMountPointFileSystem); +Assert.assertTrue(resolveResult.resolvedPath.equals("/user/hadoop")); +Assert.assertTrue( +
[GitHub] [hadoop] JohnZZGithub commented on a change in pull request #2185: HADOOP-15891. provide Regex Based Mount Point In Inode Tree
JohnZZGithub commented on a change in pull request #2185: URL: https://github.com/apache/hadoop/pull/2185#discussion_r485346691 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/viewfs/TestViewFileSystemLinkRegex.java ## @@ -0,0 +1,473 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.fs.viewfs; + +import java.io.File; +import java.io.IOException; +import java.net.URI; +import java.net.URISyntaxException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; + +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.FSDataOutputStream; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.FileSystemTestHelper; +import org.apache.hadoop.fs.FsConstants; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.hdfs.DFSConfigKeys; +import org.apache.hadoop.hdfs.MiniDFSCluster; +import org.apache.hadoop.hdfs.MiniDFSNNTopology; +import org.apache.hadoop.test.GenericTestUtils; +import org.junit.AfterClass; +import org.junit.Assert; +import org.junit.Before; +import org.junit.BeforeClass; +import org.junit.Test; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import static org.apache.hadoop.fs.viewfs.RegexMountPoint.INTERCEPTOR_INTERNAL_SEP; +import static org.junit.Assert.assertSame; + +/** + * Test linkRegex node type for view file system. + */ +public class TestViewFileSystemLinkRegex extends ViewFileSystemBaseTest { + public static final Logger LOGGER = + LoggerFactory.getLogger(TestViewFileSystemLinkRegex.class); + + private static FileSystem fsDefault; + private static MiniDFSCluster cluster; + private static Configuration clusterConfig; + private static final int NAME_SPACES_COUNT = 3; + private static final int DATA_NODES_COUNT = 3; + private static final int FS_INDEX_DEFAULT = 0; + private static final FileSystem[] FS_HDFS = new FileSystem[NAME_SPACES_COUNT]; + private static final String CLUSTER_NAME = + "TestViewFileSystemLinkRegexCluster"; + private static final File TEST_DIR = GenericTestUtils + .getTestDir(TestViewFileSystemLinkRegex.class.getSimpleName()); + private static final String TEST_BASE_PATH = + "/tmp/TestViewFileSystemLinkRegex"; + + @Override + protected FileSystemTestHelper createFileSystemHelper() { +return new FileSystemTestHelper(TEST_BASE_PATH); + } + + @BeforeClass + public static void clusterSetupAtBeginning() throws IOException { +SupportsBlocks = true; +clusterConfig = ViewFileSystemTestSetup.createConfig(); +clusterConfig.setBoolean( +DFSConfigKeys.DFS_NAMENODE_DELEGATION_TOKEN_ALWAYS_USE_KEY, +true); +cluster = new MiniDFSCluster.Builder(clusterConfig).nnTopology( +MiniDFSNNTopology.simpleFederatedTopology(NAME_SPACES_COUNT)) +.numDataNodes(DATA_NODES_COUNT).build(); +cluster.waitClusterUp(); + +for (int i = 0; i < NAME_SPACES_COUNT; i++) { + FS_HDFS[i] = cluster.getFileSystem(i); +} +fsDefault = FS_HDFS[FS_INDEX_DEFAULT]; + } + + @AfterClass + public static void clusterShutdownAtEnd() throws Exception { +if (cluster != null) { + cluster.shutdown(); +} + } + + @Override + @Before + public void setUp() throws Exception { +fsTarget = fsDefault; +super.setUp(); + } + + /** + * Override this so that we don't set the targetTestRoot to any path under the + * root of the FS, and so that we don't try to delete the test dir, but rather + * only its contents. + */ + @Override + void initializeTargetTestRoot() throws IOException { +targetTestRoot = fsDefault.makeQualified(new Path("/")); +for (FileStatus status : fsDefault.listStatus(targetTestRoot)) { + fsDefault.delete(status.getPath(), true); +} + } + + @Override + void setupMountPoints() { +super.setupMountPoints(); + } + + @Override + int getExpectedDelegationTokenCount() { +return 1; // all point to the same fs so 1 unique token + } + + @Override + int getExpectedDelegationTokenCountWithCredentials() { +return 1; + } + + public String buildReplaceInterceptorSettingString(String srcRegex, +
[jira] [Work logged] (HADOOP-15891) Provide Regex Based Mount Point In Inode Tree
[ https://issues.apache.org/jira/browse/HADOOP-15891?focusedWorklogId=480600&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-480600 ] ASF GitHub Bot logged work on HADOOP-15891: --- Author: ASF GitHub Bot Created on: 09/Sep/20 05:27 Start Date: 09/Sep/20 05:27 Worklog Time Spent: 10m Work Description: JohnZZGithub commented on a change in pull request #2185: URL: https://github.com/apache/hadoop/pull/2185#discussion_r485346458 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/viewfs/ViewFileSystem.java ## @@ -217,6 +239,7 @@ public Path getMountedOnPath() { Path homeDir = null; private boolean enableInnerCache = false; private InnerCache cache; + private boolean evictCacheOnClose = false; Review comment: @umamaheswararao Thanks for the kindness. I didn't see a problem with making it true. Just mean to be more cautious, let me remove it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 480600) Time Spent: 5h 10m (was: 5h) > Provide Regex Based Mount Point In Inode Tree > - > > Key: HADOOP-15891 > URL: https://issues.apache.org/jira/browse/HADOOP-15891 > Project: Hadoop Common > Issue Type: New Feature > Components: viewfs >Reporter: zhenzhao wang >Assignee: zhenzhao wang >Priority: Major > Labels: pull-request-available > Attachments: HADOOP-15891.015.patch, HDFS-13948.001.patch, > HDFS-13948.002.patch, HDFS-13948.003.patch, HDFS-13948.004.patch, > HDFS-13948.005.patch, HDFS-13948.006.patch, HDFS-13948.007.patch, > HDFS-13948.008.patch, HDFS-13948.009.patch, HDFS-13948.011.patch, > HDFS-13948.012.patch, HDFS-13948.013.patch, HDFS-13948.014.patch, HDFS-13948_ > Regex Link Type In Mont Table-V0.pdf, HDFS-13948_ Regex Link Type In Mount > Table-v1.pdf > > Time Spent: 5h 10m > Remaining Estimate: 0h > > This jira is created to support regex based mount point in Inode Tree. We > noticed that mount point only support fixed target path. However, we might > have user cases when target needs to refer some fields from source. e.g. We > might want a mapping of /cluster1/user1 => /cluster1-dc1/user-nn-user1, we > want to refer `cluster` and `user` field in source to construct target. It's > impossible to archive this with current link type. Though we could set > one-to-one mapping, the mount table would become bloated if we have thousands > of users. Besides, a regex mapping would empower us more flexibility. So we > are going to build a regex based mount point which target could refer groups > from src regex mapping. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] JohnZZGithub commented on a change in pull request #2185: HADOOP-15891. provide Regex Based Mount Point In Inode Tree
JohnZZGithub commented on a change in pull request #2185: URL: https://github.com/apache/hadoop/pull/2185#discussion_r485346542 ## File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/viewfs/TestRegexMountPoint.java ## @@ -0,0 +1,160 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.fs.viewfs; + +import java.io.IOException; +import java.net.URI; +import java.util.Map; +import java.util.Set; + +import org.apache.hadoop.conf.Configuration; +import org.junit.After; +import org.junit.Assert; +import org.junit.Before; +import org.junit.Test; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * Test Regex Mount Point. + */ +public class TestRegexMountPoint { + private static final Logger LOGGER = + LoggerFactory.getLogger(TestRegexMountPoint.class.getName()); + + private InodeTree inodeTree; + private Configuration conf; + + class TestRegexMountPointFileSystem { +public URI getUri() { + return uri; +} + +private URI uri; + +TestRegexMountPointFileSystem(URI uri) { + String uriStr = uri == null ? "null" : uri.toString(); + LOGGER.info("Create TestRegexMountPointFileSystem Via URI:" + uriStr); + this.uri = uri; +} + } + + @Before + public void setUp() throws Exception { +conf = new Configuration(); +ConfigUtil.addLink(conf, TestRegexMountPoint.class.getName(), "/mnt", +URI.create("file:///")); + +inodeTree = new InodeTree(conf, +TestRegexMountPoint.class.getName(), null, false) { + @Override + protected TestRegexMountPointFileSystem getTargetFileSystem( + final URI uri) { +return new TestRegexMountPointFileSystem(uri); + } + + @Override + protected TestRegexMountPointFileSystem getTargetFileSystem( + final INodeDir dir) { +return new TestRegexMountPointFileSystem(null); + } + + @Override + protected TestRegexMountPointFileSystem getTargetFileSystem( + final String settings, final URI[] mergeFsURIList) { +return new TestRegexMountPointFileSystem(null); + } +}; + } + + @After + public void tearDown() throws Exception { +inodeTree = null; + } + + @Test + public void testGetVarListInString() throws IOException { +String srcRegex = "/(\\w+)"; +String target = "/$0/${1}/$1/${2}/${2}"; +RegexMountPoint regexMountPoint = +new RegexMountPoint(inodeTree, srcRegex, target, null); +regexMountPoint.initialize(); +Map> varMap = regexMountPoint.getVarInDestPathMap(); +Assert.assertEquals(varMap.size(), 3); +Assert.assertEquals(varMap.get("0").size(), 1); +Assert.assertTrue(varMap.get("0").contains("$0")); +Assert.assertEquals(varMap.get("1").size(), 2); +Assert.assertTrue(varMap.get("1").contains("${1}")); +Assert.assertTrue(varMap.get("1").contains("$1")); +Assert.assertEquals(varMap.get("2").size(), 1); +Assert.assertTrue(varMap.get("2").contains("${2}")); + } + + @Test + public void testResolve() throws IOException { +String regexStr = "^/user/(?\\w+)"; +String dstPathStr = "/namenode1/testResolve/$username"; +String settingsStr = null; +RegexMountPoint regexMountPoint = +new RegexMountPoint(inodeTree, regexStr, dstPathStr, settingsStr); +regexMountPoint.initialize(); +InodeTree.ResolveResult resolveResult = +regexMountPoint.resolve("/user/hadoop/file1", true); +Assert.assertEquals(resolveResult.kind, InodeTree.ResultKind.EXTERNAL_DIR); +Assert.assertTrue( +resolveResult.targetFileSystem +instanceof TestRegexMountPointFileSystem); +Assert.assertTrue(resolveResult.resolvedPath.equals("/user/hadoop")); +Assert.assertTrue( +resolveResult.targetFileSystem +instanceof TestRegexMountPointFileSystem); +Assert.assertTrue( +((TestRegexMountPointFileSystem) resolveResult.targetFileSystem) +.getUri().toString().equals("/namenode1/testResolve/hadoop")); +Assert.assertTrue(resolveResult.remainingPath.toString().equals("/file1")); + } + + @Test + public void testResolveWithInterceptor() throws IOException { +S
[GitHub] [hadoop] JohnZZGithub commented on a change in pull request #2185: HADOOP-15891. provide Regex Based Mount Point In Inode Tree
JohnZZGithub commented on a change in pull request #2185: URL: https://github.com/apache/hadoop/pull/2185#discussion_r485346458 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/viewfs/ViewFileSystem.java ## @@ -217,6 +239,7 @@ public Path getMountedOnPath() { Path homeDir = null; private boolean enableInnerCache = false; private InnerCache cache; + private boolean evictCacheOnClose = false; Review comment: @umamaheswararao Thanks for the kindness. I didn't see a problem with making it true. Just mean to be more cautious, let me remove it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17165) Implement service-user feature in DecayRPCScheduler
[ https://issues.apache.org/jira/browse/HADOOP-17165?focusedWorklogId=480596&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-480596 ] ASF GitHub Bot logged work on HADOOP-17165: --- Author: ASF GitHub Bot Created on: 09/Sep/20 05:23 Start Date: 09/Sep/20 05:23 Worklog Time Spent: 10m Work Description: tasanuma commented on a change in pull request #2240: URL: https://github.com/apache/hadoop/pull/2240#discussion_r485345171 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/DecayRpcScheduler.java ## @@ -178,6 +188,7 @@ private static final double PRECISION = 0.0001; private MetricsProxy metricsProxy; private final CostProvider costProvider; + private Set serviceUsernames; Review comment: Thanks for your review. Updated PR following `UserName`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 480596) Time Spent: 1h 40m (was: 1.5h) > Implement service-user feature in DecayRPCScheduler > --- > > Key: HADOOP-17165 > URL: https://issues.apache.org/jira/browse/HADOOP-17165 > Project: Hadoop Common > Issue Type: New Feature >Reporter: Takanobu Asanuma >Assignee: Takanobu Asanuma >Priority: Major > Labels: pull-request-available > Attachments: HADOOP-17165.001.patch, HADOOP-17165.002.patch, > after.png, before.png > > Time Spent: 1h 40m > Remaining Estimate: 0h > > In our cluster, we want to use FairCallQueue to limit heavy users, but not > want to restrict certain users who are submitting important requests. This > jira proposes to implement the service-user feature that the user is always > scheduled high-priority queue. > According to HADOOP-9640, the initial concept of FCQ has this feature, but > not implemented finally. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] tasanuma commented on a change in pull request #2240: HADOOP-17165. Implement service-user feature in DecayRPCScheduler.
tasanuma commented on a change in pull request #2240: URL: https://github.com/apache/hadoop/pull/2240#discussion_r485345171 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/DecayRpcScheduler.java ## @@ -178,6 +188,7 @@ private static final double PRECISION = 0.0001; private MetricsProxy metricsProxy; private final CostProvider costProvider; + private Set serviceUsernames; Review comment: Thanks for your review. Updated PR following `UserName`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-15891) Provide Regex Based Mount Point In Inode Tree
[ https://issues.apache.org/jira/browse/HADOOP-15891?focusedWorklogId=480590&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-480590 ] ASF GitHub Bot logged work on HADOOP-15891: --- Author: ASF GitHub Bot Created on: 09/Sep/20 05:02 Start Date: 09/Sep/20 05:02 Worklog Time Spent: 10m Work Description: umamaheswararao commented on a change in pull request #2185: URL: https://github.com/apache/hadoop/pull/2185#discussion_r485338866 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ViewFs.md ## @@ -366,6 +366,69 @@ Don't want to change scheme or difficult to copy mount-table configurations to a Please refer to the [View File System Overload Scheme Guide](./ViewFsOverloadScheme.html) +Regex Pattern Based Mount Points + + +The view file system mount points were a Key-Value based mapping system. It is not friendly for user cases which mapping config could be abstracted to rules. E.g. Users want to provide a GCS bucket per user and there might be thousands of users in total. The old key-value based approach won't work well for several reasons: + +1. The mount table is used by FileSystem clients. There's a cost to spread the config to all clients and we should avoid it if possible. The [View File System Overload Scheme Guide](./ViewFsOverloadScheme.html) could help the distribution by central mount table management. But the mount table still have to be updated on every change. The change could be greatly avoided if provide a rule-based mount table. + +2. The client have to understand all the KVs in the mount table. This is not ideal when the mountable grows to thousands of items. E.g. thousands of file systems might be initialized even users only need one. And the config itself will become bloated at scale. + +### Understand the Difference + +In the key-value based mount table, view file system treats every mount point as a partition. There's several file system APIs which will lead to operation on all partitions. E.g. there's an HDFS cluster with multiple mount. Users want to run “hadoop fs -put file viewfs://hdfs.namenode.apache.org/tmp/” cmd to copy data from local disk to our HDFS cluster. The cmd will trigger ViewFileSystem to call setVerifyChecksum() method which will initialize the file system for every mount point. +For a regex-base rule mount table entry, we couldn't know what's corresponding path until parsing. So the regex based mount table entry will be ignored on such cases. The file system (ChRootedFileSystem) will be created upon accessing. But the underlying file system will be cached by inner cache of ViewFileSystem. Review comment: >For a regex-base rule mount table entry, we couldn't know what's corresponding path until parsing. Whatever we know should be added to mountPoints? So, that getMountPoints will return known fs-es? It may be a good idea to add Java doc on API level. I am ok to have this in followup JIRA to cover all of this aspects. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 480590) Time Spent: 5h (was: 4h 50m) > Provide Regex Based Mount Point In Inode Tree > - > > Key: HADOOP-15891 > URL: https://issues.apache.org/jira/browse/HADOOP-15891 > Project: Hadoop Common > Issue Type: New Feature > Components: viewfs >Reporter: zhenzhao wang >Assignee: zhenzhao wang >Priority: Major > Labels: pull-request-available > Attachments: HADOOP-15891.015.patch, HDFS-13948.001.patch, > HDFS-13948.002.patch, HDFS-13948.003.patch, HDFS-13948.004.patch, > HDFS-13948.005.patch, HDFS-13948.006.patch, HDFS-13948.007.patch, > HDFS-13948.008.patch, HDFS-13948.009.patch, HDFS-13948.011.patch, > HDFS-13948.012.patch, HDFS-13948.013.patch, HDFS-13948.014.patch, HDFS-13948_ > Regex Link Type In Mont Table-V0.pdf, HDFS-13948_ Regex Link Type In Mount > Table-v1.pdf > > Time Spent: 5h > Remaining Estimate: 0h > > This jira is created to support regex based mount point in Inode Tree. We > noticed that mount point only support fixed target path. However, we might > have user cases when target needs to refer some fields from source. e.g. We > might want a mapping of /cluster1/user1 => /cluster1-dc1/user-nn-user1, we > want to refer `cluster` and `user` field in source to construct target. It's > impossible to archive this with current link type.
[GitHub] [hadoop] umamaheswararao commented on a change in pull request #2185: HADOOP-15891. provide Regex Based Mount Point In Inode Tree
umamaheswararao commented on a change in pull request #2185: URL: https://github.com/apache/hadoop/pull/2185#discussion_r485338866 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ViewFs.md ## @@ -366,6 +366,69 @@ Don't want to change scheme or difficult to copy mount-table configurations to a Please refer to the [View File System Overload Scheme Guide](./ViewFsOverloadScheme.html) +Regex Pattern Based Mount Points + + +The view file system mount points were a Key-Value based mapping system. It is not friendly for user cases which mapping config could be abstracted to rules. E.g. Users want to provide a GCS bucket per user and there might be thousands of users in total. The old key-value based approach won't work well for several reasons: + +1. The mount table is used by FileSystem clients. There's a cost to spread the config to all clients and we should avoid it if possible. The [View File System Overload Scheme Guide](./ViewFsOverloadScheme.html) could help the distribution by central mount table management. But the mount table still have to be updated on every change. The change could be greatly avoided if provide a rule-based mount table. + +2. The client have to understand all the KVs in the mount table. This is not ideal when the mountable grows to thousands of items. E.g. thousands of file systems might be initialized even users only need one. And the config itself will become bloated at scale. + +### Understand the Difference + +In the key-value based mount table, view file system treats every mount point as a partition. There's several file system APIs which will lead to operation on all partitions. E.g. there's an HDFS cluster with multiple mount. Users want to run “hadoop fs -put file viewfs://hdfs.namenode.apache.org/tmp/” cmd to copy data from local disk to our HDFS cluster. The cmd will trigger ViewFileSystem to call setVerifyChecksum() method which will initialize the file system for every mount point. +For a regex-base rule mount table entry, we couldn't know what's corresponding path until parsing. So the regex based mount table entry will be ignored on such cases. The file system (ChRootedFileSystem) will be created upon accessing. But the underlying file system will be cached by inner cache of ViewFileSystem. Review comment: >For a regex-base rule mount table entry, we couldn't know what's corresponding path until parsing. Whatever we know should be added to mountPoints? So, that getMountPoints will return known fs-es? It may be a good idea to add Java doc on API level. I am ok to have this in followup JIRA to cover all of this aspects. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-15891) Provide Regex Based Mount Point In Inode Tree
[ https://issues.apache.org/jira/browse/HADOOP-15891?focusedWorklogId=480589&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-480589 ] ASF GitHub Bot logged work on HADOOP-15891: --- Author: ASF GitHub Bot Created on: 09/Sep/20 05:01 Start Date: 09/Sep/20 05:01 Worklog Time Spent: 10m Work Description: umamaheswararao commented on a change in pull request #2185: URL: https://github.com/apache/hadoop/pull/2185#discussion_r485332518 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/viewfs/ViewFileSystem.java ## @@ -217,6 +239,7 @@ public Path getMountedOnPath() { Path homeDir = null; private boolean enableInnerCache = false; private InnerCache cache; + private boolean evictCacheOnClose = false; Review comment: Do you see any issue if we make it true? If no issues, we can simply clean it on close right instead of having another config? Seems like this is an improvement to existing code. If you want, I am ok to file small JIRA and fix this cleanup thing.( I am assuming it's not necessarily needed with this.) ## File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/viewfs/TestRegexMountPoint.java ## @@ -0,0 +1,160 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.fs.viewfs; + +import java.io.IOException; +import java.net.URI; +import java.util.Map; +import java.util.Set; + +import org.apache.hadoop.conf.Configuration; +import org.junit.After; +import org.junit.Assert; +import org.junit.Before; +import org.junit.Test; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * Test Regex Mount Point. + */ +public class TestRegexMountPoint { + private static final Logger LOGGER = + LoggerFactory.getLogger(TestRegexMountPoint.class.getName()); + + private InodeTree inodeTree; + private Configuration conf; + + class TestRegexMountPointFileSystem { +public URI getUri() { + return uri; +} + +private URI uri; + +TestRegexMountPointFileSystem(URI uri) { + String uriStr = uri == null ? "null" : uri.toString(); + LOGGER.info("Create TestRegexMountPointFileSystem Via URI:" + uriStr); + this.uri = uri; +} + } + + @Before + public void setUp() throws Exception { +conf = new Configuration(); +ConfigUtil.addLink(conf, TestRegexMountPoint.class.getName(), "/mnt", +URI.create("file:///")); + +inodeTree = new InodeTree(conf, +TestRegexMountPoint.class.getName(), null, false) { + @Override + protected TestRegexMountPointFileSystem getTargetFileSystem( + final URI uri) { +return new TestRegexMountPointFileSystem(uri); + } + + @Override + protected TestRegexMountPointFileSystem getTargetFileSystem( + final INodeDir dir) { +return new TestRegexMountPointFileSystem(null); + } + + @Override + protected TestRegexMountPointFileSystem getTargetFileSystem( + final String settings, final URI[] mergeFsURIList) { +return new TestRegexMountPointFileSystem(null); + } +}; + } + + @After + public void tearDown() throws Exception { +inodeTree = null; + } + + @Test + public void testGetVarListInString() throws IOException { +String srcRegex = "/(\\w+)"; +String target = "/$0/${1}/$1/${2}/${2}"; +RegexMountPoint regexMountPoint = +new RegexMountPoint(inodeTree, srcRegex, target, null); +regexMountPoint.initialize(); +Map> varMap = regexMountPoint.getVarInDestPathMap(); +Assert.assertEquals(varMap.size(), 3); +Assert.assertEquals(varMap.get("0").size(), 1); +Assert.assertTrue(varMap.get("0").contains("$0")); +Assert.assertEquals(varMap.get("1").size(), 2); +Assert.assertTrue(varMap.get("1").contains("${1}")); +Assert.assertTrue(varMap.get("1").contains("$1")); +Assert.assertEquals(varMap.get("2").size(), 1); +Assert.assertTrue(varMap.get("2").contains("${2}")); + } + + @Test + public void testResolve() throws IOException { +String regexStr = "^/
[GitHub] [hadoop] umamaheswararao commented on a change in pull request #2185: HADOOP-15891. provide Regex Based Mount Point In Inode Tree
umamaheswararao commented on a change in pull request #2185: URL: https://github.com/apache/hadoop/pull/2185#discussion_r485332518 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/viewfs/ViewFileSystem.java ## @@ -217,6 +239,7 @@ public Path getMountedOnPath() { Path homeDir = null; private boolean enableInnerCache = false; private InnerCache cache; + private boolean evictCacheOnClose = false; Review comment: Do you see any issue if we make it true? If no issues, we can simply clean it on close right instead of having another config? Seems like this is an improvement to existing code. If you want, I am ok to file small JIRA and fix this cleanup thing.( I am assuming it's not necessarily needed with this.) ## File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/viewfs/TestRegexMountPoint.java ## @@ -0,0 +1,160 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.fs.viewfs; + +import java.io.IOException; +import java.net.URI; +import java.util.Map; +import java.util.Set; + +import org.apache.hadoop.conf.Configuration; +import org.junit.After; +import org.junit.Assert; +import org.junit.Before; +import org.junit.Test; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * Test Regex Mount Point. + */ +public class TestRegexMountPoint { + private static final Logger LOGGER = + LoggerFactory.getLogger(TestRegexMountPoint.class.getName()); + + private InodeTree inodeTree; + private Configuration conf; + + class TestRegexMountPointFileSystem { +public URI getUri() { + return uri; +} + +private URI uri; + +TestRegexMountPointFileSystem(URI uri) { + String uriStr = uri == null ? "null" : uri.toString(); + LOGGER.info("Create TestRegexMountPointFileSystem Via URI:" + uriStr); + this.uri = uri; +} + } + + @Before + public void setUp() throws Exception { +conf = new Configuration(); +ConfigUtil.addLink(conf, TestRegexMountPoint.class.getName(), "/mnt", +URI.create("file:///")); + +inodeTree = new InodeTree(conf, +TestRegexMountPoint.class.getName(), null, false) { + @Override + protected TestRegexMountPointFileSystem getTargetFileSystem( + final URI uri) { +return new TestRegexMountPointFileSystem(uri); + } + + @Override + protected TestRegexMountPointFileSystem getTargetFileSystem( + final INodeDir dir) { +return new TestRegexMountPointFileSystem(null); + } + + @Override + protected TestRegexMountPointFileSystem getTargetFileSystem( + final String settings, final URI[] mergeFsURIList) { +return new TestRegexMountPointFileSystem(null); + } +}; + } + + @After + public void tearDown() throws Exception { +inodeTree = null; + } + + @Test + public void testGetVarListInString() throws IOException { +String srcRegex = "/(\\w+)"; +String target = "/$0/${1}/$1/${2}/${2}"; +RegexMountPoint regexMountPoint = +new RegexMountPoint(inodeTree, srcRegex, target, null); +regexMountPoint.initialize(); +Map> varMap = regexMountPoint.getVarInDestPathMap(); +Assert.assertEquals(varMap.size(), 3); +Assert.assertEquals(varMap.get("0").size(), 1); +Assert.assertTrue(varMap.get("0").contains("$0")); +Assert.assertEquals(varMap.get("1").size(), 2); +Assert.assertTrue(varMap.get("1").contains("${1}")); +Assert.assertTrue(varMap.get("1").contains("$1")); +Assert.assertEquals(varMap.get("2").size(), 1); +Assert.assertTrue(varMap.get("2").contains("${2}")); + } + + @Test + public void testResolve() throws IOException { +String regexStr = "^/user/(?\\w+)"; +String dstPathStr = "/namenode1/testResolve/$username"; +String settingsStr = null; +RegexMountPoint regexMountPoint = +new RegexMountPoint(inodeTree, regexStr, dstPathStr, settingsStr); +regexMountPoint.initialize(); +InodeTree.ResolveResult resolveResult = +regexMountPoint.resolve("/user/hadoop/file1", true); +Assert.assertEquals(resolveResult.kind, InodeTree.ResultKind.EXT
[GitHub] [hadoop] hadoop-yetus commented on pull request #2281: HDFS-15516.Add info for create flags in NameNode audit logs.
hadoop-yetus commented on pull request #2281: URL: https://github.com/apache/hadoop/pull/2281#issuecomment-689294424 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 1m 9s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 1 new or modified test files. | ||| _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 31m 31s | trunk passed | | +1 :green_heart: | compile | 1m 20s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 1m 11s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 0m 51s | trunk passed | | +1 :green_heart: | mvnsite | 1m 18s | trunk passed | | +1 :green_heart: | shadedclient | 17m 50s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 50s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 1m 21s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 3m 12s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 3m 9s | trunk passed | ||| _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 12s | the patch passed | | +1 :green_heart: | compile | 1m 23s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javac | 1m 23s | the patch passed | | +1 :green_heart: | compile | 1m 14s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | javac | 1m 14s | the patch passed | | -0 :warning: | checkstyle | 0m 47s | hadoop-hdfs-project/hadoop-hdfs: The patch generated 3 new + 184 unchanged - 1 fixed = 187 total (was 185) | | +1 :green_heart: | mvnsite | 1m 16s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | shadedclient | 15m 36s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 49s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 1m 18s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | findbugs | 3m 35s | the patch passed | ||| _ Other Tests _ | | -1 :x: | unit | 116m 21s | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 40s | The patch does not generate ASF License warnings. | | | | 206m 5s | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.TestFileChecksum | | | hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes | | | hadoop.hdfs.TestReadStripedFileWithMissingBlocks | | | hadoop.hdfs.TestDFSOutputStream | | | hadoop.hdfs.TestGetFileChecksum | | | hadoop.hdfs.TestFileChecksumCompositeCrc | | | hadoop.hdfs.server.namenode.TestFsck | | | hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2281/3/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2281 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 415ef7f732eb 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 0d855159f09 | | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | checkstyle | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2281/3/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt | | unit | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2281/3/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2281/3/testReport/ | | Max. process+thread count | 28
[jira] [Work logged] (HADOOP-17230) JAVA_HOME with spaces not supported
[ https://issues.apache.org/jira/browse/HADOOP-17230?focusedWorklogId=480581&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-480581 ] ASF GitHub Bot logged work on HADOOP-17230: --- Author: ASF GitHub Bot Created on: 09/Sep/20 04:17 Start Date: 09/Sep/20 04:17 Worklog Time Spent: 10m Work Description: liuml07 commented on pull request #2248: URL: https://github.com/apache/hadoop/pull/2248#issuecomment-689290756 Not sure if @goiri or @abmodi is available to give a quick review and commit. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 480581) Time Spent: 0.5h (was: 20m) > JAVA_HOME with spaces not supported > --- > > Key: HADOOP-17230 > URL: https://issues.apache.org/jira/browse/HADOOP-17230 > Project: Hadoop Common > Issue Type: Bug > Components: common >Affects Versions: 3.3.0 > Environment: Windows 10 > Hadoop 3.1.0, 3.2.1, 3.3.0, etc >Reporter: Wayne Seguin >Priority: Minor > Labels: pull-request-available > Attachments: image-2020-08-26-12-24-04-118.png, > image-2020-08-26-12-49-40-335.png > > Time Spent: 0.5h > Remaining Estimate: 0h > > When running on Windows, if JAVA_HOME contains a space (which is frequently > since the default Java install path is "C:\Program Files\Java", running > Hadoop fails to run. > !image-2020-08-26-12-24-04-118.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] liuml07 commented on pull request #2248: HADOOP-17230. Fix JAVA_HOME with spaces
liuml07 commented on pull request #2248: URL: https://github.com/apache/hadoop/pull/2248#issuecomment-689290756 Not sure if @goiri or @abmodi is available to give a quick review and commit. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17252) Website to link to latest Hadoop wiki
[ https://issues.apache.org/jira/browse/HADOOP-17252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192606#comment-17192606 ] Mingliang Liu commented on HADOOP-17252: Thanks [~aajisaka] If the current wiki page is https://cwiki.apache.org/confluence/display/HADOOP then I think the current website is fine because the link it has already automatically redirects to it. I will close this JIRA shortly. > Website to link to latest Hadoop wiki > - > > Key: HADOOP-17252 > URL: https://issues.apache.org/jira/browse/HADOOP-17252 > Project: Hadoop Common > Issue Type: Bug > Components: site >Reporter: Mingliang Liu >Priority: Major > > Currently the website links to the [old wiki|https://wiki.apache.org/hadoop]. > Shall we update that to the latest one: > https://cwiki.apache.org/confluence/display/HADOOP2/Home > Or am I confused which one is latest, https://wiki.apache.org/hadoop or > https://cwiki.apache.org/confluence/display/HADOOP2? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] Hexiaoqiao commented on pull request #2265: HDFS-15551. Tiny Improve for DeadNode detector
Hexiaoqiao commented on pull request #2265: URL: https://github.com/apache/hadoop/pull/2265#issuecomment-689287274 Thanks @imbajin involve me here and thanks @leosunli for reviews. LGTM. +1. Will merge pending 3 business days if any other comments leave. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17230) JAVA_HOME with spaces not supported
[ https://issues.apache.org/jira/browse/HADOOP-17230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HADOOP-17230: Labels: pull-request-available (was: ) > JAVA_HOME with spaces not supported > --- > > Key: HADOOP-17230 > URL: https://issues.apache.org/jira/browse/HADOOP-17230 > Project: Hadoop Common > Issue Type: Bug > Components: common >Affects Versions: 3.3.0 > Environment: Windows 10 > Hadoop 3.1.0, 3.2.1, 3.3.0, etc >Reporter: Wayne Seguin >Priority: Minor > Labels: pull-request-available > Attachments: image-2020-08-26-12-24-04-118.png, > image-2020-08-26-12-49-40-335.png > > Time Spent: 10m > Remaining Estimate: 0h > > When running on Windows, if JAVA_HOME contains a space (which is frequently > since the default Java install path is "C:\Program Files\Java", running > Hadoop fails to run. > !image-2020-08-26-12-24-04-118.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17230) JAVA_HOME with spaces not supported
[ https://issues.apache.org/jira/browse/HADOOP-17230?focusedWorklogId=480571&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-480571 ] ASF GitHub Bot logged work on HADOOP-17230: --- Author: ASF GitHub Bot Created on: 09/Sep/20 03:42 Start Date: 09/Sep/20 03:42 Worklog Time Spent: 10m Work Description: aajisaka edited a comment on pull request #2248: URL: https://github.com/apache/hadoop/pull/2248#issuecomment-689281191 > If it's a no-op instead of a potential bug, then the current code seems fine. Agreed. I haven't used Windows for more than 2 years, so I don't have a strong opinion. Sorry for the late response. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 480571) Time Spent: 20m (was: 10m) > JAVA_HOME with spaces not supported > --- > > Key: HADOOP-17230 > URL: https://issues.apache.org/jira/browse/HADOOP-17230 > Project: Hadoop Common > Issue Type: Bug > Components: common >Affects Versions: 3.3.0 > Environment: Windows 10 > Hadoop 3.1.0, 3.2.1, 3.3.0, etc >Reporter: Wayne Seguin >Priority: Minor > Labels: pull-request-available > Attachments: image-2020-08-26-12-24-04-118.png, > image-2020-08-26-12-49-40-335.png > > Time Spent: 20m > Remaining Estimate: 0h > > When running on Windows, if JAVA_HOME contains a space (which is frequently > since the default Java install path is "C:\Program Files\Java", running > Hadoop fails to run. > !image-2020-08-26-12-24-04-118.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17230) JAVA_HOME with spaces not supported
[ https://issues.apache.org/jira/browse/HADOOP-17230?focusedWorklogId=480570&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-480570 ] ASF GitHub Bot logged work on HADOOP-17230: --- Author: ASF GitHub Bot Created on: 09/Sep/20 03:42 Start Date: 09/Sep/20 03:42 Worklog Time Spent: 10m Work Description: aajisaka commented on pull request #2248: URL: https://github.com/apache/hadoop/pull/2248#issuecomment-689281191 bq. If it's a no-op instead of a potential bug, then the current code seems fine. Agreed. I haven't used Windows for more than 2 years, so I don't have a strong opinion. Sorry for the late response. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 480570) Remaining Estimate: 0h Time Spent: 10m > JAVA_HOME with spaces not supported > --- > > Key: HADOOP-17230 > URL: https://issues.apache.org/jira/browse/HADOOP-17230 > Project: Hadoop Common > Issue Type: Bug > Components: common >Affects Versions: 3.3.0 > Environment: Windows 10 > Hadoop 3.1.0, 3.2.1, 3.3.0, etc >Reporter: Wayne Seguin >Priority: Minor > Attachments: image-2020-08-26-12-24-04-118.png, > image-2020-08-26-12-49-40-335.png > > Time Spent: 10m > Remaining Estimate: 0h > > When running on Windows, if JAVA_HOME contains a space (which is frequently > since the default Java install path is "C:\Program Files\Java", running > Hadoop fails to run. > !image-2020-08-26-12-24-04-118.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] aajisaka edited a comment on pull request #2248: HADOOP-17230. Fix JAVA_HOME with spaces
aajisaka edited a comment on pull request #2248: URL: https://github.com/apache/hadoop/pull/2248#issuecomment-689281191 > If it's a no-op instead of a potential bug, then the current code seems fine. Agreed. I haven't used Windows for more than 2 years, so I don't have a strong opinion. Sorry for the late response. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] aajisaka commented on pull request #2248: HADOOP-17230. Fix JAVA_HOME with spaces
aajisaka commented on pull request #2248: URL: https://github.com/apache/hadoop/pull/2248#issuecomment-689281191 bq. If it's a no-op instead of a potential bug, then the current code seems fine. Agreed. I haven't used Windows for more than 2 years, so I don't have a strong opinion. Sorry for the late response. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17222) Create socket address combined with cache to speed up hdfs client choose DataNode
[ https://issues.apache.org/jira/browse/HADOOP-17222?focusedWorklogId=480566&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-480566 ] ASF GitHub Bot logged work on HADOOP-17222: --- Author: ASF GitHub Bot Created on: 09/Sep/20 03:11 Start Date: 09/Sep/20 03:11 Worklog Time Spent: 10m Work Description: 1996fanrui commented on pull request #2241: URL: https://github.com/apache/hadoop/pull/2241#issuecomment-689272323 > Was waiting for a clean QA run but could not trigger it manually. If I get no help from community, we can upload the patch file to the JIRA and from there I can trigger multiple runs to get a clean pre-commit build. > > I assume failing tests are not related to this patch but it's always nice to confirm. Hi @liuml07 , I ran all failed unit tests on my mac. Only one failed: TestHDFSContractMultipartUploader#testConcurrentUploads It failed are not related to this patch, the community is fixing it, link: https://issues.apache.org/jira/browse/HDFS-15471 Other unit tests are successful on my mac. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 480566) Time Spent: 1.5h (was: 1h 20m) > Create socket address combined with cache to speed up hdfs client choose > DataNode > - > > Key: HADOOP-17222 > URL: https://issues.apache.org/jira/browse/HADOOP-17222 > Project: Hadoop Common > Issue Type: Improvement > Components: common, hdfs-client > Environment: HBase version: 2.1.0 > JVM: -Xmx2g -Xms2g > hadoop hdfs version: 2.7.4 > disk:SSD > OS:CentOS Linux release 7.4.1708 (Core) > JMH Benchmark: @Fork(value = 1) > @Warmup(iterations = 300) > @Measurement(iterations = 300) >Reporter: fanrui >Assignee: fanrui >Priority: Major > Labels: pull-request-available > Attachments: After Optimization remark.png, After optimization.svg, > Before Optimization remark.png, Before optimization.svg > > Time Spent: 1.5h > Remaining Estimate: 0h > > Note:Not only the hdfs client can get the current benefit, all callers of > NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an > example. > > Hdfs client selects best DN for hdfs Block. method call stack: > DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> > NetUtils.createSocketAddr > NetUtils.createSocketAddr creates the corresponding InetSocketAddress based > on the host and port. There are some heavier operations in the > NetUtils.createSocketAddr method, for example: URI.create(target), so > NetUtils.createSocketAddr takes more time to execute. > The following is my performance report. The report is based on HBase calling > hdfs. HBase is a high-frequency access client for hdfs, because HBase read > operations often access a small DataBlock (about 64k) instead of the entire > HFile. In the case of high frequency access, the NetUtils.createSocketAddr > method is time-consuming. > h3. Test Environment: > > {code:java} > HBase version: 2.1.0 > JVM: -Xmx2g -Xms2g > hadoop hdfs version: 2.7.4 > disk:SSD > OS:CentOS Linux release 7.4.1708 (Core) > JMH Benchmark: @Fork(value = 1) > @Warmup(iterations = 300) > @Measurement(iterations = 300) > {code} > h4. Before Optimization FlameGraph: > In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts > for 4.86% of the entire CPU, and the creation of URIs accounts for a larger > proportion. > !Before Optimization remark.png! > h3. Optimization ideas: > NetUtils.createSocketAddr creates InetSocketAddress based on host and port. > Here we can add Cache to InetSocketAddress. The key of Cache is host and > port, and the value is InetSocketAddress. > h4. After Optimization FlameGraph: > In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts > for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, > and the ConcurrentHashMap.get() method gets data from the Cache. The CPU > usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% > to 0.54%. > !After Optimization remark.png! > h3. Original FlameGraph link: > [Before > Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing] > [After Optimization > FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hadoop] 1996fanrui commented on pull request #2241: HADOOP-17222. Create socket address combined with URI cache
1996fanrui commented on pull request #2241: URL: https://github.com/apache/hadoop/pull/2241#issuecomment-689272438 > @1996fanrui could you try the above command shared by @dineshchitlangia to trigger a new PreCommit QA run? Hopefully it will be clean next time. Thanks, done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17222) Create socket address combined with cache to speed up hdfs client choose DataNode
[ https://issues.apache.org/jira/browse/HADOOP-17222?focusedWorklogId=480567&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-480567 ] ASF GitHub Bot logged work on HADOOP-17222: --- Author: ASF GitHub Bot Created on: 09/Sep/20 03:11 Start Date: 09/Sep/20 03:11 Worklog Time Spent: 10m Work Description: 1996fanrui commented on pull request #2241: URL: https://github.com/apache/hadoop/pull/2241#issuecomment-689272438 > @1996fanrui could you try the above command shared by @dineshchitlangia to trigger a new PreCommit QA run? Hopefully it will be clean next time. Thanks, done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 480567) Time Spent: 1h 40m (was: 1.5h) > Create socket address combined with cache to speed up hdfs client choose > DataNode > - > > Key: HADOOP-17222 > URL: https://issues.apache.org/jira/browse/HADOOP-17222 > Project: Hadoop Common > Issue Type: Improvement > Components: common, hdfs-client > Environment: HBase version: 2.1.0 > JVM: -Xmx2g -Xms2g > hadoop hdfs version: 2.7.4 > disk:SSD > OS:CentOS Linux release 7.4.1708 (Core) > JMH Benchmark: @Fork(value = 1) > @Warmup(iterations = 300) > @Measurement(iterations = 300) >Reporter: fanrui >Assignee: fanrui >Priority: Major > Labels: pull-request-available > Attachments: After Optimization remark.png, After optimization.svg, > Before Optimization remark.png, Before optimization.svg > > Time Spent: 1h 40m > Remaining Estimate: 0h > > Note:Not only the hdfs client can get the current benefit, all callers of > NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an > example. > > Hdfs client selects best DN for hdfs Block. method call stack: > DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> > NetUtils.createSocketAddr > NetUtils.createSocketAddr creates the corresponding InetSocketAddress based > on the host and port. There are some heavier operations in the > NetUtils.createSocketAddr method, for example: URI.create(target), so > NetUtils.createSocketAddr takes more time to execute. > The following is my performance report. The report is based on HBase calling > hdfs. HBase is a high-frequency access client for hdfs, because HBase read > operations often access a small DataBlock (about 64k) instead of the entire > HFile. In the case of high frequency access, the NetUtils.createSocketAddr > method is time-consuming. > h3. Test Environment: > > {code:java} > HBase version: 2.1.0 > JVM: -Xmx2g -Xms2g > hadoop hdfs version: 2.7.4 > disk:SSD > OS:CentOS Linux release 7.4.1708 (Core) > JMH Benchmark: @Fork(value = 1) > @Warmup(iterations = 300) > @Measurement(iterations = 300) > {code} > h4. Before Optimization FlameGraph: > In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts > for 4.86% of the entire CPU, and the creation of URIs accounts for a larger > proportion. > !Before Optimization remark.png! > h3. Optimization ideas: > NetUtils.createSocketAddr creates InetSocketAddress based on host and port. > Here we can add Cache to InetSocketAddress. The key of Cache is host and > port, and the value is InetSocketAddress. > h4. After Optimization FlameGraph: > In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts > for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, > and the ConcurrentHashMap.get() method gets data from the Cache. The CPU > usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% > to 0.54%. > !After Optimization remark.png! > h3. Original FlameGraph link: > [Before > Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing] > [After Optimization > FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] 1996fanrui commented on pull request #2241: HADOOP-17222. Create socket address combined with URI cache
1996fanrui commented on pull request #2241: URL: https://github.com/apache/hadoop/pull/2241#issuecomment-689272323 > Was waiting for a clean QA run but could not trigger it manually. If I get no help from community, we can upload the patch file to the JIRA and from there I can trigger multiple runs to get a clean pre-commit build. > > I assume failing tests are not related to this patch but it's always nice to confirm. Hi @liuml07 , I ran all failed unit tests on my mac. Only one failed: TestHDFSContractMultipartUploader#testConcurrentUploads It failed are not related to this patch, the community is fixing it, link: https://issues.apache.org/jira/browse/HDFS-15471 Other unit tests are successful on my mac. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17252) Website to link to latest Hadoop wiki
[ https://issues.apache.org/jira/browse/HADOOP-17252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192581#comment-17192581 ] Akira Ajisaka commented on HADOOP-17252: Thank you [~liuml07] for the report. The latest Hadoop wiki is https://cwiki.apache.org/confluence/display/HADOOP, so please update the link to the page. The old wiki has been migrated and archived to https://cwiki.apache.org/confluence/display/HADOOP2 (read-only). I've just updated the https://cwiki.apache.org/confluence/display/HADOOP/Home to add a link to the archive pages. > Website to link to latest Hadoop wiki > - > > Key: HADOOP-17252 > URL: https://issues.apache.org/jira/browse/HADOOP-17252 > Project: Hadoop Common > Issue Type: Bug > Components: site >Reporter: Mingliang Liu >Priority: Major > > Currently the website links to the [old wiki|https://wiki.apache.org/hadoop]. > Shall we update that to the latest one: > https://cwiki.apache.org/confluence/display/HADOOP2/Home > Or am I confused which one is latest, https://wiki.apache.org/hadoop or > https://cwiki.apache.org/confluence/display/HADOOP2? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] imbajin opened a new pull request #2287: HDFS-15552. Let DeadNode Detector also work for EC cases
imbajin opened a new pull request #2287: URL: https://github.com/apache/hadoop/pull/2287 Use detector to deal with dead nodes under EC to avoid reading failures Refer: [JIRA URL](https://issues.apache.org/jira/browse/HDFS-15552) NOTE: This pr's `unit-test` is not ready for review, I'll finish it soon This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] imbajin commented on pull request #2265: HDFS-15551. Tiny Improve for DeadNode detector
imbajin commented on pull request #2265: URL: https://github.com/apache/hadoop/pull/2265#issuecomment-689258241 @Hexiaoqiao Could u help me check if there are any other questions? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-15891) Provide Regex Based Mount Point In Inode Tree
[ https://issues.apache.org/jira/browse/HADOOP-15891?focusedWorklogId=480549&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-480549 ] ASF GitHub Bot logged work on HADOOP-15891: --- Author: ASF GitHub Bot Created on: 09/Sep/20 01:22 Start Date: 09/Sep/20 01:22 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2185: URL: https://github.com/apache/hadoop/pull/2185#issuecomment-689240050 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 0m 28s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +0 :ok: | markdownlint | 0m 1s | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 6 new or modified test files. | ||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 3m 21s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 26m 6s | trunk passed | | +1 :green_heart: | compile | 19m 32s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 16m 59s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 2m 45s | trunk passed | | +1 :green_heart: | mvnsite | 2m 59s | trunk passed | | +1 :green_heart: | shadedclient | 20m 34s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 42s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 3m 12s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 3m 9s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 5m 24s | trunk passed | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 27s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 58s | the patch passed | | +1 :green_heart: | compile | 18m 49s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javac | 18m 49s | the patch passed | | +1 :green_heart: | compile | 16m 53s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | javac | 16m 53s | the patch passed | | -0 :warning: | checkstyle | 2m 44s | root: The patch generated 1 new + 182 unchanged - 1 fixed = 183 total (was 183) | | +1 :green_heart: | mvnsite | 2m 57s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | shadedclient | 14m 12s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 41s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 3m 13s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | findbugs | 5m 39s | the patch passed | ||| _ Other Tests _ | | +1 :green_heart: | unit | 9m 29s | hadoop-common in the patch passed. | | -1 :x: | unit | 94m 24s | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 1m 5s | The patch does not generate ASF License warnings. | | | | 276m 23s | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics | | | hadoop.hdfs.TestMultipleNNPortQOP | | | hadoop.hdfs.TestFileChecksumCompositeCrc | | | hadoop.hdfs.server.namenode.TestNamenodeCapacityReport | | | hadoop.hdfs.TestFileChecksum | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2185/17/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2185 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle markdownlint | | uname | Linux 45b80267deee 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 0d855159f09 | | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | Multi-JDK versions | /usr/lib/jvm/ja
[GitHub] [hadoop] hadoop-yetus commented on pull request #2185: HADOOP-15891. provide Regex Based Mount Point In Inode Tree
hadoop-yetus commented on pull request #2185: URL: https://github.com/apache/hadoop/pull/2185#issuecomment-689240050 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 0m 28s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +0 :ok: | markdownlint | 0m 1s | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 6 new or modified test files. | ||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 3m 21s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 26m 6s | trunk passed | | +1 :green_heart: | compile | 19m 32s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 16m 59s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 2m 45s | trunk passed | | +1 :green_heart: | mvnsite | 2m 59s | trunk passed | | +1 :green_heart: | shadedclient | 20m 34s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 42s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 3m 12s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 3m 9s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 5m 24s | trunk passed | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 27s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 58s | the patch passed | | +1 :green_heart: | compile | 18m 49s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javac | 18m 49s | the patch passed | | +1 :green_heart: | compile | 16m 53s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | javac | 16m 53s | the patch passed | | -0 :warning: | checkstyle | 2m 44s | root: The patch generated 1 new + 182 unchanged - 1 fixed = 183 total (was 183) | | +1 :green_heart: | mvnsite | 2m 57s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | shadedclient | 14m 12s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 41s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 3m 13s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | findbugs | 5m 39s | the patch passed | ||| _ Other Tests _ | | +1 :green_heart: | unit | 9m 29s | hadoop-common in the patch passed. | | -1 :x: | unit | 94m 24s | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 1m 5s | The patch does not generate ASF License warnings. | | | | 276m 23s | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics | | | hadoop.hdfs.TestMultipleNNPortQOP | | | hadoop.hdfs.TestFileChecksumCompositeCrc | | | hadoop.hdfs.server.namenode.TestNamenodeCapacityReport | | | hadoop.hdfs.TestFileChecksum | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2185/17/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2185 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle markdownlint | | uname | Linux 45b80267deee 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 0d855159f09 | | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | checkstyle | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2185/17/artifact/out/diff-checkstyle-root.txt | | unit | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2185/17/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | T
[jira] [Updated] (HADOOP-17249) Upgrade jackson-databind to 2.9.10.6 on branch-2.10
[ https://issues.apache.org/jira/browse/HADOOP-17249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HADOOP-17249: -- Fix Version/s: 2.10.1 Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) > Upgrade jackson-databind to 2.9.10.6 on branch-2.10 > --- > > Key: HADOOP-17249 > URL: https://issues.apache.org/jira/browse/HADOOP-17249 > Project: Hadoop Common > Issue Type: Improvement >Affects Versions: 2.10.0 >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Fix For: 2.10.1 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > This is filed to test backporting HADOOP-16905 to branch-2.10. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17249) Upgrade jackson-databind to 2.9.10.6 on branch-2.10
[ https://issues.apache.org/jira/browse/HADOOP-17249?focusedWorklogId=480542&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-480542 ] ASF GitHub Bot logged work on HADOOP-17249: --- Author: ASF GitHub Bot Created on: 09/Sep/20 00:55 Start Date: 09/Sep/20 00:55 Worklog Time Spent: 10m Work Description: iwasakims merged pull request #2279: URL: https://github.com/apache/hadoop/pull/2279 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 480542) Time Spent: 1h (was: 50m) > Upgrade jackson-databind to 2.9.10.6 on branch-2.10 > --- > > Key: HADOOP-17249 > URL: https://issues.apache.org/jira/browse/HADOOP-17249 > Project: Hadoop Common > Issue Type: Improvement >Affects Versions: 2.10.0 >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > This is filed to test backporting HADOOP-16905 to branch-2.10. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17249) Upgrade jackson-databind to 2.9.10.6 on branch-2.10
[ https://issues.apache.org/jira/browse/HADOOP-17249?focusedWorklogId=480543&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-480543 ] ASF GitHub Bot logged work on HADOOP-17249: --- Author: ASF GitHub Bot Created on: 09/Sep/20 00:55 Start Date: 09/Sep/20 00:55 Worklog Time Spent: 10m Work Description: iwasakims commented on pull request #2279: URL: https://github.com/apache/hadoop/pull/2279#issuecomment-689232475 I merged this. Thanks, @jojochuang. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 480543) Time Spent: 1h 10m (was: 1h) > Upgrade jackson-databind to 2.9.10.6 on branch-2.10 > --- > > Key: HADOOP-17249 > URL: https://issues.apache.org/jira/browse/HADOOP-17249 > Project: Hadoop Common > Issue Type: Improvement >Affects Versions: 2.10.0 >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > This is filed to test backporting HADOOP-16905 to branch-2.10. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] iwasakims commented on pull request #2279: HADOOP-17249. Upgrade jackson-databind to 2.9.10.6 on branch-2.10.
iwasakims commented on pull request #2279: URL: https://github.com/apache/hadoop/pull/2279#issuecomment-689232475 I merged this. Thanks, @jojochuang. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] iwasakims merged pull request #2279: HADOOP-17249. Upgrade jackson-databind to 2.9.10.6 on branch-2.10.
iwasakims merged pull request #2279: URL: https://github.com/apache/hadoop/pull/2279 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17249) Upgrade jackson-databind to 2.9.10.6 on branch-2.10
[ https://issues.apache.org/jira/browse/HADOOP-17249?focusedWorklogId=480541&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-480541 ] ASF GitHub Bot logged work on HADOOP-17249: --- Author: ASF GitHub Bot Created on: 09/Sep/20 00:54 Start Date: 09/Sep/20 00:54 Worklog Time Spent: 10m Work Description: iwasakims edited a comment on pull request #2279: URL: https://github.com/apache/hadoop/pull/2279#issuecomment-689164946 changed the target version of jackson-databind based on the comment in [HADOOP-17249](https://issues.apache.org/jira/browse/HADOOP-17249). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 480541) Time Spent: 50m (was: 40m) > Upgrade jackson-databind to 2.9.10.6 on branch-2.10 > --- > > Key: HADOOP-17249 > URL: https://issues.apache.org/jira/browse/HADOOP-17249 > Project: Hadoop Common > Issue Type: Improvement >Affects Versions: 2.10.0 >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > This is filed to test backporting HADOOP-16905 to branch-2.10. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] iwasakims edited a comment on pull request #2279: HADOOP-17249. Upgrade jackson-databind to 2.9.10.6 on branch-2.10.
iwasakims edited a comment on pull request #2279: URL: https://github.com/apache/hadoop/pull/2279#issuecomment-689164946 changed the target version of jackson-databind based on the comment in [HADOOP-17249](https://issues.apache.org/jira/browse/HADOOP-17249). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17181) ITestS3AContractUnbuffer failure -stream.read didn't return all data
[ https://issues.apache.org/jira/browse/HADOOP-17181?focusedWorklogId=480499&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-480499 ] ASF GitHub Bot logged work on HADOOP-17181: --- Author: ASF GitHub Bot Created on: 09/Sep/20 00:11 Start Date: 09/Sep/20 00:11 Worklog Time Spent: 10m Work Description: liuml07 edited a comment on pull request #2286: URL: https://github.com/apache/hadoop/pull/2286#issuecomment-689202660 The PR title (and commit subject) can be a bit more verbose like: ``` Fix transient stream read failures in FileSystem contract tests ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 480499) Time Spent: 50m (was: 40m) > ITestS3AContractUnbuffer failure -stream.read didn't return all data > > > Key: HADOOP-17181 > URL: https://issues.apache.org/jira/browse/HADOOP-17181 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.0 >Reporter: Steve Loughran >Priority: Minor > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Seen 2x recently, failure in ITestS3AContractUnbuffer as not enough data came > back in the read. > The contract test assumes that stream.read() will return everything, but it > could be some buffering problem. Proposed: switch to ReadFully to see if it > is a quirk of the read/get or is something actually wrong with the production > code. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17181) ITestS3AContractUnbuffer failure -stream.read didn't return all data
[ https://issues.apache.org/jira/browse/HADOOP-17181?focusedWorklogId=480498&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-480498 ] ASF GitHub Bot logged work on HADOOP-17181: --- Author: ASF GitHub Bot Created on: 09/Sep/20 00:11 Start Date: 09/Sep/20 00:11 Worklog Time Spent: 10m Work Description: liuml07 commented on pull request #2286: URL: https://github.com/apache/hadoop/pull/2286#issuecomment-689202660 The PR title can be a bit more verbose like: ``` Fix transient stream read failures in FileSystem contract tests ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 480498) Time Spent: 40m (was: 0.5h) > ITestS3AContractUnbuffer failure -stream.read didn't return all data > > > Key: HADOOP-17181 > URL: https://issues.apache.org/jira/browse/HADOOP-17181 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.0 >Reporter: Steve Loughran >Priority: Minor > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Seen 2x recently, failure in ITestS3AContractUnbuffer as not enough data came > back in the read. > The contract test assumes that stream.read() will return everything, but it > could be some buffering problem. Proposed: switch to ReadFully to see if it > is a quirk of the read/get or is something actually wrong with the production > code. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] liuml07 edited a comment on pull request #2286: HADOOP-17181. transient stream read failures
liuml07 edited a comment on pull request #2286: URL: https://github.com/apache/hadoop/pull/2286#issuecomment-689202660 The PR title (and commit subject) can be a bit more verbose like: ``` Fix transient stream read failures in FileSystem contract tests ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] liuml07 commented on pull request #2286: HADOOP-17181. transient stream read failures
liuml07 commented on pull request #2286: URL: https://github.com/apache/hadoop/pull/2286#issuecomment-689202660 The PR title can be a bit more verbose like: ``` Fix transient stream read failures in FileSystem contract tests ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=480490&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-480490 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 08/Sep/20 23:55 Start Date: 08/Sep/20 23:55 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2201: URL: https://github.com/apache/hadoop/pull/2201#discussion_r485253169 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java ## @@ -276,13 +258,27 @@ public void end() { // do nothing } - private native static void initIDs(); + private int decompressBytesDirect() throws IOException { +if (compressedDirectBufLen == 0) { + return 0; +} else { + // Set the position and limit of `compressedDirectBuf` for reading + compressedDirectBuf.position(0).limit(compressedDirectBufLen); + // There is compressed input, decompress it now. + int size = Snappy.uncompressedLength((ByteBuffer) compressedDirectBuf); + if (size > uncompressedDirectBuf.capacity()) { Review comment: Should we check with `uncompressedDirectBuf.remaining` instead of `capacity`? ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java ## @@ -276,13 +258,27 @@ public void end() { // do nothing } - private native static void initIDs(); + private int decompressBytesDirect() throws IOException { +if (compressedDirectBufLen == 0) { + return 0; +} else { + // Set the position and limit of `compressedDirectBuf` for reading + compressedDirectBuf.position(0).limit(compressedDirectBufLen); Review comment: I'm not sure we need to set position and limit here? If `compressedDirectBuf` already has been set with position and limit before calling `decompressBytesDirect`? Won't we read wrong data from this buffer? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 480490) Time Spent: 40m (was: 0.5h) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] viirya commented on a change in pull request #2201: HADOOP-17125. Using snappy-java in SnappyCodec
viirya commented on a change in pull request #2201: URL: https://github.com/apache/hadoop/pull/2201#discussion_r485253169 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java ## @@ -276,13 +258,27 @@ public void end() { // do nothing } - private native static void initIDs(); + private int decompressBytesDirect() throws IOException { +if (compressedDirectBufLen == 0) { + return 0; +} else { + // Set the position and limit of `compressedDirectBuf` for reading + compressedDirectBuf.position(0).limit(compressedDirectBufLen); + // There is compressed input, decompress it now. + int size = Snappy.uncompressedLength((ByteBuffer) compressedDirectBuf); + if (size > uncompressedDirectBuf.capacity()) { Review comment: Should we check with `uncompressedDirectBuf.remaining` instead of `capacity`? ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java ## @@ -276,13 +258,27 @@ public void end() { // do nothing } - private native static void initIDs(); + private int decompressBytesDirect() throws IOException { +if (compressedDirectBufLen == 0) { + return 0; +} else { + // Set the position and limit of `compressedDirectBuf` for reading + compressedDirectBuf.position(0).limit(compressedDirectBufLen); Review comment: I'm not sure we need to set position and limit here? If `compressedDirectBuf` already has been set with position and limit before calling `decompressBytesDirect`? Won't we read wrong data from this buffer? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17249) Upgrade jackson-databind to 2.9.10.6 on branch-2.10
[ https://issues.apache.org/jira/browse/HADOOP-17249?focusedWorklogId=480483&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-480483 ] ASF GitHub Bot logged work on HADOOP-17249: --- Author: ASF GitHub Bot Created on: 08/Sep/20 23:11 Start Date: 08/Sep/20 23:11 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2279: URL: https://github.com/apache/hadoop/pull/2279#issuecomment-689183641 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 1m 16s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | ||| _ branch-2.10 Compile Tests _ | | +0 :ok: | mvndep | 2m 22s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 15m 1s | branch-2.10 passed | | +1 :green_heart: | compile | 13m 50s | branch-2.10 passed | | +1 :green_heart: | mvnsite | 1m 10s | branch-2.10 passed | | +1 :green_heart: | javadoc | 0m 54s | branch-2.10 passed | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 24s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 0m 46s | the patch passed | | +1 :green_heart: | compile | 13m 1s | the patch passed | | +1 :green_heart: | javac | 13m 1s | the patch passed | | +1 :green_heart: | mvnsite | 1m 10s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | xml | 0m 1s | The patch has no ill-formed XML file. | | +1 :green_heart: | javadoc | 0m 53s | the patch passed | ||| _ Other Tests _ | | +1 :green_heart: | unit | 0m 23s | hadoop-project in the patch passed. | | +1 :green_heart: | unit | 4m 34s | hadoop-yarn-server-applicationhistoryservice in the patch passed. | | +1 :green_heart: | asflicense | 0m 48s | The patch does not generate ASF License warnings. | | | | 59m 50s | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2279/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2279 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml | | uname | Linux 15f171a308e5 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | branch-2.10 / 43c6c54 | | Default Java | Oracle Corporation-1.7.0_95-b00 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2279/2/testReport/ | | Max. process+thread count | 125 (vs. ulimit of 5500) | | modules | C: hadoop-project hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice U: . | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2279/2/console | | versions | git=2.7.4 maven=3.3.9 | | Powered by | Apache Yetus 0.12.0 https://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 480483) Time Spent: 40m (was: 0.5h) > Upgrade jackson-databind to 2.9.10.6 on branch-2.10 > --- > > Key: HADOOP-17249 > URL: https://issues.apache.org/jira/browse/HADOOP-17249 > Project: Hadoop Common > Issue Type: Improvement >Affects Versions: 2.10.0 >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > This is filed to test backporting HADOOP-16905 to branch-2.10. -- This message was sent by Atlassian Jira (v8.3.4#803005) -
[GitHub] [hadoop] hadoop-yetus commented on pull request #2279: HADOOP-17249. Upgrade jackson-databind to 2.9.10.6 on branch-2.10.
hadoop-yetus commented on pull request #2279: URL: https://github.com/apache/hadoop/pull/2279#issuecomment-689183641 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 1m 16s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | ||| _ branch-2.10 Compile Tests _ | | +0 :ok: | mvndep | 2m 22s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 15m 1s | branch-2.10 passed | | +1 :green_heart: | compile | 13m 50s | branch-2.10 passed | | +1 :green_heart: | mvnsite | 1m 10s | branch-2.10 passed | | +1 :green_heart: | javadoc | 0m 54s | branch-2.10 passed | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 24s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 0m 46s | the patch passed | | +1 :green_heart: | compile | 13m 1s | the patch passed | | +1 :green_heart: | javac | 13m 1s | the patch passed | | +1 :green_heart: | mvnsite | 1m 10s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | xml | 0m 1s | The patch has no ill-formed XML file. | | +1 :green_heart: | javadoc | 0m 53s | the patch passed | ||| _ Other Tests _ | | +1 :green_heart: | unit | 0m 23s | hadoop-project in the patch passed. | | +1 :green_heart: | unit | 4m 34s | hadoop-yarn-server-applicationhistoryservice in the patch passed. | | +1 :green_heart: | asflicense | 0m 48s | The patch does not generate ASF License warnings. | | | | 59m 50s | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2279/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2279 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml | | uname | Linux 15f171a308e5 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | branch-2.10 / 43c6c54 | | Default Java | Oracle Corporation-1.7.0_95-b00 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2279/2/testReport/ | | Max. process+thread count | 125 (vs. ulimit of 5500) | | modules | C: hadoop-project hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice U: . | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2279/2/console | | versions | git=2.7.4 maven=3.3.9 | | Powered by | Apache Yetus 0.12.0 https://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus commented on pull request #2286: HADOOP-17181. transient stream read failures
hadoop-yetus commented on pull request #2286: URL: https://github.com/apache/hadoop/pull/2286#issuecomment-689173534 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 0m 35s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 2 new or modified test files. | ||| _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 30m 59s | trunk passed | | +1 :green_heart: | compile | 19m 24s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 16m 46s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 0m 52s | trunk passed | | +1 :green_heart: | mvnsite | 1m 29s | trunk passed | | +1 :green_heart: | shadedclient | 16m 38s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 39s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 1m 33s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 2m 15s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 2m 13s | trunk passed | ||| _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 51s | the patch passed | | +1 :green_heart: | compile | 18m 41s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javac | 18m 41s | the patch passed | | +1 :green_heart: | compile | 16m 55s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | javac | 16m 55s | the patch passed | | +1 :green_heart: | checkstyle | 0m 52s | the patch passed | | +1 :green_heart: | mvnsite | 1m 26s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | shadedclient | 14m 9s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 38s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 1m 32s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | findbugs | 2m 20s | the patch passed | ||| _ Other Tests _ | | +1 :green_heart: | unit | 9m 36s | hadoop-common in the patch passed. | | +1 :green_heart: | asflicense | 0m 55s | The patch does not generate ASF License warnings. | | | | 161m 44s | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2286/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2286 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 41d49a58369e 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 0d855159f09 | | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2286/1/testReport/ | | Max. process+thread count | 3326 (vs. ulimit of 5500) | | modules | C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2286/1/console | | versions | git=2.17.1 maven=3.6.0 findbugs=4.0.6 | | Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoo
[jira] [Work logged] (HADOOP-17181) ITestS3AContractUnbuffer failure -stream.read didn't return all data
[ https://issues.apache.org/jira/browse/HADOOP-17181?focusedWorklogId=480480&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-480480 ] ASF GitHub Bot logged work on HADOOP-17181: --- Author: ASF GitHub Bot Created on: 08/Sep/20 22:40 Start Date: 08/Sep/20 22:40 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2286: URL: https://github.com/apache/hadoop/pull/2286#issuecomment-689173534 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 0m 35s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 2 new or modified test files. | ||| _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 30m 59s | trunk passed | | +1 :green_heart: | compile | 19m 24s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 16m 46s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 0m 52s | trunk passed | | +1 :green_heart: | mvnsite | 1m 29s | trunk passed | | +1 :green_heart: | shadedclient | 16m 38s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 39s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 1m 33s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 2m 15s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 2m 13s | trunk passed | ||| _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 51s | the patch passed | | +1 :green_heart: | compile | 18m 41s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javac | 18m 41s | the patch passed | | +1 :green_heart: | compile | 16m 55s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | javac | 16m 55s | the patch passed | | +1 :green_heart: | checkstyle | 0m 52s | the patch passed | | +1 :green_heart: | mvnsite | 1m 26s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | shadedclient | 14m 9s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 38s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 1m 32s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | findbugs | 2m 20s | the patch passed | ||| _ Other Tests _ | | +1 :green_heart: | unit | 9m 36s | hadoop-common in the patch passed. | | +1 :green_heart: | asflicense | 0m 55s | The patch does not generate ASF License warnings. | | | | 161m 44s | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2286/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2286 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 41d49a58369e 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 0d855159f09 | | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2286/1/testReport/ | | Max. process+thread count | 3326 (vs. ulimit of 5500) | | modules | C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2286/1/console | | versions | git=2.17.1 maven=3.6.0 findbugs=4.0.6 | | Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org | This message was automatically generated.
[jira] [Updated] (HADOOP-17251) Upgrade netty-all to 4.1.50.Final on branch-2.10
[ https://issues.apache.org/jira/browse/HADOOP-17251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HADOOP-17251: -- Fix Version/s: 2.10.1 Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) I merged this. Thanks, [~Jim_Brennan]. > Upgrade netty-all to 4.1.50.Final on branch-2.10 > > > Key: HADOOP-17251 > URL: https://issues.apache.org/jira/browse/HADOOP-17251 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Fix For: 2.10.1 > > Time Spent: 50m > Remaining Estimate: 0h > > netty-all seems to be easily updated to fix HADOOP-16918. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] iwasakims commented on pull request #2285: HADOOP-17251. Upgrade netty-all to 4.1.50.Final on branch-2.10.
iwasakims commented on pull request #2285: URL: https://github.com/apache/hadoop/pull/2285#issuecomment-689167263 Thanks, @jojochuang. I merged this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17251) Upgrade netty-all to 4.1.50.Final on branch-2.10
[ https://issues.apache.org/jira/browse/HADOOP-17251?focusedWorklogId=480477&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-480477 ] ASF GitHub Bot logged work on HADOOP-17251: --- Author: ASF GitHub Bot Created on: 08/Sep/20 22:22 Start Date: 08/Sep/20 22:22 Worklog Time Spent: 10m Work Description: iwasakims commented on pull request #2285: URL: https://github.com/apache/hadoop/pull/2285#issuecomment-689167263 Thanks, @jojochuang. I merged this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 480477) Time Spent: 50m (was: 40m) > Upgrade netty-all to 4.1.50.Final on branch-2.10 > > > Key: HADOOP-17251 > URL: https://issues.apache.org/jira/browse/HADOOP-17251 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > netty-all seems to be easily updated to fix HADOOP-16918. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17251) Upgrade netty-all to 4.1.50.Final on branch-2.10
[ https://issues.apache.org/jira/browse/HADOOP-17251?focusedWorklogId=480476&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-480476 ] ASF GitHub Bot logged work on HADOOP-17251: --- Author: ASF GitHub Bot Created on: 08/Sep/20 22:19 Start Date: 08/Sep/20 22:19 Worklog Time Spent: 10m Work Description: iwasakims merged pull request #2285: URL: https://github.com/apache/hadoop/pull/2285 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 480476) Time Spent: 40m (was: 0.5h) > Upgrade netty-all to 4.1.50.Final on branch-2.10 > > > Key: HADOOP-17251 > URL: https://issues.apache.org/jira/browse/HADOOP-17251 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > netty-all seems to be easily updated to fix HADOOP-16918. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] iwasakims merged pull request #2285: HADOOP-17251. Upgrade netty-all to 4.1.50.Final on branch-2.10.
iwasakims merged pull request #2285: URL: https://github.com/apache/hadoop/pull/2285 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17249) Upgrade jackson-databind to 2.9.10.6 on branch-2.10
[ https://issues.apache.org/jira/browse/HADOOP-17249?focusedWorklogId=480474&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-480474 ] ASF GitHub Bot logged work on HADOOP-17249: --- Author: ASF GitHub Bot Created on: 08/Sep/20 22:15 Start Date: 08/Sep/20 22:15 Worklog Time Spent: 10m Work Description: iwasakims commented on pull request #2279: URL: https://github.com/apache/hadoop/pull/2279#issuecomment-689164946 changed the target version of jackson-databind stay based on the comment in [HADOOP-17249](https://issues.apache.org/jira/browse/HADOOP-17249). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 480474) Time Spent: 0.5h (was: 20m) > Upgrade jackson-databind to 2.9.10.6 on branch-2.10 > --- > > Key: HADOOP-17249 > URL: https://issues.apache.org/jira/browse/HADOOP-17249 > Project: Hadoop Common > Issue Type: Improvement >Affects Versions: 2.10.0 >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > This is filed to test backporting HADOOP-16905 to branch-2.10. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] iwasakims commented on pull request #2279: HADOOP-17249. Upgrade jackson-databind to 2.9.10.6 on branch-2.10.
iwasakims commented on pull request #2279: URL: https://github.com/apache/hadoop/pull/2279#issuecomment-689164946 changed the target version of jackson-databind stay based on the comment in [HADOOP-17249](https://issues.apache.org/jira/browse/HADOOP-17249). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17222) Create socket address combined with cache to speed up hdfs client choose DataNode
[ https://issues.apache.org/jira/browse/HADOOP-17222?focusedWorklogId=480471&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-480471 ] ASF GitHub Bot logged work on HADOOP-17222: --- Author: ASF GitHub Bot Created on: 08/Sep/20 22:14 Start Date: 08/Sep/20 22:14 Worklog Time Spent: 10m Work Description: liuml07 commented on pull request #2241: URL: https://github.com/apache/hadoop/pull/2241#issuecomment-689164399 @1996fanrui could you try the above command shared by @dineshchitlangia to trigger a new PreCommit QA run? Hopefully it will be clean next time. Thanks, This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 480471) Time Spent: 1h 20m (was: 1h 10m) > Create socket address combined with cache to speed up hdfs client choose > DataNode > - > > Key: HADOOP-17222 > URL: https://issues.apache.org/jira/browse/HADOOP-17222 > Project: Hadoop Common > Issue Type: Improvement > Components: common, hdfs-client > Environment: HBase version: 2.1.0 > JVM: -Xmx2g -Xms2g > hadoop hdfs version: 2.7.4 > disk:SSD > OS:CentOS Linux release 7.4.1708 (Core) > JMH Benchmark: @Fork(value = 1) > @Warmup(iterations = 300) > @Measurement(iterations = 300) >Reporter: fanrui >Assignee: fanrui >Priority: Major > Labels: pull-request-available > Attachments: After Optimization remark.png, After optimization.svg, > Before Optimization remark.png, Before optimization.svg > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Note:Not only the hdfs client can get the current benefit, all callers of > NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an > example. > > Hdfs client selects best DN for hdfs Block. method call stack: > DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> > NetUtils.createSocketAddr > NetUtils.createSocketAddr creates the corresponding InetSocketAddress based > on the host and port. There are some heavier operations in the > NetUtils.createSocketAddr method, for example: URI.create(target), so > NetUtils.createSocketAddr takes more time to execute. > The following is my performance report. The report is based on HBase calling > hdfs. HBase is a high-frequency access client for hdfs, because HBase read > operations often access a small DataBlock (about 64k) instead of the entire > HFile. In the case of high frequency access, the NetUtils.createSocketAddr > method is time-consuming. > h3. Test Environment: > > {code:java} > HBase version: 2.1.0 > JVM: -Xmx2g -Xms2g > hadoop hdfs version: 2.7.4 > disk:SSD > OS:CentOS Linux release 7.4.1708 (Core) > JMH Benchmark: @Fork(value = 1) > @Warmup(iterations = 300) > @Measurement(iterations = 300) > {code} > h4. Before Optimization FlameGraph: > In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts > for 4.86% of the entire CPU, and the creation of URIs accounts for a larger > proportion. > !Before Optimization remark.png! > h3. Optimization ideas: > NetUtils.createSocketAddr creates InetSocketAddress based on host and port. > Here we can add Cache to InetSocketAddress. The key of Cache is host and > port, and the value is InetSocketAddress. > h4. After Optimization FlameGraph: > In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts > for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, > and the ConcurrentHashMap.get() method gets data from the Cache. The CPU > usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% > to 0.54%. > !After Optimization remark.png! > h3. Original FlameGraph link: > [Before > Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing] > [After Optimization > FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] liuml07 commented on pull request #2241: HADOOP-17222. Create socket address combined with URI cache
liuml07 commented on pull request #2241: URL: https://github.com/apache/hadoop/pull/2241#issuecomment-689164399 @1996fanrui could you try the above command shared by @dineshchitlangia to trigger a new PreCommit QA run? Hopefully it will be clean next time. Thanks, This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17249) Upgrade jackson-databind to 2.10 on branch-2.10
[ https://issues.apache.org/jira/browse/HADOOP-17249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192507#comment-17192507 ] Masatake Iwasaki commented on HADOOP-17249: --- Thanks for the comment, [~Jim_Brennan] and [~jeagles]. Let's stay in 2.9.10 here then. Since there will be no 2.11 release of Hadoop, we should consider incompatible change if it is critical. I think jackson-2.10 is not worth for it. I'm going to update the title of this JIRA and PR. > Upgrade jackson-databind to 2.10 on branch-2.10 > --- > > Key: HADOOP-17249 > URL: https://issues.apache.org/jira/browse/HADOOP-17249 > Project: Hadoop Common > Issue Type: Improvement >Affects Versions: 2.10.0 >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > This is filed to test backporting HADOOP-16905 to branch-2.10. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17249) Upgrade jackson-databind to 2.9.10.6 on branch-2.10
[ https://issues.apache.org/jira/browse/HADOOP-17249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HADOOP-17249: -- Summary: Upgrade jackson-databind to 2.9.10.6 on branch-2.10 (was: Upgrade jackson-databind to 2.10 on branch-2.10) > Upgrade jackson-databind to 2.9.10.6 on branch-2.10 > --- > > Key: HADOOP-17249 > URL: https://issues.apache.org/jira/browse/HADOOP-17249 > Project: Hadoop Common > Issue Type: Improvement >Affects Versions: 2.10.0 >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > This is filed to test backporting HADOOP-16905 to branch-2.10. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17252) Website to link to latest Hadoop wiki
[ https://issues.apache.org/jira/browse/HADOOP-17252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HADOOP-17252: --- Description: Currently the website links to the [old wiki|https://wiki.apache.org/hadoop]. Shall we update that to the latest one: https://cwiki.apache.org/confluence/display/HADOOP2/Home Or am I confused which one is latest, https://wiki.apache.org/hadoop or https://cwiki.apache.org/confluence/display/HADOOP2? was:Currently the website links to the [old wiki|https://wiki.apache.org/hadoop]. Shall we update that to the latest one: https://cwiki.apache.org/confluence/display/HADOOP2/Home > Website to link to latest Hadoop wiki > - > > Key: HADOOP-17252 > URL: https://issues.apache.org/jira/browse/HADOOP-17252 > Project: Hadoop Common > Issue Type: Bug > Components: site >Reporter: Mingliang Liu >Priority: Major > > Currently the website links to the [old wiki|https://wiki.apache.org/hadoop]. > Shall we update that to the latest one: > https://cwiki.apache.org/confluence/display/HADOOP2/Home > Or am I confused which one is latest, https://wiki.apache.org/hadoop or > https://cwiki.apache.org/confluence/display/HADOOP2? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17222) Create socket address combined with cache to speed up hdfs client choose DataNode
[ https://issues.apache.org/jira/browse/HADOOP-17222?focusedWorklogId=480463&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-480463 ] ASF GitHub Bot logged work on HADOOP-17222: --- Author: ASF GitHub Bot Created on: 08/Sep/20 21:58 Start Date: 08/Sep/20 21:58 Worklog Time Spent: 10m Work Description: dineshchitlangia commented on pull request #2241: URL: https://github.com/apache/hadoop/pull/2241#issuecomment-689158906 @liuml07 You could do an empty commit on the PR branch and push it. `git commit --allow-empty -m 'trigger new CI check' && git push` This should trigger CI This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 480463) Time Spent: 1h 10m (was: 1h) > Create socket address combined with cache to speed up hdfs client choose > DataNode > - > > Key: HADOOP-17222 > URL: https://issues.apache.org/jira/browse/HADOOP-17222 > Project: Hadoop Common > Issue Type: Improvement > Components: common, hdfs-client > Environment: HBase version: 2.1.0 > JVM: -Xmx2g -Xms2g > hadoop hdfs version: 2.7.4 > disk:SSD > OS:CentOS Linux release 7.4.1708 (Core) > JMH Benchmark: @Fork(value = 1) > @Warmup(iterations = 300) > @Measurement(iterations = 300) >Reporter: fanrui >Assignee: fanrui >Priority: Major > Labels: pull-request-available > Attachments: After Optimization remark.png, After optimization.svg, > Before Optimization remark.png, Before optimization.svg > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Note:Not only the hdfs client can get the current benefit, all callers of > NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an > example. > > Hdfs client selects best DN for hdfs Block. method call stack: > DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> > NetUtils.createSocketAddr > NetUtils.createSocketAddr creates the corresponding InetSocketAddress based > on the host and port. There are some heavier operations in the > NetUtils.createSocketAddr method, for example: URI.create(target), so > NetUtils.createSocketAddr takes more time to execute. > The following is my performance report. The report is based on HBase calling > hdfs. HBase is a high-frequency access client for hdfs, because HBase read > operations often access a small DataBlock (about 64k) instead of the entire > HFile. In the case of high frequency access, the NetUtils.createSocketAddr > method is time-consuming. > h3. Test Environment: > > {code:java} > HBase version: 2.1.0 > JVM: -Xmx2g -Xms2g > hadoop hdfs version: 2.7.4 > disk:SSD > OS:CentOS Linux release 7.4.1708 (Core) > JMH Benchmark: @Fork(value = 1) > @Warmup(iterations = 300) > @Measurement(iterations = 300) > {code} > h4. Before Optimization FlameGraph: > In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts > for 4.86% of the entire CPU, and the creation of URIs accounts for a larger > proportion. > !Before Optimization remark.png! > h3. Optimization ideas: > NetUtils.createSocketAddr creates InetSocketAddress based on host and port. > Here we can add Cache to InetSocketAddress. The key of Cache is host and > port, and the value is InetSocketAddress. > h4. After Optimization FlameGraph: > In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts > for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, > and the ConcurrentHashMap.get() method gets data from the Cache. The CPU > usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% > to 0.54%. > !After Optimization remark.png! > h3. Original FlameGraph link: > [Before > Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing] > [After Optimization > FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] dineshchitlangia commented on pull request #2241: HADOOP-17222. Create socket address combined with URI cache
dineshchitlangia commented on pull request #2241: URL: https://github.com/apache/hadoop/pull/2241#issuecomment-689158906 @liuml07 You could do an empty commit on the PR branch and push it. `git commit --allow-empty -m 'trigger new CI check' && git push` This should trigger CI This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17252) Website to link to latest Hadoop wiki
[ https://issues.apache.org/jira/browse/HADOOP-17252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192502#comment-17192502 ] Mingliang Liu commented on HADOOP-17252: CC [~aajisaka] > Website to link to latest Hadoop wiki > - > > Key: HADOOP-17252 > URL: https://issues.apache.org/jira/browse/HADOOP-17252 > Project: Hadoop Common > Issue Type: Bug > Components: site >Reporter: Mingliang Liu >Priority: Major > > Currently the website links to the [old wiki|https://wiki.apache.org/hadoop]. > Shall we update that to the latest one: > https://cwiki.apache.org/confluence/display/HADOOP2/Home -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-17252) Website to link to latest Hadoop wiki
Mingliang Liu created HADOOP-17252: -- Summary: Website to link to latest Hadoop wiki Key: HADOOP-17252 URL: https://issues.apache.org/jira/browse/HADOOP-17252 Project: Hadoop Common Issue Type: Bug Reporter: Mingliang Liu Currently the website links to the [old wiki|https://wiki.apache.org/hadoop]. Shall we update that to the latest one: https://cwiki.apache.org/confluence/display/HADOOP2/Home -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17252) Website to link to latest Hadoop wiki
[ https://issues.apache.org/jira/browse/HADOOP-17252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HADOOP-17252: --- Component/s: site > Website to link to latest Hadoop wiki > - > > Key: HADOOP-17252 > URL: https://issues.apache.org/jira/browse/HADOOP-17252 > Project: Hadoop Common > Issue Type: Bug > Components: site >Reporter: Mingliang Liu >Priority: Major > > Currently the website links to the [old wiki|https://wiki.apache.org/hadoop]. > Shall we update that to the latest one: > https://cwiki.apache.org/confluence/display/HADOOP2/Home -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17252) Website to link to latest Hadoop wiki
[ https://issues.apache.org/jira/browse/HADOOP-17252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192500#comment-17192500 ] Mingliang Liu commented on HADOOP-17252: Code: https://github.com/apache/hadoop-site/blob/asf-site/layouts/partials/navbar.html#L31 > Website to link to latest Hadoop wiki > - > > Key: HADOOP-17252 > URL: https://issues.apache.org/jira/browse/HADOOP-17252 > Project: Hadoop Common > Issue Type: Bug > Components: site >Reporter: Mingliang Liu >Priority: Major > > Currently the website links to the [old wiki|https://wiki.apache.org/hadoop]. > Shall we update that to the latest one: > https://cwiki.apache.org/confluence/display/HADOOP2/Home -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17222) Create socket address combined with cache to speed up hdfs client choose DataNode
[ https://issues.apache.org/jira/browse/HADOOP-17222?focusedWorklogId=480458&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-480458 ] ASF GitHub Bot logged work on HADOOP-17222: --- Author: ASF GitHub Bot Created on: 08/Sep/20 21:45 Start Date: 08/Sep/20 21:45 Worklog Time Spent: 10m Work Description: liuml07 commented on pull request #2241: URL: https://github.com/apache/hadoop/pull/2241#issuecomment-689153364 Was waiting for a clean QA run but could not trigger it manually. If I get no help from community, we can upload the patch file to the JIRA and from there I can trigger multiple runs to get a clean pre-commit build. I assume failing tests are not related to this but it's always nice to confirm. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 480458) Time Spent: 50m (was: 40m) > Create socket address combined with cache to speed up hdfs client choose > DataNode > - > > Key: HADOOP-17222 > URL: https://issues.apache.org/jira/browse/HADOOP-17222 > Project: Hadoop Common > Issue Type: Improvement > Components: common, hdfs-client > Environment: HBase version: 2.1.0 > JVM: -Xmx2g -Xms2g > hadoop hdfs version: 2.7.4 > disk:SSD > OS:CentOS Linux release 7.4.1708 (Core) > JMH Benchmark: @Fork(value = 1) > @Warmup(iterations = 300) > @Measurement(iterations = 300) >Reporter: fanrui >Assignee: fanrui >Priority: Major > Labels: pull-request-available > Attachments: After Optimization remark.png, After optimization.svg, > Before Optimization remark.png, Before optimization.svg > > Time Spent: 50m > Remaining Estimate: 0h > > Note:Not only the hdfs client can get the current benefit, all callers of > NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an > example. > > Hdfs client selects best DN for hdfs Block. method call stack: > DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> > NetUtils.createSocketAddr > NetUtils.createSocketAddr creates the corresponding InetSocketAddress based > on the host and port. There are some heavier operations in the > NetUtils.createSocketAddr method, for example: URI.create(target), so > NetUtils.createSocketAddr takes more time to execute. > The following is my performance report. The report is based on HBase calling > hdfs. HBase is a high-frequency access client for hdfs, because HBase read > operations often access a small DataBlock (about 64k) instead of the entire > HFile. In the case of high frequency access, the NetUtils.createSocketAddr > method is time-consuming. > h3. Test Environment: > > {code:java} > HBase version: 2.1.0 > JVM: -Xmx2g -Xms2g > hadoop hdfs version: 2.7.4 > disk:SSD > OS:CentOS Linux release 7.4.1708 (Core) > JMH Benchmark: @Fork(value = 1) > @Warmup(iterations = 300) > @Measurement(iterations = 300) > {code} > h4. Before Optimization FlameGraph: > In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts > for 4.86% of the entire CPU, and the creation of URIs accounts for a larger > proportion. > !Before Optimization remark.png! > h3. Optimization ideas: > NetUtils.createSocketAddr creates InetSocketAddress based on host and port. > Here we can add Cache to InetSocketAddress. The key of Cache is host and > port, and the value is InetSocketAddress. > h4. After Optimization FlameGraph: > In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts > for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, > and the ConcurrentHashMap.get() method gets data from the Cache. The CPU > usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% > to 0.54%. > !After Optimization remark.png! > h3. Original FlameGraph link: > [Before > Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing] > [After Optimization > FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] liuml07 commented on pull request #2241: HADOOP-17222. Create socket address combined with URI cache
liuml07 commented on pull request #2241: URL: https://github.com/apache/hadoop/pull/2241#issuecomment-689153364 Was waiting for a clean QA run but could not trigger it manually. If I get no help from community, we can upload the patch file to the JIRA and from there I can trigger multiple runs to get a clean pre-commit build. I assume failing tests are not related to this but it's always nice to confirm. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17222) Create socket address combined with cache to speed up hdfs client choose DataNode
[ https://issues.apache.org/jira/browse/HADOOP-17222?focusedWorklogId=480459&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-480459 ] ASF GitHub Bot logged work on HADOOP-17222: --- Author: ASF GitHub Bot Created on: 08/Sep/20 21:45 Start Date: 08/Sep/20 21:45 Worklog Time Spent: 10m Work Description: liuml07 edited a comment on pull request #2241: URL: https://github.com/apache/hadoop/pull/2241#issuecomment-689153364 Was waiting for a clean QA run but could not trigger it manually. If I get no help from community, we can upload the patch file to the JIRA and from there I can trigger multiple runs to get a clean pre-commit build. I assume failing tests are not related to this patch but it's always nice to confirm. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 480459) Time Spent: 1h (was: 50m) > Create socket address combined with cache to speed up hdfs client choose > DataNode > - > > Key: HADOOP-17222 > URL: https://issues.apache.org/jira/browse/HADOOP-17222 > Project: Hadoop Common > Issue Type: Improvement > Components: common, hdfs-client > Environment: HBase version: 2.1.0 > JVM: -Xmx2g -Xms2g > hadoop hdfs version: 2.7.4 > disk:SSD > OS:CentOS Linux release 7.4.1708 (Core) > JMH Benchmark: @Fork(value = 1) > @Warmup(iterations = 300) > @Measurement(iterations = 300) >Reporter: fanrui >Assignee: fanrui >Priority: Major > Labels: pull-request-available > Attachments: After Optimization remark.png, After optimization.svg, > Before Optimization remark.png, Before optimization.svg > > Time Spent: 1h > Remaining Estimate: 0h > > Note:Not only the hdfs client can get the current benefit, all callers of > NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an > example. > > Hdfs client selects best DN for hdfs Block. method call stack: > DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> > NetUtils.createSocketAddr > NetUtils.createSocketAddr creates the corresponding InetSocketAddress based > on the host and port. There are some heavier operations in the > NetUtils.createSocketAddr method, for example: URI.create(target), so > NetUtils.createSocketAddr takes more time to execute. > The following is my performance report. The report is based on HBase calling > hdfs. HBase is a high-frequency access client for hdfs, because HBase read > operations often access a small DataBlock (about 64k) instead of the entire > HFile. In the case of high frequency access, the NetUtils.createSocketAddr > method is time-consuming. > h3. Test Environment: > > {code:java} > HBase version: 2.1.0 > JVM: -Xmx2g -Xms2g > hadoop hdfs version: 2.7.4 > disk:SSD > OS:CentOS Linux release 7.4.1708 (Core) > JMH Benchmark: @Fork(value = 1) > @Warmup(iterations = 300) > @Measurement(iterations = 300) > {code} > h4. Before Optimization FlameGraph: > In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts > for 4.86% of the entire CPU, and the creation of URIs accounts for a larger > proportion. > !Before Optimization remark.png! > h3. Optimization ideas: > NetUtils.createSocketAddr creates InetSocketAddress based on host and port. > Here we can add Cache to InetSocketAddress. The key of Cache is host and > port, and the value is InetSocketAddress. > h4. After Optimization FlameGraph: > In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts > for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, > and the ConcurrentHashMap.get() method gets data from the Cache. The CPU > usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% > to 0.54%. > !After Optimization remark.png! > h3. Original FlameGraph link: > [Before > Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing] > [After Optimization > FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] liuml07 edited a comment on pull request #2241: HADOOP-17222. Create socket address combined with URI cache
liuml07 edited a comment on pull request #2241: URL: https://github.com/apache/hadoop/pull/2241#issuecomment-689153364 Was waiting for a clean QA run but could not trigger it manually. If I get no help from community, we can upload the patch file to the JIRA and from there I can trigger multiple runs to get a clean pre-commit build. I assume failing tests are not related to this patch but it's always nice to confirm. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17222) Create socket address combined with cache to speed up hdfs client choose DataNode
[ https://issues.apache.org/jira/browse/HADOOP-17222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HADOOP-17222: --- Hadoop Flags: Reviewed Status: Patch Available (was: Open) > Create socket address combined with cache to speed up hdfs client choose > DataNode > - > > Key: HADOOP-17222 > URL: https://issues.apache.org/jira/browse/HADOOP-17222 > Project: Hadoop Common > Issue Type: Improvement > Components: common, hdfs-client > Environment: HBase version: 2.1.0 > JVM: -Xmx2g -Xms2g > hadoop hdfs version: 2.7.4 > disk:SSD > OS:CentOS Linux release 7.4.1708 (Core) > JMH Benchmark: @Fork(value = 1) > @Warmup(iterations = 300) > @Measurement(iterations = 300) >Reporter: fanrui >Assignee: fanrui >Priority: Major > Labels: pull-request-available > Attachments: After Optimization remark.png, After optimization.svg, > Before Optimization remark.png, Before optimization.svg > > Time Spent: 40m > Remaining Estimate: 0h > > Note:Not only the hdfs client can get the current benefit, all callers of > NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an > example. > > Hdfs client selects best DN for hdfs Block. method call stack: > DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> > NetUtils.createSocketAddr > NetUtils.createSocketAddr creates the corresponding InetSocketAddress based > on the host and port. There are some heavier operations in the > NetUtils.createSocketAddr method, for example: URI.create(target), so > NetUtils.createSocketAddr takes more time to execute. > The following is my performance report. The report is based on HBase calling > hdfs. HBase is a high-frequency access client for hdfs, because HBase read > operations often access a small DataBlock (about 64k) instead of the entire > HFile. In the case of high frequency access, the NetUtils.createSocketAddr > method is time-consuming. > h3. Test Environment: > > {code:java} > HBase version: 2.1.0 > JVM: -Xmx2g -Xms2g > hadoop hdfs version: 2.7.4 > disk:SSD > OS:CentOS Linux release 7.4.1708 (Core) > JMH Benchmark: @Fork(value = 1) > @Warmup(iterations = 300) > @Measurement(iterations = 300) > {code} > h4. Before Optimization FlameGraph: > In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts > for 4.86% of the entire CPU, and the creation of URIs accounts for a larger > proportion. > !Before Optimization remark.png! > h3. Optimization ideas: > NetUtils.createSocketAddr creates InetSocketAddress based on host and port. > Here we can add Cache to InetSocketAddress. The key of Cache is host and > port, and the value is InetSocketAddress. > h4. After Optimization FlameGraph: > In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts > for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, > and the ConcurrentHashMap.get() method gets data from the Cache. The CPU > usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% > to 0.54%. > !After Optimization remark.png! > h3. Original FlameGraph link: > [Before > Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing] > [After Optimization > FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org