[ https://issues.apache.org/jira/browse/HDFS-17302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17803764#comment-17803764 ]
ASF GitHub Bot commented on HDFS-17302: --------------------------------------- hadoop-yetus commented on PR #6380: URL: https://github.com/apache/hadoop/pull/6380#issuecomment-1879579502 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |:----:|----------:|--------:|:--------:|:-------:| | +0 :ok: | reexec | 0m 49s | | Docker mode activated. | |||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 2 new or modified test files. | |||| _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 46m 33s | | trunk passed | | +1 :green_heart: | compile | 0m 41s | | trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | compile | 0m 37s | | trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 0m 29s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 40s | | trunk passed | | +1 :green_heart: | javadoc | 0m 42s | | trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | javadoc | 0m 30s | | trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 1m 22s | | trunk passed | | +1 :green_heart: | shadedclient | 38m 14s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 38m 34s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | |||| _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 32s | | the patch passed | | +1 :green_heart: | compile | 0m 32s | | the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | javac | 0m 32s | | the patch passed | | +1 :green_heart: | compile | 0m 29s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | javac | 0m 29s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 18s | | the patch passed | | +1 :green_heart: | mvnsite | 0m 31s | | the patch passed | | +1 :green_heart: | javadoc | 0m 28s | | the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | javadoc | 0m 23s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 1m 21s | | the patch passed | | +1 :green_heart: | shadedclient | 38m 36s | | patch has no errors when building and testing our client artifacts. | |||| _ Other Tests _ | | +1 :green_heart: | unit | 22m 57s | | hadoop-hdfs-rbf in the patch passed. | | +1 :green_heart: | asflicense | 0m 35s | | The patch does not generate ASF License warnings. | | | | 161m 34s | | | | Subsystem | Report/Notes | |----------:|:-------------| | Docker | ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6380/3/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6380 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 58649d70f692 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 64fb454c2b447cbda6d0e134edd4b18cebaa29cf | | Default Java | Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6380/3/testReport/ | | Max. process+thread count | 2376 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: hadoop-hdfs-project/hadoop-hdfs-rbf | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6380/3/console | | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 | | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org | This message was automatically generated. > RBF: ProportionRouterRpcFairnessPolicyController-Sharing and isolation. > ----------------------------------------------------------------------- > > Key: HDFS-17302 > URL: https://issues.apache.org/jira/browse/HDFS-17302 > Project: Hadoop HDFS > Issue Type: New Feature > Components: rbf > Reporter: Jian Zhang > Assignee: Jian Zhang > Priority: Major > Labels: pull-request-available > Attachments: HDFS-17302.001.patch, HDFS-17302.002.patch, > HDFS-17302.003.patch > > > h2. Current shortcomings > [HDFS-14090|https://issues.apache.org/jira/browse/HDFS-14090] provides a > StaticRouterRpcFairnessPolicyController to support configuring different > handlers for different ns. Using the StaticRouterRpcFairnessPolicyController > allows the router to isolate different ns, and the ns with a higher load will > not affect the router's access to the ns with a normal load. But the > StaticRouterRpcFairnessPolicyController still falls short in many ways, such > as: > 1. *Configuration is inconvenient and error-prone*: When I use > StaticRouterRpcFairnessPolicyController, I first need to know how many > handlers the router has in total, then I have to know how many nameservices > the router currently has, and then carefully calculate how many handlers to > allocate to each ns so that the sum of handlers for all ns will not exceed > the total handlers of the router, and I also need to consider how many > handlers to allocate to each ns to achieve better performance. Therefore, I > need to be very careful when configuring. Even if I configure only one more > handler for a certain ns, the total number is more than the number of > handlers owned by the router, which will also cause the router to fail to > start. At this time, I had to investigate the reason why the router failed to > start. After finding the reason, I had to reconsider the number of handlers > for each ns. In addition, when I reconfigure the total number of handlers on > the router, I have to re-allocate handlers to each ns, which undoubtedly > increases the complexity of operation and maintenance. > 2. *Extension ns is not supported*: During the running of the router, if a > new ns is added to the cluster and a mount is added for the ns, but because > no handler is allocated for the ns, the ns cannot be accessed through the > router. We must reconfigure the number of handlers and then refresh the > configuration. At this time, the router can access the ns normally. When we > reconfigure the number of handlers, we have to face disadvantage 1: > Configuration is inconvenient and error-prone. > 3. *Waste handlers*: The main purpose of proposing > RouterRpcFairnessPolicyController is to enable the router to access ns with > normal load and not be affected by ns with higher load. First of all, not all > ns have high loads; secondly, ns with high loads do not have high loads 24 > hours a day. It may be that only certain time periods, such as 0 to 8 > o'clock, have high loads, and other time periods have normal loads. Assume > there are 2 ns, and each ns is allocated half of the number of handlers. > Assume that ns1 has many requests from 0 to 14 o'clock, and almost no > requests from 14 to 24 o'clock, ns2 has many requests from 12 to 24 o'clock, > and almost no requests from 0 to 14 o'clock; when it is between 0 o'clock and > 12 o'clock and between 14 o'clock and 24 o'clock, only one ns has more > requests and the other ns has almost no requests, so we have wasted half of > the number of handlers. > 4. *Only isolation, no sharing*: The staticRouterRpcFairnessPolicyController > does not support sharing, only isolation. I think isolation is just a means > to improve the performance of router access to normal ns, not the purpose. It > is impossible for all ns in the cluster to have high loads. On the contrary, > in most scenarios, only a few ns in the cluster have high loads, and the > loads of most other ns are normal. For ns with higher load and ns with normal > load, we need to isolate their handlers so that the ns with higher load will > not affect the performance of ns with lower load. However, for nameservices > that are also under normal load, or are under higher load, we do not need to > isolate them, these ns of the same nature can share the handlers of the > router; The performance is better than assigning a fixed number of handlers > to each ns, because each ns can use all the handlers of the router. > h2. New features > Based on the above staticRouterRpcFairnessPolicyController, there are > deficiencies in usage and performance. I provide a new > RouterRpcFairnessPolicyController: > ProportionRouterRpcFairnessPolicyController (maybe with a better name) to > solve the above major shortcomings. > 1. *More user-friendly configuration* : Supports allocating handlers > proportionally to each ns. For example, we can give ns1 a handler ratio of > 0.2, then ns1 will use 0.2 of the total number of handlers on the router. > Using this method, we do not need to confirm in advance how many handlers the > router has. > 2. *Sharing and isolation* : Sharing is as important as isolation. We > support that the sum of handlers for all ns exceeds the total number of > handlers. For example, assuming we have 10 handlers and 3 ns, we can allocate > 5 (0.5) handlers to ns1, 5 (0.5) handlers to ns2, and ns3 also allocates 5 > (0.5) handlers.This feature is very important,.Consider the following > scenarios: > - Only one ns is busy during a period of time: Assume that ns1 has more > requests from 0 to 8 o'clock, ns2 has more requests from 8 to 16 o'clock, and > ns3 has more requests from 16 o'clock to 24 o'clock. Then, at any time > period, the ns with more requests uses at most half of the handlers, and the > other two normal ns share the remaining half of the handlers. In this way, > the isolation is still satisfied, and compared with > StaticRouterRpcFairnessPolicyController, we can use more handlers to handle > requests of busy and Normal ns (if you use > StaticRouterRpcFairnessPolicyController, each ns uses 3 handlers-[ns1:3 ns2:3 > ns3:3], now we can let each ns use 5 handlers). > - Only ns1 is busy: Assuming that only ns1 is busy at any time, the requests > for ns2 and ns3 are normal (the requests to access ns2 and ns3 are very few > and very fast because the downstream namenode has no pressure). We can give > ns1 5(0.5) handlers, ns2 and ns3 both have 10(1) handlers. Since the number > of requests for ns2 and ns3 is very small, and the request processing time is > very short, it will not have a major impact on the performance of ns1, and we > stipulate that ns1 uses at most half of the handlers, so the isolation is > still met. > 3. *Transparent extension*: Expanding new ns does not require refreshing the > configuration. For an ns, if we do not assign handlers to it, we can assign a > certain proportion of handlers to it by default. > 4. *Fully compatible*: The new RouterRpcFairnessPolicyController fully meets > the characteristics of StaticRouterRpcFairnessPolicyController. If we want to > only support isolation but not sharing, we can allocate 0.3 to ns2、0.3 to > ns3、0.4 to ns1. This is also more convenient than using the original > StaticRouterRpcFairnessPolicyController, because we don't need to know how > many handlers the router has in total. > Therefore, the new RouterRpcFairnessPolicyController is more flexible, has > better performance, and is more suitable for actual production environments. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org