Re: [PR] HADOOP-19140. [ABFS, S3A] Add IORateLimiter API [hadoop]
steveloughran commented on code in PR #6703: URL: https://github.com/apache/hadoop/pull/6703#discussion_r1743444084 ## hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/IORateLimiter.java: ## @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs; + +import java.time.Duration; +import javax.annotation.Nullable; + +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceStability; + +/** + * An optional interface for classes that provide rate limiters. + * For a filesystem source, the operation name SHOULD be one of + * those listed in + * {@link org.apache.hadoop.fs.statistics.StoreStatisticNames} + * if the operation is listed there. + * + * This interfaces is intended to be exported by FileSystems so that + * applications wishing to perform bulk operations may request access + * to a rate limiter which is shared across all threads interacting + * with the store.. + * That is: the rate limiting is global to the specific instance of the + * object implementing this interface. + * + * It is not expected to be shared with other instances of the same + * class, or across processes. + * + * This means it is primarily of benefit when limiting bulk operations + * which can overload an (object) store from a small pool of threads. + * Examples of this can include: + * + * Bulk delete operations + * Bulk rename operations + * Completing many in-progress uploads + * Deep and wide recursive treewalks + * Reading/prefetching many blocks within a file + * + * In cluster applications, it is more likely that rate limiting is + * useful during job commit operations, or processes with many threads. + */ +@InterfaceAudience.Public +@InterfaceStability.Unstable +public interface IORateLimiter { + + /** + * Acquire IO capacity. + * + * The implementation may assign different costs to the different + * operations. + * + * If there is not enough space, the permits will be acquired, + * but the subsequent call will block until the capacity has been + * refilled. + * + * The path parameter is used to support stores where there may be different throttling + * under different paths. + * @param operation operation being performed. Must not be null, may be "", + * should be from {@link org.apache.hadoop.fs.statistics.StoreStatisticNames} + * where there is a matching operation. + * @param source path for operations. + * Use "/" for root/store-wide operations. + * @param dest destination path for rename operations or any other operation which + * takes two paths. + * @param requestedCapacity capacity to acquire. + * Must be greater than or equal to 0. + * @return time spent waiting for output. + */ + Duration acquireIOCapacity( + String operation, + Path source, Review Comment: really good q. will comment below -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
Re: [PR] HADOOP-19140. [ABFS, S3A] Add IORateLimiter API [hadoop]
anujmodi2021 commented on code in PR #6703: URL: https://github.com/apache/hadoop/pull/6703#discussion_r1741499620 ## hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestIORateLimiter.java: ## @@ -0,0 +1,213 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs; + +import java.time.Duration; + +import org.assertj.core.api.Assertions; +import org.junit.Test; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.hadoop.fs.impl.IORateLimiterSupport; +import org.apache.hadoop.test.AbstractHadoopTestBase; +import org.apache.hadoop.util.RateLimiting; +import org.apache.hadoop.util.RateLimitingFactory; + +import static org.apache.hadoop.fs.statistics.StoreStatisticNames.OP_DELETE; +import static org.apache.hadoop.fs.statistics.StoreStatisticNames.OP_DELETE_BULK; +import static org.apache.hadoop.fs.statistics.StoreStatisticNames.OP_DELETE_DIR; +import static org.apache.hadoop.test.LambdaTestUtils.intercept; + +/** + * Test IO rate limiting in {@link RateLimiting} and {@link IORateLimiter}. + * + * This includes: illegal arguments, and what if more capacity + * is requested than is available. + */ +public class TestIORateLimiter extends AbstractHadoopTestBase { + + private static final Logger LOG = LoggerFactory.getLogger( + TestIORateLimiter.class); + + public static final Path ROOT = new Path("/"); + + @Test + public void testAcquireCapacity() { +final int size = 10; +final RateLimiting limiter = RateLimitingFactory.create(size); +// do a chain of requests +limiter.acquire(0); +limiter.acquire(1); +limiter.acquire(2); + +// now ask for more than is allowed. This MUST work. +final int excess = size * 2; +limiter.acquire(excess); +assertDelayed(limiter, excess); + } + + @Test + public void testNegativeCapacityRejected() throws Throwable { +final RateLimiting limiter = RateLimitingFactory.create(1); +intercept(IllegalArgumentException.class, () -> +limiter.acquire(-1)); + } + + @Test + public void testNegativeLimiterCapacityRejected() throws Throwable { +intercept(IllegalArgumentException.class, () -> +RateLimitingFactory.create(-1)); + } + + /** + * This is a key behavior: it is acceptable to ask for more capacity + * than the caller has, the initial request must be granted, + * but the followup request must be delayed until enough capacity + * has been restored. + */ + @Test + public void testAcquireExcessCapacity() { + +// create a small limiter +final int size = 10; +final RateLimiting limiter = RateLimitingFactory.create(size); + +// now ask for more than is allowed. This MUST work. +final int excess = size * 2; +// first attempt gets more capacity than arrives every second. +assertNotDelayed(limiter, excess); +// second attempt will block +assertDelayed(limiter, excess); +// third attempt will block +assertDelayed(limiter, size); +// as these are short-cut, no delays. +assertNotDelayed(limiter, 0); + } + + @Test + public void testIORateLimiterWithLimitedCapacity() { +final int size = 10; +final IORateLimiter limiter = IORateLimiterSupport.createIORateLimiter(size); +// this size will use more than can be allocated in a second. +final int excess = size * 2; +// first attempt gets more capacity than arrives every second. +assertNotDelayed(limiter, OP_DELETE_DIR, excess); +// second attempt will block +assertDelayed(limiter, OP_DELETE_BULK, excess); +// third attempt will block +assertDelayed(limiter, OP_DELETE, size); +// as zero capacity requests are short-cut, no delays, ever. +assertNotDelayed(limiter, "", 0); + } + + /** + * Verify the unlimited rate limiter really is unlimited. + */ + @Test + public void testIORateLimiterWithUnlimitedCapacity() { +final IORateLimiter limiter = IORateLimiterSupport.unlimited(); +// this size will use more than can be allocated in a second. + +assertNotDelayed(limiter, "1", 100_000); +assertNotDelayed(limiter, "2", 100_000); + } + + @Test + public void testUnlimitedRejectsNegativeCapacity() th
Re: [PR] HADOOP-19140. [ABFS, S3A] Add IORateLimiter API [hadoop]
anujmodi2021 commented on code in PR #6703: URL: https://github.com/apache/hadoop/pull/6703#discussion_r1741496480 ## hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/IORateLimiter.java: ## @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs; + +import java.time.Duration; +import javax.annotation.Nullable; + +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceStability; + +/** + * An optional interface for classes that provide rate limiters. + * For a filesystem source, the operation name SHOULD be one of + * those listed in + * {@link org.apache.hadoop.fs.statistics.StoreStatisticNames} + * if the operation is listed there. + * + * This interfaces is intended to be exported by FileSystems so that + * applications wishing to perform bulk operations may request access + * to a rate limiter which is shared across all threads interacting + * with the store.. + * That is: the rate limiting is global to the specific instance of the + * object implementing this interface. + * + * It is not expected to be shared with other instances of the same + * class, or across processes. + * + * This means it is primarily of benefit when limiting bulk operations + * which can overload an (object) store from a small pool of threads. + * Examples of this can include: + * + * Bulk delete operations + * Bulk rename operations + * Completing many in-progress uploads + * Deep and wide recursive treewalks + * Reading/prefetching many blocks within a file + * + * In cluster applications, it is more likely that rate limiting is + * useful during job commit operations, or processes with many threads. + */ +@InterfaceAudience.Public +@InterfaceStability.Unstable +public interface IORateLimiter { + + /** + * Acquire IO capacity. + * + * The implementation may assign different costs to the different + * operations. + * + * If there is not enough space, the permits will be acquired, + * but the subsequent call will block until the capacity has been + * refilled. + * + * The path parameter is used to support stores where there may be different throttling + * under different paths. + * @param operation operation being performed. Must not be null, may be "", + * should be from {@link org.apache.hadoop.fs.statistics.StoreStatisticNames} + * where there is a matching operation. + * @param source path for operations. + * Use "/" for root/store-wide operations. + * @param dest destination path for rename operations or any other operation which + * takes two paths. + * @param requestedCapacity capacity to acquire. + * Must be greater than or equal to 0. + * @return time spent waiting for output. + */ + Duration acquireIOCapacity( + String operation, + Path source, Review Comment: Just to understand this better... If we have a list of paths on which we are attempting a bulk operation and the only common prefix for them, is the root itself. Should we acquire IO Capacity for each individual path or for the root path itself?? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
Re: [PR] HADOOP-19140. [ABFS, S3A] Add IORateLimiter API [hadoop]
hadoop-yetus commented on PR #6703: URL: https://github.com/apache/hadoop/pull/6703#issuecomment-2073517148 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 31s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 46m 17s | | trunk passed | | +1 :green_heart: | compile | 17m 52s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 17m 12s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 1m 15s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 39s | | trunk passed | | +1 :green_heart: | javadoc | 1m 14s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 50s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 2m 35s | | trunk passed | | +1 :green_heart: | shadedclient | 38m 40s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 55s | | the patch passed | | +1 :green_heart: | compile | 16m 46s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 16m 46s | | the patch passed | | +1 :green_heart: | compile | 16m 7s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 16m 7s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 1m 14s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 34s | | the patch passed | | +1 :green_heart: | javadoc | 1m 4s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 49s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 2m 52s | | the patch passed | | +1 :green_heart: | shadedclient | 38m 43s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 19m 54s | | hadoop-common in the patch passed. | | +1 :green_heart: | asflicense | 0m 58s | | The patch does not generate ASF License warnings. | | | | 232m 44s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6703/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6703 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 1e9683e47802 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / d2e146e4180311a52a94240922e3daf8f94ec8bd | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6703/2/testReport/ | | Max. process+thread count | 2038 (vs. ulimit of 5500) | | modules | C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6703/2/console | | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 | | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org | This message was automatically generated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, pleas
Re: [PR] HADOOP-19140. [ABFS, S3A] Add IORateLimiter API [hadoop]
steveloughran commented on code in PR #6703: URL: https://github.com/apache/hadoop/pull/6703#discussion_r1561220101 ## hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/impl/IORateLimiterSupport.java: ## @@ -0,0 +1,73 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.impl; + +import org.apache.hadoop.fs.IORateLimiter; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.util.RateLimiting; +import org.apache.hadoop.util.RateLimitingFactory; + +import static org.apache.hadoop.util.Preconditions.checkArgument; + +/** + * Implementation support for {@link IORateLimiter}. + */ +public final class IORateLimiterSupport { Review Comment: with the op name and path you can be clever: * limit by path * use operation name and have a "multiplier" of actual io, to include extra operations made (rename: list, copy, delete). for s3, separate read/write io capacities would need to be requested. * consider some free and give a cost of 0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
Re: [PR] HADOOP-19140. [ABFS, S3A] Add IORateLimiter API [hadoop]
steveloughran commented on code in PR #6703: URL: https://github.com/apache/hadoop/pull/6703#discussion_r1561216836 ## hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/IORateLimiter.java: ## @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs; + +import java.time.Duration; +import javax.annotation.Nullable; + +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceStability; + +/** + * An optional interface for classes that provide rate limiters. + * For a filesystem source, the operation name SHOULD be one of + * those listed in + * {@link org.apache.hadoop.fs.statistics.StoreStatisticNames} + * if the operation is listed there. + * + * This interfaces is intended to be exported by FileSystems so that + * applications wishing to perform bulk operations may request access + * to a rate limiter which is shared across all threads interacting + * with the store.. + * That is: the rate limiting is global to the specific instance of the + * object implementing this interface. + * + * It is not expected to be shared with other instances of the same + * class, or across processes. + * + * This means it is primarily of benefit when limiting bulk operations + * which can overload an (object) store from a small pool of threads. + * Examples of this can include: + * + * Bulk delete operations + * Bulk rename operations + * Completing many in-progress uploads + * Deep and wide recursive treewalks + * Reading/prefetching many blocks within a file + * + * In cluster applications, it is more likely that rate limiting is + * useful during job commit operations, or processes with many threads. + */ +@InterfaceAudience.Public +@InterfaceStability.Unstable +public interface IORateLimiter { + + /** + * Acquire IO capacity. + * + * The implementation may assign different costs to the different + * operations. + * + * If there is not enough space, the permits will be acquired, + * but the subsequent call will block until the capacity has been + * refilled. + * + * The path parameter is used to support stores where there may be different throttling + * under different paths. + * @param operation operation being performed. Must not be null, may be "", + * should be from {@link org.apache.hadoop.fs.statistics.StoreStatisticNames} + * where there is a matching operation. + * @param source path for operations. + * Use "/" for root/store-wide operations. + * @param dest destination path for rename operations or any other operation which + * takes two paths. + * @param requestedCapacity capacity to acquire. + * Must be greater than or equal to 0. + * @return time spent waiting for output. + */ + Duration acquireIOCapacity( + String operation, + Path source, Review Comment: s3 throttling does as it is per prefix. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
Re: [PR] HADOOP-19140. [ABFS, S3A] Add IORateLimiter API [hadoop]
mukund-thakur commented on code in PR #6703: URL: https://github.com/apache/hadoop/pull/6703#discussion_r1558044600 ## hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/IORateLimiter.java: ## @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs; + +import java.time.Duration; +import javax.annotation.Nullable; + +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceStability; + +/** + * An optional interface for classes that provide rate limiters. + * For a filesystem source, the operation name SHOULD be one of + * those listed in + * {@link org.apache.hadoop.fs.statistics.StoreStatisticNames} + * if the operation is listed there. + * + * This interfaces is intended to be exported by FileSystems so that + * applications wishing to perform bulk operations may request access + * to a rate limiter which is shared across all threads interacting + * with the store.. + * That is: the rate limiting is global to the specific instance of the + * object implementing this interface. + * + * It is not expected to be shared with other instances of the same + * class, or across processes. + * + * This means it is primarily of benefit when limiting bulk operations + * which can overload an (object) store from a small pool of threads. + * Examples of this can include: + * + * Bulk delete operations + * Bulk rename operations + * Completing many in-progress uploads + * Deep and wide recursive treewalks + * Reading/prefetching many blocks within a file + * + * In cluster applications, it is more likely that rate limiting is + * useful during job commit operations, or processes with many threads. + */ +@InterfaceAudience.Public +@InterfaceStability.Unstable +public interface IORateLimiter { + + /** + * Acquire IO capacity. + * + * The implementation may assign different costs to the different + * operations. + * + * If there is not enough space, the permits will be acquired, + * but the subsequent call will block until the capacity has been + * refilled. + * + * The path parameter is used to support stores where there may be different throttling + * under different paths. + * @param operation operation being performed. Must not be null, may be "", + * should be from {@link org.apache.hadoop.fs.statistics.StoreStatisticNames} + * where there is a matching operation. + * @param source path for operations. + * Use "/" for root/store-wide operations. + * @param dest destination path for rename operations or any other operation which + * takes two paths. + * @param requestedCapacity capacity to acquire. + * Must be greater than or equal to 0. + * @return time spent waiting for output. + */ + Duration acquireIOCapacity( + String operation, + Path source, Review Comment: A multi-delete operation takes a list of paths. Although we have a concept of the base path, I don't think the S3 client cares about every path to be under the base path. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
Re: [PR] HADOOP-19140. [ABFS, S3A] Add IORateLimiter API [hadoop]
mukund-thakur commented on code in PR #6703: URL: https://github.com/apache/hadoop/pull/6703#discussion_r1554373273 ## hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/impl/IORateLimiterSupport.java: ## @@ -0,0 +1,73 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.impl; + +import org.apache.hadoop.fs.IORateLimiter; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.util.RateLimiting; +import org.apache.hadoop.util.RateLimitingFactory; + +import static org.apache.hadoop.util.Preconditions.checkArgument; + +/** + * Implementation support for {@link IORateLimiter}. + */ +public final class IORateLimiterSupport { Review Comment: This is just a wrapper on top of RestrictedRateLimiting with extra operation name validation right? I think this can be extended to limit per operation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
Re: [PR] HADOOP-19140. [ABFS, S3A] Add IORateLimiter API [hadoop]
hadoop-yetus commented on PR #6703: URL: https://github.com/apache/hadoop/pull/6703#issuecomment-2035351221 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 19s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 32m 47s | | trunk passed | | +1 :green_heart: | compile | 8m 56s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 8m 7s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 44s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 3s | | trunk passed | | +1 :green_heart: | javadoc | 0m 48s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 34s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 23s | | trunk passed | | +1 :green_heart: | shadedclient | 20m 53s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 30s | | the patch passed | | +1 :green_heart: | compile | 8m 30s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 8m 30s | | the patch passed | | +1 :green_heart: | compile | 8m 6s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 8m 6s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 35s | | the patch passed | | +1 :green_heart: | mvnsite | 0m 56s | | the patch passed | | +1 :green_heart: | javadoc | 0m 43s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 37s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 35s | | the patch passed | | +1 :green_heart: | shadedclient | 21m 19s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 16m 31s | | hadoop-common in the patch passed. | | +1 :green_heart: | asflicense | 0m 42s | | The patch does not generate ASF License warnings. | | | | 138m 38s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6703/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6703 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux e24358cb7c53 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 58fb6a3036d824f0c201c7dbdf18b542cc6576d8 | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6703/1/testReport/ | | Max. process+thread count | 2150 (vs. ulimit of 5500) | | modules | C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6703/1/console | | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 | | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org | This message was automatically generated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, pleas
[PR] HADOOP-19140. [ABFS, S3A] Add IORateLimiter API [hadoop]
steveloughran opened a new pull request, #6703: URL: https://github.com/apache/hadoop/pull/6703 Adds an API (pulled from #6596) to allow callers to request IO capacity for an named operation with optional source and dest paths. The first use of this would be the bulk delete operation of #6494; there'd be some throttling within the s3a code which set max # of writes per bucket and for the bulk delete the caller would ask for as many as there were entries. Added new store operations for delete_bulk and delete_dir ### How was this patch tested? New tests. ### For code changes: - [X] Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')? - [ ] Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, `NOTICE-binary` files? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org