Erik Krogen created SPARK-35263: ----------------------------------- Summary: Refactor ShuffleBlockFetcherIteratorSuite to reduce duplicated code Key: SPARK-35263 URL: https://issues.apache.org/jira/browse/SPARK-35263 Project: Spark Issue Type: Improvement Components: Shuffle, Tests Affects Versions: 3.1.1 Reporter: Erik Krogen
{{ShuffleFetcherBlockIteratorSuite}} has tons of duplicate code, like: {code} val iterator = new ShuffleBlockFetcherIterator( taskContext, transfer, blockManager, blocksByAddress, (_, in) => in, 48 * 1024 * 1024, Int.MaxValue, Int.MaxValue, Int.MaxValue, true, false, metrics, false) {code} It's challenging to tell what the interesting parts are vs. what is just being set to some default/unused value. Similarly but not as bad, there are 10 calls like: {code} verify(transfer, times(1)).fetchBlocks(any(), any(), any(), any(), any(), any()) {code} and 7 like {code} when(transfer.fetchBlocks(any(), any(), any(), any(), any(), any())).thenAnswer ... {code} This can result in about 10% reduction in both lines and characters in the file: {code} # Before > wc > core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala 1063 3950 43201 core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala # After > wc > core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala 928 3609 39053 core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala {code} It also helps readability: {code} val iterator = createShuffleBlockIteratorWithDefaults( transfer, blocksByAddress, maxBytesInFlight = 1000L ) {code} Now I can clearly tell that {{maxBytesInFlight}} is the main parameter we're interested in here. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org