Erik Krogen created SPARK-35263:
-----------------------------------

             Summary: Refactor ShuffleBlockFetcherIteratorSuite to reduce 
duplicated code
                 Key: SPARK-35263
                 URL: https://issues.apache.org/jira/browse/SPARK-35263
             Project: Spark
          Issue Type: Improvement
          Components: Shuffle, Tests
    Affects Versions: 3.1.1
            Reporter: Erik Krogen


{{ShuffleFetcherBlockIteratorSuite}} has tons of duplicate code, like:
{code}
    val iterator = new ShuffleBlockFetcherIterator(
      taskContext,
      transfer,
      blockManager,
      blocksByAddress,
      (_, in) => in,
      48 * 1024 * 1024,
      Int.MaxValue,
      Int.MaxValue,
      Int.MaxValue,
      true,
      false,
      metrics,
      false)
{code}
It's challenging to tell what the interesting parts are vs. what is just being 
set to some default/unused value.

Similarly but not as bad, there are 10 calls like:
{code}
verify(transfer, times(1)).fetchBlocks(any(), any(), any(), any(), any(), any())
{code}
and 7 like
{code}
when(transfer.fetchBlocks(any(), any(), any(), any(), any(), any())).thenAnswer 
...
{code}

This can result in about 10% reduction in both lines and characters in the file:
{code}
# Before
> wc 
> core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala
    1063    3950   43201 
core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala

# After
> wc 
> core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala
     928    3609   39053 
core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala
{code}

It also helps readability:
{code}
    val iterator = createShuffleBlockIteratorWithDefaults(
      transfer,
      blocksByAddress,
      maxBytesInFlight = 1000L
    )
{code}
Now I can clearly tell that {{maxBytesInFlight}} is the main parameter we're 
interested in here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to