GitHub user aokolnychyi opened a pull request:

    https://github.com/apache/spark/pull/23171

    [SPARK-26205][SQL] Optimize In for bytes, shorts, ints

    ## What changes were proposed in this pull request?
    
    This PR optimizes `In` expressions for byte, short, integer types. It is a 
follow-up on PR #21442 from @dbtsai.  
    
    Currently, `In` expressions are compiled into a sequence of if-else 
statement, which results in O(n) time complexity. `InSet` is an optimized 
version of `In`, which is supposed to improve the performance if the number of 
elements is big enough. However, `InSet` actually degrades the performance in 
many cases due to various reasons (`InSet` will be handled in 
[SPARK-26203](https://issues.apache.org/jira/browse/SPARK-26203) and 
[SPARK-26204](https://issues.apache.org/jira/browse/SPARK-26204)).
    
    The main idea of this PR is to make use of `tableswitch` and `lookupswitch` 
bytecode instructions to speed up `In` expressions. In short, we can improve 
our time complexity from O(n) to O(1) or at least O(log n) by using Java 
`switch` statements. We will have O(1) time complexity if our case values are 
compact and `tableswitch` can be used. Otherwise, `lookupswitch` will give us 
O(log n). See 
[here](https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-3.html#jvms-3.10)
 and 
[here](https://stackoverflow.com/questions/10287700/difference-between-jvms-lookupswitch-and-tableswitch)
 for more information.
    
    An important benefit of the proposed approach is that we do not have to pay 
an extra cost for autoboxing as in case of `InSet`. As a consequence, we can 
substantially outperform `InSet` even on 250+ elements.
    
    This PR does not cover long values as Java `switch` statements cannot be 
used on them. However, we can have a follow-up PR with an approach similar to 
binary search.
    
    ## How was this patch tested?
    
    ### Correctness
    
    There is a new test that verifies the logic of the proposed optimization.
    
    ### Performance Benchmarks
    
    The performance impact was measured by the benchmarks below (an appropriate 
part of them will be included in 
[SPARK-26203](https://issues.apache.org/jira/browse/SPARK-26203)).
    
    ```
      def compactBytesBenchmark(numItems: Int, numRows: Long, minNumIters: 
Int): Benchmark = {
        val name = s"$numItems compact bytes"
        val values = (1 to numItems).map(v => s"CAST($v AS tinyint)")
        val df = spark.range(1, numRows).select($"id".cast(ByteType))
        benchmark(name, df, values, numRows, minNumIters)
      }
    
      def nonCompactBytesBenchmark(numItems: Int, numRows: Long, minNumIters: 
Int): Benchmark = {
        val step = (Byte.MaxValue.toInt - Byte.MinValue.toInt) / numItems
        val maxValue = Byte.MinValue + numItems * step
        val rangeSize = maxValue - Byte.MinValue
        require(isLookupSwitch(rangeSize, numItems))
        val name = s"$numItems non-compact bytes"
        val values = (Byte.MinValue until maxValue by step).map(v => s"CAST($v 
AS tinyint)")
        val df = spark.range(1, numRows).select($"id".cast(ByteType))
        benchmark(name, df, values, numRows, minNumIters)
      }
    
      def compactShortsBenchmark(numItems: Int, numRows: Long, minNumIters: 
Int): Benchmark = {
        val name = s"$numItems compact shorts"
        val values = (1 to numItems).map(v => s"CAST($v AS smallint)")
        val df = spark.range(1, numRows).select($"id".cast(ShortType))
        benchmark(name, df, values, numRows, minNumIters)
      }
    
      def nonCompactShortsBenchmark(numItems: Int, numRows: Long, minNumIters: 
Int): Benchmark = {
        val step = (Short.MaxValue.toInt - Short.MinValue.toInt) / numItems
        val maxValue = Short.MinValue + numItems * step
        val rangeSize = maxValue - Short.MinValue
        require(isLookupSwitch(rangeSize, numItems))
        val name = s"$numItems non-compact shorts"
        val values = (Short.MinValue until maxValue by step).map(v => s"CAST($v 
AS smallint)")
        val df = spark.range(1, numRows).select($"id".cast(ShortType))
        benchmark(name, df, values, numRows, minNumIters)
      }
    
      def compactIntsBenchmark(numItems: Int, numRows: Long, minNumIters: Int): 
Benchmark = {
        val name = s"$numItems compact ints"
        val values = (1 to numItems).map(v => 10000000 + v)
        val df = spark.range(1, numRows).select($"id".cast(IntegerType))
        benchmark(name, df, values, numRows, minNumIters)
      }
    
      def nonCompactIntsBenchmark(numItems: Int, numRows: Long, minNumIters: 
Int): Benchmark = {
        val step = (Int.MaxValue.toLong - Int.MinValue.toLong) / numItems
        val maxValue = Int.MinValue + numItems * step
        val rangeSize = maxValue - Int.MinValue
        require(isLookupSwitch(rangeSize, numItems))
        val name = s"$numItems non-compact ints"
        val values = Int.MinValue until maxValue.toInt by step.toInt
        val df = spark.range(1, numRows).select($"id".cast(IntegerType))
        benchmark(name, df, values, numRows, minNumIters)
      }
    
      def benchmark(
          name: String,
          df: DataFrame,
          values: Seq[Any],
          numRows: Long,
          minNumIters: Int): Benchmark = {
    
        val benchmark = new Benchmark(name, numRows, minNumIters, output = 
output)
    
        df.createOrReplaceTempView("t")
    
        val testClosure = () => {
          val df = spark.sql(s"SELECT * FROM t WHERE id IN (${values mkString 
","})")
          df.queryExecution.toRdd.foreach(_ => Unit)
        }
    
        benchmark.addCase("In expression") { _ =>
          withSQLConf(SQLConf.OPTIMIZER_INSET_CONVERSION_THRESHOLD.key -> 
values.size.toString) {
            testClosure()
          }
        }
    
        benchmark.addCase("InSet expression") { _ =>
          withSQLConf(SQLConf.OPTIMIZER_INSET_CONVERSION_THRESHOLD.key -> "1") {
            testClosure()
          }
        }
    
        benchmark
      }
    
      // this logic is derived from com.sun.tools.javac.jvm.Gen.java
      private def isLookupSwitch(rangeSize: Long, numLabels: Int): Boolean = {
        val tableSpaceCost = 4 + rangeSize
        val tableTimeCost = 3
        val lookupSpaceCost = 3 + 2 * numLabels
        val lookupTimeCost = numLabels
        lookupSpaceCost + 3 * lookupTimeCost < tableSpaceCost + 3 * 
tableTimeCost
      }
    
    ```
    
    The actual benchmarking code looks like:
    
    ```
    val numRows = 100000000
    val minNumIters = 10
    
    compactBytesBenchmark(numItems = 10, numRows = numRows, minNumIters).run()
    nonCompactBytesBenchmark(numItems = 10, numRows = numRows, 
minNumIters).run()
    compactBytesBenchmark(numItems = 25, numRows = numRows, minNumIters).run()
    nonCompactBytesBenchmark(numItems = 25, numRows = numRows, 
minNumIters).run()
    
    compactShortsBenchmark(numItems = 10, numRows = numRows, minNumIters).run()
    nonCompactShortsBenchmark(numItems = 10, numRows = numRows, 
minNumIters).run()
    compactShortsBenchmark(numItems = 50, numRows = numRows, minNumIters).run()
    nonCompactShortsBenchmark(numItems = 50, numRows = numRows, 
minNumIters).run()
    compactShortsBenchmark(numItems = 100, numRows = numRows, minNumIters).run()
    nonCompactShortsBenchmark(numItems = 100, numRows = numRows, 
minNumIters).run()
    
    compactIntsBenchmark(numItems = 10, numRows = numRows, minNumIters).run()
    nonCompactIntsBenchmark(numItems = 10, numRows = numRows, minNumIters).run()
    compactIntsBenchmark(numItems = 100, numRows = numRows, minNumIters).run()
    nonCompactIntsBenchmark(numItems = 100, numRows = numRows, 
minNumIters).run()
    compactIntsBenchmark(numItems = 250, numRows = numRows, minNumIters).run()
    nonCompactIntsBenchmark(numItems = 250, numRows = numRows, 
minNumIters).run()
    ```
    
    #### Byte
    
    **_Performance Summary_**
    
    Proposed Solution
    
    |  # values       | InSet           | In with compact data | In with 
non-compact data |
    | ------------- |:-------------:| -----:| ---:|
    | 10      | 230 ms | 73 ms | 87 ms |
    | 25      | 290 ms | 110 ms | 117 ms |
    
    Current Solution
    
    |  # values       | InSet           | In with compact data | In with 
non-compact data |
    | ------------- |:-------------:| -----:| ---:|
    | 10      | 220 ms | 91 ms | 91 ms |
    | 25      | 291 ms | 167 ms | 168 ms |
    
    **_Raw Results_**
    
    Proposed Solution
    
    ```
    Running benchmark: 10 compact bytes
      Running case: In expression
      Stopped after 28 iterations, 2053 ms
      Running case: InSet expression
      Stopped after 15 iterations, 3277 ms
    
    Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.14
    Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
    10 compact bytes:                        Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    In expression                                   68 /   73       1474.6      
     0.7       1.0X
    InSet expression                               216 /  218        463.9      
     2.2       0.3X
    
    Running benchmark: 10 non-compact bytes
      Running case: In expression
      Stopped after 24 iterations, 2076 ms
      Running case: InSet expression
      Stopped after 15 iterations, 3340 ms
    
    Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.14
    Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
    10 non-compact bytes:                    Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    In expression                                   84 /   87       1186.6      
     0.8       1.0X
    InSet expression                               217 /  223        460.7      
     2.2       0.4X
    
    Running benchmark: 25 compact bytes
      Running case: In expression
      Stopped after 19 iterations, 2080 ms
      Running case: InSet expression
      Stopped after 15 iterations, 4285 ms
    
    Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.14
    Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
    25 compact bytes:                        Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    In expression                                  105 /  110        955.0      
     1.0       1.0X
    InSet expression                               273 /  286        365.8      
     2.7       0.4X
    
    Running benchmark: 25 non-compact bytes
      Running case: In expression
      Stopped after 18 iterations, 2104 ms
      Running case: InSet expression
      Stopped after 15 iterations, 4449 ms
    
    Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.14
    Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
    25 non-compact bytes:                    Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    In expression                                  107 /  117        938.5      
     1.1       1.0X
    InSet expression                               284 /  297        352.3      
     2.8       0.4X
    ```
    
    Current Solution
    
    ```
    Running benchmark: 10 compact bytes
      Running case: In expression
      Stopped after 22 iterations, 2002 ms
      Running case: InSet expression
      Stopped after 15 iterations, 3601 ms
    
    Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.14
    Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
    10 compact bytes:                        Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    In expression                                   86 /   91       1168.9      
     0.9       1.0X
    InSet expression                               226 /  240        442.5      
     2.3       0.4X
    
    Running benchmark: 10 non-compact bytes
      Running case: In expression
      Stopped after 23 iterations, 2085 ms
      Running case: InSet expression
      Stopped after 15 iterations, 3265 ms
    
    Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.14
    Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
    10 non-compact bytes:                    Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    In expression                                   86 /   91       1169.1      
     0.9       1.0X
    InSet expression                               213 /  218        469.5      
     2.1       0.4X
    
    Running benchmark: 25 compact bytes
      Running case: In expression
      Stopped after 15 iterations, 2499 ms
      Running case: InSet expression
      Stopped after 15 iterations, 4326 ms
    
    Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.14
    Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
    25 compact bytes:                        Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    In expression                                  164 /  167        610.7      
     1.6       1.0X
    InSet expression                               282 /  288        354.2      
     2.8       0.6X
    
    Running benchmark: 25 non-compact bytes
      Running case: In expression
      Stopped after 15 iterations, 2518 ms
      Running case: InSet expression
      Stopped after 15 iterations, 4404 ms
    
    Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.14
    Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
    25 non-compact bytes:                    Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    In expression                                  164 /  168        611.5      
     1.6       1.0X
    InSet expression                               282 /  294        354.4      
     2.8       0.6X
    ```
    
    **_Conclusion_**
    
    As expected, we see that the proposed logic improves the performance on 
compact and non-compact sequences of byte values. While the performance on 
compact elements is better than on non-compact ones, we do not see that it is 
stable independently of the number of elements inside `In`. Hence, we cannot 
conclude O(1) time performance on compact byte values. Spark generates `(byte) 
1`, `(byte) 100` to represent byte literals, so it can potentially be a reason 
for the non-linear time complexity.
    
    We also see that the performance of the old approach on bytes is the same 
independently of data compactness (as expected).
    
    #### Short
    
    **_Performance Summary_**
    
    Proposed Solution
    
    |  # values       | InSet           | In with compact data | In with 
non-compact data |
    | ------------- |:-------------:| -----:| ---:|
    | 10      | 243 ms | 45 ms | 56 ms |
    | 50      | 294 ms | 45 ms | 71 ms |
    | 100      | 260 ms | 46 ms | 82 ms |
    
    Current Solution
    
    |  # values       | InSet           | In with compact data | In with 
non-compact data |
    | ------------- |:-------------:| -----:| ---:|
    | 10      | 248 ms | 71 ms | 69 ms |
    | 50      | 295 ms | 169 ms | 173 ms |
    | 100      | 265 ms | 335 ms | 370 ms |
    
    **_Raw Results_**
    
    Proposed Solution
    
    ```
    Running benchmark: 10 compact shorts
      Running case: In expression
      Stopped after 45 iterations, 2032 ms
      Running case: InSet expression
      Stopped after 15 iterations, 3633 ms
    
    Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.14
    Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
    10 compact shorts:                       Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    In expression                                   41 /   45       2430.9      
     0.4       1.0X
    InSet expression                               232 /  242        430.4      
     2.3       0.2X
    
    Running benchmark: 10 non-compact shorts
      Running case: In expression
      Stopped after 36 iterations, 2012 ms
      Running case: InSet expression
      Stopped after 15 iterations, 3645 ms
    
    Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.14
    Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
    10 non-compact shorts:                   Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    In expression                                   53 /   56       1880.2      
     0.5       1.0X
    InSet expression                               233 /  243        428.4      
     2.3       0.2X
    
    Running benchmark: 50 compact shorts
      Running case: In expression
      Stopped after 45 iterations, 2007 ms
      Running case: InSet expression
      Stopped after 15 iterations, 4366 ms
    
    Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.14
    Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
    50 compact shorts:                       Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    In expression                                   42 /   45       2368.9      
     0.4       1.0X
    InSet expression                               280 /  291        357.4      
     2.8       0.2X
    
    Running benchmark: 50 non-compact shorts
      Running case: In expression
      Stopped after 29 iterations, 2053 ms
      Running case: InSet expression
      Stopped after 15 iterations, 4447 ms
    
    Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.14
    Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
    50 non-compact shorts:                   Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    In expression                                   69 /   71       1451.0      
     0.7       1.0X
    InSet expression                               292 /  297        342.3      
     2.9       0.2X
    
    Running benchmark: 100 compact shorts
      Running case: In expression
      Stopped after 44 iterations, 2032 ms
      Running case: InSet expression
      Stopped after 15 iterations, 3967 ms
    
    Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.14
    Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
    100 compact shorts:                      Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    In expression                                   44 /   46       2248.6      
     0.4       1.0X
    InSet expression                               259 /  264        385.6      
     2.6       0.2X
    
    Running benchmark: 100 non-compact shorts
      Running case: In expression
      Stopped after 25 iterations, 2060 ms
      Running case: InSet expression
      Stopped after 15 iterations, 3832 ms
    
    Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.14
    Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
    100 non-compact shorts:                  Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    In expression                                   79 /   82       1267.6      
     0.8       1.0X
    InSet expression                               250 /  256        399.4      
     2.5       0.3X
    ```
    Current Solution
    
    ```
    Running benchmark: 10 compact shorts
      Running case: In expression
      Stopped after 29 iterations, 2051 ms
      Running case: InSet expression
      Stopped after 15 iterations, 3788 ms
    
    Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.14
    Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
    10 compact shorts:                       Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    In expression                                   64 /   71       1553.8      
     0.6       1.0X
    InSet expression                               243 /  253        411.4      
     2.4       0.3X
    
    Running benchmark: 10 non-compact shorts
      Running case: In expression
      Stopped after 30 iterations, 2055 ms
      Running case: InSet expression
      Stopped after 15 iterations, 3658 ms
    
    Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.14
    Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
    10 non-compact shorts:                   Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    In expression                                   64 /   69       1572.3      
     0.6       1.0X
    InSet expression                               238 /  244        420.0      
     2.4       0.3X
    
    Running benchmark: 50 compact shorts
      Running case: In expression
      Stopped after 15 iterations, 2532 ms
      Running case: InSet expression
      Stopped after 15 iterations, 4394 ms
    
    Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.14
    Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
    50 compact shorts:                       Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    In expression                                  163 /  169        614.5      
     1.6       1.0X
    InSet expression                               288 /  293        346.8      
     2.9       0.6X
    
    Running benchmark: 50 non-compact shorts
      Running case: In expression
      Stopped after 15 iterations, 2588 ms
      Running case: InSet expression
      Stopped after 15 iterations, 4554 ms
    
    Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.14
    Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
    50 non-compact shorts:                   Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    In expression                                  168 /  173        594.0      
     1.7       1.0X
    InSet expression                               290 /  304        345.0      
     2.9       0.6X
    
    Running benchmark: 100 compact shorts
      Running case: In expression
      Stopped after 15 iterations, 5021 ms
      Running case: InSet expression
      Stopped after 15 iterations, 4086 ms
    
    Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.14
    Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
    100 compact shorts:                      Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    In expression                                  330 /  335        303.3      
     3.3       1.0X
    InSet expression                               259 /  272        386.8      
     2.6       1.3X
    
    Running benchmark: 100 non-compact shorts
      Running case: In expression
      Stopped after 15 iterations, 5542 ms
      Running case: InSet expression
      Stopped after 15 iterations, 3921 ms
    
    Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.14
    Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
    100 non-compact shorts:                  Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    In expression                                  351 /  370        284.9      
     3.5       1.0X
    InSet expression                               252 /  261        396.6      
     2.5       1.4X
    ```
    
    **_Conclusion_**
    
    The proposed optimization gives us even more benefits on shorts. As opposed 
to bytes, we see that the performance on compact data stays the same 
independently of the number of values inside `In`. Therefore, we see O(1) time 
complexity for compact data and a significant improvement even on non-compact 
data.
    
    As on bytes, the performance of the old approach on shorts is the same 
independently of data compactness.
    
    
    #### Int
    
    **_Performance Summary_**
    
    Proposed Solution
    
    |  # values       | InSet           | In with compact data | In with 
non-compact data |
    | ------------- |:-------------:| -----:| ---:|
    | 10      | 230 ms | 43 ms | 36 ms |
    | 100      | 240 ms | 45 ms | 99 ms |
    | 250      | 250 ms | 49 ms | 109 ms |
    
    Current Solution
    
    |  # values       | InSet           | In with compact data | In with 
non-compact data |
    | ------------- |:-------------:| -----:| ---:|
    | 10      | 225 ms | 93 ms | 74 ms |
    | 100      | 240 ms | 624 ms | 608 ms |
    | 250      | 250 ms | 1131 ms | 1120 ms |
    
    **_Raw Results_**
    
    Proposed Solution
    
    ```
    Running benchmark: 10 compact ints
      Running case: In expression
      Stopped after 47 iterations, 2018 ms
      Running case: InSet expression
      Stopped after 15 iterations, 3404 ms
    
    Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.14
    Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
    10 compact ints:                         Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    In expression                                   41 /   43       2457.4      
     0.4       1.0X
    InSet expression                               224 /  227        446.5      
     2.2       0.2X
    
    Running benchmark: 10 non-compact ints
      Running case: In expression
      Stopped after 56 iterations, 2021 ms
      Running case: InSet expression
      Stopped after 15 iterations, 3470 ms
    
    Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.14
    Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
    10 non-compact ints:                     Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    In expression                                   35 /   36       2853.6      
     0.4       1.0X
    InSet expression                               226 /  231        442.9      
     2.3       0.2X
    
    Running benchmark: 100 compact ints
      Running case: In expression
      Stopped after 45 iterations, 2007 ms
      Running case: InSet expression
      Stopped after 15 iterations, 3766 ms
    
    Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.14
    Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
    100 compact ints:                        Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    In expression                                   43 /   45       2344.0      
     0.4       1.0X
    InSet expression                               239 /  251        417.8      
     2.4       0.2X
    
    Running benchmark: 100 non-compact ints
      Running case: In expression
      Stopped after 21 iterations, 2077 ms
      Running case: InSet expression
      Stopped after 15 iterations, 3776 ms
    
    Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.14
    Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
    100 non-compact ints:                    Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    In expression                                   96 /   99       1040.2      
     1.0       1.0X
    InSet expression                               248 /  252        403.6      
     2.5       0.4X
    
    Running benchmark: 250 compact ints
      Running case: In expression
      Stopped after 43 iterations, 2023 ms
      Running case: InSet expression
      Stopped after 15 iterations, 3673 ms
    
    Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.14
    Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
    250 compact ints:                        Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    In expression                                   45 /   47       2201.2      
     0.5       1.0X
    InSet expression                               241 /  245        414.7      
     2.4       0.2X
    
    Running benchmark: 250 non-compact ints
      Running case: In expression
      Stopped after 19 iterations, 2073 ms
      Running case: InSet expression
      Stopped after 15 iterations, 3802 ms
    
    Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.14
    Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
    250 non-compact ints:                    Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    In expression                                  108 /  109        929.2      
     1.1       1.0X
    InSet expression                               247 /  254        405.4      
     2.5       0.4X
    ```
    
    Current Solution
    
    ```
    Running benchmark: 10 compact ints
      Running case: In expression
      Stopped after 22 iterations, 2050 ms
      Running case: InSet expression
      Stopped after 15 iterations, 3387 ms
    
    Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.14
    Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
    10 compact ints:                         Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    In expression                                   89 /   93       1123.1      
     0.9       1.0X
    InSet expression                               215 /  226        464.9      
     2.2       0.4X
    
    Running benchmark: 10 non-compact ints
      Running case: In expression
      Stopped after 28 iterations, 2059 ms
      Running case: InSet expression
      Stopped after 15 iterations, 3364 ms
    
    Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.14
    Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
    10 non-compact ints:                     Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    In expression                                   72 /   74       1397.9      
     0.7       1.0X
    InSet expression                               216 /  224        463.3      
     2.2       0.3X
    
    Running benchmark: 100 compact ints
      Running case: In expression
      Stopped after 15 iterations, 9360 ms
      Running case: InSet expression
      Stopped after 15 iterations, 3672 ms
    
    Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.14
    Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
    100 compact ints:                        Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    In expression                                  621 /  624        161.2      
     6.2       1.0X
    InSet expression                               234 /  245        428.2      
     2.3       2.7X
    
    Running benchmark: 100 non-compact ints
      Running case: In expression
      Stopped after 15 iterations, 9113 ms
      Running case: InSet expression
      Stopped after 15 iterations, 3563 ms
    
    Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.14
    Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
    100 non-compact ints:                    Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    In expression                                  593 /  608        168.6      
     5.9       1.0X
    InSet expression                               230 /  238        435.5      
     2.3       2.6X
    
    Running benchmark: 250 compact ints
      Running case: In expression
      Stopped after 15 iterations, 16970 ms
      Running case: InSet expression
      Stopped after 15 iterations, 3724 ms
    
    Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.14
    Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
    250 compact ints:                        Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    In expression                                 1123 / 1131         89.0      
    11.2       1.0X
    InSet expression                               236 /  248        423.6      
     2.4       4.8X
    
    Running benchmark: 250 non-compact ints
      Running case: In expression
      Stopped after 15 iterations, 16807 ms
      Running case: InSet expression
      Stopped after 15 iterations, 3788 ms
    
    Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.14
    Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
    250 non-compact ints:                    Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
    
------------------------------------------------------------------------------------------------
    In expression                                 1097 / 1120         91.1      
    11.0       1.0X
    InSet expression                               246 /  253        407.2      
     2.5       4.5X
    ```
    
    **_Conclusion_**
    
    Similarly to previous types, Spark performs better on ints with the 
proposed optimization (up to 23x in some cases). Interestingly, the new 
approach can perform 5 times faster than `InSet` even on 250 elements. The 
reason for that is autoboxing, which will be addressed in a separate PR.
    
    As on other types, the performance of the old approach on ints is the same 
independently of data compactness.
    
    ## Notes
    
    - The performance difference was confirmed on OpenJDK 64-Bit Server VM 
1.8.0_181-b13 on Linux 4.9.93-linuxkit-aufs.
    - [This link] contains source code that decides between `tableswitch` and 
`lookupswitch`. The logic was re-used in the benchmarks. See the 
`isLookupSwitch` method.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/aokolnychyi/spark spark-26205

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/23171.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #23171
    
----
commit 1477f101951ed39b96f884ddd2d451e164c0cc43
Author: Anton Okolnychyi <aokolnychyi@...>
Date:   2018-11-13T01:48:19Z

    [SPARK-26205][SQL] Optimize In for bytes, shorts, ints

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to