Github user wangyum commented on the issue:

    https://github.com/apache/spark/pull/22420
  
    @hellodengfei  Could you change the PR against `master` branch? This change 
LGTM. I did a benchmark about `Set` and `Array`:
    ```scala
    def benchmark(func: () => Unit): Long = {
      val start = System.currentTimeMillis()
      func()
      val end = System.currentTimeMillis()
      end - start
    }
    
    val range = Range(1, 1000000)
    val set = range.toSet
    val array = range.toArray
    
    for (i <- 0 until 5) {
      val setExecutionTime =
        benchmark(() => for (i <- 0 until 500) { 
set.contains(scala.util.Random.nextInt()) })
      val arrayExecutionTime =
        benchmark(() => for (i <- 0 until 500) { 
array.contains(scala.util.Random.nextInt()) })
      println(s"set execution time: $setExecutionTime, array execution time: 
$arrayExecutionTime")
    }
    ```
    benchmark result:
    ```
    set execution time: 4, array execution time: 2760
    set execution time: 1, array execution time: 1911
    set execution time: 3, array execution time: 2043
    set execution time: 12, array execution time: 2214
    set execution time: 6, array execution time: 1770
    ```



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to