Github user wangyum commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22427#discussion_r217918733
  
    --- Diff: sql/core/benchmarks/FilterPushdownBenchmark-results.txt ---
    @@ -2,737 +2,669 @@
     Pushdown for many distinct value case
     
================================================================================================
     
    -Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
    -Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
    -
    +OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
    +Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
     Select 0 string row (value IS NULL):     Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
     
------------------------------------------------------------------------------------------------
    -Parquet Vectorized                            8970 / 9122          1.8     
    570.3       1.0X
    -Parquet Vectorized (Pushdown)                  471 /  491         33.4     
     30.0      19.0X
    -Native ORC Vectorized                         7661 / 7853          2.1     
    487.0       1.2X
    -Native ORC Vectorized (Pushdown)              1134 / 1161         13.9     
     72.1       7.9X
    -
    -Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
    -Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
    +Parquet Vectorized                          11405 / 11485          1.4     
    725.1       1.0X
    +Parquet Vectorized (Pushdown)                  675 /  690         23.3     
     42.9      16.9X
    +Native ORC Vectorized                         7127 / 7170          2.2     
    453.1       1.6X
    +Native ORC Vectorized (Pushdown)               519 /  541         30.3     
     33.0      22.0X
     
    +OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
    +Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
     Select 0 string row ('7864320' < value < '7864320'): Best/Avg Time(ms)    
Rate(M/s)   Per Row(ns)   Relative
     
------------------------------------------------------------------------------------------------
    -Parquet Vectorized                            9246 / 9297          1.7     
    587.8       1.0X
    -Parquet Vectorized (Pushdown)                  480 /  488         32.8     
     30.5      19.3X
    -Native ORC Vectorized                         7838 / 7850          2.0     
    498.3       1.2X
    -Native ORC Vectorized (Pushdown)              1054 / 1118         14.9     
     67.0       8.8X
    -
    -Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
    -Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
    +Parquet Vectorized                          11457 / 11473          1.4     
    728.4       1.0X
    +Parquet Vectorized (Pushdown)                  656 /  686         24.0     
     41.7      17.5X
    +Native ORC Vectorized                         7328 / 7342          2.1     
    465.9       1.6X
    +Native ORC Vectorized (Pushdown)               539 /  565         29.2     
     34.2      21.3X
     
    +OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
    +Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
     Select 1 string row (value = '7864320'): Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
     
------------------------------------------------------------------------------------------------
    -Parquet Vectorized                            8989 / 9100          1.7     
    571.5       1.0X
    -Parquet Vectorized (Pushdown)                  448 /  467         35.1     
     28.5      20.1X
    -Native ORC Vectorized                         7680 / 7768          2.0     
    488.3       1.2X
    -Native ORC Vectorized (Pushdown)              1067 / 1118         14.7     
     67.8       8.4X
    -
    -Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
    -Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
    +Parquet Vectorized                          11878 / 11888          1.3     
    755.2       1.0X
    +Parquet Vectorized (Pushdown)                  630 /  654         25.0     
     40.1      18.9X
    +Native ORC Vectorized                         7342 / 7362          2.1     
    466.8       1.6X
    +Native ORC Vectorized (Pushdown)               519 /  537         30.3     
     33.0      22.9X
     
    +OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
    +Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
     Select 1 string row (value <=> '7864320'): Best/Avg Time(ms)    Rate(M/s)  
 Per Row(ns)   Relative
     
------------------------------------------------------------------------------------------------
    -Parquet Vectorized                            9115 / 9266          1.7     
    579.5       1.0X
    -Parquet Vectorized (Pushdown)                  466 /  492         33.7     
     29.7      19.5X
    -Native ORC Vectorized                         7800 / 7914          2.0     
    495.9       1.2X
    -Native ORC Vectorized (Pushdown)              1075 / 1102         14.6     
     68.4       8.5X
    -
    -Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
    -Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
    +Parquet Vectorized                          11423 / 11440          1.4     
    726.2       1.0X
    +Parquet Vectorized (Pushdown)                  625 /  643         25.2     
     39.7      18.3X
    +Native ORC Vectorized                         7315 / 7335          2.2     
    465.1       1.6X
    +Native ORC Vectorized (Pushdown)               507 /  520         31.0     
     32.2      22.5X
     
    +OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
    +Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
     Select 1 string row ('7864320' <= value <= '7864320'): Best/Avg Time(ms)   
 Rate(M/s)   Per Row(ns)   Relative
     
------------------------------------------------------------------------------------------------
    -Parquet Vectorized                            9099 / 9237          1.7     
    578.5       1.0X
    -Parquet Vectorized (Pushdown)                  462 /  475         34.1     
     29.3      19.7X
    -Native ORC Vectorized                         7847 / 7925          2.0     
    498.9       1.2X
    -Native ORC Vectorized (Pushdown)              1078 / 1114         14.6     
     68.5       8.4X
    -
    -Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
    -Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
    +Parquet Vectorized                          11440 / 11478          1.4     
    727.3       1.0X
    +Parquet Vectorized (Pushdown)                  634 /  652         24.8     
     40.3      18.0X
    +Native ORC Vectorized                         7311 / 7324          2.2     
    464.8       1.6X
    +Native ORC Vectorized (Pushdown)               517 /  548         30.4     
     32.8      22.1X
     
    +OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
    +Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
     Select all string rows (value IS NOT NULL): Best/Avg Time(ms)    Rate(M/s) 
  Per Row(ns)   Relative
     
------------------------------------------------------------------------------------------------
    -Parquet Vectorized                          19303 / 19547          0.8     
   1227.3       1.0X
    -Parquet Vectorized (Pushdown)               19924 / 20089          0.8     
   1266.7       1.0X
    -Native ORC Vectorized                       18725 / 19079          0.8     
   1190.5       1.0X
    -Native ORC Vectorized (Pushdown)            19310 / 19492          0.8     
   1227.7       1.0X
    -
    -Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
    -Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
    +Parquet Vectorized                          20750 / 20872          0.8     
   1319.3       1.0X
    +Parquet Vectorized (Pushdown)               21002 / 21032          0.7     
   1335.3       1.0X
    +Native ORC Vectorized                       16714 / 16742          0.9     
   1062.6       1.2X
    +Native ORC Vectorized (Pushdown)            16926 / 16965          0.9     
   1076.1       1.2X
     
    +OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
    +Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
     Select 0 int row (value IS NULL):        Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
     
------------------------------------------------------------------------------------------------
    -Parquet Vectorized                            8117 / 8323          1.9     
    516.1       1.0X
    -Parquet Vectorized (Pushdown)                  484 /  494         32.5     
     30.8      16.8X
    -Native ORC Vectorized                         6811 / 7036          2.3     
    433.0       1.2X
    -Native ORC Vectorized (Pushdown)              1061 / 1082         14.8     
     67.5       7.6X
    -
    -Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
    -Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
    +Parquet Vectorized                          10510 / 10532          1.5     
    668.2       1.0X
    +Parquet Vectorized (Pushdown)                  642 /  665         24.5     
     40.8      16.4X
    +Native ORC Vectorized                         6609 / 6618          2.4     
    420.2       1.6X
    +Native ORC Vectorized (Pushdown)               502 /  512         31.4     
     31.9      21.0X
     
    +OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
    +Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
     Select 0 int row (7864320 < value < 7864320): Best/Avg Time(ms)    
Rate(M/s)   Per Row(ns)   Relative
     
------------------------------------------------------------------------------------------------
    -Parquet Vectorized                            8105 / 8140          1.9     
    515.3       1.0X
    -Parquet Vectorized (Pushdown)                  478 /  505         32.9     
     30.4      17.0X
    -Native ORC Vectorized                         6914 / 7211          2.3     
    439.6       1.2X
    -Native ORC Vectorized (Pushdown)              1044 / 1064         15.1     
     66.4       7.8X
    -
    -Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
    -Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
    +Parquet Vectorized                          10505 / 10514          1.5     
    667.9       1.0X
    +Parquet Vectorized (Pushdown)                  659 /  673         23.9     
     41.9      15.9X
    +Native ORC Vectorized                         6634 / 6641          2.4     
    421.8       1.6X
    +Native ORC Vectorized (Pushdown)               513 /  526         30.7     
     32.6      20.5X
     
    +OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
    +Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
     Select 1 int row (value = 7864320):      Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
     
------------------------------------------------------------------------------------------------
    -Parquet Vectorized                            7983 / 8116          2.0     
    507.6       1.0X
    -Parquet Vectorized (Pushdown)                  464 /  487         33.9     
     29.5      17.2X
    -Native ORC Vectorized                         6703 / 6774          2.3     
    426.1       1.2X
    -Native ORC Vectorized (Pushdown)              1017 / 1058         15.5     
     64.6       7.9X
    -
    -Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
    -Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
    +Parquet Vectorized                          10555 / 10570          1.5     
    671.1       1.0X
    +Parquet Vectorized (Pushdown)                  651 /  668         24.2     
     41.4      16.2X
    +Native ORC Vectorized                         6721 / 6728          2.3     
    427.3       1.6X
    +Native ORC Vectorized (Pushdown)               508 /  519         31.0     
     32.3      20.8X
     
    +OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
    +Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
     Select 1 int row (value <=> 7864320):    Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
     
------------------------------------------------------------------------------------------------
    -Parquet Vectorized                            7942 / 7983          2.0     
    504.9       1.0X
    -Parquet Vectorized (Pushdown)                  468 /  479         33.6     
     29.7      17.0X
    -Native ORC Vectorized                         6677 / 6779          2.4     
    424.5       1.2X
    -Native ORC Vectorized (Pushdown)              1021 / 1068         15.4     
     64.9       7.8X
    -
    -Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
    -Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
    +Parquet Vectorized                          10556 / 10566          1.5     
    671.1       1.0X
    +Parquet Vectorized (Pushdown)                  647 /  654         24.3     
     41.1      16.3X
    +Native ORC Vectorized                         6716 / 6728          2.3     
    427.0       1.6X
    +Native ORC Vectorized (Pushdown)               510 /  521         30.9     
     32.4      20.7X
     
    +OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
    +Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
     Select 1 int row (7864320 <= value <= 7864320): Best/Avg Time(ms)    
Rate(M/s)   Per Row(ns)   Relative
     
------------------------------------------------------------------------------------------------
    -Parquet Vectorized                            7909 / 7958          2.0     
    502.8       1.0X
    -Parquet Vectorized (Pushdown)                  485 /  494         32.4     
     30.8      16.3X
    -Native ORC Vectorized                         6751 / 6846          2.3     
    429.2       1.2X
    -Native ORC Vectorized (Pushdown)              1043 / 1077         15.1     
     66.3       7.6X
    -
    -Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
    -Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
    +Parquet Vectorized                          10556 / 10565          1.5     
    671.1       1.0X
    +Parquet Vectorized (Pushdown)                  649 /  654         24.2     
     41.3      16.3X
    +Native ORC Vectorized                         6700 / 6712          2.3     
    426.0       1.6X
    +Native ORC Vectorized (Pushdown)               509 /  520         30.9     
     32.3      20.8X
     
    +OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
    +Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
     Select 1 int row (7864319 < value < 7864321): Best/Avg Time(ms)    
Rate(M/s)   Per Row(ns)   Relative
     
------------------------------------------------------------------------------------------------
    -Parquet Vectorized                            8010 / 8033          2.0     
    509.2       1.0X
    -Parquet Vectorized (Pushdown)                  472 /  489         33.3     
     30.0      17.0X
    -Native ORC Vectorized                         6655 / 6808          2.4     
    423.1       1.2X
    -Native ORC Vectorized (Pushdown)              1015 / 1067         15.5     
     64.5       7.9X
    -
    -Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
    -Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
    +Parquet Vectorized                          10547 / 10566          1.5     
    670.5       1.0X
    +Parquet Vectorized (Pushdown)                  649 /  653         24.2     
     41.3      16.3X
    +Native ORC Vectorized                         6703 / 6713          2.3     
    426.2       1.6X
    +Native ORC Vectorized (Pushdown)               510 /  520         30.8     
     32.5      20.7X
     
    +OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
    +Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
     Select 10% int rows (value < 1572864):   Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
     
------------------------------------------------------------------------------------------------
    -Parquet Vectorized                            8983 / 9035          1.8     
    571.1       1.0X
    -Parquet Vectorized (Pushdown)                 2204 / 2231          7.1     
    140.1       4.1X
    -Native ORC Vectorized                         7864 / 8011          2.0     
    500.0       1.1X
    -Native ORC Vectorized (Pushdown)              2674 / 2789          5.9     
    170.0       3.4X
    -
    -Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
    -Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
    +Parquet Vectorized                          11478 / 11525          1.4     
    729.7       1.0X
    +Parquet Vectorized (Pushdown)                 2576 / 2587          6.1     
    163.8       4.5X
    +Native ORC Vectorized                         7633 / 7657          2.1     
    485.3       1.5X
    +Native ORC Vectorized (Pushdown)              2076 / 2096          7.6     
    132.0       5.5X
     
    +OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
    +Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
     Select 50% int rows (value < 7864320):   Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
     
------------------------------------------------------------------------------------------------
    -Parquet Vectorized                          12723 / 12903          1.2     
    808.9       1.0X
    -Parquet Vectorized (Pushdown)                 9112 / 9282          1.7     
    579.3       1.4X
    -Native ORC Vectorized                       12090 / 12230          1.3     
    768.7       1.1X
    -Native ORC Vectorized (Pushdown)              9242 / 9372          1.7     
    587.6       1.4X
    -
    -Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
    -Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
    +Parquet Vectorized                          14785 / 14802          1.1     
    940.0       1.0X
    +Parquet Vectorized (Pushdown)                 9971 / 9977          1.6     
    633.9       1.5X
    +Native ORC Vectorized                       11082 / 11107          1.4     
    704.6       1.3X
    +Native ORC Vectorized (Pushdown)              8061 / 8073          2.0     
    512.5       1.8X
     
    +OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
    +Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
     Select 90% int rows (value < 14155776):  Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
     
------------------------------------------------------------------------------------------------
    -Parquet Vectorized                          16453 / 16678          1.0     
   1046.1       1.0X
    -Parquet Vectorized (Pushdown)               15997 / 16262          1.0     
   1017.0       1.0X
    -Native ORC Vectorized                       16652 / 17070          0.9     
   1058.7       1.0X
    -Native ORC Vectorized (Pushdown)            15843 / 16112          1.0     
   1007.2       1.0X
    -
    -Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
    -Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
    +Parquet Vectorized                          18174 / 18214          0.9     
   1155.5       1.0X
    +Parquet Vectorized (Pushdown)               17387 / 17403          0.9     
   1105.5       1.0X
    +Native ORC Vectorized                       14465 / 14492          1.1     
    919.7       1.3X
    +Native ORC Vectorized (Pushdown)            14024 / 14041          1.1     
    891.6       1.3X
     
    +OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
    +Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
     Select all int rows (value IS NOT NULL): Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
     
------------------------------------------------------------------------------------------------
    -Parquet Vectorized                          17098 / 17254          0.9     
   1087.1       1.0X
    -Parquet Vectorized (Pushdown)               17302 / 17529          0.9     
   1100.1       1.0X
    -Native ORC Vectorized                       16790 / 17098          0.9     
   1067.5       1.0X
    -Native ORC Vectorized (Pushdown)            17329 / 17914          0.9     
   1101.7       1.0X
    -
    -Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
    -Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
    +Parquet Vectorized                          19004 / 19014          0.8     
   1208.2       1.0X
    +Parquet Vectorized (Pushdown)               19219 / 19232          0.8     
   1221.9       1.0X
    +Native ORC Vectorized                       15266 / 15290          1.0     
    970.6       1.2X
    +Native ORC Vectorized (Pushdown)            15469 / 15482          1.0     
    983.5       1.2X
     
    +OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
    +Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
     Select all int rows (value > -1):        Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
     
------------------------------------------------------------------------------------------------
    -Parquet Vectorized                          17088 / 17392          0.9     
   1086.4       1.0X
    -Parquet Vectorized (Pushdown)               17609 / 17863          0.9     
   1119.5       1.0X
    -Native ORC Vectorized                       18334 / 69831          0.9     
   1165.7       0.9X
    -Native ORC Vectorized (Pushdown)            17465 / 17629          0.9     
   1110.4       1.0X
    -
    -Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
    -Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
    +Parquet Vectorized                          19036 / 19052          0.8     
   1210.3       1.0X
    +Parquet Vectorized (Pushdown)               19287 / 19306          0.8     
   1226.2       1.0X
    +Native ORC Vectorized                       15311 / 15371          1.0     
    973.5       1.2X
    +Native ORC Vectorized (Pushdown)            15517 / 15590          1.0     
    986.5       1.2X
     
    +OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
    +Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
     Select all int rows (value != -1):       Best/Avg Time(ms)    Rate(M/s)   
Per Row(ns)   Relative
     
------------------------------------------------------------------------------------------------
    -Parquet Vectorized                          16903 / 17233          0.9     
   1074.6       1.0X
    -Parquet Vectorized (Pushdown)               16945 / 17032          0.9     
   1077.3       1.0X
    -Native ORC Vectorized                       16377 / 16762          1.0     
   1041.2       1.0X
    -Native ORC Vectorized (Pushdown)            16950 / 17212          0.9     
   1077.7       1.0X
    +Parquet Vectorized                          19072 / 19102          0.8     
   1212.6       1.0X
    +Parquet Vectorized (Pushdown)               19288 / 19318          0.8     
   1226.3       1.0X
    +Native ORC Vectorized                       15277 / 15293          1.0     
    971.3       1.2X
    +Native ORC Vectorized (Pushdown)            15479 / 15499          1.0     
    984.1       1.2X
     
     
     
================================================================================================
     Pushdown for few distinct value case (use dictionary encoding)
     
================================================================================================
     
    -Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
    -Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
    -
    +OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
    +Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
     Select 0 distinct string row (value IS NULL): Best/Avg Time(ms)    
Rate(M/s)   Per Row(ns)   Relative
     
------------------------------------------------------------------------------------------------
    -Parquet Vectorized                            7245 / 7322          2.2     
    460.7       1.0X
    -Parquet Vectorized (Pushdown)                  378 /  389         41.6     
     24.0      19.2X
    -Native ORC Vectorized                         6720 / 6778          2.3     
    427.2       1.1X
    -Native ORC Vectorized (Pushdown)              1009 / 1032         15.6     
     64.2       7.2X
    -
    -Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
    -Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
    +Parquet Vectorized                          10250 / 10274          1.5     
    651.7       1.0X
    +Parquet Vectorized (Pushdown)                  571 /  576         27.5     
     36.3      17.9X
    +Native ORC Vectorized                         8651 / 8660          1.8     
    550.0       1.2X
    +Native ORC Vectorized (Pushdown)               909 /  933         17.3     
     57.8      11.3X
     
    +OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
    +Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
     Select 0 distinct string row ('100' < value < '100'): Best/Avg Time(ms)    
Rate(M/s)   Per Row(ns)   Relative
     
------------------------------------------------------------------------------------------------
    -Parquet Vectorized                            7627 / 7795          2.1     
    484.9       1.0X
    -Parquet Vectorized (Pushdown)                  384 /  406         41.0     
     24.4      19.9X
    -Native ORC Vectorized                         6724 / 7824          2.3     
    427.5       1.1X
    -Native ORC Vectorized (Pushdown)               968 /  986         16.3     
     61.5       7.9X
    -
    -Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
    -Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
    +Parquet Vectorized                          10420 / 10426          1.5     
    662.5       1.0X
    +Parquet Vectorized (Pushdown)                  574 /  579         27.4     
     36.5      18.2X
    +Native ORC Vectorized                         8973 / 8982          1.8     
    570.5       1.2X
    +Native ORC Vectorized (Pushdown)               916 /  955         17.2     
     58.2      11.4X
     
    +OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
    +Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
     Select 1 distinct string row (value = '100'): Best/Avg Time(ms)    
Rate(M/s)   Per Row(ns)   Relative
     
------------------------------------------------------------------------------------------------
    -Parquet Vectorized                            7157 / 7534          2.2     
    455.0       1.0X
    -Parquet Vectorized (Pushdown)                  542 /  565         29.0     
     34.5      13.2X
    -Native ORC Vectorized                         6716 / 7214          2.3     
    427.0       1.1X
    -Native ORC Vectorized (Pushdown)              1212 / 1288         13.0     
     77.0       5.9X
    -
    -Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
    -Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
    +Parquet Vectorized                          10428 / 10441          1.5     
    663.0       1.0X
    +Parquet Vectorized (Pushdown)                  789 /  809         19.9     
     50.2      13.2X
    +Native ORC Vectorized                         9042 / 9055          1.7     
    574.9       1.2X
    +Native ORC Vectorized (Pushdown)              1130 / 1145         13.9     
     71.8       9.2X
     
    +OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
    +Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
     Select 1 distinct string row (value <=> '100'): Best/Avg Time(ms)    
Rate(M/s)   Per Row(ns)   Relative
     
------------------------------------------------------------------------------------------------
    -Parquet Vectorized                            7368 / 7552          2.1     
    468.4       1.0X
    -Parquet Vectorized (Pushdown)                  544 /  556         28.9     
     34.6      13.5X
    -Native ORC Vectorized                         6740 / 6867          2.3     
    428.5       1.1X
    -Native ORC Vectorized (Pushdown)              1230 / 1426         12.8     
     78.2       6.0X
    -
    -Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
    -Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
    +Parquet Vectorized                          10402 / 10416          1.5     
    661.3       1.0X
    +Parquet Vectorized (Pushdown)                  791 /  806         19.9     
     50.3      13.2X
    +Native ORC Vectorized                         9042 / 9055          1.7     
    574.9       1.2X
    +Native ORC Vectorized (Pushdown)              1112 / 1145         14.1     
     70.7       9.4X
     
    +OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
    +Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
     Select 1 distinct string row ('100' <= value <= '100'): Best/Avg Time(ms)  
  Rate(M/s)   Per Row(ns)   Relative
     
------------------------------------------------------------------------------------------------
    -Parquet Vectorized                            7427 / 7734          2.1     
    472.2       1.0X
    -Parquet Vectorized (Pushdown)                  556 /  568         28.3     
     35.4      13.3X
    -Native ORC Vectorized                         6847 / 7059          2.3     
    435.3       1.1X
    -Native ORC Vectorized (Pushdown)              1226 / 1230         12.8     
     77.9       6.1X
    -
    -Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
    -Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
    +Parquet Vectorized                          10548 / 10563          1.5     
    670.6       1.0X
    +Parquet Vectorized (Pushdown)                  790 /  796         19.9     
     50.2      13.4X
    +Native ORC Vectorized                         9144 / 9153          1.7     
    581.3       1.2X
    +Native ORC Vectorized (Pushdown)              1117 / 1148         14.1     
     71.0       9.4X
     
    +OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
    +Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
     Select all distinct string rows (value IS NOT NULL): Best/Avg Time(ms)    
Rate(M/s)   Per Row(ns)   Relative
     
------------------------------------------------------------------------------------------------
    -Parquet Vectorized                          16998 / 17311          0.9     
   1080.7       1.0X
    -Parquet Vectorized (Pushdown)               16977 / 17250          0.9     
   1079.4       1.0X
    -Native ORC Vectorized                       18447 / 19852          0.9     
   1172.8       0.9X
    -Native ORC Vectorized (Pushdown)            16614 / 17102          0.9     
   1056.3       1.0X
    +Parquet Vectorized                          20445 / 20469          0.8     
   1299.8       1.0X
    +Parquet Vectorized (Pushdown)               20686 / 20699          0.8     
   1315.2       1.0X
    +Native ORC Vectorized                       18851 / 18953          0.8     
   1198.5       1.1X
    +Native ORC Vectorized (Pushdown)            19255 / 19268          0.8     
   1224.2       1.1X
     
     
     
================================================================================================
     Pushdown benchmark for StringStartsWith
     
================================================================================================
     
    -Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
    -Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
    -
    +OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
    +Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
     StringStartsWith filter: (value like '10%'): Best/Avg Time(ms)    
Rate(M/s)   Per Row(ns)   Relative
     
------------------------------------------------------------------------------------------------
    -Parquet Vectorized                           9705 / 10814          1.6     
    617.0       1.0X
    -Parquet Vectorized (Pushdown)                 3086 / 3574          5.1     
    196.2       3.1X
    -Native ORC Vectorized                       10094 / 10695          1.6     
    641.8       1.0X
    -Native ORC Vectorized (Pushdown)              9611 / 9999          1.6     
    611.0       1.0X
    -
    -Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
    -Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
    +Parquet Vectorized                          14265 / 15213          1.1     
    907.0       1.0X
    +Parquet Vectorized (Pushdown)                 4228 / 4870          3.7     
    268.8       3.4X
    +Native ORC Vectorized                       10116 / 10977          1.6     
    643.2       1.4X
    +Native ORC Vectorized (Pushdown)            10653 / 11376          1.5     
    677.3       1.3X
     
    +OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
    +Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
     StringStartsWith filter: (value like '1000%'): Best/Avg Time(ms)    
Rate(M/s)   Per Row(ns)   Relative
     
------------------------------------------------------------------------------------------------
    -Parquet Vectorized                            8016 / 8183          2.0     
    509.7       1.0X
    -Parquet Vectorized (Pushdown)                  444 /  457         35.4     
     28.2      18.0X
    -Native ORC Vectorized                         6970 / 7169          2.3     
    443.2       1.2X
    -Native ORC Vectorized (Pushdown)              7447 / 7503          2.1     
    473.5       1.1X
    -
    -Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
    -Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
    +Parquet Vectorized                          11499 / 11539          1.4     
    731.1       1.0X
    +Parquet Vectorized (Pushdown)                  669 /  672         23.5     
     42.5      17.2X
    +Native ORC Vectorized                         7343 / 7363          2.1     
    466.8       1.6X
    +Native ORC Vectorized (Pushdown)              7559 / 7568          2.1     
    480.6       1.5X
    --- End diff --
    
    It seems ORC doesn't support custom filter yet: 
https://github.com/apache/spark/pull/21623#issuecomment-401558357


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to