Github user jihoonson commented on the pull request:

    https://github.com/apache/tajo/pull/231#issuecomment-62916612
  
    This is a result of performance test.
    * Data: synthetic data
     * Size: 74.4GB
     * # of rows: 357913940
     * DDL
      * create external table gen2 (id int4, sel_0_001 int4, sel_0_01 int4, 
sel_0_1 int4, sel_1 int4, sel_5 int4, sel_10 int4, sel_20 int4, f1 text, f2 
text, f3 text, f4 text, f5 text, f6 text, f7 text, f8 text, f9 text, f10 text, 
f11 text, pad text) USING CSV WITH ('csvfile.delimiter'=',') LOCATION 
'/gen/50g_2';
       * The sel_0_001, sel_0_01, sel_0_1, sel_1, sel_5, sel_10, and sel_20 
columns have 0.001, 0.01, 0.1, 1, 5, 10, and 20% selectivities, respectively.
    * Query: select id from gen2 where $col = 1;
     * $col: sel_0_001, sel_0_01, sel_0_1, sel_1, sel_5, sel_10, sel_20
    * Result
     * Creation time (sec)
    
    | Index name | Time | 
    |:-------------:|:-------------:| 
    |sel_0_001|159.237|
    |sel_0_01|137.716|
    |sel_0_1|135.772|
    |sel_1|121.511|
    |sel_5|124.585|
    |sel_10|123.593|
    |sel_20|120.621|
    
     * Query execution time (sec)
    
    | $col | Seq scan | Index scan |
    |:-------------:|:-------------:|:-------------:| 
    |sel_0_001|8.445|67.681|
    |sel_0_01|16.306|75.628|
    |sel_0_1|54.602|74.602|
    |sel_1|98.552|78.056|
    |sel_5|104.018|69.755|
    |sel_10|117.401|81.483|
    |sel_20|103.993|71.334|


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to