[ 
https://issues.apache.org/jira/browse/TRAFODION-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14714425#comment-14714425
 ] 

Qifan Chen commented on TRAFODION-1449:
---------------------------------------

The goal of the fix is to remove redundancy that was reported in previous 
posing. 

1. The need of full key predicates that refer to any key column.

This set is required to compute the keys for HBase in 
SearchKey::makeHBaseSearchKey(). The normal key predicates (stored in 
SearchKey::keyPredicates_) is not suitable since it is an intersection of 
commonPredicates and may not contain all the predicates. This is the reason 
that the select predicates were included as the input to makeHbaseSearchKey() 
call. However, the  inclusion of the selection predicates introduces 
redundancy. 

The full key predicates are computed in SearchKey::init() and stored in 
SearchKey as a new data member fullyKeyPredicates_.  The full key predicate is 
a set of predicates in which each predicate refers to some key column, 

2. Removal of the redundancy elsewhere.

The two places in the code that introduce redundant  representation of the same 
original predicates have been modified.  One is the scan optimizer before we 
call "new SearchKey" to include another representation of a RANGE SPEC.  The 
second place is the setup steps before calling SearchKey::makeHBaseSaerchKey(). 

The final executor predicate used for HBase access or IUD operators will be the 
predicates survived from makeHbaseSearchKey() call, plus the executor predicate 
passed-in plus the selection predicate minus the full key predicates. 

Evidences that the fix works.

1. The original query.  

TRAFODION_SCAN ============================  SEQ_NO 1        NO CHILDREN
TABLE_NAME ............... FOO
REQUESTS_IN .............. 1
ROWS_OUT ................ 11
EST_OPER_COST ............ 0.01
EST_TOTAL_COST ........... 0.01
DESCRIPTION
  max_card_est .......... 11
  fragment_id ............ 0
  parent_frag ............ (none)
  fragment_type .......... master
  scan_type .............. subset scan of table TRAFODION.SEABASE.FOO
  object_type ............ Trafodion
  columns ................ all
  begin_keys(excl) ....... 2
  end_keys(excl) ......... 5 <== the end key is present with the fix now. 
  cache_size ........... 100
  probes ................. 1
  rows_accessed ......... 11
  key_columns ............ UNIQ
  executor_predicates .... (UNIQ > 2) and (UNIQ < 5)

2. Query in regression SEABASE/TEST010

TRAFODION_SCAN ============================  SEQ_NO 1        NO CHILDREN
TABLE_NAME ............... T010T2
REQUESTS_IN .............. 1
ROWS_OUT ................. 3
EST_OPER_COST ............ 0.01
EST_TOTAL_COST ........... 0.01
DESCRIPTION
  max_card_est ........... 5
  fragment_id ............ 0
  parent_frag ............ (none)
  fragment_type .......... master
  scan_type .............. subset scan of table TRAFODION.SEABASE.T010T2
  object_type ............ Trafodion
  columns ................ all
  unique_rows ............ 1,a,1  <== we do not miss these two keys either. 
  unique_rows ............ 4,a,1
  cache_size ........... 100
  probes ................. 1
  rows_accessed ......... 10
  key_columns ............ A, B, C
  executor_predicates .... (B = 'a') and (C = 1) and ((A = 1) or (A = 4))

> End-key is not specified for a scan node when range spec feature is on. 
> ------------------------------------------------------------------------
>
>                 Key: TRAFODION-1449
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-1449
>             Project: Apache Trafodion
>          Issue Type: Bug
>          Components: sql-cmp
>    Affects Versions: 1.1 (pre-incubation)
>            Reporter: Qifan Chen
>            Assignee: Qifan Chen
>              Labels: performance
>
> Problem: the end key is not specified for the scan operator and is evaluated 
> in the executor predicate. 
> MODULE_NAME .............. DYNAMICALLY COMPILED
> STATEMENT_NAME ........... XX
> PLAN_ID .................. 212306167959521210
> ROWS_OUT ................. 1
> EST_TOTAL_COST ........... 0.01
> STATEMENT ................ select *
>                            from foo << + cardinality 10e8 >>
>                            where uniq > 2 and uniq < 5;
> TRAFODION_SCAN ============================  SEQ_NO 1        NO CHILDREN
> TABLE_NAME ............... FOO
> REQUESTS_IN .............. 1
> ROWS_OUT ................ 11
> EST_OPER_COST ............ 0.01
> EST_TOTAL_COST ........... 0.01
> DESCRIPTION
>   max_card_est .......... 11
>   fragment_id ............ 0
>   parent_frag ............ (none)
>   fragment_type .......... master
>   scan_type .............. subset scan of table TRAFODION.SEABASE.FOO
>   object_type ............ Trafodion
>   columns ................ all
>   begin_keys(excl) ....... 2
>   end_keys(incl)
>   cache_size ........... 100
>   probes ................. 1
>   rows_accessed ......... 11
>   key_columns ............ UNIQ
>   executor_predicates .... (UNIQ < 5) and (UNIQ > 2) and (UNIQ < 5)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to