Hi Eric,
You probably know all of this, but just to recap the entire process: To
partition the scan into ranges for parallel execution, this "partition key
predicate" gets added:
(k1, k2) >= (:_sys_hostVarLo0, :_sys_hostVarLo1) and
case when :_sys_hostVarExclRange
then (k1, k2) < (:_sys_hostVarHi0, :_sys_HostVarHi1)
else (k1, k2) <= (:_sys_hostVarHi0, :_sys_HostVarHi1)
The case expression handles the case of the last range that needs to
include the highest key value, while all other ranges exclude the high
range. At runtime, each ESP gets a different set of these host variables,
so that it reads a different range.
These multi-valued comparisons then get converted into an equivalent form
of ANDs and ORs. That is shown as the executor predicate.
Then, that predicate, plus the user predicate (k2 between 10 and 20) get
transformed into disjunctive normal form. I assume it's possible that
during this process some disjuncts get created that always evaluate to TRUE
or FALSE. We could try to detect those, but it's probably not worth the
trouble. Dave may have more insights into this.
I didn't go though the exercise of doing these steps by hand and validating
them all.
Hans
On Tue, Mar 8, 2016 at 9:01 AM, Eric Owhadi <[email protected]> wrote:
> Hello Trafodioneers,
>
>
>
> Trying to learn more about mdam, so tried the following on my deb build…
>
>
>
> create table t132helper (a int not null, primary key(a));
>
> insert into t132helper values(1);
>
> create table t132 (k1 int not null, k2 int not null, a int not null, b int
> not null,
>
> c
> char(1000),
> primary key (k1,k2)) ATTRIBUTES ALIGNED FORMAT ;
>
> upsert using load
>
> into t132
>
> select x1000*1000+ x10000*10000 + x100000*100000 ,
>
> x1+x10*10+x100*100,
>
> x1+x10*10+x100*100+ x1000*1000+ x10000*10000 + x100000*100000 ,
>
> x1+x10*10+x100*100+ x1000*1000+ x10000*10000 + x100000*100000 ,
>
> 'yo bro'
>
>
>
> from t132helper
>
> transpose 0,1,2,3,4,5,6,7,8,9 as x1
>
> transpose 0,1,2,3,4,5,6,7,8,9 as x10
>
> transpose 0,1,2,3,4,5,6,7,8,9 as x100
>
> transpose 0,1,2,3,4,5,6,7,8,9 as x1000
>
> transpose 0,1,2,3,4,5,6,7,8,9 as x10000
>
> transpose 0,1,2,3,4,5,6,7,8,9 as x100000;
>
> update statistics for table t132 on every column;
>
>
>
>
>
> then I do:
>
> explain select avg(a) from t132 where k2 between 10 and 20;
>
> ------------------------------------------------------------------ PLAN
> SUMMARY
>
> MODULE_NAME .............. DYNAMICALLY COMPILED
>
> STATEMENT_NAME ........... XX
>
> PLAN_ID .................. 212324155011172566
>
> ROWS_OUT ................. 1
>
> EST_TOTAL_COST .......... 17.64
>
> STATEMENT ................ select avg(a) from t132 where k2 between 10 and
> 20;
>
>
>
>
>
> ------------------------------------------------------------------ NODE
> LISTING
>
> ROOT ====================================== SEQ_NO 5 ONLY CHILD 4
>
> REQUESTS_IN .............. 1
>
> ROWS_OUT ................. 1
>
> EST_OPER_COST ............ 0
>
> EST_TOTAL_COST .......... 17.64
>
> DESCRIPTION
>
> max_card_est ........... 1
>
> fragment_id ............ 0
>
> parent_frag ............ (none)
>
> fragment_type .......... master
>
> statement_index ........ 0
>
> affinity_value ......... 0
>
> max_max_cardinalit 11,990
>
> total_overflow_size .... 0.00 KB
>
> esp_2_node_map ......... (\NSK:-1:-1)
>
> xn_access_mode ......... read_only
>
> xn_autoabort_interval 0
>
> auto_query_retry ....... enabled
>
> plan_version ....... 2,600
>
> embedded_arkcmp ........ used
>
> IS_SQLCI ............... ON
>
> LDAP_USERNAME
>
> ObjectUIDs ............. 3134597362287934668
>
> select_list ............
> cast(cast(cast((cast((cast((cast(sum(sum(TRAFODION.S
>
> CH.T132.A))) * 10000 ...0)) / cast(sum(count(1
>
> ))))) / 10000 ...0))))
>
>
>
>
>
> SORT_PARTIAL_AGGR_ROOT ==================== SEQ_NO 4 ONLY CHILD 3
>
> REQUESTS_IN .............. 1
>
> ROWS_OUT ................. 1
>
> EST_OPER_COST ............ 0.01
>
> EST_TOTAL_COST .......... 17.64
>
> DESCRIPTION
>
> max_card_est ........... 1
>
> fragment_id ............ 0
>
> parent_frag ............ (none)
>
> fragment_type .......... master
>
> aggregates ............. sum(sum(TRAFODION.SCH.T132.A)), sum(count(1 ))
>
>
>
>
>
> ESP_EXCHANGE ============================== SEQ_NO 3 ONLY CHILD 2
>
> REQUESTS_IN .............. 1
>
> ROWS_OUT ................. 1
>
> EST_OPER_COST ............ 0.01
>
> EST_TOTAL_COST .......... 17.64
>
> DESCRIPTION
>
> max_card_est ........... 1
>
> fragment_id ............ 2
>
> parent_frag ............ 0
>
> fragment_type .......... esp
>
> buffer_size ........ 6,250
>
> record_length ......... 24
>
> parent_processes ....... 1
>
> child_processes ........ 2
>
> child_partitioning_func range partitioned 2 ways on
> (TRAFODION.SCH.T132.K1,
>
> TRAFODION.SCH.T132.K2) with
> boundaries(c(<min>)
>
> c(<min>) ;c(281000) c(571) )
>
> seamonster_query ....... no
>
> seamonster_exchange .... no
>
>
>
>
>
> SORT_PARTIAL_AGGR_LEAF ==================== SEQ_NO 2 ONLY CHILD 1
>
> REQUESTS_IN .............. 1
>
> ROWS_OUT ................. 1
>
> EST_OPER_COST ............ 0.01
>
> EST_TOTAL_COST .......... 17.64
>
> DESCRIPTION
>
> max_card_est ........... 1
>
> fragment_id ............ 2
>
> parent_frag ............ 0
>
> fragment_type .......... esp
>
> aggregates ............. sum(TRAFODION.SCH.T132.A), count(1 )
>
>
>
>
>
> TRAFODION_SCAN ============================ SEQ_NO 1 NO CHILDREN
>
> TABLE_NAME ............... T132
>
> REQUESTS_IN .............. 1
>
> ROWS_OUT ............ 11,990
>
> EST_OPER_COST ........... 17.64
>
> EST_TOTAL_COST .......... 17.64
>
> DESCRIPTION
>
> max_card_est ...... 11,990
>
> fragment_id ............ 2
>
> parent_frag ............ 0
>
> fragment_type .......... esp
>
> scan_type .............. subset scan limited by mdam of table
>
> TRAFODION.SCH.T132
>
> object_type ............ Trafodion
>
> cache_size ......... 5,995
>
> probes ................. 1
>
> rows_accessed ..... 11,990
>
> column_retrieved ....... #1:1
>
> key_columns ............ K1, K2
>
> executor_predicates .... ((K1 < \:_sys_HostVarHi0) or (K1 =
>
> \:_sys_HostVarHi0) and case(if_then_else((0
> <>
>
> \:_sys_hostVarExclRange),
>
> (K2 < \:_sys_HostVarHi1),
>
> (K2 <= \:_sys_HostVarHi1)))) and ((K1 >
>
> \:_sys_HostVarLo0) or (K1 = \:_sys_HostVarLo0)
> and
>
> (K2 >= \:_sys_HostVarLo1)) and ((K1 <
>
> \:_sys_HostVarHi0) or (K1 = \:_sys_HostVarHi0)
> and
>
> (K2 <= \:_sys_HostVarHi1))
>
> mdam_disjunct .......... (K1 < \:_sys_HostVarHi0) and (K1 >
>
> \:_sys_HostVarLo0) and (K1 <
> \:_sys_HostVarHi0)
> -> duplicate identical
>
> and (K2 >= 10) and (K2 <= 20)
>
> mdam_disjunct .......... (K1 < \:_sys_HostVarHi0) and (K1 >
>
> \:_sys_HostVarLo0) and (K1 =
> \:_sys_HostVarHi0)
> -> K1 cannot be < and = to the same value at the same time, so all this
> expression is always false
>
> and (K2 <= \:_sys_HostVarHi1) and (K2 >= 10)
> and
>
> (K2 <= 20)
>
> mdam_disjunct .......... (K1 < \:_sys_HostVarHi0) and (K1 = ->duplicate
> identical
>
> \:_sys_HostVarLo0) and (K2 >=
> \:_sys_HostVarLo1)
>
> and (K1 < \:_sys_HostVarHi0) and (K2 >= 10)
> and
>
> (K2 <= 20)
>
> mdam_disjunct .......... (K1 < \:_sys_HostVarHi0) and (K1 = -> K1 cannot
> be < and = to the same value at the same time, so all this expression is
> always false
>
> \:_sys_HostVarLo0) and (K2 >=
> \:_sys_HostVarLo1)
>
> and (K1 = \:_sys_HostVarHi0) and (K2 <=
>
> \:_sys_HostVarHi1) and (K2 >= 10) and (K2 <=
> 20)
>
> mdam_disjunct .......... (K1 = \:_sys_HostVarHi0) and (K1 >-> K1 cannot
> be < and = to the same value at the same time, so all this expression is
> always false
>
> \:_sys_HostVarLo0) and (K1 <
> \:_sys_HostVarHi0)
>
> and (K2 >= 10) and (K2 <= 20)
>
> mdam_disjunct .......... (K1 = \:_sys_HostVarHi0) and (K1 >
>
> \:_sys_HostVarLo0) and (K1 =
> \:_sys_HostVarHi0)->
> duplicate identical
>
> and (K2 <= \:_sys_HostVarHi1) and (K2 >= 10)
> and
>
> (K2 <= 20)
>
> mdam_disjunct .......... (K1 = \:_sys_HostVarHi0) and (K1 =
>
> \:_sys_HostVarLo0) and (K2 >=
> \:_sys_HostVarLo1)
>
> and (K1 < \:_sys_HostVarHi0) and (K2 >= 10)
> and->
> K1 cannot be < and = to the same value at the same time, so all this
> expression is always false
>
> (K2 <= 20)
>
> mdam_disjunct .......... (K1 = \:_sys_HostVarHi0) and (K1 = ->duplicate
> identical
>
> \:_sys_HostVarLo0) and (K2 >=
> \:_sys_HostVarLo1)
>
> and (K1 = \:_sys_HostVarHi0) and (K2 <=
>
> \:_sys_HostVarHi1) and (K2 >= 10) and (K2 <=
> 20)
>
> part_key_predicates .... (K2 >= 10) and (K2 <= 20)
>
>
>
> --- SQL operation complete.
>
>
>
>
>
> So I am not sure I fully understand how to read an mdam_disjunct, but I am
> sure I am struggling to make sense out of the stuff in red?
>
> Can someone help me understand please,
>
> Thanks,
> Eric
>