[ 
https://issues.apache.org/jira/browse/IMPALA-10647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17326180#comment-17326180
 ] 

ASF subversion and git services commented on IMPALA-10647:
----------------------------------------------------------

Commit a985e1134ee2a51cb44d0b6ccf83c77bcef64e83 in impala's branch 
refs/heads/master from Qifan Chen
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=a985e11 ]

IMPALA-10647 Improve always-true min/max filter handling in coordinator

The change improves how a coordinator behaves when a just
arriving min/max filter is always true. A new member
'always_true_filter_received_' is introduced to record such a
fact. Similarily, the new member always_false_flipped_to_false_
is added to indicate that the always false flag is flipped from
'true' to 'false'. These two members only influence how the min
and max columns in "Filter routing table" and "Final filter
table" in profile are displayed as follows.

  1. 'PartialUpdates' - The min and the max are partially updated;
  2. 'AlwaysTrue'     - One received filter is AlwaysTrue;
  3. 'AlwaysFalse'    - No filter is received or all received
                        filters are empty;
  4. 'Real values'    - The final accumulated min/max from all
                        received filters.

A second change introduced is to record, in scan node, the
arrival time of min/max filters (as a timestamp since the system
is rebooted, obtained by calling MonotonicMillis()). A timestamp
of similar nature is recorded for hdfs parquet scanners when a
row group is processed. By comparing these two timestamps, one
can easily diagnose issues related to late arrival of min/max
filters.

This change also addresses a flaw with rows unexpectedly
filtered out, due to the reason that the always_true_ flag in
a min/max filter, when set, is ignored in the eval code path
in RuntimeFilter::Eval().

Testing:
  1. Added three new tests in overlap_min_max_filters.test to
     verify that the min/max are displayed correctly when the
     min/max filter in hash join builder is set to always true,
     always false, or a pair of meaningful min and max values.
  2. Ran unit tests;
  3. Ran runtime-filter-test;
  4. Ran core tests successfully.

Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964
Reviewed-on: http://gerrit.cloudera.org:8080/17252
Reviewed-by: Joe McDonnell <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Improve always-true min/max filter handling in coordinator
> ----------------------------------------------------------
>
>                 Key: IMPALA-10647
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10647
>             Project: IMPALA
>          Issue Type: Improvement
>            Reporter: Qifan Chen
>            Assignee: Qifan Chen
>            Priority: Major
>
> Currently, when a just arriving min/max filter is the last one to arrive or 
> is always true, the coordinator disables the corresponding filter
> representation by setting it to Always True. This makes it impossible to 
> differentiate a true AlwaysTrue filter (say, set in the
> hash join building step) from the one being disabled.
> A better handling is needed in this area. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to