[jira] [Comment Edited] (ASTERIXDB-1030) Constant created between two data scans gets inserted into their join

Yingyi Bu (JIRA) Wed, 16 Dec 2015 12:06:07 -0800

    [ 
https://issues.apache.org/jira/browse/ASTERIXDB-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060665#comment-15060665
 ]


Yingyi Bu edited comment on ASTERIXDB-1030 at 12/16/15 8:04 PM:
----------------------------------------------------------------

This the optimized query plan with my patch:

{noformat}
distribute result [%0->$$14]
-- DISTRIBUTE_RESULT  |PARTITIONED|
  exchange 
  -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
    project ([$$14])
    -- STREAM_PROJECT  |PARTITIONED|
      assign [$$14] <- [function-call: asterix:closed-record-constructor, 
Args:[AString: {id}, %0->$$16]]
      -- ASSIGN  |PARTITIONED|
        project ([$$16])
        -- STREAM_PROJECT  |PARTITIONED|
          exchange 
          -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
            join (function-call: algebricks:eq, Args:[%0->$$19, %0->$$16])
            -- HYBRID_HASH_JOIN [$$16][$$19]  |PARTITIONED|
              exchange 
              -- HASH_PARTITION_EXCHANGE [$$16]  |PARTITIONED|
                project ([$$16])
                -- STREAM_PROJECT  |PARTITIONED|
                  assign [$$16] <- [function-call: 
asterix:field-access-by-index, Args:[%0->$$0, AInt32: {1}]]
                  -- ASSIGN  |PARTITIONED|
                    project ([$$0])
                    -- STREAM_PROJECT  |PARTITIONED|
                      exchange 
                      -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                        data-scan []<-[$$17, $$0] <- 
emergencyTest:NearbySheltersDuringTornadoDangerChannelSubscriptions
                        -- DATASOURCE_SCAN  |PARTITIONED|
                          exchange 
                          -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                            empty-tuple-source
                            -- EMPTY_TUPLE_SOURCE  |PARTITIONED|
              exchange 
              -- HASH_PARTITION_EXCHANGE [$$19]  |PARTITIONED|
                project ([$$19])
                -- STREAM_PROJECT  |PARTITIONED|
                  select (function-call: algebricks:ge, Args:[function-call: 
asterix:field-access-by-index, Args:[%0->$$2, AInt32: {4}], function-call: 
asterix:numeric-subtract, Args:[function-call: asterix:current-datetime, 
Args:[], org.apache.asterix.om.base.ADayTimeDuration@927c0]])
                  -- STREAM_SELECT  |PARTITIONED|
                    assign [$$19] <- [function-call: 
asterix:field-access-by-index, Args:[%0->$$2, AInt32: {5}]]
                    -- ASSIGN  |PARTITIONED|
                      project ([$$2])
                      -- STREAM_PROJECT  |PARTITIONED|
                        exchange 
                        -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                          unnest-map [$$18, $$2] <- function-call: 
asterix:index-search, Args:[AString: {CHPReports}, AInt32: {0}, AString: 
{emergencyTest}, AString: {CHPReports}, ABoolean: {false}, ABoolean: {false}, 
ABoolean: {false}, AInt32: {1}, %0->$$23, AInt32: {1}, %0->$$23, TRUE, TRUE, 
TRUE]
                          -- BTREE_SEARCH  |PARTITIONED|
                            exchange 
                            -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                              order (ASC, %0->$$23) 
                              -- STABLE_SORT [$$23(ASC)]  |PARTITIONED|
                                exchange 
                                -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                                  project ([$$23])
                                  -- STREAM_PROJECT  |PARTITIONED|
                                    exchange 
                                    -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                                      unnest-map [$$22, $$23] <- function-call: 
asterix:index-search, Args:[AString: {times}, AInt32: {0}, AString: 
{emergencyTest}, AString: {CHPReports}, ABoolean: {false}, ABoolean: {false}, 
ABoolean: {false}, AInt32: {1}, %0->$$21, AInt32: {0}, TRUE, TRUE, FALSE]
                                      -- BTREE_SEARCH  |PARTITIONED|
                                        exchange 
                                        -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                                          assign [$$21] <- [function-call: 
asterix:numeric-subtract, Args:[function-call: asterix:current-datetime, 
Args:[], org.apache.asterix.om.base.ADayTimeDuration@927c0]]
                                          -- ASSIGN  |PARTITIONED|
                                            empty-tuple-source
                                            -- EMPTY_TUPLE_SOURCE  |PARTITIONED|
{noformat}

The plan seems fine to me because it leverages the secondary index on the 
timestamp field of dataset CHPReports.
What're the downside of pushing constant through join?  Is it a correctness 
issue or performance issue?



was (Author: buyingyi):
This the optimized query plan with my patch:

distribute result [%0->$$14]
-- DISTRIBUTE_RESULT  |PARTITIONED|
  exchange 
  -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
    project ([$$14])
    -- STREAM_PROJECT  |PARTITIONED|
      assign [$$14] <- [function-call: asterix:closed-record-constructor, 
Args:[AString: {id}, %0->$$16]]
      -- ASSIGN  |PARTITIONED|
        project ([$$16])
        -- STREAM_PROJECT  |PARTITIONED|
          exchange 
          -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
            join (function-call: algebricks:eq, Args:[%0->$$19, %0->$$16])
            -- HYBRID_HASH_JOIN [$$16][$$19]  |PARTITIONED|
              exchange 
              -- HASH_PARTITION_EXCHANGE [$$16]  |PARTITIONED|
                project ([$$16])
                -- STREAM_PROJECT  |PARTITIONED|
                  assign [$$16] <- [function-call: 
asterix:field-access-by-index, Args:[%0->$$0, AInt32: {1}]]
                  -- ASSIGN  |PARTITIONED|
                    project ([$$0])
                    -- STREAM_PROJECT  |PARTITIONED|
                      exchange 
                      -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                        data-scan []<-[$$17, $$0] <- 
emergencyTest:NearbySheltersDuringTornadoDangerChannelSubscriptions
                        -- DATASOURCE_SCAN  |PARTITIONED|
                          exchange 
                          -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                            empty-tuple-source
                            -- EMPTY_TUPLE_SOURCE  |PARTITIONED|
              exchange 
              -- HASH_PARTITION_EXCHANGE [$$19]  |PARTITIONED|
                project ([$$19])
                -- STREAM_PROJECT  |PARTITIONED|
                  select (function-call: algebricks:ge, Args:[function-call: 
asterix:field-access-by-index, Args:[%0->$$2, AInt32: {4}], function-call: 
asterix:numeric-subtract, Args:[function-call: asterix:current-datetime, 
Args:[], org.apache.asterix.om.base.ADayTimeDuration@927c0]])
                  -- STREAM_SELECT  |PARTITIONED|
                    assign [$$19] <- [function-call: 
asterix:field-access-by-index, Args:[%0->$$2, AInt32: {5}]]
                    -- ASSIGN  |PARTITIONED|
                      project ([$$2])
                      -- STREAM_PROJECT  |PARTITIONED|
                        exchange 
                        -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                          unnest-map [$$18, $$2] <- function-call: 
asterix:index-search, Args:[AString: {CHPReports}, AInt32: {0}, AString: 
{emergencyTest}, AString: {CHPReports}, ABoolean: {false}, ABoolean: {false}, 
ABoolean: {false}, AInt32: {1}, %0->$$23, AInt32: {1}, %0->$$23, TRUE, TRUE, 
TRUE]
                          -- BTREE_SEARCH  |PARTITIONED|
                            exchange 
                            -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                              order (ASC, %0->$$23) 
                              -- STABLE_SORT [$$23(ASC)]  |PARTITIONED|
                                exchange 
                                -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                                  project ([$$23])
                                  -- STREAM_PROJECT  |PARTITIONED|
                                    exchange 
                                    -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                                      unnest-map [$$22, $$23] <- function-call: 
asterix:index-search, Args:[AString: {times}, AInt32: {0}, AString: 
{emergencyTest}, AString: {CHPReports}, ABoolean: {false}, ABoolean: {false}, 
ABoolean: {false}, AInt32: {1}, %0->$$21, AInt32: {0}, TRUE, TRUE, FALSE]
                                      -- BTREE_SEARCH  |PARTITIONED|
                                        exchange 
                                        -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                                          assign [$$21] <- [function-call: 
asterix:numeric-subtract, Args:[function-call: asterix:current-datetime, 
Args:[], org.apache.asterix.om.base.ADayTimeDuration@927c0]]
                                          -- ASSIGN  |PARTITIONED|
                                            empty-tuple-source
                                            -- EMPTY_TUPLE_SOURCE  |PARTITIONED|

The plan seems fine to me because it leverages the secondary index on the 
timestamp field of dataset CHPReports.
What're the downside of pushing constant through join?  Is it a correctness 
issue or performance issue?


> Constant created between two data scans gets inserted into their join
> ---------------------------------------------------------------------
>
>                 Key: ASTERIXDB-1030
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-1030
>             Project: Apache AsterixDB
>          Issue Type: Bug
>          Components: AsterixDB, Optimizer
>            Reporter: Steven Jacobs
>            Assignee: Yingyi Bu
>            Priority: Minor
>              Labels: FixedInBranch
>
> Constant created between two data scans gets inserted into their join



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (ASTERIXDB-1030) Constant created between two data scans gets inserted into their join

Reply via email to