cristianpop opened a new issue, #18726:
URL: https://github.com/apache/pinot/issues/18726

   Queries that look for rows with either nulls or specific values for a column 
don't return the rows with null values.
   
   Table schema excerpt:
   `{
        "schemaName": "test",
        "enableColumnBasedNullHandling": true,
        "dimensionFieldSpecs": [
                ...
                {
                  "name": "prop",
                  "dataType": "STRING",
                  "fieldType": "DIMENSION"
                },
                ...
        ]
   }`
   
   Using SSE the following query returns the correct count (102493):
   `SET enableNullHandling = true;
   
   SELECT COUNT(*)
   FROM test
   WHERE prop IS NULL
     OR prop = 'value'` 
   
   On the other hand, using MSE the same query doesn't return the correct count 
(80):
   `SET useMultistageEngine=true;
   SET enableNullHandling = true;
   
   SELECT COUNT(*)
   FROM test
   WHERE prop IS NULL
     OR prop = 'value'`
   
   If I query for rows with the given value for prop using MSE I get the 
correct count (80):
   `SET useMultistageEngine=true;
   SET enableNullHandling = true;
   
   SELECT COUNT(*)
   FROM test
   WHERE prop = 'value'` 
   
   If I query for rows with null values for prop using MSE I get the correct 
count (102413):
   `SET useMultistageEngine=true;
   SET enableNullHandling = true;
   
   SELECT COUNT(*)
   FROM test
   WHERE prop IS NULL`
   
   As can be seen from the queries above, even if I ask for rows with prop as 
null or with given value, MSE only returns the rows that have the given value, 
ignoring those with nulls.
   
   I looked at the execution plan and it looks fine since it has NULL AS TRUE 
set:
   `Execution Plan
   PinotLogicalAggregate(group=[{}], agg#0=[COUNT($0)], aggType=[FINAL])
     PinotLogicalExchange(distribution=[hash])
       PinotLogicalAggregate(group=[{}], agg#0=[COUNT()], aggType=[LEAF])
         LogicalFilter(condition=[SEARCH($5, Sarg[_UTF-8'value':VARCHAR 
CHARACTER SET "UTF-8"; NULL AS TRUE]:VARCHAR CHARACTER SET "UTF-8")])
           PinotLogicalTableScan(table=[[default, test]])`
   
   It seems like the null as true flag gets lost in 
RexExpressionUtils::handleSearch because it doesn't take it into account at 
all. I already have a commit right 
[here](https://github.com/apache/pinot/compare/master...cristianpop:pinot:wip-fix-mse-is-null-or-value-query-handling)
 that fixes this issue so I can open a PR for it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to