Anshuman Mishra created DRILL-8331:
--------------------------------------
Summary: UNION ALL for multiple sheets in excel giving unexpected
result when used with a subquery in not in
Key: DRILL-8331
URL: https://issues.apache.org/jira/browse/DRILL-8331
Project: Apache Drill
Issue Type: Bug
Components: Client - Java
Affects Versions: 1.20.2
Reporter: Anshuman Mishra
Attachments: TestingMerge.xlsx
Hi All,
I am having a query regarding a query I am firing up in drill and not getting
the expected result set. Basically what I am doing is merging the data from
three sheets and eliminating only those records present in the 4th sheet.
The query:
{quote}{_}select * from (SELECT EmpId,Test FROM
TABLE(dfs.{_}{{{}_[C://ATOP//all_files//XLS_XLSX//TestingMerge.xlsx|file:///C://ATOP//all_files//XLS_XLSX//TestingMerge.xlsx]_{}}}{_}(type
=> 'excel', sheetName =>'Sheet1')) UNION ALL SELECT EmpId,Test FROM
TABLE(dfs.{_}{{{}_[C://ATOP//all_files//XLS_XLSX//TestingMerge.xlsx|file:///C://ATOP//all_files//XLS_XLSX//TestingMerge.xlsx]_{}}}{_}(type
=> 'excel', sheetName =>'Sheet2')) UNION ALL SELECT EmpId,Test FROM
TABLE(dfs.{_}{{{}_[C://ATOP//all_files//XLS_XLSX//TestingMerge.xlsx|file:///C://ATOP//all_files//XLS_XLSX//TestingMerge.xlsx]_{}}}{_}(type
=> 'excel', sheetName =>'Sheet3'))) where EmpID not in (select EmpID from
TABLE(dfs.{_}{{{}_[C://ATOP//all_files//XLS_XLSX//TestingMerge.xlsx|file:///C://ATOP//all_files//XLS_XLSX//TestingMerge.xlsx]_{}}}{_}(type
=> 'excel', sheetName =>'Sheet4'))){_}{quote}
The query when we execute it, return no data, when it should just eliminate the
data present in sheet4 from the excel sheet. Strangely enough when we do a
union All on two sheets this query works perfectly fine and returns the
expected result. One more finding is, if we use static values in 'not in'
instead of 'subquery', the query works fine with all the 3 sheets as union all.
I even gave a null check in the subquery used in 'not in' as it might consider
null values as well, still no luck. Not sure where I am going wrong with this.
Would appreciate any inputs.
NOTE: Instead of UNION ALL if UNION is applied then it is giving the expected
result set.
I have attached a sample file I am using to recreate the issue.
[^TestingMerge.xlsx]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)