[ 
https://issues.apache.org/jira/browse/DRILL-8331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Turton updated DRILL-8331:
--------------------------------
    Priority: Major  (was: Critical)

> UNION ALL for multiple sheets in excel giving unexpected result when used 
> with a subquery in not in
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DRILL-8331
>                 URL: https://issues.apache.org/jira/browse/DRILL-8331
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Excel
>    Affects Versions: 1.20.2
>            Reporter: Anshuman Mishra
>            Priority: Major
>         Attachments: TestingMerge.xlsx
>
>
> Hi All,
> I am having a query regarding a query I am firing up in drill and not getting 
> the expected result set. Basically what I am doing is merging the data from 
> three sheets and eliminating only those records present in the 4th sheet.
> The query:
> {quote}{_}select * from (SELECT EmpId,Test FROM                         
> TABLE(dfs.{_}{{{}_[C://TestingMerge.xlsx|file:///C://ATOP//all_files//XLS_XLSX//TestingMerge.xlsx]_{}}}{_}(type
>  => 'excel', sheetName        =>'Sheet1')) UNION ALL SELECT EmpId,Test FROM  
> TABLE(dfs.{_}{{{}_[C://TestingMerge.xlsx|file:///C://ATOP//all_files//XLS_XLSX//TestingMerge.xlsx]_{}}}{_}(type
>  => 'excel', sheetName =>'Sheet2')) UNION ALL SELECT EmpId,Test FROM  
> TABLE(dfs.{_}{{{}_[C://TestingMerge.xlsx|file:///C://ATOP//all_files//XLS_XLSX//TestingMerge.xlsx]_{}}}{_}(type
>  => 'excel', sheetName =>'Sheet3'))) where EmpID not in (select EmpID from 
> TABLE(dfs.{_}{{{}_[C://TestingMerge.xlsx|file:///C://ATOP//all_files//XLS_XLSX//TestingMerge.xlsx]_{}}}{_}(type
>  => 'excel', sheetName    =>'Sheet4'))){_}
> {quote}
> The query when we execute it, returns no data, when it should just eliminate 
> the data present in sheet4 from the excel sheet. Strangely enough when we do 
> a union All on two sheets this query works perfectly fine and returns the 
> expected result. One more finding is, if we use static values in 'not in' 
> instead of 'subquery', the query works fine with all the 3 sheets as union 
> all. I even gave a null check in the subquery used in 'not in' as it might 
> consider null values as well, still no luck.
> NOTE: Instead of UNION ALL if UNION is applied then it is giving the expected 
> result set.
> I have attached a sample file I am using to recreate the issue. 
> [^TestingMerge.xlsx]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to