[ 
https://issues.apache.org/jira/browse/IMPALA-7015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16474673#comment-16474673
 ] 

Thomas Tauber-Marshall commented on IMPALA-7015:
------------------------------------------------

So I agree with the point from IMPALA-3710 that we don't want these types of 
errors to prematurely terminate the query, which would happen if we just 
returned them immediately. I think that it would be reasonable to just save the 
errors and return an error status once all of the rows have been sent to Kudu, 
eg in FlushFinal().

We could then return an error message with the counts of rows that hit various 
errors. That wouldn't be as good as the more structured approach suggested in 
IMPALA-4416 and IMPALA-1789, since clients would have to parse the message, but 
it would at least be something, rather than just returning no info about what 
happened.

My main hesitation is that this would be a somewhat breaking change, and if 
we're going to do that it might be better to wait until everything is in place 
to do it the right way, rather than doing breaking changes around this twice.

> Insert into Kudu table returns with Status OK even if there are Kudu errors
> ---------------------------------------------------------------------------
>
>                 Key: IMPALA-7015
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7015
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 2.12.0
>            Reporter: Mostafa Mokhtar
>            Priority: Major
>         Attachments: Insert into kudu profile with errors.txt
>
>
> DML statements against Kudu tables return status OK even if there are Kudu 
> errors.
> This behavior is misleading. 
> {code}
>   Summary:
>     Session ID: 18430b000e5dd8dc:e3e5dadb4a15d4b4
>     Session Type: BEESWAX
>     Start Time: 2018-05-11 10:10:07.314218000
>     End Time: 2018-05-11 10:10:07.434017000
>     Query Type: DML
>     Query State: FINISHED
>     Query Status: OK
>     Impala Version: impalad version 2.12.0-cdh5.15.0 RELEASE (build 
> 2f9498d5c2f980aa7ff9505c56654c8e59e026ca)
>     User: mmokhtar
>     Connected User: mmokhtar
>     Delegated User: 
>     Network Address: ::ffff:10.17.234.27:60760
>     Default Db: tpcds_1000_kudu
>     Sql Statement: insert into store_2 select * from store
>     Coordinator: vd1317.foo:22000
>     Query Options (set by configuration): 
>     Query Options (set by configuration and planner): MT_DOP=0
>     Plan: 
> {code}
> {code}
> Operator          #Hosts   Avg Time  Max Time  #Rows  Est. #Rows  Peak Mem  
> Est. Peak Mem  Detail                                                
> -------------------------------------------------------------------------------------------------------------------------------------------------
> 02:PARTIAL SORT        5  909.030us   1.025ms  1.00K       1.00K   6.14 MB    
>     4.00 MB                                                        
> 01:EXCHANGE            5    6.262ms   7.232ms  1.00K       1.00K  75.50 KB    
>           0  KUDU(KuduPartition(tpcds_1000_kudu.store.s_store_sk)) 
> 00:SCAN KUDU           5    3.694ms   4.137ms  1.00K       1.00K   4.34 MB    
>           0  tpcds_1000_kudu.store                                 
>     Errors: Key already present in Kudu table 
> 'impala::tpcds_1000_kudu.store_2'. (1 of 1002 similar)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to