[ 
https://issues.apache.org/jira/browse/HIVE-14949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15843757#comment-15843757
 ] 

Eugene Koifman commented on HIVE-14949:
---------------------------------------

The "cardinality clause" is the last one in the text order of the generated 
multi-insert - the parser preserves this order.  (I'm pretty sure we rely on 
this in other places).

I agree that it has a non-trivial cost but it seems that "training wheels on" 
should be the default.  If this condition is violated, we may write different 
events with the same ROW__ID to the same file (same op, same current txnid).  
It's not clear to me how the base/delta merge logic will react to it.  Maybe it 
will crash, maybe silently produce bad data.
So with this on, bugs in user supplied ON clauses will hopefully be detected 
before it's too late.

> Enforce that target:source is not 1:N
> -------------------------------------
>
>                 Key: HIVE-14949
>                 URL: https://issues.apache.org/jira/browse/HIVE-14949
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Transactions
>            Reporter: Eugene Koifman
>            Assignee: Eugene Koifman
>         Attachments: HIVE-14949.01.patch, HIVE-14949.02.patch, 
> HIVE-14949.03.patch, HIVE-14949.03.patch, HIVE-14949.04.patch, 
> HIVE-14949.05.patch
>
>
> If > 1 row on source side matches the same row on target side that means that 
>  we are forced update (or delete) the same row in target more than once as 
> part of the same SQL statement.  This should raise an error per SQL Spec
> ISO/IEC 9075-2:2011(E)
> Section 14.2 under "General Rules" Item 6/Subitem a/Subitem 2/Subitem B
> There is no sure way to do this via static analysis of the query.
> Can we add something to ROJ operator to pay attention to ROW__ID of target 
> side row and compare it with ROW__ID of target side of previous row output?  
> If they are the same, that means > 1 source row matched.
> Or perhaps just mark each row in the hash table that it matched.  And if it 
> matches again, throw an error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to