[jira] [Commented] (HIVE-20076) Delete on a partitioned table removes more rows than expected

Gopal V (JIRA) Thu, 05 Jul 2018 11:36:12 -0700


    [ 
https://issues.apache.org/jira/browse/HIVE-20076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16534007#comment-16534007
 ]


Gopal V commented on HIVE-20076:
--------------------------------

The order by in the queries make it very hard to tell if the bug is happening 
or not, because this is about sequential numbering.

{code}
+{"writeid":### Masked writeid ###,"bucketid":536870912,"rowid":513}    jessica 
garcia  59      3.52    20110926
+{"writeid":### Masked writeid ###,"bucketid":536870912,"rowid":6633}   jessica 
garcia  59      3.69    20110925
{code}

I think the fix to getRowNumber() is necessary, which is 

{code}
     if (rowInBatch >= batch.size) {
+      baseRow = super.getRowNumber();
+      rowInBatch = 0;
       return super.nextBatch(theirBatch);
     }
{code}

This has a side-effect of reading batch.size again the next time around (if 
batch.size !=0, then the first batch will be repeated between every fast-path 
batch).

Ideally, at that point it should reset the batch, if the batch.size is > 0 (the 
invariant is that it has already been consumed by rowInBatch).

> Delete on a partitioned table removes more rows than expected
> -------------------------------------------------------------
>
>                 Key: HIVE-20076
>                 URL: https://issues.apache.org/jira/browse/HIVE-20076
>             Project: Hive
>          Issue Type: Bug
>          Components: Transactions
>            Reporter: Teddy Choi
>            Assignee: Teddy Choi
>            Priority: Major
>         Attachments: HIVE-20076.2.patch, HIVE-20076.patch
>
>
> Delete on a partitioned table removes more rows than expected



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20076) Delete on a partitioned table removes more rows than expected

Reply via email to