[jira] [Commented] (HIVE-20076) Delete on a partitioned table removes more rows than expected

2018-07-05 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16534099#comment-16534099
 ] 

Gopal V commented on HIVE-20076:


[~teddy.choi]: Added a test to TestVectorizedORCReader to check that the 
row-numbering is sequential and continuous (with a loop) to replace the qtest 
version.

And reset the batch within the fast-path batch.

> Delete on a partitioned table removes more rows than expected
> -
>
> Key: HIVE-20076
> URL: https://issues.apache.org/jira/browse/HIVE-20076
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
> Attachments: HIVE-20076.2.patch, HIVE-20076.3.patch, HIVE-20076.patch
>
>
> Delete on a partitioned table removes more rows than expected



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20076) Delete on a partitioned table removes more rows than expected

2018-07-05 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16534007#comment-16534007
 ] 

Gopal V commented on HIVE-20076:


The order by in the queries make it very hard to tell if the bug is happening 
or not, because this is about sequential numbering.

{code}
+{"writeid":### Masked writeid ###,"bucketid":536870912,"rowid":513}jessica 
garcia  59  3.5220110926
+{"writeid":### Masked writeid ###,"bucketid":536870912,"rowid":6633}   jessica 
garcia  59  3.6920110925
{code}

I think the fix to getRowNumber() is necessary, which is 

{code}
 if (rowInBatch >= batch.size) {
+  baseRow = super.getRowNumber();
+  rowInBatch = 0;
   return super.nextBatch(theirBatch);
 }
{code}

This has a side-effect of reading batch.size again the next time around (if 
batch.size !=0, then the first batch will be repeated between every fast-path 
batch).

Ideally, at that point it should reset the batch, if the batch.size is > 0 (the 
invariant is that it has already been consumed by rowInBatch).

> Delete on a partitioned table removes more rows than expected
> -
>
> Key: HIVE-20076
> URL: https://issues.apache.org/jira/browse/HIVE-20076
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
> Attachments: HIVE-20076.2.patch, HIVE-20076.patch
>
>
> Delete on a partitioned table removes more rows than expected



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20076) Delete on a partitioned table removes more rows than expected

2018-07-05 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16533986#comment-16533986
 ] 

Eugene Koifman commented on HIVE-20076:
---

[~teddy.choi] the patch didn't apply.

Also, how long does it take to run these tests?  The q tests include multiple 
queries producing 30K rows each all of which is recored in the .q.out file.  

> Delete on a partitioned table removes more rows than expected
> -
>
> Key: HIVE-20076
> URL: https://issues.apache.org/jira/browse/HIVE-20076
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
> Attachments: HIVE-20076.2.patch, HIVE-20076.patch
>
>
> Delete on a partitioned table removes more rows than expected



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20076) Delete on a partitioned table removes more rows than expected

2018-07-05 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16533313#comment-16533313
 ] 

Hive QA commented on HIVE-20076:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12930244/HIVE-20076.2.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12395/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12395/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12395/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Tests exited with: Exception: Patch URL 
https://issues.apache.org/jira/secure/attachment/12930244/HIVE-20076.2.patch 
was found in seen patch url's cache and a test was probably run already on it. 
Aborting...
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12930244 - PreCommit-HIVE-Build

> Delete on a partitioned table removes more rows than expected
> -
>
> Key: HIVE-20076
> URL: https://issues.apache.org/jira/browse/HIVE-20076
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
> Attachments: HIVE-20076.2.patch, HIVE-20076.patch
>
>
> Delete on a partitioned table removes more rows than expected



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20076) Delete on a partitioned table removes more rows than expected

2018-07-04 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16533095#comment-16533095
 ] 

Hive QA commented on HIVE-20076:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12930244/HIVE-20076.2.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12382/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12382/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12382/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2018-07-04 23:07:42.568
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-12382/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2018-07-04 23:07:42.572
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   bb35d83..d7128cf  branch-3   -> origin/branch-3
+ git reset --hard HEAD
HEAD is now at 5e2a530 HIVE-20066 : hive.load.data.owner is compared to full 
principal (Daniel Voros via Zoltan Haindrich)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 5e2a530 HIVE-20066 : hive.load.data.owner is compared to full 
principal (Daniel Voros via Zoltan Haindrich)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2018-07-04 23:07:44.249
+ rm -rf ../yetus_PreCommit-HIVE-Build-12382
+ mkdir ../yetus_PreCommit-HIVE-Build-12382
+ git gc
+ cp -R . ../yetus_PreCommit-HIVE-Build-12382
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-12382/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: cannot apply binary patch to 'data/files/student/ds=20110924/00_0' 
without full index line
Falling back to three-way merge...
error: cannot apply binary patch to 'data/files/student/ds=20110924/00_0' 
without full index line
error: data/files/student/ds=20110924/00_0: patch does not apply
error: cannot apply binary patch to 'data/files/student/ds=20110925/00_0' 
without full index line
Falling back to three-way merge...
error: cannot apply binary patch to 'data/files/student/ds=20110925/00_0' 
without full index line
error: data/files/student/ds=20110925/00_0: patch does not apply
error: cannot apply binary patch to 'data/files/student/ds=20110926/00_0' 
without full index line
Falling back to three-way merge...
error: cannot apply binary patch to 'data/files/student/ds=20110926/00_0' 
without full index line
error: data/files/student/ds=20110926/00_0: patch does not apply
error: cannot apply binary patch to 'files/student/ds=20110924/00_0' 
without full index line
Falling back to three-way merge...
error: cannot apply binary patch to 'files/student/ds=20110924/00_0' 
without full index line
error: files/student/ds=20110924/00_0: patch does not apply
error: cannot apply binary patch to 'files/student/ds=20110925/00_0' 
without full index line
Falling back to three-way merge...
error: cannot apply binary patch to 'files/student/ds=20110925/00_0' 
without full index line
error: files/student/ds=20110925/00_0: patch does not apply
error: cannot apply binary patch to 'files/student/ds=20110926/00_0' 
without full index line
Falling back to three-way merge...
error: cannot apply binary patch to 'files/student/ds=20110926/00_0' 
without full index line
error: files/student/ds=20110926/00_0: patch does not apply
error: src/test/resources/testconfiguration.properties: does not exist in index
error: 

[jira] [Commented] (HIVE-20076) Delete on a partitioned table removes more rows than expected

2018-07-04 Thread Teddy Choi (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532352#comment-16532352
 ] 

Teddy Choi commented on HIVE-20076:
---

The second patch includes test data and better fix for RecordReaderImpl.

> Delete on a partitioned table removes more rows than expected
> -
>
> Key: HIVE-20076
> URL: https://issues.apache.org/jira/browse/HIVE-20076
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
> Attachments: HIVE-20076.2.patch, HIVE-20076.patch
>
>
> Delete on a partitioned table removes more rows than expected



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20076) Delete on a partitioned table removes more rows than expected

2018-07-03 Thread Teddy Choi (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532200#comment-16532200
 ] 

Teddy Choi commented on HIVE-20076:
---

Explanation: org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.nextBatch needs 
to update rowInBatch value even if it uses the fast path. However, I still need 
to check whether it makes side effects or not.

> Delete on a partitioned table removes more rows than expected
> -
>
> Key: HIVE-20076
> URL: https://issues.apache.org/jira/browse/HIVE-20076
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
> Attachments: HIVE-20076.patch
>
>
> Delete on a partitioned table removes more rows than expected



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20076) Delete on a partitioned table removes more rows than expected

2018-07-03 Thread Teddy Choi (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532198#comment-16532198
 ] 

Teddy Choi commented on HIVE-20076:
---

I guess that it's alllowed to. I will make the patch with the test data set I 
have.

> Delete on a partitioned table removes more rows than expected
> -
>
> Key: HIVE-20076
> URL: https://issues.apache.org/jira/browse/HIVE-20076
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
> Attachments: HIVE-20076.patch
>
>
> Delete on a partitioned table removes more rows than expected



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20076) Delete on a partitioned table removes more rows than expected

2018-07-03 Thread Teddy Choi (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532197#comment-16532197
 ] 

Teddy Choi commented on HIVE-20076:
---

I have some test data, but I was not sure whether it's allowed to share. So I 
will find other data set which already is in Hive. It will make the difference 
more clear. Thanks for feedback, [~sershe].

> Delete on a partitioned table removes more rows than expected
> -
>
> Key: HIVE-20076
> URL: https://issues.apache.org/jira/browse/HIVE-20076
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
> Attachments: HIVE-20076.patch
>
>
> Delete on a partitioned table removes more rows than expected



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20076) Delete on a partitioned table removes more rows than expected

2018-07-03 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16531789#comment-16531789
 ] 

Gopal V commented on HIVE-20076:


Would a ROW__ID print in a .q show the issue?

> Delete on a partitioned table removes more rows than expected
> -
>
> Key: HIVE-20076
> URL: https://issues.apache.org/jira/browse/HIVE-20076
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
> Attachments: HIVE-20076.patch
>
>
> Delete on a partitioned table removes more rows than expected



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20076) Delete on a partitioned table removes more rows than expected

2018-07-03 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16531785#comment-16531785
 ] 

Sergey Shelukhin commented on HIVE-20076:
-

What does this do? :)
Is it possible to add a regression test?

> Delete on a partitioned table removes more rows than expected
> -
>
> Key: HIVE-20076
> URL: https://issues.apache.org/jira/browse/HIVE-20076
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
> Attachments: HIVE-20076.patch
>
>
> Delete on a partitioned table removes more rows than expected



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20076) Delete on a partitioned table removes more rows than expected

2018-07-03 Thread Teddy Choi (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16531783#comment-16531783
 ] 

Teddy Choi commented on HIVE-20076:
---

[~mmccline], [~sershe], could you review this issue? Thanks.

> Delete on a partitioned table removes more rows than expected
> -
>
> Key: HIVE-20076
> URL: https://issues.apache.org/jira/browse/HIVE-20076
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
> Attachments: HIVE-20076.patch
>
>
> Delete on a partitioned table removes more rows than expected



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)