[jira] [Commented] (HIVE-20076) Delete on a partitioned table removes more rows than expected
[ https://issues.apache.org/jira/browse/HIVE-20076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16534099#comment-16534099 ] Gopal V commented on HIVE-20076: [~teddy.choi]: Added a test to TestVectorizedORCReader to check that the row-numbering is sequential and continuous (with a loop) to replace the qtest version. And reset the batch within the fast-path batch. > Delete on a partitioned table removes more rows than expected > - > > Key: HIVE-20076 > URL: https://issues.apache.org/jira/browse/HIVE-20076 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Teddy Choi >Assignee: Teddy Choi >Priority: Major > Attachments: HIVE-20076.2.patch, HIVE-20076.3.patch, HIVE-20076.patch > > > Delete on a partitioned table removes more rows than expected -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20076) Delete on a partitioned table removes more rows than expected
[ https://issues.apache.org/jira/browse/HIVE-20076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16534007#comment-16534007 ] Gopal V commented on HIVE-20076: The order by in the queries make it very hard to tell if the bug is happening or not, because this is about sequential numbering. {code} +{"writeid":### Masked writeid ###,"bucketid":536870912,"rowid":513}jessica garcia 59 3.5220110926 +{"writeid":### Masked writeid ###,"bucketid":536870912,"rowid":6633} jessica garcia 59 3.6920110925 {code} I think the fix to getRowNumber() is necessary, which is {code} if (rowInBatch >= batch.size) { + baseRow = super.getRowNumber(); + rowInBatch = 0; return super.nextBatch(theirBatch); } {code} This has a side-effect of reading batch.size again the next time around (if batch.size !=0, then the first batch will be repeated between every fast-path batch). Ideally, at that point it should reset the batch, if the batch.size is > 0 (the invariant is that it has already been consumed by rowInBatch). > Delete on a partitioned table removes more rows than expected > - > > Key: HIVE-20076 > URL: https://issues.apache.org/jira/browse/HIVE-20076 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Teddy Choi >Assignee: Teddy Choi >Priority: Major > Attachments: HIVE-20076.2.patch, HIVE-20076.patch > > > Delete on a partitioned table removes more rows than expected -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20076) Delete on a partitioned table removes more rows than expected
[ https://issues.apache.org/jira/browse/HIVE-20076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16533986#comment-16533986 ] Eugene Koifman commented on HIVE-20076: --- [~teddy.choi] the patch didn't apply. Also, how long does it take to run these tests? The q tests include multiple queries producing 30K rows each all of which is recored in the .q.out file. > Delete on a partitioned table removes more rows than expected > - > > Key: HIVE-20076 > URL: https://issues.apache.org/jira/browse/HIVE-20076 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Teddy Choi >Assignee: Teddy Choi >Priority: Major > Attachments: HIVE-20076.2.patch, HIVE-20076.patch > > > Delete on a partitioned table removes more rows than expected -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20076) Delete on a partitioned table removes more rows than expected
[ https://issues.apache.org/jira/browse/HIVE-20076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16533313#comment-16533313 ] Hive QA commented on HIVE-20076: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12930244/HIVE-20076.2.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12395/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12395/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12395/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Tests exited with: Exception: Patch URL https://issues.apache.org/jira/secure/attachment/12930244/HIVE-20076.2.patch was found in seen patch url's cache and a test was probably run already on it. Aborting... {noformat} This message is automatically generated. ATTACHMENT ID: 12930244 - PreCommit-HIVE-Build > Delete on a partitioned table removes more rows than expected > - > > Key: HIVE-20076 > URL: https://issues.apache.org/jira/browse/HIVE-20076 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Teddy Choi >Assignee: Teddy Choi >Priority: Major > Attachments: HIVE-20076.2.patch, HIVE-20076.patch > > > Delete on a partitioned table removes more rows than expected -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20076) Delete on a partitioned table removes more rows than expected
[ https://issues.apache.org/jira/browse/HIVE-20076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16533095#comment-16533095 ] Hive QA commented on HIVE-20076: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12930244/HIVE-20076.2.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12382/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12382/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12382/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2018-07-04 23:07:42.568 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-12382/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2018-07-04 23:07:42.572 + cd apache-github-source-source + git fetch origin >From https://github.com/apache/hive bb35d83..d7128cf branch-3 -> origin/branch-3 + git reset --hard HEAD HEAD is now at 5e2a530 HIVE-20066 : hive.load.data.owner is compared to full principal (Daniel Voros via Zoltan Haindrich) + git clean -f -d + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at 5e2a530 HIVE-20066 : hive.load.data.owner is compared to full principal (Daniel Voros via Zoltan Haindrich) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2018-07-04 23:07:44.249 + rm -rf ../yetus_PreCommit-HIVE-Build-12382 + mkdir ../yetus_PreCommit-HIVE-Build-12382 + git gc + cp -R . ../yetus_PreCommit-HIVE-Build-12382 + mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-12382/yetus + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch error: cannot apply binary patch to 'data/files/student/ds=20110924/00_0' without full index line Falling back to three-way merge... error: cannot apply binary patch to 'data/files/student/ds=20110924/00_0' without full index line error: data/files/student/ds=20110924/00_0: patch does not apply error: cannot apply binary patch to 'data/files/student/ds=20110925/00_0' without full index line Falling back to three-way merge... error: cannot apply binary patch to 'data/files/student/ds=20110925/00_0' without full index line error: data/files/student/ds=20110925/00_0: patch does not apply error: cannot apply binary patch to 'data/files/student/ds=20110926/00_0' without full index line Falling back to three-way merge... error: cannot apply binary patch to 'data/files/student/ds=20110926/00_0' without full index line error: data/files/student/ds=20110926/00_0: patch does not apply error: cannot apply binary patch to 'files/student/ds=20110924/00_0' without full index line Falling back to three-way merge... error: cannot apply binary patch to 'files/student/ds=20110924/00_0' without full index line error: files/student/ds=20110924/00_0: patch does not apply error: cannot apply binary patch to 'files/student/ds=20110925/00_0' without full index line Falling back to three-way merge... error: cannot apply binary patch to 'files/student/ds=20110925/00_0' without full index line error: files/student/ds=20110925/00_0: patch does not apply error: cannot apply binary patch to 'files/student/ds=20110926/00_0' without full index line Falling back to three-way merge... error: cannot apply binary patch to 'files/student/ds=20110926/00_0' without full index line error: files/student/ds=20110926/00_0: patch does not apply error: src/test/resources/testconfiguration.properties: does not exist in index error:
[jira] [Commented] (HIVE-20076) Delete on a partitioned table removes more rows than expected
[ https://issues.apache.org/jira/browse/HIVE-20076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532352#comment-16532352 ] Teddy Choi commented on HIVE-20076: --- The second patch includes test data and better fix for RecordReaderImpl. > Delete on a partitioned table removes more rows than expected > - > > Key: HIVE-20076 > URL: https://issues.apache.org/jira/browse/HIVE-20076 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Teddy Choi >Assignee: Teddy Choi >Priority: Major > Attachments: HIVE-20076.2.patch, HIVE-20076.patch > > > Delete on a partitioned table removes more rows than expected -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20076) Delete on a partitioned table removes more rows than expected
[ https://issues.apache.org/jira/browse/HIVE-20076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532200#comment-16532200 ] Teddy Choi commented on HIVE-20076: --- Explanation: org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.nextBatch needs to update rowInBatch value even if it uses the fast path. However, I still need to check whether it makes side effects or not. > Delete on a partitioned table removes more rows than expected > - > > Key: HIVE-20076 > URL: https://issues.apache.org/jira/browse/HIVE-20076 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Teddy Choi >Assignee: Teddy Choi >Priority: Major > Attachments: HIVE-20076.patch > > > Delete on a partitioned table removes more rows than expected -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20076) Delete on a partitioned table removes more rows than expected
[ https://issues.apache.org/jira/browse/HIVE-20076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532198#comment-16532198 ] Teddy Choi commented on HIVE-20076: --- I guess that it's alllowed to. I will make the patch with the test data set I have. > Delete on a partitioned table removes more rows than expected > - > > Key: HIVE-20076 > URL: https://issues.apache.org/jira/browse/HIVE-20076 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Teddy Choi >Assignee: Teddy Choi >Priority: Major > Attachments: HIVE-20076.patch > > > Delete on a partitioned table removes more rows than expected -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20076) Delete on a partitioned table removes more rows than expected
[ https://issues.apache.org/jira/browse/HIVE-20076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532197#comment-16532197 ] Teddy Choi commented on HIVE-20076: --- I have some test data, but I was not sure whether it's allowed to share. So I will find other data set which already is in Hive. It will make the difference more clear. Thanks for feedback, [~sershe]. > Delete on a partitioned table removes more rows than expected > - > > Key: HIVE-20076 > URL: https://issues.apache.org/jira/browse/HIVE-20076 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Teddy Choi >Assignee: Teddy Choi >Priority: Major > Attachments: HIVE-20076.patch > > > Delete on a partitioned table removes more rows than expected -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20076) Delete on a partitioned table removes more rows than expected
[ https://issues.apache.org/jira/browse/HIVE-20076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16531789#comment-16531789 ] Gopal V commented on HIVE-20076: Would a ROW__ID print in a .q show the issue? > Delete on a partitioned table removes more rows than expected > - > > Key: HIVE-20076 > URL: https://issues.apache.org/jira/browse/HIVE-20076 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Teddy Choi >Assignee: Teddy Choi >Priority: Major > Attachments: HIVE-20076.patch > > > Delete on a partitioned table removes more rows than expected -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20076) Delete on a partitioned table removes more rows than expected
[ https://issues.apache.org/jira/browse/HIVE-20076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16531785#comment-16531785 ] Sergey Shelukhin commented on HIVE-20076: - What does this do? :) Is it possible to add a regression test? > Delete on a partitioned table removes more rows than expected > - > > Key: HIVE-20076 > URL: https://issues.apache.org/jira/browse/HIVE-20076 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Teddy Choi >Assignee: Teddy Choi >Priority: Major > Attachments: HIVE-20076.patch > > > Delete on a partitioned table removes more rows than expected -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20076) Delete on a partitioned table removes more rows than expected
[ https://issues.apache.org/jira/browse/HIVE-20076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16531783#comment-16531783 ] Teddy Choi commented on HIVE-20076: --- [~mmccline], [~sershe], could you review this issue? Thanks. > Delete on a partitioned table removes more rows than expected > - > > Key: HIVE-20076 > URL: https://issues.apache.org/jira/browse/HIVE-20076 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Teddy Choi >Assignee: Teddy Choi >Priority: Major > Attachments: HIVE-20076.patch > > > Delete on a partitioned table removes more rows than expected -- This message was sent by Atlassian JIRA (v7.6.3#76005)