[jira] [Commented] (HIVE-20199) Improved filtering performance for a large number of partitions in a single table.
[ https://issues.apache.org/jira/browse/HIVE-20199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16547691#comment-16547691 ] tartarus commented on HIVE-20199: - If there are too many partition table, use the jdo query performance also is not very good, looking forward to more optimization!! LGTM > Improved filtering performance for a large number of partitions in a single > table. > -- > > Key: HIVE-20199 > URL: https://issues.apache.org/jira/browse/HIVE-20199 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 1.2.1 >Reporter: Biao Wu >Assignee: Biao Wu >Priority: Major > Attachments: 021-HIVE-20199.mysql.sql > > > eg: > {code:sql} > select * from test where dt = '20180606' > {code} > The filter 'dt=20180606' will be pushed down to mysql for execution, but the > test table contains a large number of partitions, this is a poor performance. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20199) Improved filtering performance for a large number of partitions in a single table.
[ https://issues.apache.org/jira/browse/HIVE-20199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16548091#comment-16548091 ] Vihang Karajgaonkar commented on HIVE-20199: Thanks for reporting the issue and providing a patch. Can you provide a patch which works for all the different databases supported? In order to provide a patch you will need to provide a .sql following the naming convention similar to *.sql files in metastore/scripts/upgrade/ directory. Also, you will have to modify the hive-schema-4.0.0.sql file to add the index above. > Improved filtering performance for a large number of partitions in a single > table. > -- > > Key: HIVE-20199 > URL: https://issues.apache.org/jira/browse/HIVE-20199 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 1.2.1 >Reporter: Biao Wu >Assignee: Biao Wu >Priority: Major > Attachments: 021-HIVE-20199.mysql.sql > > > eg: > {code:sql} > select * from test where dt = '20180606' > {code} > The filter 'dt=20180606' will be pushed down to mysql for execution, but the > test table contains a large number of partitions, this is a poor performance. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20199) Improved filtering performance for a large number of partitions in a single table.
[ https://issues.apache.org/jira/browse/HIVE-20199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16548523#comment-16548523 ] Hive QA commented on HIVE-20199: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12932064/021-HIVE-20199.mysql.sql {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12680/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12680/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12680/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2018-07-18 22:49:05.294 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-12680/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2018-07-18 22:49:05.297 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at da1f758 HIVE-20172 : StatsUpdater failed with GSS Exception while trying to connect to remote metastore (Rajkumar Singh via Ashutosh Chauhan) + git clean -f -d + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at da1f758 HIVE-20172 : StatsUpdater failed with GSS Exception while trying to connect to remote metastore (Rajkumar Singh via Ashutosh Chauhan) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2018-07-18 22:49:06.735 + rm -rf ../yetus_PreCommit-HIVE-Build-12680 + mkdir ../yetus_PreCommit-HIVE-Build-12680 + git gc + cp -R . ../yetus_PreCommit-HIVE-Build-12680 + mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-12680/yetus + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch fatal: unrecognized input fatal: unrecognized input fatal: unrecognized input The patch does not appear to apply with p0, p1, or p2 + result=1 + '[' 1 -ne 0 ']' + rm -rf yetus_PreCommit-HIVE-Build-12680 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12932064 - PreCommit-HIVE-Build > Improved filtering performance for a large number of partitions in a single > table. > -- > > Key: HIVE-20199 > URL: https://issues.apache.org/jira/browse/HIVE-20199 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 1.2.1 >Reporter: Biao Wu >Assignee: Biao Wu >Priority: Major > Attachments: 021-HIVE-20199.mysql.sql > > > eg: > {code:sql} > select * from test where dt = '20180606' > {code} > The filter 'dt=20180606' will be pushed down to mysql for execution, but the > test table contains a large number of partitions, this is a poor performance. -- This message was sent by Atlassian JIRA (v7.6.3#76005)