[jira] [Updated] (HIVE-20041) ResultsCache: Improve logging for concurrent queries
[ https://issues.apache.org/jira/browse/HIVE-20041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-20041: --- Resolution: Fixed Fix Version/s: 4.0.0 Status: Resolved (was: Patch Available) > ResultsCache: Improve logging for concurrent queries > > > Key: HIVE-20041 > URL: https://issues.apache.org/jira/browse/HIVE-20041 > Project: Hive > Issue Type: Improvement > Components: Diagnosability >Reporter: Gopal V >Assignee: Laszlo Bodor >Priority: Minor > Labels: Branch3Candidate > Fix For: 4.0.0 > > Attachments: HIVE-20041.01.patch, HIVE-20041.02.patch, > HIVE-20041.03.patch, HIVE-20041.04.patch, HIVE-20041.05.patch > > > The logging for QueryResultsCache ends up printing information without > context, like > {code} > 2018-06-30T17:48:45,502 INFO [HiveServer2-Background-Pool: Thread-166] > results.QueryResultsCache: Waiting on pending cacheEntry > {code} > {code} > 2018-06-30T17:50:17,963 INFO [HiveServer2-Background-Pool: Thread-145] > ql.Driver: savedToCache: true > {code} > The previous lines for this are in DEBUG level, so the logging ends up being > useless at INFO level to debug. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-20165) Enable ZLIB for streaming ingest
[ https://issues.apache.org/jira/browse/HIVE-20165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran reassigned HIVE-20165: > Enable ZLIB for streaming ingest > > > Key: HIVE-20165 > URL: https://issues.apache.org/jira/browse/HIVE-20165 > Project: Hive > Issue Type: Bug > Components: Streaming, Transactions >Affects Versions: 4.0.0, 3.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Major > > Per [~gopalv]'s recommendation tried running streaming ingest with and > without zlib. Following are the numbers > > *Compression: NONE* > Total rows committed: 9380 > Throughput: *156* rows/second > [prasanth@cn105-10 culvert]$ hdfs dfs -du -s -h > /apps/hive/warehouse/prasanth.db/culvert > *14.1 G* /apps/hive/warehouse/prasanth.db/culvert > > *Compression: ZLIB* > Total rows committed: 9210 > Throughput: *1535000* rows/second > [prasanth@cn105-10 culvert]$ hdfs dfs -du -s -h > /apps/hive/warehouse/prasanth.db/culvert > *7.4 G* /apps/hive/warehouse/prasanth.db/culvert > > ZLIB is getting us 2x compression and only 2% lesser throughput. We should > enable ZLIB by default for streaming ingest. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20090) Extend creation of semijoin reduction filters to be able to discover new opportunities
[ https://issues.apache.org/jira/browse/HIVE-20090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542604#comment-16542604 ] Hive QA commented on HIVE-20090: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 4s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 42s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 21s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 56s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 31s{color} | {color:blue} common in master has 64 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 3s{color} | {color:blue} ql in master has 2289 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 11s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 27s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 21s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 40s{color} | {color:red} ql: The patch generated 17 new + 35 unchanged - 9 fixed = 52 total (was 44) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 4m 18s{color} | {color:red} ql generated 2 new + 2289 unchanged - 0 fixed = 2291 total (was 2289) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 19s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 27m 43s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:ql | | | Should org.apache.hadoop.hive.ql.parse.TezCompiler$RedundantSemijoinAndDppContext be a _static_ inner class? At TezCompiler.java:inner class? At TezCompiler.java:[lines 1222-1229] | | | Should org.apache.hadoop.hive.ql.parse.TezCompiler$SemiJoinRemovalProc be a _static_ inner class? At TezCompiler.java:inner class? At TezCompiler.java:[lines 1020-1132] | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-12574/dev-support/hive-personality.sh | | git revision | master / 20eb7b5 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-12574/yetus/diff-checkstyle-ql.txt | | whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-12574/yetus/whitespace-eol.txt | | findbugs | http://104.198.109.242/logs//PreCommit-HIVE-Build-12574/yetus/new-findbugs-ql.html | | modules | C: common ql itests U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-12574/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Extend creation of semijoin reduction filters to be able to discover new > opportunities > -- > > Key: HIVE-20090 > URL: https://issues.apache.org/jira/browse/HIVE-20090 > Project: Hive > Issue
[jira] [Commented] (HIVE-19829) Incremental replication load should create tasks in execution phase rather than semantic phase
[ https://issues.apache.org/jira/browse/HIVE-19829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542554#comment-16542554 ] Hive QA commented on HIVE-19829: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12931333/HIVE-19829.11-branch-3.patch {color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 96 failed/errored test(s), 14400 tests executed *Failed tests:* {noformat} TestAddPartitions - did not produce a TEST-*.xml file (likely timed out) (batchId=213) TestAddPartitionsFromPartSpec - did not produce a TEST-*.xml file (likely timed out) (batchId=215) TestAdminUser - did not produce a TEST-*.xml file (likely timed out) (batchId=221) TestAggregateStatsCache - did not produce a TEST-*.xml file (likely timed out) (batchId=215) TestAlterPartitions - did not produce a TEST-*.xml file (likely timed out) (batchId=215) TestAppendPartitions - did not produce a TEST-*.xml file (likely timed out) (batchId=215) TestBeeLineDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=271) TestCachedStore - did not produce a TEST-*.xml file (likely timed out) (batchId=221) TestCatalogCaching - did not produce a TEST-*.xml file (likely timed out) (batchId=221) TestCatalogNonDefaultClient - did not produce a TEST-*.xml file (likely timed out) (batchId=213) TestCatalogNonDefaultSvr - did not produce a TEST-*.xml file (likely timed out) (batchId=221) TestCatalogOldClient - did not produce a TEST-*.xml file (likely timed out) (batchId=213) TestCatalogs - did not produce a TEST-*.xml file (likely timed out) (batchId=215) TestCheckConstraint - did not produce a TEST-*.xml file (likely timed out) (batchId=213) TestDataSourceProviderFactory - did not produce a TEST-*.xml file (likely timed out) (batchId=223) TestDatabases - did not produce a TEST-*.xml file (likely timed out) (batchId=215) TestDeadline - did not produce a TEST-*.xml file (likely timed out) (batchId=221) TestDefaultConstraint - did not produce a TEST-*.xml file (likely timed out) (batchId=215) TestDropPartitions - did not produce a TEST-*.xml file (likely timed out) (batchId=213) TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=271) TestEmbeddedHiveMetaStore - did not produce a TEST-*.xml file (likely timed out) (batchId=216) TestExchangePartitions - did not produce a TEST-*.xml file (likely timed out) (batchId=215) TestFMSketchSerialization - did not produce a TEST-*.xml file (likely timed out) (batchId=223) TestFilterHooks - did not produce a TEST-*.xml file (likely timed out) (batchId=213) TestForeignKey - did not produce a TEST-*.xml file (likely timed out) (batchId=215) TestFunctions - did not produce a TEST-*.xml file (likely timed out) (batchId=213) TestGetPartitions - did not produce a TEST-*.xml file (likely timed out) (batchId=213) TestGetTableMeta - did not produce a TEST-*.xml file (likely timed out) (batchId=213) TestHLLNoBias - did not produce a TEST-*.xml file (likely timed out) (batchId=223) TestHLLSerialization - did not produce a TEST-*.xml file (likely timed out) (batchId=223) TestHdfsUtils - did not produce a TEST-*.xml file (likely timed out) (batchId=221) TestHiveAlterHandler - did not produce a TEST-*.xml file (likely timed out) (batchId=213) TestHiveMetaStoreGetMetaConf - did not produce a TEST-*.xml file (likely timed out) (batchId=221) TestHiveMetaStorePartitionSpecs - did not produce a TEST-*.xml file (likely timed out) (batchId=215) TestHiveMetaStoreSchemaMethods - did not produce a TEST-*.xml file (likely timed out) (batchId=221) TestHiveMetaStoreTimeout - did not produce a TEST-*.xml file (likely timed out) (batchId=223) TestHiveMetaStoreTxns - did not produce a TEST-*.xml file (likely timed out) (batchId=223) TestHiveMetaStoreWithEnvironmentContext - did not produce a TEST-*.xml file (likely timed out) (batchId=218) TestHiveMetastoreCli - did not produce a TEST-*.xml file (likely timed out) (batchId=213) TestHyperLogLog - did not produce a TEST-*.xml file (likely timed out) (batchId=223) TestHyperLogLogDense - did not produce a TEST-*.xml file (likely timed out) (batchId=223) TestHyperLogLogMerge - did not produce a TEST-*.xml file (likely timed out) (batchId=223) TestHyperLogLogSparse - did not produce a TEST-*.xml file (likely timed out) (batchId=223) TestJSONMessageDeserializer - did not produce a TEST-*.xml file (likely timed out) (batchId=221) TestListPartitions - did not produce a TEST-*.xml file (likely timed out) (batchId=213) TestLockRequestBuilder - did not produce a TEST-*.xml file (likely timed out) (batchId=213) TestMarkPartition - did not produce a TEST-*.xml file (likely timed out) (batchId=221) TestMarkPartitionRemote - did not produce a TEST-*.xml file (likely timed out) (batchId=223) TestMetaStoreConnectionUrlHook - did not produce a TES
[jira] [Commented] (HIVE-20153) Count and Sum UDF consume more memory in Hive 2+
[ https://issues.apache.org/jira/browse/HIVE-20153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542550#comment-16542550 ] Gopal V commented on HIVE-20153: >From a quick look, it looks like they are hashmaps with 0 items. {code} @Override public void reset(AggregationBuffer agg) throws HiveException { ((CountAgg) agg).value = 0; ((CountAgg) agg).uniqueObjects = new HashSet(); } {code} > Count and Sum UDF consume more memory in Hive 2+ > > > Key: HIVE-20153 > URL: https://issues.apache.org/jira/browse/HIVE-20153 > Project: Hive > Issue Type: Bug > Components: UDF >Affects Versions: 2.3.2 >Reporter: Szehon Ho >Assignee: Aihua Xu >Priority: Major > Attachments: Screen Shot 2018-07-12 at 6.41.28 PM.png > > > While playing with Hive2, we noticed that queries with a lot of count() and > sum() aggregations run out of memory on Hadoop side where they worked before > in Hive1. > In many queries, we have to double the Mapper Memory settings (in our > particular case mapreduce.map.java.opts from -Xmx2000M to -Xmx4000M), it > makes it not so easy to upgrade to Hive 2. > Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' > in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window > functions. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19820) add ACID stats support to background stats updater and fix bunch of edge cases found in SU tests
[ https://issues.apache.org/jira/browse/HIVE-19820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-19820: Attachment: HIVE-19820.03.patch > add ACID stats support to background stats updater and fix bunch of edge > cases found in SU tests > > > Key: HIVE-19820 > URL: https://issues.apache.org/jira/browse/HIVE-19820 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HIVE-19820.01-master-txnstats.patch, > HIVE-19820.01.patch, HIVE-19820.02-master-txnstats.patch, > HIVE-19820.03-master-txnstats.patch, HIVE-19820.03.patch, > HIVE-19820.04-master-txnstats.patch, HIVE-19820.patch, > branch-19820.02.nogen.patch, branch-19820.03.nogen.patch, > branch-19820.nogen.patch, branch-19820.nogen.patch > > > Follow-up from HIVE-19418. > Right now it checks whether stats are valid in an old-fashioned way... and > also gets ACID state, and discards it without using. > When ACID stats are implemented, ACID state needs to be used to do > version-aware valid stats checks. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19820) add ACID stats support to background stats updater and fix bunch of edge cases found in SU tests
[ https://issues.apache.org/jira/browse/HIVE-19820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-19820: Attachment: (was: HIVE-19820.03.patch) > add ACID stats support to background stats updater and fix bunch of edge > cases found in SU tests > > > Key: HIVE-19820 > URL: https://issues.apache.org/jira/browse/HIVE-19820 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HIVE-19820.01-master-txnstats.patch, > HIVE-19820.01.patch, HIVE-19820.02-master-txnstats.patch, > HIVE-19820.03-master-txnstats.patch, HIVE-19820.03.patch, > HIVE-19820.04-master-txnstats.patch, HIVE-19820.patch, > branch-19820.02.nogen.patch, branch-19820.03.nogen.patch, > branch-19820.nogen.patch, branch-19820.nogen.patch > > > Follow-up from HIVE-19418. > Right now it checks whether stats are valid in an old-fashioned way... and > also gets ACID state, and discards it without using. > When ACID stats are implemented, ACID state needs to be used to do > version-aware valid stats checks. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19820) add ACID stats support to background stats updater and fix bunch of edge cases found in SU tests
[ https://issues.apache.org/jira/browse/HIVE-19820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542523#comment-16542523 ] Sergey Shelukhin commented on HIVE-19820: - Fixed the directsql issue that affects partitioned views. > add ACID stats support to background stats updater and fix bunch of edge > cases found in SU tests > > > Key: HIVE-19820 > URL: https://issues.apache.org/jira/browse/HIVE-19820 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HIVE-19820.01-master-txnstats.patch, > HIVE-19820.01.patch, HIVE-19820.02-master-txnstats.patch, > HIVE-19820.03-master-txnstats.patch, HIVE-19820.03.patch, > HIVE-19820.04-master-txnstats.patch, HIVE-19820.patch, > branch-19820.02.nogen.patch, branch-19820.03.nogen.patch, > branch-19820.nogen.patch, branch-19820.nogen.patch > > > Follow-up from HIVE-19418. > Right now it checks whether stats are valid in an old-fashioned way... and > also gets ACID state, and discards it without using. > When ACID stats are implemented, ACID state needs to be used to do > version-aware valid stats checks. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19829) Incremental replication load should create tasks in execution phase rather than semantic phase
[ https://issues.apache.org/jira/browse/HIVE-19829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542510#comment-16542510 ] Hive QA commented on HIVE-19829: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 13s{color} | {color:red} /data/hiveptest/logs/PreCommit-HIVE-Build-12573/patches/PreCommit-HIVE-Build-12573.patch does not apply to master. Rebase required? Wrong Branch? See http://cwiki.apache.org/confluence/display/Hive/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-12573/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Incremental replication load should create tasks in execution phase rather > than semantic phase > -- > > Key: HIVE-19829 > URL: https://issues.apache.org/jira/browse/HIVE-19829 > Project: Hive > Issue Type: Task > Components: repl >Affects Versions: 3.1.0, 4.0.0 >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-19829.01.patch, HIVE-19829.02.patch, > HIVE-19829.03.patch, HIVE-19829.04.patch, HIVE-19829.06.patch, > HIVE-19829.07.patch, HIVE-19829.07.patch, HIVE-19829.08-branch-3.patch, > HIVE-19829.08.patch, HIVE-19829.09.patch, HIVE-19829.10-branch-3.patch, > HIVE-19829.10.patch, HIVE-19829.11-branch-3.patch > > > Split the incremental load into multiple iterations. In each iteration create > number of tasks equal to the configured value. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20135) Fix incompatible change in TimestampColumnVector to default to UTC
[ https://issues.apache.org/jira/browse/HIVE-20135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542509#comment-16542509 ] Hive QA commented on HIVE-20135: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12931330/HIVE-20135.02.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 14650 tests executed *Failed tests:* {noformat} org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=248) org.apache.hive.jdbc.TestSSL.testSSLConnectionWithProperty (batchId=248) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighShuffleBytes (batchId=250) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12572/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12572/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12572/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12931330 - PreCommit-HIVE-Build > Fix incompatible change in TimestampColumnVector to default to UTC > -- > > Key: HIVE-20135 > URL: https://issues.apache.org/jira/browse/HIVE-20135 > Project: Hive > Issue Type: Improvement >Reporter: Owen O'Malley >Assignee: Jesus Camacho Rodriguez >Priority: Blocker > Fix For: 3.1.0, 4.0.0, storage-2.7.0 > > Attachments: HIVE-20135.01.patch, HIVE-20135.02.patch, > HIVE-20135.patch > > > HIVE-20007 changed the default for TimestampColumnVector to be to use UTC, > which breaks the API compatibility with storage-api 2.6. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-17593) DataWritableWriter strip spaces for CHAR type before writing, but predicate generator doesn't do same thing.
[ https://issues.apache.org/jira/browse/HIVE-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen updated HIVE-17593: --- Attachment: HIVE-17593.4.patch > DataWritableWriter strip spaces for CHAR type before writing, but predicate > generator doesn't do same thing. > > > Key: HIVE-17593 > URL: https://issues.apache.org/jira/browse/HIVE-17593 > Project: Hive > Issue Type: Bug >Affects Versions: 2.3.0, 3.0.0 >Reporter: Junjie Chen >Assignee: Junjie Chen >Priority: Major > Labels: pull-request-available > Attachments: HIVE-17593.2.patch, HIVE-17593.3.patch, > HIVE-17593.4.patch, HIVE-17593.patch > > > DataWritableWriter strip spaces for CHAR type before writing. While when > generating predicate, it does NOT do same striping which should cause data > missing! > In current version, it doesn't cause data missing since predicate is not well > push down to parquet due to HIVE-17261. > Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as > same which will build a predicate with tail spaces. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17593) DataWritableWriter strip spaces for CHAR type before writing, but predicate generator doesn't do same thing.
[ https://issues.apache.org/jira/browse/HIVE-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542455#comment-16542455 ] Junjie Chen commented on HIVE-17593: The previous unit test failure (vectorized_parquet_types.q) is because of different length UDF used for CHAR. When performing query in non-vectorized mode, GenericUDFLength is used to calculate length of column, it converts the primitive value to string by using PrimitiveObjectInspectorUtil.getString, in which the tailing spaces is ignored for CHAR type. However, when performing query in vectorized mode, StringLength is used to calculate the length of column, it treats column as byte array and doesn't consider the column type. > DataWritableWriter strip spaces for CHAR type before writing, but predicate > generator doesn't do same thing. > > > Key: HIVE-17593 > URL: https://issues.apache.org/jira/browse/HIVE-17593 > Project: Hive > Issue Type: Bug >Affects Versions: 2.3.0, 3.0.0 >Reporter: Junjie Chen >Assignee: Junjie Chen >Priority: Major > Labels: pull-request-available > Attachments: HIVE-17593.2.patch, HIVE-17593.3.patch, HIVE-17593.patch > > > DataWritableWriter strip spaces for CHAR type before writing. While when > generating predicate, it does NOT do same striping which should cause data > missing! > In current version, it doesn't cause data missing since predicate is not well > push down to parquet due to HIVE-17261. > Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as > same which will build a predicate with tail spaces. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20135) Fix incompatible change in TimestampColumnVector to default to UTC
[ https://issues.apache.org/jira/browse/HIVE-20135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542450#comment-16542450 ] Hive QA commented on HIVE-20135: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 26s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 11s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 11s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 25s{color} | {color:blue} storage-api in master has 48 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 10s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 11m 11s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-12572/dev-support/hive-personality.sh | | git revision | master / 20eb7b5 | | Default Java | 1.8.0_111 | | findbugs | v3.0.1 | | modules | C: storage-api U: storage-api | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-12572/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Fix incompatible change in TimestampColumnVector to default to UTC > -- > > Key: HIVE-20135 > URL: https://issues.apache.org/jira/browse/HIVE-20135 > Project: Hive > Issue Type: Improvement >Reporter: Owen O'Malley >Assignee: Jesus Camacho Rodriguez >Priority: Blocker > Fix For: 3.1.0, 4.0.0, storage-2.7.0 > > Attachments: HIVE-20135.01.patch, HIVE-20135.02.patch, > HIVE-20135.patch > > > HIVE-20007 changed the default for TimestampColumnVector to be to use UTC, > which breaks the API compatibility with storage-api 2.6. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20006) Make materializations invalidation cache work with multiple active remote metastores
[ https://issues.apache.org/jira/browse/HIVE-20006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542443#comment-16542443 ] Hive QA commented on HIVE-20006: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12931331/HIVE-20006.06.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12571/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12571/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12571/ Messages: {noformat} This message was trimmed, see log for full details Removing standalone-metastore/src/ + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at 20eb7b5 HIVE-20097 : Convert standalone-metastore to a submodule (Alexander Kolbasov reviewed by Vihang Karajgaonkar) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2018-07-13 02:54:18.459 + rm -rf ../yetus_PreCommit-HIVE-Build-12571 + mkdir ../yetus_PreCommit-HIVE-Build-12571 + git gc + cp -R . ../yetus_PreCommit-HIVE-Build-12571 + mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-12571/yetus + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch error: a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/MaterializedViewTask.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java: does not exist in index error: a/ql/src/test/queries/clientpositive/materialized_view_create_rewrite_time_window.q: does not exist in index error: a/ql/src/test/results/clientpositive/druid/druidmini_mv.q.out: does not exist in index error: a/ql/src/test/results/clientpositive/llap/materialized_view_create_rewrite_5.q.out: does not exist in index error: a/ql/src/test/results/clientpositive/llap/materialized_view_create_rewrite_time_window.q.out: does not exist in index error: a/ql/src/test/results/clientpositive/llap/materialized_view_rewrite_empty.q.out: does not exist in index error: a/standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp: does not exist in index error: a/standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h: does not exist in index error: a/standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp: does not exist in index error: a/standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp: does not exist in index error: a/standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h: does not exist in index error: a/standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/CreationMetadata.java: does not exist in index error: a/standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/FindSchemasByColsResp.java: does not exist in index error: a/standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Materialization.java: does not exist in index error: a/standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/SchemaVersion.java: does not exist in index error: a/standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java: does not exist in index error: a/standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/WMFullResourcePlan.java: does not exist in index error: a/standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/WMGetAllResourcePlanResponse.java: does not exist in index error: a/standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/WMGetTriggersForResourePlanResponse.java: does not exist in index error: a/standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/WMValidateResourcePlanResponse.java: does not exist in index error: a/standalone-metastore/src/gen/thrift/gen-php/metastore/ThriftHiveMetastore.php: does not exist in index error: a/standalone-metastore/src/gen/thrift/gen-php/metastore/Types.php: does not exist in index error: a/standalone-metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote: does not exist in index error: a/standalone-metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py: does not exist in index
[jira] [Commented] (HIVE-18705) Improve HiveMetaStoreClient.dropDatabase
[ https://issues.apache.org/jira/browse/HIVE-18705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542441#comment-16542441 ] Hive QA commented on HIVE-18705: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12931322/HIVE-18705.9.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12570/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12570/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12570/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2018-07-13 02:52:52.708 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-12570/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2018-07-13 02:52:52.711 + cd apache-github-source-source + git fetch origin >From https://github.com/apache/hive 57dd304..20eb7b5 master -> origin/master 04ea145..93b9cdd master-txnstats -> origin/master-txnstats + git reset --hard HEAD HEAD is now at 57dd304 HIVE-20037: Print root cause exception's toString() rather than getMessage() (Aihua Xu, reviewed by Sahil Takiar) + git clean -f -d + git checkout master Already on 'master' Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded. (use "git pull" to update your local branch) + git reset --hard origin/master HEAD is now at 20eb7b5 HIVE-20097 : Convert standalone-metastore to a submodule (Alexander Kolbasov reviewed by Vihang Karajgaonkar) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2018-07-13 02:52:56.190 + rm -rf ../yetus_PreCommit-HIVE-Build-12570 + mkdir ../yetus_PreCommit-HIVE-Build-12570 + git gc + cp -R . ../yetus_PreCommit-HIVE-Build-12570 + mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-12570/yetus + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch error: standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java: does not exist in index error: standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java: does not exist in index error: src/java/org/apache/hadoop/hive/ql/metadata/TableIterable.java: does not exist in index error: src/test/org/apache/hadoop/hive/ql/metadata/TestTableIterable.java: does not exist in index error: src/java/org/apache/hive/service/cli/operation/GetColumnsOperation.java: does not exist in index error: src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java: does not exist in index error: src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java: does not exist in index error: java/org/apache/hadoop/hive/ql/metadata/TableIterable.java: does not exist in index error: test/org/apache/hadoop/hive/ql/metadata/TestTableIterable.java: does not exist in index error: java/org/apache/hive/service/cli/operation/GetColumnsOperation.java: does not exist in index error: main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java: does not exist in index error: main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java: does not exist in index The patch does not appear to apply with p0, p1, or p2 + result=1 + '[' 1 -ne 0 ']' + rm -rf yetus_PreCommit-HIVE-Build-12570 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12931322 - PreCommit-HIVE-Build > Improve HiveMetaStoreClient.dropDatabase > > > Key: HIVE-18705 > URL: https://issues.apache.org/jira/browse/HIVE-18705 > Project: Hive > Issue Type: Improvement >
[jira] [Commented] (HIVE-19486) Discrepancy in HikariCP config naming
[ https://issues.apache.org/jira/browse/HIVE-19486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542440#comment-16542440 ] Hive QA commented on HIVE-19486: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12931305/HIVE-19486.2.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 14650 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12569/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12569/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12569/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12931305 - PreCommit-HIVE-Build > Discrepancy in HikariCP config naming > - > > Key: HIVE-19486 > URL: https://issues.apache.org/jira/browse/HIVE-19486 > Project: Hive > Issue Type: Bug >Reporter: Antal Sinkovits >Assignee: Antal Sinkovits >Priority: Major > Attachments: HIVE-19486.1.patch, HIVE-19486.2.patch > > > HiveConf hive.conf.restricted.list contains "hikari." instead of "hikaricp." -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19486) Discrepancy in HikariCP config naming
[ https://issues.apache.org/jira/browse/HIVE-19486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542417#comment-16542417 ] Hive QA commented on HIVE-19486: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 51s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 47s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 33s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 32s{color} | {color:blue} common in master has 64 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 43s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 25s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 17m 59s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-12569/dev-support/hive-personality.sh | | git revision | master / 20eb7b5 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | modules | C: common itests/hive-unit U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-12569/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Discrepancy in HikariCP config naming > - > > Key: HIVE-19486 > URL: https://issues.apache.org/jira/browse/HIVE-19486 > Project: Hive > Issue Type: Bug >Reporter: Antal Sinkovits >Assignee: Antal Sinkovits >Priority: Major > Attachments: HIVE-19486.1.patch, HIVE-19486.2.patch > > > HiveConf hive.conf.restricted.list contains "hikari." instead of "hikaricp." -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20032) Don't serialize hashCode when groupByShuffle and RDD cacheing is disabled
[ https://issues.apache.org/jira/browse/HIVE-20032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542413#comment-16542413 ] Sahil Takiar commented on HIVE-20032: - As for benchmarking, I have done a lot of TPC-DS benchmarking, and I don't consistently get better performance. However, the amount of shuffled data is significantly reduced (as well as the amount of data spilled to disk). My guess is that latency doesn't improve much because I'm running my tests on a unloaded cluster. However, I expect cluster throughput to be better with this patch since less I/O resources are being used. I'll need to run some concurrent TPC-DS workloads to confirm this though. > Don't serialize hashCode when groupByShuffle and RDD cacheing is disabled > - > > Key: HIVE-20032 > URL: https://issues.apache.org/jira/browse/HIVE-20032 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar >Priority: Major > Attachments: HIVE-20032.1.patch, HIVE-20032.2.patch, > HIVE-20032.3.patch, HIVE-20032.4.patch > > > Follow up on HIVE-15104, if we don't enable RDD cacheing or groupByShuffles, > then we don't need to serialize the hashCode when shuffling data in HoS. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20032) Don't serialize hashCode when groupByShuffle and RDD cacheing is disabled
[ https://issues.apache.org/jira/browse/HIVE-20032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542409#comment-16542409 ] Sahil Takiar commented on HIVE-20032: - [~lirui] thanks for taking a look. So I took a closer look at this, and I think there might be a way to specify custom serializers just for shuffles. However, it require accessing some lower-level Spark APIs. The idea is that RDD operations such as {{SortByKey}} and {{repartitionAndSortWithinPartitions}} return a {{ShuffledRDD}}. The {{ShuffledRDD}} object has a method called {{setSerializer}} that allows users to set a custom serializer for that RDD. Certain RDD APIs such as {{combineByKey}} expose setting a custom serializer via invoking the {{ShuffledRDD#setSerializer}} method, however, it doesn't look like {{sortByKey}} or {{repartitionAndSortWithinPartitions}} does. I think this is probably better than my original approach. The other issue is that specifying a customer serializer doesn't work with the way we currently shade Kryo in {{hive-exec}} (I think you found similar issues while working on HIVE-15104). So I had to remove the relocation for Kryo (which was added in HIVE-5915). Hopefully thats ok since Spark and Hive use the same version of Kryo. I attached an updated patch (still a WIP) that implements this approach. > Don't serialize hashCode when groupByShuffle and RDD cacheing is disabled > - > > Key: HIVE-20032 > URL: https://issues.apache.org/jira/browse/HIVE-20032 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar >Priority: Major > Attachments: HIVE-20032.1.patch, HIVE-20032.2.patch, > HIVE-20032.3.patch, HIVE-20032.4.patch > > > Follow up on HIVE-15104, if we don't enable RDD cacheing or groupByShuffles, > then we don't need to serialize the hashCode when shuffling data in HoS. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20032) Don't serialize hashCode when groupByShuffle and RDD cacheing is disabled
[ https://issues.apache.org/jira/browse/HIVE-20032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar updated HIVE-20032: Attachment: HIVE-20032.4.patch > Don't serialize hashCode when groupByShuffle and RDD cacheing is disabled > - > > Key: HIVE-20032 > URL: https://issues.apache.org/jira/browse/HIVE-20032 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar >Priority: Major > Attachments: HIVE-20032.1.patch, HIVE-20032.2.patch, > HIVE-20032.3.patch, HIVE-20032.4.patch > > > Follow up on HIVE-15104, if we don't enable RDD cacheing or groupByShuffles, > then we don't need to serialize the hashCode when shuffling data in HoS. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19820) add ACID stats support to background stats updater and fix bunch of edge cases found in SU tests
[ https://issues.apache.org/jira/browse/HIVE-19820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542405#comment-16542405 ] Sergey Shelukhin commented on HIVE-19820: - Fixed a bunch more tests and paths. Few tests still fail/have bad result changes. Most out file changes that remain are trivial. autoColumnStats_10,autoColumnStats_2,stats_analyze_decimal_compare - suspicious stats change. create_or_replace_view - has a very strange error where SQL and ORM have different results w.r.t. write ID, need to investigate, probably smth stupid. > add ACID stats support to background stats updater and fix bunch of edge > cases found in SU tests > > > Key: HIVE-19820 > URL: https://issues.apache.org/jira/browse/HIVE-19820 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HIVE-19820.01-master-txnstats.patch, > HIVE-19820.01.patch, HIVE-19820.02-master-txnstats.patch, > HIVE-19820.03-master-txnstats.patch, HIVE-19820.03.patch, > HIVE-19820.04-master-txnstats.patch, HIVE-19820.patch, > branch-19820.02.nogen.patch, branch-19820.03.nogen.patch, > branch-19820.nogen.patch, branch-19820.nogen.patch > > > Follow-up from HIVE-19418. > Right now it checks whether stats are valid in an old-fashioned way... and > also gets ACID state, and discards it without using. > When ACID stats are implemented, ACID state needs to be used to do > version-aware valid stats checks. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19820) add ACID stats support to background stats updater and fix bunch of edge cases found in SU tests
[ https://issues.apache.org/jira/browse/HIVE-19820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-19820: Attachment: HIVE-19820.03.patch > add ACID stats support to background stats updater and fix bunch of edge > cases found in SU tests > > > Key: HIVE-19820 > URL: https://issues.apache.org/jira/browse/HIVE-19820 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HIVE-19820.01-master-txnstats.patch, > HIVE-19820.01.patch, HIVE-19820.02-master-txnstats.patch, > HIVE-19820.03-master-txnstats.patch, HIVE-19820.03.patch, > HIVE-19820.04-master-txnstats.patch, HIVE-19820.patch, > branch-19820.02.nogen.patch, branch-19820.03.nogen.patch, > branch-19820.nogen.patch, branch-19820.nogen.patch > > > Follow-up from HIVE-19418. > Right now it checks whether stats are valid in an old-fashioned way... and > also gets ACID state, and discards it without using. > When ACID stats are implemented, ACID state needs to be used to do > version-aware valid stats checks. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19820) add ACID stats support to background stats updater and fix bunch of edge cases found in SU tests
[ https://issues.apache.org/jira/browse/HIVE-19820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-19820: Attachment: branch-19820.03.nogen.patch > add ACID stats support to background stats updater and fix bunch of edge > cases found in SU tests > > > Key: HIVE-19820 > URL: https://issues.apache.org/jira/browse/HIVE-19820 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HIVE-19820.01-master-txnstats.patch, > HIVE-19820.01.patch, HIVE-19820.02-master-txnstats.patch, > HIVE-19820.03-master-txnstats.patch, HIVE-19820.04-master-txnstats.patch, > HIVE-19820.patch, branch-19820.02.nogen.patch, branch-19820.03.nogen.patch, > branch-19820.nogen.patch, branch-19820.nogen.patch > > > Follow-up from HIVE-19418. > Right now it checks whether stats are valid in an old-fashioned way... and > also gets ACID state, and discards it without using. > When ACID stats are implemented, ACID state needs to be used to do > version-aware valid stats checks. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17852) remove support for list bucketing "stored as directories" in 3.0
[ https://issues.apache.org/jira/browse/HIVE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542400#comment-16542400 ] Hive QA commented on HIVE-17852: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 33s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 6s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 32s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 88m 1s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 18s{color} | {color:blue} standalone-metastore in master has 217 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 8s{color} | {color:blue} ql in master has 2289 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 40s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 10m 15s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 37s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 58s{color} | {color:red} hive-unit in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 58s{color} | {color:red} hive-unit in the patch failed. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 58s{color} | {color:red} hive-unit in the patch failed. {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 5m 18s{color} | {color:red} standalone-metastore: The patch generated 438 new + 19074 unchanged - 441 fixed = 19512 total (was 19515) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 27m 55s{color} | {color:red} ql: The patch generated 699 new + 129209 unchanged - 788 fixed = 129908 total (was 129997) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 59m 47s{color} | {color:red} root: The patch generated 1136 new + 246803 unchanged - 1228 fixed = 247939 total (was 248031) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 2m 34s{color} | {color:red} itests/hive-unit: The patch generated 6 new + 11887 unchanged - 6 fixed = 11893 total (was 11893) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 3m 15s{color} | {color:red} patch/standalone-metastore cannot run setBugDatabaseInfo from findbugs {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 23s{color} | {color:red} patch/metastore cannot run setBugDatabaseInfo from findbugs {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 4m 5s{color} | {color:red} patch/ql cannot run setBugDatabaseInfo from findbugs {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 43s{color} | {color:red} hive-unit in the patch failed. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 55s{color} | {color:red} ql generated 2 new + 98 unchanged - 2 fixed = 100 total (was 100) {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 7m 9s{color} | {color:red} root generated 2 new + 369 unchanged - 2 fixed = 371 total (was 371) {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}275m 7s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile xml | | uname | Linux hivep
[jira] [Commented] (HIVE-19924) Tag distcp jobs run by Repl Load
[ https://issues.apache.org/jira/browse/HIVE-19924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542398#comment-16542398 ] Hive QA commented on HIVE-19924: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12931299/HIVE-19924.02.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 37 failed/errored test(s), 14619 tests executed *Failed tests:* {noformat} TestJdbcWithMiniHS2 - did not produce a TEST-*.xml file (likely timed out) (batchId=250) TestJdbcWithMiniHS2ErasureCoding - did not produce a TEST-*.xml file (likely timed out) (batchId=250) TestNoSaslAuth - did not produce a TEST-*.xml file (likely timed out) (batchId=250) org.apache.hadoop.hive.ql.parse.TestMacroSemanticAnalyzer.testDropMacro (batchId=288) org.apache.hadoop.hive.ql.parse.TestMacroSemanticAnalyzer.testDropMacroDoesNotExist (batchId=288) org.apache.hadoop.hive.ql.parse.TestMacroSemanticAnalyzer.testDropMacroExistsDoNotIgnoreErrors (batchId=288) org.apache.hadoop.hive.ql.parse.TestMacroSemanticAnalyzer.testDropMacroNonExistentWithIfExists (batchId=288) org.apache.hadoop.hive.ql.parse.TestMacroSemanticAnalyzer.testDropMacroNonExistentWithIfExistsDoNotIgnoreNonExistent (batchId=288) org.apache.hadoop.hive.ql.parse.TestMacroSemanticAnalyzer.testOneInputParamters (batchId=288) org.apache.hadoop.hive.ql.parse.TestMacroSemanticAnalyzer.testThreeInputParamters (batchId=288) org.apache.hadoop.hive.ql.parse.TestMacroSemanticAnalyzer.testTwoInputParamters (batchId=288) org.apache.hadoop.hive.ql.parse.TestMacroSemanticAnalyzer.testZeroInputParamters (batchId=288) org.apache.hive.jdbc.TestJdbcDriver2.testGetQueryId (batchId=249) org.apache.hive.jdbc.TestJdbcDriver2.testReplErrorScenarios (batchId=249) org.apache.hive.jdbc.TestTriggersMoveWorkloadManager.testTriggerMoveAndKill (batchId=250) org.apache.hive.jdbc.TestTriggersMoveWorkloadManager.testTriggerMoveBackKill (batchId=250) org.apache.hive.jdbc.TestTriggersMoveWorkloadManager.testTriggerMoveConflictKill (batchId=250) org.apache.hive.jdbc.TestTriggersNoTezSessionPool.testTriggerSlowQueryExecutionTime (batchId=247) org.apache.hive.jdbc.TestTriggersNoTezSessionPool.testTriggerTotalLaunchedTasks (batchId=247) org.apache.hive.jdbc.TestTriggersNoTezSessionPool.testTriggerVertexTotalTasks (batchId=247) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testMultipleTriggers1 (batchId=250) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testMultipleTriggers2 (batchId=250) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomCreatedDynamicPartitions (batchId=250) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomCreatedDynamicPartitionsMultiInsert (batchId=250) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomCreatedDynamicPartitionsUnionAll (batchId=250) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomCreatedFiles (batchId=250) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomReadOps (batchId=250) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerDagRawInputSplitsKill (batchId=250) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerDagTotalTasks (batchId=250) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerDefaultRawInputSplits (batchId=250) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighBytesRead (batchId=250) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighShuffleBytes (batchId=250) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerShortQueryElapsedTime (batchId=250) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerSlowQueryElapsedTime (batchId=250) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerSlowQueryExecutionTime (batchId=250) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerTotalTasks (batchId=250) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerVertexRawInputSplitsKill (batchId=250) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12568/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12568/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12568/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 37 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12931299 - PreCommit-HIVE-Build > Tag distcp jobs run by Repl Load > > > Key:
[jira] [Commented] (HIVE-20117) schema changes for txn stats
[ https://issues.apache.org/jira/browse/HIVE-20117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542391#comment-16542391 ] Jesus Camacho Rodriguez commented on HIVE-20117: [~sershe], I think you are referring to HIVE-19027 that landed in 3.1, and its clone HIVE-20006 that will land in master? If that is the case, these diffs should be fixed soon, HIVE-20006 has not landed yet because I did not get a clean QA... > schema changes for txn stats > > > Key: HIVE-20117 > URL: https://issues.apache.org/jira/browse/HIVE-20117 > Project: Hive > Issue Type: Bug > Components: Statistics, Transactions >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HIVE-20117.01.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20117) schema changes for txn stats
[ https://issues.apache.org/jira/browse/HIVE-20117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542390#comment-16542390 ] Sergey Shelukhin commented on HIVE-20117: - Updated simple patch. As of now the stats tables don't need write ID, the flag is in TBLS table anyway so we ahve to check and update that. We might move it later. [~vgarg] 3.0-to-3.1 upgrade script is currently inconsistent between branch-3 and master (some changes are only on master and I think should be reverted given that they are not actually going to be part of 3.1 cc [~sankarh], some only on branch-3 and must be committed to master together cc [~jcamachorodriguez]). So, this won't apply to branch-3. I will update the patch once this situation is resolved, or feel free to update/commit where necessary for 3.1 release. > schema changes for txn stats > > > Key: HIVE-20117 > URL: https://issues.apache.org/jira/browse/HIVE-20117 > Project: Hive > Issue Type: Bug > Components: Statistics, Transactions >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HIVE-20117.01.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20117) schema changes for txn stats
[ https://issues.apache.org/jira/browse/HIVE-20117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-20117: Attachment: HIVE-20117.01.patch > schema changes for txn stats > > > Key: HIVE-20117 > URL: https://issues.apache.org/jira/browse/HIVE-20117 > Project: Hive > Issue Type: Bug > Components: Statistics, Transactions >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HIVE-20117.01.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20117) schema changes for txn stats
[ https://issues.apache.org/jira/browse/HIVE-20117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-20117: Attachment: (was: HIVE-20117.patch) > schema changes for txn stats > > > Key: HIVE-20117 > URL: https://issues.apache.org/jira/browse/HIVE-20117 > Project: Hive > Issue Type: Bug > Components: Statistics, Transactions >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HIVE-20117.01.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20097) Convert standalone-metastore to a submodule
[ https://issues.apache.org/jira/browse/HIVE-20097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-20097: --- Resolution: Fixed Fix Version/s: 4.0.0 Status: Resolved (was: Patch Available) Patch merged into master. Thanks [~akolb]! > Convert standalone-metastore to a submodule > --- > > Key: HIVE-20097 > URL: https://issues.apache.org/jira/browse/HIVE-20097 > Project: Hive > Issue Type: Sub-task > Components: Hive, Metastore, Standalone Metastore >Affects Versions: 3.1.0, 4.0.0 >Reporter: Alexander Kolbasov >Assignee: Alexander Kolbasov >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-20097.01.patch, HIVE-20097.02.patch, > HIVE-20097.03.patch, HIVE-20097.04.patch, HIVE-20097.05.patch, > HIVE-20097.06.patch, HIVE-20097.07.patch > > > This is a subtask to stage HIVE-17751 changes into several smaller phases. > The first part is moving existing code in hive-standalone-metastore to a > sub-module. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19924) Tag distcp jobs run by Repl Load
[ https://issues.apache.org/jira/browse/HIVE-19924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542386#comment-16542386 ] Hive QA commented on HIVE-19924: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 58s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 42s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 36s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 21s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 39s{color} | {color:blue} ql in master has 2289 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 46s{color} | {color:blue} service in master has 48 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 48s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 58s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 29s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 36s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 45s{color} | {color:red} ql: The patch generated 1 new + 55 unchanged - 12 fixed = 56 total (was 67) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 16s{color} | {color:red} service: The patch generated 2 new + 123 unchanged - 0 fixed = 125 total (was 123) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 59s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 39m 19s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-12568/dev-support/hive-personality.sh | | git revision | master / 57dd304 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-12568/yetus/diff-checkstyle-ql.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-12568/yetus/diff-checkstyle-service.txt | | modules | C: ql service itests/hive-unit U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-12568/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Tag distcp jobs run by Repl Load > > > Key: HIVE-19924 > URL: https://issues.apache.org/jira/browse/HIVE-19924 > Project: Hive > Issue Type: Task > Components: repl >Affects Versions: 3.1.0, 4.0.0 >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: DR, replication > Fix For: 4.0.0, 3.2.0 > > Attachments: HIVE-19924.01.patch, HIVE-19924.02.patch > > > Add tags in jobconf for distcp related job
[jira] [Updated] (HIVE-20164) Murmur Hash : Make sure CTAS and IAS use correct bucketing version
[ https://issues.apache.org/jira/browse/HIVE-20164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Jaiswal updated HIVE-20164: -- Attachment: HIVE-20164.1.patch > Murmur Hash : Make sure CTAS and IAS use correct bucketing version > -- > > Key: HIVE-20164 > URL: https://issues.apache.org/jira/browse/HIVE-20164 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal >Priority: Major > Attachments: HIVE-20164.1.patch > > > With the migration to Murmur hash, CTAS and IAS from old table version to new > table version does not work as intended and data is hashed using old hash > logic. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20164) Murmur Hash : Make sure CTAS and IAS use correct bucketing version
[ https://issues.apache.org/jira/browse/HIVE-20164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Jaiswal updated HIVE-20164: -- Status: Patch Available (was: Open) > Murmur Hash : Make sure CTAS and IAS use correct bucketing version > -- > > Key: HIVE-20164 > URL: https://issues.apache.org/jira/browse/HIVE-20164 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal >Priority: Major > Attachments: HIVE-20164.1.patch > > > With the migration to Murmur hash, CTAS and IAS from old table version to new > table version does not work as intended and data is hashed using old hash > logic. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HIVE-17683) Annotate Query Plan with locking information
[ https://issues.apache.org/jira/browse/HIVE-17683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542380#comment-16542380 ] Eugene Koifman edited comment on HIVE-17683 at 7/13/18 12:54 AM: - [~ikryvenko], sorry, it took a while to get back to this. Your implementation creates ExplainTask.getJsonLocks() which duplicates a lot of the logic in DbTxnManger.acquireLocks(). This is problematic because they have to be kept in sync. Could you refactor it so that they share code? For example, create a {{LockRequest makeLockRequest(List, List)}} and use it in both places? Also, the refactoring in acquireLocks() lost {noformat} default: throw new IllegalArgumentException(String .format("Lock type [%s] for Database.Table [%s.%s] is unknown", lockType, t.getDbName(), t.getTableName() ));{noformat} This may change how errors are surfaced - not sure it's a good idea. Don't know if it's related to your changes but in explain_locks.q.out {{explain locks drop table test_explain_locks}} doesn't acquire any locks - this is odd - I'd expect X lock on the table for a drop command. Why did you chose to output the data as JSON? was (Author: ekoifman): [~ikryvenko], sorry, it took a while to get back to this. Your implementation creates ExplainTask.getJsonLocks() which duplicates a lot of the logic in DbTxnManger.acquireLocks(). This is problematic because they have to be kept in sync. Could you refactor it so that they share code? For example, create a {{LockRequest makeLockRequest(List, List)}} and use it in both places? Also, the refactoring in acquireLocks() lost {noformat} default: throw new IllegalArgumentException(String .format("Lock type [%s] for Database.Table [%s.%s] is unknown", lockType, t.getDbName(), t.getTableName() ));{noformat} This may change how errors are surfaced - not sure it's a good idea. > Annotate Query Plan with locking information > > > Key: HIVE-17683 > URL: https://issues.apache.org/jira/browse/HIVE-17683 > Project: Hive > Issue Type: New Feature > Components: Transactions >Reporter: Eugene Koifman >Assignee: Igor Kryvenko >Priority: Critical > Attachments: HIVE-17683.01.patch, HIVE-17683.02.patch > > > Explore if it's possible to add info about what locks will be asked for to > the query plan. > Lock acquisition (for Acid Lock Manager) is done in > DbTxnManager.acquireLocks() which is called once the query starts running. > Would need to refactor that. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-20164) Murmur Hash : Make sure CTAS and IAS use correct bucketing version
[ https://issues.apache.org/jira/browse/HIVE-20164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Jaiswal reassigned HIVE-20164: - > Murmur Hash : Make sure CTAS and IAS use correct bucketing version > -- > > Key: HIVE-20164 > URL: https://issues.apache.org/jira/browse/HIVE-20164 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal >Priority: Major > > With the migration to Murmur hash, CTAS and IAS from old table version to new > table version does not work as intended and data is hashed using old hash > logic. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17683) Annotate Query Plan with locking information
[ https://issues.apache.org/jira/browse/HIVE-17683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542380#comment-16542380 ] Eugene Koifman commented on HIVE-17683: --- [~ikryvenko], sorry, it took a while to get back to this. Your implementation creates ExplainTask.getJsonLocks() which duplicates a lot of the logic in DbTxnManger.acquireLocks(). This is problematic because they have to be kept in sync. Could you refactor it so that they share code? For example, create a {{LockRequest makeLockRequest(List, List)}} and use it in both places? Also, the refactoring in acquireLocks() lost {noformat} default: throw new IllegalArgumentException(String .format("Lock type [%s] for Database.Table [%s.%s] is unknown", lockType, t.getDbName(), t.getTableName() ));{noformat} This may change how errors are surfaced - not sure it's a good idea. > Annotate Query Plan with locking information > > > Key: HIVE-17683 > URL: https://issues.apache.org/jira/browse/HIVE-17683 > Project: Hive > Issue Type: New Feature > Components: Transactions >Reporter: Eugene Koifman >Assignee: Igor Kryvenko >Priority: Critical > Attachments: HIVE-17683.01.patch, HIVE-17683.02.patch > > > Explore if it's possible to add info about what locks will be asked for to > the query plan. > Lock acquisition (for Acid Lock Manager) is done in > DbTxnManager.acquireLocks() which is called once the query starts running. > Would need to refactor that. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-12342) Set default value of hive.optimize.index.filter to true
[ https://issues.apache.org/jira/browse/HIVE-12342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542375#comment-16542375 ] Deepak Jaiswal commented on HIVE-12342: --- [~ikryvenko] I was looking at the change in ParseContext.java where HashMap is converted to LinkedHashMap. Can you please tell me why it was needed? > Set default value of hive.optimize.index.filter to true > --- > > Key: HIVE-12342 > URL: https://issues.apache.org/jira/browse/HIVE-12342 > Project: Hive > Issue Type: Task > Components: Configuration >Reporter: Ashutosh Chauhan >Assignee: Igor Kryvenko >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-12342.05.patch, HIVE-12342.06.patch, > HIVE-12342.07.patch, HIVE-12342.08.patch, HIVE-12342.09.patch, > HIVE-12342.1.patch, HIVE-12342.10.patch, HIVE-12342.11.patch, > HIVE-12342.12.patch, HIVE-12342.13.patch, HIVE-12342.14.patch, > HIVE-12342.15.patch, HIVE-12342.16.patch, HIVE-12342.17.patch, > HIVE-12342.18.patch, HIVE-12342.19.patch, HIVE-12342.2.patch, > HIVE-12342.20.patch, HIVE-12342.21.patch, HIVE-12342.22.patch, > HIVE-12342.23.patch, HIVE-12342.24.patch, HIVE-12342.3.patch, > HIVE-12342.4.patch, HIVE-12342.patch > > > This configuration governs ppd for storage layer. When applicable, it will > always help. It should be on by default. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17896) TopNKey: Create a standalone vectorizable TopNKey operator
[ https://issues.apache.org/jira/browse/HIVE-17896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542352#comment-16542352 ] Hive QA commented on HIVE-17896: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12931291/HIVE-17896.11.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 14656 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12567/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12567/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12567/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12931291 - PreCommit-HIVE-Build > TopNKey: Create a standalone vectorizable TopNKey operator > -- > > Key: HIVE-17896 > URL: https://issues.apache.org/jira/browse/HIVE-17896 > Project: Hive > Issue Type: New Feature > Components: Operators >Affects Versions: 3.0.0 >Reporter: Gopal V >Assignee: Teddy Choi >Priority: Major > Attachments: HIVE-17896.1.patch, HIVE-17896.10.patch, > HIVE-17896.11.patch, HIVE-17896.3.patch, HIVE-17896.4.patch, > HIVE-17896.5.patch, HIVE-17896.6.patch, HIVE-17896.7.patch, > HIVE-17896.8.patch, HIVE-17896.9.patch > > > For TPC-DS Query27, the TopN operation is delayed by the group-by - the > group-by operator buffers up all the rows before discarding the 99% of the > rows in the TopN Hash within the ReduceSink Operator. > The RS TopN operator is very restrictive as it only supports doing the > filtering on the shuffle keys, but it is better to do this before breaking > the vectors into rows and losing the isRepeating properties. > Adding a TopN Key operator in the physical operator tree allows the following > to happen. > GBY->RS(Top=1) > can become > TNK(1)->GBY->RS(Top=1) > So that, the TopNKey can remove rows before they are buffered into the GBY > and consume memory. > Here's the equivalent implementation in Presto > https://github.com/prestodb/presto/blob/master/presto-main/src/main/java/com/facebook/presto/operator/TopNOperator.java#L35 > Adding this as a sub-feature of GroupBy prevents further optimizations if the > GBY is on keys "a,b,c" and the TopNKey is on just "a". -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17896) TopNKey: Create a standalone vectorizable TopNKey operator
[ https://issues.apache.org/jira/browse/HIVE-17896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542341#comment-16542341 ] Hive QA commented on HIVE-17896: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 30s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 49s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 46s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 6s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 1m 3s{color} | {color:blue} common in master has 64 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 1m 8s{color} | {color:blue} serde in master has 194 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 5m 9s{color} | {color:blue} ql in master has 2289 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 7s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 30s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 0s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 57s{color} | {color:red} ql: The patch generated 35 new + 426 unchanged - 0 fixed = 461 total (was 426) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 58s{color} | {color:red} serde generated 1 new + 194 unchanged - 0 fixed = 195 total (was 194) {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 4m 55s{color} | {color:red} ql generated 8 new + 2289 unchanged - 0 fixed = 2297 total (was 2289) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 38s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 16s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 42m 15s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:serde | | | org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(Object[], ObjectInspector[], Object[], ObjectInspector[], boolean[]) negates the return value of org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(Object, ObjectInspector, Object, ObjectInspector) At ObjectInspectorUtils.java:negates the return value of org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(Object, ObjectInspector, Object, ObjectInspector) At ObjectInspectorUtils.java:[line 956] | | FindBugs | module:ql | | | new org.apache.hadoop.hive.ql.exec.TopNKeyOperator$KeyWrapperComparator(ObjectInspector[], ObjectInspector[], boolean[]) may expose internal representation by storing an externally mutable object into TopNKeyOperator$KeyWrapperComparator.columnSortOrderIsDesc At TopNKeyOperator.java:expose internal representation by storing an externally mutable object into TopNKeyOperator$KeyWrapperComparator.columnSortOrderIsDesc At TopNKeyOperator.java:[line 71] | | | new org.apache.hadoop.hive.ql.exec.TopNKeyOperator$KeyWrapperComparator(ObjectInspector[], ObjectInspector[], boolean[]) may expose internal representation by storing an externally mutable object into TopNKeyOperator$KeyWrapperComparator.objectInspectors1 At TopNKeyOperator.java:expose internal representation by storing an externally mutable object into TopNKeyOperator$KeyWrapperComparat
[jira] [Updated] (HIVE-19360) CBO: Add an "optimizedSQL" to QueryPlan object
[ https://issues.apache.org/jira/browse/HIVE-19360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-19360: --- Attachment: (was: HIVE-19360.6.patch) > CBO: Add an "optimizedSQL" to QueryPlan object > --- > > Key: HIVE-19360 > URL: https://issues.apache.org/jira/browse/HIVE-19360 > Project: Hive > Issue Type: Improvement > Components: CBO, Diagnosability >Affects Versions: 3.1.0 >Reporter: Gopal V >Assignee: Jesus Camacho Rodriguez >Priority: Major > Attachments: HIVE-19360.1.patch, HIVE-19360.2.patch, > HIVE-19360.3.patch, HIVE-19360.4.patch, HIVE-19360.5.patch, HIVE-19360.6.patch > > > Calcite RelNodes can be converted back into SQL (as the new JDBC storage > handler does), which allows Hive to print out the post CBO plan as a SQL > query instead of having to guess the join orders from the subsequent Tez plan. > The query generated might not be always valid SQL at this point, but is a > world ahead of DAG plans in readability. > Eg. tpc-ds Query4 CTEs gets expanded to > {code} > SELECT t16.$f3 customer_preferred_cust_flag > FROM > (SELECT t0.c_customer_id $f0, >SUM((t2.ws_ext_list_price - > t2.ws_ext_wholesale_cost - t2.ws_ext_discount_amt + t2.ws_ext_sales_price) / > CAST(2 AS DECIMAL(10, 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t0 >INNER JOIN ( > (SELECT ws_sold_date_sk, > ws_bill_customer_sk, > ws_ext_discount_amt, > ws_ext_sales_price, > ws_ext_wholesale_cost, > ws_ext_list_price > FROM default.web_sales > WHERE ws_bill_customer_sk IS NOT NULL > AND ws_sold_date_sk IS NOT NULL) t2 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t4 ON t2.ws_sold_date_sk = > t4.d_date_sk) ON t0.c_customer_sk = t2.ws_bill_customer_sk >GROUP BY t0.c_customer_id, > t0.c_first_name, > t0.c_last_name, > t0.c_preferred_cust_flag, > t0.c_birth_country, > t0.c_login, > t0.c_email_address) t7 > INNER JOIN ( > (SELECT t9.c_customer_id $f0, >t9.c_preferred_cust_flag $f3, > > SUM((t11.ss_ext_list_price - t11.ss_ext_wholesale_cost - > t11.ss_ext_discount_amt + t11.ss_ext_sales_price) / CAST(2 AS DECIMAL(10, > 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t9 >INNER JOIN ( > (SELECT ss_sold_date_sk, > ss_customer_sk, > ss_ext_discount_amt, > ss_ext_sales_price, > ss_ext_wholesale_cost, > ss_ext_list_price > FROM default.store_sales > WHERE ss_customer_sk IS NOT NULL > AND ss_sold_date_sk IS NOT NULL) t11 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t13 ON > t11.ss_sold_date_sk = t13.d_date_sk) ON t9.c_customer_sk = t11.ss_customer_sk >GROUP BY t9.c_customer_id, > t9.c_first_name, > t9.c_last_name, > t9.c_preferred_cust_flag, > t9.c_birth_country, >
[jira] [Updated] (HIVE-19360) CBO: Add an "optimizedSQL" to QueryPlan object
[ https://issues.apache.org/jira/browse/HIVE-19360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-19360: --- Attachment: HIVE-19360.6.patch > CBO: Add an "optimizedSQL" to QueryPlan object > --- > > Key: HIVE-19360 > URL: https://issues.apache.org/jira/browse/HIVE-19360 > Project: Hive > Issue Type: Improvement > Components: CBO, Diagnosability >Affects Versions: 3.1.0 >Reporter: Gopal V >Assignee: Jesus Camacho Rodriguez >Priority: Major > Attachments: HIVE-19360.1.patch, HIVE-19360.2.patch, > HIVE-19360.3.patch, HIVE-19360.4.patch, HIVE-19360.5.patch, HIVE-19360.6.patch > > > Calcite RelNodes can be converted back into SQL (as the new JDBC storage > handler does), which allows Hive to print out the post CBO plan as a SQL > query instead of having to guess the join orders from the subsequent Tez plan. > The query generated might not be always valid SQL at this point, but is a > world ahead of DAG plans in readability. > Eg. tpc-ds Query4 CTEs gets expanded to > {code} > SELECT t16.$f3 customer_preferred_cust_flag > FROM > (SELECT t0.c_customer_id $f0, >SUM((t2.ws_ext_list_price - > t2.ws_ext_wholesale_cost - t2.ws_ext_discount_amt + t2.ws_ext_sales_price) / > CAST(2 AS DECIMAL(10, 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t0 >INNER JOIN ( > (SELECT ws_sold_date_sk, > ws_bill_customer_sk, > ws_ext_discount_amt, > ws_ext_sales_price, > ws_ext_wholesale_cost, > ws_ext_list_price > FROM default.web_sales > WHERE ws_bill_customer_sk IS NOT NULL > AND ws_sold_date_sk IS NOT NULL) t2 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t4 ON t2.ws_sold_date_sk = > t4.d_date_sk) ON t0.c_customer_sk = t2.ws_bill_customer_sk >GROUP BY t0.c_customer_id, > t0.c_first_name, > t0.c_last_name, > t0.c_preferred_cust_flag, > t0.c_birth_country, > t0.c_login, > t0.c_email_address) t7 > INNER JOIN ( > (SELECT t9.c_customer_id $f0, >t9.c_preferred_cust_flag $f3, > > SUM((t11.ss_ext_list_price - t11.ss_ext_wholesale_cost - > t11.ss_ext_discount_amt + t11.ss_ext_sales_price) / CAST(2 AS DECIMAL(10, > 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t9 >INNER JOIN ( > (SELECT ss_sold_date_sk, > ss_customer_sk, > ss_ext_discount_amt, > ss_ext_sales_price, > ss_ext_wholesale_cost, > ss_ext_list_price > FROM default.store_sales > WHERE ss_customer_sk IS NOT NULL > AND ss_sold_date_sk IS NOT NULL) t11 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t13 ON > t11.ss_sold_date_sk = t13.d_date_sk) ON t9.c_customer_sk = t11.ss_customer_sk >GROUP BY t9.c_customer_id, > t9.c_first_name, > t9.c_last_name, > t9.c_preferred_cust_flag, > t9.c_birth_country, >
[jira] [Updated] (HIVE-19360) CBO: Add an "optimizedSQL" to QueryPlan object
[ https://issues.apache.org/jira/browse/HIVE-19360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-19360: --- Attachment: HIVE-19360.6.patch > CBO: Add an "optimizedSQL" to QueryPlan object > --- > > Key: HIVE-19360 > URL: https://issues.apache.org/jira/browse/HIVE-19360 > Project: Hive > Issue Type: Improvement > Components: CBO, Diagnosability >Affects Versions: 3.1.0 >Reporter: Gopal V >Assignee: Jesus Camacho Rodriguez >Priority: Major > Attachments: HIVE-19360.1.patch, HIVE-19360.2.patch, > HIVE-19360.3.patch, HIVE-19360.4.patch, HIVE-19360.5.patch, HIVE-19360.6.patch > > > Calcite RelNodes can be converted back into SQL (as the new JDBC storage > handler does), which allows Hive to print out the post CBO plan as a SQL > query instead of having to guess the join orders from the subsequent Tez plan. > The query generated might not be always valid SQL at this point, but is a > world ahead of DAG plans in readability. > Eg. tpc-ds Query4 CTEs gets expanded to > {code} > SELECT t16.$f3 customer_preferred_cust_flag > FROM > (SELECT t0.c_customer_id $f0, >SUM((t2.ws_ext_list_price - > t2.ws_ext_wholesale_cost - t2.ws_ext_discount_amt + t2.ws_ext_sales_price) / > CAST(2 AS DECIMAL(10, 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t0 >INNER JOIN ( > (SELECT ws_sold_date_sk, > ws_bill_customer_sk, > ws_ext_discount_amt, > ws_ext_sales_price, > ws_ext_wholesale_cost, > ws_ext_list_price > FROM default.web_sales > WHERE ws_bill_customer_sk IS NOT NULL > AND ws_sold_date_sk IS NOT NULL) t2 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t4 ON t2.ws_sold_date_sk = > t4.d_date_sk) ON t0.c_customer_sk = t2.ws_bill_customer_sk >GROUP BY t0.c_customer_id, > t0.c_first_name, > t0.c_last_name, > t0.c_preferred_cust_flag, > t0.c_birth_country, > t0.c_login, > t0.c_email_address) t7 > INNER JOIN ( > (SELECT t9.c_customer_id $f0, >t9.c_preferred_cust_flag $f3, > > SUM((t11.ss_ext_list_price - t11.ss_ext_wholesale_cost - > t11.ss_ext_discount_amt + t11.ss_ext_sales_price) / CAST(2 AS DECIMAL(10, > 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t9 >INNER JOIN ( > (SELECT ss_sold_date_sk, > ss_customer_sk, > ss_ext_discount_amt, > ss_ext_sales_price, > ss_ext_wholesale_cost, > ss_ext_list_price > FROM default.store_sales > WHERE ss_customer_sk IS NOT NULL > AND ss_sold_date_sk IS NOT NULL) t11 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t13 ON > t11.ss_sold_date_sk = t13.d_date_sk) ON t9.c_customer_sk = t11.ss_customer_sk >GROUP BY t9.c_customer_id, > t9.c_first_name, > t9.c_last_name, > t9.c_preferred_cust_flag, > t9.c_birth_country, >
[jira] [Commented] (HIVE-20095) Fix jdbc external table feature
[ https://issues.apache.org/jira/browse/HIVE-20095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542339#comment-16542339 ] Jesus Camacho Rodriguez commented on HIVE-20095: [~daijy], the motivation for that is that currently we can generate complex SQL statement for the part of the query that we push to the storage handler automatically from Calcite. Hence, the types for that TableScan should be coming from the output of that query rather than from the Table schema. I think an easy fix would be that if query has been generated by Hive, then we could use ResultSet to determine the schema; otherwise we fallback to using table schema, which is what [~msydoron] did. Since the number of dialects supported from Calcite will be limited, this means we should also include a map for the types in those databases towards Hive. We can do that later on. Does it make sense? [~msydoron], there seem to be be some test failures still. > Fix jdbc external table feature > --- > > Key: HIVE-20095 > URL: https://issues.apache.org/jira/browse/HIVE-20095 > Project: Hive > Issue Type: Bug >Reporter: Jonathan Doron >Assignee: Jonathan Doron >Priority: Major > Attachments: HIVE-20095.1.patch, HIVE-20095.2.patch, > HIVE-20095.3.patch > > > It seems like the committed code for HIVE-19161 > (7584b3276bebf64aa006eaa162c0a6264d8fcb56) reverted some of HIVE-18423 > updates, and therefore some of the external table queries are not working > correctly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20097) Convert standalone-metastore to a submodule
[ https://issues.apache.org/jira/browse/HIVE-20097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Kolbasov updated HIVE-20097: -- Attachment: HIVE-20097.07.patch > Convert standalone-metastore to a submodule > --- > > Key: HIVE-20097 > URL: https://issues.apache.org/jira/browse/HIVE-20097 > Project: Hive > Issue Type: Sub-task > Components: Hive, Metastore, Standalone Metastore >Affects Versions: 3.1.0, 4.0.0 >Reporter: Alexander Kolbasov >Assignee: Alexander Kolbasov >Priority: Major > Attachments: HIVE-20097.01.patch, HIVE-20097.02.patch, > HIVE-20097.03.patch, HIVE-20097.04.patch, HIVE-20097.05.patch, > HIVE-20097.06.patch, HIVE-20097.07.patch > > > This is a subtask to stage HIVE-17751 changes into several smaller phases. > The first part is moving existing code in hive-standalone-metastore to a > sub-module. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20097) Convert standalone-metastore to a submodule
[ https://issues.apache.org/jira/browse/HIVE-20097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542331#comment-16542331 ] Alexander Kolbasov commented on HIVE-20097: --- Looks like findbugs wasn't happy because it couldn't find findbugs-exclude.xml file - added it under standalonemetastore-metastore-common/findbugs. Patch 7 contains the change. > Convert standalone-metastore to a submodule > --- > > Key: HIVE-20097 > URL: https://issues.apache.org/jira/browse/HIVE-20097 > Project: Hive > Issue Type: Sub-task > Components: Hive, Metastore, Standalone Metastore >Affects Versions: 3.1.0, 4.0.0 >Reporter: Alexander Kolbasov >Assignee: Alexander Kolbasov >Priority: Major > Attachments: HIVE-20097.01.patch, HIVE-20097.02.patch, > HIVE-20097.03.patch, HIVE-20097.04.patch, HIVE-20097.05.patch, > HIVE-20097.06.patch, HIVE-20097.07.patch > > > This is a subtask to stage HIVE-17751 changes into several smaller phases. > The first part is moving existing code in hive-standalone-metastore to a > sub-module. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-18038) org.apache.hadoop.hive.ql.session.OperationLog - Review
[ https://issues.apache.org/jira/browse/HIVE-18038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated HIVE-18038: --- Status: Patch Available (was: Open) > org.apache.hadoop.hive.ql.session.OperationLog - Review > --- > > Key: HIVE-18038 > URL: https://issues.apache.org/jira/browse/HIVE-18038 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Trivial > Attachments: HIVE-18038.1.patch, HIVE-18038.2.patch, > HIVE-18038.3.patch, HIVE-18038.4.patch, HIVE-18038.5.patch, > HIVE-18038.6.patch, HIVE-18038.7.patch, HIVE-18038.8.patch, HIVE-18038.9.patch > > > Simplifications, improve readability -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-18038) org.apache.hadoop.hive.ql.session.OperationLog - Review
[ https://issues.apache.org/jira/browse/HIVE-18038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated HIVE-18038: --- Attachment: HIVE-18038.9.patch > org.apache.hadoop.hive.ql.session.OperationLog - Review > --- > > Key: HIVE-18038 > URL: https://issues.apache.org/jira/browse/HIVE-18038 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Trivial > Attachments: HIVE-18038.1.patch, HIVE-18038.2.patch, > HIVE-18038.3.patch, HIVE-18038.4.patch, HIVE-18038.5.patch, > HIVE-18038.6.patch, HIVE-18038.7.patch, HIVE-18038.8.patch, HIVE-18038.9.patch > > > Simplifications, improve readability -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-18038) org.apache.hadoop.hive.ql.session.OperationLog - Review
[ https://issues.apache.org/jira/browse/HIVE-18038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated HIVE-18038: --- Status: Open (was: Patch Available) > org.apache.hadoop.hive.ql.session.OperationLog - Review > --- > > Key: HIVE-18038 > URL: https://issues.apache.org/jira/browse/HIVE-18038 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Trivial > Attachments: HIVE-18038.1.patch, HIVE-18038.2.patch, > HIVE-18038.3.patch, HIVE-18038.4.patch, HIVE-18038.5.patch, > HIVE-18038.6.patch, HIVE-18038.7.patch, HIVE-18038.8.patch > > > Simplifications, improve readability -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-18038) org.apache.hadoop.hive.ql.session.OperationLog - Review
[ https://issues.apache.org/jira/browse/HIVE-18038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated HIVE-18038: --- Attachment: (was: HIVE-18038.9.patch) > org.apache.hadoop.hive.ql.session.OperationLog - Review > --- > > Key: HIVE-18038 > URL: https://issues.apache.org/jira/browse/HIVE-18038 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Trivial > Attachments: HIVE-18038.1.patch, HIVE-18038.2.patch, > HIVE-18038.3.patch, HIVE-18038.4.patch, HIVE-18038.5.patch, > HIVE-18038.6.patch, HIVE-18038.7.patch, HIVE-18038.8.patch > > > Simplifications, improve readability -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20163) Simplify StringSubstrColStart Initialization
[ https://issues.apache.org/jira/browse/HIVE-20163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated HIVE-20163: --- Status: Patch Available (was: Open) > Simplify StringSubstrColStart Initialization > > > Key: HIVE-20163 > URL: https://issues.apache.org/jira/browse/HIVE-20163 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Affects Versions: 3.0.0, 4.0.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Minor > Attachments: HIVE-20163.1.patch > > > * Remove code > * Remove exception handling > * Remove {{printStackTrace}} call -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20163) Simplify StringSubstrColStart Initialization
[ https://issues.apache.org/jira/browse/HIVE-20163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated HIVE-20163: --- Attachment: HIVE-20163.1.patch > Simplify StringSubstrColStart Initialization > > > Key: HIVE-20163 > URL: https://issues.apache.org/jira/browse/HIVE-20163 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Affects Versions: 3.0.0, 4.0.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Minor > Attachments: HIVE-20163.1.patch > > > * Remove code > * Remove exception handling > * Remove {{printStackTrace}} call -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-20163) Simplify StringSubstrColStart Initialization
[ https://issues.apache.org/jira/browse/HIVE-20163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR reassigned HIVE-20163: -- Assignee: BELUGA BEHR > Simplify StringSubstrColStart Initialization > > > Key: HIVE-20163 > URL: https://issues.apache.org/jira/browse/HIVE-20163 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Affects Versions: 3.0.0, 4.0.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Minor > Attachments: HIVE-20163.1.patch > > > * Remove code > * Remove exception handling > * Remove {{printStackTrace}} call -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-18038) org.apache.hadoop.hive.ql.session.OperationLog - Review
[ https://issues.apache.org/jira/browse/HIVE-18038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated HIVE-18038: --- Status: Patch Available (was: In Progress) Yup. This is getting embarrassing. HA. > org.apache.hadoop.hive.ql.session.OperationLog - Review > --- > > Key: HIVE-18038 > URL: https://issues.apache.org/jira/browse/HIVE-18038 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Trivial > Attachments: HIVE-18038.1.patch, HIVE-18038.2.patch, > HIVE-18038.3.patch, HIVE-18038.4.patch, HIVE-18038.5.patch, > HIVE-18038.6.patch, HIVE-18038.7.patch, HIVE-18038.8.patch, HIVE-18038.9.patch > > > Simplifications, improve readability -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-18038) org.apache.hadoop.hive.ql.session.OperationLog - Review
[ https://issues.apache.org/jira/browse/HIVE-18038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated HIVE-18038: --- Attachment: HIVE-18038.9.patch > org.apache.hadoop.hive.ql.session.OperationLog - Review > --- > > Key: HIVE-18038 > URL: https://issues.apache.org/jira/browse/HIVE-18038 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Trivial > Attachments: HIVE-18038.1.patch, HIVE-18038.2.patch, > HIVE-18038.3.patch, HIVE-18038.4.patch, HIVE-18038.5.patch, > HIVE-18038.6.patch, HIVE-18038.7.patch, HIVE-18038.8.patch, HIVE-18038.9.patch > > > Simplifications, improve readability -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19940) Push predicates with deterministic UDFs with RBO
[ https://issues.apache.org/jira/browse/HIVE-19940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Janaki Lahorani updated HIVE-19940: --- Attachment: HIVE-19940.4.patch > Push predicates with deterministic UDFs with RBO > > > Key: HIVE-19940 > URL: https://issues.apache.org/jira/browse/HIVE-19940 > Project: Hive > Issue Type: Improvement >Reporter: Janaki Lahorani >Assignee: Janaki Lahorani >Priority: Major > Attachments: HIVE-19940.1.patch, HIVE-19940.2.patch, > HIVE-19940.3.patch, HIVE-19940.4.patch > > > With RBO, predicates with any UDF doesn't get pushed down. It makes sense to > not pushdown the predicates with non-deterministic function as the meaning of > the query changes after the predicate is resolved to use the function. But > pushing a deterministic function is beneficial. > Test Case: > {code} > set hive.cbo.enable=false; > CREATE TABLE `testb`( >`cola` string COMMENT '', >`colb` string COMMENT '', >`colc` string COMMENT '') > PARTITIONED BY ( >`part1` string, >`part2` string, >`part3` string) > STORED AS AVRO; > CREATE TABLE `testa`( >`col1` string COMMENT '', >`col2` string COMMENT '', >`col3` string COMMENT '', >`col4` string COMMENT '', >`col5` string COMMENT '') > PARTITIONED BY ( >`part1` string, >`part2` string, >`part3` string) > STORED AS AVRO; > insert into testA partition (part1='US', part2='ABC', part3='123') > values ('12.34', '100', '200', '300', 'abc'), > ('12.341', '1001', '2001', '3001', 'abcd'); > insert into testA partition (part1='UK', part2='DEF', part3='123') > values ('12.34', '100', '200', '300', 'abc'), > ('12.341', '1001', '2001', '3001', 'abcd'); > insert into testA partition (part1='US', part2='DEF', part3='200') > values ('12.34', '100', '200', '300', 'abc'), > ('12.341', '1001', '2001', '3001', 'abcd'); > insert into testA partition (part1='CA', part2='ABC', part3='300') > values ('12.34', '100', '200', '300', 'abc'), > ('12.341', '1001', '2001', '3001', 'abcd'); > insert into testB partition (part1='CA', part2='ABC', part3='300') > values ('600', '700', 'abc'), ('601', '701', 'abcd'); > insert into testB partition (part1='CA', part2='ABC', part3='400') > values ( '600', '700', 'abc'), ( '601', '701', 'abcd'); > insert into testB partition (part1='UK', part2='PQR', part3='500') > values ('600', '700', 'abc'), ('601', '701', 'abcd'); > insert into testB partition (part1='US', part2='DEF', part3='200') > values ( '600', '700', 'abc'), ('601', '701', 'abcd'); > insert into testB partition (part1='US', part2='PQR', part3='123') > values ( '600', '700', 'abc'), ('601', '701', 'abcd'); > -- views with deterministic functions > create view viewDeterministicUDFA partitioned on (vpart1, vpart2, vpart3) as > select > cast(col1 as decimal(38,18)) as vcol1, > cast(col2 as decimal(38,18)) as vcol2, > cast(col3 as decimal(38,18)) as vcol3, > cast(col4 as decimal(38,18)) as vcol4, > cast(col5 as char(10)) as vcol5, > cast(part1 as char(2)) as vpart1, > cast(part2 as char(3)) as vpart2, > cast(part3 as char(3)) as vpart3 > from testa > where part1 in ('US', 'CA'); > create view viewDeterministicUDFB partitioned on (vpart1, vpart2, vpart3) as > select > cast(cola as decimal(38,18)) as vcolA, > cast(colb as decimal(38,18)) as vcolB, > cast(colc as char(10)) as vcolC, > cast(part1 as char(2)) as vpart1, > cast(part2 as char(3)) as vpart2, > cast(part3 as char(3)) as vpart3 > from testb > where part1 in ('US', 'CA'); > explain > select vcol1, vcol2, vcol3, vcola, vcolb > from viewDeterministicUDFA a inner join viewDeterministicUDFB b > on a.vpart1 = b.vpart1 > and a.vpart2 = b.vpart2 > and a.vpart3 = b.vpart3 > and a.vpart1 = 'US' > and a.vpart2 = 'DEF' > and a.vpart3 = '200'; > {code} > Plan where the CAST is not pushed down. > {code} > STAGE PLANS: > Stage: Stage-1 > Map Reduce > Map Operator Tree: > TableScan > alias: testa > filterExpr: (part1) IN ('US', 'CA') (type: boolean) > Statistics: Num rows: 6 Data size: 13740 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: CAST( col1 AS decimal(38,18)) (type: > decimal(38,18)), CAST( col2 AS decimal(38,18)) (type: decimal(38,18)), CAST( > col3 AS decimal(38,18)) (type: decimal(38,18)), CAST( part1 AS CHAR(2)) > (type: char(2)), CAST( part2 AS CHAR(3)) (type: char(3)), CAST( part3 AS > CHAR(3)) (type: char(3)) > outputColumnNames: _col0, _col1, _col2, _col5, _col6, _col7 > Statistics: Num rows: 6 Data size: 13740 Basic stats: COMPLETE > Column stats: NONE > Filter Operator > predicate: ((_col5 = 'US') and (_col6 = 'DEF') and (_col7 = > '200')) (type: boolean) >
[jira] [Commented] (HIVE-20097) Convert standalone-metastore to a submodule
[ https://issues.apache.org/jira/browse/HIVE-20097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542316#comment-16542316 ] Alexander Kolbasov commented on HIVE-20097: --- Patch 6 fixes rat violations - the only change is the addition of this block to {{standalone-metastore/pom.xml}}: {code} org.apache.rat apache-rat-plugin 0.10 binary-package-licenses/** DEV-README **/src/main/sql/** **/README.md **/*.iml **/*.txt **/*.log **/*.arcconfig **/package-info.java **/*.properties **/*.q **/*.q.out **/*.xml **/gen/** **/patchprocess/** **/metastore_db/** {code} The patch is merged to {code} * commit 57dd30441a708f9fe653aea1c54df678ed459c34 (origin/master, origin/HEAD) | Author: Aihua Xu | Date: Fri Jun 29 14:40:43 2018 -0700 | | HIVE-20037: Print root cause exception's toString() rather than getMessage() (Aihua Xu, reviewed by Sahil Takiar) {code} > Convert standalone-metastore to a submodule > --- > > Key: HIVE-20097 > URL: https://issues.apache.org/jira/browse/HIVE-20097 > Project: Hive > Issue Type: Sub-task > Components: Hive, Metastore, Standalone Metastore >Affects Versions: 3.1.0, 4.0.0 >Reporter: Alexander Kolbasov >Assignee: Alexander Kolbasov >Priority: Major > Attachments: HIVE-20097.01.patch, HIVE-20097.02.patch, > HIVE-20097.03.patch, HIVE-20097.04.patch, HIVE-20097.05.patch, > HIVE-20097.06.patch > > > This is a subtask to stage HIVE-17751 changes into several smaller phases. > The first part is moving existing code in hive-standalone-metastore to a > sub-module. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20090) Extend creation of semijoin reduction filters to be able to discover new opportunities
[ https://issues.apache.org/jira/browse/HIVE-20090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-20090: --- Attachment: HIVE-20090.07.patch > Extend creation of semijoin reduction filters to be able to discover new > opportunities > -- > > Key: HIVE-20090 > URL: https://issues.apache.org/jira/browse/HIVE-20090 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Attachments: HIVE-20090.01.patch, HIVE-20090.02.patch, > HIVE-20090.04.patch, HIVE-20090.05.patch, HIVE-20090.06.patch, > HIVE-20090.07.patch > > > Assume the following plan: > {noformat} > TS[0] - RS[1] - JOIN[4] - RS[5] - JOIN[8] - FS[9] > TS[2] - RS[3] - JOIN[4] > TS[6] - RS[7] - JOIN[8] > {noformat} > Currently, {{TS\[6\]}} may only be reduced with the output of {{RS\[5\]}}, > i.e., input to join between both subplans. > However, it may be useful to consider other possibilities too, e.g., reduced > by the output of {{RS\[1\]}} or {{RS\[3\]}}. For instance, this is important > when, given a large plan, an edge between {{RS[5]}} and {{TS[0]}} would > create a cycle, while an edge between {{RS[1]}} and {{TS[6]}} would not. > This patch comprises two parts. First, it creates additional predicates when > possible. Secondly, it removes duplicate semijoin reduction > branches/predicates, e.g., if another semijoin that consumes the output of > the same expression already reduces a certain table scan operator (heuristic, > since this may not result in most efficient plan in all cases). Ultimately, > the decision on whether to use one or another should be cost-driven > (follow-up). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20097) Convert standalone-metastore to a submodule
[ https://issues.apache.org/jira/browse/HIVE-20097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Kolbasov updated HIVE-20097: -- Attachment: HIVE-20097.06.patch > Convert standalone-metastore to a submodule > --- > > Key: HIVE-20097 > URL: https://issues.apache.org/jira/browse/HIVE-20097 > Project: Hive > Issue Type: Sub-task > Components: Hive, Metastore, Standalone Metastore >Affects Versions: 3.1.0, 4.0.0 >Reporter: Alexander Kolbasov >Assignee: Alexander Kolbasov >Priority: Major > Attachments: HIVE-20097.01.patch, HIVE-20097.02.patch, > HIVE-20097.03.patch, HIVE-20097.04.patch, HIVE-20097.05.patch, > HIVE-20097.06.patch > > > This is a subtask to stage HIVE-17751 changes into several smaller phases. > The first part is moving existing code in hive-standalone-metastore to a > sub-module. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings
[ https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-19668: -- Status: Patch Available (was: In Progress) > Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and > duplicate strings > -- > > Key: HIVE-19668 > URL: https://issues.apache.org/jira/browse/HIVE-19668 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, > HIVE-19668.03.patch, image-2018-05-22-17-41-39-572.png > > > I've recently analyzed a HS2 heap dump, obtained when there was a huge memory > spike during compilation of some big query. The analysis was done with jxray > ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of > the 20G heap was used by data structures associated with query parsing > ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple > opportunities for optimizations here. One of them is to stop the code from > creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See > a sample of these objects in the attached image: > !image-2018-05-22-17-41-39-572.png|width=879,height=399! > Looks like these particular {{CommonToken}} objects are constants, that don't > change once created. I see some code, e.g. in > {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are > apparently repeatedly created with e.g. {{new > CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds > are instead created once and reused, we will save more than 1/10th of the > heap in this scenario. Plus, since these objects are small but very numerous, > getting rid of them will remove a gread deal of pressure from the GC. > Another source of waste are duplicate strings, that collectively waste 26.1% > of memory. Some of them come from CommonToken objects that have the same text > (i.e. for multiple CommonToken objects the contents of their 'text' Strings > are the same, but each has its own copy of that String). Other duplicate > strings come from other sources, that are easy enough to fix by adding > String.intern() calls. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings
[ https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-19668: -- Attachment: HIVE-19668.03.patch > Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and > duplicate strings > -- > > Key: HIVE-19668 > URL: https://issues.apache.org/jira/browse/HIVE-19668 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, > HIVE-19668.03.patch, image-2018-05-22-17-41-39-572.png > > > I've recently analyzed a HS2 heap dump, obtained when there was a huge memory > spike during compilation of some big query. The analysis was done with jxray > ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of > the 20G heap was used by data structures associated with query parsing > ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple > opportunities for optimizations here. One of them is to stop the code from > creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See > a sample of these objects in the attached image: > !image-2018-05-22-17-41-39-572.png|width=879,height=399! > Looks like these particular {{CommonToken}} objects are constants, that don't > change once created. I see some code, e.g. in > {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are > apparently repeatedly created with e.g. {{new > CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds > are instead created once and reused, we will save more than 1/10th of the > heap in this scenario. Plus, since these objects are small but very numerous, > getting rid of them will remove a gread deal of pressure from the GC. > Another source of waste are duplicate strings, that collectively waste 26.1% > of memory. Some of them come from CommonToken objects that have the same text > (i.e. for multiple CommonToken objects the contents of their 'text' Strings > are the same, but each has its own copy of that String). Other duplicate > strings come from other sources, that are easy enough to fix by adding > String.intern() calls. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings
[ https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-19668: -- Status: In Progress (was: Patch Available) > Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and > duplicate strings > -- > > Key: HIVE-19668 > URL: https://issues.apache.org/jira/browse/HIVE-19668 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, > image-2018-05-22-17-41-39-572.png > > > I've recently analyzed a HS2 heap dump, obtained when there was a huge memory > spike during compilation of some big query. The analysis was done with jxray > ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of > the 20G heap was used by data structures associated with query parsing > ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple > opportunities for optimizations here. One of them is to stop the code from > creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See > a sample of these objects in the attached image: > !image-2018-05-22-17-41-39-572.png|width=879,height=399! > Looks like these particular {{CommonToken}} objects are constants, that don't > change once created. I see some code, e.g. in > {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are > apparently repeatedly created with e.g. {{new > CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds > are instead created once and reused, we will save more than 1/10th of the > heap in this scenario. Plus, since these objects are small but very numerous, > getting rid of them will remove a gread deal of pressure from the GC. > Another source of waste are duplicate strings, that collectively waste 26.1% > of memory. Some of them come from CommonToken objects that have the same text > (i.e. for multiple CommonToken objects the contents of their 'text' Strings > are the same, but each has its own copy of that String). Other duplicate > strings come from other sources, that are easy enough to fix by adding > String.intern() calls. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17852) remove support for list bucketing "stored as directories" in 3.0
[ https://issues.apache.org/jira/browse/HIVE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542311#comment-16542311 ] Hive QA commented on HIVE-17852: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12931284/HIVE-17852.20.patch {color:green}SUCCESS:{color} +1 due to 36 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 14647 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters] (batchId=154) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12566/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12566/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12566/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12931284 - PreCommit-HIVE-Build > remove support for list bucketing "stored as directories" in 3.0 > > > Key: HIVE-17852 > URL: https://issues.apache.org/jira/browse/HIVE-17852 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Laszlo Bodor >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-17852.01.patch, HIVE-17852.02.patch, > HIVE-17852.03.patch, HIVE-17852.04.patch, HIVE-17852.05.patch, > HIVE-17852.06.patch, HIVE-17852.07.patch, HIVE-17852.08.patch, > HIVE-17852.09.patch, HIVE-17852.10.patch, HIVE-17852.11.patch, > HIVE-17852.12.patch, HIVE-17852.13.patch, HIVE-17852.14.patch, > HIVE-17852.15.patch, HIVE-17852.16.patch, HIVE-17852.17.patch, > HIVE-17852.18.patch, HIVE-17852.19.patch, HIVE-17852.20.patch > > > From the email thread: > 1) LB, when stored as directories, adds a lot of low-level complexity to Hive > tables that has to be accounted for in many places in the code where the > files are written or modified - from FSOP to ACID/replication/export. > 2) While working on some FSOP code I noticed that some of that logic is > broken - e.g. the duplicate file removal from tasks, a pretty fundamental > correctness feature in Hive, may be broken. LB also doesn’t appear to be > compatible with e.g. regular bucketing. > 3) The feature hasn’t seen development activity in a while; it also doesn’t > appear to be used a lot. > Keeping with the theme of cleaning up “legacy” code for 3.0, I was proposing > we remove it. > (2) also suggested that, if needed, it might be easier to implement similar > functionality by adding some flexibility to partitions (which LB directories > look like anyway); that would also keep the logic on a higher level of > abstraction (split generation, partition pruning) as opposed to many > low-level places like FSOP, etc. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20151) External table: exception while storing stats
[ https://issues.apache.org/jira/browse/HIVE-20151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542288#comment-16542288 ] Jaume M commented on HIVE-20151: Duplicate from https://issues.apache.org/jira/browse/HIVE-19316 [~kgyrtkirk]? > External table: exception while storing stats > - > > Key: HIVE-20151 > URL: https://issues.apache.org/jira/browse/HIVE-20151 > Project: Hive > Issue Type: Bug > Components: Metastore, Statistics >Reporter: Zoltan Haindrich >Priority: Major > > {code} > create external table e3(a integer,b string,c double); > -- goes well > insert into e3 values(1,'2',3); > -- takes a while: > insert into e3 values(1,'2',3); > -- after 2 minutes > -- > ERROR : FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.StatsTask > INFO : Completed executing > command(queryId=hive_20180712120342_6893e234-44a0-4e48-8320-f1699557bae3); > Time taken: 125.276 seconds > Error: Error while processing statement: FAILED: Execution Error, return code > 1 from org.apache.hadoop.hive.ql.exec.StatsTask (state=08S01,code=1) > {code} > exception in metastore logs: > {code} > java.lang.ClassCastException: > org.apache.hadoop.hive.metastore.api.LongColumnStatsData cannot be cast to > org.apache.hadoop.hive.metastore.columnstats.cache.LongColumnStatsDataInspector > at > org.apache.hadoop.hive.metastore.columnstats.merge.LongColumnStatsMerger.merge(LongColumnStatsMerger.java:30) > ~[hive-exec-3.1.0.3.0.0.0-1632.jar:3.1.0.3.0.0.0-1632] > at > org.apache.hadoop.hive.metastore.utils.MetaStoreUtils.mergeColStats(MetaStoreUtils.java:1084) > ~[hive-exec-3.1.0.3.0.0.0-1632.jar:3.1.0.3.0.0.0-1632] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.set_aggr_stats_for(HiveMetaStore.java:7514) > ~[hive-exec-3.1.0.3.0.0.0-1632.jar:3.1.0.3.0.0.0-1632] > at sun.reflect.GeneratedMethodAccessor80.invoke(Unknown Source) ~[?:?] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_161] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_161] > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > ~[hive-exec-3.1.0.3.0.0.0-1632.jar:3.1.0.3.0.0.0-1632] > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) > ~[hive-exec-3.1.0.3.0.0.0-1632.jar:3.1.0.3.0.0.0-1632] > at com.sun.proxy.$Proxy34.set_aggr_stats_for(Unknown Source) ~[?:?] > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$set_aggr_stats_for.getResult(ThriftHiveMetastore.java:17017) > ~[hive-exec-3.1.0.3.0.0.0-1632.jar:3.1.0.3.0.0.0-1632] > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$set_aggr_stats_for.getResult(ThriftHiveMetastore.java:17001) > ~[hive-exec-3.1.0.3.0.0.0-1632.jar:3.1.0.3.0.0.0-1632] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20158) Do Not Print StackTraces to STDERR in Base64TextOutputFormat
[ https://issues.apache.org/jira/browse/HIVE-20158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542278#comment-16542278 ] BELUGA BEHR commented on HIVE-20158: The class {{Base64TextInputFormat}} has the same issue. > Do Not Print StackTraces to STDERR in Base64TextOutputFormat > > > Key: HIVE-20158 > URL: https://issues.apache.org/jira/browse/HIVE-20158 > Project: Hive > Issue Type: Improvement > Components: Contrib >Affects Versions: 3.0.0, 4.0.0 >Reporter: BELUGA BEHR >Priority: Trivial > Labels: newbie, noob > > https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/contrib/src/java/org/apache/hadoop/hive/contrib/fileformat/base64/Base64TextOutputFormat.java > {code} > try { > String signatureString = > job.get("base64.text.output.format.signature"); > if (signatureString != null) { > signature = signatureString.getBytes("UTF-8"); > } else { > signature = new byte[0]; > } > } catch (UnsupportedEncodingException e) { > e.printStackTrace(); > } > {code} > The {{UnsupportedEncodingException}} is coming from the {{getBytes}} method > call. Instead, use the {{CharSet}} version of the method and it doesn't > throw this explicit exception so the 'try' block can simply be removed. > Every JVM will support UTF-8. > https://docs.oracle.com/javase/7/docs/api/java/lang/String.html#getBytes(java.nio.charset.Charset) > https://docs.oracle.com/javase/7/docs/api/java/nio/charset/StandardCharsets.html#UTF_8 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20159) Do Not Print StackTraces to STDERR in ConditionalResolverSkewJoin
[ https://issues.apache.org/jira/browse/HIVE-20159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated HIVE-20159: --- Description: https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverSkewJoin.java#L121 {code} } catch (IOException e) { e.printStackTrace(); } {code} Introduce an SLF4J logger to this class and print a WARN level log message if the {{IOException}} from {{Utilities.listStatusIfExists}} is generated. I suggest WARN because the entire operation doesn't fail if this error happens. It continues on its way with the data that it was able to collect. I'm not sure if this is the intended behavior, but for now, a helpful warning message in the logging would be better. was: https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverSkewJoin.java#L121 {code} } catch (IOException e) { e.printStackTrace(); } {code} Introduce an SLF4J logger to this class and print a WARN level log message if the {{IOException}} from {{Utilities.listStatusIfExists}} is generated. I suggest WARN because the entire operation doesn't fail if this error happens. It continues on its way with the data that it was able to collect. I'm not sure if this is the intended behavior, but for now, an error message in the logging would be better. > Do Not Print StackTraces to STDERR in ConditionalResolverSkewJoin > - > > Key: HIVE-20159 > URL: https://issues.apache.org/jira/browse/HIVE-20159 > Project: Hive > Issue Type: Improvement > Components: Query Planning >Affects Versions: 3.0.0, 4.0.0 >Reporter: BELUGA BEHR >Priority: Minor > Labels: newbie, noob > > https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverSkewJoin.java#L121 > {code} > } catch (IOException e) { > e.printStackTrace(); > } > {code} > Introduce an SLF4J logger to this class and print a WARN level log message if > the {{IOException}} from {{Utilities.listStatusIfExists}} is generated. I > suggest WARN because the entire operation doesn't fail if this error happens. > It continues on its way with the data that it was able to collect. I'm not > sure if this is the intended behavior, but for now, a helpful warning message > in the logging would be better. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings
[ https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542265#comment-16542265 ] Misha Dmitriev commented on HIVE-19668: --- Thank you for checking, [~vihangk1] [~aihuaxu] and [~stakiar]. In the end, it turns out that at least some failures are reproducible locally, and my changes are responsible. Not all {{CommonToken}}s can be made {{ImmutableToken}}s, because for some of them the type may be rewritten in some special operators later. I've already found one such type in the past, and now eliminating others. Will post the updated patch once I am done. > Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and > duplicate strings > -- > > Key: HIVE-19668 > URL: https://issues.apache.org/jira/browse/HIVE-19668 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, > image-2018-05-22-17-41-39-572.png > > > I've recently analyzed a HS2 heap dump, obtained when there was a huge memory > spike during compilation of some big query. The analysis was done with jxray > ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of > the 20G heap was used by data structures associated with query parsing > ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple > opportunities for optimizations here. One of them is to stop the code from > creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See > a sample of these objects in the attached image: > !image-2018-05-22-17-41-39-572.png|width=879,height=399! > Looks like these particular {{CommonToken}} objects are constants, that don't > change once created. I see some code, e.g. in > {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are > apparently repeatedly created with e.g. {{new > CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds > are instead created once and reused, we will save more than 1/10th of the > heap in this scenario. Plus, since these objects are small but very numerous, > getting rid of them will remove a gread deal of pressure from the GC. > Another source of waste are duplicate strings, that collectively waste 26.1% > of memory. Some of them come from CommonToken objects that have the same text > (i.e. for multiple CommonToken objects the contents of their 'text' Strings > are the same, but each has its own copy of that String). Other duplicate > strings come from other sources, that are easy enough to fix by adding > String.intern() calls. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings
[ https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542258#comment-16542258 ] Sahil Takiar commented on HIVE-19668: - [~mi...@cloudera.com] unless you can re-produce the test failures locally, its unlikely the failed tests are related to your patch. If you can re-produce them locally, let me know the stack-trace and I can help you debug. Otherwise, can you re-base the patch and post and updated version? This will re-trigger Hive QA and re-run all the tests. > Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and > duplicate strings > -- > > Key: HIVE-19668 > URL: https://issues.apache.org/jira/browse/HIVE-19668 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, > image-2018-05-22-17-41-39-572.png > > > I've recently analyzed a HS2 heap dump, obtained when there was a huge memory > spike during compilation of some big query. The analysis was done with jxray > ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of > the 20G heap was used by data structures associated with query parsing > ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple > opportunities for optimizations here. One of them is to stop the code from > creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See > a sample of these objects in the attached image: > !image-2018-05-22-17-41-39-572.png|width=879,height=399! > Looks like these particular {{CommonToken}} objects are constants, that don't > change once created. I see some code, e.g. in > {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are > apparently repeatedly created with e.g. {{new > CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds > are instead created once and reused, we will save more than 1/10th of the > heap in this scenario. Plus, since these objects are small but very numerous, > getting rid of them will remove a gread deal of pressure from the GC. > Another source of waste are duplicate strings, that collectively waste 26.1% > of memory. Some of them come from CommonToken objects that have the same text > (i.e. for multiple CommonToken objects the contents of their 'text' Strings > are the same, but each has its own copy of that String). Other duplicate > strings come from other sources, that are easy enough to fix by adding > String.intern() calls. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20157) Do Not Print StackTraces to STDERR in ParseDriver
[ https://issues.apache.org/jira/browse/HIVE-20157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated HIVE-20157: --- Summary: Do Not Print StackTraces to STDERR in ParseDriver (was: Do Not Print StackTraces to STDERR) > Do Not Print StackTraces to STDERR in ParseDriver > - > > Key: HIVE-20157 > URL: https://issues.apache.org/jira/browse/HIVE-20157 > Project: Hive > Issue Type: Improvement > Components: Parser >Affects Versions: 3.0.0, 4.0.0 >Reporter: BELUGA BEHR >Priority: Minor > Labels: newbie, noob > > {{org/apache/hadoop/hive/ql/parse/ParseDriver.java}} > {code} > catch (RecognitionException e) { > e.printStackTrace(); > throw new ParseException(parser.errors); > } > {code} > Do not use {{e.printStackTrace()}} and print to STDERR. Either remove or > replace with a debug-level log statement. I would vote to simply remove. > There are several occurrences of this pattern in this class. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20147) Hive streaming ingest is contented on synchronized logging
[ https://issues.apache.org/jira/browse/HIVE-20147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542254#comment-16542254 ] Prasanth Jayachandran commented on HIVE-20147: -- If the clients of this API does not use async logging, there is high contention in the logging done by the streaming API specifically logStats before/after commit and close. Changed the log levels to DEBUG. In one of the test application using async logging + this patch, the log contention is no longer seen in the profiles. > Hive streaming ingest is contented on synchronized logging > -- > > Key: HIVE-20147 > URL: https://issues.apache.org/jira/browse/HIVE-20147 > Project: Hive > Issue Type: Bug > Components: Streaming, Transactions >Affects Versions: 4.0.0, 3.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Major > Attachments: HIVE-20147.1.patch, Screen Shot 2018-07-11 at 4.17.27 > PM.png, sync-logger-contention.svg > > > In one of the observed profile, >30% time spent on synchronized logging. See > attachment. > We should use async logging for hive streaming ingest by default. !Screen > Shot 2018-07-11 at 4.17.27 PM.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20147) Hive streaming ingest is contented on synchronized logging
[ https://issues.apache.org/jira/browse/HIVE-20147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-20147: - Attachment: HIVE-20147.1.patch > Hive streaming ingest is contented on synchronized logging > -- > > Key: HIVE-20147 > URL: https://issues.apache.org/jira/browse/HIVE-20147 > Project: Hive > Issue Type: Bug > Components: Streaming, Transactions >Affects Versions: 4.0.0, 3.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Major > Attachments: HIVE-20147.1.patch, Screen Shot 2018-07-11 at 4.17.27 > PM.png, sync-logger-contention.svg > > > In one of the observed profile, >30% time spent on synchronized logging. See > attachment. > We should use async logging for hive streaming ingest by default. !Screen > Shot 2018-07-11 at 4.17.27 PM.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20147) Hive streaming ingest is contented on synchronized logging
[ https://issues.apache.org/jira/browse/HIVE-20147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-20147: - Status: Patch Available (was: Open) > Hive streaming ingest is contented on synchronized logging > -- > > Key: HIVE-20147 > URL: https://issues.apache.org/jira/browse/HIVE-20147 > Project: Hive > Issue Type: Bug > Components: Streaming, Transactions >Affects Versions: 4.0.0, 3.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Major > Attachments: HIVE-20147.1.patch, Screen Shot 2018-07-11 at 4.17.27 > PM.png, sync-logger-contention.svg > > > In one of the observed profile, >30% time spent on synchronized logging. See > attachment. > We should use async logging for hive streaming ingest by default. !Screen > Shot 2018-07-11 at 4.17.27 PM.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19886) Logs may be directed to 2 files if --hiveconf hive.log.file is used
[ https://issues.apache.org/jira/browse/HIVE-19886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jaume M updated HIVE-19886: --- Attachment: HIVE-19886.3.patch Status: Patch Available (was: In Progress) > Logs may be directed to 2 files if --hiveconf hive.log.file is used > --- > > Key: HIVE-19886 > URL: https://issues.apache.org/jira/browse/HIVE-19886 > Project: Hive > Issue Type: Bug > Components: Logging >Affects Versions: 3.1.0, 4.0.0 >Reporter: Prasanth Jayachandran >Assignee: Jaume M >Priority: Major > Labels: pull-request-available > Attachments: HIVE-19886.2.patch, HIVE-19886.2.patch, > HIVE-19886.3.patch, HIVE-19886.patch > > > hive launch script explicitly specific log4j2 configuration file to use. The > main() methods in HiveServer2 and HiveMetastore reconfigures the logger based > on user input via --hiveconf hive.log.file. This may cause logs to end up in > 2 different files. Initial logs goes to the file specified in > hive-log4j2.properties and after logger reconfiguration the rest of the logs > goes to the file specified via --hiveconf hive.log.file. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19886) Logs may be directed to 2 files if --hiveconf hive.log.file is used
[ https://issues.apache.org/jira/browse/HIVE-19886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jaume M updated HIVE-19886: --- Status: In Progress (was: Patch Available) > Logs may be directed to 2 files if --hiveconf hive.log.file is used > --- > > Key: HIVE-19886 > URL: https://issues.apache.org/jira/browse/HIVE-19886 > Project: Hive > Issue Type: Bug > Components: Logging >Affects Versions: 3.1.0, 4.0.0 >Reporter: Prasanth Jayachandran >Assignee: Jaume M >Priority: Major > Labels: pull-request-available > Attachments: HIVE-19886.2.patch, HIVE-19886.2.patch, HIVE-19886.patch > > > hive launch script explicitly specific log4j2 configuration file to use. The > main() methods in HiveServer2 and HiveMetastore reconfigures the logger based > on user input via --hiveconf hive.log.file. This may cause logs to end up in > 2 different files. Initial logs goes to the file specified in > hive-log4j2.properties and after logger reconfiguration the rest of the logs > goes to the file specified via --hiveconf hive.log.file. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20037) Print root cause exception's toString() rather than getMessage()
[ https://issues.apache.org/jira/browse/HIVE-20037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-20037: Resolution: Fixed Fix Version/s: 4.0.0 Status: Resolved (was: Patch Available) Pushed to master. Thanks [~stakiar] for reviewing. > Print root cause exception's toString() rather than getMessage() > > > Key: HIVE-20037 > URL: https://issues.apache.org/jira/browse/HIVE-20037 > Project: Hive > Issue Type: Sub-task > Components: Spark >Affects Versions: 3.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu >Priority: Trivial > Fix For: 4.0.0 > > Attachments: HIVE-20037.1.patch, HIVE-20037.2.patch > > > When we run HoS job and if it fails for some errors, we are printing the > exception message rather than exception toString(), for some exceptions, > e.g., this java.lang.NoClassDefFoundError, we are missing the exception type > information. > {noformat} > Failed to execute Spark task Stage-1, with exception > 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create Spark > client for Spark session cf054497-b073-4327-a315-68c867ce3434: > org/apache/spark/SparkConf)' > {noformat} > If we use exception's toString(), it will be as follows and make more sense. > {noformat} > Failed to execute Spark task Stage-1, with exception > 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create Spark > client for Spark session cf054497-b073-4327-a315-68c867ce3434: > java.lang.NoClassDefFoundError: org/apache/spark/SparkConf)' > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20149) TestHiveCli failing/timing out
[ https://issues.apache.org/jira/browse/HIVE-20149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542197#comment-16542197 ] Hive QA commented on HIVE-20149: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12931275/HIVE-20149.1.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 14650 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12565/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12565/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12565/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12931275 - PreCommit-HIVE-Build > TestHiveCli failing/timing out > -- > > Key: HIVE-20149 > URL: https://issues.apache.org/jira/browse/HIVE-20149 > Project: Hive > Issue Type: Bug > Components: CLI >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-20149.1.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-20153) Count and Sum UDF consume more memory in Hive 2+
[ https://issues.apache.org/jira/browse/HIVE-20153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu reassigned HIVE-20153: --- Assignee: Aihua Xu > Count and Sum UDF consume more memory in Hive 2+ > > > Key: HIVE-20153 > URL: https://issues.apache.org/jira/browse/HIVE-20153 > Project: Hive > Issue Type: Bug > Components: UDF >Affects Versions: 2.3.2 >Reporter: Szehon Ho >Assignee: Aihua Xu >Priority: Major > Attachments: Screen Shot 2018-07-12 at 6.41.28 PM.png > > > While playing with Hive2, we noticed that queries with a lot of count() and > sum() aggregations run out of memory on Hadoop side where they worked before > in Hive1. > In many queries, we have to double the Mapper Memory settings (in our > particular case mapreduce.map.java.opts from -Xmx2000M to -Xmx4000M), it > makes it not so easy to upgrade to Hive 2. > Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' > in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window > functions. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20102) Add a couple of additional tests for query parsing
[ https://issues.apache.org/jira/browse/HIVE-20102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-20102: --- Resolution: Fixed Fix Version/s: 4.0.0 3.1.0 Status: Resolved (was: Patch Available) Pushed to master, branch-3, branch-3.1. Thanks [~ashutoshc] > Add a couple of additional tests for query parsing > -- > > Key: HIVE-20102 > URL: https://issues.apache.org/jira/browse/HIVE-20102 > Project: Hive > Issue Type: Improvement > Components: Parser >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Fix For: 3.1.0, 4.0.0 > > Attachments: HIVE-20102.01.patch, HIVE-20102.02.patch, > HIVE-20102.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20153) Count and Sum UDF consume more memory in Hive 2+
[ https://issues.apache.org/jira/browse/HIVE-20153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542161#comment-16542161 ] Aihua Xu commented on HIVE-20153: - [~szehon] Nice to see you again. :) I will take a look. Do you have the full heap dump? If it's too big, you may try to use http://www.jxray.com/ to generate a small file. > Count and Sum UDF consume more memory in Hive 2+ > > > Key: HIVE-20153 > URL: https://issues.apache.org/jira/browse/HIVE-20153 > Project: Hive > Issue Type: Bug > Components: UDF >Affects Versions: 2.3.2 >Reporter: Szehon Ho >Priority: Major > Attachments: Screen Shot 2018-07-12 at 6.41.28 PM.png > > > While playing with Hive2, we noticed that queries with a lot of count() and > sum() aggregations run out of memory on Hadoop side where they worked before > in Hive1. > In many queries, we have to double the Mapper Memory settings (in our > particular case mapreduce.map.java.opts from -Xmx2000M to -Xmx4000M), it > makes it not so easy to upgrade to Hive 2. > Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' > in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window > functions. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20149) TestHiveCli failing/timing out
[ https://issues.apache.org/jira/browse/HIVE-20149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542157#comment-16542157 ] Hive QA commented on HIVE-20149: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 31s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 31s{color} | {color:blue} beeline in master has 53 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 11s{color} | {color:green} beeline: The patch generated 0 new + 39 unchanged - 1 fixed = 39 total (was 40) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 12m 9s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-12565/dev-support/hive-personality.sh | | git revision | master / 3fa7f0c | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | modules | C: beeline U: beeline | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-12565/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > TestHiveCli failing/timing out > -- > > Key: HIVE-20149 > URL: https://issues.apache.org/jira/browse/HIVE-20149 > Project: Hive > Issue Type: Bug > Components: CLI >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-20149.1.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19820) add ACID stats support to background stats updater and fix bunch of edge cases found in SU tests
[ https://issues.apache.org/jira/browse/HIVE-19820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542138#comment-16542138 ] Hive QA commented on HIVE-19820: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12931270/branch-19820.02.nogen.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12564/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12564/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12564/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Tests exited with: Exception: Patch URL https://issues.apache.org/jira/secure/attachment/12931270/branch-19820.02.nogen.patch was found in seen patch url's cache and a test was probably run already on it. Aborting... {noformat} This message is automatically generated. ATTACHMENT ID: 12931270 - PreCommit-HIVE-Build > add ACID stats support to background stats updater and fix bunch of edge > cases found in SU tests > > > Key: HIVE-19820 > URL: https://issues.apache.org/jira/browse/HIVE-19820 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HIVE-19820.01-master-txnstats.patch, > HIVE-19820.01.patch, HIVE-19820.02-master-txnstats.patch, > HIVE-19820.03-master-txnstats.patch, HIVE-19820.04-master-txnstats.patch, > HIVE-19820.patch, branch-19820.02.nogen.patch, branch-19820.nogen.patch, > branch-19820.nogen.patch > > > Follow-up from HIVE-19418. > Right now it checks whether stats are valid in an old-fashioned way... and > also gets ACID state, and discards it without using. > When ACID stats are implemented, ACID state needs to be used to do > version-aware valid stats checks. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20102) Add a couple of additional tests for query parsing
[ https://issues.apache.org/jira/browse/HIVE-20102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542136#comment-16542136 ] Hive QA commented on HIVE-20102: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12931248/HIVE-20102.02.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 14650 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12563/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12563/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12563/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12931248 - PreCommit-HIVE-Build > Add a couple of additional tests for query parsing > -- > > Key: HIVE-20102 > URL: https://issues.apache.org/jira/browse/HIVE-20102 > Project: Hive > Issue Type: Improvement > Components: Parser >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Attachments: HIVE-20102.01.patch, HIVE-20102.02.patch, > HIVE-20102.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20102) Add a couple of additional tests for query parsing
[ https://issues.apache.org/jira/browse/HIVE-20102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542133#comment-16542133 ] Hive QA commented on HIVE-20102: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 49s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 15s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 10s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 58s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 3s{color} | {color:blue} ql in master has 2288 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 50s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 24s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 23s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 46s{color} | {color:red} ql: The patch generated 1 new + 492 unchanged - 0 fixed = 493 total (was 492) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 2m 10s{color} | {color:red} root: The patch generated 1 new + 493 unchanged - 0 fixed = 494 total (was 493) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 55s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 63m 2s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-12563/dev-support/hive-personality.sh | | git revision | master / e0c2b9d | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-12563/yetus/diff-checkstyle-ql.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-12563/yetus/diff-checkstyle-root.txt | | modules | C: ql . U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-12563/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Add a couple of additional tests for query parsing > -- > > Key: HIVE-20102 > URL: https://issues.apache.org/jira/browse/HIVE-20102 > Project: Hive > Issue Type: Improvement > Components: Parser >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Attachments: HIVE-20102.01.patch, HIVE-20102.02.patch, > HIVE-20102.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20079) Populate more accurate rawDataSize for parquet format
[ https://issues.apache.org/jira/browse/HIVE-20079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542117#comment-16542117 ] Sahil Takiar commented on HIVE-20079: - Looks similar to HIVE-16887 > Populate more accurate rawDataSize for parquet format > - > > Key: HIVE-20079 > URL: https://issues.apache.org/jira/browse/HIVE-20079 > Project: Hive > Issue Type: Improvement > Components: File Formats >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu >Priority: Major > Attachments: HIVE-20079.1.patch > > > Run the following queries and you will see the raw data for the table is 4 > (that is the number of fields) incorrectly. We need to populate correct data > size so data can be split properly. > {noformat} > SET hive.stats.autogather=true; > CREATE TABLE parquet_stats (id int,str string) STORED AS PARQUET; > INSERT INTO parquet_stats values(0, 'this is string 0'), (1, 'string 1'); > DESC FORMATTED parquet_stats; > {noformat} > {noformat} > Table Parameters: > COLUMN_STATS_ACCURATE true > numFiles1 > numRows 2 > rawDataSize 4 > totalSize 373 > transient_lastDdlTime 1530660523 > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-20155) Semijoin Reduction : Put all the min-max filters before all the bloom filters
[ https://issues.apache.org/jira/browse/HIVE-20155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Jaiswal reassigned HIVE-20155: - > Semijoin Reduction : Put all the min-max filters before all the bloom filters > - > > Key: HIVE-20155 > URL: https://issues.apache.org/jira/browse/HIVE-20155 > Project: Hive > Issue Type: Task >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal >Priority: Major > > If there are more than 1 semijoin reduction filters, apply all min-max > filters before any of the bloom filters are applied as bloom filter lookup is > expensive. > > cc [~gopalv] [~jdere] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20142) Semijoin Reduction : Peform cost based removal after rule based removal.
[ https://issues.apache.org/jira/browse/HIVE-20142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Jaiswal updated HIVE-20142: -- Description: The semijoin reduction removal logic is spread out into multiple functions. Currently, the cost based removal logic is applied before the rule based(dumb) ones. Instead, apply the rule based removal logic and then apply the cost based removal. cc [~jdere] [~jcamachorodriguez] [~gopalv] was: The semijoin reduction removal logic is spread out into multiple functions. Currently, the cost based removal logic is applied before the rule based(dumb) ones. Instead, apply the rule based removal logic and then apply the cost based removal. cc [~jdere] [~jcamachorodriguez] > Semijoin Reduction : Peform cost based removal after rule based removal. > > > Key: HIVE-20142 > URL: https://issues.apache.org/jira/browse/HIVE-20142 > Project: Hive > Issue Type: Task >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal >Priority: Major > > The semijoin reduction removal logic is spread out into multiple functions. > Currently, the cost based removal logic is applied before the rule > based(dumb) ones. > Instead, apply the rule based removal logic and then apply the cost based > removal. > > cc [~jdere] [~jcamachorodriguez] [~gopalv] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20091) Tez: Add security credentials for FileSinkOperator output
[ https://issues.apache.org/jira/browse/HIVE-20091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-20091: Fix Version/s: 4.0.0 3.1.0 > Tez: Add security credentials for FileSinkOperator output > - > > Key: HIVE-20091 > URL: https://issues.apache.org/jira/browse/HIVE-20091 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 3.1.0, 4.0.0 > > Attachments: HIVE-20091.01.patch, HIVE-20091.02.patch, > HIVE-20091.03.patch, HIVE-20091.04.patch, HIVE-20091.05.patch, > HIVE-20091.06.patch, HIVE-20091.07.patch, HIVE-20091.08.patch > > > DagUtils needs to add security credentials for the output for the > FileSinkOperator. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20091) Tez: Add security credentials for FileSinkOperator output
[ https://issues.apache.org/jira/browse/HIVE-20091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-20091: Resolution: Fixed Status: Resolved (was: Patch Available) > Tez: Add security credentials for FileSinkOperator output > - > > Key: HIVE-20091 > URL: https://issues.apache.org/jira/browse/HIVE-20091 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 3.1.0, 4.0.0 > > Attachments: HIVE-20091.01.patch, HIVE-20091.02.patch, > HIVE-20091.03.patch, HIVE-20091.04.patch, HIVE-20091.05.patch, > HIVE-20091.06.patch, HIVE-20091.07.patch, HIVE-20091.08.patch > > > DagUtils needs to add security credentials for the output for the > FileSinkOperator. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20091) Tez: Add security credentials for FileSinkOperator output
[ https://issues.apache.org/jira/browse/HIVE-20091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542082#comment-16542082 ] Matt McCline commented on HIVE-20091: - Committed to master and branch-3. > Tez: Add security credentials for FileSinkOperator output > - > > Key: HIVE-20091 > URL: https://issues.apache.org/jira/browse/HIVE-20091 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 3.1.0, 4.0.0 > > Attachments: HIVE-20091.01.patch, HIVE-20091.02.patch, > HIVE-20091.03.patch, HIVE-20091.04.patch, HIVE-20091.05.patch, > HIVE-20091.06.patch, HIVE-20091.07.patch, HIVE-20091.08.patch > > > DagUtils needs to add security credentials for the output for the > FileSinkOperator. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20091) Tez: Add security credentials for FileSinkOperator output
[ https://issues.apache.org/jira/browse/HIVE-20091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542079#comment-16542079 ] Matt McCline commented on HIVE-20091: - Successful test run. > Tez: Add security credentials for FileSinkOperator output > - > > Key: HIVE-20091 > URL: https://issues.apache.org/jira/browse/HIVE-20091 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-20091.01.patch, HIVE-20091.02.patch, > HIVE-20091.03.patch, HIVE-20091.04.patch, HIVE-20091.05.patch, > HIVE-20091.06.patch, HIVE-20091.07.patch, HIVE-20091.08.patch > > > DagUtils needs to add security credentials for the output for the > FileSinkOperator. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20154) Improve unix_timestamp(args) to handle automatic DST-switching timezones
[ https://issues.apache.org/jira/browse/HIVE-20154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vincent Tran updated HIVE-20154: Summary: Improve unix_timestamp(args) to handle automatic DST-switching timezones (was: Improve unix_timestamp(args) to handle automatic-DST switching timezones) > Improve unix_timestamp(args) to handle automatic DST-switching timezones > > > Key: HIVE-20154 > URL: https://issues.apache.org/jira/browse/HIVE-20154 > Project: Hive > Issue Type: Improvement > Components: UDF >Affects Versions: 1.1.0 >Reporter: Vincent Tran >Priority: Major > > Currently unix_timestamp(args) UDF will only handle static timezone > specifiers. It does not recognize SystemV specifiers such as EST5EDT or > PST8PDT. > Based on this experiment, when z is used to parse a TZ string like UTC4PDT > (obviously not a valid SystemV specifier) - it will parse the time as UTC. > When zz is used to parse a TZ string like UTC4PDT, it will take parse the > timestamp as the TZ of the final z position. This is demonstrated by my final > query when the format string z4z1z is used to parse UTC4PDT1EDT. > {noformat} > 0: jdbc:hive2://localhost:1/default>; select > from_unixtime(unix_timestamp("2018-02-01 00:00:00 UTC4PDT", "-MM-dd > HH:mm:ss z"), "-MM-dd HH:mm:ss "); > ++--+ > |_c0 | > ++--+ > | 2018-01-31 16:00:00 Pacific Standard Time | > ++--+ > 1 row selected (0.041 seconds) > 0: jdbc:hive2://localhost:1/default>; select > from_unixtime(unix_timestamp("2018-02-01 00:00:00 UTC", "-MM-dd HH:mm:ss > z"), "-MM-dd HH:mm:ss "); > ++--+ > |_c0 | > ++--+ > | 2018-01-31 16:00:00 Pacific Standard Time | > ++--+ > 1 row selected (0.041 seconds) > 0: jdbc:hive2://localhost:1/default>; select > from_unixtime(unix_timestamp("2018-02-01 00:00:00 UTC4PDT", "-MM-dd > HH:mm:ss z4z"), "-MM-dd HH:mm:ss "); > ++--+ > |_c0 | > ++--+ > | 2018-01-31 23:00:00 Pacific Standard Time | > ++--+ > 1 row selected (0.047 seconds) > 0: jdbc:hive2://localhost:1/default>; select > from_unixtime(unix_timestamp("2018-02-01 00:00:00 UTC4PDT1EDT", "-MM-dd > HH:mm:ss z4z1z"), "-MM-dd HH:mm:ss "); > ++--+ > |_c0 | > ++--+ > | 2018-01-31 20:00:00 Pacific Standard Time | > ++--+ > 1 row selected (0.055 seconds) > 0: jdbc:hive2://localhost:1/default>; > {noformat} > So all in all, I don't think the SystemV specifier EST5EDT or PST8PDT are > valid to unix_timestamp(args) at all. And that those when parsed with the > zz format string, will be read as whatever valid timezone at the final > position (effectively EDT and PDT respectively in when those valid SystemV TZ > specifiers above are used). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20154) Improve unix_timestamp(args) to handle automatic-DST switching timezones
[ https://issues.apache.org/jira/browse/HIVE-20154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vincent Tran updated HIVE-20154: Component/s: UDF > Improve unix_timestamp(args) to handle automatic-DST switching timezones > > > Key: HIVE-20154 > URL: https://issues.apache.org/jira/browse/HIVE-20154 > Project: Hive > Issue Type: Improvement > Components: UDF >Affects Versions: 1.1.0 >Reporter: Vincent Tran >Priority: Major > > Currently unix_timestamp(args) UDF will only handle static timezone > specifiers. It does not recognize SystemV specifiers such as EST5EDT or > PST8PDT. > Based on this experiment, when z is used to parse a TZ string like UTC4PDT > (obviously not a valid SystemV specifier) - it will parse the time as UTC. > When zz is used to parse a TZ string like UTC4PDT, it will take parse the > timestamp as the TZ of the final z position. This is demonstrated by my final > query when the format string z4z1z is used to parse UTC4PDT1EDT. > {noformat} > 0: jdbc:hive2://localhost:1/default>; select > from_unixtime(unix_timestamp("2018-02-01 00:00:00 UTC4PDT", "-MM-dd > HH:mm:ss z"), "-MM-dd HH:mm:ss "); > ++--+ > |_c0 | > ++--+ > | 2018-01-31 16:00:00 Pacific Standard Time | > ++--+ > 1 row selected (0.041 seconds) > 0: jdbc:hive2://localhost:1/default>; select > from_unixtime(unix_timestamp("2018-02-01 00:00:00 UTC", "-MM-dd HH:mm:ss > z"), "-MM-dd HH:mm:ss "); > ++--+ > |_c0 | > ++--+ > | 2018-01-31 16:00:00 Pacific Standard Time | > ++--+ > 1 row selected (0.041 seconds) > 0: jdbc:hive2://localhost:1/default>; select > from_unixtime(unix_timestamp("2018-02-01 00:00:00 UTC4PDT", "-MM-dd > HH:mm:ss z4z"), "-MM-dd HH:mm:ss "); > ++--+ > |_c0 | > ++--+ > | 2018-01-31 23:00:00 Pacific Standard Time | > ++--+ > 1 row selected (0.047 seconds) > 0: jdbc:hive2://localhost:1/default>; select > from_unixtime(unix_timestamp("2018-02-01 00:00:00 UTC4PDT1EDT", "-MM-dd > HH:mm:ss z4z1z"), "-MM-dd HH:mm:ss "); > ++--+ > |_c0 | > ++--+ > | 2018-01-31 20:00:00 Pacific Standard Time | > ++--+ > 1 row selected (0.055 seconds) > 0: jdbc:hive2://localhost:1/default>; > {noformat} > So all in all, I don't think the SystemV specifier EST5EDT or PST8PDT are > valid to unix_timestamp(args) at all. And that those when parsed with the > zz format string, will be read as whatever valid timezone at the final > position (effectively EDT and PDT respectively in when those valid SystemV TZ > specifiers above are used). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20154) Improve unix_timestamp(args) to handle automatic-DST switching timezones
[ https://issues.apache.org/jira/browse/HIVE-20154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vincent Tran updated HIVE-20154: Affects Version/s: 1.1.0 > Improve unix_timestamp(args) to handle automatic-DST switching timezones > > > Key: HIVE-20154 > URL: https://issues.apache.org/jira/browse/HIVE-20154 > Project: Hive > Issue Type: Improvement >Affects Versions: 1.1.0 >Reporter: Vincent Tran >Priority: Major > > Currently unix_timestamp(args) UDF will only handle static timezone > specifiers. It does not recognize SystemV specifiers such as EST5EDT or > PST8PDT. > Based on this experiment, when z is used to parse a TZ string like UTC4PDT > (obviously not a valid SystemV specifier) - it will parse the time as UTC. > When zz is used to parse a TZ string like UTC4PDT, it will take parse the > timestamp as the TZ of the final z position. This is demonstrated by my final > query when the format string z4z1z is used to parse UTC4PDT1EDT. > {noformat} > 0: jdbc:hive2://localhost:1/default>; select > from_unixtime(unix_timestamp("2018-02-01 00:00:00 UTC4PDT", "-MM-dd > HH:mm:ss z"), "-MM-dd HH:mm:ss "); > ++--+ > |_c0 | > ++--+ > | 2018-01-31 16:00:00 Pacific Standard Time | > ++--+ > 1 row selected (0.041 seconds) > 0: jdbc:hive2://localhost:1/default>; select > from_unixtime(unix_timestamp("2018-02-01 00:00:00 UTC", "-MM-dd HH:mm:ss > z"), "-MM-dd HH:mm:ss "); > ++--+ > |_c0 | > ++--+ > | 2018-01-31 16:00:00 Pacific Standard Time | > ++--+ > 1 row selected (0.041 seconds) > 0: jdbc:hive2://localhost:1/default>; select > from_unixtime(unix_timestamp("2018-02-01 00:00:00 UTC4PDT", "-MM-dd > HH:mm:ss z4z"), "-MM-dd HH:mm:ss "); > ++--+ > |_c0 | > ++--+ > | 2018-01-31 23:00:00 Pacific Standard Time | > ++--+ > 1 row selected (0.047 seconds) > 0: jdbc:hive2://localhost:1/default>; select > from_unixtime(unix_timestamp("2018-02-01 00:00:00 UTC4PDT1EDT", "-MM-dd > HH:mm:ss z4z1z"), "-MM-dd HH:mm:ss "); > ++--+ > |_c0 | > ++--+ > | 2018-01-31 20:00:00 Pacific Standard Time | > ++--+ > 1 row selected (0.055 seconds) > 0: jdbc:hive2://localhost:1/default>; > {noformat} > So all in all, I don't think the SystemV specifier EST5EDT or PST8PDT are > valid to unix_timestamp(args) at all. And that those when parsed with the > zz format string, will be read as whatever valid timezone at the final > position (effectively EDT and PDT respectively in when those valid SystemV TZ > specifiers above are used). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20091) Tez: Add security credentials for FileSinkOperator output
[ https://issues.apache.org/jira/browse/HIVE-20091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542072#comment-16542072 ] Hive QA commented on HIVE-20091: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12931245/HIVE-20091.08.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 14649 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12562/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12562/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12562/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12931245 - PreCommit-HIVE-Build > Tez: Add security credentials for FileSinkOperator output > - > > Key: HIVE-20091 > URL: https://issues.apache.org/jira/browse/HIVE-20091 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-20091.01.patch, HIVE-20091.02.patch, > HIVE-20091.03.patch, HIVE-20091.04.patch, HIVE-20091.05.patch, > HIVE-20091.06.patch, HIVE-20091.07.patch, HIVE-20091.08.patch > > > DagUtils needs to add security credentials for the output for the > FileSinkOperator. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19375) Bad message: 'transactional'='false' is no longer a valid property and will be ignored
[ https://issues.apache.org/jira/browse/HIVE-19375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-19375: -- Resolution: Fixed Fix Version/s: 3.2.0 Status: Resolved (was: Patch Available) committed to branch-3 and master thanks Jason for the review > Bad message: 'transactional'='false' is no longer a valid property and will > be ignored > -- > > Key: HIVE-19375 > URL: https://issues.apache.org/jira/browse/HIVE-19375 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Minor > Fix For: 3.2.0 > > Attachments: HIVE-19375.01.patch > > > from {{TransactionalValidationListener.handleCreateTableTransactionalProp()}} > {noformat} > if ("false".equalsIgnoreCase(transactional)) { > // just drop transactional=false. For backward compatibility in case > someone has scripts > // with transactional=false > LOG.info("'transactional'='false' is no longer a valid property and > will be ignored: " + > Warehouse.getQualifiedName(newTable)); > return; > } > {noformat} > this msg is misleading since with metastore.create.as.acid=true, setting > transactional=false is valid to make a flat table -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19375) Bad message: 'transactional'='false' is no longer a valid property and will be ignored
[ https://issues.apache.org/jira/browse/HIVE-19375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-19375: -- Summary: Bad message: 'transactional'='false' is no longer a valid property and will be ignored (was: "'transactional'='false' is no longer a valid property and will be ignored: ) > Bad message: 'transactional'='false' is no longer a valid property and will > be ignored > -- > > Key: HIVE-19375 > URL: https://issues.apache.org/jira/browse/HIVE-19375 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Minor > Attachments: HIVE-19375.01.patch > > > from {{TransactionalValidationListener.handleCreateTableTransactionalProp()}} > {noformat} > if ("false".equalsIgnoreCase(transactional)) { > // just drop transactional=false. For backward compatibility in case > someone has scripts > // with transactional=false > LOG.info("'transactional'='false' is no longer a valid property and > will be ignored: " + > Warehouse.getQualifiedName(newTable)); > return; > } > {noformat} > this msg is misleading since with metastore.create.as.acid=true, setting > transactional=false is valid to make a flat table -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19027) Make materializations invalidation cache work with multiple active remote metastores
[ https://issues.apache.org/jira/browse/HIVE-19027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-19027: --- Description: The main points: - Only MVs that use transactional tables and are stored in transactional tables can have a time window value of 0. Those are the only MVs that can be guaranteed to not be outdated when a query is executed. - For MVs that +cannot be outdated+, comparison is based on valid write id lists. - For MVs that +can be outdated+: ** The window for valid outdated MVs can be specified in intervals of 1 minute. ** A materialized view is outdated if it was built before that time window and any source table has been modified since. A time window of -1 means to always use the materialized view for rewriting without any checks concerning its validity. If a materialized view uses an external table, the only way to trigger the rewriting would be to set the property to -1, since currently we do not capture for validation purposes whether the external source tables have been modified since the MV was created or not. was: The main points: - Only MVs that use transactional tables and are stored in transactional tables can have a time window value of 0. Those are the only MVs that can be guaranteed to not be outdated when a query is executed. - For MVs that +cannot be outdated+, comparison is based on valid write id lists. - For MVs that +can be outdated+: ** The window for valid outdated MVs can be specified in intervals of 1 minute. ** A materialized view is outdated if it was built before that time window and any source table has been modified since. A time window of -1 means to always use the materialized view for rewriting without any checks concerning its validity. > Make materializations invalidation cache work with multiple active remote > metastores > > > Key: HIVE-19027 > URL: https://issues.apache.org/jira/browse/HIVE-19027 > Project: Hive > Issue Type: Improvement > Components: Materialized views >Affects Versions: 3.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Critical > Fix For: 3.1.0 > > Attachments: HIVE-19027.01.patch, HIVE-19027.02.patch, > HIVE-19027.03.patch, HIVE-19027.04.patch > > > The main points: > - Only MVs that use transactional tables and are stored in transactional > tables can have a time window value of 0. Those are the only MVs that can be > guaranteed to not be outdated when a query is executed. > - For MVs that +cannot be outdated+, comparison is based on valid write id > lists. > - For MVs that +can be outdated+: > ** The window for valid outdated MVs can be specified in intervals of 1 > minute. > ** A materialized view is outdated if it was built before that time window > and any source table has been modified since. > A time window of -1 means to always use the materialized view for rewriting > without any checks concerning its validity. If a materialized view uses an > external table, the only way to trigger the rewriting would be to set the > property to -1, since currently we do not capture for validation purposes > whether the external source tables have been modified since the MV was > created or not. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20006) Make materializations invalidation cache work with multiple active remote metastores
[ https://issues.apache.org/jira/browse/HIVE-20006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-20006: --- Description: The main points: - Only MVs that use transactional tables and are stored in transactional tables can have a time window value of 0. Those are the only MVs that can be guaranteed to not be outdated when a query is executed. - For MVs that +cannot be outdated+, comparison is based on valid write id lists. - For MVs that +can be outdated+: ** The window for valid outdated MVs can be specified in intervals of 1 minute. ** A materialized view is outdated if it was built before that time window and any source table has been modified since. A time window of -1 means to always use the materialized view for rewriting without any checks concerning its validity. If a materialized view uses an external table, the only way to trigger the rewriting would be to set the property to -1, since currently we do not capture for validation purposes whether the external source tables have been modified since the MV was created or not. was: The main points: - Only MVs that use transactional tables and are stored in transactional tables can have a time window value of 0. Those are the only MVs that can be guaranteed to not be outdated when a query is executed. - For MVs that +cannot be outdated+, comparison is based on valid write id lists. - For MVs that +can be outdated+: ** The window for valid outdated MVs can be specified in intervals of 1 minute. ** A materialized view is outdated if it was built before that time window and any source table has been modified since. A time window of -1 means to always use the materialized view for rewriting without any checks concerning its validity. > Make materializations invalidation cache work with multiple active remote > metastores > > > Key: HIVE-20006 > URL: https://issues.apache.org/jira/browse/HIVE-20006 > Project: Hive > Issue Type: Improvement > Components: Materialized views >Affects Versions: 3.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Critical > Attachments: HIVE-19027.01.patch, HIVE-19027.02.patch, > HIVE-19027.03.patch, HIVE-19027.04.patch, HIVE-20006.01.patch, > HIVE-20006.02.patch, HIVE-20006.03.patch, HIVE-20006.04.patch, > HIVE-20006.05.patch, HIVE-20006.06.patch, HIVE-20006.patch > > > The main points: > - Only MVs that use transactional tables and are stored in transactional > tables can have a time window value of 0. Those are the only MVs that can be > guaranteed to not be outdated when a query is executed. > - For MVs that +cannot be outdated+, comparison is based on valid write id > lists. > - For MVs that +can be outdated+: > ** The window for valid outdated MVs can be specified in intervals of 1 > minute. > ** A materialized view is outdated if it was built before that time window > and any source table has been modified since. > A time window of -1 means to always use the materialized view for rewriting > without any checks concerning its validity. If a materialized view uses an > external table, the only way to trigger the rewriting would be to set the > property to -1, since currently we do not capture for validation purposes > whether the external source tables have been modified since the MV was > created or not. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20006) Make materializations invalidation cache work with multiple active remote metastores
[ https://issues.apache.org/jira/browse/HIVE-20006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542060#comment-16542060 ] Jesus Camacho Rodriguez commented on HIVE-20006: [~ashutoshc], done. Updated HIVE-19027 too. > Make materializations invalidation cache work with multiple active remote > metastores > > > Key: HIVE-20006 > URL: https://issues.apache.org/jira/browse/HIVE-20006 > Project: Hive > Issue Type: Improvement > Components: Materialized views >Affects Versions: 3.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Critical > Attachments: HIVE-19027.01.patch, HIVE-19027.02.patch, > HIVE-19027.03.patch, HIVE-19027.04.patch, HIVE-20006.01.patch, > HIVE-20006.02.patch, HIVE-20006.03.patch, HIVE-20006.04.patch, > HIVE-20006.05.patch, HIVE-20006.06.patch, HIVE-20006.patch > > > The main points: > - Only MVs that use transactional tables and are stored in transactional > tables can have a time window value of 0. Those are the only MVs that can be > guaranteed to not be outdated when a query is executed. > - For MVs that +cannot be outdated+, comparison is based on valid write id > lists. > - For MVs that +can be outdated+: > ** The window for valid outdated MVs can be specified in intervals of 1 > minute. > ** A materialized view is outdated if it was built before that time window > and any source table has been modified since. > A time window of -1 means to always use the materialized view for rewriting > without any checks concerning its validity. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19027) Make materializations invalidation cache work with multiple active remote metastores
[ https://issues.apache.org/jira/browse/HIVE-19027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-19027: --- Description: The main points: - Only MVs that use transactional tables and are stored in transactional tables can have a time window value of 0. Those are the only MVs that can be guaranteed to not be outdated when a query is executed. - For MVs that +cannot be outdated+, comparison is based on valid write id lists. - For MVs that +can be outdated+: ** The window for valid outdated MVs can be specified in intervals of 1 minute. ** A materialized view is outdated if it was built before that time window and any source table has been modified since. A time window of -1 means to always use the materialized view for rewriting without any checks concerning its validity. was: The main points: - Only MVs stored in transactional tables can have a time window value of 0. Those are the only MVs that can be guaranteed to not be outdated when a query is executed, if we use custom storage handlers to store the materialized view, we cannot make any promises. - For MVs that +cannot be outdated+, we do not check the metastore. Instead, comparison is based on valid write id lists. - For MVs that +can be outdated+, we still rely on the invalidation cache. ** The window for valid outdated MVs can be specified in intervals of 1 minute (less than that, it is difficult to have any guarantees about whether the MV is actually outdated by less than a minute or not). ** The async loading is done every interval / 2 (or probably better, we can make it configurable). > Make materializations invalidation cache work with multiple active remote > metastores > > > Key: HIVE-19027 > URL: https://issues.apache.org/jira/browse/HIVE-19027 > Project: Hive > Issue Type: Improvement > Components: Materialized views >Affects Versions: 3.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Critical > Fix For: 3.1.0 > > Attachments: HIVE-19027.01.patch, HIVE-19027.02.patch, > HIVE-19027.03.patch, HIVE-19027.04.patch > > > The main points: > - Only MVs that use transactional tables and are stored in transactional > tables can have a time window value of 0. Those are the only MVs that can be > guaranteed to not be outdated when a query is executed. > - For MVs that +cannot be outdated+, comparison is based on valid write id > lists. > - For MVs that +can be outdated+: > ** The window for valid outdated MVs can be specified in intervals of 1 > minute. > ** A materialized view is outdated if it was built before that time window > and any source table has been modified since. > A time window of -1 means to always use the materialized view for rewriting > without any checks concerning its validity. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20006) Make materializations invalidation cache work with multiple active remote metastores
[ https://issues.apache.org/jira/browse/HIVE-20006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-20006: --- Description: The main points: - Only MVs that use transactional tables and are stored in transactional tables can have a time window value of 0. Those are the only MVs that can be guaranteed to not be outdated when a query is executed. - For MVs that +cannot be outdated+, comparison is based on valid write id lists. - For MVs that +can be outdated+: ** The window for valid outdated MVs can be specified in intervals of 1 minute. ** A materialized view is outdated if it was built before that time window and any source table has been modified since. A time window of -1 means to always use the materialized view for rewriting without any checks concerning its validity. was: The main points: - Only MVs stored in transactional tables can have a time window value of 0. Those are the only MVs that can be guaranteed to not be outdated when a query is executed, if we use custom storage handlers to store the materialized view, we cannot make any promises. - For MVs that +cannot be outdated+, we do not check the metastore. Instead, comparison is based on valid write id lists. - For MVs that +can be outdated+, we still rely on the invalidation cache. ** The window for valid outdated MVs can be specified in intervals of 1 minute (less than that, it is difficult to have any guarantees about whether the MV is actually outdated by less than a minute or not). ** The async loading is done every interval / 2 (or probably better, we can make it configurable). > Make materializations invalidation cache work with multiple active remote > metastores > > > Key: HIVE-20006 > URL: https://issues.apache.org/jira/browse/HIVE-20006 > Project: Hive > Issue Type: Improvement > Components: Materialized views >Affects Versions: 3.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Critical > Attachments: HIVE-19027.01.patch, HIVE-19027.02.patch, > HIVE-19027.03.patch, HIVE-19027.04.patch, HIVE-20006.01.patch, > HIVE-20006.02.patch, HIVE-20006.03.patch, HIVE-20006.04.patch, > HIVE-20006.05.patch, HIVE-20006.06.patch, HIVE-20006.patch > > > The main points: > - Only MVs that use transactional tables and are stored in transactional > tables can have a time window value of 0. Those are the only MVs that can be > guaranteed to not be outdated when a query is executed. > - For MVs that +cannot be outdated+, comparison is based on valid write id > lists. > - For MVs that +can be outdated+: > ** The window for valid outdated MVs can be specified in intervals of 1 > minute. > ** A materialized view is outdated if it was built before that time window > and any source table has been modified since. > A time window of -1 means to always use the materialized view for rewriting > without any checks concerning its validity. -- This message was sent by Atlassian JIRA (v7.6.3#76005)