[jira] [Commented] (HIVE-9941) sql std authorization on partitioned table: truncate and insert
[ https://issues.apache.org/jira/browse/HIVE-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638331#comment-14638331 ] Olaf Flebbe commented on HIVE-9941: --- Just verified it happens on 1.2.0 too > sql std authorization on partitioned table: truncate and insert > --- > > Key: HIVE-9941 > URL: https://issues.apache.org/jira/browse/HIVE-9941 > Project: Hive > Issue Type: Bug > Components: Authorization >Affects Versions: 1.0.0, 1.2.0 >Reporter: Olaf Flebbe > > sql std authorization works as expected. > However if a table is partitioned any user can truncate it > User foo: > {code} > create table bla (a string) partitioned by (b string); > #.. loading values ... > {code} > Admin: > {code} > 0: jdbc:hive2://localhost:1/default> set role admin; > No rows affected (0,074 seconds) > 0: jdbc:hive2://localhost:1/default> show grant on bla; > +---+++-+-+-++---++--+--+ > | database | table | partition | column | principal_name | > principal_type | privilege | grant_option | grant_time | grantor | > +---+++-+-+-++---++--+--+ > | default | bla|| | foo | USER > | DELETE | true | 1426158997000 | foo | > | default | bla|| | foo | USER > | INSERT | true | 1426158997000 | foo | > | default | bla|| | foo | USER > | SELECT | true | 1426158997000 | foo | > | default | bla|| | foo | USER > | UPDATE | true | 1426158997000 | foo | > +---+++-+-+-++---++--+--+ > {code} > now user olaf > {code} > 0: jdbc:hive2://localhost:1/default> select * from bla; > Error: Error while compiling statement: FAILED: HiveAccessControlException > Permission denied: Principal [name=olaf, type=USER] does not have following > privileges for operation QUERY [[SELECT] on Object [type=TABLE_OR_VIEW, > name=default.bla]] (state=42000,code=4) > {code} > works as expected. > _BUT_ > {code} > 0: jdbc:hive2://localhost:1/default> truncate table bla; > No rows affected (0,18 seconds) > {code} > _And table is empty afterwards_. > Similarily: {{insert into table}} works, too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9941) sql std authorization on partitioned table: truncate and insert
[ https://issues.apache.org/jira/browse/HIVE-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olaf Flebbe updated HIVE-9941: -- Affects Version/s: (was: 0.14.0) 1.0.0 1.2.0 > sql std authorization on partitioned table: truncate and insert > --- > > Key: HIVE-9941 > URL: https://issues.apache.org/jira/browse/HIVE-9941 > Project: Hive > Issue Type: Bug > Components: Authorization >Affects Versions: 1.0.0, 1.2.0 >Reporter: Olaf Flebbe > > sql std authorization works as expected. > However if a table is partitioned any user can truncate it > User foo: > {code} > create table bla (a string) partitioned by (b string); > #.. loading values ... > {code} > Admin: > {code} > 0: jdbc:hive2://localhost:1/default> set role admin; > No rows affected (0,074 seconds) > 0: jdbc:hive2://localhost:1/default> show grant on bla; > +---+++-+-+-++---++--+--+ > | database | table | partition | column | principal_name | > principal_type | privilege | grant_option | grant_time | grantor | > +---+++-+-+-++---++--+--+ > | default | bla|| | foo | USER > | DELETE | true | 1426158997000 | foo | > | default | bla|| | foo | USER > | INSERT | true | 1426158997000 | foo | > | default | bla|| | foo | USER > | SELECT | true | 1426158997000 | foo | > | default | bla|| | foo | USER > | UPDATE | true | 1426158997000 | foo | > +---+++-+-+-++---++--+--+ > {code} > now user olaf > {code} > 0: jdbc:hive2://localhost:1/default> select * from bla; > Error: Error while compiling statement: FAILED: HiveAccessControlException > Permission denied: Principal [name=olaf, type=USER] does not have following > privileges for operation QUERY [[SELECT] on Object [type=TABLE_OR_VIEW, > name=default.bla]] (state=42000,code=4) > {code} > works as expected. > _BUT_ > {code} > 0: jdbc:hive2://localhost:1/default> truncate table bla; > No rows affected (0,18 seconds) > {code} > _And table is empty afterwards_. > Similarily: {{insert into table}} works, too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11254) Process result sets returned by a stored procedure
[ https://issues.apache.org/jira/browse/HIVE-11254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638317#comment-14638317 ] Dmitry Tolpeko commented on HIVE-11254: --- Yes, and it is documented at http://www.plhql.org/allocate-cursor. Sorry I will start porting it to Apache Confluence soon. > Process result sets returned by a stored procedure > -- > > Key: HIVE-11254 > URL: https://issues.apache.org/jira/browse/HIVE-11254 > Project: Hive > Issue Type: Improvement > Components: hpl/sql >Reporter: Dmitry Tolpeko >Assignee: Dmitry Tolpeko > Fix For: 2.0.0 > > Attachments: HIVE-11254.1.patch, HIVE-11254.2.patch, > HIVE-11254.3.patch, HIVE-11254.4.patch > > > Stored procedure can return one or more result sets. A caller should be able > to process them. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11347) CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix CTAS
[ https://issues.apache.org/jira/browse/HIVE-11347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638316#comment-14638316 ] Hive QA commented on HIVE-11347: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12746661/HIVE-11347.01.patch {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9256 tests executed *Failed tests:* {noformat} TestCustomAuthentication - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_auto_join1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_join0 org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4699/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4699/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4699/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12746661 - PreCommit-HIVE-TRUNK-Build > CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix CTAS > -- > > Key: HIVE-11347 > URL: https://issues.apache.org/jira/browse/HIVE-11347 > Project: Hive > Issue Type: Sub-task > Components: CBO >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11347.01.patch > > > need to add a project on the final project. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11055) HPL/SQL - Implementing Procedural SQL in Hive (PL/HQL Contribution)
[ https://issues.apache.org/jira/browse/HIVE-11055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638310#comment-14638310 ] Dmitry Tolpeko commented on HIVE-11055: --- Not yet, sorry. This functionality needs support from Hive core. > HPL/SQL - Implementing Procedural SQL in Hive (PL/HQL Contribution) > --- > > Key: HIVE-11055 > URL: https://issues.apache.org/jira/browse/HIVE-11055 > Project: Hive > Issue Type: Improvement >Reporter: Dmitry Tolpeko >Assignee: Dmitry Tolpeko > Fix For: 2.0.0 > > Attachments: HIVE-11055.1.patch, HIVE-11055.2.patch, > HIVE-11055.3.patch, HIVE-11055.4.patch, hplsql-site.xml > > > There is PL/HQL tool (www.plhql.org) that implements procedural SQL for Hive > (actually any SQL-on-Hadoop implementation and any JDBC source). > Alan Gates offered to contribute it to Hive under HPL/SQL name > (org.apache.hive.hplsql package). This JIRA is to create a patch to > contribute the PL/HQL code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11335) Multi-Join Inner Query producing incorrect results
[ https://issues.apache.org/jira/browse/HIVE-11335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638229#comment-14638229 ] fatkun commented on HIVE-11335: --- thanks, I test the patch in 1.1.0, It's OK now. > Multi-Join Inner Query producing incorrect results > -- > > Key: HIVE-11335 > URL: https://issues.apache.org/jira/browse/HIVE-11335 > Project: Hive > Issue Type: Bug > Components: Query Planning >Affects Versions: 1.1.0 > Environment: CDH5.4.0 >Reporter: fatkun >Assignee: Jesus Camacho Rodriguez > Attachments: query1.txt, query2.txt > > > test step > {code} > create table log (uid string, uid2 string); > insert into log values ('1', '1'); > create table user (uid string, name string); > insert into user values ('1', "test1"); > {code} > (Query1) > {code} > select b.name, c.name from log a > left outer join (select uid, name from user) b on (a.uid=b.uid) > left outer join user c on (a.uid2=c.uid); > {code} > return wrong result: > 1 test1 > It should be both return test1 > (Query2)I try to find error, if I use this query, return right result.(join > key different) > {code} > select b.name, c.name from log a > left outer join (select uid, name from user) b on (a.uid=b.uid) > left outer join user c on (a.uid=c.uid); > {code} > The explain is different,Query1 only select one colum. It should select uid > and name. > {code} > b:user > TableScan > alias: user > Statistics: Num rows: 1 Data size: 7 Basic stats: COMPLETE Column > stats: NONE > Select Operator > expressions: uid (type: string) > outputColumnNames: _col0 > {code} > It may relate HIVE-10996 > =UPDATE1=== > (Query3) this query return correct result > {code} > select b.name, c.name from log a > left outer join (select user.uid, user.name from user) b on (a.uid=b.uid) > left outer join user c on (a.uid2=c.uid); > {code} > the operator tree > TS[0]-SEL[1]-RS[5]-JOIN[6]-RS[7]-JOIN[9]-SEL[10]-FS[11] > TS[2]-RS[4]-JOIN[6] > TS[3]-RS[8]-JOIN[9] > the Query1 SEL[1] rowSchema is wrong, cannot detect the tabAlias -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11350) LLAP: Fix API usage to work with evolving Tez APIs - TEZ-2005
[ https://issues.apache.org/jira/browse/HIVE-11350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-11350: -- Attachment: HIVE-11350.1.TEZ2005.txt > LLAP: Fix API usage to work with evolving Tez APIs - TEZ-2005 > - > > Key: HIVE-11350 > URL: https://issues.apache.org/jira/browse/HIVE-11350 > Project: Hive > Issue Type: Sub-task >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Fix For: llap > > Attachments: HIVE-11350.1.TEZ2005.txt > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11344) HIVE-9845 makes HCatSplit.write modify the split so that PartInfo objects are unusable after it
[ https://issues.apache.org/jira/browse/HIVE-11344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638172#comment-14638172 ] Hive QA commented on HIVE-11344: {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12746652/HIVE-11344.patch {color:green}SUCCESS:{color} +1 9257 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4698/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4698/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4698/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12746652 - PreCommit-HIVE-TRUNK-Build > HIVE-9845 makes HCatSplit.write modify the split so that PartInfo objects are > unusable after it > --- > > Key: HIVE-11344 > URL: https://issues.apache.org/jira/browse/HIVE-11344 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.0 >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan > Attachments: HIVE-11344.patch > > > HIVE-9845 introduced a notion of compression for HCatSplits so that when > serializing, it finds commonalities between PartInfo and TableInfo objects, > and if the two are identical, it nulls out that field in PartInfo, thus > making sure that when PartInfo is then serialized, info is not repeated. > This, however, has the side effect of making the PartInfo object unusable if > HCatSplit.write has been called. > While this does not affect M/R directly, since they do not know about the > PartInfo objects and once serialized, the HCatSplit object is recreated by > deserializing on the backend, which does restore the split and its PartInfo > objects, this does, however, affect framework users of HCat that try to mimic > M/R and then use the PartInfo objects to instantiate distinct readers. > Thus, we need to make it so that PartInfo is still usable after > HCatSplit.write is called. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-11294) Use HBase to cache aggregated stats
[ https://issues.apache.org/jira/browse/HIVE-11294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638169#comment-14638169 ] Lefty Leverenz edited comment on HIVE-11294 at 7/23/15 5:14 AM: Doc note: This creates four configuration parameters, so they will need to be documented in the wiki after hbase-metastore-branch gets merged to trunk. In the meantime, I'm linking this issue to HIVE-9752 (Documentation for HBase metastore). The new parameters are: * hive.metastore.hbase.aggr.stats.cache.entries * hive.metastore.hbase.aggr.stats.memory.ttl * hive.metastore.hbase.aggr.stats.invalidator.frequency * hive.metastore.hbase.aggr.stats.hbase.ttl was (Author: le...@hortonworks.com): Doc note: This creates four configuration parameters, so they will need to be documented in the wiki after hbase-metastore-branch gets merged to trunk. In the meantime, I'm linking this issue to HIVE-9752 (Documentation for HBase metastore). > Use HBase to cache aggregated stats > --- > > Key: HIVE-11294 > URL: https://issues.apache.org/jira/browse/HIVE-11294 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: hbase-metastore-branch >Reporter: Alan Gates >Assignee: Alan Gates > Fix For: hbase-metastore-branch > > Attachments: HIVE-11294.2.patch, HIVE-11294.patch > > > Currently stats are cached only in the memory of the client. Given that > HBase can easily manage the scale of caching aggregated stats we should be > using it to do so. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10516) Measure Hive CLI's performance difference before and after implementation is switched
[ https://issues.apache.org/jira/browse/HIVE-10516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638170#comment-14638170 ] Ferdinand Xu commented on HIVE-10516: - Yes, exactly. And we can share some time to improve the performance. > Measure Hive CLI's performance difference before and after implementation is > switched > - > > Key: HIVE-10516 > URL: https://issues.apache.org/jira/browse/HIVE-10516 > Project: Hive > Issue Type: Sub-task > Components: CLI >Affects Versions: 0.10.0 >Reporter: Xuefu Zhang >Assignee: Ferdinand Xu > Attachments: HIVE-10516-beeline-cli.patch, HIVE-10516.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11294) Use HBase to cache aggregated stats
[ https://issues.apache.org/jira/browse/HIVE-11294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638169#comment-14638169 ] Lefty Leverenz commented on HIVE-11294: --- Doc note: This creates four configuration parameters, so they will need to be documented in the wiki after hbase-metastore-branch gets merged to trunk. In the meantime, I'm linking this issue to HIVE-9752 (Documentation for HBase metastore). > Use HBase to cache aggregated stats > --- > > Key: HIVE-11294 > URL: https://issues.apache.org/jira/browse/HIVE-11294 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: hbase-metastore-branch >Reporter: Alan Gates >Assignee: Alan Gates > Fix For: hbase-metastore-branch > > Attachments: HIVE-11294.2.patch, HIVE-11294.patch > > > Currently stats are cached only in the memory of the client. Given that > HBase can easily manage the scale of caching aggregated stats we should be > using it to do so. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11271) java.lang.IndexOutOfBoundsException when union all with if function
[ https://issues.apache.org/jira/browse/HIVE-11271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638168#comment-14638168 ] Ashutosh Chauhan commented on HIVE-11271: - Plan which is broken at compile time but is patched up at runtime to work correctly is a *bad* idea because the notion of brokeness is only between this piece of code and at runtime and is opaque to everything in between. So, any subsequent code which mutates the plan (e.g, logical optimizer rules or physical compiler (MR/Tez/Spark compiler)) has to accomodate for this special condition. In general, at any time plan should be fully self-describing and not rely on subsequent patching. > java.lang.IndexOutOfBoundsException when union all with if function > --- > > Key: HIVE-11271 > URL: https://issues.apache.org/jira/browse/HIVE-11271 > Project: Hive > Issue Type: Bug > Components: Logical Optimizer >Affects Versions: 0.14.0, 1.0.0, 1.2.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Attachments: HIVE-11271.1.patch > > > Some queries with Union all as subquery fail in MapReduce task with > stacktrace: > {noformat} > 15/07/15 14:19:30 [pool-13-thread-1]: INFO exec.UnionOperator: Initializing > operator UNION[104] > 15/07/15 14:19:30 [Thread-72]: INFO mapred.LocalJobRunner: Map task executor > complete. > 15/07/15 14:19:30 [Thread-72]: WARN mapred.LocalJobRunner: > job_local826862759_0005 > java.lang.Exception: java.lang.RuntimeException: Error in configuring object > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354) > Caused by: java.lang.RuntimeException: Error in configuring object > at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93) > at > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64) > at > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366) > at > org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.GeneratedMethodAccessor53.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88) > ... 10 more > Caused by: java.lang.RuntimeException: Error in configuring object > at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93) > at > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64) > at > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) > at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34) > ... 14 more > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.GeneratedMethodAccessor53.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88) > ... 17 more > Caused by: java.lang.RuntimeException: Map operator initialization failed > at > org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:140) > ... 21 more > Caused by: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 > at java.util.ArrayList.rangeCheck(ArrayList.java:635) > at java.util.ArrayList.get(ArrayList.java:411) > at > org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481) > at > org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481) > at > org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438) > at org.apach
[jira] [Commented] (HIVE-11254) Process result sets returned by a stored procedure
[ https://issues.apache.org/jira/browse/HIVE-11254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638162#comment-14638162 ] Lefty Leverenz commented on HIVE-11254: --- Does this need documentation? > Process result sets returned by a stored procedure > -- > > Key: HIVE-11254 > URL: https://issues.apache.org/jira/browse/HIVE-11254 > Project: Hive > Issue Type: Improvement > Components: hpl/sql >Reporter: Dmitry Tolpeko >Assignee: Dmitry Tolpeko > Fix For: 2.0.0 > > Attachments: HIVE-11254.1.patch, HIVE-11254.2.patch, > HIVE-11254.3.patch, HIVE-11254.4.patch > > > Stored procedure can return one or more result sets. A caller should be able > to process them. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10516) Measure Hive CLI's performance difference before and after implementation is switched
[ https://issues.apache.org/jira/browse/HIVE-10516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638114#comment-14638114 ] Xuefu Zhang commented on HIVE-10516: So, embedded beeline is 85% slower than hive CLI? > Measure Hive CLI's performance difference before and after implementation is > switched > - > > Key: HIVE-10516 > URL: https://issues.apache.org/jira/browse/HIVE-10516 > Project: Hive > Issue Type: Sub-task > Components: CLI >Affects Versions: 0.10.0 >Reporter: Xuefu Zhang >Assignee: Ferdinand Xu > Attachments: HIVE-10516-beeline-cli.patch, HIVE-10516.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11321) Move OrcFile.OrcTableProperties from OrcFile into OrcConf.
[ https://issues.apache.org/jira/browse/HIVE-11321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638106#comment-14638106 ] Hive QA commented on HIVE-11321: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12746435/HIVE-11321.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9253 tests executed *Failed tests:* {noformat} TestSchedulerQueue - did not produce a TEST-*.xml file {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4697/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4697/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4697/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12746435 - PreCommit-HIVE-TRUNK-Build > Move OrcFile.OrcTableProperties from OrcFile into OrcConf. > -- > > Key: HIVE-11321 > URL: https://issues.apache.org/jira/browse/HIVE-11321 > Project: Hive > Issue Type: Sub-task >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Fix For: 2.0.0 > > Attachments: HIVE-11321.patch > > > We should pull all of the configuration/table property knobs into a single > list. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10516) Measure Hive CLI's performance difference before and after implementation is switched
[ https://issues.apache.org/jira/browse/HIVE-10516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638098#comment-14638098 ] Ferdinand Xu commented on HIVE-10516: - Hi [~xuefuz], is this what you have in mind? > Measure Hive CLI's performance difference before and after implementation is > switched > - > > Key: HIVE-10516 > URL: https://issues.apache.org/jira/browse/HIVE-10516 > Project: Hive > Issue Type: Sub-task > Components: CLI >Affects Versions: 0.10.0 >Reporter: Xuefu Zhang >Assignee: Ferdinand Xu > Attachments: HIVE-10516-beeline-cli.patch, HIVE-10516.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11336) Support initial file option for new CLI [beeline-cli branch]
[ https://issues.apache.org/jira/browse/HIVE-11336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-11336: Attachment: HIVE-11336-beeline-cli.1.patch Thanks [~xuefuz] for your review. It's reasonable to have a space in the path especially for Windows, like "Personal Data". Update patch addressing this. > Support initial file option for new CLI [beeline-cli branch] > > > Key: HIVE-11336 > URL: https://issues.apache.org/jira/browse/HIVE-11336 > Project: Hive > Issue Type: Sub-task > Components: Beeline >Affects Versions: beeline-cli-branch >Reporter: Ferdinand Xu >Assignee: Ferdinand Xu > Attachments: HIVE-11336-beeline-cli.1.patch, > HIVE-11336-beeline-cli.patch > > > Option 'i' need to be enabled in the new CLI, which can support multiple > initial files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11055) HPL/SQL - Implementing Procedural SQL in Hive (PL/HQL Contribution)
[ https://issues.apache.org/jira/browse/HIVE-11055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638072#comment-14638072 ] wangchangchun commented on HIVE-11055: -- Hello, I want to ask a question. Whether HPL/SQL support TRANSACTION ? If support, support which level,read uncommitted,read commit,repeatable read or serializable? Can HPL/SQL support SAVEPOINT? > HPL/SQL - Implementing Procedural SQL in Hive (PL/HQL Contribution) > --- > > Key: HIVE-11055 > URL: https://issues.apache.org/jira/browse/HIVE-11055 > Project: Hive > Issue Type: Improvement >Reporter: Dmitry Tolpeko >Assignee: Dmitry Tolpeko > Fix For: 2.0.0 > > Attachments: HIVE-11055.1.patch, HIVE-11055.2.patch, > HIVE-11055.3.patch, HIVE-11055.4.patch, hplsql-site.xml > > > There is PL/HQL tool (www.plhql.org) that implements procedural SQL for Hive > (actually any SQL-on-Hadoop implementation and any JDBC source). > Alan Gates offered to contribute it to Hive under HPL/SQL name > (org.apache.hive.hplsql package). This JIRA is to create a patch to > contribute the PL/HQL code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11333) CBO: Calcite Operator To Hive Operator (Calcite Return Path): ColumnPruner prunes columns of UnionOperator that should be kept
[ https://issues.apache.org/jira/browse/HIVE-11333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638059#comment-14638059 ] Hive QA commented on HIVE-11333: {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12746636/HIVE-11333.02.patch {color:green}SUCCESS:{color} +1 9257 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4696/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4696/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4696/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12746636 - PreCommit-HIVE-TRUNK-Build > CBO: Calcite Operator To Hive Operator (Calcite Return Path): ColumnPruner > prunes columns of UnionOperator that should be kept > -- > > Key: HIVE-11333 > URL: https://issues.apache.org/jira/browse/HIVE-11333 > Project: Hive > Issue Type: Sub-task > Components: CBO >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11333.01.patch, HIVE-11333.02.patch > > > unionOperator will have the schema following the operator in the first > branch. Because ColumnPruner prunes columns based on the internal name, the > column in other branches may be pruned due to a different internal name from > the first branch. To repro, run rcfile_union.q with return path turned on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
issues@hive.apache.org
[ https://issues.apache.org/jira/browse/HIVE-8339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638055#comment-14638055 ] xy7 commented on HIVE-8339: --- My cluster is hadoop 2.6.0, hive1.1.0, with the same bug, I tried recompile the code of https://github.com/radimk/hive/commit/bf4d047274fb3fddd9bcfe8432154cda222e6582 and replace the output jar(hive-exec-1.1.0.jar) of hive, but this did not worked, i dont know why? 2015-07-23 02:08:04,992 Stage-1 map = 100%, reduce = 98%, Cumulative CPU 38797.19 sec 2015-07-23 02:08:11,090 Stage-1 map = 100%, reduce = 99%, Cumulative CPU 38804.08 sec 2015-07-23 02:08:21,319 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 38815.55 sec java.io.IOException: Could not find status of job:job_1437009840203_3838 at org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:295) at org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:557) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:434) at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88) at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:75) > Job status not found after 100% succeded map&reduce > --- > > Key: HIVE-8339 > URL: https://issues.apache.org/jira/browse/HIVE-8339 > Project: Hive > Issue Type: Bug >Affects Versions: 0.13.1 > Environment: Hadoop 2.4.0, Hive 0.13.1. > Amazon EMR cluster of 9 i2.4xlarge nodes. > 800+GB of data in HDFS. >Reporter: Valera Chevtaev > > According to the logs it seems that the jobs 100% succeed for both map and > reduce but then wasn't able to get the status of the job from job history > server. > Hive logs: > 2014-10-03 07:57:26,593 INFO [main]: exec.Task > (SessionState.java:printInfo(536)) - 2014-10-03 07:57:26,593 Stage-1 map = > 100%, reduce = 99%, Cumulative CPU 872541.02 sec > 2014-10-03 07:57:47,447 INFO [main]: exec.Task > (SessionState.java:printInfo(536)) - 2014-10-03 07:57:47,446 Stage-1 map = > 100%, reduce = 100%, Cumulative CPU 872566.55 sec > 2014-10-03 07:57:48,710 INFO [main]: mapred.ClientServiceDelegate > (ClientServiceDelegate.java:getProxy(273)) - Application state is completed. > FinalApplicationStatus=SUCCEEDED. Redirecting to job history server > 2014-10-03 07:57:48,716 ERROR [main]: exec.Task > (SessionState.java:printError(545)) - Ended Job = job_1412263771568_0002 with > exception 'java.io.IOException(Could not find status of > job:job_1412263771568_0002)' > java.io.IOException: Could not find status of job:job_1412263771568_0002 >at > org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:294) >at > org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:547) >at > org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:426) >at > org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136) >at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) >at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) >at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1503) >at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1270) >at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1088) >at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911) >at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901) >at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:275) >at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:227) >at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:430) >at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:366) >at > org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:463) >at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:479) >at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:759) >at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:697) >at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:636) >at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >at java.lang.reflect.Method.invoke(Method.java:606) >at org.apache.hadoop.util.RunJar.main(RunJa
[jira] [Updated] (HIVE-11316) Use datastructure that doesnt duplicate any part of string for ASTNode::toStringTree()
[ https://issues.apache.org/jira/browse/HIVE-11316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-11316: - Attachment: HIVE-11316.4.patch > Use datastructure that doesnt duplicate any part of string for > ASTNode::toStringTree() > -- > > Key: HIVE-11316 > URL: https://issues.apache.org/jira/browse/HIVE-11316 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-11316-branch-1.0.patch, > HIVE-11316-branch-1.2.patch, HIVE-11316.1.patch, HIVE-11316.2.patch, > HIVE-11316.3.patch, HIVE-11316.4.patch > > > HIVE-11281 uses an approach to memoize toStringTree() for ASTNode. This jira > is suppose to alter the string memoization to use a different data structure > that doesn't duplicate any part of the string so that we do not run into OOM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9900) LLAP: Integrate MiniLLAPCluster into tests
[ https://issues.apache.org/jira/browse/HIVE-9900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-9900: --- Assignee: Vikram Dixit K > LLAP: Integrate MiniLLAPCluster into tests > -- > > Key: HIVE-9900 > URL: https://issues.apache.org/jira/browse/HIVE-9900 > Project: Hive > Issue Type: Sub-task >Reporter: Siddharth Seth >Assignee: Vikram Dixit K > Fix For: llap > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10117) LLAP: Use task number, attempt number to cache plans
[ https://issues.apache.org/jira/browse/HIVE-10117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638014#comment-14638014 ] Sergey Shelukhin commented on HIVE-10117: - Is this different from ObjectCache? > LLAP: Use task number, attempt number to cache plans > > > Key: HIVE-10117 > URL: https://issues.apache.org/jira/browse/HIVE-10117 > Project: Hive > Issue Type: Sub-task >Reporter: Siddharth Seth > Fix For: llap > > > Instead of relying on thread locals only. This can be used to share the work > between Inputs / Processor / Outputs in Tez. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8678) Pig fails to correctly load DATE fields using HCatalog
[ https://issues.apache.org/jira/browse/HIVE-8678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638012#comment-14638012 ] Michael McLellan commented on HIVE-8678: I no longer have access to the system where this was an issue. This was about 9 months ago and we ended up working around it by just using Strings. I don't remember anything more - sorry I didn't write down instructions to reproduce when I created this. > Pig fails to correctly load DATE fields using HCatalog > -- > > Key: HIVE-8678 > URL: https://issues.apache.org/jira/browse/HIVE-8678 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 0.13.1 >Reporter: Michael McLellan >Assignee: Sushanth Sowmyan > > Using: > Hadoop 2.5.0-cdh5.2.0 > Pig 0.12.0-cdh5.2.0 > Hive 0.13.1-cdh5.2.0 > When using pig -useHCatalog to load a Hive table that has a DATE field, when > trying to DUMP the field, the following error occurs: > {code} > 2014-10-30 22:58:05,469 [main] ERROR > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - > org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error > converting read value to tuple > at > org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76) > at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:58) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211) > at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553) > at > org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80) > at > org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) > Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to > java.sql.Date > at > org.apache.hive.hcatalog.pig.PigHCatUtil.extractPigObject(PigHCatUtil.java:420) > at > org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:457) > at > org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:375) > at > org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:64) > 2014-10-30 22:58:05,469 [main] ERROR > org.apache.pig.tools.pigstats.SimplePigStats - ERROR 6018: Error converting > read value to tuple > {code} > It seems to be occuring here: > https://github.com/apache/hive/blob/trunk/hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hive/hcatalog/pig/PigHCatUtil.java#L433 > and that it should be: > {code}Date d = Date.valueOf(o);{code} > instead of > {code}Date d = (Date) o;{code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11349) Update HBase metastore hbase version to 1.1.1
[ https://issues.apache.org/jira/browse/HIVE-11349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-11349: -- Attachment: HIVE-11349.patch Updated version of HBase to 1.1.1. This breaks Tephra, but we aren't testing with it at the moment anyway. > Update HBase metastore hbase version to 1.1.1 > - > > Key: HIVE-11349 > URL: https://issues.apache.org/jira/browse/HIVE-11349 > Project: Hive > Issue Type: Task > Components: Metastore >Affects Versions: hbase-metastore-branch >Reporter: Alan Gates >Assignee: Alan Gates > Attachments: HIVE-11349.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11304) Migrate to Log4j2 from Log4j 1.x
[ https://issues.apache.org/jira/browse/HIVE-11304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637995#comment-14637995 ] Thejas M Nair commented on HIVE-11304: -- [~prasanth_j], [~hsubramaniyan] added changes to be able to change logging level between queries in HIVE-10119. TestOperationLoggingAPI has tests for it. We would need to detect the current logging level for current operation in doAppend() (or its equivalent in log4j2) > Migrate to Log4j2 from Log4j 1.x > > > Key: HIVE-11304 > URL: https://issues.apache.org/jira/browse/HIVE-11304 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-11304.2.patch, HIVE-11304.patch > > > Log4J2 has some great benefits and can benefit hive significantly. Some > notable features include > 1) Performance (parametrized logging, performance when logging is disabled > etc.) More details can be found here > https://logging.apache.org/log4j/2.x/performance.html > 2) RoutingAppender - Route logs to different log files based on MDC context > (useful for HS2, LLAP etc.) > 3) Asynchronous logging > This is an umbrella jira to track changes related to Log4j2 migration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11348) Support START TRANSACTION/COMMIT/ROLLBACK commands: support SQL2011 reserved keywords
[ https://issues.apache.org/jira/browse/HIVE-11348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637991#comment-14637991 ] Eugene Koifman commented on HIVE-11348: --- +1 pending tests > Support START TRANSACTION/COMMIT/ROLLBACK commands: support SQL2011 reserved > keywords > - > > Key: HIVE-11348 > URL: https://issues.apache.org/jira/browse/HIVE-11348 > Project: Hive > Issue Type: Sub-task > Components: SQL, Transactions >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11348.01.patch, HIVE-11348.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11348) Support START TRANSACTION/COMMIT/ROLLBACK commands: support SQL2011 reserved keywords
[ https://issues.apache.org/jira/browse/HIVE-11348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637974#comment-14637974 ] Pengcheng Xiong commented on HIVE-11348: [~ekoifman], Sure. Done. > Support START TRANSACTION/COMMIT/ROLLBACK commands: support SQL2011 reserved > keywords > - > > Key: HIVE-11348 > URL: https://issues.apache.org/jira/browse/HIVE-11348 > Project: Hive > Issue Type: Sub-task > Components: SQL, Transactions >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11348.01.patch, HIVE-11348.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11259) LLAP: clean up ORC dependencies part 1
[ https://issues.apache.org/jira/browse/HIVE-11259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637975#comment-14637975 ] Hive QA commented on HIVE-11259: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12746632/HIVE-11259.01.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4695/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4695/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4695/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4695/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at e57c360 HIVE-11077 Add support in parser and wire up to txn manager (Eugene Koifman, reviewed by Alan Gates) + git clean -f -d Removing ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java.orig Removing ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HybridHashTableContainer.java.orig + git checkout master Already on 'master' + git reset --hard origin/master HEAD is now at e57c360 HIVE-11077 Add support in parser and wire up to txn manager (Eugene Koifman, reviewed by Alan Gates) + git merge --ff-only origin/master Already up-to-date. + git gc + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12746632 - PreCommit-HIVE-TRUNK-Build > LLAP: clean up ORC dependencies part 1 > -- > > Key: HIVE-11259 > URL: https://issues.apache.org/jira/browse/HIVE-11259 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11259.01.patch, HIVE-11259.patch > > > Before there's storage handler module, we can clean some things up -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11348) Support START TRANSACTION/COMMIT/ROLLBACK commands: support SQL2011 reserved keywords
[ https://issues.apache.org/jira/browse/HIVE-11348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11348: --- Attachment: HIVE-11348.02.patch > Support START TRANSACTION/COMMIT/ROLLBACK commands: support SQL2011 reserved > keywords > - > > Key: HIVE-11348 > URL: https://issues.apache.org/jira/browse/HIVE-11348 > Project: Hive > Issue Type: Sub-task > Components: SQL, Transactions >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11348.01.patch, HIVE-11348.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11306) Add a bloom-1 filter for Hybrid MapJoin spills
[ https://issues.apache.org/jira/browse/HIVE-11306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637970#comment-14637970 ] Hive QA commented on HIVE-11306: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12746635/HIVE-11306.2.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9257 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_auto_partitioned org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_leftsemi_mapjoin {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4694/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4694/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4694/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12746635 - PreCommit-HIVE-TRUNK-Build > Add a bloom-1 filter for Hybrid MapJoin spills > -- > > Key: HIVE-11306 > URL: https://issues.apache.org/jira/browse/HIVE-11306 > Project: Hive > Issue Type: Improvement > Components: Hive >Affects Versions: 1.3.0, 2.0.0 >Reporter: Gopal V >Assignee: Gopal V > Attachments: HIVE-11306.1.patch, HIVE-11306.2.patch > > > HIVE-9277 implemented Spillable joins for Tez, which suffers from a > corner-case performance issue when joining wide small tables against a narrow > big table (like a user info table join events stream). > The fact that the wide table is spilled causes extra IO, even though the nDV > of the join key might be in the thousands. > A cheap bloom-1 filter would add a massive performance gain for such queries, > massively cutting down on the spill IO costs for the big-table spills. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11341) Avoid expensive resizing of ASTNode tree
[ https://issues.apache.org/jira/browse/HIVE-11341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-11341: - Attachment: HIVE-11341.1.patch > Avoid expensive resizing of ASTNode tree > - > > Key: HIVE-11341 > URL: https://issues.apache.org/jira/browse/HIVE-11341 > Project: Hive > Issue Type: Bug > Components: Hive, Physical Optimizer >Affects Versions: 0.14.0 >Reporter: Mostafa Mokhtar >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-11341.1.patch > > > {code} > Stack TraceSample CountPercentage(%) > parse.BaseSemanticAnalyzer.analyze(ASTNode, Context) 1,605 90 >parse.CalcitePlanner.analyzeInternal(ASTNode) 1,605 90 > parse.SemanticAnalyzer.analyzeInternal(ASTNode, > SemanticAnalyzer$PlannerContext) 1,605 90 > parse.CalcitePlanner.genOPTree(ASTNode, > SemanticAnalyzer$PlannerContext) 1,604 90 > parse.SemanticAnalyzer.genOPTree(ASTNode, > SemanticAnalyzer$PlannerContext) 1,604 90 >parse.SemanticAnalyzer.genPlan(QB) 1,604 90 > parse.SemanticAnalyzer.genPlan(QB, boolean) 1,604 90 > parse.SemanticAnalyzer.genBodyPlan(QB, Operator, Map) > 1,604 90 > parse.SemanticAnalyzer.genFilterPlan(ASTNode, QB, > Operator, Map, boolean) 1,603 90 >parse.SemanticAnalyzer.genFilterPlan(QB, ASTNode, > Operator, boolean)1,603 90 > parse.SemanticAnalyzer.genExprNodeDesc(ASTNode, > RowResolver, boolean)1,603 90 > > parse.SemanticAnalyzer.genExprNodeDesc(ASTNode, RowResolver, TypeCheckCtx) > 1,603 90 > > parse.SemanticAnalyzer.genAllExprNodeDesc(ASTNode, RowResolver, TypeCheckCtx) > 1,603 90 > > parse.TypeCheckProcFactory.genExprNode(ASTNode, TypeCheckCtx) 1,603 90 > > parse.TypeCheckProcFactory.genExprNode(ASTNode, TypeCheckCtx, > TypeCheckProcFactory) 1,603 90 > > lib.DefaultGraphWalker.startWalking(Collection, HashMap) 1,579 89 > > lib.DefaultGraphWalker.walk(Node) 1,571 89 > > java.util.ArrayList.removeAll(Collection) 1,433 81 > > java.util.ArrayList.batchRemove(Collection, boolean) 1,433 81 > > java.util.ArrayList.contains(Object) 1,228 69 > > java.util.ArrayList.indexOf(Object)1,228 69 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11348) Support START TRANSACTION/COMMIT/ROLLBACK commands: support SQL2011 reserved keywords
[ https://issues.apache.org/jira/browse/HIVE-11348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637936#comment-14637936 ] Eugene Koifman commented on HIVE-11348: --- [~pxiong], could you also add some instructions in IdentifirersParser.g to explain why we have 2 lists and what rules should be followed when adding/not adding new ones? > Support START TRANSACTION/COMMIT/ROLLBACK commands: support SQL2011 reserved > keywords > - > > Key: HIVE-11348 > URL: https://issues.apache.org/jira/browse/HIVE-11348 > Project: Hive > Issue Type: Sub-task > Components: SQL, Transactions >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11348.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11348) Support START TRANSACTION/COMMIT/ROLLBACK commands: support SQL2011 reserved keywords
[ https://issues.apache.org/jira/browse/HIVE-11348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637932#comment-14637932 ] Pengcheng Xiong commented on HIVE-11348: [~sershe], that is why it is a subtask. Sorry to make you disappointed. :) > Support START TRANSACTION/COMMIT/ROLLBACK commands: support SQL2011 reserved > keywords > - > > Key: HIVE-11348 > URL: https://issues.apache.org/jira/browse/HIVE-11348 > Project: Hive > Issue Type: Sub-task > Components: SQL, Transactions >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11348.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11348) Support START TRANSACTION/COMMIT/ROLLBACK commands: support SQL2011 reserved keywords
[ https://issues.apache.org/jira/browse/HIVE-11348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637921#comment-14637921 ] Sergey Shelukhin commented on HIVE-11348: - This JIRA title sounds way too exciting for what the patch does :) > Support START TRANSACTION/COMMIT/ROLLBACK commands: support SQL2011 reserved > keywords > - > > Key: HIVE-11348 > URL: https://issues.apache.org/jira/browse/HIVE-11348 > Project: Hive > Issue Type: Sub-task > Components: SQL, Transactions >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11348.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8678) Pig fails to correctly load DATE fields using HCatalog
[ https://issues.apache.org/jira/browse/HIVE-8678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637888#comment-14637888 ] Sushanth Sowmyan commented on HIVE-8678: Also, unit tests exist since the introduction of DATE capability that have tested date interop between hive and pig through HCatalog, and that still succeeds for me when I try running them on hive 0.13.1. Could you please show me what hive commands and pig commands you're running to recreate this issue? > Pig fails to correctly load DATE fields using HCatalog > -- > > Key: HIVE-8678 > URL: https://issues.apache.org/jira/browse/HIVE-8678 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 0.13.1 >Reporter: Michael McLellan >Assignee: Sushanth Sowmyan > > Using: > Hadoop 2.5.0-cdh5.2.0 > Pig 0.12.0-cdh5.2.0 > Hive 0.13.1-cdh5.2.0 > When using pig -useHCatalog to load a Hive table that has a DATE field, when > trying to DUMP the field, the following error occurs: > {code} > 2014-10-30 22:58:05,469 [main] ERROR > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - > org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error > converting read value to tuple > at > org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76) > at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:58) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211) > at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553) > at > org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80) > at > org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) > Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to > java.sql.Date > at > org.apache.hive.hcatalog.pig.PigHCatUtil.extractPigObject(PigHCatUtil.java:420) > at > org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:457) > at > org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:375) > at > org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:64) > 2014-10-30 22:58:05,469 [main] ERROR > org.apache.pig.tools.pigstats.SimplePigStats - ERROR 6018: Error converting > read value to tuple > {code} > It seems to be occuring here: > https://github.com/apache/hive/blob/trunk/hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hive/hcatalog/pig/PigHCatUtil.java#L433 > and that it should be: > {code}Date d = Date.valueOf(o);{code} > instead of > {code}Date d = (Date) o;{code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8678) Pig fails to correctly load DATE fields using HCatalog
[ https://issues.apache.org/jira/browse/HIVE-8678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637886#comment-14637886 ] Sushanth Sowmyan commented on HIVE-8678: I'm currently unable to reproduce this issue on hive-1.2 and pig-0.14.0, where I get the following: In hive: {noformat} hive> create table tdate(a string, b date) stored as orc; OK Time taken: 0.151 seconds hive> create table tsource(a string, b string) stored as orc; OK Time taken: 0.057 seconds hive> insert into table tsource values ("abc", "2015-02-28"); ... OK Time taken: 19.875 seconds hive> select * from tsource; OK abc 2015-02-28 Time taken: 0.143 seconds, Fetched: 1 row(s) hive> select a, cast(b as date) from tsource; OK abc 2015-02-28 Time taken: 0.092 seconds, Fetched: 1 row(s) hive> insert into table tdate select a, cast(b as date) from tsource; ... OK Time taken: 20.672 seconds hive> select * from tdate; OK abc 2015-02-28 Time taken: 0.051 seconds, Fetched: 1 row(s) hive> describe tdate; OK a string b date Time taken: 0.293 seconds, Fetched: 2 row(s) {noformat} In pig: {noformat} grunt> A = load 'tdate' using org.apache.hive.hcatalog.pig.HCatLoader(); grunt> describe A; 2015-07-22 15:42:26,367 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS A: {a: chararray,b: datetime} grunt> dump A; ... (abc,2015-02-28T00:00:00.000-08:00) grunt> {noformat} > Pig fails to correctly load DATE fields using HCatalog > -- > > Key: HIVE-8678 > URL: https://issues.apache.org/jira/browse/HIVE-8678 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 0.13.1 >Reporter: Michael McLellan >Assignee: Sushanth Sowmyan > > Using: > Hadoop 2.5.0-cdh5.2.0 > Pig 0.12.0-cdh5.2.0 > Hive 0.13.1-cdh5.2.0 > When using pig -useHCatalog to load a Hive table that has a DATE field, when > trying to DUMP the field, the following error occurs: > {code} > 2014-10-30 22:58:05,469 [main] ERROR > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - > org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error > converting read value to tuple > at > org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76) > at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:58) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211) > at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553) > at > org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80) > at > org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) > Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to > java.sql.Date > at > org.apache.hive.hcatalog.pig.PigHCatUtil.extractPigObject(PigHCatUtil.java:420) > at > org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:457) > at > org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:375) > at > org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:64) > 2014-10-30 22:58:05,469 [main] ERROR > org.apache.pig.tools.pigstats.SimplePigStats - ERROR 6018: Error converting > read value to tuple > {code} > It seems to be occuring here: > https://github.com/apache/hive/blob/trunk/hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hive/hcatalog/pig/PigHCatUtil.java#L433 > and that it should be: > {code}Date d = Date.valueOf(o);{code} > instead of > {code}Date d = (Date) o;{code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11348) Support START TRANSACTION/COMMIT/ROLLBACK commands: support SQL2011 reserved keywords
[ https://issues.apache.org/jira/browse/HIVE-11348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11348: --- Attachment: HIVE-11348.01.patch [~ekoifman], could you please review the patch? Thanks. :) > Support START TRANSACTION/COMMIT/ROLLBACK commands: support SQL2011 reserved > keywords > - > > Key: HIVE-11348 > URL: https://issues.apache.org/jira/browse/HIVE-11348 > Project: Hive > Issue Type: Sub-task > Components: SQL, Transactions >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11348.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10950) Unit test against HBase Metastore
[ https://issues.apache.org/jira/browse/HIVE-10950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637882#comment-14637882 ] Vaibhav Gumashta commented on HIVE-10950: - Assigning it to myself since [~daijy] is OOO. Will continue from where he left. > Unit test against HBase Metastore > - > > Key: HIVE-10950 > URL: https://issues.apache.org/jira/browse/HIVE-10950 > Project: Hive > Issue Type: Sub-task > Components: Metastore >Affects Versions: hbase-metastore-branch >Reporter: Daniel Dai >Assignee: Vaibhav Gumashta > Fix For: hbase-metastore-branch > > Attachments: HIVE-10950-1.patch, HIVE-10950-2.patch > > > We need to run the entire Hive UT against HBase Metastore and make sure they > pass. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11331) Doc Notes
[ https://issues.apache.org/jira/browse/HIVE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-11331: -- Description: This ticket is to track various doc related issues for HIVE-9675 since the works is spread out over time. 1. calling set autocommit = true while a transaction is open will commit the transaction 2. document multi-statement transactions support 3. only Queries are allowed inside an open transaction (and commit/rollback) was: This ticket is to track various doc related issues for HIVE-9675 since the works is spread out over time. 1. calling set autocommit = true while a transaction is open will commit the transaction 2. document multi-statement transactions support > Doc Notes > - > > Key: HIVE-11331 > URL: https://issues.apache.org/jira/browse/HIVE-11331 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Eugene Koifman > > This ticket is to track various doc related issues for HIVE-9675 since the > works is spread out over time. > 1. calling set autocommit = true while a transaction is open will commit the > transaction > 2. document multi-statement transactions support > 3. only Queries are allowed inside an open transaction (and commit/rollback) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-10950) Unit test against HBase Metastore
[ https://issues.apache.org/jira/browse/HIVE-10950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta reassigned HIVE-10950: --- Assignee: Vaibhav Gumashta (was: Daniel Dai) > Unit test against HBase Metastore > - > > Key: HIVE-10950 > URL: https://issues.apache.org/jira/browse/HIVE-10950 > Project: Hive > Issue Type: Sub-task > Components: Metastore >Affects Versions: hbase-metastore-branch >Reporter: Daniel Dai >Assignee: Vaibhav Gumashta > Fix For: hbase-metastore-branch > > Attachments: HIVE-10950-1.patch, HIVE-10950-2.patch > > > We need to run the entire Hive UT against HBase Metastore and make sure they > pass. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11316) Use datastructure that doesnt duplicate any part of string for ASTNode::toStringTree()
[ https://issues.apache.org/jira/browse/HIVE-11316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637877#comment-14637877 ] Jesus Camacho Rodriguez commented on HIVE-11316: [~ekoifman], [~hsubramaniyan], that sounds good to me. > Use datastructure that doesnt duplicate any part of string for > ASTNode::toStringTree() > -- > > Key: HIVE-11316 > URL: https://issues.apache.org/jira/browse/HIVE-11316 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-11316-branch-1.0.patch, > HIVE-11316-branch-1.2.patch, HIVE-11316.1.patch, HIVE-11316.2.patch, > HIVE-11316.3.patch > > > HIVE-11281 uses an approach to memoize toStringTree() for ASTNode. This jira > is suppose to alter the string memoization to use a different data structure > that doesn't duplicate any part of the string so that we do not run into OOM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11304) Migrate to Log4j2 from Log4j 1.x
[ https://issues.apache.org/jira/browse/HIVE-11304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637875#comment-14637875 ] Hive QA commented on HIVE-11304: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12746633/HIVE-11304.2.patch {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9256 tests executed *Failed tests:* {noformat} TestPigHBaseStorageHandler - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testNegativeCliDriver_case_with_row_sequence org.apache.hadoop.hive.ql.log.TestLog4j2Appenders.testHiveEventCounterAppender org.apache.hive.service.cli.operation.TestOperationLoggingAPIWithMr.testFetchResultsOfLogWithVerboseMode {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4693/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4693/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4693/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12746633 - PreCommit-HIVE-TRUNK-Build > Migrate to Log4j2 from Log4j 1.x > > > Key: HIVE-11304 > URL: https://issues.apache.org/jira/browse/HIVE-11304 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-11304.2.patch, HIVE-11304.patch > > > Log4J2 has some great benefits and can benefit hive significantly. Some > notable features include > 1) Performance (parametrized logging, performance when logging is disabled > etc.) More details can be found here > https://logging.apache.org/log4j/2.x/performance.html > 2) RoutingAppender - Route logs to different log files based on MDC context > (useful for HS2, LLAP etc.) > 3) Asynchronous logging > This is an umbrella jira to track changes related to Log4j2 migration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11347) CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix CTAS
[ https://issues.apache.org/jira/browse/HIVE-11347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11347: --- Attachment: HIVE-11347.01.patch > CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix CTAS > -- > > Key: HIVE-11347 > URL: https://issues.apache.org/jira/browse/HIVE-11347 > Project: Hive > Issue Type: Sub-task > Components: CBO >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11347.01.patch > > > need to add a project on the final project. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11077) Add support in parser and wire up to txn manager
[ https://issues.apache.org/jira/browse/HIVE-11077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637872#comment-14637872 ] Pengcheng Xiong commented on HIVE-11077: [~ekoifman], i will submit a tiny patch soon. Thanks. > Add support in parser and wire up to txn manager > > > Key: HIVE-11077 > URL: https://issues.apache.org/jira/browse/HIVE-11077 > Project: Hive > Issue Type: Sub-task > Components: SQL, Transactions >Affects Versions: 1.2.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Fix For: 1.3.0 > > Attachments: HIVE-11077.3.patch, HIVE-11077.5.patch, > HIVE-11077.6.patch, HIVE-11077.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11077) Add support in parser and wire up to txn manager
[ https://issues.apache.org/jira/browse/HIVE-11077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637793#comment-14637793 ] Pengcheng Xiong commented on HIVE-11077: SQL:2011 > Add support in parser and wire up to txn manager > > > Key: HIVE-11077 > URL: https://issues.apache.org/jira/browse/HIVE-11077 > Project: Hive > Issue Type: Sub-task > Components: SQL, Transactions >Affects Versions: 1.2.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Fix For: 1.3.0 > > Attachments: HIVE-11077.3.patch, HIVE-11077.5.patch, > HIVE-11077.6.patch, HIVE-11077.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11346) Fix Unit test failures when HBase Metastore is used
[ https://issues.apache.org/jira/browse/HIVE-11346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-11346: Issue Type: Sub-task (was: Bug) Parent: HIVE-9452 > Fix Unit test failures when HBase Metastore is used > --- > > Key: HIVE-11346 > URL: https://issues.apache.org/jira/browse/HIVE-11346 > Project: Hive > Issue Type: Sub-task >Affects Versions: hbase-metastore-branch >Reporter: Vaibhav Gumashta > > Umbrella jira to track HBase metastore UT failures -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11077) Add support in parser and wire up to txn manager
[ https://issues.apache.org/jira/browse/HIVE-11077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637785#comment-14637785 ] Eugene Koifman commented on HIVE-11077: --- in particular, which column in the table you referred to to follow (i.e. which version of the standard) > Add support in parser and wire up to txn manager > > > Key: HIVE-11077 > URL: https://issues.apache.org/jira/browse/HIVE-11077 > Project: Hive > Issue Type: Sub-task > Components: SQL, Transactions >Affects Versions: 1.2.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Fix For: 1.3.0 > > Attachments: HIVE-11077.3.patch, HIVE-11077.5.patch, > HIVE-11077.6.patch, HIVE-11077.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11077) Add support in parser and wire up to txn manager
[ https://issues.apache.org/jira/browse/HIVE-11077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637782#comment-14637782 ] Eugene Koifman commented on HIVE-11077: --- [~pxiong], good catch. If you don't mind doing this, please go ahead. It may also be useful to add a more detailed comment in IdentifiersParser.g about how to add KW_ and to which list. > Add support in parser and wire up to txn manager > > > Key: HIVE-11077 > URL: https://issues.apache.org/jira/browse/HIVE-11077 > Project: Hive > Issue Type: Sub-task > Components: SQL, Transactions >Affects Versions: 1.2.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Fix For: 1.3.0 > > Attachments: HIVE-11077.3.patch, HIVE-11077.5.patch, > HIVE-11077.6.patch, HIVE-11077.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11344) HIVE-9845 makes HCatSplit.write modify the split so that PartInfo objects are unusable after it
[ https://issues.apache.org/jira/browse/HIVE-11344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-11344: Summary: HIVE-9845 makes HCatSplit.write modify the split so that PartInfo objects are unusable after it (was: HIVE-9845 makes HCatSplit.write modify the split so that PartitionInfo objects are unusable after it) > HIVE-9845 makes HCatSplit.write modify the split so that PartInfo objects are > unusable after it > --- > > Key: HIVE-11344 > URL: https://issues.apache.org/jira/browse/HIVE-11344 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.0 >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan > Attachments: HIVE-11344.patch > > > HIVE-9845 introduced a notion of compression for HCatSplits so that when > serializing, it finds commonalities between PartInfo and TableInfo objects, > and if the two are identical, it nulls out that field in PartInfo, thus > making sure that when PartInfo is then serialized, info is not repeated. > This, however, has the side effect of making the PartInfo object unusable if > HCatSplit.write has been called. > While this does not affect M/R directly, since they do not know about the > PartInfo objects and once serialized, the HCatSplit object is recreated by > deserializing on the backend, which does restore the split and its PartInfo > objects, this does, however, affect framework users of HCat that try to mimic > M/R and then use the PartInfo objects to instantiate distinct readers. > Thus, we need to make it so that PartInfo is still usable after > HCatSplit.write is called. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11259) LLAP: clean up ORC dependencies part 1
[ https://issues.apache.org/jira/browse/HIVE-11259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637759#comment-14637759 ] Sergey Shelukhin commented on HIVE-11259: - We were discussing putting it inside orc-encoded or some other module inside ORC. Would you want to clone Reader for that? > LLAP: clean up ORC dependencies part 1 > -- > > Key: HIVE-11259 > URL: https://issues.apache.org/jira/browse/HIVE-11259 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11259.01.patch, HIVE-11259.patch > > > Before there's storage handler module, we can clean some things up -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11259) LLAP: clean up ORC dependencies part 1
[ https://issues.apache.org/jira/browse/HIVE-11259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637749#comment-14637749 ] Owen O'Malley commented on HIVE-11259: -- I can't see any other application for EncodedReader other than LLAP and it clearly shouldn't be part of the core ORC API. Exposing the TreeReader API isn't great, but at least there are other applications for it. Certainly for a while, we'll need to mark the API as evolving instead of stable. From my point of view, enabling dependency injection isn't a bad thing for ORC. :) > LLAP: clean up ORC dependencies part 1 > -- > > Key: HIVE-11259 > URL: https://issues.apache.org/jira/browse/HIVE-11259 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11259.01.patch, HIVE-11259.patch > > > Before there's storage handler module, we can clean some things up -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11259) LLAP: clean up ORC dependencies part 1
[ https://issues.apache.org/jira/browse/HIVE-11259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637722#comment-14637722 ] Sergey Shelukhin commented on HIVE-11259: - It cannot be moved out of ORC unless ORC exposes a LOT of things to public, for everyone to create custom readers outside of the main project, and then makes sure to keep backward compact with all these things. > LLAP: clean up ORC dependencies part 1 > -- > > Key: HIVE-11259 > URL: https://issues.apache.org/jira/browse/HIVE-11259 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11259.01.patch, HIVE-11259.patch > > > Before there's storage handler module, we can clean some things up -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11259) LLAP: clean up ORC dependencies part 1
[ https://issues.apache.org/jira/browse/HIVE-11259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637718#comment-14637718 ] Sergey Shelukhin commented on HIVE-11259: - EncodedReader is part of ORC, it's not related to LLAP, it's only influenced by LLAP in API design. Similar to how the fact that all Reader/RecordReader/etc. APIs were dictated by Hive doesn't make them part of Hive, they are still part of ORC. It's a different interface to read ORC files. How can a factory be passed in to create it? Also, if the dependency is removed, to avoid creating two different Reader-s with potentially 2 FS objects and files, I'd need to clone Reader/ReaderImpl and duplicate half the functionality, only changing the record reader type. Why can reader create recordReader but not encodedReader? > LLAP: clean up ORC dependencies part 1 > -- > > Key: HIVE-11259 > URL: https://issues.apache.org/jira/browse/HIVE-11259 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11259.01.patch, HIVE-11259.patch > > > Before there's storage handler module, we can clean some things up -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11344) HIVE-9845 makes HCatSplit.write modify the split so that PartitionInfo objects are unusable after it
[ https://issues.apache.org/jira/browse/HIVE-11344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-11344: Attachment: HIVE-11344.patch Patch implementing (a) attached. > HIVE-9845 makes HCatSplit.write modify the split so that PartitionInfo > objects are unusable after it > > > Key: HIVE-11344 > URL: https://issues.apache.org/jira/browse/HIVE-11344 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.0 >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan > Attachments: HIVE-11344.patch > > > HIVE-9845 introduced a notion of compression for HCatSplits so that when > serializing, it finds commonalities between PartInfo and TableInfo objects, > and if the two are identical, it nulls out that field in PartInfo, thus > making sure that when PartInfo is then serialized, info is not repeated. > This, however, has the side effect of making the PartInfo object unusable if > HCatSplit.write has been called. > While this does not affect M/R directly, since they do not know about the > PartInfo objects and once serialized, the HCatSplit object is recreated by > deserializing on the backend, which does restore the split and its PartInfo > objects, this does, however, affect framework users of HCat that try to mimic > M/R and then use the PartInfo objects to instantiate distinct readers. > Thus, we need to make it so that PartInfo is still usable after > HCatSplit.write is called. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11344) HIVE-9845 makes HCatSplit.write modify the split so that PartitionInfo objects are unusable after it
[ https://issues.apache.org/jira/browse/HIVE-11344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637699#comment-14637699 ] Sushanth Sowmyan commented on HIVE-11344: - [~mithun], could you please review? > HIVE-9845 makes HCatSplit.write modify the split so that PartitionInfo > objects are unusable after it > > > Key: HIVE-11344 > URL: https://issues.apache.org/jira/browse/HIVE-11344 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.0 >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan > Attachments: HIVE-11344.patch > > > HIVE-9845 introduced a notion of compression for HCatSplits so that when > serializing, it finds commonalities between PartInfo and TableInfo objects, > and if the two are identical, it nulls out that field in PartInfo, thus > making sure that when PartInfo is then serialized, info is not repeated. > This, however, has the side effect of making the PartInfo object unusable if > HCatSplit.write has been called. > While this does not affect M/R directly, since they do not know about the > PartInfo objects and once serialized, the HCatSplit object is recreated by > deserializing on the backend, which does restore the split and its PartInfo > objects, this does, however, affect framework users of HCat that try to mimic > M/R and then use the PartInfo objects to instantiate distinct readers. > Thus, we need to make it so that PartInfo is still usable after > HCatSplit.write is called. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11259) LLAP: clean up ORC dependencies part 1
[ https://issues.apache.org/jira/browse/HIVE-11259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637686#comment-14637686 ] Owen O'Malley commented on HIVE-11259: -- That will remove a lot of the entanglement to Allocator and DataCache. > LLAP: clean up ORC dependencies part 1 > -- > > Key: HIVE-11259 > URL: https://issues.apache.org/jira/browse/HIVE-11259 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11259.01.patch, HIVE-11259.patch > > > Before there's storage handler module, we can clean some things up -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11344) HIVE-9845 makes HCatSplit.write modify the split so that PartitionInfo objects are unusable after it
[ https://issues.apache.org/jira/browse/HIVE-11344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637678#comment-14637678 ] Sushanth Sowmyan commented on HIVE-11344: - There are three routes I see available here: a) There is decompress logic in PartInfo.setTableInfo, and compress logic in PartInfo.writeObject. we could make it so that PartInfo.writeObject does the "compression", writes itself, and then does the decompression back. b) We could decompress on demand - wherein if a user calls getInputFormatClassName(), we then fetch that info if it's not available, and always return values consistently. c) We could add a new conf parameter that controls whether or not we do compression - users with 100k splits would prefer compression, and be okay with the fact that PartInfo objects are not usable, and users that want to use the PartInfo objects will be okay with the fact that they are going to hog a little bit more serialized space. (c) is a bad solution all-round. [~ashutoshc] would be mad at me for adding another conf parameter, and it is entirely possible that those that are trying to implement other streaming interfaces/etc and are mimicing M/R will run into a large number of partitions as well. (b) is nifty, and I probably like the idea of, but I'm not entirely certain if it will run afoul of other serialization methods in the future that call getters to get fields (some json serializers) which might result in a bloated serialized PartInfo object anyway. Also, it spreads the decompression logic across multiple getters, and pushes the assert statement in multiple places as well. (a) is probably the cleanest solution, although it makes a code reader wonder why we're going through the gymnastics we are. Some code comments might help with that. > HIVE-9845 makes HCatSplit.write modify the split so that PartitionInfo > objects are unusable after it > > > Key: HIVE-11344 > URL: https://issues.apache.org/jira/browse/HIVE-11344 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.0 >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan > > HIVE-9845 introduced a notion of compression for HCatSplits so that when > serializing, it finds commonalities between PartInfo and TableInfo objects, > and if the two are identical, it nulls out that field in PartInfo, thus > making sure that when PartInfo is then serialized, info is not repeated. > This, however, has the side effect of making the PartInfo object unusable if > HCatSplit.write has been called. > While this does not affect M/R directly, since they do not know about the > PartInfo objects and once serialized, the HCatSplit object is recreated by > deserializing on the backend, which does restore the split and its PartInfo > objects, this does, however, affect framework users of HCat that try to mimic > M/R and then use the PartInfo objects to instantiate distinct readers. > Thus, we need to make it so that PartInfo is still usable after > HCatSplit.write is called. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10799) Refactor the SearchArgumentFactory to remove the dependence on ExprNodeGenericFuncDesc
[ https://issues.apache.org/jira/browse/HIVE-10799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637657#comment-14637657 ] Owen O'Malley commented on HIVE-10799: -- Given that ORC files currently have the value for char columns space-padded, we need to make the sarg code expand the literals to be padded to the right width. I don't think we need a CHAR type in the sarg API. > Refactor the SearchArgumentFactory to remove the dependence on > ExprNodeGenericFuncDesc > -- > > Key: HIVE-10799 > URL: https://issues.apache.org/jira/browse/HIVE-10799 > Project: Hive > Issue Type: Sub-task >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Attachments: HIVE-10799.patch, HIVE-10799.patch, HIVE-10799.patch, > HIVE-10799.patch, HIVE-10799.patch > > > SearchArgumentFactory and SearchArgumentImpl are high level and shouldn't > depend on the internals of Hive's AST model. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11259) LLAP: clean up ORC dependencies part 1
[ https://issues.apache.org/jira/browse/HIVE-11259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637649#comment-14637649 ] Owen O'Malley commented on HIVE-11259: -- You need to remove the dependence between Reader and ReaderImpl and the EncodedReader and EncodedReaderImpl. I'd suggest adding passing a factory object into the OrcFile.ReaderOptions that can control the implementation of the TreeReaders and RecordReader. Basically, the goal is to make it so that LLAP can pass in a factory object that lets it control the behavior of the RecordReader and TreeReaders without making the ORC reader depend on LLAP. > LLAP: clean up ORC dependencies part 1 > -- > > Key: HIVE-11259 > URL: https://issues.apache.org/jira/browse/HIVE-11259 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11259.01.patch, HIVE-11259.patch > > > Before there's storage handler module, we can clean some things up -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11317) ACID: Improve transaction Abort logic due to timeout
[ https://issues.apache.org/jira/browse/HIVE-11317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-11317: -- Description: the logic to Abort transactions that have stopped heartbeating is in TxnHandler.timeOutTxns() This is only called when DbTxnManger.getValidTxns() is called. So if there is a lot of txns that need to be timed out and the there are not SQL clients talking to the system, there is nothing to abort dead transactions, and thus compaction can't clean them up so garbage accumulates in the system. Also, streaming api doesn't call DbTxnManager at all. Need to move this logic into Initiator (or some other metastore side thread). Also, make sure it is broken up into multiple small(er) transactions against metastore DB. Also more timeOutLocks() locks there as well. see about adding TXNS.COMMENT field which can be used for "Auto aborted due to timeout" for example. was: the logic to Abort transactions that have stopped heartbeating is in TxnHandler.timeOutTxns() This is only called when DbTxnManger.getValidTxns() is called. So if there is a lot of txns that need to be timed out and the there are not SQL clients talking to the system, there is nothing to abort dead transactions, and thus compaction can't clean them up so garbage accumulates in the system. Also, streaming api doesn't call DbTxnManager at all. Need to move this logic into Initiator (or some other metastore side thread). Also, make sure it is broken up into multiple small(er) transactions against metastore DB. Also more timeOutLocks() locks there as well. > ACID: Improve transaction Abort logic due to timeout > > > Key: HIVE-11317 > URL: https://issues.apache.org/jira/browse/HIVE-11317 > Project: Hive > Issue Type: Bug > Components: Metastore, Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Labels: triage > > the logic to Abort transactions that have stopped heartbeating is in > TxnHandler.timeOutTxns() > This is only called when DbTxnManger.getValidTxns() is called. > So if there is a lot of txns that need to be timed out and the there are not > SQL clients talking to the system, there is nothing to abort dead > transactions, and thus compaction can't clean them up so garbage accumulates > in the system. > Also, streaming api doesn't call DbTxnManager at all. > Need to move this logic into Initiator (or some other metastore side thread). > Also, make sure it is broken up into multiple small(er) transactions against > metastore DB. > Also more timeOutLocks() locks there as well. > see about adding TXNS.COMMENT field which can be used for "Auto aborted due > to timeout" for example. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11304) Migrate to Log4j2 from Log4j 1.x
[ https://issues.apache.org/jira/browse/HIVE-11304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637597#comment-14637597 ] Prasanth Jayachandran commented on HIVE-11304: -- [~gopalv] Can you please review the patch? > Migrate to Log4j2 from Log4j 1.x > > > Key: HIVE-11304 > URL: https://issues.apache.org/jira/browse/HIVE-11304 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-11304.2.patch, HIVE-11304.patch > > > Log4J2 has some great benefits and can benefit hive significantly. Some > notable features include > 1) Performance (parametrized logging, performance when logging is disabled > etc.) More details can be found here > https://logging.apache.org/log4j/2.x/performance.html > 2) RoutingAppender - Route logs to different log files based on MDC context > (useful for HS2, LLAP etc.) > 3) Asynchronous logging > This is an umbrella jira to track changes related to Log4j2 migration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11304) Migrate to Log4j2 from Log4j 1.x
[ https://issues.apache.org/jira/browse/HIVE-11304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637596#comment-14637596 ] Prasanth Jayachandran commented on HIVE-11304: -- [~thejas] Can you take a look at the changes to LogDivertAppender? I am not sure of the purpose of doAppend() method that was there in the initial implementation. In my understanding, it seems to be checking for any changes in verbosity before writing every log line. If the verbosity changes then it switches to a different layout. If thats the case, under what circumstances can the verbosity change. Is there a test case to verify changing verbosity? > Migrate to Log4j2 from Log4j 1.x > > > Key: HIVE-11304 > URL: https://issues.apache.org/jira/browse/HIVE-11304 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-11304.2.patch, HIVE-11304.patch > > > Log4J2 has some great benefits and can benefit hive significantly. Some > notable features include > 1) Performance (parametrized logging, performance when logging is disabled > etc.) More details can be found here > https://logging.apache.org/log4j/2.x/performance.html > 2) RoutingAppender - Route logs to different log files based on MDC context > (useful for HS2, LLAP etc.) > 3) Asynchronous logging > This is an umbrella jira to track changes related to Log4j2 migration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11077) Add support in parser and wire up to txn manager
[ https://issues.apache.org/jira/browse/HIVE-11077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637598#comment-14637598 ] Pengcheng Xiong commented on HIVE-11077: Hi [~ekoifman], I came across your patch when I was looking at Hive master. I saw that the following key words are added to the non-reserved list. However, some of them are actually reserved ones (marked with R) following SQL2011 according to http://www.postgresql.org/docs/9.2/static/sql-keywords-appendix.html. Adding them to the non-reserved list will bring in ambiguity in the grammar. Would you mind removing them? If you agree, i can do it for you too. Thanks. :) {code} KW_WORK KW_START (R) KW_TRANSACTION KW_COMMIT (R) KW_ROLLBACK (R) KW_ONLY (R) KW_WRITE KW_ISOLATION KW_LEVEL KW_SNAPSHOT KW_AUTOCOMMIT {code} > Add support in parser and wire up to txn manager > > > Key: HIVE-11077 > URL: https://issues.apache.org/jira/browse/HIVE-11077 > Project: Hive > Issue Type: Sub-task > Components: SQL, Transactions >Affects Versions: 1.2.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Fix For: 1.3.0 > > Attachments: HIVE-11077.3.patch, HIVE-11077.5.patch, > HIVE-11077.6.patch, HIVE-11077.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-11343) Merge trunk to hbase-metastore branch
[ https://issues.apache.org/jira/browse/HIVE-11343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates resolved HIVE-11343. --- Resolution: Fixed Fix Version/s: hbase-metastore-branch Done. > Merge trunk to hbase-metastore branch > - > > Key: HIVE-11343 > URL: https://issues.apache.org/jira/browse/HIVE-11343 > Project: Hive > Issue Type: Task > Components: Metastore >Affects Versions: hbase-metastore-branch >Reporter: Alan Gates >Assignee: Alan Gates > Fix For: hbase-metastore-branch > > > Periodic merge -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11333) CBO: Calcite Operator To Hive Operator (Calcite Return Path): ColumnPruner prunes columns of UnionOperator that should be kept
[ https://issues.apache.org/jira/browse/HIVE-11333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11333: --- Attachment: HIVE-11333.02.patch > CBO: Calcite Operator To Hive Operator (Calcite Return Path): ColumnPruner > prunes columns of UnionOperator that should be kept > -- > > Key: HIVE-11333 > URL: https://issues.apache.org/jira/browse/HIVE-11333 > Project: Hive > Issue Type: Sub-task > Components: CBO >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11333.01.patch, HIVE-11333.02.patch > > > unionOperator will have the schema following the operator in the first > branch. Because ColumnPruner prunes columns based on the internal name, the > column in other branches may be pruned due to a different internal name from > the first branch. To repro, run rcfile_union.q with return path turned on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11331) Doc Notes
[ https://issues.apache.org/jira/browse/HIVE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-11331: -- Description: This ticket is to track various doc related issues for HIVE-9675 since the works is spread out over time. 1. calling set autocommit = true while a transaction is open will commit the transaction 2. document multi-statement transactions support was: This ticket is to track various doc related issues for HIVE-9675 since the works is spread out over time. 1. calling set autocommit = true while a transaction is open will commit the transaction 2. > Doc Notes > - > > Key: HIVE-11331 > URL: https://issues.apache.org/jira/browse/HIVE-11331 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Eugene Koifman > > This ticket is to track various doc related issues for HIVE-9675 since the > works is spread out over time. > 1. calling set autocommit = true while a transaction is open will commit the > transaction > 2. document multi-statement transactions support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11306) Add a bloom-1 filter for Hybrid MapJoin spills
[ https://issues.apache.org/jira/browse/HIVE-11306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-11306: --- Attachment: HIVE-11306.2.patch Fix assertion > Add a bloom-1 filter for Hybrid MapJoin spills > -- > > Key: HIVE-11306 > URL: https://issues.apache.org/jira/browse/HIVE-11306 > Project: Hive > Issue Type: Improvement > Components: Hive >Affects Versions: 1.3.0, 2.0.0 >Reporter: Gopal V >Assignee: Gopal V > Attachments: HIVE-11306.1.patch, HIVE-11306.2.patch > > > HIVE-9277 implemented Spillable joins for Tez, which suffers from a > corner-case performance issue when joining wide small tables against a narrow > big table (like a user info table join events stream). > The fact that the wide table is spilled causes extra IO, even though the nDV > of the join key might be in the thousands. > A cheap bloom-1 filter would add a massive performance gain for such queries, > massively cutting down on the spill IO costs for the big-table spills. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11304) Migrate to Log4j2 from Log4j 1.x
[ https://issues.apache.org/jira/browse/HIVE-11304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-11304: - Attachment: HIVE-11304.2.patch Should fix test failures. > Migrate to Log4j2 from Log4j 1.x > > > Key: HIVE-11304 > URL: https://issues.apache.org/jira/browse/HIVE-11304 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-11304.2.patch, HIVE-11304.patch > > > Log4J2 has some great benefits and can benefit hive significantly. Some > notable features include > 1) Performance (parametrized logging, performance when logging is disabled > etc.) More details can be found here > https://logging.apache.org/log4j/2.x/performance.html > 2) RoutingAppender - Route logs to different log files based on MDC context > (useful for HS2, LLAP etc.) > 3) Asynchronous logging > This is an umbrella jira to track changes related to Log4j2 migration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11306) Add a bloom-1 filter for Hybrid MapJoin spills
[ https://issues.apache.org/jira/browse/HIVE-11306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-11306: --- Attachment: (was: HIVE-11306.2.patch) > Add a bloom-1 filter for Hybrid MapJoin spills > -- > > Key: HIVE-11306 > URL: https://issues.apache.org/jira/browse/HIVE-11306 > Project: Hive > Issue Type: Improvement > Components: Hive >Affects Versions: 1.3.0, 2.0.0 >Reporter: Gopal V >Assignee: Gopal V > Attachments: HIVE-11306.1.patch > > > HIVE-9277 implemented Spillable joins for Tez, which suffers from a > corner-case performance issue when joining wide small tables against a narrow > big table (like a user info table join events stream). > The fact that the wide table is spilled causes extra IO, even though the nDV > of the join key might be in the thousands. > A cheap bloom-1 filter would add a massive performance gain for such queries, > massively cutting down on the spill IO costs for the big-table spills. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11259) LLAP: clean up ORC dependencies part 1
[ https://issues.apache.org/jira/browse/HIVE-11259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637575#comment-14637575 ] Sergey Shelukhin commented on HIVE-11259: - Got rid of TrackedCacheChunk, renamed confusingly named StreamBuffer, added better comments to it and ProcCacheChunk. > LLAP: clean up ORC dependencies part 1 > -- > > Key: HIVE-11259 > URL: https://issues.apache.org/jira/browse/HIVE-11259 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11259.01.patch, HIVE-11259.patch > > > Before there's storage handler module, we can clean some things up -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11259) LLAP: clean up ORC dependencies part 1
[ https://issues.apache.org/jira/browse/HIVE-11259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11259: Attachment: HIVE-11259.01.patch > LLAP: clean up ORC dependencies part 1 > -- > > Key: HIVE-11259 > URL: https://issues.apache.org/jira/browse/HIVE-11259 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11259.01.patch, HIVE-11259.patch > > > Before there's storage handler module, we can clean some things up -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11306) Add a bloom-1 filter for Hybrid MapJoin spills
[ https://issues.apache.org/jira/browse/HIVE-11306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-11306: --- Attachment: (was: HIVE-11306.2.patch) > Add a bloom-1 filter for Hybrid MapJoin spills > -- > > Key: HIVE-11306 > URL: https://issues.apache.org/jira/browse/HIVE-11306 > Project: Hive > Issue Type: Improvement > Components: Hive >Affects Versions: 1.3.0, 2.0.0 >Reporter: Gopal V >Assignee: Gopal V > Attachments: HIVE-11306.1.patch, HIVE-11306.2.patch > > > HIVE-9277 implemented Spillable joins for Tez, which suffers from a > corner-case performance issue when joining wide small tables against a narrow > big table (like a user info table join events stream). > The fact that the wide table is spilled causes extra IO, even though the nDV > of the join key might be in the thousands. > A cheap bloom-1 filter would add a massive performance gain for such queries, > massively cutting down on the spill IO costs for the big-table spills. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11306) Add a bloom-1 filter for Hybrid MapJoin spills
[ https://issues.apache.org/jira/browse/HIVE-11306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-11306: --- Attachment: HIVE-11306.2.patch > Add a bloom-1 filter for Hybrid MapJoin spills > -- > > Key: HIVE-11306 > URL: https://issues.apache.org/jira/browse/HIVE-11306 > Project: Hive > Issue Type: Improvement > Components: Hive >Affects Versions: 1.3.0, 2.0.0 >Reporter: Gopal V >Assignee: Gopal V > Attachments: HIVE-11306.1.patch, HIVE-11306.2.patch > > > HIVE-9277 implemented Spillable joins for Tez, which suffers from a > corner-case performance issue when joining wide small tables against a narrow > big table (like a user info table join events stream). > The fact that the wide table is spilled causes extra IO, even though the nDV > of the join key might be in the thousands. > A cheap bloom-1 filter would add a massive performance gain for such queries, > massively cutting down on the spill IO costs for the big-table spills. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11306) Add a bloom-1 filter for Hybrid MapJoin spills
[ https://issues.apache.org/jira/browse/HIVE-11306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-11306: --- Attachment: HIVE-11306.2.patch [~wzheng]: Updated patch to make sure that the hashMapResult has 0 rows, for the NOMATCH case. > Add a bloom-1 filter for Hybrid MapJoin spills > -- > > Key: HIVE-11306 > URL: https://issues.apache.org/jira/browse/HIVE-11306 > Project: Hive > Issue Type: Improvement > Components: Hive >Affects Versions: 1.3.0, 2.0.0 >Reporter: Gopal V >Assignee: Gopal V > Attachments: HIVE-11306.1.patch, HIVE-11306.2.patch > > > HIVE-9277 implemented Spillable joins for Tez, which suffers from a > corner-case performance issue when joining wide small tables against a narrow > big table (like a user info table join events stream). > The fact that the wide table is spilled causes extra IO, even though the nDV > of the join key might be in the thousands. > A cheap bloom-1 filter would add a massive performance gain for such queries, > massively cutting down on the spill IO costs for the big-table spills. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11316) Use datastructure that doesnt duplicate any part of string for ASTNode::toStringTree()
[ https://issues.apache.org/jira/browse/HIVE-11316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637541#comment-14637541 ] Eugene Koifman commented on HIVE-11316: --- My concern was with the modification of the existing toStringTree() in a dangerous way. patch 3 keeps toStringTree() as is and adds a new "optimized" method. [~hsubramaniyan] and I just discussed and there is a best of both worlds approach. ASTNode only has 5 or 6 (inherited) methods that allow tree modification. We could overload each one to set a flag that says cached "toString" needs to be recomputed. This way no one even has to know about caching. [~jcamachorodriguez], does this seem reasonable? > Use datastructure that doesnt duplicate any part of string for > ASTNode::toStringTree() > -- > > Key: HIVE-11316 > URL: https://issues.apache.org/jira/browse/HIVE-11316 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-11316-branch-1.0.patch, > HIVE-11316-branch-1.2.patch, HIVE-11316.1.patch, HIVE-11316.2.patch, > HIVE-11316.3.patch > > > HIVE-11281 uses an approach to memoize toStringTree() for ASTNode. This jira > is suppose to alter the string memoization to use a different data structure > that doesn't duplicate any part of the string so that we do not run into OOM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11306) Add a bloom-1 filter for Hybrid MapJoin spills
[ https://issues.apache.org/jira/browse/HIVE-11306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637538#comment-14637538 ] Wei Zheng commented on HIVE-11306: -- Only for the left outer join case, we will set joinNeeded to true if the return isn't SPILL. > Add a bloom-1 filter for Hybrid MapJoin spills > -- > > Key: HIVE-11306 > URL: https://issues.apache.org/jira/browse/HIVE-11306 > Project: Hive > Issue Type: Improvement > Components: Hive >Affects Versions: 1.3.0, 2.0.0 >Reporter: Gopal V >Assignee: Gopal V > Attachments: HIVE-11306.1.patch > > > HIVE-9277 implemented Spillable joins for Tez, which suffers from a > corner-case performance issue when joining wide small tables against a narrow > big table (like a user info table join events stream). > The fact that the wide table is spilled causes extra IO, even though the nDV > of the join key might be in the thousands. > A cheap bloom-1 filter would add a massive performance gain for such queries, > massively cutting down on the spill IO costs for the big-table spills. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11306) Add a bloom-1 filter for Hybrid MapJoin spills
[ https://issues.apache.org/jira/browse/HIVE-11306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637524#comment-14637524 ] Gopal V commented on HIVE-11306: Can't be, because the test-case that's broken is an inner join. All further checks are actually for {{if (joinNeeded) {}} So, the core issue is that there's no check for MATCH, so the inner join thinks it got a MATCH if the return isn't SPILL. > Add a bloom-1 filter for Hybrid MapJoin spills > -- > > Key: HIVE-11306 > URL: https://issues.apache.org/jira/browse/HIVE-11306 > Project: Hive > Issue Type: Improvement > Components: Hive >Affects Versions: 1.3.0, 2.0.0 >Reporter: Gopal V >Assignee: Gopal V > Attachments: HIVE-11306.1.patch > > > HIVE-9277 implemented Spillable joins for Tez, which suffers from a > corner-case performance issue when joining wide small tables against a narrow > big table (like a user info table join events stream). > The fact that the wide table is spilled causes extra IO, even though the nDV > of the join key might be in the thousands. > A cheap bloom-1 filter would add a massive performance gain for such queries, > massively cutting down on the spill IO costs for the big-table spills. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11306) Add a bloom-1 filter for Hybrid MapJoin spills
[ https://issues.apache.org/jira/browse/HIVE-11306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637516#comment-14637516 ] Wei Zheng commented on HIVE-11306: -- Looks like that's the problem. The bloom test early-determines nomatch, which is good, but broke the left outer join assumption. So maybe the right logic should be: {code} if (!bloom1.testLong(keyHash) && !isOnDisk(partitionId)) { ... return JoinUtil.JoinResult.NOMATCH; } // otherwise just pass long to the next round (join for spill partition) to decide what to do {code} > Add a bloom-1 filter for Hybrid MapJoin spills > -- > > Key: HIVE-11306 > URL: https://issues.apache.org/jira/browse/HIVE-11306 > Project: Hive > Issue Type: Improvement > Components: Hive >Affects Versions: 1.3.0, 2.0.0 >Reporter: Gopal V >Assignee: Gopal V > Attachments: HIVE-11306.1.patch > > > HIVE-9277 implemented Spillable joins for Tez, which suffers from a > corner-case performance issue when joining wide small tables against a narrow > big table (like a user info table join events stream). > The fact that the wide table is spilled causes extra IO, even though the nDV > of the join key might be in the thousands. > A cheap bloom-1 filter would add a massive performance gain for such queries, > massively cutting down on the spill IO costs for the big-table spills. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11306) Add a bloom-1 filter for Hybrid MapJoin spills
[ https://issues.apache.org/jira/browse/HIVE-11306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637461#comment-14637461 ] Gopal V commented on HIVE-11306: Looks like this might be due to the {{&& joinResult != JoinUtil.JoinResult.SPILL)}} in the MapJoinOperator::process(). {code} if (!noOuterJoin) { // For Hybrid Grace Hash Join, during the 1st round processing, // we only keep the LEFT side if the row is not spilled if (!conf.isHybridHashJoin() || hybridMapJoinLeftover || (!hybridMapJoinLeftover && joinResult != JoinUtil.JoinResult.SPILL)) { joinNeeded = true; storage[pos] = dummyObjVectors[pos]; } } else { storage[pos] = emptyList; } {code} > Add a bloom-1 filter for Hybrid MapJoin spills > -- > > Key: HIVE-11306 > URL: https://issues.apache.org/jira/browse/HIVE-11306 > Project: Hive > Issue Type: Improvement > Components: Hive >Affects Versions: 1.3.0, 2.0.0 >Reporter: Gopal V >Assignee: Gopal V > Attachments: HIVE-11306.1.patch > > > HIVE-9277 implemented Spillable joins for Tez, which suffers from a > corner-case performance issue when joining wide small tables against a narrow > big table (like a user info table join events stream). > The fact that the wide table is spilled causes extra IO, even though the nDV > of the join key might be in the thousands. > A cheap bloom-1 filter would add a massive performance gain for such queries, > massively cutting down on the spill IO costs for the big-table spills. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11305) LLAP: Hybrid Map-join cache returns invalid data
[ https://issues.apache.org/jira/browse/HIVE-11305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637457#comment-14637457 ] Hive QA commented on HIVE-11305: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12746436/HIVE-11305.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4690/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4690/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4690/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4690/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + cd apache-github-source-source + git fetch origin >From https://github.com/apache/hive 2f0ae24..afab133 branch-1.1 -> origin/branch-1.1 1a1c0d8..a310524 hbase-metastore -> origin/hbase-metastore 72f97fc..2240dbd master -> origin/master + git reset --hard HEAD HEAD is now at 72f97fc HIVE-11303: Getting Tez LimitExceededException after dag execution on large query (Jason Dere, reviewed by Gopal V) + git clean -f -d + git checkout master Already on 'master' Your branch is behind 'origin/master' by 3 commits, and can be fast-forwarded. + git reset --hard origin/master HEAD is now at 2240dbd HIVE-11254 Process result sets returned by a stored procedure (Dmitry Tolpeko via gates) + git merge --ff-only origin/master Already up-to-date. + git gc + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12746436 - PreCommit-HIVE-TRUNK-Build > LLAP: Hybrid Map-join cache returns invalid data > - > > Key: HIVE-11305 > URL: https://issues.apache.org/jira/browse/HIVE-11305 > Project: Hive > Issue Type: Sub-task >Affects Versions: llap > Environment: TPC-DS 200 scale data >Reporter: Gopal V >Assignee: Sergey Shelukhin >Priority: Critical > Fix For: llap > > Attachments: HIVE-11305.patch, q55-test.sql > > > Start a 1-node LLAP cluster with 16 executors and run attached test-case on > the single node instance. > {code} > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.persistence.HybridHashTableContainer cannot be > cast to > org.apache.hadoop.hive.ql.exec.vector.mapjoin.hashtable.VectorMapJoinTableContainer > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.loadHashTable(VectorMapJoinCommonOperator.java:648) > at > org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:314) > at > org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1104) > at > org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1108) > at > org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1108) > at > org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileCha
[jira] [Updated] (HIVE-11300) HBase metastore: Support token and master key methods
[ https://issues.apache.org/jira/browse/HIVE-11300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-11300: -- Attachment: HIVE-11300.2.patch A new version of the patch rebased after HIVE-11294 was checked in. > HBase metastore: Support token and master key methods > - > > Key: HIVE-11300 > URL: https://issues.apache.org/jira/browse/HIVE-11300 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: hbase-metastore-branch >Reporter: Alan Gates >Assignee: Alan Gates > Attachments: HIVE-11300.2.patch, HIVE-11300.patch > > > The methods addToken, removeToken, getToken, getAllTokenIdentifiers, > addMasterKey, updateMasterKey, removeMasterKey, and getMasterKeys() need to > be implemented. They are all used in security. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11305) LLAP: Hybrid Map-join cache returns invalid data
[ https://issues.apache.org/jira/browse/HIVE-11305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637408#comment-14637408 ] Sergey Shelukhin commented on HIVE-11305: - ping? [~gopalv] > LLAP: Hybrid Map-join cache returns invalid data > - > > Key: HIVE-11305 > URL: https://issues.apache.org/jira/browse/HIVE-11305 > Project: Hive > Issue Type: Sub-task >Affects Versions: llap > Environment: TPC-DS 200 scale data >Reporter: Gopal V >Assignee: Sergey Shelukhin >Priority: Critical > Fix For: llap > > Attachments: HIVE-11305.patch, q55-test.sql > > > Start a 1-node LLAP cluster with 16 executors and run attached test-case on > the single node instance. > {code} > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.persistence.HybridHashTableContainer cannot be > cast to > org.apache.hadoop.hive.ql.exec.vector.mapjoin.hashtable.VectorMapJoinTableContainer > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.loadHashTable(VectorMapJoinCommonOperator.java:648) > at > org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:314) > at > org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1104) > at > org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1108) > at > org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1108) > at > org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1108) > at > org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1108) > at > org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:37) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:86) > ... 17 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11259) LLAP: clean up ORC dependencies part 1
[ https://issues.apache.org/jira/browse/HIVE-11259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637407#comment-14637407 ] Sergey Shelukhin commented on HIVE-11259: - Actually, nm, 3 is also impossible, Boolean cannot be set, so it cannot be used as an out parameter. > LLAP: clean up ORC dependencies part 1 > -- > > Key: HIVE-11259 > URL: https://issues.apache.org/jira/browse/HIVE-11259 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11259.patch > > > Before there's storage handler module, we can clean some things up -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11254) Process result sets returned by a stored procedure
[ https://issues.apache.org/jira/browse/HIVE-11254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637394#comment-14637394 ] Dmitry Tolpeko commented on HIVE-11254: --- Thanks, Alan. > Process result sets returned by a stored procedure > -- > > Key: HIVE-11254 > URL: https://issues.apache.org/jira/browse/HIVE-11254 > Project: Hive > Issue Type: Improvement > Components: hpl/sql >Reporter: Dmitry Tolpeko >Assignee: Dmitry Tolpeko > Fix For: 2.0.0 > > Attachments: HIVE-11254.1.patch, HIVE-11254.2.patch, > HIVE-11254.3.patch, HIVE-11254.4.patch > > > Stored procedure can return one or more result sets. A caller should be able > to process them. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11341) Avoid expensive resizing of ASTNode tree
[ https://issues.apache.org/jira/browse/HIVE-11341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637282#comment-14637282 ] Mostafa Mokhtar commented on HIVE-11341: [~hagleitn] [~jcamachorodriguez] [~hsubramaniyan] FYI > Avoid expensive resizing of ASTNode tree > - > > Key: HIVE-11341 > URL: https://issues.apache.org/jira/browse/HIVE-11341 > Project: Hive > Issue Type: Bug > Components: Hive, Physical Optimizer >Affects Versions: 0.14.0 >Reporter: Mostafa Mokhtar >Assignee: Hari Sankar Sivarama Subramaniyan > > {code} > Stack TraceSample CountPercentage(%) > parse.BaseSemanticAnalyzer.analyze(ASTNode, Context) 1,605 90 >parse.CalcitePlanner.analyzeInternal(ASTNode) 1,605 90 > parse.SemanticAnalyzer.analyzeInternal(ASTNode, > SemanticAnalyzer$PlannerContext) 1,605 90 > parse.CalcitePlanner.genOPTree(ASTNode, > SemanticAnalyzer$PlannerContext) 1,604 90 > parse.SemanticAnalyzer.genOPTree(ASTNode, > SemanticAnalyzer$PlannerContext) 1,604 90 >parse.SemanticAnalyzer.genPlan(QB) 1,604 90 > parse.SemanticAnalyzer.genPlan(QB, boolean) 1,604 90 > parse.SemanticAnalyzer.genBodyPlan(QB, Operator, Map) > 1,604 90 > parse.SemanticAnalyzer.genFilterPlan(ASTNode, QB, > Operator, Map, boolean) 1,603 90 >parse.SemanticAnalyzer.genFilterPlan(QB, ASTNode, > Operator, boolean)1,603 90 > parse.SemanticAnalyzer.genExprNodeDesc(ASTNode, > RowResolver, boolean)1,603 90 > > parse.SemanticAnalyzer.genExprNodeDesc(ASTNode, RowResolver, TypeCheckCtx) > 1,603 90 > > parse.SemanticAnalyzer.genAllExprNodeDesc(ASTNode, RowResolver, TypeCheckCtx) > 1,603 90 > > parse.TypeCheckProcFactory.genExprNode(ASTNode, TypeCheckCtx) 1,603 90 > > parse.TypeCheckProcFactory.genExprNode(ASTNode, TypeCheckCtx, > TypeCheckProcFactory) 1,603 90 > > lib.DefaultGraphWalker.startWalking(Collection, HashMap) 1,579 89 > > lib.DefaultGraphWalker.walk(Node) 1,571 89 > > java.util.ArrayList.removeAll(Collection) 1,433 81 > > java.util.ArrayList.batchRemove(Collection, boolean) 1,433 81 > > java.util.ArrayList.contains(Object) 1,228 69 > > java.util.ArrayList.indexOf(Object)1,228 69 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11077) Add support in parser and wire up to txn manager
[ https://issues.apache.org/jira/browse/HIVE-11077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637257#comment-14637257 ] Alan Gates commented on HIVE-11077: --- +1 > Add support in parser and wire up to txn manager > > > Key: HIVE-11077 > URL: https://issues.apache.org/jira/browse/HIVE-11077 > Project: Hive > Issue Type: Sub-task > Components: SQL, Transactions >Affects Versions: 1.2.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-11077.3.patch, HIVE-11077.5.patch, > HIVE-11077.6.patch, HIVE-11077.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-11335) Multi-Join Inner Query producing incorrect results
[ https://issues.apache.org/jira/browse/HIVE-11335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez resolved HIVE-11335. Resolution: Duplicate > Multi-Join Inner Query producing incorrect results > -- > > Key: HIVE-11335 > URL: https://issues.apache.org/jira/browse/HIVE-11335 > Project: Hive > Issue Type: Bug > Components: Query Planning >Affects Versions: 1.1.0 > Environment: CDH5.4.0 >Reporter: fatkun >Assignee: Jesus Camacho Rodriguez > Attachments: query1.txt, query2.txt > > > test step > {code} > create table log (uid string, uid2 string); > insert into log values ('1', '1'); > create table user (uid string, name string); > insert into user values ('1', "test1"); > {code} > (Query1) > {code} > select b.name, c.name from log a > left outer join (select uid, name from user) b on (a.uid=b.uid) > left outer join user c on (a.uid2=c.uid); > {code} > return wrong result: > 1 test1 > It should be both return test1 > (Query2)I try to find error, if I use this query, return right result.(join > key different) > {code} > select b.name, c.name from log a > left outer join (select uid, name from user) b on (a.uid=b.uid) > left outer join user c on (a.uid=c.uid); > {code} > The explain is different,Query1 only select one colum. It should select uid > and name. > {code} > b:user > TableScan > alias: user > Statistics: Num rows: 1 Data size: 7 Basic stats: COMPLETE Column > stats: NONE > Select Operator > expressions: uid (type: string) > outputColumnNames: _col0 > {code} > It may relate HIVE-10996 > =UPDATE1=== > (Query3) this query return correct result > {code} > select b.name, c.name from log a > left outer join (select user.uid, user.name from user) b on (a.uid=b.uid) > left outer join user c on (a.uid2=c.uid); > {code} > the operator tree > TS[0]-SEL[1]-RS[5]-JOIN[6]-RS[7]-JOIN[9]-SEL[10]-FS[11] > TS[2]-RS[4]-JOIN[6] > TS[3]-RS[8]-JOIN[9] > the Query1 SEL[1] rowSchema is wrong, cannot detect the tabAlias -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9613) Left join query plan outputs wrong column when using subquery
[ https://issues.apache.org/jira/browse/HIVE-9613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-9613: -- Fix Version/s: 1.1.1 > Left join query plan outputs wrong column when using subquery > -- > > Key: HIVE-9613 > URL: https://issues.apache.org/jira/browse/HIVE-9613 > Project: Hive > Issue Type: Bug > Components: Parser, Query Planning >Affects Versions: 0.14.0, 1.0.0 > Environment: apache hadoop 2.5.1 >Reporter: Li Xin >Assignee: Gunther Hagleitner > Fix For: 1.2.0, 1.1.1 > > Attachments: HIVE-9613.1.patch, test.sql > > > I have a query that outputs a column with wrong contents when using > subquery,and the contents of that column is equal to another column,not its > own. > I have three tables,as follows: > table 1: _hivetemp.category_city_rank_: > ||category||city||rank|| > |jinrongfuwu|shanghai|1| > |ktvjiuba|shanghai|2| > table 2:_hivetemp.category_match_: > ||src_category_en||src_category_cn||dst_category_en||dst_category_cn|| > |danbaobaoxiantouzi|投资担保|担保/贷款|jinrongfuwu| > |zpwentiyingshi|娱乐/休闲|KTV/酒吧|ktvjiuba| > table 3:_hivetemp.city_match_: > ||src_city_name_en||dst_city_name_en||city_name_cn|| > |sh|shanghai|上海| > And the query is : > {code} > select > a.category, > a.city, > a.rank, > b.src_category_en, > c.src_city_name_en > from > hivetemp.category_city_rank a > left outer join > (select > src_category_en, > dst_category_en > from > hivetemp.category_match) b > on a.category = b.dst_category_en > left outer join > (select > src_city_name_en, > dst_city_name_en > from > hivetemp.city_match) c > on a.city = c.dst_city_name_en > {code} > which shoud output the results as follows,and i test it in hive 0.13: > ||category||city||rank||src_category_en||src_city_name_en|| > |jinrongfuwu|shanghai|1|danbaobaoxiantouzi|sh| > |ktvjiuba|shanghai|2|zpwentiyingshi|sh| > but int hive0.14,the results in the column *src_category_en* is wrong,and is > just the *city* contents: > ||category||city||rank||src_category_en||src_city_name_en|| > |jinrongfuwu|shanghai|1|shanghai|sh| > |ktvjiuba|shanghai|2|shanghai|sh| > Using explain to examine the execution plan,i can see the first subquery just > outputs one column of *dst_category_en*,and *src_category_en* is just missing. > {quote} >b:category_match > TableScan > alias: category_match > Statistics: Num rows: 131 Data size: 13149 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: dst_category_en (type: string) > outputColumnNames: _col1 > Statistics: Num rows: 131 Data size: 13149 Basic stats: > COMPLETE Column stats: NONE > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11335) Multi-Join Inner Query producing incorrect results
[ https://issues.apache.org/jira/browse/HIVE-11335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637238#comment-14637238 ] Jesus Camacho Rodriguez commented on HIVE-11335: [~fatkun], this is a duplicate of HIVE-9613, which had not been committed to 1.1 (in fact, the issue cannot be reproduced in other versions). With that patch, the problem is solved. I have just backported it, thus I mark this one as duplicate. > Multi-Join Inner Query producing incorrect results > -- > > Key: HIVE-11335 > URL: https://issues.apache.org/jira/browse/HIVE-11335 > Project: Hive > Issue Type: Bug > Components: Query Planning >Affects Versions: 1.1.0 > Environment: CDH5.4.0 >Reporter: fatkun >Assignee: Jesus Camacho Rodriguez > Attachments: query1.txt, query2.txt > > > test step > {code} > create table log (uid string, uid2 string); > insert into log values ('1', '1'); > create table user (uid string, name string); > insert into user values ('1', "test1"); > {code} > (Query1) > {code} > select b.name, c.name from log a > left outer join (select uid, name from user) b on (a.uid=b.uid) > left outer join user c on (a.uid2=c.uid); > {code} > return wrong result: > 1 test1 > It should be both return test1 > (Query2)I try to find error, if I use this query, return right result.(join > key different) > {code} > select b.name, c.name from log a > left outer join (select uid, name from user) b on (a.uid=b.uid) > left outer join user c on (a.uid=c.uid); > {code} > The explain is different,Query1 only select one colum. It should select uid > and name. > {code} > b:user > TableScan > alias: user > Statistics: Num rows: 1 Data size: 7 Basic stats: COMPLETE Column > stats: NONE > Select Operator > expressions: uid (type: string) > outputColumnNames: _col0 > {code} > It may relate HIVE-10996 > =UPDATE1=== > (Query3) this query return correct result > {code} > select b.name, c.name from log a > left outer join (select user.uid, user.name from user) b on (a.uid=b.uid) > left outer join user c on (a.uid2=c.uid); > {code} > the operator tree > TS[0]-SEL[1]-RS[5]-JOIN[6]-RS[7]-JOIN[9]-SEL[10]-FS[11] > TS[2]-RS[4]-JOIN[6] > TS[3]-RS[8]-JOIN[9] > the Query1 SEL[1] rowSchema is wrong, cannot detect the tabAlias -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11271) java.lang.IndexOutOfBoundsException when union all with if function
[ https://issues.apache.org/jira/browse/HIVE-11271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637229#comment-14637229 ] Yongzhi Chen commented on HIVE-11271: - [~pxiong], thanks for your advice. Your suggestion should work, I just do not understand why change run time code is not good. In your plan, you add extra SEL which is just same as what the run time map in this patch do, and the extra select should not have better performance than the Filter with input to output map. And another good thing with my change is that I need 0 q file change. : ) I would be happy to make changes as you suggest if you or [~ashutoshc] can explain why the change has to be in compile time. Thanks > java.lang.IndexOutOfBoundsException when union all with if function > --- > > Key: HIVE-11271 > URL: https://issues.apache.org/jira/browse/HIVE-11271 > Project: Hive > Issue Type: Bug > Components: Logical Optimizer >Affects Versions: 0.14.0, 1.0.0, 1.2.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Attachments: HIVE-11271.1.patch > > > Some queries with Union all as subquery fail in MapReduce task with > stacktrace: > {noformat} > 15/07/15 14:19:30 [pool-13-thread-1]: INFO exec.UnionOperator: Initializing > operator UNION[104] > 15/07/15 14:19:30 [Thread-72]: INFO mapred.LocalJobRunner: Map task executor > complete. > 15/07/15 14:19:30 [Thread-72]: WARN mapred.LocalJobRunner: > job_local826862759_0005 > java.lang.Exception: java.lang.RuntimeException: Error in configuring object > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354) > Caused by: java.lang.RuntimeException: Error in configuring object > at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93) > at > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64) > at > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366) > at > org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.GeneratedMethodAccessor53.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88) > ... 10 more > Caused by: java.lang.RuntimeException: Error in configuring object > at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93) > at > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64) > at > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) > at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34) > ... 14 more > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.GeneratedMethodAccessor53.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88) > ... 17 more > Caused by: java.lang.RuntimeException: Map operator initialization failed > at > org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:140) > ... 21 more > Caused by: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 > at java.util.ArrayList.rangeCheck(ArrayList.java:635) > at java.util.ArrayList.get(ArrayList.java:411) > at > org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481) > at > org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481) > at > org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java
[jira] [Commented] (HIVE-11271) java.lang.IndexOutOfBoundsException when union all with if function
[ https://issues.apache.org/jira/browse/HIVE-11271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637177#comment-14637177 ] Pengcheng Xiong commented on HIVE-11271: [~ashutoshc], thanks a lot for your attention. I applied the patch in HIVE-11333 and it seems that it can not solve the problem here. The problem here is that we have FIL-UNION, FIL has to have 2 columns (1 for union and 1 for predicate). The problem in HIVE-11333 is that we have SEL-UNION, because of return path, the column in SEL got wrongly pruned. However, I still agree with you that a better fix should be at compile time, not run time. [~ychena], it seems that this problem is similar to the issue mentioned in https://issues.apache.org/jira/browse/HIVE-10996 although they are dealing with JOIN. A similar solution by adding a SEL may be like this: (1) In ColumnPruner, when dealing with FIL, check if the needed columns (from its child) and check the columns used in predicate. (2) If the former one contains the latter one, continue (no problem); else insert a SEL which just select the needed columns in between the FIL and its child. (3) This solution is happening at compile time, not run time. But it may involve many q files update. Thanks. > java.lang.IndexOutOfBoundsException when union all with if function > --- > > Key: HIVE-11271 > URL: https://issues.apache.org/jira/browse/HIVE-11271 > Project: Hive > Issue Type: Bug > Components: Logical Optimizer >Affects Versions: 0.14.0, 1.0.0, 1.2.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Attachments: HIVE-11271.1.patch > > > Some queries with Union all as subquery fail in MapReduce task with > stacktrace: > {noformat} > 15/07/15 14:19:30 [pool-13-thread-1]: INFO exec.UnionOperator: Initializing > operator UNION[104] > 15/07/15 14:19:30 [Thread-72]: INFO mapred.LocalJobRunner: Map task executor > complete. > 15/07/15 14:19:30 [Thread-72]: WARN mapred.LocalJobRunner: > job_local826862759_0005 > java.lang.Exception: java.lang.RuntimeException: Error in configuring object > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354) > Caused by: java.lang.RuntimeException: Error in configuring object > at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93) > at > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64) > at > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366) > at > org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.GeneratedMethodAccessor53.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88) > ... 10 more > Caused by: java.lang.RuntimeException: Error in configuring object > at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93) > at > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64) > at > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) > at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34) > ... 14 more > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.GeneratedMethodAccessor53.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88) > ... 17 more > Caused by: java.lang.RuntimeException: Map operator initialization failed > at > org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:140) > ... 21 more > Caused by: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 > at java.util.ArrayList.rangeCheck(ArrayList.java:635) > at java.util.ArrayList.get(ArrayList.java:411) > at > org.apache.hadoop.hive.ql.exec.Unio
[jira] [Updated] (HIVE-11328) Avoid String representation of expression nodes in ConstantPropagateProcFactory unless necessary
[ https://issues.apache.org/jira/browse/HIVE-11328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-11328: --- Attachment: (was: HIVE-11310.4.branch-1.2.patch) > Avoid String representation of expression nodes in > ConstantPropagateProcFactory unless necessary > > > Key: HIVE-11328 > URL: https://issues.apache.org/jira/browse/HIVE-11328 > Project: Hive > Issue Type: Bug >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Fix For: 2.0.0 > > Attachments: HIVE-11328.branch-1.0.patch, > HIVE-11328.branch-1.2.patch, HIVE-11328.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11328) Avoid String representation of expression nodes in ConstantPropagateProcFactory unless necessary
[ https://issues.apache.org/jira/browse/HIVE-11328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-11328: --- Attachment: HIVE-11328.branch-1.2.patch > Avoid String representation of expression nodes in > ConstantPropagateProcFactory unless necessary > > > Key: HIVE-11328 > URL: https://issues.apache.org/jira/browse/HIVE-11328 > Project: Hive > Issue Type: Bug >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Fix For: 2.0.0 > > Attachments: HIVE-11328.branch-1.0.patch, > HIVE-11328.branch-1.2.patch, HIVE-11328.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11328) Avoid String representation of expression nodes in ConstantPropagateProcFactory unless necessary
[ https://issues.apache.org/jira/browse/HIVE-11328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-11328: --- Attachment: HIVE-11310.4.branch-1.2.patch > Avoid String representation of expression nodes in > ConstantPropagateProcFactory unless necessary > > > Key: HIVE-11328 > URL: https://issues.apache.org/jira/browse/HIVE-11328 > Project: Hive > Issue Type: Bug >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Fix For: 2.0.0 > > Attachments: HIVE-11310.4.branch-1.2.patch, > HIVE-11328.branch-1.0.patch, HIVE-11328.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11317) ACID: Improve transaction Abort logic due to timeout
[ https://issues.apache.org/jira/browse/HIVE-11317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriharsha Chintalapani updated HIVE-11317: -- Labels: triage (was: ) > ACID: Improve transaction Abort logic due to timeout > > > Key: HIVE-11317 > URL: https://issues.apache.org/jira/browse/HIVE-11317 > Project: Hive > Issue Type: Bug > Components: Metastore, Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Labels: triage > > the logic to Abort transactions that have stopped heartbeating is in > TxnHandler.timeOutTxns() > This is only called when DbTxnManger.getValidTxns() is called. > So if there is a lot of txns that need to be timed out and the there are not > SQL clients talking to the system, there is nothing to abort dead > transactions, and thus compaction can't clean them up so garbage accumulates > in the system. > Also, streaming api doesn't call DbTxnManager at all. > Need to move this logic into Initiator (or some other metastore side thread). > Also, make sure it is broken up into multiple small(er) transactions against > metastore DB. > Also more timeOutLocks() locks there as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11328) Avoid String representation of expression nodes in ConstantPropagateProcFactory unless necessary
[ https://issues.apache.org/jira/browse/HIVE-11328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-11328: --- Attachment: HIVE-11328.branch-1.0.patch > Avoid String representation of expression nodes in > ConstantPropagateProcFactory unless necessary > > > Key: HIVE-11328 > URL: https://issues.apache.org/jira/browse/HIVE-11328 > Project: Hive > Issue Type: Bug >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Fix For: 2.0.0 > > Attachments: HIVE-11328.branch-1.0.patch, HIVE-11328.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11209) Clean up dependencies in HiveDecimalWritable
[ https://issues.apache.org/jira/browse/HIVE-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637099#comment-14637099 ] Owen O'Malley commented on HIVE-11209: -- For most types, it would be a problem. With decimal, we are doing so many allocations in the inner loop that this won't be noticeable. We really need to fix or even better reimplement the Decimal128 to get good performance on decimal columns. If you really want me to fix this instance, how about a thread local. Hive's over use of static caches has been a huge source of problems. > Clean up dependencies in HiveDecimalWritable > > > Key: HIVE-11209 > URL: https://issues.apache.org/jira/browse/HIVE-11209 > Project: Hive > Issue Type: Sub-task >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Fix For: 2.0.0 > > Attachments: HIVE-11209.patch, HIVE-11209.patch, HIVE-11209.patch, > HIVE-11209.patch > > > Currently HiveDecimalWritable depends on: > * org.apache.hadoop.hive.serde2.ByteStream > * org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils > * org.apache.hadoop.hive.serde2.typeinfo.HiveDecimalUtils > since we need HiveDecimalWritable for the decimal VectorizedColumnBatch, > breaking these dependencies will improve things. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11310) Avoid expensive AST tree conversion to String for expressions in WHERE clause
[ https://issues.apache.org/jira/browse/HIVE-11310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-11310: --- Attachment: HIVE-11310.4.branch-1.0.patch > Avoid expensive AST tree conversion to String for expressions in WHERE clause > - > > Key: HIVE-11310 > URL: https://issues.apache.org/jira/browse/HIVE-11310 > Project: Hive > Issue Type: Bug > Components: Parser >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Fix For: 2.0.0 > > Attachments: HIVE-11310.1.patch, HIVE-11310.2.patch, > HIVE-11310.3.patch, HIVE-11310.4.branch-1.0.patch, > HIVE-11310.4.branch-1.2.patch, HIVE-11310.4.patch, HIVE-11310.patch > > > We use the AST tree String representation of a condition in the WHERE clause > to identify its column in the RowResolver. This can lead to OOM Exceptions > when the condition is very large. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11341) Avoid expensive resizing of ASTNode tree
[ https://issues.apache.org/jira/browse/HIVE-11341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mostafa Mokhtar updated HIVE-11341: --- Summary: Avoid expensive resizing of ASTNode tree (was: Avoid resizing of ASTNode tree ) > Avoid expensive resizing of ASTNode tree > - > > Key: HIVE-11341 > URL: https://issues.apache.org/jira/browse/HIVE-11341 > Project: Hive > Issue Type: Bug > Components: Hive, Physical Optimizer >Affects Versions: 0.14.0 >Reporter: Mostafa Mokhtar >Assignee: Hari Sankar Sivarama Subramaniyan > > {code} > Stack TraceSample CountPercentage(%) > parse.BaseSemanticAnalyzer.analyze(ASTNode, Context) 1,605 90 >parse.CalcitePlanner.analyzeInternal(ASTNode) 1,605 90 > parse.SemanticAnalyzer.analyzeInternal(ASTNode, > SemanticAnalyzer$PlannerContext) 1,605 90 > parse.CalcitePlanner.genOPTree(ASTNode, > SemanticAnalyzer$PlannerContext) 1,604 90 > parse.SemanticAnalyzer.genOPTree(ASTNode, > SemanticAnalyzer$PlannerContext) 1,604 90 >parse.SemanticAnalyzer.genPlan(QB) 1,604 90 > parse.SemanticAnalyzer.genPlan(QB, boolean) 1,604 90 > parse.SemanticAnalyzer.genBodyPlan(QB, Operator, Map) > 1,604 90 > parse.SemanticAnalyzer.genFilterPlan(ASTNode, QB, > Operator, Map, boolean) 1,603 90 >parse.SemanticAnalyzer.genFilterPlan(QB, ASTNode, > Operator, boolean)1,603 90 > parse.SemanticAnalyzer.genExprNodeDesc(ASTNode, > RowResolver, boolean)1,603 90 > > parse.SemanticAnalyzer.genExprNodeDesc(ASTNode, RowResolver, TypeCheckCtx) > 1,603 90 > > parse.SemanticAnalyzer.genAllExprNodeDesc(ASTNode, RowResolver, TypeCheckCtx) > 1,603 90 > > parse.TypeCheckProcFactory.genExprNode(ASTNode, TypeCheckCtx) 1,603 90 > > parse.TypeCheckProcFactory.genExprNode(ASTNode, TypeCheckCtx, > TypeCheckProcFactory) 1,603 90 > > lib.DefaultGraphWalker.startWalking(Collection, HashMap) 1,579 89 > > lib.DefaultGraphWalker.walk(Node) 1,571 89 > > java.util.ArrayList.removeAll(Collection) 1,433 81 > > java.util.ArrayList.batchRemove(Collection, boolean) 1,433 81 > > java.util.ArrayList.contains(Object) 1,228 69 > > java.util.ArrayList.indexOf(Object)1,228 69 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)