[jira] [Updated] (HIVE-9169) UT: set hive.support.concurrency to true for spark UTs
[ https://issues.apache.org/jira/browse/HIVE-9169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bing Li updated HIVE-9169: -- Assignee: (was: Bing Li) > UT: set hive.support.concurrency to true for spark UTs > -- > > Key: HIVE-9169 > URL: https://issues.apache.org/jira/browse/HIVE-9169 > Project: Hive > Issue Type: Sub-task > Components: Tests >Affects Versions: spark-branch >Reporter: Thomas Friedrich >Priority: Minor > > The test cases > lock1 > lock2 > lock3 > lock4 > are failing because the flag hive.support.concurrency is set to false in the > hive-site.xml for the spark tests. > This value was set to true in trunk with HIVE-1293 when these test cases were > introduced to Hive. > After setting the value to true and generating the output files, the test > cases are successful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11649) Hive UPDATE,INSERT,DELETE issue
[ https://issues.apache.org/jira/browse/HIVE-11649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Veerendra Nath Jasthi updated HIVE-11649: - Attachment: hive-site.xml Hi Alan, PFA. > Hive UPDATE,INSERT,DELETE issue > --- > > Key: HIVE-11649 > URL: https://issues.apache.org/jira/browse/HIVE-11649 > Project: Hive > Issue Type: Bug > Environment: Hadoop-2.2.0 , hive-1.2.0 ,operating system > ubuntu14.04lts (64-bit) & Java 1.7 >Reporter: Veerendra Nath Jasthi >Assignee: Hive QA > Attachments: afterChange.png, beforeChange.png, hive-site.xml, > hive.log > > > have been trying to implement the UPDATE,INSERT,DELETE operations in hive > table as per link: > https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions- > > but whenever I was trying to include the properties which will do our work > i.e. > Configuration Values to Set for INSERT, UPDATE, DELETE > hive.support.concurrency true (default is false) > hive.enforce.bucketingtrue (default is false) > hive.exec.dynamic.partition.mode nonstrict (default is strict) > after that if I run show tables command on hive shell its taking 65.15 > seconds which normally runs at 0.18 seconds without the above properties. > Apart from show tables rest of the commands not giving any output i.e. its > keep on running until and unless kill the process. > Could you tell me reason for this? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11825) get_json_object(col,'$.a') is null in where clause didn`t work
[ https://issues.apache.org/jira/browse/HIVE-11825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746983#comment-14746983 ] Feng Yuan commented on HIVE-11825: -- thank your detailed reply,i will try your 2nd way,if you have time ,could you please commit your patch about "ALLOW_BACKSLASH_ESCAPING_ANY_CHARACTER "? thank you! > get_json_object(col,'$.a') is null in where clause didn`t work > -- > > Key: HIVE-11825 > URL: https://issues.apache.org/jira/browse/HIVE-11825 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 0.14.0 >Reporter: Feng Yuan >Priority: Critical > Fix For: 0.14.1 > > > example: > select attr from raw_kafka_item_dt0 where l_date='2015-09-06' and > customer='Czgc_news' and get_json_object(attr,'$.title') is NULL limit 10; > but in results,title is still not null! > {"title":"思科Q4收入估$79.2亿 > 前景阴云笼罩","ItemType":"NewsBase","keywords":"思科Q4收入估\$79.2亿 > 前景阴云笼罩","random":"1420253511075","callback":"BCore.instances[2].callbacks[1]","user_agent":"Mozilla/5.0 > (iPhone; U; CPU iPhone OS 4_2_1 like Mac OS X; en-us) AppleWebKit/533.17.9 > (KHTML; like Gecko) Version/5.0.2 Mobile/8C148 > Safari/6533.18.5","is_newgid":"false","uuid":"DS.Input:b56c782bcb75035d:2116:003dcd40:54a75947","ptime":"1.1549997E9"} > > attr is a dict -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11110) Reorder applyPreJoinOrderingTransforms, add NotNULL/FilterMerge rules, improve Filter selectivity estimation
[ https://issues.apache.org/jira/browse/HIVE-0?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laljo John Pullokkaran updated HIVE-0: -- Attachment: HIVE-0.13.patch > Reorder applyPreJoinOrderingTransforms, add NotNULL/FilterMerge rules, > improve Filter selectivity estimation > > > Key: HIVE-0 > URL: https://issues.apache.org/jira/browse/HIVE-0 > Project: Hive > Issue Type: Bug > Components: CBO >Reporter: Jesus Camacho Rodriguez >Assignee: Laljo John Pullokkaran > Attachments: HIVE-0-10.patch, HIVE-0-11.patch, > HIVE-0-12.patch, HIVE-0-branch-1.2.patch, HIVE-0.1.patch, > HIVE-0.13.patch, HIVE-0.2.patch, HIVE-0.4.patch, > HIVE-0.5.patch, HIVE-0.6.patch, HIVE-0.7.patch, > HIVE-0.8.patch, HIVE-0.9.patch, HIVE-0.91.patch, > HIVE-0.92.patch, HIVE-0.patch > > > Query > {code} > select count(*) > from store_sales > ,store_returns > ,date_dim d1 > ,date_dim d2 > where d1.d_quarter_name = '2000Q1' >and d1.d_date_sk = ss_sold_date_sk >and ss_customer_sk = sr_customer_sk >and ss_item_sk = sr_item_sk >and ss_ticket_number = sr_ticket_number >and sr_returned_date_sk = d2.d_date_sk >and d2.d_quarter_name in ('2000Q1','2000Q2','2000Q3’); > {code} > The store_sales table is partitioned on ss_sold_date_sk, which is also used > in a join clause. The join clause should add a filter “filterExpr: > ss_sold_date_sk is not null”, which should get pushed the MetaStore when > fetching the stats. Currently this is not done in CBO planning, which results > in the stats from __HIVE_DEFAULT_PARTITION__ to be fetched and considered in > the optimization phase. In particular, this increases the NDV for the join > columns and may result in wrong planning. > Including HiveJoinAddNotNullRule in the optimization phase solves this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11838) Another positive test case for HIVE-11658
[ https://issues.apache.org/jira/browse/HIVE-11838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-11838: - Attachment: HIVE-11838.patch [~deepesh] fyi.. > Another positive test case for HIVE-11658 > - > > Key: HIVE-11838 > URL: https://issues.apache.org/jira/browse/HIVE-11838 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 2.0.0 >Reporter: Deepesh Khandelwal >Assignee: Prasanth Jayachandran > Attachments: HIVE-11838.patch > > > We can add additional positive test coverage for HIVE-11658 covering load > directory to text partition. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11838) Another positive test case for HIVE-11658
[ https://issues.apache.org/jira/browse/HIVE-11838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747004#comment-14747004 ] Prasanth Jayachandran commented on HIVE-11838: -- This patch just adds additional test. I don't think we need a full precommit run for this patch. > Another positive test case for HIVE-11658 > - > > Key: HIVE-11838 > URL: https://issues.apache.org/jira/browse/HIVE-11838 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 2.0.0 >Reporter: Deepesh Khandelwal >Assignee: Prasanth Jayachandran > Attachments: HIVE-11838.patch > > > We can add additional positive test coverage for HIVE-11658 covering load > directory to text partition. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-4243) Fix column names in FileSinkOperator
[ https://issues.apache.org/jira/browse/HIVE-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-4243: Affects Version/s: 2.0.0 1.3.0 > Fix column names in FileSinkOperator > > > Key: HIVE-4243 > URL: https://issues.apache.org/jira/browse/HIVE-4243 > Project: Hive > Issue Type: Bug > Components: File Formats >Affects Versions: 1.3.0, 2.0.0 >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Attachments: HIVE-4243.patch, HIVE-4243.patch, HIVE-4243.tmp.patch > > > All of the ObjectInspectors given to SerDe's by FileSinkOperator have virtual > column names. Since the files are part of tables, Hive knows the column > names. For self-describing file formats like ORC, having the real column > names will improve the understandability. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-5623) ORC accessing array column that's empty will fail with java out of bound exception
[ https://issues.apache.org/jira/browse/HIVE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-5623: Fix Version/s: (was: 0.13) 1.3.0 > ORC accessing array column that's empty will fail with java out of bound > exception > -- > > Key: HIVE-5623 > URL: https://issues.apache.org/jira/browse/HIVE-5623 > Project: Hive > Issue Type: Bug > Components: File Formats >Affects Versions: 0.11.0 >Reporter: Eric Chu >Assignee: Prasanth Jayachandran >Priority: Critical > Labels: orcfile > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-5623.patch > > > In our ORC tests we saw that queries that work on RCFile failed on the > corresponding ORC version with Java IndexOutOfBoundsException in > OrcStruct.java. The queries failed b/c the table has an array type column and > there are rows with an empty array. We noticed that the getList(Object list, > int i) method in OrcStruct.java simply returns the i-th element from list > without checking if list is not null or if i is within valid range. After > fixing that the queries run fine. The fix is really simple, but maybe there > are other similar cases that need to be handled. > The fix is to check if listObj is null and if i falls within range: > {code} > public Object getListElement(Object listObj, int i) { > if (listObj == null) { > return null; > } > List list = ((List) listObj); > if (i < 0 || i >= list.size()) { > return null; > } > return list.get(i); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11700) exception in logs in Tez test with new logger
[ https://issues.apache.org/jira/browse/HIVE-11700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-11700: - Affects Version/s: 2.0.0 > exception in logs in Tez test with new logger > - > > Key: HIVE-11700 > URL: https://issues.apache.org/jira/browse/HIVE-11700 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Sergey Shelukhin >Assignee: Prasanth Jayachandran > Attachments: HIVE-11700.patch, HIVE-11700.patch > > > {noformat} > 2015-08-31 11:27:47,400 WARN Error while converting string > [${sys:hive.ql.log.PerfLogger.level}] to type [class > org.apache.logging.log4j.Level]. Using default value [null]. > java.lang.IllegalArgumentException: Unknown level constant > [${SYS:HIVE.QL.LOG.PERFLOGGER.LEVEL}]. >at org.apache.logging.log4j.Level.valueOf(Level.java:286) >at > org.apache.logging.log4j.core.config.plugins.convert.TypeConverters$LevelConverter.convert(TypeConverters.java:230) >at > org.apache.logging.log4j.core.config.plugins.convert.TypeConverters$LevelConverter.convert(TypeConverters.java:226) >at > org.apache.logging.log4j.core.config.plugins.convert.TypeConverters.convert(TypeConverters.java:336) >at > org.apache.logging.log4j.core.config.plugins.visitors.AbstractPluginVisitor.convert(AbstractPluginVisitor.java:130) >at > org.apache.logging.log4j.core.config.plugins.visitors.PluginAttributeVisitor.visit(PluginAttributeVisitor.java:45) >at > org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.generateParameters(PluginBuilder.java:247) >at > org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.build(PluginBuilder.java:136) >at > org.apache.logging.log4j.core.config.AbstractConfiguration.createPluginObject(AbstractConfiguration.java:766) >at > org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:706) >at > org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:698) >at > org.apache.logging.log4j.core.config.AbstractConfiguration.doConfigure(AbstractConfiguration.java:358) >at > org.apache.logging.log4j.core.config.AbstractConfiguration.start(AbstractConfiguration.java:161) >at > org.apache.logging.log4j.core.LoggerContext.setConfiguration(LoggerContext.java:361) >at > org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:426) >at > org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:442) >at > org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:138) >at > org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:147) >at > org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:41) >at org.apache.logging.log4j.LogManager.getContext(LogManager.java:175) >at > org.apache.logging.log4j.spi.AbstractLoggerAdapter.getContext(AbstractLoggerAdapter.java:102) >at org.apache.logging.log4j.jcl.LogAdapter.getContext(LogAdapter.java:39) >at > org.apache.logging.log4j.spi.AbstractLoggerAdapter.getLogger(AbstractLoggerAdapter.java:42) >at > org.apache.logging.log4j.jcl.LogFactoryImpl.getInstance(LogFactoryImpl.java:40) >at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:671) >at org.apache.hadoop.hive.ql.QTestUtil.(QTestUtil.java:122) >at > org.apache.hadoop.hive.cli.TestMiniTezCliDriver.(TestMiniTezCliDriver.java:33) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11732) LLAP: MiniLlapCluster integration broke hadoop-1 build
[ https://issues.apache.org/jira/browse/HIVE-11732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-11732: - Fix Version/s: llap > LLAP: MiniLlapCluster integration broke hadoop-1 build > -- > > Key: HIVE-11732 > URL: https://issues.apache.org/jira/browse/HIVE-11732 > Project: Hive > Issue Type: Sub-task >Affects Versions: llap >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Fix For: llap > > Attachments: HIVE-11732.1.patch, HIVE-11732.2.patch > > > HIVE-9900 broke hadoop-1 build. Needs shimming for MiniLlapCluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11839) Vectorization wrong results with filter of (CAST AS CHAR)
[ https://issues.apache.org/jira/browse/HIVE-11839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-11839: Attachment: HIVE-11839.01.patch > Vectorization wrong results with filter of (CAST AS CHAR) > - > > Key: HIVE-11839 > URL: https://issues.apache.org/jira/browse/HIVE-11839 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-11839.01.patch > > > PROBLEM: > For query such as > select count(1) from table where CAST (id as CHAR(4))='1000'; > gives wrong results 0 than expected results. > STEPS TO REPRODUCE: > create table s1(id smallint) stored as orc; > insert into table s1 values (1000),(1001),(1002),(1003),(1000); > set hive.vectorized.execution.enabled=true; > select count(1) from s1 where cast(id as char(4))='1000'; > – this gives 0 > set hive.vectorized.execution.enabled=false; > select count(1) from s1 where cast(id as char(4))='1000'; > – this gives 2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10819) SearchArgumentImpl for Timestamp is broken by HIVE-10286
[ https://issues.apache.org/jira/browse/HIVE-10819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-10819: - Affects Version/s: 1.3.0 1.2.0 > SearchArgumentImpl for Timestamp is broken by HIVE-10286 > > > Key: HIVE-10819 > URL: https://issues.apache.org/jira/browse/HIVE-10819 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.0, 1.3.0 >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 1.2.1 > > Attachments: HIVE-10819.1.patch, HIVE-10819.2.patch, > HIVE-10819.3.patch, HIVE-10819.4.patch > > > The work around for kryo bug for Timestamp is accidentally removed by > HIVE-10286. Need to bring it back. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11822) vectorize NVL UDF
[ https://issues.apache.org/jira/browse/HIVE-11822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747076#comment-14747076 ] Takanobu Asanuma commented on HIVE-11822: - [~gopalv] Thanks for the assignment and kind advice. > vectorize NVL UDF > - > > Key: HIVE-11822 > URL: https://issues.apache.org/jira/browse/HIVE-11822 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Takanobu Asanuma > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11826) 'hadoop.proxyuser.hive.groups' configuration doesn't prevent unauthorized user to access metastore
[ https://issues.apache.org/jira/browse/HIVE-11826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747167#comment-14747167 ] Hive QA commented on HIVE-11826: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12756028/HIVE-11826.patch {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9445 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testTempTable {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5292/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5292/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5292/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12756028 - PreCommit-HIVE-TRUNK-Build > 'hadoop.proxyuser.hive.groups' configuration doesn't prevent unauthorized > user to access metastore > -- > > Key: HIVE-11826 > URL: https://issues.apache.org/jira/browse/HIVE-11826 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-11826.patch > > > With 'hadoop.proxyuser.hive.groups' configured in core-site.xml to certain > groups, currently if you run the job with a user not belonging to those > groups, it won't fail to access metastore. With old version hive 0.13, > actually it fails properly. > Seems HadoopThriftAuthBridge20S.java correctly call ProxyUsers.authorize() > while HadoopThriftAuthBridge23 doesn't. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9523) For partitioned tables same optimizations should be available as for bucketed tables and vice versa: ①[Sort Merge] PARTITION Map join and ②BUCKET pruning
[ https://issues.apache.org/jira/browse/HIVE-9523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciek Kocon updated HIVE-9523: --- Description: Logically and functionally bucketing and partitioning are quite similar - both provide mechanism to segregate and separate the table's data based on its content. Thanks to that significant further optimisations like [partition] PRUNING or [bucket] MAP JOIN are possible. The difference seems to be imposed by design where the PARTITIONing is open/explicit while BUCKETing is discrete/implicit. Partitioning seems to be very common if not a standard feature in all current RDBMS while BUCKETING seems to be HIVE specific only. In a way BUCKETING could be also called by "hashing" or simply "IMPLICIT PARTITIONING". Regardless of the fact that these two are recognised as two separate features available in Hive there should be nothing to prevent leveraging same existing query/join optimisations across the two. ①[Sort Merge] PARTITION Map join (no progress yet) Enable Bucket Map Join or better, the Sort Merge Bucket Map Join equivalent optimisations when PARTITIONING is used exclusively or in combination with BUCKETING. For JOIN conditions where partitioning criteria are used respectively: ⋮ FROM TabA JOIN TabB ON TabA.partCol1 = TabB.partCol2 AND TabA.partCol2 = TabB.partCol2 the optimizer could/should choose to treat it the same way as with bucketed tables: ⋮ FROM TabC JOIN TabD ON TabC.clusteredByCol1 = TabD.clusteredByCol2 AND TabC.clusteredByCol2 = TabD.clusteredByCol2 and use either Bucket Map Join or better, the Sort Merge Bucket Map Join. The latter would require capability to create sorted partitions first. This is based on fact that same way as buckets translate to separate files, the partitions essentially provide the same mapping. When data locality is known the optimizer could focus only on joining corresponding partitions rather than whole data sets. ②BUCKET pruning (taken care by [HIVE-11525|https://issues.apache.org/jira/browse/HIVE-11525]) Enable partition PRUNING equivalent optimisation for queries on BUCKETED tables Simplest example is for queries like: "SELECT … FROM x WHERE colA=123123" to read only the relevant bucket file rather than all file-buckets that belong to a table. was: Logically and functionally bucketing and partitioning are quite similar - both provide mechanism to segregate and separate the table's data based on its content. Thanks to that significant further optimisations like [partition] PRUNING or [bucket] MAP JOIN are possible. The difference seems to be imposed by design where the PARTITIONing is open/explicit while BUCKETing is discrete/implicit. Partitioning seems to be very common if not a standard feature in all current RDBMS while BUCKETING seems to be HIVE specific only. In a way BUCKETING could be also called by "hashing" or simply "IMPLICIT PARTITIONING". Regardless of the fact that these two are recognised as two separate features available in Hive there should be nothing to prevent leveraging same existing query/join optimisations across the two. ①[Sort Merge] PARTITION Map join Enable Bucket Map Join or better, the Sort Merge Bucket Map Join equivalent optimisations when PARTITIONING is used exclusively or in combination with BUCKETING. For JOIN conditions where partitioning criteria are used respectively: ⋮ FROM TabA JOIN TabB ON TabA.partCol1 = TabB.partCol2 AND TabA.partCol2 = TabB.partCol2 the optimizer could/should choose to treat it the same way as with bucketed tables: ⋮ FROM TabC JOIN TabD ON TabC.clusteredByCol1 = TabD.clusteredByCol2 AND TabC.clusteredByCol2 = TabD.clusteredByCol2 and use either Bucket Map Join or better, the Sort Merge Bucket Map Join. This is based on fact that same way as buckets translate to separate files, the partitions essentially provide the same mapping. When data locality is known the optimizer could focus only on joining corresponding partitions rather than whole data sets. ②BUCKET pruning Enable partition PRUNING equivalent optimisation for queries on BUCKETED tables Simplest example is for queries like: "SELECT … FROM x WHERE colA=123123" to read only the relevant bucket file rather than all file-buckets that belong to a table. > For partitioned tables same optimizations should be available as for bucketed > tables and vice versa: ①[Sort Merge] PARTITION Map join and ②BUCKET pruning > - > > Key: HIVE-9523 > URL: https://issues.apache.org/jira/browse/HIVE-9523 > Project: Hive > Issue Type: Improvement > Components: Logical Optimizer, Physical Optimizer, SQL >Affects Versions: 0.13.0, 0.14.0,
[jira] [Commented] (HIVE-11825) get_json_object(col,'$.a') is null in where clause didn`t work
[ https://issues.apache.org/jira/browse/HIVE-11825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747226#comment-14747226 ] Feng Yuan commented on HIVE-11825: -- thank you so much ,it works! > get_json_object(col,'$.a') is null in where clause didn`t work > -- > > Key: HIVE-11825 > URL: https://issues.apache.org/jira/browse/HIVE-11825 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 0.14.0 >Reporter: Feng Yuan >Priority: Critical > Fix For: 0.14.1 > > Attachments: HIVE-11825.patch > > > example: > select attr from raw_kafka_item_dt0 where l_date='2015-09-06' and > customer='Czgc_news' and get_json_object(attr,'$.title') is NULL limit 10; > but in results,title is still not null! > {"title":"思科Q4收入估$79.2亿 > 前景阴云笼罩","ItemType":"NewsBase","keywords":"思科Q4收入估\$79.2亿 > 前景阴云笼罩","random":"1420253511075","callback":"BCore.instances[2].callbacks[1]","user_agent":"Mozilla/5.0 > (iPhone; U; CPU iPhone OS 4_2_1 like Mac OS X; en-us) AppleWebKit/533.17.9 > (KHTML; like Gecko) Version/5.0.2 Mobile/8C148 > Safari/6533.18.5","is_newgid":"false","uuid":"DS.Input:b56c782bcb75035d:2116:003dcd40:54a75947","ptime":"1.1549997E9"} > > attr is a dict -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10651) ORC file footer cache should be bounded
[ https://issues.apache.org/jira/browse/HIVE-10651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-10651: - Affects Version/s: 2.0.0 > ORC file footer cache should be bounded > --- > > Key: HIVE-10651 > URL: https://issues.apache.org/jira/browse/HIVE-10651 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 2.0.0 >Reporter: Mostafa Mokhtar >Assignee: Prasanth Jayachandran >Priority: Minor > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-10651.1.patch > > > ORC's file footer cache is currently unbounded and is a soft reference cache. > The cache size got from config is used to set initial capacity. We should > bound the cache from growing too big and to get a predictable performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11758) Querying nested parquet columns is case sensitive
[ https://issues.apache.org/jira/browse/HIVE-11758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747054#comment-14747054 ] Hive QA commented on HIVE-11758: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12756020/HIVE-11758.2.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9445 tests executed *Failed tests:* {noformat} org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation org.apache.hive.hcatalog.streaming.TestStreaming.testTimeOutReaper {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5291/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5291/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5291/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12756020 - PreCommit-HIVE-TRUNK-Build > Querying nested parquet columns is case sensitive > - > > Key: HIVE-11758 > URL: https://issues.apache.org/jira/browse/HIVE-11758 > Project: Hive > Issue Type: Bug > Components: File Formats >Affects Versions: 1.1.0, 1.1.1, 1.2.1 >Reporter: Jakub Kukul >Priority: Minor > Attachments: HIVE-11758.2.patch, HIVE-11758.patch > > > Querying nested parquet columns (columns within a {{STRUCT}}) is case > sensitive. It should be case insensitive, to be compatible with querying > non-nested columns and querying nested columns with other file formats. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11825) get_json_object(col,'$.a') is null in where clause didn`t work
[ https://issues.apache.org/jira/browse/HIVE-11825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cazen Lee updated HIVE-11825: - Attachment: HIVE-11825.patch > get_json_object(col,'$.a') is null in where clause didn`t work > -- > > Key: HIVE-11825 > URL: https://issues.apache.org/jira/browse/HIVE-11825 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 0.14.0 >Reporter: Feng Yuan >Priority: Critical > Fix For: 0.14.1 > > Attachments: HIVE-11825.patch > > > example: > select attr from raw_kafka_item_dt0 where l_date='2015-09-06' and > customer='Czgc_news' and get_json_object(attr,'$.title') is NULL limit 10; > but in results,title is still not null! > {"title":"思科Q4收入估$79.2亿 > 前景阴云笼罩","ItemType":"NewsBase","keywords":"思科Q4收入估\$79.2亿 > 前景阴云笼罩","random":"1420253511075","callback":"BCore.instances[2].callbacks[1]","user_agent":"Mozilla/5.0 > (iPhone; U; CPU iPhone OS 4_2_1 like Mac OS X; en-us) AppleWebKit/533.17.9 > (KHTML; like Gecko) Version/5.0.2 Mobile/8C148 > Safari/6533.18.5","is_newgid":"false","uuid":"DS.Input:b56c782bcb75035d:2116:003dcd40:54a75947","ptime":"1.1549997E9"} > > attr is a dict -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11198) Fix load data query file format check for partitioned tables
[ https://issues.apache.org/jira/browse/HIVE-11198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-11198: - Affects Version/s: 1.3.0 > Fix load data query file format check for partitioned tables > > > Key: HIVE-11198 > URL: https://issues.apache.org/jira/browse/HIVE-11198 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-11198.patch > > > HIVE-8 added file format check for ORC format. The check will throw > exception when non ORC formats is loaded to ORC managed table. But it does > not work for partitioned table. Partitioned tables are allowed to have some > partitions with different file format. See this discussion for more details > https://issues.apache.org/jira/browse/HIVE-8?focusedCommentId=14617271=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14617271 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11705) refactor SARG stripe filtering for ORC into a separate method
[ https://issues.apache.org/jira/browse/HIVE-11705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-11705: - Affects Version/s: 2.0.0 > refactor SARG stripe filtering for ORC into a separate method > - > > Key: HIVE-11705 > URL: https://issues.apache.org/jira/browse/HIVE-11705 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: 2.0.0 > > Attachments: HIVE-11705.01.patch, HIVE-11705.02.patch, > HIVE-11705.03.patch, HIVE-11705.patch > > > For footer cache PPD to metastore, we'd need a method to do the PPD. Tiny > item to create it on OrcInputFormat. > For metastore path, these methods will be called from expression proxy > similar to current objectstore expr filtering; it will change to have > serialized sarg and column list to come from request instead of conf; > includedCols/etc. will also come from request instead of assorted java > objects. > -The types and stripe stats will need to be extracted from HBase. This is a > little bit of a problem, since ideally we want to be inside HBase > filter/coprocessor/ I'd need to take a look to see if this is possible... > since that filter would need to either deserialize orc, or we would need to > store types and stats information in some other, non-ORC manner on write. The > latter is probably a better idea, although it's dangerous because there's no > sync between this code and ORC itself.- > Meanwhile minimize dependencies for stripe picking to essentials (and conf > which is easy to remove). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11118) Load data query should validate file formats with destination tables
[ https://issues.apache.org/jira/browse/HIVE-8?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-8: - Affects Version/s: 1.3.0 > Load data query should validate file formats with destination tables > > > Key: HIVE-8 > URL: https://issues.apache.org/jira/browse/HIVE-8 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-8.2.patch, HIVE-8.3.patch, > HIVE-8.4.patch, HIVE-8.patch > > > Load data local inpath queries does not do any validation wrt file format. If > the destination table is ORC and if we try to load files that are not ORC, > the load will succeed but querying such tables will result in runtime > exceptions. We can do some simple sanity checks to prevent loading of files > that does not match the destination table file format. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11700) exception in logs in Tez test with new logger
[ https://issues.apache.org/jira/browse/HIVE-11700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-11700: - Fix Version/s: 2.0.0 > exception in logs in Tez test with new logger > - > > Key: HIVE-11700 > URL: https://issues.apache.org/jira/browse/HIVE-11700 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Sergey Shelukhin >Assignee: Prasanth Jayachandran > Fix For: 2.0.0 > > Attachments: HIVE-11700.patch, HIVE-11700.patch > > > {noformat} > 2015-08-31 11:27:47,400 WARN Error while converting string > [${sys:hive.ql.log.PerfLogger.level}] to type [class > org.apache.logging.log4j.Level]. Using default value [null]. > java.lang.IllegalArgumentException: Unknown level constant > [${SYS:HIVE.QL.LOG.PERFLOGGER.LEVEL}]. >at org.apache.logging.log4j.Level.valueOf(Level.java:286) >at > org.apache.logging.log4j.core.config.plugins.convert.TypeConverters$LevelConverter.convert(TypeConverters.java:230) >at > org.apache.logging.log4j.core.config.plugins.convert.TypeConverters$LevelConverter.convert(TypeConverters.java:226) >at > org.apache.logging.log4j.core.config.plugins.convert.TypeConverters.convert(TypeConverters.java:336) >at > org.apache.logging.log4j.core.config.plugins.visitors.AbstractPluginVisitor.convert(AbstractPluginVisitor.java:130) >at > org.apache.logging.log4j.core.config.plugins.visitors.PluginAttributeVisitor.visit(PluginAttributeVisitor.java:45) >at > org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.generateParameters(PluginBuilder.java:247) >at > org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.build(PluginBuilder.java:136) >at > org.apache.logging.log4j.core.config.AbstractConfiguration.createPluginObject(AbstractConfiguration.java:766) >at > org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:706) >at > org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:698) >at > org.apache.logging.log4j.core.config.AbstractConfiguration.doConfigure(AbstractConfiguration.java:358) >at > org.apache.logging.log4j.core.config.AbstractConfiguration.start(AbstractConfiguration.java:161) >at > org.apache.logging.log4j.core.LoggerContext.setConfiguration(LoggerContext.java:361) >at > org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:426) >at > org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:442) >at > org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:138) >at > org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:147) >at > org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:41) >at org.apache.logging.log4j.LogManager.getContext(LogManager.java:175) >at > org.apache.logging.log4j.spi.AbstractLoggerAdapter.getContext(AbstractLoggerAdapter.java:102) >at org.apache.logging.log4j.jcl.LogAdapter.getContext(LogAdapter.java:39) >at > org.apache.logging.log4j.spi.AbstractLoggerAdapter.getLogger(AbstractLoggerAdapter.java:42) >at > org.apache.logging.log4j.jcl.LogFactoryImpl.getInstance(LogFactoryImpl.java:40) >at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:671) >at org.apache.hadoop.hive.ql.QTestUtil.(QTestUtil.java:122) >at > org.apache.hadoop.hive.cli.TestMiniTezCliDriver.(TestMiniTezCliDriver.java:33) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11840) when multi insert the inputformat becomes OneNullRowInputFormat
[ https://issues.apache.org/jira/browse/HIVE-11840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feng Yuan updated HIVE-11840: - Attachment: single__insert multi insert > when multi insert the inputformat becomes OneNullRowInputFormat > --- > > Key: HIVE-11840 > URL: https://issues.apache.org/jira/browse/HIVE-11840 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 0.14.0 >Reporter: Feng Yuan >Priority: Critical > Fix For: 0.14.1 > > Attachments: multi insert, single__insert > > > example: > from portrait.rec_feature_feedback a > insert overwrite table portrait.test1 select iid, feedback_15day, > feedback_7day, feedback_5day, feedback_3day, feedback_1day where l_date = > '2015-09-09' and bid in ('949722CF_12F7_523A_EE21_E3D591B7E755') > insert overwrite table portrait.test2 select iid, feedback_15day, > feedback_7day, feedback_5day, feedback_3day, feedback_1day where l_date = > '2015-09-09' and bid in ('test') > insert overwrite table portrait.test3 select iid, feedback_15day, > feedback_7day, feedback_5day, feedback_3day, feedback_1day where l_date = > '2015-09-09' and bid in ('F7734668_CC49_8C4F_24C5_EA8B6728E394') > when single insert it works.but multi insert when i select * from test1: > NULL NULL NULL NULL NULL NULL. > i see "explain extended" > Path -> Alias: > -mr-10006portrait.rec_feature_feedback{l_date=2015-09-09, > cid=Cyiyaowang, bid=F7734668_CC49_8C4F_24C5_EA8B6728E394} [a] > -mr-10007portrait.rec_feature_feedback{l_date=2015-09-09, > cid=Czgc_pc, bid=949722CF_12F7_523A_EE21_E3D591B7E755} [a] > Path -> Partition: > -mr-10006portrait.rec_feature_feedback{l_date=2015-09-09, > cid=Cyiyaowang, bid=F7734668_CC49_8C4F_24C5_EA8B6728E394} > Partition > base file name: bid=F7734668_CC49_8C4F_24C5_EA8B6728E394 > input format: org.apache.hadoop.hive.ql.io.OneNullRowInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > partition values: > bid F7734668_CC49_8C4F_24C5_EA8B6728E394 > cid Cyiyaowang > l_date 2015-09-09 > but when single insert: > Path -> Alias: > > hdfs://bfdhadoopcool/warehouse/portrait.db/rec_feature_feedback/l_date=2015-09-09/cid=Czgc_pc/bid=949722CF_12F7_523A_EE21_E3D591B7E755 > [a] > Path -> Partition: > > hdfs://bfdhadoopcool/warehouse/portrait.db/rec_feature_feedback/l_date=2015-09-09/cid=Czgc_pc/bid=949722CF_12F7_523A_EE21_E3D591B7E755 > > Partition > base file name: bid=949722CF_12F7_523A_EE21_E3D591B7E755 > input format: org.apache.hadoop.mapred.TextInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > partition values: > bid 949722CF_12F7_523A_EE21_E3D591B7E755 > cid Czgc_pc > l_date 2015-09-09 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11634) Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)
[ https://issues.apache.org/jira/browse/HIVE-11634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747278#comment-14747278 ] Hive QA commented on HIVE-11634: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12756043/HIVE-11634.94.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9446 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5293/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5293/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5293/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12756043 - PreCommit-HIVE-TRUNK-Build > Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...) > -- > > Key: HIVE-11634 > URL: https://issues.apache.org/jira/browse/HIVE-11634 > Project: Hive > Issue Type: Bug > Components: CBO >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-11634.1.patch, HIVE-11634.2.patch, > HIVE-11634.3.patch, HIVE-11634.4.patch, HIVE-11634.5.patch, > HIVE-11634.6.patch, HIVE-11634.7.patch, HIVE-11634.8.patch, > HIVE-11634.9.patch, HIVE-11634.91.patch, HIVE-11634.92.patch, > HIVE-11634.93.patch, HIVE-11634.94.patch > > > Currently, we do not support partition pruning for the following scenario > {code} > create table pcr_t1 (key int, value string) partitioned by (ds string); > insert overwrite table pcr_t1 partition (ds='2000-04-08') select * from src > where key < 20 order by key; > insert overwrite table pcr_t1 partition (ds='2000-04-09') select * from src > where key < 20 order by key; > insert overwrite table pcr_t1 partition (ds='2000-04-10') select * from src > where key < 20 order by key; > explain extended select ds from pcr_t1 where struct(ds, key) in > (struct('2000-04-08',1), struct('2000-04-09',2)); > {code} > If we run the above query, we see that all the partitions of table pcr_t1 are > present in the filter predicate where as we can prune partition > (ds='2000-04-10'). > The optimization is to rewrite the above query into the following. > {code} > explain extended select ds from pcr_t1 where (struct(ds)) IN > (struct('2000-04-08'), struct('2000-04-09')) and struct(ds, key) in > (struct('2000-04-08',1), struct('2000-04-09',2)); > {code} > The predicate (struct(ds)) IN (struct('2000-04-08'), struct('2000-04-09')) > is used by partition pruner to prune the columns which otherwise will not be > pruned. > This is an extension of the idea presented in HIVE-11573. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11786) Deprecate the use of redundant column in colunm stats related tables
[ https://issues.apache.org/jira/browse/HIVE-11786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14768865#comment-14768865 ] Chaoyu Tang commented on HIVE-11786: Patch has been uploaded to https://reviews.apache.org/r/38429/. [~sershe], [~xuefuz], [~ashutoshc], could you help review it? Thanks. > Deprecate the use of redundant column in colunm stats related tables > > > Key: HIVE-11786 > URL: https://issues.apache.org/jira/browse/HIVE-11786 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-11786.patch > > > The stats tables such as TAB_COL_STATS, PART_COL_STATS have redundant columns > such as DB_NAME, TABLE_NAME, PARTITION_NAME since these tables already have > foreign key like TBL_ID, or PART_ID referencing to TBLS or PARTITIONS. > These redundant columns violate database normalization rules and cause a lot > of inconvenience (sometimes difficult) in column stats related feature > implementation. For example, when renaming a table, we have to update > TABLE_NAME column in these tables as well which is unnecessary. > This JIRA is first to deprecate the use of these columns at HMS code level. A > followed JIRA is to be opened to focus on DB schema change and upgrade. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11825) get_json_object(col,'$.a') is null in where clause didn`t work
[ https://issues.apache.org/jira/browse/HIVE-11825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747371#comment-14747371 ] Cazen Lee commented on HIVE-11825: -- Good :) Have a good day! > get_json_object(col,'$.a') is null in where clause didn`t work > -- > > Key: HIVE-11825 > URL: https://issues.apache.org/jira/browse/HIVE-11825 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 0.14.0 >Reporter: Feng Yuan >Priority: Critical > Fix For: 0.14.1 > > Attachments: HIVE-11825.patch > > > example: > select attr from raw_kafka_item_dt0 where l_date='2015-09-06' and > customer='Czgc_news' and get_json_object(attr,'$.title') is NULL limit 10; > but in results,title is still not null! > {"title":"思科Q4收入估$79.2亿 > 前景阴云笼罩","ItemType":"NewsBase","keywords":"思科Q4收入估\$79.2亿 > 前景阴云笼罩","random":"1420253511075","callback":"BCore.instances[2].callbacks[1]","user_agent":"Mozilla/5.0 > (iPhone; U; CPU iPhone OS 4_2_1 like Mac OS X; en-us) AppleWebKit/533.17.9 > (KHTML; like Gecko) Version/5.0.2 Mobile/8C148 > Safari/6533.18.5","is_newgid":"false","uuid":"DS.Input:b56c782bcb75035d:2116:003dcd40:54a75947","ptime":"1.1549997E9"} > > attr is a dict -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-4243) Fix column names in FileSinkOperator
[ https://issues.apache.org/jira/browse/HIVE-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14768893#comment-14768893 ] Hive QA commented on HIVE-4243: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12756087/HIVE-4243.patch {color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 9443 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.ql.io.orc.TestColumnStatistics.testHasNull org.apache.hadoop.hive.ql.io.orc.TestFileDump.testBloomFilter org.apache.hadoop.hive.ql.io.orc.TestFileDump.testBloomFilter2 org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDictionaryThreshold org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDump org.apache.hadoop.hive.ql.io.orc.TestJsonFileDump.testJsonDump org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testRecordReaderNewBaseAndDelta org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation org.apache.hive.hcatalog.streaming.TestStreaming.testInterleavedTransactionBatchCommits org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5294/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5294/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5294/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 10 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12756087 - PreCommit-HIVE-TRUNK-Build > Fix column names in FileSinkOperator > > > Key: HIVE-4243 > URL: https://issues.apache.org/jira/browse/HIVE-4243 > Project: Hive > Issue Type: Bug > Components: File Formats >Affects Versions: 1.3.0, 2.0.0 >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Attachments: HIVE-4243.patch, HIVE-4243.patch, HIVE-4243.tmp.patch > > > All of the ObjectInspectors given to SerDe's by FileSinkOperator have virtual > column names. Since the files are part of tables, Hive knows the column > names. For self-describing file formats like ORC, having the real column > names will improve the understandability. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11512) Hive LDAP Authenticator should also support full DN in Authenticate()
[ https://issues.apache.org/jira/browse/HIVE-11512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam updated HIVE-11512: - Attachment: HIVE-11512.patch > Hive LDAP Authenticator should also support full DN in Authenticate() > -- > > Key: HIVE-11512 > URL: https://issues.apache.org/jira/browse/HIVE-11512 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 1.1.0 >Reporter: Naveen Gangam >Assignee: Naveen Gangam >Priority: Minor > Attachments: HIVE-11512.patch > > > In certain LDAP implementation, LDAP Binding can occur using the full DN for > the user. Currently, LDAPAuthentication Provider assumes that the username > passed into Authenticate() is a short username & not a full DN. While the > initial bind works fine either way, the filter code is reliant on it being a > shortname. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11831) TXN tables in Oracle should be created with ROWDEPENDENCIES
[ https://issues.apache.org/jira/browse/HIVE-11831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14769034#comment-14769034 ] Hive QA commented on HIVE-11831: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12756135/HIVE-11831.01.patch {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9445 tests executed *Failed tests:* {noformat} org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation org.apache.hive.hcatalog.streaming.TestStreaming.testEndpointConnection org.apache.hive.hcatalog.streaming.TestStreaming.testTimeOutReaper org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5295/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5295/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5295/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12756135 - PreCommit-HIVE-TRUNK-Build > TXN tables in Oracle should be created with ROWDEPENDENCIES > --- > > Key: HIVE-11831 > URL: https://issues.apache.org/jira/browse/HIVE-11831 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11831.01.patch, HIVE-11831.patch > > > These frequently-updated tables may otherwise suffer from spurious deadlocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-11841) KeyValuesInputMerger creates huge logs
[ https://issues.apache.org/jira/browse/HIVE-11841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan reassigned HIVE-11841: --- Assignee: Rajesh Balamohan > KeyValuesInputMerger creates huge logs > -- > > Key: HIVE-11841 > URL: https://issues.apache.org/jira/browse/HIVE-11841 > Project: Hive > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > Attachments: HIVE-11841.1.patch > > > https://github.com/apache/hive/blob/ac755ebe26361a4647d53db2a28500f71697b276/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/tools/KeyValuesInputMerger.java#L107 > When running tpc-ds q75 at relatively large scale, it ends up generating huge > logs due to this. > {noformat} > Log Type: syslog_attempt_1439860407967_1249_1_30_00_0 > Log Upload Time: Wed Sep 16 12:49:09 + 2015 > Log Length: 3992760053 > Showing 4096 bytes of 3992760053 total. Click here for the full log. > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11841) KeyValuesInputMerger creates huge logs
[ https://issues.apache.org/jira/browse/HIVE-11841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated HIVE-11841: Attachment: HIVE-11841.1.patch [~vikram.dixit], [~gopalv] - Please review when you find time. > KeyValuesInputMerger creates huge logs > -- > > Key: HIVE-11841 > URL: https://issues.apache.org/jira/browse/HIVE-11841 > Project: Hive > Issue Type: Bug >Reporter: Rajesh Balamohan > Attachments: HIVE-11841.1.patch > > > https://github.com/apache/hive/blob/ac755ebe26361a4647d53db2a28500f71697b276/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/tools/KeyValuesInputMerger.java#L107 > When running tpc-ds q75 at relatively large scale, it ends up generating huge > logs due to this. > {noformat} > Log Type: syslog_attempt_1439860407967_1249_1_30_00_0 > Log Upload Time: Wed Sep 16 12:49:09 + 2015 > Log Length: 3992760053 > Showing 4096 bytes of 3992760053 total. Click here for the full log. > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11842) Improve RuleRegExp by caching some internal data structures
[ https://issues.apache.org/jira/browse/HIVE-11842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-11842: --- Attachment: HIVE-11842.patch > Improve RuleRegExp by caching some internal data structures > --- > > Key: HIVE-11842 > URL: https://issues.apache.org/jira/browse/HIVE-11842 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Fix For: 2.0.0 > > Attachments: HIVE-11842.patch > > > Continuing work started in HIVE-11141. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11217) CTAS statements throws error, when the table is stored as ORC File format and select clause has NULL/VOID type column
[ https://issues.apache.org/jira/browse/HIVE-11217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790570#comment-14790570 ] Yongzhi Chen commented on HIVE-11217: - [~prasanth_j], I can not find a proper place in TypeCheckProcFactory to put the code. For it is related to TypeInfo, could I put it into hive/serde2/typeinfo/TypeInfoUtils ? Attached the third patch to use TypeInfoUtils. Please review to see if it makes sense. Thanks > CTAS statements throws error, when the table is stored as ORC File format and > select clause has NULL/VOID type column > -- > > Key: HIVE-11217 > URL: https://issues.apache.org/jira/browse/HIVE-11217 > Project: Hive > Issue Type: Bug > Components: File Formats >Affects Versions: 0.13.1 >Reporter: Gaurav Kohli >Assignee: Yongzhi Chen >Priority: Minor > Attachments: HIVE-11217.1.patch, HIVE-11217.2.patch, > HIVE-11217.3.patch > > > If you try to use create-table-as-select (CTAS) statement and create a ORC > File format based table, then you can't use NULL as a column value in select > clause > CREATE TABLE empty (x int); > CREATE TABLE orc_table_with_null > STORED AS ORC > AS > SELECT > x, > null > FROM empty; > Error: > {quote} > 347084 [main] ERROR hive.ql.exec.DDLTask - > org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.IllegalArgumentException: Unknown primitive type VOID > at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:643) > at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4242) > at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:285) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1554) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1321) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1139) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:962) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:952) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:269) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:221) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:431) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:367) > at > org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:464) > at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:474) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:694) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633) > at org.apache.oozie.action.hadoop.HiveMain.runHive(HiveMain.java:323) > at org.apache.oozie.action.hadoop.HiveMain.run(HiveMain.java:284) > at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:39) > at org.apache.oozie.action.hadoop.HiveMain.main(HiveMain.java:66) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:227) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) > Caused by: java.lang.IllegalArgumentException: Unknown primitive type VOID > at > org.apache.hadoop.hive.ql.io.orc.OrcStruct.createObjectInspector(OrcStruct.java:530) > at > org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.(OrcStruct.java:195) > at > org.apache.hadoop.hive.ql.io.orc.OrcStruct.createObjectInspector(OrcStruct.java:534) > at > org.apache.hadoop.hive.ql.io.orc.OrcSerde.initialize(OrcSerde.java:106) > at > org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:519) > at >
[jira] [Updated] (HIVE-11217) CTAS statements throws error, when the table is stored as ORC File format and select clause has NULL/VOID type column
[ https://issues.apache.org/jira/browse/HIVE-11217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-11217: Attachment: (was: HIVE-11217.3.patch) > CTAS statements throws error, when the table is stored as ORC File format and > select clause has NULL/VOID type column > -- > > Key: HIVE-11217 > URL: https://issues.apache.org/jira/browse/HIVE-11217 > Project: Hive > Issue Type: Bug > Components: File Formats >Affects Versions: 0.13.1 >Reporter: Gaurav Kohli >Assignee: Yongzhi Chen >Priority: Minor > Attachments: HIVE-11217.1.patch, HIVE-11217.2.patch, > HIVE-11217.3.patch > > > If you try to use create-table-as-select (CTAS) statement and create a ORC > File format based table, then you can't use NULL as a column value in select > clause > CREATE TABLE empty (x int); > CREATE TABLE orc_table_with_null > STORED AS ORC > AS > SELECT > x, > null > FROM empty; > Error: > {quote} > 347084 [main] ERROR hive.ql.exec.DDLTask - > org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.IllegalArgumentException: Unknown primitive type VOID > at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:643) > at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4242) > at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:285) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1554) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1321) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1139) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:962) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:952) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:269) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:221) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:431) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:367) > at > org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:464) > at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:474) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:694) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633) > at org.apache.oozie.action.hadoop.HiveMain.runHive(HiveMain.java:323) > at org.apache.oozie.action.hadoop.HiveMain.run(HiveMain.java:284) > at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:39) > at org.apache.oozie.action.hadoop.HiveMain.main(HiveMain.java:66) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:227) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) > Caused by: java.lang.IllegalArgumentException: Unknown primitive type VOID > at > org.apache.hadoop.hive.ql.io.orc.OrcStruct.createObjectInspector(OrcStruct.java:530) > at > org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.(OrcStruct.java:195) > at > org.apache.hadoop.hive.ql.io.orc.OrcStruct.createObjectInspector(OrcStruct.java:534) > at > org.apache.hadoop.hive.ql.io.orc.OrcSerde.initialize(OrcSerde.java:106) > at > org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:519) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:345) > at > org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:292) > at > org.apache.hadoop.hive.ql.metadata.Table.checkValidity(Table.java:194) >
[jira] [Commented] (HIVE-11843) Add 'sort by c' to Parquet PPD q-tests to avoid different output issues with hadoop-1
[ https://issues.apache.org/jira/browse/HIVE-11843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790593#comment-14790593 ] Sergio Peña commented on HIVE-11843: [~Ferd] Could you help me review this patch? it is easy, I just added the 'sort by c' in some queries that display mixed values. > Add 'sort by c' to Parquet PPD q-tests to avoid different output issues with > hadoop-1 > - > > Key: HIVE-11843 > URL: https://issues.apache.org/jira/browse/HIVE-11843 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Sergio Peña >Assignee: Sergio Peña > Attachments: HIVE-11843.1.patch > > > Parquet PPD tests has a different output when is run against hadoop-1 because > mixed values are in different order. > To fix this, we should just add 'sort by c' in the queries that will display > mixed values. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11217) CTAS statements throws error, when the table is stored as ORC File format and select clause has NULL/VOID type column
[ https://issues.apache.org/jira/browse/HIVE-11217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-11217: Attachment: HIVE-11217.3.patch > CTAS statements throws error, when the table is stored as ORC File format and > select clause has NULL/VOID type column > -- > > Key: HIVE-11217 > URL: https://issues.apache.org/jira/browse/HIVE-11217 > Project: Hive > Issue Type: Bug > Components: File Formats >Affects Versions: 0.13.1 >Reporter: Gaurav Kohli >Assignee: Yongzhi Chen >Priority: Minor > Attachments: HIVE-11217.1.patch, HIVE-11217.2.patch, > HIVE-11217.3.patch > > > If you try to use create-table-as-select (CTAS) statement and create a ORC > File format based table, then you can't use NULL as a column value in select > clause > CREATE TABLE empty (x int); > CREATE TABLE orc_table_with_null > STORED AS ORC > AS > SELECT > x, > null > FROM empty; > Error: > {quote} > 347084 [main] ERROR hive.ql.exec.DDLTask - > org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.IllegalArgumentException: Unknown primitive type VOID > at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:643) > at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4242) > at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:285) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1554) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1321) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1139) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:962) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:952) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:269) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:221) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:431) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:367) > at > org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:464) > at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:474) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:694) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633) > at org.apache.oozie.action.hadoop.HiveMain.runHive(HiveMain.java:323) > at org.apache.oozie.action.hadoop.HiveMain.run(HiveMain.java:284) > at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:39) > at org.apache.oozie.action.hadoop.HiveMain.main(HiveMain.java:66) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:227) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) > Caused by: java.lang.IllegalArgumentException: Unknown primitive type VOID > at > org.apache.hadoop.hive.ql.io.orc.OrcStruct.createObjectInspector(OrcStruct.java:530) > at > org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.(OrcStruct.java:195) > at > org.apache.hadoop.hive.ql.io.orc.OrcStruct.createObjectInspector(OrcStruct.java:534) > at > org.apache.hadoop.hive.ql.io.orc.OrcSerde.initialize(OrcSerde.java:106) > at > org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:519) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:345) > at > org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:292) > at > org.apache.hadoop.hive.ql.metadata.Table.checkValidity(Table.java:194) > at
[jira] [Updated] (HIVE-11217) CTAS statements throws error, when the table is stored as ORC File format and select clause has NULL/VOID type column
[ https://issues.apache.org/jira/browse/HIVE-11217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-11217: Attachment: HIVE-11217.3.patch > CTAS statements throws error, when the table is stored as ORC File format and > select clause has NULL/VOID type column > -- > > Key: HIVE-11217 > URL: https://issues.apache.org/jira/browse/HIVE-11217 > Project: Hive > Issue Type: Bug > Components: File Formats >Affects Versions: 0.13.1 >Reporter: Gaurav Kohli >Assignee: Yongzhi Chen >Priority: Minor > Attachments: HIVE-11217.1.patch, HIVE-11217.2.patch, > HIVE-11217.3.patch > > > If you try to use create-table-as-select (CTAS) statement and create a ORC > File format based table, then you can't use NULL as a column value in select > clause > CREATE TABLE empty (x int); > CREATE TABLE orc_table_with_null > STORED AS ORC > AS > SELECT > x, > null > FROM empty; > Error: > {quote} > 347084 [main] ERROR hive.ql.exec.DDLTask - > org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.IllegalArgumentException: Unknown primitive type VOID > at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:643) > at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4242) > at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:285) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1554) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1321) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1139) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:962) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:952) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:269) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:221) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:431) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:367) > at > org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:464) > at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:474) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:694) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633) > at org.apache.oozie.action.hadoop.HiveMain.runHive(HiveMain.java:323) > at org.apache.oozie.action.hadoop.HiveMain.run(HiveMain.java:284) > at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:39) > at org.apache.oozie.action.hadoop.HiveMain.main(HiveMain.java:66) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:227) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) > Caused by: java.lang.IllegalArgumentException: Unknown primitive type VOID > at > org.apache.hadoop.hive.ql.io.orc.OrcStruct.createObjectInspector(OrcStruct.java:530) > at > org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.(OrcStruct.java:195) > at > org.apache.hadoop.hive.ql.io.orc.OrcStruct.createObjectInspector(OrcStruct.java:534) > at > org.apache.hadoop.hive.ql.io.orc.OrcSerde.initialize(OrcSerde.java:106) > at > org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:519) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:345) > at > org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:292) > at > org.apache.hadoop.hive.ql.metadata.Table.checkValidity(Table.java:194) > at
[jira] [Commented] (HIVE-11791) Add unit test for HIVE-10122
[ https://issues.apache.org/jira/browse/HIVE-11791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790582#comment-14790582 ] Illya Yalovyy commented on HIVE-11791: -- [~hagleitn], actually in test case compactExpr(or(isNull(col1), false)) it returns invalid result: or(isNull(col1)). OR is a binary operator. I'm looking into fixing it. Your suggestions would be appreciated. > Add unit test for HIVE-10122 > > > Key: HIVE-11791 > URL: https://issues.apache.org/jira/browse/HIVE-11791 > Project: Hive > Issue Type: Test > Components: Metastore >Affects Versions: 1.1.0 >Reporter: Illya Yalovyy >Priority: Minor > Attachments: HIVE-11791.patch > > > Unit tests for PartitionPruner.compactExpr() -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11841) KeyValuesInputMerger creates huge logs
[ https://issues.apache.org/jira/browse/HIVE-11841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790564#comment-14790564 ] Gopal V commented on HIVE-11841: Repoted by a user as well. Can we get a list of versions affected by this & file backports? LGTM - +1 > KeyValuesInputMerger creates huge logs > -- > > Key: HIVE-11841 > URL: https://issues.apache.org/jira/browse/HIVE-11841 > Project: Hive > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > Attachments: HIVE-11841.1.patch > > > https://github.com/apache/hive/blob/ac755ebe26361a4647d53db2a28500f71697b276/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/tools/KeyValuesInputMerger.java#L107 > When running tpc-ds q75 at relatively large scale, it ends up generating huge > logs due to this. > {noformat} > Log Type: syslog_attempt_1439860407967_1249_1_30_00_0 > Log Upload Time: Wed Sep 16 12:49:09 + 2015 > Log Length: 3992760053 > Showing 4096 bytes of 3992760053 total. Click here for the full log. > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11398) Parse wide OR and wide AND trees to flat OR/AND trees
[ https://issues.apache.org/jira/browse/HIVE-11398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790585#comment-14790585 ] Illya Yalovyy commented on HIVE-11398: -- [~gopalv], I have added unit tests for one is affected method. Could you please review results after this change. Some of them look suspicious. More details: HIVE-11791. > Parse wide OR and wide AND trees to flat OR/AND trees > - > > Key: HIVE-11398 > URL: https://issues.apache.org/jira/browse/HIVE-11398 > Project: Hive > Issue Type: New Feature > Components: Logical Optimizer, UDF >Affects Versions: 1.3.0, 2.0.0 >Reporter: Gopal V >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-11398.2.patch, HIVE-11398.3.patch, > HIVE-11398.4.patch, HIVE-11398.5.patch, HIVE-11398.patch > > > Deep trees of AND/OR are hard to traverse particularly when they are merely > the same structure in nested form as a version of the operator that takes an > arbitrary number of args. > One potential way to convert the DFS searches into a simpler BFS search is to > introduce a new Operator pair named ALL and ANY. > ALL(A, B, C, D, E) represents AND(AND(AND(AND(E, D), C), B), A) > ANY(A, B, C, D, E) represents OR(OR(OR(OR(E, D), C),B),A) > The SemanticAnalyser would be responsible for generating these operators and > this would mean that the depth and complexity of traversals for the simplest > case of wide AND/OR trees would be trivial. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11843) Add 'sort by c' to Parquet PPD q-tests to avoid different output issues with hadoop-1
[ https://issues.apache.org/jira/browse/HIVE-11843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-11843: --- Attachment: HIVE-11843.1.patch > Add 'sort by c' to Parquet PPD q-tests to avoid different output issues with > hadoop-1 > - > > Key: HIVE-11843 > URL: https://issues.apache.org/jira/browse/HIVE-11843 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Sergio Peña >Assignee: Sergio Peña > Attachments: HIVE-11843.1.patch > > > Parquet PPD tests has a different output when is run against hadoop-1 because > mixed values are in different order. > To fix this, we should just add 'sort by c' in the queries that will display > mixed values. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11512) Hive LDAP Authenticator should also support full DN in Authenticate()
[ https://issues.apache.org/jira/browse/HIVE-11512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam updated HIVE-11512: - Attachment: (was: HIVE-11512.patch) > Hive LDAP Authenticator should also support full DN in Authenticate() > -- > > Key: HIVE-11512 > URL: https://issues.apache.org/jira/browse/HIVE-11512 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 1.1.0 >Reporter: Naveen Gangam >Assignee: Naveen Gangam >Priority: Minor > > In certain LDAP implementation, LDAP Binding can occur using the full DN for > the user. Currently, LDAPAuthentication Provider assumes that the username > passed into Authenticate() is a short username & not a full DN. While the > initial bind works fine either way, the filter code is reliant on it being a > shortname. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11791) Add unit test for HIVE-10122
[ https://issues.apache.org/jira/browse/HIVE-11791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790699#comment-14790699 ] Gopal V commented on HIVE-11791: [~yalovyyi]: Good catch, that look a bit odd - it should be returning just the isNull(col1). The case missing is inside {code} if (allFalse) { return new ExprNodeConstantDesc(Boolean.FALSE); } // Nothing to compact, update expr with compacted children. ((ExprNodeGenericFuncDesc) expr).setChildren(newChildren); {code} also FYI, you can annotate the compactExpr method with an @VisibleForTesting annotation, so that use from a non-test will trigger a warning during findbugs (which I'll re-add today). > Add unit test for HIVE-10122 > > > Key: HIVE-11791 > URL: https://issues.apache.org/jira/browse/HIVE-11791 > Project: Hive > Issue Type: Test > Components: Metastore >Affects Versions: 1.1.0 >Reporter: Illya Yalovyy >Assignee: Illya Yalovyy >Priority: Minor > Attachments: HIVE-11791.patch > > > Unit tests for PartitionPruner.compactExpr() -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11844) Merge master to Spark branch 9/16/2015 [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-11844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-11844: --- Summary: Merge master to Spark branch 9/16/2015 [Spark Branch] (was: CMerge master to Spark branch 9/16/2015 [Spark Branch]) > Merge master to Spark branch 9/16/2015 [Spark Branch] > - > > Key: HIVE-11844 > URL: https://issues.apache.org/jira/browse/HIVE-11844 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Fix For: 1.2.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11512) Hive LDAP Authenticator should also support full DN in Authenticate()
[ https://issues.apache.org/jira/browse/HIVE-11512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam updated HIVE-11512: - Attachment: HIVE-11512.patch > Hive LDAP Authenticator should also support full DN in Authenticate() > -- > > Key: HIVE-11512 > URL: https://issues.apache.org/jira/browse/HIVE-11512 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 1.1.0 >Reporter: Naveen Gangam >Assignee: Naveen Gangam >Priority: Minor > Attachments: HIVE-11512.patch > > > In certain LDAP implementation, LDAP Binding can occur using the full DN for > the user. Currently, LDAPAuthentication Provider assumes that the username > passed into Authenticate() is a short username & not a full DN. While the > initial bind works fine either way, the filter code is reliant on it being a > shortname. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8327) mvn site -Pfindbugs
[ https://issues.apache.org/jira/browse/HIVE-8327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790686#comment-14790686 ] Gopal V commented on HIVE-8327: --- This patch seems to have been lost during a spark branch merge. {code} commit 714b3db65d41dd96db59ca1b9a6d1b6a4613072e Merge: 537114b 7df9d7a Author: xzhangDate: Thu Jul 30 17:41:17 2015 -0700 HIVE-10863: Merge master to Spark branch 7/29/2015 [Spark Branch] (reviewed by Chao) {code} > mvn site -Pfindbugs > --- > > Key: HIVE-8327 > URL: https://issues.apache.org/jira/browse/HIVE-8327 > Project: Hive > Issue Type: Test > Components: Diagnosability >Reporter: Gopal V >Assignee: Gopal V > Fix For: 1.1.0 > > Attachments: HIVE-8327.1.patch, HIVE-8327.2.patch, ql-findbugs.html > > > HIVE-3099 originally added findbugs into the old ant build. > Get basic findbugs working for the maven build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11791) Add unit test for HIVE-10122
[ https://issues.apache.org/jira/browse/HIVE-11791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-11791: --- Assignee: Illya Yalovyy > Add unit test for HIVE-10122 > > > Key: HIVE-11791 > URL: https://issues.apache.org/jira/browse/HIVE-11791 > Project: Hive > Issue Type: Test > Components: Metastore >Affects Versions: 1.1.0 >Reporter: Illya Yalovyy >Assignee: Illya Yalovyy >Priority: Minor > Attachments: HIVE-11791.patch > > > Unit tests for PartitionPruner.compactExpr() -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11832) HIVE-11802 breaks compilation in JDK 8
[ https://issues.apache.org/jira/browse/HIVE-11832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790646#comment-14790646 ] Hive QA commented on HIVE-11832: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12756105/HIVE-11832.1.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9445 tests executed *Failed tests:* {noformat} org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation org.apache.hive.jdbc.TestSSL.testSSLFetchHttp {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5296/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5296/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5296/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12756105 - PreCommit-HIVE-TRUNK-Build > HIVE-11802 breaks compilation in JDK 8 > -- > > Key: HIVE-11832 > URL: https://issues.apache.org/jira/browse/HIVE-11832 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Sergio Peña > Attachments: HIVE-11832.1.patch > > > HIVE-11802 changes breaks JDK 8 compilation. FloatingDecimal constructor > accepting float is removed in JDK 8. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11678) Add AggregateProjectMergeRule
[ https://issues.apache.org/jira/browse/HIVE-11678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790683#comment-14790683 ] Jesus Camacho Rodriguez commented on HIVE-11678: I've gone through the patch and the plan changes. +1 > Add AggregateProjectMergeRule > - > > Key: HIVE-11678 > URL: https://issues.apache.org/jira/browse/HIVE-11678 > Project: Hive > Issue Type: New Feature > Components: CBO, Logical Optimizer >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-11678.2.patch, HIVE-11678.3.patch, > HIVE-11678.4.patch, HIVE-11678.5.patch, HIVE-11678.patch > > > This will help to get rid of extra projects on top of Aggregation, thus > compacting query plan. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-11835) Type decimal(1,1) reads 0.0, 0.00, etc from text file as NULL
[ https://issues.apache.org/jira/browse/HIVE-11835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790853#comment-14790853 ] Xuefu Zhang edited comment on HIVE-11835 at 9/16/15 6:42 PM: - The problem is caused by the fact that Hive trims zeros. In most of cases this is harmless. However, if the value is 0.0, 0.00, 0.000, etc, trimming zeros changes the value to 0, which has a type decimal(1,0). Since type decimal(1, 1) allows on integer digits, 0 becomes NULL when being converted to decimal(1, 1). It seems that trimming trailing zeros doesn't do any good. It not only changes the data type, creating the problem like the one here, but also completely changes the semantic meaning of the number. The right fix is to keep trailing zeros only if it goes beyond the datatype allows, which happens when scale is enforced. This will also keeps the right number of decimal points on query result, which is desirable and common practice in other DBs. Initial patch to have a test run. Expect some test results need to be updated. Will also add new tests. was (Author: xuefuz): Initial patch to have a test run. Expect some test results need to be updated. Will also add new tests. > Type decimal(1,1) reads 0.0, 0.00, etc from text file as NULL > - > > Key: HIVE-11835 > URL: https://issues.apache.org/jira/browse/HIVE-11835 > Project: Hive > Issue Type: Bug > Components: Types >Affects Versions: 1.2.0, 1.1.0, 2.0.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Attachments: HIVE-11835.patch > > > Steps to reproduce: > 1. create a text file with values like 0.0, 0.00, etc. > 2. create table in hive with type decimal(1,1). > 3. run "load data local inpath ..." to load data into the table. > 4. run select * on the table. > You will see that NULL is displayed for 0.0, 0.00, .0, etc. Instead, these > should be read as 0.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11836) ORC SARG creation throws NPE for null constants with void type
[ https://issues.apache.org/jira/browse/HIVE-11836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790836#comment-14790836 ] Hive QA commented on HIVE-11836: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12756132/HIVE-11836.1.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5300/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5300/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5300/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-5300/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at ce71355 HIVE-8327: (repeat) mvn site -Pfindbugs for hive (Gopal V reviewed by Ashutosh Chauhan) + git clean -f -d + git checkout master Already on 'master' + git reset --hard origin/master HEAD is now at ce71355 HIVE-8327: (repeat) mvn site -Pfindbugs for hive (Gopal V reviewed by Ashutosh Chauhan) + git merge --ff-only origin/master Already up-to-date. + git gc + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12756132 - PreCommit-HIVE-TRUNK-Build > ORC SARG creation throws NPE for null constants with void type > -- > > Key: HIVE-11836 > URL: https://issues.apache.org/jira/browse/HIVE-11836 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-11836.1.patch > > > Queries like > {code} > select * from table where col = null > {code} > will throw the following exception > {code} > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$ExpressionBuilder.boxLiteral(SearchArgumentImpl.java:446) > at > org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$ExpressionBuilder.getLiteral(SearchArgumentImpl.java:476) > at > org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$ExpressionBuilder.createLeaf(SearchArgumentImpl.java:524) > at > org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$ExpressionBuilder.createLeaf(SearchArgumentImpl.java:584) > at > org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$ExpressionBuilder.parse(SearchArgumentImpl.java:629) > at > org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$ExpressionBuilder.addChildren(SearchArgumentImpl.java:598) > at > org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$ExpressionBuilder.parse(SearchArgumentImpl.java:621) > at > org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$ExpressionBuilder.addChildren(SearchArgumentImpl.java:598) > at > org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$ExpressionBuilder.parse(SearchArgumentImpl.java:621) > at >
[jira] [Commented] (HIVE-11815) Correct the column/table names in subquery expression when creating a view
[ https://issues.apache.org/jira/browse/HIVE-11815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790837#comment-14790837 ] Ashutosh Chauhan commented on HIVE-11815: - +1 > Correct the column/table names in subquery expression when creating a view > -- > > Key: HIVE-11815 > URL: https://issues.apache.org/jira/browse/HIVE-11815 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11815.01.patch, HIVE-11815.02.patch > > > Right now Hive does not quote column/table names in subquery expression when > create a view. For example > {code} > hive> > > create table tc (`@d` int); > OK > Time taken: 0.119 seconds > hive> create view tcv as select * from tc b where exists (select a.`@d` from > tc a where b.`@d`=a.`@d`); > OK > Time taken: 0.075 seconds > hive> describe extended tcv; > OK > @dint > Detailed Table InformationTable(tableName:tcv, dbName:default, > owner:pxiong, createTime:1442250005, lastAccessTime:0, retention:0, > sd:StorageDescriptor(cols:[FieldSchema(name:@d, type:int, comment:null)], > location:null, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, > outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, > compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, > serializationLib:null, parameters:{}), bucketCols:[], sortCols:[], > parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], > skewedColValueLocationMaps:{}), storedAsSubDirectories:false), > partitionKeys:[], parameters:{transient_lastDdlTime=1442250005}, > viewOriginalText:select * from tc b where exists (select a.@d from tc a where > b.@d=a.@d), viewExpandedText:select `b`.`@d` from `default`.`tc` `b` where > exists (select a.@d from tc a where b.@d=a.@d), tableType:VIRTUAL_VIEW) > Time taken: 0.063 seconds, Fetched: 3 row(s) > hive> select * from tcv; > FAILED: SemanticException line 1:63 character '@' not supported here > line 1:84 character '@' not supported here > line 1:89 character '@' not supported here in definition of VIEW tcv [ > select `b`.`@d` from `default`.`tc` `b` where exists (select a.@d from tc a > where b.@d=a.@d) > ] used as tcv at Line 1:14 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11332) Unicode table comments do not work
[ https://issues.apache.org/jira/browse/HIVE-11332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11332: Affects Version/s: 2.0.0 0.13.1 1.1.0 > Unicode table comments do not work > -- > > Key: HIVE-11332 > URL: https://issues.apache.org/jira/browse/HIVE-11332 > Project: Hive > Issue Type: Bug >Affects Versions: 0.13.1, 1.1.0, 2.0.0 >Reporter: Sergey Shelukhin > > Noticed by accident. > {noformat} > select ' ', count(*) from moo; > Query ID = sershe_20150721190413_979e1b6f-86d6-436f-b8e6-d6785b9d3b83 > Total jobs = 1 > Launching Job 1 out of 1 > [snip] > OK > 0 > Time taken: 13.347 seconds, Fetched: 1 row(s) > hive> ALTER TABLE moo SET TBLPROPERTIES ('comment' = ' '); > OK > Time taken: 0.292 seconds > hive> desc extended moo; > OK > i int > > Detailed Table InformationTable(tableName:moo, dbName:default, > owner:sershe, createTime:1437519787, lastAccessTime:0, retention:0, > sd:StorageDescriptor(cols:[FieldSchema(name:i, type:int, comment:null)], > location:hdfs://cn108-10.l42scl.hortonworks.com:8020/apps/hive/warehouse/moo, > inputFormat:org.apache.hadoop.mapred.TextInputFormat, > outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, > compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, > serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, > parameters:{serialization.format=1}), bucketCols:[], sortCols:[], > parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], > skewedColValueLocationMaps:{}), storedAsSubDirectories:false), > partitionKeys:[], parameters:{last_modified_time=1437519883, totalSize=0, > numRows=-1, rawDataSize=-1, COLUMN_STATS_ACCURATE=false, numFiles=0, > transient_lastDdlTime=1437519883, comment=?? , last_modified_by=sershe}, > viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE) > Time taken: 0.347 seconds, Fetched: 3 row(s) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11846) CliDriver shutdown tries to drop index table again which was already dropped when dropping the original table
[ https://issues.apache.org/jira/browse/HIVE-11846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790907#comment-14790907 ] Xuefu Zhang commented on HIVE-11846: The description doesn't seem describing the problem clearly. What's the symptom of the problem and how is it releated to CliDriver shutdown? > CliDriver shutdown tries to drop index table again which was already dropped > when dropping the original table > -- > > Key: HIVE-11846 > URL: https://issues.apache.org/jira/browse/HIVE-11846 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong >Priority: Critical > > Steps to repro: > {code} > set hive.stats.dbclass=fs; > set hive.stats.autogather=true; > set hive.cbo.enable=true; > DROP TABLE IF EXISTS aa; > CREATE TABLE aa (L_ORDERKEY INT, > L_PARTKEY INT, > L_SUPPKEY INT, > L_LINENUMBERINT, > L_QUANTITY DOUBLE, > L_EXTENDEDPRICE DOUBLE, > L_DISCOUNT DOUBLE, > L_TAX DOUBLE, > L_RETURNFLAGSTRING, > L_LINESTATUSSTRING, > l_shipdate STRING, > L_COMMITDATESTRING, > L_RECEIPTDATE STRING, > L_SHIPINSTRUCT STRING, > L_SHIPMODE STRING, > L_COMMENT STRING) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY '|'; > LOAD DATA LOCAL INPATH '../../data/files/lineitem.txt' OVERWRITE INTO TABLE > aa; > CREATE INDEX aa_lshipdate_idx ON TABLE aa(l_shipdate) AS > 'org.apache.hadoop.hive.ql.index.AggregateIndexHandler' WITH DEFERRED REBUILD > IDXPROPERTIES("AGGREGATES"="count(l_shipdate)"); > ALTER INDEX aa_lshipdate_idx ON aa REBUILD; > show tables; > explain select l_shipdate, count(l_shipdate) > from aa > group by l_shipdate; > {code} > The problem is that, we create an index table default_aa_lshipdate_idx, > (default is the database name) and it comes after the table aa. Then, it > first drop aa, which will drop default_aa_lshipdate_idx as well as it is > related to aa. It will not find the table default_aa_lshipdate_idx when it > tries to drop it again, which will throw an exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11847) Avoid expensive call to contains/containsAll in DefaultGraphWalker
[ https://issues.apache.org/jira/browse/HIVE-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-11847: --- Attachment: HIVE-11847.patch > Avoid expensive call to contains/containsAll in DefaultGraphWalker > -- > > Key: HIVE-11847 > URL: https://issues.apache.org/jira/browse/HIVE-11847 > Project: Hive > Issue Type: Bug > Components: Logical Optimizer, Physical Optimizer >Affects Versions: 1.3.0, 2.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-11847.patch > > > Continuing work started in HIVE-11652. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11835) Type decimal(1,1) reads 0.0, 0.00, etc from text file as NULL
[ https://issues.apache.org/jira/browse/HIVE-11835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790853#comment-14790853 ] Xuefu Zhang commented on HIVE-11835: Initial patch to have a test run. Expect some test results need to be updated. Will also add new tests. > Type decimal(1,1) reads 0.0, 0.00, etc from text file as NULL > - > > Key: HIVE-11835 > URL: https://issues.apache.org/jira/browse/HIVE-11835 > Project: Hive > Issue Type: Bug > Components: Types >Affects Versions: 1.2.0, 1.1.0, 2.0.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Attachments: HIVE-11835.patch > > > Steps to reproduce: > 1. create a text file with values like 0.0, 0.00, etc. > 2. create table in hive with type decimal(1,1). > 3. run "load data local inpath ..." to load data into the table. > 4. run select * on the table. > You will see that NULL is displayed for 0.0, 0.00, .0, etc. Instead, these > should be read as 0.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11791) Add unit test for HIVE-10122
[ https://issues.apache.org/jira/browse/HIVE-11791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790893#comment-14790893 ] Gopal V commented on HIVE-11791: [~yalovyyi]: I have added the find bugs changes, but the compactExpr is still broken. Can you test with the following fix? {code} --- ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java +++ ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java @@ -331,6 +331,9 @@ static private ExprNodeDesc compactExpr(ExprNodeDesc expr) { if (allFalse) { return new ExprNodeConstantDesc(Boolean.FALSE); } +if (newChildren.size() == 1) { + return newChildren.get(0); +} // Nothing to compact, update expr with compacted children. ((ExprNodeGenericFuncDesc) expr).setChildren(newChildren); } {code} > Add unit test for HIVE-10122 > > > Key: HIVE-11791 > URL: https://issues.apache.org/jira/browse/HIVE-11791 > Project: Hive > Issue Type: Test > Components: Metastore >Affects Versions: 1.1.0 >Reporter: Illya Yalovyy >Assignee: Illya Yalovyy >Priority: Minor > Attachments: HIVE-11791.patch > > > Unit tests for PartitionPruner.compactExpr() -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11791) Add unit test for HIVE-10122
[ https://issues.apache.org/jira/browse/HIVE-11791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790918#comment-14790918 ] Illya Yalovyy commented on HIVE-11791: -- I'm on it. If you could review/confirm expected result for my tests, I would try to fix the rest myself. > Add unit test for HIVE-10122 > > > Key: HIVE-11791 > URL: https://issues.apache.org/jira/browse/HIVE-11791 > Project: Hive > Issue Type: Test > Components: Metastore >Affects Versions: 1.1.0 >Reporter: Illya Yalovyy >Assignee: Illya Yalovyy >Priority: Minor > Attachments: HIVE-11791.patch > > > Unit tests for PartitionPruner.compactExpr() -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11786) Deprecate the use of redundant column in colunm stats related tables
[ https://issues.apache.org/jira/browse/HIVE-11786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791032#comment-14791032 ] Hive QA commented on HIVE-11786: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12756158/HIVE-11786.patch {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 9445 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_stats_counter_partitioned org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.testStatsAfterCompactionPartTbl org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation org.apache.hive.hcatalog.streaming.TestStreaming.testInterleavedTransactionBatchCommits org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5301/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5301/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5301/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12756158 - PreCommit-HIVE-TRUNK-Build > Deprecate the use of redundant column in colunm stats related tables > > > Key: HIVE-11786 > URL: https://issues.apache.org/jira/browse/HIVE-11786 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-11786.patch > > > The stats tables such as TAB_COL_STATS, PART_COL_STATS have redundant columns > such as DB_NAME, TABLE_NAME, PARTITION_NAME since these tables already have > foreign key like TBL_ID, or PART_ID referencing to TBLS or PARTITIONS. > These redundant columns violate database normalization rules and cause a lot > of inconvenience (sometimes difficult) in column stats related feature > implementation. For example, when renaming a table, we have to update > TABLE_NAME column in these tables as well which is unnecessary. > This JIRA is first to deprecate the use of these columns at HMS code level. A > followed JIRA is to be opened to focus on DB schema change and upgrade. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11819) HiveServer2 catches OOMs on request threads
[ https://issues.apache.org/jira/browse/HIVE-11819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790827#comment-14790827 ] Hive QA commented on HIVE-11819: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12756110/HIVE-11819.01.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5298/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5298/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5298/ Messages: {noformat} This message was trimmed, see log for full details [INFO] Excluding aopalliance:aopalliance:jar:1.0 from the shaded jar. [INFO] Excluding com.sun.jersey.contribs:jersey-guice:jar:1.9 from the shaded jar. [INFO] Excluding org.apache.commons:commons-collections4:jar:4.0 from the shaded jar. [INFO] Excluding org.apache.tez:tez-runtime-library:jar:0.5.2 from the shaded jar. [INFO] Excluding org.apache.tez:tez-common:jar:0.5.2 from the shaded jar. [INFO] Excluding org.apache.tez:tez-runtime-internals:jar:0.5.2 from the shaded jar. [INFO] Excluding org.apache.tez:tez-mapreduce:jar:0.5.2 from the shaded jar. [INFO] Excluding commons-collections:commons-collections:jar:3.2.1 from the shaded jar. [INFO] Excluding org.apache.spark:spark-core_2.10:jar:1.4.0 from the shaded jar. [INFO] Excluding com.twitter:chill_2.10:jar:0.5.0 from the shaded jar. [INFO] Excluding com.twitter:chill-java:jar:0.5.0 from the shaded jar. [INFO] Excluding org.apache.hadoop:hadoop-client:jar:1.2.1 from the shaded jar. [INFO] Excluding org.apache.spark:spark-launcher_2.10:jar:1.4.0 from the shaded jar. [INFO] Excluding org.apache.spark:spark-network-common_2.10:jar:1.4.0 from the shaded jar. [INFO] Excluding org.apache.spark:spark-network-shuffle_2.10:jar:1.4.0 from the shaded jar. [INFO] Excluding org.apache.spark:spark-unsafe_2.10:jar:1.4.0 from the shaded jar. [INFO] Excluding net.java.dev.jets3t:jets3t:jar:0.7.1 from the shaded jar. [INFO] Excluding org.apache.curator:curator-recipes:jar:2.6.0 from the shaded jar. [INFO] Excluding org.eclipse.jetty.orbit:javax.servlet:jar:3.0.0.v201112011016 from the shaded jar. [INFO] Excluding org.apache.commons:commons-math3:jar:3.4.1 from the shaded jar. [INFO] Excluding org.slf4j:jul-to-slf4j:jar:1.7.10 from the shaded jar. [INFO] Excluding org.slf4j:jcl-over-slf4j:jar:1.7.10 from the shaded jar. [INFO] Excluding com.ning:compress-lzf:jar:1.0.3 from the shaded jar. [INFO] Excluding net.jpountz.lz4:lz4:jar:1.2.0 from the shaded jar. [INFO] Excluding org.roaringbitmap:RoaringBitmap:jar:0.4.5 from the shaded jar. [INFO] Excluding commons-net:commons-net:jar:2.2 from the shaded jar. [INFO] Excluding org.spark-project.akka:akka-remote_2.10:jar:2.3.4-spark from the shaded jar. [INFO] Excluding org.spark-project.akka:akka-actor_2.10:jar:2.3.4-spark from the shaded jar. [INFO] Excluding com.typesafe:config:jar:1.2.1 from the shaded jar. [INFO] Excluding org.spark-project.protobuf:protobuf-java:jar:2.5.0-spark from the shaded jar. [INFO] Excluding org.uncommons.maths:uncommons-maths:jar:1.2.2a from the shaded jar. [INFO] Excluding org.spark-project.akka:akka-slf4j_2.10:jar:2.3.4-spark from the shaded jar. [INFO] Excluding org.scala-lang:scala-library:jar:2.10.4 from the shaded jar. [INFO] Excluding org.json4s:json4s-jackson_2.10:jar:3.2.10 from the shaded jar. [INFO] Excluding org.json4s:json4s-core_2.10:jar:3.2.10 from the shaded jar. [INFO] Excluding org.json4s:json4s-ast_2.10:jar:3.2.10 from the shaded jar. [INFO] Excluding org.scala-lang:scalap:jar:2.10.0 from the shaded jar. [INFO] Excluding org.scala-lang:scala-compiler:jar:2.10.0 from the shaded jar. [INFO] Excluding com.sun.jersey:jersey-server:jar:1.14 from the shaded jar. [INFO] Excluding asm:asm:jar:3.1 from the shaded jar. [INFO] Excluding com.sun.jersey:jersey-core:jar:1.14 from the shaded jar. [INFO] Excluding org.apache.mesos:mesos:jar:shaded-protobuf:0.21.1 from the shaded jar. [INFO] Excluding com.clearspring.analytics:stream:jar:2.7.0 from the shaded jar. [INFO] Excluding io.dropwizard.metrics:metrics-graphite:jar:3.1.0 from the shaded jar. [INFO] Excluding com.fasterxml.jackson.module:jackson-module-scala_2.10:jar:2.4.4 from the shaded jar. [INFO] Excluding org.scala-lang:scala-reflect:jar:2.10.4 from the shaded jar. [INFO] Excluding oro:oro:jar:2.0.8 from the shaded jar. [INFO] Excluding org.tachyonproject:tachyon-client:jar:0.6.4 from the shaded jar. [INFO] Excluding org.tachyonproject:tachyon:jar:0.6.4 from the shaded jar. [INFO] Excluding net.razorvine:pyrolite:jar:4.4 from the shaded jar. [INFO] Excluding net.sf.py4j:py4j:jar:0.8.2.1 from the shaded jar. [INFO] Excluding org.spark-project.spark:unused:jar:1.0.0 from the shaded jar. [INFO]
[jira] [Commented] (HIVE-11842) Improve RuleRegExp by caching some internal data structures
[ https://issues.apache.org/jira/browse/HIVE-11842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790881#comment-14790881 ] Sergey Shelukhin commented on HIVE-11842: - +1 provided tests pass > Improve RuleRegExp by caching some internal data structures > --- > > Key: HIVE-11842 > URL: https://issues.apache.org/jira/browse/HIVE-11842 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Fix For: 2.0.0 > > Attachments: HIVE-11842.patch > > > Continuing work started in HIVE-11141. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11791) Add unit test for HIVE-10122
[ https://issues.apache.org/jira/browse/HIVE-11791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790943#comment-14790943 ] Illya Yalovyy commented on HIVE-11791: -- With this change test results look much better. The one which looks strange is: and(true, true) produces NULL. I would expect it to be TRUE. If this doesn't matter for the downstream logic then I'll update expected result for the test. Could you please clarify what NULL result means? > Add unit test for HIVE-10122 > > > Key: HIVE-11791 > URL: https://issues.apache.org/jira/browse/HIVE-11791 > Project: Hive > Issue Type: Test > Components: Metastore >Affects Versions: 1.1.0 >Reporter: Illya Yalovyy >Assignee: Illya Yalovyy >Priority: Minor > Attachments: HIVE-11791.patch > > > Unit tests for PartitionPruner.compactExpr() -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11683) Hive Streaming may overload the metastore
[ https://issues.apache.org/jira/browse/HIVE-11683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-11683: -- Component/s: Metastore > Hive Streaming may overload the metastore > - > > Key: HIVE-11683 > URL: https://issues.apache.org/jira/browse/HIVE-11683 > Project: Hive > Issue Type: Bug > Components: HCatalog, Hive, Metastore, Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Roshan Naik > > HiveEndPoint represents a way to write to a specific partition > transactionally. > Each HiveEndPoint creates TransactionBatch(es) and commits transactions. > Suppose you have 10 instances of Storm Hive bolt using Streaming API. > Each instance will create HiveEndPoints on demand when it sees an event for > particular partition value. > If events are uniformly distributed wrt partition values and the table has > 1000 partitions (for example it's partitioned by CustomerId), each of 10 bolt > instances may create 1000 HiveEndPoints and thus > 10,000 (actually 10K * > num_txn_per_batch) concurrent transactions. > This creates huge amount of Metastore traffic. > HIVE-11672 is investigating how some sort of "shuffle" phase can be added > route events for a particular bucket to the same bolt instance. > The same idea should explored to route events based on partition value. > cc [~alangates],[~sriharsha],[~rbains] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11846) CliDriver shutdown tries to drop index table again which was already dropped when dropping the original table
[ https://issues.apache.org/jira/browse/HIVE-11846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791046#comment-14791046 ] Pengcheng Xiong commented on HIVE-11846: [~xuefuz]. thanks for your attention. The problem can be better understood if you can take a look at my patch. So, when CliDriver tries to shutdown, it tries to drop all the tables that are created during q test. In this case, it iterates through all the tables in db.getAllTables() in QTestUtil.java and try to drop every one of them. Let's assume there are two tables, A, an original table, and index_A, which is an index table created based on A. If index_A comes before A in the iteration, there is no problem, because L674 in QTestUtil.java will skip it and later when A is dropped, index_A is dropped as well. However, If A comes before index_A in the iteration, A will be dropped and index_A is dropped as well, later, it will not find index_A and throw InvalidTableException. That is the symptom of the problem and why it is related to CliDriver shutdown. Thanks. > CliDriver shutdown tries to drop index table again which was already dropped > when dropping the original table > -- > > Key: HIVE-11846 > URL: https://issues.apache.org/jira/browse/HIVE-11846 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong >Priority: Critical > Attachments: HIVE-11846.01.patch > > > Steps to repro: > {code} > set hive.stats.dbclass=fs; > set hive.stats.autogather=true; > set hive.cbo.enable=true; > DROP TABLE IF EXISTS aa; > CREATE TABLE aa (L_ORDERKEY INT, > L_PARTKEY INT, > L_SUPPKEY INT, > L_LINENUMBERINT, > L_QUANTITY DOUBLE, > L_EXTENDEDPRICE DOUBLE, > L_DISCOUNT DOUBLE, > L_TAX DOUBLE, > L_RETURNFLAGSTRING, > L_LINESTATUSSTRING, > l_shipdate STRING, > L_COMMITDATESTRING, > L_RECEIPTDATE STRING, > L_SHIPINSTRUCT STRING, > L_SHIPMODE STRING, > L_COMMENT STRING) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY '|'; > LOAD DATA LOCAL INPATH '../../data/files/lineitem.txt' OVERWRITE INTO TABLE > aa; > CREATE INDEX aa_lshipdate_idx ON TABLE aa(l_shipdate) AS > 'org.apache.hadoop.hive.ql.index.AggregateIndexHandler' WITH DEFERRED REBUILD > IDXPROPERTIES("AGGREGATES"="count(l_shipdate)"); > ALTER INDEX aa_lshipdate_idx ON aa REBUILD; > show tables; > explain select l_shipdate, count(l_shipdate) > from aa > group by l_shipdate; > {code} > The problem is that, we create an index table default_aa_lshipdate_idx, > (default is the database name) and it comes after the table aa. Then, it > first drop aa, which will drop default_aa_lshipdate_idx as well as it is > related to aa. It will not find the table default_aa_lshipdate_idx when it > tries to drop it again, which will throw an exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11835) Type decimal(1,1) reads 0.0, 0.00, etc from text file as NULL
[ https://issues.apache.org/jira/browse/HIVE-11835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-11835: --- Attachment: HIVE-11835.patch > Type decimal(1,1) reads 0.0, 0.00, etc from text file as NULL > - > > Key: HIVE-11835 > URL: https://issues.apache.org/jira/browse/HIVE-11835 > Project: Hive > Issue Type: Bug > Components: Types >Affects Versions: 1.2.0, 1.1.0, 2.0.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Attachments: HIVE-11835.patch > > > Steps to reproduce: > 1. create a text file with values like 0.0, 0.00, etc. > 2. create table in hive with type decimal(1,1). > 3. run "load data local inpath ..." to load data into the table. > 4. run select * on the table. > You will see that NULL is displayed for 0.0, 0.00, .0, etc. Instead, these > should be read as 0.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11819) HiveServer2 catches OOMs on request threads
[ https://issues.apache.org/jira/browse/HIVE-11819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11819: Attachment: HIVE-11819.02.patch This time, forgot the new file > HiveServer2 catches OOMs on request threads > --- > > Key: HIVE-11819 > URL: https://issues.apache.org/jira/browse/HIVE-11819 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11819.01.patch, HIVE-11819.02.patch, > HIVE-11819.patch > > > ThriftCLIService methods such as ExecuteStatement are apparently capable of > catching OOMs because they get wrapped in RTE by HiveSessionProxy. > This shouldn't happen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11839) Vectorization wrong results with filter of (CAST AS CHAR)
[ https://issues.apache.org/jira/browse/HIVE-11839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790886#comment-14790886 ] Sergey Shelukhin commented on HIVE-11839: - +1. Does it also affect 1.2, 1.3 etc.? It should be backported accordingly > Vectorization wrong results with filter of (CAST AS CHAR) > - > > Key: HIVE-11839 > URL: https://issues.apache.org/jira/browse/HIVE-11839 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-11839.01.patch > > > PROBLEM: > For query such as > select count(1) from table where CAST (id as CHAR(4))='1000'; > gives wrong results 0 than expected results. > STEPS TO REPRODUCE: > create table s1(id smallint) stored as orc; > insert into table s1 values (1000),(1001),(1002),(1003),(1000); > set hive.vectorized.execution.enabled=true; > select count(1) from s1 where cast(id as char(4))='1000'; > – this gives 0 > set hive.vectorized.execution.enabled=false; > select count(1) from s1 where cast(id as char(4))='1000'; > – this gives 2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11096) Bump the parquet version to 1.7.0
[ https://issues.apache.org/jira/browse/HIVE-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-11096: --- Fix Version/s: 1.3.0 > Bump the parquet version to 1.7.0 > - > > Key: HIVE-11096 > URL: https://issues.apache.org/jira/browse/HIVE-11096 > Project: Hive > Issue Type: Task >Affects Versions: 1.2.0 >Reporter: Sergio Peña >Assignee: Ferdinand Xu >Priority: Minor > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-11096.1.patch > > > Parquet has moved officially as an Apache project since parquet 1.7.0. > This new version does not have any bugfixes nor improvements from its last > 1.6.0 version, but all imports were changed to be org.apache.parquet, and the > pom.xml must use org.apache.parquet instead of com.twitter. > This ticket should address those import and pom changes only. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11844) Merge master to Spark branch 9/16/2015 [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-11844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790916#comment-14790916 ] Hive QA commented on HIVE-11844: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12756302/HIVE-11844.1-spark.patch {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 7467 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.initializationError org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_inner_join org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join2 org.apache.hadoop.hive.cli.TestMinimrCliDriver.initializationError org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/949/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/949/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-949/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12756302 - PreCommit-HIVE-SPARK-Build > Merge master to Spark branch 9/16/2015 [Spark Branch] > - > > Key: HIVE-11844 > URL: https://issues.apache.org/jira/browse/HIVE-11844 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Attachments: HIVE-11844.1-spark.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11842) Improve RuleRegExp by caching some internal data structures
[ https://issues.apache.org/jira/browse/HIVE-11842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790882#comment-14790882 ] Sergey Shelukhin commented on HIVE-11842: - is there perf test on this? > Improve RuleRegExp by caching some internal data structures > --- > > Key: HIVE-11842 > URL: https://issues.apache.org/jira/browse/HIVE-11842 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Fix For: 2.0.0 > > Attachments: HIVE-11842.patch > > > Continuing work started in HIVE-11141. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11819) HiveServer2 catches OOMs on request threads
[ https://issues.apache.org/jira/browse/HIVE-11819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11819: Attachment: (was: HIVE-11819.02.patch) > HiveServer2 catches OOMs on request threads > --- > > Key: HIVE-11819 > URL: https://issues.apache.org/jira/browse/HIVE-11819 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11819.01.patch, HIVE-11819.patch > > > ThriftCLIService methods such as ExecuteStatement are apparently capable of > catching OOMs because they get wrapped in RTE by HiveSessionProxy. > This shouldn't happen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11844) Merge master to Spark branch 9/16/2015 [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-11844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790988#comment-14790988 ] Xuefu Zhang commented on HIVE-11844: Besides some test result diff, there seems to be an issue with the test environment. Since there is only a minor conflict, I'm committing the merge now and addressing the test and env as followups. > Merge master to Spark branch 9/16/2015 [Spark Branch] > - > > Key: HIVE-11844 > URL: https://issues.apache.org/jira/browse/HIVE-11844 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Attachments: HIVE-11844.1-spark.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11846) CliDriver shutdown tries to drop index table again which was already dropped when dropping the original table
[ https://issues.apache.org/jira/browse/HIVE-11846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11846: --- Attachment: HIVE-11846.01.patch > CliDriver shutdown tries to drop index table again which was already dropped > when dropping the original table > -- > > Key: HIVE-11846 > URL: https://issues.apache.org/jira/browse/HIVE-11846 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong >Priority: Critical > Attachments: HIVE-11846.01.patch > > > Steps to repro: > {code} > set hive.stats.dbclass=fs; > set hive.stats.autogather=true; > set hive.cbo.enable=true; > DROP TABLE IF EXISTS aa; > CREATE TABLE aa (L_ORDERKEY INT, > L_PARTKEY INT, > L_SUPPKEY INT, > L_LINENUMBERINT, > L_QUANTITY DOUBLE, > L_EXTENDEDPRICE DOUBLE, > L_DISCOUNT DOUBLE, > L_TAX DOUBLE, > L_RETURNFLAGSTRING, > L_LINESTATUSSTRING, > l_shipdate STRING, > L_COMMITDATESTRING, > L_RECEIPTDATE STRING, > L_SHIPINSTRUCT STRING, > L_SHIPMODE STRING, > L_COMMENT STRING) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY '|'; > LOAD DATA LOCAL INPATH '../../data/files/lineitem.txt' OVERWRITE INTO TABLE > aa; > CREATE INDEX aa_lshipdate_idx ON TABLE aa(l_shipdate) AS > 'org.apache.hadoop.hive.ql.index.AggregateIndexHandler' WITH DEFERRED REBUILD > IDXPROPERTIES("AGGREGATES"="count(l_shipdate)"); > ALTER INDEX aa_lshipdate_idx ON aa REBUILD; > show tables; > explain select l_shipdate, count(l_shipdate) > from aa > group by l_shipdate; > {code} > The problem is that, we create an index table default_aa_lshipdate_idx, > (default is the database name) and it comes after the table aa. Then, it > first drop aa, which will drop default_aa_lshipdate_idx as well as it is > related to aa. It will not find the table default_aa_lshipdate_idx when it > tries to drop it again, which will throw an exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11110) Reorder applyPreJoinOrderingTransforms, add NotNULL/FilterMerge rules, improve Filter selectivity estimation
[ https://issues.apache.org/jira/browse/HIVE-0?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791037#comment-14791037 ] Hive QA commented on HIVE-0: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12756178/HIVE-0.13.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5302/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5302/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5302/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-5302/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + cd apache-github-source-source + git fetch origin >From https://github.com/apache/hive 27eeadc..efd059c branch-1 -> origin/branch-1 ce71355..57158da master -> origin/master f78f663..70eeadd spark -> origin/spark + git reset --hard HEAD HEAD is now at ce71355 HIVE-8327: (repeat) mvn site -Pfindbugs for hive (Gopal V reviewed by Ashutosh Chauhan) + git clean -f -d + git checkout master Already on 'master' Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded. + git reset --hard origin/master HEAD is now at 57158da HIVE-11816 : Upgrade groovy to 2.4.4 (Szehon, reviewed by Xuefu) + git merge --ff-only origin/master Already up-to-date. + git gc + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12756178 - PreCommit-HIVE-TRUNK-Build > Reorder applyPreJoinOrderingTransforms, add NotNULL/FilterMerge rules, > improve Filter selectivity estimation > > > Key: HIVE-0 > URL: https://issues.apache.org/jira/browse/HIVE-0 > Project: Hive > Issue Type: Bug > Components: CBO >Reporter: Jesus Camacho Rodriguez >Assignee: Laljo John Pullokkaran > Attachments: HIVE-0-10.patch, HIVE-0-11.patch, > HIVE-0-12.patch, HIVE-0-branch-1.2.patch, HIVE-0.1.patch, > HIVE-0.13.patch, HIVE-0.2.patch, HIVE-0.4.patch, > HIVE-0.5.patch, HIVE-0.6.patch, HIVE-0.7.patch, > HIVE-0.8.patch, HIVE-0.9.patch, HIVE-0.91.patch, > HIVE-0.92.patch, HIVE-0.patch > > > Query > {code} > select count(*) > from store_sales > ,store_returns > ,date_dim d1 > ,date_dim d2 > where d1.d_quarter_name = '2000Q1' >and d1.d_date_sk = ss_sold_date_sk >and ss_customer_sk = sr_customer_sk >and ss_item_sk = sr_item_sk >and ss_ticket_number = sr_ticket_number >and sr_returned_date_sk = d2.d_date_sk >and d2.d_quarter_name in ('2000Q1','2000Q2','2000Q3’); > {code} > The store_sales table is partitioned on ss_sold_date_sk, which is also used > in a join clause. The join clause should add a filter “filterExpr: > ss_sold_date_sk is not null”, which should get pushed the MetaStore when > fetching the stats. Currently this is not done in
[jira] [Commented] (HIVE-8327) mvn site -Pfindbugs
[ https://issues.apache.org/jira/browse/HIVE-8327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791054#comment-14791054 ] Lefty Leverenz commented on HIVE-8327: -- Again: should this be documented? > mvn site -Pfindbugs > --- > > Key: HIVE-8327 > URL: https://issues.apache.org/jira/browse/HIVE-8327 > Project: Hive > Issue Type: Test > Components: Diagnosability >Reporter: Gopal V >Assignee: Gopal V > Fix For: 1.1.0 > > Attachments: HIVE-8327.1.patch, HIVE-8327.2.patch, ql-findbugs.html > > > HIVE-3099 originally added findbugs into the old ant build. > Get basic findbugs working for the maven build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11791) Add unit test for HIVE-10122
[ https://issues.apache.org/jira/browse/HIVE-11791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791101#comment-14791101 ] Illya Yalovyy commented on HIVE-11791: -- [~gopalv], There is an inconsistency: compactExpr(or(true, NULL)) => true, but compactExpr(or(NULL, true)) => NULL. If true == NULL in this context, then this behavior is acceptable, but still inconsistent. > Add unit test for HIVE-10122 > > > Key: HIVE-11791 > URL: https://issues.apache.org/jira/browse/HIVE-11791 > Project: Hive > Issue Type: Test > Components: Metastore >Affects Versions: 1.1.0 >Reporter: Illya Yalovyy >Assignee: Illya Yalovyy >Priority: Minor > Attachments: HIVE-11791.patch > > > Unit tests for PartitionPruner.compactExpr() -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11834) Lineage doesn't work with dynamic partitioning query
[ https://issues.apache.org/jira/browse/HIVE-11834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HIVE-11834: --- Attachment: HIVE-11834.1.patch > Lineage doesn't work with dynamic partitioning query > > > Key: HIVE-11834 > URL: https://issues.apache.org/jira/browse/HIVE-11834 > Project: Hive > Issue Type: Bug >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang > Attachments: HIVE-11834.1.patch > > > As Mark found out, > https://issues.apache.org/jira/browse/HIVE-11139?focusedCommentId=14745937=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14745937 > This is indeed a code bug. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11839) Vectorization wrong results with filter of (CAST AS CHAR)
[ https://issues.apache.org/jira/browse/HIVE-11839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791207#comment-14791207 ] Matt McCline commented on HIVE-11839: - Committed to trunk. > Vectorization wrong results with filter of (CAST AS CHAR) > - > > Key: HIVE-11839 > URL: https://issues.apache.org/jira/browse/HIVE-11839 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-11839.01.patch > > > PROBLEM: > For query such as > select count(1) from table where CAST (id as CHAR(4))='1000'; > gives wrong results 0 than expected results. > STEPS TO REPRODUCE: > create table s1(id smallint) stored as orc; > insert into table s1 values (1000),(1001),(1002),(1003),(1000); > set hive.vectorized.execution.enabled=true; > select count(1) from s1 where cast(id as char(4))='1000'; > – this gives 0 > set hive.vectorized.execution.enabled=false; > select count(1) from s1 where cast(id as char(4))='1000'; > – this gives 2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11815) Correct the column/table names in subquery expression when creating a view
[ https://issues.apache.org/jira/browse/HIVE-11815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11815: --- Attachment: HIVE-11815.03.patch rebase the patch based on recent changes on master. > Correct the column/table names in subquery expression when creating a view > -- > > Key: HIVE-11815 > URL: https://issues.apache.org/jira/browse/HIVE-11815 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11815.01.patch, HIVE-11815.02.patch, > HIVE-11815.03.patch > > > Right now Hive does not quote column/table names in subquery expression when > create a view. For example > {code} > hive> > > create table tc (`@d` int); > OK > Time taken: 0.119 seconds > hive> create view tcv as select * from tc b where exists (select a.`@d` from > tc a where b.`@d`=a.`@d`); > OK > Time taken: 0.075 seconds > hive> describe extended tcv; > OK > @dint > Detailed Table InformationTable(tableName:tcv, dbName:default, > owner:pxiong, createTime:1442250005, lastAccessTime:0, retention:0, > sd:StorageDescriptor(cols:[FieldSchema(name:@d, type:int, comment:null)], > location:null, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, > outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, > compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, > serializationLib:null, parameters:{}), bucketCols:[], sortCols:[], > parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], > skewedColValueLocationMaps:{}), storedAsSubDirectories:false), > partitionKeys:[], parameters:{transient_lastDdlTime=1442250005}, > viewOriginalText:select * from tc b where exists (select a.@d from tc a where > b.@d=a.@d), viewExpandedText:select `b`.`@d` from `default`.`tc` `b` where > exists (select a.@d from tc a where b.@d=a.@d), tableType:VIRTUAL_VIEW) > Time taken: 0.063 seconds, Fetched: 3 row(s) > hive> select * from tcv; > FAILED: SemanticException line 1:63 character '@' not supported here > line 1:84 character '@' not supported here > line 1:89 character '@' not supported here in definition of VIEW tcv [ > select `b`.`@d` from `default`.`tc` `b` where exists (select a.@d from tc a > where b.@d=a.@d) > ] used as tcv at Line 1:14 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11839) Vectorization wrong results with filter of (CAST AS CHAR)
[ https://issues.apache.org/jira/browse/HIVE-11839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791178#comment-14791178 ] Matt McCline commented on HIVE-11839: - Thanks [~sershe] for quick review. Test failures are unrelated. > Vectorization wrong results with filter of (CAST AS CHAR) > - > > Key: HIVE-11839 > URL: https://issues.apache.org/jira/browse/HIVE-11839 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-11839.01.patch > > > PROBLEM: > For query such as > select count(1) from table where CAST (id as CHAR(4))='1000'; > gives wrong results 0 than expected results. > STEPS TO REPRODUCE: > create table s1(id smallint) stored as orc; > insert into table s1 values (1000),(1001),(1002),(1003),(1000); > set hive.vectorized.execution.enabled=true; > select count(1) from s1 where cast(id as char(4))='1000'; > – this gives 0 > set hive.vectorized.execution.enabled=false; > select count(1) from s1 where cast(id as char(4))='1000'; > – this gives 2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11826) 'hadoop.proxyuser.hive.groups' configuration doesn't prevent unauthorized user to access metastore
[ https://issues.apache.org/jira/browse/HIVE-11826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-11826: Attachment: HIVE-11826.2.patch Change the TestHadoop20SAuthBridge.java to be the one for version23 since version23S is already removed from the code base. > 'hadoop.proxyuser.hive.groups' configuration doesn't prevent unauthorized > user to access metastore > -- > > Key: HIVE-11826 > URL: https://issues.apache.org/jira/browse/HIVE-11826 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-11826.2.patch, HIVE-11826.patch > > > With 'hadoop.proxyuser.hive.groups' configured in core-site.xml to certain > groups, currently if you run the job with a user not belonging to those > groups, it won't fail to access metastore. With old version hive 0.13, > actually it fails properly. > Seems HadoopThriftAuthBridge20S.java correctly call ProxyUsers.authorize() > while HadoopThriftAuthBridge23 doesn't. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11839) Vectorization wrong results with filter of (CAST AS CHAR)
[ https://issues.apache.org/jira/browse/HIVE-11839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791173#comment-14791173 ] Hive QA commented on HIVE-11839: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12756187/HIVE-11839.01.patch {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9447 tests executed *Failed tests:* {noformat} org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation org.apache.hive.hcatalog.streaming.TestStreaming.testEndpointConnection org.apache.hive.hcatalog.streaming.TestStreaming.testRemainingTransactions org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5303/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5303/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5303/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12756187 - PreCommit-HIVE-TRUNK-Build > Vectorization wrong results with filter of (CAST AS CHAR) > - > > Key: HIVE-11839 > URL: https://issues.apache.org/jira/browse/HIVE-11839 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-11839.01.patch > > > PROBLEM: > For query such as > select count(1) from table where CAST (id as CHAR(4))='1000'; > gives wrong results 0 than expected results. > STEPS TO REPRODUCE: > create table s1(id smallint) stored as orc; > insert into table s1 values (1000),(1001),(1002),(1003),(1000); > set hive.vectorized.execution.enabled=true; > select count(1) from s1 where cast(id as char(4))='1000'; > – this gives 0 > set hive.vectorized.execution.enabled=false; > select count(1) from s1 where cast(id as char(4))='1000'; > – this gives 2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11849) NPE in HiveHBaseTableShapshotInputFormat in query with just count(*)
[ https://issues.apache.org/jira/browse/HIVE-11849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791221#comment-14791221 ] Enis Soztutar commented on HIVE-11849: -- Offline conversation with Jason, we have noted a couple of things: - HiveHBaseTableSnapshotInputFormat.java uses the mapred API while HiveHBaseTableInputFormat uses the mapreduce API. [~ndimiduk] I remember you were talking about specifically that all Hive IFs use mapred. Is that changed? - mapred version of the HBase's TableMapreduceUtil does not have the utility methods to pass the Scan serialized inside the job configuration. The only supported way is to set the Scan through now-deprecated {{TableInputFormat.COLUMN_LIST}}. - Although the {{mapred.TableMapreduceUtil}} does not support setting the serialized scan, we can still manually set it and have the TSIF work correctly since the mapred and mapreduce versions use the same underlying implementation ({{TableSnapshotInputFormatImpl}}). > NPE in HiveHBaseTableShapshotInputFormat in query with just count(*) > > > Key: HIVE-11849 > URL: https://issues.apache.org/jira/browse/HIVE-11849 > Project: Hive > Issue Type: Bug > Components: HBase Handler >Affects Versions: 1.3.0 >Reporter: Jason Dere > > Adding the following example as a qfile test in hbase-handler fails. Looks > like this may have been introduced by HIVE-5277. > {noformat} > SET hive.hbase.snapshot.name=src_hbase_snapshot; > SET hive.hbase.snapshot.restoredir=/tmp; > select count(*) from src_hbase; > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11791) Add unit test for HIVE-10122
[ https://issues.apache.org/jira/browse/HIVE-11791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Illya Yalovyy updated HIVE-11791: - Attachment: HIVE-11791.2.patch Updated expected results and fixed some issues with expression compaction logic. > Add unit test for HIVE-10122 > > > Key: HIVE-11791 > URL: https://issues.apache.org/jira/browse/HIVE-11791 > Project: Hive > Issue Type: Test > Components: Metastore >Affects Versions: 1.1.0 >Reporter: Illya Yalovyy >Assignee: Illya Yalovyy >Priority: Minor > Attachments: HIVE-11791.2.patch, HIVE-11791.patch > > > Unit tests for PartitionPruner.compactExpr() -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11820) export tables with size of >32MB throws "java.lang.IllegalArgumentException: Skip CRC is valid only with update options"
[ https://issues.apache.org/jira/browse/HIVE-11820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takahiko Saito updated HIVE-11820: -- Attachment: HIVE-11820.patch > export tables with size of >32MB throws "java.lang.IllegalArgumentException: > Skip CRC is valid only with update options" > > > Key: HIVE-11820 > URL: https://issues.apache.org/jira/browse/HIVE-11820 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.1 >Reporter: Takahiko Saito >Assignee: Takahiko Saito > Fix For: 1.2.1 > > Attachments: HIVE-11820.patch > > > Tested a patch of HIVE-11607 and seeing the following exception: > {noformat} > 2015-09-14 21:44:16,817 ERROR [main]: exec.Task > (SessionState.java:printError(960)) - Failed with exception Skip CRC is valid > only with update options > java.lang.IllegalArgumentException: Skip CRC is valid only with update options > at > org.apache.hadoop.tools.DistCpOptions.validate(DistCpOptions.java:556) > at > org.apache.hadoop.tools.DistCpOptions.setSkipCRC(DistCpOptions.java:311) > at > org.apache.hadoop.hive.shims.Hadoop23Shims.runDistCp(Hadoop23Shims.java:1147) > at org.apache.hadoop.hive.common.FileUtils.copy(FileUtils.java:553) > at org.apache.hadoop.hive.ql.exec.CopyTask.execute(CopyTask.java:82) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1655) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1414) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1195) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > {noformat} > A possible resolution is to reverse the order of the following two lines from > a patch of HIVE-11607: > {noformat} > +options.setSkipCRC(true); > +options.setSyncFolder(true); > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11834) Lineage doesn't work with dynamic partitioning query
[ https://issues.apache.org/jira/browse/HIVE-11834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791244#comment-14791244 ] Jimmy Xiang commented on HIVE-11834: Patch is on RB: https://reviews.apache.org/r/38442/ > Lineage doesn't work with dynamic partitioning query > > > Key: HIVE-11834 > URL: https://issues.apache.org/jira/browse/HIVE-11834 > Project: Hive > Issue Type: Bug >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-11834.1.patch > > > As Mark found out, > https://issues.apache.org/jira/browse/HIVE-11139?focusedCommentId=14745937=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14745937 > This is indeed a code bug. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11846) CliDriver shutdown tries to drop index table again which was already dropped when dropping the original table
[ https://issues.apache.org/jira/browse/HIVE-11846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791294#comment-14791294 ] Xuefu Zhang commented on HIVE-11846: Got it. Thanks for the explanation. > CliDriver shutdown tries to drop index table again which was already dropped > when dropping the original table > -- > > Key: HIVE-11846 > URL: https://issues.apache.org/jira/browse/HIVE-11846 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong >Priority: Critical > Attachments: HIVE-11846.01.patch > > > Steps to repro: > {code} > set hive.stats.dbclass=fs; > set hive.stats.autogather=true; > set hive.cbo.enable=true; > DROP TABLE IF EXISTS aa; > CREATE TABLE aa (L_ORDERKEY INT, > L_PARTKEY INT, > L_SUPPKEY INT, > L_LINENUMBERINT, > L_QUANTITY DOUBLE, > L_EXTENDEDPRICE DOUBLE, > L_DISCOUNT DOUBLE, > L_TAX DOUBLE, > L_RETURNFLAGSTRING, > L_LINESTATUSSTRING, > l_shipdate STRING, > L_COMMITDATESTRING, > L_RECEIPTDATE STRING, > L_SHIPINSTRUCT STRING, > L_SHIPMODE STRING, > L_COMMENT STRING) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY '|'; > LOAD DATA LOCAL INPATH '../../data/files/lineitem.txt' OVERWRITE INTO TABLE > aa; > CREATE INDEX aa_lshipdate_idx ON TABLE aa(l_shipdate) AS > 'org.apache.hadoop.hive.ql.index.AggregateIndexHandler' WITH DEFERRED REBUILD > IDXPROPERTIES("AGGREGATES"="count(l_shipdate)"); > ALTER INDEX aa_lshipdate_idx ON aa REBUILD; > show tables; > explain select l_shipdate, count(l_shipdate) > from aa > group by l_shipdate; > {code} > The problem is that, we create an index table default_aa_lshipdate_idx, > (default is the database name) and it comes after the table aa. Then, it > first drop aa, which will drop default_aa_lshipdate_idx as well as it is > related to aa. It will not find the table default_aa_lshipdate_idx when it > tries to drop it again, which will throw an exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11839) Vectorization wrong results with filter of (CAST AS CHAR)
[ https://issues.apache.org/jira/browse/HIVE-11839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791300#comment-14791300 ] Xuefu Zhang commented on HIVE-11839: Could we update the fix versions please? > Vectorization wrong results with filter of (CAST AS CHAR) > - > > Key: HIVE-11839 > URL: https://issues.apache.org/jira/browse/HIVE-11839 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-11839.01.patch > > > PROBLEM: > For query such as > select count(1) from table where CAST (id as CHAR(4))='1000'; > gives wrong results 0 than expected results. > STEPS TO REPRODUCE: > create table s1(id smallint) stored as orc; > insert into table s1 values (1000),(1001),(1002),(1003),(1000); > set hive.vectorized.execution.enabled=true; > select count(1) from s1 where cast(id as char(4))='1000'; > – this gives 0 > set hive.vectorized.execution.enabled=false; > select count(1) from s1 where cast(id as char(4))='1000'; > – this gives 2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8342) Potential null dereference in ColumnTruncateMapper#jobClose()
[ https://issues.apache.org/jira/browse/HIVE-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791307#comment-14791307 ] Lars Francke commented on HIVE-8342: Hey [~tedyu] I get notifications about this issue every once in a while because you seemingly change something but it looks like you're not actually changing anything. Is this a JIRA problem? > Potential null dereference in ColumnTruncateMapper#jobClose() > - > > Key: HIVE-8342 > URL: https://issues.apache.org/jira/browse/HIVE-8342 > Project: Hive > Issue Type: Bug >Reporter: Ted Yu >Assignee: skrho >Priority: Minor > Attachments: HIVE-8342_001.patch, HIVE-8342_002.patch > > > {code} > Utilities.mvFileToFinalPath(outputPath, job, success, LOG, dynPartCtx, > null, > reporter); > {code} > Utilities.mvFileToFinalPath() calls createEmptyBuckets() where conf is > dereferenced: > {code} > boolean isCompressed = conf.getCompressed(); > TableDesc tableInfo = conf.getTableInfo(); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11852) numRows and rawDataSize table properties are not replicated
[ https://issues.apache.org/jira/browse/HIVE-11852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-11852: Reporter: Paul Isaychuk (was: Sushanth Sowmyan) > numRows and rawDataSize table properties are not replicated > --- > > Key: HIVE-11852 > URL: https://issues.apache.org/jira/browse/HIVE-11852 > Project: Hive > Issue Type: Bug > Components: Import/Export >Affects Versions: 1.2.1 >Reporter: Paul Isaychuk >Assignee: Sushanth Sowmyan > > numRows and rawDataSize table properties are not replicated when exported for > replication and re-imported. > {code} > Table drdbnonreplicatabletable.vanillatable has different TblProps from > drdbnonreplicatabletable.vanillatable expected [{numFiles=1, numRows=2, > totalSize=560, rawDataSize=440}] but found [{numFiles=1, totalSize=560}] > java.lang.AssertionError: Table drdbnonreplicatabletable.vanillatable has > different TblProps from drdbnonreplicatabletable.vanillatable expected > [{numFiles=1, numRows=2, totalSize=560, rawDataSize=440}] but found > [{numFiles=1, totalSize=560}] > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11839) Vectorization wrong results with filter of (CAST AS CHAR)
[ https://issues.apache.org/jira/browse/HIVE-11839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-11839: Fix Version/s: 2.0.0 1.3.0 > Vectorization wrong results with filter of (CAST AS CHAR) > - > > Key: HIVE-11839 > URL: https://issues.apache.org/jira/browse/HIVE-11839 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-11839.01.patch > > > PROBLEM: > For query such as > select count(1) from table where CAST (id as CHAR(4))='1000'; > gives wrong results 0 than expected results. > STEPS TO REPRODUCE: > create table s1(id smallint) stored as orc; > insert into table s1 values (1000),(1001),(1002),(1003),(1000); > set hive.vectorized.execution.enabled=true; > select count(1) from s1 where cast(id as char(4))='1000'; > – this gives 0 > set hive.vectorized.execution.enabled=false; > select count(1) from s1 where cast(id as char(4))='1000'; > – this gives 2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11842) Improve RuleRegExp by caching some internal data structures
[ https://issues.apache.org/jira/browse/HIVE-11842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791385#comment-14791385 ] Hive QA commented on HIVE-11842: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12756271/HIVE-11842.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9447 tests executed *Failed tests:* {noformat} org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5305/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5305/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5305/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12756271 - PreCommit-HIVE-TRUNK-Build > Improve RuleRegExp by caching some internal data structures > --- > > Key: HIVE-11842 > URL: https://issues.apache.org/jira/browse/HIVE-11842 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Fix For: 2.0.0 > > Attachments: HIVE-11842.patch > > > Continuing work started in HIVE-11141. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11553) use basic file metadata cache in ETLSplitStrategy-related paths
[ https://issues.apache.org/jira/browse/HIVE-11553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791429#comment-14791429 ] Sergey Shelukhin commented on HIVE-11553: - [~gopalv] [~prasanth_j] can you please review this? Note that this is stage 1, before PPD. PPD is stage 2 :) Unfortunately my local branches are a clusterfuck by now and everything now depends on this patch, so makes it hard to make progress. > use basic file metadata cache in ETLSplitStrategy-related paths > --- > > Key: HIVE-11553 > URL: https://issues.apache.org/jira/browse/HIVE-11553 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: hbase-metastore-branch > > Attachments: HIVE-11553.01.patch, HIVE-11553.02.patch, > HIVE-11553.patch > > > NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8327) mvn site -Pfindbugs
[ https://issues.apache.org/jira/browse/HIVE-8327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791280#comment-14791280 ] Gopal V commented on HIVE-8327: --- Yes, I already added a doc. https://cwiki.apache.org/confluence/display/Hive/HiveDeveloperFAQ#HiveDeveloperFAQ-Howtorunfindbugsafterachange? > mvn site -Pfindbugs > --- > > Key: HIVE-8327 > URL: https://issues.apache.org/jira/browse/HIVE-8327 > Project: Hive > Issue Type: Test > Components: Diagnosability >Reporter: Gopal V >Assignee: Gopal V > Fix For: 1.1.0 > > Attachments: HIVE-8327.1.patch, HIVE-8327.2.patch, ql-findbugs.html > > > HIVE-3099 originally added findbugs into the old ant build. > Get basic findbugs working for the maven build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11512) Hive LDAP Authenticator should also support full DN in Authenticate()
[ https://issues.apache.org/jira/browse/HIVE-11512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791290#comment-14791290 ] Hive QA commented on HIVE-11512: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12756286/HIVE-11512.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9445 tests executed *Failed tests:* {noformat} org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation org.apache.hive.hcatalog.streaming.TestStreaming.testInterleavedTransactionBatchCommits {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5304/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5304/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5304/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12756286 - PreCommit-HIVE-TRUNK-Build > Hive LDAP Authenticator should also support full DN in Authenticate() > -- > > Key: HIVE-11512 > URL: https://issues.apache.org/jira/browse/HIVE-11512 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 1.1.0 >Reporter: Naveen Gangam >Assignee: Naveen Gangam >Priority: Minor > Attachments: HIVE-11512.patch > > > In certain LDAP implementation, LDAP Binding can occur using the full DN for > the user. Currently, LDAPAuthentication Provider assumes that the username > passed into Authenticate() is a short username & not a full DN. While the > initial bind works fine either way, the filter code is reliant on it being a > shortname. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-11839) Vectorization wrong results with filter of (CAST AS CHAR)
[ https://issues.apache.org/jira/browse/HIVE-11839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791300#comment-14791300 ] Xuefu Zhang edited comment on HIVE-11839 at 9/16/15 11:32 PM: -- Could we update the fix versions please? Also, affected versions. was (Author: xuefuz): Could we update the fix versions please? > Vectorization wrong results with filter of (CAST AS CHAR) > - > > Key: HIVE-11839 > URL: https://issues.apache.org/jira/browse/HIVE-11839 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-11839.01.patch > > > PROBLEM: > For query such as > select count(1) from table where CAST (id as CHAR(4))='1000'; > gives wrong results 0 than expected results. > STEPS TO REPRODUCE: > create table s1(id smallint) stored as orc; > insert into table s1 values (1000),(1001),(1002),(1003),(1000); > set hive.vectorized.execution.enabled=true; > select count(1) from s1 where cast(id as char(4))='1000'; > – this gives 0 > set hive.vectorized.execution.enabled=false; > select count(1) from s1 where cast(id as char(4))='1000'; > – this gives 2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11820) export tables with size of >32MB throws "java.lang.IllegalArgumentException: Skip CRC is valid only with update options"
[ https://issues.apache.org/jira/browse/HIVE-11820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takahiko Saito updated HIVE-11820: -- Fix Version/s: (was: 1.2.1) > export tables with size of >32MB throws "java.lang.IllegalArgumentException: > Skip CRC is valid only with update options" > > > Key: HIVE-11820 > URL: https://issues.apache.org/jira/browse/HIVE-11820 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Takahiko Saito >Assignee: Takahiko Saito > Attachments: HIVE-11820.patch > > > Tested a patch of HIVE-11607 and seeing the following exception: > {noformat} > 2015-09-14 21:44:16,817 ERROR [main]: exec.Task > (SessionState.java:printError(960)) - Failed with exception Skip CRC is valid > only with update options > java.lang.IllegalArgumentException: Skip CRC is valid only with update options > at > org.apache.hadoop.tools.DistCpOptions.validate(DistCpOptions.java:556) > at > org.apache.hadoop.tools.DistCpOptions.setSkipCRC(DistCpOptions.java:311) > at > org.apache.hadoop.hive.shims.Hadoop23Shims.runDistCp(Hadoop23Shims.java:1147) > at org.apache.hadoop.hive.common.FileUtils.copy(FileUtils.java:553) > at org.apache.hadoop.hive.ql.exec.CopyTask.execute(CopyTask.java:82) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1655) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1414) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1195) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > {noformat} > A possible resolution is to reverse the order of the following two lines from > a patch of HIVE-11607: > {noformat} > +options.setSkipCRC(true); > +options.setSyncFolder(true); > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11820) export tables with size of >32MB throws "java.lang.IllegalArgumentException: Skip CRC is valid only with update options"
[ https://issues.apache.org/jira/browse/HIVE-11820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takahiko Saito updated HIVE-11820: -- Affects Version/s: (was: 1.2.1) > export tables with size of >32MB throws "java.lang.IllegalArgumentException: > Skip CRC is valid only with update options" > > > Key: HIVE-11820 > URL: https://issues.apache.org/jira/browse/HIVE-11820 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Takahiko Saito >Assignee: Takahiko Saito > Attachments: HIVE-11820.patch > > > Tested a patch of HIVE-11607 and seeing the following exception: > {noformat} > 2015-09-14 21:44:16,817 ERROR [main]: exec.Task > (SessionState.java:printError(960)) - Failed with exception Skip CRC is valid > only with update options > java.lang.IllegalArgumentException: Skip CRC is valid only with update options > at > org.apache.hadoop.tools.DistCpOptions.validate(DistCpOptions.java:556) > at > org.apache.hadoop.tools.DistCpOptions.setSkipCRC(DistCpOptions.java:311) > at > org.apache.hadoop.hive.shims.Hadoop23Shims.runDistCp(Hadoop23Shims.java:1147) > at org.apache.hadoop.hive.common.FileUtils.copy(FileUtils.java:553) > at org.apache.hadoop.hive.ql.exec.CopyTask.execute(CopyTask.java:82) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1655) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1414) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1195) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > {noformat} > A possible resolution is to reverse the order of the following two lines from > a patch of HIVE-11607: > {noformat} > +options.setSkipCRC(true); > +options.setSyncFolder(true); > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11849) NPE in HiveHBaseTableShapshotInputFormat in query with just count(*)
[ https://issues.apache.org/jira/browse/HIVE-11849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791247#comment-14791247 ] Nick Dimiduk commented on HIVE-11849: - Yeah, IIRC, this stuff is a big mess of the two different mapred API's across the two projects. Have a look at some of the linked issues from HIVE-6584. Notable TODO items were HBASE-11179, HBASE-11163 and HIVE-7534. > NPE in HiveHBaseTableShapshotInputFormat in query with just count(*) > > > Key: HIVE-11849 > URL: https://issues.apache.org/jira/browse/HIVE-11849 > Project: Hive > Issue Type: Bug > Components: HBase Handler >Affects Versions: 1.3.0 >Reporter: Jason Dere > > Adding the following example as a qfile test in hbase-handler fails. Looks > like this may have been introduced by HIVE-5277. > {noformat} > SET hive.hbase.snapshot.name=src_hbase_snapshot; > SET hive.hbase.snapshot.restoredir=/tmp; > select count(*) from src_hbase; > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)