[jira] [Updated] (HIVE-9169) UT: set hive.support.concurrency to true for spark UTs

2015-09-16 Thread Bing Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bing Li updated HIVE-9169:
--
Assignee: (was: Bing Li)

> UT: set hive.support.concurrency to true for spark UTs
> --
>
> Key: HIVE-9169
> URL: https://issues.apache.org/jira/browse/HIVE-9169
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: spark-branch
>Reporter: Thomas Friedrich
>Priority: Minor
>
> The test cases 
> lock1
> lock2
> lock3
> lock4 
> are failing because the flag hive.support.concurrency is set to false in the 
> hive-site.xml for the spark tests.
> This value was set to true in trunk with HIVE-1293 when these test cases were 
> introduced to Hive.
> After setting the value to true and generating the output files, the test 
> cases are successful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11649) Hive UPDATE,INSERT,DELETE issue

2015-09-16 Thread Veerendra Nath Jasthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Veerendra Nath Jasthi updated HIVE-11649:
-
Attachment: hive-site.xml

Hi Alan,

PFA.




> Hive UPDATE,INSERT,DELETE issue
> ---
>
> Key: HIVE-11649
> URL: https://issues.apache.org/jira/browse/HIVE-11649
> Project: Hive
>  Issue Type: Bug
> Environment: Hadoop-2.2.0 , hive-1.2.0 ,operating system 
> ubuntu14.04lts (64-bit) & Java 1.7
>Reporter: Veerendra Nath Jasthi
>Assignee: Hive QA
> Attachments: afterChange.png, beforeChange.png, hive-site.xml, 
> hive.log
>
>
>  have been trying to implement the UPDATE,INSERT,DELETE operations in hive 
> table as per link: 
> https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-
>  
> but whenever I was trying to include the properties which will do our work 
> i.e. 
> Configuration Values to Set for INSERT, UPDATE, DELETE 
> hive.support.concurrency  true (default is false) 
> hive.enforce.bucketingtrue (default is false) 
> hive.exec.dynamic.partition.mode  nonstrict (default is strict) 
> after that if I run show tables command on hive shell its taking 65.15 
> seconds which normally runs at 0.18 seconds without the above properties. 
> Apart from show tables rest of the commands not giving any output i.e. its 
> keep on running until and unless kill the process.
> Could you tell me reason for this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11825) get_json_object(col,'$.a') is null in where clause didn`t work

2015-09-16 Thread Feng Yuan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746983#comment-14746983
 ] 

Feng Yuan commented on HIVE-11825:
--

thank your detailed reply,i will try your 2nd way,if you have time ,could you 
please commit your patch about "ALLOW_BACKSLASH_ESCAPING_ANY_CHARACTER "? thank 
you!



> get_json_object(col,'$.a') is null in where clause didn`t work
> --
>
> Key: HIVE-11825
> URL: https://issues.apache.org/jira/browse/HIVE-11825
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 0.14.0
>Reporter: Feng Yuan
>Priority: Critical
> Fix For: 0.14.1
>
>
> example:
> select attr from raw_kafka_item_dt0 where l_date='2015-09-06' and 
> customer='Czgc_news' and get_json_object(attr,'$.title') is NULL limit 10;
> but in results,title is still not null!
> {"title":"思科Q4收入估$79.2亿 
> 前景阴云笼罩","ItemType":"NewsBase","keywords":"思科Q4收入估\$79.2亿 
> 前景阴云笼罩","random":"1420253511075","callback":"BCore.instances[2].callbacks[1]","user_agent":"Mozilla/5.0
>  (iPhone; U; CPU iPhone OS 4_2_1 like Mac OS X; en-us) AppleWebKit/533.17.9 
> (KHTML; like Gecko) Version/5.0.2 Mobile/8C148 
> Safari/6533.18.5","is_newgid":"false","uuid":"DS.Input:b56c782bcb75035d:2116:003dcd40:54a75947","ptime":"1.1549997E9"}
>  
> attr is a dict



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11110) Reorder applyPreJoinOrderingTransforms, add NotNULL/FilterMerge rules, improve Filter selectivity estimation

2015-09-16 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-0?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-0:
--
Attachment: HIVE-0.13.patch

> Reorder applyPreJoinOrderingTransforms, add NotNULL/FilterMerge rules, 
> improve Filter selectivity estimation
> 
>
> Key: HIVE-0
> URL: https://issues.apache.org/jira/browse/HIVE-0
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Laljo John Pullokkaran
> Attachments: HIVE-0-10.patch, HIVE-0-11.patch, 
> HIVE-0-12.patch, HIVE-0-branch-1.2.patch, HIVE-0.1.patch, 
> HIVE-0.13.patch, HIVE-0.2.patch, HIVE-0.4.patch, 
> HIVE-0.5.patch, HIVE-0.6.patch, HIVE-0.7.patch, 
> HIVE-0.8.patch, HIVE-0.9.patch, HIVE-0.91.patch, 
> HIVE-0.92.patch, HIVE-0.patch
>
>
> Query
> {code}
> select  count(*)
>  from store_sales
>  ,store_returns
>  ,date_dim d1
>  ,date_dim d2
>  where d1.d_quarter_name = '2000Q1'
>and d1.d_date_sk = ss_sold_date_sk
>and ss_customer_sk = sr_customer_sk
>and ss_item_sk = sr_item_sk
>and ss_ticket_number = sr_ticket_number
>and sr_returned_date_sk = d2.d_date_sk
>and d2.d_quarter_name in ('2000Q1','2000Q2','2000Q3’);
> {code}
> The store_sales table is partitioned on ss_sold_date_sk, which is also used 
> in a join clause. The join clause should add a filter “filterExpr: 
> ss_sold_date_sk is not null”, which should get pushed the MetaStore when 
> fetching the stats. Currently this is not done in CBO planning, which results 
> in the stats from __HIVE_DEFAULT_PARTITION__ to be fetched and considered in 
> the optimization phase. In particular, this increases the NDV for the join 
> columns and may result in wrong planning.
> Including HiveJoinAddNotNullRule in the optimization phase solves this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11838) Another positive test case for HIVE-11658

2015-09-16 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11838:
-
Attachment: HIVE-11838.patch

[~deepesh] fyi..

> Another positive test case for HIVE-11658
> -
>
> Key: HIVE-11838
> URL: https://issues.apache.org/jira/browse/HIVE-11838
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Deepesh Khandelwal
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11838.patch
>
>
> We can add additional positive test coverage for HIVE-11658 covering load 
> directory to text partition. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11838) Another positive test case for HIVE-11658

2015-09-16 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747004#comment-14747004
 ] 

Prasanth Jayachandran commented on HIVE-11838:
--

This patch just adds additional test. I don't think we need a full precommit 
run for this patch.

> Another positive test case for HIVE-11658
> -
>
> Key: HIVE-11838
> URL: https://issues.apache.org/jira/browse/HIVE-11838
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Deepesh Khandelwal
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11838.patch
>
>
> We can add additional positive test coverage for HIVE-11658 covering load 
> directory to text partition. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-4243) Fix column names in FileSinkOperator

2015-09-16 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-4243:

Affects Version/s: 2.0.0
   1.3.0

> Fix column names in FileSinkOperator
> 
>
> Key: HIVE-4243
> URL: https://issues.apache.org/jira/browse/HIVE-4243
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-4243.patch, HIVE-4243.patch, HIVE-4243.tmp.patch
>
>
> All of the ObjectInspectors given to SerDe's by FileSinkOperator have virtual 
> column names. Since the files are part of tables, Hive knows the column 
> names. For self-describing file formats like ORC, having the real column 
> names will improve the understandability.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-5623) ORC accessing array column that's empty will fail with java out of bound exception

2015-09-16 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-5623:

Fix Version/s: (was: 0.13)
   1.3.0

> ORC accessing array column that's empty will fail with java out of bound 
> exception
> --
>
> Key: HIVE-5623
> URL: https://issues.apache.org/jira/browse/HIVE-5623
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.11.0
>Reporter: Eric Chu
>Assignee: Prasanth Jayachandran
>Priority: Critical
>  Labels: orcfile
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-5623.patch
>
>
> In our ORC tests we saw that queries that work on RCFile failed on the 
> corresponding ORC version with Java IndexOutOfBoundsException in 
> OrcStruct.java. The queries failed b/c the table has an array type column and 
> there are rows with an empty array.  We noticed that the getList(Object list, 
> int i) method in OrcStruct.java simply returns the i-th element from list 
> without checking if list is not null or if i is within valid range. After 
> fixing that the queries run fine. The fix is really simple, but maybe there 
> are other similar cases that need to be handled.
> The fix is to check if listObj is null and if i falls within range:
> {code}
> public Object getListElement(Object listObj, int i) {
>   if (listObj == null) {
>   return null;
>   }
>   List list = ((List) listObj);
>   if (i < 0 || i >= list.size()) {
>   return null;
>   }
>   return list.get(i);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11700) exception in logs in Tez test with new logger

2015-09-16 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11700:
-
Affects Version/s: 2.0.0

> exception in logs in Tez test with new logger
> -
>
> Key: HIVE-11700
> URL: https://issues.apache.org/jira/browse/HIVE-11700
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Sergey Shelukhin
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11700.patch, HIVE-11700.patch
>
>
> {noformat}
> 2015-08-31 11:27:47,400 WARN Error while converting string 
> [${sys:hive.ql.log.PerfLogger.level}] to type [class 
> org.apache.logging.log4j.Level]. Using default value [null]. 
> java.lang.IllegalArgumentException: Unknown level constant 
> [${SYS:HIVE.QL.LOG.PERFLOGGER.LEVEL}].
>at org.apache.logging.log4j.Level.valueOf(Level.java:286)
>at 
> org.apache.logging.log4j.core.config.plugins.convert.TypeConverters$LevelConverter.convert(TypeConverters.java:230)
>at 
> org.apache.logging.log4j.core.config.plugins.convert.TypeConverters$LevelConverter.convert(TypeConverters.java:226)
>at 
> org.apache.logging.log4j.core.config.plugins.convert.TypeConverters.convert(TypeConverters.java:336)
>at 
> org.apache.logging.log4j.core.config.plugins.visitors.AbstractPluginVisitor.convert(AbstractPluginVisitor.java:130)
>at 
> org.apache.logging.log4j.core.config.plugins.visitors.PluginAttributeVisitor.visit(PluginAttributeVisitor.java:45)
>at 
> org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.generateParameters(PluginBuilder.java:247)
>at 
> org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.build(PluginBuilder.java:136)
>at 
> org.apache.logging.log4j.core.config.AbstractConfiguration.createPluginObject(AbstractConfiguration.java:766)
>at 
> org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:706)
>at 
> org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:698)
>at 
> org.apache.logging.log4j.core.config.AbstractConfiguration.doConfigure(AbstractConfiguration.java:358)
>at 
> org.apache.logging.log4j.core.config.AbstractConfiguration.start(AbstractConfiguration.java:161)
>at 
> org.apache.logging.log4j.core.LoggerContext.setConfiguration(LoggerContext.java:361)
>at 
> org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:426)
>at 
> org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:442)
>at 
> org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:138)
>at 
> org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:147)
>at 
> org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:41)
>at org.apache.logging.log4j.LogManager.getContext(LogManager.java:175)
>at 
> org.apache.logging.log4j.spi.AbstractLoggerAdapter.getContext(AbstractLoggerAdapter.java:102)
>at org.apache.logging.log4j.jcl.LogAdapter.getContext(LogAdapter.java:39)
>at 
> org.apache.logging.log4j.spi.AbstractLoggerAdapter.getLogger(AbstractLoggerAdapter.java:42)
>at 
> org.apache.logging.log4j.jcl.LogFactoryImpl.getInstance(LogFactoryImpl.java:40)
>at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:671)
>at org.apache.hadoop.hive.ql.QTestUtil.(QTestUtil.java:122)
>at 
> org.apache.hadoop.hive.cli.TestMiniTezCliDriver.(TestMiniTezCliDriver.java:33)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11732) LLAP: MiniLlapCluster integration broke hadoop-1 build

2015-09-16 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11732:
-
Fix Version/s: llap

> LLAP: MiniLlapCluster integration broke hadoop-1 build
> --
>
> Key: HIVE-11732
> URL: https://issues.apache.org/jira/browse/HIVE-11732
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: llap
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: llap
>
> Attachments: HIVE-11732.1.patch, HIVE-11732.2.patch
>
>
> HIVE-9900 broke hadoop-1 build. Needs shimming for MiniLlapCluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11839) Vectorization wrong results with filter of (CAST AS CHAR)

2015-09-16 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-11839:

Attachment: HIVE-11839.01.patch

> Vectorization wrong results with filter of (CAST AS CHAR)
> -
>
> Key: HIVE-11839
> URL: https://issues.apache.org/jira/browse/HIVE-11839
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11839.01.patch
>
>
> PROBLEM:
> For query such as
> select count(1) from table where CAST (id as CHAR(4))='1000';
> gives wrong results 0 than expected results.
> STEPS TO REPRODUCE:
> create table s1(id smallint) stored as orc;
> insert into table s1 values (1000),(1001),(1002),(1003),(1000);
> set hive.vectorized.execution.enabled=true;
> select count(1) from s1 where cast(id as char(4))='1000';
> – this gives 0
> set hive.vectorized.execution.enabled=false;
> select count(1) from s1 where cast(id as char(4))='1000';
> – this gives 2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10819) SearchArgumentImpl for Timestamp is broken by HIVE-10286

2015-09-16 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-10819:
-
Affects Version/s: 1.3.0
   1.2.0

> SearchArgumentImpl for Timestamp is broken by HIVE-10286
> 
>
> Key: HIVE-10819
> URL: https://issues.apache.org/jira/browse/HIVE-10819
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.0, 1.3.0
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 1.2.1
>
> Attachments: HIVE-10819.1.patch, HIVE-10819.2.patch, 
> HIVE-10819.3.patch, HIVE-10819.4.patch
>
>
> The work around for kryo bug for Timestamp is accidentally removed by 
> HIVE-10286. Need to bring it back.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11822) vectorize NVL UDF

2015-09-16 Thread Takanobu Asanuma (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747076#comment-14747076
 ] 

Takanobu Asanuma commented on HIVE-11822:
-

[~gopalv]
Thanks for the assignment and kind advice.

> vectorize NVL UDF
> -
>
> Key: HIVE-11822
> URL: https://issues.apache.org/jira/browse/HIVE-11822
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Takanobu Asanuma
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11826) 'hadoop.proxyuser.hive.groups' configuration doesn't prevent unauthorized user to access metastore

2015-09-16 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747167#comment-14747167
 ] 

Hive QA commented on HIVE-11826:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12756028/HIVE-11826.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9445 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json
org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testTempTable
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5292/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5292/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5292/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12756028 - PreCommit-HIVE-TRUNK-Build

> 'hadoop.proxyuser.hive.groups' configuration doesn't prevent unauthorized 
> user to access metastore
> --
>
> Key: HIVE-11826
> URL: https://issues.apache.org/jira/browse/HIVE-11826
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-11826.patch
>
>
> With 'hadoop.proxyuser.hive.groups' configured in core-site.xml to certain 
> groups, currently if you run the job with a user not belonging to those 
> groups, it won't fail to access metastore. With old version hive 0.13, 
> actually it fails properly. 
> Seems HadoopThriftAuthBridge20S.java correctly call ProxyUsers.authorize() 
> while HadoopThriftAuthBridge23 doesn't. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9523) For partitioned tables same optimizations should be available as for bucketed tables and vice versa: ①[Sort Merge] PARTITION Map join and ②BUCKET pruning

2015-09-16 Thread Maciek Kocon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maciek Kocon updated HIVE-9523:
---
Description: 
Logically and functionally bucketing and partitioning are quite similar - both 
provide mechanism to segregate and separate the table's data based on its 
content. Thanks to that significant further optimisations like [partition] 
PRUNING or [bucket] MAP JOIN are possible.
The difference seems to be imposed by design where the PARTITIONing is 
open/explicit while BUCKETing is discrete/implicit.
Partitioning seems to be very common if not a standard feature in all current 
RDBMS while BUCKETING seems to be HIVE specific only.
In a way BUCKETING could be also called by "hashing" or simply "IMPLICIT 
PARTITIONING".

Regardless of the fact that these two are recognised as two separate features 
available in Hive there should be nothing to prevent leveraging same existing 
query/join optimisations across the two.


①[Sort Merge] PARTITION Map join (no progress yet)
Enable Bucket Map Join or better, the Sort Merge Bucket Map Join equivalent 
optimisations when PARTITIONING is used exclusively or in combination with 
BUCKETING.

For JOIN conditions where partitioning criteria are used respectively:
⋮ 
FROM TabA JOIN TabB
   ON TabA.partCol1 = TabB.partCol2
   AND TabA.partCol2 = TabB.partCol2

the optimizer could/should choose to treat it the same way as with bucketed 
tables: ⋮ 
FROM TabC
  JOIN TabD
 ON TabC.clusteredByCol1 = TabD.clusteredByCol2
   AND TabC.clusteredByCol2 = TabD.clusteredByCol2

and use either Bucket Map Join or better, the Sort Merge Bucket Map Join. The 
latter would require capability to create sorted partitions first.

This is based on fact that same way as buckets translate to separate files, the 
partitions essentially provide the same mapping.
When data locality is known the optimizer could focus only on joining 
corresponding partitions rather than whole data sets.

②BUCKET pruning (taken care by 
[HIVE-11525|https://issues.apache.org/jira/browse/HIVE-11525])
Enable partition PRUNING equivalent optimisation for queries on BUCKETED tables

Simplest example is for queries like:
"SELECT … FROM x WHERE colA=123123"
to read only the relevant bucket file rather than all file-buckets that belong 
to a table.

  was:
Logically and functionally bucketing and partitioning are quite similar - both 
provide mechanism to segregate and separate the table's data based on its 
content. Thanks to that significant further optimisations like [partition] 
PRUNING or [bucket] MAP JOIN are possible.
The difference seems to be imposed by design where the PARTITIONing is 
open/explicit while BUCKETing is discrete/implicit.
Partitioning seems to be very common if not a standard feature in all current 
RDBMS while BUCKETING seems to be HIVE specific only.
In a way BUCKETING could be also called by "hashing" or simply "IMPLICIT 
PARTITIONING".

Regardless of the fact that these two are recognised as two separate features 
available in Hive there should be nothing to prevent leveraging same existing 
query/join optimisations across the two.


①[Sort Merge] PARTITION Map join
Enable Bucket Map Join or better, the Sort Merge Bucket Map Join equivalent 
optimisations when PARTITIONING is used exclusively or in combination with 
BUCKETING.

For JOIN conditions where partitioning criteria are used respectively:
⋮ 
FROM TabA JOIN TabB
   ON TabA.partCol1 = TabB.partCol2
   AND TabA.partCol2 = TabB.partCol2

the optimizer could/should choose to treat it the same way as with bucketed 
tables: ⋮ 
FROM TabC
  JOIN TabD
 ON TabC.clusteredByCol1 = TabD.clusteredByCol2
   AND TabC.clusteredByCol2 = TabD.clusteredByCol2

and use either Bucket Map Join or better, the Sort Merge Bucket Map Join.

This is based on fact that same way as buckets translate to separate files, the 
partitions essentially provide the same mapping.
When data locality is known the optimizer could focus only on joining 
corresponding partitions rather than whole data sets.

②BUCKET pruning
Enable partition PRUNING equivalent optimisation for queries on BUCKETED tables

Simplest example is for queries like:
"SELECT … FROM x WHERE colA=123123"
to read only the relevant bucket file rather than all file-buckets that belong 
to a table.


> For partitioned tables same optimizations should be available as for bucketed 
> tables and vice versa: ①[Sort Merge] PARTITION Map join and ②BUCKET pruning
> -
>
> Key: HIVE-9523
> URL: https://issues.apache.org/jira/browse/HIVE-9523
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer, Physical Optimizer, SQL
>Affects Versions: 0.13.0, 0.14.0, 

[jira] [Commented] (HIVE-11825) get_json_object(col,'$.a') is null in where clause didn`t work

2015-09-16 Thread Feng Yuan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747226#comment-14747226
 ] 

Feng Yuan commented on HIVE-11825:
--

thank you so much ,it works!

> get_json_object(col,'$.a') is null in where clause didn`t work
> --
>
> Key: HIVE-11825
> URL: https://issues.apache.org/jira/browse/HIVE-11825
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 0.14.0
>Reporter: Feng Yuan
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-11825.patch
>
>
> example:
> select attr from raw_kafka_item_dt0 where l_date='2015-09-06' and 
> customer='Czgc_news' and get_json_object(attr,'$.title') is NULL limit 10;
> but in results,title is still not null!
> {"title":"思科Q4收入估$79.2亿 
> 前景阴云笼罩","ItemType":"NewsBase","keywords":"思科Q4收入估\$79.2亿 
> 前景阴云笼罩","random":"1420253511075","callback":"BCore.instances[2].callbacks[1]","user_agent":"Mozilla/5.0
>  (iPhone; U; CPU iPhone OS 4_2_1 like Mac OS X; en-us) AppleWebKit/533.17.9 
> (KHTML; like Gecko) Version/5.0.2 Mobile/8C148 
> Safari/6533.18.5","is_newgid":"false","uuid":"DS.Input:b56c782bcb75035d:2116:003dcd40:54a75947","ptime":"1.1549997E9"}
>  
> attr is a dict



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10651) ORC file footer cache should be bounded

2015-09-16 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-10651:
-
Affects Version/s: 2.0.0

> ORC file footer cache should be bounded
> ---
>
> Key: HIVE-10651
> URL: https://issues.apache.org/jira/browse/HIVE-10651
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Mostafa Mokhtar
>Assignee: Prasanth Jayachandran
>Priority: Minor
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-10651.1.patch
>
>
> ORC's file footer cache is currently unbounded and is a soft reference cache. 
> The cache size got from config is used to set initial capacity. We should 
> bound the cache from growing too big and to get a predictable performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11758) Querying nested parquet columns is case sensitive

2015-09-16 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747054#comment-14747054
 ] 

Hive QA commented on HIVE-11758:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12756020/HIVE-11758.2.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9445 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.hcatalog.streaming.TestStreaming.testTimeOutReaper
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5291/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5291/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5291/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12756020 - PreCommit-HIVE-TRUNK-Build

> Querying nested parquet columns is case sensitive
> -
>
> Key: HIVE-11758
> URL: https://issues.apache.org/jira/browse/HIVE-11758
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 1.1.0, 1.1.1, 1.2.1
>Reporter: Jakub Kukul
>Priority: Minor
> Attachments: HIVE-11758.2.patch, HIVE-11758.patch
>
>
> Querying nested parquet columns (columns within a {{STRUCT}}) is case 
> sensitive. It should be case insensitive, to be compatible with querying 
> non-nested columns and querying nested columns with other file formats.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11825) get_json_object(col,'$.a') is null in where clause didn`t work

2015-09-16 Thread Cazen Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cazen Lee updated HIVE-11825:
-
Attachment: HIVE-11825.patch

> get_json_object(col,'$.a') is null in where clause didn`t work
> --
>
> Key: HIVE-11825
> URL: https://issues.apache.org/jira/browse/HIVE-11825
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 0.14.0
>Reporter: Feng Yuan
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-11825.patch
>
>
> example:
> select attr from raw_kafka_item_dt0 where l_date='2015-09-06' and 
> customer='Czgc_news' and get_json_object(attr,'$.title') is NULL limit 10;
> but in results,title is still not null!
> {"title":"思科Q4收入估$79.2亿 
> 前景阴云笼罩","ItemType":"NewsBase","keywords":"思科Q4收入估\$79.2亿 
> 前景阴云笼罩","random":"1420253511075","callback":"BCore.instances[2].callbacks[1]","user_agent":"Mozilla/5.0
>  (iPhone; U; CPU iPhone OS 4_2_1 like Mac OS X; en-us) AppleWebKit/533.17.9 
> (KHTML; like Gecko) Version/5.0.2 Mobile/8C148 
> Safari/6533.18.5","is_newgid":"false","uuid":"DS.Input:b56c782bcb75035d:2116:003dcd40:54a75947","ptime":"1.1549997E9"}
>  
> attr is a dict



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11198) Fix load data query file format check for partitioned tables

2015-09-16 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11198:
-
Affects Version/s: 1.3.0

> Fix load data query file format check for partitioned tables
> 
>
> Key: HIVE-11198
> URL: https://issues.apache.org/jira/browse/HIVE-11198
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11198.patch
>
>
> HIVE-8 added file format check for ORC format. The check will throw 
> exception when non ORC formats is loaded to ORC managed table. But it does 
> not work for partitioned table. Partitioned tables are allowed to have some 
> partitions with different file format. See this discussion for more details
> https://issues.apache.org/jira/browse/HIVE-8?focusedCommentId=14617271=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14617271



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11705) refactor SARG stripe filtering for ORC into a separate method

2015-09-16 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11705:
-
Affects Version/s: 2.0.0

> refactor SARG stripe filtering for ORC into a separate method
> -
>
> Key: HIVE-11705
> URL: https://issues.apache.org/jira/browse/HIVE-11705
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.0.0
>
> Attachments: HIVE-11705.01.patch, HIVE-11705.02.patch, 
> HIVE-11705.03.patch, HIVE-11705.patch
>
>
> For footer cache PPD to metastore, we'd need a method to do the PPD. Tiny 
> item to create it on OrcInputFormat.
> For metastore path, these methods will be called from expression proxy 
> similar to current objectstore expr filtering; it will change to have 
> serialized sarg and column list to come from request instead of conf; 
> includedCols/etc. will also come from request instead of assorted java 
> objects. 
> -The types and stripe stats will need to be extracted from HBase. This is a 
> little bit of a problem, since ideally we want to be inside HBase 
> filter/coprocessor/ I'd need to take a look to see if this is possible... 
> since that filter would need to either deserialize orc, or we would need to 
> store types and stats information in some other, non-ORC manner on write. The 
> latter is probably a better idea, although it's dangerous because there's no 
> sync between this code and ORC itself.-
> Meanwhile minimize dependencies for stripe picking to essentials (and conf 
> which is easy to remove).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11118) Load data query should validate file formats with destination tables

2015-09-16 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-8:
-
Affects Version/s: 1.3.0

> Load data query should validate file formats with destination tables
> 
>
> Key: HIVE-8
> URL: https://issues.apache.org/jira/browse/HIVE-8
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-8.2.patch, HIVE-8.3.patch, 
> HIVE-8.4.patch, HIVE-8.patch
>
>
> Load data local inpath queries does not do any validation wrt file format. If 
> the destination table is ORC and if we try to load files that are not ORC, 
> the load will succeed but querying such tables will result in runtime 
> exceptions. We can do some simple sanity checks to prevent loading of files 
> that does not match the destination table file format.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11700) exception in logs in Tez test with new logger

2015-09-16 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11700:
-
Fix Version/s: 2.0.0

> exception in logs in Tez test with new logger
> -
>
> Key: HIVE-11700
> URL: https://issues.apache.org/jira/browse/HIVE-11700
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Sergey Shelukhin
>Assignee: Prasanth Jayachandran
> Fix For: 2.0.0
>
> Attachments: HIVE-11700.patch, HIVE-11700.patch
>
>
> {noformat}
> 2015-08-31 11:27:47,400 WARN Error while converting string 
> [${sys:hive.ql.log.PerfLogger.level}] to type [class 
> org.apache.logging.log4j.Level]. Using default value [null]. 
> java.lang.IllegalArgumentException: Unknown level constant 
> [${SYS:HIVE.QL.LOG.PERFLOGGER.LEVEL}].
>at org.apache.logging.log4j.Level.valueOf(Level.java:286)
>at 
> org.apache.logging.log4j.core.config.plugins.convert.TypeConverters$LevelConverter.convert(TypeConverters.java:230)
>at 
> org.apache.logging.log4j.core.config.plugins.convert.TypeConverters$LevelConverter.convert(TypeConverters.java:226)
>at 
> org.apache.logging.log4j.core.config.plugins.convert.TypeConverters.convert(TypeConverters.java:336)
>at 
> org.apache.logging.log4j.core.config.plugins.visitors.AbstractPluginVisitor.convert(AbstractPluginVisitor.java:130)
>at 
> org.apache.logging.log4j.core.config.plugins.visitors.PluginAttributeVisitor.visit(PluginAttributeVisitor.java:45)
>at 
> org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.generateParameters(PluginBuilder.java:247)
>at 
> org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.build(PluginBuilder.java:136)
>at 
> org.apache.logging.log4j.core.config.AbstractConfiguration.createPluginObject(AbstractConfiguration.java:766)
>at 
> org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:706)
>at 
> org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:698)
>at 
> org.apache.logging.log4j.core.config.AbstractConfiguration.doConfigure(AbstractConfiguration.java:358)
>at 
> org.apache.logging.log4j.core.config.AbstractConfiguration.start(AbstractConfiguration.java:161)
>at 
> org.apache.logging.log4j.core.LoggerContext.setConfiguration(LoggerContext.java:361)
>at 
> org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:426)
>at 
> org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:442)
>at 
> org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:138)
>at 
> org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:147)
>at 
> org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:41)
>at org.apache.logging.log4j.LogManager.getContext(LogManager.java:175)
>at 
> org.apache.logging.log4j.spi.AbstractLoggerAdapter.getContext(AbstractLoggerAdapter.java:102)
>at org.apache.logging.log4j.jcl.LogAdapter.getContext(LogAdapter.java:39)
>at 
> org.apache.logging.log4j.spi.AbstractLoggerAdapter.getLogger(AbstractLoggerAdapter.java:42)
>at 
> org.apache.logging.log4j.jcl.LogFactoryImpl.getInstance(LogFactoryImpl.java:40)
>at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:671)
>at org.apache.hadoop.hive.ql.QTestUtil.(QTestUtil.java:122)
>at 
> org.apache.hadoop.hive.cli.TestMiniTezCliDriver.(TestMiniTezCliDriver.java:33)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11840) when multi insert the inputformat becomes OneNullRowInputFormat

2015-09-16 Thread Feng Yuan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feng Yuan updated HIVE-11840:
-
Attachment: single__insert
multi insert

> when multi insert the inputformat becomes OneNullRowInputFormat
> ---
>
> Key: HIVE-11840
> URL: https://issues.apache.org/jira/browse/HIVE-11840
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 0.14.0
>Reporter: Feng Yuan
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: multi insert, single__insert
>
>
> example:
> from portrait.rec_feature_feedback a 
> insert overwrite table portrait.test1 select iid, feedback_15day, 
> feedback_7day, feedback_5day, feedback_3day, feedback_1day where l_date = 
> '2015-09-09' and bid in ('949722CF_12F7_523A_EE21_E3D591B7E755') 
> insert overwrite table portrait.test2 select iid, feedback_15day, 
> feedback_7day, feedback_5day, feedback_3day, feedback_1day where l_date = 
> '2015-09-09' and bid in ('test') 
> insert overwrite table portrait.test3 select iid, feedback_15day, 
> feedback_7day, feedback_5day, feedback_3day, feedback_1day where l_date = 
> '2015-09-09' and bid in ('F7734668_CC49_8C4F_24C5_EA8B6728E394')
> when single insert it works.but multi insert when i select * from test1:
> NULL NULL NULL NULL NULL NULL.
> i see "explain extended"
> Path -> Alias:
> -mr-10006portrait.rec_feature_feedback{l_date=2015-09-09, 
> cid=Cyiyaowang, bid=F7734668_CC49_8C4F_24C5_EA8B6728E394} [a]
> -mr-10007portrait.rec_feature_feedback{l_date=2015-09-09, 
> cid=Czgc_pc, bid=949722CF_12F7_523A_EE21_E3D591B7E755} [a]
>   Path -> Partition:
> -mr-10006portrait.rec_feature_feedback{l_date=2015-09-09, 
> cid=Cyiyaowang, bid=F7734668_CC49_8C4F_24C5_EA8B6728E394} 
>   Partition
> base file name: bid=F7734668_CC49_8C4F_24C5_EA8B6728E394
> input format: org.apache.hadoop.hive.ql.io.OneNullRowInputFormat
> output format: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> partition values:
>   bid F7734668_CC49_8C4F_24C5_EA8B6728E394
>   cid Cyiyaowang
>   l_date 2015-09-09
> but when single insert:
> Path -> Alias:
> 
> hdfs://bfdhadoopcool/warehouse/portrait.db/rec_feature_feedback/l_date=2015-09-09/cid=Czgc_pc/bid=949722CF_12F7_523A_EE21_E3D591B7E755
>  [a]
>   Path -> Partition:
> 
> hdfs://bfdhadoopcool/warehouse/portrait.db/rec_feature_feedback/l_date=2015-09-09/cid=Czgc_pc/bid=949722CF_12F7_523A_EE21_E3D591B7E755
>  
>   Partition
> base file name: bid=949722CF_12F7_523A_EE21_E3D591B7E755
> input format: org.apache.hadoop.mapred.TextInputFormat
> output format: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> partition values:
>   bid 949722CF_12F7_523A_EE21_E3D591B7E755
>   cid Czgc_pc
>   l_date 2015-09-09



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11634) Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)

2015-09-16 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747278#comment-14747278
 ] 

Hive QA commented on HIVE-11634:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12756043/HIVE-11634.94.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9446 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5293/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5293/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5293/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12756043 - PreCommit-HIVE-TRUNK-Build

> Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)
> --
>
> Key: HIVE-11634
> URL: https://issues.apache.org/jira/browse/HIVE-11634
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-11634.1.patch, HIVE-11634.2.patch, 
> HIVE-11634.3.patch, HIVE-11634.4.patch, HIVE-11634.5.patch, 
> HIVE-11634.6.patch, HIVE-11634.7.patch, HIVE-11634.8.patch, 
> HIVE-11634.9.patch, HIVE-11634.91.patch, HIVE-11634.92.patch, 
> HIVE-11634.93.patch, HIVE-11634.94.patch
>
>
> Currently, we do not support partition pruning for the following scenario
> {code}
> create table pcr_t1 (key int, value string) partitioned by (ds string);
> insert overwrite table pcr_t1 partition (ds='2000-04-08') select * from src 
> where key < 20 order by key;
> insert overwrite table pcr_t1 partition (ds='2000-04-09') select * from src 
> where key < 20 order by key;
> insert overwrite table pcr_t1 partition (ds='2000-04-10') select * from src 
> where key < 20 order by key;
> explain extended select ds from pcr_t1 where struct(ds, key) in 
> (struct('2000-04-08',1), struct('2000-04-09',2));
> {code}
> If we run the above query, we see that all the partitions of table pcr_t1 are 
> present in the filter predicate where as we can prune  partition 
> (ds='2000-04-10'). 
> The optimization is to rewrite the above query into the following.
> {code}
> explain extended select ds from pcr_t1 where  (struct(ds)) IN 
> (struct('2000-04-08'), struct('2000-04-09')) and  struct(ds, key) in 
> (struct('2000-04-08',1), struct('2000-04-09',2));
> {code}
> The predicate (struct(ds)) IN (struct('2000-04-08'), struct('2000-04-09'))  
> is used by partition pruner to prune the columns which otherwise will not be 
> pruned.
> This is an extension of the idea presented in HIVE-11573.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11786) Deprecate the use of redundant column in colunm stats related tables

2015-09-16 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14768865#comment-14768865
 ] 

Chaoyu Tang commented on HIVE-11786:


Patch has been uploaded to https://reviews.apache.org/r/38429/. [~sershe], 
[~xuefuz], [~ashutoshc], could you help review it? Thanks.

> Deprecate the use of redundant column in colunm stats related tables
> 
>
> Key: HIVE-11786
> URL: https://issues.apache.org/jira/browse/HIVE-11786
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-11786.patch
>
>
> The stats tables such as TAB_COL_STATS, PART_COL_STATS have redundant columns 
> such as DB_NAME, TABLE_NAME, PARTITION_NAME since these tables already have 
> foreign key like TBL_ID, or PART_ID referencing to TBLS or PARTITIONS. 
> These redundant columns violate database normalization rules and cause a lot 
> of inconvenience (sometimes difficult) in column stats related feature 
> implementation. For example, when renaming a table, we have to update 
> TABLE_NAME column in these tables as well which is unnecessary.
> This JIRA is first to deprecate the use of these columns at HMS code level. A 
> followed JIRA is to be opened to focus on DB schema change and upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11825) get_json_object(col,'$.a') is null in where clause didn`t work

2015-09-16 Thread Cazen Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747371#comment-14747371
 ] 

Cazen Lee commented on HIVE-11825:
--

Good :)

Have a good day!

> get_json_object(col,'$.a') is null in where clause didn`t work
> --
>
> Key: HIVE-11825
> URL: https://issues.apache.org/jira/browse/HIVE-11825
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 0.14.0
>Reporter: Feng Yuan
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-11825.patch
>
>
> example:
> select attr from raw_kafka_item_dt0 where l_date='2015-09-06' and 
> customer='Czgc_news' and get_json_object(attr,'$.title') is NULL limit 10;
> but in results,title is still not null!
> {"title":"思科Q4收入估$79.2亿 
> 前景阴云笼罩","ItemType":"NewsBase","keywords":"思科Q4收入估\$79.2亿 
> 前景阴云笼罩","random":"1420253511075","callback":"BCore.instances[2].callbacks[1]","user_agent":"Mozilla/5.0
>  (iPhone; U; CPU iPhone OS 4_2_1 like Mac OS X; en-us) AppleWebKit/533.17.9 
> (KHTML; like Gecko) Version/5.0.2 Mobile/8C148 
> Safari/6533.18.5","is_newgid":"false","uuid":"DS.Input:b56c782bcb75035d:2116:003dcd40:54a75947","ptime":"1.1549997E9"}
>  
> attr is a dict



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-4243) Fix column names in FileSinkOperator

2015-09-16 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14768893#comment-14768893
 ] 

Hive QA commented on HIVE-4243:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12756087/HIVE-4243.patch

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 9443 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.ql.io.orc.TestColumnStatistics.testHasNull
org.apache.hadoop.hive.ql.io.orc.TestFileDump.testBloomFilter
org.apache.hadoop.hive.ql.io.orc.TestFileDump.testBloomFilter2
org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDictionaryThreshold
org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDump
org.apache.hadoop.hive.ql.io.orc.TestJsonFileDump.testJsonDump
org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testRecordReaderNewBaseAndDelta
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.hcatalog.streaming.TestStreaming.testInterleavedTransactionBatchCommits
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5294/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5294/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5294/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12756087 - PreCommit-HIVE-TRUNK-Build

> Fix column names in FileSinkOperator
> 
>
> Key: HIVE-4243
> URL: https://issues.apache.org/jira/browse/HIVE-4243
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-4243.patch, HIVE-4243.patch, HIVE-4243.tmp.patch
>
>
> All of the ObjectInspectors given to SerDe's by FileSinkOperator have virtual 
> column names. Since the files are part of tables, Hive knows the column 
> names. For self-describing file formats like ORC, having the real column 
> names will improve the understandability.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11512) Hive LDAP Authenticator should also support full DN in Authenticate()

2015-09-16 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-11512:
-
Attachment: HIVE-11512.patch

> Hive LDAP Authenticator should also support full DN in Authenticate() 
> --
>
> Key: HIVE-11512
> URL: https://issues.apache.org/jira/browse/HIVE-11512
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Minor
> Attachments: HIVE-11512.patch
>
>
> In certain LDAP implementation, LDAP Binding can occur using the full DN for 
> the user. Currently, LDAPAuthentication Provider assumes that the username 
> passed into Authenticate() is a short username & not a full DN. While the 
> initial bind works fine either way, the filter code is reliant on it being a 
> shortname.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11831) TXN tables in Oracle should be created with ROWDEPENDENCIES

2015-09-16 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14769034#comment-14769034
 ] 

Hive QA commented on HIVE-11831:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12756135/HIVE-11831.01.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9445 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.hcatalog.streaming.TestStreaming.testEndpointConnection
org.apache.hive.hcatalog.streaming.TestStreaming.testTimeOutReaper
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5295/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5295/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5295/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12756135 - PreCommit-HIVE-TRUNK-Build

> TXN tables in Oracle should be created with ROWDEPENDENCIES
> ---
>
> Key: HIVE-11831
> URL: https://issues.apache.org/jira/browse/HIVE-11831
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11831.01.patch, HIVE-11831.patch
>
>
> These frequently-updated tables may otherwise suffer from spurious deadlocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-11841) KeyValuesInputMerger creates huge logs

2015-09-16 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan reassigned HIVE-11841:
---

Assignee: Rajesh Balamohan

> KeyValuesInputMerger creates huge logs
> --
>
> Key: HIVE-11841
> URL: https://issues.apache.org/jira/browse/HIVE-11841
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Attachments: HIVE-11841.1.patch
>
>
> https://github.com/apache/hive/blob/ac755ebe26361a4647d53db2a28500f71697b276/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/tools/KeyValuesInputMerger.java#L107
> When running tpc-ds q75 at relatively large scale, it ends up generating huge 
> logs due to this.
> {noformat}
> Log Type: syslog_attempt_1439860407967_1249_1_30_00_0
> Log Upload Time: Wed Sep 16 12:49:09 + 2015
> Log Length: 3992760053
> Showing 4096 bytes of 3992760053 total. Click here for the full log.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11841) KeyValuesInputMerger creates huge logs

2015-09-16 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-11841:

Attachment: HIVE-11841.1.patch

[~vikram.dixit], [~gopalv] - Please review when you find time.

> KeyValuesInputMerger creates huge logs
> --
>
> Key: HIVE-11841
> URL: https://issues.apache.org/jira/browse/HIVE-11841
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
> Attachments: HIVE-11841.1.patch
>
>
> https://github.com/apache/hive/blob/ac755ebe26361a4647d53db2a28500f71697b276/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/tools/KeyValuesInputMerger.java#L107
> When running tpc-ds q75 at relatively large scale, it ends up generating huge 
> logs due to this.
> {noformat}
> Log Type: syslog_attempt_1439860407967_1249_1_30_00_0
> Log Upload Time: Wed Sep 16 12:49:09 + 2015
> Log Length: 3992760053
> Showing 4096 bytes of 3992760053 total. Click here for the full log.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11842) Improve RuleRegExp by caching some internal data structures

2015-09-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11842:
---
Attachment: HIVE-11842.patch

> Improve RuleRegExp by caching some internal data structures
> ---
>
> Key: HIVE-11842
> URL: https://issues.apache.org/jira/browse/HIVE-11842
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 2.0.0
>
> Attachments: HIVE-11842.patch
>
>
> Continuing work started in HIVE-11141.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11217) CTAS statements throws error, when the table is stored as ORC File format and select clause has NULL/VOID type column

2015-09-16 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790570#comment-14790570
 ] 

Yongzhi Chen commented on HIVE-11217:
-

[~prasanth_j], I can not find a proper place in TypeCheckProcFactory to put the 
code. For it is related to TypeInfo, could I put it into
hive/serde2/typeinfo/TypeInfoUtils ?
Attached the third patch to use TypeInfoUtils. Please review to see if it makes 
sense.
Thanks

> CTAS statements throws error, when the table is stored as ORC File format and 
> select clause has NULL/VOID type column 
> --
>
> Key: HIVE-11217
> URL: https://issues.apache.org/jira/browse/HIVE-11217
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.13.1
>Reporter: Gaurav Kohli
>Assignee: Yongzhi Chen
>Priority: Minor
> Attachments: HIVE-11217.1.patch, HIVE-11217.2.patch, 
> HIVE-11217.3.patch
>
>
> If you try to use create-table-as-select (CTAS) statement and create a ORC 
> File format based table, then you can't use NULL as a column value in select 
> clause 
> CREATE TABLE empty (x int);
> CREATE TABLE orc_table_with_null 
> STORED AS ORC 
> AS 
> SELECT 
> x,
> null
> FROM empty;
> Error: 
> {quote}
> 347084 [main] ERROR hive.ql.exec.DDLTask  - 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IllegalArgumentException: Unknown primitive type VOID
>   at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:643)
>   at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4242)
>   at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:285)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1554)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1321)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1139)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:962)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:952)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:269)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:221)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:431)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:367)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:464)
>   at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:474)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:694)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633)
>   at org.apache.oozie.action.hadoop.HiveMain.runHive(HiveMain.java:323)
>   at org.apache.oozie.action.hadoop.HiveMain.run(HiveMain.java:284)
>   at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:39)
>   at org.apache.oozie.action.hadoop.HiveMain.main(HiveMain.java:66)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:227)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> Caused by: java.lang.IllegalArgumentException: Unknown primitive type VOID
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct.createObjectInspector(OrcStruct.java:530)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.(OrcStruct.java:195)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct.createObjectInspector(OrcStruct.java:534)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcSerde.initialize(OrcSerde.java:106)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:519)
>   at 
> 

[jira] [Updated] (HIVE-11217) CTAS statements throws error, when the table is stored as ORC File format and select clause has NULL/VOID type column

2015-09-16 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-11217:

Attachment: (was: HIVE-11217.3.patch)

> CTAS statements throws error, when the table is stored as ORC File format and 
> select clause has NULL/VOID type column 
> --
>
> Key: HIVE-11217
> URL: https://issues.apache.org/jira/browse/HIVE-11217
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.13.1
>Reporter: Gaurav Kohli
>Assignee: Yongzhi Chen
>Priority: Minor
> Attachments: HIVE-11217.1.patch, HIVE-11217.2.patch, 
> HIVE-11217.3.patch
>
>
> If you try to use create-table-as-select (CTAS) statement and create a ORC 
> File format based table, then you can't use NULL as a column value in select 
> clause 
> CREATE TABLE empty (x int);
> CREATE TABLE orc_table_with_null 
> STORED AS ORC 
> AS 
> SELECT 
> x,
> null
> FROM empty;
> Error: 
> {quote}
> 347084 [main] ERROR hive.ql.exec.DDLTask  - 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IllegalArgumentException: Unknown primitive type VOID
>   at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:643)
>   at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4242)
>   at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:285)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1554)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1321)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1139)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:962)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:952)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:269)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:221)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:431)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:367)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:464)
>   at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:474)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:694)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633)
>   at org.apache.oozie.action.hadoop.HiveMain.runHive(HiveMain.java:323)
>   at org.apache.oozie.action.hadoop.HiveMain.run(HiveMain.java:284)
>   at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:39)
>   at org.apache.oozie.action.hadoop.HiveMain.main(HiveMain.java:66)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:227)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> Caused by: java.lang.IllegalArgumentException: Unknown primitive type VOID
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct.createObjectInspector(OrcStruct.java:530)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.(OrcStruct.java:195)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct.createObjectInspector(OrcStruct.java:534)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcSerde.initialize(OrcSerde.java:106)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:519)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:345)
>   at 
> org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:292)
>   at 
> org.apache.hadoop.hive.ql.metadata.Table.checkValidity(Table.java:194)
>  

[jira] [Commented] (HIVE-11843) Add 'sort by c' to Parquet PPD q-tests to avoid different output issues with hadoop-1

2015-09-16 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-11843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790593#comment-14790593
 ] 

Sergio Peña commented on HIVE-11843:


[~Ferd] Could you help me review this patch? it is easy, I just added the 'sort 
by c' in some queries that display mixed values.

> Add 'sort by c' to Parquet PPD q-tests to avoid different output issues with 
> hadoop-1
> -
>
> Key: HIVE-11843
> URL: https://issues.apache.org/jira/browse/HIVE-11843
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-11843.1.patch
>
>
> Parquet PPD tests has a different output when is run against hadoop-1 because 
> mixed values are in different order.
> To fix this, we should just add 'sort by c' in the queries that will display 
> mixed values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11217) CTAS statements throws error, when the table is stored as ORC File format and select clause has NULL/VOID type column

2015-09-16 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-11217:

Attachment: HIVE-11217.3.patch

> CTAS statements throws error, when the table is stored as ORC File format and 
> select clause has NULL/VOID type column 
> --
>
> Key: HIVE-11217
> URL: https://issues.apache.org/jira/browse/HIVE-11217
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.13.1
>Reporter: Gaurav Kohli
>Assignee: Yongzhi Chen
>Priority: Minor
> Attachments: HIVE-11217.1.patch, HIVE-11217.2.patch, 
> HIVE-11217.3.patch
>
>
> If you try to use create-table-as-select (CTAS) statement and create a ORC 
> File format based table, then you can't use NULL as a column value in select 
> clause 
> CREATE TABLE empty (x int);
> CREATE TABLE orc_table_with_null 
> STORED AS ORC 
> AS 
> SELECT 
> x,
> null
> FROM empty;
> Error: 
> {quote}
> 347084 [main] ERROR hive.ql.exec.DDLTask  - 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IllegalArgumentException: Unknown primitive type VOID
>   at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:643)
>   at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4242)
>   at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:285)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1554)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1321)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1139)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:962)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:952)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:269)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:221)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:431)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:367)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:464)
>   at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:474)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:694)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633)
>   at org.apache.oozie.action.hadoop.HiveMain.runHive(HiveMain.java:323)
>   at org.apache.oozie.action.hadoop.HiveMain.run(HiveMain.java:284)
>   at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:39)
>   at org.apache.oozie.action.hadoop.HiveMain.main(HiveMain.java:66)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:227)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> Caused by: java.lang.IllegalArgumentException: Unknown primitive type VOID
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct.createObjectInspector(OrcStruct.java:530)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.(OrcStruct.java:195)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct.createObjectInspector(OrcStruct.java:534)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcSerde.initialize(OrcSerde.java:106)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:519)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:345)
>   at 
> org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:292)
>   at 
> org.apache.hadoop.hive.ql.metadata.Table.checkValidity(Table.java:194)
>   at 

[jira] [Updated] (HIVE-11217) CTAS statements throws error, when the table is stored as ORC File format and select clause has NULL/VOID type column

2015-09-16 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-11217:

Attachment: HIVE-11217.3.patch

> CTAS statements throws error, when the table is stored as ORC File format and 
> select clause has NULL/VOID type column 
> --
>
> Key: HIVE-11217
> URL: https://issues.apache.org/jira/browse/HIVE-11217
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.13.1
>Reporter: Gaurav Kohli
>Assignee: Yongzhi Chen
>Priority: Minor
> Attachments: HIVE-11217.1.patch, HIVE-11217.2.patch, 
> HIVE-11217.3.patch
>
>
> If you try to use create-table-as-select (CTAS) statement and create a ORC 
> File format based table, then you can't use NULL as a column value in select 
> clause 
> CREATE TABLE empty (x int);
> CREATE TABLE orc_table_with_null 
> STORED AS ORC 
> AS 
> SELECT 
> x,
> null
> FROM empty;
> Error: 
> {quote}
> 347084 [main] ERROR hive.ql.exec.DDLTask  - 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IllegalArgumentException: Unknown primitive type VOID
>   at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:643)
>   at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4242)
>   at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:285)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1554)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1321)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1139)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:962)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:952)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:269)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:221)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:431)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:367)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:464)
>   at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:474)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:694)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633)
>   at org.apache.oozie.action.hadoop.HiveMain.runHive(HiveMain.java:323)
>   at org.apache.oozie.action.hadoop.HiveMain.run(HiveMain.java:284)
>   at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:39)
>   at org.apache.oozie.action.hadoop.HiveMain.main(HiveMain.java:66)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:227)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> Caused by: java.lang.IllegalArgumentException: Unknown primitive type VOID
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct.createObjectInspector(OrcStruct.java:530)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.(OrcStruct.java:195)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct.createObjectInspector(OrcStruct.java:534)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcSerde.initialize(OrcSerde.java:106)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:519)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:345)
>   at 
> org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:292)
>   at 
> org.apache.hadoop.hive.ql.metadata.Table.checkValidity(Table.java:194)
>   at 

[jira] [Commented] (HIVE-11791) Add unit test for HIVE-10122

2015-09-16 Thread Illya Yalovyy (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790582#comment-14790582
 ] 

Illya Yalovyy commented on HIVE-11791:
--

[~hagleitn], actually in test case  compactExpr(or(isNull(col1), false)) it 
returns invalid result: or(isNull(col1)). OR is a binary operator. I'm looking 
into fixing it. Your suggestions would be appreciated.

> Add unit test for HIVE-10122
> 
>
> Key: HIVE-11791
> URL: https://issues.apache.org/jira/browse/HIVE-11791
> Project: Hive
>  Issue Type: Test
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Illya Yalovyy
>Priority: Minor
> Attachments: HIVE-11791.patch
>
>
> Unit tests for PartitionPruner.compactExpr()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11841) KeyValuesInputMerger creates huge logs

2015-09-16 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790564#comment-14790564
 ] 

Gopal V commented on HIVE-11841:


Repoted by a user as well. Can we get a list of versions affected by this & 
file backports?

LGTM  - +1

> KeyValuesInputMerger creates huge logs
> --
>
> Key: HIVE-11841
> URL: https://issues.apache.org/jira/browse/HIVE-11841
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Attachments: HIVE-11841.1.patch
>
>
> https://github.com/apache/hive/blob/ac755ebe26361a4647d53db2a28500f71697b276/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/tools/KeyValuesInputMerger.java#L107
> When running tpc-ds q75 at relatively large scale, it ends up generating huge 
> logs due to this.
> {noformat}
> Log Type: syslog_attempt_1439860407967_1249_1_30_00_0
> Log Upload Time: Wed Sep 16 12:49:09 + 2015
> Log Length: 3992760053
> Showing 4096 bytes of 3992760053 total. Click here for the full log.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11398) Parse wide OR and wide AND trees to flat OR/AND trees

2015-09-16 Thread Illya Yalovyy (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790585#comment-14790585
 ] 

Illya Yalovyy commented on HIVE-11398:
--

[~gopalv], I have added unit tests for one is affected method. Could you please 
review results after this change. Some of them look suspicious. More details: 
HIVE-11791.

> Parse wide OR and wide AND trees to flat OR/AND trees
> -
>
> Key: HIVE-11398
> URL: https://issues.apache.org/jira/browse/HIVE-11398
> Project: Hive
>  Issue Type: New Feature
>  Components: Logical Optimizer, UDF
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Gopal V
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11398.2.patch, HIVE-11398.3.patch, 
> HIVE-11398.4.patch, HIVE-11398.5.patch, HIVE-11398.patch
>
>
> Deep trees of AND/OR are hard to traverse particularly when they are merely 
> the same structure in nested form as a version of the operator that takes an 
> arbitrary number of args.
> One potential way to convert the DFS searches into a simpler BFS search is to 
> introduce a new Operator pair named ALL and ANY.
> ALL(A, B, C, D, E) represents AND(AND(AND(AND(E, D), C), B), A)
> ANY(A, B, C, D, E) represents OR(OR(OR(OR(E, D), C),B),A)
> The SemanticAnalyser would be responsible for generating these operators and 
> this would mean that the depth and complexity of traversals for the simplest 
> case of wide AND/OR trees would be trivial.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11843) Add 'sort by c' to Parquet PPD q-tests to avoid different output issues with hadoop-1

2015-09-16 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-11843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-11843:
---
Attachment: HIVE-11843.1.patch

> Add 'sort by c' to Parquet PPD q-tests to avoid different output issues with 
> hadoop-1
> -
>
> Key: HIVE-11843
> URL: https://issues.apache.org/jira/browse/HIVE-11843
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-11843.1.patch
>
>
> Parquet PPD tests has a different output when is run against hadoop-1 because 
> mixed values are in different order.
> To fix this, we should just add 'sort by c' in the queries that will display 
> mixed values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11512) Hive LDAP Authenticator should also support full DN in Authenticate()

2015-09-16 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-11512:
-
Attachment: (was: HIVE-11512.patch)

> Hive LDAP Authenticator should also support full DN in Authenticate() 
> --
>
> Key: HIVE-11512
> URL: https://issues.apache.org/jira/browse/HIVE-11512
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Minor
>
> In certain LDAP implementation, LDAP Binding can occur using the full DN for 
> the user. Currently, LDAPAuthentication Provider assumes that the username 
> passed into Authenticate() is a short username & not a full DN. While the 
> initial bind works fine either way, the filter code is reliant on it being a 
> shortname.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11791) Add unit test for HIVE-10122

2015-09-16 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790699#comment-14790699
 ] 

Gopal V commented on HIVE-11791:


[~yalovyyi]: Good catch, that look a bit odd - it should be returning just the 
isNull(col1).

The case missing is inside 

{code}
if (allFalse) {
  return new ExprNodeConstantDesc(Boolean.FALSE);
}
// Nothing to compact, update expr with compacted children.
((ExprNodeGenericFuncDesc) expr).setChildren(newChildren);
{code}

also FYI, you can annotate the compactExpr method with an @VisibleForTesting 
annotation, so that use from a non-test will trigger a warning during findbugs 
(which I'll re-add today).

> Add unit test for HIVE-10122
> 
>
> Key: HIVE-11791
> URL: https://issues.apache.org/jira/browse/HIVE-11791
> Project: Hive
>  Issue Type: Test
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
>Priority: Minor
> Attachments: HIVE-11791.patch
>
>
> Unit tests for PartitionPruner.compactExpr()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11844) Merge master to Spark branch 9/16/2015 [Spark Branch]

2015-09-16 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-11844:
---
Summary: Merge master to Spark branch 9/16/2015 [Spark Branch]  (was: 
CMerge master to Spark branch 9/16/2015 [Spark Branch])

> Merge master to Spark branch 9/16/2015 [Spark Branch]
> -
>
> Key: HIVE-11844
> URL: https://issues.apache.org/jira/browse/HIVE-11844
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Fix For: 1.2.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11512) Hive LDAP Authenticator should also support full DN in Authenticate()

2015-09-16 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-11512:
-
Attachment: HIVE-11512.patch

> Hive LDAP Authenticator should also support full DN in Authenticate() 
> --
>
> Key: HIVE-11512
> URL: https://issues.apache.org/jira/browse/HIVE-11512
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Minor
> Attachments: HIVE-11512.patch
>
>
> In certain LDAP implementation, LDAP Binding can occur using the full DN for 
> the user. Currently, LDAPAuthentication Provider assumes that the username 
> passed into Authenticate() is a short username & not a full DN. While the 
> initial bind works fine either way, the filter code is reliant on it being a 
> shortname.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8327) mvn site -Pfindbugs

2015-09-16 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790686#comment-14790686
 ] 

Gopal V commented on HIVE-8327:
---

This patch seems to have been lost during a spark branch merge.

{code}
commit 714b3db65d41dd96db59ca1b9a6d1b6a4613072e
Merge: 537114b 7df9d7a
Author: xzhang 
Date:   Thu Jul 30 17:41:17 2015 -0700

HIVE-10863: Merge master to Spark branch 7/29/2015 [Spark Branch] (reviewed 
by Chao)
{code}

> mvn site -Pfindbugs
> ---
>
> Key: HIVE-8327
> URL: https://issues.apache.org/jira/browse/HIVE-8327
> Project: Hive
>  Issue Type: Test
>  Components: Diagnosability
>Reporter: Gopal V
>Assignee: Gopal V
> Fix For: 1.1.0
>
> Attachments: HIVE-8327.1.patch, HIVE-8327.2.patch, ql-findbugs.html
>
>
> HIVE-3099 originally added findbugs into the old ant build.
> Get basic findbugs working for the maven build.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11791) Add unit test for HIVE-10122

2015-09-16 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-11791:
---
Assignee: Illya Yalovyy

> Add unit test for HIVE-10122
> 
>
> Key: HIVE-11791
> URL: https://issues.apache.org/jira/browse/HIVE-11791
> Project: Hive
>  Issue Type: Test
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
>Priority: Minor
> Attachments: HIVE-11791.patch
>
>
> Unit tests for PartitionPruner.compactExpr()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11832) HIVE-11802 breaks compilation in JDK 8

2015-09-16 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790646#comment-14790646
 ] 

Hive QA commented on HIVE-11832:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12756105/HIVE-11832.1.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9445 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.jdbc.TestSSL.testSSLFetchHttp
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5296/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5296/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5296/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12756105 - PreCommit-HIVE-TRUNK-Build

> HIVE-11802 breaks compilation in JDK 8
> --
>
> Key: HIVE-11832
> URL: https://issues.apache.org/jira/browse/HIVE-11832
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Sergio Peña
> Attachments: HIVE-11832.1.patch
>
>
> HIVE-11802 changes breaks JDK 8 compilation. FloatingDecimal constructor 
> accepting float is removed in JDK 8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11678) Add AggregateProjectMergeRule

2015-09-16 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790683#comment-14790683
 ] 

Jesus Camacho Rodriguez commented on HIVE-11678:


I've gone through the patch and the plan changes.

+1

> Add AggregateProjectMergeRule
> -
>
> Key: HIVE-11678
> URL: https://issues.apache.org/jira/browse/HIVE-11678
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO, Logical Optimizer
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-11678.2.patch, HIVE-11678.3.patch, 
> HIVE-11678.4.patch, HIVE-11678.5.patch, HIVE-11678.patch
>
>
> This will help to get rid of extra projects on top of Aggregation, thus 
> compacting query plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-11835) Type decimal(1,1) reads 0.0, 0.00, etc from text file as NULL

2015-09-16 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790853#comment-14790853
 ] 

Xuefu Zhang edited comment on HIVE-11835 at 9/16/15 6:42 PM:
-

The problem is caused by the fact that Hive trims zeros. In most of cases this 
is harmless. However, if the value is 0.0, 0.00, 0.000, etc, trimming zeros 
changes the value to 0, which has a type decimal(1,0). Since type decimal(1, 1) 
allows on integer digits, 0 becomes NULL when being converted to decimal(1, 1).

It seems that trimming trailing zeros doesn't do any good. It not only changes 
the data type, creating the problem like the one here, but also completely 
changes the semantic meaning of the number. The right fix is to keep trailing 
zeros only if it goes beyond the datatype allows, which happens when scale is 
enforced. This will also keeps the right number of decimal points on query 
result, which is desirable and common practice in other DBs.

Initial patch to have a test run. Expect some test results need to be updated. 
Will also add new tests.


was (Author: xuefuz):
Initial patch to have a test run. Expect some test results need to be updated. 
Will also add new tests.

> Type decimal(1,1) reads 0.0, 0.00, etc from text file as NULL
> -
>
> Key: HIVE-11835
> URL: https://issues.apache.org/jira/browse/HIVE-11835
> Project: Hive
>  Issue Type: Bug
>  Components: Types
>Affects Versions: 1.2.0, 1.1.0, 2.0.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-11835.patch
>
>
> Steps to reproduce:
> 1. create a text file with values like 0.0, 0.00, etc.
> 2. create table in hive with type decimal(1,1).
> 3. run "load data local inpath ..." to load data into the table.
> 4. run select * on the table.
> You will see that NULL is displayed for 0.0, 0.00, .0, etc. Instead, these 
> should be read as 0.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11836) ORC SARG creation throws NPE for null constants with void type

2015-09-16 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790836#comment-14790836
 ] 

Hive QA commented on HIVE-11836:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12756132/HIVE-11836.1.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5300/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5300/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5300/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-5300/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at ce71355 HIVE-8327: (repeat) mvn site -Pfindbugs for hive (Gopal 
V reviewed by Ashutosh Chauhan)
+ git clean -f -d
+ git checkout master
Already on 'master'
+ git reset --hard origin/master
HEAD is now at ce71355 HIVE-8327: (repeat) mvn site -Pfindbugs for hive (Gopal 
V reviewed by Ashutosh Chauhan)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12756132 - PreCommit-HIVE-TRUNK-Build

> ORC SARG creation throws NPE for null constants with void type
> --
>
> Key: HIVE-11836
> URL: https://issues.apache.org/jira/browse/HIVE-11836
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11836.1.patch
>
>
> Queries like
> {code}
> select * from table where col = null
> {code}
> will throw the following exception
> {code}
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$ExpressionBuilder.boxLiteral(SearchArgumentImpl.java:446)
>   at 
> org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$ExpressionBuilder.getLiteral(SearchArgumentImpl.java:476)
>   at 
> org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$ExpressionBuilder.createLeaf(SearchArgumentImpl.java:524)
>   at 
> org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$ExpressionBuilder.createLeaf(SearchArgumentImpl.java:584)
>   at 
> org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$ExpressionBuilder.parse(SearchArgumentImpl.java:629)
>   at 
> org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$ExpressionBuilder.addChildren(SearchArgumentImpl.java:598)
>   at 
> org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$ExpressionBuilder.parse(SearchArgumentImpl.java:621)
>   at 
> org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$ExpressionBuilder.addChildren(SearchArgumentImpl.java:598)
>   at 
> org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$ExpressionBuilder.parse(SearchArgumentImpl.java:621)
>   at 
> 

[jira] [Commented] (HIVE-11815) Correct the column/table names in subquery expression when creating a view

2015-09-16 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790837#comment-14790837
 ] 

Ashutosh Chauhan commented on HIVE-11815:
-

+1

> Correct the column/table names in subquery expression when creating a view
> --
>
> Key: HIVE-11815
> URL: https://issues.apache.org/jira/browse/HIVE-11815
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11815.01.patch, HIVE-11815.02.patch
>
>
> Right now Hive does not quote column/table names in subquery expression when 
> create a view. For example
> {code}
> hive>
> > create table tc (`@d` int);
> OK
> Time taken: 0.119 seconds
> hive> create view tcv as select * from tc b where exists (select a.`@d` from 
> tc a where b.`@d`=a.`@d`);
> OK
> Time taken: 0.075 seconds
> hive> describe extended tcv;
> OK
> @dint
> Detailed Table InformationTable(tableName:tcv, dbName:default, 
> owner:pxiong, createTime:1442250005, lastAccessTime:0, retention:0, 
> sd:StorageDescriptor(cols:[FieldSchema(name:@d, type:int, comment:null)], 
> location:null, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, 
> compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
> serializationLib:null, parameters:{}), bucketCols:[], sortCols:[], 
> parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], 
> skewedColValueLocationMaps:{}), storedAsSubDirectories:false), 
> partitionKeys:[], parameters:{transient_lastDdlTime=1442250005}, 
> viewOriginalText:select * from tc b where exists (select a.@d from tc a where 
> b.@d=a.@d), viewExpandedText:select `b`.`@d` from `default`.`tc` `b` where 
> exists (select a.@d from tc a where b.@d=a.@d), tableType:VIRTUAL_VIEW)
> Time taken: 0.063 seconds, Fetched: 3 row(s)
> hive> select * from tcv;
> FAILED: SemanticException line 1:63 character '@' not supported here
> line 1:84 character '@' not supported here
> line 1:89 character '@' not supported here in definition of VIEW tcv [
> select `b`.`@d` from `default`.`tc` `b` where exists (select a.@d from tc a 
> where b.@d=a.@d)
> ] used as tcv at Line 1:14
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11332) Unicode table comments do not work

2015-09-16 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11332:

Affects Version/s: 2.0.0
   0.13.1
   1.1.0

> Unicode table comments do not work
> --
>
> Key: HIVE-11332
> URL: https://issues.apache.org/jira/browse/HIVE-11332
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.1, 1.1.0, 2.0.0
>Reporter: Sergey Shelukhin
>
> Noticed by accident.
> {noformat}
> select ' ', count(*) from moo;
> Query ID = sershe_20150721190413_979e1b6f-86d6-436f-b8e6-d6785b9d3b83
> Total jobs = 1
> Launching Job 1 out of 1
> [snip]
> OK
>  0
> Time taken: 13.347 seconds, Fetched: 1 row(s)
> hive> ALTER TABLE moo SET TBLPROPERTIES ('comment' = ' ');
> OK
> Time taken: 0.292 seconds
> hive> desc extended moo;
> OK
> i int 
>
> Detailed Table InformationTable(tableName:moo, dbName:default, 
> owner:sershe, createTime:1437519787, lastAccessTime:0, retention:0, 
> sd:StorageDescriptor(cols:[FieldSchema(name:i, type:int, comment:null)], 
> location:hdfs://cn108-10.l42scl.hortonworks.com:8020/apps/hive/warehouse/moo, 
> inputFormat:org.apache.hadoop.mapred.TextInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, 
> compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
> serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
> parameters:{serialization.format=1}), bucketCols:[], sortCols:[], 
> parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], 
> skewedColValueLocationMaps:{}), storedAsSubDirectories:false), 
> partitionKeys:[], parameters:{last_modified_time=1437519883, totalSize=0, 
> numRows=-1, rawDataSize=-1, COLUMN_STATS_ACCURATE=false, numFiles=0, 
> transient_lastDdlTime=1437519883, comment=?? , last_modified_by=sershe}, 
> viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE) 
> Time taken: 0.347 seconds, Fetched: 3 row(s)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11846) CliDriver shutdown tries to drop index table again which was already dropped when dropping the original table

2015-09-16 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790907#comment-14790907
 ] 

Xuefu Zhang commented on HIVE-11846:


The description doesn't seem describing the problem clearly. What's the symptom 
of the problem and how is it releated to CliDriver shutdown?

> CliDriver shutdown tries to drop index table again which was already dropped 
> when dropping the original table 
> --
>
> Key: HIVE-11846
> URL: https://issues.apache.org/jira/browse/HIVE-11846
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Critical
>
> Steps to repro:
> {code}
> set hive.stats.dbclass=fs;
> set hive.stats.autogather=true;
> set hive.cbo.enable=true;
> DROP TABLE IF EXISTS aa;
> CREATE TABLE aa (L_ORDERKEY  INT,
> L_PARTKEY   INT,
> L_SUPPKEY   INT,
> L_LINENUMBERINT,
> L_QUANTITY  DOUBLE,
> L_EXTENDEDPRICE DOUBLE,
> L_DISCOUNT  DOUBLE,
> L_TAX   DOUBLE,
> L_RETURNFLAGSTRING,
> L_LINESTATUSSTRING,
> l_shipdate  STRING,
> L_COMMITDATESTRING,
> L_RECEIPTDATE   STRING,
> L_SHIPINSTRUCT  STRING,
> L_SHIPMODE  STRING,
> L_COMMENT   STRING)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '|';
> LOAD DATA LOCAL INPATH '../../data/files/lineitem.txt' OVERWRITE INTO TABLE 
> aa;
> CREATE INDEX aa_lshipdate_idx ON TABLE aa(l_shipdate) AS 
> 'org.apache.hadoop.hive.ql.index.AggregateIndexHandler' WITH DEFERRED REBUILD 
> IDXPROPERTIES("AGGREGATES"="count(l_shipdate)");
> ALTER INDEX aa_lshipdate_idx ON aa REBUILD;
> show tables;
> explain select l_shipdate, count(l_shipdate)
> from aa
> group by l_shipdate;
> {code}
> The problem is that, we create an index table default_aa_lshipdate_idx, 
> (default is the database name) and it comes after the table aa. Then, it 
> first drop aa, which will drop default_aa_lshipdate_idx as well as it is 
> related to aa. It will not find the table default_aa_lshipdate_idx when it 
> tries to drop it again, which will throw an exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11847) Avoid expensive call to contains/containsAll in DefaultGraphWalker

2015-09-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11847:
---
Attachment: HIVE-11847.patch

> Avoid expensive call to contains/containsAll in DefaultGraphWalker
> --
>
> Key: HIVE-11847
> URL: https://issues.apache.org/jira/browse/HIVE-11847
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer, Physical Optimizer
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11847.patch
>
>
> Continuing work started in HIVE-11652.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11835) Type decimal(1,1) reads 0.0, 0.00, etc from text file as NULL

2015-09-16 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790853#comment-14790853
 ] 

Xuefu Zhang commented on HIVE-11835:


Initial patch to have a test run. Expect some test results need to be updated. 
Will also add new tests.

> Type decimal(1,1) reads 0.0, 0.00, etc from text file as NULL
> -
>
> Key: HIVE-11835
> URL: https://issues.apache.org/jira/browse/HIVE-11835
> Project: Hive
>  Issue Type: Bug
>  Components: Types
>Affects Versions: 1.2.0, 1.1.0, 2.0.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-11835.patch
>
>
> Steps to reproduce:
> 1. create a text file with values like 0.0, 0.00, etc.
> 2. create table in hive with type decimal(1,1).
> 3. run "load data local inpath ..." to load data into the table.
> 4. run select * on the table.
> You will see that NULL is displayed for 0.0, 0.00, .0, etc. Instead, these 
> should be read as 0.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11791) Add unit test for HIVE-10122

2015-09-16 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790893#comment-14790893
 ] 

Gopal V commented on HIVE-11791:


[~yalovyyi]: I have added the find bugs changes, but the compactExpr is still 
broken. Can you test with the following fix?

{code}
--- ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java
+++ ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java
@@ -331,6 +331,9 @@ static private ExprNodeDesc compactExpr(ExprNodeDesc expr) {
 if (allFalse) {
   return new ExprNodeConstantDesc(Boolean.FALSE);
 }
+if (newChildren.size() == 1) {
+  return newChildren.get(0);
+}
 // Nothing to compact, update expr with compacted children.
 ((ExprNodeGenericFuncDesc) expr).setChildren(newChildren);
   }
{code}

> Add unit test for HIVE-10122
> 
>
> Key: HIVE-11791
> URL: https://issues.apache.org/jira/browse/HIVE-11791
> Project: Hive
>  Issue Type: Test
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
>Priority: Minor
> Attachments: HIVE-11791.patch
>
>
> Unit tests for PartitionPruner.compactExpr()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11791) Add unit test for HIVE-10122

2015-09-16 Thread Illya Yalovyy (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790918#comment-14790918
 ] 

Illya Yalovyy commented on HIVE-11791:
--

I'm on it. If you could review/confirm expected result for my tests, I would 
try to fix the rest myself.

> Add unit test for HIVE-10122
> 
>
> Key: HIVE-11791
> URL: https://issues.apache.org/jira/browse/HIVE-11791
> Project: Hive
>  Issue Type: Test
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
>Priority: Minor
> Attachments: HIVE-11791.patch
>
>
> Unit tests for PartitionPruner.compactExpr()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11786) Deprecate the use of redundant column in colunm stats related tables

2015-09-16 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791032#comment-14791032
 ] 

Hive QA commented on HIVE-11786:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12756158/HIVE-11786.patch

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 9445 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_stats_counter_partitioned
org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.testStatsAfterCompactionPartTbl
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.hcatalog.streaming.TestStreaming.testInterleavedTransactionBatchCommits
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5301/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5301/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5301/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12756158 - PreCommit-HIVE-TRUNK-Build

> Deprecate the use of redundant column in colunm stats related tables
> 
>
> Key: HIVE-11786
> URL: https://issues.apache.org/jira/browse/HIVE-11786
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-11786.patch
>
>
> The stats tables such as TAB_COL_STATS, PART_COL_STATS have redundant columns 
> such as DB_NAME, TABLE_NAME, PARTITION_NAME since these tables already have 
> foreign key like TBL_ID, or PART_ID referencing to TBLS or PARTITIONS. 
> These redundant columns violate database normalization rules and cause a lot 
> of inconvenience (sometimes difficult) in column stats related feature 
> implementation. For example, when renaming a table, we have to update 
> TABLE_NAME column in these tables as well which is unnecessary.
> This JIRA is first to deprecate the use of these columns at HMS code level. A 
> followed JIRA is to be opened to focus on DB schema change and upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11819) HiveServer2 catches OOMs on request threads

2015-09-16 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790827#comment-14790827
 ] 

Hive QA commented on HIVE-11819:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12756110/HIVE-11819.01.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5298/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5298/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5298/

Messages:
{noformat}
 This message was trimmed, see log for full details 
[INFO] Excluding aopalliance:aopalliance:jar:1.0 from the shaded jar.
[INFO] Excluding com.sun.jersey.contribs:jersey-guice:jar:1.9 from the shaded 
jar.
[INFO] Excluding org.apache.commons:commons-collections4:jar:4.0 from the 
shaded jar.
[INFO] Excluding org.apache.tez:tez-runtime-library:jar:0.5.2 from the shaded 
jar.
[INFO] Excluding org.apache.tez:tez-common:jar:0.5.2 from the shaded jar.
[INFO] Excluding org.apache.tez:tez-runtime-internals:jar:0.5.2 from the shaded 
jar.
[INFO] Excluding org.apache.tez:tez-mapreduce:jar:0.5.2 from the shaded jar.
[INFO] Excluding commons-collections:commons-collections:jar:3.2.1 from the 
shaded jar.
[INFO] Excluding org.apache.spark:spark-core_2.10:jar:1.4.0 from the shaded jar.
[INFO] Excluding com.twitter:chill_2.10:jar:0.5.0 from the shaded jar.
[INFO] Excluding com.twitter:chill-java:jar:0.5.0 from the shaded jar.
[INFO] Excluding org.apache.hadoop:hadoop-client:jar:1.2.1 from the shaded jar.
[INFO] Excluding org.apache.spark:spark-launcher_2.10:jar:1.4.0 from the shaded 
jar.
[INFO] Excluding org.apache.spark:spark-network-common_2.10:jar:1.4.0 from the 
shaded jar.
[INFO] Excluding org.apache.spark:spark-network-shuffle_2.10:jar:1.4.0 from the 
shaded jar.
[INFO] Excluding org.apache.spark:spark-unsafe_2.10:jar:1.4.0 from the shaded 
jar.
[INFO] Excluding net.java.dev.jets3t:jets3t:jar:0.7.1 from the shaded jar.
[INFO] Excluding org.apache.curator:curator-recipes:jar:2.6.0 from the shaded 
jar.
[INFO] Excluding org.eclipse.jetty.orbit:javax.servlet:jar:3.0.0.v201112011016 
from the shaded jar.
[INFO] Excluding org.apache.commons:commons-math3:jar:3.4.1 from the shaded jar.
[INFO] Excluding org.slf4j:jul-to-slf4j:jar:1.7.10 from the shaded jar.
[INFO] Excluding org.slf4j:jcl-over-slf4j:jar:1.7.10 from the shaded jar.
[INFO] Excluding com.ning:compress-lzf:jar:1.0.3 from the shaded jar.
[INFO] Excluding net.jpountz.lz4:lz4:jar:1.2.0 from the shaded jar.
[INFO] Excluding org.roaringbitmap:RoaringBitmap:jar:0.4.5 from the shaded jar.
[INFO] Excluding commons-net:commons-net:jar:2.2 from the shaded jar.
[INFO] Excluding org.spark-project.akka:akka-remote_2.10:jar:2.3.4-spark from 
the shaded jar.
[INFO] Excluding org.spark-project.akka:akka-actor_2.10:jar:2.3.4-spark from 
the shaded jar.
[INFO] Excluding com.typesafe:config:jar:1.2.1 from the shaded jar.
[INFO] Excluding org.spark-project.protobuf:protobuf-java:jar:2.5.0-spark from 
the shaded jar.
[INFO] Excluding org.uncommons.maths:uncommons-maths:jar:1.2.2a from the shaded 
jar.
[INFO] Excluding org.spark-project.akka:akka-slf4j_2.10:jar:2.3.4-spark from 
the shaded jar.
[INFO] Excluding org.scala-lang:scala-library:jar:2.10.4 from the shaded jar.
[INFO] Excluding org.json4s:json4s-jackson_2.10:jar:3.2.10 from the shaded jar.
[INFO] Excluding org.json4s:json4s-core_2.10:jar:3.2.10 from the shaded jar.
[INFO] Excluding org.json4s:json4s-ast_2.10:jar:3.2.10 from the shaded jar.
[INFO] Excluding org.scala-lang:scalap:jar:2.10.0 from the shaded jar.
[INFO] Excluding org.scala-lang:scala-compiler:jar:2.10.0 from the shaded jar.
[INFO] Excluding com.sun.jersey:jersey-server:jar:1.14 from the shaded jar.
[INFO] Excluding asm:asm:jar:3.1 from the shaded jar.
[INFO] Excluding com.sun.jersey:jersey-core:jar:1.14 from the shaded jar.
[INFO] Excluding org.apache.mesos:mesos:jar:shaded-protobuf:0.21.1 from the 
shaded jar.
[INFO] Excluding com.clearspring.analytics:stream:jar:2.7.0 from the shaded jar.
[INFO] Excluding io.dropwizard.metrics:metrics-graphite:jar:3.1.0 from the 
shaded jar.
[INFO] Excluding 
com.fasterxml.jackson.module:jackson-module-scala_2.10:jar:2.4.4 from the 
shaded jar.
[INFO] Excluding org.scala-lang:scala-reflect:jar:2.10.4 from the shaded jar.
[INFO] Excluding oro:oro:jar:2.0.8 from the shaded jar.
[INFO] Excluding org.tachyonproject:tachyon-client:jar:0.6.4 from the shaded 
jar.
[INFO] Excluding org.tachyonproject:tachyon:jar:0.6.4 from the shaded jar.
[INFO] Excluding net.razorvine:pyrolite:jar:4.4 from the shaded jar.
[INFO] Excluding net.sf.py4j:py4j:jar:0.8.2.1 from the shaded jar.
[INFO] Excluding org.spark-project.spark:unused:jar:1.0.0 from the shaded jar.
[INFO] 

[jira] [Commented] (HIVE-11842) Improve RuleRegExp by caching some internal data structures

2015-09-16 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790881#comment-14790881
 ] 

Sergey Shelukhin commented on HIVE-11842:
-

+1 provided tests pass

> Improve RuleRegExp by caching some internal data structures
> ---
>
> Key: HIVE-11842
> URL: https://issues.apache.org/jira/browse/HIVE-11842
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 2.0.0
>
> Attachments: HIVE-11842.patch
>
>
> Continuing work started in HIVE-11141.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11791) Add unit test for HIVE-10122

2015-09-16 Thread Illya Yalovyy (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790943#comment-14790943
 ] 

Illya Yalovyy commented on HIVE-11791:
--

With this change test results look much better. 

The one which looks strange is:
and(true, true) produces NULL. I would expect it to be TRUE.

If this doesn't matter for the downstream logic then I'll update expected 
result for the test.

Could you please clarify what NULL result means?


> Add unit test for HIVE-10122
> 
>
> Key: HIVE-11791
> URL: https://issues.apache.org/jira/browse/HIVE-11791
> Project: Hive
>  Issue Type: Test
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
>Priority: Minor
> Attachments: HIVE-11791.patch
>
>
> Unit tests for PartitionPruner.compactExpr()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11683) Hive Streaming may overload the metastore

2015-09-16 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-11683:
--
Component/s: Metastore

> Hive Streaming may overload the metastore
> -
>
> Key: HIVE-11683
> URL: https://issues.apache.org/jira/browse/HIVE-11683
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Hive, Metastore, Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Roshan Naik
>
> HiveEndPoint represents a way to write to a specific partition 
> transactionally.
> Each HiveEndPoint creates TransactionBatch(es) and commits transactions.
> Suppose you have 10 instances of Storm Hive bolt using Streaming API.
> Each instance will create HiveEndPoints on demand when it sees an event for 
> particular partition value.
> If events are uniformly distributed wrt partition values and the table has 
> 1000 partitions (for example it's partitioned by CustomerId), each of 10 bolt 
> instances may create 1000 HiveEndPoints and thus > 10,000 (actually 10K * 
> num_txn_per_batch) concurrent transactions.
> This creates huge amount of Metastore traffic.
> HIVE-11672 is investigating how some sort of "shuffle" phase can be added 
> route events for a particular bucket to the same bolt instance.
> The same idea should explored to route events based on partition value.
> cc [~alangates],[~sriharsha],[~rbains]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11846) CliDriver shutdown tries to drop index table again which was already dropped when dropping the original table

2015-09-16 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791046#comment-14791046
 ] 

Pengcheng Xiong commented on HIVE-11846:


[~xuefuz]. thanks for your attention. The problem can be better understood if 
you can take a look at my patch. So, when CliDriver tries to shutdown, it tries 
to drop all the tables that are created during q test. In this case, it 
iterates through all the tables in db.getAllTables() in QTestUtil.java and try 
to drop every one of them. Let's assume there are two tables, A, an original 
table, and index_A, which is an index table created based on A. If index_A 
comes before A in the iteration, there is no problem, because L674 in 
QTestUtil.java will skip it and later when A is dropped, index_A is dropped as 
well. However, If A comes before index_A in the iteration, A will be dropped 
and index_A is dropped as well, later, it will not find index_A and throw 
InvalidTableException. That is the symptom of the problem and why it is related 
to CliDriver shutdown. Thanks.

> CliDriver shutdown tries to drop index table again which was already dropped 
> when dropping the original table 
> --
>
> Key: HIVE-11846
> URL: https://issues.apache.org/jira/browse/HIVE-11846
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Critical
> Attachments: HIVE-11846.01.patch
>
>
> Steps to repro:
> {code}
> set hive.stats.dbclass=fs;
> set hive.stats.autogather=true;
> set hive.cbo.enable=true;
> DROP TABLE IF EXISTS aa;
> CREATE TABLE aa (L_ORDERKEY  INT,
> L_PARTKEY   INT,
> L_SUPPKEY   INT,
> L_LINENUMBERINT,
> L_QUANTITY  DOUBLE,
> L_EXTENDEDPRICE DOUBLE,
> L_DISCOUNT  DOUBLE,
> L_TAX   DOUBLE,
> L_RETURNFLAGSTRING,
> L_LINESTATUSSTRING,
> l_shipdate  STRING,
> L_COMMITDATESTRING,
> L_RECEIPTDATE   STRING,
> L_SHIPINSTRUCT  STRING,
> L_SHIPMODE  STRING,
> L_COMMENT   STRING)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '|';
> LOAD DATA LOCAL INPATH '../../data/files/lineitem.txt' OVERWRITE INTO TABLE 
> aa;
> CREATE INDEX aa_lshipdate_idx ON TABLE aa(l_shipdate) AS 
> 'org.apache.hadoop.hive.ql.index.AggregateIndexHandler' WITH DEFERRED REBUILD 
> IDXPROPERTIES("AGGREGATES"="count(l_shipdate)");
> ALTER INDEX aa_lshipdate_idx ON aa REBUILD;
> show tables;
> explain select l_shipdate, count(l_shipdate)
> from aa
> group by l_shipdate;
> {code}
> The problem is that, we create an index table default_aa_lshipdate_idx, 
> (default is the database name) and it comes after the table aa. Then, it 
> first drop aa, which will drop default_aa_lshipdate_idx as well as it is 
> related to aa. It will not find the table default_aa_lshipdate_idx when it 
> tries to drop it again, which will throw an exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11835) Type decimal(1,1) reads 0.0, 0.00, etc from text file as NULL

2015-09-16 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-11835:
---
Attachment: HIVE-11835.patch

> Type decimal(1,1) reads 0.0, 0.00, etc from text file as NULL
> -
>
> Key: HIVE-11835
> URL: https://issues.apache.org/jira/browse/HIVE-11835
> Project: Hive
>  Issue Type: Bug
>  Components: Types
>Affects Versions: 1.2.0, 1.1.0, 2.0.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-11835.patch
>
>
> Steps to reproduce:
> 1. create a text file with values like 0.0, 0.00, etc.
> 2. create table in hive with type decimal(1,1).
> 3. run "load data local inpath ..." to load data into the table.
> 4. run select * on the table.
> You will see that NULL is displayed for 0.0, 0.00, .0, etc. Instead, these 
> should be read as 0.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11819) HiveServer2 catches OOMs on request threads

2015-09-16 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11819:

Attachment: HIVE-11819.02.patch

This time, forgot the new file

> HiveServer2 catches OOMs on request threads
> ---
>
> Key: HIVE-11819
> URL: https://issues.apache.org/jira/browse/HIVE-11819
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11819.01.patch, HIVE-11819.02.patch, 
> HIVE-11819.patch
>
>
> ThriftCLIService methods such as ExecuteStatement are apparently capable of 
> catching OOMs because they get wrapped in RTE by HiveSessionProxy. 
> This shouldn't happen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11839) Vectorization wrong results with filter of (CAST AS CHAR)

2015-09-16 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790886#comment-14790886
 ] 

Sergey Shelukhin commented on HIVE-11839:
-

+1. Does it also affect 1.2, 1.3 etc.? It should be backported accordingly

> Vectorization wrong results with filter of (CAST AS CHAR)
> -
>
> Key: HIVE-11839
> URL: https://issues.apache.org/jira/browse/HIVE-11839
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11839.01.patch
>
>
> PROBLEM:
> For query such as
> select count(1) from table where CAST (id as CHAR(4))='1000';
> gives wrong results 0 than expected results.
> STEPS TO REPRODUCE:
> create table s1(id smallint) stored as orc;
> insert into table s1 values (1000),(1001),(1002),(1003),(1000);
> set hive.vectorized.execution.enabled=true;
> select count(1) from s1 where cast(id as char(4))='1000';
> – this gives 0
> set hive.vectorized.execution.enabled=false;
> select count(1) from s1 where cast(id as char(4))='1000';
> – this gives 2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11096) Bump the parquet version to 1.7.0

2015-09-16 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-11096:
---
Fix Version/s: 1.3.0

> Bump the parquet version to 1.7.0
> -
>
> Key: HIVE-11096
> URL: https://issues.apache.org/jira/browse/HIVE-11096
> Project: Hive
>  Issue Type: Task
>Affects Versions: 1.2.0
>Reporter: Sergio Peña
>Assignee: Ferdinand Xu
>Priority: Minor
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11096.1.patch
>
>
> Parquet has moved officially as an Apache project since parquet 1.7.0.
> This new version does not have any bugfixes nor improvements from its last 
> 1.6.0 version, but all imports were changed to be org.apache.parquet, and the 
> pom.xml must use org.apache.parquet instead of com.twitter.
> This ticket should address those import and pom changes only.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11844) Merge master to Spark branch 9/16/2015 [Spark Branch]

2015-09-16 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790916#comment-14790916
 ] 

Hive QA commented on HIVE-11844:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12756302/HIVE-11844.1-spark.patch

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 7467 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.initializationError
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_inner_join
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join2
org.apache.hadoop.hive.cli.TestMinimrCliDriver.initializationError
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/949/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/949/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-949/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12756302 - PreCommit-HIVE-SPARK-Build

> Merge master to Spark branch 9/16/2015 [Spark Branch]
> -
>
> Key: HIVE-11844
> URL: https://issues.apache.org/jira/browse/HIVE-11844
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-11844.1-spark.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11842) Improve RuleRegExp by caching some internal data structures

2015-09-16 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790882#comment-14790882
 ] 

Sergey Shelukhin commented on HIVE-11842:
-

is there perf test on this?

> Improve RuleRegExp by caching some internal data structures
> ---
>
> Key: HIVE-11842
> URL: https://issues.apache.org/jira/browse/HIVE-11842
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 2.0.0
>
> Attachments: HIVE-11842.patch
>
>
> Continuing work started in HIVE-11141.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11819) HiveServer2 catches OOMs on request threads

2015-09-16 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11819:

Attachment: (was: HIVE-11819.02.patch)

> HiveServer2 catches OOMs on request threads
> ---
>
> Key: HIVE-11819
> URL: https://issues.apache.org/jira/browse/HIVE-11819
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11819.01.patch, HIVE-11819.patch
>
>
> ThriftCLIService methods such as ExecuteStatement are apparently capable of 
> catching OOMs because they get wrapped in RTE by HiveSessionProxy. 
> This shouldn't happen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11844) Merge master to Spark branch 9/16/2015 [Spark Branch]

2015-09-16 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790988#comment-14790988
 ] 

Xuefu Zhang commented on HIVE-11844:


Besides some test result diff, there seems to be an issue with the test 
environment. Since there is only a minor conflict, I'm committing the merge now 
and addressing the test and env as followups.

> Merge master to Spark branch 9/16/2015 [Spark Branch]
> -
>
> Key: HIVE-11844
> URL: https://issues.apache.org/jira/browse/HIVE-11844
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-11844.1-spark.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11846) CliDriver shutdown tries to drop index table again which was already dropped when dropping the original table

2015-09-16 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11846:
---
Attachment: HIVE-11846.01.patch

> CliDriver shutdown tries to drop index table again which was already dropped 
> when dropping the original table 
> --
>
> Key: HIVE-11846
> URL: https://issues.apache.org/jira/browse/HIVE-11846
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Critical
> Attachments: HIVE-11846.01.patch
>
>
> Steps to repro:
> {code}
> set hive.stats.dbclass=fs;
> set hive.stats.autogather=true;
> set hive.cbo.enable=true;
> DROP TABLE IF EXISTS aa;
> CREATE TABLE aa (L_ORDERKEY  INT,
> L_PARTKEY   INT,
> L_SUPPKEY   INT,
> L_LINENUMBERINT,
> L_QUANTITY  DOUBLE,
> L_EXTENDEDPRICE DOUBLE,
> L_DISCOUNT  DOUBLE,
> L_TAX   DOUBLE,
> L_RETURNFLAGSTRING,
> L_LINESTATUSSTRING,
> l_shipdate  STRING,
> L_COMMITDATESTRING,
> L_RECEIPTDATE   STRING,
> L_SHIPINSTRUCT  STRING,
> L_SHIPMODE  STRING,
> L_COMMENT   STRING)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '|';
> LOAD DATA LOCAL INPATH '../../data/files/lineitem.txt' OVERWRITE INTO TABLE 
> aa;
> CREATE INDEX aa_lshipdate_idx ON TABLE aa(l_shipdate) AS 
> 'org.apache.hadoop.hive.ql.index.AggregateIndexHandler' WITH DEFERRED REBUILD 
> IDXPROPERTIES("AGGREGATES"="count(l_shipdate)");
> ALTER INDEX aa_lshipdate_idx ON aa REBUILD;
> show tables;
> explain select l_shipdate, count(l_shipdate)
> from aa
> group by l_shipdate;
> {code}
> The problem is that, we create an index table default_aa_lshipdate_idx, 
> (default is the database name) and it comes after the table aa. Then, it 
> first drop aa, which will drop default_aa_lshipdate_idx as well as it is 
> related to aa. It will not find the table default_aa_lshipdate_idx when it 
> tries to drop it again, which will throw an exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11110) Reorder applyPreJoinOrderingTransforms, add NotNULL/FilterMerge rules, improve Filter selectivity estimation

2015-09-16 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-0?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791037#comment-14791037
 ] 

Hive QA commented on HIVE-0:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12756178/HIVE-0.13.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5302/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5302/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5302/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-5302/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   27eeadc..efd059c  branch-1   -> origin/branch-1
   ce71355..57158da  master -> origin/master
   f78f663..70eeadd  spark  -> origin/spark
+ git reset --hard HEAD
HEAD is now at ce71355 HIVE-8327: (repeat) mvn site -Pfindbugs for hive (Gopal 
V reviewed by Ashutosh Chauhan)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded.
+ git reset --hard origin/master
HEAD is now at 57158da HIVE-11816 : Upgrade groovy to 2.4.4 (Szehon, reviewed 
by Xuefu)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12756178 - PreCommit-HIVE-TRUNK-Build

> Reorder applyPreJoinOrderingTransforms, add NotNULL/FilterMerge rules, 
> improve Filter selectivity estimation
> 
>
> Key: HIVE-0
> URL: https://issues.apache.org/jira/browse/HIVE-0
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Laljo John Pullokkaran
> Attachments: HIVE-0-10.patch, HIVE-0-11.patch, 
> HIVE-0-12.patch, HIVE-0-branch-1.2.patch, HIVE-0.1.patch, 
> HIVE-0.13.patch, HIVE-0.2.patch, HIVE-0.4.patch, 
> HIVE-0.5.patch, HIVE-0.6.patch, HIVE-0.7.patch, 
> HIVE-0.8.patch, HIVE-0.9.patch, HIVE-0.91.patch, 
> HIVE-0.92.patch, HIVE-0.patch
>
>
> Query
> {code}
> select  count(*)
>  from store_sales
>  ,store_returns
>  ,date_dim d1
>  ,date_dim d2
>  where d1.d_quarter_name = '2000Q1'
>and d1.d_date_sk = ss_sold_date_sk
>and ss_customer_sk = sr_customer_sk
>and ss_item_sk = sr_item_sk
>and ss_ticket_number = sr_ticket_number
>and sr_returned_date_sk = d2.d_date_sk
>and d2.d_quarter_name in ('2000Q1','2000Q2','2000Q3’);
> {code}
> The store_sales table is partitioned on ss_sold_date_sk, which is also used 
> in a join clause. The join clause should add a filter “filterExpr: 
> ss_sold_date_sk is not null”, which should get pushed the MetaStore when 
> fetching the stats. Currently this is not done in 

[jira] [Commented] (HIVE-8327) mvn site -Pfindbugs

2015-09-16 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791054#comment-14791054
 ] 

Lefty Leverenz commented on HIVE-8327:
--

Again:  should this be documented?

> mvn site -Pfindbugs
> ---
>
> Key: HIVE-8327
> URL: https://issues.apache.org/jira/browse/HIVE-8327
> Project: Hive
>  Issue Type: Test
>  Components: Diagnosability
>Reporter: Gopal V
>Assignee: Gopal V
> Fix For: 1.1.0
>
> Attachments: HIVE-8327.1.patch, HIVE-8327.2.patch, ql-findbugs.html
>
>
> HIVE-3099 originally added findbugs into the old ant build.
> Get basic findbugs working for the maven build.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11791) Add unit test for HIVE-10122

2015-09-16 Thread Illya Yalovyy (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791101#comment-14791101
 ] 

Illya Yalovyy commented on HIVE-11791:
--

[~gopalv], There is an inconsistency:
compactExpr(or(true, NULL)) => true, but
compactExpr(or(NULL, true)) => NULL.

If true == NULL in this context, then this behavior is acceptable, but still 
inconsistent.




> Add unit test for HIVE-10122
> 
>
> Key: HIVE-11791
> URL: https://issues.apache.org/jira/browse/HIVE-11791
> Project: Hive
>  Issue Type: Test
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
>Priority: Minor
> Attachments: HIVE-11791.patch
>
>
> Unit tests for PartitionPruner.compactExpr()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11834) Lineage doesn't work with dynamic partitioning query

2015-09-16 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HIVE-11834:
---
Attachment: HIVE-11834.1.patch

> Lineage doesn't work with dynamic partitioning query
> 
>
> Key: HIVE-11834
> URL: https://issues.apache.org/jira/browse/HIVE-11834
> Project: Hive
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Attachments: HIVE-11834.1.patch
>
>
> As Mark found out,
> https://issues.apache.org/jira/browse/HIVE-11139?focusedCommentId=14745937=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14745937
> This is indeed a code bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11839) Vectorization wrong results with filter of (CAST AS CHAR)

2015-09-16 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791207#comment-14791207
 ] 

Matt McCline commented on HIVE-11839:
-

Committed to trunk.

> Vectorization wrong results with filter of (CAST AS CHAR)
> -
>
> Key: HIVE-11839
> URL: https://issues.apache.org/jira/browse/HIVE-11839
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11839.01.patch
>
>
> PROBLEM:
> For query such as
> select count(1) from table where CAST (id as CHAR(4))='1000';
> gives wrong results 0 than expected results.
> STEPS TO REPRODUCE:
> create table s1(id smallint) stored as orc;
> insert into table s1 values (1000),(1001),(1002),(1003),(1000);
> set hive.vectorized.execution.enabled=true;
> select count(1) from s1 where cast(id as char(4))='1000';
> – this gives 0
> set hive.vectorized.execution.enabled=false;
> select count(1) from s1 where cast(id as char(4))='1000';
> – this gives 2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11815) Correct the column/table names in subquery expression when creating a view

2015-09-16 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11815:
---
Attachment: HIVE-11815.03.patch

rebase the patch based on recent changes on master.

> Correct the column/table names in subquery expression when creating a view
> --
>
> Key: HIVE-11815
> URL: https://issues.apache.org/jira/browse/HIVE-11815
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11815.01.patch, HIVE-11815.02.patch, 
> HIVE-11815.03.patch
>
>
> Right now Hive does not quote column/table names in subquery expression when 
> create a view. For example
> {code}
> hive>
> > create table tc (`@d` int);
> OK
> Time taken: 0.119 seconds
> hive> create view tcv as select * from tc b where exists (select a.`@d` from 
> tc a where b.`@d`=a.`@d`);
> OK
> Time taken: 0.075 seconds
> hive> describe extended tcv;
> OK
> @dint
> Detailed Table InformationTable(tableName:tcv, dbName:default, 
> owner:pxiong, createTime:1442250005, lastAccessTime:0, retention:0, 
> sd:StorageDescriptor(cols:[FieldSchema(name:@d, type:int, comment:null)], 
> location:null, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, 
> compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
> serializationLib:null, parameters:{}), bucketCols:[], sortCols:[], 
> parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], 
> skewedColValueLocationMaps:{}), storedAsSubDirectories:false), 
> partitionKeys:[], parameters:{transient_lastDdlTime=1442250005}, 
> viewOriginalText:select * from tc b where exists (select a.@d from tc a where 
> b.@d=a.@d), viewExpandedText:select `b`.`@d` from `default`.`tc` `b` where 
> exists (select a.@d from tc a where b.@d=a.@d), tableType:VIRTUAL_VIEW)
> Time taken: 0.063 seconds, Fetched: 3 row(s)
> hive> select * from tcv;
> FAILED: SemanticException line 1:63 character '@' not supported here
> line 1:84 character '@' not supported here
> line 1:89 character '@' not supported here in definition of VIEW tcv [
> select `b`.`@d` from `default`.`tc` `b` where exists (select a.@d from tc a 
> where b.@d=a.@d)
> ] used as tcv at Line 1:14
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11839) Vectorization wrong results with filter of (CAST AS CHAR)

2015-09-16 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791178#comment-14791178
 ] 

Matt McCline commented on HIVE-11839:
-

Thanks [~sershe] for quick review.

Test failures are unrelated.

> Vectorization wrong results with filter of (CAST AS CHAR)
> -
>
> Key: HIVE-11839
> URL: https://issues.apache.org/jira/browse/HIVE-11839
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11839.01.patch
>
>
> PROBLEM:
> For query such as
> select count(1) from table where CAST (id as CHAR(4))='1000';
> gives wrong results 0 than expected results.
> STEPS TO REPRODUCE:
> create table s1(id smallint) stored as orc;
> insert into table s1 values (1000),(1001),(1002),(1003),(1000);
> set hive.vectorized.execution.enabled=true;
> select count(1) from s1 where cast(id as char(4))='1000';
> – this gives 0
> set hive.vectorized.execution.enabled=false;
> select count(1) from s1 where cast(id as char(4))='1000';
> – this gives 2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11826) 'hadoop.proxyuser.hive.groups' configuration doesn't prevent unauthorized user to access metastore

2015-09-16 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-11826:

Attachment: HIVE-11826.2.patch

Change the TestHadoop20SAuthBridge.java to be the one for version23 since 
version23S is already removed from the code base. 

> 'hadoop.proxyuser.hive.groups' configuration doesn't prevent unauthorized 
> user to access metastore
> --
>
> Key: HIVE-11826
> URL: https://issues.apache.org/jira/browse/HIVE-11826
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-11826.2.patch, HIVE-11826.patch
>
>
> With 'hadoop.proxyuser.hive.groups' configured in core-site.xml to certain 
> groups, currently if you run the job with a user not belonging to those 
> groups, it won't fail to access metastore. With old version hive 0.13, 
> actually it fails properly. 
> Seems HadoopThriftAuthBridge20S.java correctly call ProxyUsers.authorize() 
> while HadoopThriftAuthBridge23 doesn't. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11839) Vectorization wrong results with filter of (CAST AS CHAR)

2015-09-16 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791173#comment-14791173
 ] 

Hive QA commented on HIVE-11839:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12756187/HIVE-11839.01.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9447 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.hcatalog.streaming.TestStreaming.testEndpointConnection
org.apache.hive.hcatalog.streaming.TestStreaming.testRemainingTransactions
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5303/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5303/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5303/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12756187 - PreCommit-HIVE-TRUNK-Build

> Vectorization wrong results with filter of (CAST AS CHAR)
> -
>
> Key: HIVE-11839
> URL: https://issues.apache.org/jira/browse/HIVE-11839
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11839.01.patch
>
>
> PROBLEM:
> For query such as
> select count(1) from table where CAST (id as CHAR(4))='1000';
> gives wrong results 0 than expected results.
> STEPS TO REPRODUCE:
> create table s1(id smallint) stored as orc;
> insert into table s1 values (1000),(1001),(1002),(1003),(1000);
> set hive.vectorized.execution.enabled=true;
> select count(1) from s1 where cast(id as char(4))='1000';
> – this gives 0
> set hive.vectorized.execution.enabled=false;
> select count(1) from s1 where cast(id as char(4))='1000';
> – this gives 2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11849) NPE in HiveHBaseTableShapshotInputFormat in query with just count(*)

2015-09-16 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791221#comment-14791221
 ] 

Enis Soztutar commented on HIVE-11849:
--

Offline conversation with Jason, we have noted a couple of things: 
 - HiveHBaseTableSnapshotInputFormat.java uses the mapred API while 
HiveHBaseTableInputFormat uses the mapreduce API. [~ndimiduk] I remember you 
were talking about specifically that all Hive IFs use mapred. Is that changed? 
 - mapred version of the HBase's TableMapreduceUtil does not have the utility 
methods to pass the Scan serialized inside the job configuration. The only 
supported way is to set the Scan through now-deprecated 
{{TableInputFormat.COLUMN_LIST}}. 
 - Although the {{mapred.TableMapreduceUtil}} does not support setting the 
serialized scan, we can still manually set it and have the TSIF work correctly 
since the mapred and mapreduce versions use the same underlying implementation 
({{TableSnapshotInputFormatImpl}}). 
 

> NPE in HiveHBaseTableShapshotInputFormat in query with just count(*)
> 
>
> Key: HIVE-11849
> URL: https://issues.apache.org/jira/browse/HIVE-11849
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Affects Versions: 1.3.0
>Reporter: Jason Dere
>
> Adding the following example as a qfile test in hbase-handler fails. Looks 
> like this may have been introduced by HIVE-5277.
> {noformat}
> SET hive.hbase.snapshot.name=src_hbase_snapshot;
> SET hive.hbase.snapshot.restoredir=/tmp;
> select count(*) from src_hbase;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11791) Add unit test for HIVE-10122

2015-09-16 Thread Illya Yalovyy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Illya Yalovyy updated HIVE-11791:
-
Attachment: HIVE-11791.2.patch

Updated expected results and fixed some issues with expression compaction logic.

> Add unit test for HIVE-10122
> 
>
> Key: HIVE-11791
> URL: https://issues.apache.org/jira/browse/HIVE-11791
> Project: Hive
>  Issue Type: Test
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
>Priority: Minor
> Attachments: HIVE-11791.2.patch, HIVE-11791.patch
>
>
> Unit tests for PartitionPruner.compactExpr()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11820) export tables with size of >32MB throws "java.lang.IllegalArgumentException: Skip CRC is valid only with update options"

2015-09-16 Thread Takahiko Saito (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takahiko Saito updated HIVE-11820:
--
Attachment: HIVE-11820.patch

> export tables with size of >32MB throws "java.lang.IllegalArgumentException: 
> Skip CRC is valid only with update options"
> 
>
> Key: HIVE-11820
> URL: https://issues.apache.org/jira/browse/HIVE-11820
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Takahiko Saito
>Assignee: Takahiko Saito
> Fix For: 1.2.1
>
> Attachments: HIVE-11820.patch
>
>
> Tested a patch of HIVE-11607 and seeing the following exception:
> {noformat}
> 2015-09-14 21:44:16,817 ERROR [main]: exec.Task 
> (SessionState.java:printError(960)) - Failed with exception Skip CRC is valid 
> only with update options
> java.lang.IllegalArgumentException: Skip CRC is valid only with update options
> at 
> org.apache.hadoop.tools.DistCpOptions.validate(DistCpOptions.java:556)
> at 
> org.apache.hadoop.tools.DistCpOptions.setSkipCRC(DistCpOptions.java:311)
> at 
> org.apache.hadoop.hive.shims.Hadoop23Shims.runDistCp(Hadoop23Shims.java:1147)
> at org.apache.hadoop.hive.common.FileUtils.copy(FileUtils.java:553)
> at org.apache.hadoop.hive.ql.exec.CopyTask.execute(CopyTask.java:82)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1655)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1414)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1195)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> A possible resolution is to reverse the order of the following two lines from 
> a patch of HIVE-11607:
> {noformat}
> +options.setSkipCRC(true);
> +options.setSyncFolder(true);
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11834) Lineage doesn't work with dynamic partitioning query

2015-09-16 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791244#comment-14791244
 ] 

Jimmy Xiang commented on HIVE-11834:


Patch is on RB: https://reviews.apache.org/r/38442/

> Lineage doesn't work with dynamic partitioning query
> 
>
> Key: HIVE-11834
> URL: https://issues.apache.org/jira/browse/HIVE-11834
> Project: Hive
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11834.1.patch
>
>
> As Mark found out,
> https://issues.apache.org/jira/browse/HIVE-11139?focusedCommentId=14745937=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14745937
> This is indeed a code bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11846) CliDriver shutdown tries to drop index table again which was already dropped when dropping the original table

2015-09-16 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791294#comment-14791294
 ] 

Xuefu Zhang commented on HIVE-11846:


Got it. Thanks for the explanation.

> CliDriver shutdown tries to drop index table again which was already dropped 
> when dropping the original table 
> --
>
> Key: HIVE-11846
> URL: https://issues.apache.org/jira/browse/HIVE-11846
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Critical
> Attachments: HIVE-11846.01.patch
>
>
> Steps to repro:
> {code}
> set hive.stats.dbclass=fs;
> set hive.stats.autogather=true;
> set hive.cbo.enable=true;
> DROP TABLE IF EXISTS aa;
> CREATE TABLE aa (L_ORDERKEY  INT,
> L_PARTKEY   INT,
> L_SUPPKEY   INT,
> L_LINENUMBERINT,
> L_QUANTITY  DOUBLE,
> L_EXTENDEDPRICE DOUBLE,
> L_DISCOUNT  DOUBLE,
> L_TAX   DOUBLE,
> L_RETURNFLAGSTRING,
> L_LINESTATUSSTRING,
> l_shipdate  STRING,
> L_COMMITDATESTRING,
> L_RECEIPTDATE   STRING,
> L_SHIPINSTRUCT  STRING,
> L_SHIPMODE  STRING,
> L_COMMENT   STRING)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '|';
> LOAD DATA LOCAL INPATH '../../data/files/lineitem.txt' OVERWRITE INTO TABLE 
> aa;
> CREATE INDEX aa_lshipdate_idx ON TABLE aa(l_shipdate) AS 
> 'org.apache.hadoop.hive.ql.index.AggregateIndexHandler' WITH DEFERRED REBUILD 
> IDXPROPERTIES("AGGREGATES"="count(l_shipdate)");
> ALTER INDEX aa_lshipdate_idx ON aa REBUILD;
> show tables;
> explain select l_shipdate, count(l_shipdate)
> from aa
> group by l_shipdate;
> {code}
> The problem is that, we create an index table default_aa_lshipdate_idx, 
> (default is the database name) and it comes after the table aa. Then, it 
> first drop aa, which will drop default_aa_lshipdate_idx as well as it is 
> related to aa. It will not find the table default_aa_lshipdate_idx when it 
> tries to drop it again, which will throw an exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11839) Vectorization wrong results with filter of (CAST AS CHAR)

2015-09-16 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791300#comment-14791300
 ] 

Xuefu Zhang commented on HIVE-11839:


Could we update the fix versions please?

> Vectorization wrong results with filter of (CAST AS CHAR)
> -
>
> Key: HIVE-11839
> URL: https://issues.apache.org/jira/browse/HIVE-11839
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11839.01.patch
>
>
> PROBLEM:
> For query such as
> select count(1) from table where CAST (id as CHAR(4))='1000';
> gives wrong results 0 than expected results.
> STEPS TO REPRODUCE:
> create table s1(id smallint) stored as orc;
> insert into table s1 values (1000),(1001),(1002),(1003),(1000);
> set hive.vectorized.execution.enabled=true;
> select count(1) from s1 where cast(id as char(4))='1000';
> – this gives 0
> set hive.vectorized.execution.enabled=false;
> select count(1) from s1 where cast(id as char(4))='1000';
> – this gives 2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8342) Potential null dereference in ColumnTruncateMapper#jobClose()

2015-09-16 Thread Lars Francke (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791307#comment-14791307
 ] 

Lars Francke commented on HIVE-8342:


Hey [~tedyu] I get notifications about this issue every once in a while because 
you seemingly change something but it looks like you're not actually changing 
anything. Is this a JIRA problem?

> Potential null dereference in ColumnTruncateMapper#jobClose()
> -
>
> Key: HIVE-8342
> URL: https://issues.apache.org/jira/browse/HIVE-8342
> Project: Hive
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: skrho
>Priority: Minor
> Attachments: HIVE-8342_001.patch, HIVE-8342_002.patch
>
>
> {code}
> Utilities.mvFileToFinalPath(outputPath, job, success, LOG, dynPartCtx, 
> null,
>   reporter);
> {code}
> Utilities.mvFileToFinalPath() calls createEmptyBuckets() where conf is 
> dereferenced:
> {code}
> boolean isCompressed = conf.getCompressed();
> TableDesc tableInfo = conf.getTableInfo();
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11852) numRows and rawDataSize table properties are not replicated

2015-09-16 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-11852:

Reporter: Paul Isaychuk  (was: Sushanth Sowmyan)

> numRows and rawDataSize table properties are not replicated
> ---
>
> Key: HIVE-11852
> URL: https://issues.apache.org/jira/browse/HIVE-11852
> Project: Hive
>  Issue Type: Bug
>  Components: Import/Export
>Affects Versions: 1.2.1
>Reporter: Paul Isaychuk
>Assignee: Sushanth Sowmyan
>
> numRows and rawDataSize table properties are not replicated when exported for 
> replication and re-imported.
> {code}
> Table drdbnonreplicatabletable.vanillatable has different TblProps from 
> drdbnonreplicatabletable.vanillatable expected [{numFiles=1, numRows=2, 
> totalSize=560, rawDataSize=440}] but found [{numFiles=1, totalSize=560}]
> java.lang.AssertionError: Table drdbnonreplicatabletable.vanillatable has 
> different TblProps from drdbnonreplicatabletable.vanillatable expected 
> [{numFiles=1, numRows=2, totalSize=560, rawDataSize=440}] but found 
> [{numFiles=1, totalSize=560}]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11839) Vectorization wrong results with filter of (CAST AS CHAR)

2015-09-16 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-11839:

Fix Version/s: 2.0.0
   1.3.0

> Vectorization wrong results with filter of (CAST AS CHAR)
> -
>
> Key: HIVE-11839
> URL: https://issues.apache.org/jira/browse/HIVE-11839
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11839.01.patch
>
>
> PROBLEM:
> For query such as
> select count(1) from table where CAST (id as CHAR(4))='1000';
> gives wrong results 0 than expected results.
> STEPS TO REPRODUCE:
> create table s1(id smallint) stored as orc;
> insert into table s1 values (1000),(1001),(1002),(1003),(1000);
> set hive.vectorized.execution.enabled=true;
> select count(1) from s1 where cast(id as char(4))='1000';
> – this gives 0
> set hive.vectorized.execution.enabled=false;
> select count(1) from s1 where cast(id as char(4))='1000';
> – this gives 2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11842) Improve RuleRegExp by caching some internal data structures

2015-09-16 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791385#comment-14791385
 ] 

Hive QA commented on HIVE-11842:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12756271/HIVE-11842.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9447 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5305/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5305/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5305/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12756271 - PreCommit-HIVE-TRUNK-Build

> Improve RuleRegExp by caching some internal data structures
> ---
>
> Key: HIVE-11842
> URL: https://issues.apache.org/jira/browse/HIVE-11842
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 2.0.0
>
> Attachments: HIVE-11842.patch
>
>
> Continuing work started in HIVE-11141.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11553) use basic file metadata cache in ETLSplitStrategy-related paths

2015-09-16 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791429#comment-14791429
 ] 

Sergey Shelukhin commented on HIVE-11553:
-

[~gopalv] [~prasanth_j] can you please review this? Note that this is stage 1, 
before PPD. PPD is stage 2 :)
Unfortunately my local branches are a clusterfuck by now and  everything now 
depends on this patch, so makes it hard to make progress.

> use basic file metadata cache in ETLSplitStrategy-related paths
> ---
>
> Key: HIVE-11553
> URL: https://issues.apache.org/jira/browse/HIVE-11553
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: hbase-metastore-branch
>
> Attachments: HIVE-11553.01.patch, HIVE-11553.02.patch, 
> HIVE-11553.patch
>
>
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8327) mvn site -Pfindbugs

2015-09-16 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791280#comment-14791280
 ] 

Gopal V commented on HIVE-8327:
---

Yes, I already added a doc.

https://cwiki.apache.org/confluence/display/Hive/HiveDeveloperFAQ#HiveDeveloperFAQ-Howtorunfindbugsafterachange?

> mvn site -Pfindbugs
> ---
>
> Key: HIVE-8327
> URL: https://issues.apache.org/jira/browse/HIVE-8327
> Project: Hive
>  Issue Type: Test
>  Components: Diagnosability
>Reporter: Gopal V
>Assignee: Gopal V
> Fix For: 1.1.0
>
> Attachments: HIVE-8327.1.patch, HIVE-8327.2.patch, ql-findbugs.html
>
>
> HIVE-3099 originally added findbugs into the old ant build.
> Get basic findbugs working for the maven build.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11512) Hive LDAP Authenticator should also support full DN in Authenticate()

2015-09-16 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791290#comment-14791290
 ] 

Hive QA commented on HIVE-11512:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12756286/HIVE-11512.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9445 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.hcatalog.streaming.TestStreaming.testInterleavedTransactionBatchCommits
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5304/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5304/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5304/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12756286 - PreCommit-HIVE-TRUNK-Build

> Hive LDAP Authenticator should also support full DN in Authenticate() 
> --
>
> Key: HIVE-11512
> URL: https://issues.apache.org/jira/browse/HIVE-11512
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Minor
> Attachments: HIVE-11512.patch
>
>
> In certain LDAP implementation, LDAP Binding can occur using the full DN for 
> the user. Currently, LDAPAuthentication Provider assumes that the username 
> passed into Authenticate() is a short username & not a full DN. While the 
> initial bind works fine either way, the filter code is reliant on it being a 
> shortname.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-11839) Vectorization wrong results with filter of (CAST AS CHAR)

2015-09-16 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791300#comment-14791300
 ] 

Xuefu Zhang edited comment on HIVE-11839 at 9/16/15 11:32 PM:
--

Could we update the fix versions please? Also, affected versions.


was (Author: xuefuz):
Could we update the fix versions please?

> Vectorization wrong results with filter of (CAST AS CHAR)
> -
>
> Key: HIVE-11839
> URL: https://issues.apache.org/jira/browse/HIVE-11839
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11839.01.patch
>
>
> PROBLEM:
> For query such as
> select count(1) from table where CAST (id as CHAR(4))='1000';
> gives wrong results 0 than expected results.
> STEPS TO REPRODUCE:
> create table s1(id smallint) stored as orc;
> insert into table s1 values (1000),(1001),(1002),(1003),(1000);
> set hive.vectorized.execution.enabled=true;
> select count(1) from s1 where cast(id as char(4))='1000';
> – this gives 0
> set hive.vectorized.execution.enabled=false;
> select count(1) from s1 where cast(id as char(4))='1000';
> – this gives 2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11820) export tables with size of >32MB throws "java.lang.IllegalArgumentException: Skip CRC is valid only with update options"

2015-09-16 Thread Takahiko Saito (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takahiko Saito updated HIVE-11820:
--
Fix Version/s: (was: 1.2.1)

> export tables with size of >32MB throws "java.lang.IllegalArgumentException: 
> Skip CRC is valid only with update options"
> 
>
> Key: HIVE-11820
> URL: https://issues.apache.org/jira/browse/HIVE-11820
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Takahiko Saito
>Assignee: Takahiko Saito
> Attachments: HIVE-11820.patch
>
>
> Tested a patch of HIVE-11607 and seeing the following exception:
> {noformat}
> 2015-09-14 21:44:16,817 ERROR [main]: exec.Task 
> (SessionState.java:printError(960)) - Failed with exception Skip CRC is valid 
> only with update options
> java.lang.IllegalArgumentException: Skip CRC is valid only with update options
> at 
> org.apache.hadoop.tools.DistCpOptions.validate(DistCpOptions.java:556)
> at 
> org.apache.hadoop.tools.DistCpOptions.setSkipCRC(DistCpOptions.java:311)
> at 
> org.apache.hadoop.hive.shims.Hadoop23Shims.runDistCp(Hadoop23Shims.java:1147)
> at org.apache.hadoop.hive.common.FileUtils.copy(FileUtils.java:553)
> at org.apache.hadoop.hive.ql.exec.CopyTask.execute(CopyTask.java:82)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1655)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1414)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1195)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> A possible resolution is to reverse the order of the following two lines from 
> a patch of HIVE-11607:
> {noformat}
> +options.setSkipCRC(true);
> +options.setSyncFolder(true);
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11820) export tables with size of >32MB throws "java.lang.IllegalArgumentException: Skip CRC is valid only with update options"

2015-09-16 Thread Takahiko Saito (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takahiko Saito updated HIVE-11820:
--
Affects Version/s: (was: 1.2.1)

> export tables with size of >32MB throws "java.lang.IllegalArgumentException: 
> Skip CRC is valid only with update options"
> 
>
> Key: HIVE-11820
> URL: https://issues.apache.org/jira/browse/HIVE-11820
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Takahiko Saito
>Assignee: Takahiko Saito
> Attachments: HIVE-11820.patch
>
>
> Tested a patch of HIVE-11607 and seeing the following exception:
> {noformat}
> 2015-09-14 21:44:16,817 ERROR [main]: exec.Task 
> (SessionState.java:printError(960)) - Failed with exception Skip CRC is valid 
> only with update options
> java.lang.IllegalArgumentException: Skip CRC is valid only with update options
> at 
> org.apache.hadoop.tools.DistCpOptions.validate(DistCpOptions.java:556)
> at 
> org.apache.hadoop.tools.DistCpOptions.setSkipCRC(DistCpOptions.java:311)
> at 
> org.apache.hadoop.hive.shims.Hadoop23Shims.runDistCp(Hadoop23Shims.java:1147)
> at org.apache.hadoop.hive.common.FileUtils.copy(FileUtils.java:553)
> at org.apache.hadoop.hive.ql.exec.CopyTask.execute(CopyTask.java:82)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1655)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1414)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1195)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> A possible resolution is to reverse the order of the following two lines from 
> a patch of HIVE-11607:
> {noformat}
> +options.setSkipCRC(true);
> +options.setSyncFolder(true);
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11849) NPE in HiveHBaseTableShapshotInputFormat in query with just count(*)

2015-09-16 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791247#comment-14791247
 ] 

Nick Dimiduk commented on HIVE-11849:
-

Yeah, IIRC, this stuff is a big mess of the two different mapred API's across 
the two projects. Have a look at some of the linked issues from HIVE-6584. 
Notable TODO items were HBASE-11179, HBASE-11163 and HIVE-7534.

> NPE in HiveHBaseTableShapshotInputFormat in query with just count(*)
> 
>
> Key: HIVE-11849
> URL: https://issues.apache.org/jira/browse/HIVE-11849
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Affects Versions: 1.3.0
>Reporter: Jason Dere
>
> Adding the following example as a qfile test in hbase-handler fails. Looks 
> like this may have been introduced by HIVE-5277.
> {noformat}
> SET hive.hbase.snapshot.name=src_hbase_snapshot;
> SET hive.hbase.snapshot.restoredir=/tmp;
> select count(*) from src_hbase;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >