date:20150819


[ 
https://issues.apache.org/jira/browse/HIVE-11579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703051#comment-14703051
 ] 

Ferdinand Xu commented on HIVE-11579:
-

Thanks [~xuefuz] for the review. The tmp files are used to redirection in the 
server side. It's not used for the console(client side). That's the reason it 
doesn't close the system err output stream in non embedded mode.

 Invoke the set command will close standard error output[beeline-cli]
 

 Key: HIVE-11579
 URL: https://issues.apache.org/jira/browse/HIVE-11579
 Project: Hive
  Issue Type: Sub-task
  Components: CLI
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
 Attachments: HIVE-11579-beeline-cli.patch, 
 HIVE-11579.2-beeline-cli.patch


 We can easily reproduce the debug by the following steps:
 {code}
 hive set system:xx=yy;
 hive lss;
 hive 
 {code}
 The error output disappeared since the err outputstream is closed when 
 closing the Hive statement.
 This bug occurred also in the upstream when using the embeded mode as the new 
 CLI uses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11375) Broken processing of queries containing NOT (x IS NOT NULL and x 0)


[ 
https://issues.apache.org/jira/browse/HIVE-11375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703080#comment-14703080
 ] 

Aihua Xu commented on HIVE-11375:
-

Attached the new patch to fix the unit tests failure.

 Broken processing of queries containing NOT (x IS NOT NULL and x  0)
 --

 Key: HIVE-11375
 URL: https://issues.apache.org/jira/browse/HIVE-11375
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 2.0.0
Reporter: Mariusz Sakowski
Assignee: Aihua Xu
 Fix For: 2.0.0

 Attachments: HIVE-11375.2.patch, HIVE-11375.3.patch, 
 HIVE-11375.4.patch, HIVE-11375.patch


 When running query like this:
 {code}explain select * from test where (val is not null and val  0);{code}
 hive will simplify expression in parenthesis and omit is not null check:
 {code}
   Filter Operator
 predicate: (val  0) (type: boolean)
 {code}
 which is fine.
 but if we negate condition using NOT operator:
 {code}explain select * from test where not (val is not null and val  
 0);{code}
 hive will also simplify thing, but now it will break stuff:
 {code}
   Filter Operator
 predicate: (not (val  0)) (type: boolean)
 {code}
 because valid predicate should be *val == 0 or val is null*, while above row 
 is equivalent to *val == 0* only, filtering away rows where val is null
 simple example:
 {code}
 CREATE TABLE example (
 val bigint
 );
 INSERT INTO example VALUES (1), (NULL), (0);
 -- returns 2 rows - NULL and 0
 select * from example where (val is null or val == 0);
 -- returns 1 row - 0
 select * from example where not (val is not null and val  0);
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11375) Broken processing of queries containing NOT (x IS NOT NULL and x 0)


 [ 
https://issues.apache.org/jira/browse/HIVE-11375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-11375:

Attachment: HIVE-11375.4.patch

 Broken processing of queries containing NOT (x IS NOT NULL and x  0)
 --

 Key: HIVE-11375
 URL: https://issues.apache.org/jira/browse/HIVE-11375
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 2.0.0
Reporter: Mariusz Sakowski
Assignee: Aihua Xu
 Fix For: 2.0.0

 Attachments: HIVE-11375.2.patch, HIVE-11375.3.patch, 
 HIVE-11375.4.patch, HIVE-11375.patch


 When running query like this:
 {code}explain select * from test where (val is not null and val  0);{code}
 hive will simplify expression in parenthesis and omit is not null check:
 {code}
   Filter Operator
 predicate: (val  0) (type: boolean)
 {code}
 which is fine.
 but if we negate condition using NOT operator:
 {code}explain select * from test where not (val is not null and val  
 0);{code}
 hive will also simplify thing, but now it will break stuff:
 {code}
   Filter Operator
 predicate: (not (val  0)) (type: boolean)
 {code}
 because valid predicate should be *val == 0 or val is null*, while above row 
 is equivalent to *val == 0* only, filtering away rows where val is null
 simple example:
 {code}
 CREATE TABLE example (
 val bigint
 );
 INSERT INTO example VALUES (1), (NULL), (0);
 -- returns 2 rows - NULL and 0
 select * from example where (val is null or val == 0);
 -- returns 1 row - 0
 select * from example where not (val is not null and val  0);
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11597) [CBO new return path] Handling of strings of zero-length

2015-08-19 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703045#comment-14703045
 ] 

Hive QA commented on HIVE-11597:




{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12751147/HIVE-11597.patch

{color:green}SUCCESS:{color} +1 9370 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5009/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5009/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5009/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12751147 - PreCommit-HIVE-TRUNK-Build

 [CBO new return path] Handling of strings of zero-length
 

 Key: HIVE-11597
 URL: https://issues.apache.org/jira/browse/HIVE-11597
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-11597.patch


 Exposed by load_dyn_part14.q



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11602) Support Struct with different field types in query


[ 
https://issues.apache.org/jira/browse/HIVE-11602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703230#comment-14703230
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-11602:
--

Changes look fine to me. +1 pending unit test run.

 Support Struct with different field types in query
 --

 Key: HIVE-11602
 URL: https://issues.apache.org/jira/browse/HIVE-11602
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-11602.patch


 Table:
 {code}
 create table journal(`journal id1` string) partitioned by (`journal id2` 
 string);
 {code}
 Query:
 {code}
 explain select * from journal where struct(`journal id1`, `journal id2`) IN 
 (struct('2013-1000-0133878664', '3'), struct('2013-1000-0133878695', 1));
 {code}
 Exception:
 {code}
 15/08/18 14:52:55 [main]: ERROR ql.Driver: FAILED: SemanticException [Error 
 10014]: Line 1:108 Wrong arguments '1': The arguments for IN should be the 
 same type! Types are: {structcol1:string,col2:string IN 
 (structcol1:string,col2:string, structcol1:string,col2:int)}
 org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:108 Wrong arguments 
 '1': The arguments for IN should be the same type! Types are: 
 {structcol1:string,col2:string IN (structcol1:string,col2:string, 
 structcol1:string,col2:int)}
 at 
 org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1196)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:132)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
 at 
 org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:195)
 at 
 org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:148)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:10595)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10551)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10519)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:2681)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:2662)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8841)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9713)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9607)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10093)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
 at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:417)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1069)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1131)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1006)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:996)
 at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410)
 at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11502) Map side aggregation is extremely slow

2015-08-19 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703102#comment-14703102
 ] 

Xuefu Zhang commented on HIVE-11502:


For my understanding, it's interesting to know why changing in ListKeyWrapper's 
hashcode solves the problem. I originally thought the problem is with hashcode 
for DoubleWritable. Any explanation would be appreciated.

 Map side aggregation is extremely slow
 --

 Key: HIVE-11502
 URL: https://issues.apache.org/jira/browse/HIVE-11502
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer, Physical Optimizer
Affects Versions: 1.2.0
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen
 Attachments: HIVE-11502.1.patch, HIVE-11502.2.patch, 
 HIVE-11502.3.patch


 For the query as following:
 {noformat}
 create table tbl2 as 
 select col1, max(col2) as col2 
 from tbl1 group by col1;
 {noformat}
 If the column for group by has many different values (for example 40) and 
 it is in type double, the map side aggregation is very slow. I ran the query 
 which took more than 3 hours , after 3 hours, I have to kill the query.
 The same query can finish in 7 seconds, if I turn off map side aggregation by:
 {noformat}
 set hive.map.aggr = false;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11579) Invoke the set command will close standard error output[beeline-cli]

[
https://issues.apache.org/jira/browse/HIVE-11579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703099#comment-14703099
]

Ferdinand Xu commented on HIVE-11579:
-

When initializing the session error output stream(server side) and beeline
error output stream(client side), they are both redirected to the standard
error output. When invoking beeline.error, it will output some msg to the
beeline error output stream. At the meantime, the server side will log this
error as well. It's a little bit confusing since they actually output the error
msg twice. Given an example, you trying to use `lss` which is invalid command,
the server side will throw an exception which is output to the session's error
output stream. When the client gets the exception, we will use the method
beeline.error to output this error again.

Invoke the set command will close standard error output[beeline-cli]

Key: HIVE-11579
URL: https://issues.apache.org/jira/browse/HIVE-11579
Project: Hive
Issue Type: Sub-task
Components: CLI
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
Attachments: HIVE-11579-beeline-cli.patch,
HIVE-11579.2-beeline-cli.patch

We can easily reproduce the debug by the following steps:
{code}
hive set system:xx=yy;
hive lss;
hive
{code}
The error output disappeared since the err outputstream is closed when
closing the Hive statement.
This bug occurred also in the upstream when using the embeded mode as the new
CLI uses.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11573) PointLookupOptimizer can be pessimistic at a low nDV


[ 
https://issues.apache.org/jira/browse/HIVE-11573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703198#comment-14703198
 ] 

Hive QA commented on HIVE-11573:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12751242/HIVE-11573.2.patch

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 9370 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_filter
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_flatten_and_or
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pcr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_transform
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_case
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_case
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_pcr
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_ppd_transform
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorized_case
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5010/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5010/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5010/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12751242 - PreCommit-HIVE-TRUNK-Build

 PointLookupOptimizer can be pessimistic at a low nDV
 

 Key: HIVE-11573
 URL: https://issues.apache.org/jira/browse/HIVE-11573
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.3.0, 2.0.0
Reporter: Gopal V
Assignee: Gopal V
 Fix For: 1.3.0, 2.0.0

 Attachments: HIVE-11573.1.patch, HIVE-11573.2.patch


 The PointLookupOptimizer can turn off some of the optimizations due to its 
 use of tuple IN() clauses.
 Limit the application of the optimizer for very low nDV cases and extract the 
 sub-clause as a pre-condition during runtime, to trigger the simple column 
 predicate index lookups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11602) Support Struct with different field types in query


[ 
https://issues.apache.org/jira/browse/HIVE-11602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703397#comment-14703397
 ] 

Hive QA commented on HIVE-11602:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12751250/HIVE-11602.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9370 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_structin
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5011/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5011/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5011/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12751250 - PreCommit-HIVE-TRUNK-Build

 Support Struct with different field types in query
 --

 Key: HIVE-11602
 URL: https://issues.apache.org/jira/browse/HIVE-11602
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-11602.patch


 Table:
 {code}
 create table journal(`journal id1` string) partitioned by (`journal id2` 
 string);
 {code}
 Query:
 {code}
 explain select * from journal where struct(`journal id1`, `journal id2`) IN 
 (struct('2013-1000-0133878664', '3'), struct('2013-1000-0133878695', 1));
 {code}
 Exception:
 {code}
 15/08/18 14:52:55 [main]: ERROR ql.Driver: FAILED: SemanticException [Error 
 10014]: Line 1:108 Wrong arguments '1': The arguments for IN should be the 
 same type! Types are: {structcol1:string,col2:string IN 
 (structcol1:string,col2:string, structcol1:string,col2:int)}
 org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:108 Wrong arguments 
 '1': The arguments for IN should be the same type! Types are: 
 {structcol1:string,col2:string IN (structcol1:string,col2:string, 
 structcol1:string,col2:int)}
 at 
 org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1196)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:132)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
 at 
 org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:195)
 at 
 org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:148)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:10595)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10551)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10519)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:2681)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:2662)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8841)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9713)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9607)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10093)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
 at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:417)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1069)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1131)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1006)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:996)
 at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247)
 at

[jira] [Resolved] (HIVE-10622) Hive doc error: 'from' is a keyword, when use it as a column name throw error.

2015-08-19 Thread Anne Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anne Yu resolved HIVE-10622.

Resolution: Fixed

 Hive doc error: 'from' is a keyword, when use it as a column name throw error.
 --

 Key: HIVE-10622
 URL: https://issues.apache.org/jira/browse/HIVE-10622
 Project: Hive
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.1.1
Reporter: Anne Yu

 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML, Use 
 from as a column name in create table, throw error.
 {code}
 CREATE TABLE pageviews (userid VARCHAR(64), link STRING, from STRING)
   PARTITIONED BY (datestamp STRING) CLUSTERED BY (userid) INTO 256 BUCKETS 
 STORED AS ORC;
 Error: Error while compiling statement: FAILED: ParseException line 1:57 
 cannot recognize input near 'from' 'STRING' ')' in column specification 
 (state=42000,code=4)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11596) nvl(x, y) throws NPE if type x and type y doesn't match, rather than throwing the meaningful error


[ 
https://issues.apache.org/jira/browse/HIVE-11596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703317#comment-14703317
 ] 

Aihua Xu commented on HIVE-11596:
-

OK. Actually we are passing 0-based or 1 based argument ids inconsistently in 
different places. I will fix that.

 nvl(x, y) throws NPE if type x and type y doesn't match, rather than throwing 
 the meaningful error
 --

 Key: HIVE-11596
 URL: https://issues.apache.org/jira/browse/HIVE-11596
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 2.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu
Priority: Minor
 Fix For: 2.0.0

 Attachments: HIVE-11596.patch


 {noformat}
 create table test(key string);
 select nvl(key, true) from test;
 {noformat}
 The query above will throw NPE rather than the meaningful error The first 
 and seconds arguments of function NLV should have the same type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11584) Update committer list

2015-08-19 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703320#comment-14703320
 ] 

Pengcheng Xiong commented on HIVE-11584:


[~Ferd], thank you very much. It is very kind of you to remember me. :)

 Update committer list
 -

 Key: HIVE-11584
 URL: https://issues.apache.org/jira/browse/HIVE-11584
 Project: Hive
  Issue Type: Bug
Reporter: Dmitry Tolpeko
Assignee: Ferdinand Xu
Priority: Minor
 Attachments: HIVE-11584.1.patch, HIVE-11584.patch


 Please update committer list:
 Name: Dmitry Tolpeko
 Apache ID: dmtolpeko
 Organization: EPAM (www.epam.com)
 Thank you,
 Dmitry



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7172) Potential resource leak in HiveSchemaTool#getMetaStoreSchemaVersion()

2015-08-19 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HIVE-7172:
-
Description: 
{code}
  ResultSet res = stmt.executeQuery(versionQuery);
  if (!res.next()) {
throw new HiveMetaException(Didn't find version data in metastore);
  }
  String currentSchemaVersion = res.getString(1);
  metastoreConn.close();
{code}
When HiveMetaException is thrown, metastoreConn.close() would be skipped.
stmt is not closed upon return from the method.

  was:
{code}
  ResultSet res = stmt.executeQuery(versionQuery);
  if (!res.next()) {
throw new HiveMetaException(Didn't find version data in metastore);
  }
  String currentSchemaVersion = res.getString(1);
  metastoreConn.close();
{code}

When HiveMetaException is thrown, metastoreConn.close() would be skipped.
stmt is not closed upon return from the method.


 Potential resource leak in HiveSchemaTool#getMetaStoreSchemaVersion()
 -

 Key: HIVE-7172
 URL: https://issues.apache.org/jira/browse/HIVE-7172
 Project: Hive
  Issue Type: Bug
Reporter: Ted Yu
Priority: Minor
 Attachments: HIVE-7172.patch


 {code}
   ResultSet res = stmt.executeQuery(versionQuery);
   if (!res.next()) {
 throw new HiveMetaException(Didn't find version data in metastore);
   }
   String currentSchemaVersion = res.getString(1);
   metastoreConn.close();
 {code}
 When HiveMetaException is thrown, metastoreConn.close() would be skipped.
 stmt is not closed upon return from the method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11596) nvl(x, y) throws NPE if type x and type y doesn't match, rather than throwing the meaningful error


 [ 
https://issues.apache.org/jira/browse/HIVE-11596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-11596:

Attachment: (was: HIVE-11596.patch)

 nvl(x, y) throws NPE if type x and type y doesn't match, rather than throwing 
 the meaningful error
 --

 Key: HIVE-11596
 URL: https://issues.apache.org/jira/browse/HIVE-11596
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 2.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu
Priority: Minor
 Fix For: 2.0.0

 Attachments: HIVE-11596.patch


 {noformat}
 create table test(key string);
 select nvl(key, true) from test;
 {noformat}
 The query above will throw NPE rather than the meaningful error The first 
 and seconds arguments of function NLV should have the same type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11606) Bucket map joins fail at hash table construction time

2015-08-19 Thread Vikram Dixit K (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-11606:
--
Attachment: HIVE-11606.1.patch

 Bucket map joins fail at hash table construction time
 -

 Key: HIVE-11606
 URL: https://issues.apache.org/jira/browse/HIVE-11606
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: 1.0.1, 1.2.1
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-11606.1.patch


 {code}
 info=[Error: Failure while running task:java.lang.RuntimeException: 
 java.lang.RuntimeException: java.lang.AssertionError: Capacity must be a 
 power of two
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186)
 at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138)
 at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)
 Caused by: java.lang.RuntimeException: java.lang.AssertionError: Capacity 
 must be a power of two
 at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91)
 at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
 at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:294)
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:163)
  
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11572) Datanucleus loads Log4j1.x Logger from AppClassLoader

2015-08-19 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703523#comment-14703523
 ] 

Prasanth Jayachandran commented on HIVE-11572:
--

[~gopalv] But hive libs are appended to CLASSPATH, not to HIVE_CLASSPATH.
{code}
for f in ${HIVE_LIB}/*.jar; do
  CLASSPATH=${CLASSPATH}:$f;
done
{code}

That's the reason why I prepended the CLASSPATH before HADOOP_CLASSPATH.

 Datanucleus loads Log4j1.x Logger from AppClassLoader
 -

 Key: HIVE-11572
 URL: https://issues.apache.org/jira/browse/HIVE-11572
 Project: Hive
  Issue Type: Sub-task
  Components: Logging
Affects Versions: 2.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
 Fix For: 2.0.0

 Attachments: HIVE-11572.patch


 As part of HIVE-11304, we moved from Log4j1.x to Log4j2. But DataNucleus log 
 messages gets logged to console when launching the hive cli. The reason is 
 DataNucleus is trying to load Log4j1.x Logger by traversing its class loader. 
 Although we use log4j-1.2-api bridge we are loading log4j-1.2.16 jar that was 
 pulled by ZooKeeper. We should make sure that there is no log4j-1.2.16 in 
 datanucleus classloader hierarchy (classpath). 
 DataNucleus logger has this 
 {code}
 NucleusLogger.class.getClassLoader().loadClass(org.apache.log4j.Logger);
 loggerClass = org.datanucleus.util.Log4JLogger.class;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11600) Hive Parser to Support multi col in clause (x,y..) in ((..),..., ())

2015-08-19 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11600:
---
Attachment: HIVE-11600.01.patch

[~jpullokkaran], could u please take a look? Thanks.

 Hive Parser to Support multi col in clause (x,y..) in ((..),..., ())
 

 Key: HIVE-11600
 URL: https://issues.apache.org/jira/browse/HIVE-11600
 Project: Hive
  Issue Type: New Feature
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-11600.01.patch


 Current hive only support single column in clause, e.g., 
 {code}select * from src where  col0 in (v1,v2,v3);{code}
 We want it to support 
 {code}select * from src where (col0,col1+3) in 
 ((col0+v1,v2),(v3,v4-col1));{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11605) Incorrect results with bucket map join in tez.

2015-08-19 Thread Vikram Dixit K (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-11605:
--
Affects Version/s: 1.0.1

 Incorrect results with bucket map join in tez.
 --

 Key: HIVE-11605
 URL: https://issues.apache.org/jira/browse/HIVE-11605
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: 1.0.0, 1.2.0, 1.0.1
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
Priority: Critical
 Attachments: HIVE-11605.1.patch


 In some cases, we aggressively try to convert to a bucket map join and this 
 ends up producing incorrect results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11605) Incorrect results with bucket map join in tez.

2015-08-19 Thread Vikram Dixit K (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-11605:
--
Attachment: HIVE-11605.1.patch

 Incorrect results with bucket map join in tez.
 --

 Key: HIVE-11605
 URL: https://issues.apache.org/jira/browse/HIVE-11605
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: 1.0.0, 1.2.0, 1.0.1
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
Priority: Critical
 Attachments: HIVE-11605.1.patch


 In some cases, we aggressively try to convert to a bucket map join and this 
 ends up producing incorrect results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11607) Export tables broken for data 32 MB

2015-08-19 Thread Sushanth Sowmyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703992#comment-14703992
 ] 

Sushanth Sowmyan commented on HIVE-11607:
-

Also, Hadoop20Shims.runDistCp seems to refer to 
org.apache.hadoop.tools.distcp2 as a classname - since 
org.apache.hadoop.tools.distcp2.DistCp would be the appropriate class, I'm not 
sure it works for 1.0 either unless I'm reading this incorrectly.


 Export tables broken for data  32 MB
 -

 Key: HIVE-11607
 URL: https://issues.apache.org/jira/browse/HIVE-11607
 Project: Hive
  Issue Type: Bug
  Components: Import/Export
Affects Versions: 1.0.0, 1.2.0, 1.1.0
Reporter: Ashutosh Chauhan

 Broken for both hadoop-1 as well as hadoop-2 line



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11552) implement basic methods for getting/putting file metadata


[ 
https://issues.apache.org/jira/browse/HIVE-11552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703985#comment-14703985
 ] 

Sergey Shelukhin commented on HIVE-11552:
-

[~alangates] ping?

 implement basic methods for getting/putting file metadata
 -

 Key: HIVE-11552
 URL: https://issues.apache.org/jira/browse/HIVE-11552
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: hbase-metastore-branch

 Attachments: HIVE-11552.01.patch, HIVE-11552.nogen.patch, 
 HIVE-11552.nogen.patch, HIVE-11552.patch


 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11605) Incorrect results with bucket map join in tez.


[ 
https://issues.apache.org/jira/browse/HIVE-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703998#comment-14703998
 ] 

Hive QA commented on HIVE-11605:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12751340/HIVE-11605.1.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9370 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_bucket_map_join_tez1
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucket_map_join_tez1
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5015/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5015/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5015/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12751340 - PreCommit-HIVE-TRUNK-Build

 Incorrect results with bucket map join in tez.
 --

 Key: HIVE-11605
 URL: https://issues.apache.org/jira/browse/HIVE-11605
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: 1.0.0, 1.2.0, 1.0.1
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
Priority: Critical
 Attachments: HIVE-11605.1.patch


 In some cases, we aggressively try to convert to a bucket map join and this 
 ends up producing incorrect results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9938) Add retry logic to DbTxnMgr instead of aborting transactions.

2015-08-19 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703879#comment-14703879
 ] 

Eugene Koifman commented on HIVE-9938:
--

The infrastructure for this is in place.  TxnHandler.isRetryable() needs to 
have a clause added to check for this message/condition.

 Add retry logic to DbTxnMgr instead of aborting transactions.
 -

 Key: HIVE-9938
 URL: https://issues.apache.org/jira/browse/HIVE-9938
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.14.0
Reporter: bharath v

 Sometimes parallel updates using DBTxnMgr results in the following error trace
 {noformat}
 5325 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - PERFLOG 
 method=acquireReadWriteLocks from=org.apache.hadoop.hive.ql.Driver 
 5351 [main] ERROR org.apache.hadoop.hive.ql.Driver - FAILED: Error in 
 acquiring locks: Error communicating with the metastore 
 org.apache.hadoop.hive.ql.lockmgr.LockException: Error communicating with the 
 metastore 
 at 
 org.apache.hadoop.hive.ql.lockmgr.DbLockManager.lock(DbLockManager.java:100) 
 at 
 org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.acquireLocks(DbTxnManager.java:194)
  
 {noformat}
 Internally looking at the postgres logs we see 
 {noformat}
 2015-02-02 06:36:05,632 ERROR 
 org.apache.hadoop.hive.metastore.RetryingHMSHandler: 
 org.apache.thrift.TException: MetaException(message:Unable to update 
 transaction database org.postgresql.util.PSQLException: ERROR: could not 
 serialize access due to concurrent update 
 {noformat}
 Ideally we should add a retry logic to retry the failed transaction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11607) Export tables broken for data 32 MB

2015-08-19 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703938#comment-14703938
 ] 

Ashutosh Chauhan commented on HIVE-11607:
-

cc: [~spena]

 Export tables broken for data  32 MB
 -

 Key: HIVE-11607
 URL: https://issues.apache.org/jira/browse/HIVE-11607
 Project: Hive
  Issue Type: Bug
  Components: Import/Export
Affects Versions: 1.0.0, 1.2.0, 1.1.0
Reporter: Ashutosh Chauhan

 Broken for both hadoop-1 as well as hadoop-2 line



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-6099) Multi insert does not work properly with distinct count

2015-08-19 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere reassigned HIVE-6099:


Assignee: Jason Dere  (was: Ashutosh Chauhan)

 Multi insert does not work properly with distinct count
 ---

 Key: HIVE-6099
 URL: https://issues.apache.org/jira/browse/HIVE-6099
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0, 0.14.0, 1.0.0
Reporter: Pavan Gadam Manohar
Assignee: Jason Dere
  Labels: TODOC1.2, count, distinct, insert, multi-insert
 Fix For: 1.2.0

 Attachments: HIVE-6099.1.patch, HIVE-6099.2.patch, HIVE-6099.3.patch, 
 HIVE-6099.4.patch, HIVE-6099.patch, explain_hive_0.10.0.txt, 
 with_disabled.txt, with_enabled.txt


 Need 2 rows to reproduce this Bug. Here are the steps.
 Step 1) Create a table Table_A
 CREATE EXTERNAL TABLE Table_A
 (
 user string
 , type int
 )
 PARTITIONED BY (dt string)
 ROW FORMAT DELIMITED 
 FIELDS TERMINATED BY '|' 
  STORED AS RCFILE
 LOCATION '/hive/path/Table_A';
 Step 2) Scenario: Lets us say consider user tommy belong to both usertypes 
 111 and 123. Insert 2 records into the table created above.
 select * from  Table_A;
 hive  select * from table_a;
 OK
 tommy   123 2013-12-02
 tommy   111 2013-12-02
 Step 3) Create 2 destination tables to simulate multi-insert.
 CREATE EXTERNAL TABLE dest_Table_A
 (
 p_date string
 , Distinct_Users int
 , Type111Users int
 , Type123Users int
 )
 PARTITIONED BY (dt string)
 ROW FORMAT DELIMITED 
 FIELDS TERMINATED BY '|' 
  STORED AS RCFILE
 LOCATION '/hive/path/dest_Table_A';
  
 CREATE EXTERNAL TABLE dest_Table_B
 (
 p_date string
 , Distinct_Users int
 , Type111Users int
 , Type123Users int
 )
 PARTITIONED BY (dt string)
 ROW FORMAT DELIMITED 
 FIELDS TERMINATED BY '|' 
  STORED AS RCFILE
 LOCATION '/hive/path/dest_Table_B';
 Step 4) Multi insert statement
 from Table_A a
 INSERT OVERWRITE TABLE dest_Table_A PARTITION(dt='2013-12-02')
 select a.dt
 ,count(distinct a.user) as AllDist
 ,count(distinct case when a.type = 111 then a.user else null end) as 
 Type111User
 ,count(distinct case when a.type != 111 then a.user else null end) as 
 Type123User
 group by a.dt
  
 INSERT OVERWRITE TABLE dest_Table_B PARTITION(dt='2013-12-02')
 select a.dt
 ,count(distinct a.user) as AllDist
 ,count(distinct case when a.type = 111 then a.user else null end) as 
 Type111User
 ,count(distinct case when a.type != 111 then a.user else null end) as 
 Type123User
 group by a.dt
 ;
  
 Step 5) Verify results.
 hive  select * from dest_table_a;
 OK
 2013-12-02  2   1   1   2013-12-02
 Time taken: 0.116 seconds
 hive  select * from dest_table_b;
 OK
 2013-12-02  2   1   1   2013-12-02
 Time taken: 0.13 seconds
 Conclusion: Hive gives a count of 2 for distinct users although there is 
 only one distinct user. After trying many datasets observed that Hive is 
 doing Type111Users + Typoe123Users = DistinctUsers which is wrong.
 hive select count(distinct a.user) from table_a a;
 Gives:
 Total MapReduce CPU Time Spent: 4 seconds 350 msec
 OK
 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11600) Hive Parser to Support multi col in clause (x,y..) in ((..),..., ())

2015-08-19 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11600:
---
Attachment: HIVE-11600.02.patch

 Hive Parser to Support multi col in clause (x,y..) in ((..),..., ())
 

 Key: HIVE-11600
 URL: https://issues.apache.org/jira/browse/HIVE-11600
 Project: Hive
  Issue Type: New Feature
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-11600.01.patch, HIVE-11600.02.patch


 Current hive only support single column in clause, e.g., 
 {code}select * from src where  col0 in (v1,v2,v3);{code}
 We want it to support 
 {code}select * from src where (col0,col1+3) in 
 ((col0+v1,v2),(v3,v4-col1));{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11595) refactor ORC footer reading to make it usable from outside


 [ 
https://issues.apache.org/jira/browse/HIVE-11595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11595:

Attachment: HIVE-11595.01.patch

Some more changes on top

 refactor ORC footer reading to make it usable from outside
 --

 Key: HIVE-11595
 URL: https://issues.apache.org/jira/browse/HIVE-11595
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-10595.patch, HIVE-11595.01.patch


 If ORC footer is read from cache, we want to parse it without having the 
 reader, opening a file, etc. I thought it would be as simple as protobuf 
 parseFrom bytes, but apparently there's bunch of stuff going on there. It 
 needs to be accessible via something like parseFrom(ByteBuffer), or similar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11553) use basic file metadata cache in ETLSplitStrategy-related paths


 [ 
https://issues.apache.org/jira/browse/HIVE-11553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11553:

Description: NO PRECOMMIT TESTS

 use basic file metadata cache in ETLSplitStrategy-related paths
 ---

 Key: HIVE-11553
 URL: https://issues.apache.org/jira/browse/HIVE-11553
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: hbase-metastore-branch


 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11502) Map side aggregation is extremely slow

2015-08-19 Thread Yongzhi Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704081#comment-14704081
 ] 

Yongzhi Chen commented on HIVE-11502:
-

[~xuefuz], for GroupBy's aggregate hashmap uses ListKeyWrapper as key, so it 
uses the ListKey's hashcode. The HashMap does not directly use DoubleWritable's 
hashcode, so we can play in between. And it is safe too: The ListKeyWrapper is 
only used by groupby, so it is only used  internal to hive. 

 Map side aggregation is extremely slow
 --

 Key: HIVE-11502
 URL: https://issues.apache.org/jira/browse/HIVE-11502
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer, Physical Optimizer
Affects Versions: 1.2.0
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen
 Attachments: HIVE-11502.1.patch, HIVE-11502.2.patch, 
 HIVE-11502.3.patch


 For the query as following:
 {noformat}
 create table tbl2 as 
 select col1, max(col2) as col2 
 from tbl1 group by col1;
 {noformat}
 If the column for group by has many different values (for example 40) and 
 it is in type double, the map side aggregation is very slow. I ran the query 
 which took more than 3 hours , after 3 hours, I have to kill the query.
 The same query can finish in 7 seconds, if I turn off map side aggregation by:
 {noformat}
 set hive.map.aggr = false;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11553) use basic file metadata cache in ETLSplitStrategy-related paths


 [ 
https://issues.apache.org/jira/browse/HIVE-11553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11553:

Attachment: HIVE-11553.patch

First patch. Not sure if it works :) I will probably combined all the other 
patches together next week and test it on cluster...

[~gopalv] fyi

 use basic file metadata cache in ETLSplitStrategy-related paths
 ---

 Key: HIVE-11553
 URL: https://issues.apache.org/jira/browse/HIVE-11553
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: hbase-metastore-branch

 Attachments: HIVE-11553.patch


 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10144) [LLAP] merge brought in file blocking github sync

2015-08-19 Thread Kai Sasaki (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704070#comment-14704070
 ] 

Kai Sasaki commented on HIVE-10144:
---

At master branch the history might be rewritten. So it's okay to push master 
branch to GitHub server.
Yes, I'll be happy with that. And of course if there is something I can help 
you, I'll do that. Thank you.

 [LLAP] merge brought in file blocking github sync
 -

 Key: HIVE-10144
 URL: https://issues.apache.org/jira/browse/HIVE-10144
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Reporter: Szehon Ho
Assignee: Gunther Hagleitner

 r1669718 brought in a file that is not in source control on llap branch:
 [http://svn.apache.org/repos/asf/hive/branches/llap/itests/thirdparty/|http://svn.apache.org/repos/asf/hive/branches/llap/itests/thirdparty/]
 It is a file downloaded during test build and should not be in source 
 control.  It is actually blocking the github sync as its too large. See 
 INFRA-9360



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11584) Update committer list


 [ 
https://issues.apache.org/jira/browse/HIVE-11584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-11584:

Attachment: HIVE-11584.1.patch

Sorry to forget adding [~pxiong] to the list. Please help me check the 
information. Thank you!

 Update committer list
 -

 Key: HIVE-11584
 URL: https://issues.apache.org/jira/browse/HIVE-11584
 Project: Hive
  Issue Type: Bug
Reporter: Dmitry Tolpeko
Assignee: Ferdinand Xu
Priority: Minor
 Attachments: HIVE-11584.1.patch, HIVE-11584.patch


 Please update committer list:
 Name: Dmitry Tolpeko
 Apache ID: dmtolpeko
 Organization: EPAM (www.epam.com)
 Thank you,
 Dmitry



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-11584) Update committer list


 [ 
https://issues.apache.org/jira/browse/HIVE-11584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu reassigned HIVE-11584:
---

Assignee: Ferdinand Xu

 Update committer list
 -

 Key: HIVE-11584
 URL: https://issues.apache.org/jira/browse/HIVE-11584
 Project: Hive
  Issue Type: Bug
Reporter: Dmitry Tolpeko
Assignee: Ferdinand Xu
Priority: Minor

 Please update committer list:
 Name: Dmitry Tolpeko
 Apache ID: dmtolpeko
 Organization: EPAM (www.epam.com)
 Thank you,
 Dmitry



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11596) nvl(x, y) throws NPE if type x and type y doesn't match, rather than throwing the meaningful error


[ 
https://issues.apache.org/jira/browse/HIVE-11596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702595#comment-14702595
 ] 

Hive QA commented on HIVE-11596:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12751096/HIVE-11596.patch

{color:red}ERROR:{color} -1 due to 36 failed/errored test(s), 9370 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_char_pad_convert_fail0
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_char_pad_convert_fail1
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_char_pad_convert_fail3
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_add_months_error_1
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_add_months_error_2
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_array_contains_wrong1
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_array_contains_wrong2
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_coalesce
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_concat_ws_wrong2
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_concat_ws_wrong3
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_elt_wrong_type
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_field_wrong_type
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_format_number_wrong4
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_format_number_wrong5
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_format_number_wrong6
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_format_number_wrong7
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_greatest_error_2
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_greatest_error_3
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_if_not_bool
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_instr_wrong_type
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_last_day_error_1
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_last_day_error_2
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_locate_wrong_type
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_map_keys_arg_type
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_map_values_arg_type
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_next_day_error_1
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_next_day_error_2
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_printf_wrong2
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_printf_wrong3
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_printf_wrong4
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_size_wrong_type
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_sort_array_wrong2
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_sort_array_wrong3
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_trunc_error1
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_trunc_error2
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_when_type_wrong
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5005/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5005/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5005/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 36 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12751096 - PreCommit-HIVE-TRUNK-Build

 nvl(x, y) throws NPE if type x and type y doesn't match, rather than throwing 
 the meaningful error
 --

 Key: HIVE-11596
 URL: https://issues.apache.org/jira/browse/HIVE-11596
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 2.0.0

[jira] [Commented] (HIVE-11597) [CBO new return path] Handling of strings of zero-length


[ 
https://issues.apache.org/jira/browse/HIVE-11597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702596#comment-14702596
 ] 

Jesus Camacho Rodriguez commented on HIVE-11597:


+1 pending QA run

There is a style problem with indentation [~ashutoshc], please correct it in 
the final patch.

Thanks

 [CBO new return path] Handling of strings of zero-length
 

 Key: HIVE-11597
 URL: https://issues.apache.org/jira/browse/HIVE-11597
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-11597.patch


 Exposed by load_dyn_part14.q



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11602) Support Struct with different field types in query


 [ 
https://issues.apache.org/jira/browse/HIVE-11602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11602:
---
Attachment: HIVE-11602.patch

 Support Struct with different field types in query
 --

 Key: HIVE-11602
 URL: https://issues.apache.org/jira/browse/HIVE-11602
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-11602.patch


 Table:
 {code}
 create table journal(`journal id1` string) partitioned by (`journal id2` 
 string);
 {code}
 Query:
 {code}
 explain select * from journal where struct(`journal id1`, `journal id2`) IN 
 (struct('2013-1000-0133878664', '3'), struct('2013-1000-0133878695', 1));
 {code}
 Exception:
 {code}
 15/08/18 14:52:55 [main]: ERROR ql.Driver: FAILED: SemanticException [Error 
 10014]: Line 1:108 Wrong arguments '1': The arguments for IN should be the 
 same type! Types are: {structcol1:string,col2:string IN 
 (structcol1:string,col2:string, structcol1:string,col2:int)}
 org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:108 Wrong arguments 
 '1': The arguments for IN should be the same type! Types are: 
 {structcol1:string,col2:string IN (structcol1:string,col2:string, 
 structcol1:string,col2:int)}
 at 
 org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1196)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:132)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
 at 
 org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:195)
 at 
 org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:148)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:10595)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10551)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10519)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:2681)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:2662)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8841)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9713)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9607)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10093)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
 at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:417)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1069)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1131)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1006)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:996)
 at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410)
 at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11602) Support Struct with different field types in query


[ 
https://issues.apache.org/jira/browse/HIVE-11602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702790#comment-14702790
 ] 

Jesus Camacho Rodriguez commented on HIVE-11602:


[~hsubramaniyan], could you take a look? Thanks

 Support Struct with different field types in query
 --

 Key: HIVE-11602
 URL: https://issues.apache.org/jira/browse/HIVE-11602
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-11602.patch


 Table:
 {code}
 create table journal(`journal id1` string) partitioned by (`journal id2` 
 string);
 {code}
 Query:
 {code}
 explain select * from journal where struct(`journal id1`, `journal id2`) IN 
 (struct('2013-1000-0133878664', '3'), struct('2013-1000-0133878695', 1));
 {code}
 Exception:
 {code}
 15/08/18 14:52:55 [main]: ERROR ql.Driver: FAILED: SemanticException [Error 
 10014]: Line 1:108 Wrong arguments '1': The arguments for IN should be the 
 same type! Types are: {structcol1:string,col2:string IN 
 (structcol1:string,col2:string, structcol1:string,col2:int)}
 org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:108 Wrong arguments 
 '1': The arguments for IN should be the same type! Types are: 
 {structcol1:string,col2:string IN (structcol1:string,col2:string, 
 structcol1:string,col2:int)}
 at 
 org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1196)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:132)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
 at 
 org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:195)
 at 
 org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:148)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:10595)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10551)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10519)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:2681)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:2662)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8841)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9713)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9607)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10093)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
 at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:417)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1069)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1131)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1006)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:996)
 at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410)
 at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10697) ObjectInspectorConvertors#UnionConvertor does a faulty conversion


[ 
https://issues.apache.org/jira/browse/HIVE-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702798#comment-14702798
 ] 

Hive QA commented on HIVE-10697:




{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12751109/HIVE-10697.2.patch.txt

{color:green}SUCCESS:{color} +1 9370 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5007/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5007/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5007/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12751109 - PreCommit-HIVE-TRUNK-Build

 ObjectInspectorConvertors#UnionConvertor does a faulty conversion
 -

 Key: HIVE-10697
 URL: https://issues.apache.org/jira/browse/HIVE-10697
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Swarnim Kulkarni
Assignee: Swarnim Kulkarni
 Attachments: HIVE-10697.1.patch.txt, HIVE-10697.2.patch.txt


 Currently the UnionConvertor in the ObjectInspectorConvertors class has an 
 issue with the convert method where it attempts to convert the 
 objectinspector itself instead of converting the field.[1]. This should be 
 changed to convert the field itself. This could result in a 
 ClassCastException as shown below:
 {code}
 Caused by: java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.lazy.objectinspector.LazyUnionObjectInspector 
 cannot be cast to org.apache.hadoop.hive.serde2.lazy.LazyString
   at 
 org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyStringObjectInspector.getPrimitiveWritableObject(LazyStringObjectInspector.java:51)
   at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$TextConverter.convert(PrimitiveObjectInspectorConverter.java:391)
   at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$TextConverter.convert(PrimitiveObjectInspectorConverter.java:338)
   at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$UnionConverter.convert(ObjectInspectorConverters.java:456)
   at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$StructConverter.convert(ObjectInspectorConverters.java:395)
   at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$MapConverter.convert(ObjectInspectorConverters.java:539)
   at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$StructConverter.convert(ObjectInspectorConverters.java:395)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.readRow(MapOperator.java:154)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.access$200(MapOperator.java:127)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:518)
   ... 9 more
 {code}
 [1] 
 https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorConverters.java#L466



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11573) PointLookupOptimizer can be pessimistic at a low nDV


 [ 
https://issues.apache.org/jira/browse/HIVE-11573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-11573:
---
Attachment: HIVE-11573.1.patch

 PointLookupOptimizer can be pessimistic at a low nDV
 

 Key: HIVE-11573
 URL: https://issues.apache.org/jira/browse/HIVE-11573
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.3.0, 2.0.0
Reporter: Gopal V
Assignee: Gopal V
 Fix For: 1.3.0, 2.0.0

 Attachments: HIVE-11573.1.patch


 The PointLookupOptimizer can turn off some of the optimizations due to its 
 use of tuple IN() clauses.
 Limit the application of the optimizer for very low nDV cases and extract the 
 sub-clause as a pre-condition during runtime, to trigger the simple column 
 predicate index lookups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11594) Analyze Table For Columns cannot handle columns with embedded spaces


[ 
https://issues.apache.org/jira/browse/HIVE-11594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702693#comment-14702693
 ] 

Hive QA commented on HIVE-11594:




{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12751103/HIVE-11594.2.patch

{color:green}SUCCESS:{color} +1 9371 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5006/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5006/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5006/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12751103 - PreCommit-HIVE-TRUNK-Build

 Analyze Table For Columns cannot handle columns with embedded spaces
 

 Key: HIVE-11594
 URL: https://issues.apache.org/jira/browse/HIVE-11594
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 1.3.0, 2.0.0
Reporter: Gopal V
Assignee: Gopal V
 Attachments: HIVE-11594.1.patch, HIVE-11594.2.patch


 {code}
 create temporary table events(`user id` bigint, `user name` string);
 explain analyze table events compute statistics for columns `user id`;
 FAILED: SemanticException [Error 30009]: Encountered parse error while 
 parsing rewritten query
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11573) PointLookupOptimizer can be pessimistic at a low nDV


 [ 
https://issues.apache.org/jira/browse/HIVE-11573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-11573:
---
Attachment: HIVE-11573.2.patch

 PointLookupOptimizer can be pessimistic at a low nDV
 

 Key: HIVE-11573
 URL: https://issues.apache.org/jira/browse/HIVE-11573
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.3.0, 2.0.0
Reporter: Gopal V
Assignee: Gopal V
 Fix For: 1.3.0, 2.0.0

 Attachments: HIVE-11573.1.patch, HIVE-11573.2.patch


 The PointLookupOptimizer can turn off some of the optimizations due to its 
 use of tuple IN() clauses.
 Limit the application of the optimizer for very low nDV cases and extract the 
 sub-clause as a pre-condition during runtime, to trigger the simple column 
 predicate index lookups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11584) Update committer list

2015-08-19 Thread Dmitry Tolpeko (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702732#comment-14702732
 ] 

Dmitry Tolpeko commented on HIVE-11584:
---

+1

Thank you, Ferdinand!

 Update committer list
 -

 Key: HIVE-11584
 URL: https://issues.apache.org/jira/browse/HIVE-11584
 Project: Hive
  Issue Type: Bug
Reporter: Dmitry Tolpeko
Assignee: Ferdinand Xu
Priority: Minor
 Attachments: HIVE-11584.1.patch, HIVE-11584.patch


 Please update committer list:
 Name: Dmitry Tolpeko
 Apache ID: dmtolpeko
 Organization: EPAM (www.epam.com)
 Thank you,
 Dmitry



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11602) Support Struct with different field types in query


 [ 
https://issues.apache.org/jira/browse/HIVE-11602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11602:
---
Affects Version/s: 2.0.0

 Support Struct with different field types in query
 --

 Key: HIVE-11602
 URL: https://issues.apache.org/jira/browse/HIVE-11602
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez

 Table:
 {code}
 create table journal(`journal id1` string) partitioned by (`journal id2` 
 string);
 {code}
 Query:
 {code}
 explain select * from journal where struct(`journal id1`, `journal id2`) IN 
 (struct('2013-1000-0133878664', '3'), struct('2013-1000-0133878695', 1));
 {code}
 Exception:
 {code}
 15/08/18 14:52:55 [main]: ERROR ql.Driver: FAILED: SemanticException [Error 
 10014]: Line 1:108 Wrong arguments '1': The arguments for IN should be the 
 same type! Types are: {structcol1:string,col2:string IN 
 (structcol1:string,col2:string, structcol1:string,col2:int)}
 org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:108 Wrong arguments 
 '1': The arguments for IN should be the same type! Types are: 
 {structcol1:string,col2:string IN (structcol1:string,col2:string, 
 structcol1:string,col2:int)}
 at 
 org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1196)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:132)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
 at 
 org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:195)
 at 
 org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:148)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:10595)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10551)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10519)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:2681)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:2662)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8841)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9713)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9607)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10093)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
 at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:417)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1069)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1131)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1006)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:996)
 at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410)
 at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11572) Datanucleus loads Log4j1.x Logger from AppClassLoader


[ 
https://issues.apache.org/jira/browse/HIVE-11572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702719#comment-14702719
 ] 

Gopal V commented on HIVE-11572:


[~prasanth_j]: the patch LGTM, +1 except for the classpath reordering.

A better solution is to reorder a few lines down, where the HIVE_CLASSPATH is 
loaded before the HADOOP_CLASSPATH.

{code}
# also pass hive classpath to hadoop
if [ $HIVE_CLASSPATH !=  ]; then
  export HADOOP_CLASSPATH=${HIVE_CLASSPATH}:${HADOOP_CLASSPATH};
fi
{code}

This will fix the DN issue, but lets Tez load its own slf4j configs as-is (when 
Tez switches to log4j2, then we'll have consistency there).

 Datanucleus loads Log4j1.x Logger from AppClassLoader
 -

 Key: HIVE-11572
 URL: https://issues.apache.org/jira/browse/HIVE-11572
 Project: Hive
  Issue Type: Sub-task
  Components: Logging
Affects Versions: 2.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
 Fix For: 2.0.0

 Attachments: HIVE-11572.patch


 As part of HIVE-11304, we moved from Log4j1.x to Log4j2. But DataNucleus log 
 messages gets logged to console when launching the hive cli. The reason is 
 DataNucleus is trying to load Log4j1.x Logger by traversing its class loader. 
 Although we use log4j-1.2-api bridge we are loading log4j-1.2.16 jar that was 
 pulled by ZooKeeper. We should make sure that there is no log4j-1.2.16 in 
 datanucleus classloader hierarchy (classpath). 
 DataNucleus logger has this 
 {code}
 NucleusLogger.class.getClassLoader().loadClass(org.apache.log4j.Logger);
 loggerClass = org.datanucleus.util.Log4JLogger.class;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11573) PointLookupOptimizer can be pessimistic at a low nDV


 [ 
https://issues.apache.org/jira/browse/HIVE-11573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-11573:
---
Attachment: HIVE-11594.2.patch

 PointLookupOptimizer can be pessimistic at a low nDV
 

 Key: HIVE-11573
 URL: https://issues.apache.org/jira/browse/HIVE-11573
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.3.0, 2.0.0
Reporter: Gopal V
Assignee: Gopal V
 Fix For: 1.3.0, 2.0.0

 Attachments: HIVE-11573.1.patch


 The PointLookupOptimizer can turn off some of the optimizations due to its 
 use of tuple IN() clauses.
 Limit the application of the optimizer for very low nDV cases and extract the 
 sub-clause as a pre-condition during runtime, to trigger the simple column 
 predicate index lookups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11573) PointLookupOptimizer can be pessimistic at a low nDV


 [ 
https://issues.apache.org/jira/browse/HIVE-11573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-11573:
---
Attachment: (was: HIVE-11594.2.patch)

 PointLookupOptimizer can be pessimistic at a low nDV
 

 Key: HIVE-11573
 URL: https://issues.apache.org/jira/browse/HIVE-11573
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.3.0, 2.0.0
Reporter: Gopal V
Assignee: Gopal V
 Fix For: 1.3.0, 2.0.0

 Attachments: HIVE-11573.1.patch


 The PointLookupOptimizer can turn off some of the optimizations due to its 
 use of tuple IN() clauses.
 Limit the application of the optimizer for very low nDV cases and extract the 
 sub-clause as a pre-condition during runtime, to trigger the simple column 
 predicate index lookups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11600) Hive Parser to Support multi col in clause (x,y..) in ((..),..., ())


[ 
https://issues.apache.org/jira/browse/HIVE-11600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704228#comment-14704228
 ] 

Hive QA commented on HIVE-11600:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12751345/HIVE-11600.02.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9369 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_char_udf1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_varchar_udf1
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5017/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5017/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5017/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12751345 - PreCommit-HIVE-TRUNK-Build

 Hive Parser to Support multi col in clause (x,y..) in ((..),..., ())
 

 Key: HIVE-11600
 URL: https://issues.apache.org/jira/browse/HIVE-11600
 Project: Hive
  Issue Type: New Feature
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-11600.01.patch, HIVE-11600.02.patch


 Current hive only support single column in clause, e.g., 
 {code}select * from src where  col0 in (v1,v2,v3);{code}
 We want it to support 
 {code}select * from src where (col0,col1+3) in 
 ((col0+v1,v2),(v3,v4-col1));{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11602) Support Struct with different field types in query


[ 
https://issues.apache.org/jira/browse/HIVE-11602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704286#comment-14704286
 ] 

Hive QA commented on HIVE-11602:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12751349/HIVE-11602.01.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9370 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_metadata_only_queries_with_filters
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5018/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5018/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5018/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12751349 - PreCommit-HIVE-TRUNK-Build

 Support Struct with different field types in query
 --

 Key: HIVE-11602
 URL: https://issues.apache.org/jira/browse/HIVE-11602
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-11602.01.patch, HIVE-11602.patch


 Table:
 {code}
 create table journal(`journal id1` string) partitioned by (`journal id2` 
 string);
 {code}
 Query:
 {code}
 explain select * from journal where struct(`journal id1`, `journal id2`) IN 
 (struct('2013-1000-0133878664', '3'), struct('2013-1000-0133878695', 1));
 {code}
 Exception:
 {code}
 15/08/18 14:52:55 [main]: ERROR ql.Driver: FAILED: SemanticException [Error 
 10014]: Line 1:108 Wrong arguments '1': The arguments for IN should be the 
 same type! Types are: {structcol1:string,col2:string IN 
 (structcol1:string,col2:string, structcol1:string,col2:int)}
 org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:108 Wrong arguments 
 '1': The arguments for IN should be the same type! Types are: 
 {structcol1:string,col2:string IN (structcol1:string,col2:string, 
 structcol1:string,col2:int)}
 at 
 org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1196)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:132)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
 at 
 org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:195)
 at 
 org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:148)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:10595)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10551)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10519)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:2681)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:2662)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8841)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9713)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9607)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10093)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
 at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:417)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1069)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1131)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1006)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:996)
 at

[jira] [Assigned] (HIVE-10622) Hive doc error: 'from' is a keyword, when use it as a column name throw error.

2015-08-19 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz reassigned HIVE-10622:
-

Assignee: Lefty Leverenz

 Hive doc error: 'from' is a keyword, when use it as a column name throw error.
 --

 Key: HIVE-10622
 URL: https://issues.apache.org/jira/browse/HIVE-10622
 Project: Hive
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.1.1
Reporter: Anne Yu
Assignee: Lefty Leverenz

 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML, Use 
 from as a column name in create table, throw error.
 {code}
 CREATE TABLE pageviews (userid VARCHAR(64), link STRING, from STRING)
   PARTITIONED BY (datestamp STRING) CLUSTERED BY (userid) INTO 256 BUCKETS 
 STORED AS ORC;
 Error: Error while compiling statement: FAILED: ParseException line 1:57 
 cannot recognize input near 'from' 'STRING' ')' in column specification 
 (state=42000,code=4)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11605) Incorrect results with bucket map join in tez.


[ 
https://issues.apache.org/jira/browse/HIVE-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704257#comment-14704257
 ] 

Gopal V commented on HIVE-11605:


The fixed hashCode fix is critical. LGTM - +1.

[~vikram.dixit]: I've written a small program that generates a near infinite 
number of test-cases for bucket map-joins to find out if the right cases are 
triggered.

https://gist.github.com/t3rmin4t0r/3185edb18efa29188796

I've annotated each query with the optimal plan according to my algorithm 
(assuming all types are identical).

 Incorrect results with bucket map join in tez.
 --

 Key: HIVE-11605
 URL: https://issues.apache.org/jira/browse/HIVE-11605
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: 1.0.0, 1.2.0, 1.0.1
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
Priority: Critical
 Attachments: HIVE-11605.1.patch


 In some cases, we aggressively try to convert to a bucket map join and this 
 ends up producing incorrect results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-11609) Capability to add a filter to hbase scan via composite key doesn't work

2015-08-19 Thread Swarnim Kulkarni (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Swarnim Kulkarni reassigned HIVE-11609:
---

Assignee: Swarnim Kulkarni

 Capability to add a filter to hbase scan via composite key doesn't work
 ---

 Key: HIVE-11609
 URL: https://issues.apache.org/jira/browse/HIVE-11609
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Reporter: Swarnim Kulkarni
Assignee: Swarnim Kulkarni

 It seems like the capability to add filter to an hbase scan which was added 
 as part of HIVE-6411 doesn't work. This is primarily because in the 
 HiveHBaseInputFormat, the filter is added in the getsplits instead of 
 getrecordreader. This works fine for start and stop keys but not for filter 
 because a filter is respected only when an actual scan is performed. This is 
 also related to the initial refactoring that was done as part of HIVE-3420.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11595) refactor ORC footer reading to make it usable from outside