date:20141030

[jira] [Updated] (HIVE-8498) Insert into table misses some rows when vectorization is enabled

2014-10-30 Thread Jitendra Nath Pandey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-8498:
---
Status: Patch Available  (was: Open)

> Insert into table misses some rows when vectorization is enabled
> 
>
> Key: HIVE-8498
> URL: https://issues.apache.org/jira/browse/HIVE-8498
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 0.13.1, 0.14.0
>Reporter: Prasanth J
>Assignee: Jitendra Nath Pandey
>Priority: Critical
>  Labels: vectorization
> Attachments: HIVE-8498.01.patch, HIVE-8498.02.patch, 
> HIVE-8498.3.patch, HIVE-8498.4.patch
>
>
>  Following is a small reproducible case for the issue
> create table orc1
>   stored as orc
>   tblproperties("orc.compress"="ZLIB")
>   as
> select rn
> from
> (
>   select cast(1 as int) as rn from src limit 1
>   union all
>   select cast(100 as int) as rn from src limit 1
>   union all
>   select cast(1 as int) as rn from src limit 1
> ) t;
> create table orc_rn1 (rn int);
> create table orc_rn2 (rn int);
> create table orc_rn3 (rn int);
> // These inserts should produce 3 rows but only 1 row is produced
> from orc1 a
> insert overwrite table orc_rn1 select a.* where a.rn < 100
> insert overwrite table orc_rn2 select a.* where a.rn >= 100 and a.rn < 1000
> insert overwrite table orc_rn3 select a.* where a.rn >= 1000;
> select * from orc_rn1
> union all
> select * from orc_rn2
> union all
> select * from orc_rn3;
> The expected output of the query is
> 1
> 100
> 1
> But with vectorization enabled we get
> 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8498) Insert into table misses some rows when vectorization is enabled

2014-10-30 Thread Jitendra Nath Pandey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-8498:
---
Attachment: HIVE-8498.4.patch

Updated patch addresses Gopal's comments, and also fixes the test failures. 

> Insert into table misses some rows when vectorization is enabled
> 
>
> Key: HIVE-8498
> URL: https://issues.apache.org/jira/browse/HIVE-8498
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 0.14.0, 0.13.1
>Reporter: Prasanth J
>Assignee: Jitendra Nath Pandey
>Priority: Critical
>  Labels: vectorization
> Attachments: HIVE-8498.01.patch, HIVE-8498.02.patch, 
> HIVE-8498.3.patch, HIVE-8498.4.patch
>
>
>  Following is a small reproducible case for the issue
> create table orc1
>   stored as orc
>   tblproperties("orc.compress"="ZLIB")
>   as
> select rn
> from
> (
>   select cast(1 as int) as rn from src limit 1
>   union all
>   select cast(100 as int) as rn from src limit 1
>   union all
>   select cast(1 as int) as rn from src limit 1
> ) t;
> create table orc_rn1 (rn int);
> create table orc_rn2 (rn int);
> create table orc_rn3 (rn int);
> // These inserts should produce 3 rows but only 1 row is produced
> from orc1 a
> insert overwrite table orc_rn1 select a.* where a.rn < 100
> insert overwrite table orc_rn2 select a.* where a.rn >= 100 and a.rn < 1000
> insert overwrite table orc_rn3 select a.* where a.rn >= 1000;
> select * from orc_rn1
> union all
> select * from orc_rn2
> union all
> select * from orc_rn3;
> The expected output of the query is
> 1
> 100
> 1
> But with vectorization enabled we get
> 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8498) Insert into table misses some rows when vectorization is enabled

2014-10-30 Thread Jitendra Nath Pandey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-8498:
---
Status: Open  (was: Patch Available)

> Insert into table misses some rows when vectorization is enabled
> 
>
> Key: HIVE-8498
> URL: https://issues.apache.org/jira/browse/HIVE-8498
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 0.13.1, 0.14.0
>Reporter: Prasanth J
>Assignee: Jitendra Nath Pandey
>Priority: Critical
>  Labels: vectorization
> Attachments: HIVE-8498.01.patch, HIVE-8498.02.patch, HIVE-8498.3.patch
>
>
>  Following is a small reproducible case for the issue
> create table orc1
>   stored as orc
>   tblproperties("orc.compress"="ZLIB")
>   as
> select rn
> from
> (
>   select cast(1 as int) as rn from src limit 1
>   union all
>   select cast(100 as int) as rn from src limit 1
>   union all
>   select cast(1 as int) as rn from src limit 1
> ) t;
> create table orc_rn1 (rn int);
> create table orc_rn2 (rn int);
> create table orc_rn3 (rn int);
> // These inserts should produce 3 rows but only 1 row is produced
> from orc1 a
> insert overwrite table orc_rn1 select a.* where a.rn < 100
> insert overwrite table orc_rn2 select a.* where a.rn >= 100 and a.rn < 1000
> insert overwrite table orc_rn3 select a.* where a.rn >= 1000;
> select * from orc_rn1
> union all
> select * from orc_rn2
> union all
> select * from orc_rn3;
> The expected output of the query is
> 1
> 100
> 1
> But with vectorization enabled we get
> 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8646) Hive class loading failure when executing Hive action via oozie workflows

2014-10-30 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8646:
-
Priority: Critical  (was: Major)

> Hive class loading  failure when executing Hive action via oozie workflows
> --
>
> Key: HIVE-8646
> URL: https://issues.apache.org/jira/browse/HIVE-8646
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
> Environment: Hadoop 2.6.0 + Hive 0.14 + Oozie 4.1
>Reporter: Venkat Ranganathan
>Priority: Critical
> Attachments: HIVE-8646.1.patch.txt
>
>
> When running Hive actions with Oozie we hit this issue sometimes.What is 
> interesting is that we have all the necessary jars in the classpath (or 
> atleast are expected to be localized).
> This static initialization block is introduced by HIVE-3925.
> ==
> Exception in thread "main" java.lang.ExceptionInInitializerError
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:270)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:214)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: java.lang.TypeNotPresentException: Type 
> org.apache.hadoop.hive.metastore.api.FieldSchema not present
>   at 
> sun.reflect.generics.factory.CoreReflectionFactory.makeNamedType(CoreReflectionFactory.java:117)
>   at 
> sun.reflect.generics.visitor.Reifier.visitClassTypeSignature(Reifier.java:125)
>   at 
> sun.reflect.generics.tree.ClassTypeSignature.accept(ClassTypeSignature.java:49)
>   at 
> sun.reflect.generics.visitor.Reifier.reifyTypeArguments(Reifier.java:68)
>   at 
> sun.reflect.generics.visitor.Reifier.visitClassTypeSignature(Reifier.java:138)
>   at 
> sun.reflect.generics.tree.ClassTypeSignature.accept(ClassTypeSignature.java:49)
>   at 
> sun.reflect.generics.repository.MethodRepository.getReturnType(MethodRepository.java:68)
>   at java.lang.reflect.Method.getGenericReturnType(Method.java:245)
>   at 
> java.beans.FeatureDescriptor.getReturnType(FeatureDescriptor.java:370)
>   at java.beans.Introspector.getTargetEventInfo(Introspector.java:996)
>   at java.beans.Introspector.getBeanInfo(Introspector.java:417)
>   at java.beans.Introspector.getBeanInfo(Introspector.java:163)
>   at 
> org.apache.hadoop.hive.ql.exec.PTFUtils.makeTransient(PTFUtils.java:267)
>   at org.apache.hadoop.hive.ql.exec.Task.(Task.java:53)
>   ... 4 more
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.metastore.api.FieldSchema
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:270)
>   at 
> sun.reflect.generics.factory.CoreReflectionFactory.makeNamedType(CoreReflectionFactory.java:114)
>   ... 17 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8646) Hive class loading failure when executing Hive action via oozie workflows

2014-10-30 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191487#comment-14191487
 ] 

Gunther Hagleitner commented on HIVE-8646:
--

I think we should just go with Navis' patch for now. Since he removed the 
FieldSchema reference there's a good chance this resolves the oozie issue. 
[~venkatnrangan] can test once it's in. The problem with "it's just a boolean" 
is that we clone tasks a lot and there might be weird side effects of removing 
the block. +1 for trunk and branch.

> Hive class loading  failure when executing Hive action via oozie workflows
> --
>
> Key: HIVE-8646
> URL: https://issues.apache.org/jira/browse/HIVE-8646
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
> Environment: Hadoop 2.6.0 + Hive 0.14 + Oozie 4.1
>Reporter: Venkat Ranganathan
> Attachments: HIVE-8646.1.patch.txt
>
>
> When running Hive actions with Oozie we hit this issue sometimes.What is 
> interesting is that we have all the necessary jars in the classpath (or 
> atleast are expected to be localized).
> This static initialization block is introduced by HIVE-3925.
> ==
> Exception in thread "main" java.lang.ExceptionInInitializerError
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:270)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:214)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: java.lang.TypeNotPresentException: Type 
> org.apache.hadoop.hive.metastore.api.FieldSchema not present
>   at 
> sun.reflect.generics.factory.CoreReflectionFactory.makeNamedType(CoreReflectionFactory.java:117)
>   at 
> sun.reflect.generics.visitor.Reifier.visitClassTypeSignature(Reifier.java:125)
>   at 
> sun.reflect.generics.tree.ClassTypeSignature.accept(ClassTypeSignature.java:49)
>   at 
> sun.reflect.generics.visitor.Reifier.reifyTypeArguments(Reifier.java:68)
>   at 
> sun.reflect.generics.visitor.Reifier.visitClassTypeSignature(Reifier.java:138)
>   at 
> sun.reflect.generics.tree.ClassTypeSignature.accept(ClassTypeSignature.java:49)
>   at 
> sun.reflect.generics.repository.MethodRepository.getReturnType(MethodRepository.java:68)
>   at java.lang.reflect.Method.getGenericReturnType(Method.java:245)
>   at 
> java.beans.FeatureDescriptor.getReturnType(FeatureDescriptor.java:370)
>   at java.beans.Introspector.getTargetEventInfo(Introspector.java:996)
>   at java.beans.Introspector.getBeanInfo(Introspector.java:417)
>   at java.beans.Introspector.getBeanInfo(Introspector.java:163)
>   at 
> org.apache.hadoop.hive.ql.exec.PTFUtils.makeTransient(PTFUtils.java:267)
>   at org.apache.hadoop.hive.ql.exec.Task.(Task.java:53)
>   ... 4 more
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.metastore.api.FieldSchema
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:270)
>   at 
> sun.reflect.generics.factory.CoreReflectionFactory.makeNamedType(CoreReflectionFactory.java:114)
>   ... 17 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8392) HiveServer2 Operation.close fails on windows

2014-10-30 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191485#comment-14191485
 ] 

Hive QA commented on HIVE-8392:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12678296/HIVE-8392.1.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 6608 tests executed
*Failed tests:*
{noformat}
org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1566/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1566/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1566/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12678296 - PreCommit-HIVE-TRUNK-Build

> HiveServer2 Operation.close fails on windows
> 
>
> Key: HIVE-8392
> URL: https://issues.apache.org/jira/browse/HIVE-8392
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.14.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Fix For: 0.14.0
>
> Attachments: HIVE-8392.1.patch
>
>
> {code}
> java.io.IOException: Unable to delete file: 
> C:\Users\HADOOP~1.ONP\AppData\Local\Temp\hadoop\operation_logs\ac7d4f51-d9b9-4189-b248-6e8d5e3102af\4b1f1153-5c0c-4741-8f53-1f1b6ed9b190
>   at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2279)
>   at 
> org.apache.hive.service.cli.operation.OperationLog$LogFile.remove(OperationLog.java:131)
>   at 
> org.apache.hive.service.cli.operation.OperationLog.close(OperationLog.java:95)
>   at 
> org.apache.hive.service.cli.operation.Operation.cleanupOperationLog(Operation.java:268)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.close(SQLOperation.java:307)
>   at 
> org.apache.hive.service.cli.operation.OperationManager.closeOperation(OperationManager.java:215)
>   at 
> org.apache.hive.service.cli.session.HiveSessionImpl.closeOperation(HiveSessionImpl.java:640)
>   at 
> org.apache.hive.service.cli.CLIService.closeOperation(CLIService.java:392)
>   at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.CloseOperation(ThriftCLIService.java:573)
>   at 
> org.apache.hive.service.cli.thrift.TCLIService$Processor$CloseOperation.getResult(TCLIService.java:1513)
>   at 
> org.apache.hive.service.cli.thrift.TCLIService$Processor$CloseOperation.getResult(TCLIService.java:1498)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>   at 
> org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> On windows, close needs to be called before delete.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-3187) support ISO-2012 timestamp literals

2014-10-30 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-3187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191471#comment-14191471
 ] 

Jason Dere commented on HIVE-3187:
--

+1

> support ISO-2012 timestamp literals
> ---
>
> Key: HIVE-3187
> URL: https://issues.apache.org/jira/browse/HIVE-3187
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.9.0
>Reporter: N Campbell
>Assignee: Navis
> Attachments: HIVE-3187.1.patch.txt, HIVE-3187.2.patch.txt, 
> HIVE-3187.3.patch.txt
>
>
> Enable the JDBC driver/Hive SQL engine to accept JDBC canonical or ISO-SQL 
> 20xx Timestamp literals
> ie.
> select 1 from cert.tversion tversion where timestamp '1989-01-01 
> 10:20:30.0' <> timestamp '2000-12-31 12:15:30.12300'
> instead of
> unix_timestamp('.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8677) TPC-DS Q51 : fails with "init not supported" exception in GenericUDAFStreamingEvaluator.init

2014-10-30 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191443#comment-14191443
 ] 

Gunther Hagleitner commented on HIVE-8677:
--

The problem is that with container reuse the PTFOperator gets initialized 
multiple times. With the streaming support we now update the PTFDesc in a way 
that can't be repeated. We replace a GenericUDAFEvaluator with a wrapper class 
- GenericUDAFStreamingEvaluator. The second time around we call init it throws 
an exception.

The fix is simple, just forward the init call to the wrapped class. The 
wrapping does not happen twice, everything else seems to get initialized 
correctly (second time around). I tested it on a few queries - seems ok.

[~rhbutani] can you take a look? [~gopalv] if you have some time for review 
that'd be great too.

> TPC-DS Q51 : fails with "init not supported" exception in 
> GenericUDAFStreamingEvaluator.init
> 
>
> Key: HIVE-8677
> URL: https://issues.apache.org/jira/browse/HIVE-8677
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Mostafa Mokhtar
>Assignee: Gunther Hagleitner
>Priority: Critical
> Fix For: 0.14.0
>
> Attachments: HIVE-8677.1.patch
>
>
> TPC-DS Q51 fails with the exception below 
> {code}
> , TaskAttempt 3 failed, info=[Error: Failure while running 
> task:java.lang.RuntimeException: java.lang.RuntimeException: Reduce operator 
> initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.RuntimeException: Reduce operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.init(ReduceRecordProcessor.java:146)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:162)
>   ... 13 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: : init not 
> supported
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDAFStreamingEvaluator.init(GenericUDAFStreamingEvaluator.java:70)
>   at 
> org.apache.hadoop.hive.ql.plan.PTFDeserializer.setupWdwFnEvaluator(PTFDeserializer.java:209)
>   at 
> org.apache.hadoop.hive.ql.plan.PTFDeserializer.initializeWindowing(PTFDeserializer.java:130)
>   at 
> org.apache.hadoop.hive.ql.plan.PTFDeserializer.initializePTFChain(PTFDeserializer.java:94)
>   at 
> org.apache.hadoop.hive.ql.exec.PTFOperator.reconstructQueryDef(PTFOperator.java:144)
>   at 
> org.apache.hadoop.hive.ql.exec.PTFOperator.initializeOp(PTFOperator.java:74)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:469)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:425)
>   at 
> org.apache.hadoop.hive.ql.exec.ExtractOperator.initializeOp(ExtractOperator.java:40)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.init(ReduceRecordProcessor.java:116)
>   ... 14 more
> {code}
> Query
> {code}
> set hive.cbo.enable=true;
> set hive.stats.fetch.column.stats=true;
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.tez.auto.reducer.parallelism=true;
> set hive.tez.exec.print.summary=true;
> set hive.auto.convert.join.noconditionaltask.size=128000;
> set hive.exec.reducers.bytes.per.reducer=1;
> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager;
> se

[jira] [Updated] (HIVE-8677) TPC-DS Q51 : fails with "init not supported" exception in GenericUDAFStreamingEvaluator.init

2014-10-30 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8677:
-
Status: Patch Available  (was: Open)

> TPC-DS Q51 : fails with "init not supported" exception in 
> GenericUDAFStreamingEvaluator.init
> 
>
> Key: HIVE-8677
> URL: https://issues.apache.org/jira/browse/HIVE-8677
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Mostafa Mokhtar
>Assignee: Gunther Hagleitner
>Priority: Critical
> Fix For: 0.14.0
>
> Attachments: HIVE-8677.1.patch
>
>
> TPC-DS Q51 fails with the exception below 
> {code}
> , TaskAttempt 3 failed, info=[Error: Failure while running 
> task:java.lang.RuntimeException: java.lang.RuntimeException: Reduce operator 
> initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.RuntimeException: Reduce operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.init(ReduceRecordProcessor.java:146)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:162)
>   ... 13 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: : init not 
> supported
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDAFStreamingEvaluator.init(GenericUDAFStreamingEvaluator.java:70)
>   at 
> org.apache.hadoop.hive.ql.plan.PTFDeserializer.setupWdwFnEvaluator(PTFDeserializer.java:209)
>   at 
> org.apache.hadoop.hive.ql.plan.PTFDeserializer.initializeWindowing(PTFDeserializer.java:130)
>   at 
> org.apache.hadoop.hive.ql.plan.PTFDeserializer.initializePTFChain(PTFDeserializer.java:94)
>   at 
> org.apache.hadoop.hive.ql.exec.PTFOperator.reconstructQueryDef(PTFOperator.java:144)
>   at 
> org.apache.hadoop.hive.ql.exec.PTFOperator.initializeOp(PTFOperator.java:74)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:469)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:425)
>   at 
> org.apache.hadoop.hive.ql.exec.ExtractOperator.initializeOp(ExtractOperator.java:40)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.init(ReduceRecordProcessor.java:116)
>   ... 14 more
> {code}
> Query
> {code}
> set hive.cbo.enable=true;
> set hive.stats.fetch.column.stats=true;
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.tez.auto.reducer.parallelism=true;
> set hive.tez.exec.print.summary=true;
> set hive.auto.convert.join.noconditionaltask.size=128000;
> set hive.exec.reducers.bytes.per.reducer=1;
> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager;
> set hive.support.concurrency=false;
>  
> WITH web_v1 as (
> select
>   ws_item_sk item_sk, d_date, sum(ws_sales_price),
>   sum(sum(ws_sales_price))
>   over (partition by ws_item_sk order by d_date rows between unbounded 
> preceding and current row) cume_sales
> from web_sales
> ,date_dim
> where ws_sold_date_sk=d_date_sk
>   and d_month_seq between 1193 and 1193+11
>   and ws_item_sk is not NULL
> group by ws_item_sk, d_date),
> store_v1 as (
> select
>   ss_item_sk item_sk, d_date, sum(ss_sales_price),
>   sum(sum(ss_sales_price))
>   over (partition by ss_item_sk order by d_date rows between unbounded 
> preceding and current row) cume_sales

[jira] [Updated] (HIVE-7930) enable vectorization_short_regress.q, vector_string_concat.q [Spark Branch]

2014-10-30 Thread Chinna Rao Lalam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam updated HIVE-7930:
---
Status: Patch Available  (was: Open)

> enable vectorization_short_regress.q,  vector_string_concat.q [Spark Branch]
> 
>
> Key: HIVE-7930
> URL: https://issues.apache.org/jira/browse/HIVE-7930
> Project: Hive
>  Issue Type: Bug
>Affects Versions: spark-branch
>Reporter: Chinna Rao Lalam
>Assignee: Chinna Rao Lalam
> Fix For: spark-branch
>
> Attachments: HIVE-7930-spark.patch
>
>
> {quote}
> vector_string_concat.q
> vectorization_short_regress.q
> {quote}
> queries executed as normal queries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7930) enable vectorization_short_regress.q, vector_string_concat.q [Spark Branch]

2014-10-30 Thread Chinna Rao Lalam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam updated HIVE-7930:
---
Attachment: HIVE-7930-spark.patch

vector_string_concat.q is already enabled. Patch updated with the 
vectorization_short_regress.q

> enable vectorization_short_regress.q,  vector_string_concat.q [Spark Branch]
> 
>
> Key: HIVE-7930
> URL: https://issues.apache.org/jira/browse/HIVE-7930
> Project: Hive
>  Issue Type: Bug
>Affects Versions: spark-branch
>Reporter: Chinna Rao Lalam
>Assignee: Chinna Rao Lalam
> Fix For: spark-branch
>
> Attachments: HIVE-7930-spark.patch
>
>
> {quote}
> vector_string_concat.q
> vectorization_short_regress.q
> {quote}
> queries executed as normal queries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8677) TPC-DS Q51 : fails with "init not supported" exception in GenericUDAFStreamingEvaluator.init

2014-10-30 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8677:
-
Attachment: HIVE-8677.1.patch

> TPC-DS Q51 : fails with "init not supported" exception in 
> GenericUDAFStreamingEvaluator.init
> 
>
> Key: HIVE-8677
> URL: https://issues.apache.org/jira/browse/HIVE-8677
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Mostafa Mokhtar
>Assignee: Gunther Hagleitner
>Priority: Critical
> Fix For: 0.14.0
>
> Attachments: HIVE-8677.1.patch
>
>
> TPC-DS Q51 fails with the exception below 
> {code}
> , TaskAttempt 3 failed, info=[Error: Failure while running 
> task:java.lang.RuntimeException: java.lang.RuntimeException: Reduce operator 
> initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.RuntimeException: Reduce operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.init(ReduceRecordProcessor.java:146)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:162)
>   ... 13 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: : init not 
> supported
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDAFStreamingEvaluator.init(GenericUDAFStreamingEvaluator.java:70)
>   at 
> org.apache.hadoop.hive.ql.plan.PTFDeserializer.setupWdwFnEvaluator(PTFDeserializer.java:209)
>   at 
> org.apache.hadoop.hive.ql.plan.PTFDeserializer.initializeWindowing(PTFDeserializer.java:130)
>   at 
> org.apache.hadoop.hive.ql.plan.PTFDeserializer.initializePTFChain(PTFDeserializer.java:94)
>   at 
> org.apache.hadoop.hive.ql.exec.PTFOperator.reconstructQueryDef(PTFOperator.java:144)
>   at 
> org.apache.hadoop.hive.ql.exec.PTFOperator.initializeOp(PTFOperator.java:74)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:469)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:425)
>   at 
> org.apache.hadoop.hive.ql.exec.ExtractOperator.initializeOp(ExtractOperator.java:40)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.init(ReduceRecordProcessor.java:116)
>   ... 14 more
> {code}
> Query
> {code}
> set hive.cbo.enable=true;
> set hive.stats.fetch.column.stats=true;
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.tez.auto.reducer.parallelism=true;
> set hive.tez.exec.print.summary=true;
> set hive.auto.convert.join.noconditionaltask.size=128000;
> set hive.exec.reducers.bytes.per.reducer=1;
> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager;
> set hive.support.concurrency=false;
>  
> WITH web_v1 as (
> select
>   ws_item_sk item_sk, d_date, sum(ws_sales_price),
>   sum(sum(ws_sales_price))
>   over (partition by ws_item_sk order by d_date rows between unbounded 
> preceding and current row) cume_sales
> from web_sales
> ,date_dim
> where ws_sold_date_sk=d_date_sk
>   and d_month_seq between 1193 and 1193+11
>   and ws_item_sk is not NULL
> group by ws_item_sk, d_date),
> store_v1 as (
> select
>   ss_item_sk item_sk, d_date, sum(ss_sales_price),
>   sum(sum(ss_sales_price))
>   over (partition by ss_item_sk order by d_date rows between unbounded 
> preceding and current row) cume_sales
> from

[jira] [Commented] (HIVE-8498) Insert into table misses some rows when vectorization is enabled

2014-10-30 Thread Jitendra Nath Pandey (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191412#comment-14191412
 ] 

Jitendra Nath Pandey commented on HIVE-8498:


bq. The local copy of the vector needs to hold onto the selectedInUse as well.
Good catch I will update the patch.

> Insert into table misses some rows when vectorization is enabled
> 
>
> Key: HIVE-8498
> URL: https://issues.apache.org/jira/browse/HIVE-8498
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 0.14.0, 0.13.1
>Reporter: Prasanth J
>Assignee: Jitendra Nath Pandey
>Priority: Critical
>  Labels: vectorization
> Attachments: HIVE-8498.01.patch, HIVE-8498.02.patch, HIVE-8498.3.patch
>
>
>  Following is a small reproducible case for the issue
> create table orc1
>   stored as orc
>   tblproperties("orc.compress"="ZLIB")
>   as
> select rn
> from
> (
>   select cast(1 as int) as rn from src limit 1
>   union all
>   select cast(100 as int) as rn from src limit 1
>   union all
>   select cast(1 as int) as rn from src limit 1
> ) t;
> create table orc_rn1 (rn int);
> create table orc_rn2 (rn int);
> create table orc_rn3 (rn int);
> // These inserts should produce 3 rows but only 1 row is produced
> from orc1 a
> insert overwrite table orc_rn1 select a.* where a.rn < 100
> insert overwrite table orc_rn2 select a.* where a.rn >= 100 and a.rn < 1000
> insert overwrite table orc_rn3 select a.* where a.rn >= 1000;
> select * from orc_rn1
> union all
> select * from orc_rn2
> union all
> select * from orc_rn3;
> The expected output of the query is
> 1
> 100
> 1
> But with vectorization enabled we get
> 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8498) Insert into table misses some rows when vectorization is enabled

2014-10-30 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191407#comment-14191407
 ] 

Gopal V commented on HIVE-8498:
---

The local copy of the vector needs to hold onto the selectedInUse as well.

> Insert into table misses some rows when vectorization is enabled
> 
>
> Key: HIVE-8498
> URL: https://issues.apache.org/jira/browse/HIVE-8498
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 0.14.0, 0.13.1
>Reporter: Prasanth J
>Assignee: Jitendra Nath Pandey
>Priority: Critical
>  Labels: vectorization
> Attachments: HIVE-8498.01.patch, HIVE-8498.02.patch, HIVE-8498.3.patch
>
>
>  Following is a small reproducible case for the issue
> create table orc1
>   stored as orc
>   tblproperties("orc.compress"="ZLIB")
>   as
> select rn
> from
> (
>   select cast(1 as int) as rn from src limit 1
>   union all
>   select cast(100 as int) as rn from src limit 1
>   union all
>   select cast(1 as int) as rn from src limit 1
> ) t;
> create table orc_rn1 (rn int);
> create table orc_rn2 (rn int);
> create table orc_rn3 (rn int);
> // These inserts should produce 3 rows but only 1 row is produced
> from orc1 a
> insert overwrite table orc_rn1 select a.* where a.rn < 100
> insert overwrite table orc_rn2 select a.* where a.rn >= 100 and a.rn < 1000
> insert overwrite table orc_rn3 select a.* where a.rn >= 1000;
> select * from orc_rn1
> union all
> select * from orc_rn2
> union all
> select * from orc_rn3;
> The expected output of the query is
> 1
> 100
> 1
> But with vectorization enabled we get
> 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 26854: HIVE-2573 Create per-session function registry

2014-10-30 Thread Navis Ryu



> On Oct. 31, 2014, 1:32 a.m., Jason Dere wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java, line 1594
> > 
> >
> > Are these changes meant for this Jira?
> > 
> > Hmm, I think I see why - have the persistent function lookups done 
> > during static initialization now caused the Hive class to not be 
> > instantiable during runtime? Could you try moving the persistent function 
> > lookups out of static initialization, and into a method, which gets called 
> > (but only initialized once) during SessionState.start()? Would that take 
> > care of the issue?

Hive.class is an access point to metastore, which should not be referenced from 
runtime classes like Table, Task, etc. Theses changes are for that. Decoupling 
persistent function registration code parts from static initializer seemed good 
idea. I'll try that.


> On Oct. 31, 2014, 1:32 a.m., Jason Dere wrote:
> > service/src/test/org/apache/hadoop/hive/service/TestHiveServerSessions.java,
> >  line 134
> > 
> >
> > What's the issue here?

I tried but couldn't make run MR in hadoop-2. Any idea?


> On Oct. 31, 2014, 1:32 a.m., Jason Dere wrote:
> > itests/hive-minikdc/src/test/java/org/apache/hive/minikdc/TestJdbcWithMiniKdc.java,
> >  line 176
> > 
> >
> > Is this change relevant to this Jira? Or is this just a general fix to 
> > TestJdbcWithMiniKdc.testNegativeTokenAuth, which I have noticed to be 
> > failing consistently in the precommit tests?

This is my bad from HIVE-8186. Chainging error message will make this test 
pass, but HIVE-8481 would be better place to to. I'll removed this part.


> On Oct. 31, 2014, 1:32 a.m., Jason Dere wrote:
> > metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java,
> >  line 144
> > 
> >
> > Are the changes in this file meant for this Jira?

This is called from static initializer in Hive(class). ObjectStore always 
returns collection but it's not true in this class. Can be fixed by checking 
null but seemed better to fix this for consistency.


> On Oct. 31, 2014, 1:32 a.m., Jason Dere wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java, line 399
> > 
> >
> > Was this line meant to be in here?

Yes, my bad. The remnant of futile attempts to run TestHiveServerSessions in 
hadoop-2. Will be removed.


- Navis


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26854/#review59281
---


On Oct. 30, 2014, 11:41 p.m., Jason Dere wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26854/
> ---
> 
> (Updated Oct. 30, 2014, 11:41 p.m.)
> 
> 
> Review request for hive, Navis Ryu and Thejas Nair.
> 
> 
> Bugs: HIVE-2573
> https://issues.apache.org/jira/browse/HIVE-2573
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Small updates to Navis' changes:
> - session registry doesn't lookup metastore for UDFs
> - my feedback from Navis' original patch
> - metastore udfs should not be considered native. This allows them to be 
> added/removed from registry
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/JavaUtils.java 9aa917c 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 7d8e5bc 
>   contrib/src/test/results/clientnegative/invalid_row_sequence.q.out 8f3c0b3 
>   
> itests/hive-minikdc/src/test/java/org/apache/hive/minikdc/TestJdbcWithMiniKdc.java
>  6647ce5 
>   
> metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
>  88b0791 
>   ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 9ac540e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/CommonFunctionInfo.java 93c15c0 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionInfo.java 074255b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java e43a328 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionTask.java 569c125 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Registry.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 5bdeb92 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/WindowFunctionInfo.java efecb05 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 4e3df75 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java b900627 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java 13277a9 
>   ql/src/java/org/

[jira] [Created] (HIVE-8682) Enable table statistic collection on counter for CTAS query[Spark Branch]

2014-10-30 Thread Chengxiang Li (JIRA)

Chengxiang Li created HIVE-8682:
---

 Summary: Enable table statistic collection on counter for CTAS 
query[Spark Branch]
 Key: HIVE-8682
 URL: https://issues.apache.org/jira/browse/HIVE-8682
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li


CREATE TABLE AS SELECT query would load data into new created table, we should 
enable table statistic collection on counter either.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8498) Insert into table misses some rows when vectorization is enabled

2014-10-30 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191381#comment-14191381
 ] 

Hive QA commented on HIVE-8498:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12678293/HIVE-8498.3.patch

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 6609 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_shufflejoin
org.apache.hadoop.hive.ql.exec.vector.TestVectorFilterOperator.testBasicFilterOperator
org.apache.hive.hcatalog.streaming.TestStreaming.testEndpointConnection
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit
org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1565/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1565/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1565/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12678293 - PreCommit-HIVE-TRUNK-Build

> Insert into table misses some rows when vectorization is enabled
> 
>
> Key: HIVE-8498
> URL: https://issues.apache.org/jira/browse/HIVE-8498
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 0.14.0, 0.13.1
>Reporter: Prasanth J
>Assignee: Jitendra Nath Pandey
>Priority: Critical
>  Labels: vectorization
> Attachments: HIVE-8498.01.patch, HIVE-8498.02.patch, HIVE-8498.3.patch
>
>
>  Following is a small reproducible case for the issue
> create table orc1
>   stored as orc
>   tblproperties("orc.compress"="ZLIB")
>   as
> select rn
> from
> (
>   select cast(1 as int) as rn from src limit 1
>   union all
>   select cast(100 as int) as rn from src limit 1
>   union all
>   select cast(1 as int) as rn from src limit 1
> ) t;
> create table orc_rn1 (rn int);
> create table orc_rn2 (rn int);
> create table orc_rn3 (rn int);
> // These inserts should produce 3 rows but only 1 row is produced
> from orc1 a
> insert overwrite table orc_rn1 select a.* where a.rn < 100
> insert overwrite table orc_rn2 select a.* where a.rn >= 100 and a.rn < 1000
> insert overwrite table orc_rn3 select a.* where a.rn >= 1000;
> select * from orc_rn1
> union all
> select * from orc_rn2
> union all
> select * from orc_rn3;
> The expected output of the query is
> 1
> 100
> 1
> But with vectorization enabled we get
> 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8202) Support SMB Join for Hive on Spark [Spark Branch]

2014-10-30 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191380#comment-14191380
 ] 

Szehon Ho commented on HIVE-8202:
-

Thanks, Xuefu

> Support SMB Join for Hive on Spark [Spark Branch]
> -
>
> Key: HIVE-8202
> URL: https://issues.apache.org/jira/browse/HIVE-8202
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Szehon Ho
> Fix For: spark-branch
>
> Attachments: HIVE-8202.1-spark.patch, HIVE-8202.2-spark.patch, 
> HIVE-8202.3-spark.patch, HIVE-8202.4-spark.patch, HIVE-8202.5-spark.patch, 
> HIVE-8202.6-spark.patch, HIVE-8202.7-spark.patch, HIVE-8202.8-spark.patch, 
> HIVE-8202.9-spark.patch, Hive on Spark SMB Join.docx, Hive on Spark SMB 
> Join.pdf
>
>
> SMB joins are used wherever the tables are sorted and bucketed. It's a 
> map-side join. The join boils down to just merging the already sorted tables, 
> allowing this operation to be faster than an ordinary map-join.
> The task is to research and support the conversion from regular SMB join to 
> SMB map join for Spark execution engine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 26854: HIVE-2573 Create per-session function registry

2014-10-30 Thread Navis Ryu



> On Oct. 23, 2014, 9:50 p.m., Jason Dere wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/Registry.java, line 465
> > 
> >
> > There is no longer a way to query the metastore for UDFs apart from the 
> > static initialization. So if one CLI user creates a permanent UDF, another 
> > user on CLI, or HS2, will not be able to use that new UDF if the 2nd CLI or 
> > HS2 was initialized before this UDF was created.
> 
> Navis Ryu wrote:
> Permanent functions (persistent function seemed better name, imho) are 
> registered to system registry, which is shared to all clients. So if one user 
> creates new permanent function, it's shared to all clients. The time a user 
> accesses the function, the class is loaded with required resources and 
> registered to session registry as a temporary function.
> 
> Jason Dere wrote:
> So this would work if all clients are using hiveserver2, because all 
> clients in this scenario would share the same system registry.
> But if one or more clients are using the Hive CLI, any persistent UDFs 
> created/dropped by this CLI client would not be reflected in the other 
> clients (or HS2), since it's a different process/system registry.

I've missed this message. Yes, it will act like you've commented. But is it a 
common case to expect things done in separate VM refleceted to others? Should 
dropping presistent function by a jdbc client be reflected to other CLI clients?


- Navis


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26854/#review57952
---


On Oct. 30, 2014, 11:41 p.m., Jason Dere wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26854/
> ---
> 
> (Updated Oct. 30, 2014, 11:41 p.m.)
> 
> 
> Review request for hive, Navis Ryu and Thejas Nair.
> 
> 
> Bugs: HIVE-2573
> https://issues.apache.org/jira/browse/HIVE-2573
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Small updates to Navis' changes:
> - session registry doesn't lookup metastore for UDFs
> - my feedback from Navis' original patch
> - metastore udfs should not be considered native. This allows them to be 
> added/removed from registry
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/JavaUtils.java 9aa917c 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 7d8e5bc 
>   contrib/src/test/results/clientnegative/invalid_row_sequence.q.out 8f3c0b3 
>   
> itests/hive-minikdc/src/test/java/org/apache/hive/minikdc/TestJdbcWithMiniKdc.java
>  6647ce5 
>   
> metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
>  88b0791 
>   ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 9ac540e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/CommonFunctionInfo.java 93c15c0 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionInfo.java 074255b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java e43a328 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionTask.java 569c125 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Registry.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 5bdeb92 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/WindowFunctionInfo.java efecb05 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 4e3df75 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java b900627 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java 13277a9 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 211ab6c 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/IndexUtils.java e2768ff 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/translator/SqlFunctionConverter.java
>  793f117 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java 
> 1796b7b 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/FunctionSemanticAnalyzer.java 
> 22e5b47 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/IndexUpdater.java 2b239ab 
>   ql/src/java/org/apache/hadoop/hive/ql/session/SessionConf.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java af633cb 
>   ql/src/test/org/apache/hadoop/hive/ql/parse/TestMacroSemanticAnalyzer.java 
> 46f8052 
>   ql/src/test/queries/clientnegative/drop_native_udf.q ae047bb 
>   ql/src/test/results/clientnegative/create_function_nonexistent_class.q.out 
> c7405ed 
>   ql/src/test/results/clientnegative/create_function_nonudf_class.q.out 
> d0dd50a 
>   ql/src/test/results/clientnegative/drop_native_udf.q.out 9f0eaa5 
>   ql/src/test/results/clientnegative/udf_nonexistent_resource.q.out e184787 
>   service/src/test/org/apache/hadoop/hive/service/Test

Re: Review Request 27367: support ISO-2012 timestamp literals

2014-10-30 Thread Navis Ryu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27367/
---

(Updated Oct. 31, 2014, 5:03 a.m.)


Review request for hive.


Changes
---

Added more tests (shamelessly copied from partition_date.q)
Currently, direct SQL cannot be applied (tried but casting to timestamp is not 
supported in some database(mysql, etc.))


Bugs: HIVE-3187
https://issues.apache.org/jira/browse/HIVE-3187


Repository: hive-git


Description
---

Enable the JDBC driver/Hive SQL engine to accept JDBC canonical or ISO-SQL 20xx 
Timestamp literals

ie.
select 1 from cert.tversion tversion where timestamp '1989-01-01 
10:20:30.0' <> timestamp '2000-12-31 12:15:30.12300'

instead of
unix_timestamp('.)


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g c903e8f 
  ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 13d5255 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java e065983 
  ql/src/test/queries/clientnegative/timestamp_literal.q PRE-CREATION 
  ql/src/test/queries/clientpositive/partition_timestamp.q PRE-CREATION 
  ql/src/test/queries/clientpositive/partition_timestamp2.q PRE-CREATION 
  ql/src/test/queries/clientpositive/timestamp_literal.q PRE-CREATION 
  ql/src/test/results/clientnegative/date_literal2.q.out 82f6425 
  ql/src/test/results/clientnegative/date_literal3.q.out 82f6425 
  ql/src/test/results/clientnegative/illegal_partition_type4.q.out e388086 
  ql/src/test/results/clientnegative/timestamp_literal.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/partition_timestamp.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/partition_timestamp2.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/timestamp_literal.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/27367/diff/


Testing
---


Thanks,

Navis Ryu

[jira] [Updated] (HIVE-3187) support ISO-2012 timestamp literals

2014-10-30 Thread Navis (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-3187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-3187:

Attachment: HIVE-3187.3.patch.txt

Added more test cases

> support ISO-2012 timestamp literals
> ---
>
> Key: HIVE-3187
> URL: https://issues.apache.org/jira/browse/HIVE-3187
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.9.0
>Reporter: N Campbell
>Assignee: Navis
> Attachments: HIVE-3187.1.patch.txt, HIVE-3187.2.patch.txt, 
> HIVE-3187.3.patch.txt
>
>
> Enable the JDBC driver/Hive SQL engine to accept JDBC canonical or ISO-SQL 
> 20xx Timestamp literals
> ie.
> select 1 from cert.tversion tversion where timestamp '1989-01-01 
> 10:20:30.0' <> timestamp '2000-12-31 12:15:30.12300'
> instead of
> unix_timestamp('.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-8639) Convert SMBJoin to MapJoin [Spark Branch]

2014-10-30 Thread Chinna Rao Lalam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam reassigned HIVE-8639:
--

Assignee: Chinna Rao Lalam

> Convert SMBJoin to MapJoin [Spark Branch]
> -
>
> Key: HIVE-8639
> URL: https://issues.apache.org/jira/browse/HIVE-8639
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Szehon Ho
>Assignee: Chinna Rao Lalam
>
> HIVE-8202 supports auto-conversion of SMB Join.  However, if the tables are 
> partitioned, there could be a slow down as each mapper would need to get a 
> very small chunk of a partition which has a single key. Thus, in some 
> scenarios it's beneficial to convert SMB join to map join.
> The task is to research and support the conversion from SMB join to map join 
> for Spark execution engine.  See the equivalent of MapReduce in 
> SortMergeJoinResolver.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Review Request 27415: CBO: Column names are missing from join expression in Map join with CBO enabled

2014-10-30 Thread pengcheng xiong


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27415/
---

Review request for hive and Ashutosh Chauhan.


Repository: hive-git


Description
---

On Hive 14 when CBO is enabled the column names are missing from the join 
expression. Rather than to use external names "key", "value", some internal 
names such as "_col0" or "_col1" are used. For map join with more than two 
tables it is very hard to figure the actual join order. In this patch, I am 
going to address this issue not only for join but also for all the other 
operators. And it will also be addressed in not only CLI but also TEZ 
environment.


The basic idea to transform the internal name to external name is to
(1) use snapshotLogicalPlanForExplain() to make a snapshot of a logical plan 
after logical optimization
(2) for each operator in the explain task, call the prepareCBOExplain function 
for each operatorDesc
(2.1) Each operator uses ''helpGetStartOp'' to map to a logical operator (start 
point) in the LogicalPlan
(2.2) From start point, each operatorDesc uses ''findExternalName'' to track 
its external name


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/ExplainTask.java e238ff1 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/AbstractSMBJoinProc.java 
c9e8086 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java 
bedc3ac 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java 46dcfaf 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinTaskDispatcher.java
 1a4fcbf 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/LocalMapJoinProcFactory.java
 9076d48 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 8215c26 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 9c944b6 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java 23fbbe1 
  ql/src/java/org/apache/hadoop/hive/ql/plan/AbstractOperatorDesc.java 8410664 
  ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java 8b25c2b 
  ql/src/java/org/apache/hadoop/hive/ql/plan/FilterDesc.java 5856743 
  ql/src/java/org/apache/hadoop/hive/ql/plan/GroupByDesc.java 7a0b0da 
  ql/src/java/org/apache/hadoop/hive/ql/plan/HashTableSinkDesc.java 1e0eb6b 
  ql/src/java/org/apache/hadoop/hive/ql/plan/JoinDesc.java 0e2c6ee 
  ql/src/java/org/apache/hadoop/hive/ql/plan/MapJoinDesc.java d43bd60 
  ql/src/java/org/apache/hadoop/hive/ql/plan/OperatorDesc.java c8c9570 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceSinkDesc.java 57beb69 
  ql/src/java/org/apache/hadoop/hive/ql/plan/SelectDesc.java fa6b548 
  ql/src/test/queries/clientpositive/explainColTest_1.q PRE-CREATION 
  ql/src/test/queries/clientpositive/explainColTest_2.q PRE-CREATION 
  ql/src/test/results/clientpositive/explainColTest_1.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/explainColTest_2.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/explainColTest_1.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/explainColTest_2.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/27415/diff/


Testing
---


Thanks,

pengcheng xiong

[jira] [Commented] (HIVE-8681) CBO: Column names are missing from join expression in Map join with CBO enabled

2014-10-30 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191326#comment-14191326
 ] 

Pengcheng Xiong commented on HIVE-8681:
---

[~jpullokkaran] and [~ashutoshc], could you please take a look? Thanks. 
[~mmokhtar], could you please apply the patch and try Q75 in TPC-DS? I heard 
that it is the most complicated query for explain. Thanks again.

> CBO: Column names are missing from join expression in Map join with CBO 
> enabled
> ---
>
> Key: HIVE-8681
> URL: https://issues.apache.org/jira/browse/HIVE-8681
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-8681.1.patch
>
>
> On Hive 14 when CBO is enabled the column names are missing from the join 
> expression. Rather than to use external names "key", "value", some internal 
> names such as "_col0" or "_col1" are used. For map join with more than two 
> tables it is very hard to figure the actual join order. In this patch, I am 
> going to address this issue not only for join but also for all the other 
> operators. And it will also be addressed in not only CLI but also TEZ 
> environment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8681) CBO: Column names are missing from join expression in Map join with CBO enabled

2014-10-30 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-8681:
--
Status: Patch Available  (was: Open)

> CBO: Column names are missing from join expression in Map join with CBO 
> enabled
> ---
>
> Key: HIVE-8681
> URL: https://issues.apache.org/jira/browse/HIVE-8681
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-8681.1.patch
>
>
> On Hive 14 when CBO is enabled the column names are missing from the join 
> expression. Rather than to use external names "key", "value", some internal 
> names such as "_col0" or "_col1" are used. For map join with more than two 
> tables it is very hard to figure the actual join order. In this patch, I am 
> going to address this issue not only for join but also for all the other 
> operators. And it will also be addressed in not only CLI but also TEZ 
> environment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8681) CBO: Column names are missing from join expression in Map join with CBO enabled

2014-10-30 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-8681:
--
Attachment: HIVE-8681.1.patch

The basic idea to transform the internal name to external name is to
(1) use snapshotLogicalPlanForExplain() to make a snapshot of a logical plan 
after logical optimization
(2) for each operator in the explain task, call the prepareCBOExplain function 
for each operatorDesc
(2.1) Each operator uses ''helpGetStartOp'' to map to a logical operator (start 
point) in the LogicalPlan
(2.2) From start point, each operatorDesc uses ''findExternalName'' to track 
its external name

> CBO: Column names are missing from join expression in Map join with CBO 
> enabled
> ---
>
> Key: HIVE-8681
> URL: https://issues.apache.org/jira/browse/HIVE-8681
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-8681.1.patch
>
>
> On Hive 14 when CBO is enabled the column names are missing from the join 
> expression. Rather than to use external names "key", "value", some internal 
> names such as "_col0" or "_col1" are used. For map join with more than two 
> tables it is very hard to figure the actual join order. In this patch, I am 
> going to address this issue not only for join but also for all the other 
> operators. And it will also be addressed in not only CLI but also TEZ 
> environment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8671) Overflow in estimate row count and data size with fetch column stats

2014-10-30 Thread Mostafa Mokhtar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191285#comment-14191285
 ] 

Mostafa Mokhtar commented on HIVE-8671:
---

For the code below the query had the following inputs:
totalInputFileSize = 9223372036854775341
bytesPerReducer = 1

9223372036854775341 + 1 -> Overflow.
{code}
  public static int estimateReducers(long totalInputFileSize, long 
bytesPerReducer,
  int maxReducers, boolean powersOfTwo) {

int reducers = (int) ((totalInputFileSize + bytesPerReducer - 1) / 
bytesPerReducer);
reducers = Math.max(1, reducers);
reducers = Math.min(maxReducers, reducers);
{code}

I recommend changing to 
{code}
int reducers = (int) ((Math.max(totalInputFileSize,bytesPerReducer )) / 
bytesPerReducer);
{code}

> Overflow in estimate row count and data size with fetch column stats
> 
>
> Key: HIVE-8671
> URL: https://issues.apache.org/jira/browse/HIVE-8671
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 0.14.0
>Reporter: Mostafa Mokhtar
>Assignee: Prasanth J
>Priority: Critical
> Fix For: 0.14.0
>
>
> Overflow in row counts and data size for several TPC-DS queries.
> Interestingly the operators which have overflow end up running with a small 
> parallelism.
> For instance Reducer 2 has an overflow but it only runs with parallelism of 2.
> {code}
>Reducer 2 
> Reduce Operator Tree:
>   Group By Operator
> aggregations: sum(VALUE._col0)
> keys: KEY._col0 (type: string), KEY._col1 (type: string), 
> KEY._col2 (type: string), KEY._col3 (type: string), KEY._col4 (type: float)
> mode: mergepartial
> outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
> Statistics: Num rows: 9223372036854775807 Data size: 
> 9223372036854775341 Basic stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
>   key expressions: _col3 (type: string), _col3 (type: string)
>   sort order: ++
>   Map-reduce partition columns: _col3 (type: string)
>   Statistics: Num rows: 9223372036854775807 Data size: 
> 9223372036854775341 Basic stats: COMPLETE Column stats: COMPLETE
>   value expressions: _col0 (type: string), _col1 (type: 
> string), _col2 (type: string), _col3 (type: string), _col4 (type: float), 
> _col5 (type: double)
> Execution mode: vectorized
> {code}
> {code}
> VERTEX   TOTAL_TASKSDURATION_SECONDS CPU_TIME_MILLIS 
> INPUT_RECORDS   OUTPUT_RECORDS 
> Map 1 62   26.41   1,779,510   
> 211,978,502   60,628,390
> Map 5  14.28   6,950   
> 138,098  138,098
> Map 6  12.44   3,910
> 31   31
> Reducer 2  2   22.69  61,320
> 60,628,390   69,182
> Reducer 3  12.63   3,910
> 69,182  100
> Reducer 4  11.01   1,180   
> 100  100
> {code}
> Query
> {code}
> explain  
> select  i_item_desc 
>   ,i_category 
>   ,i_class 
>   ,i_current_price
>   ,i_item_id
>   ,sum(ws_ext_sales_price) as itemrevenue 
>   ,sum(ws_ext_sales_price)*100/sum(sum(ws_ext_sales_price)) over
>   (partition by i_class) as revenueratio
> from  
>   web_sales
>   ,item 
>   ,date_dim
> where 
>   web_sales.ws_item_sk = item.i_item_sk 
>   and item.i_category in ('Jewelry', 'Sports', 'Books')
>   and web_sales.ws_sold_date_sk = date_dim.d_date_sk
>   and date_dim.d_date between '2001-01-12' and '2001-02-11'
> group by 
>   i_item_id
> ,i_item_desc 
> ,i_category
> ,i_class
> ,i_current_price
> order by 
>   i_category
> ,i_class
> ,i_item_id
> ,i_item_desc
> ,revenueratio
> limit 100
> {code}
> Explain 
> {code}
> STAGE PLANS:
>   Stage: Stage-1
> Tez
>   Edges:
> Map 1 <- Map 5 (BROADCAST_EDGE), Map 6 (BROADCAST_EDGE)
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
> Reducer 4 <- Reducer 3 (SIMPLE_EDGE)
>   DagName: mmokhtar_20141019164343_854cb757-01bd-40cb-843e-9ada7c5e6f38:1
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: web_sales
>   filterExpr: ws_item_sk is not null (type: boolean)
>   Statistics: Num rows: 215946

[jira] [Commented] (HIVE-8646) Hive class loading failure when executing Hive action via oozie workflows

2014-10-30 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191282#comment-14191282
 ] 

Hive QA commented on HIVE-8646:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12678152/HIVE-8646.1.patch.txt

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 6608 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_transform_acid
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan
org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1564/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1564/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1564/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12678152 - PreCommit-HIVE-TRUNK-Build

> Hive class loading  failure when executing Hive action via oozie workflows
> --
>
> Key: HIVE-8646
> URL: https://issues.apache.org/jira/browse/HIVE-8646
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
> Environment: Hadoop 2.6.0 + Hive 0.14 + Oozie 4.1
>Reporter: Venkat Ranganathan
> Attachments: HIVE-8646.1.patch.txt
>
>
> When running Hive actions with Oozie we hit this issue sometimes.What is 
> interesting is that we have all the necessary jars in the classpath (or 
> atleast are expected to be localized).
> This static initialization block is introduced by HIVE-3925.
> ==
> Exception in thread "main" java.lang.ExceptionInInitializerError
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:270)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:214)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: java.lang.TypeNotPresentException: Type 
> org.apache.hadoop.hive.metastore.api.FieldSchema not present
>   at 
> sun.reflect.generics.factory.CoreReflectionFactory.makeNamedType(CoreReflectionFactory.java:117)
>   at 
> sun.reflect.generics.visitor.Reifier.visitClassTypeSignature(Reifier.java:125)
>   at 
> sun.reflect.generics.tree.ClassTypeSignature.accept(ClassTypeSignature.java:49)
>   at 
> sun.reflect.generics.visitor.Reifier.reifyTypeArguments(Reifier.java:68)
>   at 
> sun.reflect.generics.visitor.Reifier.visitClassTypeSignature(Reifier.java:138)
>   at 
> sun.reflect.generics.tree.ClassTypeSignature.accept(ClassTypeSignature.java:49)
>   at 
> sun.reflect.generics.repository.MethodRepository.getReturnType(MethodRepository.java:68)
>   at java.lang.reflect.Method.getGenericReturnType(Method.java:245)
>   at 
> java.beans.FeatureDescriptor.getReturnType(FeatureDescriptor.java:370)
>   at java.beans.Introspector.getTargetEventInfo(Introspector.java:996)
>   at java.beans.Introspector.getBeanInfo(Introspector.java:417)
>   at java.beans.Introspector.getBeanInfo(Introspector.java:163)
>   at 
> org.apache.hadoop.hive.ql.exec.PTFUtils.makeTransient(PTFUtils.java:267)
>   at org.apache.hadoop.hive.ql.exec.Task.(Task.java:53)
>   ... 4 more
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.metastore.api.FieldSchema
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:270)
>   at 
> sun.reflect.generics.factory.CoreReflectionFactory.makeNamedType(CoreReflectionFactory.java:114)
>   ... 17 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8671) Overflow in estimate row count and data size with fetch column stats

2014-10-30 Thread Mostafa Mokhtar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191274#comment-14191274
 ] 

Mostafa Mokhtar commented on HIVE-8671:
---

This is where the bug is :

Since we hit an overflow before data size is set to Log.MAX_VALUE, then when we 
add 1 to that it overflows and reducers ends up being 1

{code}
  public static int estimateReducers(long totalInputFileSize, long 
bytesPerReducer,
  int maxReducers, boolean powersOfTwo) {

int reducers = (int) ((totalInputFileSize + bytesPerReducer - 1) / 
bytesPerReducer);
reducers = Math.max(1, reducers);
reducers = Math.min(maxReducers, reducers);
{code}

> Overflow in estimate row count and data size with fetch column stats
> 
>
> Key: HIVE-8671
> URL: https://issues.apache.org/jira/browse/HIVE-8671
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 0.14.0
>Reporter: Mostafa Mokhtar
>Assignee: Prasanth J
>Priority: Critical
> Fix For: 0.14.0
>
>
> Overflow in row counts and data size for several TPC-DS queries.
> Interestingly the operators which have overflow end up running with a small 
> parallelism.
> For instance Reducer 2 has an overflow but it only runs with parallelism of 2.
> {code}
>Reducer 2 
> Reduce Operator Tree:
>   Group By Operator
> aggregations: sum(VALUE._col0)
> keys: KEY._col0 (type: string), KEY._col1 (type: string), 
> KEY._col2 (type: string), KEY._col3 (type: string), KEY._col4 (type: float)
> mode: mergepartial
> outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
> Statistics: Num rows: 9223372036854775807 Data size: 
> 9223372036854775341 Basic stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
>   key expressions: _col3 (type: string), _col3 (type: string)
>   sort order: ++
>   Map-reduce partition columns: _col3 (type: string)
>   Statistics: Num rows: 9223372036854775807 Data size: 
> 9223372036854775341 Basic stats: COMPLETE Column stats: COMPLETE
>   value expressions: _col0 (type: string), _col1 (type: 
> string), _col2 (type: string), _col3 (type: string), _col4 (type: float), 
> _col5 (type: double)
> Execution mode: vectorized
> {code}
> {code}
> VERTEX   TOTAL_TASKSDURATION_SECONDS CPU_TIME_MILLIS 
> INPUT_RECORDS   OUTPUT_RECORDS 
> Map 1 62   26.41   1,779,510   
> 211,978,502   60,628,390
> Map 5  14.28   6,950   
> 138,098  138,098
> Map 6  12.44   3,910
> 31   31
> Reducer 2  2   22.69  61,320
> 60,628,390   69,182
> Reducer 3  12.63   3,910
> 69,182  100
> Reducer 4  11.01   1,180   
> 100  100
> {code}
> Query
> {code}
> explain  
> select  i_item_desc 
>   ,i_category 
>   ,i_class 
>   ,i_current_price
>   ,i_item_id
>   ,sum(ws_ext_sales_price) as itemrevenue 
>   ,sum(ws_ext_sales_price)*100/sum(sum(ws_ext_sales_price)) over
>   (partition by i_class) as revenueratio
> from  
>   web_sales
>   ,item 
>   ,date_dim
> where 
>   web_sales.ws_item_sk = item.i_item_sk 
>   and item.i_category in ('Jewelry', 'Sports', 'Books')
>   and web_sales.ws_sold_date_sk = date_dim.d_date_sk
>   and date_dim.d_date between '2001-01-12' and '2001-02-11'
> group by 
>   i_item_id
> ,i_item_desc 
> ,i_category
> ,i_class
> ,i_current_price
> order by 
>   i_category
> ,i_class
> ,i_item_id
> ,i_item_desc
> ,revenueratio
> limit 100
> {code}
> Explain 
> {code}
> STAGE PLANS:
>   Stage: Stage-1
> Tez
>   Edges:
> Map 1 <- Map 5 (BROADCAST_EDGE), Map 6 (BROADCAST_EDGE)
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
> Reducer 4 <- Reducer 3 (SIMPLE_EDGE)
>   DagName: mmokhtar_20141019164343_854cb757-01bd-40cb-843e-9ada7c5e6f38:1
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: web_sales
>   filterExpr: ws_item_sk is not null (type: boolean)
>   Statistics: Num rows: 21594638446 Data size: 2850189889652 
> Basic stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> pr

[jira] [Commented] (HIVE-8675) Increase thrift server protocol test coverage

2014-10-30 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191218#comment-14191218
 ] 

Hive QA commented on HIVE-8675:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12678268/HIVE-8675.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 6609 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.streaming.TestStreaming.testRemainingTransactions
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1563/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1563/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1563/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12678268 - PreCommit-HIVE-TRUNK-Build

> Increase thrift server protocol test coverage
> -
>
> Key: HIVE-8675
> URL: https://issues.apache.org/jira/browse/HIVE-8675
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Brock Noland
> Fix For: 0.14.0
>
> Attachments: HIVE-8675.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8681) CBO: Column names are missing from join expression in Map join with CBO enabled

2014-10-30 Thread Pengcheng Xiong (JIRA)

Pengcheng Xiong created HIVE-8681:
-

 Summary: CBO: Column names are missing from join expression in Map 
join with CBO enabled
 Key: HIVE-8681
 URL: https://issues.apache.org/jira/browse/HIVE-8681
 Project: Hive
  Issue Type: Improvement
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong


On Hive 14 when CBO is enabled the column names are missing from the join 
expression. Rather than to use external names "key", "value", some internal 
names such as "_col0" or "_col1" are used. For map join with more than two 
tables it is very hard to figure the actual join order. In this patch, I am 
going to address this issue not only for join but also for all the other 
operators. And it will also be addressed in not only CLI but also TEZ 
environment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8202) Support SMB Join for Hive on Spark [Spark Branch]

2014-10-30 Thread Xuefu Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-8202:
--
   Resolution: Fixed
Fix Version/s: spark-branch
   Status: Resolved  (was: Patch Available)

Patch committed to Spark branch. Thanks to Szehon for the contribution.

> Support SMB Join for Hive on Spark [Spark Branch]
> -
>
> Key: HIVE-8202
> URL: https://issues.apache.org/jira/browse/HIVE-8202
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Szehon Ho
> Fix For: spark-branch
>
> Attachments: HIVE-8202.1-spark.patch, HIVE-8202.2-spark.patch, 
> HIVE-8202.3-spark.patch, HIVE-8202.4-spark.patch, HIVE-8202.5-spark.patch, 
> HIVE-8202.6-spark.patch, HIVE-8202.7-spark.patch, HIVE-8202.8-spark.patch, 
> HIVE-8202.9-spark.patch, Hive on Spark SMB Join.docx, Hive on Spark SMB 
> Join.pdf
>
>
> SMB joins are used wherever the tables are sorted and bucketed. It's a 
> map-side join. The join boils down to just merging the already sorted tables, 
> allowing this operation to be faster than an ordinary map-join.
> The task is to research and support the conversion from regular SMB join to 
> SMB map join for Spark execution engine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8202) Support SMB Join for Hive on Spark [Spark Branch]

2014-10-30 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191207#comment-14191207
 ] 

Xuefu Zhang commented on HIVE-8202:
---

+1

> Support SMB Join for Hive on Spark [Spark Branch]
> -
>
> Key: HIVE-8202
> URL: https://issues.apache.org/jira/browse/HIVE-8202
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Szehon Ho
> Attachments: HIVE-8202.1-spark.patch, HIVE-8202.2-spark.patch, 
> HIVE-8202.3-spark.patch, HIVE-8202.4-spark.patch, HIVE-8202.5-spark.patch, 
> HIVE-8202.6-spark.patch, HIVE-8202.7-spark.patch, HIVE-8202.8-spark.patch, 
> HIVE-8202.9-spark.patch, Hive on Spark SMB Join.docx, Hive on Spark SMB 
> Join.pdf
>
>
> SMB joins are used wherever the tables are sorted and bucketed. It's a 
> map-side join. The join boils down to just merging the already sorted tables, 
> allowing this operation to be faster than an ordinary map-join.
> The task is to research and support the conversion from regular SMB join to 
> SMB map join for Spark execution engine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8674) Fix tests after merge [Spark Branch]

2014-10-30 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191198#comment-14191198
 ] 

Xuefu Zhang commented on HIVE-8674:
---

It looks like parallel.q is a merge issue.

> Fix tests after merge [Spark Branch]
> 
>
> Key: HIVE-8674
> URL: https://issues.apache.org/jira/browse/HIVE-8674
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-8674.1-spark.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8202) Support SMB Join for Hive on Spark [Spark Branch]

2014-10-30 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191189#comment-14191189
 ] 

Xuefu Zhang commented on HIVE-8202:
---

Test failure parallel.q appears to be caused by the latest merge from trunk. 
It's not related to this patch.

> Support SMB Join for Hive on Spark [Spark Branch]
> -
>
> Key: HIVE-8202
> URL: https://issues.apache.org/jira/browse/HIVE-8202
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Szehon Ho
> Attachments: HIVE-8202.1-spark.patch, HIVE-8202.2-spark.patch, 
> HIVE-8202.3-spark.patch, HIVE-8202.4-spark.patch, HIVE-8202.5-spark.patch, 
> HIVE-8202.6-spark.patch, HIVE-8202.7-spark.patch, HIVE-8202.8-spark.patch, 
> HIVE-8202.9-spark.patch, Hive on Spark SMB Join.docx, Hive on Spark SMB 
> Join.pdf
>
>
> SMB joins are used wherever the tables are sorted and bucketed. It's a 
> map-side join. The join boils down to just merging the already sorted tables, 
> allowing this operation to be faster than an ordinary map-join.
> The task is to research and support the conversion from regular SMB join to 
> SMB map join for Spark execution engine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8663) Fetching Vectorization scratch column map in Reduce-Side stop working

2014-10-30 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191188#comment-14191188
 ] 

Gunther Hagleitner commented on HIVE-8663:
--

+1 for .14

> Fetching Vectorization scratch column map in Reduce-Side stop working
> -
>
> Key: HIVE-8663
> URL: https://issues.apache.org/jira/browse/HIVE-8663
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 0.14.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 0.14.0
>
> Attachments: HIVE-8663.01.patch, HIVE-8663.02.patch
>
>
> Recent changes (somewhere) caused scratch column types to not be fetched on 
> reduce-side.
> Switching to use scratch column types from VectorizationContext.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8550) Hive cannot load data into partitioned table with Unicode key

2014-10-30 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8550:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to branch and trunk. Thanks [~xiaobingo] and [~jdere]!

> Hive cannot load data into partitioned table with Unicode key
> -
>
> Key: HIVE-8550
> URL: https://issues.apache.org/jira/browse/HIVE-8550
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
> Environment: Windows
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>Priority: Critical
> Attachments: CreatePartitionedTable.hql, HIVE-8550.0.14.1.patch, 
> HIVE-8550.1.patch, HIVE-8550.2.patch, HIVE-8550.3.patch, 
> LoadIntoPartitionedTable.hql, partitioned.txt
>
>
> Steps to reproduce:
> 1) Copy the file partitioned.txt to the root folder of your HDFS root dir. 
> Copy the two hql files to your local directory.
> 2) Open Hive CLI.
> 3) Run:
> hive> source ;
> 4) Run 
> hive> source ;
> The following error will be shown:
> hive> source C:\Scripts\partition\LoadIntoPartitionedTable.hql;
> Loading data to table default.mypartitioned partition (tag=ä¶µ)
> Failed with exception null
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.MoveTask



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8202) Support SMB Join for Hive on Spark [Spark Branch]

2014-10-30 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191178#comment-14191178
 ] 

Hive QA commented on HIVE-8202:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12678366/HIVE-8202.9-spark.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7095 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parallel
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook
org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/294/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/294/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-294/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12678366 - PreCommit-HIVE-SPARK-Build

> Support SMB Join for Hive on Spark [Spark Branch]
> -
>
> Key: HIVE-8202
> URL: https://issues.apache.org/jira/browse/HIVE-8202
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Szehon Ho
> Attachments: HIVE-8202.1-spark.patch, HIVE-8202.2-spark.patch, 
> HIVE-8202.3-spark.patch, HIVE-8202.4-spark.patch, HIVE-8202.5-spark.patch, 
> HIVE-8202.6-spark.patch, HIVE-8202.7-spark.patch, HIVE-8202.8-spark.patch, 
> HIVE-8202.9-spark.patch, Hive on Spark SMB Join.docx, Hive on Spark SMB 
> Join.pdf
>
>
> SMB joins are used wherever the tables are sorted and bucketed. It's a 
> map-side join. The join boils down to just merging the already sorted tables, 
> allowing this operation to be faster than an ordinary map-join.
> The task is to research and support the conversion from regular SMB join to 
> SMB map join for Spark execution engine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 26854: HIVE-2573 Create per-session function registry

2014-10-30 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26854/#review59281
---


Still have the concerns from one of the previous review comments that there is 
no longer a way to query for persistent UDFs after static intitialization, 
since this can potentially cause issues if one or more clients is using Hive 
CLI.


ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java


Are these changes meant for this Jira?

Hmm, I think I see why - have the persistent function lookups done during 
static initialization now caused the Hive class to not be instantiable during 
runtime? Could you try moving the persistent function lookups out of static 
initialization, and into a method, which gets called (but only initialized 
once) during SessionState.start()? Would that take care of the issue?



service/src/test/org/apache/hadoop/hive/service/TestHiveServerSessions.java


What's the issue here?



itests/hive-minikdc/src/test/java/org/apache/hive/minikdc/TestJdbcWithMiniKdc.java


Is this change relevant to this Jira? Or is this just a general fix to 
TestJdbcWithMiniKdc.testNegativeTokenAuth, which I have noticed to be failing 
consistently in the precommit tests?



metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java


Are the changes in this file meant for this Jira?



ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java


Was this line meant to be in here?


- Jason Dere


On Oct. 30, 2014, 11:41 p.m., Jason Dere wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26854/
> ---
> 
> (Updated Oct. 30, 2014, 11:41 p.m.)
> 
> 
> Review request for hive, Navis Ryu and Thejas Nair.
> 
> 
> Bugs: HIVE-2573
> https://issues.apache.org/jira/browse/HIVE-2573
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Small updates to Navis' changes:
> - session registry doesn't lookup metastore for UDFs
> - my feedback from Navis' original patch
> - metastore udfs should not be considered native. This allows them to be 
> added/removed from registry
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/JavaUtils.java 9aa917c 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 7d8e5bc 
>   contrib/src/test/results/clientnegative/invalid_row_sequence.q.out 8f3c0b3 
>   
> itests/hive-minikdc/src/test/java/org/apache/hive/minikdc/TestJdbcWithMiniKdc.java
>  6647ce5 
>   
> metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
>  88b0791 
>   ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 9ac540e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/CommonFunctionInfo.java 93c15c0 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionInfo.java 074255b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java e43a328 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionTask.java 569c125 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Registry.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 5bdeb92 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/WindowFunctionInfo.java efecb05 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 4e3df75 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java b900627 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java 13277a9 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 211ab6c 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/IndexUtils.java e2768ff 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/translator/SqlFunctionConverter.java
>  793f117 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java 
> 1796b7b 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/FunctionSemanticAnalyzer.java 
> 22e5b47 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/IndexUpdater.java 2b239ab 
>   ql/src/java/org/apache/hadoop/hive/ql/session/SessionConf.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java af633cb 
>   ql/src/test/org/apache/hadoop/hive/ql/parse/TestMacroSemanticAnalyzer.java 
> 46f8052 
>   ql/src/test/queries/clientnegative/drop_native_udf.q ae047bb 
>   ql/src/test/results/clientnegative/create_function_nonexistent_class.q.out 
> c7405ed 
>   ql/src/test/results/clientnegative/create_function_nonudf_class.q.out 
> d0dd50a 
>   ql/src/test/results/clientnegative/drop_native_udf.q.out 9f0eaa5 
>   ql/src/test/results

[jira] [Commented] (HIVE-8636) CBO: split cbo_correctness test

2014-10-30 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191168#comment-14191168
 ] 

Ashutosh Chauhan commented on HIVE-8636:


We need to divy up cbo_correctness.q for sure. Currently, its cruel to ask 
someone to debug it, if this test fails. I am fine if division happens either 
on operator basis or some other basis. [~jpullokkaran] any suggestion for 
division?

> CBO: split cbo_correctness test
> ---
>
> Key: HIVE-8636
> URL: https://issues.apache.org/jira/browse/HIVE-8636
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-8636.patch
>
>
> CBO correctness test is extremely annoying - it runs forever, if anything 
> fails it's hard to debug due to the volume of logs from all the stuff, also 
> it doesn't run further so if multiple things fail they can only be discovered 
> one by one; also SORT_QUERY_RESULTS cannot be used, because some queries 
> presumably use sorting.
> It should be split into separate tests, the numbers in there now may be good 
> as boundaries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8667) CBO: optimize_nullscan - some TableScans for nullscans appeared after CBO

2014-10-30 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191165#comment-14191165
 ] 

Sergey Shelukhin commented on HIVE-8667:


I think optimize_nullscan out for non-tez was committed by accident, I've just 
replaced it back.
So it happens in both.

> CBO: optimize_nullscan - some TableScans for nullscans appeared after CBO
> -
>
> Key: HIVE-8667
> URL: https://issues.apache.org/jira/browse/HIVE-8667
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.15.0
>
>
> Looks like some rewriting by CBO prevents nullscans from being optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8636) CBO: split cbo_correctness test

2014-10-30 Thread Laljo John Pullokkaran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191147#comment-14191147
 ] 

Laljo John Pullokkaran commented on HIVE-8636:
--

CBO correctness tests are organized by increasing complexity by adding in 
operators.
So i am not sure division by operator is sane.

> CBO: split cbo_correctness test
> ---
>
> Key: HIVE-8636
> URL: https://issues.apache.org/jira/browse/HIVE-8636
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-8636.patch
>
>
> CBO correctness test is extremely annoying - it runs forever, if anything 
> fails it's hard to debug due to the volume of logs from all the stuff, also 
> it doesn't run further so if multiple things fail they can only be discovered 
> one by one; also SORT_QUERY_RESULTS cannot be used, because some queries 
> presumably use sorting.
> It should be split into separate tests, the numbers in there now may be good 
> as boundaries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8667) CBO: optimize_nullscan - some TableScans for nullscans appeared after CBO

2014-10-30 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191140#comment-14191140
 ] 

Ashutosh Chauhan commented on HIVE-8667:


 [~hagleitn] Seems like this is only an issue on Tez. On the last run of 
HIVE-8395 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1556/testReport/
 this test failed only for MiniTezCliDriver and passed for CliDriver. This is 
surprising since optimization itself is not aware of execution engine, but this 
is possible since on Tez, we have different set of physical optimizations than 
on MR. However,  I am not able to repro this. After enabling hive.cbo.enable, 
only diffs that I get is:
{code}
diff --git a/ql/src/test/results/clientpositive/tez/optimize_nullscan.q.out 
b/ql/src/test/results/clientpositive/tez/optimize_nullscan.q.out
index c45f0db..4228cec 100644
--- a/ql/src/test/results/clientpositive/tez/optimize_nullscan.q.out
+++ b/ql/src/test/results/clientpositive/tez/optimize_nullscan.q.out
@@ -1815,9 +1815,9 @@ STAGE PLANS:
   value expressions: key (type: string)
   auto parallelism: true
 Path -> Alias:
-  -mr-10003default.src{} [s2]
+  -mr-10002default.src{} [s2]
 Path -> Partition:
-  -mr-10003default.src{} 
+  -mr-10002default.src{} 
 Partition
   base file name: src
   input format: 
org.apache.hadoop.hive.ql.io.OneNullRowInputFormat
@@ -1862,7 +1862,7 @@ STAGE PLANS:
 name: default.src
   name: default.src
 Truncated Path -> Alias:
-  -mr-10003default.src{} [s2]
+  -mr-10002default.src{} [s2]
 Map 3 
 Map Operator Tree:
 TableScan
@@ -1882,9 +1882,9 @@ STAGE PLANS:
   value expressions: key (type: string)
   auto parallelism: true
 Path -> Alias:
-  -mr-10002default.src{} [s1]
+  -mr-10003default.src{} [s1]
 Path -> Partition:
-  -mr-10002default.src{} 
+  -mr-10003default.src{} 
{code}

[~sershe] This optimizer dont care whether join is present in query or not. It 
looks for TS->FIL pattern with FIL being where false. So, final plan as you 
have printed after CBO is exactly of its liking. In Hive, PPD makes sure we get 
that pattern.  
Whats different on Tez is really intriguing. I will dig more.

> CBO: optimize_nullscan - some TableScans for nullscans appeared after CBO
> -
>
> Key: HIVE-8667
> URL: https://issues.apache.org/jira/browse/HIVE-8667
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.15.0
>
>
> Looks like some rewriting by CBO prevents nullscans from being optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 27401: HIVE-8636 CBO: split cbo_correctness test

2014-10-30 Thread John Pullokkaran


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27401/#review59290
---



itests/src/test/resources/testconfiguration.properties


Why don't we put all CBO subqueries in to a single test?


- John Pullokkaran


On Oct. 30, 2014, 11:18 p.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27401/
> ---
> 
> (Updated Oct. 30, 2014, 11:18 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and John Pullokkaran.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> See JIRA
> 
> 
> Diffs
> -
> 
>   data/scripts/q_test_cleanup.sql 8ec0f9f 
>   data/scripts/q_test_init.sql 7484f0c 
>   itests/src/test/resources/testconfiguration.properties 2c84a36 
>   pom.xml bd74830 
>   ql/src/test/queries/clientpositive/cbo_correctness.q bb328f6 
>   ql/src/test/queries/clientpositive/cbo_gby.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_gby_empty.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_join.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_limit.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_semijoin.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_simple_select.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_stats.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_subq_exists.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_subq_in.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_subq_not_in.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_udf_udaf.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_union.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_views.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_windowing.q PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_correctness.q.out d98cb5b 
>   ql/src/test/results/clientpositive/cbo_gby.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_gby_empty.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_join.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_limit.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_semijoin.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_simple_select.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_stats.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_subq_exists.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_subq_in.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_subq_not_in.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_udf_udaf.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_union.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_views.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_windowing.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_correctness.q.out 2103d08 
>   ql/src/test/results/clientpositive/tez/cbo_gby.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_gby_empty.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_join.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_limit.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_semijoin.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_simple_select.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_stats.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_subq_exists.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_subq_in.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_subq_not_in.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_udf_udaf.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_union.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_views.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_windowing.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/27401/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>

[jira] [Commented] (HIVE-8313) Optimize evaluation for ExprNodeConstantEvaluator and ExprNodeNullEvaluator

2014-10-30 Thread Navis (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191138#comment-14191138
 ] 

Navis commented on HIVE-8313:
-

Good. Let's see the result of test.

> Optimize evaluation for ExprNodeConstantEvaluator and ExprNodeNullEvaluator
> ---
>
> Key: HIVE-8313
> URL: https://issues.apache.org/jira/browse/HIVE-8313
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.12.0, 0.13.0, 0.14.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-8313.1.patch, HIVE-8313.2.patch
>
>
> Consider the following query:
> {code:sql}
> SELECT foo, bar, goo, id
> FROM myTable
> WHERE id IN ( 'A', 'B', 'C', 'D', ... , 'ZZ' );
> {code}
> One finds that when the IN clause has several thousand elements (and the 
> table has several million rows), the query above takes orders-of-magnitude 
> longer to run on Hive 0.12 than say Hive 0.10.
> I have a possibly incomplete fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8313) Optimize evaluation for ExprNodeConstantEvaluator and ExprNodeNullEvaluator

2014-10-30 Thread Mithun Radhakrishnan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-8313:
---
Status: Patch Available  (was: Open)

> Optimize evaluation for ExprNodeConstantEvaluator and ExprNodeNullEvaluator
> ---
>
> Key: HIVE-8313
> URL: https://issues.apache.org/jira/browse/HIVE-8313
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.13.0, 0.12.0, 0.14.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-8313.1.patch, HIVE-8313.2.patch
>
>
> Consider the following query:
> {code:sql}
> SELECT foo, bar, goo, id
> FROM myTable
> WHERE id IN ( 'A', 'B', 'C', 'D', ... , 'ZZ' );
> {code}
> One finds that when the IN clause has several thousand elements (and the 
> table has several million rows), the query above takes orders-of-magnitude 
> longer to run on Hive 0.12 than say Hive 0.10.
> I have a possibly incomplete fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8313) Optimize evaluation for ExprNodeConstantEvaluator and ExprNodeNullEvaluator

2014-10-30 Thread Mithun Radhakrishnan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-8313:
---
Status: Open  (was: Patch Available)

Cancelling the old patch.

> Optimize evaluation for ExprNodeConstantEvaluator and ExprNodeNullEvaluator
> ---
>
> Key: HIVE-8313
> URL: https://issues.apache.org/jira/browse/HIVE-8313
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.13.0, 0.12.0, 0.14.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-8313.1.patch, HIVE-8313.2.patch
>
>
> Consider the following query:
> {code:sql}
> SELECT foo, bar, goo, id
> FROM myTable
> WHERE id IN ( 'A', 'B', 'C', 'D', ... , 'ZZ' );
> {code}
> One finds that when the IN clause has several thousand elements (and the 
> table has several million rows), the query above takes orders-of-magnitude 
> longer to run on Hive 0.12 than say Hive 0.10.
> I have a possibly incomplete fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8313) Optimize evaluation for ExprNodeConstantEvaluator and ExprNodeNullEvaluator

2014-10-30 Thread Mithun Radhakrishnan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-8313:
---
Attachment: HIVE-8313.2.patch

Something along these lines?

> Optimize evaluation for ExprNodeConstantEvaluator and ExprNodeNullEvaluator
> ---
>
> Key: HIVE-8313
> URL: https://issues.apache.org/jira/browse/HIVE-8313
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.12.0, 0.13.0, 0.14.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-8313.1.patch, HIVE-8313.2.patch
>
>
> Consider the following query:
> {code:sql}
> SELECT foo, bar, goo, id
> FROM myTable
> WHERE id IN ( 'A', 'B', 'C', 'D', ... , 'ZZ' );
> {code}
> One finds that when the IN clause has several thousand elements (and the 
> table has several million rows), the query above takes orders-of-magnitude 
> longer to run on Hive 0.12 than say Hive 0.10.
> I have a possibly incomplete fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8313) Optimize evaluation for ExprNodeConstantEvaluator and ExprNodeNullEvaluator

2014-10-30 Thread Navis (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191132#comment-14191132
 ] 

Navis commented on HIVE-8313:
-

Yes, If it's accessed per row basis, it would be better to minimize footprint 
for it.

> Optimize evaluation for ExprNodeConstantEvaluator and ExprNodeNullEvaluator
> ---
>
> Key: HIVE-8313
> URL: https://issues.apache.org/jira/browse/HIVE-8313
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.12.0, 0.13.0, 0.14.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-8313.1.patch
>
>
> Consider the following query:
> {code:sql}
> SELECT foo, bar, goo, id
> FROM myTable
> WHERE id IN ( 'A', 'B', 'C', 'D', ... , 'ZZ' );
> {code}
> One finds that when the IN clause has several thousand elements (and the 
> table has several million rows), the query above takes orders-of-magnitude 
> longer to run on Hive 0.12 than say Hive 0.10.
> I have a possibly incomplete fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-6770) disallow change of column name when talbe's serde is parquet serde

2014-10-30 Thread Tongjie Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tongjie Chen resolved HIVE-6770.

Resolution: Fixed

This is not necessary now.
https://issues.apache.org/jira/browse/HIVE-6938

> disallow change of column name when talbe's serde is parquet serde
> --
>
> Key: HIVE-6770
> URL: https://issues.apache.org/jira/browse/HIVE-6770
> Project: Hive
>  Issue Type: Task
>Affects Versions: 0.13.0
>Reporter: Tongjie Chen
>
> hive column is index based, hence changing column name does not affect the 
> result.
> However parquet columns are name based, changing column will result in 
> incorrect result.
> Before rename issue is addressed in parquet, hive should disallow column 
> rename if Parquet is used. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8313) Optimize evaluation for ExprNodeConstantEvaluator and ExprNodeNullEvaluator

2014-10-30 Thread Mithun Radhakrishnan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191124#comment-14191124
 ] 

Mithun Radhakrishnan commented on HIVE-8313:


Hello, [~navis]. Thanks for reviewing. (Apologies for getting to this so late.)

I suppose I could change {{childrenNeedingPrepare}} to an array, but the size 
wouldn't be known until the end of {{initialize()}}. Would you recommend that I 
create a temp-list and convert that to an array?

> Optimize evaluation for ExprNodeConstantEvaluator and ExprNodeNullEvaluator
> ---
>
> Key: HIVE-8313
> URL: https://issues.apache.org/jira/browse/HIVE-8313
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.12.0, 0.13.0, 0.14.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-8313.1.patch
>
>
> Consider the following query:
> {code:sql}
> SELECT foo, bar, goo, id
> FROM myTable
> WHERE id IN ( 'A', 'B', 'C', 'D', ... , 'ZZ' );
> {code}
> One finds that when the IN clause has several thousand elements (and the 
> table has several million rows), the query above takes orders-of-magnitude 
> longer to run on Hive 0.12 than say Hive 0.10.
> I have a possibly incomplete fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8550) Hive cannot load data into partitioned table with Unicode key

2014-10-30 Thread Xiaobing Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HIVE-8550:

Attachment: HIVE-8550.0.14.1.patch

Made a patch for 0.14. [~hagleitn] can you get that into 0.14, thanks!

> Hive cannot load data into partitioned table with Unicode key
> -
>
> Key: HIVE-8550
> URL: https://issues.apache.org/jira/browse/HIVE-8550
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
> Environment: Windows
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>Priority: Critical
> Attachments: CreatePartitionedTable.hql, HIVE-8550.0.14.1.patch, 
> HIVE-8550.1.patch, HIVE-8550.2.patch, HIVE-8550.3.patch, 
> LoadIntoPartitionedTable.hql, partitioned.txt
>
>
> Steps to reproduce:
> 1) Copy the file partitioned.txt to the root folder of your HDFS root dir. 
> Copy the two hql files to your local directory.
> 2) Open Hive CLI.
> 3) Run:
> hive> source ;
> 4) Run 
> hive> source ;
> The following error will be shown:
> hive> source C:\Scripts\partition\LoadIntoPartitionedTable.hql;
> Loading data to table default.mypartitioned partition (tag=ä¶µ)
> Failed with exception null
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.MoveTask



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8636) CBO: split cbo_correctness test

2014-10-30 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191118#comment-14191118
 ] 

Ashutosh Chauhan commented on HIVE-8636:


Minor comments on RB

> CBO: split cbo_correctness test
> ---
>
> Key: HIVE-8636
> URL: https://issues.apache.org/jira/browse/HIVE-8636
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-8636.patch
>
>
> CBO correctness test is extremely annoying - it runs forever, if anything 
> fails it's hard to debug due to the volume of logs from all the stuff, also 
> it doesn't run further so if multiple things fail they can only be discovered 
> one by one; also SORT_QUERY_RESULTS cannot be used, because some queries 
> presumably use sorting.
> It should be split into separate tests, the numbers in there now may be good 
> as boundaries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 27401: HIVE-8636 CBO: split cbo_correctness test

2014-10-30 Thread Ashutosh Chauhan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27401/#review59288
---



data/scripts/q_test_init.sql


if exists missed



data/scripts/q_test_init.sql


Any reason for this? We have hive.stats.dbclass = fs at the top of file. 
That should be sufficient for this.



data/scripts/q_test_init.sql


you can just say analyze table cbo_t1 compute statistics;



data/scripts/q_test_init.sql


similarly here analyze table cbo_t1 compute statistics for columns;


- Ashutosh Chauhan


On Oct. 30, 2014, 11:18 p.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27401/
> ---
> 
> (Updated Oct. 30, 2014, 11:18 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and John Pullokkaran.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> See JIRA
> 
> 
> Diffs
> -
> 
>   data/scripts/q_test_cleanup.sql 8ec0f9f 
>   data/scripts/q_test_init.sql 7484f0c 
>   itests/src/test/resources/testconfiguration.properties 2c84a36 
>   pom.xml bd74830 
>   ql/src/test/queries/clientpositive/cbo_correctness.q bb328f6 
>   ql/src/test/queries/clientpositive/cbo_gby.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_gby_empty.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_join.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_limit.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_semijoin.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_simple_select.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_stats.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_subq_exists.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_subq_in.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_subq_not_in.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_udf_udaf.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_union.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_views.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_windowing.q PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_correctness.q.out d98cb5b 
>   ql/src/test/results/clientpositive/cbo_gby.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_gby_empty.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_join.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_limit.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_semijoin.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_simple_select.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_stats.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_subq_exists.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_subq_in.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_subq_not_in.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_udf_udaf.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_union.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_views.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_windowing.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_correctness.q.out 2103d08 
>   ql/src/test/results/clientpositive/tez/cbo_gby.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_gby_empty.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_join.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_limit.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_semijoin.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_simple_select.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_stats.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_subq_exists.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_subq_in.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_subq_not_in.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_udf_udaf.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_union.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_views.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_windowing.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/27401/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>

[jira] [Commented] (HIVE-8668) mssql sql script has carriage returns

2014-10-30 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1419#comment-1419
 ] 

Hive QA commented on HIVE-8668:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12678263/HIVE-8668.patch

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 6608 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_transform_acid
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan
org.apache.hive.hcatalog.streaming.TestStreaming.testEndpointConnection
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit
org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1562/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1562/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1562/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12678263 - PreCommit-HIVE-TRUNK-Build

> mssql sql script has carriage returns
> -
>
> Key: HIVE-8668
> URL: https://issues.apache.org/jira/browse/HIVE-8668
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Brock Noland
>Assignee: Brock Noland
>Priority: Blocker
> Fix For: 0.14.0
>
> Attachments: HIVE-8668.patch, HIVE-8668.patch
>
>
> This is breaking patches generated by {{svn merge}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8202) Support SMB Join for Hive on Spark [Spark Branch]

2014-10-30 Thread Szehon Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-8202:

Attachment: HIVE-8202.9-spark.patch

Getting closer, fixing more test failures.

> Support SMB Join for Hive on Spark [Spark Branch]
> -
>
> Key: HIVE-8202
> URL: https://issues.apache.org/jira/browse/HIVE-8202
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Szehon Ho
> Attachments: HIVE-8202.1-spark.patch, HIVE-8202.2-spark.patch, 
> HIVE-8202.3-spark.patch, HIVE-8202.4-spark.patch, HIVE-8202.5-spark.patch, 
> HIVE-8202.6-spark.patch, HIVE-8202.7-spark.patch, HIVE-8202.8-spark.patch, 
> HIVE-8202.9-spark.patch, Hive on Spark SMB Join.docx, Hive on Spark SMB 
> Join.pdf
>
>
> SMB joins are used wherever the tables are sorted and bucketed. It's a 
> map-side join. The join boils down to just merging the already sorted tables, 
> allowing this operation to be faster than an ordinary map-join.
> The task is to research and support the conversion from regular SMB join to 
> SMB map join for Spark execution engine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8648) numRows cannot be set by user

2014-10-30 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191097#comment-14191097
 ] 

Ashutosh Chauhan commented on HIVE-8648:


yeah, thats correct.  If you put up a patch, happy to review it.

> numRows cannot be set by user
> -
>
> Key: HIVE-8648
> URL: https://issues.apache.org/jira/browse/HIVE-8648
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.0, 0.14.0, 0.13.1
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-8648.1.patch
>
>
> Since HIVE-3777 users who want to set the number of rows for a table, must do 
> as follows:
> {noformat}
> alter table ... set tblproperties ('numRows' = '12345', 
> 'STATS_GENERATED_VIA_STATS_TASK' = 'true');
> {noformat}
> Which is strange because (1) users can know the numbers of rows and (2) the 
> stat is not generated by a stats task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8663) Fetching Vectorization scratch column map in Reduce-Side stop working

2014-10-30 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191094#comment-14191094
 ] 

Ashutosh Chauhan commented on HIVE-8663:


+1

> Fetching Vectorization scratch column map in Reduce-Side stop working
> -
>
> Key: HIVE-8663
> URL: https://issues.apache.org/jira/browse/HIVE-8663
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 0.14.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 0.14.0
>
> Attachments: HIVE-8663.01.patch, HIVE-8663.02.patch
>
>
> Recent changes (somewhere) caused scratch column types to not be fetched on 
> reduce-side.
> Switching to use scratch column types from VectorizationContext.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8394) HIVE-7803 doesn't handle Pig MultiQuery, can cause data-loss.

2014-10-30 Thread Mithun Radhakrishnan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-8394:
---
Status: Patch Available  (was: Open)

> HIVE-7803 doesn't handle Pig MultiQuery, can cause data-loss.
> -
>
> Key: HIVE-8394
> URL: https://issues.apache.org/jira/browse/HIVE-8394
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.13.1, 0.12.0, 0.14.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
>Priority: Critical
> Attachments: HIVE-8394.1.patch, HIVE-8394.2.patch
>
>
> We've found situations in production where Pig queries using {{HCatStorer}}, 
> dynamic partitioning and {{opt.multiquery=true}} that produce partitions in 
> the output table, but the corresponding directories have no data files (in 
> spite of Pig reporting non-zero records written to HDFS). I don't yet have a 
> distilled test-case for this.
> Here's the code from FileOutputCommitterContainer after HIVE-7803:
> {code:java|title=FileOutputCommitterContainer.java|borderStyle=dashed|titleBGColor=#F7D6C1|bgColor=#CE}
>   @Override
>   public void commitTask(TaskAttemptContext context) throws IOException {
> String jobInfoStr = 
> context.getConfiguration().get(FileRecordWriterContainer.DYN_JOBINFO);
> if (!dynamicPartitioningUsed) {
>  //See HCATALOG-499
>   FileOutputFormatContainer.setWorkOutputPath(context);
>   
> getBaseOutputCommitter().commitTask(HCatMapRedUtil.createTaskAttemptContext(context));
> } else if (jobInfoStr != null) {
>   ArrayList jobInfoList = 
> (ArrayList)HCatUtil.deserialize(jobInfoStr);
>   org.apache.hadoop.mapred.TaskAttemptContext currTaskContext = 
> HCatMapRedUtil.createTaskAttemptContext(context);
>   for (String jobStr : jobInfoList) {
>   OutputJobInfo localJobInfo = 
> (OutputJobInfo)HCatUtil.deserialize(jobStr);
>   FileOutputCommitter committer = new FileOutputCommitter(new 
> Path(localJobInfo.getLocation()), currTaskContext);
>   committer.commitTask(currTaskContext);
>   }
> }
>   }
> {code}
> The serialized jobInfoList can't be retrieved, and hence the commit never 
> completes. This is because Pig's MapReducePOStoreImpl deliberately clones 
> both the TaskAttemptContext and the contained Configuration instance, thus 
> separating the Configuration instances passed to 
> {{FileOutputCommitterContainer::commitTask()}} and 
> {{FileRecordWriterContainer::close()}}. Anything set by the RecordWriter is 
> unavailable to the Committer.
> One approach would have been to store state in the FileOutputFormatContainer. 
> But that won't work since this is constructed via reflection in 
> HCatOutputFormat (itself constructed via reflection by PigOutputFormat via 
> HCatStorer). There's no guarantee that the instance is preserved.
> My only recourse seems to be to use a Singleton to store shared state. I'm 
> loath to indulge in this brand of shenanigans. (Statics and container-reuse 
> in Tez might not play well together, for instance.) It might work if we're 
> careful about tearing down the singleton.
> Any other ideas? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8394) HIVE-7803 doesn't handle Pig MultiQuery, can cause data-loss.

2014-10-30 Thread Mithun Radhakrishnan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-8394:
---
Attachment: HIVE-8394.2.patch

Here's the updated patch. It handles the singleton-cleanup more completely. And 
instead of keying on only the TaskAttemptID, the solution now uses the 
commit-path as part of the key. This should handle multiple Pig outputs for the 
same attempt. (Thanks, [~cdrome].)

> HIVE-7803 doesn't handle Pig MultiQuery, can cause data-loss.
> -
>
> Key: HIVE-8394
> URL: https://issues.apache.org/jira/browse/HIVE-8394
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.12.0, 0.14.0, 0.13.1
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
>Priority: Critical
> Attachments: HIVE-8394.1.patch, HIVE-8394.2.patch
>
>
> We've found situations in production where Pig queries using {{HCatStorer}}, 
> dynamic partitioning and {{opt.multiquery=true}} that produce partitions in 
> the output table, but the corresponding directories have no data files (in 
> spite of Pig reporting non-zero records written to HDFS). I don't yet have a 
> distilled test-case for this.
> Here's the code from FileOutputCommitterContainer after HIVE-7803:
> {code:java|title=FileOutputCommitterContainer.java|borderStyle=dashed|titleBGColor=#F7D6C1|bgColor=#CE}
>   @Override
>   public void commitTask(TaskAttemptContext context) throws IOException {
> String jobInfoStr = 
> context.getConfiguration().get(FileRecordWriterContainer.DYN_JOBINFO);
> if (!dynamicPartitioningUsed) {
>  //See HCATALOG-499
>   FileOutputFormatContainer.setWorkOutputPath(context);
>   
> getBaseOutputCommitter().commitTask(HCatMapRedUtil.createTaskAttemptContext(context));
> } else if (jobInfoStr != null) {
>   ArrayList jobInfoList = 
> (ArrayList)HCatUtil.deserialize(jobInfoStr);
>   org.apache.hadoop.mapred.TaskAttemptContext currTaskContext = 
> HCatMapRedUtil.createTaskAttemptContext(context);
>   for (String jobStr : jobInfoList) {
>   OutputJobInfo localJobInfo = 
> (OutputJobInfo)HCatUtil.deserialize(jobStr);
>   FileOutputCommitter committer = new FileOutputCommitter(new 
> Path(localJobInfo.getLocation()), currTaskContext);
>   committer.commitTask(currTaskContext);
>   }
> }
>   }
> {code}
> The serialized jobInfoList can't be retrieved, and hence the commit never 
> completes. This is because Pig's MapReducePOStoreImpl deliberately clones 
> both the TaskAttemptContext and the contained Configuration instance, thus 
> separating the Configuration instances passed to 
> {{FileOutputCommitterContainer::commitTask()}} and 
> {{FileRecordWriterContainer::close()}}. Anything set by the RecordWriter is 
> unavailable to the Committer.
> One approach would have been to store state in the FileOutputFormatContainer. 
> But that won't work since this is constructed via reflection in 
> HCatOutputFormat (itself constructed via reflection by PigOutputFormat via 
> HCatStorer). There's no guarantee that the instance is preserved.
> My only recourse seems to be to use a Singleton to store shared state. I'm 
> loath to indulge in this brand of shenanigans. (Statics and container-reuse 
> in Tez might not play well together, for instance.) It might work if we're 
> careful about tearing down the singleton.
> Any other ideas? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8550) Hive cannot load data into partitioned table with Unicode key

2014-10-30 Thread Xiaobing Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191083#comment-14191083
 ] 

Xiaobing Zhou commented on HIVE-8550:
-

Confirmed two concerns from [~jdere]:
1. legacy system(varchar typed column) with non-unicode partition data in DB, 
running nvarchar upgrade, patch works well for unicode partitioned table 
creation and data loading.
2. run some samples queries on PARTITIONS table to make sure index is still 
working after upgrade.

For case 1, smoothy upgrade is seen; For case 2, I ran query on MSSQL, 
{noformat}
select * from dbo.PARTITIONS where part_name = 'ds=2008-04-08/hr=11' and TBL_ID 
= 7;
{noformat}
, the execution plans include index seek on the [PARTITIONS].[UNIQUEPARTITION] 
which is NonClustered index on part_name and TBL_ID. This is good certificate 
that index is working well after upgrade.

So It's safe to commit it.

> Hive cannot load data into partitioned table with Unicode key
> -
>
> Key: HIVE-8550
> URL: https://issues.apache.org/jira/browse/HIVE-8550
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
> Environment: Windows
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>Priority: Critical
> Attachments: CreatePartitionedTable.hql, HIVE-8550.1.patch, 
> HIVE-8550.2.patch, HIVE-8550.3.patch, LoadIntoPartitionedTable.hql, 
> partitioned.txt
>
>
> Steps to reproduce:
> 1) Copy the file partitioned.txt to the root folder of your HDFS root dir. 
> Copy the two hql files to your local directory.
> 2) Open Hive CLI.
> 3) Run:
> hive> source ;
> 4) Run 
> hive> source ;
> The following error will be shown:
> hive> source C:\Scripts\partition\LoadIntoPartitionedTable.hql;
> Loading data to table default.mypartitioned partition (tag=ä¶µ)
> Failed with exception null
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.MoveTask



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8680) Set Max Message for Binary Thrift endpoints

2014-10-30 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-8680:
---
Attachment: HIVE-8680.patch

> Set Max Message for Binary Thrift endpoints
> ---
>
> Key: HIVE-8680
> URL: https://issues.apache.org/jira/browse/HIVE-8680
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-8680.patch
>
>
> Thrift has a configuration open to restrict incoming message size. If we 
> configure this we'll stop OOM'ing when someone sends us an HTTP request.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8680) Set Max Message for Binary Thrift endpoints

2014-10-30 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-8680:
---
Status: Patch Available  (was: Open)

> Set Max Message for Binary Thrift endpoints
> ---
>
> Key: HIVE-8680
> URL: https://issues.apache.org/jira/browse/HIVE-8680
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-8680.patch
>
>
> Thrift has a configuration open to restrict incoming message size. If we 
> configure this we'll stop OOM'ing when someone sends us an HTTP request.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8680) Set Max Message for Binary Thrift endpoints

2014-10-30 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-8680:
---
Description: Thrift has a configuration open to restrict incoming message 
size. If we configure this we'll stop OOM'ing when someone sends us an HTTP 
request.

> Set Max Message for Binary Thrift endpoints
> ---
>
> Key: HIVE-8680
> URL: https://issues.apache.org/jira/browse/HIVE-8680
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Brock Noland
>
> Thrift has a configuration open to restrict incoming message size. If we 
> configure this we'll stop OOM'ing when someone sends us an HTTP request.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-8680) Set Max Message for Binary Thrift endpoints

2014-10-30 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland reassigned HIVE-8680:
--

Assignee: Brock Noland

> Set Max Message for Binary Thrift endpoints
> ---
>
> Key: HIVE-8680
> URL: https://issues.apache.org/jira/browse/HIVE-8680
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Brock Noland
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8680) Set Max Message for Binary Thrift endpoints

2014-10-30 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-8680:
---
Summary: Set Max Message for Binary Thrift endpoints  (was: Configure Max 
Message for Binary Thrift endpoints)

> Set Max Message for Binary Thrift endpoints
> ---
>
> Key: HIVE-8680
> URL: https://issues.apache.org/jira/browse/HIVE-8680
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8680) Configure Max Message for Binary Thrift endpoints

2014-10-30 Thread Brock Noland (JIRA)

Brock Noland created HIVE-8680:
--

 Summary: Configure Max Message for Binary Thrift endpoints
 Key: HIVE-8680
 URL: https://issues.apache.org/jira/browse/HIVE-8680
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8202) Support SMB Join for Hive on Spark [Spark Branch]

2014-10-30 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191079#comment-14191079
 ] 

Hive QA commented on HIVE-8202:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12678336/HIVE-8202.8-spark.patch

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 7095 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parallel
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_script_pipe
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoin
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoin_noskew
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_smb_mapjoin_20
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_smb_mapjoin_21
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_tez_join_tests
org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/293/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/293/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-293/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12678336 - PreCommit-HIVE-SPARK-Build

> Support SMB Join for Hive on Spark [Spark Branch]
> -
>
> Key: HIVE-8202
> URL: https://issues.apache.org/jira/browse/HIVE-8202
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Szehon Ho
> Attachments: HIVE-8202.1-spark.patch, HIVE-8202.2-spark.patch, 
> HIVE-8202.3-spark.patch, HIVE-8202.4-spark.patch, HIVE-8202.5-spark.patch, 
> HIVE-8202.6-spark.patch, HIVE-8202.7-spark.patch, HIVE-8202.8-spark.patch, 
> Hive on Spark SMB Join.docx, Hive on Spark SMB Join.pdf
>
>
> SMB joins are used wherever the tables are sorted and bucketed. It's a 
> map-side join. The join boils down to just merging the already sorted tables, 
> allowing this operation to be faster than an ordinary map-join.
> The task is to research and support the conversion from regular SMB join to 
> SMB map join for Spark execution engine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8359) Map containing null values are not correctly written in Parquet files

2014-10-30 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191049#comment-14191049
 ] 

Sergio Peña commented on HIVE-8359:
---

I submitted the patch to code review.
https://reviews.apache.org/r/27404/

> Map containing null values are not correctly written in Parquet files
> -
>
> Key: HIVE-8359
> URL: https://issues.apache.org/jira/browse/HIVE-8359
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.13.1
>Reporter: Frédéric TERRAZZONI
>Assignee: Sergio Peña
> Attachments: HIVE-8359.1.patch, map_null_val.avro
>
>
> Tried write a map column in a Parquet file. The table should 
> contain :
> {code}
> {"key3":"val3","key4":null}
> {"key3":"val3","key4":null}
> {"key1":null,"key2":"val2"}
> {"key3":"val3","key4":null}
> {"key3":"val3","key4":null}
> {code}
> ... and when you do a query like {code}SELECT * from mytable{code}
> We can see that the table is corrupted :
> {code}
> {"key3":"val3"}
> {"key4":"val3"}
> {"key3":"val2"}
> {"key4":"val3"}
> {"key1":"val3"}
> {code}
> I've not been able to read the Parquet file in our software afterwards, and 
> consequently I suspect it to be corrupted. 
> For those who are interested, I generated this Parquet table from an Avro 
> file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8679) jdwp error when debugging Hive with MapredLocalTask

2014-10-30 Thread Chao (JIRA)

Chao created HIVE-8679:
--

 Summary: jdwp error when debugging Hive with MapredLocalTask
 Key: HIVE-8679
 URL: https://issues.apache.org/jira/browse/HIVE-8679
 Project: Hive
  Issue Type: Bug
Reporter: Chao


When debugging Hive with the {{--debug}} option, Hive will fail when starting a 
local task, and give the following error message:

{noformat}
Error occurred during initialization of VMERROR: Cannot load this JVM TI agent 
twice, check your java command line for duplicate jdwp options.

agent library failed to init: jdwp
Execution failed with exit status: 1
Obtaining error information

Task failed!
Task ID:
  Stage-4

Logs:

/tmp/chao/hive.log
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Review Request 27404: HIVE-8359 Map containing null values are not correctly written in Parquet files

2014-10-30 Thread Sergio Pena


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27404/
---

Review request for hive.


Bugs: HIVE-8359
https://issues.apache.org/jira/browse/HIVE-8359


Repository: hive-git


Description
---

The patch changes the way DataWritableWriter class writes an array of elements 
to the Parquet record. 
It wraps each array record into a new startGroup/endGroup block so that Parquet 
can detect null values on those optional fields.


Diffs
-

  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java 
c7078efe27482df0a11dd68ac068da27dbcf51b3 
  ql/src/test/queries/clientpositive/parquet_map_null.q PRE-CREATION 
  ql/src/test/results/clientpositive/parquet_map_null.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/27404/diff/


Testing
---


Thanks,

Sergio Pena

Re: Review Request 26854: HIVE-2573 Create per-session function registry

2014-10-30 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26854/
---

(Updated Oct. 30, 2014, 11:41 p.m.)


Review request for hive, Navis Ryu and Thejas Nair.


Changes
---

Updating with HIVE-2573.11.patch.txt from Navis


Bugs: HIVE-2573
https://issues.apache.org/jira/browse/HIVE-2573


Repository: hive-git


Description
---

Small updates to Navis' changes:
- session registry doesn't lookup metastore for UDFs
- my feedback from Navis' original patch
- metastore udfs should not be considered native. This allows them to be 
added/removed from registry


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/common/JavaUtils.java 9aa917c 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 7d8e5bc 
  contrib/src/test/results/clientnegative/invalid_row_sequence.q.out 8f3c0b3 
  
itests/hive-minikdc/src/test/java/org/apache/hive/minikdc/TestJdbcWithMiniKdc.java
 6647ce5 
  
metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
 88b0791 
  ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 9ac540e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/CommonFunctionInfo.java 93c15c0 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionInfo.java 074255b 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java e43a328 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionTask.java 569c125 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Registry.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 5bdeb92 
  ql/src/java/org/apache/hadoop/hive/ql/exec/WindowFunctionInfo.java efecb05 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 4e3df75 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java b900627 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java 13277a9 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 211ab6c 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/IndexUtils.java e2768ff 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/translator/SqlFunctionConverter.java
 793f117 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java 
1796b7b 
  ql/src/java/org/apache/hadoop/hive/ql/parse/FunctionSemanticAnalyzer.java 
22e5b47 
  ql/src/java/org/apache/hadoop/hive/ql/parse/IndexUpdater.java 2b239ab 
  ql/src/java/org/apache/hadoop/hive/ql/session/SessionConf.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java af633cb 
  ql/src/test/org/apache/hadoop/hive/ql/parse/TestMacroSemanticAnalyzer.java 
46f8052 
  ql/src/test/queries/clientnegative/drop_native_udf.q ae047bb 
  ql/src/test/results/clientnegative/create_function_nonexistent_class.q.out 
c7405ed 
  ql/src/test/results/clientnegative/create_function_nonudf_class.q.out d0dd50a 
  ql/src/test/results/clientnegative/drop_native_udf.q.out 9f0eaa5 
  ql/src/test/results/clientnegative/udf_nonexistent_resource.q.out e184787 
  service/src/test/org/apache/hadoop/hive/service/TestHiveServerSessions.java 
fd38907 

Diff: https://reviews.apache.org/r/26854/diff/


Testing
---


Thanks,

Jason Dere

Re: Review Request 27367: support ISO-2012 timestamp literals

2014-10-30 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27367/#review59279
---


I think this looks good. Would you be able to add a test using timestamp 
literals as a partition column? Just to make sure we don't get a repeat of 
HIVE-4928 for timestamp.

- Jason Dere


On Oct. 30, 2014, 12:52 a.m., Navis Ryu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27367/
> ---
> 
> (Updated Oct. 30, 2014, 12:52 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-3187
> https://issues.apache.org/jira/browse/HIVE-3187
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Enable the JDBC driver/Hive SQL engine to accept JDBC canonical or ISO-SQL 
> 20xx Timestamp literals
> 
> ie.
> select 1 from cert.tversion tversion where timestamp '1989-01-01 
> 10:20:30.0' <> timestamp '2000-12-31 12:15:30.12300'
> 
> instead of
> unix_timestamp('.)
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g c903e8f 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 13d5255 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java 
> e065983 
>   ql/src/test/queries/clientnegative/timestamp_literal.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/timestamp_literal.q PRE-CREATION 
>   ql/src/test/results/clientnegative/date_literal2.q.out 82f6425 
>   ql/src/test/results/clientnegative/date_literal3.q.out 82f6425 
>   ql/src/test/results/clientnegative/illegal_partition_type4.q.out e388086 
>   ql/src/test/results/clientnegative/timestamp_literal.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/timestamp_literal.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/27367/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Navis Ryu
> 
>

[jira] [Updated] (HIVE-8546) Handle "add archive scripts.tar.gz" in Tez

2014-10-30 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-8546:
--
Status: Patch Available  (was: Open)

> Handle "add archive scripts.tar.gz" in Tez
> --
>
> Key: HIVE-8546
> URL: https://issues.apache.org/jira/browse/HIVE-8546
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 0.14.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Critical
> Fix For: 0.14.0
>
> Attachments: HIVE-8546.1.patch, HIVE-8546.2.patch, HIVE-8546.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8546) Handle "add archive scripts.tar.gz" in Tez

2014-10-30 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191041#comment-14191041
 ] 

Gopal V commented on HIVE-8546:
---

[~sseth]: Pattern changed to Archive now.

> Handle "add archive scripts.tar.gz" in Tez
> --
>
> Key: HIVE-8546
> URL: https://issues.apache.org/jira/browse/HIVE-8546
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 0.14.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Critical
> Fix For: 0.14.0
>
> Attachments: HIVE-8546.1.patch, HIVE-8546.2.patch, HIVE-8546.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8546) Handle "add archive scripts.tar.gz" in Tez

2014-10-30 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-8546:
--
Status: Open  (was: Patch Available)

> Handle "add archive scripts.tar.gz" in Tez
> --
>
> Key: HIVE-8546
> URL: https://issues.apache.org/jira/browse/HIVE-8546
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 0.14.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Critical
> Fix For: 0.14.0
>
> Attachments: HIVE-8546.1.patch, HIVE-8546.2.patch, HIVE-8546.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8359) Map containing null values are not correctly written in Parquet files

2014-10-30 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191038#comment-14191038
 ] 

Hive QA commented on HIVE-8359:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12678229/HIVE-8359.1.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6609 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_transform_acid
org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1561/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1561/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1561/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12678229 - PreCommit-HIVE-TRUNK-Build

> Map containing null values are not correctly written in Parquet files
> -
>
> Key: HIVE-8359
> URL: https://issues.apache.org/jira/browse/HIVE-8359
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.13.1
>Reporter: Frédéric TERRAZZONI
>Assignee: Sergio Peña
> Attachments: HIVE-8359.1.patch, map_null_val.avro
>
>
> Tried write a map column in a Parquet file. The table should 
> contain :
> {code}
> {"key3":"val3","key4":null}
> {"key3":"val3","key4":null}
> {"key1":null,"key2":"val2"}
> {"key3":"val3","key4":null}
> {"key3":"val3","key4":null}
> {code}
> ... and when you do a query like {code}SELECT * from mytable{code}
> We can see that the table is corrupted :
> {code}
> {"key3":"val3"}
> {"key4":"val3"}
> {"key3":"val2"}
> {"key4":"val3"}
> {"key1":"val3"}
> {code}
> I've not been able to read the Parquet file in our software afterwards, and 
> consequently I suspect it to be corrupted. 
> For those who are interested, I generated this Parquet table from an Avro 
> file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8546) Handle "add archive scripts.tar.gz" in Tez

2014-10-30 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-8546:
--
Attachment: HIVE-8546.3.patch

> Handle "add archive scripts.tar.gz" in Tez
> --
>
> Key: HIVE-8546
> URL: https://issues.apache.org/jira/browse/HIVE-8546
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 0.14.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Critical
> Fix For: 0.14.0
>
> Attachments: HIVE-8546.1.patch, HIVE-8546.2.patch, HIVE-8546.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8395) CBO: enable by default

2014-10-30 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-8395:
---
Attachment: HIVE-8395.19.patch

Most recent out file updates for parquet_ctas and optimize_nullscan, plus 
remove all the now-committed fixes from the patch.
Presumably only one issue remains, join_filters, which is covered in CALCITE-448


> CBO: enable by default
> --
>
> Key: HIVE-8395
> URL: https://issues.apache.org/jira/browse/HIVE-8395
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.15.0
>
> Attachments: HIVE-8395.01.patch, HIVE-8395.02.patch, 
> HIVE-8395.03.patch, HIVE-8395.04.patch, HIVE-8395.05.patch, 
> HIVE-8395.06.patch, HIVE-8395.07.patch, HIVE-8395.08.patch, 
> HIVE-8395.09.patch, HIVE-8395.10.patch, HIVE-8395.11.patch, 
> HIVE-8395.12.patch, HIVE-8395.12.patch, HIVE-8395.13.patch, 
> HIVE-8395.13.patch, HIVE-8395.14.patch, HIVE-8395.15.patch, 
> HIVE-8395.16.patch, HIVE-8395.17.patch, HIVE-8395.18.patch, 
> HIVE-8395.18.patch, HIVE-8395.19.patch, HIVE-8395.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8636) CBO: split cbo_correctness test

2014-10-30 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-8636:
---
Attachment: HIVE-8636.patch

[~jpullokkaran] can you review? the change moves tables into test tables, 
renames t1..3 to cbo_t1..3 and splits the test.
RB at https://reviews.apache.org/r/27401/

> CBO: split cbo_correctness test
> ---
>
> Key: HIVE-8636
> URL: https://issues.apache.org/jira/browse/HIVE-8636
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-8636.patch
>
>
> CBO correctness test is extremely annoying - it runs forever, if anything 
> fails it's hard to debug due to the volume of logs from all the stuff, also 
> it doesn't run further so if multiple things fail they can only be discovered 
> one by one; also SORT_QUERY_RESULTS cannot be used, because some queries 
> presumably use sorting.
> It should be split into separate tests, the numbers in there now may be good 
> as boundaries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8636) CBO: split cbo_correctness test

2014-10-30 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-8636:
---
Status: Patch Available  (was: Open)

> CBO: split cbo_correctness test
> ---
>
> Key: HIVE-8636
> URL: https://issues.apache.org/jira/browse/HIVE-8636
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-8636.patch
>
>
> CBO correctness test is extremely annoying - it runs forever, if anything 
> fails it's hard to debug due to the volume of logs from all the stuff, also 
> it doesn't run further so if multiple things fail they can only be discovered 
> one by one; also SORT_QUERY_RESULTS cannot be used, because some queries 
> presumably use sorting.
> It should be split into separate tests, the numbers in there now may be good 
> as boundaries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 27401: HIVE-8636 CBO: split cbo_correctness test

2014-10-30 Thread Sergey Shelukhin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27401/
---

(Updated Oct. 30, 2014, 11:18 p.m.)


Review request for hive, Ashutosh Chauhan and John Pullokkaran.


Repository: hive-git


Description
---

See JIRA


Diffs (updated)
-

  data/scripts/q_test_cleanup.sql 8ec0f9f 
  data/scripts/q_test_init.sql 7484f0c 
  itests/src/test/resources/testconfiguration.properties 2c84a36 
  pom.xml bd74830 
  ql/src/test/queries/clientpositive/cbo_correctness.q bb328f6 
  ql/src/test/queries/clientpositive/cbo_gby.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_gby_empty.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_join.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_limit.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_semijoin.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_simple_select.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_stats.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_subq_exists.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_subq_in.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_subq_not_in.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_udf_udaf.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_union.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_views.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_windowing.q PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_correctness.q.out d98cb5b 
  ql/src/test/results/clientpositive/cbo_gby.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_gby_empty.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_join.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_limit.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_semijoin.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_simple_select.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_stats.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_subq_exists.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_subq_in.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_subq_not_in.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_udf_udaf.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_union.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_views.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_windowing.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_correctness.q.out 2103d08 
  ql/src/test/results/clientpositive/tez/cbo_gby.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_gby_empty.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_join.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_limit.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_semijoin.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_simple_select.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_stats.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_subq_exists.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_subq_in.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_subq_not_in.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_udf_udaf.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_union.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_views.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_windowing.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/27401/diff/


Testing
---


Thanks,

Sergey Shelukhin

[jira] [Updated] (HIVE-8678) Pig fails to correctly load DATE fields using HCatalog

2014-10-30 Thread Michael McLellan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McLellan updated HIVE-8678:
---
Description: 
On:
Hadoop 2.5.0-cdh5.2.0 
Pig 0.12.0-cdh5.2.0
Hive 0.13.1-cdh5.2.0

When using pig -useHCatalog to load a Hive table that has a DATE field, when 
trying to DUMP the field, the following error occurs:

{code}
2014-10-30 22:58:05,469 [main] ERROR 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - 
org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error 
converting read value to tuple
at 
org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76)
at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:58)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
at 
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
at 
org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at 
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to 
java.sql.Date
at 
org.apache.hive.hcatalog.pig.PigHCatUtil.extractPigObject(PigHCatUtil.java:420)
at 
org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:457)
at 
org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:375)
at 
org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:64)

2014-10-30 22:58:05,469 [main] ERROR 
org.apache.pig.tools.pigstats.SimplePigStats - ERROR 6018: Error converting 
read value to tuple
{code}

It seems to be occuring here: 
https://github.com/apache/hive/blob/trunk/hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hive/hcatalog/pig/PigHCatUtil.java#L433

and that it should be:
{code}Date d = Date.valueOf(o);{code} 
instead of 
{code}Date d = (Date) o;{code}

  was:
On:
Hadoop 2.5.0-cdh5.2.0 
Pig 0.12.0-cdh5.2.0
Hive 0.13.1-cdh5.2.0

When using pig -useHCatalog to load a Hive table that has a DATE field, when 
trying to DUMP the field, the following error occurs:

{code}
2014-10-30 22:58:05,469 [main] ERROR 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - 
org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error 
converting read value to tuple
at 
org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76)
at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:58)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
at 
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
at 
org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at 
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to 
java.sql.Date
at 
org.apache.hive.hcatalog.pig.PigHCatUtil.extractPigObject(PigHCatUtil.java:420)
at 
org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:457)
at 
org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:375)
at 
org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:64)

2014-10-30 22:58:05,469 [main] ERROR 
org.apache.pig.tools.pigstats.SimplePigStats - ERROR 6018: Error converting 
read value to tuple
{code}

It seems to be occuring here: 
https://github.com/apache/hive/blob/trunk/hcatalog/hcatalog-pig-ada

[jira] [Updated] (HIVE-8678) Pig fails to correctly load DATE fields using HCatalog

2014-10-30 Thread Michael McLellan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McLellan updated HIVE-8678:
---
Description: 
Using:
Hadoop 2.5.0-cdh5.2.0 
Pig 0.12.0-cdh5.2.0
Hive 0.13.1-cdh5.2.0

When using pig -useHCatalog to load a Hive table that has a DATE field, when 
trying to DUMP the field, the following error occurs:

{code}
2014-10-30 22:58:05,469 [main] ERROR 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - 
org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error 
converting read value to tuple
at 
org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76)
at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:58)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
at 
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
at 
org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at 
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to 
java.sql.Date
at 
org.apache.hive.hcatalog.pig.PigHCatUtil.extractPigObject(PigHCatUtil.java:420)
at 
org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:457)
at 
org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:375)
at 
org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:64)

2014-10-30 22:58:05,469 [main] ERROR 
org.apache.pig.tools.pigstats.SimplePigStats - ERROR 6018: Error converting 
read value to tuple
{code}

It seems to be occuring here: 
https://github.com/apache/hive/blob/trunk/hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hive/hcatalog/pig/PigHCatUtil.java#L433

and that it should be:
{code}Date d = Date.valueOf(o);{code} 
instead of 
{code}Date d = (Date) o;{code}

  was:
On:
Hadoop 2.5.0-cdh5.2.0 
Pig 0.12.0-cdh5.2.0
Hive 0.13.1-cdh5.2.0

When using pig -useHCatalog to load a Hive table that has a DATE field, when 
trying to DUMP the field, the following error occurs:

{code}
2014-10-30 22:58:05,469 [main] ERROR 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - 
org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error 
converting read value to tuple
at 
org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76)
at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:58)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
at 
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
at 
org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at 
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to 
java.sql.Date
at 
org.apache.hive.hcatalog.pig.PigHCatUtil.extractPigObject(PigHCatUtil.java:420)
at 
org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:457)
at 
org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:375)
at 
org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:64)

2014-10-30 22:58:05,469 [main] ERROR 
org.apache.pig.tools.pigstats.SimplePigStats - ERROR 6018: Error converting 
read value to tuple
{code}

It seems to be occuring here: 
https://github.com/apache/hive/blob/trunk/hcatalog/hcatalog-pig-

Re: Review Request 27401: HIVE-8636 CBO: split cbo_correctness test

2014-10-30 Thread Sergey Shelukhin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27401/#review59275
---



metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java


will remove


- Sergey Shelukhin


On Oct. 30, 2014, 11:15 p.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27401/
> ---
> 
> (Updated Oct. 30, 2014, 11:15 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and John Pullokkaran.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> See JIRA
> 
> 
> Diffs
> -
> 
>   data/scripts/q_test_cleanup.sql 8ec0f9f 
>   data/scripts/q_test_init.sql 7484f0c 
>   itests/src/test/resources/testconfiguration.properties 2c84a36 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
> e3240ca 
>   pom.xml bd74830 
>   ql/src/test/queries/clientpositive/cbo_correctness.q bb328f6 
>   ql/src/test/queries/clientpositive/cbo_gby.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_gby_empty.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_join.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_limit.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_semijoin.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_simple_select.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_stats.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_subq_exists.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_subq_in.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_subq_not_in.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_udf_udaf.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_union.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_views.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cbo_windowing.q PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_correctness.q.out d98cb5b 
>   ql/src/test/results/clientpositive/cbo_gby.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_gby_empty.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_join.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_limit.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_semijoin.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_simple_select.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_stats.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_subq_exists.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_subq_in.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_subq_not_in.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_udf_udaf.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_union.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_views.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_windowing.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_correctness.q.out 2103d08 
>   ql/src/test/results/clientpositive/tez/cbo_gby.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_gby_empty.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_join.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_limit.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_semijoin.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_simple_select.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_stats.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_subq_exists.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_subq_in.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_subq_not_in.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_udf_udaf.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_union.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_views.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/cbo_windowing.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/27401/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>

[jira] [Commented] (HIVE-8648) numRows cannot be set by user

2014-10-30 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191014#comment-14191014
 ] 

Brock Noland commented on HIVE-8648:


[~ashutoshc] it looks we need to pass the fact that we are not adding or 
deleting data from {{HiveAlterHandler}} to {{MetastoreUtils}}. Is that correct? 
If so, I can do that.

> numRows cannot be set by user
> -
>
> Key: HIVE-8648
> URL: https://issues.apache.org/jira/browse/HIVE-8648
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.0, 0.14.0, 0.13.1
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-8648.1.patch
>
>
> Since HIVE-3777 users who want to set the number of rows for a table, must do 
> as follows:
> {noformat}
> alter table ... set tblproperties ('numRows' = '12345', 
> 'STATS_GENERATED_VIA_STATS_TASK' = 'true');
> {noformat}
> Which is strange because (1) users can know the numbers of rows and (2) the 
> stat is not generated by a stats task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8579) Guaranteed NPE in DDLSemanticAnalyzer

2014-10-30 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-8579:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk/branch-0.14

> Guaranteed NPE in DDLSemanticAnalyzer
> -
>
> Key: HIVE-8579
> URL: https://issues.apache.org/jira/browse/HIVE-8579
> Project: Hive
>  Issue Type: Bug
>Reporter: Lars Francke
>Assignee: Jason Dere
> Fix For: 0.14.0
>
> Attachments: HIVE-8579.1.patch, HIVE-8579.1.patch
>
>
> This was added by [~jdere] in HIVE-8411. I don't fully understand the code 
> (i.e. what it means when desc is null) but I'm sure, Jason, you can fix it 
> without much trouble?
> {code}
> if (desc == null || 
> !AlterTableDesc.doesAlterTableTypeSupportPartialPartitionSpec(desc.getOp())) {
>   throw new SemanticException( 
> ErrorMsg.ALTER_TABLE_TYPE_PARTIAL_PARTITION_SPEC_NO_SUPPORTED, 
> desc.getOp().name());
> } else if (!conf.getBoolVar(HiveConf.ConfVars.DYNAMICPARTITIONING)) {
>   throw new SemanticException(ErrorMsg.DYNAMIC_PARTITION_DISABLED);
> }
> {code}
> You check for whether {{desc}} is null but then use it to do {{desc.getOp()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 27401: HIVE-8636 CBO: split cbo_correctness test

2014-10-30 Thread Sergey Shelukhin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27401/
---

(Updated Oct. 30, 2014, 11:15 p.m.)


Review request for hive, Ashutosh Chauhan and John Pullokkaran.


Repository: hive-git


Description
---

See JIRA


Diffs
-

  data/scripts/q_test_cleanup.sql 8ec0f9f 
  data/scripts/q_test_init.sql 7484f0c 
  itests/src/test/resources/testconfiguration.properties 2c84a36 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
e3240ca 
  pom.xml bd74830 
  ql/src/test/queries/clientpositive/cbo_correctness.q bb328f6 
  ql/src/test/queries/clientpositive/cbo_gby.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_gby_empty.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_join.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_limit.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_semijoin.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_simple_select.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_stats.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_subq_exists.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_subq_in.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_subq_not_in.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_udf_udaf.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_union.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_views.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_windowing.q PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_correctness.q.out d98cb5b 
  ql/src/test/results/clientpositive/cbo_gby.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_gby_empty.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_join.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_limit.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_semijoin.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_simple_select.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_stats.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_subq_exists.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_subq_in.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_subq_not_in.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_udf_udaf.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_union.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_views.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_windowing.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_correctness.q.out 2103d08 
  ql/src/test/results/clientpositive/tez/cbo_gby.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_gby_empty.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_join.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_limit.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_semijoin.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_simple_select.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_stats.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_subq_exists.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_subq_in.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_subq_not_in.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_udf_udaf.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_union.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_views.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_windowing.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/27401/diff/


Testing
---


Thanks,

Sergey Shelukhin

[jira] [Updated] (HIVE-8678) Pig fails to correctly load DATE fields using HCatalog

2014-10-30 Thread Michael McLellan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McLellan updated HIVE-8678:
---
Description: 
On:
Hadoop 2.5.0-cdh5.2.0 
Pig 0.12.0-cdh5.2.0
Hive 0.13.1-cdh5.2.0

When using pig -useHCatalog to load a Hive table that has a DATE field, when 
trying to DUMP the field, the following error occurs:

{code}
2014-10-30 22:58:05,469 [main] ERROR 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - 
org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error 
converting read value to tuple
at 
org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76)
at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:58)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
at 
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
at 
org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at 
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to 
java.sql.Date
at 
org.apache.hive.hcatalog.pig.PigHCatUtil.extractPigObject(PigHCatUtil.java:420)
at 
org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:457)
at 
org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:375)
at 
org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:64)

2014-10-30 22:58:05,469 [main] ERROR 
org.apache.pig.tools.pigstats.SimplePigStats - ERROR 6018: Error converting 
read value to tuple
{code}

It seems to be occuring here: 
https://github.com/apache/hive/blob/trunk/hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hive/hcatalog/pig/PigHCatUtil.java#L433

It seems like this should be:
{code}Date d = Date.valueOf(o);{code} 
instead of 
{code}Date d = (Date) o;{code}

  was:
On:
Hadoop 2.5.0-cdh5.2.0 
Pig 0.12.0-cdh5.2.0
Hive 0.13.1-cdh5.2.0

When using pig -useHCatalog to load a Hive table that has a DATE field, when 
trying to DUMP the field, the following error occurs:

{code}
2014-10-30 22:58:05,469 [main] ERROR 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - 
org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error 
converting read value to tuple
at 
org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76)
at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:58)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
at 
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
at 
org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at 
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to 
java.sql.Date
at 
org.apache.hive.hcatalog.pig.PigHCatUtil.extractPigObject(PigHCatUtil.java:420)
at 
org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:457)
at 
org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:375)
at 
org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:64)

2014-10-30 22:58:05,469 [main] ERROR 
org.apache.pig.tools.pigstats.SimplePigStats - ERROR 6018: Error converting 
read value to tuple
{code}

It seems to be occuring here: 
https://github.com/apache/hive/blob/trunk/hcatalog/hcatalog-

[jira] [Updated] (HIVE-8678) Pig fails to correctly load DATE fields using HCatalog

2014-10-30 Thread Michael McLellan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McLellan updated HIVE-8678:
---
Description: 
On:
Hadoop 2.5.0-cdh5.2.0 
Pig 0.12.0-cdh5.2.0
Hive 0.13.1-cdh5.2.0

When using pig -useHCatalog to load a Hive table that has a DATE field, when 
trying to DUMP the field, the following error occurs:

{code}
2014-10-30 22:58:05,469 [main] ERROR 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - 
org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error 
converting read value to tuple
at 
org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76)
at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:58)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
at 
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
at 
org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at 
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to 
java.sql.Date
at 
org.apache.hive.hcatalog.pig.PigHCatUtil.extractPigObject(PigHCatUtil.java:420)
at 
org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:457)
at 
org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:375)
at 
org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:64)

2014-10-30 22:58:05,469 [main] ERROR 
org.apache.pig.tools.pigstats.SimplePigStats - ERROR 6018: Error converting 
read value to tuple
{code}

It seems to be occuring here: 
https://github.com/apache/hive/blob/trunk/hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hive/hcatalog/pig/PigHCatUtil.java#L433

It seems like this should be:
Date d = Date.valueOf(o); 
instead of 
Date d = (Date) o;

  was:
On:
Hadoop 2.5.0-cdh5.2.0 
Pig 0.12.0-cdh5.2.0
Hive 0.13.1-cdh5.2.0

When using pig -useHCatalog to load a Hive table that has a DATE field, when 
trying to DUMP the field, the following error occurs:

2014-10-30 22:58:05,469 [main] ERROR 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - 
org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error 
converting read value to tuple
at 
org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76)
at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:58)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
at 
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
at 
org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at 
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to 
java.sql.Date
at 
org.apache.hive.hcatalog.pig.PigHCatUtil.extractPigObject(PigHCatUtil.java:420)
at 
org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:457)
at 
org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:375)
at 
org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:64)

2014-10-30 22:58:05,469 [main] ERROR 
org.apache.pig.tools.pigstats.SimplePigStats - ERROR 6018: Error converting 
read value to tuple

It seems to be occuring here: 
https://github.com/apache/hive/blob/trunk/hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/h

Re: Review Request 27401: HIVE-8636 CBO: split cbo_correctness test

2014-10-30 Thread Sergey Shelukhin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27401/
---

(Updated Oct. 30, 2014, 11:15 p.m.)


Review request for hive, Ashutosh Chauhan and John Pullokkaran.


Repository: hive-git


Description
---

See JIRA


Diffs
-

  data/scripts/q_test_cleanup.sql 8ec0f9f 
  data/scripts/q_test_init.sql 7484f0c 
  itests/src/test/resources/testconfiguration.properties 2c84a36 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
e3240ca 
  pom.xml bd74830 
  ql/src/test/queries/clientpositive/cbo_correctness.q bb328f6 
  ql/src/test/queries/clientpositive/cbo_gby.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_gby_empty.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_join.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_limit.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_semijoin.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_simple_select.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_stats.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_subq_exists.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_subq_in.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_subq_not_in.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_udf_udaf.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_union.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_views.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_windowing.q PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_correctness.q.out d98cb5b 
  ql/src/test/results/clientpositive/cbo_gby.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_gby_empty.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_join.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_limit.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_semijoin.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_simple_select.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_stats.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_subq_exists.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_subq_in.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_subq_not_in.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_udf_udaf.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_union.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_views.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_windowing.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_correctness.q.out 2103d08 
  ql/src/test/results/clientpositive/tez/cbo_gby.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_gby_empty.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_join.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_limit.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_semijoin.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_simple_select.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_stats.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_subq_exists.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_subq_in.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_subq_not_in.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_udf_udaf.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_union.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_views.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_windowing.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/27401/diff/


Testing
---


Thanks,

Sergey Shelukhin

Review Request 27401: HIVE-8636 CBO: split cbo_correctness test

2014-10-30 Thread Sergey Shelukhin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27401/
---

Review request for hive, Ashutosh Chauhan and John Pullokkaran.


Repository: hive-git


Description
---

See JIRA


Diffs
-

  data/scripts/q_test_cleanup.sql 8ec0f9f 
  data/scripts/q_test_init.sql 7484f0c 
  itests/src/test/resources/testconfiguration.properties 2c84a36 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
e3240ca 
  pom.xml bd74830 
  ql/src/test/queries/clientpositive/cbo_correctness.q bb328f6 
  ql/src/test/queries/clientpositive/cbo_gby.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_gby_empty.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_join.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_limit.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_semijoin.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_simple_select.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_stats.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_subq_exists.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_subq_in.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_subq_not_in.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_udf_udaf.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_union.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_views.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_windowing.q PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_correctness.q.out d98cb5b 
  ql/src/test/results/clientpositive/cbo_gby.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_gby_empty.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_join.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_limit.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_semijoin.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_simple_select.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_stats.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_subq_exists.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_subq_in.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_subq_not_in.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_udf_udaf.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_union.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_views.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_windowing.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_correctness.q.out 2103d08 
  ql/src/test/results/clientpositive/tez/cbo_gby.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_gby_empty.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_join.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_limit.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_semijoin.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_simple_select.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_stats.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_subq_exists.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_subq_in.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_subq_not_in.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_udf_udaf.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_union.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_views.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/cbo_windowing.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/27401/diff/


Testing
---


Thanks,

Sergey Shelukhin

[jira] [Created] (HIVE-8678) Pig fails to correctly load DATE fields using HCatalog

2014-10-30 Thread Michael McLellan (JIRA)

Michael McLellan created HIVE-8678:
--

 Summary: Pig fails to correctly load DATE fields using HCatalog
 Key: HIVE-8678
 URL: https://issues.apache.org/jira/browse/HIVE-8678
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.13.1
Reporter: Michael McLellan


On:
Hadoop 2.5.0-cdh5.2.0 
Pig 0.12.0-cdh5.2.0
Hive 0.13.1-cdh5.2.0

When using pig -useHCatalog to load a Hive table that has a DATE field, when 
trying to DUMP the field, the following error occurs:

2014-10-30 22:58:05,469 [main] ERROR 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - 
org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error 
converting read value to tuple
at 
org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76)
at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:58)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
at 
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
at 
org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at 
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to 
java.sql.Date
at 
org.apache.hive.hcatalog.pig.PigHCatUtil.extractPigObject(PigHCatUtil.java:420)
at 
org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:457)
at 
org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:375)
at 
org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:64)

2014-10-30 22:58:05,469 [main] ERROR 
org.apache.pig.tools.pigstats.SimplePigStats - ERROR 6018: Error converting 
read value to tuple

It seems to be occuring here: 
https://github.com/apache/hive/blob/trunk/hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hive/hcatalog/pig/PigHCatUtil.java#L433

It seems like this should be:
Date d = Date.valueOf(o); 
instead of 
Date d = (Date) o;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8663) Fetching Vectorization scratch column map in Reduce-Side stop working

2014-10-30 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14190995#comment-14190995
 ] 

Matt McCline commented on HIVE-8663:


Thank you Ashutosh for your review comments.  New patch created with review 
comment changes.

> Fetching Vectorization scratch column map in Reduce-Side stop working
> -
>
> Key: HIVE-8663
> URL: https://issues.apache.org/jira/browse/HIVE-8663
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 0.14.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 0.14.0
>
> Attachments: HIVE-8663.01.patch, HIVE-8663.02.patch
>
>
> Recent changes (somewhere) caused scratch column types to not be fetched on 
> reduce-side.
> Switching to use scratch column types from VectorizationContext.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8663) Fetching Vectorization scratch column map in Reduce-Side stop working

2014-10-30 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8663:
---
Attachment: HIVE-8663.02.patch

> Fetching Vectorization scratch column map in Reduce-Side stop working
> -
>
> Key: HIVE-8663
> URL: https://issues.apache.org/jira/browse/HIVE-8663
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 0.14.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 0.14.0
>
> Attachments: HIVE-8663.01.patch, HIVE-8663.02.patch
>
>
> Recent changes (somewhere) caused scratch column types to not be fetched on 
> reduce-side.
> Switching to use scratch column types from VectorizationContext.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8663) Fetching Vectorization scratch column map in Reduce-Side stop working

2014-10-30 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8663:
---
Status: Patch Available  (was: In Progress)

> Fetching Vectorization scratch column map in Reduce-Side stop working
> -
>
> Key: HIVE-8663
> URL: https://issues.apache.org/jira/browse/HIVE-8663
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 0.14.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 0.14.0
>
> Attachments: HIVE-8663.01.patch, HIVE-8663.02.patch
>
>
> Recent changes (somewhere) caused scratch column types to not be fetched on 
> reduce-side.
> Switching to use scratch column types from VectorizationContext.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8663) Fetching Vectorization scratch column map in Reduce-Side stop working

2014-10-30 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8663:
---
Status: In Progress  (was: Patch Available)

> Fetching Vectorization scratch column map in Reduce-Side stop working
> -
>
> Key: HIVE-8663
> URL: https://issues.apache.org/jira/browse/HIVE-8663
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 0.14.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 0.14.0
>
> Attachments: HIVE-8663.01.patch
>
>
> Recent changes (somewhere) caused scratch column types to not be fetched on 
> reduce-side.
> Switching to use scratch column types from VectorizationContext.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8498) Insert into table misses some rows when vectorization is enabled

2014-10-30 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14190982#comment-14190982
 ] 

Matt McCline commented on HIVE-8498:


Similar theme to the VectorSelectOperator restoring the projection arrays 
afterward.

+1 (non-binding).

> Insert into table misses some rows when vectorization is enabled
> 
>
> Key: HIVE-8498
> URL: https://issues.apache.org/jira/browse/HIVE-8498
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 0.14.0, 0.13.1
>Reporter: Prasanth J
>Assignee: Jitendra Nath Pandey
>Priority: Critical
>  Labels: vectorization
> Attachments: HIVE-8498.01.patch, HIVE-8498.02.patch, HIVE-8498.3.patch
>
>
>  Following is a small reproducible case for the issue
> create table orc1
>   stored as orc
>   tblproperties("orc.compress"="ZLIB")
>   as
> select rn
> from
> (
>   select cast(1 as int) as rn from src limit 1
>   union all
>   select cast(100 as int) as rn from src limit 1
>   union all
>   select cast(1 as int) as rn from src limit 1
> ) t;
> create table orc_rn1 (rn int);
> create table orc_rn2 (rn int);
> create table orc_rn3 (rn int);
> // These inserts should produce 3 rows but only 1 row is produced
> from orc1 a
> insert overwrite table orc_rn1 select a.* where a.rn < 100
> insert overwrite table orc_rn2 select a.* where a.rn >= 100 and a.rn < 1000
> insert overwrite table orc_rn3 select a.* where a.rn >= 1000;
> select * from orc_rn1
> union all
> select * from orc_rn2
> union all
> select * from orc_rn3;
> The expected output of the query is
> 1
> 100
> 1
> But with vectorization enabled we get
> 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8665) Fix misc unit tests on Windows

2014-10-30 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14190980#comment-14190980
 ] 

Thejas M Nair commented on HIVE-8665:
-

+1

> Fix misc unit tests on Windows
> --
>
> Key: HIVE-8665
> URL: https://issues.apache.org/jira/browse/HIVE-8665
> Project: Hive
>  Issue Type: Bug
>  Components: Windows
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-8665.1.patch
>
>
> Several junit tests failing on Windows for misc reasons (path issues, 
> resources need to be closed before file can be deleted, etc).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 27369: HIVE-8665 Fix misc unit tests on Windows

2014-10-30 Thread Thejas Nair


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27369/#review59273
---

Ship it!


Ship It!

- Thejas Nair


On Oct. 30, 2014, 2:01 a.m., Jason Dere wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27369/
> ---
> 
> (Updated Oct. 30, 2014, 2:01 a.m.)
> 
> 
> Review request for hive and Thejas Nair.
> 
> 
> Bugs: HIVE-8665
> https://issues.apache.org/jira/browse/HIVE-8665
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> misc test fixes to allow tests to pass on Windows
> 
> 
> Diffs
> -
> 
>   common/src/test/org/apache/hadoop/hive/conf/TestHiveLogging.java 80f3a12 
>   common/src/test/resources/hive-exec-log4j-test.properties 839a9ca 
>   common/src/test/resources/hive-log4j-test.properties 51acda2 
>   
> hcatalog/core/src/test/java/org/apache/hive/hcatalog/cli/TestUseDatabase.java 
> 8868623 
>   
> hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestHCatLoaderStorer.java
>  7162584 
>   hcatalog/webhcat/java-client/pom.xml ebef9f1 
>   
> hcatalog/webhcat/java-client/src/test/java/org/apache/hive/hcatalog/api/TestHCatClient.java
>  bc90ffe 
>   
> itests/hive-unit-hadoop2/src/test/java/org/apache/hadoop/hive/ql/security/TestPasswordWithCredentialProvider.java
>  f9b698e 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
>  95f1c39 
>   itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcDriver2.java 
> 6a18b9a 
>   serde/src/test/org/apache/hadoop/hive/serde2/avro/TestTypeInfoToSchema.java 
> c6b5cb6 
> 
> Diff: https://reviews.apache.org/r/27369/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jason Dere
> 
>

1 2 3 >

1 - 100 of 258 matches

Mail list logo