date:20150213

[jira] [Work started] (HIVE-8128) Improve Parquet Vectorization

2015-02-13 Thread Dong Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-8128 started by Dong Chen.
---
> Improve Parquet Vectorization
> -
>
> Key: HIVE-8128
> URL: https://issues.apache.org/jira/browse/HIVE-8128
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Brock Noland
>Assignee: Dong Chen
>
> We'll want to do is finish the vectorization work (e.g. VectorizedOrcSerde, 
> VectorizedOrcSerde) which was partially done in HIVE-5998.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8128) Improve Parquet Vectorization

2015-02-13 Thread Dong Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319723#comment-14319723
 ] 

Dong Chen commented on HIVE-8128:
-

Will start from a POC based on the new vectorized Parquet API at 
https://github.com/zhenxiao/incubator-parquet-mr/pull/1

> Improve Parquet Vectorization
> -
>
> Key: HIVE-8128
> URL: https://issues.apache.org/jira/browse/HIVE-8128
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Brock Noland
>Assignee: Dong Chen
>
> We'll want to do is finish the vectorization work (e.g. VectorizedOrcSerde, 
> VectorizedOrcSerde) which was partially done in HIVE-5998.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-9635) LLAP: I'm the decider

2015-02-13 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner resolved HIVE-9635.
--
Resolution: Fixed

Committed to branch.

> LLAP: I'm the decider
> -
>
> Key: HIVE-9635
> URL: https://issues.apache.org/jira/browse/HIVE-9635
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: llap
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Attachments: HIVE-9635.1.patch, HIVE-9635.2.patch
>
>
> https://www.youtube.com/watch?v=r8VbzrZ9yHQ
> Physical optimizer to choose what to run inside/outside llap. Tests first 
> whether user code has to be shipped then if the specific query fragment is 
> suitable to run.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9425) External Function Jar files are not available for Driver when running with yarn-cluster mode [Spark Branch]

2015-02-13 Thread Rui Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-9425:
-
Attachment: HIVE-9425.1-spark.patch

Upload an initial patch on behalf of Chengxiang.
[~zhos] and [~xhao1], please help to verify if this can solve your problems. 
Thanks!

> External Function Jar files are not available for Driver when running with 
> yarn-cluster mode [Spark Branch]
> ---
>
> Key: HIVE-9425
> URL: https://issues.apache.org/jira/browse/HIVE-9425
> Project: Hive
>  Issue Type: Sub-task
>  Components: spark-branch
>Reporter: Xiaomin Zhang
>Assignee: Rui Li
> Attachments: HIVE-9425.1-spark.patch
>
>
> 15/01/20 00:27:31 INFO cluster.YarnClusterScheduler: 
> YarnClusterScheduler.postStartHook done
> 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
> (java.io.FileNotFoundException: hive-exec-0.15.0-SNAPSHOT.jar (No such file 
> or directory)), was the --addJars option used?
> 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
> (java.io.FileNotFoundException: opennlp-maxent-3.0.3.jar (No such file or 
> directory)), was the --addJars option used?
> 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
> (java.io.FileNotFoundException: bigbenchqueriesmr.jar (No such file or 
> directory)), was the --addJars option used?
> 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
> (java.io.FileNotFoundException: opennlp-tools-1.5.3.jar (No such file or 
> directory)), was the --addJars option used?
> 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
> (java.io.FileNotFoundException: jcl-over-slf4j-1.7.5.jar (No such file or 
> directory)), was the --addJars option used?
> 15/01/20 00:27:31 INFO client.RemoteDriver: Received job request 
> fef081b0-5408-4804-9531-d131fdd628e6
> 15/01/20 00:27:31 INFO Configuration.deprecation: mapred.max.split.size is 
> deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
> 15/01/20 00:27:31 INFO Configuration.deprecation: mapred.min.split.size is 
> deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
> 15/01/20 00:27:31 INFO client.RemoteDriver: Failed to run job 
> fef081b0-5408-4804-9531-d131fdd628e6
> org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find 
> class: de.bankmark.bigbench.queries.q10.SentimentUDF
> Serialization trace:
> genericUDTF (org.apache.hadoop.hive.ql.plan.UDTFDesc)
> conf (org.apache.hadoop.hive.ql.exec.UDTFOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
> aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
> invertedWorkGraph (org.apache.hadoop.hive.ql.plan.SparkWork)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
> It seems the additional Jar files are not uploaded to DistributedCache, so 
> that the Driver cannot access it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9425) External Function Jar files are not available for Driver when running with yarn-cluster mode [Spark Branch]

2015-02-13 Thread Rui Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-9425:
-
Description: 
{noformat}
15/01/20 00:27:31 INFO cluster.YarnClusterScheduler: 
YarnClusterScheduler.postStartHook done
15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
(java.io.FileNotFoundException: hive-exec-0.15.0-SNAPSHOT.jar (No such file or 
directory)), was the --addJars option used?
15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
(java.io.FileNotFoundException: opennlp-maxent-3.0.3.jar (No such file or 
directory)), was the --addJars option used?
15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
(java.io.FileNotFoundException: bigbenchqueriesmr.jar (No such file or 
directory)), was the --addJars option used?
15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
(java.io.FileNotFoundException: opennlp-tools-1.5.3.jar (No such file or 
directory)), was the --addJars option used?
15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
(java.io.FileNotFoundException: jcl-over-slf4j-1.7.5.jar (No such file or 
directory)), was the --addJars option used?
15/01/20 00:27:31 INFO client.RemoteDriver: Received job request 
fef081b0-5408-4804-9531-d131fdd628e6
15/01/20 00:27:31 INFO Configuration.deprecation: mapred.max.split.size is 
deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
15/01/20 00:27:31 INFO Configuration.deprecation: mapred.min.split.size is 
deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
15/01/20 00:27:31 INFO client.RemoteDriver: Failed to run job 
fef081b0-5408-4804-9531-d131fdd628e6
org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find class: 
de.bankmark.bigbench.queries.q10.SentimentUDF
Serialization trace:
genericUDTF (org.apache.hadoop.hive.ql.plan.UDTFDesc)
conf (org.apache.hadoop.hive.ql.exec.UDTFOperator)
childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
invertedWorkGraph (org.apache.hadoop.hive.ql.plan.SparkWork)
at 
org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
at 
org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
{noformat}

It seems the additional Jar files are not uploaded to DistributedCache, so that 
the Driver cannot access it.


  was:
15/01/20 00:27:31 INFO cluster.YarnClusterScheduler: 
YarnClusterScheduler.postStartHook done
15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
(java.io.FileNotFoundException: hive-exec-0.15.0-SNAPSHOT.jar (No such file or 
directory)), was the --addJars option used?
15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
(java.io.FileNotFoundException: opennlp-maxent-3.0.3.jar (No such file or 
directory)), was the --addJars option used?
15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
(java.io.FileNotFoundException: bigbenchqueriesmr.jar (No such file or 
directory)), was the --addJars option used?
15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
(java.io.FileNotFoundException: opennlp-tools-1.5.3.jar (No such file or 
directory)), was the --addJars option used?
15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
(java.io.FileNotFoundException: jcl-over-slf4j-1.7.5.jar (No such file or 
directory)), was the --addJars option used?
15/01/20 00:27:31 INFO client.RemoteDriver: Received job request 
fef081b0-5408-4804-9531-d131fdd628e6
15/01/20 00:27:31 INFO Configuration.deprecation: mapred.max.split.size is 
deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
15/01/20 00:27:31 INFO Configuration.deprecation: mapred.min.split.size is 
deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
15/01/20 00:27:31 INFO client.RemoteDriver: Failed to run job 
fef081b0-5408-4804-9531-d131fdd628e6
org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find class: 
de.bankmark.bigbench.queries.q10.SentimentUDF
Serialization trace:
genericUDTF (org.apache.hadoop.hive.ql.plan.UDTFDesc)
conf (org.apache.hadoop.hive.ql.exec.UDTFOperator)
childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
invertedWorkGraph (org.apache.hadoop.hive.ql.plan.SparkWork)
at 
org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
at 
org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)

It seems the additional Jar files are not uploaded to DistributedCache, so that 
the Driver cannot access it.



> External Function Jar files are not available for Driver when running with 
>

[jira] [Updated] (HIVE-9425) External Function Jar files are not available for Driver when running with yarn-cluster mode [Spark Branch]

2015-02-13 Thread Rui Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-9425:
-
Status: Patch Available  (was: Open)

> External Function Jar files are not available for Driver when running with 
> yarn-cluster mode [Spark Branch]
> ---
>
> Key: HIVE-9425
> URL: https://issues.apache.org/jira/browse/HIVE-9425
> Project: Hive
>  Issue Type: Sub-task
>  Components: spark-branch
>Reporter: Xiaomin Zhang
>Assignee: Rui Li
> Attachments: HIVE-9425.1-spark.patch
>
>
> {noformat}
> 15/01/20 00:27:31 INFO cluster.YarnClusterScheduler: 
> YarnClusterScheduler.postStartHook done
> 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
> (java.io.FileNotFoundException: hive-exec-0.15.0-SNAPSHOT.jar (No such file 
> or directory)), was the --addJars option used?
> 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
> (java.io.FileNotFoundException: opennlp-maxent-3.0.3.jar (No such file or 
> directory)), was the --addJars option used?
> 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
> (java.io.FileNotFoundException: bigbenchqueriesmr.jar (No such file or 
> directory)), was the --addJars option used?
> 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
> (java.io.FileNotFoundException: opennlp-tools-1.5.3.jar (No such file or 
> directory)), was the --addJars option used?
> 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
> (java.io.FileNotFoundException: jcl-over-slf4j-1.7.5.jar (No such file or 
> directory)), was the --addJars option used?
> 15/01/20 00:27:31 INFO client.RemoteDriver: Received job request 
> fef081b0-5408-4804-9531-d131fdd628e6
> 15/01/20 00:27:31 INFO Configuration.deprecation: mapred.max.split.size is 
> deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
> 15/01/20 00:27:31 INFO Configuration.deprecation: mapred.min.split.size is 
> deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
> 15/01/20 00:27:31 INFO client.RemoteDriver: Failed to run job 
> fef081b0-5408-4804-9531-d131fdd628e6
> org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find 
> class: de.bankmark.bigbench.queries.q10.SentimentUDF
> Serialization trace:
> genericUDTF (org.apache.hadoop.hive.ql.plan.UDTFDesc)
> conf (org.apache.hadoop.hive.ql.exec.UDTFOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
> aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
> invertedWorkGraph (org.apache.hadoop.hive.ql.plan.SparkWork)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
> {noformat}
> It seems the additional Jar files are not uploaded to DistributedCache, so 
> that the Driver cannot access it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9425) Add jar/file doesn't work with yarn-cluster mode [Spark Branch]

2015-02-13 Thread Rui Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-9425:
-
Summary: Add jar/file doesn't work with yarn-cluster mode [Spark Branch]  
(was: External Function Jar files are not available for Driver when running 
with yarn-cluster mode [Spark Branch])

> Add jar/file doesn't work with yarn-cluster mode [Spark Branch]
> ---
>
> Key: HIVE-9425
> URL: https://issues.apache.org/jira/browse/HIVE-9425
> Project: Hive
>  Issue Type: Sub-task
>  Components: spark-branch
>Reporter: Xiaomin Zhang
>Assignee: Rui Li
> Attachments: HIVE-9425.1-spark.patch
>
>
> {noformat}
> 15/01/20 00:27:31 INFO cluster.YarnClusterScheduler: 
> YarnClusterScheduler.postStartHook done
> 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
> (java.io.FileNotFoundException: hive-exec-0.15.0-SNAPSHOT.jar (No such file 
> or directory)), was the --addJars option used?
> 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
> (java.io.FileNotFoundException: opennlp-maxent-3.0.3.jar (No such file or 
> directory)), was the --addJars option used?
> 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
> (java.io.FileNotFoundException: bigbenchqueriesmr.jar (No such file or 
> directory)), was the --addJars option used?
> 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
> (java.io.FileNotFoundException: opennlp-tools-1.5.3.jar (No such file or 
> directory)), was the --addJars option used?
> 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
> (java.io.FileNotFoundException: jcl-over-slf4j-1.7.5.jar (No such file or 
> directory)), was the --addJars option used?
> 15/01/20 00:27:31 INFO client.RemoteDriver: Received job request 
> fef081b0-5408-4804-9531-d131fdd628e6
> 15/01/20 00:27:31 INFO Configuration.deprecation: mapred.max.split.size is 
> deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
> 15/01/20 00:27:31 INFO Configuration.deprecation: mapred.min.split.size is 
> deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
> 15/01/20 00:27:31 INFO client.RemoteDriver: Failed to run job 
> fef081b0-5408-4804-9531-d131fdd628e6
> org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find 
> class: de.bankmark.bigbench.queries.q10.SentimentUDF
> Serialization trace:
> genericUDTF (org.apache.hadoop.hive.ql.plan.UDTFDesc)
> conf (org.apache.hadoop.hive.ql.exec.UDTFOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
> aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
> invertedWorkGraph (org.apache.hadoop.hive.ql.plan.SparkWork)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
> {noformat}
> It seems the additional Jar files are not uploaded to DistributedCache, so 
> that the Driver cannot access it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9680) GlobalLimitOptimizer is not checking filters correctly

2015-02-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319756#comment-14319756
 ] 

Hive QA commented on HIVE-9680:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12698595/HIVE-9680.1.patch.txt

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 7542 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testMetastoreProxyUser
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2789/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2789/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2789/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12698595 - PreCommit-HIVE-TRUNK-Build

> GlobalLimitOptimizer is not checking filters correctly 
> ---
>
> Key: HIVE-9680
> URL: https://issues.apache.org/jira/browse/HIVE-9680
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-9680.1.patch.txt
>
>
> Some predicates can be not included in opToPartPruner



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-2573) Create per-session function registry

2015-02-13 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-2573:
-
Labels: TODOC1.2  (was: )

> Create per-session function registry 
> -
>
> Key: HIVE-2573
> URL: https://issues.apache.org/jira/browse/HIVE-2573
> Project: Hive
>  Issue Type: Improvement
>  Components: Server Infrastructure
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
>  Labels: TODOC1.2
> Fix For: 1.2.0
>
> Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2573.D3231.1.patch, 
> HIVE-2573.1.patch.txt, HIVE-2573.10.patch.txt, HIVE-2573.11.patch.txt, 
> HIVE-2573.12.patch.txt, HIVE-2573.13.patch.txt, HIVE-2573.14.patch.txt, 
> HIVE-2573.15.patch.txt, HIVE-2573.2.patch.txt, HIVE-2573.3.patch.txt, 
> HIVE-2573.4.patch.txt, HIVE-2573.5.patch, HIVE-2573.6.patch, 
> HIVE-2573.7.patch, HIVE-2573.8.patch.txt, HIVE-2573.9.patch.txt
>
>
> Currently the function registry is shared resource and could be overrided by 
> other users when using HiveServer. If per-session function registry is 
> provided, this situation could be prevented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9561) SHUFFLE_SORT should only be used for order by query [Spark Branch]

2015-02-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319776#comment-14319776
 ] 

Hive QA commented on HIVE-9561:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12698669/HIVE-9561.3-spark.patch

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 7471 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby7_noskew_multi_single_reducer
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_multi_single_reducer3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parallel_join0
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_covar_samp
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union4
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union4
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/724/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/724/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-724/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12698669 - PreCommit-HIVE-SPARK-Build

> SHUFFLE_SORT should only be used for order by query [Spark Branch]
> --
>
> Key: HIVE-9561
> URL: https://issues.apache.org/jira/browse/HIVE-9561
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-9561.1-spark.patch, HIVE-9561.2-spark.patch, 
> HIVE-9561.3-spark.patch
>
>
> The {{sortByKey}} shuffle launches probe jobs. Such jobs can hurt performance 
> and are difficult to control. So we should limit the use of {{sortByKey}} to 
> order by query only.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-2573) Create per-session function registry

2015-02-13 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319779#comment-14319779
 ] 

Lefty Leverenz commented on HIVE-2573:
--

Doc note:  This adds "Function" to the description of 
*hive.exec.drop.ignorenonexistent* in 1.2.0, so the wiki needs to be updated 
(with version information).  By the way, HIVE-3781 added "Index" to the 
description in 1.1.0.

* [Configuration Properties -- hive.exec.drop.ignorenonexistent | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.exec.drop.ignorenonexistent]

What other documentation does this need?  Should there be a release note?

> Create per-session function registry 
> -
>
> Key: HIVE-2573
> URL: https://issues.apache.org/jira/browse/HIVE-2573
> Project: Hive
>  Issue Type: Improvement
>  Components: Server Infrastructure
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
>  Labels: TODOC1.2
> Fix For: 1.2.0
>
> Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2573.D3231.1.patch, 
> HIVE-2573.1.patch.txt, HIVE-2573.10.patch.txt, HIVE-2573.11.patch.txt, 
> HIVE-2573.12.patch.txt, HIVE-2573.13.patch.txt, HIVE-2573.14.patch.txt, 
> HIVE-2573.15.patch.txt, HIVE-2573.2.patch.txt, HIVE-2573.3.patch.txt, 
> HIVE-2573.4.patch.txt, HIVE-2573.5.patch, HIVE-2573.6.patch, 
> HIVE-2573.7.patch, HIVE-2573.8.patch.txt, HIVE-2573.9.patch.txt
>
>
> Currently the function registry is shared resource and could be overrided by 
> other users when using HiveServer. If per-session function registry is 
> provided, this situation could be prevented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9667) Disable ORC bloom filters for ORC v11 output-format

2015-02-13 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-9667:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks [~gopalv] for the patch!

> Disable ORC bloom filters for ORC v11 output-format
> ---
>
> Key: HIVE-9667
> URL: https://issues.apache.org/jira/browse/HIVE-9667
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 1.2.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Minor
> Fix For: 1.2.0
>
> Attachments: HIVE-9667.1.patch
>
>
> ORC column bloom filters should only be written if the file format is 0.12+.
> The older format should not write out the metadata streams for bloom filters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-9684) Incorrect disk range computation in ORC because of optional stream kind

2015-02-13 Thread Prasanth Jayachandran (JIRA)

Prasanth Jayachandran created HIVE-9684:
---

 Summary: Incorrect disk range computation in ORC because of 
optional stream kind
 Key: HIVE-9684
 URL: https://issues.apache.org/jira/browse/HIVE-9684
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 1.0.0, 1.2.0, 1.1.0, 1.0.1
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
Priority: Critical


HIVE-9593 changed all required fields in ORC protobuf message to optional 
field. But DiskRange computation and stream creation code assumes existence of 
stream kind everywhere. This leads to incorrect calculation of diskranges 
resulting in out of range exceptions. The proper fix is to check if stream kind 
exists using stream.hasKind() before adding the stream to disk range 
computation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9684) Incorrect disk range computation in ORC because of optional stream kind

2015-02-13 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-9684:

Attachment: HIVE-9684.branch-1.0.patch

> Incorrect disk range computation in ORC because of optional stream kind
> ---
>
> Key: HIVE-9684
> URL: https://issues.apache.org/jira/browse/HIVE-9684
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 1.0.0, 1.2.0, 1.1.0, 1.0.1
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-9684.branch-1.0.patch
>
>
> HIVE-9593 changed all required fields in ORC protobuf message to optional 
> field. But DiskRange computation and stream creation code assumes existence 
> of stream kind everywhere. This leads to incorrect calculation of diskranges 
> resulting in out of range exceptions. The proper fix is to check if stream 
> kind exists using stream.hasKind() before adding the stream to disk range 
> computation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9684) Incorrect disk range computation in ORC because of optional stream kind

2015-02-13 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-9684:

Attachment: HIVE-9684.branch-1.1.patch

> Incorrect disk range computation in ORC because of optional stream kind
> ---
>
> Key: HIVE-9684
> URL: https://issues.apache.org/jira/browse/HIVE-9684
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 1.0.0, 1.2.0, 1.1.0, 1.0.1
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-9684.branch-1.0.patch, HIVE-9684.branch-1.1.patch
>
>
> HIVE-9593 changed all required fields in ORC protobuf message to optional 
> field. But DiskRange computation and stream creation code assumes existence 
> of stream kind everywhere. This leads to incorrect calculation of diskranges 
> resulting in out of range exceptions. The proper fix is to check if stream 
> kind exists using stream.hasKind() before adding the stream to disk range 
> computation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9638) Drop Index does not check Index or Table exisit or not

2015-02-13 Thread Chinna Rao Lalam (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319806#comment-14319806
 ] 

Chinna Rao Lalam commented on HIVE-9638:


Hi,

In Hive 0.7.0 or later, DROP returns an error if the index doesn't exist, 
unless IF EXISTS is specified or the configuration variable 
hive.exec.drop.ignorenonexistent is set to true.

> Drop Index does not check Index or Table exisit or not
> --
>
> Key: HIVE-9638
> URL: https://issues.apache.org/jira/browse/HIVE-9638
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 0.11.0, 0.13.0, 0.14.0, 1.0.0
>Reporter: Will Du
>
> DROP INDEX index_name ON table_name;
> statement will be always successful no matter the index_name or table_name 
> exsit



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9684) Incorrect disk range computation in ORC because of optional stream kind

2015-02-13 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-9684:

Affects Version/s: (was: 1.2.0)

> Incorrect disk range computation in ORC because of optional stream kind
> ---
>
> Key: HIVE-9684
> URL: https://issues.apache.org/jira/browse/HIVE-9684
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 1.0.0, 1.1.0, 1.0.1
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-9684.branch-1.0.patch, HIVE-9684.branch-1.1.patch
>
>
> HIVE-9593 changed all required fields in ORC protobuf message to optional 
> field. But DiskRange computation and stream creation code assumes existence 
> of stream kind everywhere. This leads to incorrect calculation of diskranges 
> resulting in out of range exceptions. The proper fix is to check if stream 
> kind exists using stream.hasKind() before adding the stream to disk range 
> computation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9684) Incorrect disk range computation in ORC because of optional stream kind

2015-02-13 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-9684:

Status: Patch Available  (was: Open)

> Incorrect disk range computation in ORC because of optional stream kind
> ---
>
> Key: HIVE-9684
> URL: https://issues.apache.org/jira/browse/HIVE-9684
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 1.0.0, 1.1.0, 1.0.1
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-9684.branch-1.0.patch, HIVE-9684.branch-1.1.patch
>
>
> HIVE-9593 changed all required fields in ORC protobuf message to optional 
> field. But DiskRange computation and stream creation code assumes existence 
> of stream kind everywhere. This leads to incorrect calculation of diskranges 
> resulting in out of range exceptions. The proper fix is to check if stream 
> kind exists using stream.hasKind() before adding the stream to disk range 
> computation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9684) Incorrect disk range computation in ORC because of optional stream kind

2015-02-13 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319809#comment-14319809
 ] 

Prasanth Jayachandran commented on HIVE-9684:
-

[~gopalv]/[~owen.omalley] Can someone review this patch?

> Incorrect disk range computation in ORC because of optional stream kind
> ---
>
> Key: HIVE-9684
> URL: https://issues.apache.org/jira/browse/HIVE-9684
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 1.0.0, 1.1.0, 1.0.1
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-9684.branch-1.0.patch, HIVE-9684.branch-1.1.patch
>
>
> HIVE-9593 changed all required fields in ORC protobuf message to optional 
> field. But DiskRange computation and stream creation code assumes existence 
> of stream kind everywhere. This leads to incorrect calculation of diskranges 
> resulting in out of range exceptions. The proper fix is to check if stream 
> kind exists using stream.hasKind() before adding the stream to disk range 
> computation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9645) Constant folding case NULL equality

2015-02-13 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-9645:
---
Status: Open  (was: Patch Available)

> Constant folding case NULL equality
> ---
>
> Key: HIVE-9645
> URL: https://issues.apache.org/jira/browse/HIVE-9645
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 1.2.0
>Reporter: Gopal V
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-9645.1.patch, HIVE-9645.patch
>
>
> Hive logical optimizer does not follow the Null scan codepath when 
> encountering a NULL = 1;
> NULL = 1 is not evaluated as false in the constant propogation implementation.
> {code}
> hive> explain select count(1) from store_sales where null=1;
> ...
>  TableScan
>   alias: store_sales
>   filterExpr: (null = 1) (type: boolean)
>   Statistics: Num rows: 550076554 Data size: 49570324480 
> Basic stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (null = 1) (type: boolean)
> Statistics: Num rows: 275038277 Data size: 0 Basic stats: 
> PARTIAL Column stats: COMPLETE
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9645) Constant folding case NULL equality

2015-02-13 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-9645:
---
Status: Patch Available  (was: Open)

> Constant folding case NULL equality
> ---
>
> Key: HIVE-9645
> URL: https://issues.apache.org/jira/browse/HIVE-9645
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 1.2.0
>Reporter: Gopal V
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-9645.1.patch, HIVE-9645.patch
>
>
> Hive logical optimizer does not follow the Null scan codepath when 
> encountering a NULL = 1;
> NULL = 1 is not evaluated as false in the constant propogation implementation.
> {code}
> hive> explain select count(1) from store_sales where null=1;
> ...
>  TableScan
>   alias: store_sales
>   filterExpr: (null = 1) (type: boolean)
>   Statistics: Num rows: 550076554 Data size: 49570324480 
> Basic stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (null = 1) (type: boolean)
> Statistics: Num rows: 275038277 Data size: 0 Basic stats: 
> PARTIAL Column stats: COMPLETE
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9645) Constant folding case NULL equality

2015-02-13 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-9645:
---
Attachment: HIVE-9645.1.patch

Fixed test cases.

> Constant folding case NULL equality
> ---
>
> Key: HIVE-9645
> URL: https://issues.apache.org/jira/browse/HIVE-9645
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 1.2.0
>Reporter: Gopal V
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-9645.1.patch, HIVE-9645.patch
>
>
> Hive logical optimizer does not follow the Null scan codepath when 
> encountering a NULL = 1;
> NULL = 1 is not evaluated as false in the constant propogation implementation.
> {code}
> hive> explain select count(1) from store_sales where null=1;
> ...
>  TableScan
>   alias: store_sales
>   filterExpr: (null = 1) (type: boolean)
>   Statistics: Num rows: 550076554 Data size: 49570324480 
> Basic stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (null = 1) (type: boolean)
> Statistics: Num rows: 275038277 Data size: 0 Basic stats: 
> PARTIAL Column stats: COMPLETE
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9655) Dynamic partition table insertion error

2015-02-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319822#comment-14319822
 ] 

Hive QA commented on HIVE-9655:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12698598/HIVE-9655.2.patch

{color:green}SUCCESS:{color} +1 7543 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2790/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2790/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2790/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12698598 - PreCommit-HIVE-TRUNK-Build

> Dynamic partition table insertion error
> ---
>
> Key: HIVE-9655
> URL: https://issues.apache.org/jira/browse/HIVE-9655
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 1.1
>Reporter: Chao
>Assignee: Chao
> Attachments: HIVE-9655.1.patch, HIVE-9655.2.patch
>
>
> We have these two tables:
> {code}
> create table t1 (c1 bigint, c2 string);
> CREATE TABLE t2 (c1 int, c2 string)
> PARTITIONED BY (p1 string);
> load data local inpath 'data' into table t1;
> load data local inpath 'data' into table t1;
> load data local inpath 'data' into table t1;
> load data local inpath 'data' into table t1;
> load data local inpath 'data' into table t1;
> {code}
> But, when try to insert into table t2 from t1:
> {code}
> SET hive.exec.dynamic.partition.mode=nonstrict;
> insert overwrite table t2 partition(p1) select *,c1 as p1 from t1 distribute 
> by p1;
> {code}
> The query failed with the following exception:
> {noformat}
> 2015-02-11 12:50:52,756 ERROR [LocalJobRunner Map Task Executor #0]: 
> mr.ExecMapper (ExecMapper.java:map(178)) - 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"c1":1,"c2":"one"}
>   at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:503)
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.RuntimeException: cannot find field _col2 from [0:_col0, 1:_col1]
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:397)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
>   at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:493)
>   ... 10 more
> Caused by: java.lang.RuntimeException: cannot find field _col2 from [0:_col0, 
> 1:_col1]
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:410)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:147)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:55)
>   at org.apache.hadoop.hive.ql.exec.Operator.initEvaluators(Operator.java:954)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:325)
>   ... 16 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7759) document hive cli authorization behavior when SQL std auth is enabled

2015-02-13 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-7759:
-
Labels:   (was: TODOC14)

> document hive cli authorization behavior when SQL std auth is enabled
> -
>
> Key: HIVE-7759
> URL: https://issues.apache.org/jira/browse/HIVE-7759
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 0.13.0, 0.14.0, 0.13.1
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
>
> There should a section in sql standard auth doc that highlights how hive-cli 
> behaves with SQL standard authorization turned on.
> Changes in HIVE-7533 and HIVE-7209 should be documented as part of it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9425) Add jar/file doesn't work with yarn-cluster mode [Spark Branch]

2015-02-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319919#comment-14319919
 ] 

Hive QA commented on HIVE-9425:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12698673/HIVE-9425.1-spark.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 7471 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/725/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/725/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-725/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12698673 - PreCommit-HIVE-SPARK-Build

> Add jar/file doesn't work with yarn-cluster mode [Spark Branch]
> ---
>
> Key: HIVE-9425
> URL: https://issues.apache.org/jira/browse/HIVE-9425
> Project: Hive
>  Issue Type: Sub-task
>  Components: spark-branch
>Reporter: Xiaomin Zhang
>Assignee: Rui Li
> Attachments: HIVE-9425.1-spark.patch
>
>
> {noformat}
> 15/01/20 00:27:31 INFO cluster.YarnClusterScheduler: 
> YarnClusterScheduler.postStartHook done
> 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
> (java.io.FileNotFoundException: hive-exec-0.15.0-SNAPSHOT.jar (No such file 
> or directory)), was the --addJars option used?
> 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
> (java.io.FileNotFoundException: opennlp-maxent-3.0.3.jar (No such file or 
> directory)), was the --addJars option used?
> 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
> (java.io.FileNotFoundException: bigbenchqueriesmr.jar (No such file or 
> directory)), was the --addJars option used?
> 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
> (java.io.FileNotFoundException: opennlp-tools-1.5.3.jar (No such file or 
> directory)), was the --addJars option used?
> 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
> (java.io.FileNotFoundException: jcl-over-slf4j-1.7.5.jar (No such file or 
> directory)), was the --addJars option used?
> 15/01/20 00:27:31 INFO client.RemoteDriver: Received job request 
> fef081b0-5408-4804-9531-d131fdd628e6
> 15/01/20 00:27:31 INFO Configuration.deprecation: mapred.max.split.size is 
> deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
> 15/01/20 00:27:31 INFO Configuration.deprecation: mapred.min.split.size is 
> deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
> 15/01/20 00:27:31 INFO client.RemoteDriver: Failed to run job 
> fef081b0-5408-4804-9531-d131fdd628e6
> org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find 
> class: de.bankmark.bigbench.queries.q10.SentimentUDF
> Serialization trace:
> genericUDTF (org.apache.hadoop.hive.ql.plan.UDTFDesc)
> conf (org.apache.hadoop.hive.ql.exec.UDTFOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
> aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
> invertedWorkGraph (org.apache.hadoop.hive.ql.plan.SparkWork)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
> {noformat}
> It seems the additional Jar files are not uploaded to DistributedCache, so 
> that the Driver cannot access it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9666) Improve some qtests

2015-02-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319979#comment-14319979
 ] 

Hive QA commented on HIVE-9666:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12698602/HIVE-9666.2.patch

{color:green}SUCCESS:{color} +1 7542 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2791/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2791/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2791/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12698602 - PreCommit-HIVE-TRUNK-Build

> Improve some qtests
> ---
>
> Key: HIVE-9666
> URL: https://issues.apache.org/jira/browse/HIVE-9666
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Minor
> Attachments: HIVE-9666.1.patch, HIVE-9666.2.patch
>
>
> {code}
> groupby7_noskew_multi_single_reducer.q
> groupby_multi_single_reducer3.q
> parallel_join0.q
> union3.q
> union4.q
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9263) Implement controllable exit code in beeline

2015-02-13 Thread Georg Zigldrum (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319993#comment-14319993
 ] 

Georg Zigldrum commented on HIVE-9263:
--

Beeline should report a non zero exit code if the query results in an error. 
Currently the exit code is 0 so that the shell script would not recognize an 
error.

Example:
# cat test.sh 
= 
#!/bin/bash 
beeline -e "show tables in abc" 
echo "Retun Code - $?" 
= 

This will give a return code of 0 despite the error:

> sh test.sh 
Error: Error while compiling statement: FAILED: SemanticException [Error 
10072]: Database does not exist: abc (state=42000,code=10072) 
Retun Code - 0

> Implement controllable exit code in beeline
> ---
>
> Key: HIVE-9263
> URL: https://issues.apache.org/jira/browse/HIVE-9263
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Reporter: Johndee Burks
>Priority: Minor
>
> It would be nice if beeline implemented something like SQLPlus WHENEVER to 
> control exit codes. This would be useful when performing beeline actions 
> through a shell script. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9683) Hive metastore thrift client connections hang indefinitely

2015-02-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320264#comment-14320264
 ] 

Hive QA commented on HIVE-9683:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12698617/HIVE-9683.1.patch

{color:green}SUCCESS:{color} +1 7542 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2792/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2792/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2792/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12698617 - PreCommit-HIVE-TRUNK-Build

> Hive metastore thrift client connections hang indefinitely
> --
>
> Key: HIVE-9683
> URL: https://issues.apache.org/jira/browse/HIVE-9683
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.0.0, 1.0.1
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Minor
> Fix For: 1.0.1
>
> Attachments: HIVE-9683.1.patch
>
>
> THRIFT-2788 fixed network-partition problems that affect Thrift client 
> connections.
> Since hive-1.0 is on thrift-0.9.0 which is affected by the bug, a workaround 
> can be applied to prevent indefinite connection hangs during net-splits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9605) Remove parquet nested objects from wrapper writable objects

2015-02-13 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320321#comment-14320321
 ] 

Sergio Peña commented on HIVE-9605:
---

This test passes in 'parquet' branch. The patch required the HIVE-9333 patch in 
order to run correctly.

> Remove parquet nested objects from wrapper writable objects
> ---
>
> Key: HIVE-9605
> URL: https://issues.apache.org/jira/browse/HIVE-9605
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 0.14.0
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-9605.3.patch, HIVE-9605.4.patch
>
>
> Parquet nested types are using an extra wrapper object (ArrayWritable) as a 
> wrapper of map and list elements. This extra object is not needed and causing 
> unnecessary memory allocations.
> An example of code is on HiveCollectionConverter.java:
> {noformat}
> public void end() {
> parent.set(index, wrapList(new ArrayWritable(
> Writable.class, list.toArray(new Writable[list.size()];
> }
> {noformat}
> This object is later unwrapped on AbstractParquetMapInspector, i.e.:
> {noformat}
> final Writable[] mapContainer = ((ArrayWritable) data).get();
> final Writable[] mapArray = ((ArrayWritable) mapContainer[0]).get();
> for (final Writable obj : mapArray) {
>   ...
> }
> {noformat}
> We should get rid of this wrapper object to save time and memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9685) CLIService should create SessionState after logging into kerberos

2015-02-13 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-9685:
---
Description: 
{noformat}
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
No valid credentials provided (Mechanism level: Failed to find any Kerberos 
tgt)]
at 
com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
at 
org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
at 
org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271)
at 
org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
at 
org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
at 
org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at 
org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:409)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:230)
at 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:74)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
at 
org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1483)
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:64)
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:74)
at 
org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2841)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2860)
at 
org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:453)
at 
org.apache.hive.service.cli.CLIService.applyAuthorizationConfigPolicy(CLIService.java:123)
at org.apache.hive.service.cli.CLIService.init(CLIService.java:81)
at 
org.apache.hive.service.CompositeService.init(CompositeService.java:59)
at org.apache.hive.service.server.HiveServer2.init(HiveServer2.java:92)
at 
org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:309)
at 
org.apache.hive.service.server.HiveServer2.access$400(HiveServer2.java:68)
at 
org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:523)
at org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:396)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
{noformat}

> CLIService should create SessionState after logging into kerberos
> -
>
> Key: HIVE-9685
> URL: https://issues.apache.org/jira/browse/HIVE-9685
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Brock Noland
>Assignee: Brock Noland
>
> {noformat}
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
> at 
> org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
> at 
> org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271)
> at 
> org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
> at java.security.AccessController.

[jira] [Created] (HIVE-9685) CLIService should create SessionState after logging into kerberos

2015-02-13 Thread Brock Noland (JIRA)

Brock Noland created HIVE-9685:
--

 Summary: CLIService should create SessionState after logging into 
kerberos
 Key: HIVE-9685
 URL: https://issues.apache.org/jira/browse/HIVE-9685
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.1.0
Reporter: Brock Noland
Assignee: Brock Noland






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9685) CLIService should create SessionState after logging into kerberos

2015-02-13 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-9685:
---
Attachment: HIVE-9685.patch

> CLIService should create SessionState after logging into kerberos
> -
>
> Key: HIVE-9685
> URL: https://issues.apache.org/jira/browse/HIVE-9685
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-9685.patch
>
>
> {noformat}
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
> at 
> org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
> at 
> org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271)
> at 
> org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:409)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:230)
> at 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:74)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1483)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:64)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:74)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2841)
> at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2860)
> at 
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:453)
> at 
> org.apache.hive.service.cli.CLIService.applyAuthorizationConfigPolicy(CLIService.java:123)
> at org.apache.hive.service.cli.CLIService.init(CLIService.java:81)
> at 
> org.apache.hive.service.CompositeService.init(CompositeService.java:59)
> at 
> org.apache.hive.service.server.HiveServer2.init(HiveServer2.java:92)
> at 
> org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:309)
> at 
> org.apache.hive.service.server.HiveServer2.access$400(HiveServer2.java:68)
> at 
> org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:523)
> at 
> org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:396)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:483)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7787) Reading Parquet file with enum in Thrift Encoding throws NoSuchFieldError

2015-02-13 Thread Arup Malakar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320356#comment-14320356
 ] 

Arup Malakar commented on HIVE-7787:


I tried release 1.0 and still have the same problem, I am going to reopen the 
JIRA. I will resubmit the patch when I get time.

> Reading Parquet file with enum in Thrift Encoding throws NoSuchFieldError
> -
>
> Key: HIVE-7787
> URL: https://issues.apache.org/jira/browse/HIVE-7787
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema, Thrift API
>Affects Versions: 0.12.0, 0.13.0, 0.12.1, 0.14.0, 0.13.1
> Environment: Hive 0.12 CDH 5.1.0, Hadoop 2.3.0 CDH 5.1.0
>Reporter: Raymond Lau
>Assignee: Arup Malakar
>Priority: Minor
> Attachments: HIVE-7787.trunk.1.patch
>
>
> When reading Parquet file, where the original Thrift schema contains a struct 
> with an enum, this causes the following error (full stack trace blow): 
> {code}
>  java.lang.NoSuchFieldError: DECIMAL.
> {code} 
> Example Thrift Schema:
> {code}
> enum MyEnumType {
> EnumOne,
> EnumTwo,
> EnumThree
> }
> struct MyStruct {
> 1: optional MyEnumType myEnumType;
> 2: optional string field2;
> 3: optional string field3;
> }
> struct outerStruct {
> 1: optional list myStructs
> }
> {code}
> Hive Table:
> {code}
> CREATE EXTERNAL TABLE mytable (
>   mystructs array>
> )
> ROW FORMAT SERDE 'parquet.hive.serde.ParquetHiveSerDe'
> STORED AS
> INPUTFORMAT 'parquet.hive.DeprecatedParquetInputFormat'
> OUTPUTFORMAT 'parquet.hive.DeprecatedParquetOutputFormat'
> ; 
> {code}
> Error Stack trace:
> {code}
> Java stack trace for Hive 0.12:
> Caused by: java.lang.NoSuchFieldError: DECIMAL
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter.getNewConverter(ETypeConverter.java:146)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:31)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.ArrayWritableGroupConverter.(ArrayWritableGroupConverter.java:45)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:34)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:64)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:47)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:36)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:64)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:40)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableRecordConverter.(DataWritableRecordConverter.java:32)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.prepareForRead(DataWritableReadSupport.java:128)
>   at 
> parquet.hadoop.InternalParquetRecordReader.initialize(InternalParquetRecordReader.java:142)
>   at 
> parquet.hadoop.ParquetRecordReader.initializeInternalReader(ParquetRecordReader.java:118)
>   at 
> parquet.hadoop.ParquetRecordReader.initialize(ParquetRecordReader.java:107)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.(ParquetRecordReaderWrapper.java:92)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.(ParquetRecordReaderWrapper.java:66)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:51)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:65)
>   ... 16 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Reopened] (HIVE-7787) Reading Parquet file with enum in Thrift Encoding throws NoSuchFieldError

2015-02-13 Thread Arup Malakar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar reopened HIVE-7787:


> Reading Parquet file with enum in Thrift Encoding throws NoSuchFieldError
> -
>
> Key: HIVE-7787
> URL: https://issues.apache.org/jira/browse/HIVE-7787
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema, Thrift API
>Affects Versions: 0.12.0, 0.13.0, 0.12.1, 0.14.0, 0.13.1
> Environment: Hive 0.12 CDH 5.1.0, Hadoop 2.3.0 CDH 5.1.0
>Reporter: Raymond Lau
>Assignee: Arup Malakar
>Priority: Minor
> Attachments: HIVE-7787.trunk.1.patch
>
>
> When reading Parquet file, where the original Thrift schema contains a struct 
> with an enum, this causes the following error (full stack trace blow): 
> {code}
>  java.lang.NoSuchFieldError: DECIMAL.
> {code} 
> Example Thrift Schema:
> {code}
> enum MyEnumType {
> EnumOne,
> EnumTwo,
> EnumThree
> }
> struct MyStruct {
> 1: optional MyEnumType myEnumType;
> 2: optional string field2;
> 3: optional string field3;
> }
> struct outerStruct {
> 1: optional list myStructs
> }
> {code}
> Hive Table:
> {code}
> CREATE EXTERNAL TABLE mytable (
>   mystructs array>
> )
> ROW FORMAT SERDE 'parquet.hive.serde.ParquetHiveSerDe'
> STORED AS
> INPUTFORMAT 'parquet.hive.DeprecatedParquetInputFormat'
> OUTPUTFORMAT 'parquet.hive.DeprecatedParquetOutputFormat'
> ; 
> {code}
> Error Stack trace:
> {code}
> Java stack trace for Hive 0.12:
> Caused by: java.lang.NoSuchFieldError: DECIMAL
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter.getNewConverter(ETypeConverter.java:146)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:31)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.ArrayWritableGroupConverter.(ArrayWritableGroupConverter.java:45)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:34)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:64)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:47)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:36)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:64)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:40)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableRecordConverter.(DataWritableRecordConverter.java:32)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.prepareForRead(DataWritableReadSupport.java:128)
>   at 
> parquet.hadoop.InternalParquetRecordReader.initialize(InternalParquetRecordReader.java:142)
>   at 
> parquet.hadoop.ParquetRecordReader.initializeInternalReader(ParquetRecordReader.java:118)
>   at 
> parquet.hadoop.ParquetRecordReader.initialize(ParquetRecordReader.java:107)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.(ParquetRecordReaderWrapper.java:92)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.(ParquetRecordReaderWrapper.java:66)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:51)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:65)
>   ... 16 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-9686) HiveMetastore.logAuditEvent can be used before sasl server is started

2015-02-13 Thread Brock Noland (JIRA)

Brock Noland created HIVE-9686:
--

 Summary: HiveMetastore.logAuditEvent can be used before sasl 
server is started
 Key: HIVE-9686
 URL: https://issues.apache.org/jira/browse/HIVE-9686
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland


Metastore listeners can use logAudit before the sasl server is started 
resulting in an NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9138) Add some explain to PTF operator

2015-02-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320373#comment-14320373
 ] 

Hive QA commented on HIVE-9138:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12698640/HIVE-9138.5.patch.txt

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 7535 tests executed
*Failed tests:*
{noformat}
TestSparkClient - did not produce a TEST-*.xml file
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2793/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2793/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2793/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12698640 - PreCommit-HIVE-TRUNK-Build

> Add some explain to PTF operator
> 
>
> Key: HIVE-9138
> URL: https://issues.apache.org/jira/browse/HIVE-9138
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-9138.1.patch.txt, HIVE-9138.2.patch.txt, 
> HIVE-9138.3.patch.txt, HIVE-9138.4.patch.txt, HIVE-9138.5.patch.txt
>
>
> PTFOperator does not explain anything in explain statement, making it hard to 
> understand the internal works. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-6617) Reduce ambiguity in grammar

2015-02-13 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-6617:
--
Attachment: HIVE-6617.15.patch

rebase the patch due to recent commit on trunk

> Reduce ambiguity in grammar
> ---
>
> Key: HIVE-6617
> URL: https://issues.apache.org/jira/browse/HIVE-6617
> Project: Hive
>  Issue Type: Task
>Reporter: Ashutosh Chauhan
>Assignee: Pengcheng Xiong
> Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, 
> HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, 
> HIVE-6617.06.patch, HIVE-6617.07.patch, HIVE-6617.08.patch, 
> HIVE-6617.09.patch, HIVE-6617.10.patch, HIVE-6617.11.patch, 
> HIVE-6617.12.patch, HIVE-6617.13.patch, HIVE-6617.14.patch, HIVE-6617.15.patch
>
>
> CLEAR LIBRARY CACHE
> As of today, antlr reports 214 warnings. Need to bring down this number, 
> ideally to 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-6617) Reduce ambiguity in grammar

2015-02-13 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-6617:
--
Status: Open  (was: Patch Available)

> Reduce ambiguity in grammar
> ---
>
> Key: HIVE-6617
> URL: https://issues.apache.org/jira/browse/HIVE-6617
> Project: Hive
>  Issue Type: Task
>Reporter: Ashutosh Chauhan
>Assignee: Pengcheng Xiong
> Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, 
> HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, 
> HIVE-6617.06.patch, HIVE-6617.07.patch, HIVE-6617.08.patch, 
> HIVE-6617.09.patch, HIVE-6617.10.patch, HIVE-6617.11.patch, 
> HIVE-6617.12.patch, HIVE-6617.13.patch, HIVE-6617.14.patch, HIVE-6617.15.patch
>
>
> CLEAR LIBRARY CACHE
> As of today, antlr reports 214 warnings. Need to bring down this number, 
> ideally to 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-6617) Reduce ambiguity in grammar

2015-02-13 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-6617:
--
Status: Patch Available  (was: Open)

> Reduce ambiguity in grammar
> ---
>
> Key: HIVE-6617
> URL: https://issues.apache.org/jira/browse/HIVE-6617
> Project: Hive
>  Issue Type: Task
>Reporter: Ashutosh Chauhan
>Assignee: Pengcheng Xiong
> Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, 
> HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, 
> HIVE-6617.06.patch, HIVE-6617.07.patch, HIVE-6617.08.patch, 
> HIVE-6617.09.patch, HIVE-6617.10.patch, HIVE-6617.11.patch, 
> HIVE-6617.12.patch, HIVE-6617.13.patch, HIVE-6617.14.patch, HIVE-6617.15.patch
>
>
> CLEAR LIBRARY CACHE
> As of today, antlr reports 214 warnings. Need to bring down this number, 
> ideally to 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 30281: HIVE-9333: Move parquet serialize implementation to DataWritableWriter to improve write speeds

2015-02-13 Thread Sergio Pena



> On Feb. 11, 2015, 11:40 p.m., Ryan Blue wrote:
> >

Thanks Ryan for your comments.

I will add this changes in another JIRA as this one was already merged. I did 
not add a comment on JIRA to wait for the merge.


- Sergio


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30281/#review72053
---


On Feb. 11, 2015, 11:19 p.m., Sergio Pena wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30281/
> ---
> 
> (Updated Feb. 11, 2015, 11:19 p.m.)
> 
> 
> Review request for hive, Ryan Blue, cheng xu, and Dong Chen.
> 
> 
> Bugs: HIVE-9333
> https://issues.apache.org/jira/browse/HIVE-9333
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> This patch moves the ParquetHiveSerDe.serialize() implementation to 
> DataWritableWriter class in order to save time in materializing data on 
> serialize().
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java
>  ea4109d358f7c48d1e2042e5da299475de4a0a29 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java 
> 9199127735533f9a324c5ef456786dda10766c46 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriteSupport.java
>  060b1b722d32f3b2f88304a1a73eb249e150294b 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java
>  1d83bf31a3dbcbaa68b3e75a72cec2ec67e7faa5 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/ParquetRecordWriterWrapper.java
>  e52c4bc0b869b3e60cb4bfa9e11a09a0d605ac28 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestDataWritableWriter.java 
> a693aff18516d133abf0aae4847d3fe00b9f1c96 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestMapredParquetOutputFormat.java
>  667d3671547190d363107019cd9a2d105d26d336 
>   ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetSerDe.java 
> 007a665529857bcec612f638a157aa5043562a15 
>   serde/src/java/org/apache/hadoop/hive/serde2/io/ParquetHiveRecord.java 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/30281/diff/
> 
> 
> Testing
> ---
> 
> The tests run were the following:
> 
> 1. JMH (Java microbenchmark)
> 
> This benchmark called parquet serialize/write methods using text writable 
> objects. 
> 
> Class.method  Before Change (ops/s)  After Change (ops/s) 
>   
> ---
> ParquetHiveSerDe.serialize:  19,113   249,528   ->  
> 19x speed increase
> DataWritableWriter.write: 5,033 5,201   ->  
> 3.34% speed increase
> 
> 
> 2. Write 20 million rows (~1GB file) from Text to Parquet
> 
> I wrote a ~1Gb file in Textfile format, then convert it to a Parquet format 
> using the following
> statement: CREATE TABLE parquet STORED AS parquet AS SELECT * FROM text;
> 
> Time (s) it took to write the whole file BEFORE changes: 93.758 s
> Time (s) it took to write the whole file AFTER changes: 83.903 s
> 
> It got a 10% of speed inscrease.
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>

Re: Review Request 30281: HIVE-9333: Move parquet serialize implementation to DataWritableWriter to improve write speeds

2015-02-13 Thread Ryan Blue


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30281/#review72398
---

Ship it!


Ship It!

- Ryan Blue


On Feb. 11, 2015, 3:19 p.m., Sergio Pena wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30281/
> ---
> 
> (Updated Feb. 11, 2015, 3:19 p.m.)
> 
> 
> Review request for hive, Ryan Blue, cheng xu, and Dong Chen.
> 
> 
> Bugs: HIVE-9333
> https://issues.apache.org/jira/browse/HIVE-9333
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> This patch moves the ParquetHiveSerDe.serialize() implementation to 
> DataWritableWriter class in order to save time in materializing data on 
> serialize().
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java
>  ea4109d358f7c48d1e2042e5da299475de4a0a29 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java 
> 9199127735533f9a324c5ef456786dda10766c46 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriteSupport.java
>  060b1b722d32f3b2f88304a1a73eb249e150294b 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java
>  1d83bf31a3dbcbaa68b3e75a72cec2ec67e7faa5 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/ParquetRecordWriterWrapper.java
>  e52c4bc0b869b3e60cb4bfa9e11a09a0d605ac28 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestDataWritableWriter.java 
> a693aff18516d133abf0aae4847d3fe00b9f1c96 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestMapredParquetOutputFormat.java
>  667d3671547190d363107019cd9a2d105d26d336 
>   ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetSerDe.java 
> 007a665529857bcec612f638a157aa5043562a15 
>   serde/src/java/org/apache/hadoop/hive/serde2/io/ParquetHiveRecord.java 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/30281/diff/
> 
> 
> Testing
> ---
> 
> The tests run were the following:
> 
> 1. JMH (Java microbenchmark)
> 
> This benchmark called parquet serialize/write methods using text writable 
> objects. 
> 
> Class.method  Before Change (ops/s)  After Change (ops/s) 
>   
> ---
> ParquetHiveSerDe.serialize:  19,113   249,528   ->  
> 19x speed increase
> DataWritableWriter.write: 5,033 5,201   ->  
> 3.34% speed increase
> 
> 
> 2. Write 20 million rows (~1GB file) from Text to Parquet
> 
> I wrote a ~1Gb file in Textfile format, then convert it to a Parquet format 
> using the following
> statement: CREATE TABLE parquet STORED AS parquet AS SELECT * FROM text;
> 
> Time (s) it took to write the whole file BEFORE changes: 93.758 s
> Time (s) it took to write the whole file AFTER changes: 83.903 s
> 
> It got a 10% of speed inscrease.
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>

[jira] [Updated] (HIVE-9673) Set operationhandle in ATS entities for lookups

2015-02-13 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-9673:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

committed to trunk. thanks [~thejas]!

> Set operationhandle in ATS entities for lookups
> ---
>
> Key: HIVE-9673
> URL: https://issues.apache.org/jira/browse/HIVE-9673
> Project: Hive
>  Issue Type: Bug
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-9673.1.patch, HIVE-9673.2.patch
>
>
> Yarn App Timeline Server (ATS) users can find their query using hive query-id.
> However, query id is available only through the logs at the moment.
> Thrift api users such as Hue have another unique id for queries, which the 
> operation handle contains 
> (TExecuteStatementResp.TOperationHandle.THandleIdentifier.guid) . Adding the 
> operationhandle guid to ATS will enable such thrift users to get information 
> from ATS for the queries that they have spawned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9673) Set operationhandle in ATS entities for lookups

2015-02-13 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-9673:
-
Issue Type: Improvement  (was: Bug)

> Set operationhandle in ATS entities for lookups
> ---
>
> Key: HIVE-9673
> URL: https://issues.apache.org/jira/browse/HIVE-9673
> Project: Hive
>  Issue Type: Improvement
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 1.2.0
>
> Attachments: HIVE-9673.1.patch, HIVE-9673.2.patch
>
>
> Yarn App Timeline Server (ATS) users can find their query using hive query-id.
> However, query id is available only through the logs at the moment.
> Thrift api users such as Hue have another unique id for queries, which the 
> operation handle contains 
> (TExecuteStatementResp.TOperationHandle.THandleIdentifier.guid) . Adding the 
> operationhandle guid to ATS will enable such thrift users to get information 
> from ATS for the queries that they have spawned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9673) Set operationhandle in ATS entities for lookups

2015-02-13 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-9673:
-
Fix Version/s: 1.2.0

> Set operationhandle in ATS entities for lookups
> ---
>
> Key: HIVE-9673
> URL: https://issues.apache.org/jira/browse/HIVE-9673
> Project: Hive
>  Issue Type: Bug
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 1.2.0
>
> Attachments: HIVE-9673.1.patch, HIVE-9673.2.patch
>
>
> Yarn App Timeline Server (ATS) users can find their query using hive query-id.
> However, query id is available only through the logs at the moment.
> Thrift api users such as Hue have another unique id for queries, which the 
> operation handle contains 
> (TExecuteStatementResp.TOperationHandle.THandleIdentifier.guid) . Adding the 
> operationhandle guid to ATS will enable such thrift users to get information 
> from ATS for the queries that they have spawned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9645) Constant folding case NULL equality

2015-02-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320484#comment-14320484
 ] 

Hive QA commented on HIVE-9645:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12698690/HIVE-9645.1.patch

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 7542 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_windowing_navfn
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_vectorization_ppd
org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testMetastoreProxyUser
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2794/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2794/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2794/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12698690 - PreCommit-HIVE-TRUNK-Build

> Constant folding case NULL equality
> ---
>
> Key: HIVE-9645
> URL: https://issues.apache.org/jira/browse/HIVE-9645
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 1.2.0
>Reporter: Gopal V
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-9645.1.patch, HIVE-9645.patch
>
>
> Hive logical optimizer does not follow the Null scan codepath when 
> encountering a NULL = 1;
> NULL = 1 is not evaluated as false in the constant propogation implementation.
> {code}
> hive> explain select count(1) from store_sales where null=1;
> ...
>  TableScan
>   alias: store_sales
>   filterExpr: (null = 1) (type: boolean)
>   Statistics: Num rows: 550076554 Data size: 49570324480 
> Basic stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (null = 1) (type: boolean)
> Statistics: Num rows: 275038277 Data size: 0 Basic stats: 
> PARTIAL Column stats: COMPLETE
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

can you review HIVE-9617 UDF from_utc_timestamp throws NPE ...

2015-02-13 Thread Alexander Pivovarov

UDF from_utc_timestamp throws NPE if the second argument is null

https://issues.apache.org/jira/browse/HIVE-9617

[jira] [Commented] (HIVE-9683) Hive metastore thrift client connections hang indefinitely

2015-02-13 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320486#comment-14320486
 ] 

Gunther Hagleitner commented on HIVE-9683:
--

[~vikram.dixit] ok for 1.0 branch?

> Hive metastore thrift client connections hang indefinitely
> --
>
> Key: HIVE-9683
> URL: https://issues.apache.org/jira/browse/HIVE-9683
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.0.0, 1.0.1
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Minor
> Fix For: 1.0.1
>
> Attachments: HIVE-9683.1.patch
>
>
> THRIFT-2788 fixed network-partition problems that affect Thrift client 
> connections.
> Since hive-1.0 is on thrift-0.9.0 which is affected by the bug, a workaround 
> can be applied to prevent indefinite connection hangs during net-splits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9607) Remove unnecessary attach-jdbc-driver execution from package/pom.xml

2015-02-13 Thread Alexander Pivovarov (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320495#comment-14320495
 ] 

Alexander Pivovarov commented on HIVE-9607:
---

[~xuefuz] Can you commit it?

> Remove unnecessary attach-jdbc-driver execution from package/pom.xml
> 
>
> Key: HIVE-9607
> URL: https://issues.apache.org/jira/browse/HIVE-9607
> Project: Hive
>  Issue Type: Improvement
>  Components: Build Infrastructure
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>Priority: Minor
> Attachments: HIVE-9607.1.patch
>
>
> Looks like build-helper-maven-plugin block which has execution 
> attach-jdbc-driver is not needed in package/pom.xml
> package/pom.xml has maven-dependency-plugin which copies hive-jdbc-standalone 
> to project.build.directory
> I removed build-helper-maven-plugin block and rebuilt hive
> hive-jdbc-standalone.jar is still placed to project.build.directory
> {code}
> $ mvn clean install -Phadoop-2 -Pdist -DskipTests
> $ find . -name "apache-hive*jdbc.jar" -exec ls -la {} \;
> 16844023 Feb  6 17:45 ./packaging/target/apache-hive-1.2.0-SNAPSHOT-jdbc.jar
> $ find . -name "hive-jdbc*standalone.jar" -exec ls -la {} \;
> 16844023 Feb  6 17:45 
> ./packaging/target/apache-hive-1.2.0-SNAPSHOT-bin/apache-hive-1.2.0-SNAPSHOT-bin/lib/hive-jdbc-1.2.0-SNAPSHOT-standalone.jar
> 16844023 Feb  6 17:45 ./jdbc/target/hive-jdbc-1.2.0-SNAPSHOT-standalone.jar
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9683) Hive metastore thrift client connections hang indefinitely

2015-02-13 Thread Vikram Dixit K (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320492#comment-14320492
 ] 

Vikram Dixit K commented on HIVE-9683:
--

+1 for 1.0 branch.

> Hive metastore thrift client connections hang indefinitely
> --
>
> Key: HIVE-9683
> URL: https://issues.apache.org/jira/browse/HIVE-9683
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.0.0, 1.0.1
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Minor
> Fix For: 1.0.1
>
> Attachments: HIVE-9683.1.patch
>
>
> THRIFT-2788 fixed network-partition problems that affect Thrift client 
> connections.
> Since hive-1.0 is on thrift-0.9.0 which is affected by the bug, a workaround 
> can be applied to prevent indefinite connection hangs during net-splits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9481) allow column list specification in INSERT statement

2015-02-13 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320522#comment-14320522
 ] 

Eugene Koifman commented on HIVE-9481:
--

Committed to trunk.  Thanks [~alangates] for the review

> allow column list specification in INSERT statement
> ---
>
> Key: HIVE-9481
> URL: https://issues.apache.org/jira/browse/HIVE-9481
> Project: Hive
>  Issue Type: Bug
>  Components: Parser, Query Processor, SQL
>Affects Versions: 0.14.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-9481.2.patch, HIVE-9481.4.patch, HIVE-9481.5.patch, 
> HIVE-9481.6.patch, HIVE-9481.patch
>
>
> Given a table FOO(a int, b int, c int), ANSI SQL supports insert into 
> FOO(c,b) select x,y from T.  The expectation is that 'x' is written to column 
> 'c' and 'y' is written column 'b' and 'a' is set to NULL, assuming column 'a' 
> is NULLABLE.
> Hive does not support this.  In Hive one has to ensure that the data 
> producing statement has a schema that matches target table schema.
> Since Hive doesn't support DEFAULT value for columns in CREATE TABLE, when 
> target schema is explicitly provided, missing columns will be set to NULL if 
> they are NULLABLE, otherwise an error will be raised.
> If/when DEFAULT clause is supported, this can be enhanced to set default 
> value rather than NULL.
> Thus, given {noformat}
> create table source (a int, b int);
> create table target (x int, y int, z int);
> create table target2 (x int, y int, z int);
> {noformat}
> {noformat}insert into target(y,z) select * from source;{noformat}
> will mean 
> {noformat}insert into target select null as x, a, b from source;{noformat}
> and 
> {noformat}insert into target(z,y) select * from source;{noformat}
> will meant 
> {noformat}insert into target select null as x, b, a from source;{noformat}
> Also,
> {noformat}
> from source 
>   insert into target(y,z) select null as x, * 
>   insert into target2(y,z) select null as x, source.*;
> {noformat}
> and for partitioned tables, given
> {noformat}
> Given:
> CREATE TABLE pageviews (userid VARCHAR(64), link STRING, "from" STRING)
>   PARTITIONED BY (datestamp STRING) CLUSTERED BY (userid) INTO 256 BUCKETS 
> STORED AS ORC;
> INSERT INTO TABLE pageviews PARTITION (datestamp = '2014-09-23')(userid,link) 
>  
>VALUES ('jsmith', 'mail.com');
> {noformat}
> And dynamic partitioning
> {noformat}
> INSERT INTO TABLE pageviews PARTITION (datestamp)(userid,datestamp,link) 
> VALUES ('jsmith', '2014-09-23', 'mail.com');
> {noformat}
> In all cases, the schema specification contains columns of the target table 
> which are matched by position to the values produced by VALUES clause/SELECT 
> statement.  If the producer side provides values for a dynamic partition 
> column, the column should be in the specified schema.  Static partition 
> values are part of the partition spec and thus are not produced by the 
> producer and should not be part of the schema specification.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9481) allow column list specification in INSERT statement

2015-02-13 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-9481:
-
Fix Version/s: 1.2.0

> allow column list specification in INSERT statement
> ---
>
> Key: HIVE-9481
> URL: https://issues.apache.org/jira/browse/HIVE-9481
> Project: Hive
>  Issue Type: Bug
>  Components: Parser, Query Processor, SQL
>Affects Versions: 0.14.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 1.2.0
>
> Attachments: HIVE-9481.2.patch, HIVE-9481.4.patch, HIVE-9481.5.patch, 
> HIVE-9481.6.patch, HIVE-9481.patch
>
>
> Given a table FOO(a int, b int, c int), ANSI SQL supports insert into 
> FOO(c,b) select x,y from T.  The expectation is that 'x' is written to column 
> 'c' and 'y' is written column 'b' and 'a' is set to NULL, assuming column 'a' 
> is NULLABLE.
> Hive does not support this.  In Hive one has to ensure that the data 
> producing statement has a schema that matches target table schema.
> Since Hive doesn't support DEFAULT value for columns in CREATE TABLE, when 
> target schema is explicitly provided, missing columns will be set to NULL if 
> they are NULLABLE, otherwise an error will be raised.
> If/when DEFAULT clause is supported, this can be enhanced to set default 
> value rather than NULL.
> Thus, given {noformat}
> create table source (a int, b int);
> create table target (x int, y int, z int);
> create table target2 (x int, y int, z int);
> {noformat}
> {noformat}insert into target(y,z) select * from source;{noformat}
> will mean 
> {noformat}insert into target select null as x, a, b from source;{noformat}
> and 
> {noformat}insert into target(z,y) select * from source;{noformat}
> will meant 
> {noformat}insert into target select null as x, b, a from source;{noformat}
> Also,
> {noformat}
> from source 
>   insert into target(y,z) select null as x, * 
>   insert into target2(y,z) select null as x, source.*;
> {noformat}
> and for partitioned tables, given
> {noformat}
> Given:
> CREATE TABLE pageviews (userid VARCHAR(64), link STRING, "from" STRING)
>   PARTITIONED BY (datestamp STRING) CLUSTERED BY (userid) INTO 256 BUCKETS 
> STORED AS ORC;
> INSERT INTO TABLE pageviews PARTITION (datestamp = '2014-09-23')(userid,link) 
>  
>VALUES ('jsmith', 'mail.com');
> {noformat}
> And dynamic partitioning
> {noformat}
> INSERT INTO TABLE pageviews PARTITION (datestamp)(userid,datestamp,link) 
> VALUES ('jsmith', '2014-09-23', 'mail.com');
> {noformat}
> In all cases, the schema specification contains columns of the target table 
> which are matched by position to the values produced by VALUES clause/SELECT 
> statement.  If the producer side provides values for a dynamic partition 
> column, the column should be in the specified schema.  Static partition 
> values are part of the partition spec and thus are not produced by the 
> producer and should not be part of the schema specification.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9481) allow column list specification in INSERT statement

2015-02-13 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-9481:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> allow column list specification in INSERT statement
> ---
>
> Key: HIVE-9481
> URL: https://issues.apache.org/jira/browse/HIVE-9481
> Project: Hive
>  Issue Type: Bug
>  Components: Parser, Query Processor, SQL
>Affects Versions: 0.14.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-9481.2.patch, HIVE-9481.4.patch, HIVE-9481.5.patch, 
> HIVE-9481.6.patch, HIVE-9481.patch
>
>
> Given a table FOO(a int, b int, c int), ANSI SQL supports insert into 
> FOO(c,b) select x,y from T.  The expectation is that 'x' is written to column 
> 'c' and 'y' is written column 'b' and 'a' is set to NULL, assuming column 'a' 
> is NULLABLE.
> Hive does not support this.  In Hive one has to ensure that the data 
> producing statement has a schema that matches target table schema.
> Since Hive doesn't support DEFAULT value for columns in CREATE TABLE, when 
> target schema is explicitly provided, missing columns will be set to NULL if 
> they are NULLABLE, otherwise an error will be raised.
> If/when DEFAULT clause is supported, this can be enhanced to set default 
> value rather than NULL.
> Thus, given {noformat}
> create table source (a int, b int);
> create table target (x int, y int, z int);
> create table target2 (x int, y int, z int);
> {noformat}
> {noformat}insert into target(y,z) select * from source;{noformat}
> will mean 
> {noformat}insert into target select null as x, a, b from source;{noformat}
> and 
> {noformat}insert into target(z,y) select * from source;{noformat}
> will meant 
> {noformat}insert into target select null as x, b, a from source;{noformat}
> Also,
> {noformat}
> from source 
>   insert into target(y,z) select null as x, * 
>   insert into target2(y,z) select null as x, source.*;
> {noformat}
> and for partitioned tables, given
> {noformat}
> Given:
> CREATE TABLE pageviews (userid VARCHAR(64), link STRING, "from" STRING)
>   PARTITIONED BY (datestamp STRING) CLUSTERED BY (userid) INTO 256 BUCKETS 
> STORED AS ORC;
> INSERT INTO TABLE pageviews PARTITION (datestamp = '2014-09-23')(userid,link) 
>  
>VALUES ('jsmith', 'mail.com');
> {noformat}
> And dynamic partitioning
> {noformat}
> INSERT INTO TABLE pageviews PARTITION (datestamp)(userid,datestamp,link) 
> VALUES ('jsmith', '2014-09-23', 'mail.com');
> {noformat}
> In all cases, the schema specification contains columns of the target table 
> which are matched by position to the values produced by VALUES clause/SELECT 
> statement.  If the producer side provides values for a dynamic partition 
> column, the column should be in the specified schema.  Static partition 
> values are part of the partition spec and thus are not produced by the 
> producer and should not be part of the schema specification.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9605) Remove parquet nested objects from wrapper writable objects

2015-02-13 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-9605:
---
   Resolution: Fixed
Fix Version/s: parquet-branch
   Status: Resolved  (was: Patch Available)

Committed to branch!

> Remove parquet nested objects from wrapper writable objects
> ---
>
> Key: HIVE-9605
> URL: https://issues.apache.org/jira/browse/HIVE-9605
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 0.14.0
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Fix For: parquet-branch
>
> Attachments: HIVE-9605.3.patch, HIVE-9605.4.patch
>
>
> Parquet nested types are using an extra wrapper object (ArrayWritable) as a 
> wrapper of map and list elements. This extra object is not needed and causing 
> unnecessary memory allocations.
> An example of code is on HiveCollectionConverter.java:
> {noformat}
> public void end() {
> parent.set(index, wrapList(new ArrayWritable(
> Writable.class, list.toArray(new Writable[list.size()];
> }
> {noformat}
> This object is later unwrapped on AbstractParquetMapInspector, i.e.:
> {noformat}
> final Writable[] mapContainer = ((ArrayWritable) data).get();
> final Writable[] mapArray = ((ArrayWritable) mapContainer[0]).get();
> for (final Writable obj : mapArray) {
>   ...
> }
> {noformat}
> We should get rid of this wrapper object to save time and memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9350) Add ability for HiveAuthorizer implementations to filter out results of 'show tables', 'show databases'

2015-02-13 Thread Thejas M Nair (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-9350:

Attachment: HIVE-9350.5.patch

Fix the classnotfound exception at runtime from perflogger.


> Add ability for HiveAuthorizer implementations to filter out results of 'show 
> tables', 'show databases'
> ---
>
> Key: HIVE-9350
> URL: https://issues.apache.org/jira/browse/HIVE-9350
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
>  Labels: TODOC1.2
> Fix For: 1.2.0
>
> Attachments: HIVE-9350.1.patch, HIVE-9350.2.patch, HIVE-9350.3.patch, 
> HIVE-9350.4.patch, HIVE-9350.5.patch
>
>
> It should be possible for HiveAuthorizer implementations to control if a user 
> is able to see a table or database in results of 'show tables' and 'show 
> databases' respectively.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 30575: HIVE-9350 : Add ability for HiveAuthorizer implementations to filter out results of 'show tables', 'show databases'

2015-02-13 Thread Thejas Nair


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30575/
---

(Updated Feb. 13, 2015, 7 p.m.)


Review request for hive and Jason Dere.


Changes
---

Fix the classnotfound exception at runtime from perflogger.


Bugs: HIVE-9350
https://issues.apache.org/jira/browse/HIVE-9350


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-9350


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 90bcc49 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestFilterHooks.java
 cceac93 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/authorization/plugin/TestHiveAuthorizerShowFilters.java
 PRE-CREATION 
  
metastore/src/java/org/apache/hadoop/hive/metastore/DefaultMetaStoreFilterHookImpl.java
 b723484 
  metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreFilterHook.java 
51f63ad 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/AuthorizationMetaStoreFilterHook.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAccessControlException.java
 d877686 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAuthorizationValidator.java
 5a5b3d5 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAuthorizer.java
 1f1eba2 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAuthorizerImpl.java
 e615049 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveV1Authorizer.java
 ac1cc47 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/DummyHiveAuthorizationValidator.java
 cabc22a 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAuthorizationValidator.java
 0e093b0 
  ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java d4e5562 
  service/src/java/org/apache/hive/service/cli/CLIService.java 883bf9b 

Diff: https://reviews.apache.org/r/30575/diff/


Testing
---

New unit tests.


Thanks,

Thejas Nair

[jira] [Commented] (HIVE-9350) Add ability for HiveAuthorizer implementations to filter out results of 'show tables', 'show databases'

2015-02-13 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320572#comment-14320572
 ] 

Thejas M Nair commented on HIVE-9350:
-

Updated review board, but it also shows other changes from trunk as part of the 
diff. Here is the real change in updated patch - 
https://github.com/thejasmn/hive/commit/b35795441195825218cc32bda814ea7a9369435f


> Add ability for HiveAuthorizer implementations to filter out results of 'show 
> tables', 'show databases'
> ---
>
> Key: HIVE-9350
> URL: https://issues.apache.org/jira/browse/HIVE-9350
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
>  Labels: TODOC1.2
> Fix For: 1.2.0
>
> Attachments: HIVE-9350.1.patch, HIVE-9350.2.patch, HIVE-9350.3.patch, 
> HIVE-9350.4.patch, HIVE-9350.5.patch
>
>
> It should be possible for HiveAuthorizer implementations to control if a user 
> is able to see a table or database in results of 'show tables' and 'show 
> databases' respectively.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9666) Improve some qtests

2015-02-13 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320593#comment-14320593
 ] 

Xuefu Zhang commented on HIVE-9666:
---

+1 to patch #2 also.

> Improve some qtests
> ---
>
> Key: HIVE-9666
> URL: https://issues.apache.org/jira/browse/HIVE-9666
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Minor
> Attachments: HIVE-9666.1.patch, HIVE-9666.2.patch
>
>
> {code}
> groupby7_noskew_multi_single_reducer.q
> groupby_multi_single_reducer3.q
> parallel_join0.q
> union3.q
> union4.q
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9350) Add ability for HiveAuthorizer implementations to filter out results of 'show tables', 'show databases'

2015-02-13 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320597#comment-14320597
 ] 

Jason Dere commented on HIVE-9350:
--

+1

> Add ability for HiveAuthorizer implementations to filter out results of 'show 
> tables', 'show databases'
> ---
>
> Key: HIVE-9350
> URL: https://issues.apache.org/jira/browse/HIVE-9350
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
>  Labels: TODOC1.2
> Fix For: 1.2.0
>
> Attachments: HIVE-9350.1.patch, HIVE-9350.2.patch, HIVE-9350.3.patch, 
> HIVE-9350.4.patch, HIVE-9350.5.patch
>
>
> It should be possible for HiveAuthorizer implementations to control if a user 
> is able to see a table or database in results of 'show tables' and 'show 
> databases' respectively.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-6069) Improve error message in GenericUDFRound

2015-02-13 Thread Vikram Dixit K (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-6069:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks [~apivovarov]!

> Improve error message in GenericUDFRound
> 
>
> Key: HIVE-6069
> URL: https://issues.apache.org/jira/browse/HIVE-6069
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 1.0.0
>Reporter: Xuefu Zhang
>Assignee: Alexander Pivovarov
>Priority: Trivial
> Attachments: HIVE-6069.1.patch
>
>
> Suggested in HIVE-6039 review board.
> https://reviews.apache.org/r/16329/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-6069) Improve error message in GenericUDFRound

2015-02-13 Thread Vikram Dixit K (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-6069:
-
Affects Version/s: 1.0.0

> Improve error message in GenericUDFRound
> 
>
> Key: HIVE-6069
> URL: https://issues.apache.org/jira/browse/HIVE-6069
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 1.0.0
>Reporter: Xuefu Zhang
>Assignee: Alexander Pivovarov
>Priority: Trivial
> Fix For: 1.2.0
>
> Attachments: HIVE-6069.1.patch
>
>
> Suggested in HIVE-6039 review board.
> https://reviews.apache.org/r/16329/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

review: HIVE-9619 Uninitialized read of numBitVectors in NumDistinctValueEstimator

2015-02-13 Thread Alexander Pivovarov

Hi Everyone

Can anyone review it?

https://issues.apache.org/jira/browse/HIVE-9619

https://reviews.apache.org/r/30789/diff/#

[jira] [Updated] (HIVE-6069) Improve error message in GenericUDFRound

2015-02-13 Thread Vikram Dixit K (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-6069:
-
Fix Version/s: 1.2.0

> Improve error message in GenericUDFRound
> 
>
> Key: HIVE-6069
> URL: https://issues.apache.org/jira/browse/HIVE-6069
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 1.0.0
>Reporter: Xuefu Zhang
>Assignee: Alexander Pivovarov
>Priority: Trivial
> Fix For: 1.2.0
>
> Attachments: HIVE-6069.1.patch
>
>
> Suggested in HIVE-6039 review board.
> https://reviews.apache.org/r/16329/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-6617) Reduce ambiguity in grammar

2015-02-13 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-6617:
--
Status: Patch Available  (was: Open)

> Reduce ambiguity in grammar
> ---
>
> Key: HIVE-6617
> URL: https://issues.apache.org/jira/browse/HIVE-6617
> Project: Hive
>  Issue Type: Task
>Reporter: Ashutosh Chauhan
>Assignee: Pengcheng Xiong
> Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, 
> HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, 
> HIVE-6617.06.patch, HIVE-6617.07.patch, HIVE-6617.08.patch, 
> HIVE-6617.09.patch, HIVE-6617.10.patch, HIVE-6617.11.patch, 
> HIVE-6617.12.patch, HIVE-6617.13.patch, HIVE-6617.14.patch, HIVE-6617.15.patch
>
>
> CLEAR LIBRARY CACHE
> As of today, antlr reports 214 warnings. Need to bring down this number, 
> ideally to 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-6617) Reduce ambiguity in grammar

2015-02-13 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-6617:
--
Attachment: (was: HIVE-6617.15.patch)

> Reduce ambiguity in grammar
> ---
>
> Key: HIVE-6617
> URL: https://issues.apache.org/jira/browse/HIVE-6617
> Project: Hive
>  Issue Type: Task
>Reporter: Ashutosh Chauhan
>Assignee: Pengcheng Xiong
> Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, 
> HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, 
> HIVE-6617.06.patch, HIVE-6617.07.patch, HIVE-6617.08.patch, 
> HIVE-6617.09.patch, HIVE-6617.10.patch, HIVE-6617.11.patch, 
> HIVE-6617.12.patch, HIVE-6617.13.patch, HIVE-6617.14.patch, HIVE-6617.15.patch
>
>
> CLEAR LIBRARY CACHE
> As of today, antlr reports 214 warnings. Need to bring down this number, 
> ideally to 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-6617) Reduce ambiguity in grammar

2015-02-13 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-6617:
--
Attachment: HIVE-6617.15.patch

> Reduce ambiguity in grammar
> ---
>
> Key: HIVE-6617
> URL: https://issues.apache.org/jira/browse/HIVE-6617
> Project: Hive
>  Issue Type: Task
>Reporter: Ashutosh Chauhan
>Assignee: Pengcheng Xiong
> Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, 
> HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, 
> HIVE-6617.06.patch, HIVE-6617.07.patch, HIVE-6617.08.patch, 
> HIVE-6617.09.patch, HIVE-6617.10.patch, HIVE-6617.11.patch, 
> HIVE-6617.12.patch, HIVE-6617.13.patch, HIVE-6617.14.patch, HIVE-6617.15.patch
>
>
> CLEAR LIBRARY CACHE
> As of today, antlr reports 214 warnings. Need to bring down this number, 
> ideally to 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-6617) Reduce ambiguity in grammar

2015-02-13 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-6617:
--
Status: Open  (was: Patch Available)

> Reduce ambiguity in grammar
> ---
>
> Key: HIVE-6617
> URL: https://issues.apache.org/jira/browse/HIVE-6617
> Project: Hive
>  Issue Type: Task
>Reporter: Ashutosh Chauhan
>Assignee: Pengcheng Xiong
> Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, 
> HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, 
> HIVE-6617.06.patch, HIVE-6617.07.patch, HIVE-6617.08.patch, 
> HIVE-6617.09.patch, HIVE-6617.10.patch, HIVE-6617.11.patch, 
> HIVE-6617.12.patch, HIVE-6617.13.patch, HIVE-6617.14.patch, HIVE-6617.15.patch
>
>
> CLEAR LIBRARY CACHE
> As of today, antlr reports 214 warnings. Need to bring down this number, 
> ideally to 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9684) Incorrect disk range computation in ORC because of optional stream kind

2015-02-13 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320625#comment-14320625
 ] 

Prasanth Jayachandran commented on HIVE-9684:
-

Attached trunk patch as well.

> Incorrect disk range computation in ORC because of optional stream kind
> ---
>
> Key: HIVE-9684
> URL: https://issues.apache.org/jira/browse/HIVE-9684
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 1.0.0, 1.1.0, 1.0.1
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-9684.1.patch, HIVE-9684.branch-1.0.patch, 
> HIVE-9684.branch-1.1.patch
>
>
> HIVE-9593 changed all required fields in ORC protobuf message to optional 
> field. But DiskRange computation and stream creation code assumes existence 
> of stream kind everywhere. This leads to incorrect calculation of diskranges 
> resulting in out of range exceptions. The proper fix is to check if stream 
> kind exists using stream.hasKind() before adding the stream to disk range 
> computation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9684) Incorrect disk range computation in ORC because of optional stream kind

2015-02-13 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-9684:

Attachment: HIVE-9684.1.patch

The issue does not happen in trunk. But the check is required for forward 
compatibility.

> Incorrect disk range computation in ORC because of optional stream kind
> ---
>
> Key: HIVE-9684
> URL: https://issues.apache.org/jira/browse/HIVE-9684
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 1.0.0, 1.1.0, 1.0.1
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-9684.1.patch, HIVE-9684.branch-1.0.patch, 
> HIVE-9684.branch-1.1.patch
>
>
> HIVE-9593 changed all required fields in ORC protobuf message to optional 
> field. But DiskRange computation and stream creation code assumes existence 
> of stream kind everywhere. This leads to incorrect calculation of diskranges 
> resulting in out of range exceptions. The proper fix is to check if stream 
> kind exists using stream.hasKind() before adding the stream to disk range 
> computation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9596) move standard getDisplayString impl to GenericUDF

2015-02-13 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-9596:
-
   Resolution: Fixed
Fix Version/s: 1.2.0
   Status: Resolved  (was: Patch Available)

Thanks for cleaning that up, I've committed to trunk.

> move standard getDisplayString impl to GenericUDF
> -
>
> Key: HIVE-9596
> URL: https://issues.apache.org/jira/browse/HIVE-9596
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>Priority: Minor
> Fix For: 1.2.0
>
> Attachments: HIVE-9596.1.patch, HIVE-9596.2.patch, HIVE-9596.3.patch, 
> HIVE-9596.4.patch
>
>
> 54 GenericUDF derived classes have very similar getDisplayString impl which 
> returns "fname(child1, child2, childn)"
> instr() and locate() have bugs in their implementation (no comma btw children)
> Instead of having 54 implementations of the same method it's better to move 
> standard implementation to the base class.
> affected UDF classes:
> {code}
> contrib/src/java/org/apache/hadoop/hive/contrib/genericudf/example/GenericUDFDBOutput.java
> itests/util/src/main/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFEvaluateNPE.java
> itests/util/src/main/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFTestGetJavaBoolean.java
> itests/util/src/main/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFTestGetJavaString.java
> itests/util/src/main/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFTestTranslate.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/AbstractGenericUDFEWAHBitmapBop.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/AbstractGenericUDFReflect.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDF.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAbs.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAddMonths.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFArray.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAssertTrue.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseNumeric.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBasePad.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseTrim.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCoalesce.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFConcat.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFConcatWS.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDate.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateAdd.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateDiff.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateSub.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDecode.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFEWAHBitmapEmpty.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFElt.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFEncode.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFField.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFFloorCeilBase.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFFormatNumber.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFGreatest.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFHash.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFIf.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFInFile.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFInitCap.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFInstr.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLastDay.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLeadLag.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLocate.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLower.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFMacro.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFMapKeys.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFMapValues.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFNamedStruct.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFPower.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFPrintf.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFRound.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSentences.java
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSize.java
> ql/src/java/org/apache/

[jira] [Updated] (HIVE-9686) HiveMetastore.logAuditEvent can be used before sasl server is started

2015-02-13 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-9686:
---
Attachment: HIVE-9686.patch

> HiveMetastore.logAuditEvent can be used before sasl server is started
> -
>
> Key: HIVE-9686
> URL: https://issues.apache.org/jira/browse/HIVE-9686
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-9686.patch
>
>
> Metastore listeners can use logAudit before the sasl server is started 
> resulting in an NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9685) CLIService should create SessionState after logging into kerberos

2015-02-13 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-9685:
---
Status: Patch Available  (was: Open)

> CLIService should create SessionState after logging into kerberos
> -
>
> Key: HIVE-9685
> URL: https://issues.apache.org/jira/browse/HIVE-9685
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-9685.patch
>
>
> {noformat}
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
> at 
> org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
> at 
> org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271)
> at 
> org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:409)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:230)
> at 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:74)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1483)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:64)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:74)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2841)
> at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2860)
> at 
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:453)
> at 
> org.apache.hive.service.cli.CLIService.applyAuthorizationConfigPolicy(CLIService.java:123)
> at org.apache.hive.service.cli.CLIService.init(CLIService.java:81)
> at 
> org.apache.hive.service.CompositeService.init(CompositeService.java:59)
> at 
> org.apache.hive.service.server.HiveServer2.init(HiveServer2.java:92)
> at 
> org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:309)
> at 
> org.apache.hive.service.server.HiveServer2.access$400(HiveServer2.java:68)
> at 
> org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:523)
> at 
> org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:396)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:483)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9686) HiveMetastore.logAuditEvent can be used before sasl server is started

2015-02-13 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-9686:
---
Affects Version/s: 1.0.0
   Status: Patch Available  (was: Open)

> HiveMetastore.logAuditEvent can be used before sasl server is started
> -
>
> Key: HIVE-9686
> URL: https://issues.apache.org/jira/browse/HIVE-9686
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-9686.patch
>
>
> Metastore listeners can use logAudit before the sasl server is started 
> resulting in an NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9523) when columns on which tables are partitioned are used in the join condition same join optimizations as for bucketed tables should be applied

2015-02-13 Thread Vikram Dixit K (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-9523:
-
Labels: gsoc2015  (was: )

> when columns on which tables are partitioned are used in the join condition 
> same join optimizations as for bucketed tables should be applied
> 
>
> Key: HIVE-9523
> URL: https://issues.apache.org/jira/browse/HIVE-9523
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer, Physical Optimizer, SQL
>Affects Versions: 0.13.0, 0.14.0, 0.13.1
>Reporter: Maciek Kocon
>  Labels: gsoc2015
>
> For JOIN conditions where partitioning criteria are used respectively:
> ⋮ 
> FROM TabA JOIN TabB
>ON TabA.partCol1 = TabB.partCol2
>AND TabA.partCol2 = TabB.partCol2
> the optimizer could/should choose to treat it the same way as with bucketed 
> tables: ⋮ 
> FROM TabC
>   JOIN TabD
>  ON TabC.clusteredByCol1 = TabD.clusteredByCol2
>AND TabC.clusteredByCol2 = TabD.clusteredByCol2
> and use either Bucket Map Join or better, the Sort Merge Bucket Map Join.
> This is based on fact that same way as buckets translate to separate files, 
> the partitions essentially provide the same mapping.
> When data locality is known the optimizer could focus only on joining 
> corresponding partitions rather than whole data sets.
> #side notes:
> ⦿ Currently Table DDL Syntax where Partitioning and Bucketing defined at the 
> same time is allowed:
> CREATE TABLE
>  ⋮
> PARTITIONED BY(…) CLUSTERED BY(…) INTO … BUCKETS;
> But in this case optimizer never chooses to use Bucket Map Join or Sort Merge 
> Bucket Map Join which defeats the purpose of creating BUCKETed tables in such 
> scenarios. Should that be raised as a separate BUG?
> ⦿ Currently partitioning and bucketing are two separate things but serve same 
> purpose - shouldn't the concept be merged (explicit/implicit partitions?)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-9687) Blink DB style approximate querying in hive

2015-02-13 Thread Vikram Dixit K (JIRA)

Vikram Dixit K created HIVE-9687:


 Summary: Blink DB style approximate querying in hive
 Key: HIVE-9687
 URL: https://issues.apache.org/jira/browse/HIVE-9687
 Project: Hive
  Issue Type: New Feature
Reporter: Vikram Dixit K


http://www.cs.berkeley.edu/~sameerag/blinkdb_eurosys13.pdf

There are various pieces here that need to be thought through and implemented. 
For e.g. sampling offline, run-time sampling selection module etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-6617) Reduce ambiguity in grammar

2015-02-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320654#comment-14320654
 ] 

Hive QA commented on HIVE-6617:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12698765/HIVE-6617.15.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7541 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables_compact
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_vectorization_ppd
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_select_charliteral
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2795/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2795/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2795/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12698765 - PreCommit-HIVE-TRUNK-Build

> Reduce ambiguity in grammar
> ---
>
> Key: HIVE-6617
> URL: https://issues.apache.org/jira/browse/HIVE-6617
> Project: Hive
>  Issue Type: Task
>Reporter: Ashutosh Chauhan
>Assignee: Pengcheng Xiong
> Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, 
> HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, 
> HIVE-6617.06.patch, HIVE-6617.07.patch, HIVE-6617.08.patch, 
> HIVE-6617.09.patch, HIVE-6617.10.patch, HIVE-6617.11.patch, 
> HIVE-6617.12.patch, HIVE-6617.13.patch, HIVE-6617.14.patch, HIVE-6617.15.patch
>
>
> CLEAR LIBRARY CACHE
> As of today, antlr reports 214 warnings. Need to bring down this number, 
> ideally to 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9684) Incorrect disk range computation in ORC because of optional stream kind

2015-02-13 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320674#comment-14320674
 ] 

Gopal V commented on HIVE-9684:
---

LGTM +1.

This needs the extra condition because unknown enum fields default to the first 
field (PRESENT).


> Incorrect disk range computation in ORC because of optional stream kind
> ---
>
> Key: HIVE-9684
> URL: https://issues.apache.org/jira/browse/HIVE-9684
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 1.0.0, 1.1.0, 1.0.1
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-9684.1.patch, HIVE-9684.branch-1.0.patch, 
> HIVE-9684.branch-1.1.patch
>
>
> HIVE-9593 changed all required fields in ORC protobuf message to optional 
> field. But DiskRange computation and stream creation code assumes existence 
> of stream kind everywhere. This leads to incorrect calculation of diskranges 
> resulting in out of range exceptions. The proper fix is to check if stream 
> kind exists using stream.hasKind() before adding the stream to disk range 
> computation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-9688) Support SAMPLE operator in hive

2015-02-13 Thread Prasanth Jayachandran (JIRA)

Prasanth Jayachandran created HIVE-9688:
---

 Summary: Support SAMPLE operator in hive
 Key: HIVE-9688
 URL: https://issues.apache.org/jira/browse/HIVE-9688
 Project: Hive
  Issue Type: New Feature
Reporter: Prasanth Jayachandran


Hive needs SAMPLE operator to support parallel order by, skew joins and count + 
distinct optimizations. Random, Reservoir and Stratified sampling should cover 
most of the cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9686) HiveMetastore.logAuditEvent can be used before sasl server is started

2015-02-13 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320691#comment-14320691
 ] 

Xuefu Zhang commented on HIVE-9686:
---

+1

> HiveMetastore.logAuditEvent can be used before sasl server is started
> -
>
> Key: HIVE-9686
> URL: https://issues.apache.org/jira/browse/HIVE-9686
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-9686.patch
>
>
> Metastore listeners can use logAudit before the sasl server is started 
> resulting in an NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-9689) Store distinct value estimator's bit vectors in metastore

2015-02-13 Thread Prasanth Jayachandran (JIRA)

Prasanth Jayachandran created HIVE-9689:
---

 Summary: Store distinct value estimator's bit vectors in metastore
 Key: HIVE-9689
 URL: https://issues.apache.org/jira/browse/HIVE-9689
 Project: Hive
  Issue Type: New Feature
Reporter: Prasanth Jayachandran


Hive currently uses PCSA (Probabilistic Counting and Stochastic Averaging) 
algorithm to determine distinct cardinality. The NDV value determined from the 
UDF is stored in the metastore instead of the actual bit vectors. This makes it 
impossible to estimation the overall NDV across all the partition (or selected 
partitions). We should ideally store the bitvectors in the metastore and do 
server side merging of the bitvectors. Also we could replace the current PCSA 
algorithm in favour of HyperLogLog if space is a constraint. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9685) CLIService should create SessionState after logging into kerberos

2015-02-13 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320720#comment-14320720
 ] 

Xuefu Zhang commented on HIVE-9685:
---

+1

> CLIService should create SessionState after logging into kerberos
> -
>
> Key: HIVE-9685
> URL: https://issues.apache.org/jira/browse/HIVE-9685
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-9685.patch
>
>
> {noformat}
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
> at 
> org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
> at 
> org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271)
> at 
> org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:409)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:230)
> at 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:74)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1483)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:64)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:74)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2841)
> at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2860)
> at 
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:453)
> at 
> org.apache.hive.service.cli.CLIService.applyAuthorizationConfigPolicy(CLIService.java:123)
> at org.apache.hive.service.cli.CLIService.init(CLIService.java:81)
> at 
> org.apache.hive.service.CompositeService.init(CompositeService.java:59)
> at 
> org.apache.hive.service.server.HiveServer2.init(HiveServer2.java:92)
> at 
> org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:309)
> at 
> org.apache.hive.service.server.HiveServer2.access$400(HiveServer2.java:68)
> at 
> org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:523)
> at 
> org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:396)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:483)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-6617) Reduce ambiguity in grammar

2015-02-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320792#comment-14320792
 ] 

Hive QA commented on HIVE-6617:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12698792/HIVE-6617.15.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7549 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_vectorization_ppd
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_select_charliteral
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2796/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2796/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2796/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12698792 - PreCommit-HIVE-TRUNK-Build

> Reduce ambiguity in grammar
> ---
>
> Key: HIVE-6617
> URL: https://issues.apache.org/jira/browse/HIVE-6617
> Project: Hive
>  Issue Type: Task
>Reporter: Ashutosh Chauhan
>Assignee: Pengcheng Xiong
> Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, 
> HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, 
> HIVE-6617.06.patch, HIVE-6617.07.patch, HIVE-6617.08.patch, 
> HIVE-6617.09.patch, HIVE-6617.10.patch, HIVE-6617.11.patch, 
> HIVE-6617.12.patch, HIVE-6617.13.patch, HIVE-6617.14.patch, HIVE-6617.15.patch
>
>
> CLEAR LIBRARY CACHE
> As of today, antlr reports 214 warnings. Need to bring down this number, 
> ideally to 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-9690) Allow non-numeric arithmetic operations

2015-02-13 Thread Jason Dere (JIRA)

Jason Dere created HIVE-9690:


 Summary: Allow non-numeric arithmetic operations
 Key: HIVE-9690
 URL: https://issues.apache.org/jira/browse/HIVE-9690
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Jason Dere
Assignee: Jason Dere


Some refactoring for HIVE-5021. The current arithmetic UDFs are specialized for 
numeric types, and trying to change the logic in the existing UDFs looks a bit 
complicated. A less intrusive fix would be to create the date-time/interval 
arithmetic UDFs as a separate UDF class, and to make the plus/minus UDFs act as 
a wrapper which would invoke the numeric or interval arithmetic UDF depending 
on the args.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-6617) Reduce ambiguity in grammar

2015-02-13 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-6617:
--
Attachment: HIVE-6617.16.patch

update more golden files

> Reduce ambiguity in grammar
> ---
>
> Key: HIVE-6617
> URL: https://issues.apache.org/jira/browse/HIVE-6617
> Project: Hive
>  Issue Type: Task
>Reporter: Ashutosh Chauhan
>Assignee: Pengcheng Xiong
> Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, 
> HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, 
> HIVE-6617.06.patch, HIVE-6617.07.patch, HIVE-6617.08.patch, 
> HIVE-6617.09.patch, HIVE-6617.10.patch, HIVE-6617.11.patch, 
> HIVE-6617.12.patch, HIVE-6617.13.patch, HIVE-6617.14.patch, 
> HIVE-6617.15.patch, HIVE-6617.16.patch
>
>
> CLEAR LIBRARY CACHE
> As of today, antlr reports 214 warnings. Need to bring down this number, 
> ideally to 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-6617) Reduce ambiguity in grammar

2015-02-13 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-6617:
--
Status: Open  (was: Patch Available)

> Reduce ambiguity in grammar
> ---
>
> Key: HIVE-6617
> URL: https://issues.apache.org/jira/browse/HIVE-6617
> Project: Hive
>  Issue Type: Task
>Reporter: Ashutosh Chauhan
>Assignee: Pengcheng Xiong
> Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, 
> HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, 
> HIVE-6617.06.patch, HIVE-6617.07.patch, HIVE-6617.08.patch, 
> HIVE-6617.09.patch, HIVE-6617.10.patch, HIVE-6617.11.patch, 
> HIVE-6617.12.patch, HIVE-6617.13.patch, HIVE-6617.14.patch, 
> HIVE-6617.15.patch, HIVE-6617.16.patch
>
>
> CLEAR LIBRARY CACHE
> As of today, antlr reports 214 warnings. Need to bring down this number, 
> ideally to 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-6617) Reduce ambiguity in grammar

2015-02-13 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-6617:
--
Status: Patch Available  (was: Open)

> Reduce ambiguity in grammar
> ---
>
> Key: HIVE-6617
> URL: https://issues.apache.org/jira/browse/HIVE-6617
> Project: Hive
>  Issue Type: Task
>Reporter: Ashutosh Chauhan
>Assignee: Pengcheng Xiong
> Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, 
> HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, 
> HIVE-6617.06.patch, HIVE-6617.07.patch, HIVE-6617.08.patch, 
> HIVE-6617.09.patch, HIVE-6617.10.patch, HIVE-6617.11.patch, 
> HIVE-6617.12.patch, HIVE-6617.13.patch, HIVE-6617.14.patch, 
> HIVE-6617.15.patch, HIVE-6617.16.patch
>
>
> CLEAR LIBRARY CACHE
> As of today, antlr reports 214 warnings. Need to bring down this number, 
> ideally to 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-9691) Include a few more files include the source tarball

2015-02-13 Thread Brock Noland (JIRA)

Brock Noland created HIVE-9691:
--

 Summary: Include a few more files include the source tarball
 Key: HIVE-9691
 URL: https://issues.apache.org/jira/browse/HIVE-9691
 Project: Hive
  Issue Type: Improvement
Reporter: Brock Noland
Assignee: Brock Noland
 Fix For: 1.1.0






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9691) Include a few more files include the source tarball

2015-02-13 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-9691:
---
Attachment: HIVE-9691.patch

> Include a few more files include the source tarball
> ---
>
> Key: HIVE-9691
> URL: https://issues.apache.org/jira/browse/HIVE-9691
> Project: Hive
>  Issue Type: Improvement
>Reporter: Brock Noland
>Assignee: Brock Noland
> Fix For: 1.1.0
>
> Attachments: HIVE-9691.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9691) Include a few more files include the source tarball

2015-02-13 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-9691:
---
Status: Patch Available  (was: Open)

> Include a few more files include the source tarball
> ---
>
> Key: HIVE-9691
> URL: https://issues.apache.org/jira/browse/HIVE-9691
> Project: Hive
>  Issue Type: Improvement
>Reporter: Brock Noland
>Assignee: Brock Noland
> Fix For: 1.1.0
>
> Attachments: HIVE-9691.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9691) Include a few more files include the source tarball

2015-02-13 Thread Chao (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320881#comment-14320881
 ] 

Chao commented on HIVE-9691:


+1

> Include a few more files include the source tarball
> ---
>
> Key: HIVE-9691
> URL: https://issues.apache.org/jira/browse/HIVE-9691
> Project: Hive
>  Issue Type: Improvement
>Reporter: Brock Noland
>Assignee: Brock Noland
> Fix For: 1.1.0
>
> Attachments: HIVE-9691.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9659) 'Error while trying to create table container' occurs during hive query case execution when hive.optimize.skewjoin set to 'true' [Spark Branch]

2015-02-13 Thread Jimmy Xiang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320906#comment-14320906
 ] 

Jimmy Xiang commented on HIVE-9659:
---

How big is the data set?  Does it work with a small data set?

> 'Error while trying to create table container' occurs during hive query case 
> execution when hive.optimize.skewjoin set to 'true' [Spark Branch]
> ---
>
> Key: HIVE-9659
> URL: https://issues.apache.org/jira/browse/HIVE-9659
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xin Hao
>
> We found that 'Error while trying to create table container'  occurs during 
> Big-Bench Q12 case execution when hive.optimize.skewjoin set to 'true'.
> If hive.optimize.skewjoin set to 'false', the case could pass.
> How to reproduce:
> 1. set hive.optimize.skewjoin=true;
> 2. Run BigBench case Q12 and it will fail. 
> Check the executor log (e.g. /usr/lib/spark/work/app-/2/stderr) and you 
> will found error 'Error while trying to create table container' in the log 
> and also a NullPointerException near the end of the log.
> (a) Detail error message for 'Error while trying to create table container':
> {noformat}
> 15/02/12 01:29:49 ERROR SparkMapRecordHandler: Error processing row: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error while trying to 
> create table container
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error while trying to 
> create table container
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HashTableLoader.load(HashTableLoader.java:118)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:193)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:219)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1051)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1055)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1055)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1055)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:486)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.processRow(SparkMapRecordHandler.java:141)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:47)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:27)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:98)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at 
> org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:217)
>   at 
> org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>   at org.apache.spark.scheduler.Task.run(Task.scala:56)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error while 
> trying to create table container
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinTableContainerSerDe.load(MapJoinTableContainerSerDe.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HashTableLoader.load(HashTableLoader.java:115)
>   ... 21 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error, not a 
> directory: 
> hdfs://bhx1:8020/tmp/hive/root/d22ef465-bff5-4edb-a822-0a9f1c25b66c/hive_2015-02-12_01-28-10_008_6897031694580088767-1/-mr-10009/HashTable-Stage-6/MapJoin-mapfile01--.hashtable
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinTableContainerSerDe.load(MapJoinTableContainerSerDe.java:106)
>   ... 22 more
> 15/02/12 01:29:49 INFO SparkRecordHandler: maximum memory = 40939028480
> 15/02/12 01:29:49 INFO PerfLogger:  from=org.apache.hadoop.hive.ql.exec.spark.SparkRecordHandler>
> {noformat}
> (b) Detail error message for NullPointerException:

[jira] [Commented] (HIVE-9684) Incorrect disk range computation in ORC because of optional stream kind

2015-02-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320930#comment-14320930
 ] 

Hive QA commented on HIVE-9684:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12698799/HIVE-9684.1.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 7548 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchAbort
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2797/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2797/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2797/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12698799 - PreCommit-HIVE-TRUNK-Build

> Incorrect disk range computation in ORC because of optional stream kind
> ---
>
> Key: HIVE-9684
> URL: https://issues.apache.org/jira/browse/HIVE-9684
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 1.0.0, 1.1.0, 1.0.1
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-9684.1.patch, HIVE-9684.branch-1.0.patch, 
> HIVE-9684.branch-1.1.patch
>
>
> HIVE-9593 changed all required fields in ORC protobuf message to optional 
> field. But DiskRange computation and stream creation code assumes existence 
> of stream kind everywhere. This leads to incorrect calculation of diskranges 
> resulting in out of range exceptions. The proper fix is to check if stream 
> kind exists using stream.hasKind() before adding the stream to disk range 
> computation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-9692) Allocate only parquet selected columns in HiveStructConverter class

2015-02-13 Thread JIRA

Sergio Peña created HIVE-9692:
-

 Summary: Allocate only parquet selected columns in 
HiveStructConverter class
 Key: HIVE-9692
 URL: https://issues.apache.org/jira/browse/HIVE-9692
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergio Peña
Assignee: Sergio Peña


HiveStructConverter class is where Hive converts parquet objects to hive 
writable objects that will be later parsed by object inspectors. This class is 
allocating enough writable objects as number of columns of the file schema.

{noformat}
ublic HiveStructConverter(final GroupType requestedSchema, final GroupType 
tableSchema, Map metadata) {
...
this.writables = new Writable[fileSchema.getFieldCount()];
...
}
{noformat}

This is always allocated even if we only select a specific number of columns. 
Let's say 2 columns from a table of 50 columns. 50 objects are allocated. Only 
2 are used, and 48 are unused.

We should be able to allocate only the requested number of columns in order to 
save memory usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Work started] (HIVE-9692) Allocate only parquet selected columns in HiveStructConverter class

2015-02-13 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HIVE-9692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-9692 started by Sergio Peña.
-
> Allocate only parquet selected columns in HiveStructConverter class
> ---
>
> Key: HIVE-9692
> URL: https://issues.apache.org/jira/browse/HIVE-9692
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Sergio Peña
>
> HiveStructConverter class is where Hive converts parquet objects to hive 
> writable objects that will be later parsed by object inspectors. This class 
> is allocating enough writable objects as number of columns of the file schema.
> {noformat}
> ublic HiveStructConverter(final GroupType requestedSchema, final GroupType 
> tableSchema, Map metadata) {
> ...
> this.writables = new Writable[fileSchema.getFieldCount()];
> ...
> }
> {noformat}
> This is always allocated even if we only select a specific number of columns. 
> Let's say 2 columns from a table of 50 columns. 50 objects are allocated. 
> Only 2 are used, and 48 are unused.
> We should be able to allocate only the requested number of columns in order 
> to save memory usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-9693) Introduce a stats cache for metastore

2015-02-13 Thread Vaibhav Gumashta (JIRA)

Vaibhav Gumashta created HIVE-9693:
--

 Summary: Introduce a stats cache for metastore
 Key: HIVE-9693
 URL: https://issues.apache.org/jira/browse/HIVE-9693
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9693) Introduce a stats cache for metastore

2015-02-13 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-9693:
---
Issue Type: Sub-task  (was: Bug)
Parent: HIVE-9452

> Introduce a stats cache for metastore
> -
>
> Key: HIVE-9693
> URL: https://issues.apache.org/jira/browse/HIVE-9693
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9693) Introduce a stats cache for HBase metastore

2015-02-13 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-9693:
---
Summary: Introduce a stats cache for HBase metastore  (was: Introduce a 
stats cache for metastore)

> Introduce a stats cache for HBase metastore
> ---
>
> Key: HIVE-9693
> URL: https://issues.apache.org/jira/browse/HIVE-9693
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9693) Introduce a stats cache for HBase metastore [hbase-metastore branch]

2015-02-13 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-9693:
---
Summary: Introduce a stats cache for HBase metastore  [hbase-metastore 
branch]  (was: Introduce a stats cache for HBase metastore)

> Introduce a stats cache for HBase metastore  [hbase-metastore branch]
> -
>
> Key: HIVE-9693
> URL: https://issues.apache.org/jira/browse/HIVE-9693
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9659) 'Error while trying to create table container' occurs during hive query case execution when hive.optimize.skewjoin set to 'true' [Spark Branch]

2015-02-13 Thread Jimmy Xiang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14321013#comment-14321013
 ] 

Jimmy Xiang commented on HIVE-9659:
---

I can reproduce this issue with a tiny data set.

> 'Error while trying to create table container' occurs during hive query case 
> execution when hive.optimize.skewjoin set to 'true' [Spark Branch]
> ---
>
> Key: HIVE-9659
> URL: https://issues.apache.org/jira/browse/HIVE-9659
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xin Hao
>
> We found that 'Error while trying to create table container'  occurs during 
> Big-Bench Q12 case execution when hive.optimize.skewjoin set to 'true'.
> If hive.optimize.skewjoin set to 'false', the case could pass.
> How to reproduce:
> 1. set hive.optimize.skewjoin=true;
> 2. Run BigBench case Q12 and it will fail. 
> Check the executor log (e.g. /usr/lib/spark/work/app-/2/stderr) and you 
> will found error 'Error while trying to create table container' in the log 
> and also a NullPointerException near the end of the log.
> (a) Detail error message for 'Error while trying to create table container':
> {noformat}
> 15/02/12 01:29:49 ERROR SparkMapRecordHandler: Error processing row: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error while trying to 
> create table container
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error while trying to 
> create table container
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HashTableLoader.load(HashTableLoader.java:118)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:193)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:219)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1051)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1055)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1055)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1055)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:486)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.processRow(SparkMapRecordHandler.java:141)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:47)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:27)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:98)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at 
> org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:217)
>   at 
> org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>   at org.apache.spark.scheduler.Task.run(Task.scala:56)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error while 
> trying to create table container
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinTableContainerSerDe.load(MapJoinTableContainerSerDe.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HashTableLoader.load(HashTableLoader.java:115)
>   ... 21 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error, not a 
> directory: 
> hdfs://bhx1:8020/tmp/hive/root/d22ef465-bff5-4edb-a822-0a9f1c25b66c/hive_2015-02-12_01-28-10_008_6897031694580088767-1/-mr-10009/HashTable-Stage-6/MapJoin-mapfile01--.hashtable
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinTableContainerSerDe.load(MapJoinTableContainerSerDe.java:106)
>   ... 22 more
> 15/02/12 01:29:49 INFO SparkRecordHandler: maximum memory = 40939028480
> 15/02/12 01:29:49 INFO PerfLogger:  from=org.apache.hadoop.hive.ql.exec.spark.SparkRecordHandler>
> {noformat}
> (b) Detail error message for NullPointerException:
> {noformat}

[jira] [Updated] (HIVE-9675) Support START TRANSACTION/COMMIT/ROLLBACK commands

2015-02-13 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-9675:
-
Summary: Support START TRANSACTION/COMMIT/ROLLBACK commands  (was: Support 
BEGIN/COMMIT/ROLLBACK commands)

> Support START TRANSACTION/COMMIT/ROLLBACK commands
> --
>
> Key: HIVE-9675
> URL: https://issues.apache.org/jira/browse/HIVE-9675
> Project: Hive
>  Issue Type: Bug
>  Components: SQL, Transactions
>Affects Versions: 0.14.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> Hive 0.14 added support for insert/update/delete statements with ACID 
> semantics.  Hive 0.14 only supports auto-commit mode.  We need to add support 
> for START TRANSACTION/COMMIT/ROLLBACK commands so that the user can demarcate 
> transaction boundaries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9675) Support BEGIN/COMMIT/ROLLBACK commands

2015-02-13 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-9675:
-
Description: Hive 0.14 added support for insert/update/delete statements 
with ACID semantics.  Hive 0.14 only supports auto-commit mode.  We need to add 
support for START TRANSACTION/COMMIT/ROLLBACK commands so that the user can 
demarcate transaction boundaries.  (was: Hive 0.14 added support for 
insert/update/delete statements with ACID semantics.  Hive 0.14 only supports 
auto-commit mode.  We need to add support for BEGIN/COMMIT/ROLLBACK commands so 
that the user can demarcate transaction boundaries.)

> Support BEGIN/COMMIT/ROLLBACK commands
> --
>
> Key: HIVE-9675
> URL: https://issues.apache.org/jira/browse/HIVE-9675
> Project: Hive
>  Issue Type: Bug
>  Components: SQL, Transactions
>Affects Versions: 0.14.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> Hive 0.14 added support for insert/update/delete statements with ACID 
> semantics.  Hive 0.14 only supports auto-commit mode.  We need to add support 
> for START TRANSACTION/COMMIT/ROLLBACK commands so that the user can demarcate 
> transaction boundaries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

1 2 >

1 - 100 of 124 matches

Mail list logo