Review Request 30208: HIVE-9449 Push YARN configuration to Spark while deply Spark on YARN[Spark Branch]

2015-01-22 Thread chengxiang li

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30208/
---

Review request for hive and Xuefu Zhang.


Bugs: HIVE-9449
https://issues.apache.org/jira/browse/HIVE-9449


Repository: hive-git


Description
---

We only push Spark configuration and RSC configuration to Spark while launch 
Spark cluster now, for Spark on YARN mode, Spark need extra YARN configuration 
to launch Spark cluster. Besides this, to support dynamically configuration 
setting for RSC configuration/YARN configuration, we need to recreate 
SparkSession while RSC configuration/YARN configuration update as well, as they 
may influence the Spark cluster deployment as well.


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java d4d98d7 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java 
9dc6c47 

Diff: https://reviews.apache.org/r/30208/diff/


Testing
---


Thanks,

chengxiang li



[jira] [Updated] (HIVE-9449) Push YARN configuration to Spark while deply Spark on YARN[Spark Branch]

2015-01-22 Thread Chengxiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-9449:

Attachment: HIVE-9449.1-spark.patch

> Push YARN configuration to Spark while deply Spark on YARN[Spark Branch]
> 
>
> Key: HIVE-9449
> URL: https://issues.apache.org/jira/browse/HIVE-9449
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
> Attachments: HIVE-9449.1-spark.patch
>
>
> We only push Spark configuration and RSC configuration to Spark while launch 
> Spark cluster now, for Spark on YARN mode, Spark need extra YARN 
> configuration to launch Spark cluster. Besides this, to support dynamically 
> configuration setting for RSC configuration/YARN configuration, we need to 
> recreate SparkSession while RSC configuration/YARN configuration update as 
> well, as they may influence the Spark cluster deployment as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9449) Push YARN configuration to Spark while deply Spark on YARN[Spark Branch]

2015-01-22 Thread Chengxiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-9449:

Status: Patch Available  (was: Open)

> Push YARN configuration to Spark while deply Spark on YARN[Spark Branch]
> 
>
> Key: HIVE-9449
> URL: https://issues.apache.org/jira/browse/HIVE-9449
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
> Attachments: HIVE-9449.1-spark.patch
>
>
> We only push Spark configuration and RSC configuration to Spark while launch 
> Spark cluster now, for Spark on YARN mode, Spark need extra YARN 
> configuration to launch Spark cluster. Besides this, to support dynamically 
> configuration setting for RSC configuration/YARN configuration, we need to 
> recreate SparkSession while RSC configuration/YARN configuration update as 
> well, as they may influence the Spark cluster deployment as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9449) Push YARN configuration to Spark while deply Spark on YARN[Spark Branch]

2015-01-22 Thread Chengxiang Li (JIRA)
Chengxiang Li created HIVE-9449:
---

 Summary: Push YARN configuration to Spark while deply Spark on 
YARN[Spark Branch]
 Key: HIVE-9449
 URL: https://issues.apache.org/jira/browse/HIVE-9449
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li


We only push Spark configuration and RSC configuration to Spark while launch 
Spark cluster now, for Spark on YARN mode, Spark need extra YARN configuration 
to launch Spark cluster. Besides this, to support dynamically configuration 
setting for RSC configuration/YARN configuration, we need to recreate 
SparkSession while RSC configuration/YARN configuration update as well, as they 
may influence the Spark cluster deployment as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9434) Shim the method Path.getPathWithoutSchemeAndAuthority

2015-01-22 Thread Dong Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288918#comment-14288918
 ] 

Dong Chen commented on HIVE-9434:
-

Failed tests seem not related.

> Shim the method Path.getPathWithoutSchemeAndAuthority
> -
>
> Key: HIVE-9434
> URL: https://issues.apache.org/jira/browse/HIVE-9434
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 0.15.0
>Reporter: Brock Noland
>Assignee: Dong Chen
> Fix For: 0.15.0
>
> Attachments: HIVE-9434.patch
>
>
> Since Hadoop 1 does not have the method 
> {{Path.getPathWithoutSchemeAndAuthority}} we need to shim it out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9410) ClassNotFoundException occurs during hive query case execution with UDF defined [Spark Branch]

2015-01-22 Thread Reynold Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reynold Xin updated HIVE-9410:
--
Description: 
We have a hive query case with UDF defined (i.e. BigBench case Q10, Q18 etc.). 
It will be passed for default Hive (on MR) mode, while failed for Hive On Spark 
mode (both Standalone and Yarn-Client). 

Although we use 'add jar .jar;' to add the UDF jar explicitly, the issue 
still exists. 

BTW, if we put the UDF jar into $HIVE_HOME/lib dir, the case will be passed.

Detail Error Message is as below (NOTE: 
de.bankmark.bigbench.queries.q10.SentimentUDF is the UDF which contained in jar 
bigbenchqueriesmr.jar, and we have add command like 'add jar 
/location/to/bigbenchqueriesmr.jar;' into .sql explicitly)

{code}
INFO  [pool-1-thread-1]: client.RemoteDriver (RemoteDriver.java:call(316)) - 
Failed to run job 8dd120cb-1a4d-4d1c-ba31-61eac648c27d
org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find class: 
de.bankmark.bigbench.queries.q10.SentimentUDF
Serialization trace:
genericUDTF (org.apache.hadoop.hive.ql.plan.UDTFDesc)
conf (org.apache.hadoop.hive.ql.exec.UDTFOperator)
childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
childOperators (org.apache.hadoop.hive.ql.exec.MapJoinOperator)
childOperators (org.apache.hadoop.hive.ql.exec.FilterOperator)
childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
right (org.apache.commons.lang3.tuple.ImmutablePair)
edgeProperties (org.apache.hadoop.hive.ql.plan.SparkWork)
at 
org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
at 
org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:656)
at 
org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:99)
at 
org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
at 
org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
at 
org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
at 
org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
at 
org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
at 
org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
at 
org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
at 
org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
...
Caused by: java.lang.ClassNotFoundException: 
de.bankmark.bigbench.queries.q10.SentimentUDF
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at 
org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:136)
... 55 more
{code}

  was:
We have a hive query case with UDF defined (i.e. BigBench case Q10, Q18 etc.). 
It will be passed for default Hive (on MR) mode, while failed for Hive On Spark 
mode (both Standalone and Yarn-Client). 

Although we use 'add jar .jar;' to add the UDF jar explicitly, the issue 
still exists. 

BTW, if we put the UDF jar into $HIVE_HOME/lib dir, the case will be passed.

Detail Error Message is as below (NOTE: 
de.bankmark.bigbench.queries.q10.SentimentUDF is the UDF which contained in jar 
bigbenchqueriesmr.jar, and we have add command like 'add jar 
/location/to/bigbenchqueriesmr.jar;' into .sql explicitly)

INFO  [pool-1-thread-1]: client.RemoteDriver (RemoteDriver.java:call(316)) - 
Failed to run job 8dd120cb-1a4d-4d1c-ba31-61eac648c27d
org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find class: 
de.bankmark.bigbench.queries.q10.SentimentUDF
Serialization trace:
gener

[jira] [Commented] (HIVE-9327) CBO (Calcite Return Path): Removing Row Resolvers from ParseContext

2015-01-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288911#comment-14288911
 ] 

Hive QA commented on HIVE-9327:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12694013/HIVE-9327.05.patch

{color:red}ERROR:{color} -1 due to 73 failed/errored test(s), 7347 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_archive_excludeHadoop20
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_archive_multi
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join27
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_simple_select
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_subq_in
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_column_access_stats
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constantPropagateForSubQuery
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explain_logical
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_join_breaktask
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_join_breaktask2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_1_23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_skew_1_23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_metadataOnlyOptimizer
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_metadataonly1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_gby2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join_filter
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_outer_join5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_union_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_vc
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rcfile_union
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_25
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_in
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_in_having
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin_having
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_views
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_table_access_keys_stats
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union24
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union28
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union30
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_null
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_6_subq
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_mapjoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_mapjoin_reduce
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_simple_select
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_subq_in
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_filter_join_breaktask
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_filter_join_breaktask2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mrr
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_subquery_in
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_decimal_mapjoin
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_mapjoin_reduce
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_constprog_partitioner
org.apache.hadoop.hive.cli.TestMinimrCliDriver.tes

Re: Creating a branch for hbase metastore work

2015-01-22 Thread Nick Dimiduk
+1

On Thursday, January 22, 2015, Brock Noland  wrote:

> +1
>
> On Thu, Jan 22, 2015 at 8:19 PM, Alan Gates  > wrote:
> > I've been working on a prototype of using HBase to store Hive's metadata.
> > Basically I've built a new implementation of RawStore that writes to
> HBase
> > rather than DataNucleus.  I want to see if I can build something that
> has a
> > much more straightforward schema than DN and that is much faster.
> >
> > I'd like to get this out in public so other can look at it and
> contribute,
> > but it's no where near ready for real time.  So I propose to create a
> branch
> > and put the code there.  Any objections?
> >
> > Alan.
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the
> reader of
> > this message is not the intended recipient, you are hereby notified that
> any
> > printing, copying, dissemination, distribution, disclosure or forwarding
> of
> > this communication is strictly prohibited. If you have received this
> > communication in error, please contact the sender immediately and delete
> it
> > from your system. Thank You.
>


Re: Review Request 30107: HIVE-9410, ClassNotFoundException occurs during hive query case execution with UDF defined[Spark Branch]

2015-01-22 Thread chengxiang li

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30107/
---

(Updated Jan. 23, 2015, 6:44 a.m.)


Review request for hive and Xuefu Zhang.


Bugs: HIVE-9410
https://issues.apache.org/jira/browse/HIVE-9410


Repository: hive-git


Description
---

The RemoteDriver does not contains added jar in it's classpath, so it would 
failed to desrialize SparkWork due to NoClassFoundException. For Hive on MR, 
while use add jar through Hive CLI, Hive add jar into CLI classpath(through 
thread context classloader) and add it to distributed cache as well. Compare to 
Hive on MR, Hive on Spark has an extra RemoteDriver componnet, we should add 
added jar into it's classpath as well.


Diffs (updated)
-

  itests/src/test/resources/testconfiguration.properties 6340d1c 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 9d9f4e6 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java 
a4a166a 
  ql/src/test/queries/clientpositive/lateral_view_explode2.q PRE-CREATION 
  ql/src/test/results/clientpositive/lateral_view_explode2.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/spark/lateral_view_explode2.q.out 
PRE-CREATION 
  spark-client/src/main/java/org/apache/hive/spark/client/JobContext.java 
00aa4ec 
  spark-client/src/main/java/org/apache/hive/spark/client/JobContextImpl.java 
1eb3ff2 
  spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 
5f9be65 
  
spark-client/src/main/java/org/apache/hive/spark/client/SparkClientUtilities.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/30107/diff/


Testing
---


Thanks,

chengxiang li



[jira] [Updated] (HIVE-9410) ClassNotFoundException occurs during hive query case execution with UDF defined [Spark Branch]

2015-01-22 Thread Chengxiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-9410:

Attachment: (was: HIVE-9410.4-spark.patch)

> ClassNotFoundException occurs during hive query case execution with UDF 
> defined [Spark Branch]
> --
>
> Key: HIVE-9410
> URL: https://issues.apache.org/jira/browse/HIVE-9410
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
> Environment: CentOS 6.5
> JDK1.7
>Reporter: Xin Hao
>Assignee: Chengxiang Li
> Attachments: HIVE-9410.1-spark.patch, HIVE-9410.2-spark.patch, 
> HIVE-9410.3-spark.patch
>
>
> We have a hive query case with UDF defined (i.e. BigBench case Q10, Q18 
> etc.). It will be passed for default Hive (on MR) mode, while failed for Hive 
> On Spark mode (both Standalone and Yarn-Client). 
> Although we use 'add jar .jar;' to add the UDF jar explicitly, the issue 
> still exists. 
> BTW, if we put the UDF jar into $HIVE_HOME/lib dir, the case will be passed.
> Detail Error Message is as below (NOTE: 
> de.bankmark.bigbench.queries.q10.SentimentUDF is the UDF which contained in 
> jar bigbenchqueriesmr.jar, and we have add command like 'add jar 
> /location/to/bigbenchqueriesmr.jar;' into .sql explicitly)
> INFO  [pool-1-thread-1]: client.RemoteDriver (RemoteDriver.java:call(316)) - 
> Failed to run job 8dd120cb-1a4d-4d1c-ba31-61eac648c27d
> org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find 
> class: de.bankmark.bigbench.queries.q10.SentimentUDF
> Serialization trace:
> genericUDTF (org.apache.hadoop.hive.ql.plan.UDTFDesc)
> conf (org.apache.hadoop.hive.ql.exec.UDTFOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.MapJoinOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.FilterOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
> aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
> right (org.apache.commons.lang3.tuple.ImmutablePair)
> edgeProperties (org.apache.hadoop.hive.ql.plan.SparkWork)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:656)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:99)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
> ...
> Caused by: java.lang.ClassNotFoundException: 
> de.bankmark.bigbench.queries.q10.SentimentUDF
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:270)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:136)
> ... 55 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9410) ClassNotFoundException occurs during hive query case execution with UDF defined [Spark Branch]

2015-01-22 Thread Chengxiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-9410:

Attachment: HIVE-9410.4-spark.patch

> ClassNotFoundException occurs during hive query case execution with UDF 
> defined [Spark Branch]
> --
>
> Key: HIVE-9410
> URL: https://issues.apache.org/jira/browse/HIVE-9410
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
> Environment: CentOS 6.5
> JDK1.7
>Reporter: Xin Hao
>Assignee: Chengxiang Li
> Attachments: HIVE-9410.1-spark.patch, HIVE-9410.2-spark.patch, 
> HIVE-9410.3-spark.patch, HIVE-9410.4-spark.patch
>
>
> We have a hive query case with UDF defined (i.e. BigBench case Q10, Q18 
> etc.). It will be passed for default Hive (on MR) mode, while failed for Hive 
> On Spark mode (both Standalone and Yarn-Client). 
> Although we use 'add jar .jar;' to add the UDF jar explicitly, the issue 
> still exists. 
> BTW, if we put the UDF jar into $HIVE_HOME/lib dir, the case will be passed.
> Detail Error Message is as below (NOTE: 
> de.bankmark.bigbench.queries.q10.SentimentUDF is the UDF which contained in 
> jar bigbenchqueriesmr.jar, and we have add command like 'add jar 
> /location/to/bigbenchqueriesmr.jar;' into .sql explicitly)
> INFO  [pool-1-thread-1]: client.RemoteDriver (RemoteDriver.java:call(316)) - 
> Failed to run job 8dd120cb-1a4d-4d1c-ba31-61eac648c27d
> org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find 
> class: de.bankmark.bigbench.queries.q10.SentimentUDF
> Serialization trace:
> genericUDTF (org.apache.hadoop.hive.ql.plan.UDTFDesc)
> conf (org.apache.hadoop.hive.ql.exec.UDTFOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.MapJoinOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.FilterOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
> aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
> right (org.apache.commons.lang3.tuple.ImmutablePair)
> edgeProperties (org.apache.hadoop.hive.ql.plan.SparkWork)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:656)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:99)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
> ...
> Caused by: java.lang.ClassNotFoundException: 
> de.bankmark.bigbench.queries.q10.SentimentUDF
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:270)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:136)
> ... 55 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 30107: HIVE-9410, ClassNotFoundException occurs during hive query case execution with UDF defined[Spark Branch]

2015-01-22 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30107/#review69357
---

Ship it!


Ship It!

- Xuefu Zhang


On Jan. 23, 2015, 6:37 a.m., chengxiang li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30107/
> ---
> 
> (Updated Jan. 23, 2015, 6:37 a.m.)
> 
> 
> Review request for hive and Xuefu Zhang.
> 
> 
> Bugs: HIVE-9410
> https://issues.apache.org/jira/browse/HIVE-9410
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The RemoteDriver does not contains added jar in it's classpath, so it would 
> failed to desrialize SparkWork due to NoClassFoundException. For Hive on MR, 
> while use add jar through Hive CLI, Hive add jar into CLI classpath(through 
> thread context classloader) and add it to distributed cache as well. Compare 
> to Hive on MR, Hive on Spark has an extra RemoteDriver componnet, we should 
> add added jar into it's classpath as well.
> 
> 
> Diffs
> -
> 
>   itests/src/test/resources/testconfiguration.properties 6340d1c 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 9d9f4e6 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java 
> a4a166a 
>   ql/src/test/queries/clientpositive/lateral_view_explode2.q PRE-CREATION 
>   ql/src/test/results/clientpositive/lateral_view_explode2.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/spark/lateral_view_explode2.q.out 
> PRE-CREATION 
>   spark-client/src/main/java/org/apache/hive/spark/client/JobContext.java 
> 00aa4ec 
>   spark-client/src/main/java/org/apache/hive/spark/client/JobContextImpl.java 
> 1eb3ff2 
>   
> spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 
> 5f9be65 
>   
> spark-client/src/main/java/org/apache/hive/spark/client/SparkClientUtilities.java
>  PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/30107/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> chengxiang li
> 
>



[jira] [Commented] (HIVE-9410) ClassNotFoundException occurs during hive query case execution with UDF defined [Spark Branch]

2015-01-22 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288864#comment-14288864
 ] 

Xuefu Zhang commented on HIVE-9410:
---

+1

> ClassNotFoundException occurs during hive query case execution with UDF 
> defined [Spark Branch]
> --
>
> Key: HIVE-9410
> URL: https://issues.apache.org/jira/browse/HIVE-9410
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
> Environment: CentOS 6.5
> JDK1.7
>Reporter: Xin Hao
>Assignee: Chengxiang Li
> Attachments: HIVE-9410.1-spark.patch, HIVE-9410.2-spark.patch, 
> HIVE-9410.3-spark.patch
>
>
> We have a hive query case with UDF defined (i.e. BigBench case Q10, Q18 
> etc.). It will be passed for default Hive (on MR) mode, while failed for Hive 
> On Spark mode (both Standalone and Yarn-Client). 
> Although we use 'add jar .jar;' to add the UDF jar explicitly, the issue 
> still exists. 
> BTW, if we put the UDF jar into $HIVE_HOME/lib dir, the case will be passed.
> Detail Error Message is as below (NOTE: 
> de.bankmark.bigbench.queries.q10.SentimentUDF is the UDF which contained in 
> jar bigbenchqueriesmr.jar, and we have add command like 'add jar 
> /location/to/bigbenchqueriesmr.jar;' into .sql explicitly)
> INFO  [pool-1-thread-1]: client.RemoteDriver (RemoteDriver.java:call(316)) - 
> Failed to run job 8dd120cb-1a4d-4d1c-ba31-61eac648c27d
> org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find 
> class: de.bankmark.bigbench.queries.q10.SentimentUDF
> Serialization trace:
> genericUDTF (org.apache.hadoop.hive.ql.plan.UDTFDesc)
> conf (org.apache.hadoop.hive.ql.exec.UDTFOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.MapJoinOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.FilterOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
> aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
> right (org.apache.commons.lang3.tuple.ImmutablePair)
> edgeProperties (org.apache.hadoop.hive.ql.plan.SparkWork)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:656)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:99)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
> ...
> Caused by: java.lang.ClassNotFoundException: 
> de.bankmark.bigbench.queries.q10.SentimentUDF
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:270)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:136)
> ... 55 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 30107: HIVE-9410, ClassNotFoundException occurs during hive query case execution with UDF defined[Spark Branch]

2015-01-22 Thread chengxiang li

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30107/
---

(Updated Jan. 23, 2015, 6:37 a.m.)


Review request for hive and Xuefu Zhang.


Changes
---

add more comments and fix what xuefu mentioned before.


Bugs: HIVE-9410
https://issues.apache.org/jira/browse/HIVE-9410


Repository: hive-git


Description
---

The RemoteDriver does not contains added jar in it's classpath, so it would 
failed to desrialize SparkWork due to NoClassFoundException. For Hive on MR, 
while use add jar through Hive CLI, Hive add jar into CLI classpath(through 
thread context classloader) and add it to distributed cache as well. Compare to 
Hive on MR, Hive on Spark has an extra RemoteDriver componnet, we should add 
added jar into it's classpath as well.


Diffs (updated)
-

  itests/src/test/resources/testconfiguration.properties 6340d1c 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 9d9f4e6 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java 
a4a166a 
  ql/src/test/queries/clientpositive/lateral_view_explode2.q PRE-CREATION 
  ql/src/test/results/clientpositive/lateral_view_explode2.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/spark/lateral_view_explode2.q.out 
PRE-CREATION 
  spark-client/src/main/java/org/apache/hive/spark/client/JobContext.java 
00aa4ec 
  spark-client/src/main/java/org/apache/hive/spark/client/JobContextImpl.java 
1eb3ff2 
  spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 
5f9be65 
  
spark-client/src/main/java/org/apache/hive/spark/client/SparkClientUtilities.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/30107/diff/


Testing
---


Thanks,

chengxiang li



[jira] [Updated] (HIVE-9410) ClassNotFoundException occurs during hive query case execution with UDF defined [Spark Branch]

2015-01-22 Thread Chengxiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-9410:

Attachment: HIVE-9410.4-spark.patch

> ClassNotFoundException occurs during hive query case execution with UDF 
> defined [Spark Branch]
> --
>
> Key: HIVE-9410
> URL: https://issues.apache.org/jira/browse/HIVE-9410
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
> Environment: CentOS 6.5
> JDK1.7
>Reporter: Xin Hao
>Assignee: Chengxiang Li
> Attachments: HIVE-9410.1-spark.patch, HIVE-9410.2-spark.patch, 
> HIVE-9410.3-spark.patch, HIVE-9410.4-spark.patch
>
>
> We have a hive query case with UDF defined (i.e. BigBench case Q10, Q18 
> etc.). It will be passed for default Hive (on MR) mode, while failed for Hive 
> On Spark mode (both Standalone and Yarn-Client). 
> Although we use 'add jar .jar;' to add the UDF jar explicitly, the issue 
> still exists. 
> BTW, if we put the UDF jar into $HIVE_HOME/lib dir, the case will be passed.
> Detail Error Message is as below (NOTE: 
> de.bankmark.bigbench.queries.q10.SentimentUDF is the UDF which contained in 
> jar bigbenchqueriesmr.jar, and we have add command like 'add jar 
> /location/to/bigbenchqueriesmr.jar;' into .sql explicitly)
> INFO  [pool-1-thread-1]: client.RemoteDriver (RemoteDriver.java:call(316)) - 
> Failed to run job 8dd120cb-1a4d-4d1c-ba31-61eac648c27d
> org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find 
> class: de.bankmark.bigbench.queries.q10.SentimentUDF
> Serialization trace:
> genericUDTF (org.apache.hadoop.hive.ql.plan.UDTFDesc)
> conf (org.apache.hadoop.hive.ql.exec.UDTFOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.MapJoinOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.FilterOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
> aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
> right (org.apache.commons.lang3.tuple.ImmutablePair)
> edgeProperties (org.apache.hadoop.hive.ql.plan.SparkWork)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:656)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:99)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
> ...
> Caused by: java.lang.ClassNotFoundException: 
> de.bankmark.bigbench.queries.q10.SentimentUDF
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:270)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:136)
> ... 55 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9410) ClassNotFoundException occurs during hive query case execution with UDF defined [Spark Branch]

2015-01-22 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288855#comment-14288855
 ] 

Xuefu Zhang commented on HIVE-9410:
---

Yes. We need to resolve this w/o SPARK-5377. Please address all comments in an 
updated patch.

> ClassNotFoundException occurs during hive query case execution with UDF 
> defined [Spark Branch]
> --
>
> Key: HIVE-9410
> URL: https://issues.apache.org/jira/browse/HIVE-9410
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
> Environment: CentOS 6.5
> JDK1.7
>Reporter: Xin Hao
>Assignee: Chengxiang Li
> Attachments: HIVE-9410.1-spark.patch, HIVE-9410.2-spark.patch, 
> HIVE-9410.3-spark.patch
>
>
> We have a hive query case with UDF defined (i.e. BigBench case Q10, Q18 
> etc.). It will be passed for default Hive (on MR) mode, while failed for Hive 
> On Spark mode (both Standalone and Yarn-Client). 
> Although we use 'add jar .jar;' to add the UDF jar explicitly, the issue 
> still exists. 
> BTW, if we put the UDF jar into $HIVE_HOME/lib dir, the case will be passed.
> Detail Error Message is as below (NOTE: 
> de.bankmark.bigbench.queries.q10.SentimentUDF is the UDF which contained in 
> jar bigbenchqueriesmr.jar, and we have add command like 'add jar 
> /location/to/bigbenchqueriesmr.jar;' into .sql explicitly)
> INFO  [pool-1-thread-1]: client.RemoteDriver (RemoteDriver.java:call(316)) - 
> Failed to run job 8dd120cb-1a4d-4d1c-ba31-61eac648c27d
> org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find 
> class: de.bankmark.bigbench.queries.q10.SentimentUDF
> Serialization trace:
> genericUDTF (org.apache.hadoop.hive.ql.plan.UDTFDesc)
> conf (org.apache.hadoop.hive.ql.exec.UDTFOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.MapJoinOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.FilterOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
> aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
> right (org.apache.commons.lang3.tuple.ImmutablePair)
> edgeProperties (org.apache.hadoop.hive.ql.plan.SparkWork)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:656)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:99)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
> ...
> Caused by: java.lang.ClassNotFoundException: 
> de.bankmark.bigbench.queries.q10.SentimentUDF
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:270)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:136)
> ... 55 more



--
This message was sent by Atlassian JIRA
(

[jira] [Updated] (HIVE-9441) Remove call to deprecated Calcite method

2015-01-22 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-9441:
---
Assignee: Julian Hyde  (was: Ashutosh Chauhan)

> Remove call to deprecated Calcite method
> 
>
> Key: HIVE-9441
> URL: https://issues.apache.org/jira/browse/HIVE-9441
> Project: Hive
>  Issue Type: Bug
>Reporter: Julian Hyde
>Assignee: Julian Hyde
>Priority: Minor
> Fix For: 0.15.0
>
> Attachments: HIVE-9441.1.patch
>
>
> The method RexLiteral.byteValue() was deprecated and will be remove in 
> Calcite 1.0. The attached patch replaces it with a non-deprecated alternative.
> As soon as the patch is committed I will push to apache nexus a new 
> calcite-1.0.0-snapshot that will be very close to proposed calcite-1.0 
> release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9441) Remove call to deprecated Calcite method

2015-01-22 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-9441:
---
   Resolution: Fixed
Fix Version/s: 0.15.0
   Status: Resolved  (was: Patch Available)

Thank you Julian for the patch and Ashutosh for the review! I have committed 
this to trunk.

> Remove call to deprecated Calcite method
> 
>
> Key: HIVE-9441
> URL: https://issues.apache.org/jira/browse/HIVE-9441
> Project: Hive
>  Issue Type: Bug
>Reporter: Julian Hyde
>Assignee: Ashutosh Chauhan
>Priority: Minor
> Fix For: 0.15.0
>
> Attachments: HIVE-9441.1.patch
>
>
> The method RexLiteral.byteValue() was deprecated and will be remove in 
> Calcite 1.0. The attached patch replaces it with a non-deprecated alternative.
> As soon as the patch is committed I will push to apache nexus a new 
> calcite-1.0.0-snapshot that will be very close to proposed calcite-1.0 
> release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9410) ClassNotFoundException occurs during hive query case execution with UDF defined [Spark Branch]

2015-01-22 Thread Chengxiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288848#comment-14288848
 ] 

Chengxiang Li commented on HIVE-9410:
-

As ser/deser between Hive driver and remote spark context is beyond spark, we 
still need this fix even SPARK-5377 is resolved.

> ClassNotFoundException occurs during hive query case execution with UDF 
> defined [Spark Branch]
> --
>
> Key: HIVE-9410
> URL: https://issues.apache.org/jira/browse/HIVE-9410
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
> Environment: CentOS 6.5
> JDK1.7
>Reporter: Xin Hao
>Assignee: Chengxiang Li
> Attachments: HIVE-9410.1-spark.patch, HIVE-9410.2-spark.patch, 
> HIVE-9410.3-spark.patch
>
>
> We have a hive query case with UDF defined (i.e. BigBench case Q10, Q18 
> etc.). It will be passed for default Hive (on MR) mode, while failed for Hive 
> On Spark mode (both Standalone and Yarn-Client). 
> Although we use 'add jar .jar;' to add the UDF jar explicitly, the issue 
> still exists. 
> BTW, if we put the UDF jar into $HIVE_HOME/lib dir, the case will be passed.
> Detail Error Message is as below (NOTE: 
> de.bankmark.bigbench.queries.q10.SentimentUDF is the UDF which contained in 
> jar bigbenchqueriesmr.jar, and we have add command like 'add jar 
> /location/to/bigbenchqueriesmr.jar;' into .sql explicitly)
> INFO  [pool-1-thread-1]: client.RemoteDriver (RemoteDriver.java:call(316)) - 
> Failed to run job 8dd120cb-1a4d-4d1c-ba31-61eac648c27d
> org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find 
> class: de.bankmark.bigbench.queries.q10.SentimentUDF
> Serialization trace:
> genericUDTF (org.apache.hadoop.hive.ql.plan.UDTFDesc)
> conf (org.apache.hadoop.hive.ql.exec.UDTFOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.MapJoinOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.FilterOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
> aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
> right (org.apache.commons.lang3.tuple.ImmutablePair)
> edgeProperties (org.apache.hadoop.hive.ql.plan.SparkWork)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:656)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:99)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
> ...
> Caused by: java.lang.ClassNotFoundException: 
> de.bankmark.bigbench.queries.q10.SentimentUDF
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:270)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:136)
> ... 55 more



--
This

Re: Created branch 1.0

2015-01-22 Thread Brock Noland
Hi Alan,

I agree with Xuefu and what was suggested in your statement. I was
thinking we'd release the next release as 0.15 and then later there
would be 1.0 off trunk (e.g. what would have been 0.16) and thus be
superset (minus anything we intentionally remove).

As I have said several times, I'd like to release more often so I feel
we could even start the 1.0 work shortly after the 0.15 release. For
my part, I do agree with some earlier contributor/user sentiment that
it would be good to have some basic public API defined for 1.0. I
don't think that will be too hard as it's more or less obvious what
our public API is today.

Hope this seems reasonable.

Cheers,
Brock

On Thu, Jan 22, 2015 at 2:31 PM, Xuefu Zhang  wrote:
> Hi Thejas/Alan,
>
> From all the argument, I think there was an assumption that the proposed
> 1.0 release will be imminent and 0.15 will happen far after that. Based on
> that assumption, 0.15 will become 1.1, which is greater in scope than 1.0.
> However, this assumption may not be true. The confusion will be significant
> if 0.15 is released early as 0.15 before 0.14.1 is released as 1.0.
>
> Another concern is that, the proposed release of 1.0 is a subset of of
> Hive's functionality, and for major releases users are expecting major
> improvement in functionality as well as stability. Mutating from 0.14.1
> release seems falling short in that expectation.
>
> Having said that, I'd think it makes more sense to release 0.15 as 0.15,
> and later we release 1.0 as the major release that supersedes any previous
> releases. That will fulfill the expectations of a major release.
>
> Thanks,
> Xuefu
>
> On Thu, Jan 22, 2015 at 12:12 PM, Alan Gates  wrote:
>
>> I had one clarifying question for Brock and Xuefu.  Was your proposal to
>> still call the branch from trunk you are planning in a few days 0.15 (and
>> hence release it as 0.15) and have 1.0 be a later release?  Or did you want
>> to call what is now 0.15 1.0?  If you wanted 1.0 to be post 0.15, are you
>> ok with stipulating that the next release from trunk after 0.15 (what would
>> have been 0.16) is 1.0?
>>
>> Alan.
>>
>>   Thejas Nair 
>>  January 22, 2015 at 12:04
>> Brock, Xuefu,
>>
>> We seem to have trouble reaching to a consensus here. (Please see my
>> arguments why I don't see this causing confusions, and let me know if
>> it changes your opinion).
>> How should we move forward ? Do you think we need to go through a
>> formal vote regarding the release plan as per hive by-laws ?
>>
>>
>>   Thejas Nair 
>>  January 22, 2015 at 10:38
>> I don't see any reasons for confusion.
>> From a user perspective, 1.0 is going to have a super set of changes of
>> 0.14.
>> 1.1 (based on planned 0.15 release) will have a super set of changes in
>> 1.0 .
>>
>>
>>   Xuefu Zhang 
>>  January 21, 2015 at 22:47
>> I strongly believe that the concept of 1.0 out of a branch as proposed is
>> creating the greatest confusion in the community. If for any reason that
>> 1.0 cannot be cut from the trunk, that means that we are not ready and so
>> shall wait until so before considering such a release. Thus, I'd -1 on this
>> proposal.
>>
>> Thanks,
>> Xuefu
>>
>>
>>   Gopal V 
>>  January 21, 2015 at 22:29
>> On 1/21/15, 7:09 PM, Brock Noland wrote:
>>
>> Too be clear I strongly feel creating 1.0 from 0.14 will be confusing. In
>> fact it's already crrated confusion amongst folks on this list.
>> Furthermore
>> 1.0 should be created from trunk and be a superset of previous releases.
>>
>>
>> I don't think there is any confusion over that - 1.0 is a long-term
>> maintenance which is going to be a super-set of all *critical fixes* made
>> from here on (emphasis).
>>
>> In fact, a long-term support release should be released off an actively
>> updated maintenance branch, which has been baked-in and never from the
>> trunk.
>>
>> Those who have followed the earlier mails would realize that the most
>> important "feature" about this branch is to stick to only long term
>> maintenance - which in effect is adopting HBase's successful idea.
>>
>> That is just plain solid engineering.
>>
>> Anyway, it would be in the best interests of the larger community, to find
>> out who else finds that approach confusing.
>>
>> Brock, I'm not sure whether you are confused or whether you think other
>> people will be confused (and if so, why?).
>>
>> Cheers,
>> Gopal
>>
>> On Wed, Jan 21, 2015 at 6:05 PM, Vikram Dixit K 
>> 
>> wrote:
>>
>> @Brock,
>>
>> I created this branch from 0.14. I created this branch based on the email
>> thread discussing 1.0,
>>
>> http://search-hadoop.com/m/8er9YGX8g2
>>
>> where you had said you agreed with the suggestion from Enis from HBase who
>> said that we should base 1.0 on a stable version rather than making it a
>> feature release.
>>
>> @Lefty,
>>
>> You are right in that branch 0.14 has been made 1.0. You are also right
>> that 0.15 would be 1.1.0 and we should capture that.
>>
>> Regards
>> Vikram.
>>
>> On Wed, Jan 21, 2015 at 5:13

[jira] [Commented] (HIVE-9431) CBO (Calcite Return Path): Removing AST from ParseContext

2015-01-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288826#comment-14288826
 ] 

Hive QA commented on HIVE-9431:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12693841/HIVE-9431.patch

{color:red}ERROR:{color} -1 due to 64 failed/errored test(s), 7347 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_archive_excludeHadoop20
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_archive_multi
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join27
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_simple_select
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_subq_in
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_column_access_stats
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constantPropagateForSubQuery
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explain_logical
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_join_breaktask
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_join_breaktask2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby3_map
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_1_23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_skew_1_23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_metadataOnlyOptimizer
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_metadataonly1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_gby2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join_filter
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_outer_join5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_union_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_vc
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rcfile_union
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_25
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_in
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_in_having
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin_having
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_views
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_table_access_keys_stats
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union24
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union28
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union30
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_null
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_6_subq
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_mapjoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_mapjoin_reduce
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_simple_select
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_subq_in
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_filter_join_breaktask
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_filter_join_breaktask2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mrr
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_subquery_in
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_decimal_mapjoin
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_mapjoin_reduce
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_constpr

Re: Creating a branch for hbase metastore work

2015-01-22 Thread Brock Noland
+1

On Thu, Jan 22, 2015 at 8:19 PM, Alan Gates  wrote:
> I've been working on a prototype of using HBase to store Hive's metadata.
> Basically I've built a new implementation of RawStore that writes to HBase
> rather than DataNucleus.  I want to see if I can build something that has a
> much more straightforward schema than DN and that is much faster.
>
> I'd like to get this out in public so other can look at it and contribute,
> but it's no where near ready for real time.  So I propose to create a branch
> and put the code there.  Any objections?
>
> Alan.
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader of
> this message is not the intended recipient, you are hereby notified that any
> printing, copying, dissemination, distribution, disclosure or forwarding of
> this communication is strictly prohibited. If you have received this
> communication in error, please contact the sender immediately and delete it
> from your system. Thank You.


[jira] [Commented] (HIVE-9436) RetryingMetaStoreClient does not retry JDOExceptions

2015-01-22 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288800#comment-14288800
 ] 

Thejas M Nair commented on HIVE-9436:
-

[~hsubramaniyan] that is a good point regarding the two places where exception 
checks and retries are happening. IMO, the metastore client should ideally only 
retry any issue related to communication with metastore server. HMSHandler is 
the one that should retry on any temporary errors such as JDOException. As 
[~sushanth] said we have the real cause available there and we can do an 
instanceof check over there. It seems like a simple and cleaner change to do 
the check there. [~sushanth] What do you think ?


> RetryingMetaStoreClient does not retry JDOExceptions
> 
>
> Key: HIVE-9436
> URL: https://issues.apache.org/jira/browse/HIVE-9436
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0, 0.13.1
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-9436.2.patch, HIVE-9436.patch
>
>
> RetryingMetaStoreClient has a bug in the following bit of code:
> {code}
> } else if ((e.getCause() instanceof MetaException) &&
> e.getCause().getMessage().matches("JDO[a-zA-Z]*Exception")) {
>   caughtException = (MetaException) e.getCause();
> } else {
>   throw e.getCause();
> }
> {code}
> The bug here is that java String.matches matches the entire string to the 
> regex, and thus, that match will fail if the message contains anything before 
> or after JDO[a-zA-Z]\*Exception. The solution, however, is very simple, we 
> should match .\*JDO[a-zA-Z]\*Exception.\*



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9410) ClassNotFoundException occurs during hive query case execution with UDF defined [Spark Branch]

2015-01-22 Thread Chengxiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288797#comment-14288797
 ] 

Chengxiang Li commented on HIVE-9410:
-

Yes, Spark would address this issue more properly, I've create SPARK-5377 for 
this. About the unit test, udf_example_add.q should not suitable to verify this 
issue, as Hive does not need to load UDF class during SparkWork serialization, 
i would try to enable some UTDF unit test for this.

> ClassNotFoundException occurs during hive query case execution with UDF 
> defined [Spark Branch]
> --
>
> Key: HIVE-9410
> URL: https://issues.apache.org/jira/browse/HIVE-9410
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
> Environment: CentOS 6.5
> JDK1.7
>Reporter: Xin Hao
>Assignee: Chengxiang Li
> Attachments: HIVE-9410.1-spark.patch, HIVE-9410.2-spark.patch, 
> HIVE-9410.3-spark.patch
>
>
> We have a hive query case with UDF defined (i.e. BigBench case Q10, Q18 
> etc.). It will be passed for default Hive (on MR) mode, while failed for Hive 
> On Spark mode (both Standalone and Yarn-Client). 
> Although we use 'add jar .jar;' to add the UDF jar explicitly, the issue 
> still exists. 
> BTW, if we put the UDF jar into $HIVE_HOME/lib dir, the case will be passed.
> Detail Error Message is as below (NOTE: 
> de.bankmark.bigbench.queries.q10.SentimentUDF is the UDF which contained in 
> jar bigbenchqueriesmr.jar, and we have add command like 'add jar 
> /location/to/bigbenchqueriesmr.jar;' into .sql explicitly)
> INFO  [pool-1-thread-1]: client.RemoteDriver (RemoteDriver.java:call(316)) - 
> Failed to run job 8dd120cb-1a4d-4d1c-ba31-61eac648c27d
> org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find 
> class: de.bankmark.bigbench.queries.q10.SentimentUDF
> Serialization trace:
> genericUDTF (org.apache.hadoop.hive.ql.plan.UDTFDesc)
> conf (org.apache.hadoop.hive.ql.exec.UDTFOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.MapJoinOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.FilterOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
> aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
> right (org.apache.commons.lang3.tuple.ImmutablePair)
> edgeProperties (org.apache.hadoop.hive.ql.plan.SparkWork)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:656)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:99)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
> ...
> Caused by: java.lang.ClassNotFoundException: 
> de.bankmark.bigbench.queries.q10.SentimentUDF
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.ja

Re: Review Request 29800: Apply ColumnPrunning for noop PTFs

2015-01-22 Thread Navis Ryu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29800/
---

(Updated Jan. 23, 2015, 5:41 a.m.)


Review request for hive.


Changes
---

Updated with full diff


Bugs: HIVE-9341
https://issues.apache.org/jira/browse/HIVE-9341


Repository: hive-git


Description
---

Currently, PTF disables CP optimization, which can make a huge burden. For 
example,
{noformat}
select p_mfgr, p_name, p_size,
rank() over (partition by p_mfgr order by p_name) as r,
dense_rank() over (partition by p_mfgr order by p_name) as dr,
sum(p_retailprice) over (partition by p_mfgr order by p_name rows between 
unbounded preceding and current row) as s1
from noop(on part 
  partition by p_mfgr
  order by p_name
  );

STAGE PLANS:
  Stage: Stage-1
Map Reduce
  Map Operator Tree:
  TableScan
alias: part
Statistics: Num rows: 26 Data size: 3147 Basic stats: COMPLETE 
Column stats: NONE
Reduce Output Operator
  key expressions: p_mfgr (type: string), p_name (type: string)
  sort order: ++
  Map-reduce partition columns: p_mfgr (type: string)
  Statistics: Num rows: 26 Data size: 3147 Basic stats: COMPLETE 
Column stats: NONE
  value expressions: p_partkey (type: int), p_name (type: string), 
p_mfgr (type: string), p_brand (type: string), p_type (type: string), p_size 
(type: int), p_container (type: string), p_retailprice (type: double), 
p_comment (type: string), BLOCK__OFFSET__INSIDE__FILE (type: bigint), 
INPUT__FILE__NAME (type: string), ROW__ID (type: 
struct)
...
{noformat}

There should be a generic way to discern referenced columns but before that, we 
know CP can be safely applied to noop functions.


Diffs (updated)
-

  itests/src/test/resources/testconfiguration.properties d08651b 
  itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 479af32 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java 
abf32f1 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkSkewJoinProcFactory.java
 fe698ef 
  ql/src/java/org/apache/hadoop/hive/ql/ppd/PredicatePushDown.java ee7328e 
  ql/src/test/queries/clientpositive/ptf.q 56eef0a 
  ql/src/test/queries/clientpositive/ptf_streaming.q 04b674c 
  ql/src/test/results/clientpositive/ptf.q.out 9196b94 
  ql/src/test/results/clientpositive/ptf_streaming.q.out ef7ae88 
  ql/src/test/results/clientpositive/spark/ptf.q.out 9196b94 
  ql/src/test/results/clientpositive/spark/ptf_streaming.q.out ef7ae88 
  ql/src/test/results/clientpositive/spark/vectorized_ptf.q.out f3b61ce 
  ql/src/test/results/clientpositive/tez/ptf.q.out 9196b94 
  ql/src/test/results/clientpositive/tez/ptf_streaming.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/vectorized_ptf.q.out 928c9f0 
  ql/src/test/results/clientpositive/vectorized_ptf.q.out 7fdd1d8 

Diff: https://reviews.apache.org/r/29800/diff/


Testing
---


Thanks,

Navis Ryu



[jira] [Updated] (HIVE-9228) Problem with subquery using windowing functions

2015-01-22 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9228:

Status: Patch Available  (was: Open)

> Problem with subquery using windowing functions
> ---
>
> Key: HIVE-9228
> URL: https://issues.apache.org/jira/browse/HIVE-9228
> Project: Hive
>  Issue Type: Bug
>  Components: PTF-Windowing
>Affects Versions: 0.13.1
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-9228.1.patch.txt, create_table_tab1.sql, tab1.csv
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> The following query with window functions failed. The internal query works 
> fine.
> select col1, col2, col3 from (select col1,col2, col3, count(case when col4=1 
> then 1 end ) over (partition by col1, col2) as col5, row_number() over 
> (partition by col1, col2 order by col4) as col6 from tab1) t;
> HIVE generates an execution plan with 2 jobs. 
> 1. The first job is to basically calculate window function for col5.  
> 2. The second job is to calculate window function for col6 and output.
> The plan says the first job outputs the columns (col1, col2, col3, col4) to a 
> tmp file since only these columns are used in later stage. While, the PTF 
> operator for the first job outputs (_wcol0, col1, col2, col3, col4) with 
> _wcol0 as the result of the window function even it's not used. 
> In the second job, the map operator still reads the 4 columns (col1, col2, 
> col3, col4) from the temp file using the plan. That causes the exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9228) Problem with subquery using windowing functions

2015-01-22 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9228:

Attachment: HIVE-9228.1.patch.txt

CP problem

> Problem with subquery using windowing functions
> ---
>
> Key: HIVE-9228
> URL: https://issues.apache.org/jira/browse/HIVE-9228
> Project: Hive
>  Issue Type: Bug
>  Components: PTF-Windowing
>Affects Versions: 0.13.1
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-9228.1.patch.txt, create_table_tab1.sql, tab1.csv
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> The following query with window functions failed. The internal query works 
> fine.
> select col1, col2, col3 from (select col1,col2, col3, count(case when col4=1 
> then 1 end ) over (partition by col1, col2) as col5, row_number() over 
> (partition by col1, col2 order by col4) as col6 from tab1) t;
> HIVE generates an execution plan with 2 jobs. 
> 1. The first job is to basically calculate window function for col5.  
> 2. The second job is to calculate window function for col6 and output.
> The plan says the first job outputs the columns (col1, col2, col3, col4) to a 
> tmp file since only these columns are used in later stage. While, the PTF 
> operator for the first job outputs (_wcol0, col1, col2, col3, col4) with 
> _wcol0 as the result of the window function even it's not used. 
> In the second job, the map operator still reads the 4 columns (col1, col2, 
> col3, col4) from the temp file using the plan. That causes the exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9410) ClassNotFoundException occurs during hive query case execution with UDF defined [Spark Branch]

2015-01-22 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288763#comment-14288763
 ] 

Xuefu Zhang commented on HIVE-9410:
---

Also, comments in the critical code section would help others understand.

> ClassNotFoundException occurs during hive query case execution with UDF 
> defined [Spark Branch]
> --
>
> Key: HIVE-9410
> URL: https://issues.apache.org/jira/browse/HIVE-9410
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
> Environment: CentOS 6.5
> JDK1.7
>Reporter: Xin Hao
>Assignee: Chengxiang Li
> Attachments: HIVE-9410.1-spark.patch, HIVE-9410.2-spark.patch, 
> HIVE-9410.3-spark.patch
>
>
> We have a hive query case with UDF defined (i.e. BigBench case Q10, Q18 
> etc.). It will be passed for default Hive (on MR) mode, while failed for Hive 
> On Spark mode (both Standalone and Yarn-Client). 
> Although we use 'add jar .jar;' to add the UDF jar explicitly, the issue 
> still exists. 
> BTW, if we put the UDF jar into $HIVE_HOME/lib dir, the case will be passed.
> Detail Error Message is as below (NOTE: 
> de.bankmark.bigbench.queries.q10.SentimentUDF is the UDF which contained in 
> jar bigbenchqueriesmr.jar, and we have add command like 'add jar 
> /location/to/bigbenchqueriesmr.jar;' into .sql explicitly)
> INFO  [pool-1-thread-1]: client.RemoteDriver (RemoteDriver.java:call(316)) - 
> Failed to run job 8dd120cb-1a4d-4d1c-ba31-61eac648c27d
> org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find 
> class: de.bankmark.bigbench.queries.q10.SentimentUDF
> Serialization trace:
> genericUDTF (org.apache.hadoop.hive.ql.plan.UDTFDesc)
> conf (org.apache.hadoop.hive.ql.exec.UDTFOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.MapJoinOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.FilterOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
> aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
> right (org.apache.commons.lang3.tuple.ImmutablePair)
> edgeProperties (org.apache.hadoop.hive.ql.plan.SparkWork)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:656)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:99)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
> ...
> Caused by: java.lang.ClassNotFoundException: 
> de.bankmark.bigbench.queries.q10.SentimentUDF
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:270)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:136)
> ... 55 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9020) When dropping external tables, Hive should not verify whether user has access to the data.

2015-01-22 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288752#comment-14288752
 ] 

Thejas M Nair commented on HIVE-9020:
-

Will you be able to also include a unit test for the patch ?


> When dropping external tables, Hive should not verify whether user has access 
> to the data. 
> ---
>
> Key: HIVE-9020
> URL: https://issues.apache.org/jira/browse/HIVE-9020
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.1
>Reporter: Anant Nag
> Attachments: dropExternal.patch
>
>
> When dropping tables, hive verifies whether the user has access to the data 
> on hdfs. It fails, if user doesn't have access. It makes sense for internal 
> tables since the data has to be deleted when dropping internal tables but for 
> external tables, Hive should not check for data access. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9020) When dropping external tables, Hive should not verify whether user has access to the data.

2015-01-22 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288750#comment-14288750
 ] 

Thejas M Nair commented on HIVE-9020:
-

[~nntnag17] Thanks for reporting the issue and submitting the patch!
Is this issue seen when you set 
hive.metastore.authorization.storage.checks=true ?


> When dropping external tables, Hive should not verify whether user has access 
> to the data. 
> ---
>
> Key: HIVE-9020
> URL: https://issues.apache.org/jira/browse/HIVE-9020
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.1
>Reporter: Anant Nag
> Attachments: dropExternal.patch
>
>
> When dropping tables, hive verifies whether the user has access to the data 
> on hdfs. It fails, if user doesn't have access. It makes sense for internal 
> tables since the data has to be deleted when dropping internal tables but for 
> external tables, Hive should not check for data access. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9443) ORC PPD - fix fuzzy case evaluation of IS_NULL

2015-01-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288731#comment-14288731
 ] 

Hive QA commented on HIVE-9443:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12693832/HIVE-9443.1.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7347 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_predicate_pushdown
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_mapjoin_addjar
org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testIsNullWithNullInStats
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2485/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2485/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2485/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12693832 - PreCommit-HIVE-TRUNK-Build

> ORC PPD - fix fuzzy case evaluation of IS_NULL
> --
>
> Key: HIVE-9443
> URL: https://issues.apache.org/jira/browse/HIVE-9443
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.15.0
>Reporter: Gopal V
>Assignee: Gopal V
> Fix For: 0.15.0
>
> Attachments: HIVE-9443.1.patch
>
>
> ORC PPD returns IS_NULL wrong for the fuzz case of some-nulls.
> The code-flow in effect should be
> {code}
> if (min == null) {
>   // all nulls
>   return YES;
> } else {
>   if (hasNull == true) {
>  // some nulls
>  return YES_NO; //maybe
>   }
>   // no nulls
>   return NO;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9410) ClassNotFoundException occurs during hive query case execution with UDF defined [Spark Branch]

2015-01-22 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288708#comment-14288708
 ] 

Xuefu Zhang commented on HIVE-9410:
---

Also, can we enhance ./ql/src/test/queries/clientpositive/udf_example_add.q to 
automate the test for this?

> ClassNotFoundException occurs during hive query case execution with UDF 
> defined [Spark Branch]
> --
>
> Key: HIVE-9410
> URL: https://issues.apache.org/jira/browse/HIVE-9410
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
> Environment: CentOS 6.5
> JDK1.7
>Reporter: Xin Hao
>Assignee: Chengxiang Li
> Attachments: HIVE-9410.1-spark.patch, HIVE-9410.2-spark.patch, 
> HIVE-9410.3-spark.patch
>
>
> We have a hive query case with UDF defined (i.e. BigBench case Q10, Q18 
> etc.). It will be passed for default Hive (on MR) mode, while failed for Hive 
> On Spark mode (both Standalone and Yarn-Client). 
> Although we use 'add jar .jar;' to add the UDF jar explicitly, the issue 
> still exists. 
> BTW, if we put the UDF jar into $HIVE_HOME/lib dir, the case will be passed.
> Detail Error Message is as below (NOTE: 
> de.bankmark.bigbench.queries.q10.SentimentUDF is the UDF which contained in 
> jar bigbenchqueriesmr.jar, and we have add command like 'add jar 
> /location/to/bigbenchqueriesmr.jar;' into .sql explicitly)
> INFO  [pool-1-thread-1]: client.RemoteDriver (RemoteDriver.java:call(316)) - 
> Failed to run job 8dd120cb-1a4d-4d1c-ba31-61eac648c27d
> org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find 
> class: de.bankmark.bigbench.queries.q10.SentimentUDF
> Serialization trace:
> genericUDTF (org.apache.hadoop.hive.ql.plan.UDTFDesc)
> conf (org.apache.hadoop.hive.ql.exec.UDTFOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.MapJoinOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.FilterOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
> aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
> right (org.apache.commons.lang3.tuple.ImmutablePair)
> edgeProperties (org.apache.hadoop.hive.ql.plan.SparkWork)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:656)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:99)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
> ...
> Caused by: java.lang.ClassNotFoundException: 
> de.bankmark.bigbench.queries.q10.SentimentUDF
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:270)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:136)
> ... 55 more



--
This message was sent by Atl

[jira] [Commented] (HIVE-9410) ClassNotFoundException occurs during hive query case execution with UDF defined [Spark Branch]

2015-01-22 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288707#comment-14288707
 ] 

Xuefu Zhang commented on HIVE-9410:
---

I think this makes sense. [~chengxiang li], could you create a SPARK jira for 
this? The way we take here seems too hacky.

> ClassNotFoundException occurs during hive query case execution with UDF 
> defined [Spark Branch]
> --
>
> Key: HIVE-9410
> URL: https://issues.apache.org/jira/browse/HIVE-9410
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
> Environment: CentOS 6.5
> JDK1.7
>Reporter: Xin Hao
>Assignee: Chengxiang Li
> Attachments: HIVE-9410.1-spark.patch, HIVE-9410.2-spark.patch, 
> HIVE-9410.3-spark.patch
>
>
> We have a hive query case with UDF defined (i.e. BigBench case Q10, Q18 
> etc.). It will be passed for default Hive (on MR) mode, while failed for Hive 
> On Spark mode (both Standalone and Yarn-Client). 
> Although we use 'add jar .jar;' to add the UDF jar explicitly, the issue 
> still exists. 
> BTW, if we put the UDF jar into $HIVE_HOME/lib dir, the case will be passed.
> Detail Error Message is as below (NOTE: 
> de.bankmark.bigbench.queries.q10.SentimentUDF is the UDF which contained in 
> jar bigbenchqueriesmr.jar, and we have add command like 'add jar 
> /location/to/bigbenchqueriesmr.jar;' into .sql explicitly)
> INFO  [pool-1-thread-1]: client.RemoteDriver (RemoteDriver.java:call(316)) - 
> Failed to run job 8dd120cb-1a4d-4d1c-ba31-61eac648c27d
> org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find 
> class: de.bankmark.bigbench.queries.q10.SentimentUDF
> Serialization trace:
> genericUDTF (org.apache.hadoop.hive.ql.plan.UDTFDesc)
> conf (org.apache.hadoop.hive.ql.exec.UDTFOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.MapJoinOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.FilterOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
> aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
> right (org.apache.commons.lang3.tuple.ImmutablePair)
> edgeProperties (org.apache.hadoop.hive.ql.plan.SparkWork)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:656)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:99)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
> ...
> Caused by: java.lang.ClassNotFoundException: 
> de.bankmark.bigbench.queries.q10.SentimentUDF
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:270)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:136)
> ... 55 more



--
This message 

Re: Review Request 30107: HIVE-9410, ClassNotFoundException occurs during hive query case execution with UDF defined[Spark Branch]

2015-01-22 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30107/#review69344
---



ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java


Could we add a check, something like:

if (hive.execution.engine=="spark") {
  try {
  ...
}

The code as it is might make other people frown.


- Xuefu Zhang


On Jan. 22, 2015, 9:23 a.m., chengxiang li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30107/
> ---
> 
> (Updated Jan. 22, 2015, 9:23 a.m.)
> 
> 
> Review request for hive and Xuefu Zhang.
> 
> 
> Bugs: HIVE-9410
> https://issues.apache.org/jira/browse/HIVE-9410
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The RemoteDriver does not contains added jar in it's classpath, so it would 
> failed to desrialize SparkWork due to NoClassFoundException. For Hive on MR, 
> while use add jar through Hive CLI, Hive add jar into CLI classpath(through 
> thread context classloader) and add it to distributed cache as well. Compare 
> to Hive on MR, Hive on Spark has an extra RemoteDriver componnet, we should 
> add added jar into it's classpath as well.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java d7cb111 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java 
> 30a00a7 
>   spark-client/src/main/java/org/apache/hive/spark/client/JobContext.java 
> 00aa4ec 
>   spark-client/src/main/java/org/apache/hive/spark/client/JobContextImpl.java 
> 1eb3ff2 
>   
> spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 
> 5f9be65 
>   
> spark-client/src/main/java/org/apache/hive/spark/client/SparkClientUtilities.java
>  PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/30107/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> chengxiang li
> 
>



[jira] [Commented] (HIVE-9370) SparkJobMonitor timeout as sortByKey would launch extra Spark job before original job get submitted [Spark Branch]

2015-01-22 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288698#comment-14288698
 ] 

Xuefu Zhang commented on HIVE-9370:
---

I think asking user to log out and log in again would be fine. That way, user 
will have a new session.

> SparkJobMonitor timeout as sortByKey would launch extra Spark job before 
> original job get submitted [Spark Branch]
> --
>
> Key: HIVE-9370
> URL: https://issues.apache.org/jira/browse/HIVE-9370
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: yuyun.chen
>Assignee: Chengxiang Li
> Fix For: spark-branch
>
> Attachments: HIVE-9370.1-spark.patch
>
>
> enable hive on spark and run BigBench Query 8 then got the following 
> exception:
> 2015-01-14 11:43:46,057 INFO  [main]: impl.RemoteSparkJobStatus 
> (RemoteSparkJobStatus.java:getSparkJobInfo(143)) - Job hasn't been submitted 
> after 30s. Aborting it.
> 2015-01-14 11:43:46,061 INFO  [main]: impl.RemoteSparkJobStatus 
> (RemoteSparkJobStatus.java:getSparkJobInfo(143)) - Job hasn't been submitted 
> after 30s. Aborting it.
> 2015-01-14 11:43:46,061 ERROR [main]: status.SparkJobMonitor 
> (SessionState.java:printError(839)) - Status: Failed
> 2015-01-14 11:43:46,062 INFO  [main]: log.PerfLogger 
> (PerfLogger.java:PerfLogEnd(148)) -  start=1421206996052 end=1421207026062 duration=30010 
> from=org.apache.hadoop.hive.ql.exec.spark.status.SparkJobMonitor>
> 2015-01-14 11:43:46,071 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(436)) - 15/01/14 11:43:46 INFO RemoteDriver: Failed 
> to run job 0a9a7782-0e0b-4561-8468-959a6d8df0a3
> 2015-01-14 11:43:46,071 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(436)) - java.lang.InterruptedException
> 2015-01-14 11:43:46,071 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(436)) -at java.lang.Object.wait(Native 
> Method)
> 2015-01-14 11:43:46,071 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(436)) -at 
> java.lang.Object.wait(Object.java:503)
> 2015-01-14 11:43:46,071 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(436)) -at 
> org.apache.spark.scheduler.JobWaiter.awaitResult(JobWaiter.scala:73)
> 2015-01-14 11:43:46,071 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(436)) -at 
> org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:514)
> 2015-01-14 11:43:46,071 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(436)) -at 
> org.apache.spark.SparkContext.runJob(SparkContext.scala:1282)
> 2015-01-14 11:43:46,072 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(436)) -at 
> org.apache.spark.SparkContext.runJob(SparkContext.scala:1300)
> 2015-01-14 11:43:46,072 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(436)) -at 
> org.apache.spark.SparkContext.runJob(SparkContext.scala:1314)
> 2015-01-14 11:43:46,072 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(436)) -at 
> org.apache.spark.SparkContext.runJob(SparkContext.scala:1328)
> 2015-01-14 11:43:46,072 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(436)) -at 
> org.apache.spark.rdd.RDD.collect(RDD.scala:780)
> 2015-01-14 11:43:46,072 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(436)) -at 
> org.apache.spark.RangePartitioner$.sketch(Partitioner.scala:262)
> 2015-01-14 11:43:46,072 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(436)) -at 
> org.apache.spark.RangePartitioner.(Partitioner.scala:124)
> 2015-01-14 11:43:46,072 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(436)) -at 
> org.apache.spark.rdd.OrderedRDDFunctions.sortByKey(OrderedRDDFunctions.scala:63)
> 2015-01-14 11:43:46,073 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(436)) -at 
> org.apache.spark.api.java.JavaPairRDD.sortByKey(JavaPairRDD.scala:894)
> 2015-01-14 11:43:46,073 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(436)) -at 
> org.apache.spark.api.java.JavaPairRDD.sortByKey(JavaPairRDD.scala:864)
> 2015-01-14 11:43:46,073 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(436)) -at 
> org.apache.hadoop.hive.ql.exec.spark.SortByShuffler.shuffle(SortByShuffler.java:48)
> 2015-01-14 11:43:46,073 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(436)) -at 
> org.apache.hadoop.hive.ql.exec.s

Re: Review Request 30107: HIVE-9410, ClassNotFoundException occurs during hive query case execution with UDF defined[Spark Branch]

2015-01-22 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30107/#review69341
---



ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java


Should we also check addedJars.isEmpty() to be consistent with other places?


- Xuefu Zhang


On Jan. 22, 2015, 9:23 a.m., chengxiang li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30107/
> ---
> 
> (Updated Jan. 22, 2015, 9:23 a.m.)
> 
> 
> Review request for hive and Xuefu Zhang.
> 
> 
> Bugs: HIVE-9410
> https://issues.apache.org/jira/browse/HIVE-9410
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The RemoteDriver does not contains added jar in it's classpath, so it would 
> failed to desrialize SparkWork due to NoClassFoundException. For Hive on MR, 
> while use add jar through Hive CLI, Hive add jar into CLI classpath(through 
> thread context classloader) and add it to distributed cache as well. Compare 
> to Hive on MR, Hive on Spark has an extra RemoteDriver componnet, we should 
> add added jar into it's classpath as well.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java d7cb111 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java 
> 30a00a7 
>   spark-client/src/main/java/org/apache/hive/spark/client/JobContext.java 
> 00aa4ec 
>   spark-client/src/main/java/org/apache/hive/spark/client/JobContextImpl.java 
> 1eb3ff2 
>   
> spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 
> 5f9be65 
>   
> spark-client/src/main/java/org/apache/hive/spark/client/SparkClientUtilities.java
>  PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/30107/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> chengxiang li
> 
>



Re: Review Request 30107: HIVE-9410, ClassNotFoundException occurs during hive query case execution with UDF defined[Spark Branch]

2015-01-22 Thread chengxiang li


> On 一月 23, 2015, 3:02 a.m., chengxiang li wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java, line 371
> > 
> >
> > #3 this would be executed in akka thread, get extra jar path from 
> > JobConf, and add to current thread classloader.
> 
> Xuefu Zhang wrote:
> what thread is referred as akka thread?

Inside Spark driver, SparkContext submit spark job to DAGSchedule through akka 
message instead of directly invoke, akka hold a thread pool to handle messages.


- chengxiang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30107/#review69336
---


On 一月 22, 2015, 9:23 a.m., chengxiang li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30107/
> ---
> 
> (Updated 一月 22, 2015, 9:23 a.m.)
> 
> 
> Review request for hive and Xuefu Zhang.
> 
> 
> Bugs: HIVE-9410
> https://issues.apache.org/jira/browse/HIVE-9410
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The RemoteDriver does not contains added jar in it's classpath, so it would 
> failed to desrialize SparkWork due to NoClassFoundException. For Hive on MR, 
> while use add jar through Hive CLI, Hive add jar into CLI classpath(through 
> thread context classloader) and add it to distributed cache as well. Compare 
> to Hive on MR, Hive on Spark has an extra RemoteDriver componnet, we should 
> add added jar into it's classpath as well.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java d7cb111 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java 
> 30a00a7 
>   spark-client/src/main/java/org/apache/hive/spark/client/JobContext.java 
> 00aa4ec 
>   spark-client/src/main/java/org/apache/hive/spark/client/JobContextImpl.java 
> 1eb3ff2 
>   
> spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 
> 5f9be65 
>   
> spark-client/src/main/java/org/apache/hive/spark/client/SparkClientUtilities.java
>  PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/30107/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> chengxiang li
> 
>



Re: Review Request 30107: HIVE-9410, ClassNotFoundException occurs during hive query case execution with UDF defined[Spark Branch]

2015-01-22 Thread Xuefu Zhang


> On Jan. 23, 2015, 3:02 a.m., chengxiang li wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java, line 371
> > 
> >
> > #3 this would be executed in akka thread, get extra jar path from 
> > JobConf, and add to current thread classloader.

what thread is referred as akka thread?


- Xuefu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30107/#review69336
---


On Jan. 22, 2015, 9:23 a.m., chengxiang li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30107/
> ---
> 
> (Updated Jan. 22, 2015, 9:23 a.m.)
> 
> 
> Review request for hive and Xuefu Zhang.
> 
> 
> Bugs: HIVE-9410
> https://issues.apache.org/jira/browse/HIVE-9410
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The RemoteDriver does not contains added jar in it's classpath, so it would 
> failed to desrialize SparkWork due to NoClassFoundException. For Hive on MR, 
> while use add jar through Hive CLI, Hive add jar into CLI classpath(through 
> thread context classloader) and add it to distributed cache as well. Compare 
> to Hive on MR, Hive on Spark has an extra RemoteDriver componnet, we should 
> add added jar into it's classpath as well.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java d7cb111 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java 
> 30a00a7 
>   spark-client/src/main/java/org/apache/hive/spark/client/JobContext.java 
> 00aa4ec 
>   spark-client/src/main/java/org/apache/hive/spark/client/JobContextImpl.java 
> 1eb3ff2 
>   
> spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 
> 5f9be65 
>   
> spark-client/src/main/java/org/apache/hive/spark/client/SparkClientUtilities.java
>  PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/30107/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> chengxiang li
> 
>



Re: Review Request 30107: HIVE-9410, ClassNotFoundException occurs during hive query case execution with UDF defined[Spark Branch]

2015-01-22 Thread Xuefu Zhang


> On Jan. 23, 2015, 2:05 a.m., Xuefu Zhang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java,
> >  line 220
> > 
> >
> > So, this is the code that adds the jars to the classpath of the remote 
> > driver?
> > 
> > I'm wondering why these jars are necessary in order to deserailize 
> > SparkWork.
> 
> chengxiang li wrote:
> Same as previous comments, SparkWork contains MapWork/ReduceWork which 
> contains operator tree, UTFFOperator need to load added jar class.

Sorry, but which operator? UTFFOperator? I could find it in hive source.


- Xuefu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30107/#review69329
---


On Jan. 22, 2015, 9:23 a.m., chengxiang li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30107/
> ---
> 
> (Updated Jan. 22, 2015, 9:23 a.m.)
> 
> 
> Review request for hive and Xuefu Zhang.
> 
> 
> Bugs: HIVE-9410
> https://issues.apache.org/jira/browse/HIVE-9410
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The RemoteDriver does not contains added jar in it's classpath, so it would 
> failed to desrialize SparkWork due to NoClassFoundException. For Hive on MR, 
> while use add jar through Hive CLI, Hive add jar into CLI classpath(through 
> thread context classloader) and add it to distributed cache as well. Compare 
> to Hive on MR, Hive on Spark has an extra RemoteDriver componnet, we should 
> add added jar into it's classpath as well.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java d7cb111 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java 
> 30a00a7 
>   spark-client/src/main/java/org/apache/hive/spark/client/JobContext.java 
> 00aa4ec 
>   spark-client/src/main/java/org/apache/hive/spark/client/JobContextImpl.java 
> 1eb3ff2 
>   
> spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 
> 5f9be65 
>   
> spark-client/src/main/java/org/apache/hive/spark/client/SparkClientUtilities.java
>  PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/30107/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> chengxiang li
> 
>



Re: Review Request 30107: HIVE-9410, ClassNotFoundException occurs during hive query case execution with UDF defined[Spark Branch]

2015-01-22 Thread chengxiang li


> On 一月 23, 2015, 2:05 a.m., Xuefu Zhang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java,
> >  line 220
> > 
> >
> > So, this is the code that adds the jars to the classpath of the remote 
> > driver?
> > 
> > I'm wondering why these jars are necessary in order to deserailize 
> > SparkWork.
> 
> chengxiang li wrote:
> Same as previous comments, SparkWork contains MapWork/ReduceWork which 
> contains operator tree, UTFFOperator need to load added jar class.
> 
> Xuefu Zhang wrote:
> Sorry, but which operator? UTFFOperator? I could find it in hive source.

Sorry, as you can see from the error log in JIRA, the extra class in added jar 
is contained in UDTFOperator:

org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find class: 
de.bankmark.bigbench.queries.q10.SentimentUDF
Serialization trace:
genericUDTF (org.apache.hadoop.hive.ql.plan.UDTFDesc)
conf (org.apache.hadoop.hive.ql.exec.UDTFOperator)
childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
childOperators (org.apache.hadoop.hive.ql.exec.MapJoinOperator)
childOperators (org.apache.hadoop.hive.ql.exec.FilterOperator)
childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)


- chengxiang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30107/#review69329
---


On 一月 22, 2015, 9:23 a.m., chengxiang li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30107/
> ---
> 
> (Updated 一月 22, 2015, 9:23 a.m.)
> 
> 
> Review request for hive and Xuefu Zhang.
> 
> 
> Bugs: HIVE-9410
> https://issues.apache.org/jira/browse/HIVE-9410
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The RemoteDriver does not contains added jar in it's classpath, so it would 
> failed to desrialize SparkWork due to NoClassFoundException. For Hive on MR, 
> while use add jar through Hive CLI, Hive add jar into CLI classpath(through 
> thread context classloader) and add it to distributed cache as well. Compare 
> to Hive on MR, Hive on Spark has an extra RemoteDriver componnet, we should 
> add added jar into it's classpath as well.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java d7cb111 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java 
> 30a00a7 
>   spark-client/src/main/java/org/apache/hive/spark/client/JobContext.java 
> 00aa4ec 
>   spark-client/src/main/java/org/apache/hive/spark/client/JobContextImpl.java 
> 1eb3ff2 
>   
> spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 
> 5f9be65 
>   
> spark-client/src/main/java/org/apache/hive/spark/client/SparkClientUtilities.java
>  PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/30107/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> chengxiang li
> 
>



[jira] [Updated] (HIVE-6617) Reduce ambiguity in grammar

2015-01-22 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-6617:
--
Status: Patch Available  (was: Open)

> Reduce ambiguity in grammar
> ---
>
> Key: HIVE-6617
> URL: https://issues.apache.org/jira/browse/HIVE-6617
> Project: Hive
>  Issue Type: Task
>Reporter: Ashutosh Chauhan
>Assignee: Pengcheng Xiong
> Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, 
> HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, 
> HIVE-6617.06.patch, HIVE-6617.07.patch
>
>
> As of today, antlr reports 214 warnings. Need to bring down this number, 
> ideally to 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6617) Reduce ambiguity in grammar

2015-01-22 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-6617:
--
Attachment: HIVE-6617.07.patch

add back some key words (e.g., default) which is reserved in SQL 11.

> Reduce ambiguity in grammar
> ---
>
> Key: HIVE-6617
> URL: https://issues.apache.org/jira/browse/HIVE-6617
> Project: Hive
>  Issue Type: Task
>Reporter: Ashutosh Chauhan
>Assignee: Pengcheng Xiong
> Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, 
> HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, 
> HIVE-6617.06.patch, HIVE-6617.07.patch
>
>
> As of today, antlr reports 214 warnings. Need to bring down this number, 
> ideally to 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9410) ClassNotFoundException occurs during hive query case execution with UDF defined [Spark Branch]

2015-01-22 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288680#comment-14288680
 ] 

Rui Li commented on HIVE-9410:
--

Shall we ask spark for an API to add jar to driver's classpath?

> ClassNotFoundException occurs during hive query case execution with UDF 
> defined [Spark Branch]
> --
>
> Key: HIVE-9410
> URL: https://issues.apache.org/jira/browse/HIVE-9410
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
> Environment: CentOS 6.5
> JDK1.7
>Reporter: Xin Hao
>Assignee: Chengxiang Li
> Attachments: HIVE-9410.1-spark.patch, HIVE-9410.2-spark.patch, 
> HIVE-9410.3-spark.patch
>
>
> We have a hive query case with UDF defined (i.e. BigBench case Q10, Q18 
> etc.). It will be passed for default Hive (on MR) mode, while failed for Hive 
> On Spark mode (both Standalone and Yarn-Client). 
> Although we use 'add jar .jar;' to add the UDF jar explicitly, the issue 
> still exists. 
> BTW, if we put the UDF jar into $HIVE_HOME/lib dir, the case will be passed.
> Detail Error Message is as below (NOTE: 
> de.bankmark.bigbench.queries.q10.SentimentUDF is the UDF which contained in 
> jar bigbenchqueriesmr.jar, and we have add command like 'add jar 
> /location/to/bigbenchqueriesmr.jar;' into .sql explicitly)
> INFO  [pool-1-thread-1]: client.RemoteDriver (RemoteDriver.java:call(316)) - 
> Failed to run job 8dd120cb-1a4d-4d1c-ba31-61eac648c27d
> org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find 
> class: de.bankmark.bigbench.queries.q10.SentimentUDF
> Serialization trace:
> genericUDTF (org.apache.hadoop.hive.ql.plan.UDTFDesc)
> conf (org.apache.hadoop.hive.ql.exec.UDTFOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.MapJoinOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.FilterOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
> aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
> right (org.apache.commons.lang3.tuple.ImmutablePair)
> edgeProperties (org.apache.hadoop.hive.ql.plan.SparkWork)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:656)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:99)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
> ...
> Caused by: java.lang.ClassNotFoundException: 
> de.bankmark.bigbench.queries.q10.SentimentUDF
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:270)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:136)
> ... 55 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9370) SparkJobMonitor timeout as sortByKey would launch extra Spark job before original job get submitted [Spark Branch]

2015-01-22 Thread Chengxiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288684#comment-14288684
 ] 

Chengxiang Li commented on HIVE-9370:
-

RSC have timeout in netty level, so if remote spark context do not response in 
netty level, we would get the exception. One question is that the sparksession 
is still alive, use could still submit queries but failed to execute as PRC 
channel is already closed, user need to restart Hive CLI or use a tricky way to 
new remote spark context, like update spark configuration.

> SparkJobMonitor timeout as sortByKey would launch extra Spark job before 
> original job get submitted [Spark Branch]
> --
>
> Key: HIVE-9370
> URL: https://issues.apache.org/jira/browse/HIVE-9370
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: yuyun.chen
>Assignee: Chengxiang Li
> Fix For: spark-branch
>
> Attachments: HIVE-9370.1-spark.patch
>
>
> enable hive on spark and run BigBench Query 8 then got the following 
> exception:
> 2015-01-14 11:43:46,057 INFO  [main]: impl.RemoteSparkJobStatus 
> (RemoteSparkJobStatus.java:getSparkJobInfo(143)) - Job hasn't been submitted 
> after 30s. Aborting it.
> 2015-01-14 11:43:46,061 INFO  [main]: impl.RemoteSparkJobStatus 
> (RemoteSparkJobStatus.java:getSparkJobInfo(143)) - Job hasn't been submitted 
> after 30s. Aborting it.
> 2015-01-14 11:43:46,061 ERROR [main]: status.SparkJobMonitor 
> (SessionState.java:printError(839)) - Status: Failed
> 2015-01-14 11:43:46,062 INFO  [main]: log.PerfLogger 
> (PerfLogger.java:PerfLogEnd(148)) -  start=1421206996052 end=1421207026062 duration=30010 
> from=org.apache.hadoop.hive.ql.exec.spark.status.SparkJobMonitor>
> 2015-01-14 11:43:46,071 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(436)) - 15/01/14 11:43:46 INFO RemoteDriver: Failed 
> to run job 0a9a7782-0e0b-4561-8468-959a6d8df0a3
> 2015-01-14 11:43:46,071 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(436)) - java.lang.InterruptedException
> 2015-01-14 11:43:46,071 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(436)) -at java.lang.Object.wait(Native 
> Method)
> 2015-01-14 11:43:46,071 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(436)) -at 
> java.lang.Object.wait(Object.java:503)
> 2015-01-14 11:43:46,071 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(436)) -at 
> org.apache.spark.scheduler.JobWaiter.awaitResult(JobWaiter.scala:73)
> 2015-01-14 11:43:46,071 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(436)) -at 
> org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:514)
> 2015-01-14 11:43:46,071 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(436)) -at 
> org.apache.spark.SparkContext.runJob(SparkContext.scala:1282)
> 2015-01-14 11:43:46,072 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(436)) -at 
> org.apache.spark.SparkContext.runJob(SparkContext.scala:1300)
> 2015-01-14 11:43:46,072 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(436)) -at 
> org.apache.spark.SparkContext.runJob(SparkContext.scala:1314)
> 2015-01-14 11:43:46,072 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(436)) -at 
> org.apache.spark.SparkContext.runJob(SparkContext.scala:1328)
> 2015-01-14 11:43:46,072 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(436)) -at 
> org.apache.spark.rdd.RDD.collect(RDD.scala:780)
> 2015-01-14 11:43:46,072 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(436)) -at 
> org.apache.spark.RangePartitioner$.sketch(Partitioner.scala:262)
> 2015-01-14 11:43:46,072 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(436)) -at 
> org.apache.spark.RangePartitioner.(Partitioner.scala:124)
> 2015-01-14 11:43:46,072 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(436)) -at 
> org.apache.spark.rdd.OrderedRDDFunctions.sortByKey(OrderedRDDFunctions.scala:63)
> 2015-01-14 11:43:46,073 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(436)) -at 
> org.apache.spark.api.java.JavaPairRDD.sortByKey(JavaPairRDD.scala:894)
> 2015-01-14 11:43:46,073 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(436)) -at 
> org.apache.spark.api.java.JavaPairRDD.sortByKey(JavaPairRDD.scala:864)
> 2015-01-14 11:43:46,073 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkCli

[jira] [Updated] (HIVE-6617) Reduce ambiguity in grammar

2015-01-22 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-6617:
--
Status: Open  (was: Patch Available)

> Reduce ambiguity in grammar
> ---
>
> Key: HIVE-6617
> URL: https://issues.apache.org/jira/browse/HIVE-6617
> Project: Hive
>  Issue Type: Task
>Reporter: Ashutosh Chauhan
>Assignee: Pengcheng Xiong
> Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, 
> HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, 
> HIVE-6617.06.patch, HIVE-6617.07.patch
>
>
> As of today, antlr reports 214 warnings. Need to bring down this number, 
> ideally to 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9434) Shim the method Path.getPathWithoutSchemeAndAuthority

2015-01-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288671#comment-14288671
 ] 

Hive QA commented on HIVE-9434:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12693830/HIVE-9434.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 7347 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_stats_counter
org.apache.hive.hcatalog.streaming.TestStreaming.testEndpointConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2484/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2484/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2484/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12693830 - PreCommit-HIVE-TRUNK-Build

> Shim the method Path.getPathWithoutSchemeAndAuthority
> -
>
> Key: HIVE-9434
> URL: https://issues.apache.org/jira/browse/HIVE-9434
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 0.15.0
>Reporter: Brock Noland
>Assignee: Dong Chen
> Fix For: 0.15.0
>
> Attachments: HIVE-9434.patch
>
>
> Since Hadoop 1 does not have the method 
> {{Path.getPathWithoutSchemeAndAuthority}} we need to shim it out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9425) External Function Jar files are not available for Driver when running with yarn-cluster mode [Spark Branch]

2015-01-22 Thread Xin Hao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288666#comment-14288666
 ] 

Xin Hao commented on HIVE-9425:
---

Double checked with Big-Bench Q1 case (includes the hql ‘ADD FILE 
${env:BIG_BENCH_QUERIES_DIR}/Resources/bigbenchqueriesmr.jar;’), and it failed 
based on latest code on Spark Branch.

Error message in hive log:

2015-01-23 10:19:21,205 INFO  [main]: exec.Task 
(SessionState.java:printInfo(852)) -   set hive.exec.reducers.max=
2015-01-23 10:19:21,205 INFO  [main]: exec.Task 
(SessionState.java:printInfo(852)) - In order to set a constant number of 
reducers:
2015-01-23 10:19:21,206 INFO  [main]: exec.Task 
(SessionState.java:printInfo(852)) -   set mapreduce.job.reduces=
2015-01-23 10:19:21,208 INFO  [main]: log.PerfLogger 
(PerfLogger.java:PerfLogBegin(121)) - 
2015-01-23 10:19:21,278 INFO  [main]: ql.Context 
(Context.java:getMRScratchDir(328)) - New scratch dir is 
hdfs://bhx1:8020/tmp/hive/root/0357a036-8988-489b-85cf-329023a567c7/hive_2015-01-23_10-18-27_797_5566502876180681874-1
2015-01-23 10:19:21,432 WARN  [RPC-Handler-3]: rpc.RpcDispatcher 
(RpcDispatcher.java:handleError(142)) - Received error 
message:java.io.FileNotFoundException: 
/HiveOnSpark/Big-Bench/engines/hive/queries/Resources/bigbenchqueriesmr.jar (No 
such file or directory)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.(FileInputStream.java:146)
at 
org.spark-project.guava.common.io.Files$FileByteSource.openStream(Files.java:124)
at 
org.spark-project.guava.common.io.Files$FileByteSource.openStream(Files.java:114)
at 
org.spark-project.guava.common.io.ByteSource.copyTo(ByteSource.java:202)
at org.spark-project.guava.common.io.Files.copy(Files.java:436)
at org.apache.spark.HttpFileServer.addFileToDir(HttpFileServer.scala:72)
at org.apache.spark.HttpFileServer.addFile(HttpFileServer.scala:55)
at org.apache.spark.SparkContext.addFile(SparkContext.scala:961)
at 
org.apache.spark.api.java.JavaSparkContext.addFile(JavaSparkContext.scala:646)
at 
org.apache.hive.spark.client.SparkClientImpl$AddFileJob.call(SparkClientImpl.java:553)
at 
org.apache.hive.spark.client.RemoteDriver$DriverProtocol.handle(RemoteDriver.java:305)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hive.spark.client.rpc.RpcDispatcher.handleCall(RpcDispatcher.java:120)
at 
org.apache.hive.spark.client.rpc.RpcDispatcher.channelRead0(RpcDispatcher.java:79)
at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
at 
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163)
at 
io.netty.handler.codec.ByteToMessageCodec.channelRead(ByteToMessageCodec.java:108)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
at 
io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
at 
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
at 
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at 
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
at java.lang.Thread.run(Thread.java:745)
.
2015-01-23 10:19:21,606 INFO  [main]: log.PerfLogger 
(PerfLogger.java:PerfLogEnd(148)) - 
=

Creating a branch for hbase metastore work

2015-01-22 Thread Alan Gates
I've been working on a prototype of using HBase to store Hive's 
metadata.  Basically I've built a new implementation of RawStore that 
writes to HBase rather than DataNucleus.  I want to see if I can build 
something that has a much more straightforward schema than DN and that 
is much faster.


I'd like to get this out in public so other can look at it and 
contribute, but it's no where near ready for real time.  So I propose to 
create a branch and put the code there.  Any objections?


Alan.

--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Review Request 30107: HIVE-9410, ClassNotFoundException occurs during hive query case execution with UDF defined[Spark Branch]

2015-01-22 Thread chengxiang li

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30107/#review69336
---



ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java


#3 this would be executed in akka thread, get extra jar path from JobConf, 
and add to current thread classloader.



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java


#2 this job is executed in thread RemoteDriver threadpool, it get extra jar 
paths from JobContext, add them to current thread classloader, and set them to 
JobConf.



spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java


#1  add extra jar path to JobContext, this job is executed in netty 
connection thread.


- chengxiang li


On 一月 22, 2015, 9:23 a.m., chengxiang li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30107/
> ---
> 
> (Updated 一月 22, 2015, 9:23 a.m.)
> 
> 
> Review request for hive and Xuefu Zhang.
> 
> 
> Bugs: HIVE-9410
> https://issues.apache.org/jira/browse/HIVE-9410
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The RemoteDriver does not contains added jar in it's classpath, so it would 
> failed to desrialize SparkWork due to NoClassFoundException. For Hive on MR, 
> while use add jar through Hive CLI, Hive add jar into CLI classpath(through 
> thread context classloader) and add it to distributed cache as well. Compare 
> to Hive on MR, Hive on Spark has an extra RemoteDriver componnet, we should 
> add added jar into it's classpath as well.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java d7cb111 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java 
> 30a00a7 
>   spark-client/src/main/java/org/apache/hive/spark/client/JobContext.java 
> 00aa4ec 
>   spark-client/src/main/java/org/apache/hive/spark/client/JobContextImpl.java 
> 1eb3ff2 
>   
> spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 
> 5f9be65 
>   
> spark-client/src/main/java/org/apache/hive/spark/client/SparkClientUtilities.java
>  PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/30107/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> chengxiang li
> 
>



Re: Review Request 30107: HIVE-9410, ClassNotFoundException occurs during hive query case execution with UDF defined[Spark Branch]

2015-01-22 Thread chengxiang li


> On 一月 23, 2015, 2:05 a.m., Xuefu Zhang wrote:
> > I'm wondering what's the story for Hive CLI. Hive CLI can add jars from 
> > local file system. Would this work for Hive on Spark?

Hive CLI add jars to classpath dynamically same as this patch does for 
RemoteDriver, update thread context classloader with added jars path included. 
For Hive on Spark, Hive CLI stay the same, the issue is that RemoteDriver does 
not add these added jars into its class path, so the NoClassFound error come 
out while RemoteDriver side need related class.


> On 一月 23, 2015, 2:05 a.m., Xuefu Zhang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java, line 367
> > 
> >
> > Callers of getBaseWork() will add the jars to the classpath. Why this 
> > is necessary? Who are the callers? Any side-effect?

The reason why we need to do this is that, getBaseWork() would generate 
MapWork/ReduceWork which contains Hive operators inside, and UDTFOperator which 
contains added jar class need to be loaded. To load added jar dynamically, we 
need to reset thread context classloader, as mentioned in previous change 
summary, unlike HiveCLI, there are 2 threads in RemoteDriver side may need to 
load added jar, For akka thread, there is no proper cut-in point for add jars 
to classpath.
The side-effect is that, many HiveCLI threads may have to check to update its 
classload unneccsary.
Another possible solution is that, we update SystemClassLoader for RemoteDriver 
dynamically, which must be done in a quite hacky way, such as:

URLClassLoader sysloader = (URLClassLoader) 
ClassLoader.getSystemClassLoader();
Class sysclass = URLClassLoader.class;

try {
Method method = sysclass.getDeclaredMethod("addURL", parameters);
method.setAccessible(true);
method.invoke(sysloader, new Object[] {u});
} catch (Throwable t) {
t.printStackTrace();
throw new IOException("Error, could not add URL to system 
classloader");
}

Which one do you prefer?


> On 一月 23, 2015, 2:05 a.m., Xuefu Zhang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java,
> >  line 220
> > 
> >
> > So, this is the code that adds the jars to the classpath of the remote 
> > driver?
> > 
> > I'm wondering why these jars are necessary in order to deserailize 
> > SparkWork.

Same as previous comments, SparkWork contains MapWork/ReduceWork which contains 
operator tree, UTFFOperator need to load added jar class.


- chengxiang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30107/#review69329
---


On 一月 22, 2015, 9:23 a.m., chengxiang li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30107/
> ---
> 
> (Updated 一月 22, 2015, 9:23 a.m.)
> 
> 
> Review request for hive and Xuefu Zhang.
> 
> 
> Bugs: HIVE-9410
> https://issues.apache.org/jira/browse/HIVE-9410
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The RemoteDriver does not contains added jar in it's classpath, so it would 
> failed to desrialize SparkWork due to NoClassFoundException. For Hive on MR, 
> while use add jar through Hive CLI, Hive add jar into CLI classpath(through 
> thread context classloader) and add it to distributed cache as well. Compare 
> to Hive on MR, Hive on Spark has an extra RemoteDriver componnet, we should 
> add added jar into it's classpath as well.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java d7cb111 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java 
> 30a00a7 
>   spark-client/src/main/java/org/apache/hive/spark/client/JobContext.java 
> 00aa4ec 
>   spark-client/src/main/java/org/apache/hive/spark/client/JobContextImpl.java 
> 1eb3ff2 
>   
> spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 
> 5f9be65 
>   
> spark-client/src/main/java/org/apache/hive/spark/client/SparkClientUtilities.java
>  PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/30107/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> chengxiang li
> 
>



[jira] [Commented] (HIVE-9436) RetryingMetaStoreClient does not retry JDOExceptions

2015-01-22 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288623#comment-14288623
 ] 

Sushanth Sowmyan commented on HIVE-9436:


Hari, good point - I had a look at RetryingHMSHandler, and the code is rather 
similar. There seems to be two main differences though: First, on the 
HMSHandler side, the exact cause-chain of exceptions are still available, and 
we can compare using "instanceof", whereas on the client-side, we have only the 
first level exception object, and all internal objects are serialized, so we 
have to look at the messages instead. Secondly, the client has thrift-related 
exceptions which are acceptable as well.

Still, possibly, we should undertake to refactor this into MetastoreUtils with 
a parameter which asks whether to look strictly at types, or look inside the 
messages as well, and call that from both sides, and also pass in a list of 
acceptable exception types. In that scenario, this would be usable all over the 
place where we need to retry. I would, however, like to tackle the refactor as 
an improvement rather than in a bug-fix jira if that's okay.

> RetryingMetaStoreClient does not retry JDOExceptions
> 
>
> Key: HIVE-9436
> URL: https://issues.apache.org/jira/browse/HIVE-9436
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0, 0.13.1
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-9436.2.patch, HIVE-9436.patch
>
>
> RetryingMetaStoreClient has a bug in the following bit of code:
> {code}
> } else if ((e.getCause() instanceof MetaException) &&
> e.getCause().getMessage().matches("JDO[a-zA-Z]*Exception")) {
>   caughtException = (MetaException) e.getCause();
> } else {
>   throw e.getCause();
> }
> {code}
> The bug here is that java String.matches matches the entire string to the 
> regex, and thus, that match will fail if the message contains anything before 
> or after JDO[a-zA-Z]\*Exception. The solution, however, is very simple, we 
> should match .\*JDO[a-zA-Z]\*Exception.\*



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9428) LocalSparkJobStatus may return failed job as successful [Spark Branch]

2015-01-22 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288619#comment-14288619
 ] 

Rui Li commented on HIVE-9428:
--

And some examples when we need to check the future instead of SparkJobInfo:
[~chengxiang li] mentioned that if a spark job's resource data is empty, the 
job may not be submitted at all so we won't get a SparkJobInfo.
I also noticed if something goes wrong during submission, we won't get a 
SparkJobInfo.

> LocalSparkJobStatus may return failed job as successful [Spark Branch]
> --
>
> Key: HIVE-9428
> URL: https://issues.apache.org/jira/browse/HIVE-9428
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Minor
> Attachments: HIVE-9428.1-spark.patch, HIVE-9428.2-spark.patch
>
>
> Future is done doesn't necessarily mean the job is successful. We should rely 
> on SparkJobInfo to get job status whenever it's available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9152) Dynamic Partition Pruning [Spark Branch]

2015-01-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288622#comment-14288622
 ] 

Hive QA commented on HIVE-9152:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12694045/HIVE-9152.1-spark.patch

{color:red}ERROR:{color} -1 due to 35 failed/errored test(s), 7356 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_spark_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join32
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_smb_mapjoin_14
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_1
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_13
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_16
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_3
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_4
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_5
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_6
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_7
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_8
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_9
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_nullsafe
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_parquet_join
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_smb_mapjoin_1
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_smb_mapjoin_10
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_smb_mapjoin_11
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_smb_mapjoin_12
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_smb_mapjoin_13
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_smb_mapjoin_14
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_smb_mapjoin_15
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_smb_mapjoin_16
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_smb_mapjoin_17
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_smb_mapjoin_2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_smb_mapjoin_25
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_smb_mapjoin_3
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_smb_mapjoin_4
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_smb_mapjoin_5
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_smb_mapjoin_6
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_smb_mapjoin_7
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_smb_mapjoin_8
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/675/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/675/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-675/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 35 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12694045 - PreCommit-HIVE-SPARK-Build

> Dynamic Partition Pruning [Spark Branch]
> 
>
> Key: HIVE-9152
> URL: https://issues.apache.org/jira/browse/HIVE-9152
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Brock Noland
>Assignee: Chao
> Attachments: HIVE-9152.1-spark.patch
>
>
> Tez implemented dynamic partition pruning in HIVE-7826. This is a nice 
> optimization and we should implement the same in HOS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9436) RetryingMetaStoreClient does not retry JDOExceptions

2015-01-22 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-9436:
---
Attachment: HIVE-9436.2.patch

Updated patch to multi-line match

> RetryingMetaStoreClient does not retry JDOExceptions
> 
>
> Key: HIVE-9436
> URL: https://issues.apache.org/jira/browse/HIVE-9436
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0, 0.13.1
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-9436.2.patch, HIVE-9436.patch
>
>
> RetryingMetaStoreClient has a bug in the following bit of code:
> {code}
> } else if ((e.getCause() instanceof MetaException) &&
> e.getCause().getMessage().matches("JDO[a-zA-Z]*Exception")) {
>   caughtException = (MetaException) e.getCause();
> } else {
>   throw e.getCause();
> }
> {code}
> The bug here is that java String.matches matches the entire string to the 
> regex, and thus, that match will fail if the message contains anything before 
> or after JDO[a-zA-Z]\*Exception. The solution, however, is very simple, we 
> should match .\*JDO[a-zA-Z]\*Exception.\*



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9436) RetryingMetaStoreClient does not retry JDOExceptions

2015-01-22 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-9436:
---
Status: Patch Available  (was: Open)

> RetryingMetaStoreClient does not retry JDOExceptions
> 
>
> Key: HIVE-9436
> URL: https://issues.apache.org/jira/browse/HIVE-9436
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.1, 0.14.0
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-9436.2.patch, HIVE-9436.patch
>
>
> RetryingMetaStoreClient has a bug in the following bit of code:
> {code}
> } else if ((e.getCause() instanceof MetaException) &&
> e.getCause().getMessage().matches("JDO[a-zA-Z]*Exception")) {
>   caughtException = (MetaException) e.getCause();
> } else {
>   throw e.getCause();
> }
> {code}
> The bug here is that java String.matches matches the entire string to the 
> regex, and thus, that match will fail if the message contains anything before 
> or after JDO[a-zA-Z]\*Exception. The solution, however, is very simple, we 
> should match .\*JDO[a-zA-Z]\*Exception.\*



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9428) LocalSparkJobStatus may return failed job as successful [Spark Branch]

2015-01-22 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288605#comment-14288605
 ] 

Rui Li commented on HIVE-9428:
--

[~xuefuz] - thanks for the review!
1. We have to check Future.isDone() first otherwise Future.get() may block us.
2. Yeah I thought about logging it. Only concern is that the exception may have 
been logged already, e.g. spark logs all exceptions during job submission. So 
it may be redundant. Would like to know your opinions.

> LocalSparkJobStatus may return failed job as successful [Spark Branch]
> --
>
> Key: HIVE-9428
> URL: https://issues.apache.org/jira/browse/HIVE-9428
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Minor
> Attachments: HIVE-9428.1-spark.patch, HIVE-9428.2-spark.patch
>
>
> Future is done doesn't necessarily mean the job is successful. We should rely 
> on SparkJobInfo to get job status whenever it's available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 30107: HIVE-9410, ClassNotFoundException occurs during hive query case execution with UDF defined[Spark Branch]

2015-01-22 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30107/#review69329
---


I'm wondering what's the story for Hive CLI. Hive CLI can add jars from local 
file system. Would this work for Hive on Spark?


ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java


Callers of getBaseWork() will add the jars to the classpath. Why this is 
necessary? Who are the callers? Any side-effect?



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java


So, this is the code that adds the jars to the classpath of the remote 
driver?

I'm wondering why these jars are necessary in order to deserailize 
SparkWork.


- Xuefu Zhang


On Jan. 22, 2015, 9:23 a.m., chengxiang li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30107/
> ---
> 
> (Updated Jan. 22, 2015, 9:23 a.m.)
> 
> 
> Review request for hive and Xuefu Zhang.
> 
> 
> Bugs: HIVE-9410
> https://issues.apache.org/jira/browse/HIVE-9410
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The RemoteDriver does not contains added jar in it's classpath, so it would 
> failed to desrialize SparkWork due to NoClassFoundException. For Hive on MR, 
> while use add jar through Hive CLI, Hive add jar into CLI classpath(through 
> thread context classloader) and add it to distributed cache as well. Compare 
> to Hive on MR, Hive on Spark has an extra RemoteDriver componnet, we should 
> add added jar into it's classpath as well.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java d7cb111 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java 
> 30a00a7 
>   spark-client/src/main/java/org/apache/hive/spark/client/JobContext.java 
> 00aa4ec 
>   spark-client/src/main/java/org/apache/hive/spark/client/JobContextImpl.java 
> 1eb3ff2 
>   
> spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 
> 5f9be65 
>   
> spark-client/src/main/java/org/apache/hive/spark/client/SparkClientUtilities.java
>  PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/30107/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> chengxiang li
> 
>



[jira] [Commented] (HIVE-9253) MetaStore server should support timeout for long running requests

2015-01-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288583#comment-14288583
 ] 

Hive QA commented on HIVE-9253:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12693809/HIVE-9253.3.patch

{color:green}SUCCESS:{color} +1 7351 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2483/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2483/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2483/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12693809 - PreCommit-HIVE-TRUNK-Build

> MetaStore server should support timeout for long running requests
> -
>
> Key: HIVE-9253
> URL: https://issues.apache.org/jira/browse/HIVE-9253
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Dong Chen
>Assignee: Dong Chen
> Attachments: HIVE-9253.1.patch, HIVE-9253.2.patch, HIVE-9253.2.patch, 
> HIVE-9253.3.patch, HIVE-9253.patch
>
>
> In the description of HIVE-7195, one issue is that MetaStore client timeout 
> is quite dumb. The client will timeout and the server has no idea the client 
> is gone.
> The server should support timeout when the request from client runs a long 
> time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9447) Metastore: inefficient Oracle query for removing unused column descriptors when add/drop table/partition

2015-01-22 Thread Selina Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Selina Zhang updated HIVE-9447:
---
Affects Version/s: 0.14.0
   Status: Patch Available  (was: In Progress)

> Metastore: inefficient Oracle query for removing unused column descriptors 
> when add/drop table/partition
> 
>
> Key: HIVE-9447
> URL: https://issues.apache.org/jira/browse/HIVE-9447
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 0.14.0
>Reporter: Selina Zhang
>Assignee: Selina Zhang
> Attachments: HIVE-9447.1.patch
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> Metastore needs removing unused column descriptors when drop/add partitions 
> or tables. For query the unused column descriptor, the current implementation 
> utilizes datanuleus' range function, which basically equals LIMIT syntax. 
> However, Oracle does not support LIMIT, the query is converted as  
> {quote}
> SQL> SELECT * FROM (SELECT subq.*,ROWNUM rn FROM (SELECT
> 'org.apache.hadoop.hive.metastore.model.MStorageDescriptor' AS
> NUCLEUS_TYPE,A0.INPUT_FORMAT,A0.IS_COMPRESSED,A0.IS_STOREDASSUBDIRECTORIES,A0.LOCATION,
> A0.NUM_BUCKETS,A0.OUTPUT_FORMAT,A0.SD_ID FROM drhcat.SDS A0 
> WHERE A0.CD_ID = ? ) subq ) WHERE  rn <= 1;
> {quote}
> Given that CD_ID is not very selective, this query may have to access large 
> amount of rows (depends how many partitions the table has, millions of rows 
> in our case). Metastore may become unresponsive because of this. 
> Since Metastore only needs to know if the specific CD_ID is referenced in SDS 
> table and does not need access the whole row. We can use 
> {quote}
> select count(1) from SDS where SDS.CD_ID=?
> {quote}
> CD_ID is index column, the above query will do range scan for index, which is 
> faster. 
> For other DBs support LIMIT syntax such as MySQL, this problem does not 
> exist. However, the new query does not hurt.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9447) Metastore: inefficient Oracle query for removing unused column descriptors when add/drop table/partition

2015-01-22 Thread Selina Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Selina Zhang updated HIVE-9447:
---
Attachment: HIVE-9447.1.patch

> Metastore: inefficient Oracle query for removing unused column descriptors 
> when add/drop table/partition
> 
>
> Key: HIVE-9447
> URL: https://issues.apache.org/jira/browse/HIVE-9447
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Selina Zhang
>Assignee: Selina Zhang
> Attachments: HIVE-9447.1.patch
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> Metastore needs removing unused column descriptors when drop/add partitions 
> or tables. For query the unused column descriptor, the current implementation 
> utilizes datanuleus' range function, which basically equals LIMIT syntax. 
> However, Oracle does not support LIMIT, the query is converted as  
> {quote}
> SQL> SELECT * FROM (SELECT subq.*,ROWNUM rn FROM (SELECT
> 'org.apache.hadoop.hive.metastore.model.MStorageDescriptor' AS
> NUCLEUS_TYPE,A0.INPUT_FORMAT,A0.IS_COMPRESSED,A0.IS_STOREDASSUBDIRECTORIES,A0.LOCATION,
> A0.NUM_BUCKETS,A0.OUTPUT_FORMAT,A0.SD_ID FROM drhcat.SDS A0 
> WHERE A0.CD_ID = ? ) subq ) WHERE  rn <= 1;
> {quote}
> Given that CD_ID is not very selective, this query may have to access large 
> amount of rows (depends how many partitions the table has, millions of rows 
> in our case). Metastore may become unresponsive because of this. 
> Since Metastore only needs to know if the specific CD_ID is referenced in SDS 
> table and does not need access the whole row. We can use 
> {quote}
> select count(1) from SDS where SDS.CD_ID=?
> {quote}
> CD_ID is index column, the above query will do range scan for index, which is 
> faster. 
> For other DBs support LIMIT syntax such as MySQL, this problem does not 
> exist. However, the new query does not hurt.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HIVE-9447) Metastore: inefficient Oracle query for removing unused column descriptors when add/drop table/partition

2015-01-22 Thread Selina Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-9447 started by Selina Zhang.
--
> Metastore: inefficient Oracle query for removing unused column descriptors 
> when add/drop table/partition
> 
>
> Key: HIVE-9447
> URL: https://issues.apache.org/jira/browse/HIVE-9447
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Selina Zhang
>Assignee: Selina Zhang
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> Metastore needs removing unused column descriptors when drop/add partitions 
> or tables. For query the unused column descriptor, the current implementation 
> utilizes datanuleus' range function, which basically equals LIMIT syntax. 
> However, Oracle does not support LIMIT, the query is converted as  
> {quote}
> SQL> SELECT * FROM (SELECT subq.*,ROWNUM rn FROM (SELECT
> 'org.apache.hadoop.hive.metastore.model.MStorageDescriptor' AS
> NUCLEUS_TYPE,A0.INPUT_FORMAT,A0.IS_COMPRESSED,A0.IS_STOREDASSUBDIRECTORIES,A0.LOCATION,
> A0.NUM_BUCKETS,A0.OUTPUT_FORMAT,A0.SD_ID FROM drhcat.SDS A0 
> WHERE A0.CD_ID = ? ) subq ) WHERE  rn <= 1;
> {quote}
> Given that CD_ID is not very selective, this query may have to access large 
> amount of rows (depends how many partitions the table has, millions of rows 
> in our case). Metastore may become unresponsive because of this. 
> Since Metastore only needs to know if the specific CD_ID is referenced in SDS 
> table and does not need access the whole row. We can use 
> {quote}
> select count(1) from SDS where SDS.CD_ID=?
> {quote}
> CD_ID is index column, the above query will do range scan for index, which is 
> faster. 
> For other DBs support LIMIT syntax such as MySQL, this problem does not 
> exist. However, the new query does not hurt.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9448) Merge spark to trunk 1/23/15

2015-01-22 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-9448:

Status: Patch Available  (was: Open)

> Merge spark to trunk 1/23/15
> 
>
> Key: HIVE-9448
> URL: https://issues.apache.org/jira/browse/HIVE-9448
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 0.15.0
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-9448.patch
>
>
> Merging latest spark changes to trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9448) Merge spark to trunk 1/23/15

2015-01-22 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-9448:

Attachment: HIVE-9448.patch

> Merge spark to trunk 1/23/15
> 
>
> Key: HIVE-9448
> URL: https://issues.apache.org/jira/browse/HIVE-9448
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 0.15.0
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-9448.patch
>
>
> Merging latest spark changes to trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9410) ClassNotFoundException occurs during hive query case execution with UDF defined [Spark Branch]

2015-01-22 Thread Xin Hao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288554#comment-14288554
 ] 

Xin Hao commented on HIVE-9410:
---

Chengxiang, I used patch HIVE-9410.3-spark.patch to validate those four 
Big-Bench cases (Q10, Q18, Q19, Q27). They are passed for both Spark Standalone 
and Yarn-Client mode. Thanks.

> ClassNotFoundException occurs during hive query case execution with UDF 
> defined [Spark Branch]
> --
>
> Key: HIVE-9410
> URL: https://issues.apache.org/jira/browse/HIVE-9410
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
> Environment: CentOS 6.5
> JDK1.7
>Reporter: Xin Hao
>Assignee: Chengxiang Li
> Attachments: HIVE-9410.1-spark.patch, HIVE-9410.2-spark.patch, 
> HIVE-9410.3-spark.patch
>
>
> We have a hive query case with UDF defined (i.e. BigBench case Q10, Q18 
> etc.). It will be passed for default Hive (on MR) mode, while failed for Hive 
> On Spark mode (both Standalone and Yarn-Client). 
> Although we use 'add jar .jar;' to add the UDF jar explicitly, the issue 
> still exists. 
> BTW, if we put the UDF jar into $HIVE_HOME/lib dir, the case will be passed.
> Detail Error Message is as below (NOTE: 
> de.bankmark.bigbench.queries.q10.SentimentUDF is the UDF which contained in 
> jar bigbenchqueriesmr.jar, and we have add command like 'add jar 
> /location/to/bigbenchqueriesmr.jar;' into .sql explicitly)
> INFO  [pool-1-thread-1]: client.RemoteDriver (RemoteDriver.java:call(316)) - 
> Failed to run job 8dd120cb-1a4d-4d1c-ba31-61eac648c27d
> org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find 
> class: de.bankmark.bigbench.queries.q10.SentimentUDF
> Serialization trace:
> genericUDTF (org.apache.hadoop.hive.ql.plan.UDTFDesc)
> conf (org.apache.hadoop.hive.ql.exec.UDTFOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.MapJoinOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.FilterOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
> aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
> right (org.apache.commons.lang3.tuple.ImmutablePair)
> edgeProperties (org.apache.hadoop.hive.ql.plan.SparkWork)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:656)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:99)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
> ...
> Caused by: java.lang.ClassNotFoundException: 
> de.bankmark.bigbench.queries.q10.SentimentUDF
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:270)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolve

[jira] [Created] (HIVE-9448) Merge spark to trunk 1/23/15

2015-01-22 Thread Szehon Ho (JIRA)
Szehon Ho created HIVE-9448:
---

 Summary: Merge spark to trunk 1/23/15
 Key: HIVE-9448
 URL: https://issues.apache.org/jira/browse/HIVE-9448
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: 0.15.0
Reporter: Szehon Ho
Assignee: Szehon Ho


Merging latest spark changes to trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9436) RetryingMetaStoreClient does not retry JDOExceptions

2015-01-22 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288534#comment-14288534
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-9436:
-

A minor comment since I remember looking at RetryingHMSHandler code for  
catching Nucleus Exceptions a while back. I think RetryingHMSHandler::invoke 
and RetryingMetaStoreClient::invoke should have similar  exceptions 
caught(except might be TException or something that might be client specific or 
HMS handler specific). Is it possible to factorize the catch part and have them 
called from  RetryingMetaStoreClient::invoke and RetryingHMSHandler::invoke to 
avoid any redundancy/ missing cases. If this is not possible, manually 
comparing these two functions for the exception type handled might help in 
resolving any similar missing exception handling cases. Another point is 
comparing the exception messages for JDOException regex  looks a bit hairy to 
me. Please correct me if this understanding is wrong.

Thanks
Hari

> RetryingMetaStoreClient does not retry JDOExceptions
> 
>
> Key: HIVE-9436
> URL: https://issues.apache.org/jira/browse/HIVE-9436
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0, 0.13.1
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-9436.patch
>
>
> RetryingMetaStoreClient has a bug in the following bit of code:
> {code}
> } else if ((e.getCause() instanceof MetaException) &&
> e.getCause().getMessage().matches("JDO[a-zA-Z]*Exception")) {
>   caughtException = (MetaException) e.getCause();
> } else {
>   throw e.getCause();
> }
> {code}
> The bug here is that java String.matches matches the entire string to the 
> regex, and thus, that match will fail if the message contains anything before 
> or after JDO[a-zA-Z]\*Exception. The solution, however, is very simple, we 
> should match .\*JDO[a-zA-Z]\*Exception.\*



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9378) Spark qfile tests should reuse RSC [Spark Branch]

2015-01-22 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-9378:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to Spark branch. Thanks, Jimmy.

> Spark qfile tests should reuse RSC [Spark Branch]
> -
>
> Key: HIVE-9378
> URL: https://issues.apache.org/jira/browse/HIVE-9378
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: spark-branch
>
> Attachments: HIVE-9378.1-spark.patch, HIVE-9378.2-spark.patch, 
> HIVE-9378.3-spark.patch, HIVE-9378.4-spark.patch
>
>
> Run several qfile tests, use jps to monitor the java processes. You will find 
> several SparkSubmitDriverBootstrapper processes are created (not the same 
> time of course).  It seems to me that we create a RSC for each qfile, then 
> terminate it when this qfile test is done. The RSC seems not shared among 
> qfiles.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9378) Spark qfile tests should reuse RSC [Spark Branch]

2015-01-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288524#comment-14288524
 ] 

Hive QA commented on HIVE-9378:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12694032/HIVE-9378.4-spark.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 7355 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/674/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/674/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-674/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12694032 - PreCommit-HIVE-SPARK-Build

> Spark qfile tests should reuse RSC [Spark Branch]
> -
>
> Key: HIVE-9378
> URL: https://issues.apache.org/jira/browse/HIVE-9378
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: spark-branch
>
> Attachments: HIVE-9378.1-spark.patch, HIVE-9378.2-spark.patch, 
> HIVE-9378.3-spark.patch, HIVE-9378.4-spark.patch
>
>
> Run several qfile tests, use jps to monitor the java processes. You will find 
> several SparkSubmitDriverBootstrapper processes are created (not the same 
> time of course).  It seems to me that we create a RSC for each qfile, then 
> terminate it when this qfile test is done. The RSC seems not shared among 
> qfiles.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6970) HCatInputFormat in hive-hcatalog-core 0.13 does not compatible with hadoop2

2015-01-22 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-6970:

Assignee: (was: Navis)

> HCatInputFormat in hive-hcatalog-core 0.13 does not compatible with hadoop2
> ---
>
> Key: HIVE-6970
> URL: https://issues.apache.org/jira/browse/HIVE-6970
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.13.0
>Reporter: Chao Shi
> Attachments: HIVE-6970.1.patch.txt, HIVE-6970.2.patch.txt
>
>
> I'm developing HCatalog support for Crunch (CRUNCH-340). Crunch support both 
> hadoop 1 and 2, by specify profile -Phadoop-1 or -Phadoop-2. When I run my 
> program with hadoop2, I got the following error. I think this was intended to 
> be fixed by HIVE-4460.
> {code}
> Exception in thread "Thread-4" java.lang.IncompatibleClassChangeError: Found 
> interface org.apache.hadoop.mapreduce.JobContext, but class was expected
> at 
> org.apache.hive.hcatalog.mapreduce.HCatBaseInputFormat.getSplits(HCatBaseInputFormat.java:102)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:491)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:508)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392)
> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
> at 
> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob.submit(CrunchControlledJob.java:331)
> at 
> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.startReadyJobs(CrunchJobControl.java:201)
> at 
> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.pollJobStatusAndStartNewOnes(CrunchJobControl.java:235)
> at 
> org.apache.crunch.impl.mr.exec.MRExecutor.monitorLoop(MRExecutor.java:112)
> at 
> org.apache.crunch.impl.mr.exec.MRExecutor.access$000(MRExecutor.java:55)
> at org.apache.crunch.impl.mr.exec.MRExecutor$1.run(MRExecutor.java:83)
> at java.lang.Thread.run(Thread.java:744)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9447) Metastore: inefficient Oracle query for removing unused column descriptors when add/drop table/partition

2015-01-22 Thread Selina Zhang (JIRA)
Selina Zhang created HIVE-9447:
--

 Summary: Metastore: inefficient Oracle query for removing unused 
column descriptors when add/drop table/partition
 Key: HIVE-9447
 URL: https://issues.apache.org/jira/browse/HIVE-9447
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Selina Zhang
Assignee: Selina Zhang


Metastore needs removing unused column descriptors when drop/add partitions or 
tables. For query the unused column descriptor, the current implementation 
utilizes datanuleus' range function, which basically equals LIMIT syntax. 
However, Oracle does not support LIMIT, the query is converted as  
{quote}
SQL> SELECT * FROM (SELECT subq.*,ROWNUM rn FROM (SELECT
'org.apache.hadoop.hive.metastore.model.MStorageDescriptor' AS
NUCLEUS_TYPE,A0.INPUT_FORMAT,A0.IS_COMPRESSED,A0.IS_STOREDASSUBDIRECTORIES,A0.LOCATION,
A0.NUM_BUCKETS,A0.OUTPUT_FORMAT,A0.SD_ID FROM drhcat.SDS A0 
WHERE A0.CD_ID = ? ) subq ) WHERE  rn <= 1;
{quote}
Given that CD_ID is not very selective, this query may have to access large 
amount of rows (depends how many partitions the table has, millions of rows in 
our case). Metastore may become unresponsive because of this. 

Since Metastore only needs to know if the specific CD_ID is referenced in SDS 
table and does not need access the whole row. We can use 
{quote}
select count(1) from SDS where SDS.CD_ID=?
{quote}
CD_ID is index column, the above query will do range scan for index, which is 
faster. 

For other DBs support LIMIT syntax such as MySQL, this problem does not exist. 
However, the new query does not hurt.  







--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9152) Dynamic Partition Pruning [Spark Branch]

2015-01-22 Thread Chao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao updated HIVE-9152:
---
Attachment: HIVE-9152.1-spark.patch

> Dynamic Partition Pruning [Spark Branch]
> 
>
> Key: HIVE-9152
> URL: https://issues.apache.org/jira/browse/HIVE-9152
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Brock Noland
>Assignee: Chao
> Attachments: HIVE-9152.1-spark.patch
>
>
> Tez implemented dynamic partition pruning in HIVE-7826. This is a nice 
> optimization and we should implement the same in HOS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9441) Remove call to deprecated Calcite method

2015-01-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288492#comment-14288492
 ] 

Hive QA commented on HIVE-9441:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12693795/HIVE-9441.1.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 7347 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby3_map
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2482/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2482/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2482/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12693795 - PreCommit-HIVE-TRUNK-Build

> Remove call to deprecated Calcite method
> 
>
> Key: HIVE-9441
> URL: https://issues.apache.org/jira/browse/HIVE-9441
> Project: Hive
>  Issue Type: Bug
>Reporter: Julian Hyde
>Assignee: Ashutosh Chauhan
>Priority: Minor
> Attachments: HIVE-9441.1.patch
>
>
> The method RexLiteral.byteValue() was deprecated and will be remove in 
> Calcite 1.0. The attached patch replaces it with a non-deprecated alternative.
> As soon as the patch is committed I will push to apache nexus a new 
> calcite-1.0.0-snapshot that will be very close to proposed calcite-1.0 
> release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9436) RetryingMetaStoreClient does not retry JDOExceptions

2015-01-22 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-9436:
---
Status: Open  (was: Patch Available)

Canceling patch to update one more thing - this needs to be a multiline match, 
from the looks of it.

> RetryingMetaStoreClient does not retry JDOExceptions
> 
>
> Key: HIVE-9436
> URL: https://issues.apache.org/jira/browse/HIVE-9436
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.1, 0.14.0
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-9436.patch
>
>
> RetryingMetaStoreClient has a bug in the following bit of code:
> {code}
> } else if ((e.getCause() instanceof MetaException) &&
> e.getCause().getMessage().matches("JDO[a-zA-Z]*Exception")) {
>   caughtException = (MetaException) e.getCause();
> } else {
>   throw e.getCause();
> }
> {code}
> The bug here is that java String.matches matches the entire string to the 
> regex, and thus, that match will fail if the message contains anything before 
> or after JDO[a-zA-Z]\*Exception. The solution, however, is very simple, we 
> should match .\*JDO[a-zA-Z]\*Exception.\*



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Created branch 1.0

2015-01-22 Thread Thejas Nair
On Thu, Jan 22, 2015 at 2:49 PM, Edward Capriolo  wrote:
> If we do a 1.0.0 release there is no problem with us later releasing 14.1
> or a 14.2. I think everyone would understand that the 14.2 being released
> after 1.0.0 would likely have back ported features. Releasing a 15.0 after
> 1.0 would not make as much sense as we probably do not want two active
> branches.
>

I assume you inadvertently missed the "0." before 14 and 15.
I think everybody agrees that releasing a 0.15 which is a super set of
changes of 1.0 would be a bad idea, and it 0.15 should become 1.1 .
1.0 branch would be used to release any further maintenance fixes if
necessary (eg 1.0.x) (ie it is a 1.0.x release branch and not a 1.x.x
branch)


> I understand why this happens but I think it is a bit hokie. Would anyone
> be happy if hive depended on
> "git-edwardcapriolo-coolutils-0.0.0-snapshot-5". So if we are having the
> "serious 1.0 talk" we should avoid things like this.

I agree we should not have releases depending on snapshots. This is no
longer an issue in 0.14 branch.

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


[jira] [Updated] (HIVE-9378) Spark qfile tests should reuse RSC [Spark Branch]

2015-01-22 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HIVE-9378:
--
Attachment: HIVE-9378.4-spark.patch

Right. Attached v4 that's rebased to Spark latest.

> Spark qfile tests should reuse RSC [Spark Branch]
> -
>
> Key: HIVE-9378
> URL: https://issues.apache.org/jira/browse/HIVE-9378
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: spark-branch
>
> Attachments: HIVE-9378.1-spark.patch, HIVE-9378.2-spark.patch, 
> HIVE-9378.3-spark.patch, HIVE-9378.4-spark.patch
>
>
> Run several qfile tests, use jps to monitor the java processes. You will find 
> several SparkSubmitDriverBootstrapper processes are created (not the same 
> time of course).  It seems to me that we create a RSC for each qfile, then 
> terminate it when this qfile test is done. The RSC seems not shared among 
> qfiles.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Created branch 1.0

2015-01-22 Thread Thejas Nair
On Thu, Jan 22, 2015 at 12:31 PM, Xuefu Zhang  wrote:
> Hi Thejas/Alan,
>
> From all the argument, I think there was an assumption that the proposed
> 1.0 release will be imminent and 0.15 will happen far after that. Based on
> that assumption, 0.15 will become 1.1, which is greater in scope than 1.0.
> However, this assumption may not be true. The confusion will be significant
> if 0.15 is released early as 0.15 before 0.14.1 is released as 1.0.

Yes, the assumption is that 1.0 will be out very soon,  before 0.15
line is ready, and that 0.15 can become 1.1 .
Do you think that assumption won't hold true ? (In previous emails in
this thread, I talk about reasons why this assumption is reliable).
I agree that it does not make sense to release 1.0 as proposed in this
thread if that does not hold true.

> Another concern is that, the proposed release of 1.0 is a subset of of
> Hive's functionality, and for major releases users are expecting major
> improvement in functionality as well as stability. Mutating from 0.14.1
> release seems falling short in that expectation.

Every release of hive has been a subset of tip of the trunk (we branch
for release while trunk moves ahead), and super set of changes of
every previous release. So every release so far has had a subset of
functionality of hive trunk branch (if that is what you are referring
to).
With the 1.0 proposed based on 0.14 maintenance branch, this still
holds true. (Again, this is with the assumption you called out about
about timeline of 1.0 and 0.15).

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


[jira] [Commented] (HIVE-9378) Spark qfile tests should reuse RSC [Spark Branch]

2015-01-22 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288393#comment-14288393
 ] 

Xuefu Zhang commented on HIVE-9378:
---

Looks like the patch needs to be rebased.

> Spark qfile tests should reuse RSC [Spark Branch]
> -
>
> Key: HIVE-9378
> URL: https://issues.apache.org/jira/browse/HIVE-9378
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: spark-branch
>
> Attachments: HIVE-9378.1-spark.patch, HIVE-9378.2-spark.patch, 
> HIVE-9378.3-spark.patch
>
>
> Run several qfile tests, use jps to monitor the java processes. You will find 
> several SparkSubmitDriverBootstrapper processes are created (not the same 
> time of course).  It seems to me that we create a RSC for each qfile, then 
> terminate it when this qfile test is done. The RSC seems not shared among 
> qfiles.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9392) JoinStatsRule miscalculates join cardinality as incorrect NDV is used due to column names having duplicated fqColumnName

2015-01-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288383#comment-14288383
 ] 

Hive QA commented on HIVE-9392:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12693780/HIVE-9392.1.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 7346 tests executed
*Failed tests:*
{noformat}
TestCustomAuthentication - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union20
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_annotate_stats_join
org.apache.hive.hcatalog.streaming.TestStreaming.testEndpointConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2481/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2481/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2481/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12693780 - PreCommit-HIVE-TRUNK-Build

> JoinStatsRule miscalculates join cardinality as incorrect NDV is used due to 
> column names having duplicated fqColumnName
> 
>
> Key: HIVE-9392
> URL: https://issues.apache.org/jira/browse/HIVE-9392
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 0.14.0
>Reporter: Mostafa Mokhtar
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Fix For: 0.15.0
>
> Attachments: HIVE-9392.1.patch
>
>
> In JoinStatsRule.process the join column statistics are stored in HashMap  
> joinedColStats, the key used which is the ColStatistics.fqColName is 
> duplicated between join column in the same vertex, as a result distinctVals 
> ends up having duplicated values which negatively affects the join 
> cardinality estimation.
> The duplicate keys are usually named KEY.reducesinkkey0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9416) Get rid of Extract Operator

2015-01-22 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-9416:
---
Status: Patch Available  (was: Open)

> Get rid of Extract Operator
> ---
>
> Key: HIVE-9416
> URL: https://issues.apache.org/jira/browse/HIVE-9416
> Project: Hive
>  Issue Type: Task
>  Components: Query Processor
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-9416.1.patch, HIVE-9416.2.patch, HIVE-9416.patch
>
>
> {{Extract Operator}} has been there for legacy reasons. But there is no 
> functionality it provides which cant be provided by {{Select Operator}} 
> Instead of having two operators, one being subset of another we should just 
> get rid of {{Extract}} and simplify our codebase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9416) Get rid of Extract Operator

2015-01-22 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-9416:
---
Attachment: HIVE-9416.2.patch

Fixed optimizers except SortedDynPartitionOptimizer.

> Get rid of Extract Operator
> ---
>
> Key: HIVE-9416
> URL: https://issues.apache.org/jira/browse/HIVE-9416
> Project: Hive
>  Issue Type: Task
>  Components: Query Processor
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-9416.1.patch, HIVE-9416.2.patch, HIVE-9416.patch
>
>
> {{Extract Operator}} has been there for legacy reasons. But there is no 
> functionality it provides which cant be provided by {{Select Operator}} 
> Instead of having two operators, one being subset of another we should just 
> get rid of {{Extract}} and simplify our codebase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9416) Get rid of Extract Operator

2015-01-22 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-9416:
---
Status: Open  (was: Patch Available)

> Get rid of Extract Operator
> ---
>
> Key: HIVE-9416
> URL: https://issues.apache.org/jira/browse/HIVE-9416
> Project: Hive
>  Issue Type: Task
>  Components: Query Processor
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-9416.1.patch, HIVE-9416.patch
>
>
> {{Extract Operator}} has been there for legacy reasons. But there is no 
> functionality it provides which cant be provided by {{Select Operator}} 
> Instead of having two operators, one being subset of another we should just 
> get rid of {{Extract}} and simplify our codebase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9208) MetaStore DB schema inconsistent for MS SQL Server in use of varchar/nvarchar

2015-01-22 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288370#comment-14288370
 ] 

Eugene Koifman commented on HIVE-9208:
--

I was asking if you are sure those fields represent partition names.  If yes, 
they should be nvarchar.
As far as length, it seems odd that we store the same exact value in fields of 
different sizes...  but perhaps this is beyond the scope of this bug.
[~sushanth], could you look at this patch?  do you know anything about the 
tables that seem to store partition values but the length of fields is not 767 
as in most cases?

> MetaStore DB schema inconsistent for MS SQL Server in use of varchar/nvarchar
> -
>
> Key: HIVE-9208
> URL: https://issues.apache.org/jira/browse/HIVE-9208
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.14.0
>Reporter: Eugene Koifman
>Assignee: Xiaobing Zhou
> Attachments: HIVE-9208.1.patch, HIVE-9208.2.patch
>
>
> hive-schema-0.15.0.mssql.sql has PARTITIONS.PART_NAME as NVARCHAR but 
> COMPLETED_TXN_COMPONENTS.CTC_PARTITON, COMPACTION_QUEUE.CQ_PARTITION, 
> HIVE_LOCKS.HL_PARTITION, TXN_COMPONENTS.TC_PARTITION all use VARCHAR.  This 
> cannot be right since they all store the same value.
> the same is true of hive-schema-0.14.0.mssql.sql and the two corresponding 
> hvie-txn-schema-... files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Created branch 1.0

2015-01-22 Thread Edward Capriolo
If we do a 1.0.0 release there is no problem with us later releasing 14.1
or a 14.2. I think everyone would understand that the 14.2 being released
after 1.0.0 would likely have back ported features. Releasing a 15.0 after
1.0 would not make as much sense as we probably do not want two active
branches.

What I am most concerned about is we have been doing hive releases against
snapshots of Apache|Other projects. Releases should not depend on
snapshots. If the other project is not mature enough to have a release we
should probably not be depending on it.

I understand why this happens but I think it is a bit hokie. Would anyone
be happy if hive depended on
"git-edwardcapriolo-coolutils-0.0.0-snapshot-5". So if we are having the
"serious 1.0 talk" we should avoid things like this.



On Thu, Jan 22, 2015 at 3:31 PM, Xuefu Zhang  wrote:

> Hi Thejas/Alan,
>
> From all the argument, I think there was an assumption that the proposed
> 1.0 release will be imminent and 0.15 will happen far after that. Based on
> that assumption, 0.15 will become 1.1, which is greater in scope than 1.0.
> However, this assumption may not be true. The confusion will be significant
> if 0.15 is released early as 0.15 before 0.14.1 is released as 1.0.
>
> Another concern is that, the proposed release of 1.0 is a subset of of
> Hive's functionality, and for major releases users are expecting major
> improvement in functionality as well as stability. Mutating from 0.14.1
> release seems falling short in that expectation.
>
> Having said that, I'd think it makes more sense to release 0.15 as 0.15,
> and later we release 1.0 as the major release that supersedes any previous
> releases. That will fulfill the expectations of a major release.
>
> Thanks,
> Xuefu
>
> On Thu, Jan 22, 2015 at 12:12 PM, Alan Gates 
> wrote:
>
> > I had one clarifying question for Brock and Xuefu.  Was your proposal to
> > still call the branch from trunk you are planning in a few days 0.15 (and
> > hence release it as 0.15) and have 1.0 be a later release?  Or did you
> want
> > to call what is now 0.15 1.0?  If you wanted 1.0 to be post 0.15, are you
> > ok with stipulating that the next release from trunk after 0.15 (what
> would
> > have been 0.16) is 1.0?
> >
> > Alan.
> >
> >   Thejas Nair 
> >  January 22, 2015 at 12:04
> > Brock, Xuefu,
> >
> > We seem to have trouble reaching to a consensus here. (Please see my
> > arguments why I don't see this causing confusions, and let me know if
> > it changes your opinion).
> > How should we move forward ? Do you think we need to go through a
> > formal vote regarding the release plan as per hive by-laws ?
> >
> >
> >   Thejas Nair 
> >  January 22, 2015 at 10:38
> > I don't see any reasons for confusion.
> > From a user perspective, 1.0 is going to have a super set of changes of
> > 0.14.
> > 1.1 (based on planned 0.15 release) will have a super set of changes in
> > 1.0 .
> >
> >
> >   Xuefu Zhang 
> >  January 21, 2015 at 22:47
> > I strongly believe that the concept of 1.0 out of a branch as proposed is
> > creating the greatest confusion in the community. If for any reason that
> > 1.0 cannot be cut from the trunk, that means that we are not ready and so
> > shall wait until so before considering such a release. Thus, I'd -1 on
> this
> > proposal.
> >
> > Thanks,
> > Xuefu
> >
> >
> >   Gopal V 
> >  January 21, 2015 at 22:29
> > On 1/21/15, 7:09 PM, Brock Noland wrote:
> >
> > Too be clear I strongly feel creating 1.0 from 0.14 will be confusing. In
> > fact it's already crrated confusion amongst folks on this list.
> > Furthermore
> > 1.0 should be created from trunk and be a superset of previous releases.
> >
> >
> > I don't think there is any confusion over that - 1.0 is a long-term
> > maintenance which is going to be a super-set of all *critical fixes* made
> > from here on (emphasis).
> >
> > In fact, a long-term support release should be released off an actively
> > updated maintenance branch, which has been baked-in and never from the
> > trunk.
> >
> > Those who have followed the earlier mails would realize that the most
> > important "feature" about this branch is to stick to only long term
> > maintenance - which in effect is adopting HBase's successful idea.
> >
> > That is just plain solid engineering.
> >
> > Anyway, it would be in the best interests of the larger community, to
> find
> > out who else finds that approach confusing.
> >
> > Brock, I'm not sure whether you are confused or whether you think other
> > people will be confused (and if so, why?).
> >
> > Cheers,
> > Gopal
> >
> > On Wed, Jan 21, 2015 at 6:05 PM, Vikram Dixit K 
> > 
> > wrote:
> >
> > @Brock,
> >
> > I created this branch from 0.14. I created this branch based on the email
> > thread discussing 1.0,
> >
> > http://search-hadoop.com/m/8er9YGX8g2
> >
> > where you had said you agreed with the suggestion from Enis from HBase
> who
> > said that we should base 1.0 on a stable version rather than making it a
> > feature r

[jira] [Commented] (HIVE-9378) Spark qfile tests should reuse RSC [Spark Branch]

2015-01-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288364#comment-14288364
 ] 

Hive QA commented on HIVE-9378:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12693960/HIVE-9378.3-spark.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 7351 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby3_map_skew
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/673/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/673/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-673/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12693960 - PreCommit-HIVE-SPARK-Build

> Spark qfile tests should reuse RSC [Spark Branch]
> -
>
> Key: HIVE-9378
> URL: https://issues.apache.org/jira/browse/HIVE-9378
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: spark-branch
>
> Attachments: HIVE-9378.1-spark.patch, HIVE-9378.2-spark.patch, 
> HIVE-9378.3-spark.patch
>
>
> Run several qfile tests, use jps to monitor the java processes. You will find 
> several SparkSubmitDriverBootstrapper processes are created (not the same 
> time of course).  It seems to me that we create a RSC for each qfile, then 
> terminate it when this qfile test is done. The RSC seems not shared among 
> qfiles.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9434) Shim the method Path.getPathWithoutSchemeAndAuthority

2015-01-22 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288356#comment-14288356
 ] 

Brock Noland commented on HIVE-9434:


+1

> Shim the method Path.getPathWithoutSchemeAndAuthority
> -
>
> Key: HIVE-9434
> URL: https://issues.apache.org/jira/browse/HIVE-9434
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 0.15.0
>Reporter: Brock Noland
>Assignee: Dong Chen
> Fix For: 0.15.0
>
> Attachments: HIVE-9434.patch
>
>
> Since Hadoop 1 does not have the method 
> {{Path.getPathWithoutSchemeAndAuthority}} we need to shim it out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9327) CBO (Calcite Return Path): Removing Row Resolvers from ParseContext

2015-01-22 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-9327:
--
Attachment: HIVE-9327.05.patch

New patch with updated golden files.

> CBO (Calcite Return Path): Removing Row Resolvers from ParseContext
> ---
>
> Key: HIVE-9327
> URL: https://issues.apache.org/jira/browse/HIVE-9327
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 0.15.0
>
> Attachments: HIVE-9327.01.patch, HIVE-9327.02.patch, 
> HIVE-9327.03.patch, HIVE-9327.04.patch, HIVE-9327.05.patch, HIVE-9327.patch
>
>
> ParseContext includes a map of Operator to RowResolver (OpParseContext). It 
> would be ideal to remove this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9327) CBO (Calcite Return Path): Removing Row Resolvers from ParseContext

2015-01-22 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-9327:
--
Status: Patch Available  (was: Open)

> CBO (Calcite Return Path): Removing Row Resolvers from ParseContext
> ---
>
> Key: HIVE-9327
> URL: https://issues.apache.org/jira/browse/HIVE-9327
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 0.15.0
>
> Attachments: HIVE-9327.01.patch, HIVE-9327.02.patch, 
> HIVE-9327.03.patch, HIVE-9327.04.patch, HIVE-9327.patch
>
>
> ParseContext includes a map of Operator to RowResolver (OpParseContext). It 
> would be ideal to remove this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9327) CBO (Calcite Return Path): Removing Row Resolvers from ParseContext

2015-01-22 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-9327:
--
Status: Open  (was: Patch Available)

> CBO (Calcite Return Path): Removing Row Resolvers from ParseContext
> ---
>
> Key: HIVE-9327
> URL: https://issues.apache.org/jira/browse/HIVE-9327
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 0.15.0
>
> Attachments: HIVE-9327.01.patch, HIVE-9327.02.patch, 
> HIVE-9327.03.patch, HIVE-9327.04.patch, HIVE-9327.patch
>
>
> ParseContext includes a map of Operator to RowResolver (OpParseContext). It 
> would be ideal to remove this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9327) CBO (Calcite Return Path): Removing Row Resolvers from ParseContext

2015-01-22 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-9327:
--
Attachment: (was: HIVE-9327.05.patch)

> CBO (Calcite Return Path): Removing Row Resolvers from ParseContext
> ---
>
> Key: HIVE-9327
> URL: https://issues.apache.org/jira/browse/HIVE-9327
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 0.15.0
>
> Attachments: HIVE-9327.01.patch, HIVE-9327.02.patch, 
> HIVE-9327.03.patch, HIVE-9327.04.patch, HIVE-9327.patch
>
>
> ParseContext includes a map of Operator to RowResolver (OpParseContext). It 
> would be ideal to remove this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9026) Re-enable remaining tests after HIVE-8970 [Spark Branch]

2015-01-22 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-9026:

   Resolution: Fixed
Fix Version/s: spark-branch
   Status: Resolved  (was: Patch Available)

Committed to spark, thanks Chao for contribution.

> Re-enable remaining tests after HIVE-8970 [Spark Branch]
> 
>
> Key: HIVE-9026
> URL: https://issues.apache.org/jira/browse/HIVE-9026
> Project: Hive
>  Issue Type: Sub-task
>  Components: spark-branch
>Affects Versions: spark-branch
>Reporter: Chao
>Assignee: Chao
> Fix For: spark-branch
>
> Attachments: HIVE-9026.1-spark.patch
>
>
> In HIVE-8970, we disabled several tests which seem to be related to an bug in 
> upstream. I filed HIVE-9025 to track it.
> {noformat}
> join38.q
> join_literals.q
> join_nullsafe.q
> subquery_in.q
> ppd_multi_insert.q
> {noformat}
> We need to re-enable these tests after HIVE-9025 is resolved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9446) JDBC DatabaseMetadata.getColumns() does not work for temporary tables

2015-01-22 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288286#comment-14288286
 ] 

Thejas M Nair commented on HIVE-9446:
-

+1

> JDBC DatabaseMetadata.getColumns() does not work for temporary tables
> -
>
> Key: HIVE-9446
> URL: https://issues.apache.org/jira/browse/HIVE-9446
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 0.14.0
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-9446.1.patch
>
>
> After creating temporary table, calling DatabaseMetadData.getColumns() hits 
> error "UnknownTableException(message:default.tmp_07 table not found)"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9208) MetaStore DB schema inconsistent for MS SQL Server in use of varchar/nvarchar

2015-01-22 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288275#comment-14288275
 ] 

Xiaobing Zhou commented on HIVE-9208:
-

They are partition name with multi-bytes support.

> MetaStore DB schema inconsistent for MS SQL Server in use of varchar/nvarchar
> -
>
> Key: HIVE-9208
> URL: https://issues.apache.org/jira/browse/HIVE-9208
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.14.0
>Reporter: Eugene Koifman
>Assignee: Xiaobing Zhou
> Attachments: HIVE-9208.1.patch, HIVE-9208.2.patch
>
>
> hive-schema-0.15.0.mssql.sql has PARTITIONS.PART_NAME as NVARCHAR but 
> COMPLETED_TXN_COMPONENTS.CTC_PARTITON, COMPACTION_QUEUE.CQ_PARTITION, 
> HIVE_LOCKS.HL_PARTITION, TXN_COMPONENTS.TC_PARTITION all use VARCHAR.  This 
> cannot be right since they all store the same value.
> the same is true of hive-schema-0.14.0.mssql.sql and the two corresponding 
> hvie-txn-schema-... files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9208) MetaStore DB schema inconsistent for MS SQL Server in use of varchar/nvarchar

2015-01-22 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288271#comment-14288271
 ] 

Xiaobing Zhou commented on HIVE-9208:
-

I don't think we need to change length, right?

> MetaStore DB schema inconsistent for MS SQL Server in use of varchar/nvarchar
> -
>
> Key: HIVE-9208
> URL: https://issues.apache.org/jira/browse/HIVE-9208
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.14.0
>Reporter: Eugene Koifman
>Assignee: Xiaobing Zhou
> Attachments: HIVE-9208.1.patch, HIVE-9208.2.patch
>
>
> hive-schema-0.15.0.mssql.sql has PARTITIONS.PART_NAME as NVARCHAR but 
> COMPLETED_TXN_COMPONENTS.CTC_PARTITON, COMPACTION_QUEUE.CQ_PARTITION, 
> HIVE_LOCKS.HL_PARTITION, TXN_COMPONENTS.TC_PARTITION all use VARCHAR.  This 
> cannot be right since they all store the same value.
> the same is true of hive-schema-0.14.0.mssql.sql and the two corresponding 
> hvie-txn-schema-... files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9026) Re-enable remaining tests after HIVE-8970 [Spark Branch]

2015-01-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288257#comment-14288257
 ] 

Hive QA commented on HIVE-9026:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12693957/HIVE-9026.1-spark.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 7355 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/672/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/672/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-672/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12693957 - PreCommit-HIVE-SPARK-Build

> Re-enable remaining tests after HIVE-8970 [Spark Branch]
> 
>
> Key: HIVE-9026
> URL: https://issues.apache.org/jira/browse/HIVE-9026
> Project: Hive
>  Issue Type: Sub-task
>  Components: spark-branch
>Affects Versions: spark-branch
>Reporter: Chao
>Assignee: Chao
> Attachments: HIVE-9026.1-spark.patch
>
>
> In HIVE-8970, we disabled several tests which seem to be related to an bug in 
> upstream. I filed HIVE-9025 to track it.
> {noformat}
> join38.q
> join_literals.q
> join_nullsafe.q
> subquery_in.q
> ppd_multi_insert.q
> {noformat}
> We need to re-enable these tests after HIVE-9025 is resolved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9416) Get rid of Extract Operator

2015-01-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288252#comment-14288252
 ] 

Hive QA commented on HIVE-9416:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12693778/HIVE-9416.1.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2480/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2480/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2480/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-2480/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ svn = \s\v\n ]]
+ [[ -n '' ]]
+ [[ -d apache-svn-trunk-source ]]
+ [[ ! -d apache-svn-trunk-source/.svn ]]
+ [[ ! -d apache-svn-trunk-source ]]
+ cd apache-svn-trunk-source
+ svn revert -R .
Reverted 
'metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreEventListener.java'
Reverted 
'metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java'
Reverted 
'metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java'
Reverted 
'metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java'
Reverted 'metastore/src/gen/thrift/gen-py/hive_metastore/ttypes.py'
Reverted 'metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py'
Reverted 
'metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote'
Reverted 'metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp'
Reverted 'metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp'
Reverted 'metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h'
Reverted 'metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h'
Reverted 
'metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp'
Reverted 'metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb'
Reverted 'metastore/src/gen/thrift/gen-rb/hive_metastore_types.rb'
Reverted 
'metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java'
Reverted 'metastore/src/gen/thrift/gen-php/metastore/ThriftHiveMetastore.php'
Reverted 'metastore/src/gen/thrift/gen-php/metastore/Types.php'
Reverted 'metastore/if/hive_metastore.thrift'
Reverted 
'itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/TestDbNotificationListener.java'
Reverted 
'hcatalog/core/src/main/java/org/apache/hive/hcatalog/common/HCatConstants.java'
Reverted 
'hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/listener/DbNotificationListener.java'
Reverted 
'hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/listener/NotificationListener.java'
Reverted 
'hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/messaging/HCatEventMessage.java'
Reverted 
'hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/messaging/json/JSONMessageFactory.java'
Reverted 
'hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/messaging/MessageFactory.java'
++ awk '{print $2}'
++ egrep -v '^X|^Performing status on external'
++ svn status --no-ignore
+ rm -rf target datanucleus.log ant/target shims/target shims/0.20S/target 
shims/0.23/target shims/aggregator/target shims/common/target 
shims/scheduler/target packaging/target hbase-handler/target testutils/target 
jdbc/target metastore/target 
metastore/src/java/org/apache/hadoop/hive/metastore/events/InsertEvent.java 
metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/FireEventRequest.java
 
metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/EventRequestType.java
 itests/target itests/thirdparty itests

[jira] [Commented] (HIVE-9271) Add ability for client to request metastore to fire an event

2015-01-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288250#comment-14288250
 ] 

Hive QA commented on HIVE-9271:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12693773/HIVE-9271.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 7349 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric
org.apache.hive.hcatalog.streaming.TestStreaming.testEndpointConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2479/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2479/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2479/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12693773 - PreCommit-HIVE-TRUNK-Build

> Add ability for client to request metastore to fire an event
> 
>
> Key: HIVE-9271
> URL: https://issues.apache.org/jira/browse/HIVE-9271
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
> Fix For: 0.15.0
>
> Attachments: HIVE-9271.patch
>
>
> Currently all events in Hive are fired by the metastore.  However, there are 
> events that only the client fully understands, such as DML operations.  There 
> should be a way for the client to request the metastore to fire a particular 
> event.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9039) Support Union Distinct

2015-01-22 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-9039:
--
Attachment: HIVE-9039.20.patch

address [~jpullokkaran]'s comments

> Support Union Distinct
> --
>
> Key: HIVE-9039
> URL: https://issues.apache.org/jira/browse/HIVE-9039
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-9039.01.patch, HIVE-9039.02.patch, 
> HIVE-9039.03.patch, HIVE-9039.04.patch, HIVE-9039.05.patch, 
> HIVE-9039.06.patch, HIVE-9039.07.patch, HIVE-9039.08.patch, 
> HIVE-9039.09.patch, HIVE-9039.10.patch, HIVE-9039.11.patch, 
> HIVE-9039.12.patch, HIVE-9039.13.patch, HIVE-9039.14.patch, 
> HIVE-9039.15.patch, HIVE-9039.16.patch, HIVE-9039.17.patch, 
> HIVE-9039.18.patch, HIVE-9039.19.patch, HIVE-9039.20.patch
>
>
> Current version (Hive 0.14) does not support union (or union distinct). It 
> only supports union all. In this patch, we try to add this new feature by 
> rewriting union distinct to union all followed by group by.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9039) Support Union Distinct

2015-01-22 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-9039:
--
Status: Patch Available  (was: Open)

> Support Union Distinct
> --
>
> Key: HIVE-9039
> URL: https://issues.apache.org/jira/browse/HIVE-9039
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-9039.01.patch, HIVE-9039.02.patch, 
> HIVE-9039.03.patch, HIVE-9039.04.patch, HIVE-9039.05.patch, 
> HIVE-9039.06.patch, HIVE-9039.07.patch, HIVE-9039.08.patch, 
> HIVE-9039.09.patch, HIVE-9039.10.patch, HIVE-9039.11.patch, 
> HIVE-9039.12.patch, HIVE-9039.13.patch, HIVE-9039.14.patch, 
> HIVE-9039.15.patch, HIVE-9039.16.patch, HIVE-9039.17.patch, 
> HIVE-9039.18.patch, HIVE-9039.19.patch, HIVE-9039.20.patch
>
>
> Current version (Hive 0.14) does not support union (or union distinct). It 
> only supports union all. In this patch, we try to add this new feature by 
> rewriting union distinct to union all followed by group by.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9039) Support Union Distinct

2015-01-22 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-9039:
--
Status: Open  (was: Patch Available)

> Support Union Distinct
> --
>
> Key: HIVE-9039
> URL: https://issues.apache.org/jira/browse/HIVE-9039
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-9039.01.patch, HIVE-9039.02.patch, 
> HIVE-9039.03.patch, HIVE-9039.04.patch, HIVE-9039.05.patch, 
> HIVE-9039.06.patch, HIVE-9039.07.patch, HIVE-9039.08.patch, 
> HIVE-9039.09.patch, HIVE-9039.10.patch, HIVE-9039.11.patch, 
> HIVE-9039.12.patch, HIVE-9039.13.patch, HIVE-9039.14.patch, 
> HIVE-9039.15.patch, HIVE-9039.16.patch, HIVE-9039.17.patch, 
> HIVE-9039.18.patch, HIVE-9039.19.patch, HIVE-9039.20.patch
>
>
> Current version (Hive 0.14) does not support union (or union distinct). It 
> only supports union all. In this patch, we try to add this new feature by 
> rewriting union distinct to union all followed by group by.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 28797: Support Union Distinct

2015-01-22 Thread pengcheng xiong

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28797/
---

(Updated Jan. 22, 2015, 9:16 p.m.)


Review request for hive and John Pullokkaran.


Changes
---

remove spaces, refactor the code


Repository: hive-git


Description
---

Current version (Hive 0.14) does not support union (or union distinct). It only 
supports union all. In this patch, we try to add this new feature by rewriting 
union distinct to union all followed by group by.


Diffs (updated)
-

  itests/src/test/resources/testconfiguration.properties d08651b 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/ASTConverter.java
 95ad9e0 
  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g 9c7603c 
  ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g c960a6b 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 4364f28 
  ql/src/test/queries/clientnegative/unionClusterBy.q PRE-CREATION 
  ql/src/test/queries/clientnegative/unionDistributeBy.q PRE-CREATION 
  ql/src/test/queries/clientnegative/unionLimit.q PRE-CREATION 
  ql/src/test/queries/clientnegative/unionOrderBy.q PRE-CREATION 
  ql/src/test/queries/clientnegative/unionSortBy.q PRE-CREATION 
  ql/src/test/queries/clientpositive/cbo_union.q e9508c5 
  ql/src/test/queries/clientpositive/explode_null.q 76e4535 
  ql/src/test/queries/clientpositive/input25.q e48368f 
  ql/src/test/queries/clientpositive/input26.q 642a7db 
  ql/src/test/queries/clientpositive/load_dyn_part14.q c34c3bf 
  ql/src/test/queries/clientpositive/metadataOnlyOptimizer.q a26ef1a 
  ql/src/test/queries/clientpositive/script_env_var1.q 381c5dc 
  ql/src/test/queries/clientpositive/script_env_var2.q 5f10812 
  ql/src/test/queries/clientpositive/union3.q d402cb0 
  ql/src/test/queries/clientpositive/unionDistinct_1.q PRE-CREATION 
  ql/src/test/queries/clientpositive/unionDistinct_2.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_null.q 64e1672 
  ql/src/test/queries/clientpositive/union_remove_25.q c6c09e1 
  ql/src/test/queries/clientpositive/union_top_level.q 946473a 
  ql/src/test/queries/clientpositive/vector_multi_insert.q 77404e9 
  ql/src/test/results/clientnegative/unionClusterBy.q.out PRE-CREATION 
  ql/src/test/results/clientnegative/unionDistributeBy.q.out PRE-CREATION 
  ql/src/test/results/clientnegative/unionLimit.q.out PRE-CREATION 
  ql/src/test/results/clientnegative/unionOrderBy.q.out PRE-CREATION 
  ql/src/test/results/clientnegative/unionSortBy.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/ba_table_union.q.out 706a537 
  ql/src/test/results/clientpositive/cbo_union.q.out 1fd88ec 
  ql/src/test/results/clientpositive/char_union1.q.out bdc4a1d 
  ql/src/test/results/clientpositive/explain_logical.q.out 2e73a89 
  ql/src/test/results/clientpositive/explode_null.q.out db71c69 
  ql/src/test/results/clientpositive/groupby_sort_1_23.q.out dd450cb 
  ql/src/test/results/clientpositive/groupby_sort_skew_1_23.q.out 2f08999 
  ql/src/test/results/clientpositive/input25.q.out 141a576 
  ql/src/test/results/clientpositive/input26.q.out 66d3bd2 
  ql/src/test/results/clientpositive/input_part7.q.out 6094f9c 
  ql/src/test/results/clientpositive/join34.q.out a20e49f 
  ql/src/test/results/clientpositive/join35.q.out 937539c 
  ql/src/test/results/clientpositive/load_dyn_part14.q.out a9dde4d 
  ql/src/test/results/clientpositive/merge4.q.out 121b724 
  ql/src/test/results/clientpositive/metadataOnlyOptimizer.q.out 1fcbc0a 
  ql/src/test/results/clientpositive/optimize_nullscan.q.out 4eb498e 
  ql/src/test/results/clientpositive/script_env_var1.q.out 8e1075a 
  ql/src/test/results/clientpositive/script_env_var2.q.out 89f3606 
  ql/src/test/results/clientpositive/spark/groupby_sort_1_23.q.out 569501f 
  ql/src/test/results/clientpositive/spark/groupby_sort_skew_1_23.q.out 6e66697 
  ql/src/test/results/clientpositive/spark/join34.q.out c337093 
  ql/src/test/results/clientpositive/spark/join35.q.out 2b217c1 
  ql/src/test/results/clientpositive/spark/load_dyn_part14.q.out 1f9985f 
  ql/src/test/results/clientpositive/spark/optimize_nullscan.q.out 3a8efcf 
  ql/src/test/results/clientpositive/spark/script_env_var1.q.out 8e1075a 
  ql/src/test/results/clientpositive/spark/script_env_var2.q.out 89f3606 
  ql/src/test/results/clientpositive/spark/union3.q.out 1e79c34 
  ql/src/test/results/clientpositive/spark/union_null.q.out 4574a2e 
  ql/src/test/results/clientpositive/spark/union_ppr.q.out 3e1a4b8 
  ql/src/test/results/clientpositive/spark/union_remove_25.q.out d36a246 
  ql/src/test/results/clientpositive/tez/cbo_union.q.out 1fd88ec 
  ql/src/test/results/clientpositive/tez/optimize_nullscan.q.out da456c7 
  ql/src/test/results/clientpositive/tez/script_env_var1.q.out 8e1075a 
  ql/src/test/results/clientpositive/tez/script_env_var2.q.out 89f3606 
  ql/src/test/results/clie

  1   2   >