date:20160526

[jira] [Updated] (HIVE-13860) Fix more json related JDK8 test failures

2016-05-26 Thread Mohit Sabharwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal updated HIVE-13860:
---
Attachment: HIVE-13860-java8.patch

> Fix more json related JDK8 test failures
> 
>
> Key: HIVE-13860
> URL: https://issues.apache.org/jira/browse/HIVE-13860
> Project: Hive
>  Issue Type: Sub-task
>  Components: Test
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-13860-java8.patch, HIVE-13860-java8.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13737) incorrect count when multiple inserts with union all

2016-05-26 Thread frank luo (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302324#comment-15302324
 ] 

frank luo commented on HIVE-13737:
--

turning stats off gives correct result. So it is a bug in StatsOptimizer?

On Thu, May 26, 2016 at 11:04 AM, Ashutosh Chauhan (JIRA) 



> incorrect count when multiple inserts with union all
> 
>
> Key: HIVE-13737
> URL: https://issues.apache.org/jira/browse/HIVE-13737
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
> Environment: hdp 2.3.4.7 on Red Hat 6
>Reporter: Frank Luo
>Priority: Critical
>
> Here is a test case to illustrate the issue. It seems MR works fine but Tez 
> is having the problem. 
> CREATE TABLE test(col1   STRING);
> CREATE TABLE src (col1 string);
> insert into table src values ('a');
> INSERT into TABLE test
> select * from (
>SELECT * from src
>UNION ALL
>SELECT * from src) x;
> -- do it one more time
> INSERT INTO TABLE test
>SELECT * from src
>UNION ALL
>SELECT * from src;
> --below gives correct result
> SELECT * FROM TEST;
> --count is incorrect. It might give either '1' or '2', but I am expecting '4'
> SELECT count (*) FROM test;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-6683) Beeline does not accept comments at end of line

2016-05-26 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-6683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302312#comment-15302312
 ] 

Sergio Peña commented on HIVE-6683:
---

Thanks [~Tauruzzz]
Could you create a JIRA, and include necessary information to reproduce and 
versions?

> Beeline does not accept comments at end of line
> ---
>
> Key: HIVE-6683
> URL: https://issues.apache.org/jira/browse/HIVE-6683
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.10.0
>Reporter: Jeremy Beard
>Assignee: Sergio Peña
>  Labels: TODOC15
> Fix For: 1.1.0
>
> Attachments: HIVE-6683.1.patch, HIVE-6683.1.patch
>
>
> Beeline fails to read queries where lines have comments at the end. This 
> works in the embedded Hive CLI.
> Example:
> SELECT
> 1 -- this is a comment about this value
> FROM
> table;
> Error: Error while processing statement: FAILED: ParseException line 1:36 
> mismatched input '' expecting FROM near '1' in from clause 
> (state=42000,code=4)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13737) incorrect count when multiple inserts with union all

2016-05-26 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302310#comment-15302310
 ] 

Ashutosh Chauhan commented on HIVE-13737:
-

StatsOptimizer is throwing query off. Try turning it off:
{code}
set hive.compute.query.using.stats=false;
{code}

> incorrect count when multiple inserts with union all
> 
>
> Key: HIVE-13737
> URL: https://issues.apache.org/jira/browse/HIVE-13737
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
> Environment: hdp 2.3.4.7 on Red Hat 6
>Reporter: Frank Luo
>Priority: Critical
>
> Here is a test case to illustrate the issue. It seems MR works fine but Tez 
> is having the problem. 
> CREATE TABLE test(col1   STRING);
> CREATE TABLE src (col1 string);
> insert into table src values ('a');
> INSERT into TABLE test
> select * from (
>SELECT * from src
>UNION ALL
>SELECT * from src) x;
> -- do it one more time
> INSERT INTO TABLE test
>SELECT * from src
>UNION ALL
>SELECT * from src;
> --below gives correct result
> SELECT * FROM TEST;
> --count is incorrect. It might give either '1' or '2', but I am expecting '4'
> SELECT count (*) FROM test;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13826) Make VectorUDFAdaptor work for GenericUDFBetween when used as FILTER

2016-05-26 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302301#comment-15302301
 ] 

Ashutosh Chauhan commented on HIVE-13826:
-

branch-2.1 has already been cut.  master is on 2.2 now If you want this to be 
in 2.1 release then its need to be committed to 2.1-branch specifically.

> Make VectorUDFAdaptor work for GenericUDFBetween when used as FILTER
> 
>
> Key: HIVE-13826
> URL: https://issues.apache.org/jira/browse/HIVE-13826
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-13826.01.patch, HIVE-13826.02.patch
>
>
> GenericUDFBetween doesn't vectorize with VectorUDFAdaptor when used as FILTER 
> (i.e. as single item for WHERE).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13737) incorrect count when multiple inserts with union all

2016-05-26 Thread frank luo (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302294#comment-15302294
 ] 

frank luo commented on HIVE-13737:
--

hive> explain INSERT INTO TABLE test SELECT * from src UNION ALL SELECT * from 
src;
OK
Plan not optimized by CBO.

Vertex dependency in root stage
Map 1 <- Union 2 (CONTAINS)
Map 3 <- Union 2 (CONTAINS)

Stage-4
   Stats-Aggr Operator
  Stage-0
 Move Operator

table:{"serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe","name:":"jluo.test","input
 format:":"org.apache.hadoop.mapred.TextInputFormat","output 
format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"}
Stage-2
   Dependency Collection{}
  Stage-1
 Union 2
 |<-Map 1 [CONTAINS]
 |  File Output Operator [FS_6]
 | compressed:false
 | Statistics:Num rows: 2 Data size: 2 Basic stats: 
COMPLETE Column stats: NONE
 | 
table:{"serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe","name:":"jluo.test","input
 format:":"org.apache.hadoop.mapred.TextInputFormat","output 
format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"}
 | Select Operator [SEL_1]
 |outputColumnNames:["_col0"]
 |Statistics:Num rows: 1 Data size: 1 Basic stats: 
COMPLETE Column stats: NONE
 |TableScan [TS_0]
 |   alias:src
 |   Statistics:Num rows: 1 Data size: 1 Basic 
stats: COMPLETE Column stats: NONE
 |<-Map 3 [CONTAINS]
File Output Operator [FS_6]
   compressed:false
   Statistics:Num rows: 2 Data size: 2 Basic stats: 
COMPLETE Column stats: NONE
   
table:{"serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe","name:":"jluo.test","input
 format:":"org.apache.hadoop.mapred.TextInputFormat","output 
format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"}
   Select Operator [SEL_3]
  outputColumnNames:["_col0"]
  Statistics:Num rows: 1 Data size: 1 Basic stats: 
COMPLETE Column stats: NONE
  TableScan [TS_2]
 alias:src
 Statistics:Num rows: 1 Data size: 1 Basic 
stats: COMPLETE Column stats: NONE
Stage-3
   Stats-Aggr Operator
   Please refer to the previous Stage-0

Time taken: 0.088 seconds, Fetched: 41 row(s)

hive> explain SELECT count(*)  FROM test;
OK
Plan not optimized by CBO.

Stage-0
   Fetch Operator
  limit:1

Time taken: 0.037 seconds, Fetched: 6 row(s)


> incorrect count when multiple inserts with union all
> 
>
> Key: HIVE-13737
> URL: https://issues.apache.org/jira/browse/HIVE-13737
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
> Environment: hdp 2.3.4.7 on Red Hat 6
>Reporter: Frank Luo
>Priority: Critical
>
> Here is a test case to illustrate the issue. It seems MR works fine but Tez 
> is having the problem. 
> CREATE TABLE test(col1   STRING);
> CREATE TABLE src (col1 string);
> insert into table src values ('a');
> INSERT into TABLE test
> select * from (
>SELECT * from src
>UNION ALL
>SELECT * from src) x;
> -- do it one more time
> INSERT INTO TABLE test
>SELECT * from src
>UNION ALL
>SELECT * from src;
> --below gives correct result
> SELECT * FROM TEST;
> --count is incorrect. It might give either '1' or '2', but I am expecting '4'
> SELECT count (*) FROM test;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13737) incorrect count when multiple inserts with union all

2016-05-26 Thread frank luo (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302296#comment-15302296
 ] 

frank luo commented on HIVE-13737:
--

forgot to mention I was using TEZ on Hive. 

> incorrect count when multiple inserts with union all
> 
>
> Key: HIVE-13737
> URL: https://issues.apache.org/jira/browse/HIVE-13737
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
> Environment: hdp 2.3.4.7 on Red Hat 6
>Reporter: Frank Luo
>Priority: Critical
>
> Here is a test case to illustrate the issue. It seems MR works fine but Tez 
> is having the problem. 
> CREATE TABLE test(col1   STRING);
> CREATE TABLE src (col1 string);
> insert into table src values ('a');
> INSERT into TABLE test
> select * from (
>SELECT * from src
>UNION ALL
>SELECT * from src) x;
> -- do it one more time
> INSERT INTO TABLE test
>SELECT * from src
>UNION ALL
>SELECT * from src;
> --below gives correct result
> SELECT * FROM TEST;
> --count is incorrect. It might give either '1' or '2', but I am expecting '4'
> SELECT count (*) FROM test;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13826) Make VectorUDFAdaptor work for GenericUDFBetween when used as FILTER

2016-05-26 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302286#comment-15302286
 ] 

Matt McCline commented on HIVE-13826:
-

Also committed to branch-1

> Make VectorUDFAdaptor work for GenericUDFBetween when used as FILTER
> 
>
> Key: HIVE-13826
> URL: https://issues.apache.org/jira/browse/HIVE-13826
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-13826.01.patch, HIVE-13826.02.patch
>
>
> GenericUDFBetween doesn't vectorize with VectorUDFAdaptor when used as FILTER 
> (i.e. as single item for WHERE).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13826) Make VectorUDFAdaptor work for GenericUDFBetween when used as FILTER

2016-05-26 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13826:

Fix Version/s: 1.3.0

> Make VectorUDFAdaptor work for GenericUDFBetween when used as FILTER
> 
>
> Key: HIVE-13826
> URL: https://issues.apache.org/jira/browse/HIVE-13826
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-13826.01.patch, HIVE-13826.02.patch
>
>
> GenericUDFBetween doesn't vectorize with VectorUDFAdaptor when used as FILTER 
> (i.e. as single item for WHERE).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13826) Make VectorUDFAdaptor work for GenericUDFBetween when used as FILTER

2016-05-26 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302264#comment-15302264
 ] 

Ashutosh Chauhan commented on HIVE-13826:
-

[~mmccline] Would you also like to commit this to branch-2.1?

> Make VectorUDFAdaptor work for GenericUDFBetween when used as FILTER
> 
>
> Key: HIVE-13826
> URL: https://issues.apache.org/jira/browse/HIVE-13826
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.1.0
>
> Attachments: HIVE-13826.01.patch, HIVE-13826.02.patch
>
>
> GenericUDFBetween doesn't vectorize with VectorUDFAdaptor when used as FILTER 
> (i.e. as single item for WHERE).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13737) incorrect count when multiple inserts with union all

2016-05-26 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302261#comment-15302261
 ] 

Ashutosh Chauhan commented on HIVE-13737:
-

Can you add explain output?

> incorrect count when multiple inserts with union all
> 
>
> Key: HIVE-13737
> URL: https://issues.apache.org/jira/browse/HIVE-13737
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
> Environment: hdp 2.3.4.7 on Red Hat 6
>Reporter: Frank Luo
>Priority: Critical
>
> Here is a test case to illustrate the issue. It seems MR works fine but Tez 
> is having the problem. 
> CREATE TABLE test(col1   STRING);
> CREATE TABLE src (col1 string);
> insert into table src values ('a');
> INSERT into TABLE test
> select * from (
>SELECT * from src
>UNION ALL
>SELECT * from src) x;
> -- do it one more time
> INSERT INTO TABLE test
>SELECT * from src
>UNION ALL
>SELECT * from src;
> --below gives correct result
> SELECT * FROM TEST;
> --count is incorrect. It might give either '1' or '2', but I am expecting '4'
> SELECT count (*) FROM test;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13850) File name conflict when have multiple INSERT INTO queries running in parallel

2016-05-26 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302260#comment-15302260
 ] 

Ashutosh Chauhan commented on HIVE-13850:
-

Whatever name you chose you will always be susceptible to [TOCTTOU issue | 
https://en.wikipedia.org/wiki/Time_of_check_to_time_of_use] since name is 
chosen by different process (hive cli) then the one doing renames (Namenode) 
Until HDFS adds merge api (HDFS-9763) best way to handle this scenario is to 
turn on locking https://cwiki.apache.org/confluence/display/Hive/Locking


> File name conflict when have multiple INSERT INTO queries running in parallel
> -
>
> Key: HIVE-13850
> URL: https://issues.apache.org/jira/browse/HIVE-13850
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: Bing Li
>Assignee: Bing Li
> Attachments: HIVE-13850-1.2.1.patch
>
>
> We have an application which connect to HiveServer2 via JDBC.
> In the application, it executes "INSERT INTO" query to the same table.
> If there are a lot of users running the application at the same time. Some of 
> the INSERT could fail.
> The root cause is that in Hive.checkPaths(), it uses the following method to 
> check the existing of the file. But if there are multiple inserts running in 
> parallel, it will led to the conflict.
> for (int counter = 1; fs.exists(itemDest) || destExists(result, itemDest); 
> counter++) {
>   itemDest = new Path(destf, name + ("_copy_" + counter) + 
> filetype);
> }
> The Error Message
> ===
> In hive log,
> org.apache.hadoop.hive.ql.metadata.HiveException: copyFiles: error  
> while moving files!!! Cannot move hdfs://node:8020/apps/hive/warehouse/met
> 
> adata.db/scalding_stats/.hive-staging_hive_2016-05-10_18-46-
> 23_642_2056172497900766879-3321/-ext-1/00_0 to 
> hdfs://node:8020/apps/hive  
> /warehouse/metadata.db/scalding_stats/00_0_copy_9014
> at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java: 
> 2719)   
> at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java: 
> 1645)  
> 
> In hadoop log, 
> WARN  hdfs.StateChange (FSDirRenameOp.java: 
> unprotectedRenameTo(174)) - DIR* FSDirectory.unprotectedRenameTo:   
> failed to rename /apps/hive/warehouse/metadata.db/scalding_stats/.hive- 
> staging_hive_2016-05-10_18-46-23_642_2056172497900766879-3321/-ext- 
> 1/00_0 to /apps/hive/warehouse/metadata.
> db/scalding_stats/00_0_copy_9014 because destination exists



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-13301) Hive on spark local mode broken

2016-05-26 Thread Teng Qiu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302255#comment-15302255
 ] 

Teng Qiu edited comment on HIVE-13301 at 5/26/16 3:31 PM:
--

Hi, i run into the same issue, can not simply remove calcite-avatica jar... i 
replaced calcite-avatica jar with this one 
https://repo1.maven.org/maven2/org/apache/calcite/avatica/avatica/1.7.1/avatica-1.7.1.jar

then it works, but not sure what is the difference between avatica jar and 
calcite-avatica jar... really very ugly hack here...



was (Author: chutium):
Hi, i run into the same issue, can not simply remove calcite-avatica jar... i 
replaced calcite-avatica jar with this one 
https://repo1.maven.org/maven2/org/apache/calcite/avatica/avatica/1.7.1/avatica-1.7.1.jar

then it works.


> Hive on spark local mode broken
> ---
>
> Key: HIVE-13301
> URL: https://issues.apache.org/jira/browse/HIVE-13301
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Szehon Ho
>
> Was trying to run hive-on-spark local mode (set spark.master=local), and 
> found it is not working due to jackson-databind conflict with spark's version.
> {noformat}
> 16/03/17 13:55:43 [f2e832af-82fc-426b-b0b1-ad201210cef4 main]: INFO 
> exec.SerializationUtilities: Serializing MapWork using kryo
> java.lang.NoSuchMethodError: 
> com.fasterxml.jackson.module.scala.deser.BigDecimalDeserializer$.handledType()Ljava/lang/Class;
>   at 
> com.fasterxml.jackson.module.scala.deser.NumberDeserializers$.(ScalaNumberDeserializersModule.scala:49)
>   at 
> com.fasterxml.jackson.module.scala.deser.NumberDeserializers$.(ScalaNumberDeserializersModule.scala)
>   at 
> com.fasterxml.jackson.module.scala.deser.ScalaNumberDeserializersModule$class.$init$(ScalaNumberDeserializersModule.scala:61)
>   at 
> com.fasterxml.jackson.module.scala.DefaultScalaModule.(DefaultScalaModule.scala:19)
>   at 
> com.fasterxml.jackson.module.scala.DefaultScalaModule$.(DefaultScalaModule.scala:35)
>   at 
> com.fasterxml.jackson.module.scala.DefaultScalaModule$.(DefaultScalaModule.scala)
>   at 
> org.apache.spark.rdd.RDDOperationScope$.(RDDOperationScope.scala:81)
>   at 
> org.apache.spark.rdd.RDDOperationScope$.(RDDOperationScope.scala)
>   at org.apache.spark.SparkContext.withScope(SparkContext.scala:714)
>   at org.apache.spark.SparkContext.hadoopRDD(SparkContext.scala:991)
>   at 
> org.apache.spark.api.java.JavaSparkContext.hadoopRDD(JavaSparkContext.scala:419)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateMapInput(SparkPlanGenerator.java:205)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateParentTran(SparkPlanGenerator.java:145)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:117)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.LocalHiveSparkClient.execute(LocalHiveSparkClient.java:130)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.submit(SparkSessionImpl.java:71)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:94)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:172)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:101)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1838)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1579)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1353)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1124)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1112)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:400)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:779)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:718)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:645)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {noformat}
> Seems conflicting version of this jackson-databind class is being brought in 
> via calcite-avatica.jar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13301) Hive on spark local mode broken

2016-05-26 Thread Teng Qiu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302255#comment-15302255
 ] 

Teng Qiu commented on HIVE-13301:
-

Hi, i run into the same issue, can not simply remove calcite-avatica jar... i 
replaced calcite-avatica jar with this one 
https://repo1.maven.org/maven2/org/apache/calcite/avatica/avatica/1.7.1/avatica-1.7.1.jar

then it works.


> Hive on spark local mode broken
> ---
>
> Key: HIVE-13301
> URL: https://issues.apache.org/jira/browse/HIVE-13301
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Szehon Ho
>
> Was trying to run hive-on-spark local mode (set spark.master=local), and 
> found it is not working due to jackson-databind conflict with spark's version.
> {noformat}
> 16/03/17 13:55:43 [f2e832af-82fc-426b-b0b1-ad201210cef4 main]: INFO 
> exec.SerializationUtilities: Serializing MapWork using kryo
> java.lang.NoSuchMethodError: 
> com.fasterxml.jackson.module.scala.deser.BigDecimalDeserializer$.handledType()Ljava/lang/Class;
>   at 
> com.fasterxml.jackson.module.scala.deser.NumberDeserializers$.(ScalaNumberDeserializersModule.scala:49)
>   at 
> com.fasterxml.jackson.module.scala.deser.NumberDeserializers$.(ScalaNumberDeserializersModule.scala)
>   at 
> com.fasterxml.jackson.module.scala.deser.ScalaNumberDeserializersModule$class.$init$(ScalaNumberDeserializersModule.scala:61)
>   at 
> com.fasterxml.jackson.module.scala.DefaultScalaModule.(DefaultScalaModule.scala:19)
>   at 
> com.fasterxml.jackson.module.scala.DefaultScalaModule$.(DefaultScalaModule.scala:35)
>   at 
> com.fasterxml.jackson.module.scala.DefaultScalaModule$.(DefaultScalaModule.scala)
>   at 
> org.apache.spark.rdd.RDDOperationScope$.(RDDOperationScope.scala:81)
>   at 
> org.apache.spark.rdd.RDDOperationScope$.(RDDOperationScope.scala)
>   at org.apache.spark.SparkContext.withScope(SparkContext.scala:714)
>   at org.apache.spark.SparkContext.hadoopRDD(SparkContext.scala:991)
>   at 
> org.apache.spark.api.java.JavaSparkContext.hadoopRDD(JavaSparkContext.scala:419)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateMapInput(SparkPlanGenerator.java:205)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateParentTran(SparkPlanGenerator.java:145)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:117)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.LocalHiveSparkClient.execute(LocalHiveSparkClient.java:130)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.submit(SparkSessionImpl.java:71)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:94)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:172)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:101)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1838)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1579)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1353)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1124)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1112)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:400)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:779)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:718)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:645)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {noformat}
> Seems conflicting version of this jackson-databind class is being brought in 
> via calcite-avatica.jar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13737) incorrect count when multiple inserts with union all

2016-05-26 Thread frank luo (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302252#comment-15302252
 ] 

frank luo commented on HIVE-13737:
--

it always gives wrong result. 

> incorrect count when multiple inserts with union all
> 
>
> Key: HIVE-13737
> URL: https://issues.apache.org/jira/browse/HIVE-13737
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
> Environment: hdp 2.3.4.7 on Red Hat 6
>Reporter: Frank Luo
>Priority: Critical
>
> Here is a test case to illustrate the issue. It seems MR works fine but Tez 
> is having the problem. 
> CREATE TABLE test(col1   STRING);
> CREATE TABLE src (col1 string);
> insert into table src values ('a');
> INSERT into TABLE test
> select * from (
>SELECT * from src
>UNION ALL
>SELECT * from src) x;
> -- do it one more time
> INSERT INTO TABLE test
>SELECT * from src
>UNION ALL
>SELECT * from src;
> --below gives correct result
> SELECT * FROM TEST;
> --count is incorrect. It might give either '1' or '2', but I am expecting '4'
> SELECT count (*) FROM test;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13826) Make VectorUDFAdaptor work for GenericUDFBetween when used as FILTER

2016-05-26 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13826:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Make VectorUDFAdaptor work for GenericUDFBetween when used as FILTER
> 
>
> Key: HIVE-13826
> URL: https://issues.apache.org/jira/browse/HIVE-13826
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.1.0
>
> Attachments: HIVE-13826.01.patch, HIVE-13826.02.patch
>
>
> GenericUDFBetween doesn't vectorize with VectorUDFAdaptor when used as FILTER 
> (i.e. as single item for WHERE).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13826) Make VectorUDFAdaptor work for GenericUDFBetween when used as FILTER

2016-05-26 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302248#comment-15302248
 ] 

Matt McCline commented on HIVE-13826:
-

Committed to master.

> Make VectorUDFAdaptor work for GenericUDFBetween when used as FILTER
> 
>
> Key: HIVE-13826
> URL: https://issues.apache.org/jira/browse/HIVE-13826
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.1.0
>
> Attachments: HIVE-13826.01.patch, HIVE-13826.02.patch
>
>
> GenericUDFBetween doesn't vectorize with VectorUDFAdaptor when used as FILTER 
> (i.e. as single item for WHERE).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13826) Make VectorUDFAdaptor work for GenericUDFBetween when used as FILTER

2016-05-26 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302236#comment-15302236
 ] 

Matt McCline commented on HIVE-13826:
-

None of the test failures are related.

> Make VectorUDFAdaptor work for GenericUDFBetween when used as FILTER
> 
>
> Key: HIVE-13826
> URL: https://issues.apache.org/jira/browse/HIVE-13826
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-13826.01.patch, HIVE-13826.02.patch
>
>
> GenericUDFBetween doesn't vectorize with VectorUDFAdaptor when used as FILTER 
> (i.e. as single item for WHERE).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13862) org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter falls back to ORM

2016-05-26 Thread Rajat Khandelwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajat Khandelwal updated HIVE-13862:

Attachment: HIVE-13862.patch

Simple enough patch. 

> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter
>  falls back to ORM 
> ---
>
> Key: HIVE-13862
> URL: https://issues.apache.org/jira/browse/HIVE-13862
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Amareshwari Sriramadasu
>Assignee: Rajat Khandelwal
> Fix For: 2.1.0
>
> Attachments: HIVE-13862.patch
>
>
> We are seeing following exception and calls fall back to ORM which make it 
> costly :
> {noformat}
>  WARN  org.apache.hadoop.hive.metastore.ObjectStore - Direct SQL failed, 
> falling back to ORM
> java.lang.ClassCastException: 
> org.datanucleus.store.rdbms.query.ForwardQueryResult cannot be cast to 
> java.lang.Number
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.extractSqlInt(MetaStoreDirectSql.java:892)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:855)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter(MetaStoreDirectSql.java:405)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$5.getSqlResult(ObjectStore.java:2763)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$5.getSqlResult(ObjectStore.java:2755)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2606)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getNumPartitionsByFilterInternal(ObjectStore.java:2770)
>  [hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getNumPartitionsByFilter(ObjectStore.java:2746)
>  [hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13850) File name conflict when have multiple INSERT INTO queries running in parallel

2016-05-26 Thread Bing Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bing Li updated HIVE-13850:
---
Description: 
We have an application which connect to HiveServer2 via JDBC.
In the application, it executes "INSERT INTO" query to the same table.

If there are a lot of users running the application at the same time. Some of 
the INSERT could fail.

The root cause is that in Hive.checkPaths(), it uses the following method to 
check the existing of the file. But if there are multiple inserts running in 
parallel, it will led to the conflict.

for (int counter = 1; fs.exists(itemDest) || destExists(result, itemDest); 
counter++) {
  itemDest = new Path(destf, name + ("_copy_" + counter) + 
filetype);
}


The Error Message
===
In hive log,
org.apache.hadoop.hive.ql.metadata.HiveException: copyFiles: error  
while moving files!!! Cannot move hdfs://node:8020/apps/hive/warehouse/met  
  
adata.db/scalding_stats/.hive-staging_hive_2016-05-10_18-46-
23_642_2056172497900766879-3321/-ext-1/00_0 to 
hdfs://node:8020/apps/hive  
/warehouse/metadata.db/scalding_stats/00_0_copy_9014
at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java: 
2719)   
at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java: 
1645)  


In hadoop log, 
WARN  hdfs.StateChange (FSDirRenameOp.java: 
unprotectedRenameTo(174)) - DIR* FSDirectory.unprotectedRenameTo:   
failed to rename /apps/hive/warehouse/metadata.db/scalding_stats/.hive- 
staging_hive_2016-05-10_18-46-23_642_2056172497900766879-3321/-ext- 
1/00_0 to /apps/hive/warehouse/metadata.
db/scalding_stats/00_0_copy_9014 because destination exists

  was:
We have an application which connect to HiveServer2 via JDBC.
In the application, it executes "INSERT INTO" query to the same table.

If there are a lot of users running the application at the same time. Some of 
the INSERT could fail.

In hive log,
org.apache.hadoop.hive.ql.metadata.HiveException: copyFiles: error  
while moving files!!! Cannot move hdfs://node:8020/apps/hive/warehouse/met  
  
adata.db/scalding_stats/.hive-staging_hive_2016-05-10_18-46-
23_642_2056172497900766879-3321/-ext-1/00_0 to 
hdfs://node:8020/apps/hive  
/warehouse/metadata.db/scalding_stats/00_0_copy_9014
at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java: 
2719)   
at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java: 
1645)  


In hadoop log, 
WARN  hdfs.StateChange (FSDirRenameOp.java: 
unprotectedRenameTo(174)) - DIR* FSDirectory.unprotectedRenameTo:   
failed to rename /apps/hive/warehouse/metadata.db/scalding_stats/.hive- 
staging_hive_2016-05-10_18-46-23_642_2056172497900766879-3321/-ext- 
1/00_0 to /apps/hive/warehouse/metadata.
db/scalding_stats/00_0_copy_9014 because destination exists


> File name conflict when have multiple INSERT INTO queries running in parallel
> -
>
> Key: HIVE-13850
> URL: https://issues.apache.org/jira/browse/HIVE-13850
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: Bing Li
>Assignee: Bing Li
>
> We have an application which connect to HiveServer2 via JDBC.
> In the application, it executes "INSERT INTO" query to the same table.
> If there are a lot of users running the application at the same time. Some of 
> the INSERT could fail.
> The root cause is that in Hive.checkPaths(), it uses the following method to 
> check the existing of the file. But if there are multiple inserts running in 
> parallel, it will led to the conflict.
> for (int counter = 1; fs.exists(itemDest) || destExists(result, itemDest); 
> counter++) {
>   itemDest = new Path(destf, name + ("_copy_" + counter) + 
> filetype);
> }
> The Error Message
> ===
> In hive log,
> org.apache.hadoop.hive.ql.metadata.HiveException: copyFiles: error  
> while moving files!!! Cannot move hdfs://node:8020/apps/hive/warehouse/met
> 
> adata.db/scalding_stats/.hive-staging_hive_2016-05-10_18-46-
> 23_642_2056172497900766879-3321/-ext-1/00_0 to 
> hdfs://node:8020/apps/hive  
>

[jira] [Commented] (HIVE-13808) Use constant expressions to backtrack when we create ReduceSink

2016-05-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302152#comment-15302152
 ] 

Hive QA commented on HIVE-13808:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12805948/HIVE-13808.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 130 failed/errored test(s), 10039 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniTezCliDriver-schema_evol_text_nonvec_mapwork_table.q-vector_decimal_trailing.q-subquery_in.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-smb_cache.q-transform_ppr2.q-vector_outer_join0.q-and-5-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-tez_union_group_by.q-vector_auto_smb_mapjoin_14.q-union_fast_stats.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_distinct_2.q-tez_joins_explain.q-cte_mat_1.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_grouping_sets.q-update_all_partitioned.q-cte_5.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-skewjoin_union_remove_2.q-timestamp_null.q-union32.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-smb_mapjoin_4.q-groupby8_map.q-groupby4_map.q-and-12-more - 
did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_groupby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cp_sel
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_stats
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_optimization_acid
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_resolution
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pcr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pointlookup2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pointlookup3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_udf_case
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ptf_matchpath
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_quotedid_basic
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_reducesink_dedup
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_18
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_date_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_round_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_interval_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_interval_arithmetic
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_windowing_gby2
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket4
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket5
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket6
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_disable_merge_for_bucketing
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_map_operators
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_num_buckets
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_list_bucket_dml_10
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge1
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge2
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge9
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge_diff_fs
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_reduce_deduplicate
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join1
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join2
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join3
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join4
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join5
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_ptf_matchpath
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_coalesce
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_date_1

[jira] [Updated] (HIVE-13816) Infer constants directly when we create semijoin

2016-05-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13816:
---
Target Version/s:   (was: 2.1.0)

> Infer constants directly when we create semijoin
> 
>
> Key: HIVE-13816
> URL: https://issues.apache.org/jira/browse/HIVE-13816
> Project: Hive
>  Issue Type: Sub-task
>  Components: Parser
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>
> Follow-up on HIVE-13068.
> When we create a left semijoin, we could infer the constants from the SEL 
> below when we create the GB to remove duplicates on the right hand side.
> Ex. ql/src/test/results/clientpositive/constprog_semijoin.q.out
> {noformat}
> explain select table1.id, table1.val, table1.val1 from table1 left semi join 
> table3 on table1.dimid = table3.id and table3.id = 100 where table1.dimid  = 
> 100;
> {noformat}
> Plan:
> {noformat}
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Map Reduce
>   Map Operator Tree:
>   TableScan
> alias: table1
> Statistics: Num rows: 10 Data size: 200 Basic stats: COMPLETE 
> Column stats: NONE
> Filter Operator
>   predicate: (((dimid = 100) = true) and (dimid = 100)) (type: 
> boolean)
>   Statistics: Num rows: 2 Data size: 40 Basic stats: COMPLETE 
> Column stats: NONE
>   Select Operator
> expressions: id (type: int), val (type: string), val1 (type: 
> string)
> outputColumnNames: _col0, _col1, _col2
> Statistics: Num rows: 2 Data size: 40 Basic stats: COMPLETE 
> Column stats: NONE
> Reduce Output Operator
>   key expressions: 100 (type: int), true (type: boolean)
>   sort order: ++
>   Map-reduce partition columns: 100 (type: int), true (type: 
> boolean)
>   Statistics: Num rows: 2 Data size: 40 Basic stats: COMPLETE 
> Column stats: NONE
>   value expressions: _col0 (type: int), _col1 (type: string), 
> _col2 (type: string)
>   TableScan
> alias: table3
> Statistics: Num rows: 5 Data size: 15 Basic stats: COMPLETE 
> Column stats: NONE
> Filter Operator
>   predicate: (((id = 100) = true) and (id = 100)) (type: boolean)
>   Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE 
> Column stats: NONE
>   Select Operator
> expressions: 100 (type: int), true (type: boolean)
> outputColumnNames: _col0, _col1
> Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE 
> Column stats: NONE
> Group By Operator
>   keys: _col0 (type: int), _col1 (type: boolean)
>   mode: hash
>   outputColumnNames: _col0, _col1
>   Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE 
> Column stats: NONE
>   Reduce Output Operator
> key expressions: _col0 (type: int), _col1 (type: boolean)
> sort order: ++
> Map-reduce partition columns: _col0 (type: int), _col1 
> (type: boolean)
> Statistics: Num rows: 1 Data size: 3 Basic stats: 
> COMPLETE Column stats: NONE
>   Reduce Operator Tree:
> Join Operator
>   condition map:
>Left Semi Join 0 to 1
>   keys:
> 0 100 (type: int), true (type: boolean)
> 1 _col0 (type: int), _col1 (type: boolean)
>   outputColumnNames: _col0, _col1, _col2
>   Statistics: Num rows: 2 Data size: 44 Basic stats: COMPLETE Column 
> stats: NONE
>   File Output Operator
> compressed: false
> Statistics: Num rows: 2 Data size: 44 Basic stats: COMPLETE 
> Column stats: NONE
> table:
> input format: org.apache.hadoop.mapred.SequenceFileInputFormat
> output format: 
> org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
> serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-0
> Fetch Operator
>   limit: -1
>   Processor Tree:
> ListSink
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13811) Constant not removed in index_auto_unused.q.out

2016-05-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13811:
---
Target Version/s:   (was: 2.1.0)

> Constant not removed in index_auto_unused.q.out
> ---
>
> Key: HIVE-13811
> URL: https://issues.apache.org/jira/browse/HIVE-13811
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>
> Follow-up on HIVE-13068.
> In test file ql/src/test/results/clientpositive/index_auto_unused.q.out.
> After HIVE-13068 goes in, the following filter is not folded after 
> PartitionPruning is done:
> {{filterExpr: ((ds = '2008-04-09') and (12.0 = 12.0) and (UDFToDouble(key) < 
> 10.0)) (type: boolean)}}
> Further, SimpleFetchOptimizer got disabled.
> All this needs further investigation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13815) Improve logic to infer false predicates

2016-05-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13815:
---
Target Version/s:   (was: 2.1.0)

> Improve logic to infer false predicates
> ---
>
> Key: HIVE-13815
> URL: https://issues.apache.org/jira/browse/HIVE-13815
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>
> Follow-up/extension of the work done in HIVE-13068.
> Ex.
> ql/src/test/results/clientpositive/annotate_stats_filter.q.out
> {{predicate: ((year = 2001) and (state = 'OH') and (state = 'FL')) (type: 
> boolean)}} -> {{false}}
> ql/src/test/results/clientpositive/cbo_rp_join1.q.out
> {{predicate: ((_col0 = _col1) and (_col1 = 40) and (_col0 = 40)) (type: 
> boolean)}} -> {{predicate: ((_col1 = 40) and (_col0 = 40)) (type: boolean)}}
> ql/src/test/results/clientpositive/constprog_semijoin.q.out 
> {{predicate: (((id = 100) = true) and (id <> 100)) (type: boolean)}} -> 
> {{false}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13804) Propagate constant expressions through insert

2016-05-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13804:
---
Target Version/s:   (was: 2.1.0)

> Propagate constant expressions through insert
> -
>
> Key: HIVE-13804
> URL: https://issues.apache.org/jira/browse/HIVE-13804
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>
> Follow-up of HIVE-13068.
> The problem is that CBO optimizes the select query and then the insert part 
> of the query is attached; after HIVE-13068, ConstantPropagate in Hive does 
> not kick in anymore because CBO optimized the plan, thus we may miss 
> opportunity to propagate constant till the top of the plan.
> Ex. ql/src/test/results/clientpositive/cp_sel.q.out
> {noformat}
> insert overwrite table testpartbucket partition(ds,hr) select 
> key,value,'hello' as ds, 'world' as hr from srcpart where hr=11;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13805) Extend HiveSortLimitPullUpConstantsRule to pull up constants even when SortLimit is the root of the plan

2016-05-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13805:
---
Target Version/s:   (was: 2.1.0)

> Extend HiveSortLimitPullUpConstantsRule to pull up constants even when 
> SortLimit is the root of the plan
> 
>
> Key: HIVE-13805
> URL: https://issues.apache.org/jira/browse/HIVE-13805
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>
> Follow-up of HIVE-13068.
> Limitation in the original HiveSortLimitPullUpConstantsRule rule.
> Currently Calcite rule does not pull-up constants when the Sort/Limit 
> operator is on top of the operator tree, as this was causing Hive limit 
> related optimizations to not kick in.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13863) Improve AnnotateWithStatistics with support for cartesian product

2016-05-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13863:
---
Status: Patch Available  (was: In Progress)

> Improve AnnotateWithStatistics with support for cartesian product
> -
>
> Key: HIVE-13863
> URL: https://issues.apache.org/jira/browse/HIVE-13863
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13863.patch
>
>
> Currently cartesian product stats based on cardinality of inputs are not 
> inferred correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13149) Remove some unnecessary HMS connections from HS2

2016-05-26 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302103#comment-15302103
 ] 

Aihua Xu commented on HIVE-13149:
-

Target version is removed. It's an improvement so not necessary for 2.1.0.

> Remove some unnecessary HMS connections from HS2 
> -
>
> Key: HIVE-13149
> URL: https://issues.apache.org/jira/browse/HIVE-13149
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Fix For: 2.2.0
>
> Attachments: HIVE-13149.1.patch, HIVE-13149.2.patch, 
> HIVE-13149.3.patch, HIVE-13149.4.patch, HIVE-13149.5.patch, 
> HIVE-13149.6.patch, HIVE-13149.7.patch, HIVE-13149.8.patch
>
>
> In SessionState class, currently we will always try to get a HMS connection 
> in {{start(SessionState startSs, boolean isAsync, LogHelper console)}} 
> regardless of if the connection will be used later or not. 
> When SessionState is accessed by the tasks in TaskRunner.java, although most 
> of the tasks other than some like StatsTask, don't need to access HMS. 
> Currently a new HMS connection will be established for each Task thread. If 
> HiveServer2 is configured to run in parallel and the query involves many 
> tasks, then the connections are created but unused.
> {noformat}
>   @Override
>   public void run() {
> runner = Thread.currentThread();
> try {
>   OperationLog.setCurrentOperationLog(operationLog);
>   SessionState.start(ss);
>   runSequential();
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13149) Remove some unnecessary HMS connections from HS2

2016-05-26 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-13149:

Target Version/s:   (was: 2.1.0)

> Remove some unnecessary HMS connections from HS2 
> -
>
> Key: HIVE-13149
> URL: https://issues.apache.org/jira/browse/HIVE-13149
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Fix For: 2.2.0
>
> Attachments: HIVE-13149.1.patch, HIVE-13149.2.patch, 
> HIVE-13149.3.patch, HIVE-13149.4.patch, HIVE-13149.5.patch, 
> HIVE-13149.6.patch, HIVE-13149.7.patch, HIVE-13149.8.patch
>
>
> In SessionState class, currently we will always try to get a HMS connection 
> in {{start(SessionState startSs, boolean isAsync, LogHelper console)}} 
> regardless of if the connection will be used later or not. 
> When SessionState is accessed by the tasks in TaskRunner.java, although most 
> of the tasks other than some like StatsTask, don't need to access HMS. 
> Currently a new HMS connection will be established for each Task thread. If 
> HiveServer2 is configured to run in parallel and the query involves many 
> tasks, then the connections are created but unused.
> {noformat}
>   @Override
>   public void run() {
> runner = Thread.currentThread();
> try {
>   OperationLog.setCurrentOperationLog(operationLog);
>   SessionState.start(ss);
>   runSequential();
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13746) Data duplication when insert overwrite

2016-05-26 Thread Chinna Rao Lalam (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302091#comment-15302091
 ] 

Chinna Rao Lalam commented on HIVE-13746:
-

Hi Bill Wailliam,

Can you give exact scenario. I have tried with  below queries on master it is 
working fine.

{code}

drop table sample1;
create table sample1(a STRING, b int) partitioned by (partitionid string) ROW 
FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;
LOAD DATA LOCAL INPATH '/home/chinna/install/data/file1.txt' INTO TABLE sample1 
partition (partitionid = "one");


drop table sample2;
create table sample2(a STRING, b int) partitioned by (partitionid string) ROW 
FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;
LOAD DATA LOCAL INPATH '/home/chinna/install/data/file2.txt' INTO TABLE sample2 
partition (partitionid = "one");


INSERT OVERWRITE TABLE sample2 PARTITION (partitionid = 'one') select a,b from 
sample1;

{code}

> Data duplication when insert overwrite 
> ---
>
> Key: HIVE-13746
> URL: https://issues.apache.org/jira/browse/HIVE-13746
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Bill Wailliam
>Priority: Critical
>
> Data duplication when insert overwrite .The old data cannot be deleted



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13149) Remove some unnecessary HMS connections from HS2

2016-05-26 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-13149:

   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks Jimmy for reviewing.

> Remove some unnecessary HMS connections from HS2 
> -
>
> Key: HIVE-13149
> URL: https://issues.apache.org/jira/browse/HIVE-13149
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Fix For: 2.2.0
>
> Attachments: HIVE-13149.1.patch, HIVE-13149.2.patch, 
> HIVE-13149.3.patch, HIVE-13149.4.patch, HIVE-13149.5.patch, 
> HIVE-13149.6.patch, HIVE-13149.7.patch, HIVE-13149.8.patch
>
>
> In SessionState class, currently we will always try to get a HMS connection 
> in {{start(SessionState startSs, boolean isAsync, LogHelper console)}} 
> regardless of if the connection will be used later or not. 
> When SessionState is accessed by the tasks in TaskRunner.java, although most 
> of the tasks other than some like StatsTask, don't need to access HMS. 
> Currently a new HMS connection will be established for each Task thread. If 
> HiveServer2 is configured to run in parallel and the query involves many 
> tasks, then the connections are created but unused.
> {noformat}
>   @Override
>   public void run() {
> runner = Thread.currentThread();
> try {
>   OperationLog.setCurrentOperationLog(operationLog);
>   SessionState.start(ss);
>   runSequential();
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13792) Show create table should not show stats info in the table properties

2016-05-26 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302063#comment-15302063
 ] 

Aihua Xu commented on HIVE-13792:
-

The tests are not related.

[~ctang.ma] Can you help review the code?

> Show create table should not show stats info in the table properties
> 
>
> Key: HIVE-13792
> URL: https://issues.apache.org/jira/browse/HIVE-13792
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Planning
>Affects Versions: 2.1.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-13792.1.patch, HIVE-13792.2.patch, 
> HIVE-13792.3.patch
>
>
> From the test 
> org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries 
> failure, we are printing table stats in show create table parameters. This 
> info should be skipped since it would be incorrect when you just copy them to 
> create a table. 
> {noformat}
> PREHOOK: query: SHOW CREATE TABLE hbase_table_1_like
> PREHOOK: type: SHOW_CREATETABLE
> PREHOOK: Input: default@hbase_table_1_like
> POSTHOOK: query: SHOW CREATE TABLE hbase_table_1_like
> POSTHOOK: type: SHOW_CREATETABLE
> POSTHOOK: Input: default@hbase_table_1_like
> CREATE EXTERNAL TABLE `hbase_table_1_like`(
>   `key` int COMMENT 'It is a column key',
>   `value` string COMMENT 'It is the column string value')
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.hbase.HBaseSerDe'
> STORED BY
>   'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES (
>   'hbase.columns.mapping'='cf:string',
>   'serialization.format'='1')
> TBLPROPERTIES (
>   'COLUMN_STATS_ACCURATE'='{\"BASIC_STATS\":\"true\"}',
>   'hbase.table.name'='hbase_table_0',
>   'numFiles'='0',
>   'numRows'='0',
>   'rawDataSize'='0',
>   'totalSize'='0',
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13862) org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter falls back to ORM

2016-05-26 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302046#comment-15302046
 ] 

Jesus Camacho Rodriguez commented on HIVE-13862:


[~amareshwari], thanks for letting me know. I plan to create the first RC 
beginning next week; there should be time to get it in.

> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter
>  falls back to ORM 
> ---
>
> Key: HIVE-13862
> URL: https://issues.apache.org/jira/browse/HIVE-13862
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Amareshwari Sriramadasu
>Assignee: Rajat Khandelwal
> Fix For: 2.1.0
>
>
> We are seeing following exception and calls fall back to ORM which make it 
> costly :
> {noformat}
>  WARN  org.apache.hadoop.hive.metastore.ObjectStore - Direct SQL failed, 
> falling back to ORM
> java.lang.ClassCastException: 
> org.datanucleus.store.rdbms.query.ForwardQueryResult cannot be cast to 
> java.lang.Number
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.extractSqlInt(MetaStoreDirectSql.java:892)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:855)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter(MetaStoreDirectSql.java:405)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$5.getSqlResult(ObjectStore.java:2763)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$5.getSqlResult(ObjectStore.java:2755)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2606)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getNumPartitionsByFilterInternal(ObjectStore.java:2770)
>  [hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getNumPartitionsByFilter(ObjectStore.java:2746)
>  [hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13862) org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter falls back to ORM

2016-05-26 Thread Amareshwari Sriramadasu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302044#comment-15302044
 ] 

Amareshwari Sriramadasu commented on HIVE-13862:


[~jcamachorodriguez], We would like to get this in 2.1.0 release. 

> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter
>  falls back to ORM 
> ---
>
> Key: HIVE-13862
> URL: https://issues.apache.org/jira/browse/HIVE-13862
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Amareshwari Sriramadasu
>Assignee: Rajat Khandelwal
> Fix For: 2.1.0
>
>
> We are seeing following exception and calls fall back to ORM which make it 
> costly :
> {noformat}
>  WARN  org.apache.hadoop.hive.metastore.ObjectStore - Direct SQL failed, 
> falling back to ORM
> java.lang.ClassCastException: 
> org.datanucleus.store.rdbms.query.ForwardQueryResult cannot be cast to 
> java.lang.Number
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.extractSqlInt(MetaStoreDirectSql.java:892)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:855)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter(MetaStoreDirectSql.java:405)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$5.getSqlResult(ObjectStore.java:2763)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$5.getSqlResult(ObjectStore.java:2755)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2606)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getNumPartitionsByFilterInternal(ObjectStore.java:2770)
>  [hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getNumPartitionsByFilter(ObjectStore.java:2746)
>  [hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13860) Fix more json related JDK8 test failures

2016-05-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301970#comment-15301970
 ] 

Hive QA commented on HIVE-13860:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12806318/HIVE-13860-java8.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build-JAVA8/10/testReport
Console output: 
http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build-JAVA8/10/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-JAVA8-10/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.8.0_25 ]]
+ export JAVA_HOME=/usr/java/jdk1.8.0_25
+ JAVA_HOME=/usr/java/jdk1.8.0_25
+ export 
PATH=/usr/java/jdk1.8.0_25/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/lib64/qt-3.3/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.8.0_25/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/lib64/qt-3.3/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-MASTER-Build-JAVA8-10/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z java8 ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 76130a9 HIVE-13269: Simplify comparison expressions using column 
stats (Jesus Camacho Rodriguez, reviewed by Ashutosh Chauhan)
+ git clean -f -d
+ git checkout java8
Switched to branch 'java8'
Your branch is behind 'origin/java8' by 1 commit, and can be fast-forwarded.
+ git reset --hard origin/java8
HEAD is now at 4cbc10e HIVE-13409: Fix JDK8 test failures related to 
COLUMN_STATS_ACCURATE (Mohit Sabharwal, reviewed by Sergio Pena)
+ git merge --ff-only origin/java8
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12806318 - PreCommit-HIVE-MASTER-Build-JAVA8

> Fix more json related JDK8 test failures
> 
>
> Key: HIVE-13860
> URL: https://issues.apache.org/jira/browse/HIVE-13860
> Project: Hive
>  Issue Type: Sub-task
>  Components: Test
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-13860-java8.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13826) Make VectorUDFAdaptor work for GenericUDFBetween when used as FILTER

2016-05-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301969#comment-15301969
 ] 

Hive QA commented on HIVE-13826:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12806242/HIVE-13826.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 76 failed/errored test(s), 10016 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniTezCliDriver-auto_join1.q-schema_evol_text_vec_mapwork_part_all_complex.q-vector_complex_join.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-auto_join30.q-vector_decimal_10_0.q-acid_globallimit.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-constprog_dpp.q-dynamic_partition_pruning.q-vectorization_10.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-enforce_order.q-vector_partition_diff_num_cols.q-unionDistinct_1.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-explainuser_4.q-update_after_multiple_inserts.q-mapreduce2.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-groupby2.q-tez_dynpart_hashjoin_1.q-custom_input_output_format.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-parallel_join1.q-escape_distributeby1.q-auto_sortmerge_join_7.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkClient - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket4
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket5
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket6
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_disable_merge_for_bucketing
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_map_operators
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_num_buckets
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_list_bucket_dml_10
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge1
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge2
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge9
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge_diff_fs
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_reduce_deduplicate
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join1
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join2
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join3
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join4
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join5
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join0
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join27
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_cbo_simple_select
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby5_noskew
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_complex_types
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_map_ppr_multi_distinct
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_input13
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_input18
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_multi_insert_gby3
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_multi_insert_mixed
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_ppd_outer_join4
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_stats9
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union13
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union19
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_ppr
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_16
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_17
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorized_math_funcs
org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService.testPreemptionQueueComparator
org.apache.hadoop.hive.llap.tez.TestConverters.testFragmentSpecToTaskSpec

[jira] [Commented] (HIVE-13831) Error pushing predicates to HBase storage handler

2016-05-26 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301941#comment-15301941
 ] 

Jesus Camacho Rodriguez commented on HIVE-13831:


Fails are not related to this patch.

> Error pushing predicates to HBase storage handler
> -
>
> Key: HIVE-13831
> URL: https://issues.apache.org/jira/browse/HIVE-13831
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13831.patch
>
>
> Discovered while working on HIVE-13693.
> There is an error on the predicates that we can push to HBaseStorageHandler. 
> In particular, range predicates of the shape {{(bounded, open)}} and {{(open, 
> bounded)}} over long or int columns get pushed and return wrong results.
> The problem has to do with the storage order for keys in HBase. Keys are 
> sorted lexicographically. Since the byte representation of negative values 
> comes after the positive values, open range predicates need special handling 
> that we do not have right now.
> Thus, for instance, when we push the predicate {{key > 2}}, we return all 
> records with column _key_ greater than 2, plus the records with negative 
> values for the column _key_. This problem does not get exposed if a filter is 
> kept in the Hive operator tree, but we should not assume the latest.
> This fix avoids pushing this kind of predicates to the storage handler, 
> returning them in the _residual_ part of the predicate that cannot be pushed. 
> In the future, special handling might be added to support this kind of 
> predicates.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13844) Invalid index handler in org.apache.hadoop.hive.ql.index.HiveIndex class

2016-05-26 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301930#comment-15301930
 ] 

Jesus Camacho Rodriguez commented on HIVE-13844:


+1

> Invalid index handler in org.apache.hadoop.hive.ql.index.HiveIndex class
> 
>
> Key: HIVE-13844
> URL: https://issues.apache.org/jira/browse/HIVE-13844
> Project: Hive
>  Issue Type: Bug
>  Components: Indexing
>Affects Versions: 2.0.0
>Reporter: Svetozar Ivanov
>Priority: Minor
> Attachments: HIVE-13844.patch
>
>
> Class org.apache.hadoop.hive.ql.index.HiveIndex has invalid handler name 
> 'org.apache.hadoop.hive.ql.AggregateIndexHandler'. The actual FQ class name 
> is 'org.apache.hadoop.hive.ql.index.AggregateIndexHandler'
> {code}
>   public static enum IndexType {
> AGGREGATE_TABLE("aggregate", 
> "org.apache.hadoop.hive.ql.AggregateIndexHandler"),
> COMPACT_SUMMARY_TABLE("compact", 
> "org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler"),
> 
> BITMAP_TABLE("bitmap","org.apache.hadoop.hive.ql.index.bitmap.BitmapIndexHandler");
> private IndexType(String indexType, String className) {
>   indexTypeName = indexType;
>   this.handlerClsName = className;
> }
> private final String indexTypeName;
> private final String handlerClsName;
> public String getName() {
>   return indexTypeName;
> }
> public String getHandlerClsName() {
>   return handlerClsName;
> }
>   }
>   
> {code}
> Because all of the above statement like 'SHOW INDEXES ON MY_TABLE' doesn't 
> work as we got java.lang.NullPointerException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13861) Fix up nullability issue that might be created by pull up constants rules

2016-05-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13861:
---
Attachment: HIVE-13861.patch

> Fix up nullability issue that might be created by pull up constants rules
> -
>
> Key: HIVE-13861
> URL: https://issues.apache.org/jira/browse/HIVE-13861
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13861.patch
>
>
> When we pull up constants through Union or Sort operators, we might end up 
> rewriting the original expression into an expression whose schema has 
> different nullability properties for some of its columns.
> This results in AssertionError of the following kind:
> {noformat}
> ...
> org.apache.hive.service.cli.HiveSQLException: Error running query: 
> java.lang.AssertionError: Internal error: Cannot add expression of different 
> type to set:
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13840) Orc split generation is reading file footers twice

2016-05-26 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-13840:
-
Attachment: HIVE-13840.2.patch

In the updated patch
1) Another file system call in split generation is avoided by specifying max 
length in reader. If max length is not specified ORC reader will issue 
fs.getFileStatus(path) to find the length of the file.
2) Added file system stats to MockFS which is used in the newly added test case

fyi.. [~rajesh.balamohan],[~ashutoshc]

[~owen.omalley] Can you please review the patch?

> Orc split generation is reading file footers twice
> --
>
> Key: HIVE-13840
> URL: https://issues.apache.org/jira/browse/HIVE-13840
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-13840.1.patch, HIVE-13840.2.patch
>
>
> Recent refactorings to move orc out introduced a regression in split 
> generation. This leads to reading the orc file footers twice during split 
> generation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Work started] (HIVE-13861) Fix up nullability issue that might be created by pull up constants rules

2016-05-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-13861 started by Jesus Camacho Rodriguez.
--
> Fix up nullability issue that might be created by pull up constants rules
> -
>
> Key: HIVE-13861
> URL: https://issues.apache.org/jira/browse/HIVE-13861
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>
> When we pull up constants through Union or Sort operators, we might end up 
> rewriting the original expression into an expression whose schema has 
> different nullability properties for some of its columns.
> This results in AssertionError of the following kind:
> {noformat}
> ...
> org.apache.hive.service.cli.HiveSQLException: Error running query: 
> java.lang.AssertionError: Internal error: Cannot add expression of different 
> type to set:
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13861) Fix up nullability issue that might be created by pull up constants rules

2016-05-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13861:
---
Status: Patch Available  (was: In Progress)

> Fix up nullability issue that might be created by pull up constants rules
> -
>
> Key: HIVE-13861
> URL: https://issues.apache.org/jira/browse/HIVE-13861
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>
> When we pull up constants through Union or Sort operators, we might end up 
> rewriting the original expression into an expression whose schema has 
> different nullability properties for some of its columns.
> This results in AssertionError of the following kind:
> {noformat}
> ...
> org.apache.hive.service.cli.HiveSQLException: Error running query: 
> java.lang.AssertionError: Internal error: Cannot add expression of different 
> type to set:
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13721) HPL/SQL COPY FROM FTP Statement: lack of DIR option leads to NPE

2016-05-26 Thread Dmitry Tolpeko (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301886#comment-15301886
 ] 

Dmitry Tolpeko commented on HIVE-13721:
---

Small fix, I included it to patch for HIVE-13540

> HPL/SQL COPY FROM FTP Statement: lack of DIR option leads to NPE
> 
>
> Key: HIVE-13721
> URL: https://issues.apache.org/jira/browse/HIVE-13721
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Reporter: Carter Shanklin
>
> The docs (http://www.hplsql.org/copy-from-ftp) suggest DIR is optional. When 
> I left it out in:
> {code}
> copy from ftp hdp250.example.com user 'vagrant' pwd 'vagrant'  files 
> 'sampledata.csv' to /tmp overwrite
> {code}
> I got:
> {code}
> Ln:2 Connected to ftp: hdp250.example.com (29 ms)
> Ln:2 Retrieving directory listing
>   Listing the current working FTP directory
> Ln:2 Files to copy: 45 bytes, 1 file, 0 subdirectories scanned (27 ms)
> Exception in thread "main" java.lang.NullPointerException
>   at org.apache.hive.hplsql.Ftp.getTargetFileName(Ftp.java:342)
>   at org.apache.hive.hplsql.Ftp.run(Ftp.java:149)
>   at org.apache.hive.hplsql.Ftp.copyFiles(Ftp.java:121)
>   at org.apache.hive.hplsql.Ftp.run(Ftp.java:91)
>   at org.apache.hive.hplsql.Exec.visitCopy_from_ftp_stmt(Exec.java:1292)
>   at org.apache.hive.hplsql.Exec.visitCopy_from_ftp_stmt(Exec.java:52)
>   at 
> org.apache.hive.hplsql.HplsqlParser$Copy_from_ftp_stmtContext.accept(HplsqlParser.java:11956)
>   at 
> org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70)
>   at org.apache.hive.hplsql.Exec.visitStmt(Exec.java:994)
>   at org.apache.hive.hplsql.Exec.visitStmt(Exec.java:52)
>   at 
> org.apache.hive.hplsql.HplsqlParser$StmtContext.accept(HplsqlParser.java:1012)
>   at 
> org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70)
>   at 
> org.apache.hive.hplsql.HplsqlBaseVisitor.visitBlock(HplsqlBaseVisitor.java:28)
>   at 
> org.apache.hive.hplsql.HplsqlParser$BlockContext.accept(HplsqlParser.java:446)
>   at 
> org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70)
>   at org.apache.hive.hplsql.Exec.visitProgram(Exec.java:901)
>   at org.apache.hive.hplsql.Exec.visitProgram(Exec.java:52)
>   at 
> org.apache.hive.hplsql.HplsqlParser$ProgramContext.accept(HplsqlParser.java:389)
>   at 
> org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visit(AbstractParseTreeVisitor.java:42)
>   at org.apache.hive.hplsql.Exec.run(Exec.java:760)
>   at org.apache.hive.hplsql.Exec.run(Exec.java:736)
>   at org.apache.hive.hplsql.Hplsql.main(Hplsql.java:23)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {code}
> Traceback leads to:
> {code}
>   /**
>* Get the target file relative path and name
>*/
>   String getTargetFileName(String file) {
> int len = dir.length();
> return targetDir + file.substring(len);
>   }
> {code}
> in Ftp.java
> When I added DIR '/' this worked.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13540) Casts to numeric types don't seem to work in hplsql

2016-05-26 Thread Dmitry Tolpeko (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Tolpeko updated HIVE-13540:
--
Affects Version/s: 2.2.0
   Status: Patch Available  (was: Open)

Patch submitted.

> Casts to numeric types don't seem to work in hplsql
> ---
>
> Key: HIVE-13540
> URL: https://issues.apache.org/jira/browse/HIVE-13540
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Affects Versions: 2.2.0
>Reporter: Carter Shanklin
>Assignee: Dmitry Tolpeko
> Attachments: HIVE-13540.1.patch
>
>
> Maybe I'm doing this wrong? But it seems to be broken.
> Casts to string types seem to work fine, but not numbers.
> This code:
> {code}
> temp_int = CAST('1' AS int);
> print temp_int
> temp_float   = CAST('1.2' AS float);
> print temp_float
> temp_double  = CAST('1.2' AS double);
> print temp_double
> temp_decimal = CAST('1.2' AS decimal(10, 4));
> print temp_decimal
> temp_string = CAST('1.2' AS string);
> print temp_string
> {code}
> Produces this output:
> {code}
> [vagrant@hdp250 hplsql]$ hplsql -f temp2.hplsql
> which: no hbase in 
> (/usr/lib64/qt-3.3/bin:/usr/lib/jvm/java/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/opt/puppetlabs/bin:/usr/local/share/jmeter/bin:/home/vagrant/bin)
> WARNING: Use "yarn jar" to launch YARN applications.
> null
> null
> null
> null
> 1.2
> {code}
> The software I'm using is not anything released but is pretty close to the 
> trunk, 2 weeks old at most.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13540) Casts to numeric types don't seem to work in hplsql

2016-05-26 Thread Dmitry Tolpeko (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Tolpeko updated HIVE-13540:
--
Attachment: HIVE-13540.1.patch

> Casts to numeric types don't seem to work in hplsql
> ---
>
> Key: HIVE-13540
> URL: https://issues.apache.org/jira/browse/HIVE-13540
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Reporter: Carter Shanklin
>Assignee: Dmitry Tolpeko
> Attachments: HIVE-13540.1.patch
>
>
> Maybe I'm doing this wrong? But it seems to be broken.
> Casts to string types seem to work fine, but not numbers.
> This code:
> {code}
> temp_int = CAST('1' AS int);
> print temp_int
> temp_float   = CAST('1.2' AS float);
> print temp_float
> temp_double  = CAST('1.2' AS double);
> print temp_double
> temp_decimal = CAST('1.2' AS decimal(10, 4));
> print temp_decimal
> temp_string = CAST('1.2' AS string);
> print temp_string
> {code}
> Produces this output:
> {code}
> [vagrant@hdp250 hplsql]$ hplsql -f temp2.hplsql
> which: no hbase in 
> (/usr/lib64/qt-3.3/bin:/usr/lib/jvm/java/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/opt/puppetlabs/bin:/usr/local/share/jmeter/bin:/home/vagrant/bin)
> WARNING: Use "yarn jar" to launch YARN applications.
> null
> null
> null
> null
> 1.2
> {code}
> The software I'm using is not anything released but is pretty close to the 
> trunk, 2 weeks old at most.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13792) Show create table should not show stats info in the table properties

2016-05-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301789#comment-15301789
 ] 

Hive QA commented on HIVE-13792:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12805910/HIVE-13792.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 79 failed/errored test(s), 10047 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniTezCliDriver-auto_join1.q-schema_evol_text_vec_mapwork_part_all_complex.q-vector_complex_join.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-groupby2.q-tez_dynpart_hashjoin_1.q-custom_input_output_format.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-update_orig_table.q-union2.q-bucket4.q-and-12-more - did 
not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_grouping_sets.q-update_all_partitioned.q-cte_5.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vectorized_parquet.q-insert_values_non_partitioned.q-schema_evol_orc_nonvec_mapwork_part.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket4
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket5
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket6
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_disable_merge_for_bucketing
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_map_operators
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_num_buckets
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_list_bucket_dml_10
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge1
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge2
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge9
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge_diff_fs
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_reduce_deduplicate
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join1
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join2
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join3
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join4
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join5
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testNegativeCliDriver_minimr_broken_pipe
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join1
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join26
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby5
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join11
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join18
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_cond_pushdown_4
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_load_dyn_part9
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union15
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union7
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_top_level
org.apache.hadoop.hive.llap.daemon.impl.comparator.TestFirstInFirstOutComparator.testWaitQueueComparatorParallelism
org.apache.hadoop.hive.llap.daemon.impl.comparator.TestFirstInFirstOutComparator.testWaitQueueComparatorWithinDagPriority
org.apache.hadoop.hive.llap.daemon.impl.comparator.TestShortestJobFirstComparator.testWaitQueueComparatorWithinDagPriority
org.apache.hadoop.hive.llap.tez.TestConverters.testFragmentSpecToTaskSpec
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskCommunicator.testFinishableStateUpdateFailure
org.apache.hadoop.hive.metastore.TestAuthzApiEmbedAuthorizerInRemote.org.apache.hadoop.hive.metastore.TestAuthzApiEmbedAuthorizerInRemote
org.apache.hadoop.hive.metastore.TestFilterHooks.org.apache.hadoop.hive.metastore.TestFilterHooks
org.apache.hadoop.hive.metastore.TestHiveMetaStoreGetMetaConf.org.apache.hadoop.hive.metastore.TestHiveMetaStoreGetMetaConf

[jira] [Updated] (HIVE-13269) Simplify comparison expressions using column stats

2016-05-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13269:
---
   Resolution: Fixed
Fix Version/s: 2.1.0
   Status: Resolved  (was: Patch Available)

Pushed to master, branch 2.1. Thanks for reviewing [~ashutoshc]!

> Simplify comparison expressions using column stats
> --
>
> Key: HIVE-13269
> URL: https://issues.apache.org/jira/browse/HIVE-13269
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 2.1.0
>
> Attachments: HIVE-13269.01.patch, HIVE-13269.02.patch, 
> HIVE-13269.03.patch, HIVE-13269.04.patch, HIVE-13269.patch, HIVE-13269.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13818) Fast Vector MapJoin not enhanced to use sortOrder when handling BinarySortable keys for Small Table?

2016-05-26 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13818:

Attachment: vector_bug.q.out

> Fast Vector MapJoin not enhanced to use sortOrder when handling 
> BinarySortable keys for Small Table?
> 
>
> Key: HIVE-13818
> URL: https://issues.apache.org/jira/browse/HIVE-13818
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-13818.01.patch, HIVE-13818.02.patch, vector_bug.q, 
> vector_bug.q.out
>
>
> Changes for HIVE-13682 did fix a bug in Fast Hash Tables, but evidently not 
> this issue according to Gopal/Rajesh/Nita.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13818) Fast Vector MapJoin not enhanced to use sortOrder when handling BinarySortable keys for Small Table?

2016-05-26 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13818:

Attachment: vector_bug.q

> Fast Vector MapJoin not enhanced to use sortOrder when handling 
> BinarySortable keys for Small Table?
> 
>
> Key: HIVE-13818
> URL: https://issues.apache.org/jira/browse/HIVE-13818
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-13818.01.patch, HIVE-13818.02.patch, vector_bug.q
>
>
> Changes for HIVE-13682 did fix a bug in Fast Hash Tables, but evidently not 
> this issue according to Gopal/Rajesh/Nita.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13818) Fast Vector MapJoin not enhanced to use sortOrder when handling BinarySortable keys for Small Table?

2016-05-26 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301770#comment-15301770
 ] 

Matt McCline commented on HIVE-13818:
-

[~gopalv] Thank you very much for working on a repro.

I've attached vector_bug.q and its Tez output.  The bug isn't triggered.  Can 
you see what I did wrong?  Thanks

> Fast Vector MapJoin not enhanced to use sortOrder when handling 
> BinarySortable keys for Small Table?
> 
>
> Key: HIVE-13818
> URL: https://issues.apache.org/jira/browse/HIVE-13818
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-13818.01.patch, HIVE-13818.02.patch
>
>
> Changes for HIVE-13682 did fix a bug in Fast Hash Tables, but evidently not 
> this issue according to Gopal/Rajesh/Nita.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13860) Fix more json related JDK8 test failures

2016-05-26 Thread Mohit Sabharwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal updated HIVE-13860:
---
Status: Patch Available  (was: Open)

> Fix more json related JDK8 test failures
> 
>
> Key: HIVE-13860
> URL: https://issues.apache.org/jira/browse/HIVE-13860
> Project: Hive
>  Issue Type: Sub-task
>  Components: Test
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-13860-java8.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13860) Fix more json related JDK8 test failures

2016-05-26 Thread Mohit Sabharwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal updated HIVE-13860:
---
Attachment: HIVE-13860-java8.patch

> Fix more json related JDK8 test failures
> 
>
> Key: HIVE-13860
> URL: https://issues.apache.org/jira/browse/HIVE-13860
> Project: Hive
>  Issue Type: Sub-task
>  Components: Test
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-13860-java8.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13729) FileSystem$Cache leaks in FileUtils.checkFileAccessWithImpersonation

2016-05-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13729:
---
Fix Version/s: (was: 2.1.0)
   2.2.0

> FileSystem$Cache leaks in FileUtils.checkFileAccessWithImpersonation
> 
>
> Key: HIVE-13729
> URL: https://issues.apache.org/jira/browse/HIVE-13729
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 1.3.0, 2.2.0
>
> Attachments: HIVE-13729.1.patch, HIVE-13729.2.patch
>
>
> Didn't invoke FileSystem.closeAllForUGI after checkFileAccess. This results 
> leak in FileSystem$Cache and eventually OOM for HS2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12279) Testcase to verify session temporary files are removed after HIVE-11768

2016-05-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12279:
---
Fix Version/s: (was: 2.1.0)
   2.2.0

> Testcase to verify session temporary files are removed after HIVE-11768
> ---
>
> Key: HIVE-12279
> URL: https://issues.apache.org/jira/browse/HIVE-12279
> Project: Hive
>  Issue Type: Test
>  Components: HiveServer2, Test
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 1.3.0, 2.2.0
>
> Attachments: HIVE-12279.1.patch
>
>
> We need to make sure HS2 session temporary files are removed after session 
> ends.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12279) Testcase to verify session temporary files are removed after HIVE-11768

2016-05-26 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301690#comment-15301690
 ] 

Jesus Camacho Rodriguez commented on HIVE-12279:


[~daijy], could you push to branch-2.1 too? Master is version 2.2.0 now (I 
updated the fix version accordingly). Thanks

> Testcase to verify session temporary files are removed after HIVE-11768
> ---
>
> Key: HIVE-12279
> URL: https://issues.apache.org/jira/browse/HIVE-12279
> Project: Hive
>  Issue Type: Test
>  Components: HiveServer2, Test
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-12279.1.patch
>
>
> We need to make sure HS2 session temporary files are removed after session 
> ends.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13513) cleardanglingscratchdir does not work in some version of HDFS

2016-05-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13513:
---
Fix Version/s: (was: 2.1.0)
   2.2.0

> cleardanglingscratchdir does not work in some version of HDFS
> -
>
> Key: HIVE-13513
> URL: https://issues.apache.org/jira/browse/HIVE-13513
> Project: Hive
>  Issue Type: Bug
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 1.3.0, 2.2.0
>
> Attachments: HIVE-13513.1.patch, HIVE-13513.2.patch
>
>
> On some Hadoop version, we keep getting "lease recovery" message at the time 
> we check for scratchdir by opening for appending:
> {code}
> Failed to APPEND_FILE xxx for DFSClient_NONMAPREDUCE_785768631_1 on 10.0.0.18 
> because lease recovery is in progress. Try again later.
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:2917)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInternal(FSNamesystem.java:2677)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInt(FSNamesystem.java:2984)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:2953)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:655)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:421)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2137)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2133)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2131)
> {code}
> and
> {code}
> 16/04/14 04:51:56 ERROR hdfs.DFSClient: Failed to close inode 18963
> java.io.IOException: Failed to replace a bad datanode on the existing 
> pipeline due to no more good datanodes being available to try. (Nodes: 
> current=[DatanodeInfoWithStorage[10.0.0.12:30010,DS-b355ac2a-a23a-418a-af9b-4c1b4e26afe8,DISK]],
>  
> original=[DatanodeInfoWithStorage[10.0.0.12:30010,DS-b355ac2a-a23a-418a-af9b-4c1b4e26afe8,DISK]]).
>  The current failed datanode replacement policy is DEFAULT, and a client may 
> configure this via 
> 'dfs.client.block.write.replace-datanode-on-failure.policy' in its 
> configuration.
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:951)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:1017)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1165)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470)
> {code}
> The reason is not clear. However, if we remove hsync from SessionState, 
> everything works as expected. Attach patch to remove hsync call for now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13513) cleardanglingscratchdir does not work in some version of HDFS

2016-05-26 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301686#comment-15301686
 ] 

Jesus Camacho Rodriguez commented on HIVE-13513:


[~daijy], could you push to branch-2.1 too? Master is version 2.2.0 now (I 
updated the fix version accordingly). Thanks

> cleardanglingscratchdir does not work in some version of HDFS
> -
>
> Key: HIVE-13513
> URL: https://issues.apache.org/jira/browse/HIVE-13513
> Project: Hive
>  Issue Type: Bug
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 1.3.0, 2.2.0
>
> Attachments: HIVE-13513.1.patch, HIVE-13513.2.patch
>
>
> On some Hadoop version, we keep getting "lease recovery" message at the time 
> we check for scratchdir by opening for appending:
> {code}
> Failed to APPEND_FILE xxx for DFSClient_NONMAPREDUCE_785768631_1 on 10.0.0.18 
> because lease recovery is in progress. Try again later.
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:2917)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInternal(FSNamesystem.java:2677)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInt(FSNamesystem.java:2984)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:2953)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:655)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:421)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2137)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2133)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2131)
> {code}
> and
> {code}
> 16/04/14 04:51:56 ERROR hdfs.DFSClient: Failed to close inode 18963
> java.io.IOException: Failed to replace a bad datanode on the existing 
> pipeline due to no more good datanodes being available to try. (Nodes: 
> current=[DatanodeInfoWithStorage[10.0.0.12:30010,DS-b355ac2a-a23a-418a-af9b-4c1b4e26afe8,DISK]],
>  
> original=[DatanodeInfoWithStorage[10.0.0.12:30010,DS-b355ac2a-a23a-418a-af9b-4c1b4e26afe8,DISK]]).
>  The current failed datanode replacement policy is DEFAULT, and a client may 
> configure this via 
> 'dfs.client.block.write.replace-datanode-on-failure.policy' in its 
> configuration.
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:951)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:1017)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1165)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470)
> {code}
> The reason is not clear. However, if we remove hsync from SessionState, 
> everything works as expected. Attach patch to remove hsync call for now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13551) Make cleardanglingscratchdir work on Windows

2016-05-26 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301684#comment-15301684
 ] 

Jesus Camacho Rodriguez commented on HIVE-13551:


[~daijy], could you push to branch-2.1 too? Master is version 2.2.0 now (I 
updated the fix version accordingly). Thanks

> Make cleardanglingscratchdir work on Windows
> 
>
> Key: HIVE-13551
> URL: https://issues.apache.org/jira/browse/HIVE-13551
> Project: Hive
>  Issue Type: Bug
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 1.3.0, 2.2.0
>
> Attachments: HIVE-13551.1.patch, HIVE-13551.2.patch
>
>
> See a couple of issues when running cleardanglingscratchdir on Windows, 
> includes:
> 1. dfs.support.append is set to false in Azure cluster, need an alternative 
> way when append is disabled
> 2. fix for cmd scripts
> 3. fix UT on Windows



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13551) Make cleardanglingscratchdir work on Windows

2016-05-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13551:
---
Fix Version/s: (was: 2.1.0)
   2.2.0

> Make cleardanglingscratchdir work on Windows
> 
>
> Key: HIVE-13551
> URL: https://issues.apache.org/jira/browse/HIVE-13551
> Project: Hive
>  Issue Type: Bug
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 1.3.0, 2.2.0
>
> Attachments: HIVE-13551.1.patch, HIVE-13551.2.patch
>
>
> See a couple of issues when running cleardanglingscratchdir on Windows, 
> includes:
> 1. dfs.support.append is set to false in Azure cluster, need an alternative 
> way when append is disabled
> 2. fix for cmd scripts
> 3. fix UT on Windows



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13354) Add ability to specify Compaction options per table and per request

2016-05-26 Thread Wei Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-13354:
-
Attachment: HIVE-13354.3.patch

Thanks Eugene.
1. I split the first round compaction worker job for ttp2 and ttp1 into the 
explicit way, so that we can compare the value we get from tblproperties (2048) 
vs the default value (1024).
2. Added symbolic constants in Initiator and CompactorMR.

> Add ability to specify Compaction options per table and per request
> ---
>
> Key: HIVE-13354
> URL: https://issues.apache.org/jira/browse/HIVE-13354
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
>  Labels: TODOC2.1
> Attachments: HIVE-13354.1.patch, 
> HIVE-13354.1.withoutSchemaChange.patch, HIVE-13354.2.patch, HIVE-13354.3.patch
>
>
> Currently the are a few options that determine when automatic compaction is 
> triggered.  They are specified once for the warehouse.
> This doesn't make sense - some table may be more important and need to be 
> compacted more often.
> We should allow specifying these on per table basis.
> Also, compaction is an MR job launched from within the metastore.  There is 
> currently no way to control job parameters (like memory, for example) except 
> to specify it in hive-site.xml for metastore which means they are site wide.
> Should add a way to specify these per table (perhaps even per compaction if 
> launched via ALTER TABLE)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13818) Fast Vector MapJoin not enhanced to use sortOrder when handling BinarySortable keys for Small Table?

2016-05-26 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301635#comment-15301635
 ] 

Gopal V commented on HIVE-13818:


Here's the smallest scenario which triggers the issue right now.

{code}
create temporary table x (a int) stored as orc;
create temporary table y (b int) stored as orc;
insert into x values(1);
insert into y values(1);
select count(1) from x, y where a = b;

Caused by: java.io.EOFException
at 
org.apache.hadoop.hive.serde2.binarysortable.InputByteBuffer.read(InputByteBuffer.java:54)
at 
org.apache.hadoop.hive.serde2.binarysortable.fast.BinarySortableDeserializeRead.readCheckNull(BinarySortableDeserializeRead.java:182)
at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastLongHashTable.putRow(VectorMapJoinFastLongHashTable.java:81)
at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastTableContainer.putRow(VectorMapJoinFastTableContainer.java:181)
at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:98)
{code}

To test the theory, I tried with

{code}
create temporary table x1 (a bigint) stored as orc;
create temporary table y1 (b bigint) stored as orc;
insert into x1 values(1);
insert into y1 values(1);
select count(1) from x1, y1 where a = b;

OK
1
Time taken: 1.532 seconds, Fetched: 1 row(s)
{code}

> Fast Vector MapJoin not enhanced to use sortOrder when handling 
> BinarySortable keys for Small Table?
> 
>
> Key: HIVE-13818
> URL: https://issues.apache.org/jira/browse/HIVE-13818
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-13818.01.patch, HIVE-13818.02.patch
>
>
> Changes for HIVE-13682 did fix a bug in Fast Hash Tables, but evidently not 
> this issue according to Gopal/Rajesh/Nita.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13376) HoS emits too many logs with application state

2016-05-26 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301613#comment-15301613
 ] 

Rui Li commented on HIVE-13376:
---

Hi [~xuefuz], I just briefly looked at the code. Although there're switches to 
control whether to log the app state, the switches are not exposed to user via 
configurations. So I think in order to disable the logging, we either need a 
log level higher than INFO, or we can disable 
{{spark.yarn.submit.waitAppCompletion}} (only works for yarn-cluster) . 
Otherwise we need the interval to avoid the verbose state logs. Let me know if 
there's other method to achieve it.
Related code in {{Client.scala}}:
{code}
  def monitorApplication(
  appId: ApplicationId,
  returnOnRunning: Boolean = false,
  logApplicationReport: Boolean = true): (YarnApplicationState, 
FinalApplicationStatus) = {
val interval = sparkConf.getLong("spark.yarn.report.interval", 1000)
var lastState: YarnApplicationState = null
while (true) {
  Thread.sleep(interval)
  val report: ApplicationReport =
try {
  getApplicationReport(appId)
} catch {
  case e: ApplicationNotFoundException =>
logError(s"Application $appId not found.")
return (YarnApplicationState.KILLED, FinalApplicationStatus.KILLED)
  case NonFatal(e) =>
logError(s"Failed to contact YARN for application $appId.", e)
return (YarnApplicationState.FAILED, FinalApplicationStatus.FAILED)
}
  val state = report.getYarnApplicationState

  if (logApplicationReport) {
logInfo(s"Application report for $appId (state: $state)")

// If DEBUG is enabled, log report details every iteration
// Otherwise, log them every time the application changes state
if (log.isDebugEnabled) {
  logDebug(formatReportDetails(report))
} else if (lastState != state) {
  logInfo(formatReportDetails(report))
}
  }

  if (lastState != state) {
state match {
  case YarnApplicationState.RUNNING =>
reportLauncherState(SparkAppHandle.State.RUNNING)
  case YarnApplicationState.FINISHED =>
reportLauncherState(SparkAppHandle.State.FINISHED)
  case YarnApplicationState.FAILED =>
reportLauncherState(SparkAppHandle.State.FAILED)
  case YarnApplicationState.KILLED =>
reportLauncherState(SparkAppHandle.State.KILLED)
  case _ =>
}
  }

  if (state == YarnApplicationState.FINISHED ||
state == YarnApplicationState.FAILED ||
state == YarnApplicationState.KILLED) {
cleanupStagingDir(appId)
return (state, report.getFinalApplicationStatus)
  }

  if (returnOnRunning && state == YarnApplicationState.RUNNING) {
return (state, report.getFinalApplicationStatus)
  }

  lastState = state
}

// Never reached, but keeps compiler happy
throw new SparkException("While loop is depleted! This should never 
happen...")
  }
{code}

> HoS emits too many logs with application state
> --
>
> Key: HIVE-13376
> URL: https://issues.apache.org/jira/browse/HIVE-13376
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Fix For: 2.1.0
>
> Attachments: HIVE-13376.2.patch, HIVE-13376.patch
>
>
> The logs get flooded with something like:
> > Mar 28, 3:12:21.851 PMINFO
> > org.apache.hive.spark.client.SparkClientImpl
> > [stderr-redir-1]: 16/03/28 15:12:21 INFO yarn.Client: Application report 
> > for application_1458679386200_0161 (state: RUNNING)
> > Mar 28, 3:12:21.912 PMINFO
> > org.apache.hive.spark.client.SparkClientImpl
> > [stderr-redir-1]: 16/03/28 15:12:21 INFO yarn.Client: Application report 
> > for application_1458679386200_0149 (state: RUNNING)
> > Mar 28, 3:12:22.853 PMINFO
> > org.apache.hive.spark.client.SparkClientImpl
> > [stderr-redir-1]: 16/03/28 15:12:22 INFO yarn.Client: Application report 
> > for application_1458679386200_0161 (state: RUNNING)
> > Mar 28, 3:12:22.913 PMINFO
> > org.apache.hive.spark.client.SparkClientImpl
> > [stderr-redir-1]: 16/03/28 15:12:22 INFO yarn.Client: Application report 
> > for application_1458679386200_0149 (state: RUNNING)
> > Mar 28, 3:12:23.855 PMINFO
> > org.apache.hive.spark.client.SparkClientImpl
> > [stderr-redir-1]: 16/03/28 15:12:23 INFO yarn.Client: Application report 
> > for application_1458679386200_0161 (state: RUNNING)
> While this is good information, it is a bit much.
> Seems like SparkJobMonitor hard-codes its interval to 1 second.  It should be 
> higher and perhaps made configurable.



--
This message was sent by Atlassian

[jira] [Updated] (HIVE-13518) Hive on Tez: Shuffle joins do not choose the right 'big' table.

2016-05-26 Thread Vikram Dixit K (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-13518:
--
Attachment: HIVE-13518.3.patch

> Hive on Tez: Shuffle joins do not choose the right 'big' table.
> ---
>
> Key: HIVE-13518
> URL: https://issues.apache.org/jira/browse/HIVE-13518
> Project: Hive
>  Issue Type: Bug
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-13518.1.patch, HIVE-13518.2.patch, 
> HIVE-13518.3.patch
>
>
> Currently the big table is always assumed to be at position 0 but this isn't 
> efficient for some queries as the big table at position 1 could have a lot 
> more keys/skew. We already have a mechanism of choosing the big table that 
> can be leveraged to make the right choice.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13729) FileSystem$Cache leaks in FileUtils.checkFileAccessWithImpersonation

2016-05-26 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301601#comment-15301601
 ] 

Lefty Leverenz commented on HIVE-13729:
---

[~daijy], branch-2.1 was cut this morning so now master is for 2.2.0.

If you want this patch to go in release 2.1.0, you'll have to commit it to 
branch-2.1.  The same goes for your other commits today:  HIVE-12279, 
HIVE-13513, and HIVE-13551.

> FileSystem$Cache leaks in FileUtils.checkFileAccessWithImpersonation
> 
>
> Key: HIVE-13729
> URL: https://issues.apache.org/jira/browse/HIVE-13729
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-13729.1.patch, HIVE-13729.2.patch
>
>
> Didn't invoke FileSystem.closeAllForUGI after checkFileAccess. This results 
> leak in FileSystem$Cache and eventually OOM for HS2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13818) Fast Vector MapJoin not enhanced to use sortOrder when handling BinarySortable keys for Small Table?

2016-05-26 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301587#comment-15301587
 ] 

Gopal V commented on HIVE-13818:


TPC-DS 55 is the one that failed - I will try to get a smaller repro for this, 
but it looks like it should fail in a two row test-case (or produce incorrect 
results at least).

The larger range join keys in hive-testbench were upgraded to BigInt sometime 
during the 100Tb testing, you might need to undo some schema changes there to 
repro this - you might not be testing the Integer:Integer join scenario.

I'm repeating this from memory from debugging this a couple of nights ago 
(might build a repro tomorrow).

{code}
if (keyBinarySortableDeserializeRead.readCheckNull()) {
  return;
}

long key = VectorMapJoinFastLongHashUtil.deserializeLongKey(
keyBinarySortableDeserializeRead, hashTableKeyType);
{code}

As explained above, looks like the BinarySortableSerDe handles Long and Integer 
differently, so just because the Join ops says LongLongInner, the deserializer 
for Long cannot be used for joins involving integers.

This is *not* an issue with LazyBinary, only BinarySortable encodes Long and 
Integers differently. In all the runs I could manage, the join worked whenever 
I cast up to bigint.

The problem seems to be that readCheckNull() does not know of what the actual 
hashTableKeyType here & reads a Long out of an encoded Int & runs out of bytes 
to read (i.e not 8 bytes).

>From the readCheckNull(), this is where it goes into the deep end.

{code}
/*
 * We have a field and are positioned to it.  Read it.
 */
...
case INT:
  {
final boolean invert = columnSortOrderIsDesc[fieldIndex];
int v = inputByteBuffer.read(invert) ^ 0x80;
for (int i = 0; i < 3; i++) {
  v = (v << 8) + (inputByteBuffer.read(invert) & 0xff);
}
currentInt = v;
  }
  break;
case LONG:
  {
final boolean invert = columnSortOrderIsDesc[fieldIndex];
long v = inputByteBuffer.read(invert) ^ 0x80;
for (int i = 0; i < 7; i++) {
  v = (v << 8) + (inputByteBuffer.read(invert) & 0xff);
}
currentLong = v;
  }
  break;
{code}

The integer:integer join case hits the 2nd case expression there and throws an 
EOF. 

Changing all joins to Long:Long allows me to run queries successfully.

> Fast Vector MapJoin not enhanced to use sortOrder when handling 
> BinarySortable keys for Small Table?
> 
>
> Key: HIVE-13818
> URL: https://issues.apache.org/jira/browse/HIVE-13818
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-13818.01.patch, HIVE-13818.02.patch
>
>
> Changes for HIVE-13682 did fix a bug in Fast Hash Tables, but evidently not 
> this issue according to Gopal/Rajesh/Nita.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

< 1 2

101 - 165 of 165 matches

Mail list logo