date:20161129

[jira] [Commented] (HIVE-15314) ThriftCLIService should LOG.error rather than LOG.warn when Exception occurs

2016-11-29 Thread Fei Hui (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15707875#comment-15707875
 ] 

Fei Hui commented on HIVE-15314:


Failed cases are not related

> ThriftCLIService should LOG.error rather than LOG.warn when Exception occurs
> 
>
> Key: HIVE-15314
> URL: https://issues.apache.org/jira/browse/HIVE-15314
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 2.2.0
>Reporter: Fei Hui
>Assignee: Fei Hui
> Attachments: HIVE-15314.1.patch
>
>
> When catch exception, critical error occurs, 
> and the message  in log is ''Error executing statement", "Error getting type 
> info", etc.
> So  we should use LOG.error, and it will remind users.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14804) HPLSQL multiple db connection does not switch back to Hive

2016-11-29 Thread Fei Hui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fei Hui updated HIVE-14804:
---
Status: Patch Available  (was: Open)

> HPLSQL multiple db connection does not switch back to Hive
> --
>
> Key: HIVE-14804
> URL: https://issues.apache.org/jira/browse/HIVE-14804
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Reporter: Dmitry Kozlov
>Assignee: Fei Hui
>Priority: Blocker
> Attachments: HIVE-14804.1-branch-2.0.patch, 
> HIVE-14804.1-branch-2.1.patch
>
>
> I have a problem with a multi database connection. I have 3 environments that 
> I would like to connect in my HPLSQL code Hive, DB2 and MySql. As soon as I 
> map any table either from DB2 or MySQL my code stops to recognize Hive 
> tables. Actually it starts to think that it is a table from the same database 
> (DB2 or MySql) that was mapped the last. It means your example 
> http://www.hplsql.org/map-object works only one way from Hive to MySQL and it 
> is not possible to go back to Hive.  
> Here is a simple piece of code.
> declare cnt int;
> begin
> /*
> PRINT 'Start MySQL';
> MAP OBJECT tbls TO hive.TBLS AT mysqlconn;
> select count(*)
> into cnt
> from tbls;
> PRINT cnt;
> PRINT 'Start Db2';
> MAP OBJECT exch TO DBDEV2.TEST_EXCHANGE AT db2conn;
> select count(1) 
> into cnt
> from exch;
> PRINT cnt;*/
> PRINT 'Check Hive';
> SELECT count(1) 
> into cnt
> FROM dev.test_sqoop;
> PRINT cnt;
> end;
> It has three blocks. One select from MySQL, second from DB2 and third from 
> Hive ORC table.
> When first two blocks are commented then block 3 works. See below
> Check Hive
> 16/09/20 18:08:08 INFO jdbc.Utils: Supplied authorities: localhost:1
> 16/09/20 18:08:08 INFO jdbc.Utils: Resolved authority: localhost:1
> 16/09/20 18:08:08 INFO jdbc.HiveConnection: Will try to open client transport 
> with JDBC Uri: jdbc:hive2://localhost:1
> Open connection: jdbc:hive2://localhost:1 (497 ms)
> Starting query
> Query executed successfully (177 ms)
> 82
> When I try to uncomment any of those blocks then block 3 stops working. For 
> example, if I uncomment block 1 I get this output. It is now assumes that 
> dev.test_sqoop is a MySQL table. Contrarily to your example
> Start MySQL
> Open connection: jdbc:mysql://10.11.12.144:3306/hive (489 ms)
> Starting query
> Query executed successfully (4 ms)
> 539
> Check Hive
> Starting query
> Unhandled exception in HPL/SQL
> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Table 
> 'dev.test_sqoop' doesn't exist
> If I comment the second block then it starts to assume that dev.test_sqoop is 
> a DB2 table. See below. So switch between DB2 and MySQL is working, however, 
> the hive table is still not working
> Start MySQL
> Open connection: jdbc:mysql://10.11.12.144:3306/hive (485 ms)
> Starting query
> Query executed successfully (5 ms)
> 539
> Start Db2
> Open connection: jdbc:db2://10.11.12.141:5/WM (227 ms)
> Starting query
> Query executed successfully (48 ms)
> 0
> Check Hive
> Starting query
> Unhandled exception in HPL/SQL
> com.ibm.db2.jcc.am.SqlSyntaxErrorException: DB2 SQL Error: SQLCODE=-204, 
> SQLSTATE=42704, SQLERRMC=DEV.TEST_SQOOP, DRIVER=4.16.53
> Could you, please, provide your feedback on this finding. In addition, I 
> would like to check if it would be possible to insert into a DB2 table 
> records that were selected from a Hive with one statement as soon as DB2 
> table is properly mapped. Please, explain.
> Looking forward to hearing from you soon.
> Regards,
> Dmitry Kozlov
> Daisy Intelligence   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14804) HPLSQL multiple db connection does not switch back to Hive

2016-11-29 Thread Fei Hui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fei Hui updated HIVE-14804:
---
Attachment: HIVE-14804.1-branch-2.1.patch
HIVE-14804.1-branch-2.0.patch

patch for branch-2.0 & branch-2.1
master branch had fixed this problem.

> HPLSQL multiple db connection does not switch back to Hive
> --
>
> Key: HIVE-14804
> URL: https://issues.apache.org/jira/browse/HIVE-14804
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Reporter: Dmitry Kozlov
>Assignee: Fei Hui
>Priority: Blocker
> Attachments: HIVE-14804.1-branch-2.0.patch, 
> HIVE-14804.1-branch-2.1.patch
>
>
> I have a problem with a multi database connection. I have 3 environments that 
> I would like to connect in my HPLSQL code Hive, DB2 and MySql. As soon as I 
> map any table either from DB2 or MySQL my code stops to recognize Hive 
> tables. Actually it starts to think that it is a table from the same database 
> (DB2 or MySql) that was mapped the last. It means your example 
> http://www.hplsql.org/map-object works only one way from Hive to MySQL and it 
> is not possible to go back to Hive.  
> Here is a simple piece of code.
> declare cnt int;
> begin
> /*
> PRINT 'Start MySQL';
> MAP OBJECT tbls TO hive.TBLS AT mysqlconn;
> select count(*)
> into cnt
> from tbls;
> PRINT cnt;
> PRINT 'Start Db2';
> MAP OBJECT exch TO DBDEV2.TEST_EXCHANGE AT db2conn;
> select count(1) 
> into cnt
> from exch;
> PRINT cnt;*/
> PRINT 'Check Hive';
> SELECT count(1) 
> into cnt
> FROM dev.test_sqoop;
> PRINT cnt;
> end;
> It has three blocks. One select from MySQL, second from DB2 and third from 
> Hive ORC table.
> When first two blocks are commented then block 3 works. See below
> Check Hive
> 16/09/20 18:08:08 INFO jdbc.Utils: Supplied authorities: localhost:1
> 16/09/20 18:08:08 INFO jdbc.Utils: Resolved authority: localhost:1
> 16/09/20 18:08:08 INFO jdbc.HiveConnection: Will try to open client transport 
> with JDBC Uri: jdbc:hive2://localhost:1
> Open connection: jdbc:hive2://localhost:1 (497 ms)
> Starting query
> Query executed successfully (177 ms)
> 82
> When I try to uncomment any of those blocks then block 3 stops working. For 
> example, if I uncomment block 1 I get this output. It is now assumes that 
> dev.test_sqoop is a MySQL table. Contrarily to your example
> Start MySQL
> Open connection: jdbc:mysql://10.11.12.144:3306/hive (489 ms)
> Starting query
> Query executed successfully (4 ms)
> 539
> Check Hive
> Starting query
> Unhandled exception in HPL/SQL
> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Table 
> 'dev.test_sqoop' doesn't exist
> If I comment the second block then it starts to assume that dev.test_sqoop is 
> a DB2 table. See below. So switch between DB2 and MySQL is working, however, 
> the hive table is still not working
> Start MySQL
> Open connection: jdbc:mysql://10.11.12.144:3306/hive (485 ms)
> Starting query
> Query executed successfully (5 ms)
> 539
> Start Db2
> Open connection: jdbc:db2://10.11.12.141:5/WM (227 ms)
> Starting query
> Query executed successfully (48 ms)
> 0
> Check Hive
> Starting query
> Unhandled exception in HPL/SQL
> com.ibm.db2.jcc.am.SqlSyntaxErrorException: DB2 SQL Error: SQLCODE=-204, 
> SQLSTATE=42704, SQLERRMC=DEV.TEST_SQOOP, DRIVER=4.16.53
> Could you, please, provide your feedback on this finding. In addition, I 
> would like to check if it would be possible to insert into a DB2 table 
> records that were selected from a Hive with one statement as soon as DB2 
> table is properly mapped. Please, explain.
> Looking forward to hearing from you soon.
> Regards,
> Dmitry Kozlov
> Daisy Intelligence   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-14804) HPLSQL multiple db connection does not switch back to Hive

2016-11-29 Thread Fei Hui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fei Hui reassigned HIVE-14804:
--

Assignee: Fei Hui  (was: Dmitry Tolpeko)

> HPLSQL multiple db connection does not switch back to Hive
> --
>
> Key: HIVE-14804
> URL: https://issues.apache.org/jira/browse/HIVE-14804
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Reporter: Dmitry Kozlov
>Assignee: Fei Hui
>Priority: Blocker
>
> I have a problem with a multi database connection. I have 3 environments that 
> I would like to connect in my HPLSQL code Hive, DB2 and MySql. As soon as I 
> map any table either from DB2 or MySQL my code stops to recognize Hive 
> tables. Actually it starts to think that it is a table from the same database 
> (DB2 or MySql) that was mapped the last. It means your example 
> http://www.hplsql.org/map-object works only one way from Hive to MySQL and it 
> is not possible to go back to Hive.  
> Here is a simple piece of code.
> declare cnt int;
> begin
> /*
> PRINT 'Start MySQL';
> MAP OBJECT tbls TO hive.TBLS AT mysqlconn;
> select count(*)
> into cnt
> from tbls;
> PRINT cnt;
> PRINT 'Start Db2';
> MAP OBJECT exch TO DBDEV2.TEST_EXCHANGE AT db2conn;
> select count(1) 
> into cnt
> from exch;
> PRINT cnt;*/
> PRINT 'Check Hive';
> SELECT count(1) 
> into cnt
> FROM dev.test_sqoop;
> PRINT cnt;
> end;
> It has three blocks. One select from MySQL, second from DB2 and third from 
> Hive ORC table.
> When first two blocks are commented then block 3 works. See below
> Check Hive
> 16/09/20 18:08:08 INFO jdbc.Utils: Supplied authorities: localhost:1
> 16/09/20 18:08:08 INFO jdbc.Utils: Resolved authority: localhost:1
> 16/09/20 18:08:08 INFO jdbc.HiveConnection: Will try to open client transport 
> with JDBC Uri: jdbc:hive2://localhost:1
> Open connection: jdbc:hive2://localhost:1 (497 ms)
> Starting query
> Query executed successfully (177 ms)
> 82
> When I try to uncomment any of those blocks then block 3 stops working. For 
> example, if I uncomment block 1 I get this output. It is now assumes that 
> dev.test_sqoop is a MySQL table. Contrarily to your example
> Start MySQL
> Open connection: jdbc:mysql://10.11.12.144:3306/hive (489 ms)
> Starting query
> Query executed successfully (4 ms)
> 539
> Check Hive
> Starting query
> Unhandled exception in HPL/SQL
> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Table 
> 'dev.test_sqoop' doesn't exist
> If I comment the second block then it starts to assume that dev.test_sqoop is 
> a DB2 table. See below. So switch between DB2 and MySQL is working, however, 
> the hive table is still not working
> Start MySQL
> Open connection: jdbc:mysql://10.11.12.144:3306/hive (485 ms)
> Starting query
> Query executed successfully (5 ms)
> 539
> Start Db2
> Open connection: jdbc:db2://10.11.12.141:5/WM (227 ms)
> Starting query
> Query executed successfully (48 ms)
> 0
> Check Hive
> Starting query
> Unhandled exception in HPL/SQL
> com.ibm.db2.jcc.am.SqlSyntaxErrorException: DB2 SQL Error: SQLCODE=-204, 
> SQLSTATE=42704, SQLERRMC=DEV.TEST_SQOOP, DRIVER=4.16.53
> Could you, please, provide your feedback on this finding. In addition, I 
> would like to check if it would be possible to insert into a DB2 table 
> records that were selected from a Hive with one statement as soon as DB2 
> table is properly mapped. Please, explain.
> Looking forward to hearing from you soon.
> Regards,
> Dmitry Kozlov
> Daisy Intelligence   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15112) Implement Parquet vectorization reader for Struct type

2016-11-29 Thread Ferdinand Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-15112:

Attachment: HIVE-15112.patch

> Implement Parquet vectorization reader for Struct type
> --
>
> Key: HIVE-15112
> URL: https://issues.apache.org/jira/browse/HIVE-15112
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
> Attachments: HIVE-15112.patch
>
>
> Like HIVE-14815, we need support Parquet vectorized reader for struct type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15057) Support other types of operators (other than SELECT)

2016-11-29 Thread Ferdinand Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15707830#comment-15707830
 ] 

Ferdinand Xu commented on HIVE-15057:
-

+1 pending on the test

> Support other types of operators (other than SELECT)
> 
>
> Key: HIVE-15057
> URL: https://issues.apache.org/jira/browse/HIVE-15057
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer, Physical Optimizer
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-15057.wip.patch
>
>
> Currently only SELECT operators are supported for nested column pruning. We 
> should add support for other types of operators so the optimization can work 
> for complex queries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15314) ThriftCLIService should LOG.error rather than LOG.warn when Exception occurs

2016-11-29 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15707815#comment-15707815
 ] 

Hive QA commented on HIVE-15314:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12840986/HIVE-15314.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10718 tests 
executed
*Failed tests:*
{noformat}
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=114)

[join39.q,bucketsortoptimize_insert_7.q,vector_distinct_2.q,join11.q,union13.q,dynamic_rdd_cache.q,auto_sortmerge_join_16.q,windowing.q,union_remove_3.q,skewjoinopt7.q,stats7.q,annotate_stats_join.q,multi_insert_lateral_view.q,ptf_streaming.q,join_1to1.q]
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=128)

[union_remove_15.q,bucket_map_join_tez1.q,groupby7_noskew.q,bucketmapjoin1.q,subquery_multiinsert.q,auto_join8.q,auto_join6.q,groupby2_map_skew.q,lateral_view_explode2.q,join28.q,load_dyn_part1.q,skewjoinopt17.q,skewjoin_union_remove_1.q,union_remove_20.q,bucketmapjoin5.q]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=133)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=91)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2338/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2338/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2338/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12840986 - PreCommit-HIVE-Build

> ThriftCLIService should LOG.error rather than LOG.warn when Exception occurs
> 
>
> Key: HIVE-15314
> URL: https://issues.apache.org/jira/browse/HIVE-15314
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 2.2.0
>Reporter: Fei Hui
>Assignee: Fei Hui
> Attachments: HIVE-15314.1.patch
>
>
> When catch exception, critical error occurs, 
> and the message  in log is ''Error executing statement", "Error getting type 
> info", etc.
> So  we should use LOG.error, and it will remind users.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15313) Add export spark.yarn.archive or spark.yarn.jars variable in Hive on Spark document

2016-11-29 Thread liyunzhang_intel (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyunzhang_intel updated HIVE-15313:

Description: 
According to 
[wiki|https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started],
 run queries in HOS16 and HOS20 in yarn mode.
Following table shows the difference in query time between HOS16 and HOS20.
||Version||Total time||Time for Jobs||Time for preparing jobs||
|Spark16|51|39|12|
|Spark20|54|40|14| 

 HOS20 spends more time(2 secs) on preparing jobs than HOS16. After reviewing 
the source code of spark, found that following point causes this:
code:[Client#distribute|https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala#L546],
 In spark20, if spark cannot find spark.yarn.archive and spark.yarn.jars in 
spark configuration file, it will first copy all jars in $SPARK_HOME/jars to a 
tmp directory and upload the tmp directory to distribute cache. Comparing 
[spark16|https://github.com/apache/spark/blob/branch-1.6/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala#L1145],
 
In spark16, it searches spark-assembly*.jar and upload it to distribute cache.

In spark20, it spends 2 more seconds to copy all jars in $SPARK_HOME/jar to a 
tmp directory if we don't set "spark.yarn.archive" or "spark.yarn.jars".

We can accelerate the startup of hive on spark 20 by settintg 
"spark.yarn.archive" or "spark.yarn.jars":
set "spark.yarn.archive":
{code}
 zip spark-archive.zip $SPARK_HOME/jars/*
$ hadoop fs -copyFromLocal spark-archive.zip 
$ echo "spark.yarn.archive=hdfs:///xxx:8020/spark-archive.zip" >> 
conf/spark-defaults.conf
{code}
set "spark.yarn.jars":
{code}
$ hadoop fs mkdir spark-2.0.0-bin-hadoop 
$hadoop fs -copyFromLocal $SPARK_HOME/jars/* spark-2.0.0-bin-hadoop 
$ echo "spark.yarn.jars=hdfs:///xxx:8020/spark-2.0.0-bin-hadoop/*" >> 
conf/spark-defaults.conf
{code}

Suggest to add this part in wiki.

performance.improvement.after.set.spark.yarn.archive.PNG shows the detail 
performance impovement after setting spark.yarn.archive in small queries.





  was:
According to 
[wiki|https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started],
 run queries in HOS16 and HOS20 in yarn mode.
Following table shows the difference in query time between HOS16 and HOS20.
||Version||Total time||Time for Jobs||Time for preparing jobs||
|Spark16|51|39|12|
|Spark20|54|40|14| 

 HOS20 spends more time(2 secs) on preparing jobs than HOS16. After reviewing 
the source code of spark, found that following point causes this:
code:[Client#distribute|https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala#L546],
 In spark20, if spark cannot find spark.yarn.archive and spark.yarn.jars in 
spark configuration file, it will first copy all jars in $SPARK_HOME/jars to a 
tmp directory and upload the tmp directory to distribute cache. Comparing 
[spark16|https://github.com/apache/spark/blob/branch-1.6/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala#L1145],
 
In spark16, it searches spark-assembly*.jar and upload it to distribute cache.

In spark20, it spends 2 more seconds to copy all jars in $SPARK_HOME/jar to a 
tmp directory if we don't set "spark.yarn.archive" or "spark.yarn.jars".

We can accelerate the startup of hive on spark 20 by settintg 
"spark.yarn.archive" or "spark.yarn.jars":
set "spark.yarn.archive":
{code}
 zip spark-archive.zip $SPARK_HOME/jars/*
$ hadoop fs -copyFromLocal spark-archive.zip 
$ echo "spark.yarn.archive=hdfs:///xxx:8020/spark-archive.zip" >> 
conf/spark-defaults.conf
{code}
set "spark.yarn.jars":
{code}
$ hadoop fs mkdir spark-2.0.0-bin-hadoop 
$hadoop fs -copyFromLocal $SPARK_HOME/jars/* spark-2.0.0-bin-hadoop 
$ echo "spark.yarn.jars=hdfs:///xxx:8020/spark-2.0.0-bin-hadoop/*" >> 
conf/spark-defaults.conf
{code}

Suggest to add this part in wiki.






> Add export spark.yarn.archive or spark.yarn.jars variable in Hive on Spark 
> document
> ---
>
> Key: HIVE-15313
> URL: https://issues.apache.org/jira/browse/HIVE-15313
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang_intel
>Priority: Minor
> Attachments: performance.improvement.after.set.spark.yarn.archive.PNG
>
>
> According to 
> [wiki|https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started],
>  run queries in HOS16 and HOS20 in yarn mode.
> Following table shows the difference in query time between HOS16 and HOS20.
> ||Version||Total time||Time for Jobs||Time for preparing jobs||
> |Spark16|51|39|12|
> |Spark20|54|40|14| 
>  HOS20 spends more time(2 secs) on preparing jobs than HOS16. After reviewing 
> the source code of spark, found that following point causes this:
> code

[jira] [Updated] (HIVE-15313) Add export spark.yarn.archive or spark.yarn.jars variable in Hive on Spark document

2016-11-29 Thread liyunzhang_intel (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyunzhang_intel updated HIVE-15313:

Attachment: performance.improvement.after.set.spark.yarn.archive.PNG

performance.improvement.after.set.spark.yarn.archive.PNG shows the detail 
performance impovement after setting spark.yarn.archive in small queries.

> Add export spark.yarn.archive or spark.yarn.jars variable in Hive on Spark 
> document
> ---
>
> Key: HIVE-15313
> URL: https://issues.apache.org/jira/browse/HIVE-15313
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang_intel
>Priority: Minor
> Attachments: performance.improvement.after.set.spark.yarn.archive.PNG
>
>
> According to 
> [wiki|https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started],
>  run queries in HOS16 and HOS20 in yarn mode.
> Following table shows the difference in query time between HOS16 and HOS20.
> ||Version||Total time||Time for Jobs||Time for preparing jobs||
> |Spark16|51|39|12|
> |Spark20|54|40|14| 
>  HOS20 spends more time(2 secs) on preparing jobs than HOS16. After reviewing 
> the source code of spark, found that following point causes this:
> code:[Client#distribute|https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala#L546],
>  In spark20, if spark cannot find spark.yarn.archive and spark.yarn.jars in 
> spark configuration file, it will first copy all jars in $SPARK_HOME/jars to 
> a tmp directory and upload the tmp directory to distribute cache. Comparing 
> [spark16|https://github.com/apache/spark/blob/branch-1.6/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala#L1145],
>  
> In spark16, it searches spark-assembly*.jar and upload it to distribute cache.
> In spark20, it spends 2 more seconds to copy all jars in $SPARK_HOME/jar to a 
> tmp directory if we don't set "spark.yarn.archive" or "spark.yarn.jars".
> We can accelerate the startup of hive on spark 20 by settintg 
> "spark.yarn.archive" or "spark.yarn.jars":
> set "spark.yarn.archive":
> {code}
>  zip spark-archive.zip $SPARK_HOME/jars/*
> $ hadoop fs -copyFromLocal spark-archive.zip 
> $ echo "spark.yarn.archive=hdfs:///xxx:8020/spark-archive.zip" >> 
> conf/spark-defaults.conf
> {code}
> set "spark.yarn.jars":
> {code}
> $ hadoop fs mkdir spark-2.0.0-bin-hadoop 
> $hadoop fs -copyFromLocal $SPARK_HOME/jars/* spark-2.0.0-bin-hadoop 
> $ echo "spark.yarn.jars=hdfs:///xxx:8020/spark-2.0.0-bin-hadoop/*" >> 
> conf/spark-defaults.conf
> {code}
> Suggest to add this part in wiki.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15285) err info for itests mvn building is not correct

2016-11-29 Thread Fei Hui (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15707661#comment-15707661
 ] 

Fei Hui commented on HIVE-15285:


many thanks

> err info for itests mvn building is not correct
> ---
>
> Key: HIVE-15285
> URL: https://issues.apache.org/jira/browse/HIVE-15285
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Fei Hui
>Assignee: Fei Hui
> Attachments: HIVE-15285.1.patch
>
>
> When i build itests, i found some err info 
> '''
> [exec] cp: cannot stat 
> `./target/../../..//data/conf/spark/log4j2.properties': No such file or 
> directory
> '''
> But i found that the real reason is spark down load error. The msg above 
> confuse users ， it is not root cause



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15240) Updating/Altering stats in metastore can be expensive in S3

2016-11-29 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-15240:
-
Labels: 2.2.0  (was: )

> Updating/Altering stats in metastore can be expensive in S3
> ---
>
> Key: HIVE-15240
> URL: https://issues.apache.org/jira/browse/HIVE-15240
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 2.2.0
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
>  Labels: 2.2.0
> Fix For: 2.2.0
>
> Attachments: HIVE-15240.1.patch, HIVE-15240.2.patch, 
> HIVE-15240.3.patch, HIVE-15240.5.patch, HIVE-15240.6.patch
>
>
> https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java#L630
> https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java#L367
> If there are 100 partitions, it iterates every partition to determine its 
> location taking up more than good amount of time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15240) Updating/Altering stats in metastore can be expensive in S3

2016-11-29 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-15240:
-
Affects Version/s: 2.2.0

> Updating/Altering stats in metastore can be expensive in S3
> ---
>
> Key: HIVE-15240
> URL: https://issues.apache.org/jira/browse/HIVE-15240
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 2.2.0
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
>  Labels: 2.2.0
> Fix For: 2.2.0
>
> Attachments: HIVE-15240.1.patch, HIVE-15240.2.patch, 
> HIVE-15240.3.patch, HIVE-15240.5.patch, HIVE-15240.6.patch
>
>
> https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java#L630
> https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java#L367
> If there are 100 partitions, it iterates every partition to determine its 
> location taking up more than good amount of time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15240) Updating/Altering stats in metastore can be expensive in S3

2016-11-29 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-15240:
-
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks [~rajesh.balamohan] for the patch!

> Updating/Altering stats in metastore can be expensive in S3
> ---
>
> Key: HIVE-15240
> URL: https://issues.apache.org/jira/browse/HIVE-15240
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 2.2.0
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
>  Labels: 2.2.0
> Fix For: 2.2.0
>
> Attachments: HIVE-15240.1.patch, HIVE-15240.2.patch, 
> HIVE-15240.3.patch, HIVE-15240.5.patch, HIVE-15240.6.patch
>
>
> https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java#L630
> https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java#L367
> If there are 100 partitions, it iterates every partition to determine its 
> location taking up more than good amount of time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15250) Reuse partitions info generated in MoveTask to its subscribers (StatsTask)

2016-11-29 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-15250:
-
  Resolution: Fixed
   Fix Version/s: 2.2.0
Target Version/s: 2.2.0
  Status: Resolved  (was: Patch Available)

Committed to master. Thanks [~rajesh.balamohan] for the patch!

> Reuse partitions info generated in MoveTask to its subscribers (StatsTask)
>
> -
>
> Key: HIVE-15250
> URL: https://issues.apache.org/jira/browse/HIVE-15250
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 2.2.0
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-15250.1.patch, HIVE-15250.2.patch, 
> HIVE-15250.3.patch
>
>
> When dynamic partitions are enabled, {{StatsTask}} loads partition 
> information by querying metastore. In cases like {{insert overwrite table}}, 
> this can be expensive operation depending on the number of partitions 
> involved (for e.g, in tpcds populating web_returns table would incur 2184 DB 
> calls just on this function).
> It would be good to pass on the partition information generated in MoveTask 
> to its subscribers to reduce the number of DB calls.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15250) Reuse partitions info generated in MoveTask to its subscribers (StatsTask)

2016-11-29 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-15250:
-
Affects Version/s: 2.2.0

> Reuse partitions info generated in MoveTask to its subscribers (StatsTask)
>
> -
>
> Key: HIVE-15250
> URL: https://issues.apache.org/jira/browse/HIVE-15250
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 2.2.0
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-15250.1.patch, HIVE-15250.2.patch, 
> HIVE-15250.3.patch
>
>
> When dynamic partitions are enabled, {{StatsTask}} loads partition 
> information by querying metastore. In cases like {{insert overwrite table}}, 
> this can be expensive operation depending on the number of partitions 
> involved (for e.g, in tpcds populating web_returns table would incur 2184 DB 
> calls just on this function).
> It would be good to pass on the partition information generated in MoveTask 
> to its subscribers to reduce the number of DB calls.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15285) err info for itests mvn building is not correct

2016-11-29 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15707645#comment-15707645
 ] 

Prasanth Jayachandran commented on HIVE-15285:
--

cc [~spena][~jxiang][~xuefuz]

> err info for itests mvn building is not correct
> ---
>
> Key: HIVE-15285
> URL: https://issues.apache.org/jira/browse/HIVE-15285
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Fei Hui
>Assignee: Fei Hui
> Attachments: HIVE-15285.1.patch
>
>
> When i build itests, i found some err info 
> '''
> [exec] cp: cannot stat 
> `./target/../../..//data/conf/spark/log4j2.properties': No such file or 
> directory
> '''
> But i found that the real reason is spark down load error. The msg above 
> confuse users ， it is not root cause



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15285) err info for itests mvn building is not correct

2016-11-29 Thread Fei Hui (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15707637#comment-15707637
 ] 

Fei Hui commented on HIVE-15285:


hi [~prasanth_j]
I dont know who can review this
could you please give suggestions and review it ?
thanks

> err info for itests mvn building is not correct
> ---
>
> Key: HIVE-15285
> URL: https://issues.apache.org/jira/browse/HIVE-15285
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Fei Hui
>Assignee: Fei Hui
> Attachments: HIVE-15285.1.patch
>
>
> When i build itests, i found some err info 
> '''
> [exec] cp: cannot stat 
> `./target/../../..//data/conf/spark/log4j2.properties': No such file or 
> directory
> '''
> But i found that the real reason is spark down load error. The msg above 
> confuse users ， it is not root cause



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15314) ThriftCLIService should LOG.error rather than LOG.warn when Exception occurs

2016-11-29 Thread Fei Hui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fei Hui updated HIVE-15314:
---
Status: Patch Available  (was: Open)

> ThriftCLIService should LOG.error rather than LOG.warn when Exception occurs
> 
>
> Key: HIVE-15314
> URL: https://issues.apache.org/jira/browse/HIVE-15314
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 2.2.0
>Reporter: Fei Hui
>Assignee: Fei Hui
> Attachments: HIVE-15314.1.patch
>
>
> When catch exception, critical error occurs, 
> and the message  in log is ''Error executing statement", "Error getting type 
> info", etc.
> So  we should use LOG.error, and it will remind users.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15314) ThriftCLIService should LOG.error rather than LOG.warn when Exception occurs

2016-11-29 Thread Fei Hui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fei Hui updated HIVE-15314:
---
Attachment: HIVE-15314.1.patch

patch uploaded

> ThriftCLIService should LOG.error rather than LOG.warn when Exception occurs
> 
>
> Key: HIVE-15314
> URL: https://issues.apache.org/jira/browse/HIVE-15314
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 2.2.0
>Reporter: Fei Hui
>Assignee: Fei Hui
> Attachments: HIVE-15314.1.patch
>
>
> When catch exception, critical error occurs, 
> and the message  in log is ''Error executing statement", "Error getting type 
> info", etc.
> So  we should use LOG.error, and it will remind users.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15239) hive on spark combine equivalentwork get wrong result because of tablescan operation compare

2016-11-29 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15707619#comment-15707619
 ] 

Rui Li commented on HIVE-15239:
---

Pinging [~xuefuz]

> hive on spark combine equivalentwork get wrong result because of  tablescan 
> operation compare
> -
>
> Key: HIVE-15239
> URL: https://issues.apache.org/jira/browse/HIVE-15239
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.0, 2.1.0
>Reporter: wangwenli
>Assignee: Rui Li
> Attachments: HIVE-15239.1.patch, HIVE-15239.2.patch
>
>
> env: hive on spark engine
> reproduce step:
> {code}
> create table a1(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> create table a2(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> alter table a1 add partition(END_DT='20161020');
> alter table a1 add partition(END_DT='20161021');
> insert into table a1 partition(END_DT='20161020') 
> values('2000721360','20161001');
> SELECT T1.KEHHAO,COUNT(1) FROM ( 
> SELECT KEHHAO FROM a1 T 
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> UNION ALL 
> SELECT KEHHAO FROM a2 T
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> ) T1 
> GROUP BY T1.KEHHAO 
> HAVING COUNT(1)>1; 
> +-+--+--+
> |  t1.kehhao  | _c1  |
> +-+--+--+
> | 2000721360  | 2|
> +-+--+--+
> {code}
> the result should be none record



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-15306) Change NOTICE file to account for JSON license components

2016-11-29 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-15306:


Assignee: Prasanth Jayachandran  (was: Jesus Camacho Rodriguez)

> Change NOTICE file to account for JSON license components
> -
>
> Key: HIVE-15306
> URL: https://issues.apache.org/jira/browse/HIVE-15306
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0, 2.1.1
>Reporter: Jesus Camacho Rodriguez
>Assignee: Prasanth Jayachandran
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-15306.patch
>
>
> NO PRECOMMIT TESTS
> As per email discussion in 
> http://mail-archives.apache.org/mod_mbox/www-legal-discuss/201611.mbox/%3C0CE2E8C9-D9B7-404D-93EF-A1F8B07189BF%40apache.org%3E
>  .
> Notice that the temporary exclusion period for JSON license components is 
> extended till April 30, 2017.
> {quote}
> At that point in time, ANY and ALL usage
> of these JSON licensed artifacts are DISALLOWED. You must
> either find a suitably licensed replacement, or do without.
> There will be NO exceptions.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15312) reduce logging in certain places

2016-11-29 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15707558#comment-15707558
 ] 

Prasanth Jayachandran commented on HIVE-15312:
--

+1, pending tests. looks like compilation is breaking.

> reduce logging in certain places
> 
>
> Key: HIVE-15312
> URL: https://issues.apache.org/jira/browse/HIVE-15312
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15312.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15313) Add export spark.yarn.archive or spark.yarn.jars variable in Hive on Spark document

2016-11-29 Thread liyunzhang_intel (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyunzhang_intel updated HIVE-15313:

Description: 
According to 
[wiki|https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started],
 run queries in HOS16 and HOS20 in yarn mode.
Following table shows the difference in query time between HOS16 and HOS20.
||Version||Total time||Time for Jobs||Time for preparing jobs||
|Spark16|51|39|12|
|Spark20|54|40|14| 

 HOS20 spends more time(2 secs) on preparing jobs than HOS16. After reviewing 
the source code of spark, found that following point causes this:
code:[Client#distribute|https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala#L546],
 In spark20, if spark cannot find spark.yarn.archive and spark.yarn.jars in 
spark configuration file, it will first copy all jars in $SPARK_HOME/jars to a 
tmp directory and upload the tmp directory to distribute cache. Comparing 
[spark16|https://github.com/apache/spark/blob/branch-1.6/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala#L1145],
 
In spark16, it searches spark-assembly*.jar and upload it to distribute cache.

In spark20, it spends 2 more seconds to copy all jars in $SPARK_HOME/jar to a 
tmp directory if we don't set "spark.yarn.archive" or "spark.yarn.jars".

We can accelerate the startup of hive on spark 20 by settintg 
"spark.yarn.archive" or "spark.yarn.jars":
set "spark.yarn.archive":
{code}
 zip spark-archive.zip $SPARK_HOME/jars/*
$ hadoop fs -copyFromLocal spark-archive.zip 
$ echo "spark.yarn.archive=hdfs:///xxx:8020/spark-archive.zip" >> 
conf/spark-defaults.conf
{code}
set "spark.yarn.jars":
{code}
$ hadoop fs mkdir spark-2.0.0-bin-hadoop 
$hadoop fs -copyFromLocal $SPARK_HOME/jars/* spark-2.0.0-bin-hadoop 
$ echo "spark.yarn.jars=hdfs:///xxx:8020/spark-2.0.0-bin-hadoop/*" >> 
conf/spark-defaults.conf
{code}

Suggest to add this part in wiki.





  was:
According to 
[wiki|https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started],
 run queries in HOS16 and HOS20 in yarn mode.
Following table shows the difference in query time between HOS16 and HOS20.
||Version||Total time||Time for Jobs||Time for preparing jobs||
|Spark16|51|39|12|
|Spark20|54|40|14| 

 HOS20 spends more time(2 secs) on preparing jobs than HOS16. After reviewing 
the source code of spark, found that following point causes this:
code:[Client#distribute|https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala#L546],
 In spark20, if spark cannot find spark.yarn.archive and spark.yarn.jars in 
spark configuration file, it will first copy all jars in $SPARK_HOME/jars to a 
tmp directory and upload the tmp directory to distribute cache. Comparing 
[spark16|https://github.com/apache/spark/blob/branch-1.6/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala#L1145],
 
In spark16, it will find spark-assembly*.jar and upload it to distribute cache.

In spark20, it spends 2 more seconds to copy all jars in $SPARK_HOME/jar to a 
tmp directory.

We can accelerate the startup of hive on spark 20 by settintg 
"spark.yarn.archive" or "spark.yarn.jars":
set "spark.yarn.archive":
{code}
 zip spark-archive.zip $SPARK_HOME/jars/*
$ hadoop fs -copyFromLocal spark-archive.zip 
$ echo "spark.yarn.archive=hdfs:///xxx:8020/spark-archive.zip" >> 
conf/spark-defaults.conf
{code}
set "spark.yarn.jars":
{code}
$ hadoop fs mkdir spark-2.0.0-bin-hadoop 
$hadoop fs -copyFromLocal $SPARK_HOME/jars/* spark-2.0.0-bin-hadoop 
$ echo "spark.yarn.jars=hdfs:///xxx:8020/spark-2.0.0-bin-hadoop/*" >> 
conf/spark-defaults.conf
{code}

Suggest to add this part in wiki.






> Add export spark.yarn.archive or spark.yarn.jars variable in Hive on Spark 
> document
> ---
>
> Key: HIVE-15313
> URL: https://issues.apache.org/jira/browse/HIVE-15313
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang_intel
>Priority: Minor
>
> According to 
> [wiki|https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started],
>  run queries in HOS16 and HOS20 in yarn mode.
> Following table shows the difference in query time between HOS16 and HOS20.
> ||Version||Total time||Time for Jobs||Time for preparing jobs||
> |Spark16|51|39|12|
> |Spark20|54|40|14| 
>  HOS20 spends more time(2 secs) on preparing jobs than HOS16. After reviewing 
> the source code of spark, found that following point causes this:
> code:[Client#distribute|https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala#L546],
>  In spark20, if spark cannot find spark.yarn.archive and spark.yarn.jars in 
> spark configuration file, it will first copy all jars in $SPARK_HOME/jars t

[jira] [Commented] (HIVE-15202) Concurrent compactions for the same partition may generate malformed folder structure

2016-11-29 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15707463#comment-15707463
 ] 

Rui Li commented on HIVE-15202:
---

Hi [~ekoifman], I have one question. Suppose we have a compaction in 
READY_FOR_CLEANING state, then we enqueue another compaction on the same 
partition. Is it possible that the cleaner removes the files that are supposed 
to be compacted by the 2nd compaction?

> Concurrent compactions for the same partition may generate malformed folder 
> structure
> -
>
> Key: HIVE-15202
> URL: https://issues.apache.org/jira/browse/HIVE-15202
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Rui Li
>Assignee: Eugene Koifman
> Attachments: HIVE-15202.01.patch, HIVE-15202.02.patch, 
> HIVE-15202.03.patch
>
>
> If two compactions run concurrently on a single partition, it may generate 
> folder structure like this: (nested base dir)
> {noformat}
> drwxr-xr-x   - root supergroup  0 2016-11-14 22:23 
> /user/hive/warehouse/test/z=1/base_007/base_007
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_0
> -rw-r--r--   3 root supergroup611 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_1
> -rw-r--r--   3 root supergroup614 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_2
> -rw-r--r--   3 root supergroup621 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_3
> -rw-r--r--   3 root supergroup621 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_4
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_5
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_6
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_7
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_8
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_9
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15302) Relax the requirement that HoS needs Spark built w/o Hive

2016-11-29 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15707443#comment-15707443
 ] 

Rui Li commented on HIVE-15302:
---

Thanks for your suggestions, Marcelo. I'll use spark.yarn.jars instead.

The "download tarball from somewhere" approach is for the test, and we have 
HIVE-14735 to move it to maven. What I'm trying to solve here is how to avoid 
the conflict at runtime (i.e. user running SQL with HoS). I plan to find the 
needed jars from the Spark installed in the cluster, and upload the jars to 
HDFS. Then we don't have to require the Spark in cluster is built w/o Hive.

> Relax the requirement that HoS needs Spark built w/o Hive
> -
>
> Key: HIVE-15302
> URL: https://issues.apache.org/jira/browse/HIVE-15302
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rui Li
>Assignee: Rui Li
>
> This requirement becomes more and more unacceptable as SparkSQL becomes 
> widely adopted. Let's use this JIRA to find out how we can relax the 
> limitation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15202) Concurrent compactions for the same partition may generate malformed folder structure

2016-11-29 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15707431#comment-15707431
 ] 

Eugene Koifman commented on HIVE-15202:
---

explainanalyze_2  is flaky per HIVE-15084, the rest have age > 1
[~wzheng] could you review please

> Concurrent compactions for the same partition may generate malformed folder 
> structure
> -
>
> Key: HIVE-15202
> URL: https://issues.apache.org/jira/browse/HIVE-15202
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Rui Li
>Assignee: Eugene Koifman
> Attachments: HIVE-15202.01.patch, HIVE-15202.02.patch, 
> HIVE-15202.03.patch
>
>
> If two compactions run concurrently on a single partition, it may generate 
> folder structure like this: (nested base dir)
> {noformat}
> drwxr-xr-x   - root supergroup  0 2016-11-14 22:23 
> /user/hive/warehouse/test/z=1/base_007/base_007
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_0
> -rw-r--r--   3 root supergroup611 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_1
> -rw-r--r--   3 root supergroup614 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_2
> -rw-r--r--   3 root supergroup621 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_3
> -rw-r--r--   3 root supergroup621 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_4
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_5
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_6
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_7
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_8
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_9
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-15259) The deserialization time of HOS20 is longer than what in HOS16

2016-11-29 Thread liyunzhang_intel (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyunzhang_intel resolved HIVE-15259.
-
Resolution: Not A Bug

> The deserialization time of HOS20 is longer than what in  HOS16
> ---
>
> Key: HIVE-15259
> URL: https://issues.apache.org/jira/browse/HIVE-15259
> Project: Hive
>  Issue Type: Improvement
>Reporter: liyunzhang_intel
> Attachments: Deserialization_HOS16.PNG, Deserialization_HOS20.PNG
>
>
> deploy Hive on Spark on spark 1.6 version and spark 2.0 version.
> run query and in latest code(with spark2.0) the deserialization time of a 
> task is 4 sec while the deserialization time of spark1.6 is 1 sec. The detail 
> is in attached picture.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15259) The deserialization time of HOS20 is longer than what in HOS16

2016-11-29 Thread liyunzhang_intel (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15707422#comment-15707422
 ] 

liyunzhang_intel commented on HIVE-15259:
-

[~lirui]: Something update for this jira:
The serializer and  deserializer of HOS is Kryo. Kryo is related with cache. So 
if HOS20 run before HO16. The time of deserialization of HOS20 is longer than 
HOS16. There is no obvious difference in serialization and deserialization in 
HOS16 and HOS20. So I close this jira.

> The deserialization time of HOS20 is longer than what in  HOS16
> ---
>
> Key: HIVE-15259
> URL: https://issues.apache.org/jira/browse/HIVE-15259
> Project: Hive
>  Issue Type: Improvement
>Reporter: liyunzhang_intel
> Attachments: Deserialization_HOS16.PNG, Deserialization_HOS20.PNG
>
>
> deploy Hive on Spark on spark 1.6 version and spark 2.0 version.
> run query and in latest code(with spark2.0) the deserialization time of a 
> task is 4 sec while the deserialization time of spark1.6 is 1 sec. The detail 
> is in attached picture.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15302) Relax the requirement that HoS needs Spark built w/o Hive

2016-11-29 Thread Marcelo Vanzin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15707398#comment-15707398
 ] 

Marcelo Vanzin commented on HIVE-15302:
---

You don't need to use the archive. You can use {{spark.yarn.jars}}, for 
example; that doesn't require an archive. I don't remember the exact 
requirements for the archive; it's documented in Spark's documentation.

The recommended is either HDFS, or having it on every node and using a "local:" 
URI to tell Spark to not upload anything.

I'm hoping that you'll be using maven to package the needed Spark dependencies 
instead of the current "download tarball from somewhere" approach.

> Relax the requirement that HoS needs Spark built w/o Hive
> -
>
> Key: HIVE-15302
> URL: https://issues.apache.org/jira/browse/HIVE-15302
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rui Li
>Assignee: Rui Li
>
> This requirement becomes more and more unacceptable as SparkSQL becomes 
> widely adopted. Let's use this JIRA to find out how we can relax the 
> limitation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15302) Relax the requirement that HoS needs Spark built w/o Hive

2016-11-29 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15707362#comment-15707362
 ] 

Rui Li commented on HIVE-15302:
---

To clarify, the method here only works for yarn-cluster mode. For yarn-client, 
the driver runs on client side, and it will suffer conflicts if spark pulls in 
hive libs.

> Relax the requirement that HoS needs Spark built w/o Hive
> -
>
> Key: HIVE-15302
> URL: https://issues.apache.org/jira/browse/HIVE-15302
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rui Li
>Assignee: Rui Li
>
> This requirement becomes more and more unacceptable as SparkSQL becomes 
> widely adopted. Let's use this JIRA to find out how we can relax the 
> limitation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15302) Relax the requirement that HoS needs Spark built w/o Hive

2016-11-29 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15707353#comment-15707353
 ] 

Rui Li commented on HIVE-15302:
---

Yeah my plan is to put the jars to HDFS. For example, if user doesn't specify 
spark.yarn.archive or spark.yarn.jars, we can find the needed jars in 
spark.home, and upload them to HDFS, under our session's tmp dir.
I'm actually not very clear about the difference between spark.yarn.archive and 
spark.yarn.jars. In my test I just put all the jars in a folder in HDFS, and 
point spark.yarn.archive to that folder and it worked. I guess the usage of 
spark.yarn.jars should be similar to this.

> Relax the requirement that HoS needs Spark built w/o Hive
> -
>
> Key: HIVE-15302
> URL: https://issues.apache.org/jira/browse/HIVE-15302
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rui Li
>Assignee: Rui Li
>
> This requirement becomes more and more unacceptable as SparkSQL becomes 
> widely adopted. Let's use this JIRA to find out how we can relax the 
> limitation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15302) Relax the requirement that HoS needs Spark built w/o Hive

2016-11-29 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15707343#comment-15707343
 ] 

Rui Li commented on HIVE-15302:
---

Hi [~vanzin], the potential conflicts introduced by transitive dep have always 
been there. My understanding is {{spark.yarn.archive}} gives us a chance to 
exclude unneeded jars as much as possible, right?

I have two more questions,
1. How to archive the spark jars if to use {{spark.yarn.archive}}? It worked if 
I put the jars in a folder, but it didn't work if I tar or zip that folder.
2. I think the recommended config is to put the archive to HDFS? But if I 
provide a local path, does it require that all the NMs have the same archive in 
their local FS?

> Relax the requirement that HoS needs Spark built w/o Hive
> -
>
> Key: HIVE-15302
> URL: https://issues.apache.org/jira/browse/HIVE-15302
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rui Li
>Assignee: Rui Li
>
> This requirement becomes more and more unacceptable as SparkSQL becomes 
> widely adopted. Let's use this JIRA to find out how we can relax the 
> limitation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15312) reduce logging in certain places

2016-11-29 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15707283#comment-15707283
 ] 

Hive QA commented on HIVE-15312:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12840975/HIVE-15312.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2337/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2337/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2337/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2016-11-30 02:29:20.952
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-2337/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2016-11-30 02:29:20.954
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 4f2d1f5 HIVE-15124. Fix OrcInputFormat to use reader's schema 
for include boolean
+ git clean -f -d
Removing 
metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/CompactionResponse.java
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 4f2d1f5 HIVE-15124. Fix OrcInputFormat to use reader's schema 
for include boolean
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2016-11-30 02:29:22.169
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
Going to apply patch with: patch -p0
patching file 
llap-server/src/java/org/apache/hadoop/hive/llap/shufflehandler/ShuffleHandler.java
patching file 
llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java
patching file ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
patching file ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java
patching file ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java
patching file ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSplit.java
+ [[ maven == \m\a\v\e\n ]]
+ rm -rf /data/hiveptest/working/maven/org/apache/hive
+ mvn -B clean install -DskipTests -T 4 -q 
-Dmaven.repo.local=/data/hiveptest/working/maven
ANTLR Parser Generator  Version 3.5.2
Output file 
/data/hiveptest/working/apache-github-source-source/metastore/target/generated-sources/antlr3/org/apache/hadoop/hive/metastore/parser/FilterParser.java
 does not exist: must build 
/data/hiveptest/working/apache-github-source-source/metastore/src/java/org/apache/hadoop/hive/metastore/parser/Filter.g
org/apache/hadoop/hive/metastore/parser/Filter.g
DataNucleus Enhancer (version 4.1.6) for API "JDO"
DataNucleus Enhancer : Classpath
>>  /usr/share/maven/boot/plexus-classworlds-2.x.jar
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MDatabase
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MFieldSchema
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MType
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MTable
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MConstraint
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MSerDeInfo
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MOrder
ENHANCED (Persistable) : 
org.apache.hadoop.hive.metastore.model.MColumnDescriptor
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MStringList
ENHANCED (Persistable) : 
org.apache.hadoop.hive.metastore.model.MStorageDescriptor
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MPartition
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.M

[jira] [Commented] (HIVE-15202) Concurrent compactions for the same partition may generate malformed folder structure

2016-11-29 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15707279#comment-15707279
 ] 

Hive QA commented on HIVE-15202:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12840974/HIVE-15202.03.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10749 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
 (batchId=133)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=133)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=91)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2336/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2336/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2336/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12840974 - PreCommit-HIVE-Build

> Concurrent compactions for the same partition may generate malformed folder 
> structure
> -
>
> Key: HIVE-15202
> URL: https://issues.apache.org/jira/browse/HIVE-15202
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Rui Li
>Assignee: Eugene Koifman
> Attachments: HIVE-15202.01.patch, HIVE-15202.02.patch, 
> HIVE-15202.03.patch
>
>
> If two compactions run concurrently on a single partition, it may generate 
> folder structure like this: (nested base dir)
> {noformat}
> drwxr-xr-x   - root supergroup  0 2016-11-14 22:23 
> /user/hive/warehouse/test/z=1/base_007/base_007
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_0
> -rw-r--r--   3 root supergroup611 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_1
> -rw-r--r--   3 root supergroup614 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_2
> -rw-r--r--   3 root supergroup621 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_3
> -rw-r--r--   3 root supergroup621 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_4
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_5
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_6
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_7
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_8
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_9
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-15296) AM may lose task failures and not reschedule when scheduling to LLAP

2016-11-29 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-15296:
---

Assignee: Sergey Shelukhin  (was: Siddharth Seth)

> AM may lose task failures and not reschedule when scheduling to LLAP
> 
>
> Key: HIVE-15296
> URL: https://issues.apache.org/jira/browse/HIVE-15296
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> First attempt and failure detection:
> {noformat}
> 2016-11-18 20:20:01,980 [INFO] [TaskSchedulerEventHandlerThread] 
> |tezplugins.LlapTaskSchedulerService|: Received allocateRequest. 
> task=attempt_1478967587833_2622_1_06_31_0, priority=65, 
> capability=, hosts=[3n01]
> 2016-11-18 20:20:01,982 [INFO] [LlapScheduler] 
> |tezplugins.LlapTaskSchedulerService|: Assigned task 
> TaskInfo{task=attempt_1478967587833_2622_1_06_31_0, priority=65, 
> startTime=0, containerId=null, assignedInstance=null, uniqueId=55, 
> localityDelayTimeout=9223372036854775807} to container 
> container_1_2622_01_56 on node=DynamicServiceInstance 
> [alive=true, host=3n01:15001 with resources=, 
> shufflePort=15551, servicesAddress=http://3n01:15002, mgmtPort=15004]
> 2016-11-18 20:20:01,982 [INFO] [LlapScheduler] 
> |tezplugins.LlapTaskSchedulerService|: ScheduleResult for Task: 
> TaskInfo{task=attempt_1478967587833_2622_1_06_31_0, priority=65, 
> startTime=10550817928, containerId=container_1_2622_01_56, 
> assignedInstance=DynamicServiceInstance [alive=true, host=3n01:15001 with 
> resources=, shufflePort=15551, 
> servicesAddress=http://3n01:15002, mgmtPort=15004], uniqueId=55, 
> localityDelayTimeout=9223372036854775807} = SCHEDULED
> 2016-11-18 20:20:03,427 [INFO] [Dispatcher thread {Central}] 
> |impl.TaskAttemptImpl|: TaskAttempt: 
> [attempt_1478967587833_2622_1_06_31_0] started. Is using containerId: 
> [container_1_2622_01_56] on NM: [3n01:15001]
> 2016-11-18 20:20:03,427 [INFO] [Dispatcher thread {Central}] 
> |history.HistoryEventHandler|: 
> [HISTORY][DAG:dag_1478967587833_2622_1][Event:TASK_ATTEMPT_STARTED]: 
> vertexName=Map 1, taskAttemptId=attempt_1478967587833_2622_1_06_31_0, 
> startTime=1479500403427, containerId=container_1_2622_01_56, 
> nodeId=3n01:15001
> 2016-11-18 20:20:03,430 [INFO] [TaskCommunicator # 1] 
> |tezplugins.LlapTaskCommunicator|: Successfully launched task: 
> attempt_1478967587833_2622_1_06_31_0
> 2016-11-18 20:20:03,434 [INFO] [IPC Server handler 11 on 43092] 
> |impl.TaskImpl|: TaskAttempt:attempt_1478967587833_2622_1_06_31_0 sent 
> events: (0-1).
> 2016-11-18 20:20:03,434 [INFO] [IPC Server handler 11 on 43092] 
> |impl.VertexImpl|: Sending attempt_1478967587833_2622_1_06_31_0 24 events 
> [0,24) total 24 vertex_1478967587833_2622_1_06 [Map 1]
> 2016-11-18 20:25:43,249 [INFO] [Dispatcher thread {Central}] 
> |history.HistoryEventHandler|: 
> [HISTORY][DAG:dag_1478967587833_2622_1][Event:TASK_ATTEMPT_FINISHED]: 
> vertexName=Map 1, taskAttemptId=attempt_1478967587833_2622_1_06_31_0, 
> creationTime=1479500401929, allocationTime=1479500403426, 
> startTime=1479500403427, finishTime=1479500743249, timeTaken=339822, 
> status=FAILED, taskFailureType=NON_FATAL, errorEnum=TASK_HEARTBEAT_ERROR, 
> diagnostics=AttemptID:attempt_1478967587833_2622_1_06_31_0 Timed out 
> after 300 secs, nodeHttpAddress=http://3n01:15002, counters=Counters: 1, 
> org.apache.tez.common.counters.DAGCounter, DATA_LOCAL_TASKS=1
> 2016-11-18 20:25:43,255 [INFO] [TaskSchedulerEventHandlerThread] 
> |tezplugins.LlapTaskSchedulerService|: Processing de-allocate request for 
> task=attempt_1478967587833_2622_1_06_31_0, state=ASSIGNED, endReason=OTHER
> 2016-11-18 20:25:43,259 [INFO] [Dispatcher thread {Central}] 
> |node.AMNodeImpl|: Attempt failed on node: 3n01:15001 TA: 
> attempt_1478967587833_2622_1_06_31_0 failed: true container: 
> container_1_2622_01_56 numFailedTAs: 7
> 2016-11-18 20:25:43,262 [INFO] [Dispatcher thread {Central}] 
> |impl.VertexImpl|: Source task attempt completed for vertex: 
> vertex_1478967587833_2622_1_07 [Reducer 2] attempt: 
> attempt_1478967587833_2622_1_06_31_0 with state: FAILED vertexState: 
> RUNNING
> {noformat}
> Second attempt:
> {noformat}
> 2016-11-18 20:25:43,267 [INFO] [TaskSchedulerEventHandlerThread] 
> |tezplugins.LlapTaskSchedulerService|: Received allocateRequest. 
> task=attempt_1478967587833_2622_1_06_31_1, priority=64, 
> capability=, hosts=null
> 2016-11-18 20:25:43,297 [INFO] [LlapScheduler] 
> |tezplugins.LlapTaskSchedulerService|: ScheduleResult for Task: 
> TaskInfo{task=attempt_1478967587833_2622_1_06_31_1, prio

[jira] [Updated] (HIVE-15312) reduce logging in certain places

2016-11-29 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15312:

Status: Patch Available  (was: Open)

> reduce logging in certain places
> 
>
> Key: HIVE-15312
> URL: https://issues.apache.org/jira/browse/HIVE-15312
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15312.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15312) reduce logging in certain places

2016-11-29 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15312:

Attachment: HIVE-15312.patch

[~prasanth_j] can you take a look?

> reduce logging in certain places
> 
>
> Key: HIVE-15312
> URL: https://issues.apache.org/jira/browse/HIVE-15312
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15312.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15115) Flaky test: TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]

2016-11-29 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15707207#comment-15707207
 ] 

Prasanth Jayachandran commented on HIVE-15115:
--

Would it be possible to confirm if both operating systems are using same 
Timezone. My guess is the difference in file size is because of the differences 
in timezone. Orc stores the timezone id in string format in file footer. So if 
the q.out files are generated in different timezones the filesizes will differ. 
Can you confirm if that's case? To view the timezone information, you can do 
"hive --orcfiledump -t " and see what timezone gets printed 
in OSX and centos.

The other possibility is ordering of rows. Generating ORC files with different 
row orderings will cause different file size because of run length encoding. 
Usually we avoid such flakiness by explicitly adding order by to INSERT or CTAS 
query. Something like "INSERT OVERWRITE orctable SELECT * FROM src ORDER BY 
key". This will avoid issue with changing file sizes caused by encodings 
(applicable for parquet as well). 

> Flaky test: TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
> --
>
> Key: HIVE-15115
> URL: https://issues.apache.org/jira/browse/HIVE-15115
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
> Fix For: 2.2.0
>
> Attachments: HIVE-15115.patch
>
>
> This test was identified as flaky before, it seems it turned flaky again.
> Earlier Jira:
> [HIVE-14976|https://issues.apache.org/jira/browse/HIVE-14976]
> New flaky runs:
> https://builds.apache.org/job/PreCommit-HIVE-Build/1931/testReport
> https://builds.apache.org/job/PreCommit-HIVE-Build/1930/testReport
> {code}
> 516c516
> < totalSize   3220
> ---
> > totalSize   3224
> 569c569
> < totalSize   3220
> ---
> > totalSize   3224
> 634c634
> < totalSize   4577
> ---
> > totalSize   4581
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15202) Concurrent compactions for the same partition may generate malformed folder structure

2016-11-29 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15707200#comment-15707200
 ] 

Eugene Koifman commented on HIVE-15202:
---

Yes, that's a better way.  Made this change in patch 3

> Concurrent compactions for the same partition may generate malformed folder 
> structure
> -
>
> Key: HIVE-15202
> URL: https://issues.apache.org/jira/browse/HIVE-15202
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Rui Li
>Assignee: Eugene Koifman
> Attachments: HIVE-15202.01.patch, HIVE-15202.02.patch, 
> HIVE-15202.03.patch
>
>
> If two compactions run concurrently on a single partition, it may generate 
> folder structure like this: (nested base dir)
> {noformat}
> drwxr-xr-x   - root supergroup  0 2016-11-14 22:23 
> /user/hive/warehouse/test/z=1/base_007/base_007
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_0
> -rw-r--r--   3 root supergroup611 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_1
> -rw-r--r--   3 root supergroup614 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_2
> -rw-r--r--   3 root supergroup621 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_3
> -rw-r--r--   3 root supergroup621 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_4
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_5
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_6
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_7
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_8
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_9
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15202) Concurrent compactions for the same partition may generate malformed folder structure

2016-11-29 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-15202:
--
Status: Patch Available  (was: Open)

> Concurrent compactions for the same partition may generate malformed folder 
> structure
> -
>
> Key: HIVE-15202
> URL: https://issues.apache.org/jira/browse/HIVE-15202
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Rui Li
>Assignee: Eugene Koifman
> Attachments: HIVE-15202.01.patch, HIVE-15202.02.patch, 
> HIVE-15202.03.patch
>
>
> If two compactions run concurrently on a single partition, it may generate 
> folder structure like this: (nested base dir)
> {noformat}
> drwxr-xr-x   - root supergroup  0 2016-11-14 22:23 
> /user/hive/warehouse/test/z=1/base_007/base_007
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_0
> -rw-r--r--   3 root supergroup611 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_1
> -rw-r--r--   3 root supergroup614 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_2
> -rw-r--r--   3 root supergroup621 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_3
> -rw-r--r--   3 root supergroup621 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_4
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_5
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_6
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_7
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_8
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_9
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15202) Concurrent compactions for the same partition may generate malformed folder structure

2016-11-29 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-15202:
--
Attachment: HIVE-15202.03.patch

> Concurrent compactions for the same partition may generate malformed folder 
> structure
> -
>
> Key: HIVE-15202
> URL: https://issues.apache.org/jira/browse/HIVE-15202
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Rui Li
>Assignee: Eugene Koifman
> Attachments: HIVE-15202.01.patch, HIVE-15202.02.patch, 
> HIVE-15202.03.patch
>
>
> If two compactions run concurrently on a single partition, it may generate 
> folder structure like this: (nested base dir)
> {noformat}
> drwxr-xr-x   - root supergroup  0 2016-11-14 22:23 
> /user/hive/warehouse/test/z=1/base_007/base_007
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_0
> -rw-r--r--   3 root supergroup611 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_1
> -rw-r--r--   3 root supergroup614 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_2
> -rw-r--r--   3 root supergroup621 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_3
> -rw-r--r--   3 root supergroup621 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_4
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_5
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_6
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_7
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_8
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_9
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15202) Concurrent compactions for the same partition may generate malformed folder structure

2016-11-29 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-15202:
--
Status: Open  (was: Patch Available)

> Concurrent compactions for the same partition may generate malformed folder 
> structure
> -
>
> Key: HIVE-15202
> URL: https://issues.apache.org/jira/browse/HIVE-15202
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Rui Li
>Assignee: Eugene Koifman
> Attachments: HIVE-15202.01.patch, HIVE-15202.02.patch
>
>
> If two compactions run concurrently on a single partition, it may generate 
> folder structure like this: (nested base dir)
> {noformat}
> drwxr-xr-x   - root supergroup  0 2016-11-14 22:23 
> /user/hive/warehouse/test/z=1/base_007/base_007
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_0
> -rw-r--r--   3 root supergroup611 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_1
> -rw-r--r--   3 root supergroup614 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_2
> -rw-r--r--   3 root supergroup621 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_3
> -rw-r--r--   3 root supergroup621 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_4
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_5
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_6
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_7
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_8
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_9
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15202) Concurrent compactions for the same partition may generate malformed folder structure

2016-11-29 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-15202:
--
Attachment: HIVE-15202.02.patch

> Concurrent compactions for the same partition may generate malformed folder 
> structure
> -
>
> Key: HIVE-15202
> URL: https://issues.apache.org/jira/browse/HIVE-15202
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Rui Li
>Assignee: Eugene Koifman
> Attachments: HIVE-15202.01.patch, HIVE-15202.02.patch
>
>
> If two compactions run concurrently on a single partition, it may generate 
> folder structure like this: (nested base dir)
> {noformat}
> drwxr-xr-x   - root supergroup  0 2016-11-14 22:23 
> /user/hive/warehouse/test/z=1/base_007/base_007
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_0
> -rw-r--r--   3 root supergroup611 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_1
> -rw-r--r--   3 root supergroup614 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_2
> -rw-r--r--   3 root supergroup621 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_3
> -rw-r--r--   3 root supergroup621 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_4
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_5
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_6
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_7
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_8
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_9
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15311) Analyze column stats should skip non-primitive column types

2016-11-29 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15707174#comment-15707174
 ] 

Ashutosh Chauhan commented on HIVE-15311:
-

+1

> Analyze column stats should skip non-primitive column types
> ---
>
> Key: HIVE-15311
> URL: https://issues.apache.org/jira/browse/HIVE-15311
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15311.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15311) Analyze column stats should skip non-primitive column types

2016-11-29 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15707169#comment-15707169
 ] 

Hive QA commented on HIVE-15311:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12840958/HIVE-15311.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10749 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
 (batchId=134)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=134)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[columnstats_tbllvl_complex_type]
 (batchId=84)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2334/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2334/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2334/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12840958 - PreCommit-HIVE-Build

> Analyze column stats should skip non-primitive column types
> ---
>
> Key: HIVE-15311
> URL: https://issues.apache.org/jira/browse/HIVE-15311
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15311.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14453) refactor physical writing of ORC data and metadata to FS from the logical writers

2016-11-29 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15707160#comment-15707160
 ] 

Sergey Shelukhin commented on HIVE-14453:
-

Different kinds of streams are not distinguished for data streams, only index 
streams.
Protobufs are necessary for two reasons:
1) Only the physical writer knows certain things that are put in them, e.g. 
file offsets.
2) Alternative writers do not have to serialize protobufs as bytes (or at all).

An example implementation of using ORC as cache storage format for text tables 
is in the blocked JIRA that now has a WIP patch.

I think it's actually no more brittle than any other separation of concerns - 
it just separates the physical file on disk and its organization from the logic 
of writing the data and metadata, which is good even if it reduces flexibility 
somewhat, as it avoids monolithic dependencies for these two unrelated things.


> refactor physical writing of ORC data and metadata to FS from the logical 
> writers
> -
>
> Key: HIVE-14453
> URL: https://issues.apache.org/jira/browse/HIVE-14453
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14453.01.patch, HIVE-14453.02.patch, 
> HIVE-14453.03.patch, HIVE-14453.patch
>
>
> ORC data doesn't have to go directly into an HDFS stream via buffers, it can 
> go somewhere else (e.g. a write-thru cache, or an addressable system that 
> doesn't require the stream blocks to be held in memory before writing them 
> all together).
> To that effect, it would be nice to abstract the data block/metadata 
> structure creating from the physical file concerns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15308) Create ACID table failed intermittently: due to Postgres (SQLState=25P02, ErrorCode=0)

2016-11-29 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15707133#comment-15707133
 ] 

Eugene Koifman commented on HIVE-15308:
---

TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a is listed as Flaky 
in HIVE-14936
other failures have age > 1

[~wzheng] could you review please

> Create ACID table failed intermittently: due to Postgres (SQLState=25P02, 
> ErrorCode=0)
> --
>
> Key: HIVE-15308
> URL: https://issues.apache.org/jira/browse/HIVE-15308
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.1.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-15308.01.patch
>
>
> if 2 concurrent calls to MutexApi.acquireLock() happen with the same "key" 
> value and there is no row in AUX_TABLE for that value yet (i.e. both are 
> attempting to insert it) Postgres kills the txn which gets the Duplicate Key 
> error and no more statements can be executed on this txn.
> (This is different from the way most DBs behave).
> {noformat}
> Caused by: MetaException(message:Unable to lock 'CheckLock' due to: ERROR: 
> current transaction is aborted, commands ignored until end of transaction 
> block (SQLState=25P02, ErrorCode=0); org.postgresql.util.PSQLException: 
> ERROR: current transaction is aborted, commands ignored until end of 
> transaction block
>   at 
> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2198)
>   at 
> org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1927)
>   at 
> org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:255)
>   at 
> org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:561)
>   at 
> org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:405)
>   at 
> org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:285)
>   at 
> com.jolbox.bonecp.StatementHandle.executeQuery(StatementHandle.java:464)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.acquireLock(TxnHandler.java:3250)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.checkLock(TxnHandler.java:2319)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.checkLockWithRetry(TxnHandler.java:1022)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:794)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:5941)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:140)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99)
>   at com.sun.proxy.$Proxy30.lock(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.lock(HiveMetaStoreClient.java:2109)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:154)
>   at com.sun.proxy.$Proxy31.lock(Unknown Source)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2259)
>   at com.sun.proxy.$Proxy31.lock(Unknown Source)
>   at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$SynchronizedMetaStoreClient.lock(DbTxnManager.java:745)
>   at 
> org.apache.hadoop.hive.ql.lockmgr.DbLockManager.lock(DbLockManager.java:103)
>   at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.acquireLocks(DbTxnManager.java:341)
>   at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.acquireLocksWithHeartbeatDelay(DbTxnManager.java:357)
>   at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.acquireLocks(DbTxnManager.java:

[jira] [Updated] (HIVE-15309) Miscellaneous logging clean up

2016-11-29 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-15309:
--
Status: Patch Available  (was: Open)

> Miscellaneous logging clean up
> --
>
> Key: HIVE-15309
> URL: https://issues.apache.org/jira/browse/HIVE-15309
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-15309.01.patch
>
>
> OrcAcidUtils.getLastFlushLength() should check for file existence first.  
> Currently causes unnecessary/confusing logging:
> {noformat}
> org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File 
> does not exist: /domains/adl/rrslog/data_history/rrslog/r\
> rslog/hot/server_date=2016-08-19/delta_0005913_0005913/bucket_00023_flush_length
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:71)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:61)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1860)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1831)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1744)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:693)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolSe\
> rverSideTranslatorPB.java:373)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientName\
> nodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307)
> at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1552)
> at org.apache.hadoop.ipc.Client.call(Client.java:1496)
> at org.apache.hadoop.ipc.Client.call(Client.java:1396)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
> at com.sun.proxy.$Proxy14.getBlockLocations(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB\
> .java:270)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:278)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:194)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:176)
> at com.sun.proxy.$Proxy15.getBlockLocations(Unknown Source)
> at 
> org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1236)
> at 
> org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1223)
> at 
> org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1211)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:309)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:274)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.(DFSInputStream.java:266)
> at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1536)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:330)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:326)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:326)
> at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:782)
> at 
> org.apache.hadoop.hive.ql.io.orc.

[jira] [Updated] (HIVE-15309) Miscellaneous logging clean up

2016-11-29 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-15309:
--
Attachment: HIVE-15309.01.patch

[~wzheng] could you review please

> Miscellaneous logging clean up
> --
>
> Key: HIVE-15309
> URL: https://issues.apache.org/jira/browse/HIVE-15309
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-15309.01.patch
>
>
> OrcAcidUtils.getLastFlushLength() should check for file existence first.  
> Currently causes unnecessary/confusing logging:
> {noformat}
> org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File 
> does not exist: /domains/adl/rrslog/data_history/rrslog/r\
> rslog/hot/server_date=2016-08-19/delta_0005913_0005913/bucket_00023_flush_length
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:71)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:61)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1860)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1831)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1744)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:693)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolSe\
> rverSideTranslatorPB.java:373)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientName\
> nodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307)
> at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1552)
> at org.apache.hadoop.ipc.Client.call(Client.java:1496)
> at org.apache.hadoop.ipc.Client.call(Client.java:1396)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
> at com.sun.proxy.$Proxy14.getBlockLocations(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB\
> .java:270)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:278)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:194)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:176)
> at com.sun.proxy.$Proxy15.getBlockLocations(Unknown Source)
> at 
> org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1236)
> at 
> org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1223)
> at 
> org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1211)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:309)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:274)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.(DFSInputStream.java:266)
> at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1536)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:330)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:326)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:326)
> at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:782)
> at 
> org

[jira] [Commented] (HIVE-14453) refactor physical writing of ORC data and metadata to FS from the logical writers

2016-11-29 Thread Owen O'Malley (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15707113#comment-15707113
 ] 

Owen O'Malley commented on HIVE-14453:
--

The PhysicalWriter interface seems brittle in the face of changes. In 
particular, calling out the different kinds of streams as different methods 
seems problematic.

What is the goal of this API? Can we implement this at the byte level rather 
than passing protobufs around?

> refactor physical writing of ORC data and metadata to FS from the logical 
> writers
> -
>
> Key: HIVE-14453
> URL: https://issues.apache.org/jira/browse/HIVE-14453
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14453.01.patch, HIVE-14453.02.patch, 
> HIVE-14453.03.patch, HIVE-14453.patch
>
>
> ORC data doesn't have to go directly into an HDFS stream via buffers, it can 
> go somewhere else (e.g. a write-thru cache, or an addressable system that 
> doesn't require the stream blocks to be held in memory before writing them 
> all together).
> To that effect, it would be nice to abstract the data block/metadata 
> structure creating from the physical file concerns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15297) Hive should not split semicolon within quoted string literals

2016-11-29 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15707065#comment-15707065
 ] 

Hive QA commented on HIVE-15297:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12840932/HIVE-15297.02.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 125 failed/errored test(s), 10749 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[allcolref_in_udf] 
(batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter5] (batchId=37)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[analyze_table_null_partition]
 (batchId=73)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[analyze_tbl_part] 
(batchId=43)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_filter] 
(batchId=8)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_groupby2] 
(batchId=42)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_groupby] 
(batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join_reordering_values]
 (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin11] 
(batchId=64)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_annotate_stats_groupby]
 (batchId=76)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[char_cast] (batchId=80)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[columnstats_partlvl_dp] 
(batchId=46)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[compustat_avro] 
(batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view_translate] 
(batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[database_drop] 
(batchId=54)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[escape_comments] 
(batchId=68)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[infer_bucket_sort] 
(batchId=74)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[infer_bucket_sort_convert_join]
 (batchId=48)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[infer_bucket_sort_grouping_operators]
 (batchId=49)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[infer_bucket_sort_multi_insert]
 (batchId=76)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input19] (batchId=78)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_into4] 
(batchId=16)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_overwrite_directory]
 (batchId=24)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_dml_2] 
(batchId=11)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_query_multiskew_1]
 (batchId=41)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_query_oneskew_1]
 (batchId=54)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_query_oneskew_2]
 (batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_ends_with_nulls] 
(batchId=69)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_windowing1] 
(batchId=40)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_windowing2] 
(batchId=10)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[query_with_semi] 
(batchId=76)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample5] (batchId=51)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[serde_opencsv] 
(batchId=67)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[specialChar] (batchId=21)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_bin] (batchId=41)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_conv] (batchId=21)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_hex] (batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_inline] (batchId=52)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_sentences] 
(batchId=35)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_sort_array_by] 
(batchId=57)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[varchar_cast] 
(batchId=53)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_char_4] 
(batchId=80)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_include_no_sel] 
(batchId=4)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_varchar_4] 
(batchId=27)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_math_funcs] 
(batchI

[jira] [Commented] (HIVE-15112) Implement Parquet vectorization reader for Struct type

2016-11-29 Thread Ferdinand Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15707031#comment-15707031
 ] 

Ferdinand Xu commented on HIVE-15112:
-

>   Does that mean even if we have fully-implemented vectorized reader for 
> complex types Hive will still use the row-by-row engine? 

Yes, it's still use row-by-row engine AFAIK. And I found no tracking JIRA under 
the original umbrella ticket HIVE-4160. Hi [~jnp], do you have any details 
about compound types support in vectorization? Thank you!

> Implement Parquet vectorization reader for Struct type
> --
>
> Key: HIVE-15112
> URL: https://issues.apache.org/jira/browse/HIVE-15112
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
>
> Like HIVE-14815, we need support Parquet vectorized reader for struct type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15311) Analyze column stats should skip non-primitive column types

2016-11-29 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15311:
---
Status: Patch Available  (was: Open)

> Analyze column stats should skip non-primitive column types
> ---
>
> Key: HIVE-15311
> URL: https://issues.apache.org/jira/browse/HIVE-15311
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15311.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15311) Analyze column stats should skip non-primitive column types

2016-11-29 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15311:
---
Attachment: HIVE-15311.01.patch

> Analyze column stats should skip non-primitive column types
> ---
>
> Key: HIVE-15311
> URL: https://issues.apache.org/jira/browse/HIVE-15311
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15311.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15308) Create ACID table failed intermittently: due to Postgres (SQLState=25P02, ErrorCode=0)

2016-11-29 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15706972#comment-15706972
 ] 

Hive QA commented on HIVE-15308:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12840929/HIVE-15308.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10746 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
 (batchId=133)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=133)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2332/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2332/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2332/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12840929 - PreCommit-HIVE-Build

> Create ACID table failed intermittently: due to Postgres (SQLState=25P02, 
> ErrorCode=0)
> --
>
> Key: HIVE-15308
> URL: https://issues.apache.org/jira/browse/HIVE-15308
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.1.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-15308.01.patch
>
>
> if 2 concurrent calls to MutexApi.acquireLock() happen with the same "key" 
> value and there is no row in AUX_TABLE for that value yet (i.e. both are 
> attempting to insert it) Postgres kills the txn which gets the Duplicate Key 
> error and no more statements can be executed on this txn.
> (This is different from the way most DBs behave).
> {noformat}
> Caused by: MetaException(message:Unable to lock 'CheckLock' due to: ERROR: 
> current transaction is aborted, commands ignored until end of transaction 
> block (SQLState=25P02, ErrorCode=0); org.postgresql.util.PSQLException: 
> ERROR: current transaction is aborted, commands ignored until end of 
> transaction block
>   at 
> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2198)
>   at 
> org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1927)
>   at 
> org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:255)
>   at 
> org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:561)
>   at 
> org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:405)
>   at 
> org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:285)
>   at 
> com.jolbox.bonecp.StatementHandle.executeQuery(StatementHandle.java:464)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.acquireLock(TxnHandler.java:3250)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.checkLock(TxnHandler.java:2319)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.checkLockWithRetry(TxnHandler.java:1022)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:794)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:5941)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:140)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99)
>   at com.sun.proxy.$Proxy30.lock(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.lock(HiveMetaStoreClient.java:2109)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeM

[jira] [Commented] (HIVE-15202) Concurrent compactions for the same partition may generate malformed folder structure

2016-11-29 Thread Wei Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15706906#comment-15706906
 ] 

Wei Zheng commented on HIVE-15202:
--

Patch looks good. Although I think an alternative way to define the 
CompactionResponse can be {id, state, isNew (boolean)}

+1

> Concurrent compactions for the same partition may generate malformed folder 
> structure
> -
>
> Key: HIVE-15202
> URL: https://issues.apache.org/jira/browse/HIVE-15202
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Rui Li
>Assignee: Eugene Koifman
> Attachments: HIVE-15202.01.patch
>
>
> If two compactions run concurrently on a single partition, it may generate 
> folder structure like this: (nested base dir)
> {noformat}
> drwxr-xr-x   - root supergroup  0 2016-11-14 22:23 
> /user/hive/warehouse/test/z=1/base_007/base_007
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_0
> -rw-r--r--   3 root supergroup611 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_1
> -rw-r--r--   3 root supergroup614 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_2
> -rw-r--r--   3 root supergroup621 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_3
> -rw-r--r--   3 root supergroup621 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_4
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_5
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_6
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_7
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_8
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_9
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15124) Fix OrcInputFormat to use reader's schema for include boolean array

2016-11-29 Thread Owen O'Malley (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-15124:
-
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

I just committed this. Thanks for the review, Prasanth.

> Fix OrcInputFormat to use reader's schema for include boolean array
> ---
>
> Key: HIVE-15124
> URL: https://issues.apache.org/jira/browse/HIVE-15124
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 1.2.1
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 2.2.0
>
> Attachments: HIVE-15124.patch, HIVE-15124.patch, HIVE-15124.patch, 
> HIVE-15124.patch
>
>
> Currently, the OrcInputFormat uses the file's schema rather than the reader's 
> schema. This means that SchemaEvolution fails with an 
> ArrayIndexOutOfBoundsException if a partition has a different schema than the 
> table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14453) refactor physical writing of ORC data and metadata to FS from the logical writers

2016-11-29 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15706859#comment-15706859
 ] 

Hive QA commented on HIVE-14453:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12840923/HIVE-14453.03.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10746 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=133)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_4] 
(batchId=91)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2331/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2331/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2331/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12840923 - PreCommit-HIVE-Build

> refactor physical writing of ORC data and metadata to FS from the logical 
> writers
> -
>
> Key: HIVE-14453
> URL: https://issues.apache.org/jira/browse/HIVE-14453
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14453.01.patch, HIVE-14453.02.patch, 
> HIVE-14453.03.patch, HIVE-14453.patch
>
>
> ORC data doesn't have to go directly into an HDFS stream via buffers, it can 
> go somewhere else (e.g. a write-thru cache, or an addressable system that 
> doesn't require the stream blocks to be held in memory before writing them 
> all together).
> To that effect, it would be nice to abstract the data block/metadata 
> structure creating from the physical file concerns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15035) Clean up Hive licenses for binary distribution

2016-11-29 Thread Alan Gates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-15035:
--
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Patch committed

> Clean up Hive licenses for binary distribution
> --
>
> Key: HIVE-15035
> URL: https://issues.apache.org/jira/browse/HIVE-15035
> Project: Hive
>  Issue Type: Bug
>  Components: distribution
>Affects Versions: 2.1.0
>Reporter: Alan Gates
>Assignee: Alan Gates
> Fix For: 2.2.0
>
> Attachments: HIVE-15035.2.patch, HIVE-15035.3.patch, HIVE-15035.patch
>
>
> Hive's current LICENSE file contains information not needed for the source 
> distribution.  For the binary distribution we are missing many license files 
> as a number of jars included in Hive come with various licenses.  This all 
> needs cleaned up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-15310) CTAS fails when target location contains multiple directories levels that don't exist

2016-11-29 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar reassigned HIVE-15310:
--

Assignee: Vihang Karajgaonkar

> CTAS fails when target location contains multiple directories levels that 
> don't exist
> -
>
> Key: HIVE-15310
> URL: https://issues.apache.org/jira/browse/HIVE-15310
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Sahil Takiar
>Assignee: Vihang Karajgaonkar
>
> The second query below fails, if the {{/tmp/}} directory is empty:
> {code}
> create table test1 (id int) location '/tmp/test1/one/two/';
> create table test2 location '/tmp/test2/one/two' as select * from test1;
> {code}
> The stacktrace is:
> {code}
> Error: org.apache.hive.service.cli.HiveSQLException: Error while processing 
> statement: FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.MoveTask. Unable to move source 
> file:/var/folders/jb/350gyf853s91hk5xk06jyw2wgp/T/stakiar/f6c5c246-3209-4d76-aaf4-1ac91436f59a/hive_2016-11-29_14-02-33_047_219358498634208469-1/-mr-10002
>  to destination /tmp/test2/one/two
>   at 
> org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:393)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:250)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:340)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:353)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move 
> source 
> file:/var/folders/jb/350gyf853s91hk5xk06jyw2wgp/T/stakiar/f6c5c246-3209-4d76-aaf4-1ac91436f59a/hive_2016-11-29_14-02-33_047_219358498634208469-1/-mr-10002
>  to destination /tmp/test2/one/two
>   at org.apache.hadoop.hive.ql.exec.MoveTask.moveFile(MoveTask.java:104)
>   at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:263)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2166)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1822)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1510)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1221)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1216)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:248)
>   ... 11 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move 
> source 
> file:/var/folders/jb/350gyf853s91hk5xk06jyw2wgp/T/stakiar/f6c5c246-3209-4d76-aaf4-1ac91436f59a/hive_2016-11-29_14-02-33_047_219358498634208469-1/-mr-10002
>  to destination /tmp/test2/one/two
>   at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:3119)
>   at 
> org.apache.hadoop.hive.ql.exec.MoveTask.moveFileInDfs(MoveTask.java:120)
>   at org.apache.hadoop.hive.ql.exec.MoveTask.moveFile(MoveTask.java:97)
>   ... 20 more
> Caused by: java.io.FileNotFoundException: File /tmp/test2/one does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:609)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:822)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:599)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421)
>   at 
> org.apache.hadoop.hive.io.HdfsUtils$HadoopFileStatus.(HdfsUtils.java:178)
>   at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:3029)
>   ... 22 more (state=08S01,code=1)
> {code}
> The second query works if the target location is simply {{/tmp/test2/}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15232) Add notification events for functions and indexes

2016-11-29 Thread Vaibhav Gumashta (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15706769#comment-15706769
 ] 

Vaibhav Gumashta commented on HIVE-15232:
-

Thanks [~stakiar]!

> Add notification events for functions and indexes
> -
>
> Key: HIVE-15232
> URL: https://issues.apache.org/jira/browse/HIVE-15232
> Project: Hive
>  Issue Type: Improvement
>  Components: repl
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Fix For: 2.2.0
>
> Attachments: HIVE-15232.1.patch, HIVE-15232.2.patch, 
> HIVE-15232.2.patch, HIVE-15232.2.patch
>
>
> Create/Drop Function and Create/Drop/Alter Index should also generate 
> metastore notification events.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Issue Comment Deleted] (HIVE-15232) Add notification events for functions and indexes

2016-11-29 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-15232:

Comment: was deleted

(was: Thanks [~stakiar]!)

> Add notification events for functions and indexes
> -
>
> Key: HIVE-15232
> URL: https://issues.apache.org/jira/browse/HIVE-15232
> Project: Hive
>  Issue Type: Improvement
>  Components: repl
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Fix For: 2.2.0
>
> Attachments: HIVE-15232.1.patch, HIVE-15232.2.patch, 
> HIVE-15232.2.patch, HIVE-15232.2.patch
>
>
> Create/Drop Function and Create/Drop/Alter Index should also generate 
> metastore notification events.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15232) Add notification events for functions and indexes

2016-11-29 Thread Vaibhav Gumashta (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15706772#comment-15706772
 ] 

Vaibhav Gumashta commented on HIVE-15232:
-

Thanks [~mohitsabharwal]]!

> Add notification events for functions and indexes
> -
>
> Key: HIVE-15232
> URL: https://issues.apache.org/jira/browse/HIVE-15232
> Project: Hive
>  Issue Type: Improvement
>  Components: repl
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Fix For: 2.2.0
>
> Attachments: HIVE-15232.1.patch, HIVE-15232.2.patch, 
> HIVE-15232.2.patch, HIVE-15232.2.patch
>
>
> Create/Drop Function and Create/Drop/Alter Index should also generate 
> metastore notification events.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15074) Schematool provides a way to detect invalid entries in VERSION table

2016-11-29 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15706741#comment-15706741
 ] 

Hive QA commented on HIVE-15074:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12840920/HIVE-15074.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10747 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
 (batchId=133)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=133)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=91)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2330/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2330/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2330/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12840920 - PreCommit-HIVE-Build

> Schematool provides a way to detect invalid entries in VERSION table
> 
>
> Key: HIVE-15074
> URL: https://issues.apache.org/jira/browse/HIVE-15074
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Yongzhi Chen
>Assignee: Chaoyu Tang
>Priority: Minor
> Attachments: HIVE-15074.1.patch, HIVE-15074.patch
>
>
> For some unknown reason, we see customer's HMS can not start because there 
> are multiple entries in their HMS VERSION table. Schematool should provide a 
> way to validate the HMS db and provide warning and fix options for this kind 
> of issues. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15035) Clean up Hive licenses for binary distribution

2016-11-29 Thread Owen O'Malley (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15706704#comment-15706704
 ] 

Owen O'Malley commented on HIVE-15035:
--

+1 This looks good and is an improvement over the current ones.

> Clean up Hive licenses for binary distribution
> --
>
> Key: HIVE-15035
> URL: https://issues.apache.org/jira/browse/HIVE-15035
> Project: Hive
>  Issue Type: Bug
>  Components: distribution
>Affects Versions: 2.1.0
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-15035.2.patch, HIVE-15035.3.patch, HIVE-15035.patch
>
>
> Hive's current LICENSE file contains information not needed for the source 
> distribution.  For the binary distribution we are missing many license files 
> as a number of jars included in Hive come with various licenses.  This all 
> needs cleaned up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15310) CTAS fails when target location contains multiple directories levels that don't exist

2016-11-29 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-15310:

Description: 
The second query below fails, if the {{/tmp/}} directory is empty:

{code}
create table test1 (id int) location '/tmp/test1/one/two/';
create table test2 location '/tmp/test2/one/two' as select * from test1;
{code}

The stacktrace is:

{code}
Error: org.apache.hive.service.cli.HiveSQLException: Error while processing 
statement: FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.MoveTask. Unable to move source 
file:/var/folders/jb/350gyf853s91hk5xk06jyw2wgp/T/stakiar/f6c5c246-3209-4d76-aaf4-1ac91436f59a/hive_2016-11-29_14-02-33_047_219358498634208469-1/-mr-10002
 to destination /tmp/test2/one/two
at 
org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:393)
at 
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:250)
at 
org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91)
at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:340)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:353)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move 
source 
file:/var/folders/jb/350gyf853s91hk5xk06jyw2wgp/T/stakiar/f6c5c246-3209-4d76-aaf4-1ac91436f59a/hive_2016-11-29_14-02-33_047_219358498634208469-1/-mr-10002
 to destination /tmp/test2/one/two
at org.apache.hadoop.hive.ql.exec.MoveTask.moveFile(MoveTask.java:104)
at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:263)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2166)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1822)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1510)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1221)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1216)
at 
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:248)
... 11 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move 
source 
file:/var/folders/jb/350gyf853s91hk5xk06jyw2wgp/T/stakiar/f6c5c246-3209-4d76-aaf4-1ac91436f59a/hive_2016-11-29_14-02-33_047_219358498634208469-1/-mr-10002
 to destination /tmp/test2/one/two
at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:3119)
at 
org.apache.hadoop.hive.ql.exec.MoveTask.moveFileInDfs(MoveTask.java:120)
at org.apache.hadoop.hive.ql.exec.MoveTask.moveFile(MoveTask.java:97)
... 20 more
Caused by: java.io.FileNotFoundException: File /tmp/test2/one does not exist
at 
org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:609)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:822)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:599)
at 
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421)
at 
org.apache.hadoop.hive.io.HdfsUtils$HadoopFileStatus.(HdfsUtils.java:178)
at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:3029)
... 22 more (state=08S01,code=1)
{code}

The second query works if the target location is simply {{/tmp/test2/}}

  was:
The following query fails, if the {{/tmp/}} directory is empty:

{code}
create table test1 (id int) location '/tmp/test1/one/two/';
create table test2 location '/tmp/test2/one/two' as select * from test1;
{code}

The stacktrace is:

{code}
Error: org.apache.hive.service.cli.HiveSQLException: Error while processing 
statement: FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.MoveTask. Unable to move source 
file:/var/folders/jb/350gyf853s91hk5xk06jyw2wgp/T/stakiar/f6c5c246-3209-4d76-aaf4-1ac91436f59a/hive_2016-11-29_14-02-33_047_219358498634208469-1/-mr-10002
 to destination /tmp/test2/one/two
at 
org.apache.hive.service.cli.ope

[jira] [Commented] (HIVE-15074) Schematool provides a way to detect invalid entries in VERSION table

2016-11-29 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15706688#comment-15706688
 ] 

Chaoyu Tang commented on HIVE-15074:


The failed tests are not related to this patch.

> Schematool provides a way to detect invalid entries in VERSION table
> 
>
> Key: HIVE-15074
> URL: https://issues.apache.org/jira/browse/HIVE-15074
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Yongzhi Chen
>Assignee: Chaoyu Tang
>Priority: Minor
> Attachments: HIVE-15074.1.patch, HIVE-15074.patch
>
>
> For some unknown reason, we see customer's HMS can not start because there 
> are multiple entries in their HMS VERSION table. Schematool should provide a 
> way to validate the HMS db and provide warning and fix options for this kind 
> of issues. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15297) Hive should not split semicolon within quoted string literals

2016-11-29 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15297:
---
Attachment: HIVE-15297.02.patch

> Hive should not split semicolon within quoted string literals
> -
>
> Key: HIVE-15297
> URL: https://issues.apache.org/jira/browse/HIVE-15297
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15297.01.patch, HIVE-15297.02.patch
>
>
> String literals in query cannot have reserved symbols. The same set of query 
> works fine in mysql and postgresql. 
> {code}
> hive> CREATE TABLE ts(s varchar(550));
> OK
> Time taken: 0.075 seconds
> hive> INSERT INTO ts VALUES ('Mozilla/5.0 (iPhone; CPU iPhone OS 5_0');
> MismatchedTokenException(14!=326)
>   at 
> org.antlr.runtime.BaseRecognizer.recoverFromMismatchedToken(BaseRecognizer.java:617)
>   at org.antlr.runtime.BaseRecognizer.match(BaseRecognizer.java:115)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.valueRowConstructor(HiveParser_FromClauseParser.java:7271)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.valuesTableConstructor(HiveParser_FromClauseParser.java:7370)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.valuesClause(HiveParser_FromClauseParser.java:7510)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.valuesClause(HiveParser.java:51854)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:45432)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:44578)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:8)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1694)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1176)
>   at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:204)
>   at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:402)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:326)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1169)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1288)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1095)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1083)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> FAILED: ParseException line 1:31 mismatched input '/' expecting ) near 
> 'Mozilla' in value row constructor
> hive>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15297) Hive should not split semicolon within quoted string literals

2016-11-29 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15297:
---
Status: Patch Available  (was: Open)

> Hive should not split semicolon within quoted string literals
> -
>
> Key: HIVE-15297
> URL: https://issues.apache.org/jira/browse/HIVE-15297
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15297.01.patch, HIVE-15297.02.patch
>
>
> String literals in query cannot have reserved symbols. The same set of query 
> works fine in mysql and postgresql. 
> {code}
> hive> CREATE TABLE ts(s varchar(550));
> OK
> Time taken: 0.075 seconds
> hive> INSERT INTO ts VALUES ('Mozilla/5.0 (iPhone; CPU iPhone OS 5_0');
> MismatchedTokenException(14!=326)
>   at 
> org.antlr.runtime.BaseRecognizer.recoverFromMismatchedToken(BaseRecognizer.java:617)
>   at org.antlr.runtime.BaseRecognizer.match(BaseRecognizer.java:115)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.valueRowConstructor(HiveParser_FromClauseParser.java:7271)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.valuesTableConstructor(HiveParser_FromClauseParser.java:7370)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.valuesClause(HiveParser_FromClauseParser.java:7510)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.valuesClause(HiveParser.java:51854)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:45432)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:44578)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:8)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1694)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1176)
>   at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:204)
>   at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:402)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:326)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1169)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1288)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1095)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1083)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> FAILED: ParseException line 1:31 mismatched input '/' expecting ) near 
> 'Mozilla' in value row constructor
> hive>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15297) Hive should not split semicolon within quoted string literals

2016-11-29 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15297:
---
Status: Open  (was: Patch Available)

> Hive should not split semicolon within quoted string literals
> -
>
> Key: HIVE-15297
> URL: https://issues.apache.org/jira/browse/HIVE-15297
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15297.01.patch, HIVE-15297.02.patch
>
>
> String literals in query cannot have reserved symbols. The same set of query 
> works fine in mysql and postgresql. 
> {code}
> hive> CREATE TABLE ts(s varchar(550));
> OK
> Time taken: 0.075 seconds
> hive> INSERT INTO ts VALUES ('Mozilla/5.0 (iPhone; CPU iPhone OS 5_0');
> MismatchedTokenException(14!=326)
>   at 
> org.antlr.runtime.BaseRecognizer.recoverFromMismatchedToken(BaseRecognizer.java:617)
>   at org.antlr.runtime.BaseRecognizer.match(BaseRecognizer.java:115)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.valueRowConstructor(HiveParser_FromClauseParser.java:7271)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.valuesTableConstructor(HiveParser_FromClauseParser.java:7370)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.valuesClause(HiveParser_FromClauseParser.java:7510)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.valuesClause(HiveParser.java:51854)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:45432)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:44578)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:8)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1694)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1176)
>   at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:204)
>   at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:402)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:326)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1169)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1288)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1095)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1083)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> FAILED: ParseException line 1:31 mismatched input '/' expecting ) near 
> 'Mozilla' in value row constructor
> hive>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15074) Schematool provides a way to detect invalid entries in VERSION table

2016-11-29 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15706625#comment-15706625
 ] 

Hive QA commented on HIVE-15074:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12840920/HIVE-15074.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10747 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=133)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2329/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2329/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2329/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12840920 - PreCommit-HIVE-Build

> Schematool provides a way to detect invalid entries in VERSION table
> 
>
> Key: HIVE-15074
> URL: https://issues.apache.org/jira/browse/HIVE-15074
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Yongzhi Chen
>Assignee: Chaoyu Tang
>Priority: Minor
> Attachments: HIVE-15074.1.patch, HIVE-15074.patch
>
>
> For some unknown reason, we see customer's HMS can not start because there 
> are multiple entries in their HMS VERSION table. Schematool should provide a 
> way to validate the HMS db and provide warning and fix options for this kind 
> of issues. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15308) Create ACID table failed intermittently: due to Postgres (SQLState=25P02, ErrorCode=0)

2016-11-29 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15706616#comment-15706616
 ] 

Eugene Koifman commented on HIVE-15308:
---

https://www.postgresql.org/message-id/flat/3D9BF154E1B0B6D07D04B1F0CD4F5C0486%40ntk-mail2k3.nortak.com#3d9bf154e1b0b6d07d04b1f0cd4f5c0...@ntk-mail2k3.nortak.com
 is one of many places discussing this particular Postgres behavior

> Create ACID table failed intermittently: due to Postgres (SQLState=25P02, 
> ErrorCode=0)
> --
>
> Key: HIVE-15308
> URL: https://issues.apache.org/jira/browse/HIVE-15308
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.1.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-15308.01.patch
>
>
> if 2 concurrent calls to MutexApi.acquireLock() happen with the same "key" 
> value and there is no row in AUX_TABLE for that value yet (i.e. both are 
> attempting to insert it) Postgres kills the txn which gets the Duplicate Key 
> error and no more statements can be executed on this txn.
> (This is different from the way most DBs behave).
> {noformat}
> Caused by: MetaException(message:Unable to lock 'CheckLock' due to: ERROR: 
> current transaction is aborted, commands ignored until end of transaction 
> block (SQLState=25P02, ErrorCode=0); org.postgresql.util.PSQLException: 
> ERROR: current transaction is aborted, commands ignored until end of 
> transaction block
>   at 
> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2198)
>   at 
> org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1927)
>   at 
> org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:255)
>   at 
> org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:561)
>   at 
> org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:405)
>   at 
> org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:285)
>   at 
> com.jolbox.bonecp.StatementHandle.executeQuery(StatementHandle.java:464)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.acquireLock(TxnHandler.java:3250)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.checkLock(TxnHandler.java:2319)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.checkLockWithRetry(TxnHandler.java:1022)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:794)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:5941)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:140)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99)
>   at com.sun.proxy.$Proxy30.lock(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.lock(HiveMetaStoreClient.java:2109)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:154)
>   at com.sun.proxy.$Proxy31.lock(Unknown Source)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2259)
>   at com.sun.proxy.$Proxy31.lock(Unknown Source)
>   at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$SynchronizedMetaStoreClient.lock(DbTxnManager.java:745)
>   at 
> org.apache.hadoop.hive.ql.lockmgr.DbLockManager.lock(DbLockManager.java:103)
>   at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.acquireLocks(DbTxnManager.java:341)
>   at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.acquireLocksWithHeartbeatDelay(DbTxnManager.java:357)
>   at

[jira] [Updated] (HIVE-15308) Create ACID table failed intermittently: due to Postgres (SQLState=25P02, ErrorCode=0)

2016-11-29 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-15308:
--
Status: Patch Available  (was: Open)

> Create ACID table failed intermittently: due to Postgres (SQLState=25P02, 
> ErrorCode=0)
> --
>
> Key: HIVE-15308
> URL: https://issues.apache.org/jira/browse/HIVE-15308
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.1.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-15308.01.patch
>
>
> if 2 concurrent calls to MutexApi.acquireLock() happen with the same "key" 
> value and there is no row in AUX_TABLE for that value yet (i.e. both are 
> attempting to insert it) Postgres kills the txn which gets the Duplicate Key 
> error and no more statements can be executed on this txn.
> (This is different from the way most DBs behave).
> {noformat}
> Caused by: MetaException(message:Unable to lock 'CheckLock' due to: ERROR: 
> current transaction is aborted, commands ignored until end of transaction 
> block (SQLState=25P02, ErrorCode=0); org.postgresql.util.PSQLException: 
> ERROR: current transaction is aborted, commands ignored until end of 
> transaction block
>   at 
> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2198)
>   at 
> org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1927)
>   at 
> org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:255)
>   at 
> org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:561)
>   at 
> org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:405)
>   at 
> org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:285)
>   at 
> com.jolbox.bonecp.StatementHandle.executeQuery(StatementHandle.java:464)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.acquireLock(TxnHandler.java:3250)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.checkLock(TxnHandler.java:2319)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.checkLockWithRetry(TxnHandler.java:1022)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:794)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:5941)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:140)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99)
>   at com.sun.proxy.$Proxy30.lock(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.lock(HiveMetaStoreClient.java:2109)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:154)
>   at com.sun.proxy.$Proxy31.lock(Unknown Source)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2259)
>   at com.sun.proxy.$Proxy31.lock(Unknown Source)
>   at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$SynchronizedMetaStoreClient.lock(DbTxnManager.java:745)
>   at 
> org.apache.hadoop.hive.ql.lockmgr.DbLockManager.lock(DbLockManager.java:103)
>   at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.acquireLocks(DbTxnManager.java:341)
>   at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.acquireLocksWithHeartbeatDelay(DbTxnManager.java:357)
>   at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.acquireLocks(DbTxnManager.java:167)
>   at 
> org.apache.hadoop.hive.ql.Driver.acquireLocksAndOpenTxn(Driver.java:985)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1321)
>

[jira] [Updated] (HIVE-15308) Create ACID table failed intermittently: due to Postgres (SQLState=25P02, ErrorCode=0)

2016-11-29 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-15308:
--
Attachment: HIVE-15308.01.patch

> Create ACID table failed intermittently: due to Postgres (SQLState=25P02, 
> ErrorCode=0)
> --
>
> Key: HIVE-15308
> URL: https://issues.apache.org/jira/browse/HIVE-15308
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.1.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-15308.01.patch
>
>
> if 2 concurrent calls to MutexApi.acquireLock() happen with the same "key" 
> value and there is no row in AUX_TABLE for that value yet (i.e. both are 
> attempting to insert it) Postgres kills the txn which gets the Duplicate Key 
> error and no more statements can be executed on this txn.
> (This is different from the way most DBs behave).
> {noformat}
> Caused by: MetaException(message:Unable to lock 'CheckLock' due to: ERROR: 
> current transaction is aborted, commands ignored until end of transaction 
> block (SQLState=25P02, ErrorCode=0); org.postgresql.util.PSQLException: 
> ERROR: current transaction is aborted, commands ignored until end of 
> transaction block
>   at 
> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2198)
>   at 
> org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1927)
>   at 
> org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:255)
>   at 
> org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:561)
>   at 
> org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:405)
>   at 
> org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:285)
>   at 
> com.jolbox.bonecp.StatementHandle.executeQuery(StatementHandle.java:464)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.acquireLock(TxnHandler.java:3250)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.checkLock(TxnHandler.java:2319)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.checkLockWithRetry(TxnHandler.java:1022)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:794)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:5941)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:140)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99)
>   at com.sun.proxy.$Proxy30.lock(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.lock(HiveMetaStoreClient.java:2109)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:154)
>   at com.sun.proxy.$Proxy31.lock(Unknown Source)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2259)
>   at com.sun.proxy.$Proxy31.lock(Unknown Source)
>   at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$SynchronizedMetaStoreClient.lock(DbTxnManager.java:745)
>   at 
> org.apache.hadoop.hive.ql.lockmgr.DbLockManager.lock(DbLockManager.java:103)
>   at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.acquireLocks(DbTxnManager.java:341)
>   at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.acquireLocksWithHeartbeatDelay(DbTxnManager.java:357)
>   at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.acquireLocks(DbTxnManager.java:167)
>   at 
> org.apache.hadoop.hive.ql.Driver.acquireLocksAndOpenTxn(Driver.java:985)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1321)
>   at o

[jira] [Commented] (HIVE-15202) Concurrent compactions for the same partition may generate malformed folder structure

2016-11-29 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15706591#comment-15706591
 ] 

Eugene Koifman commented on HIVE-15202:
---

All failures have age > 1

[~wzheng] could you review please

> Concurrent compactions for the same partition may generate malformed folder 
> structure
> -
>
> Key: HIVE-15202
> URL: https://issues.apache.org/jira/browse/HIVE-15202
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Rui Li
>Assignee: Eugene Koifman
> Attachments: HIVE-15202.01.patch
>
>
> If two compactions run concurrently on a single partition, it may generate 
> folder structure like this: (nested base dir)
> {noformat}
> drwxr-xr-x   - root supergroup  0 2016-11-14 22:23 
> /user/hive/warehouse/test/z=1/base_007/base_007
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_0
> -rw-r--r--   3 root supergroup611 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_1
> -rw-r--r--   3 root supergroup614 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_2
> -rw-r--r--   3 root supergroup621 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_3
> -rw-r--r--   3 root supergroup621 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_4
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_5
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_6
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_7
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_8
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_9
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14453) refactor physical writing of ORC data and metadata to FS from the logical writers

2016-11-29 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14453:

Attachment: HIVE-14453.03.patch

Renamed the method, rebased

> refactor physical writing of ORC data and metadata to FS from the logical 
> writers
> -
>
> Key: HIVE-14453
> URL: https://issues.apache.org/jira/browse/HIVE-14453
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14453.01.patch, HIVE-14453.02.patch, 
> HIVE-14453.03.patch, HIVE-14453.patch
>
>
> ORC data doesn't have to go directly into an HDFS stream via buffers, it can 
> go somewhere else (e.g. a write-thru cache, or an addressable system that 
> doesn't require the stream blocks to be held in memory before writing them 
> all together).
> To that effect, it would be nice to abstract the data block/metadata 
> structure creating from the physical file concerns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15202) Concurrent compactions for the same partition may generate malformed folder structure

2016-11-29 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15706501#comment-15706501
 ] 

Hive QA commented on HIVE-15202:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12840914/HIVE-15202.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10747 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=133)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2328/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2328/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2328/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12840914 - PreCommit-HIVE-Build

> Concurrent compactions for the same partition may generate malformed folder 
> structure
> -
>
> Key: HIVE-15202
> URL: https://issues.apache.org/jira/browse/HIVE-15202
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Rui Li
>Assignee: Eugene Koifman
> Attachments: HIVE-15202.01.patch
>
>
> If two compactions run concurrently on a single partition, it may generate 
> folder structure like this: (nested base dir)
> {noformat}
> drwxr-xr-x   - root supergroup  0 2016-11-14 22:23 
> /user/hive/warehouse/test/z=1/base_007/base_007
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_0
> -rw-r--r--   3 root supergroup611 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_1
> -rw-r--r--   3 root supergroup614 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_2
> -rw-r--r--   3 root supergroup621 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_3
> -rw-r--r--   3 root supergroup621 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_4
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_5
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_6
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_7
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_8
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_9
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15276) CLIs spell "substitution" as "subsitution" and "auxiliary" as "auxillary"

2016-11-29 Thread Alan Gates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-15276:
--
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Patch committed.  Thanks Grant.

> CLIs spell "substitution" as "subsitution" and "auxiliary" as "auxillary"
> -
>
> Key: HIVE-15276
> URL: https://issues.apache.org/jira/browse/HIVE-15276
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 1.1.0
>Reporter: Grant Sohn
>Assignee: Grant Sohn
>Priority: Trivial
> Fix For: 2.2.0
>
> Attachments: HIVE-15276.1.patch, HIVE-15276.2.patch, 
> HIVE-15276.3.patch, HIVE-15276.4.patch, HIVE-15276.5.patch
>
>
> Found some obvious spelling typos in the CLI help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15308) Create ACID table failed intermittently: due to Postgres (SQLState=25P02, ErrorCode=0)

2016-11-29 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-15308:
--
Description: 
if 2 concurrent calls to MutexApi.acquireLock() happen with the same "key" 
value and there is no row in AUX_TABLE for that value yet (i.e. both are 
attempting to insert it) Postgres kills the txn which gets the Duplicate Key 
error and no more statements can be executed on this txn.
(This is different from the way most DBs behave).

{noformat}
Caused by: MetaException(message:Unable to lock 'CheckLock' due to: ERROR: 
current transaction is aborted, commands ignored until end of transaction block 
(SQLState=25P02, ErrorCode=0); org.postgresql.util.PSQLException: ERROR: 
current transaction is aborted, commands ignored until end of transaction block
at 
org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2198)
at 
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1927)
at 
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:255)
at 
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:561)
at 
org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:405)
at 
org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:285)
at 
com.jolbox.bonecp.StatementHandle.executeQuery(StatementHandle.java:464)
at 
org.apache.hadoop.hive.metastore.txn.TxnHandler.acquireLock(TxnHandler.java:3250)
at 
org.apache.hadoop.hive.metastore.txn.TxnHandler.checkLock(TxnHandler.java:2319)
at 
org.apache.hadoop.hive.metastore.txn.TxnHandler.checkLockWithRetry(TxnHandler.java:1022)
at 
org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:794)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:5941)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:140)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99)
at com.sun.proxy.$Proxy30.lock(Unknown Source)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.lock(HiveMetaStoreClient.java:2109)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:154)
at com.sun.proxy.$Proxy31.lock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2259)
at com.sun.proxy.$Proxy31.lock(Unknown Source)
at 
org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$SynchronizedMetaStoreClient.lock(DbTxnManager.java:745)
at 
org.apache.hadoop.hive.ql.lockmgr.DbLockManager.lock(DbLockManager.java:103)
at 
org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.acquireLocks(DbTxnManager.java:341)
at 
org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.acquireLocksWithHeartbeatDelay(DbTxnManager.java:357)
at 
org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.acquireLocks(DbTxnManager.java:167)
at 
org.apache.hadoop.hive.ql.Driver.acquireLocksAndOpenTxn(Driver.java:985)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1321)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1095)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1088)
at 
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:242)
at 
org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91)
at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:334)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma

[jira] [Updated] (HIVE-15074) Schematool provides a way to detect invalid entries in VERSION table

2016-11-29 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-15074:
---
Attachment: HIVE-15074.1.patch

> Schematool provides a way to detect invalid entries in VERSION table
> 
>
> Key: HIVE-15074
> URL: https://issues.apache.org/jira/browse/HIVE-15074
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Yongzhi Chen
>Assignee: Chaoyu Tang
>Priority: Minor
> Attachments: HIVE-15074.1.patch, HIVE-15074.patch
>
>
> For some unknown reason, we see customer's HMS can not start because there 
> are multiple entries in their HMS VERSION table. Schematool should provide a 
> way to validate the HMS db and provide warning and fix options for this kind 
> of issues. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15074) Schematool provides a way to detect invalid entries in VERSION table

2016-11-29 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-15074:
---
Attachment: (was: HIVE-15074.1.patch)

> Schematool provides a way to detect invalid entries in VERSION table
> 
>
> Key: HIVE-15074
> URL: https://issues.apache.org/jira/browse/HIVE-15074
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Yongzhi Chen
>Assignee: Chaoyu Tang
>Priority: Minor
> Attachments: HIVE-15074.1.patch, HIVE-15074.patch
>
>
> For some unknown reason, we see customer's HMS can not start because there 
> are multiple entries in their HMS VERSION table. Schematool should provide a 
> way to validate the HMS db and provide warning and fix options for this kind 
> of issues. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-15306) Change NOTICE file to account for JSON license components

2016-11-29 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez resolved HIVE-15306.

Resolution: Fixed

Pushed to master, branch-2.1. Thanks for reviewing [~thejas]

> Change NOTICE file to account for JSON license components
> -
>
> Key: HIVE-15306
> URL: https://issues.apache.org/jira/browse/HIVE-15306
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0, 2.1.1
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-15306.patch
>
>
> NO PRECOMMIT TESTS
> As per email discussion in 
> http://mail-archives.apache.org/mod_mbox/www-legal-discuss/201611.mbox/%3C0CE2E8C9-D9B7-404D-93EF-A1F8B07189BF%40apache.org%3E
>  .
> Notice that the temporary exclusion period for JSON license components is 
> extended till April 30, 2017.
> {quote}
> At that point in time, ANY and ALL usage
> of these JSON licensed artifacts are DISALLOWED. You must
> either find a suitably licensed replacement, or do without.
> There will be NO exceptions.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15306) Change NOTICE file to account for JSON license components

2016-11-29 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15306:
---
Fix Version/s: 2.2.0

> Change NOTICE file to account for JSON license components
> -
>
> Key: HIVE-15306
> URL: https://issues.apache.org/jira/browse/HIVE-15306
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0, 2.1.1
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-15306.patch
>
>
> NO PRECOMMIT TESTS
> As per email discussion in 
> http://mail-archives.apache.org/mod_mbox/www-legal-discuss/201611.mbox/%3C0CE2E8C9-D9B7-404D-93EF-A1F8B07189BF%40apache.org%3E
>  .
> Notice that the temporary exclusion period for JSON license components is 
> extended till April 30, 2017.
> {quote}
> At that point in time, ANY and ALL usage
> of these JSON licensed artifacts are DISALLOWED. You must
> either find a suitably licensed replacement, or do without.
> There will be NO exceptions.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15074) Schematool provides a way to detect invalid entries in VERSION table

2016-11-29 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-15074:
---
Attachment: HIVE-15074.1.patch

Provide a new patch based on [~ngangam]'s, and [~aihuaxu]'s comments, though 
not totally convinced that we should still continue the validations if the 
version has already be detected not right :-( 
[~ngangam] & [~aihuaxu], could you review the patch? Thanks

> Schematool provides a way to detect invalid entries in VERSION table
> 
>
> Key: HIVE-15074
> URL: https://issues.apache.org/jira/browse/HIVE-15074
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Yongzhi Chen
>Assignee: Chaoyu Tang
>Priority: Minor
> Attachments: HIVE-15074.1.patch, HIVE-15074.patch
>
>
> For some unknown reason, we see customer's HMS can not start because there 
> are multiple entries in their HMS VERSION table. Schematool should provide a 
> way to validate the HMS db and provide warning and fix options for this kind 
> of issues. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14453) refactor physical writing of ORC data and metadata to FS from the logical writers

2016-11-29 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15706355#comment-15706355
 ] 

Sergey Shelukhin commented on HIVE-14453:
-

[~prasanth_j] Writer/WriterImpl already exists, so I wanted to distinguish it 
from that. 

> refactor physical writing of ORC data and metadata to FS from the logical 
> writers
> -
>
> Key: HIVE-14453
> URL: https://issues.apache.org/jira/browse/HIVE-14453
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14453.01.patch, HIVE-14453.02.patch, 
> HIVE-14453.patch
>
>
> ORC data doesn't have to go directly into an HDFS stream via buffers, it can 
> go somewhere else (e.g. a write-thru cache, or an addressable system that 
> doesn't require the stream blocks to be held in memory before writing them 
> all together).
> To that effect, it would be nice to abstract the data block/metadata 
> structure creating from the physical file concerns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15279) map join dummy operators are not set up correctly in certain cases with merge join

2016-11-29 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15706344#comment-15706344
 ] 

Sergey Shelukhin commented on HIVE-15279:
-

Test failures are known. [~hagleitn] ping?

> map join dummy operators are not set up correctly in certain cases with merge 
> join
> --
>
> Key: HIVE-15279
> URL: https://issues.apache.org/jira/browse/HIVE-15279
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15279.01.patch, HIVE-15279.02.patch, 
> HIVE-15279.patch
>
>
> As a result, MapJoin is not initialized and there's NPE later.
> Tez-specific.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15202) Concurrent compactions for the same partition may generate malformed folder structure

2016-11-29 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-15202:
--
Status: Patch Available  (was: Open)

> Concurrent compactions for the same partition may generate malformed folder 
> structure
> -
>
> Key: HIVE-15202
> URL: https://issues.apache.org/jira/browse/HIVE-15202
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Rui Li
>Assignee: Eugene Koifman
> Attachments: HIVE-15202.01.patch
>
>
> If two compactions run concurrently on a single partition, it may generate 
> folder structure like this: (nested base dir)
> {noformat}
> drwxr-xr-x   - root supergroup  0 2016-11-14 22:23 
> /user/hive/warehouse/test/z=1/base_007/base_007
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_0
> -rw-r--r--   3 root supergroup611 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_1
> -rw-r--r--   3 root supergroup614 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_2
> -rw-r--r--   3 root supergroup621 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_3
> -rw-r--r--   3 root supergroup621 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_4
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_5
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_6
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_7
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_8
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_9
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15202) Concurrent compactions for the same partition may generate malformed folder structure

2016-11-29 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-15202:
--
Attachment: HIVE-15202.01.patch

> Concurrent compactions for the same partition may generate malformed folder 
> structure
> -
>
> Key: HIVE-15202
> URL: https://issues.apache.org/jira/browse/HIVE-15202
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Rui Li
>Assignee: Eugene Koifman
> Attachments: HIVE-15202.01.patch
>
>
> If two compactions run concurrently on a single partition, it may generate 
> folder structure like this: (nested base dir)
> {noformat}
> drwxr-xr-x   - root supergroup  0 2016-11-14 22:23 
> /user/hive/warehouse/test/z=1/base_007/base_007
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_0
> -rw-r--r--   3 root supergroup611 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_1
> -rw-r--r--   3 root supergroup614 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_2
> -rw-r--r--   3 root supergroup621 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_3
> -rw-r--r--   3 root supergroup621 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_4
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_5
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_6
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_7
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_8
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_9
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Issue Comment Deleted] (HIVE-15306) Change NOTICE file to account for JSON license components

2016-11-29 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15306:
---
Comment: was deleted

(was: NO PRECOMMIT TESTS)

> Change NOTICE file to account for JSON license components
> -
>
> Key: HIVE-15306
> URL: https://issues.apache.org/jira/browse/HIVE-15306
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0, 2.1.1
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 2.1.1
>
> Attachments: HIVE-15306.patch
>
>
> NO PRECOMMIT TESTS
> As per email discussion in 
> http://mail-archives.apache.org/mod_mbox/www-legal-discuss/201611.mbox/%3C0CE2E8C9-D9B7-404D-93EF-A1F8B07189BF%40apache.org%3E
>  .
> Notice that the temporary exclusion period for JSON license components is 
> extended till April 30, 2017.
> {quote}
> At that point in time, ANY and ALL usage
> of these JSON licensed artifacts are DISALLOWED. You must
> either find a suitably licensed replacement, or do without.
> There will be NO exceptions.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15306) Change NOTICE file to account for JSON license components

2016-11-29 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15706334#comment-15706334
 ] 

Thejas M Nair commented on HIVE-15306:
--

+1

> Change NOTICE file to account for JSON license components
> -
>
> Key: HIVE-15306
> URL: https://issues.apache.org/jira/browse/HIVE-15306
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0, 2.1.1
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 2.1.1
>
> Attachments: HIVE-15306.patch
>
>
> NO PRECOMMIT TESTS
> As per email discussion in 
> http://mail-archives.apache.org/mod_mbox/www-legal-discuss/201611.mbox/%3C0CE2E8C9-D9B7-404D-93EF-A1F8B07189BF%40apache.org%3E
>  .
> Notice that the temporary exclusion period for JSON license components is 
> extended till April 30, 2017.
> {quote}
> At that point in time, ANY and ALL usage
> of these JSON licensed artifacts are DISALLOWED. You must
> either find a suitably licensed replacement, or do without.
> There will be NO exceptions.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15306) Change NOTICE file to account for JSON license components

2016-11-29 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15306:
---
Fix Version/s: 2.1.1

> Change NOTICE file to account for JSON license components
> -
>
> Key: HIVE-15306
> URL: https://issues.apache.org/jira/browse/HIVE-15306
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0, 2.1.1
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 2.1.1
>
> Attachments: HIVE-15306.patch
>
>
> NO PRECOMMIT TESTS
> As per email discussion in 
> http://mail-archives.apache.org/mod_mbox/www-legal-discuss/201611.mbox/%3C0CE2E8C9-D9B7-404D-93EF-A1F8B07189BF%40apache.org%3E
>  .
> Notice that the temporary exclusion period for JSON license components is 
> extended till April 30, 2017.
> {quote}
> At that point in time, ANY and ALL usage
> of these JSON licensed artifacts are DISALLOWED. You must
> either find a suitably licensed replacement, or do without.
> There will be NO exceptions.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15306) Change NOTICE file to account for JSON license components

2016-11-29 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15306:
---
Attachment: HIVE-15306.patch

> Change NOTICE file to account for JSON license components
> -
>
> Key: HIVE-15306
> URL: https://issues.apache.org/jira/browse/HIVE-15306
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0, 2.1.1
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-15306.patch
>
>
> NO PRECOMMIT TESTS
> As per email discussion in 
> http://mail-archives.apache.org/mod_mbox/www-legal-discuss/201611.mbox/%3C0CE2E8C9-D9B7-404D-93EF-A1F8B07189BF%40apache.org%3E
>  .
> Notice that the temporary exclusion period for JSON license components is 
> extended till April 30, 2017.
> {quote}
> At that point in time, ANY and ALL usage
> of these JSON licensed artifacts are DISALLOWED. You must
> either find a suitably licensed replacement, or do without.
> There will be NO exceptions.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15306) Change NOTICE file to account for JSON license components

2016-11-29 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15706328#comment-15706328
 ] 

Jesus Camacho Rodriguez commented on HIVE-15306:


[~thejas], could you review it? Thanks

> Change NOTICE file to account for JSON license components
> -
>
> Key: HIVE-15306
> URL: https://issues.apache.org/jira/browse/HIVE-15306
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0, 2.1.1
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-15306.patch
>
>
> NO PRECOMMIT TESTS
> As per email discussion in 
> http://mail-archives.apache.org/mod_mbox/www-legal-discuss/201611.mbox/%3C0CE2E8C9-D9B7-404D-93EF-A1F8B07189BF%40apache.org%3E
>  .
> Notice that the temporary exclusion period for JSON license components is 
> extended till April 30, 2017.
> {quote}
> At that point in time, ANY and ALL usage
> of these JSON licensed artifacts are DISALLOWED. You must
> either find a suitably licensed replacement, or do without.
> There will be NO exceptions.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15306) Change NOTICE file to account for JSON license components

2016-11-29 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15706326#comment-15706326
 ] 

Jesus Camacho Rodriguez commented on HIVE-15306:


NO PRECOMMIT TESTS

> Change NOTICE file to account for JSON license components
> -
>
> Key: HIVE-15306
> URL: https://issues.apache.org/jira/browse/HIVE-15306
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0, 2.1.1
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>
> As per email discussion in 
> http://mail-archives.apache.org/mod_mbox/www-legal-discuss/201611.mbox/%3C0CE2E8C9-D9B7-404D-93EF-A1F8B07189BF%40apache.org%3E
>  .
> Notice that the temporary exclusion period for JSON license components is 
> extended till April 30, 2017.
> {quote}
> At that point in time, ANY and ALL usage
> of these JSON licensed artifacts are DISALLOWED. You must
> either find a suitably licensed replacement, or do without.
> There will be NO exceptions.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15306) Change NOTICE file to account for JSON license components

2016-11-29 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15306:
---
Description: 
NO PRECOMMIT TESTS

As per email discussion in 
http://mail-archives.apache.org/mod_mbox/www-legal-discuss/201611.mbox/%3C0CE2E8C9-D9B7-404D-93EF-A1F8B07189BF%40apache.org%3E
 .

Notice that the temporary exclusion period for JSON license components is 
extended till April 30, 2017.
{quote}
At that point in time, ANY and ALL usage
of these JSON licensed artifacts are DISALLOWED. You must
either find a suitably licensed replacement, or do without.
There will be NO exceptions.
{quote}

  was:
As per email discussion in 
http://mail-archives.apache.org/mod_mbox/www-legal-discuss/201611.mbox/%3C0CE2E8C9-D9B7-404D-93EF-A1F8B07189BF%40apache.org%3E
 .

Notice that the temporary exclusion period for JSON license components is 
extended till April 30, 2017.
{quote}
At that point in time, ANY and ALL usage
of these JSON licensed artifacts are DISALLOWED. You must
either find a suitably licensed replacement, or do without.
There will be NO exceptions.
{quote}


> Change NOTICE file to account for JSON license components
> -
>
> Key: HIVE-15306
> URL: https://issues.apache.org/jira/browse/HIVE-15306
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0, 2.1.1
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>
> NO PRECOMMIT TESTS
> As per email discussion in 
> http://mail-archives.apache.org/mod_mbox/www-legal-discuss/201611.mbox/%3C0CE2E8C9-D9B7-404D-93EF-A1F8B07189BF%40apache.org%3E
>  .
> Notice that the temporary exclusion period for JSON license components is 
> extended till April 30, 2017.
> {quote}
> At that point in time, ANY and ALL usage
> of these JSON licensed artifacts are DISALLOWED. You must
> either find a suitably licensed replacement, or do without.
> There will be NO exceptions.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15112) Implement Parquet vectorization reader for Struct type

2016-11-29 Thread Chao Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15706248#comment-15706248
 ] 

Chao Sun commented on HIVE-15112:
-

Thanks for the reply [~Ferd]. Does that mean even if we have fully-implemented 
vectorized reader for complex types Hive will still use the row-by-row engine? 
That doesn't sound good. Do you know whether there's a JIRA to track the issue?

> Implement Parquet vectorization reader for Struct type
> --
>
> Key: HIVE-15112
> URL: https://issues.apache.org/jira/browse/HIVE-15112
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
>
> Like HIVE-14815, we need support Parquet vectorized reader for struct type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15306) Change NOTICE file to account for JSON license components

2016-11-29 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15306:
---
Issue Type: Bug  (was: Improvement)

> Change NOTICE file to account for JSON license components
> -
>
> Key: HIVE-15306
> URL: https://issues.apache.org/jira/browse/HIVE-15306
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0, 2.1.1
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>
> As per email discussion in 
> http://mail-archives.apache.org/mod_mbox/www-legal-discuss/201611.mbox/%3C0CE2E8C9-D9B7-404D-93EF-A1F8B07189BF%40apache.org%3E
>  .
> Notice that the temporary exclusion period for JSON license components is 
> extended till April 30, 2017.
> {quote}
> At that point in time, ANY and ALL usage
> of these JSON licensed artifacts are DISALLOWED. You must
> either find a suitably licensed replacement, or do without.
> There will be NO exceptions.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15291) Comparison of timestamp fails if only date part is provided.

2016-11-29 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15706200#comment-15706200
 ] 

Jason Dere commented on HIVE-15291:
---

Can you add a .q test so we can see this fix work in a SQL statement (such as 
the example you give in description)? Would be good to have a test to confirm 
your use case is fixed.

> Comparison of timestamp fails if only date part is provided. 
> -
>
> Key: HIVE-15291
> URL: https://issues.apache.org/jira/browse/HIVE-15291
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, UDF
>Affects Versions: 2.1.0
>Reporter: Dhiraj Kumar
>Assignee: Dhiraj Kumar
> Attachments: HIVE-15291.1.patch, HIVE-15291.2.patch, 
> HIVE-15291.3.patch
>
>
> Summary : If a query needs to compare two timestamp with one timestamp 
> provided in "-MM-DD" format, skipping the time part, it returns incorrect 
> result. 
> Steps to reproduce : 
> 1. Start a hive-cli. 
> 2. Fire up the query -> select cast("2016-12-31 12:00:00" as timestamp) > 
> "2016-12-30";
> 3. Expected result : true
> 4. Actual result : NULL
> Detailed description : 
> If two primitives of different type needs to compared, a common comparator 
> type is chosen. Prior to 2.1, Common type Text was chosen to compare 
> Timestamp type and Text type. 
> In version 2.1, Common type Timestamp is chosen to compare Timestamp type and 
> Text type. This leads to converting Text type (-MM-DD) into 
> java.sql.Timestamp which throws exception saying the input is not in proper 
> format. The exception is suppressed and a null is returned. 
> Code below from org.apache.hadoop.hive.ql.exec.FunctionRegistry
> {code:java}
> if (pgA == PrimitiveGrouping.STRING_GROUP && pgB == 
> PrimitiveGrouping.DATE_GROUP) {
>   return b;
> }
> // date/timestamp is higher precedence than String_GROUP
> if (pgB == PrimitiveGrouping.STRING_GROUP && pgA == 
> PrimitiveGrouping.DATE_GROUP) {
>   return a;
> }
> {code}
> The bug was introduced in  
> [HIVE-13381|https://issues.apache.org/jira/browse/HIVE-13381]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15232) Add notification events for functions and indexes

2016-11-29 Thread Mohit Sabharwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15706116#comment-15706116
 ] 

Mohit Sabharwal commented on HIVE-15232:


[~vgumashta], was on vacation so couldn't respond earlier. You're right, this 
patch should not remove unit testing 
for the older config. Created HIVE-15305 to fix this. Thanks!

> Add notification events for functions and indexes
> -
>
> Key: HIVE-15232
> URL: https://issues.apache.org/jira/browse/HIVE-15232
> Project: Hive
>  Issue Type: Improvement
>  Components: repl
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Fix For: 2.2.0
>
> Attachments: HIVE-15232.1.patch, HIVE-15232.2.patch, 
> HIVE-15232.2.patch, HIVE-15232.2.patch
>
>
> Create/Drop Function and Create/Drop/Alter Index should also generate 
> metastore notification events.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15291) Comparison of timestamp fails if only date part is provided.

2016-11-29 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15706046#comment-15706046
 ] 

Hive QA commented on HIVE-15291:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12840893/HIVE-15291.3.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10747 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=133)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=91)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2327/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2327/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2327/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12840893 - PreCommit-HIVE-Build

> Comparison of timestamp fails if only date part is provided. 
> -
>
> Key: HIVE-15291
> URL: https://issues.apache.org/jira/browse/HIVE-15291
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, UDF
>Affects Versions: 2.1.0
>Reporter: Dhiraj Kumar
>Assignee: Dhiraj Kumar
> Attachments: HIVE-15291.1.patch, HIVE-15291.2.patch, 
> HIVE-15291.3.patch
>
>
> Summary : If a query needs to compare two timestamp with one timestamp 
> provided in "-MM-DD" format, skipping the time part, it returns incorrect 
> result. 
> Steps to reproduce : 
> 1. Start a hive-cli. 
> 2. Fire up the query -> select cast("2016-12-31 12:00:00" as timestamp) > 
> "2016-12-30";
> 3. Expected result : true
> 4. Actual result : NULL
> Detailed description : 
> If two primitives of different type needs to compared, a common comparator 
> type is chosen. Prior to 2.1, Common type Text was chosen to compare 
> Timestamp type and Text type. 
> In version 2.1, Common type Timestamp is chosen to compare Timestamp type and 
> Text type. This leads to converting Text type (-MM-DD) into 
> java.sql.Timestamp which throws exception saying the input is not in proper 
> format. The exception is suppressed and a null is returned. 
> Code below from org.apache.hadoop.hive.ql.exec.FunctionRegistry
> {code:java}
> if (pgA == PrimitiveGrouping.STRING_GROUP && pgB == 
> PrimitiveGrouping.DATE_GROUP) {
>   return b;
> }
> // date/timestamp is higher precedence than String_GROUP
> if (pgB == PrimitiveGrouping.STRING_GROUP && pgA == 
> PrimitiveGrouping.DATE_GROUP) {
>   return a;
> }
> {code}
> The bug was introduced in  
> [HIVE-13381|https://issues.apache.org/jira/browse/HIVE-13381]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

1 2 >

1 - 100 of 156 matches

Mail list logo