date:20170824

[jira] [Updated] (HIVE-17100) Improve HS2 operation logs for REPL commands.

2017-08-24 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17100:

Status: Open  (was: Patch Available)

> Improve HS2 operation logs for REPL commands.
> -
>
> Key: HIVE-17100
> URL: https://issues.apache.org/jira/browse/HIVE-17100
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17100.01.patch, HIVE-17100.02.patch, 
> HIVE-17100.03.patch, HIVE-17100.04.patch, HIVE-17100.05.patch, 
> HIVE-17100.06.patch, HIVE-17100.07.patch
>
>
> It is necessary to log the progress the replication tasks in a structured 
> manner as follows.
> *+Bootstrap Dump:+*
> * At the start of bootstrap dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (BOOTSTRAP)
> * (Estimated) Total number of tables/views to dump
> * (Estimated) Total number of functions to dump.
> * Dump Start Time{color}
> * After each table dump, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table dump end time
> * Table dump progress. Format is Table sequence no/(Estimated) Total number 
> of tables and views.{color}
> * After each function dump, will add a log as follows
> {color:#59afe1}* Function Name
> * Function dump end time
> * Function dump progress. Format is Function sequence no/(Estimated) Total 
> number of functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> dump.
> {color:#59afe1}* Database Name.
> * Dump Type (BOOTSTRAP).
> * Dump End Time.
> * (Actual) Total number of tables/views dumped.
> * (Actual) Total number of functions dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The actual and estimated number of tables/functions may not match if 
> any table/function is dropped when dump in progress.
> *+Bootstrap Load:+*
> * At the start of bootstrap load, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump directory
> * Load Type (BOOTSTRAP)
> * Total number of tables/views to load
> * Total number of functions to load.
> * Load Start Time{color}
> * After each table load, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table load completion time
> * Table load progress. Format is Table sequence no/Total number of tables and 
> views.{color}
> * After each function load, will add a log as follows
> {color:#59afe1}* Function Name
> * Function load completion time
> * Function load progress. Format is Function sequence no/Total number of 
> functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> load.
> {color:#59afe1}* Database Name.
> * Load Type (BOOTSTRAP).
> * Load End Time.
> * Total number of tables/views loaded.
> * Total number of functions loaded.
> * Last Repl ID of the loaded database.{color}
> *+Incremental Dump:+*
> * At the start of database dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (INCREMENTAL)
> * (Estimated) Total number of events to dump.
> * Dump Start Time{color}
> * After each event dump, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event dump end time
> * Event dump progress. Format is Event sequence no/ (Estimated) Total number 
> of events.{color}
> * After completion of all event dumps, will add a log as follows.
> {color:#59afe1}* Database Name.
> * Dump Type (INCREMENTAL).
> * Dump End Time.
> * (Actual) Total number of events dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The estimated number of events can be terribly inaccurate with actual 
> number as we don’t have the number of events upfront until we read from 
> metastore NotificationEvents table.
> *+Incremental Load:+*
> * At the start of incremental load, will add one log with below details.
> {color:#59afe1}* Target Database Name 
> * Dump directory
> * Load Type (INCREMENTAL)
> * Total number of events to load
> * Load Start Time{color}
> * After each event load, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event load end time
> * Event load progress. Format is Event sequence no/ Total number of 
> events.{color}
> * After completion of all event loads, will add a log as follows to 
> consolidate the load.
> {color:#59afe1}* Target Database Name.
> * Load Type (INCREMENTAL).
> * Load End Time.
> * Total number of events loaded.
> * Last Repl ID of the loaded databas

[jira] [Work started] (HIVE-17100) Improve HS2 operation logs for REPL commands.

2017-08-24 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-17100 started by Sankar Hariappan.
---
> Improve HS2 operation logs for REPL commands.
> -
>
> Key: HIVE-17100
> URL: https://issues.apache.org/jira/browse/HIVE-17100
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17100.01.patch, HIVE-17100.02.patch, 
> HIVE-17100.03.patch, HIVE-17100.04.patch, HIVE-17100.05.patch, 
> HIVE-17100.06.patch, HIVE-17100.07.patch
>
>
> It is necessary to log the progress the replication tasks in a structured 
> manner as follows.
> *+Bootstrap Dump:+*
> * At the start of bootstrap dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (BOOTSTRAP)
> * (Estimated) Total number of tables/views to dump
> * (Estimated) Total number of functions to dump.
> * Dump Start Time{color}
> * After each table dump, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table dump end time
> * Table dump progress. Format is Table sequence no/(Estimated) Total number 
> of tables and views.{color}
> * After each function dump, will add a log as follows
> {color:#59afe1}* Function Name
> * Function dump end time
> * Function dump progress. Format is Function sequence no/(Estimated) Total 
> number of functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> dump.
> {color:#59afe1}* Database Name.
> * Dump Type (BOOTSTRAP).
> * Dump End Time.
> * (Actual) Total number of tables/views dumped.
> * (Actual) Total number of functions dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The actual and estimated number of tables/functions may not match if 
> any table/function is dropped when dump in progress.
> *+Bootstrap Load:+*
> * At the start of bootstrap load, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump directory
> * Load Type (BOOTSTRAP)
> * Total number of tables/views to load
> * Total number of functions to load.
> * Load Start Time{color}
> * After each table load, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table load completion time
> * Table load progress. Format is Table sequence no/Total number of tables and 
> views.{color}
> * After each function load, will add a log as follows
> {color:#59afe1}* Function Name
> * Function load completion time
> * Function load progress. Format is Function sequence no/Total number of 
> functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> load.
> {color:#59afe1}* Database Name.
> * Load Type (BOOTSTRAP).
> * Load End Time.
> * Total number of tables/views loaded.
> * Total number of functions loaded.
> * Last Repl ID of the loaded database.{color}
> *+Incremental Dump:+*
> * At the start of database dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (INCREMENTAL)
> * (Estimated) Total number of events to dump.
> * Dump Start Time{color}
> * After each event dump, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event dump end time
> * Event dump progress. Format is Event sequence no/ (Estimated) Total number 
> of events.{color}
> * After completion of all event dumps, will add a log as follows.
> {color:#59afe1}* Database Name.
> * Dump Type (INCREMENTAL).
> * Dump End Time.
> * (Actual) Total number of events dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The estimated number of events can be terribly inaccurate with actual 
> number as we don’t have the number of events upfront until we read from 
> metastore NotificationEvents table.
> *+Incremental Load:+*
> * At the start of incremental load, will add one log with below details.
> {color:#59afe1}* Target Database Name 
> * Dump directory
> * Load Type (INCREMENTAL)
> * Total number of events to load
> * Load Start Time{color}
> * After each event load, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event load end time
> * Event load progress. Format is Event sequence no/ Total number of 
> events.{color}
> * After completion of all event loads, will add a log as follows to 
> consolidate the load.
> {color:#59afe1}* Target Database Name.
> * Load Type (INCREMENTAL).
> * Load End Time.
> * Total number of events loaded.
> * Last Repl ID of the loaded database.{color}



--
This

[jira] [Updated] (HIVE-17183) Disable rename operations during bootstrap dump

2017-08-24 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17183:

Status: Open  (was: Patch Available)

> Disable rename operations during bootstrap dump
> ---
>
> Key: HIVE-17183
> URL: https://issues.apache.org/jira/browse/HIVE-17183
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17183.01.patch, HIVE-17183.02.patch, 
> HIVE-17183.03.patch
>
>
> Currently, bootstrap dump shall lead to data loss when any rename happens 
> while dump in progress. 
> *Scenario:*
> - Fetch table names (T1 and T2)
> - Dump table T1
> - Rename table T2 to T3 generates RENAME event
> - Dump table T2 is noop as table doesn’t exist.
> - In target after load, it only have T1.
> - Apply RENAME event will fail as T2 doesn’t exist in target.
> This feature can be supported in next phase development as it need proper 
> design to keep track of renamed tables/partitions. 
> So, for time being, we shall disable rename operations when bootstrap dump in 
> progress to avoid any inconsistent state.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Work started] (HIVE-17183) Disable rename operations during bootstrap dump

2017-08-24 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-17183 started by Sankar Hariappan.
---
> Disable rename operations during bootstrap dump
> ---
>
> Key: HIVE-17183
> URL: https://issues.apache.org/jira/browse/HIVE-17183
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17183.01.patch, HIVE-17183.02.patch, 
> HIVE-17183.03.patch
>
>
> Currently, bootstrap dump shall lead to data loss when any rename happens 
> while dump in progress. 
> *Scenario:*
> - Fetch table names (T1 and T2)
> - Dump table T1
> - Rename table T2 to T3 generates RENAME event
> - Dump table T2 is noop as table doesn’t exist.
> - In target after load, it only have T1.
> - Apply RENAME event will fail as T2 doesn’t exist in target.
> This feature can be supported in next phase development as it need proper 
> design to keep track of renamed tables/partitions. 
> So, for time being, we shall disable rename operations when bootstrap dump in 
> progress to avoid any inconsistent state.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17373) Upgrade some dependency versions

2017-08-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16141259#comment-16141259
 ] 

Hive QA commented on HIVE-17373:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12883602/HIVE-17373.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10993 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=231)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_windowing2] 
(batchId=10)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=235)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6532/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6532/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6532/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12883602 - PreCommit-HIVE-Build

> Upgrade some dependency versions
> 
>
> Key: HIVE-17373
> URL: https://issues.apache.org/jira/browse/HIVE-17373
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-17373.1.patch, HIVE-17373.2.patch
>
>
> Upgrade some libraries including log4j to 2.8.2, accumulo to 1.8.1 and 
> commons-httpclient to 3.1. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17205) add functional support

2017-08-24 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16141209#comment-16141209
 ] 

Eugene Koifman commented on HIVE-17205:
---

patch16 - no related failures


> add functional support
> --
>
> Key: HIVE-17205
> URL: https://issues.apache.org/jira/browse/HIVE-17205
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17205.01.patch, HIVE-17205.02.patch, 
> HIVE-17205.03.patch, HIVE-17205.09.patch, HIVE-17205.10.patch, 
> HIVE-17205.11.patch, HIVE-17205.12.patch, HIVE-17205.13.patch, 
> HIVE-17205.14.patch, HIVE-17205.15.patch, HIVE-17205.16.patch
>
>
> make sure unbucketed tables can be marked transactional=true
> make insert/update/delete/compaction work



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17205) add functional support

2017-08-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16141207#comment-16141207
 ] 

Hive QA commented on HIVE-17205:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12883600/HIVE-17205.16.patch

{color:green}SUCCESS:{color} +1 due to 11 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 11004 tests 
executed
*Failed tests:*
{noformat}
TestTxnCommandsBase - did not produce a TEST-*.xml file (likely timed out) 
(batchId=280)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=100)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=234)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testConnection (batchId=240)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testNegativeTokenAuth 
(batchId=240)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testProxyAuth (batchId=240)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6531/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6531/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6531/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12883600 - PreCommit-HIVE-Build

> add functional support
> --
>
> Key: HIVE-17205
> URL: https://issues.apache.org/jira/browse/HIVE-17205
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17205.01.patch, HIVE-17205.02.patch, 
> HIVE-17205.03.patch, HIVE-17205.09.patch, HIVE-17205.10.patch, 
> HIVE-17205.11.patch, HIVE-17205.12.patch, HIVE-17205.13.patch, 
> HIVE-17205.14.patch, HIVE-17205.15.patch, HIVE-17205.16.patch
>
>
> make sure unbucketed tables can be marked transactional=true
> make insert/update/delete/compaction work



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17297) allow AM to use LLAP guaranteed tasks

2017-08-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16141128#comment-16141128
 ] 

Hive QA commented on HIVE-17297:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12883591/HIVE-17297.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 11009 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=100)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testWriteTinyint 
(batchId=183)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6530/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6530/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6530/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12883591 - PreCommit-HIVE-Build

> allow AM to use LLAP guaranteed tasks
> -
>
> Key: HIVE-17297
> URL: https://issues.apache.org/jira/browse/HIVE-17297
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17297.01.nogen.patch, HIVE-17297.01.patch, 
> HIVE-17297.01.patch, HIVE-17297.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17330) refactor TezSessionPoolManager to separate its multiple functions

2017-08-24 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16141122#comment-16141122
 ] 

Gunther Hagleitner commented on HIVE-17330:
---

+1 thanks for cleaning this up.

> refactor TezSessionPoolManager to separate its multiple functions
> -
>
> Key: HIVE-17330
> URL: https://issues.apache.org/jira/browse/HIVE-17330
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17330.01.patch, HIVE-17330.02.patch, 
> HIVE-17330.patch
>
>
> TezSessionPoolManager would retain things specific to current Hive session 
> management. 
> The session pool itself, as well as expiration tracking, the pool session 
> implementation, and some config validation can be separated out and made 
> independent from the pool.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17330) refactor TezSessionPoolManager to separate its multiple functions

2017-08-24 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17330:

Attachment: HIVE-17330.02.patch

Rebasing the patch

> refactor TezSessionPoolManager to separate its multiple functions
> -
>
> Key: HIVE-17330
> URL: https://issues.apache.org/jira/browse/HIVE-17330
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17330.01.patch, HIVE-17330.02.patch, 
> HIVE-17330.patch
>
>
> TezSessionPoolManager would retain things specific to current Hive session 
> management. 
> The session pool itself, as well as expiration tracking, the pool session 
> implementation, and some config validation can be separated out and made 
> independent from the pool.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17380) refactor LlapProtocolClientProxy to be usable with other protocols

2017-08-24 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17380:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks for the review!

> refactor LlapProtocolClientProxy to be usable with other protocols
> --
>
> Key: HIVE-17380
> URL: https://issues.apache.org/jira/browse/HIVE-17380
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 3.0.0
>
> Attachments: HIVE-17380.patch, HIVE-17380.patch
>
>
> This basically moves a bunch of code into a generic async PB RPC proxy, in 
> llap-common for now. Moving to common would require one to move LlapNodeId, 
> that can be done later.
> The only logic change is that concurrent hash map, that never expires, is 
> replaced by Guava cache. A path to shut down a proxy is added, but does 
> nothing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17380) refactor LlapProtocolClientProxy to be usable with other protocols

2017-08-24 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16141099#comment-16141099
 ] 

Sergey Shelukhin commented on HIVE-17380:
-

This is going to be used for the AM LLAP plugin endpoint client. At least for 
now, one at a time thing is not as critical for that, but there is bunch of 
other general useful stuff in this. Thanks for the review!

> refactor LlapProtocolClientProxy to be usable with other protocols
> --
>
> Key: HIVE-17380
> URL: https://issues.apache.org/jira/browse/HIVE-17380
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17380.patch, HIVE-17380.patch
>
>
> This basically moves a bunch of code into a generic async PB RPC proxy, in 
> llap-common for now. Moving to common would require one to move LlapNodeId, 
> that can be done later.
> The only logic change is that concurrent hash map, that never expires, is 
> replaced by Guava cache. A path to shut down a proxy is added, but does 
> nothing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17360) Tez session reopen appears to use a wrong conf object

2017-08-24 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17360:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks for the review!

> Tez session reopen appears to use a wrong conf object
> -
>
> Key: HIVE-17360
> URL: https://issues.apache.org/jira/browse/HIVE-17360
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 3.0.0
>
> Attachments: HIVE-17360.01.patch, HIVE-17360.02.patch, 
> HIVE-17360.03.patch, HIVE-17360.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17387) implement Tez AM registry in Hive

2017-08-24 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-17387:
---


> implement Tez AM registry in Hive
> -
>
> Key: HIVE-17387
> URL: https://issues.apache.org/jira/browse/HIVE-17387
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> Necessary for HS2 HA, to transfer AMs between HS2s, etc.
> Helpful for workload management.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (HIVE-17386) support LLAP workload management in HS2 (low level only)

2017-08-24 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16141074#comment-16141074
 ] 

Sergey Shelukhin edited comment on HIVE-17386 at 8/25/17 2:17 AM:
--

An initial patch on top of HIVE-17297. Also includes 2 refactoring jiras, for 
LlapProtocolClientProxy and TezSessionPoolManager, that should be committed 
separately.
Needs tests; a lot of this patch is plumbing, so I'm not yet sure about the 
testing strategy for everything.


was (Author: sershe):
An initial patch on top of HIVE-17297. Also includes 2 refactoring jiras, for 
LlapProtocolClientProxy and TezSessionPoolManager, that should be committed 
separately.

> support LLAP workload management in HS2 (low level only)
> 
>
> Key: HIVE-17386
> URL: https://issues.apache.org/jira/browse/HIVE-17386
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17386.patch
>
>
> This makes use of HIVE-17297 and creates building blocks for workload 
> management policies, etc.
> For now, there are no policies - a single yarn queue is designated for all 
> LLAP query AMs, and the capacity is distributed equally.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17386) support LLAP workload management in HS2 (low level only)

2017-08-24 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17386:

Attachment: HIVE-17386.patch

An initial patch on top of HIVE-17297. Also includes 2 refactoring jiras, for 
LlapProtocolClientProxy and TezSessionPoolManager, that should be committed 
separately.

> support LLAP workload management in HS2 (low level only)
> 
>
> Key: HIVE-17386
> URL: https://issues.apache.org/jira/browse/HIVE-17386
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17386.patch
>
>
> This makes use of HIVE-17297 and creates building blocks for workload 
> management policies, etc.
> For now, there are no policies - a single yarn queue is designated for all 
> LLAP query AMs, and the capacity is distributed equally.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17386) support LLAP workload management in HS2 (low level only)

2017-08-24 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17386:

Status: Patch Available  (was: Open)

> support LLAP workload management in HS2 (low level only)
> 
>
> Key: HIVE-17386
> URL: https://issues.apache.org/jira/browse/HIVE-17386
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17386.patch
>
>
> This makes use of HIVE-17297 and creates building blocks for workload 
> management policies, etc.
> For now, there are no policies - a single yarn queue is designated for all 
> LLAP query AMs, and the capacity is distributed equally.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17386) support LLAP workload management in HS2 (low level only)

2017-08-24 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-17386:
---


> support LLAP workload management in HS2 (low level only)
> 
>
> Key: HIVE-17386
> URL: https://issues.apache.org/jira/browse/HIVE-17386
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> This makes use of HIVE-17297 and creates building blocks for workload 
> management policies, etc.
> For now, there are no policies - a single yarn queue is designated for all 
> LLAP query AMs, and the capacity is distributed equally.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17373) Upgrade some dependency versions

2017-08-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16141047#comment-16141047
 ] 

Hive QA commented on HIVE-17373:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12883602/HIVE-17373.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10994 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=231)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=46)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6528/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6528/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6528/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12883602 - PreCommit-HIVE-Build

> Upgrade some dependency versions
> 
>
> Key: HIVE-17373
> URL: https://issues.apache.org/jira/browse/HIVE-17373
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-17373.1.patch, HIVE-17373.2.patch
>
>
> Upgrade some libraries including log4j to 2.8.2, accumulo to 1.8.1 and 
> commons-httpclient to 3.1. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-08-24 Thread Rui Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-15104:
--
Attachment: HIVE-15104.5.patch

> Hive on Spark generate more shuffle data than hive on mr
> 
>
> Key: HIVE-15104
> URL: https://issues.apache.org/jira/browse/HIVE-15104
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.1
>Reporter: wangwenli
>Assignee: Rui Li
> Attachments: HIVE-15104.1.patch, HIVE-15104.2.patch, 
> HIVE-15104.3.patch, HIVE-15104.4.patch, HIVE-15104.5.patch, 
> HIVE-15104.5.patch, TPC-H 100G.xlsx
>
>
> the same sql,  running on spark  and mr engine, will generate different size 
> of shuffle data.
> i think it is because of hive on mr just serialize part of HiveKey, but hive 
> on spark which using kryo will serialize full of Hivekey object.  
> what is your opionion?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17341) DbTxnManger.startHeartbeat() - randomize initial delay

2017-08-24 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17341:
--
Status: Patch Available  (was: Open)

> DbTxnManger.startHeartbeat() - randomize initial delay
> --
>
> Key: HIVE-17341
> URL: https://issues.apache.org/jira/browse/HIVE-17341
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17341.01.patch
>
>
> This sets up a fixed delay for all heartebeats.  If many queries land on the 
> server at the same time,
> they will wake up and start hearbeating at the same time causing a bottleneck.
> Add some random element to heatbeat delay.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17341) DbTxnManger.startHeartbeat() - randomize initial delay

2017-08-24 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17341:
--
Attachment: HIVE-17341.01.patch

[~wzheng] could you review please


> DbTxnManger.startHeartbeat() - randomize initial delay
> --
>
> Key: HIVE-17341
> URL: https://issues.apache.org/jira/browse/HIVE-17341
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17341.01.patch
>
>
> This sets up a fixed delay for all heartebeats.  If many queries land on the 
> server at the same time,
> they will wake up and start hearbeating at the same time causing a bottleneck.
> Add some random element to heatbeat delay.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17380) refactor LlapProtocolClientProxy to be usable with other protocols

2017-08-24 Thread Siddharth Seth (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16141017#comment-16141017
 ] 

Siddharth Seth commented on HIVE-17380:
---

This mainly makes the one per node a little more generic? Is there another 
protocol this needs to be used with?

I was, at some point, planning to delink from protocols etc. Essentially 1 
thread per any single entity.

+1 for the patch.

> refactor LlapProtocolClientProxy to be usable with other protocols
> --
>
> Key: HIVE-17380
> URL: https://issues.apache.org/jira/browse/HIVE-17380
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17380.patch, HIVE-17380.patch
>
>
> This basically moves a bunch of code into a generic async PB RPC proxy, in 
> llap-common for now. Moving to common would require one to move LlapNodeId, 
> that can be done later.
> The only logic change is that concurrent hash map, that never expires, is 
> replaced by Guava cache. A path to shut down a proxy is added, but does 
> nothing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17385) Fix incremental repl error for non-native tables

2017-08-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140974#comment-16140974
 ] 

Hive QA commented on HIVE-17385:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12883582/HIVE-17385.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 11001 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_date_funcs] 
(batchId=73)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hadoop.hive.ql.parse.TestExport.shouldExportImportATemporaryTable 
(batchId=218)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6527/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6527/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6527/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12883582 - PreCommit-HIVE-Build

> Fix incremental repl error for non-native tables
> 
>
> Key: HIVE-17385
> URL: https://issues.apache.org/jira/browse/HIVE-17385
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17385.1.patch, HIVE-17385.2.patch, 
> HIVE-17385.3.patch
>
>
> See below error with incremental replication for non-native (storage handler 
> based) tables. The bug is that we are not checking a table should be 
> dumped/exported or not during incremental dump.
> 2017-08-02T12:31:48,195 ERROR [HiveServer2-Background-Pool: Thread-8078]: 
> exec.DDLTask (DDLTask.java:failed(632)) - 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> MetaException(message:LOCATION may not be specified for HBase.)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17367) IMPORT table doesn't load from data dump if a metadata-only dump was already imported.

2017-08-24 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17367:

Status: Patch Available  (was: Open)

> IMPORT table doesn't load from data dump if a metadata-only dump was already 
> imported.
> --
>
> Key: HIVE-17367
> URL: https://issues.apache.org/jira/browse/HIVE-17367
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Import/Export, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17367.01.patch
>
>
> Repl v1 creates a set of EXPORT/IMPORT commands to replicate modified data 
> (as per events) across clusters.
> For instance, let's say, insert generates 2 events such as
> ALTER_TABLE (ID: 10)
> INSERT (ID: 11)
> Each event generates a set of EXPORT and IMPORT commands.
> ALTER_TABLE event generates metadata only export/import
> INSERT generates metadata+data export/import.
> As Hive always dump the latest copy of table during export, it sets the 
> latest notification event ID as current state of it. So, in this example, 
> import of metadata by ALTER_TABLE event sets the current state of the table 
> as 11.
> Now, when we try to import the data dumped by INSERT event, it is noop as the 
> table's current state(11) is equal to the dump state (11) which in-turn leads 
> to the data never gets replicated to target cluster.
> So, it is necessary to allow overwrite of table/partition if their current 
> state equals the dump state.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17367) IMPORT table doesn't load from data dump if a metadata-only dump was already imported.

2017-08-24 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17367:

Attachment: HIVE-17367.01.patch

Added 01.patch with changes to set last repl ID only for data export. Metadata 
only export will set the current event ID.


> IMPORT table doesn't load from data dump if a metadata-only dump was already 
> imported.
> --
>
> Key: HIVE-17367
> URL: https://issues.apache.org/jira/browse/HIVE-17367
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Import/Export, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17367.01.patch
>
>
> Repl v1 creates a set of EXPORT/IMPORT commands to replicate modified data 
> (as per events) across clusters.
> For instance, let's say, insert generates 2 events such as
> ALTER_TABLE (ID: 10)
> INSERT (ID: 11)
> Each event generates a set of EXPORT and IMPORT commands.
> ALTER_TABLE event generates metadata only export/import
> INSERT generates metadata+data export/import.
> As Hive always dump the latest copy of table during export, it sets the 
> latest notification event ID as current state of it. So, in this example, 
> import of metadata by ALTER_TABLE event sets the current state of the table 
> as 11.
> Now, when we try to import the data dumped by INSERT event, it is noop as the 
> table's current state(11) is equal to the dump state (11) which in-turn leads 
> to the data never gets replicated to target cluster.
> So, it is necessary to allow overwrite of table/partition if their current 
> state equals the dump state.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17214) check/fix conversion of non-acid to acid

2017-08-24 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140922#comment-16140922
 ] 

Eugene Koifman commented on HIVE-17214:
---

non-acid to acid conversion works for normal data layouts but see 
_TestAcidOnTez.testNonStandardConversion02_

> check/fix conversion of non-acid to acid
> 
>
> Key: HIVE-17214
> URL: https://issues.apache.org/jira/browse/HIVE-17214
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> bucketed tables have stricter rules for file layout on disk - bucket files 
> are direct children of a partition directory.
> for un-bucketed tables I'm not sure there are any rules
> for example, CTAS with Tez + Union operator creates 1 directory for each leg 
> of the union
> Supposedly Hive can read table by picking all files recursively.  
> Can it also write (other than CTAS example above) arbitrarily?
> Does it mean Acid write can also write anywhere?
> Figure out what can be supported and how can existing layout can be checked?  
> Examining a full "ls -l -R" for a large table could be expensive. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-15899) check CTAS over acid table

2017-08-24 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140919#comment-16140919
 ] 

Eugene Koifman commented on HIVE-15899:
---

HIVE-17205 makes CTAS work with unbucketed acid table but could be optimized
in HIVE-17205 CTAS creates an acid table but writes files w/o ROW__IDs.  Then 
on read they are treated like non-acid to acid conversion.

> check CTAS over acid table 
> ---
>
> Key: HIVE-15899
> URL: https://issues.apache.org/jira/browse/HIVE-15899
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> need to add a test to check if create table as works correctly with acid 
> tables



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (HIVE-17215) Streaming Ingest API writing unbucketed tables

2017-08-24 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman resolved HIVE-17215.
---
   Resolution: Fixed
Fix Version/s: 3.0.0

fix included in HIVE-17205

> Streaming Ingest API writing unbucketed tables
> --
>
> Key: HIVE-17215
> URL: https://issues.apache.org/jira/browse/HIVE-17215
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 3.0.0
>
>
> Currently the API expects the target table to be bucketed.
> It creates 1 writer per bucket per connection/partition.
> The simplest is to allow the API to create a single writer for unbucketed 
> tables.  
> If this doesn't provide enough write throughput, the client can create 
> another connection.
> Could add a parameter to the API to specify writer parallelism for unbucketed 
> tables.  If it's set to 2 for example, the writer will write delta_x_y_ 
> and delta_x_y_1 using statementId.  Maybe as a followup.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17340) TxnHandler.checkLock() - reduce number of SQL statements

2017-08-24 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140913#comment-16140913
 ] 

Eugene Koifman commented on HIVE-17340:
---

@Wei Zheng could you review please

> TxnHandler.checkLock() - reduce number of SQL statements
> 
>
> Key: HIVE-17340
> URL: https://issues.apache.org/jira/browse/HIVE-17340
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17340.03.patch
>
>
> This calls acquire(Connection dbConn, Statement stmt, long extLockId, 
> LockInfo lockInfo)
> for each lock in the same DB transaction - 1 Update stmt per acquire().
> There is no reason all of them cannot be sent in 1 statement if all the locks 
> are granted
> With a lot of partitions this can be a perf issue



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (HIVE-17340) TxnHandler.checkLock() - reduce number of SQL statements

2017-08-24 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140913#comment-16140913
 ] 

Eugene Koifman edited comment on HIVE-17340 at 8/24/17 11:18 PM:
-

[~wzheng] could you review please


was (Author: ekoifman):
@Wei Zheng could you review please

> TxnHandler.checkLock() - reduce number of SQL statements
> 
>
> Key: HIVE-17340
> URL: https://issues.apache.org/jira/browse/HIVE-17340
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17340.03.patch
>
>
> This calls acquire(Connection dbConn, Statement stmt, long extLockId, 
> LockInfo lockInfo)
> for each lock in the same DB transaction - 1 Update stmt per acquire().
> There is no reason all of them cannot be sent in 1 statement if all the locks 
> are granted
> With a lot of partitions this can be a perf issue



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17216) Additional qtests for HoS DPP

2017-08-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140908#comment-16140908
 ] 

Hive QA commented on HIVE-17216:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12883580/HIVE-17216.2.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 11001 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=100)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6526/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6526/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6526/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12883580 - PreCommit-HIVE-Build

> Additional qtests for HoS DPP
> -
>
> Key: HIVE-17216
> URL: https://issues.apache.org/jira/browse/HIVE-17216
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17216.1.patch, HIVE-17216.2.patch
>
>
> There are a few queries that we can add to the HoS DPP tests to increase 
> coverage. There are a few query patterns that the current tests don't cover.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17340) TxnHandler.checkLock() - reduce number of SQL statements

2017-08-24 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17340:
--
Attachment: HIVE-17340.03.patch

> TxnHandler.checkLock() - reduce number of SQL statements
> 
>
> Key: HIVE-17340
> URL: https://issues.apache.org/jira/browse/HIVE-17340
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17340.03.patch
>
>
> This calls acquire(Connection dbConn, Statement stmt, long extLockId, 
> LockInfo lockInfo)
> for each lock in the same DB transaction - 1 Update stmt per acquire().
> There is no reason all of them cannot be sent in 1 statement if all the locks 
> are granted
> With a lot of partitions this can be a perf issue



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17340) TxnHandler.checkLock() - reduce number of SQL statements

2017-08-24 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17340:
--
Status: Patch Available  (was: Open)

> TxnHandler.checkLock() - reduce number of SQL statements
> 
>
> Key: HIVE-17340
> URL: https://issues.apache.org/jira/browse/HIVE-17340
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17340.03.patch
>
>
> This calls acquire(Connection dbConn, Statement stmt, long extLockId, 
> LockInfo lockInfo)
> for each lock in the same DB transaction - 1 Update stmt per acquire().
> There is no reason all of them cannot be sent in 1 statement if all the locks 
> are granted
> With a lot of partitions this can be a perf issue



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Work stopped] (HIVE-17367) IMPORT table doesn't load from data dump if a metadata-only dump was already imported.

2017-08-24 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-17367 stopped by Sankar Hariappan.
---
> IMPORT table doesn't load from data dump if a metadata-only dump was already 
> imported.
> --
>
> Key: HIVE-17367
> URL: https://issues.apache.org/jira/browse/HIVE-17367
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Import/Export, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
>
> Repl v1 creates a set of EXPORT/IMPORT commands to replicate modified data 
> (as per events) across clusters.
> For instance, let's say, insert generates 2 events such as
> ALTER_TABLE (ID: 10)
> INSERT (ID: 11)
> Each event generates a set of EXPORT and IMPORT commands.
> ALTER_TABLE event generates metadata only export/import
> INSERT generates metadata+data export/import.
> As Hive always dump the latest copy of table during export, it sets the 
> latest notification event ID as current state of it. So, in this example, 
> import of metadata by ALTER_TABLE event sets the current state of the table 
> as 11.
> Now, when we try to import the data dumped by INSERT event, it is noop as the 
> table's current state(11) is equal to the dump state (11) which in-turn leads 
> to the data never gets replicated to target cluster.
> So, it is necessary to allow overwrite of table/partition if their current 
> state equals the dump state.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17377) SharedWorkOptimizer might not iterate through TS operators deterministically

2017-08-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140821#comment-16140821
 ] 

Hive QA commented on HIVE-17377:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12883559/HIVE-17377.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10999 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6525/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6525/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6525/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12883559 - PreCommit-HIVE-Build

> SharedWorkOptimizer might not iterate through TS operators deterministically
> 
>
> Key: HIVE-17377
> URL: https://issues.apache.org/jira/browse/HIVE-17377
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 3.0.0
>
> Attachments: HIVE-17377.01.patch, HIVE-17377.patch
>
>
> Given same query, multiple executions of the same query might yield different 
> reutilization results since iteration order over TS operators is not 
> guaranteed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16614) Support "set local time zone" statement

2017-08-24 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-16614:
---
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master, thanks for reviewing [~ashutoshc]!

A few notes:
* This fix changes 'TIMESTAMP WITH TIME ZONE' to 'TIMESTAMP WITH LOCAL TIME 
ZONE' type. To be compliant with SQL standard, 'TIMESTAMP WITH TIME ZONE' 
should store the time-zone displacement within the data value, however our 
current type does not. Since we might aim at having 'TIMESTAMP WITH TIME ZONE' 
conforming to the standard in the longer term, plus there has not been any 
release since 'TIMESTAMP WITH TIME ZONE' was introduced, we make this change to 
cause less confusion for end users.
* FWIW, Postgres implementation of 'TIMESTAMP WITH TIME ZONE' does not conform 
to the standard and its semantics are equivalent to 'TIMESTAMP WITH LOCAL TIME 
ZONE'.
* Other RDBMSs also have both 'TIMESTAMP WITH LOCAL TIME ZONE' and 'TIMESTAMP 
WITH TIME ZONE' types.
* Finally, I will create a follow-up to fix 'TIMESTAMP' type semantics so they 
are not dependent on the system timezone anymore. While 'TIMESTAMP' type will 
be compliant with SQL semantics, I will also take care that we are backwards 
compatible and old 'TIMESTAMP' typed values continue being accessible in the 
same way. (More in the issue to come)

> Support "set local time zone" statement
> ---
>
> Key: HIVE-16614
> URL: https://issues.apache.org/jira/browse/HIVE-16614
> Project: Hive
>  Issue Type: Improvement
>Reporter: Carter Shanklin
>Assignee: Jesus Camacho Rodriguez
> Fix For: 3.0.0
>
> Attachments: HIVE-16614.01.patch, HIVE-16614.02.patch, 
> HIVE-16614.03.patch, HIVE-16614.04.patch, HIVE-16614.05.patch, 
> HIVE-16614.patch
>
>
> HIVE-14412 introduces a timezone-aware timestamp.
> SQL has a concept of default time zone displacements, which are transparently 
> applied when converting between timezone-unaware types and timezone-aware 
> types and, in Hive's case, are also used to shift a timezone aware type to a 
> different time zone, depending on configuration.
> SQL also provides that the default time zone displacement be settable at a 
> session level, so that clients can access a database simultaneously from 
> different time zones and see time values in their own time zone.
> Currently the time zone displacement is fixed and is set based on the system 
> time zone where the Hive client runs (HiveServer2 or Hive CLI). It will be 
> more convenient for users if they have the ability to set their time zone of 
> choice.
> SQL defines "set time zone" with 2 ways of specifying the time zone, first 
> using an interval and second using the special keyword LOCAL.
> Examples:
>   • set time zone '-8:00';
>   • set time zone LOCAL;
> LOCAL means to set the current default time zone displacement to the 
> session's original default time zone displacement.
> Reference: SQL:2011 section 19.4



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17373) Upgrade some dependency versions

2017-08-24 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-17373:

Attachment: HIVE-17373.2.patch

> Upgrade some dependency versions
> 
>
> Key: HIVE-17373
> URL: https://issues.apache.org/jira/browse/HIVE-17373
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-17373.1.patch, HIVE-17373.2.patch
>
>
> Upgrade some libraries including log4j to 2.8.2, accumulo to 1.8.1 and 
> commons-httpclient to 3.1. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17377) SharedWorkOptimizer might not iterate through TS operators deterministically

2017-08-24 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-17377:
---
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master, thanks for reviewing [~ashutoshc]!

> SharedWorkOptimizer might not iterate through TS operators deterministically
> 
>
> Key: HIVE-17377
> URL: https://issues.apache.org/jira/browse/HIVE-17377
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 3.0.0
>
> Attachments: HIVE-17377.01.patch, HIVE-17377.patch
>
>
> Given same query, multiple executions of the same query might yield different 
> reutilization results since iteration order over TS operators is not 
> guaranteed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17205) add functional support

2017-08-24 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17205:
--
Attachment: HIVE-17205.16.patch

[~wzheng] could you review please

> add functional support
> --
>
> Key: HIVE-17205
> URL: https://issues.apache.org/jira/browse/HIVE-17205
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17205.01.patch, HIVE-17205.02.patch, 
> HIVE-17205.03.patch, HIVE-17205.09.patch, HIVE-17205.10.patch, 
> HIVE-17205.11.patch, HIVE-17205.12.patch, HIVE-17205.13.patch, 
> HIVE-17205.14.patch, HIVE-17205.15.patch, HIVE-17205.16.patch
>
>
> make sure unbucketed tables can be marked transactional=true
> make insert/update/delete/compaction work



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16614) Support "set local time zone" statement

2017-08-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140675#comment-16140675
 ] 

Hive QA commented on HIVE-16614:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12883554/HIVE-16614.05.patch

{color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 11001 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=235)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6524/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6524/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6524/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12883554 - PreCommit-HIVE-Build

> Support "set local time zone" statement
> ---
>
> Key: HIVE-16614
> URL: https://issues.apache.org/jira/browse/HIVE-16614
> Project: Hive
>  Issue Type: Improvement
>Reporter: Carter Shanklin
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-16614.01.patch, HIVE-16614.02.patch, 
> HIVE-16614.03.patch, HIVE-16614.04.patch, HIVE-16614.05.patch, 
> HIVE-16614.patch
>
>
> HIVE-14412 introduces a timezone-aware timestamp.
> SQL has a concept of default time zone displacements, which are transparently 
> applied when converting between timezone-unaware types and timezone-aware 
> types and, in Hive's case, are also used to shift a timezone aware type to a 
> different time zone, depending on configuration.
> SQL also provides that the default time zone displacement be settable at a 
> session level, so that clients can access a database simultaneously from 
> different time zones and see time values in their own time zone.
> Currently the time zone displacement is fixed and is set based on the system 
> time zone where the Hive client runs (HiveServer2 or Hive CLI). It will be 
> more convenient for users if they have the ability to set their time zone of 
> choice.
> SQL defines "set time zone" with 2 ways of specifying the time zone, first 
> using an interval and second using the special keyword LOCAL.
> Examples:
>   • set time zone '-8:00';
>   • set time zone LOCAL;
> LOCAL means to set the current default time zone displacement to the 
> session's original default time zone displacement.
> Reference: SQL:2011 section 19.4



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16886) HMS log notifications may have duplicated event IDs if multiple HMS are running concurrently

2017-08-24 Thread Alexander Kolbasov (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140668#comment-16140668
 ] 

Alexander Kolbasov commented on HIVE-16886:
---

[~anishek] I am not saying there would be a deadlock, it all depends on 
specifics - the notification id is updated as part of a larger transaction 
where lock ordering may be not that easy to enforce - but it may be a non-issue 
here.

> HMS log notifications may have duplicated event IDs if multiple HMS are 
> running concurrently
> 
>
> Key: HIVE-16886
> URL: https://issues.apache.org/jira/browse/HIVE-16886
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Metastore
>Reporter: Sergio Peña
>Assignee: anishek
> Attachments: datastore-identity-holes.diff, HIVE-16886.1.patch
>
>
> When running multiple Hive Metastore servers and DB notifications are 
> enabled, I could see that notifications can be persisted with a duplicated 
> event ID. 
> This does not happen when running multiple threads in a single HMS node due 
> to the locking acquired on the DbNotificationsLog class, but multiple HMS 
> could cause conflicts.
> The issue is in the ObjectStore#addNotificationEvent() method. The event ID 
> fetched from the datastore is used for the new notification, incremented in 
> the server itself, then persisted or updated back to the datastore. If 2 
> servers read the same ID, then these 2 servers write a new notification with 
> the same ID.
> The event ID is not unique nor a primary key.
> Here's a test case using the TestObjectStore class that confirms this issue:
> {noformat}
> @Test
>   public void testConcurrentAddNotifications() throws ExecutionException, 
> InterruptedException {
> final int NUM_THREADS = 2;
> CountDownLatch countIn = new CountDownLatch(NUM_THREADS);
> CountDownLatch countOut = new CountDownLatch(1);
> HiveConf conf = new HiveConf();
> conf.setVar(HiveConf.ConfVars.METASTORE_EXPRESSION_PROXY_CLASS, 
> MockPartitionExpressionProxy.class.getName());
> ExecutorService executorService = 
> Executors.newFixedThreadPool(NUM_THREADS);
> FutureTask tasks[] = new FutureTask[NUM_THREADS];
> for (int i=0; i   final int n = i;
>   tasks[i] = new FutureTask(new Callable() {
> @Override
> public Void call() throws Exception {
>   ObjectStore store = new ObjectStore();
>   store.setConf(conf);
>   NotificationEvent dbEvent =
>   new NotificationEvent(0, 0, 
> EventMessage.EventType.CREATE_DATABASE.toString(), "CREATE DATABASE DB" + n);
>   System.out.println("ADDING NOTIFICATION");
>   countIn.countDown();
>   countOut.await();
>   store.addNotificationEvent(dbEvent);
>   System.out.println("FINISH NOTIFICATION");
>   return null;
> }
>   });
>   executorService.execute(tasks[i]);
> }
> countIn.await();
> countOut.countDown();
> for (int i = 0; i < NUM_THREADS; ++i) {
>   tasks[i].get();
> }
> NotificationEventResponse eventResponse = 
> objectStore.getNextNotification(new NotificationEventRequest());
> Assert.assertEquals(2, eventResponse.getEventsSize());
> Assert.assertEquals(1, eventResponse.getEvents().get(0).getEventId());
> // This fails because the next notification has an event ID = 1
> Assert.assertEquals(2, eventResponse.getEvents().get(1).getEventId());
>   }
> {noformat}
> The last assertion fails expecting an event ID 1 instead of 2. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16886) HMS log notifications may have duplicated event IDs if multiple HMS are running concurrently

2017-08-24 Thread anishek (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140650#comment-16140650
 ] 

anishek commented on HIVE-16886:


[~akolb] the {{select for update}} will be on {{notification_sequence}} and 
since its an *exclusive* lock ,so why would there be deadlocks ?

> HMS log notifications may have duplicated event IDs if multiple HMS are 
> running concurrently
> 
>
> Key: HIVE-16886
> URL: https://issues.apache.org/jira/browse/HIVE-16886
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Metastore
>Reporter: Sergio Peña
>Assignee: anishek
> Attachments: datastore-identity-holes.diff, HIVE-16886.1.patch
>
>
> When running multiple Hive Metastore servers and DB notifications are 
> enabled, I could see that notifications can be persisted with a duplicated 
> event ID. 
> This does not happen when running multiple threads in a single HMS node due 
> to the locking acquired on the DbNotificationsLog class, but multiple HMS 
> could cause conflicts.
> The issue is in the ObjectStore#addNotificationEvent() method. The event ID 
> fetched from the datastore is used for the new notification, incremented in 
> the server itself, then persisted or updated back to the datastore. If 2 
> servers read the same ID, then these 2 servers write a new notification with 
> the same ID.
> The event ID is not unique nor a primary key.
> Here's a test case using the TestObjectStore class that confirms this issue:
> {noformat}
> @Test
>   public void testConcurrentAddNotifications() throws ExecutionException, 
> InterruptedException {
> final int NUM_THREADS = 2;
> CountDownLatch countIn = new CountDownLatch(NUM_THREADS);
> CountDownLatch countOut = new CountDownLatch(1);
> HiveConf conf = new HiveConf();
> conf.setVar(HiveConf.ConfVars.METASTORE_EXPRESSION_PROXY_CLASS, 
> MockPartitionExpressionProxy.class.getName());
> ExecutorService executorService = 
> Executors.newFixedThreadPool(NUM_THREADS);
> FutureTask tasks[] = new FutureTask[NUM_THREADS];
> for (int i=0; i   final int n = i;
>   tasks[i] = new FutureTask(new Callable() {
> @Override
> public Void call() throws Exception {
>   ObjectStore store = new ObjectStore();
>   store.setConf(conf);
>   NotificationEvent dbEvent =
>   new NotificationEvent(0, 0, 
> EventMessage.EventType.CREATE_DATABASE.toString(), "CREATE DATABASE DB" + n);
>   System.out.println("ADDING NOTIFICATION");
>   countIn.countDown();
>   countOut.await();
>   store.addNotificationEvent(dbEvent);
>   System.out.println("FINISH NOTIFICATION");
>   return null;
> }
>   });
>   executorService.execute(tasks[i]);
> }
> countIn.await();
> countOut.countDown();
> for (int i = 0; i < NUM_THREADS; ++i) {
>   tasks[i].get();
> }
> NotificationEventResponse eventResponse = 
> objectStore.getNextNotification(new NotificationEventRequest());
> Assert.assertEquals(2, eventResponse.getEventsSize());
> Assert.assertEquals(1, eventResponse.getEvents().get(0).getEventId());
> // This fails because the next notification has an event ID = 1
> Assert.assertEquals(2, eventResponse.getEvents().get(1).getEventId());
>   }
> {noformat}
> The last assertion fails expecting an event ID 1 instead of 2. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-10349) overflow in stats

2017-08-24 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140612#comment-16140612
 ] 

Sergey Shelukhin commented on HIVE-10349:
-

I do not recall... [~prasanth_j] do you remember if this has been fixed 
somewhere? I think it was.

> overflow in stats
> -
>
> Key: HIVE-10349
> URL: https://issues.apache.org/jira/browse/HIVE-10349
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Prasanth Jayachandran
>
> Discovered while running q17 in LLAP.
> {noformat}
> Reducer 2 
> Execution mode: llap
> Reduce Operator Tree:
>   Merge Join Operator
> condition map:
>  Inner Join 0 to 1
> keys:
>   0 _col28 (type: int), _col27 (type: int)
>   1 cs_bill_customer_sk (type: int), cs_item_sk (type: int)
> outputColumnNames: _col1, _col2, _col6, _col8, _col9, _col22, 
> _col27, _col28, _col34, _col35, _col45, _col51, _col63, _col66, _col82
> Statistics: Num rows: 1047651367827495040 Data size: 
> 9223372036854775807 Basic stats: COMPLETE Column stats: PARTIAL
> Map Join Operator
>   condition map:
>Inner Join 0 to 1
>   keys:
> 0 _col22 (type: int)
> 1 d_date_sk (type: int)
>   outputColumnNames: _col1, _col2, _col6, _col8, _col9, 
> _col22, _col27, _col28, _col34, _col35, _col45, _col51, _col63, _col66, 
> _col82, _col86
>   input vertices:
> 1 Map 7
>   Statistics: Num rows: 1152416529588199552 Data size: 
> 9223372036854775807 Basic stats: COMPLETE Column stats: NONE
> {noformat}
> Data size overflows and row count also looks wrong. I wonder if this is why 
> it generates 1009 reducers for this stage on 6 machines



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17297) allow AM to use LLAP guaranteed tasks

2017-08-24 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17297:

Attachment: HIVE-17297.01.patch

It tried to apply the nogen patch. Attaching the same full patch again... I can 
apply it just fine

> allow AM to use LLAP guaranteed tasks
> -
>
> Key: HIVE-17297
> URL: https://issues.apache.org/jira/browse/HIVE-17297
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17297.01.nogen.patch, HIVE-17297.01.patch, 
> HIVE-17297.01.patch, HIVE-17297.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17307) Change the metastore to not use the metrics code in hive/common

2017-08-24 Thread Alan Gates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-17307:
--
Status: Patch Available  (was: Open)

3rd times a charm, I hope.  Regenerated the patch again.

> Change the metastore to not use the metrics code in hive/common
> ---
>
> Key: HIVE-17307
> URL: https://issues.apache.org/jira/browse/HIVE-17307
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-17307.2.patch, HIVE-17307.3.patch, HIVE-17307.patch
>
>
> As we move code into the standalone metastore module, it cannot use the 
> metrics in hive-common.  We could copy the current Metrics interface or we 
> could change the metastore code to directly use codahale metrics.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17307) Change the metastore to not use the metrics code in hive/common

2017-08-24 Thread Alan Gates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-17307:
--
Attachment: HIVE-17307.3.patch

> Change the metastore to not use the metrics code in hive/common
> ---
>
> Key: HIVE-17307
> URL: https://issues.apache.org/jira/browse/HIVE-17307
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-17307.2.patch, HIVE-17307.3.patch, HIVE-17307.patch
>
>
> As we move code into the standalone metastore module, it cannot use the 
> metrics in hive-common.  We could copy the current Metrics interface or we 
> could change the metastore code to directly use codahale metrics.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17307) Change the metastore to not use the metrics code in hive/common

2017-08-24 Thread Alan Gates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-17307:
--
Status: Open  (was: Patch Available)

> Change the metastore to not use the metrics code in hive/common
> ---
>
> Key: HIVE-17307
> URL: https://issues.apache.org/jira/browse/HIVE-17307
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-17307.2.patch, HIVE-17307.patch
>
>
> As we move code into the standalone metastore module, it cannot use the 
> metrics in hive-common.  We could copy the current Metrics interface or we 
> could change the metastore code to directly use codahale metrics.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17205) add functional support

2017-08-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140587#comment-16140587
 ] 

Hive QA commented on HIVE-17205:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12883541/HIVE-17205.15.patch

{color:green}SUCCESS:{color} +1 due to 11 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 11004 tests 
executed
*Failed tests:*
{noformat}
TestTxnCommandsBase - did not produce a TEST-*.xml file (likely timed out) 
(batchId=280)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=100)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=234)
org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion02 
(batchId=215)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6523/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6523/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6523/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12883541 - PreCommit-HIVE-Build

> add functional support
> --
>
> Key: HIVE-17205
> URL: https://issues.apache.org/jira/browse/HIVE-17205
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17205.01.patch, HIVE-17205.02.patch, 
> HIVE-17205.03.patch, HIVE-17205.09.patch, HIVE-17205.10.patch, 
> HIVE-17205.11.patch, HIVE-17205.12.patch, HIVE-17205.13.patch, 
> HIVE-17205.14.patch, HIVE-17205.15.patch
>
>
> make sure unbucketed tables can be marked transactional=true
> make insert/update/delete/compaction work



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17373) Upgrade some dependency versions

2017-08-24 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-17373:

Attachment: (was: HIVE-17373.1.patch)

> Upgrade some dependency versions
> 
>
> Key: HIVE-17373
> URL: https://issues.apache.org/jira/browse/HIVE-17373
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-17373.1.patch
>
>
> Upgrade some libraries including log4j to 2.8.2, accumulo to 1.8.1 and 
> commons-httpclient to 3.1. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17373) Upgrade some dependency versions

2017-08-24 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-17373:

Attachment: HIVE-17373.1.patch

> Upgrade some dependency versions
> 
>
> Key: HIVE-17373
> URL: https://issues.apache.org/jira/browse/HIVE-17373
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-17373.1.patch
>
>
> Upgrade some libraries including log4j to 2.8.2, accumulo to 1.8.1 and 
> commons-httpclient to 3.1. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16949) Leak of threads from Get-Input-Paths thread pool when more than 1 used in query

2017-08-24 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140563#comment-16140563
 ] 

Vihang Karajgaonkar commented on HIVE-16949:


LGTM, left a comment on RB.

> Leak of threads from Get-Input-Paths thread pool when more than 1 used in 
> query
> ---
>
> Key: HIVE-16949
> URL: https://issues.apache.org/jira/browse/HIVE-16949
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Birger Brunswiek
>Assignee: Sahil Takiar
> Attachments: HIVE-16949.1.patch
>
>
> The commit 
> [20210de|https://github.com/apache/hive/commit/20210dec94148c9b529132b1545df3dd7be083c3]
>  which was part of HIVE-15546 [introduced a thread 
> pool|https://github.com/apache/hive/blob/824b9c80b443dc4e2b9ad35214a23ac756e75234/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L3109]
>  which is not shutdown upon completion of its threads. This leads to a leak 
> of threads for each query which uses more than 1 partition. They are not 
> removed automatically. When queries spanning multiple partitions are made the 
> number of threads increases and is never reduced. On my machine hiveserver2 
> starts to get slower and slower once 10k threads are reached.
> Thread pools only shutdown automatically in special circumstances (see 
> [documentation section 
> _Finalization_|https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ThreadPoolExecutor.html]).
>  This is not currently the case for the Get-Input-Paths thread pool. I would 
> add a _pool.shutdown()_ in a finally block just before returning the result 
> to make sure the threads are really shutdown.
> My current workaround is to set {{hive.exec.input.listing.max.threads = 1}}. 
> This prevents the the thread pool from being spawned 
> [\[1\]|https://github.com/apache/hive/blob/824b9c80b443dc4e2b9ad35214a23ac756e75234/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L2118]
>  
> [\[2\]|https://github.com/apache/hive/blob/824b9c80b443dc4e2b9ad35214a23ac756e75234/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L3107].
> The same issue probably also applies to the [Get-Input-Summary thread 
> pool|https://github.com/apache/hive/blob/824b9c80b443dc4e2b9ad35214a23ac756e75234/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L2193].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16949) Leak of threads from Get-Input-Paths thread pool when more than 1 used in query

2017-08-24 Thread Sahil Takiar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140535#comment-16140535
 ] 

Sahil Takiar commented on HIVE-16949:
-

[~vihangk1], [~asherman], [~spena] could someone take a look? Most of the code 
changes are re-factoring to make the code more test-able. I added a call to 
{{ExecutorService#shutdown}} after all threads have been submitted, and then a 
call to {{ExecutorService#shutdownNow}} in a {{finally}} block.

> Leak of threads from Get-Input-Paths thread pool when more than 1 used in 
> query
> ---
>
> Key: HIVE-16949
> URL: https://issues.apache.org/jira/browse/HIVE-16949
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Birger Brunswiek
>Assignee: Sahil Takiar
> Attachments: HIVE-16949.1.patch
>
>
> The commit 
> [20210de|https://github.com/apache/hive/commit/20210dec94148c9b529132b1545df3dd7be083c3]
>  which was part of HIVE-15546 [introduced a thread 
> pool|https://github.com/apache/hive/blob/824b9c80b443dc4e2b9ad35214a23ac756e75234/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L3109]
>  which is not shutdown upon completion of its threads. This leads to a leak 
> of threads for each query which uses more than 1 partition. They are not 
> removed automatically. When queries spanning multiple partitions are made the 
> number of threads increases and is never reduced. On my machine hiveserver2 
> starts to get slower and slower once 10k threads are reached.
> Thread pools only shutdown automatically in special circumstances (see 
> [documentation section 
> _Finalization_|https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ThreadPoolExecutor.html]).
>  This is not currently the case for the Get-Input-Paths thread pool. I would 
> add a _pool.shutdown()_ in a finally block just before returning the result 
> to make sure the threads are really shutdown.
> My current workaround is to set {{hive.exec.input.listing.max.threads = 1}}. 
> This prevents the the thread pool from being spawned 
> [\[1\]|https://github.com/apache/hive/blob/824b9c80b443dc4e2b9ad35214a23ac756e75234/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L2118]
>  
> [\[2\]|https://github.com/apache/hive/blob/824b9c80b443dc4e2b9ad35214a23ac756e75234/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L3107].
> The same issue probably also applies to the [Get-Input-Summary thread 
> pool|https://github.com/apache/hive/blob/824b9c80b443dc4e2b9ad35214a23ac756e75234/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L2193].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17385) Fix incremental repl error for non-native tables

2017-08-24 Thread Tao Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17385:
--
Status: Patch Available  (was: Open)

> Fix incremental repl error for non-native tables
> 
>
> Key: HIVE-17385
> URL: https://issues.apache.org/jira/browse/HIVE-17385
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17385.1.patch, HIVE-17385.2.patch, 
> HIVE-17385.3.patch
>
>
> See below error with incremental replication for non-native (storage handler 
> based) tables. The bug is that we are not checking a table should be 
> dumped/exported or not during incremental dump.
> 2017-08-02T12:31:48,195 ERROR [HiveServer2-Background-Pool: Thread-8078]: 
> exec.DDLTask (DDLTask.java:failed(632)) - 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> MetaException(message:LOCATION may not be specified for HBase.)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17385) Fix incremental repl error for non-native tables

2017-08-24 Thread Tao Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17385:
--
Attachment: HIVE-17385.3.patch

> Fix incremental repl error for non-native tables
> 
>
> Key: HIVE-17385
> URL: https://issues.apache.org/jira/browse/HIVE-17385
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17385.1.patch, HIVE-17385.2.patch, 
> HIVE-17385.3.patch
>
>
> See below error with incremental replication for non-native (storage handler 
> based) tables. The bug is that we are not checking a table should be 
> dumped/exported or not during incremental dump.
> 2017-08-02T12:31:48,195 ERROR [HiveServer2-Background-Pool: Thread-8078]: 
> exec.DDLTask (DDLTask.java:failed(632)) - 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> MetaException(message:LOCATION may not be specified for HBase.)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17216) Additional qtests for HoS DPP

2017-08-24 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17216:

Attachment: HIVE-17216.2.patch

Re-basing patch.

> Additional qtests for HoS DPP
> -
>
> Key: HIVE-17216
> URL: https://issues.apache.org/jira/browse/HIVE-17216
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17216.1.patch, HIVE-17216.2.patch
>
>
> There are a few queries that we can add to the HoS DPP tests to increase 
> coverage. There are a few query patterns that the current tests don't cover.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17382) Change startsWith relation introduced in HIVE-17316

2017-08-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140494#comment-16140494
 ] 

Hive QA commented on HIVE-17382:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12883527/HIVE-17382.01.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 22 failed/errored test(s), 11000 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed]
 (batchId=240)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[drop_with_concurrency]
 (batchId=240)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[escape_comments] 
(batchId=240)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 (batchId=240)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[mapjoin2] 
(batchId=240)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite]
 (batchId=240)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[select_dummy_source] 
(batchId=240)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_10] 
(batchId=240)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_11] 
(batchId=240)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_12] 
(batchId=240)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_13] 
(batchId=240)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_16] 
(batchId=240)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_1] 
(batchId=240)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_2] 
(batchId=240)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_3] 
(batchId=240)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_7] 
(batchId=240)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[udf_unix_timestamp] 
(batchId=240)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=235)
org.apache.hadoop.hive.conf.TestHiveConfRestrictList.testMultipleRestrictions 
(batchId=250)
org.apache.hadoop.hive.metastore.datasource.TestDataSourceProviderFactory.testBoneCPConfigCannotBeSet
 (batchId=201)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6522/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6522/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6522/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 22 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12883527 - PreCommit-HIVE-Build

> Change startsWith relation introduced in HIVE-17316
> ---
>
> Key: HIVE-17382
> URL: https://issues.apache.org/jira/browse/HIVE-17382
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
> Fix For: 3.0.0
>
> Attachments: HIVE-17382.01.patch
>
>
> In HiveConf the new name should be checked if it starts with a 
> restricted/hidden variable prefix and not vice-versa.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17385) Fix incremental repl error for non-native tables

2017-08-24 Thread Tao Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140446#comment-16140446
 ] 

Tao Li commented on HIVE-17385:
---

[~daijy] Can you please take a look at this change?

> Fix incremental repl error for non-native tables
> 
>
> Key: HIVE-17385
> URL: https://issues.apache.org/jira/browse/HIVE-17385
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17385.1.patch, HIVE-17385.2.patch
>
>
> See below error with incremental replication for non-native (storage handler 
> based) tables. The bug is that we are not checking a table should be 
> dumped/exported or not during incremental dump.
> 2017-08-02T12:31:48,195 ERROR [HiveServer2-Background-Pool: Thread-8078]: 
> exec.DDLTask (DDLTask.java:failed(632)) - 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> MetaException(message:LOCATION may not be specified for HBase.)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17385) Fix incremental repl error for non-native tables

2017-08-24 Thread Tao Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17385:
--
Attachment: HIVE-17385.2.patch

> Fix incremental repl error for non-native tables
> 
>
> Key: HIVE-17385
> URL: https://issues.apache.org/jira/browse/HIVE-17385
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17385.1.patch, HIVE-17385.2.patch
>
>
> See below error with incremental replication for non-native (storage handler 
> based) tables. The bug is that we are not checking a table should be 
> dumped/exported or not during incremental dump.
> 2017-08-02T12:31:48,195 ERROR [HiveServer2-Background-Pool: Thread-8078]: 
> exec.DDLTask (DDLTask.java:failed(632)) - 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> MetaException(message:LOCATION may not be specified for HBase.)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17385) Fix incremental repl error for non-native tables

2017-08-24 Thread Tao Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17385:
--
Attachment: HIVE-17385.1.patch

> Fix incremental repl error for non-native tables
> 
>
> Key: HIVE-17385
> URL: https://issues.apache.org/jira/browse/HIVE-17385
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17385.1.patch
>
>
> See below error with incremental replication for non-native (storage handler 
> based) tables. The bug is that we are not checking a table should be 
> dumped/exported or not during incremental dump.
> 2017-08-02T12:31:48,195 ERROR [HiveServer2-Background-Pool: Thread-8078]: 
> exec.DDLTask (DDLTask.java:failed(632)) - 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> MetaException(message:LOCATION may not be specified for HBase.)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17361) Support LOAD DATA for transactional tables

2017-08-24 Thread Wei Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140387#comment-16140387
 ] 

Wei Zheng commented on HIVE-17361:
--

[~ekoifman] Can you take a look please?

> Support LOAD DATA for transactional tables
> --
>
> Key: HIVE-17361
> URL: https://issues.apache.org/jira/browse/HIVE-17361
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-17361.1.patch
>
>
> LOAD DATA was not supported since ACID was introduced. Need to fill this gap 
> between ACID table and regular hive table.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17318) Make Hikari CP configurable using hive properties in hive-site.xml

2017-08-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140357#comment-16140357
 ] 

Hive QA commented on HIVE-17318:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12883517/HIVE-17318.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 11003 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=235)
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProvider.testSimplePrivileges
 (batchId=221)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6521/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6521/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6521/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12883517 - PreCommit-HIVE-Build

> Make Hikari CP configurable using hive properties in hive-site.xml
> --
>
> Key: HIVE-17318
> URL: https://issues.apache.org/jira/browse/HIVE-17318
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
> Attachments: HIVE-17318.01.patch, HIVE-17318.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17377) SharedWorkOptimizer might not iterate through TS operators deterministically

2017-08-24 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140352#comment-16140352
 ] 

Ashutosh Chauhan commented on HIVE-17377:
-

+1

> SharedWorkOptimizer might not iterate through TS operators deterministically
> 
>
> Key: HIVE-17377
> URL: https://issues.apache.org/jira/browse/HIVE-17377
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-17377.01.patch, HIVE-17377.patch
>
>
> Given same query, multiple executions of the same query might yield different 
> reutilization results since iteration order over TS operators is not 
> guaranteed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17241) Change metastore classes to not use the shims

2017-08-24 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140344#comment-16140344
 ] 

Vihang Karajgaonkar commented on HIVE-17241:


[~alangates] I have created HIVE-17371 to fix that. I can take that up.

> Change metastore classes to not use the shims
> -
>
> Key: HIVE-17241
> URL: https://issues.apache.org/jira/browse/HIVE-17241
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
> Fix For: 3.0.0
>
> Attachments: HIVE-17241.2.patch, HIVE-17241.patch
>
>
> As part of moving the metastore into a standalone package, it will no longer 
> have access to the shims.  This means we need to either copy them or access 
> the underlying Hadoop operations directly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17385) Fix incremental repl error for non-native tables

2017-08-24 Thread Tao Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17385:
--
Description: 
See below error with incremental replication for non-native (storage handler 
based) tables. The bug is that we are not checking a table should be 
dumped/exported or not during incremental dump.

2017-08-02T12:31:48,195 ERROR [HiveServer2-Background-Pool: Thread-8078]: 
exec.DDLTask (DDLTask.java:failed(632)) - 
org.apache.hadoop.hive.ql.metadata.HiveException: 
MetaException(message:LOCATION may not be specified for HBase.)



  was:
See below error with incremental replication for non-native (storage handler 
based) tables.

2017-08-02T12:31:48,195 ERROR [HiveServer2-Background-Pool: Thread-8078]: 
exec.DDLTask (DDLTask.java:failed(632)) - 
org.apache.hadoop.hive.ql.metadata.HiveException: 
MetaException(message:LOCATION may not be specified for HBase.)


> Fix incremental repl error for non-native tables
> 
>
> Key: HIVE-17385
> URL: https://issues.apache.org/jira/browse/HIVE-17385
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
>
> See below error with incremental replication for non-native (storage handler 
> based) tables. The bug is that we are not checking a table should be 
> dumped/exported or not during incremental dump.
> 2017-08-02T12:31:48,195 ERROR [HiveServer2-Background-Pool: Thread-8078]: 
> exec.DDLTask (DDLTask.java:failed(632)) - 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> MetaException(message:LOCATION may not be specified for HBase.)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17385) Fix incremental repl error for non-native tables

2017-08-24 Thread Tao Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17385:
--
Description: 
See below error with incremental replication for non-native (storage handler 
based) tables.

2017-08-02T12:31:48,195 ERROR [HiveServer2-Background-Pool: Thread-8078]: 
exec.DDLTask (DDLTask.java:failed(632)) - 
org.apache.hadoop.hive.ql.metadata.HiveException: 
MetaException(message:LOCATION may not be specified for HBase.)

  was:See below error for 


> Fix incremental repl error for non-native tables
> 
>
> Key: HIVE-17385
> URL: https://issues.apache.org/jira/browse/HIVE-17385
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
>
> See below error with incremental replication for non-native (storage handler 
> based) tables.
> 2017-08-02T12:31:48,195 ERROR [HiveServer2-Background-Pool: Thread-8078]: 
> exec.DDLTask (DDLTask.java:failed(632)) - 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> MetaException(message:LOCATION may not be specified for HBase.)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17385) Fix incremental repl error for non-native tables

2017-08-24 Thread Tao Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17385:
--
Summary: Fix incremental repl error for non-native tables  (was: Fix repl 
load error for non-native tables)

> Fix incremental repl error for non-native tables
> 
>
> Key: HIVE-17385
> URL: https://issues.apache.org/jira/browse/HIVE-17385
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
>
> See below error for 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17385) Fix incremental repl error for non-native tables

2017-08-24 Thread Tao Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17385:
--
Component/s: repl

> Fix incremental repl error for non-native tables
> 
>
> Key: HIVE-17385
> URL: https://issues.apache.org/jira/browse/HIVE-17385
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
>
> See below error for 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17385) Fix repl load error for non-native tables

2017-08-24 Thread Tao Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17385:
--
Description: See below error for 

> Fix repl load error for non-native tables
> -
>
> Key: HIVE-17385
> URL: https://issues.apache.org/jira/browse/HIVE-17385
> Project: Hive
>  Issue Type: Bug
>Reporter: Tao Li
>Assignee: Tao Li
>
> See below error for 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17385) Fix repl load error for non-native tables

2017-08-24 Thread Tao Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li reassigned HIVE-17385:
-


> Fix repl load error for non-native tables
> -
>
> Key: HIVE-17385
> URL: https://issues.apache.org/jira/browse/HIVE-17385
> Project: Hive
>  Issue Type: Bug
>Reporter: Tao Li
>Assignee: Tao Li
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17377) SharedWorkOptimizer might not iterate through TS operators deterministically

2017-08-24 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140317#comment-16140317
 ] 

Jesus Camacho Rodriguez commented on HIVE-17377:


[~ashutoshc], could you take a look? Thanks

> SharedWorkOptimizer might not iterate through TS operators deterministically
> 
>
> Key: HIVE-17377
> URL: https://issues.apache.org/jira/browse/HIVE-17377
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-17377.01.patch, HIVE-17377.patch
>
>
> Given same query, multiple executions of the same query might yield different 
> reutilization results since iteration order over TS operators is not 
> guaranteed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17377) SharedWorkOptimizer might not iterate through TS operators deterministically

2017-08-24 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-17377:
---
Attachment: HIVE-17377.01.patch

> SharedWorkOptimizer might not iterate through TS operators deterministically
> 
>
> Key: HIVE-17377
> URL: https://issues.apache.org/jira/browse/HIVE-17377
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-17377.01.patch, HIVE-17377.patch
>
>
> Given same query, multiple executions of the same query might yield different 
> reutilization results since iteration order over TS operators is not 
> guaranteed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17372) update druid dependency to druid 0.10.1

2017-08-24 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-17372:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Slim!

> update druid dependency to druid 0.10.1
> ---
>
> Key: HIVE-17372
> URL: https://issues.apache.org/jira/browse/HIVE-17372
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Fix For: 3.0.0
>
> Attachments: HIVE-17372.patch
>
>
> Update to most recent druid version to be released August 23.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16614) Support "set local time zone" statement

2017-08-24 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140294#comment-16140294
 ] 

Ashutosh Chauhan commented on HIVE-16614:
-

+1 pending tests

> Support "set local time zone" statement
> ---
>
> Key: HIVE-16614
> URL: https://issues.apache.org/jira/browse/HIVE-16614
> Project: Hive
>  Issue Type: Improvement
>Reporter: Carter Shanklin
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-16614.01.patch, HIVE-16614.02.patch, 
> HIVE-16614.03.patch, HIVE-16614.04.patch, HIVE-16614.05.patch, 
> HIVE-16614.patch
>
>
> HIVE-14412 introduces a timezone-aware timestamp.
> SQL has a concept of default time zone displacements, which are transparently 
> applied when converting between timezone-unaware types and timezone-aware 
> types and, in Hive's case, are also used to shift a timezone aware type to a 
> different time zone, depending on configuration.
> SQL also provides that the default time zone displacement be settable at a 
> session level, so that clients can access a database simultaneously from 
> different time zones and see time values in their own time zone.
> Currently the time zone displacement is fixed and is set based on the system 
> time zone where the Hive client runs (HiveServer2 or Hive CLI). It will be 
> more convenient for users if they have the ability to set their time zone of 
> choice.
> SQL defines "set time zone" with 2 ways of specifying the time zone, first 
> using an interval and second using the special keyword LOCAL.
> Examples:
>   • set time zone '-8:00';
>   • set time zone LOCAL;
> LOCAL means to set the current default time zone displacement to the 
> session's original default time zone displacement.
> Reference: SQL:2011 section 19.4



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16614) Support "set local time zone" statement

2017-08-24 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-16614:
---
Attachment: HIVE-16614.05.patch

> Support "set local time zone" statement
> ---
>
> Key: HIVE-16614
> URL: https://issues.apache.org/jira/browse/HIVE-16614
> Project: Hive
>  Issue Type: Improvement
>Reporter: Carter Shanklin
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-16614.01.patch, HIVE-16614.02.patch, 
> HIVE-16614.03.patch, HIVE-16614.04.patch, HIVE-16614.05.patch, 
> HIVE-16614.patch
>
>
> HIVE-14412 introduces a timezone-aware timestamp.
> SQL has a concept of default time zone displacements, which are transparently 
> applied when converting between timezone-unaware types and timezone-aware 
> types and, in Hive's case, are also used to shift a timezone aware type to a 
> different time zone, depending on configuration.
> SQL also provides that the default time zone displacement be settable at a 
> session level, so that clients can access a database simultaneously from 
> different time zones and see time values in their own time zone.
> Currently the time zone displacement is fixed and is set based on the system 
> time zone where the Hive client runs (HiveServer2 or Hive CLI). It will be 
> more convenient for users if they have the ability to set their time zone of 
> choice.
> SQL defines "set time zone" with 2 ways of specifying the time zone, first 
> using an interval and second using the special keyword LOCAL.
> Examples:
>   • set time zone '-8:00';
>   • set time zone LOCAL;
> LOCAL means to set the current default time zone displacement to the 
> session's original default time zone displacement.
> Reference: SQL:2011 section 19.4



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-13989) Extended ACLs are not handled according to specification

2017-08-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140257#comment-16140257
 ] 

Hive QA commented on HIVE-13989:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12883493/HIVE-13989.4-branch-2.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10606 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[comments] (batchId=35)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[explaindenpendencydiffengs]
 (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=142)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=139)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=144)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[explaindenpendencydiffengs]
 (batchId=115)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorized_ptf] 
(batchId=125)
org.apache.hive.hcatalog.api.TestHCatClient.testTransportFailure (batchId=176)
org.apache.hive.jdbc.TestJdbcDriver2.testYarnATSGuid (batchId=222)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6520/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6520/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6520/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12883493 - PreCommit-HIVE-Build

> Extended ACLs are not handled according to specification
> 
>
> Key: HIVE-13989
> URL: https://issues.apache.org/jira/browse/HIVE-13989
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Chris Drome
>Assignee: Chris Drome
> Attachments: HIVE-13989.1-branch-1.patch, HIVE-13989.1.patch, 
> HIVE-13989.4-branch-2.2.patch, HIVE-13989.4-branch-2.patch, 
> HIVE-13989-branch-1.patch, HIVE-13989-branch-2.2.patch, 
> HIVE-13989-branch-2.2.patch, HIVE-13989-branch-2.2.patch
>
>
> Hive takes two approaches to working with extended ACLs depending on whether 
> data is being produced via a Hive query or HCatalog APIs. A Hive query will 
> run an FsShell command to recursively set the extended ACLs for a directory 
> sub-tree. HCatalog APIs will attempt to build up the directory sub-tree 
> programmatically and runs some code to set the ACLs to match the parent 
> directory.
> Some incorrect assumptions were made when implementing the extended ACLs 
> support. Refer to https://issues.apache.org/jira/browse/HDFS-4685 for the 
> design documents of extended ACLs in HDFS. These documents model the 
> implementation after the POSIX implementation on Linux, which can be found at 
> http://www.vanemery.com/Linux/ACL/POSIX_ACL_on_Linux.html.
> The code for setting extended ACLs via HCatalog APIs is found in 
> HdfsUtils.java:
> {code}
> if (aclEnabled) {
>   aclStatus =  sourceStatus.getAclStatus();
>   if (aclStatus != null) {
> LOG.trace(aclStatus.toString());
> aclEntries = aclStatus.getEntries();
> removeBaseAclEntries(aclEntries);
> //the ACL api's also expect the tradition user/group/other permission 
> in the form of ACL
> aclEntries.add(newAclEntry(AclEntryScope.ACCESS, AclEntryType.USER, 
> sourcePerm.getUserAction()));
> aclEntries.add(newAclEntry(AclEntryScope.ACCESS, AclEntryType.GROUP, 
> sourcePerm.getGroupAction()));
> aclEntries.add(newAclEntry(AclEntryScope.ACCESS, AclEntryType.OTHER, 
> sourcePerm.getOtherAction()));
>   }
> }
> {code}
> We found that DEFAULT extended ACL rules were not being inherited properly by 
> the directory sub-tree, so the above code is incomplete because it 
> effectively drops the DEFAULT rules. The second problem is with the call to 
> {{sourcePerm.getGroupAction()}}, which is incorrect in the case of extended 
> ACLs. When extended ACLs are used the GROUP permission is replaced with the 
> extended ACL mask. So the above code will apply the wrong permissions to the 
> GROUP. Instead the correct GROUP permissions now need to be pulled from the 
> AclEntry as returned by {{getAclStatus().getEntries()}}. See the 
> implementation of the new method {{getDefaultAclEntries}} for details.
> Similar issues exist with the HCatalog API. None of th

[jira] [Commented] (HIVE-17381) When we enable Parquet Writer Version V2, hive throws an exception: Unsupported encoding: DELTA_BYTE_ARRAY.

2017-08-24 Thread Ferdinand Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140217#comment-16140217
 ] 

Ferdinand Xu commented on HIVE-17381:
-

V1 is just using plain encoding. To support V2, we need implementing the new 
encoding as well.

> When we enable Parquet Writer Version V2, hive throws an exception: 
> Unsupported encoding: DELTA_BYTE_ARRAY.
> ---
>
> Key: HIVE-17381
> URL: https://issues.apache.org/jira/browse/HIVE-17381
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ke Jia
>
> when we set "hive.vectorized.execution.enabled=true" and 
> "parquet.writer.version=v2" simultaneously, hive throws the following 
> exception:
> Caused by: java.io.IOException: java.io.IOException: 
> java.lang.UnsupportedOperationException: Unsupported encoding: 
> DELTA_BYTE_ARRAY
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:232)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142)
>   at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:254)
>   at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:208)
>   at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
>   at 
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
>   at 
> scala.collection.convert.Wrappers$IteratorWrapper.hasNext(Wrappers.scala:30)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList.hasNext(HiveBaseFunctionResultList.java:83)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:42)
>   at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
>   at org.apache.spark.scheduler.Task.run(Task.scala:86)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: java.lang.UnsupportedOperationException: 
> Unsupported encoding: DELTA_BYTE_ARRAY
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229)
>   ... 16 more



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (HIVE-17381) When we enable Parquet Writer Version V2, hive throws an exception: Unsupported encoding: DELTA_BYTE_ARRAY.

2017-08-24 Thread Ferdinand Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140217#comment-16140217
 ] 

Ferdinand Xu edited comment on HIVE-17381 at 8/24/17 3:55 PM:
--

V1 is just using plain encoding + dictionary encoding. To support V2, we need 
implementing the new encoding as well.


was (Author: ferd):
V1 is just using plain encoding. To support V2, we need implementing the new 
encoding as well.

> When we enable Parquet Writer Version V2, hive throws an exception: 
> Unsupported encoding: DELTA_BYTE_ARRAY.
> ---
>
> Key: HIVE-17381
> URL: https://issues.apache.org/jira/browse/HIVE-17381
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ke Jia
>
> when we set "hive.vectorized.execution.enabled=true" and 
> "parquet.writer.version=v2" simultaneously, hive throws the following 
> exception:
> Caused by: java.io.IOException: java.io.IOException: 
> java.lang.UnsupportedOperationException: Unsupported encoding: 
> DELTA_BYTE_ARRAY
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:232)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142)
>   at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:254)
>   at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:208)
>   at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
>   at 
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
>   at 
> scala.collection.convert.Wrappers$IteratorWrapper.hasNext(Wrappers.scala:30)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList.hasNext(HiveBaseFunctionResultList.java:83)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:42)
>   at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
>   at org.apache.spark.scheduler.Task.run(Task.scala:86)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: java.lang.UnsupportedOperationException: 
> Unsupported encoding: DELTA_BYTE_ARRAY
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229)
>   ... 16 more



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17381) When we enable Parquet Writer Version V2, hive throws an exception: Unsupported encoding: DELTA_BYTE_ARRAY.

2017-08-24 Thread Ferdinand Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-17381:

Issue Type: Sub-task  (was: Bug)
Parent: HIVE-14826

> When we enable Parquet Writer Version V2, hive throws an exception: 
> Unsupported encoding: DELTA_BYTE_ARRAY.
> ---
>
> Key: HIVE-17381
> URL: https://issues.apache.org/jira/browse/HIVE-17381
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ke Jia
>
> when we set "hive.vectorized.execution.enabled=true" and 
> "parquet.writer.version=v2" simultaneously, hive throws the following 
> exception:
> Caused by: java.io.IOException: java.io.IOException: 
> java.lang.UnsupportedOperationException: Unsupported encoding: 
> DELTA_BYTE_ARRAY
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:232)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142)
>   at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:254)
>   at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:208)
>   at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
>   at 
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
>   at 
> scala.collection.convert.Wrappers$IteratorWrapper.hasNext(Wrappers.scala:30)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList.hasNext(HiveBaseFunctionResultList.java:83)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:42)
>   at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
>   at org.apache.spark.scheduler.Task.run(Task.scala:86)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: java.lang.UnsupportedOperationException: 
> Unsupported encoding: DELTA_BYTE_ARRAY
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229)
>   ... 16 more



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17369) We should turn off the TestHcatClient tests until HIVE-16908 is solved

2017-08-24 Thread Peter Vary (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-17369:
--
Resolution: Duplicate
Status: Resolved  (was: Patch Available)

Since on HIVE-16908 we pushed an intermediate solution which runs the tests 
even if those tests are not yet the best ones, I close this jira

> We should turn off the TestHcatClient tests until HIVE-16908 is solved
> --
>
> Key: HIVE-17369
> URL: https://issues.apache.org/jira/browse/HIVE-17369
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog, Test
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-17369.patch
>
>
> [~sbeeram], [~mithun], [~thejas]: These tests are failing all the time. Do 
> you mind if we disable them until HIVE-16908 is solved?
> Thanks,
> Peter



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16908) Failures in TestHcatClient due to HIVE-16844

2017-08-24 Thread Peter Vary (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-16908:
--
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed [~mithun]'s patch to master.
Thanks [~mithun] for the patch!

And thanks for [~sbeeram] to not accepting "it runs" solution as a final one. 
Created the followup Jira for you - HIVE-17384
Launching metastore in a different process for tests

Thanks,
Peter

> Failures in TestHcatClient due to HIVE-16844
> 
>
> Key: HIVE-16908
> URL: https://issues.apache.org/jira/browse/HIVE-16908
> Project: Hive
>  Issue Type: Bug
>Reporter: Sunitha Beeram
>Assignee: Mithun Radhakrishnan
> Fix For: 3.0.0
>
> Attachments: HIVE-16908.1.patch, HIVE-16908.2.patch, 
> HIVE-16908.3.patch, HIVE-16908.4.patch
>
>
> Some of the tests in TestHCatClient.java, for ex:
> {noformat}
> org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
>  (batchId=177)
> org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
>  (batchId=177)
> org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
> (batchId=177)
> {noformat}
> are failing due to HIVE-16844. HIVE-16844 fixes a connection leak when a new 
> configuration object is set on the ObjectStore. TestHCatClient fires up a 
> second instance of metastore thread with a different conf object that results 
> in the PersistenceMangaerFactory closure and hence tests fail. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17139) Conditional expressions optimization: skip the expression evaluation if the condition is not satisfied for vectorization engine.

2017-08-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140178#comment-16140178
 ] 

Hive QA commented on HIVE-17139:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12879301/HIVE-17139.4.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 10999 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_coalesce_3]
 (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby_grouping_sets_grouping]
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketizedhiveinputformat]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=235)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=180)
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testHttpRetryOnServerIdleTimeout 
(batchId=228)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6519/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6519/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6519/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12879301 - PreCommit-HIVE-Build

> Conditional expressions optimization: skip the expression evaluation if the 
> condition is not satisfied for vectorization engine.
> 
>
> Key: HIVE-17139
> URL: https://issues.apache.org/jira/browse/HIVE-17139
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ke Jia
>Assignee: Ke Jia
> Attachments: HIVE-17139.1.patch, HIVE-17139.2.patch, 
> HIVE-17139.3.patch, HIVE-17139.4.patch
>
>
> The case when and if statement execution for Hive vectorization is not 
> optimal, which all the conditional and else expressions are evaluated for 
> current implementation. The optimized approach is to update the selected 
> array of batch parameter after the conditional expression is executed. Then 
> the else expression will only do the selected rows instead of all.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17384) Launching metastore in a different process for tests

2017-08-24 Thread Peter Vary (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary reassigned HIVE-17384:
-


> Launching metastore in a different process for tests
> 
>
> Key: HIVE-17384
> URL: https://issues.apache.org/jira/browse/HIVE-17384
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Testing Infrastructure
>Affects Versions: 3.0.0
>Reporter: Peter Vary
>Assignee: Sunitha Beeram
>
> During HIVE-16908 [~sbeeram] identified the issue, that in tests it would be 
> good to be able to have multiple Metastore instances in one test. The problem 
> is that it is not possible to have 2 Metastore instance in the same JVM. We 
> need to find a solution for that.
> Assigning it to [~sbeeram], since she is already working on this



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17375) stddev_samp,var_samp standard compliance

2017-08-24 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140143#comment-16140143
 ] 

Ashutosh Chauhan commented on HIVE-17375:
-

seems like vectorization tests need to be updated as well.

> stddev_samp,var_samp standard compliance
> 
>
> Key: HIVE-17375
> URL: https://issues.apache.org/jira/browse/HIVE-17375
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Minor
> Attachments: HIVE-17375.1.patch
>
>
> these two udaf-s are returning 0 in case of only one element - however the 
> stadard requires NULL to be returned



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17372) update druid dependency to druid 0.10.1

2017-08-24 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140136#comment-16140136
 ] 

Ashutosh Chauhan commented on HIVE-17372:
-

+1

> update druid dependency to druid 0.10.1
> ---
>
> Key: HIVE-17372
> URL: https://issues.apache.org/jira/browse/HIVE-17372
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-17372.patch
>
>
> Update to most recent druid version to be released August 23.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17332) NullPointer exception when processing query

2017-08-24 Thread Zoltan Haindrich (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140137#comment-16140137
 ] 

Zoltan Haindrich commented on HIVE-17332:
-

eventually I was able to reproduce this with an hdp-2.4.5.0, but not yet using 
a hive integration test... seems like something which was fixed ever since

> NullPointer exception when processing query
> ---
>
> Key: HIVE-17332
> URL: https://issues.apache.org/jira/browse/HIVE-17332
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: Lukas Waldmann
>
> Hive query:
> {code}
> select count(*) from (select * from EXM_BASE_DATA, (select max(snapshot) 
> max_snapshot from EXM_BASE_DATA) s0 where snapshot == max_snapshot) t;
> {code}
> finish with NullPointer exception
> while 
> {code}
> select * from EXM_BASE_DATA, (select max(snapshot) max_snapshot from 
> EXM_BASE_DATA) s0 where snapshot == max_snapshot
> {code}
> is executed without error



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-12791) Truncated table stats should return 0 as datasize

2017-08-24 Thread BELUGA BEHR (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140110#comment-16140110
 ] 

BELUGA BEHR commented on HIVE-12791:


I believe this is still an issue. Please confirm and re-open [~pxiong]

> Truncated table stats should return 0 as datasize
> -
>
> Key: HIVE-12791
> URL: https://issues.apache.org/jira/browse/HIVE-12791
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>
> {code}
> create table s as select * from src;
> truncate table s;
> hive> explain select * from s;
> OK
> STAGE DEPENDENCIES:
>   Stage-0 is a root stage
> STAGE PLANS:
>   Stage: Stage-0
> Fetch Operator
>   limit: -1
>   Processor Tree:
> TableScan
>   alias: s
>   Statistics: Num rows: 29 Data size: 5812 Basic stats: COMPLETE 
> Column stats: NONE
>   Select Operator
> expressions: key (type: string), value (type: string)
> outputColumnNames: _col0, _col1
> Statistics: Num rows: 29 Data size: 5812 Basic stats: COMPLETE 
> Column stats: NONE
> ListSink
> Time taken: 0.048 seconds, Fetched: 17 row(s)
> {code}
> should be 
> {code}
> Num rows: 1 Data size: 0
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17205) add functional support

2017-08-24 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17205:
--
Attachment: HIVE-17205.15.patch

> add functional support
> --
>
> Key: HIVE-17205
> URL: https://issues.apache.org/jira/browse/HIVE-17205
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17205.01.patch, HIVE-17205.02.patch, 
> HIVE-17205.03.patch, HIVE-17205.09.patch, HIVE-17205.10.patch, 
> HIVE-17205.11.patch, HIVE-17205.12.patch, HIVE-17205.13.patch, 
> HIVE-17205.14.patch, HIVE-17205.15.patch
>
>
> make sure unbucketed tables can be marked transactional=true
> make insert/update/delete/compaction work



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17205) add functional support

2017-08-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140080#comment-16140080
 ] 

Hive QA commented on HIVE-17205:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12883468/HIVE-17205.14.patch

{color:green}SUCCESS:{color} +1 due to 11 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 11004 tests 
executed
*Failed tests:*
{noformat}
TestTxnCommandsBase - did not produce a TEST-*.xml file (likely timed out) 
(batchId=280)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=234)
org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion02 
(batchId=215)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=180)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6518/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6518/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6518/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12883468 - PreCommit-HIVE-Build

> add functional support
> --
>
> Key: HIVE-17205
> URL: https://issues.apache.org/jira/browse/HIVE-17205
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17205.01.patch, HIVE-17205.02.patch, 
> HIVE-17205.03.patch, HIVE-17205.09.patch, HIVE-17205.10.patch, 
> HIVE-17205.11.patch, HIVE-17205.12.patch, HIVE-17205.13.patch, 
> HIVE-17205.14.patch
>
>
> make sure unbucketed tables can be marked transactional=true
> make insert/update/delete/compaction work



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17381) When we enable Parquet Writer Version V2, hive throws an exception: Unsupported encoding: DELTA_BYTE_ARRAY.

2017-08-24 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140051#comment-16140051
 ] 

Xuefu Zhang commented on HIVE-17381:


fyi: [~Ferd], [~pgolash]

> When we enable Parquet Writer Version V2, hive throws an exception: 
> Unsupported encoding: DELTA_BYTE_ARRAY.
> ---
>
> Key: HIVE-17381
> URL: https://issues.apache.org/jira/browse/HIVE-17381
> Project: Hive
>  Issue Type: Bug
>Reporter: Ke Jia
>
> when we set "hive.vectorized.execution.enabled=true" and 
> "parquet.writer.version=v2" simultaneously, hive throws the following 
> exception:
> Caused by: java.io.IOException: java.io.IOException: 
> java.lang.UnsupportedOperationException: Unsupported encoding: 
> DELTA_BYTE_ARRAY
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:232)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142)
>   at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:254)
>   at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:208)
>   at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
>   at 
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
>   at 
> scala.collection.convert.Wrappers$IteratorWrapper.hasNext(Wrappers.scala:30)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList.hasNext(HiveBaseFunctionResultList.java:83)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:42)
>   at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
>   at org.apache.spark.scheduler.Task.run(Task.scala:86)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: java.lang.UnsupportedOperationException: 
> Unsupported encoding: DELTA_BYTE_ARRAY
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229)
>   ... 16 more



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17380) refactor LlapProtocolClientProxy to be usable with other protocols

2017-08-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140021#comment-16140021
 ] 

Hive QA commented on HIVE-17380:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12883461/HIVE-17380.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10999 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cte_1] (batchId=82)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=100)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=235)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=180)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6517/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6517/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6517/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12883461 - PreCommit-HIVE-Build

> refactor LlapProtocolClientProxy to be usable with other protocols
> --
>
> Key: HIVE-17380
> URL: https://issues.apache.org/jira/browse/HIVE-17380
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17380.patch, HIVE-17380.patch
>
>
> This basically moves a bunch of code into a generic async PB RPC proxy, in 
> llap-common for now. Moving to common would require one to move LlapNodeId, 
> that can be done later.
> The only logic change is that concurrent hash map, that never expires, is 
> replaced by Guava cache. A path to shut down a proxy is added, but does 
> nothing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17100) Improve HS2 operation logs for REPL commands.

2017-08-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16139959#comment-16139959
 ] 

Hive QA commented on HIVE-17100:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12883460/HIVE-17100.07.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10993 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver
 (batchId=242)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=180)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6516/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6516/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6516/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12883460 - PreCommit-HIVE-Build

> Improve HS2 operation logs for REPL commands.
> -
>
> Key: HIVE-17100
> URL: https://issues.apache.org/jira/browse/HIVE-17100
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17100.01.patch, HIVE-17100.02.patch, 
> HIVE-17100.03.patch, HIVE-17100.04.patch, HIVE-17100.05.patch, 
> HIVE-17100.06.patch, HIVE-17100.07.patch
>
>
> It is necessary to log the progress the replication tasks in a structured 
> manner as follows.
> *+Bootstrap Dump:+*
> * At the start of bootstrap dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (BOOTSTRAP)
> * (Estimated) Total number of tables/views to dump
> * (Estimated) Total number of functions to dump.
> * Dump Start Time{color}
> * After each table dump, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table dump end time
> * Table dump progress. Format is Table sequence no/(Estimated) Total number 
> of tables and views.{color}
> * After each function dump, will add a log as follows
> {color:#59afe1}* Function Name
> * Function dump end time
> * Function dump progress. Format is Function sequence no/(Estimated) Total 
> number of functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> dump.
> {color:#59afe1}* Database Name.
> * Dump Type (BOOTSTRAP).
> * Dump End Time.
> * (Actual) Total number of tables/views dumped.
> * (Actual) Total number of functions dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The actual and estimated number of tables/functions may not match if 
> any table/function is dropped when dump in progress.
> *+Bootstrap Load:+*
> * At the start of bootstrap load, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump directory
> * Load Type (BOOTSTRAP)
> * Total number of tables/views to load
> * Total number of functions to load.
> * Load Start Time{color}
> * After each table load, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table load completion time
> * Table load progress. Format is Table sequence no/Total number of tables and 
> views.{color}
> * After each function load, will add a log as follows
> {color:#59afe1}* Function Name
> * Function load completion time
> * Function load progress. Format is Function sequence no/Total number of 
> functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> load.
> {color:#59afe1}* Database Name.
> * Load Type (BOOTSTRAP).
> * Load End Time.
> * Total number of tables/views loaded.
> * Total number of functions loaded.
> * Last Repl ID of the loaded database.{color}
> *+Incremental Dump:

[jira] [Updated] (HIVE-17382) Change startsWith relation introduced in HIVE-17316

2017-08-24 Thread Barna Zsombor Klara (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barna Zsombor Klara updated HIVE-17382:
---
Status: Patch Available  (was: Open)

> Change startsWith relation introduced in HIVE-17316
> ---
>
> Key: HIVE-17382
> URL: https://issues.apache.org/jira/browse/HIVE-17382
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
> Fix For: 3.0.0
>
> Attachments: HIVE-17382.01.patch
>
>
> In HiveConf the new name should be checked if it starts with a 
> restricted/hidden variable prefix and not vice-versa.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17382) Change startsWith relation introduced in HIVE-17316

2017-08-24 Thread Barna Zsombor Klara (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barna Zsombor Klara updated HIVE-17382:
---
Attachment: HIVE-17382.01.patch

> Change startsWith relation introduced in HIVE-17316
> ---
>
> Key: HIVE-17382
> URL: https://issues.apache.org/jira/browse/HIVE-17382
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
> Fix For: 3.0.0
>
> Attachments: HIVE-17382.01.patch
>
>
> In HiveConf the new name should be checked if it starts with a 
> restricted/hidden variable prefix and not vice-versa.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16823) "ArrayIndexOutOfBoundsException" in spark_vectorized_dynamic_partition_pruning.q

2017-08-24 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16139902#comment-16139902
 ] 

Rui Li commented on HIVE-16823:
---

I have a simpler query to reproduce the issue and it happens to Tez as well:
{noformat}
set hive.cbo.enable=false;
select count(*) from (select key from src group by key) s where s.key='98';
{noformat}

> "ArrayIndexOutOfBoundsException" in 
> spark_vectorized_dynamic_partition_pruning.q
> 
>
> Key: HIVE-16823
> URL: https://issues.apache.org/jira/browse/HIVE-16823
> Project: Hive
>  Issue Type: Bug
>Reporter: Jianguo Tian
>Assignee: liyunzhang_intel
> Attachments: explain.spark, explain.tez, HIVE-16823.1.patch, 
> HIVE-16823.patch
>
>
> spark_vectorized_dynamic_partition_pruning.q
> {code}
> set hive.optimize.ppd=true;
> set hive.ppd.remove.duplicatefilters=true;
> set hive.spark.dynamic.partition.pruning=true;
> set hive.optimize.metadataonly=false;
> set hive.optimize.index.filter=true;
> set hive.vectorized.execution.enabled=true;
> set hive.strict.checks.cartesian.product=false;
> -- parent is reduce tasks
> select count(*) from srcpart join (select ds as ds, ds as `date` from srcpart 
> group by ds) s on (srcpart.ds = s.ds) where s.`date` = '2008-04-08';
> {code}
> The exceptions are as follows:
> {code}
> 2017-06-05T09:20:31,468 ERROR [Executor task launch worker-0] 
> spark.SparkReduceRecordHandler: Fatal error: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error while processing 
> vector batch (tag=0) Column vector types: 0:BYTES, 1:BYTES
> ["2008-04-08", "2008-04-08"]
> org.apache.hadoop.hive.ql.metadata.HiveException: Error while processing 
> vector batch (tag=0) Column vector types: 0:BYTES, 1:BYTES
> ["2008-04-08", "2008-04-08"]
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processVectors(SparkReduceRecordHandler.java:413)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:301)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:54)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList.hasNext(HiveBaseFunctionResultList.java:85)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:42) 
> ~[scala-library-2.11.8.jar:?]
>   at scala.collection.Iterator$class.foreach(Iterator.scala:893) 
> ~[scala-library-2.11.8.jar:?]
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1336) 
> ~[scala-library-2.11.8.jar:?]
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$12.apply(AsyncRDDActions.scala:127)
>  ~[spark-core_2.11-2.0.0.jar:2.0.0]
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$12.apply(AsyncRDDActions.scala:127)
>  ~[spark-core_2.11-2.0.0.jar:2.0.0]
>   at 
> org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:1974) 
> ~[spark-core_2.11-2.0.0.jar:2.0.0]
>   at 
> org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:1974) 
> ~[spark-core_2.11-2.0.0.jar:2.0.0]
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70) 
> ~[spark-core_2.11-2.0.0.jar:2.0.0]
>   at org.apache.spark.scheduler.Task.run(Task.scala:85) 
> ~[spark-core_2.11-2.0.0.jar:2.0.0]
>   at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274) 
> ~[spark-core_2.11-2.0.0.jar:2.0.0]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [?:1.8.0_112]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [?:1.8.0_112]
>   at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorGroupKeyHelper.copyGroupKey(VectorGroupKeyHelper.java:107)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeReduceMergePartial.doProcessBatch(VectorGroupByOperator.java:832)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeBase.processBatch(VectorGroupByOperator.java:179)
>  ~[hive-exec-3.0.0-SN

[jira] [Commented] (HIVE-17380) refactor LlapProtocolClientProxy to be usable with other protocols

2017-08-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16139891#comment-16139891
 ] 

Hive QA commented on HIVE-17380:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12883461/HIVE-17380.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 10999 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=235)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=180)
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testHttpRetryOnServerIdleTimeout 
(batchId=228)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testConnection (batchId=241)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testIsValid (batchId=241)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testIsValidNeg (batchId=241)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testNegativeProxyAuth 
(batchId=241)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testNegativeTokenAuth 
(batchId=241)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testProxyAuth (batchId=241)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testTokenAuth (batchId=241)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6515/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6515/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6515/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12883461 - PreCommit-HIVE-Build

> refactor LlapProtocolClientProxy to be usable with other protocols
> --
>
> Key: HIVE-17380
> URL: https://issues.apache.org/jira/browse/HIVE-17380
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17380.patch, HIVE-17380.patch
>
>
> This basically moves a bunch of code into a generic async PB RPC proxy, in 
> llap-common for now. Moving to common would require one to move LlapNodeId, 
> that can be done later.
> The only logic change is that concurrent hash map, that never expires, is 
> replaced by Guava cache. A path to shut down a proxy is added, but does 
> nothing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16949) Leak of threads from Get-Input-Paths thread pool when more than 1 used in query

2017-08-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16139848#comment-16139848
 ] 

Hive QA commented on HIVE-16949:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12883454/HIVE-16949.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 11003 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_windowing2] 
(batchId=10)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=235)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=180)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6514/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6514/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6514/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12883454 - PreCommit-HIVE-Build

> Leak of threads from Get-Input-Paths thread pool when more than 1 used in 
> query
> ---
>
> Key: HIVE-16949
> URL: https://issues.apache.org/jira/browse/HIVE-16949
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Birger Brunswiek
>Assignee: Sahil Takiar
> Attachments: HIVE-16949.1.patch
>
>
> The commit 
> [20210de|https://github.com/apache/hive/commit/20210dec94148c9b529132b1545df3dd7be083c3]
>  which was part of HIVE-15546 [introduced a thread 
> pool|https://github.com/apache/hive/blob/824b9c80b443dc4e2b9ad35214a23ac756e75234/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L3109]
>  which is not shutdown upon completion of its threads. This leads to a leak 
> of threads for each query which uses more than 1 partition. They are not 
> removed automatically. When queries spanning multiple partitions are made the 
> number of threads increases and is never reduced. On my machine hiveserver2 
> starts to get slower and slower once 10k threads are reached.
> Thread pools only shutdown automatically in special circumstances (see 
> [documentation section 
> _Finalization_|https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ThreadPoolExecutor.html]).
>  This is not currently the case for the Get-Input-Paths thread pool. I would 
> add a _pool.shutdown()_ in a finally block just before returning the result 
> to make sure the threads are really shutdown.
> My current workaround is to set {{hive.exec.input.listing.max.threads = 1}}. 
> This prevents the the thread pool from being spawned 
> [\[1\]|https://github.com/apache/hive/blob/824b9c80b443dc4e2b9ad35214a23ac756e75234/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L2118]
>  
> [\[2\]|https://github.com/apache/hive/blob/824b9c80b443dc4e2b9ad35214a23ac756e75234/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L3107].
> The same issue probably also applies to the [Get-Input-Summary thread 
> pool|https://github.com/apache/hive/blob/824b9c80b443dc4e2b9ad35214a23ac756e75234/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L2193].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17355) Casting to Decimal along with UNION ALL gives incosistent results

2017-08-24 Thread Aditya Allamraju (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya Allamraju reassigned HIVE-17355:
---

Assignee: (was: Aditya Allamraju)

> Casting to Decimal along with UNION ALL gives incosistent results
> -
>
> Key: HIVE-17355
> URL: https://issues.apache.org/jira/browse/HIVE-17355
> Project: Hive
>  Issue Type: Bug
>  Components: Parser, UDF
>Affects Versions: 2.1.0, 2.1.1
> Environment: CentOS 7.2
>Reporter: Aditya Allamraju
>
> Extra trailing zeros are added when running "union all" on the tables 
> containing decimal data types.
> *Version:* Hive 2.1
> *Steps to repro:-*
> {code:java}
> 1) CREATE TABLE `decisample`(
>   `a` decimal(8,2),
>   `b` int,
>   `c` decimal(5,2))
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.mapred.TextInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION
>   'maprfs:/user/hive/warehouse/decisample'
> 2) CREATE TABLE `decisample3`(
>   `a` decimal(8,2),
>   `b` int,
>   `c` decimal(5,2))
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.mapred.TextInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION
>   'maprfs:/user/hive/warehouse/decisample3'
> 3)hive> select * from decisample3;
> OK
> 1.002   3.00
> 7.008   9.00
> 4)hive> select * from decisample;
> OK
> 4.005   6.00
> 5) query:- 
> select a1.a, '' as a1b,'' as a1c from decisample a1 union all select 
> a2.a,a2.b,a2.c from decisample3 a2;
> o/p:-
> OK
> 4.00NULL
> 1.002   3.00
> 7.008   9.00
> Time taken: 87.993 seconds, Fetched: 3 row(s)
> 6)select a2.a,a2.b,a2.c from decisample3 a2 union all select a1.a, '' as 
> a1b,'' as a1c from decisample a1;
> o/p:-
> 4.00
> 1.002   3
> 7.008   9
> {code}
> Steps 5 is yielding 18 trailing zeros where as step 6 query is yieldings no 
> trailing  zero.
> Observation:
> 1. Hive is trying to run the UNION ALL after ensuring the SELECT's are 
> semantically same(equal number of columns and same datatypes). To do this, it 
> is implicitly type casting the values where required.
> From the explain plan, type casting is not consistent when done 2 different 
> ways:
> a)  select-1  UNION ALL select-2 (Query-5 in above comment)
> vs
> b) select-2 UNION ALL select-2   (Query-6 in above comment)
> Showing only the "expresssions" part of execution plans
> Query-5:
> 
> {code:java}
> ..
> ..
> Map Operator Tree:
>   TableScan
> alias: a1
> Statistics: Num rows: 1 Data size: 11 Basic stats: COMPLETE 
> Column stats: NONE
> Select Operator
>   expressions: a (type: decimal(8,2)), '' (type: string), null 
> (type: decimal(38,18))
>   outputColumnNames: _col0, _col1, _col2
> ..
> ..
> TableScan
> alias: a2
> Statistics: Num rows: 2 Data size: 22 Basic stats: COMPLETE 
> Column stats: NONE
> Select Operator
>   expressions: a (type: decimal(8,2)), UDFToString(b) (type: 
> string), CAST( c AS decimal(38,18)) (type: decimal(38,18))
> {code}
> Query-6:
> 
> {code:java}
> ..
> ..
> Map Operator Tree:
>   TableScan
> alias: a2
> Statistics: Num rows: 2 Data size: 22 Basic stats: COMPLETE 
> Column stats: NONE
> Select Operator
>   expressions: a (type: decimal(8,2)), UDFToString(b) (type: 
> string), UDFToString(c) (type: string)
> ..
> ..
> TableScan
> alias: a1
> Statistics: Num rows: 1 Data size: 11 Basic stats: COMPLETE 
> Column stats: NONE
> Select Operator
>   expressions: a (type: decimal(8,2)), '' (type: string), '' 
> (type: string)
> ..
> ..
> {code}
> Attaching the execution plans for both queries for reference.
> 2. The reason for 18 zeros in query-5 above is due to casting NULL to Decimal.
> And by default, the precision and scale are taken as (38,18) in Hive. This 
> could be the reason for 18 zeros.
> 3. This is repeating every time implicit type casting is happening on EMPTY 
> strings.
> If excluding few columns in one of the SELECT statement is absolutely 
> necessary, then the only Workaround is to explicitly type cast the empty 
> strings to same Datatypes as the Other Select statement which included the 
> columns.
> For ex:
> Q1:
> select a,b,c from decisample3
> union all
> select a,cast(' ' as int),cast(' ' as decimal) from decisample;
> Q2:
> select a,cast(' ' as int),cast(' ' as decimal) from decisample
> union all
> select a,b,c from decisample

[jira] [Assigned] (HIVE-17382) Change startsWith relation introduced in HIVE-17316

2017-08-24 Thread Barna Zsombor Klara (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barna Zsombor Klara reassigned HIVE-17382:
--


> Change startsWith relation introduced in HIVE-17316
> ---
>
> Key: HIVE-17382
> URL: https://issues.apache.org/jira/browse/HIVE-17382
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
> Fix For: 3.0.0
>
>
> In HiveConf the new name should be checked if it starts with a 
> restricted/hidden variable prefix and not vice-versa.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17318) Make Hikari CP configurable using hive properties in hive-site.xml

2017-08-24 Thread Barna Zsombor Klara (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barna Zsombor Klara updated HIVE-17318:
---
Attachment: HIVE-17318.02.patch

Updated the patch based on review board comments.

> Make Hikari CP configurable using hive properties in hive-site.xml
> --
>
> Key: HIVE-17318
> URL: https://issues.apache.org/jira/browse/HIVE-17318
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
> Attachments: HIVE-17318.01.patch, HIVE-17318.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

1 2 >

1 - 100 of 105 matches

Mail list logo