[jira] [Commented] (HIVE-16530) Improve execution logs for REPL commands

2017-05-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15996211#comment-15996211
 ] 

Hive QA commented on HIVE-16530:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12866326/HIVE-16530.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10649 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testConnection (batchId=235)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testIsValid (batchId=235)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testIsValidNeg (batchId=235)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testNegativeProxyAuth 
(batchId=235)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testNegativeTokenAuth 
(batchId=235)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testProxyAuth (batchId=235)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testTokenAuth (batchId=235)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5038/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5038/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5038/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12866326 - PreCommit-HIVE-Build

> Improve execution logs for REPL commands
> 
>
> Key: HIVE-16530
> URL: https://issues.apache.org/jira/browse/HIVE-16530
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Attachments: HIVE-16530.01.patch
>
>
> This is the log format that is being proposed for Hive Repl query logs
> For bootstrap case:
> Hive will log a message for each object as it is being bootstrapped and it 
> will be in the following sequence
> - Tables first (views are tables for this purpose) at time including 
> partitions (depth first), followed by functions, constraints 
> - The ordering is based on the ordering of listStatus API of HDFS
> - For each object, a message at the beginning of the replication will be 
> logged
> - Every partition bootstrapped will be followed by a message saying the 
> number of partitions bootstrapped so far (for the table) and the partition 
> name
> - And a message at the end of bootstrap of an object
> Incremental case:
> - We will have DB Name, event id and event type  will be part of the log 
> header (for debugging/troubleshooting)
> - We will have information of current event ID and total number of events to 
> replicate for every event replicated.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-1010) Implement INFORMATION_SCHEMA in Hive

2017-05-03 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15996208#comment-15996208
 ] 

Gunther Hagleitner commented on HIVE-1010:
--

Thanks [~thejas]. I believe .13 addresses all your feedback. Could you take 
another look?

> Implement INFORMATION_SCHEMA in Hive
> 
>
> Key: HIVE-1010
> URL: https://issues.apache.org/jira/browse/HIVE-1010
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore, Query Processor, Server Infrastructure
>Reporter: Jeff Hammerbacher
>Assignee: Gunther Hagleitner
> Attachments: HIVE-1010.10.patch, HIVE-1010.11.patch, 
> HIVE-1010.12.patch, HIVE-1010.13.patch, HIVE-1010.7.patch, HIVE-1010.8.patch, 
> HIVE-1010.9.patch
>
>
> INFORMATION_SCHEMA is part of the SQL92 standard and would be useful to 
> implement using our metastore.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-1010) Implement INFORMATION_SCHEMA in Hive

2017-05-03 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-1010:
-
Attachment: HIVE-1010.13.patch

> Implement INFORMATION_SCHEMA in Hive
> 
>
> Key: HIVE-1010
> URL: https://issues.apache.org/jira/browse/HIVE-1010
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore, Query Processor, Server Infrastructure
>Reporter: Jeff Hammerbacher
>Assignee: Gunther Hagleitner
> Attachments: HIVE-1010.10.patch, HIVE-1010.11.patch, 
> HIVE-1010.12.patch, HIVE-1010.13.patch, HIVE-1010.7.patch, HIVE-1010.8.patch, 
> HIVE-1010.9.patch
>
>
> INFORMATION_SCHEMA is part of the SQL92 standard and would be useful to 
> implement using our metastore.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16552) Limit the number of tasks a Spark job may contain

2017-05-03 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15996190#comment-15996190
 ] 

Rui Li commented on HIVE-16552:
---

+1

> Limit the number of tasks a Spark job may contain
> -
>
> Key: HIVE-16552
> URL: https://issues.apache.org/jira/browse/HIVE-16552
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 1.0.0, 2.0.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-16552.1.patch, HIVE-16552.2.patch, 
> HIVE-16552.3.patch, HIVE-16552.4.patch, HIVE-16552.patch
>
>
> It's commonly desirable to block bad and big queries that takes a lot of YARN 
> resources. One approach, similar to mapreduce.job.max.map in MapReduce, is to 
> stop a query that invokes a Spark job that contains too many tasks. The 
> proposal here is to introduce hive.spark.job.max.tasks with a default value 
> of -1 (no limit), which an admin can set to block queries that trigger too 
> many spark tasks.
> Please note that this control knob applies to a spark job, though it's 
> possible that one query can trigger multiple Spark jobs (such as in case of 
> map-join). Nevertheless, the proposed approach is still helpful.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16552) Limit the number of tasks a Spark job may contain

2017-05-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15996176#comment-15996176
 ] 

Hive QA commented on HIVE-16552:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12866319/HIVE-16552.4.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 10650 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
 (batchId=155)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5037/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5037/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5037/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12866319 - PreCommit-HIVE-Build

> Limit the number of tasks a Spark job may contain
> -
>
> Key: HIVE-16552
> URL: https://issues.apache.org/jira/browse/HIVE-16552
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 1.0.0, 2.0.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-16552.1.patch, HIVE-16552.2.patch, 
> HIVE-16552.3.patch, HIVE-16552.4.patch, HIVE-16552.patch
>
>
> It's commonly desirable to block bad and big queries that takes a lot of YARN 
> resources. One approach, similar to mapreduce.job.max.map in MapReduce, is to 
> stop a query that invokes a Spark job that contains too many tasks. The 
> proposal here is to introduce hive.spark.job.max.tasks with a default value 
> of -1 (no limit), which an admin can set to block queries that trigger too 
> many spark tasks.
> Please note that this control knob applies to a spark job, though it's 
> possible that one query can trigger multiple Spark jobs (such as in case of 
> map-join). Nevertheless, the proposed approach is still helpful.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16530) Improve execution logs for REPL commands

2017-05-03 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16530:

Attachment: (was: Bootstrap_ReplDump_Console_Log.png)

> Improve execution logs for REPL commands
> 
>
> Key: HIVE-16530
> URL: https://issues.apache.org/jira/browse/HIVE-16530
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Attachments: HIVE-16530.01.patch
>
>
> This is the log format that is being proposed for Hive Repl query logs
> For bootstrap case:
> Hive will log a message for each object as it is being bootstrapped and it 
> will be in the following sequence
> - Tables first (views are tables for this purpose) at time including 
> partitions (depth first), followed by functions, constraints 
> - The ordering is based on the ordering of listStatus API of HDFS
> - For each object, a message at the beginning of the replication will be 
> logged
> - Every partition bootstrapped will be followed by a message saying the 
> number of partitions bootstrapped so far (for the table) and the partition 
> name
> - And a message at the end of bootstrap of an object
> Incremental case:
> - We will have DB Name, event id and event type  will be part of the log 
> header (for debugging/troubleshooting)
> - We will have information of current event ID and total number of events to 
> replicate for every event replicated.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16530) Improve execution logs for REPL commands

2017-05-03 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16530:

Attachment: Bootstrap_ReplDump_Console_Log.png

> Improve execution logs for REPL commands
> 
>
> Key: HIVE-16530
> URL: https://issues.apache.org/jira/browse/HIVE-16530
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Attachments: Bootstrap_ReplDump_Console_Log.png, HIVE-16530.01.patch
>
>
> This is the log format that is being proposed for Hive Repl query logs
> For bootstrap case:
> Hive will log a message for each object as it is being bootstrapped and it 
> will be in the following sequence
> - Tables first (views are tables for this purpose) at time including 
> partitions (depth first), followed by functions, constraints 
> - The ordering is based on the ordering of listStatus API of HDFS
> - For each object, a message at the beginning of the replication will be 
> logged
> - Every partition bootstrapped will be followed by a message saying the 
> number of partitions bootstrapped so far (for the table) and the partition 
> name
> - And a message at the end of bootstrap of an object
> Incremental case:
> - We will have DB Name, event id and event type  will be part of the log 
> header (for debugging/troubleshooting)
> - We will have information of current event ID and total number of events to 
> replicate for every event replicated.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16530) Improve execution logs for REPL commands

2017-05-03 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16530:

Attachment: HIVE-16530.01.patch

> Improve execution logs for REPL commands
> 
>
> Key: HIVE-16530
> URL: https://issues.apache.org/jira/browse/HIVE-16530
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Attachments: HIVE-16530.01.patch
>
>
> This is the log format that is being proposed for Hive Repl query logs
> For bootstrap case:
> Hive will log a message for each object as it is being bootstrapped and it 
> will be in the following sequence
> - Tables first (views are tables for this purpose) at time including 
> partitions (depth first), followed by functions, constraints 
> - The ordering is based on the ordering of listStatus API of HDFS
> - For each object, a message at the beginning of the replication will be 
> logged
> - Every partition bootstrapped will be followed by a message saying the 
> number of partitions bootstrapped so far (for the table) and the partition 
> name
> - And a message at the end of bootstrap of an object
> Incremental case:
> - We will have DB Name, event id and event type  will be part of the log 
> header (for debugging/troubleshooting)
> - We will have information of current event ID and total number of events to 
> replicate for every event replicated.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16530) Improve execution logs for REPL commands

2017-05-03 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16530:

Status: Patch Available  (was: Open)

> Improve execution logs for REPL commands
> 
>
> Key: HIVE-16530
> URL: https://issues.apache.org/jira/browse/HIVE-16530
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Attachments: HIVE-16530.01.patch
>
>
> This is the log format that is being proposed for Hive Repl query logs
> For bootstrap case:
> Hive will log a message for each object as it is being bootstrapped and it 
> will be in the following sequence
> - Tables first (views are tables for this purpose) at time including 
> partitions (depth first), followed by functions, constraints 
> - The ordering is based on the ordering of listStatus API of HDFS
> - For each object, a message at the beginning of the replication will be 
> logged
> - Every partition bootstrapped will be followed by a message saying the 
> number of partitions bootstrapped so far (for the table) and the partition 
> name
> - And a message at the end of bootstrap of an object
> Incremental case:
> - We will have DB Name, event id and event type  will be part of the log 
> header (for debugging/troubleshooting)
> - We will have information of current event ID and total number of events to 
> replicate for every event replicated.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16530) Improve execution logs for REPL commands

2017-05-03 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16530:

Status: Open  (was: Patch Available)

> Improve execution logs for REPL commands
> 
>
> Key: HIVE-16530
> URL: https://issues.apache.org/jira/browse/HIVE-16530
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
>
> This is the log format that is being proposed for Hive Repl query logs
> For bootstrap case:
> Hive will log a message for each object as it is being bootstrapped and it 
> will be in the following sequence
> - Tables first (views are tables for this purpose) at time including 
> partitions (depth first), followed by functions, constraints 
> - The ordering is based on the ordering of listStatus API of HDFS
> - For each object, a message at the beginning of the replication will be 
> logged
> - Every partition bootstrapped will be followed by a message saying the 
> number of partitions bootstrapped so far (for the table) and the partition 
> name
> - And a message at the end of bootstrap of an object
> Incremental case:
> - We will have DB Name, event id and event type  will be part of the log 
> header (for debugging/troubleshooting)
> - We will have information of current event ID and total number of events to 
> replicate for every event replicated.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16530) Improve execution logs for REPL commands

2017-05-03 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16530:

Attachment: (was: HIVE-16530.01.patch)

> Improve execution logs for REPL commands
> 
>
> Key: HIVE-16530
> URL: https://issues.apache.org/jira/browse/HIVE-16530
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
>
> This is the log format that is being proposed for Hive Repl query logs
> For bootstrap case:
> Hive will log a message for each object as it is being bootstrapped and it 
> will be in the following sequence
> - Tables first (views are tables for this purpose) at time including 
> partitions (depth first), followed by functions, constraints 
> - The ordering is based on the ordering of listStatus API of HDFS
> - For each object, a message at the beginning of the replication will be 
> logged
> - Every partition bootstrapped will be followed by a message saying the 
> number of partitions bootstrapped so far (for the table) and the partition 
> name
> - And a message at the end of bootstrap of an object
> Incremental case:
> - We will have DB Name, event id and event type  will be part of the log 
> header (for debugging/troubleshooting)
> - We will have information of current event ID and total number of events to 
> replicate for every event replicated.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16530) Improve execution logs for REPL commands

2017-05-03 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16530:

Attachment: (was: Incremental_ReplLoad_Console_Log.png)

> Improve execution logs for REPL commands
> 
>
> Key: HIVE-16530
> URL: https://issues.apache.org/jira/browse/HIVE-16530
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Attachments: HIVE-16530.01.patch
>
>
> This is the log format that is being proposed for Hive Repl query logs
> For bootstrap case:
> Hive will log a message for each object as it is being bootstrapped and it 
> will be in the following sequence
> - Tables first (views are tables for this purpose) at time including 
> partitions (depth first), followed by functions, constraints 
> - The ordering is based on the ordering of listStatus API of HDFS
> - For each object, a message at the beginning of the replication will be 
> logged
> - Every partition bootstrapped will be followed by a message saying the 
> number of partitions bootstrapped so far (for the table) and the partition 
> name
> - And a message at the end of bootstrap of an object
> Incremental case:
> - We will have DB Name, event id and event type  will be part of the log 
> header (for debugging/troubleshooting)
> - We will have information of current event ID and total number of events to 
> replicate for every event replicated.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16530) Improve execution logs for REPL commands

2017-05-03 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16530:

Attachment: (was: Bootstrap_ReplLoad_Console_Log.png)

> Improve execution logs for REPL commands
> 
>
> Key: HIVE-16530
> URL: https://issues.apache.org/jira/browse/HIVE-16530
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Attachments: HIVE-16530.01.patch
>
>
> This is the log format that is being proposed for Hive Repl query logs
> For bootstrap case:
> Hive will log a message for each object as it is being bootstrapped and it 
> will be in the following sequence
> - Tables first (views are tables for this purpose) at time including 
> partitions (depth first), followed by functions, constraints 
> - The ordering is based on the ordering of listStatus API of HDFS
> - For each object, a message at the beginning of the replication will be 
> logged
> - Every partition bootstrapped will be followed by a message saying the 
> number of partitions bootstrapped so far (for the table) and the partition 
> name
> - And a message at the end of bootstrap of an object
> Incremental case:
> - We will have DB Name, event id and event type  will be part of the log 
> header (for debugging/troubleshooting)
> - We will have information of current event ID and total number of events to 
> replicate for every event replicated.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16530) Improve execution logs for REPL commands

2017-05-03 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16530:

Attachment: (was: Bootstrap_ReplDump_Console_Log.png)

> Improve execution logs for REPL commands
> 
>
> Key: HIVE-16530
> URL: https://issues.apache.org/jira/browse/HIVE-16530
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Attachments: HIVE-16530.01.patch
>
>
> This is the log format that is being proposed for Hive Repl query logs
> For bootstrap case:
> Hive will log a message for each object as it is being bootstrapped and it 
> will be in the following sequence
> - Tables first (views are tables for this purpose) at time including 
> partitions (depth first), followed by functions, constraints 
> - The ordering is based on the ordering of listStatus API of HDFS
> - For each object, a message at the beginning of the replication will be 
> logged
> - Every partition bootstrapped will be followed by a message saying the 
> number of partitions bootstrapped so far (for the table) and the partition 
> name
> - And a message at the end of bootstrap of an object
> Incremental case:
> - We will have DB Name, event id and event type  will be part of the log 
> header (for debugging/troubleshooting)
> - We will have information of current event ID and total number of events to 
> replicate for every event replicated.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16530) Improve execution logs for REPL commands

2017-05-03 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16530:

Labels: DR replication  (was: )

> Improve execution logs for REPL commands
> 
>
> Key: HIVE-16530
> URL: https://issues.apache.org/jira/browse/HIVE-16530
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Attachments: HIVE-16530.01.patch
>
>
> This is the log format that is being proposed for Hive Repl query logs
> For bootstrap case:
> Hive will log a message for each object as it is being bootstrapped and it 
> will be in the following sequence
> - Tables first (views are tables for this purpose) at time including 
> partitions (depth first), followed by functions, constraints 
> - The ordering is based on the ordering of listStatus API of HDFS
> - For each object, a message at the beginning of the replication will be 
> logged
> - Every partition bootstrapped will be followed by a message saying the 
> number of partitions bootstrapped so far (for the table) and the partition 
> name
> - And a message at the end of bootstrap of an object
> Incremental case:
> - We will have DB Name, event id and event type  will be part of the log 
> header (for debugging/troubleshooting)
> - We will have information of current event ID and total number of events to 
> replicate for every event replicated.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16530) Improve execution logs for REPL commands

2017-05-03 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16530:

Attachment: (was: Incremental_ReplDump_Console_Log.png)

> Improve execution logs for REPL commands
> 
>
> Key: HIVE-16530
> URL: https://issues.apache.org/jira/browse/HIVE-16530
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Attachments: HIVE-16530.01.patch
>
>
> This is the log format that is being proposed for Hive Repl query logs
> For bootstrap case:
> Hive will log a message for each object as it is being bootstrapped and it 
> will be in the following sequence
> - Tables first (views are tables for this purpose) at time including 
> partitions (depth first), followed by functions, constraints 
> - The ordering is based on the ordering of listStatus API of HDFS
> - For each object, a message at the beginning of the replication will be 
> logged
> - Every partition bootstrapped will be followed by a message saying the 
> number of partitions bootstrapped so far (for the table) and the partition 
> name
> - And a message at the end of bootstrap of an object
> Incremental case:
> - We will have DB Name, event id and event type  will be part of the log 
> header (for debugging/troubleshooting)
> - We will have information of current event ID and total number of events to 
> replicate for every event replicated.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16513) width_bucket issues

2017-05-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15996149#comment-15996149
 ] 

Hive QA commented on HIVE-16513:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12866311/HIVE-16513.4.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 10649 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5036/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5036/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5036/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12866311 - PreCommit-HIVE-Build

> width_bucket issues
> ---
>
> Key: HIVE-16513
> URL: https://issues.apache.org/jira/browse/HIVE-16513
> Project: Hive
>  Issue Type: Bug
>Reporter: Carter Shanklin
>Assignee: Sahil Takiar
> Attachments: HIVE-16513.1.patch, HIVE-16513.2.patch, 
> HIVE-16513.3.patch, HIVE-16513.4.patch
>
>
> width_bucket was recently added with HIVE-15982. This ticket notes a few 
> issues.
> Usability issue:
> Currently only accepts integral numeric types. Decimals, floats and doubles 
> are not supported.
> Runtime failures: This query will cause a runtime divide-by-zero in the 
> reduce stage.
> select width_bucket(c1, 0, c1*2, 10) from e011_01 group by c1;
> The divide-by-zero seems to trigger any time I use a group-by. Here's another 
> example (that actually requires the group-by):
> select width_bucket(c1, 0, max(c1), 10) from e011_01 group by c1;
> Advanced Usage Issues:
> Suppose you have a table e011_01 as follows:
> create table e011_01 (c1 integer, c2 smallint);
> insert into e011_01 values (1, 1), (2, 2);
> Compile-time problems:
> You cannot use simple case expressions, searched case expressions or grouping 
> sets. These queries fail:
> select width_bucket(5, c2, case c1 when 1 then c1 * 2 else c1 * 3 end, 10) 
> from e011_01;
> select width_bucket(5, c2, case when c1 < 2 then c1 * 2 else c1 * 3 end, 10) 
> from e011_01;
> select width_bucket(5, c2, max(c1)*10, cast(grouping(c1, c2)*20+1 as 
> integer)) from e011_02 group by cube(c1, c2);
> I'll admit the grouping one is pretty contrived but the case ones seem 
> straightforward, valid, and it's strange that they don't work. Similar 
> queries work with other UDFs like sum. Why wouldn't they "just work"? Maybe 
> [~ashutoshc] can lend some perspective on that?
> Interestingly, you can use window functions in width_bucket, example:
> select width_bucket(rank() over (order by c2), 0, 10, 10) from e011_01;
> works just fine. Hopefully we can get to a place where people implementing 
> functions like this don't need to think about value expression support but we 
> don't seem to be there yet.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16552) Limit the number of tasks a Spark job may contain

2017-05-03 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-16552:
---
Attachment: HIVE-16552.4.patch

UPdated with Rui's suggestion.

> Limit the number of tasks a Spark job may contain
> -
>
> Key: HIVE-16552
> URL: https://issues.apache.org/jira/browse/HIVE-16552
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 1.0.0, 2.0.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-16552.1.patch, HIVE-16552.2.patch, 
> HIVE-16552.3.patch, HIVE-16552.4.patch, HIVE-16552.patch
>
>
> It's commonly desirable to block bad and big queries that takes a lot of YARN 
> resources. One approach, similar to mapreduce.job.max.map in MapReduce, is to 
> stop a query that invokes a Spark job that contains too many tasks. The 
> proposal here is to introduce hive.spark.job.max.tasks with a default value 
> of -1 (no limit), which an admin can set to block queries that trigger too 
> many spark tasks.
> Please note that this control knob applies to a spark job, though it's 
> possible that one query can trigger multiple Spark jobs (such as in case of 
> map-join). Nevertheless, the proposed approach is still helpful.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16578) Semijoin Hints should use column name, if provided for partition key check

2017-05-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15996106#comment-15996106
 ] 

Hive QA commented on HIVE-16578:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12866307/HIVE-16578.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10648 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_merge_incompat2] 
(batchId=78)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[dynamic_semijoin_user_level]
 (batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction_2]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction_3]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5035/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5035/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5035/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12866307 - PreCommit-HIVE-Build

> Semijoin Hints should use column name, if provided for partition key check
> --
>
> Key: HIVE-16578
> URL: https://issues.apache.org/jira/browse/HIVE-16578
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-16578.1.patch, HIVE-16578.2.patch
>
>
> Current logic does not verify the column name provided in the hint against 
> the column on which the runtime filtering branch will originate from.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-12157) Support unicode for table/column names

2017-05-03 Thread hefuhua (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15996080#comment-15996080
 ] 

hefuhua commented on HIVE-12157:


[#Pengcheng Xiong], in fact  [#Sergey Shelukhin] and [#richard du] meant 
support unicode for table names and column names, in my case, I plan to support 
unicode alias in select-clause only,  so i think scirpt in the metastore 
needn't be changed?

>  Support unicode for table/column names
> ---
>
> Key: HIVE-12157
> URL: https://issues.apache.org/jira/browse/HIVE-12157
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Affects Versions: 1.2.1
>Reporter: richard du
>Assignee: hefuhua
>Priority: Minor
> Attachments: HIVE-12157.01.patch, HIVE-12157.02.patch, 
> HIVE-12157.patch
>
>
> Parser will throw exception when I use alias:
> hive> desc test;
> OK
> a   int 
> b   string  
> Time taken: 0.135 seconds, Fetched: 2 row(s)
> hive> select a as 行1 from test limit 10;
> NoViableAltException(302@[134:7: ( ( ( KW_AS )? identifier ) | ( KW_AS LPAREN 
> identifier ( COMMA identifier )* RPAREN ) )?])
> at org.antlr.runtime.DFA.noViableAlt(DFA.java:158)
> at org.antlr.runtime.DFA.predict(DFA.java:116)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectItem(HiveParser_SelectClauseParser.java:2915)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectList(HiveParser_SelectClauseParser.java:1373)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectClause(HiveParser_SelectClauseParser.java:1128)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser.selectClause(HiveParser.java:45827)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:41495)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:41402)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:40413)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:40283)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1590)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1109)
> at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:202)
> at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:396)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> FAILED: ParseException line 1:13 cannot recognize input near 'as' '1' 'from' 
> in selection target



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16582) HashTableLoader should log info about the input, rows, size etc.

2017-05-03 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-16582:



> HashTableLoader should log info about the input, rows, size etc.
> 
>
> Key: HIVE-16582
> URL: https://issues.apache.org/jira/browse/HIVE-16582
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Minor
>
> Will be useful to log the following info during hash table loading
> - input name
> - number of rows 
> - estimated data size (LLAP tracks this)
> - object cache key



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16581) a bug in HIVE-16523

2017-05-03 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16581:

   Resolution: Fixed
Fix Version/s: 2.4.0
   3.0.0
   Status: Resolved  (was: Patch Available)

Committed to branches. Thanks for the review!

>  a bug in HIVE-16523
> 
>
> Key: HIVE-16581
> URL: https://issues.apache.org/jira/browse/HIVE-16581
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HIVE-16581.patch, HIVE-16581.patch
>
>
> A bug



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16207) Add support for Complex Types in Fast SerDe

2017-05-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15996051#comment-15996051
 ] 

Hive QA commented on HIVE-16207:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12866293/HIVE-16207.1.patch

{color:green}SUCCESS:{color} +1 due to 7 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10650 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[drop_with_concurrency]
 (batchId=234)
org.apache.hadoop.hive.cli.TestHBaseCliDriver.org.apache.hadoop.hive.cli.TestHBaseCliDriver
 (batchId=91)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mergejoin] 
(batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vecrow_part]
 (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vecrow_part_all_primitive]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vecrow_table]
 (batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_vector_dynpart_hashjoin_1]
 (batchId=156)
org.apache.hadoop.hive.ql.exec.vector.TestVectorSerDeRow.testVectorBinarySortableDeserializeRow
 (batchId=269)
org.apache.hadoop.hive.ql.exec.vector.TestVectorSerDeRow.testVectorLazySimpleDeserializeRow
 (batchId=269)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5034/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5034/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5034/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12866293 - PreCommit-HIVE-Build

> Add support for Complex Types in Fast SerDe
> ---
>
> Key: HIVE-16207
> URL: https://issues.apache.org/jira/browse/HIVE-16207
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Teddy Choi
>Priority: Critical
> Attachments: HIVE-16207.1.patch, HIVE-16207.1.patch.zip, partial.patch
>
>
> Add complex type support to Fast SerDe classes.  This is needed for fully 
> supporting complex types in Vectorization



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16581) a bug in HIVE-16523

2017-05-03 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16581:

Attachment: HIVE-16581.patch

With the test

>  a bug in HIVE-16523
> 
>
> Key: HIVE-16581
> URL: https://issues.apache.org/jira/browse/HIVE-16581
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16581.patch, HIVE-16581.patch
>
>
> A bug



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16581) a bug in HIVE-16523

2017-05-03 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16581:

Status: Patch Available  (was: Open)

>  a bug in HIVE-16523
> 
>
> Key: HIVE-16581
> URL: https://issues.apache.org/jira/browse/HIVE-16581
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16581.patch, HIVE-16581.patch
>
>
> A bug



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16047) Shouldn't try to get KeyProvider unless encryption is enabled

2017-05-03 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15996040#comment-15996040
 ] 

Rui Li commented on HIVE-16047:
---

[~spena] thanks for letting me know. Since we're rolling RCs for Hive-2.3, I 
don't think 2.3 can use the new API. HDFS should quash the log so after user 
upgrades Hadoop, the issue goes away right?

> Shouldn't try to get KeyProvider unless encryption is enabled
> -
>
> Key: HIVE-16047
> URL: https://issues.apache.org/jira/browse/HIVE-16047
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-16047.1.patch, HIVE-16047.2.patch
>
>
> Found lots of following errors in HS2 log:
> {noformat}
> hdfs.KeyProviderCache: Could not find uri with key 
> [dfs.encryption.key.provider.uri] to create a keyProvider !!
> {noformat}
> Similar to HDFS-7931



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16513) width_bucket issues

2017-05-03 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-16513:

Attachment: HIVE-16513.4.patch

> width_bucket issues
> ---
>
> Key: HIVE-16513
> URL: https://issues.apache.org/jira/browse/HIVE-16513
> Project: Hive
>  Issue Type: Bug
>Reporter: Carter Shanklin
>Assignee: Sahil Takiar
> Attachments: HIVE-16513.1.patch, HIVE-16513.2.patch, 
> HIVE-16513.3.patch, HIVE-16513.4.patch
>
>
> width_bucket was recently added with HIVE-15982. This ticket notes a few 
> issues.
> Usability issue:
> Currently only accepts integral numeric types. Decimals, floats and doubles 
> are not supported.
> Runtime failures: This query will cause a runtime divide-by-zero in the 
> reduce stage.
> select width_bucket(c1, 0, c1*2, 10) from e011_01 group by c1;
> The divide-by-zero seems to trigger any time I use a group-by. Here's another 
> example (that actually requires the group-by):
> select width_bucket(c1, 0, max(c1), 10) from e011_01 group by c1;
> Advanced Usage Issues:
> Suppose you have a table e011_01 as follows:
> create table e011_01 (c1 integer, c2 smallint);
> insert into e011_01 values (1, 1), (2, 2);
> Compile-time problems:
> You cannot use simple case expressions, searched case expressions or grouping 
> sets. These queries fail:
> select width_bucket(5, c2, case c1 when 1 then c1 * 2 else c1 * 3 end, 10) 
> from e011_01;
> select width_bucket(5, c2, case when c1 < 2 then c1 * 2 else c1 * 3 end, 10) 
> from e011_01;
> select width_bucket(5, c2, max(c1)*10, cast(grouping(c1, c2)*20+1 as 
> integer)) from e011_02 group by cube(c1, c2);
> I'll admit the grouping one is pretty contrived but the case ones seem 
> straightforward, valid, and it's strange that they don't work. Similar 
> queries work with other UDFs like sum. Why wouldn't they "just work"? Maybe 
> [~ashutoshc] can lend some perspective on that?
> Interestingly, you can use window functions in width_bucket, example:
> select width_bucket(rank() over (order by c2), 0, 10, 10) from e011_01;
> works just fine. Hopefully we can get to a place where people implementing 
> functions like this don't need to think about value expression support but we 
> don't seem to be there yet.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16469) Parquet timestamp table property is not always taken into account

2017-05-03 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15996026#comment-15996026
 ] 

Ferdinand Xu commented on HIVE-16469:
-

Thanks [~zsombor.klara] for the patch. Left a few comments on RB.

> Parquet timestamp table property is not always taken into account
> -
>
> Key: HIVE-16469
> URL: https://issues.apache.org/jira/browse/HIVE-16469
> Project: Hive
>  Issue Type: Bug
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
> Attachments: HIVE-16469.01.patch, HIVE-16469.02.patch, 
> HIVE-16469.03.patch, HIVE-16469.04.patch
>
>
> The parquet timestamp timezone property is currently copied over into the 
> JobConf in the FetchOperator, but this may be too late for some execution 
> paths.
> We should:
> 1 - copy the property over earlier
> 2 - set the default value on the JobConf if no property is set, and fail in 
> the ParquetRecordReader if the property is missing from the JobConf
> We should add extra validations for the cases when:
> - the property was not set by accident on the JobConf (unexpected execution 
> path)
> - an incorrect/invalid timezone id is being set on the table



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16552) Limit the number of tasks a Spark job may contain

2017-05-03 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15996025#comment-15996025
 ] 

Rui Li commented on HIVE-16552:
---

[~xuefuz] thanks for the update. I think you can exclude the new test for 
TestNegativeCliDriver because it's only for Spark. You should be able to do it 
in:
{code}
  public static class NegativeCliConfig extends AbstractCliConfig {
public NegativeCliConfig() {
  super(CoreNegativeCliDriver.class);
  try {
setQueryDir("ql/src/test/queries/clientnegative");

excludesFrom(testConfigProps, "minimr.query.negative.files");
excludeQuery("authorization_uri_import.q");

setResultsDir("ql/src/test/results/clientnegative");
setLogDir("itests/qtest/target/qfile-results/clientnegative");

setInitScript("q_test_init.sql");
setCleanupScript("q_test_cleanup.sql");

setHiveConfDir("");
setClusterType(MiniClusterType.none);
  } catch (Exception e) {
throw new RuntimeException("can't construct cliconfig", e);
  }
}
  }
{code}

> Limit the number of tasks a Spark job may contain
> -
>
> Key: HIVE-16552
> URL: https://issues.apache.org/jira/browse/HIVE-16552
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 1.0.0, 2.0.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-16552.1.patch, HIVE-16552.2.patch, 
> HIVE-16552.3.patch, HIVE-16552.patch
>
>
> It's commonly desirable to block bad and big queries that takes a lot of YARN 
> resources. One approach, similar to mapreduce.job.max.map in MapReduce, is to 
> stop a query that invokes a Spark job that contains too many tasks. The 
> proposal here is to introduce hive.spark.job.max.tasks with a default value 
> of -1 (no limit), which an admin can set to block queries that trigger too 
> many spark tasks.
> Please note that this control knob applies to a spark job, though it's 
> possible that one query can trigger multiple Spark jobs (such as in case of 
> map-join). Nevertheless, the proposed approach is still helpful.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16581) a bug in HIVE-16523

2017-05-03 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15996020#comment-15996020
 ] 

Gopal V commented on HIVE-16581:


LGTM - +1.

>  a bug in HIVE-16523
> 
>
> Key: HIVE-16581
> URL: https://issues.apache.org/jira/browse/HIVE-16581
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16581.patch
>
>
> A bug



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16581) a bug in HIVE-16523

2017-05-03 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16581:

Attachment: HIVE-16581.patch

[~gopalv] can you take a look? Need to write a test probably, will add one

>  a bug in HIVE-16523
> 
>
> Key: HIVE-16581
> URL: https://issues.apache.org/jira/browse/HIVE-16581
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16581.patch
>
>
> A bug



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16581) a bug in HIVE-16523

2017-05-03 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16581:

Reporter: Sergey Shelukhin  (was: Gopal V)

>  a bug in HIVE-16523
> 
>
> Key: HIVE-16581
> URL: https://issues.apache.org/jira/browse/HIVE-16581
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> A bug



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16581) a bug in HIVE-16523

2017-05-03 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16581:

Summary:  a bug in HIVE-16523  (was: improve upon HIVE-16523 II)

>  a bug in HIVE-16523
> 
>
> Key: HIVE-16581
> URL: https://issues.apache.org/jira/browse/HIVE-16581
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
>
> A bug



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16581) improve upon HIVE-16523 II

2017-05-03 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-16581:
---


> improve upon HIVE-16523 II
> --
>
> Key: HIVE-16581
> URL: https://issues.apache.org/jira/browse/HIVE-16581
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
>
> Some things could be faster.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16581) improve upon HIVE-16523 II

2017-05-03 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16581:

Description: A bug  (was: Some things could be faster.)

> improve upon HIVE-16523 II
> --
>
> Key: HIVE-16581
> URL: https://issues.apache.org/jira/browse/HIVE-16581
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
>
> A bug



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16578) Semijoin Hints should use column name, if provided for partition key check

2017-05-03 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-16578:
--
Attachment: HIVE-16578.2.patch

Implemented review comments.
Added a new test

> Semijoin Hints should use column name, if provided for partition key check
> --
>
> Key: HIVE-16578
> URL: https://issues.apache.org/jira/browse/HIVE-16578
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-16578.1.patch, HIVE-16578.2.patch
>
>
> Current logic does not verify the column name provided in the hint against 
> the column on which the runtime filtering branch will originate from.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16579) CachedStore: improvements to partition col stats caching

2017-05-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15996000#comment-15996000
 ] 

Hive QA commented on HIVE-16579:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12866287/HIVE-16579.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 10648 tests passed

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5033/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5033/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5033/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12866287 - PreCommit-HIVE-Build

> CachedStore: improvements to partition col stats caching
> 
>
> Key: HIVE-16579
> URL: https://issues.apache.org/jira/browse/HIVE-16579
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-16579.1.patch
>
>
> 1. Update stats cache when partitions/table is dropped.
> 2. Update cached partition col stats in the background cache update thread. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16530) Improve execution logs for REPL commands

2017-05-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995957#comment-15995957
 ] 

Hive QA commented on HIVE-16530:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12866281/Incremental_ReplLoad_Console_Log.png

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5032/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5032/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5032/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-05-04 00:26:41.513
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-5032/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-05-04 00:26:41.515
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 740779f HIVE-15229: 'like any' and 'like all' operators in hive 
(Simanchal Das via Carl Steinbach)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 740779f HIVE-15229: 'like any' and 'like all' operators in hive 
(Simanchal Das via Carl Steinbach)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-05-04 00:26:43.510
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
patch:  Only garbage was found in the patch input.
patch:  Only garbage was found in the patch input.
patch:  Only garbage was found in the patch input.
fatal: unrecognized input
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12866281 - PreCommit-HIVE-Build

> Improve execution logs for REPL commands
> 
>
> Key: HIVE-16530
> URL: https://issues.apache.org/jira/browse/HIVE-16530
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
> Attachments: Bootstrap_ReplDump_Console_Log.png, 
> Bootstrap_ReplLoad_Console_Log.png, HIVE-16530.01.patch, 
> Incremental_ReplDump_Console_Log.png, Incremental_ReplLoad_Console_Log.png
>
>
> This is the log format that is being proposed for Hive Repl query logs
> For bootstrap case:
> Hive will log a message for each object as it is being bootstrapped and it 
> will be in the following sequence
> - Tables first (views are tables for this purpose) at time including 
> partitions (depth first), followed by functions, constraints 
> - The ordering is based on the ordering of listStatus API of HDFS
> - For each object, a message at the beginning of the replication will be 
> logged
> - Every partition bootstrapped will be followed by a message saying the 
> number of partitions bootstrapped so far (for the table) and the partition 
> name
> - And a message at the end of bootstrap of an object
> Incremental case:
> - We will have DB Name, event id and event type  will be part of the log 
> header (for debugging/troubleshooting)
> - We will have information of current event ID and total number of events to 
> replicate for every event replicated.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16572) Rename a partition should not drop its column stats

2017-05-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995953#comment-15995953
 ] 

Hive QA commented on HIVE-16572:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12866271/HIVE-16572.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10648 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[rename_external_partition_location]
 (batchId=46)
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[external_table_ppd] 
(batchId=90)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5031/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5031/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5031/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12866271 - PreCommit-HIVE-Build

> Rename a partition should not drop its column stats
> ---
>
> Key: HIVE-16572
> URL: https://issues.apache.org/jira/browse/HIVE-16572
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16572.patch
>
>
> The column stats for the table sample_pt partition (dummy=1) is as following:
> {code}
> hive> describe formatted sample_pt partition (dummy=1) code;
> OK
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> code  string  
> 0   303 6.985 
>   7   
> from deserializer   
> Time taken: 0.259 seconds, Fetched: 3 row(s)
> {code}
> But when this partition is renamed, say
> alter table sample_pt partition (dummy=1) rename to partition (dummy=11);
> The COLUMN_STATS in partition description are true, but column stats are 
> actually all deleted.
> {code}
> hive> describe formatted sample_pt partition (dummy=11);
> OK
> # col_namedata_type   comment 
>
> code  string  
> description   string  
> salaryint 
> total_emp int 
>
> # Partition Information
> # col_namedata_type   comment 
>
> dummy int 
>
> # Detailed Partition Information   
> Partition Value:  [11] 
> Database: default  
> Table:sample_pt
> CreateTime:   Thu Mar 30 23:03:59 EDT 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=11 
>  
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   numFiles1   
>   numRows 200 
>   rawDataSize 10228   
>   totalSize   10428   
>   transient_lastDdlTime   1490929439  
>
> # Storage Information  
> SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
>  
> InputFormat:  org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Compressed:   No   
> Num Buckets:  -1   
> Bucket Columns:   []   
> Sort Columns: []

[jira] [Updated] (HIVE-16207) Add support for Complex Types in Fast SerDe

2017-05-03 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-16207:
--
Status: Patch Available  (was: Open)

> Add support for Complex Types in Fast SerDe
> ---
>
> Key: HIVE-16207
> URL: https://issues.apache.org/jira/browse/HIVE-16207
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Teddy Choi
>Priority: Critical
> Attachments: HIVE-16207.1.patch, HIVE-16207.1.patch.zip, partial.patch
>
>
> Add complex type support to Fast SerDe classes.  This is needed for fully 
> supporting complex types in Vectorization



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16207) Add support for Complex Types in Fast SerDe

2017-05-03 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-16207:
--
Attachment: HIVE-16207.1.patch

> Add support for Complex Types in Fast SerDe
> ---
>
> Key: HIVE-16207
> URL: https://issues.apache.org/jira/browse/HIVE-16207
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Teddy Choi
>Priority: Critical
> Attachments: HIVE-16207.1.patch, HIVE-16207.1.patch.zip, partial.patch
>
>
> Add complex type support to Fast SerDe classes.  This is needed for fully 
> supporting complex types in Vectorization



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16578) Semijoin Hints should use column name, if provided for partition key check

2017-05-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995904#comment-15995904
 ] 

Hive QA commented on HIVE-16578:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12866262/HIVE-16578.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10648 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[dynamic_semijoin_user_level]
 (batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[semijoin_hint]
 (batchId=147)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5030/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5030/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5030/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12866262 - PreCommit-HIVE-Build

> Semijoin Hints should use column name, if provided for partition key check
> --
>
> Key: HIVE-16578
> URL: https://issues.apache.org/jira/browse/HIVE-16578
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-16578.1.patch
>
>
> Current logic does not verify the column name provided in the hint against 
> the column on which the runtime filtering branch will originate from.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16047) Shouldn't try to get KeyProvider unless encryption is enabled

2017-05-03 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995860#comment-15995860
 ] 

Lei (Eddy) Xu commented on HIVE-16047:
--

[~spena], HDFS-11687 introduces a {{HdfsAdmin#getKeyProvider()}} API for hive.  
It returns {{null}} if the encryption is not enabled.  It is close to be 
committed (by EOD or tomorrow).

> Shouldn't try to get KeyProvider unless encryption is enabled
> -
>
> Key: HIVE-16047
> URL: https://issues.apache.org/jira/browse/HIVE-16047
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-16047.1.patch, HIVE-16047.2.patch
>
>
> Found lots of following errors in HS2 log:
> {noformat}
> hdfs.KeyProviderCache: Could not find uri with key 
> [dfs.encryption.key.provider.uri] to create a keyProvider !!
> {noformat}
> Similar to HDFS-7931



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16579) CachedStore: improvements to partition col stats caching

2017-05-03 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-16579:

Status: Patch Available  (was: Open)

> CachedStore: improvements to partition col stats caching
> 
>
> Key: HIVE-16579
> URL: https://issues.apache.org/jira/browse/HIVE-16579
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-16579.1.patch
>
>
> 1. Update stats cache when partitions/table is dropped.
> 2. Update cached partition col stats in the background cache update thread. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16579) CachedStore: improvements to partition col stats caching

2017-05-03 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-16579:

Attachment: HIVE-16579.1.patch

> CachedStore: improvements to partition col stats caching
> 
>
> Key: HIVE-16579
> URL: https://issues.apache.org/jira/browse/HIVE-16579
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-16579.1.patch
>
>
> 1. Update stats cache when partitions/table is dropped.
> 2. Update cached partition col stats in the background cache update thread. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16580) CachedStore: Cache column stats for unpartitioned tables

2017-05-03 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta reassigned HIVE-16580:
---


> CachedStore: Cache column stats for unpartitioned tables
> 
>
> Key: HIVE-16580
> URL: https://issues.apache.org/jira/browse/HIVE-16580
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16579) CachedStore: improvements to partition col stats caching

2017-05-03 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-16579:

Description: 
1. Update stats cache when partitions/table is dropped.
2. Update cached partition col stats in the background cache update thread. 

> CachedStore: improvements to partition col stats caching
> 
>
> Key: HIVE-16579
> URL: https://issues.apache.org/jira/browse/HIVE-16579
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>
> 1. Update stats cache when partitions/table is dropped.
> 2. Update cached partition col stats in the background cache update thread. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16579) CachedStore: improvements to partition col stats caching

2017-05-03 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta reassigned HIVE-16579:
---


> CachedStore: improvements to partition col stats caching
> 
>
> Key: HIVE-16579
> URL: https://issues.apache.org/jira/browse/HIVE-16579
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16577) Syntax error in the metastore init scripts for mssql

2017-05-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995821#comment-15995821
 ] 

Hive QA commented on HIVE-16577:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12866256/HIVE-16577.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10637 tests 
executed
*Failed tests:*
{noformat}
TestHs2Hooks - did not produce a TEST-*.xml file (likely timed out) 
(batchId=214)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_join30]
 (batchId=148)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5029/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5029/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5029/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12866256 - PreCommit-HIVE-Build

> Syntax error in the metastore init scripts for mssql
> 
>
> Key: HIVE-16577
> URL: https://issues.apache.org/jira/browse/HIVE-16577
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.2.0, 2.3.0, 3.0.0, 2.4.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Blocker
> Attachments: HIVE-16577.01.patch
>
>
> HIVE-10562 introduced a new column to {{NOTIFICATION_LOG}} table. The mssql 
> init scripts which were modified have a syntax error and they fail to 
> initialize metastore schema from 2.2.0 onwards.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-11609) Capability to add a filter to hbase scan via composite key doesn't work

2017-05-03 Thread Swarnim Kulkarni (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995810#comment-15995810
 ] 

Swarnim Kulkarni commented on HIVE-11609:
-

Hey [~zsombor.klara]. Please feel free to take a stab if you have time. 
Unfortunately I am very occupied for next few weeks but can definitely help 
answer any questions.

> Capability to add a filter to hbase scan via composite key doesn't work
> ---
>
> Key: HIVE-11609
> URL: https://issues.apache.org/jira/browse/HIVE-11609
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Reporter: Swarnim Kulkarni
>Assignee: Swarnim Kulkarni
> Attachments: HIVE-11609.1.patch.txt, HIVE-11609.2.patch.txt, 
> HIVE-11609.3.patch.txt, HIVE-11609.4.patch.txt, HIVE-11609.5.patch, 
> HIVE-11609.6.patch.txt, HIVE-11609.7.patch.txt
>
>
> It seems like the capability to add filter to an hbase scan which was added 
> as part of HIVE-6411 doesn't work. This is primarily because in the 
> HiveHBaseInputFormat, the filter is added in the getsplits instead of 
> getrecordreader. This works fine for start and stop keys but not for filter 
> because a filter is respected only when an actual scan is performed. This is 
> also related to the initial refactoring that was done as part of HIVE-3420.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-6147) Support avro data stored in HBase columns

2017-05-03 Thread Swarnim Kulkarni (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Swarnim Kulkarni updated HIVE-6147:
---
Labels:   (was: TODOC14)

Docs look good. Removing the label.

> Support avro data stored in HBase columns
> -
>
> Key: HIVE-6147
> URL: https://issues.apache.org/jira/browse/HIVE-6147
> Project: Hive
>  Issue Type: Improvement
>  Components: HBase Handler
>Affects Versions: 0.12.0, 0.13.0
>Reporter: Swarnim Kulkarni
>Assignee: Swarnim Kulkarni
> Fix For: 0.14.0
>
> Attachments: HIVE-6147.1.patch.txt, HIVE-6147.2.patch.txt, 
> HIVE-6147.3.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.4.patch.txt, 
> HIVE-6147.5.patch.txt, HIVE-6147.6.patch.txt
>
>
> Presently, the HBase Hive integration supports querying only primitive data 
> types in columns. It would be nice to be able to store and query Avro objects 
> in HBase columns by making them visible as structs to Hive. This will allow 
> Hive to perform ad hoc analysis of HBase data which can be deeply structured.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Work stopped] (HIVE-16530) Improve execution logs for REPL commands

2017-05-03 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-16530 stopped by Sankar Hariappan.
---
> Improve execution logs for REPL commands
> 
>
> Key: HIVE-16530
> URL: https://issues.apache.org/jira/browse/HIVE-16530
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
> Attachments: Bootstrap_ReplDump_Console_Log.png, 
> Bootstrap_ReplLoad_Console_Log.png, HIVE-16530.01.patch, 
> Incremental_ReplDump_Console_Log.png, Incremental_ReplLoad_Console_Log.png
>
>
> This is the log format that is being proposed for Hive Repl query logs
> For bootstrap case:
> Hive will log a message for each object as it is being bootstrapped and it 
> will be in the following sequence
> - Tables first (views are tables for this purpose) at time including 
> partitions (depth first), followed by functions, constraints 
> - The ordering is based on the ordering of listStatus API of HDFS
> - For each object, a message at the beginning of the replication will be 
> logged
> - Every partition bootstrapped will be followed by a message saying the 
> number of partitions bootstrapped so far (for the table) and the partition 
> name
> - And a message at the end of bootstrap of an object
> Incremental case:
> - We will have DB Name, event id and event type  will be part of the log 
> header (for debugging/troubleshooting)
> - We will have information of current event ID and total number of events to 
> replicate for every event replicated.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16530) Improve execution logs for REPL commands

2017-05-03 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16530:

Status: Patch Available  (was: Open)

> Improve execution logs for REPL commands
> 
>
> Key: HIVE-16530
> URL: https://issues.apache.org/jira/browse/HIVE-16530
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
> Attachments: Bootstrap_ReplDump_Console_Log.png, 
> Bootstrap_ReplLoad_Console_Log.png, HIVE-16530.01.patch, 
> Incremental_ReplDump_Console_Log.png, Incremental_ReplLoad_Console_Log.png
>
>
> This is the log format that is being proposed for Hive Repl query logs
> For bootstrap case:
> Hive will log a message for each object as it is being bootstrapped and it 
> will be in the following sequence
> - Tables first (views are tables for this purpose) at time including 
> partitions (depth first), followed by functions, constraints 
> - The ordering is based on the ordering of listStatus API of HDFS
> - For each object, a message at the beginning of the replication will be 
> logged
> - Every partition bootstrapped will be followed by a message saying the 
> number of partitions bootstrapped so far (for the table) and the partition 
> name
> - And a message at the end of bootstrap of an object
> Incremental case:
> - We will have DB Name, event id and event type  will be part of the log 
> header (for debugging/troubleshooting)
> - We will have information of current event ID and total number of events to 
> replicate for every event replicated.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HIVE-16530) Improve execution logs for REPL commands

2017-05-03 Thread Sankar Hariappan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995768#comment-15995768
 ] 

Sankar Hariappan edited comment on HIVE-16530 at 5/3/17 10:04 PM:
--

Added 01.patch:
- Added progress logs for REPL DUMP and REPL LOAD.
- Considered the log table level for bootstrap and event level for incremental.
- Object level log is not considered as it will be taken as follow up task.
- No new test added as the log comes in console and the same is attached here 
for review.

Request [~sushanth], [~thejas] to please review the patch.


was (Author: sankarh):
Added 01.patch:
- Added progress logs for REPL DUMP and REPL LOAD.
- Considered the log table level for bootstrap and event level for incremental.
- Object level log is not considered as it will be taken as follow up task.
- No new test added as the log comes in console and the same is attached here 
for review.

> Improve execution logs for REPL commands
> 
>
> Key: HIVE-16530
> URL: https://issues.apache.org/jira/browse/HIVE-16530
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
> Attachments: Bootstrap_ReplDump_Console_Log.png, 
> Bootstrap_ReplLoad_Console_Log.png, HIVE-16530.01.patch, 
> Incremental_ReplDump_Console_Log.png, Incremental_ReplLoad_Console_Log.png
>
>
> This is the log format that is being proposed for Hive Repl query logs
> For bootstrap case:
> Hive will log a message for each object as it is being bootstrapped and it 
> will be in the following sequence
> - Tables first (views are tables for this purpose) at time including 
> partitions (depth first), followed by functions, constraints 
> - The ordering is based on the ordering of listStatus API of HDFS
> - For each object, a message at the beginning of the replication will be 
> logged
> - Every partition bootstrapped will be followed by a message saying the 
> number of partitions bootstrapped so far (for the table) and the partition 
> name
> - And a message at the end of bootstrap of an object
> Incremental case:
> - We will have DB Name, event id and event type  will be part of the log 
> header (for debugging/troubleshooting)
> - We will have information of current event ID and total number of events to 
> replicate for every event replicated.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16530) Improve execution logs for REPL commands

2017-05-03 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16530:

Attachment: Incremental_ReplLoad_Console_Log.png
Incremental_ReplDump_Console_Log.png
Bootstrap_ReplLoad_Console_Log.png
Bootstrap_ReplDump_Console_Log.png
HIVE-16530.01.patch

Added 01.patch:
- Added progress logs for REPL DUMP and REPL LOAD.
- Considered the log table level for bootstrap and event level for incremental.
- Object level log is not considered as it will be taken as follow up task.
- No new test added as the log comes in console and the same is attached here 
for review.

> Improve execution logs for REPL commands
> 
>
> Key: HIVE-16530
> URL: https://issues.apache.org/jira/browse/HIVE-16530
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
> Attachments: Bootstrap_ReplDump_Console_Log.png, 
> Bootstrap_ReplLoad_Console_Log.png, HIVE-16530.01.patch, 
> Incremental_ReplDump_Console_Log.png, Incremental_ReplLoad_Console_Log.png
>
>
> This is the log format that is being proposed for Hive Repl query logs
> For bootstrap case:
> Hive will log a message for each object as it is being bootstrapped and it 
> will be in the following sequence
> - Tables first (views are tables for this purpose) at time including 
> partitions (depth first), followed by functions, constraints 
> - The ordering is based on the ordering of listStatus API of HDFS
> - For each object, a message at the beginning of the replication will be 
> logged
> - Every partition bootstrapped will be followed by a message saying the 
> number of partitions bootstrapped so far (for the table) and the partition 
> name
> - And a message at the end of bootstrap of an object
> Incremental case:
> - We will have DB Name, event id and event type  will be part of the log 
> header (for debugging/troubleshooting)
> - We will have information of current event ID and total number of events to 
> replicate for every event replicated.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16530) Improve execution logs for REPL commands

2017-05-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995756#comment-15995756
 ] 

ASF GitHub Bot commented on HIVE-16530:
---

GitHub user sankarh opened a pull request:

https://github.com/apache/hive/pull/176

HIVE-16530: Improve execution logs for REPL commands



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sankarh/hive HIVE-16530

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/176.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #176


commit d9a5e0aaf317cfdd60d1f2b62840e3f03d0fd1ac
Author: Sankar Hariappan 
Date:   2017-05-03T21:54:41Z

HIVE-16530: Improve execution logs for REPL commands




> Improve execution logs for REPL commands
> 
>
> Key: HIVE-16530
> URL: https://issues.apache.org/jira/browse/HIVE-16530
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>
> This is the log format that is being proposed for Hive Repl query logs
> For bootstrap case:
> Hive will log a message for each object as it is being bootstrapped and it 
> will be in the following sequence
> - Tables first (views are tables for this purpose) at time including 
> partitions (depth first), followed by functions, constraints 
> - The ordering is based on the ordering of listStatus API of HDFS
> - For each object, a message at the beginning of the replication will be 
> logged
> - Every partition bootstrapped will be followed by a message saying the 
> number of partitions bootstrapped so far (for the table) and the partition 
> name
> - And a message at the end of bootstrap of an object
> Incremental case:
> - We will have DB Name, event id and event type  will be part of the log 
> header (for debugging/troubleshooting)
> - We will have information of current event ID and total number of events to 
> replicate for every event replicated.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive

2017-05-03 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-15229:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks Simanchal!

> 'like any' and 'like all' operators in hive
> ---
>
> Key: HIVE-15229
> URL: https://issues.apache.org/jira/browse/HIVE-15229
> Project: Hive
>  Issue Type: New Feature
>  Components: Operators
>Reporter: Simanchal Das
>Assignee: Simanchal Das
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch, 
> HIVE-15229.3.patch, HIVE-15229.4.patch, HIVE-15229.5.patch, HIVE-15229.6.patch
>
>
> In Teradata 'like any' and 'like all' operators are mostly used when we are 
> matching a text field with numbers of patterns.
> 'like any' and 'like all' operator are equivalents of multiple like operator 
> like example below.
> {noformat}
> --like any
> select col1 from table1 where col2 like any ('%accountant%', '%accounting%', 
> '%retail%', '%bank%', '%insurance%');
> --Can be written using multiple like condition 
> select col1 from table1 where col2 like '%accountant%' or col2 like 
> '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like 
> '%insurance%' ;
> --like all
> select col1 from table1 where col2 like all ('%accountant%', '%accounting%', 
> '%retail%', '%bank%', '%insurance%');
> --Can be written using multiple like operator 
> select col1 from table1 where col2 like '%accountant%' and col2 like 
> '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like 
> '%insurance%' ;
> {noformat}
> Problem statement:
> Now a days so many data warehouse projects are being migrated from Teradata 
> to Hive.
> Always Data engineer and Business analyst are searching for these two 
> operator.
> If we introduce these two operator in hive then so many scripts will be 
> migrated smoothly instead of converting these operators to multiple like 
> operators.
> Result:
> 1. 'LIKE ANY' operator return true if a text(column value) matches to any 
> pattern.
> 2. 'LIKE ALL' operator return true if a text(column value) matches to all 
> patterns.
> 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the 
> left hand side is NULL, but also if one of the pattern in the list is NULL.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16556) Modify schematool scripts to initialize and create METASTORE_DB_PROPERTIES table

2017-05-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995745#comment-15995745
 ] 

Hive QA commented on HIVE-16556:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12866252/HIVE-16556.04.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10640 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[hbasestats] 
(batchId=91)
org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5028/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5028/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5028/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12866252 - PreCommit-HIVE-Build

> Modify schematool scripts to initialize and create METASTORE_DB_PROPERTIES 
> table
> 
>
> Key: HIVE-16556
> URL: https://issues.apache.org/jira/browse/HIVE-16556
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-16556.01.patch, HIVE-16556.02.patch, 
> HIVE-16556.03.patch, HIVE-16556.04.patch
>
>
> sub-task to modify schema tool and its related changes so that the new table 
> is added to the schema when schematool initializes or upgrades the schema.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16572) Rename a partition should not drop its column stats

2017-05-03 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16572:
---
Status: Patch Available  (was: Open)

> Rename a partition should not drop its column stats
> ---
>
> Key: HIVE-16572
> URL: https://issues.apache.org/jira/browse/HIVE-16572
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16572.patch
>
>
> The column stats for the table sample_pt partition (dummy=1) is as following:
> {code}
> hive> describe formatted sample_pt partition (dummy=1) code;
> OK
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> code  string  
> 0   303 6.985 
>   7   
> from deserializer   
> Time taken: 0.259 seconds, Fetched: 3 row(s)
> {code}
> But when this partition is renamed, say
> alter table sample_pt partition (dummy=1) rename to partition (dummy=11);
> The COLUMN_STATS in partition description are true, but column stats are 
> actually all deleted.
> {code}
> hive> describe formatted sample_pt partition (dummy=11);
> OK
> # col_namedata_type   comment 
>
> code  string  
> description   string  
> salaryint 
> total_emp int 
>
> # Partition Information
> # col_namedata_type   comment 
>
> dummy int 
>
> # Detailed Partition Information   
> Partition Value:  [11] 
> Database: default  
> Table:sample_pt
> CreateTime:   Thu Mar 30 23:03:59 EDT 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=11 
>  
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   numFiles1   
>   numRows 200 
>   rawDataSize 10228   
>   totalSize   10428   
>   transient_lastDdlTime   1490929439  
>
> # Storage Information  
> SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
>  
> InputFormat:  org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Compressed:   No   
> Num Buckets:  -1   
> Bucket Columns:   []   
> Sort Columns: []   
> Storage Desc Params:   
>   serialization.format1   
> Time taken: 6.783 seconds, Fetched: 37 row(s)
> ===
> hive> describe formatted sample_pt partition (dummy=11) code;
> OK
> # col_namedata_type   comment 
>  
>   
>  
> code  string  from deserializer   
>  
> Time taken: 9.429 seconds, Fetched: 3 row(s)
> {code}
> The column stats should not be drop when a partition is renamed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16572) Rename a partition should not drop its column stats

2017-05-03 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16572:
---
Attachment: HIVE-16572.patch

The patch is to do following:
1. keep the partition column stats when a partition is renamed
2. refactor the partition renaming logic. We move the partition directory 
before committing the HMS transaction, since it will be easier to revert the 
data moving in a rename failure.

> Rename a partition should not drop its column stats
> ---
>
> Key: HIVE-16572
> URL: https://issues.apache.org/jira/browse/HIVE-16572
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16572.patch
>
>
> The column stats for the table sample_pt partition (dummy=1) is as following:
> {code}
> hive> describe formatted sample_pt partition (dummy=1) code;
> OK
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> code  string  
> 0   303 6.985 
>   7   
> from deserializer   
> Time taken: 0.259 seconds, Fetched: 3 row(s)
> {code}
> But when this partition is renamed, say
> alter table sample_pt partition (dummy=1) rename to partition (dummy=11);
> The COLUMN_STATS in partition description are true, but column stats are 
> actually all deleted.
> {code}
> hive> describe formatted sample_pt partition (dummy=11);
> OK
> # col_namedata_type   comment 
>
> code  string  
> description   string  
> salaryint 
> total_emp int 
>
> # Partition Information
> # col_namedata_type   comment 
>
> dummy int 
>
> # Detailed Partition Information   
> Partition Value:  [11] 
> Database: default  
> Table:sample_pt
> CreateTime:   Thu Mar 30 23:03:59 EDT 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=11 
>  
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   numFiles1   
>   numRows 200 
>   rawDataSize 10228   
>   totalSize   10428   
>   transient_lastDdlTime   1490929439  
>
> # Storage Information  
> SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
>  
> InputFormat:  org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Compressed:   No   
> Num Buckets:  -1   
> Bucket Columns:   []   
> Sort Columns: []   
> Storage Desc Params:   
>   serialization.format1   
> Time taken: 6.783 seconds, Fetched: 37 row(s)
> ===
> hive> describe formatted sample_pt partition (dummy=11) code;
> OK
> # col_namedata_type   comment 
>  
>   
>  
> code  string  from deserializer   
>  
> Time taken: 9.429 seconds, Fetched: 3 row(s)
> {code}
> The column stats should not be drop when a partition is renamed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16576) Fix encoding of intervals when fetching select query candidates from druid

2017-05-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995651#comment-15995651
 ] 

Hive QA commented on HIVE-16576:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12866249/HIVE-16576.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10638 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] 
(batchId=225)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_join30]
 (batchId=148)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5027/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5027/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5027/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12866249 - PreCommit-HIVE-Build

> Fix encoding of intervals when fetching select query candidates from druid
> --
>
> Key: HIVE-16576
> URL: https://issues.apache.org/jira/browse/HIVE-16576
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
> Attachments: HIVE-16576.patch
>
>
> Debug logs on HIVE side - 
> {code}
> 2017-05-03T23:49:00,672 DEBUG [HttpClient-Netty-Worker-0] 
> client.NettyHttpClient: [GET 
> http://localhost:8082/druid/v2/datasources/cmv_basetable_druid/candidates?intervals=1900-01-01T00:00:00.000+05:53:20/3000-01-01T00:00:00.000+05:30]
>  Got response: 500 Server Error
> {code}
> Druid exception stack trace - 
> {code}
> 2017-05-03T18:56:58,928 WARN [qtp1651318806-158] 
> org.eclipse.jetty.servlet.ServletHandler - 
> /druid/v2/datasources/cmv_basetable_druid/candidates
> java.lang.IllegalArgumentException: Invalid format: ""1900-01-01T00:00:00.000 
> 05:53:20"
>   at 
> org.joda.time.format.DateTimeFormatter.parseDateTime(DateTimeFormatter.java:899)
>  ~[joda-time-2.8.2.jar:2.8.2]
>   at 
> org.joda.time.convert.StringConverter.setInto(StringConverter.java:212) 
> ~[joda-time-2.8.2.jar:2.8.2]
>   at org.joda.time.base.BaseInterval.(BaseInterval.java:200) 
> ~[joda-time-2.8.2.jar:2.8.2]
>   at org.joda.time.Interval.(Interval.java:193) 
> ~[joda-time-2.8.2.jar:2.8.2]
>   at org.joda.time.Interval.parse(Interval.java:69) 
> ~[joda-time-2.8.2.jar:2.8.2]
>   at 
> io.druid.server.ClientInfoResource.getQueryTargets(ClientInfoResource.java:320)
>  ~[classes/:?]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_92]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_92]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_92]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_92]
> {code}
> Note that intervals being sent as part of the HTTP request URL are not 
> encoded properly when not using UTC timezone.  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16577) Syntax error in the metastore init scripts for mssql

2017-05-03 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995639#comment-15995639
 ] 

Vihang Karajgaonkar commented on HIVE-16577:


[~sushanth] Can you please review?

> Syntax error in the metastore init scripts for mssql
> 
>
> Key: HIVE-16577
> URL: https://issues.apache.org/jira/browse/HIVE-16577
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.2.0, 2.3.0, 3.0.0, 2.4.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Blocker
> Attachments: HIVE-16577.01.patch
>
>
> HIVE-10562 introduced a new column to {{NOTIFICATION_LOG}} table. The mssql 
> init scripts which were modified have a syntax error and they fail to 
> initialize metastore schema from 2.2.0 onwards.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15795) Support Accumulo Index Tables in Hive Accumulo Connector

2017-05-03 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15795:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed the addendum to master. Thanks for the patch(es)!

> Support Accumulo Index Tables in Hive Accumulo Connector
> 
>
> Key: HIVE-15795
> URL: https://issues.apache.org/jira/browse/HIVE-15795
> Project: Hive
>  Issue Type: Improvement
>  Components: Accumulo Storage Handler
>Reporter: Mike Fagan
>Assignee: Mike Fagan
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-15795.1.patch, HIVE-15795.2.patch, 
> HIVE-15795.3.patch
>
>
> Ability to specify an accumulo index table for an accumulo-hive table.
> This would greatly improve performance for non-rowid query predicates



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16485) Enable outputName for RS operator in explain formatted

2017-05-03 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16485:
---
Status: Patch Available  (was: Open)

> Enable outputName for RS operator in explain formatted
> --
>
> Key: HIVE-16485
> URL: https://issues.apache.org/jira/browse/HIVE-16485
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16485.01.patch, HIVE-16485.02.patch, 
> HIVE-16485.03.patch, HIVE-16485.04.patch, HIVE-16485-disableMasking, plan, 
> query
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16578) Semijoin Hints should use column name, if provided for partition key check

2017-05-03 Thread Deepak Jaiswal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995569#comment-15995569
 ] 

Deepak Jaiswal commented on HIVE-16578:
---

review board link

https://reviews.apache.org/r/58973

> Semijoin Hints should use column name, if provided for partition key check
> --
>
> Key: HIVE-16578
> URL: https://issues.apache.org/jira/browse/HIVE-16578
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-16578.1.patch
>
>
> Current logic does not verify the column name provided in the hint against 
> the column on which the runtime filtering branch will originate from.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16558) In the hiveserver2.jsp page, when you click Drilldown to view the details of the Closed Queries, the Chinese show garbled

2017-05-03 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995562#comment-15995562
 ] 

Xuefu Zhang commented on HIVE-16558:


+1

> In the hiveserver2.jsp page, when you click Drilldown to view the details of 
> the Closed Queries, the Chinese show garbled
> -
>
> Key: HIVE-16558
> URL: https://issues.apache.org/jira/browse/HIVE-16558
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: ZhangBing Lin
>Assignee: ZhangBing Lin
> Fix For: 3.0.0
>
> Attachments: HIVE-16558.1.patch
>
>
> In QueryProfileImpl.jamon,We see the following settings:
> 
> 
>   
> 
> HiveServer2
> 
> 
> 
> 
> 
>   
> So we should set the response code to utf-8, which can avoid Chinese garbled 
> or other languages garbled,Please check it!



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16578) Semijoin Hints should use column name, if provided for partition key check

2017-05-03 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-16578:
--
Attachment: HIVE-16578.1.patch

Initial patch.
Some refactoring of code.

> Semijoin Hints should use column name, if provided for partition key check
> --
>
> Key: HIVE-16578
> URL: https://issues.apache.org/jira/browse/HIVE-16578
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-16578.1.patch
>
>
> Current logic does not verify the column name provided in the hint against 
> the column on which the runtime filtering branch will originate from.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16578) Semijoin Hints should use column name, if provided for partition key check

2017-05-03 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-16578:
--
Status: Patch Available  (was: In Progress)

> Semijoin Hints should use column name, if provided for partition key check
> --
>
> Key: HIVE-16578
> URL: https://issues.apache.org/jira/browse/HIVE-16578
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>
> Current logic does not verify the column name provided in the hint against 
> the column on which the runtime filtering branch will originate from.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16578) Semijoin Hints should use column name, if provided for partition key check

2017-05-03 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal reassigned HIVE-16578:
-


> Semijoin Hints should use column name, if provided for partition key check
> --
>
> Key: HIVE-16578
> URL: https://issues.apache.org/jira/browse/HIVE-16578
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>
> Current logic does not verify the column name provided in the hint against 
> the column on which the runtime filtering branch will originate from.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16552) Limit the number of tasks a Spark job may contain

2017-05-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995535#comment-15995535
 ] 

Hive QA commented on HIVE-16552:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12866237/HIVE-16552.3.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10640 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] 
(batchId=225)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[spark_job_max_tasks]
 (batchId=87)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5026/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5026/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5026/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12866237 - PreCommit-HIVE-Build

> Limit the number of tasks a Spark job may contain
> -
>
> Key: HIVE-16552
> URL: https://issues.apache.org/jira/browse/HIVE-16552
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 1.0.0, 2.0.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-16552.1.patch, HIVE-16552.2.patch, 
> HIVE-16552.3.patch, HIVE-16552.patch
>
>
> It's commonly desirable to block bad and big queries that takes a lot of YARN 
> resources. One approach, similar to mapreduce.job.max.map in MapReduce, is to 
> stop a query that invokes a Spark job that contains too many tasks. The 
> proposal here is to introduce hive.spark.job.max.tasks with a default value 
> of -1 (no limit), which an admin can set to block queries that trigger too 
> many spark tasks.
> Please note that this control knob applies to a spark job, though it's 
> possible that one query can trigger multiple Spark jobs (such as in case of 
> map-join). Nevertheless, the proposed approach is still helpful.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15349) Create a jenkins job to run HMS schema testing on a regular basis.

2017-05-03 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995512#comment-15995512
 ] 

Vihang Karajgaonkar commented on HIVE-15349:


Hi [~ngangam] I can help getting this back up.

> Create a jenkins job to run HMS schema testing on a regular basis.
> --
>
> Key: HIVE-15349
> URL: https://issues.apache.org/jira/browse/HIVE-15349
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 2.2.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>
> Create a jenkins job similar to be run at a regular basis that will run the 
> schema upgrade tests.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16577) Syntax error in the metastore init scripts for mssql

2017-05-03 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-16577:
---
Attachment: HIVE-16577.01.patch

Attaching the patch. Tested manually that the schema init script now works fine 
for 2.3.0 and 3.0.0

> Syntax error in the metastore init scripts for mssql
> 
>
> Key: HIVE-16577
> URL: https://issues.apache.org/jira/browse/HIVE-16577
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.2.0, 2.3.0, 3.0.0, 2.4.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Blocker
> Attachments: HIVE-16577.01.patch
>
>
> HIVE-10562 introduced a new column to {{NOTIFICATION_LOG}} table. The mssql 
> init scripts which were modified have a syntax error and they fail to 
> initialize metastore schema from 2.2.0 onwards.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16577) Syntax error in the metastore init scripts for mssql

2017-05-03 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-16577:
---
Status: Patch Available  (was: Open)

> Syntax error in the metastore init scripts for mssql
> 
>
> Key: HIVE-16577
> URL: https://issues.apache.org/jira/browse/HIVE-16577
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.2.0, 2.3.0, 3.0.0, 2.4.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Blocker
> Attachments: HIVE-16577.01.patch
>
>
> HIVE-10562 introduced a new column to {{NOTIFICATION_LOG}} table. The mssql 
> init scripts which were modified have a syntax error and they fail to 
> initialize metastore schema from 2.2.0 onwards.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16501) Add rej/orig to .gitignore ; remove *.orig files

2017-05-03 Thread Zoltan Haindrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995496#comment-15995496
 ] 

Zoltan Haindrich commented on HIVE-16501:
-

[~ekoifman] if you haven't added a file to git...the code will just compile 
fine...and since the worktree might have already crowded with crappy files...it 
will even pass the compile test..you suggest to set gitignore for myself...but 
that doesn't really address the original issue...anyway: If you would like to 
revert this change...I'm not against it - but this .orig and .rej files will 
come back later again...

I've played with it a bit...and it seems the following might come handy; 
setting {{--index}} for git apply makes it apply the patch directly to the git 
index - and it seems very usefull:
{code}
git apply --index HIVE-15224.3-branch-1.patch
# or enable git level conflic markers; this will appear as like a regular merge 
problem - but marked at the git level as well
git apply -3 p0.patch
{code}
both of these will never miss an added file...and {{-3}} will mark it as a 
conflicted file ; and since this is now git merge conflict; {{git mergetool}} 
can be used to resolve the conflict...

> Add rej/orig to .gitignore ; remove *.orig files
> 
>
> Key: HIVE-16501
> URL: https://issues.apache.org/jira/browse/HIVE-16501
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Fix For: 3.0.0
>
> Attachments: HIVE-16501.1.patch
>
>
> sometimes git reject/orig files made there way into the repo...
> it would be better to just ignore them :)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16577) Syntax error in the metastore init scripts for mssql

2017-05-03 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar reassigned HIVE-16577:
--


> Syntax error in the metastore init scripts for mssql
> 
>
> Key: HIVE-16577
> URL: https://issues.apache.org/jira/browse/HIVE-16577
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.2.0, 2.3.0, 3.0.0, 2.4.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Blocker
>
> HIVE-10562 introduced a new column to {{NOTIFICATION_LOG}} table. The mssql 
> init scripts which were modified have a syntax error and they fail to 
> initialize metastore schema from 2.2.0 onwards.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16556) Modify schematool scripts to initialize and create METASTORE_DB_PROPERTIES table

2017-05-03 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-16556:
---
Attachment: HIVE-16556.04.patch

Manually tested against sql-server database. Fixed the scripts for mssql

> Modify schematool scripts to initialize and create METASTORE_DB_PROPERTIES 
> table
> 
>
> Key: HIVE-16556
> URL: https://issues.apache.org/jira/browse/HIVE-16556
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-16556.01.patch, HIVE-16556.02.patch, 
> HIVE-16556.03.patch, HIVE-16556.04.patch
>
>
> sub-task to modify schema tool and its related changes so that the new table 
> is added to the schema when schematool initializes or upgrades the schema.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16576) Fix encoding of intervals when fetching select query candidates from druid

2017-05-03 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995475#comment-15995475
 ] 

Jesus Camacho Rodriguez commented on HIVE-16576:


+1 (pending tests)

> Fix encoding of intervals when fetching select query candidates from druid
> --
>
> Key: HIVE-16576
> URL: https://issues.apache.org/jira/browse/HIVE-16576
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
> Attachments: HIVE-16576.patch
>
>
> Debug logs on HIVE side - 
> {code}
> 2017-05-03T23:49:00,672 DEBUG [HttpClient-Netty-Worker-0] 
> client.NettyHttpClient: [GET 
> http://localhost:8082/druid/v2/datasources/cmv_basetable_druid/candidates?intervals=1900-01-01T00:00:00.000+05:53:20/3000-01-01T00:00:00.000+05:30]
>  Got response: 500 Server Error
> {code}
> Druid exception stack trace - 
> {code}
> 2017-05-03T18:56:58,928 WARN [qtp1651318806-158] 
> org.eclipse.jetty.servlet.ServletHandler - 
> /druid/v2/datasources/cmv_basetable_druid/candidates
> java.lang.IllegalArgumentException: Invalid format: ""1900-01-01T00:00:00.000 
> 05:53:20"
>   at 
> org.joda.time.format.DateTimeFormatter.parseDateTime(DateTimeFormatter.java:899)
>  ~[joda-time-2.8.2.jar:2.8.2]
>   at 
> org.joda.time.convert.StringConverter.setInto(StringConverter.java:212) 
> ~[joda-time-2.8.2.jar:2.8.2]
>   at org.joda.time.base.BaseInterval.(BaseInterval.java:200) 
> ~[joda-time-2.8.2.jar:2.8.2]
>   at org.joda.time.Interval.(Interval.java:193) 
> ~[joda-time-2.8.2.jar:2.8.2]
>   at org.joda.time.Interval.parse(Interval.java:69) 
> ~[joda-time-2.8.2.jar:2.8.2]
>   at 
> io.druid.server.ClientInfoResource.getQueryTargets(ClientInfoResource.java:320)
>  ~[classes/:?]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_92]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_92]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_92]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_92]
> {code}
> Note that intervals being sent as part of the HTTP request URL are not 
> encoded properly when not using UTC timezone.  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16576) Fix encoding of intervals when fetching select query candidates from druid

2017-05-03 Thread Nishant Bangarwa (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995470#comment-15995470
 ] 

Nishant Bangarwa commented on HIVE-16576:
-

[~jcamachorodriguez] please review. 

> Fix encoding of intervals when fetching select query candidates from druid
> --
>
> Key: HIVE-16576
> URL: https://issues.apache.org/jira/browse/HIVE-16576
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
> Attachments: HIVE-16576.patch
>
>
> Debug logs on HIVE side - 
> {code}
> 2017-05-03T23:49:00,672 DEBUG [HttpClient-Netty-Worker-0] 
> client.NettyHttpClient: [GET 
> http://localhost:8082/druid/v2/datasources/cmv_basetable_druid/candidates?intervals=1900-01-01T00:00:00.000+05:53:20/3000-01-01T00:00:00.000+05:30]
>  Got response: 500 Server Error
> {code}
> Druid exception stack trace - 
> {code}
> 2017-05-03T18:56:58,928 WARN [qtp1651318806-158] 
> org.eclipse.jetty.servlet.ServletHandler - 
> /druid/v2/datasources/cmv_basetable_druid/candidates
> java.lang.IllegalArgumentException: Invalid format: ""1900-01-01T00:00:00.000 
> 05:53:20"
>   at 
> org.joda.time.format.DateTimeFormatter.parseDateTime(DateTimeFormatter.java:899)
>  ~[joda-time-2.8.2.jar:2.8.2]
>   at 
> org.joda.time.convert.StringConverter.setInto(StringConverter.java:212) 
> ~[joda-time-2.8.2.jar:2.8.2]
>   at org.joda.time.base.BaseInterval.(BaseInterval.java:200) 
> ~[joda-time-2.8.2.jar:2.8.2]
>   at org.joda.time.Interval.(Interval.java:193) 
> ~[joda-time-2.8.2.jar:2.8.2]
>   at org.joda.time.Interval.parse(Interval.java:69) 
> ~[joda-time-2.8.2.jar:2.8.2]
>   at 
> io.druid.server.ClientInfoResource.getQueryTargets(ClientInfoResource.java:320)
>  ~[classes/:?]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_92]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_92]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_92]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_92]
> {code}
> Note that intervals being sent as part of the HTTP request URL are not 
> encoded properly when not using UTC timezone.  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16576) Fix encoding of intervals when fetching select query candidates from druid

2017-05-03 Thread Nishant Bangarwa (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-16576:

Status: Patch Available  (was: Open)

> Fix encoding of intervals when fetching select query candidates from druid
> --
>
> Key: HIVE-16576
> URL: https://issues.apache.org/jira/browse/HIVE-16576
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
> Attachments: HIVE-16576.patch
>
>
> Debug logs on HIVE side - 
> {code}
> 2017-05-03T23:49:00,672 DEBUG [HttpClient-Netty-Worker-0] 
> client.NettyHttpClient: [GET 
> http://localhost:8082/druid/v2/datasources/cmv_basetable_druid/candidates?intervals=1900-01-01T00:00:00.000+05:53:20/3000-01-01T00:00:00.000+05:30]
>  Got response: 500 Server Error
> {code}
> Druid exception stack trace - 
> {code}
> 2017-05-03T18:56:58,928 WARN [qtp1651318806-158] 
> org.eclipse.jetty.servlet.ServletHandler - 
> /druid/v2/datasources/cmv_basetable_druid/candidates
> java.lang.IllegalArgumentException: Invalid format: ""1900-01-01T00:00:00.000 
> 05:53:20"
>   at 
> org.joda.time.format.DateTimeFormatter.parseDateTime(DateTimeFormatter.java:899)
>  ~[joda-time-2.8.2.jar:2.8.2]
>   at 
> org.joda.time.convert.StringConverter.setInto(StringConverter.java:212) 
> ~[joda-time-2.8.2.jar:2.8.2]
>   at org.joda.time.base.BaseInterval.(BaseInterval.java:200) 
> ~[joda-time-2.8.2.jar:2.8.2]
>   at org.joda.time.Interval.(Interval.java:193) 
> ~[joda-time-2.8.2.jar:2.8.2]
>   at org.joda.time.Interval.parse(Interval.java:69) 
> ~[joda-time-2.8.2.jar:2.8.2]
>   at 
> io.druid.server.ClientInfoResource.getQueryTargets(ClientInfoResource.java:320)
>  ~[classes/:?]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_92]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_92]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_92]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_92]
> {code}
> Note that intervals being sent as part of the HTTP request URL are not 
> encoded properly when not using UTC timezone.  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16576) Fix encoding of intervals when fetching select query candidates from druid

2017-05-03 Thread Nishant Bangarwa (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-16576:

Attachment: HIVE-16576.patch

> Fix encoding of intervals when fetching select query candidates from druid
> --
>
> Key: HIVE-16576
> URL: https://issues.apache.org/jira/browse/HIVE-16576
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
> Attachments: HIVE-16576.patch
>
>
> Debug logs on HIVE side - 
> {code}
> 2017-05-03T23:49:00,672 DEBUG [HttpClient-Netty-Worker-0] 
> client.NettyHttpClient: [GET 
> http://localhost:8082/druid/v2/datasources/cmv_basetable_druid/candidates?intervals=1900-01-01T00:00:00.000+05:53:20/3000-01-01T00:00:00.000+05:30]
>  Got response: 500 Server Error
> {code}
> Druid exception stack trace - 
> {code}
> 2017-05-03T18:56:58,928 WARN [qtp1651318806-158] 
> org.eclipse.jetty.servlet.ServletHandler - 
> /druid/v2/datasources/cmv_basetable_druid/candidates
> java.lang.IllegalArgumentException: Invalid format: ""1900-01-01T00:00:00.000 
> 05:53:20"
>   at 
> org.joda.time.format.DateTimeFormatter.parseDateTime(DateTimeFormatter.java:899)
>  ~[joda-time-2.8.2.jar:2.8.2]
>   at 
> org.joda.time.convert.StringConverter.setInto(StringConverter.java:212) 
> ~[joda-time-2.8.2.jar:2.8.2]
>   at org.joda.time.base.BaseInterval.(BaseInterval.java:200) 
> ~[joda-time-2.8.2.jar:2.8.2]
>   at org.joda.time.Interval.(Interval.java:193) 
> ~[joda-time-2.8.2.jar:2.8.2]
>   at org.joda.time.Interval.parse(Interval.java:69) 
> ~[joda-time-2.8.2.jar:2.8.2]
>   at 
> io.druid.server.ClientInfoResource.getQueryTargets(ClientInfoResource.java:320)
>  ~[classes/:?]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_92]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_92]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_92]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_92]
> {code}
> Note that intervals being sent as part of the HTTP request URL are not 
> encoded properly when not using UTC timezone.  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16576) Fix encoding of intervals when fetching select query candidates from druid

2017-05-03 Thread Nishant Bangarwa (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-16576:

Description: 
Debug logs on HIVE side - 
{code}
2017-05-03T23:49:00,672 DEBUG [HttpClient-Netty-Worker-0] 
client.NettyHttpClient: [GET 
http://localhost:8082/druid/v2/datasources/cmv_basetable_druid/candidates?intervals=1900-01-01T00:00:00.000+05:53:20/3000-01-01T00:00:00.000+05:30]
 Got response: 500 Server Error
{code}

Druid exception stack trace - 
{code}
2017-05-03T18:56:58,928 WARN [qtp1651318806-158] 
org.eclipse.jetty.servlet.ServletHandler - 
/druid/v2/datasources/cmv_basetable_druid/candidates
java.lang.IllegalArgumentException: Invalid format: ""1900-01-01T00:00:00.000 
05:53:20"
at 
org.joda.time.format.DateTimeFormatter.parseDateTime(DateTimeFormatter.java:899)
 ~[joda-time-2.8.2.jar:2.8.2]
at 
org.joda.time.convert.StringConverter.setInto(StringConverter.java:212) 
~[joda-time-2.8.2.jar:2.8.2]
at org.joda.time.base.BaseInterval.(BaseInterval.java:200) 
~[joda-time-2.8.2.jar:2.8.2]
at org.joda.time.Interval.(Interval.java:193) 
~[joda-time-2.8.2.jar:2.8.2]
at org.joda.time.Interval.parse(Interval.java:69) 
~[joda-time-2.8.2.jar:2.8.2]
at 
io.druid.server.ClientInfoResource.getQueryTargets(ClientInfoResource.java:320) 
~[classes/:?]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_92]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_92]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_92]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_92]
{code}

Note that intervals being sent as part of the HTTP request URL are not encoded 
properly when not using UTC timezone.  

  was:
Debug logs on HIVE side - 
{code}
2017-05-03T23:49:00,672 DEBUG [HttpClient-Netty-Worker-0] 
client.NettyHttpClient: [GET 
http://localhost:8082/druid/v2/datasources/cmv_basetable_druid/candidates?intervals=1900-01-01T00:00:00.000+05:53:20/3000-01-01T00:00:00.000+05:30]
 Got response: 500 Server Error
{code}

Druid exception stack trace - 
{code}
2017-05-03T18:56:58,928 WARN [qtp1651318806-158] 
org.eclipse.jetty.servlet.ServletHandler - 
/druid/v2/datasources/cmv_basetable_druid/candidates
java.lang.IllegalArgumentException: Invalid format: ""1900-01-01T00:00:00.000 
05:53:20"
at 
org.joda.time.format.DateTimeFormatter.parseDateTime(DateTimeFormatter.java:899)
 ~[joda-time-2.8.2.jar:2.8.2]
at 
org.joda.time.convert.StringConverter.setInto(StringConverter.java:212) 
~[joda-time-2.8.2.jar:2.8.2]
at org.joda.time.base.BaseInterval.(BaseInterval.java:200) 
~[joda-time-2.8.2.jar:2.8.2]
at org.joda.time.Interval.(Interval.java:193) 
~[joda-time-2.8.2.jar:2.8.2]
at org.joda.time.Interval.parse(Interval.java:69) 
~[joda-time-2.8.2.jar:2.8.2]
at 
io.druid.server.ClientInfoResource.getQueryTargets(ClientInfoResource.java:320) 
~[classes/:?]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_92]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_92]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_92]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_92]
{code}

Note that intervals being sent as part of the HTTP request URL are not encoded 
properly. 


> Fix encoding of intervals when fetching select query candidates from druid
> --
>
> Key: HIVE-16576
> URL: https://issues.apache.org/jira/browse/HIVE-16576
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>
> Debug logs on HIVE side - 
> {code}
> 2017-05-03T23:49:00,672 DEBUG [HttpClient-Netty-Worker-0] 
> client.NettyHttpClient: [GET 
> http://localhost:8082/druid/v2/datasources/cmv_basetable_druid/candidates?intervals=1900-01-01T00:00:00.000+05:53:20/3000-01-01T00:00:00.000+05:30]
>  Got response: 500 Server Error
> {code}
> Druid exception stack trace - 
> {code}
> 2017-05-03T18:56:58,928 WARN [qtp1651318806-158] 
> org.eclipse.jetty.servlet.ServletHandler - 
> /druid/v2/datasources/cmv_basetable_druid/candidates
> java.lang.IllegalArgumentException: Invalid format: ""1900-01-01T00:00:00.000 
> 05:53:20"
>   at 
> org.joda.time.format.DateTimeFormatter.parseDateTime(DateTimeFormatter.java:899)
>  ~[joda-time-2.8.2.jar:2.8.2]
>   at 
> org.joda.time.convert.StringConverter.setInto(StringConverter.java:212) 
> ~[joda-time-2.8.2.jar:2.8.2]
>   at org.joda.time.base.BaseInterval.(BaseInterval.java:200) 
> 

[jira] [Commented] (HIVE-16575) Support for 'UNIQUE' and 'NOT NULL' constraints

2017-05-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995454#comment-15995454
 ] 

Hive QA commented on HIVE-16575:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12866229/HIVE-16575.01.patch

{color:green}SUCCESS:{color} +1 due to 17 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10556 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] 
(batchId=225)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[create_with_multi_pk_constraint]
 (batchId=87)
org.apache.hadoop.hive.ql.parse.TestHiveDecimalParse.testDecimalType7 
(batchId=259)
org.apache.hive.jdbc.TestJdbcDriver2.org.apache.hive.jdbc.TestJdbcDriver2 
(batchId=221)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5025/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5025/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5025/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12866229 - PreCommit-HIVE-Build

> Support for 'UNIQUE' and 'NOT NULL' constraints
> ---
>
> Key: HIVE-16575
> URL: https://issues.apache.org/jira/browse/HIVE-16575
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO, Logical Optimizer, Parser
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-16575.01.patch, HIVE-16575.patch
>
>
> Follow-up on HIVE-13076.
> This issue add support for SQL 'UNIQUE' and 'NOT NULL' constraints when we 
> create a table / alter a table 
> (https://www.postgresql.org/docs/9.6/static/sql-createtable.html).
> As with PK and FK constraints, currently we do not enforce them; thus, the 
> constraints need to use the DISABLE option, but they will be stored and can 
> be enabled for rewriting/optimization using RELY.
> This patch also adds support for inlining the constraints next to the column 
> type definition, i.e., 'column constraints'.
> Some examples of the extension to the syntax included in the patch:
> {code:sql}
> CREATE TABLE table3 (x string NOT NULL DISABLE, PRIMARY KEY (x) DISABLE, 
> CONSTRAINT fk1 FOREIGN KEY (x) REFERENCES table2(a) DISABLE); 
> CREATE TABLE table4 (x string CONSTRAINT nn4_1 NOT NULL DISABLE, y string 
> CONSTRAINT nn4_2 NOT NULL DISABLE, UNIQUE (x) DISABLE, CONSTRAINT fk2 FOREIGN 
> KEY (x) REFERENCES table2(a) DISABLE, 
> CONSTRAINT fk3 FOREIGN KEY (y) REFERENCES table2(a) DISABLE);
> CREATE TABLE table12 (a STRING CONSTRAINT nn12_1 NOT NULL DISABLE NORELY, b 
> STRING);
> CREATE TABLE table13 (a STRING NOT NULL DISABLE RELY, b STRING);
> CREATE TABLE table14 (a STRING CONSTRAINT nn14_1 NOT NULL DISABLE RELY, b 
> STRING);
> CREATE TABLE table15 (a STRING REFERENCES table4(x) DISABLE, b STRING);
> CREATE TABLE table16 (a STRING CONSTRAINT nn16_1 REFERENCES table4(x) DISABLE 
> RELY, b STRING);
> ALTER TABLE table16 CHANGE a a STRING REFERENCES table4(x) DISABLE NOVALIDATE;
> ALTER TABLE table12 CHANGE COLUMN b b STRING CONSTRAINT nn12_2 NOT NULL 
> DISABLE NOVALIDATE;
> ALTER TABLE table13 CHANGE b b STRING NOT NULL DISABLE NOVALIDATE;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16576) Fix encoding of intervals when fetching select query candidates from druid

2017-05-03 Thread Nishant Bangarwa (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa reassigned HIVE-16576:
---


> Fix encoding of intervals when fetching select query candidates from druid
> --
>
> Key: HIVE-16576
> URL: https://issues.apache.org/jira/browse/HIVE-16576
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>
> Debug logs on HIVE side - 
> {code}
> 2017-05-03T23:49:00,672 DEBUG [HttpClient-Netty-Worker-0] 
> client.NettyHttpClient: [GET 
> http://localhost:8082/druid/v2/datasources/cmv_basetable_druid/candidates?intervals=1900-01-01T00:00:00.000+05:53:20/3000-01-01T00:00:00.000+05:30]
>  Got response: 500 Server Error
> {code}
> Druid exception stack trace - 
> {code}
> 2017-05-03T18:56:58,928 WARN [qtp1651318806-158] 
> org.eclipse.jetty.servlet.ServletHandler - 
> /druid/v2/datasources/cmv_basetable_druid/candidates
> java.lang.IllegalArgumentException: Invalid format: ""1900-01-01T00:00:00.000 
> 05:53:20"
>   at 
> org.joda.time.format.DateTimeFormatter.parseDateTime(DateTimeFormatter.java:899)
>  ~[joda-time-2.8.2.jar:2.8.2]
>   at 
> org.joda.time.convert.StringConverter.setInto(StringConverter.java:212) 
> ~[joda-time-2.8.2.jar:2.8.2]
>   at org.joda.time.base.BaseInterval.(BaseInterval.java:200) 
> ~[joda-time-2.8.2.jar:2.8.2]
>   at org.joda.time.Interval.(Interval.java:193) 
> ~[joda-time-2.8.2.jar:2.8.2]
>   at org.joda.time.Interval.parse(Interval.java:69) 
> ~[joda-time-2.8.2.jar:2.8.2]
>   at 
> io.druid.server.ClientInfoResource.getQueryTargets(ClientInfoResource.java:320)
>  ~[classes/:?]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_92]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_92]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_92]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_92]
> {code}
> Note that intervals being sent as part of the HTTP request URL are not 
> encoded properly. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16550) Semijoin Hints should be able to skip the optimization if needed.

2017-05-03 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-16550:
--
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed to master

> Semijoin Hints should be able to skip the optimization if needed.
> -
>
> Key: HIVE-16550
> URL: https://issues.apache.org/jira/browse/HIVE-16550
> Project: Hive
>  Issue Type: Improvement
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Fix For: 3.0.0
>
> Attachments: HIVE-16550.1.patch, HIVE-16550.2.patch, 
> HIVE-16550.3.patch
>
>
> Currently semi join hints are designed to enforce a particular semi join, 
> however, it should also be able to skip the optimization all together in a 
> query using hints.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16552) Limit the number of tasks a Spark job may contain

2017-05-03 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-16552:
---
Attachment: HIVE-16552.3.patch

Patch #3 has the following updates:
1. Added a negative test
2. Keep the log msgs at SparkTask as it contains additional info
3. Undo some changes in the way of calculating total tasks only once.

[~lirui] could you please take another look? Thanks.

> Limit the number of tasks a Spark job may contain
> -
>
> Key: HIVE-16552
> URL: https://issues.apache.org/jira/browse/HIVE-16552
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 1.0.0, 2.0.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-16552.1.patch, HIVE-16552.2.patch, 
> HIVE-16552.3.patch, HIVE-16552.patch
>
>
> It's commonly desirable to block bad and big queries that takes a lot of YARN 
> resources. One approach, similar to mapreduce.job.max.map in MapReduce, is to 
> stop a query that invokes a Spark job that contains too many tasks. The 
> proposal here is to introduce hive.spark.job.max.tasks with a default value 
> of -1 (no limit), which an admin can set to block queries that trigger too 
> many spark tasks.
> Please note that this control knob applies to a spark job, though it's 
> possible that one query can trigger multiple Spark jobs (such as in case of 
> map-join). Nevertheless, the proposed approach is still helpful.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16575) Support for 'UNIQUE' and 'NOT NULL' constraints

2017-05-03 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-16575:
---
Attachment: HIVE-16575.01.patch

Fixed ptest failures.

> Support for 'UNIQUE' and 'NOT NULL' constraints
> ---
>
> Key: HIVE-16575
> URL: https://issues.apache.org/jira/browse/HIVE-16575
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO, Logical Optimizer, Parser
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-16575.01.patch, HIVE-16575.patch
>
>
> Follow-up on HIVE-13076.
> This issue add support for SQL 'UNIQUE' and 'NOT NULL' constraints when we 
> create a table / alter a table 
> (https://www.postgresql.org/docs/9.6/static/sql-createtable.html).
> As with PK and FK constraints, currently we do not enforce them; thus, the 
> constraints need to use the DISABLE option, but they will be stored and can 
> be enabled for rewriting/optimization using RELY.
> This patch also adds support for inlining the constraints next to the column 
> type definition, i.e., 'column constraints'.
> Some examples of the extension to the syntax included in the patch:
> {code:sql}
> CREATE TABLE table3 (x string NOT NULL DISABLE, PRIMARY KEY (x) DISABLE, 
> CONSTRAINT fk1 FOREIGN KEY (x) REFERENCES table2(a) DISABLE); 
> CREATE TABLE table4 (x string CONSTRAINT nn4_1 NOT NULL DISABLE, y string 
> CONSTRAINT nn4_2 NOT NULL DISABLE, UNIQUE (x) DISABLE, CONSTRAINT fk2 FOREIGN 
> KEY (x) REFERENCES table2(a) DISABLE, 
> CONSTRAINT fk3 FOREIGN KEY (y) REFERENCES table2(a) DISABLE);
> CREATE TABLE table12 (a STRING CONSTRAINT nn12_1 NOT NULL DISABLE NORELY, b 
> STRING);
> CREATE TABLE table13 (a STRING NOT NULL DISABLE RELY, b STRING);
> CREATE TABLE table14 (a STRING CONSTRAINT nn14_1 NOT NULL DISABLE RELY, b 
> STRING);
> CREATE TABLE table15 (a STRING REFERENCES table4(x) DISABLE, b STRING);
> CREATE TABLE table16 (a STRING CONSTRAINT nn16_1 REFERENCES table4(x) DISABLE 
> RELY, b STRING);
> ALTER TABLE table16 CHANGE a a STRING REFERENCES table4(x) DISABLE NOVALIDATE;
> ALTER TABLE table12 CHANGE COLUMN b b STRING CONSTRAINT nn12_2 NOT NULL 
> DISABLE NOVALIDATE;
> ALTER TABLE table13 CHANGE b b STRING NOT NULL DISABLE NOVALIDATE;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-12157) Support unicode for table/column names

2017-05-03 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995282#comment-15995282
 ] 

Pengcheng Xiong commented on HIVE-12157:


[~a092cc], thanks for your patch. However, as we previously discussed, you may 
also need to change the script in the metastore as you can see from the 
previous comments from [~sershe] and [~richarddu]

>  Support unicode for table/column names
> ---
>
> Key: HIVE-12157
> URL: https://issues.apache.org/jira/browse/HIVE-12157
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Affects Versions: 1.2.1
>Reporter: richard du
>Assignee: hefuhua
>Priority: Minor
> Attachments: HIVE-12157.01.patch, HIVE-12157.02.patch, 
> HIVE-12157.patch
>
>
> Parser will throw exception when I use alias:
> hive> desc test;
> OK
> a   int 
> b   string  
> Time taken: 0.135 seconds, Fetched: 2 row(s)
> hive> select a as 行1 from test limit 10;
> NoViableAltException(302@[134:7: ( ( ( KW_AS )? identifier ) | ( KW_AS LPAREN 
> identifier ( COMMA identifier )* RPAREN ) )?])
> at org.antlr.runtime.DFA.noViableAlt(DFA.java:158)
> at org.antlr.runtime.DFA.predict(DFA.java:116)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectItem(HiveParser_SelectClauseParser.java:2915)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectList(HiveParser_SelectClauseParser.java:1373)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectClause(HiveParser_SelectClauseParser.java:1128)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser.selectClause(HiveParser.java:45827)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:41495)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:41402)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:40413)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:40283)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1590)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1109)
> at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:202)
> at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:396)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> FAILED: ParseException line 1:13 cannot recognize input near 'as' '1' 'from' 
> in selection target



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16501) Add rej/orig to .gitignore ; remove *.orig files

2017-05-03 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995238#comment-15995238
 ] 

Eugene Koifman commented on HIVE-16501:
---

https://git-scm.com/docs/gitignore seems to indicate that you can add 
additional ignores in you own environment - but I don't see how the same can be 
used to "remove" items from  a checked in .gitignore in the src tree

> Add rej/orig to .gitignore ; remove *.orig files
> 
>
> Key: HIVE-16501
> URL: https://issues.apache.org/jira/browse/HIVE-16501
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Fix For: 3.0.0
>
> Attachments: HIVE-16501.1.patch
>
>
> sometimes git reject/orig files made there way into the repo...
> it would be better to just ignore them :)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16575) Support for 'UNIQUE' and 'NOT NULL' constraints

2017-05-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995203#comment-15995203
 ] 

Hive QA commented on HIVE-16575:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12866198/HIVE-16575.patch

{color:green}SUCCESS:{color} +1 due to 7 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 21 failed/errored test(s), 10557 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] 
(batchId=225)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_with_constraints] 
(batchId=64)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[alter_table_constraint_duplicate_pk]
 (batchId=88)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[alter_table_constraint_invalid_fk_col1]
 (batchId=87)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[alter_table_constraint_invalid_fk_col2]
 (batchId=88)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[alter_table_constraint_invalid_fk_tbl1]
 (batchId=88)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[alter_table_constraint_invalid_fk_tbl2]
 (batchId=88)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[alter_table_constraint_invalid_pk_tbl]
 (batchId=87)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[create_with_constraints_enable]
 (batchId=88)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[create_with_fk_constraint_2]
 (batchId=87)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[drop_invalid_constraint1]
 (batchId=87)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[drop_invalid_constraint2]
 (batchId=87)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[drop_invalid_constraint3]
 (batchId=88)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[drop_invalid_constraint4]
 (batchId=88)
org.apache.hadoop.hive.ql.TestErrorMsg.testUniqueErrorCode (batchId=252)
org.apache.hadoop.hive.ql.parse.TestHiveDecimalParse.testDecimalType7 
(batchId=259)
org.apache.hive.hcatalog.pig.TestRCFileHCatStorer.testMultiPartColsInData 
(batchId=178)
org.apache.hive.hcatalog.pig.TestRCFileHCatStorer.testWriteDecimalX 
(batchId=178)
org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testWriteChar (batchId=178)
org.apache.hive.jdbc.TestJdbcDriver2.org.apache.hive.jdbc.TestJdbcDriver2 
(batchId=221)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5024/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5024/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5024/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 21 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12866198 - PreCommit-HIVE-Build

> Support for 'UNIQUE' and 'NOT NULL' constraints
> ---
>
> Key: HIVE-16575
> URL: https://issues.apache.org/jira/browse/HIVE-16575
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO, Logical Optimizer, Parser
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-16575.patch
>
>
> Follow-up on HIVE-13076.
> This issue add support for SQL 'UNIQUE' and 'NOT NULL' constraints when we 
> create a table / alter a table 
> (https://www.postgresql.org/docs/9.6/static/sql-createtable.html).
> As with PK and FK constraints, currently we do not enforce them; thus, the 
> constraints need to use the DISABLE option, but they will be stored and can 
> be enabled for rewriting/optimization using RELY.
> This patch also adds support for inlining the constraints next to the column 
> type definition, i.e., 'column constraints'.
> Some examples of the extension to the syntax included in the patch:
> {code:sql}
> CREATE TABLE table3 (x string NOT NULL DISABLE, PRIMARY KEY (x) DISABLE, 
> CONSTRAINT fk1 FOREIGN KEY (x) REFERENCES table2(a) DISABLE); 
> CREATE TABLE table4 (x string CONSTRAINT nn4_1 NOT NULL DISABLE, y string 
> CONSTRAINT nn4_2 NOT NULL DISABLE, UNIQUE (x) DISABLE, CONSTRAINT fk2 FOREIGN 
> KEY (x) REFERENCES table2(a) DISABLE, 
> CONSTRAINT fk3 FOREIGN KEY (y) REFERENCES table2(a) DISABLE);
> CREATE TABLE table12 (a STRING CONSTRAINT nn12_1 NOT NULL DISABLE NORELY, b 
> STRING);
> CREATE TABLE table13 (a STRING NOT NULL DISABLE RELY, b STRING);
> CREATE TABLE table14 (a STRING CONSTRAINT 

[jira] [Commented] (HIVE-16474) Upgrade Druid version to 0.10

2017-05-03 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995191#comment-15995191
 ] 

slim bouguerra commented on HIVE-16474:
---

can you please create a github pull request ?

> Upgrade Druid version to 0.10
> -
>
> Key: HIVE-16474
> URL: https://issues.apache.org/jira/browse/HIVE-16474
> Project: Hive
>  Issue Type: Task
>  Components: Druid integration
>Reporter: Ashutosh Chauhan
>Assignee: Nishant Bangarwa
> Attachments: HIVE-16474.01.patch, HIVE-16474.patch
>
>
> Druid 0.10 is out. We shall upgrade to it to take advantage of improvements 
> it brings.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16474) Upgrade Druid version to 0.10

2017-05-03 Thread Nishant Bangarwa (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995145#comment-15995145
 ] 

Nishant Bangarwa commented on HIVE-16474:
-

[~jcamachorodriguez][~bslim] please review. 

> Upgrade Druid version to 0.10
> -
>
> Key: HIVE-16474
> URL: https://issues.apache.org/jira/browse/HIVE-16474
> Project: Hive
>  Issue Type: Task
>  Components: Druid integration
>Reporter: Ashutosh Chauhan
>Assignee: Nishant Bangarwa
> Attachments: HIVE-16474.01.patch, HIVE-16474.patch
>
>
> Druid 0.10 is out. We shall upgrade to it to take advantage of improvements 
> it brings.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16469) Parquet timestamp table property is not always taken into account

2017-05-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995128#comment-15995128
 ] 

Hive QA commented on HIVE-16469:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12866169/HIVE-16469.04.patch

{color:green}SUCCESS:{color} +1 due to 7 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10642 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] 
(batchId=225)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5023/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5023/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5023/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12866169 - PreCommit-HIVE-Build

> Parquet timestamp table property is not always taken into account
> -
>
> Key: HIVE-16469
> URL: https://issues.apache.org/jira/browse/HIVE-16469
> Project: Hive
>  Issue Type: Bug
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
> Attachments: HIVE-16469.01.patch, HIVE-16469.02.patch, 
> HIVE-16469.03.patch, HIVE-16469.04.patch
>
>
> The parquet timestamp timezone property is currently copied over into the 
> JobConf in the FetchOperator, but this may be too late for some execution 
> paths.
> We should:
> 1 - copy the property over earlier
> 2 - set the default value on the JobConf if no property is set, and fail in 
> the ParquetRecordReader if the property is missing from the JobConf
> We should add extra validations for the cases when:
> - the property was not set by accident on the JobConf (unexpected execution 
> path)
> - an incorrect/invalid timezone id is being set on the table



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16469) Parquet timestamp table property is not always taken into account

2017-05-03 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-16469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995121#comment-15995121
 ] 

Sergio Peña commented on HIVE-16469:


[~Ferd] Do you still remember hive/parquet? Would you help [~zsombor.klara] 
having a second pair of eyes to review his code?

> Parquet timestamp table property is not always taken into account
> -
>
> Key: HIVE-16469
> URL: https://issues.apache.org/jira/browse/HIVE-16469
> Project: Hive
>  Issue Type: Bug
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
> Attachments: HIVE-16469.01.patch, HIVE-16469.02.patch, 
> HIVE-16469.03.patch, HIVE-16469.04.patch
>
>
> The parquet timestamp timezone property is currently copied over into the 
> JobConf in the FetchOperator, but this may be too late for some execution 
> paths.
> We should:
> 1 - copy the property over earlier
> 2 - set the default value on the JobConf if no property is set, and fail in 
> the ParquetRecordReader if the property is missing from the JobConf
> We should add extra validations for the cases when:
> - the property was not set by accident on the JobConf (unexpected execution 
> path)
> - an incorrect/invalid timezone id is being set on the table



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16047) Shouldn't try to get KeyProvider unless encryption is enabled

2017-05-03 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-16047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995112#comment-15995112
 ] 

Sergio Peña commented on HIVE-16047:


Thank you so much [~lirui]. I noticed the patch wasn't reverted on branch-2 so 
I just reverted as well. For info about branch-2.2, the community decided to 
based branch-2.2 from branch-2.1 (not master), that's why it was not included 
into the branch.

[~andrew.wang] When do you think we would have this new API on HDFS?

> Shouldn't try to get KeyProvider unless encryption is enabled
> -
>
> Key: HIVE-16047
> URL: https://issues.apache.org/jira/browse/HIVE-16047
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-16047.1.patch, HIVE-16047.2.patch
>
>
> Found lots of following errors in HS2 log:
> {noformat}
> hdfs.KeyProviderCache: Could not find uri with key 
> [dfs.encryption.key.provider.uri] to create a keyProvider !!
> {noformat}
> Similar to HDFS-7931



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16575) Support for 'UNIQUE' and 'NOT NULL' constraints

2017-05-03 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-16575:
---
Attachment: HIVE-16575.patch

> Support for 'UNIQUE' and 'NOT NULL' constraints
> ---
>
> Key: HIVE-16575
> URL: https://issues.apache.org/jira/browse/HIVE-16575
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO, Logical Optimizer, Parser
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-16575.patch
>
>
> Follow-up on HIVE-13076.
> This issue add support for SQL 'UNIQUE' and 'NOT NULL' constraints when we 
> create a table / alter a table 
> (https://www.postgresql.org/docs/9.6/static/sql-createtable.html).
> As with PK and FK constraints, currently we do not enforce them; thus, the 
> constraints need to use the DISABLE option, but they will be stored and can 
> be enabled for rewriting/optimization using RELY.
> This patch also adds support for inlining the constraints next to the column 
> type definition, i.e., 'column constraints'.
> Some examples of the extension to the syntax included in the patch:
> {code:sql}
> CREATE TABLE table3 (x string NOT NULL DISABLE, PRIMARY KEY (x) DISABLE, 
> CONSTRAINT fk1 FOREIGN KEY (x) REFERENCES table2(a) DISABLE); 
> CREATE TABLE table4 (x string CONSTRAINT nn4_1 NOT NULL DISABLE, y string 
> CONSTRAINT nn4_2 NOT NULL DISABLE, UNIQUE (x) DISABLE, CONSTRAINT fk2 FOREIGN 
> KEY (x) REFERENCES table2(a) DISABLE, 
> CONSTRAINT fk3 FOREIGN KEY (y) REFERENCES table2(a) DISABLE);
> CREATE TABLE table12 (a STRING CONSTRAINT nn12_1 NOT NULL DISABLE NORELY, b 
> STRING);
> CREATE TABLE table13 (a STRING NOT NULL DISABLE RELY, b STRING);
> CREATE TABLE table14 (a STRING CONSTRAINT nn14_1 NOT NULL DISABLE RELY, b 
> STRING);
> CREATE TABLE table15 (a STRING REFERENCES table4(x) DISABLE, b STRING);
> CREATE TABLE table16 (a STRING CONSTRAINT nn16_1 REFERENCES table4(x) DISABLE 
> RELY, b STRING);
> ALTER TABLE table16 CHANGE a a STRING REFERENCES table4(x) DISABLE NOVALIDATE;
> ALTER TABLE table12 CHANGE COLUMN b b STRING CONSTRAINT nn12_2 NOT NULL 
> DISABLE NOVALIDATE;
> ALTER TABLE table13 CHANGE b b STRING NOT NULL DISABLE NOVALIDATE;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Work started] (HIVE-16575) Support for 'UNIQUE' and 'NOT NULL' constraints

2017-05-03 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-16575 started by Jesus Camacho Rodriguez.
--
> Support for 'UNIQUE' and 'NOT NULL' constraints
> ---
>
> Key: HIVE-16575
> URL: https://issues.apache.org/jira/browse/HIVE-16575
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO, Logical Optimizer, Parser
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>
> Follow-up on HIVE-13076.
> This issue add support for SQL 'UNIQUE' and 'NOT NULL' constraints when we 
> create a table / alter a table 
> (https://www.postgresql.org/docs/9.6/static/sql-createtable.html).
> As with PK and FK constraints, currently we do not enforce them; thus, the 
> constraints need to use the DISABLE option, but they will be stored and can 
> be enabled for rewriting/optimization using RELY.
> This patch also adds support for inlining the constraints next to the column 
> type definition, i.e., 'column constraints'.
> Some examples of the extension to the syntax included in the patch:
> {code:sql}
> CREATE TABLE table3 (x string NOT NULL DISABLE, PRIMARY KEY (x) DISABLE, 
> CONSTRAINT fk1 FOREIGN KEY (x) REFERENCES table2(a) DISABLE); 
> CREATE TABLE table4 (x string CONSTRAINT nn4_1 NOT NULL DISABLE, y string 
> CONSTRAINT nn4_2 NOT NULL DISABLE, UNIQUE (x) DISABLE, CONSTRAINT fk2 FOREIGN 
> KEY (x) REFERENCES table2(a) DISABLE, 
> CONSTRAINT fk3 FOREIGN KEY (y) REFERENCES table2(a) DISABLE);
> CREATE TABLE table12 (a STRING CONSTRAINT nn12_1 NOT NULL DISABLE NORELY, b 
> STRING);
> CREATE TABLE table13 (a STRING NOT NULL DISABLE RELY, b STRING);
> CREATE TABLE table14 (a STRING CONSTRAINT nn14_1 NOT NULL DISABLE RELY, b 
> STRING);
> CREATE TABLE table15 (a STRING REFERENCES table4(x) DISABLE, b STRING);
> CREATE TABLE table16 (a STRING CONSTRAINT nn16_1 REFERENCES table4(x) DISABLE 
> RELY, b STRING);
> ALTER TABLE table16 CHANGE a a STRING REFERENCES table4(x) DISABLE NOVALIDATE;
> ALTER TABLE table12 CHANGE COLUMN b b STRING CONSTRAINT nn12_2 NOT NULL 
> DISABLE NOVALIDATE;
> ALTER TABLE table13 CHANGE b b STRING NOT NULL DISABLE NOVALIDATE;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16575) Support for 'UNIQUE' and 'NOT NULL' constraints

2017-05-03 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-16575:
---
Status: Patch Available  (was: In Progress)

> Support for 'UNIQUE' and 'NOT NULL' constraints
> ---
>
> Key: HIVE-16575
> URL: https://issues.apache.org/jira/browse/HIVE-16575
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO, Logical Optimizer, Parser
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>
> Follow-up on HIVE-13076.
> This issue add support for SQL 'UNIQUE' and 'NOT NULL' constraints when we 
> create a table / alter a table 
> (https://www.postgresql.org/docs/9.6/static/sql-createtable.html).
> As with PK and FK constraints, currently we do not enforce them; thus, the 
> constraints need to use the DISABLE option, but they will be stored and can 
> be enabled for rewriting/optimization using RELY.
> This patch also adds support for inlining the constraints next to the column 
> type definition, i.e., 'column constraints'.
> Some examples of the extension to the syntax included in the patch:
> {code:sql}
> CREATE TABLE table3 (x string NOT NULL DISABLE, PRIMARY KEY (x) DISABLE, 
> CONSTRAINT fk1 FOREIGN KEY (x) REFERENCES table2(a) DISABLE); 
> CREATE TABLE table4 (x string CONSTRAINT nn4_1 NOT NULL DISABLE, y string 
> CONSTRAINT nn4_2 NOT NULL DISABLE, UNIQUE (x) DISABLE, CONSTRAINT fk2 FOREIGN 
> KEY (x) REFERENCES table2(a) DISABLE, 
> CONSTRAINT fk3 FOREIGN KEY (y) REFERENCES table2(a) DISABLE);
> CREATE TABLE table12 (a STRING CONSTRAINT nn12_1 NOT NULL DISABLE NORELY, b 
> STRING);
> CREATE TABLE table13 (a STRING NOT NULL DISABLE RELY, b STRING);
> CREATE TABLE table14 (a STRING CONSTRAINT nn14_1 NOT NULL DISABLE RELY, b 
> STRING);
> CREATE TABLE table15 (a STRING REFERENCES table4(x) DISABLE, b STRING);
> CREATE TABLE table16 (a STRING CONSTRAINT nn16_1 REFERENCES table4(x) DISABLE 
> RELY, b STRING);
> ALTER TABLE table16 CHANGE a a STRING REFERENCES table4(x) DISABLE NOVALIDATE;
> ALTER TABLE table12 CHANGE COLUMN b b STRING CONSTRAINT nn12_2 NOT NULL 
> DISABLE NOVALIDATE;
> ALTER TABLE table13 CHANGE b b STRING NOT NULL DISABLE NOVALIDATE;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16575) Support for 'UNIQUE' and 'NOT NULL' constraints

2017-05-03 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995071#comment-15995071
 ] 

Jesus Camacho Rodriguez commented on HIVE-16575:


[~ashutoshc], I have created a PR in https://github.com/apache/hive/pull/175 . 
Could you take a look?

One thing that I observed is that currently we do not enforce _foreign keys_ to 
refer to _primary keys_ or _unique keys_ (as opposed to PostgreSQL and others); 
I will tackle that extension in a follow-up.

> Support for 'UNIQUE' and 'NOT NULL' constraints
> ---
>
> Key: HIVE-16575
> URL: https://issues.apache.org/jira/browse/HIVE-16575
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO, Logical Optimizer, Parser
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>
> Follow-up on HIVE-13076.
> This issue add support for SQL 'UNIQUE' and 'NOT NULL' constraints when we 
> create a table / alter a table 
> (https://www.postgresql.org/docs/9.6/static/sql-createtable.html).
> As with PK and FK constraints, currently we do not enforce them; thus, the 
> constraints need to use the DISABLE option, but they will be stored and can 
> be enabled for rewriting/optimization using RELY.
> This patch also adds support for inlining the constraints next to the column 
> type definition, i.e., 'column constraints'.
> Some examples of the extension to the syntax included in the patch:
> {code:sql}
> CREATE TABLE table3 (x string NOT NULL DISABLE, PRIMARY KEY (x) DISABLE, 
> CONSTRAINT fk1 FOREIGN KEY (x) REFERENCES table2(a) DISABLE); 
> CREATE TABLE table4 (x string CONSTRAINT nn4_1 NOT NULL DISABLE, y string 
> CONSTRAINT nn4_2 NOT NULL DISABLE, UNIQUE (x) DISABLE, CONSTRAINT fk2 FOREIGN 
> KEY (x) REFERENCES table2(a) DISABLE, 
> CONSTRAINT fk3 FOREIGN KEY (y) REFERENCES table2(a) DISABLE);
> CREATE TABLE table12 (a STRING CONSTRAINT nn12_1 NOT NULL DISABLE NORELY, b 
> STRING);
> CREATE TABLE table13 (a STRING NOT NULL DISABLE RELY, b STRING);
> CREATE TABLE table14 (a STRING CONSTRAINT nn14_1 NOT NULL DISABLE RELY, b 
> STRING);
> CREATE TABLE table15 (a STRING REFERENCES table4(x) DISABLE, b STRING);
> CREATE TABLE table16 (a STRING CONSTRAINT nn16_1 REFERENCES table4(x) DISABLE 
> RELY, b STRING);
> ALTER TABLE table16 CHANGE a a STRING REFERENCES table4(x) DISABLE NOVALIDATE;
> ALTER TABLE table12 CHANGE COLUMN b b STRING CONSTRAINT nn12_2 NOT NULL 
> DISABLE NOVALIDATE;
> ALTER TABLE table13 CHANGE b b STRING NOT NULL DISABLE NOVALIDATE;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16575) Support for 'UNIQUE' and 'NOT NULL' constraints

2017-05-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995065#comment-15995065
 ] 

ASF GitHub Bot commented on HIVE-16575:
---

GitHub user jcamachor opened a pull request:

https://github.com/apache/hive/pull/175

HIVE-16575: Support for 'UNIQUE' and 'NOT NULL' constraints



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jcamachor/hive not_null

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/175.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #175


commit 686ed38fdcf5b67f9748631795dbeaf2a8a2b692
Author: Jesus Camacho Rodriguez 
Date:   2017-05-03T09:09:49Z

HIVE-16575: Support for 'UNIQUE' and 'NOT NULL' constraints




> Support for 'UNIQUE' and 'NOT NULL' constraints
> ---
>
> Key: HIVE-16575
> URL: https://issues.apache.org/jira/browse/HIVE-16575
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO, Logical Optimizer, Parser
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>
> Follow-up on HIVE-13076.
> This issue add support for SQL 'UNIQUE' and 'NOT NULL' constraints when we 
> create a table / alter a table 
> (https://www.postgresql.org/docs/9.6/static/sql-createtable.html).
> As with PK and FK constraints, currently we do not enforce them; thus, the 
> constraints need to use the DISABLE option, but they will be stored and can 
> be enabled for rewriting/optimization using RELY.
> This patch also adds support for inlining the constraints next to the column 
> type definition, i.e., 'column constraints'.
> Some examples of the extension to the syntax included in the patch:
> {code:sql}
> CREATE TABLE table3 (x string NOT NULL DISABLE, PRIMARY KEY (x) DISABLE, 
> CONSTRAINT fk1 FOREIGN KEY (x) REFERENCES table2(a) DISABLE); 
> CREATE TABLE table4 (x string CONSTRAINT nn4_1 NOT NULL DISABLE, y string 
> CONSTRAINT nn4_2 NOT NULL DISABLE, UNIQUE (x) DISABLE, CONSTRAINT fk2 FOREIGN 
> KEY (x) REFERENCES table2(a) DISABLE, 
> CONSTRAINT fk3 FOREIGN KEY (y) REFERENCES table2(a) DISABLE);
> CREATE TABLE table12 (a STRING CONSTRAINT nn12_1 NOT NULL DISABLE NORELY, b 
> STRING);
> CREATE TABLE table13 (a STRING NOT NULL DISABLE RELY, b STRING);
> CREATE TABLE table14 (a STRING CONSTRAINT nn14_1 NOT NULL DISABLE RELY, b 
> STRING);
> CREATE TABLE table15 (a STRING REFERENCES table4(x) DISABLE, b STRING);
> CREATE TABLE table16 (a STRING CONSTRAINT nn16_1 REFERENCES table4(x) DISABLE 
> RELY, b STRING);
> ALTER TABLE table16 CHANGE a a STRING REFERENCES table4(x) DISABLE NOVALIDATE;
> ALTER TABLE table12 CHANGE COLUMN b b STRING CONSTRAINT nn12_2 NOT NULL 
> DISABLE NOVALIDATE;
> ALTER TABLE table13 CHANGE b b STRING NOT NULL DISABLE NOVALIDATE;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16575) Support for 'UNIQUE' and 'NOT NULL' constraints

2017-05-03 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-16575:
---
Description: 
Follow-up on HIVE-13076.

This issue add support for SQL 'UNIQUE' and 'NOT NULL' constraints when we 
create a table / alter a table 
(https://www.postgresql.org/docs/9.6/static/sql-createtable.html).

As with PK and FK constraints, currently we do not enforce them; thus, the 
constraints need to use the DISABLE option, but they will be stored and can be 
enabled for rewriting/optimization using RELY.

This patch also adds support for inlining the constraints next to the column 
type definition, i.e., 'column constraints'.

Some examples of the extension to the syntax included in the patch:
{code:sql}
CREATE TABLE table3 (x string NOT NULL DISABLE, PRIMARY KEY (x) DISABLE, 
CONSTRAINT fk1 FOREIGN KEY (x) REFERENCES table2(a) DISABLE); 
CREATE TABLE table4 (x string CONSTRAINT nn4_1 NOT NULL DISABLE, y string 
CONSTRAINT nn4_2 NOT NULL DISABLE, UNIQUE (x) DISABLE, CONSTRAINT fk2 FOREIGN 
KEY (x) REFERENCES table2(a) DISABLE, 
CONSTRAINT fk3 FOREIGN KEY (y) REFERENCES table2(a) DISABLE);
CREATE TABLE table12 (a STRING CONSTRAINT nn12_1 NOT NULL DISABLE NORELY, b 
STRING);
CREATE TABLE table13 (a STRING NOT NULL DISABLE RELY, b STRING);
CREATE TABLE table14 (a STRING CONSTRAINT nn14_1 NOT NULL DISABLE RELY, b 
STRING);
CREATE TABLE table15 (a STRING REFERENCES table4(x) DISABLE, b STRING);
CREATE TABLE table16 (a STRING CONSTRAINT nn16_1 REFERENCES table4(x) DISABLE 
RELY, b STRING);
ALTER TABLE table16 CHANGE a a STRING REFERENCES table4(x) DISABLE NOVALIDATE;
ALTER TABLE table12 CHANGE COLUMN b b STRING CONSTRAINT nn12_2 NOT NULL DISABLE 
NOVALIDATE;
ALTER TABLE table13 CHANGE b b STRING NOT NULL DISABLE NOVALIDATE;
{code}

  was:
Follow-up on HIVE-13076.

This issue add support for SQL 'UNIQUE' and 'NOT NULL' constraints when we 
create a table / alter a table 
(https://www.postgresql.org/docs/9.6/static/sql-createtable.html).

As with PK and FK constraints, currently we do not enforce them; thus, the 
constraints need to use the DISABLE option, but they will be stored and can be 
enabled for rewriting/optimization using RELY.

This patch also adds support for inlining the constraints next to the column 
type definition, i.e., 'column constraints'.

Some examples of the extension to the syntax included in the patch:
{code:sql}
CREATE TABLE table3 (x string NOT NULL DISABLE, PRIMARY KEY (x) DISABLE, 
CONSTRAINT fk1 FOREIGN KEY (x) REFERENCES table2(a) DISABLE); 
CREATE TABLE table4 (x string CONSTRAINT nn4_1 NOT NULL DISABLE, y string 
CONSTRAINT nn4_2 NOT NULL DISABLE, UNIQUE (x) DISABLE, CONSTRAINT fk2 FOREIGN 
KEY (x) REFERENCES table2(a) DISABLE, 
CONSTRAINT fk3 FOREIGN KEY (y) REFERENCES table2(a) DISABLE);
CREATE TABLE table12 (a STRING CONSTRAINT nn12_1 NOT NULL DISABLE NORELY, b 
STRING);
CREATE TABLE table13 (a STRING NOT NULL DISABLE RELY, b STRING);
CREATE TABLE table14 (a STRING CONSTRAINT nn14_1 NOT NULL DISABLE RELY, b 
STRING);
CREATE TABLE table15 (a STRING REFERENCES table4(x) DISABLE, b STRING);
CREATE TABLE table16 (a STRING CONSTRAINT nn16_1 REFERENCES table4(x) DISABLE 
RELY, b STRING);
{code}


> Support for 'UNIQUE' and 'NOT NULL' constraints
> ---
>
> Key: HIVE-16575
> URL: https://issues.apache.org/jira/browse/HIVE-16575
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO, Logical Optimizer, Parser
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>
> Follow-up on HIVE-13076.
> This issue add support for SQL 'UNIQUE' and 'NOT NULL' constraints when we 
> create a table / alter a table 
> (https://www.postgresql.org/docs/9.6/static/sql-createtable.html).
> As with PK and FK constraints, currently we do not enforce them; thus, the 
> constraints need to use the DISABLE option, but they will be stored and can 
> be enabled for rewriting/optimization using RELY.
> This patch also adds support for inlining the constraints next to the column 
> type definition, i.e., 'column constraints'.
> Some examples of the extension to the syntax included in the patch:
> {code:sql}
> CREATE TABLE table3 (x string NOT NULL DISABLE, PRIMARY KEY (x) DISABLE, 
> CONSTRAINT fk1 FOREIGN KEY (x) REFERENCES table2(a) DISABLE); 
> CREATE TABLE table4 (x string CONSTRAINT nn4_1 NOT NULL DISABLE, y string 
> CONSTRAINT nn4_2 NOT NULL DISABLE, UNIQUE (x) DISABLE, CONSTRAINT fk2 FOREIGN 
> KEY (x) REFERENCES table2(a) DISABLE, 
> CONSTRAINT fk3 FOREIGN KEY (y) REFERENCES table2(a) DISABLE);
> CREATE TABLE table12 (a STRING CONSTRAINT nn12_1 NOT NULL DISABLE NORELY, b 
> STRING);
> CREATE TABLE table13 (a STRING NOT NULL DISABLE RELY, b STRING);
> CREATE TABLE table14 (a STRING CONSTRAINT nn14_1 NOT NULL DISABLE RELY, b 
> 

  1   2   >