[jira] [Commented] (HIVE-21886) REPL - With table list - Handle rename events during replace policy

2019-06-28 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875379#comment-16875379
 ] 

Hive QA commented on HIVE-21886:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12973227/HIVE-21886.05.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 16359 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/17796/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17796/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17796/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12973227 - PreCommit-HIVE-Build

> REPL - With table list - Handle rename events during replace policy
> ---
>
> Key: HIVE-21886
> URL: https://issues.apache.org/jira/browse/HIVE-21886
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, Replication, pull-request-available
> Attachments: HIVE-21886.01.patch, HIVE-21886.02.patch, 
> HIVE-21886.03.patch, HIVE-21886.04.patch, HIVE-21886.04.patch, 
> HIVE-21886.05.patch
>
>  Time Spent: 11h 10m
>  Remaining Estimate: 0h
>
> If some rename events are found to be dumped and replayed while replace 
> policy is getting executed, it needs to take care of the policy inclusion in 
> both the policy for each table name.
>  1. Create a list of tables to be bootstrapped. 
>   2. During handling of alter table, if the alter type is rename 
>       1. If the old table name is present in the list of table to be 
> bootstrapped, remove it.
>        2. If the new table name, matches the new policy, add it to the list 
> of tables to be bootstrapped.
>        3. If the old table does not match the old policy drop it, even if the 
> table is not present at target.
>   3. During handling of drop table
>        1. if the table is in the list of tables to be bootstrapped, then 
> remove it and ignore the event.
>   4. During other event handling 
>        1. if the table is there in the list of tables to be bootstrapped, 
> then ignore the event.
>        2. If the new policy does not match the table name, then ignore the 
> event.
>  
> Rename handling during replace policy
>  # Old name not matching old policy – The old table will not be there at the 
> target cluster. The table will not be returned by get-all-table.
>  ## Old name is not matching new policy
>  ### New name not matching old policy
>   New name not matching new policy
>  * Ignore the event, no need to do anything.
>   New name matching new policy
>  * The table will be returned by get-all-table. Replace policy handler 
> will bootstrap this table as its matching new policy and not matching old 
> policy.
>  * All the future events will be ignored as part of check added by 
> replace policy handling.
>  * All the event with old table name will anyways be ignored as the old 
> name is not matching the new policy.
>  ### New name matching old policy
>   New name not matching new policy
>  * As the new name is not matching the new policy, the table need not be 
> replicated.
>  * As the old name is not matching the new policy, the rename events will 
> be ignored.
>  * So nothing to be done for this scenario.
>   New name matching new policy
>  * As the new name is matching both old and new policy, replace handler 
> will not bootstrap the table.
>  * Add the table to the list of tables to be bootstrapped.
>  * Ignore all the events with new name.
>  * If there is a drop event for the table (with new name), then remove 
> the table from the the list of table to be bootstrapped.
>  * In case of rename event (double rename)
>  ** If the new name satisfies the table pattern, then add the new name to 
> the list of tables to be bootstrapped and remove the old name from the list 
> of tables to be bootstrapped.
>  ** If the new name does not satisfies then just removed the table name 
> from the list of tables to be bootstrapped.
>  ## Old name is matching new policy – As per replace policy handler, which 
> checks based on old table, the table should be bootstrapped and event should 
> be ignored. But rename handler should decide based on new name.The old table 
> name will no

[jira] [Commented] (HIVE-21886) REPL - With table list - Handle rename events during replace policy

2019-06-28 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875369#comment-16875369
 ] 

Hive QA commented on HIVE-21886:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
52s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
37s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
5s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 2s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
58s{color} | {color:blue} ql in master has 2253 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
40s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
21s{color} | {color:blue} testutils/ptest2 in master has 24 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
36s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
31s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
14s{color} | {color:red} The patch generated 3 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 32m 56s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-17796/dev-support/hive-personality.sh
 |
| git revision | master / b2a265a |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17796/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql itests/hive-unit testutils/ptest2 U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17796/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> REPL - With table list - Handle rename events during replace policy
> ---
>
> Key: HIVE-21886
> URL: https://issues.apache.org/jira/browse/HIVE-21886
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, Replication, pull-request-available
> Attachments: HIVE-21886.01.patch, HIVE-21886.02.patch, 
> HIVE-21886.03.patch, HIVE-21886.04.patch, HIVE-21886.04.patch, 
> HIVE-21886.05.patch
>
>  Time Spent: 11h 10m
>  Remaining Estimate: 0h
>
> If some rename events are found to be dumped and replayed while replace 
> policy is getting executed, it needs to take care of the policy inclusion in 
> both the policy for each table name.
>  1. Create a list of t

[jira] [Updated] (HIVE-21886) REPL - With table list - Handle rename events during replace policy

2019-06-28 Thread mahesh kumar behera (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-21886:
---
Status: Patch Available  (was: Open)

> REPL - With table list - Handle rename events during replace policy
> ---
>
> Key: HIVE-21886
> URL: https://issues.apache.org/jira/browse/HIVE-21886
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, Replication, pull-request-available
> Attachments: HIVE-21886.01.patch, HIVE-21886.02.patch, 
> HIVE-21886.03.patch, HIVE-21886.04.patch, HIVE-21886.04.patch, 
> HIVE-21886.05.patch
>
>  Time Spent: 11h 10m
>  Remaining Estimate: 0h
>
> If some rename events are found to be dumped and replayed while replace 
> policy is getting executed, it needs to take care of the policy inclusion in 
> both the policy for each table name.
>  1. Create a list of tables to be bootstrapped. 
>   2. During handling of alter table, if the alter type is rename 
>       1. If the old table name is present in the list of table to be 
> bootstrapped, remove it.
>        2. If the new table name, matches the new policy, add it to the list 
> of tables to be bootstrapped.
>        3. If the old table does not match the old policy drop it, even if the 
> table is not present at target.
>   3. During handling of drop table
>        1. if the table is in the list of tables to be bootstrapped, then 
> remove it and ignore the event.
>   4. During other event handling 
>        1. if the table is there in the list of tables to be bootstrapped, 
> then ignore the event.
>        2. If the new policy does not match the table name, then ignore the 
> event.
>  
> Rename handling during replace policy
>  # Old name not matching old policy – The old table will not be there at the 
> target cluster. The table will not be returned by get-all-table.
>  ## Old name is not matching new policy
>  ### New name not matching old policy
>   New name not matching new policy
>  * Ignore the event, no need to do anything.
>   New name matching new policy
>  * The table will be returned by get-all-table. Replace policy handler 
> will bootstrap this table as its matching new policy and not matching old 
> policy.
>  * All the future events will be ignored as part of check added by 
> replace policy handling.
>  * All the event with old table name will anyways be ignored as the old 
> name is not matching the new policy.
>  ### New name matching old policy
>   New name not matching new policy
>  * As the new name is not matching the new policy, the table need not be 
> replicated.
>  * As the old name is not matching the new policy, the rename events will 
> be ignored.
>  * So nothing to be done for this scenario.
>   New name matching new policy
>  * As the new name is matching both old and new policy, replace handler 
> will not bootstrap the table.
>  * Add the table to the list of tables to be bootstrapped.
>  * Ignore all the events with new name.
>  * If there is a drop event for the table (with new name), then remove 
> the table from the the list of table to be bootstrapped.
>  * In case of rename event (double rename)
>  ** If the new name satisfies the table pattern, then add the new name to 
> the list of tables to be bootstrapped and remove the old name from the list 
> of tables to be bootstrapped.
>  ** If the new name does not satisfies then just removed the table name 
> from the list of tables to be bootstrapped.
>  ## Old name is matching new policy – As per replace policy handler, which 
> checks based on old table, the table should be bootstrapped and event should 
> be ignored. But rename handler should decide based on new name.The old table 
> name will not be returned by get-all-table, so replace handler will not d 
> anything for the old table.
>  ### New name not matching old policy
>   New name not matching new policy
>  * As the old table is not there at target and new name is not matching 
> new policy. Ignore the event.
>  * No need to add the table to the list of tables to be bootstrapped.
>  * All the subsequent events will be ignored as the new name is not 
> matching the new policy.
>   New name matching new policy
>  * As the new name is not matching old policy but matching new policy, 
> the table will be bootstrapped by replace policy handler. So rename event 
> need not add this table to list of table to be bootstrapped.
>  * All the future events will be ignored by replace policy handler.
>  * For rename event (double rename)
>  ** If there is a rename, the table (with intermitten

[jira] [Updated] (HIVE-21886) REPL - With table list - Handle rename events during replace policy

2019-06-28 Thread mahesh kumar behera (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-21886:
---
Status: Open  (was: Patch Available)

> REPL - With table list - Handle rename events during replace policy
> ---
>
> Key: HIVE-21886
> URL: https://issues.apache.org/jira/browse/HIVE-21886
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, Replication, pull-request-available
> Attachments: HIVE-21886.01.patch, HIVE-21886.02.patch, 
> HIVE-21886.03.patch, HIVE-21886.04.patch, HIVE-21886.04.patch, 
> HIVE-21886.05.patch
>
>  Time Spent: 11h 10m
>  Remaining Estimate: 0h
>
> If some rename events are found to be dumped and replayed while replace 
> policy is getting executed, it needs to take care of the policy inclusion in 
> both the policy for each table name.
>  1. Create a list of tables to be bootstrapped. 
>   2. During handling of alter table, if the alter type is rename 
>       1. If the old table name is present in the list of table to be 
> bootstrapped, remove it.
>        2. If the new table name, matches the new policy, add it to the list 
> of tables to be bootstrapped.
>        3. If the old table does not match the old policy drop it, even if the 
> table is not present at target.
>   3. During handling of drop table
>        1. if the table is in the list of tables to be bootstrapped, then 
> remove it and ignore the event.
>   4. During other event handling 
>        1. if the table is there in the list of tables to be bootstrapped, 
> then ignore the event.
>        2. If the new policy does not match the table name, then ignore the 
> event.
>  
> Rename handling during replace policy
>  # Old name not matching old policy – The old table will not be there at the 
> target cluster. The table will not be returned by get-all-table.
>  ## Old name is not matching new policy
>  ### New name not matching old policy
>   New name not matching new policy
>  * Ignore the event, no need to do anything.
>   New name matching new policy
>  * The table will be returned by get-all-table. Replace policy handler 
> will bootstrap this table as its matching new policy and not matching old 
> policy.
>  * All the future events will be ignored as part of check added by 
> replace policy handling.
>  * All the event with old table name will anyways be ignored as the old 
> name is not matching the new policy.
>  ### New name matching old policy
>   New name not matching new policy
>  * As the new name is not matching the new policy, the table need not be 
> replicated.
>  * As the old name is not matching the new policy, the rename events will 
> be ignored.
>  * So nothing to be done for this scenario.
>   New name matching new policy
>  * As the new name is matching both old and new policy, replace handler 
> will not bootstrap the table.
>  * Add the table to the list of tables to be bootstrapped.
>  * Ignore all the events with new name.
>  * If there is a drop event for the table (with new name), then remove 
> the table from the the list of table to be bootstrapped.
>  * In case of rename event (double rename)
>  ** If the new name satisfies the table pattern, then add the new name to 
> the list of tables to be bootstrapped and remove the old name from the list 
> of tables to be bootstrapped.
>  ** If the new name does not satisfies then just removed the table name 
> from the list of tables to be bootstrapped.
>  ## Old name is matching new policy – As per replace policy handler, which 
> checks based on old table, the table should be bootstrapped and event should 
> be ignored. But rename handler should decide based on new name.The old table 
> name will not be returned by get-all-table, so replace handler will not d 
> anything for the old table.
>  ### New name not matching old policy
>   New name not matching new policy
>  * As the old table is not there at target and new name is not matching 
> new policy. Ignore the event.
>  * No need to add the table to the list of tables to be bootstrapped.
>  * All the subsequent events will be ignored as the new name is not 
> matching the new policy.
>   New name matching new policy
>  * As the new name is not matching old policy but matching new policy, 
> the table will be bootstrapped by replace policy handler. So rename event 
> need not add this table to list of table to be bootstrapped.
>  * All the future events will be ignored by replace policy handler.
>  * For rename event (double rename)
>  ** If there is a rename, the table (with intermitten

[jira] [Updated] (HIVE-21886) REPL - With table list - Handle rename events during replace policy

2019-06-28 Thread mahesh kumar behera (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-21886:
---
Attachment: HIVE-21886.05.patch

> REPL - With table list - Handle rename events during replace policy
> ---
>
> Key: HIVE-21886
> URL: https://issues.apache.org/jira/browse/HIVE-21886
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, Replication, pull-request-available
> Attachments: HIVE-21886.01.patch, HIVE-21886.02.patch, 
> HIVE-21886.03.patch, HIVE-21886.04.patch, HIVE-21886.04.patch, 
> HIVE-21886.05.patch
>
>  Time Spent: 11h 10m
>  Remaining Estimate: 0h
>
> If some rename events are found to be dumped and replayed while replace 
> policy is getting executed, it needs to take care of the policy inclusion in 
> both the policy for each table name.
>  1. Create a list of tables to be bootstrapped. 
>   2. During handling of alter table, if the alter type is rename 
>       1. If the old table name is present in the list of table to be 
> bootstrapped, remove it.
>        2. If the new table name, matches the new policy, add it to the list 
> of tables to be bootstrapped.
>        3. If the old table does not match the old policy drop it, even if the 
> table is not present at target.
>   3. During handling of drop table
>        1. if the table is in the list of tables to be bootstrapped, then 
> remove it and ignore the event.
>   4. During other event handling 
>        1. if the table is there in the list of tables to be bootstrapped, 
> then ignore the event.
>        2. If the new policy does not match the table name, then ignore the 
> event.
>  
> Rename handling during replace policy
>  # Old name not matching old policy – The old table will not be there at the 
> target cluster. The table will not be returned by get-all-table.
>  ## Old name is not matching new policy
>  ### New name not matching old policy
>   New name not matching new policy
>  * Ignore the event, no need to do anything.
>   New name matching new policy
>  * The table will be returned by get-all-table. Replace policy handler 
> will bootstrap this table as its matching new policy and not matching old 
> policy.
>  * All the future events will be ignored as part of check added by 
> replace policy handling.
>  * All the event with old table name will anyways be ignored as the old 
> name is not matching the new policy.
>  ### New name matching old policy
>   New name not matching new policy
>  * As the new name is not matching the new policy, the table need not be 
> replicated.
>  * As the old name is not matching the new policy, the rename events will 
> be ignored.
>  * So nothing to be done for this scenario.
>   New name matching new policy
>  * As the new name is matching both old and new policy, replace handler 
> will not bootstrap the table.
>  * Add the table to the list of tables to be bootstrapped.
>  * Ignore all the events with new name.
>  * If there is a drop event for the table (with new name), then remove 
> the table from the the list of table to be bootstrapped.
>  * In case of rename event (double rename)
>  ** If the new name satisfies the table pattern, then add the new name to 
> the list of tables to be bootstrapped and remove the old name from the list 
> of tables to be bootstrapped.
>  ** If the new name does not satisfies then just removed the table name 
> from the list of tables to be bootstrapped.
>  ## Old name is matching new policy – As per replace policy handler, which 
> checks based on old table, the table should be bootstrapped and event should 
> be ignored. But rename handler should decide based on new name.The old table 
> name will not be returned by get-all-table, so replace handler will not d 
> anything for the old table.
>  ### New name not matching old policy
>   New name not matching new policy
>  * As the old table is not there at target and new name is not matching 
> new policy. Ignore the event.
>  * No need to add the table to the list of tables to be bootstrapped.
>  * All the subsequent events will be ignored as the new name is not 
> matching the new policy.
>   New name matching new policy
>  * As the new name is not matching old policy but matching new policy, 
> the table will be bootstrapped by replace policy handler. So rename event 
> need not add this table to list of table to be bootstrapped.
>  * All the future events will be ignored by replace policy handler.
>  * For rename event (double rename)
>  ** If there is a rename, the table (with intermittent new

[jira] [Commented] (HIVE-21637) Synchronized metastore cache

2019-06-28 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875340#comment-16875340
 ] 

Hive QA commented on HIVE-21637:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
59s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
30s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
26s{color} | {color:blue} storage-api in master has 48 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  2m 
34s{color} | {color:blue} standalone-metastore/metastore-common in master has 
31 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m 
12s{color} | {color:blue} standalone-metastore/metastore-server in master has 
179 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
3s{color} | {color:blue} ql in master has 2253 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
29s{color} | {color:blue} beeline in master has 44 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
29s{color} | {color:blue} hcatalog/server-extensions in master has 3 extant 
Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
32s{color} | {color:blue} 
standalone-metastore/metastore-tools/metastore-benchmarks in master has 3 
extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
42s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
51s{color} | {color:blue} itests/util in master has 44 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m 
17s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
38s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m  
4s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
11s{color} | {color:red} storage-api: The patch generated 1 new + 5 unchanged - 
0 fixed = 6 total (was 5) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
18s{color} | {color:red} standalone-metastore/metastore-common: The patch 
generated 9 new + 498 unchanged - 2 fixed = 507 total (was 500) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
46s{color} | {color:red} standalone-metastore/metastore-server: The patch 
generated 160 new + 2193 unchanged - 65 fixed = 2353 total (was 2258) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
53s{color} | {color:red} ql: The patch generated 25 new + 962 unchanged - 10 
fixed = 987 total (was 972) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
12s{color} | {color:red} standalone-metastore/metastore-tools/tools-common: The 
patch generated 5 new + 31 unchanged - 0 fixed = 36 total (was 31) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
11s{color} | {color:red} itests/hcatalog-unit: The patch generated 2 new + 24 
unchanged - 3 fixed = 26 total (was 27) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
19s{color} | {color:red} itests/hive-unit: The patch generated 3 new + 163 
unchanged - 1 fixed = 166 total (was 164) {color} |
| {color:red}-1{color} | {color:red} checkstyle

[jira] [Updated] (HIVE-21637) Synchronized metastore cache

2019-06-28 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-21637:
--
Attachment: HIVE-21637.13.patch

> Synchronized metastore cache
> 
>
> Key: HIVE-21637
> URL: https://issues.apache.org/jira/browse/HIVE-21637
> Project: Hive
>  Issue Type: New Feature
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>Priority: Major
> Attachments: HIVE-21637-1.patch, HIVE-21637.10.patch, 
> HIVE-21637.11.patch, HIVE-21637.12.patch, HIVE-21637.13.patch, 
> HIVE-21637.2.patch, HIVE-21637.3.patch, HIVE-21637.4.patch, 
> HIVE-21637.5.patch, HIVE-21637.6.patch, HIVE-21637.7.patch, 
> HIVE-21637.8.patch, HIVE-21637.9.patch
>
>
> Currently, HMS has a cache implemented by CachedStore. The cache is 
> asynchronized and in HMS HA setting, we can only get eventual consistency. In 
> this Jira, we try to make it synchronized.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21637) Synchronized metastore cache

2019-06-28 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-21637:
--
Attachment: (was: HIVE-21637.13.patch)

> Synchronized metastore cache
> 
>
> Key: HIVE-21637
> URL: https://issues.apache.org/jira/browse/HIVE-21637
> Project: Hive
>  Issue Type: New Feature
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>Priority: Major
> Attachments: HIVE-21637-1.patch, HIVE-21637.10.patch, 
> HIVE-21637.11.patch, HIVE-21637.12.patch, HIVE-21637.2.patch, 
> HIVE-21637.3.patch, HIVE-21637.4.patch, HIVE-21637.5.patch, 
> HIVE-21637.6.patch, HIVE-21637.7.patch, HIVE-21637.8.patch, HIVE-21637.9.patch
>
>
> Currently, HMS has a cache implemented by CachedStore. The cache is 
> asynchronized and in HMS HA setting, we can only get eventual consistency. In 
> this Jira, we try to make it synchronized.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21637) Synchronized metastore cache

2019-06-28 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-21637:
--
Attachment: HIVE-21637.13.patch

> Synchronized metastore cache
> 
>
> Key: HIVE-21637
> URL: https://issues.apache.org/jira/browse/HIVE-21637
> Project: Hive
>  Issue Type: New Feature
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>Priority: Major
> Attachments: HIVE-21637-1.patch, HIVE-21637.10.patch, 
> HIVE-21637.11.patch, HIVE-21637.12.patch, HIVE-21637.13.patch, 
> HIVE-21637.2.patch, HIVE-21637.3.patch, HIVE-21637.4.patch, 
> HIVE-21637.5.patch, HIVE-21637.6.patch, HIVE-21637.7.patch, 
> HIVE-21637.8.patch, HIVE-21637.9.patch
>
>
> Currently, HMS has a cache implemented by CachedStore. The cache is 
> asynchronized and in HMS HA setting, we can only get eventual consistency. In 
> this Jira, we try to make it synchronized.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21927) HiveServer Web UI: Setting the HttpOnly option in the cookies

2019-06-28 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-21927:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

> HiveServer Web UI: Setting the HttpOnly option in the cookies
> -
>
> Key: HIVE-21927
> URL: https://issues.apache.org/jira/browse/HIVE-21927
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.1
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21927.01.patch, HIVE-21927.patch
>
>
> Intend of this JIRA is to introduce the HttpOnly option in the cookie.
> cookie: before change
> {code:java}
> hdp32bFALSE   /   FALSE   0   JSESSIONID  
> 8dkibwayfnrc4y4hvpu3vh74
> {code}
> after change:
> {code:java}
> #HttpOnly_hdp32b  FALSE   /   FALSE   0   JSESSIONID  
> e1npdkbo3inj1xnd6gdc6ihws
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21927) HiveServer Web UI: Setting the HttpOnly option in the cookies

2019-06-28 Thread Daniel Dai (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875257#comment-16875257
 ] 

Daniel Dai commented on HIVE-21927:
---

+1. Patch pushed to master. Thanks Rajkumar!

> HiveServer Web UI: Setting the HttpOnly option in the cookies
> -
>
> Key: HIVE-21927
> URL: https://issues.apache.org/jira/browse/HIVE-21927
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.1
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
> Attachments: HIVE-21927.01.patch, HIVE-21927.patch
>
>
> Intend of this JIRA is to introduce the HttpOnly option in the cookie.
> cookie: before change
> {code:java}
> hdp32bFALSE   /   FALSE   0   JSESSIONID  
> 8dkibwayfnrc4y4hvpu3vh74
> {code}
> after change:
> {code:java}
> #HttpOnly_hdp32b  FALSE   /   FALSE   0   JSESSIONID  
> e1npdkbo3inj1xnd6gdc6ihws
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-21935) Hive Vectorization : degraded performance with vectorize UDF

2019-06-28 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875255#comment-16875255
 ] 

Gopal V edited comment on HIVE-21935 at 6/28/19 9:53 PM:
-

Actually, the execution is buffering 1024 rows - the index is not evaluated 
until the 1024 split calls are queued up


was (Author: gopalv):
The loop looks like it is constant folding the value and executing the UDF to 
do that.

> Hive Vectorization : degraded performance with vectorize UDF  
> --
>
> Key: HIVE-21935
> URL: https://issues.apache.org/jira/browse/HIVE-21935
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 3.1.1
> Environment: Hive-3, JDK-8
>Reporter: Rajkumar Singh
>Priority: Major
>  Labels: performance
> Attachments: CustomSplit-1.0-SNAPSHOT.jar
>
>
> with vectorization turned on and hive.vectorized.adaptor.usage.mode=all we 
> were seeing severe performance degradation. looking at the task jstacks it 
> seems that it is running the code which vectorizes UDF and stuck in some loop.
> {code:java}
> jstack -l 14954 | grep 0x3af0 -A20
> "TezChild" #15 daemon prio=5 os_prio=0 tid=0x7f157538d800 nid=0x3af0 
> runnable [0x7f1547581000]
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:573)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:350)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:205)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:150)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:271)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.ListIndexColScalar.evaluate(ListIndexColScalar.java:59)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:889)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
> [yarn@hdp32b ~]$ jstack -l 14954 | grep 0x3af0 -A20
> "TezChild" #15 daemon prio=5 os_prio=0 tid=0x7f157538d800 nid=0x3af0 
> runnable [0x7f1547581000]
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.ensureSize(BytesColumnVector.java:554)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:570)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:350)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:205)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:150)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:271)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.ListIndexColScalar.evaluate(ListIndexColScalar.java:59)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:889)
>   at 
> org.apache.hadoop.hive.ql

[jira] [Commented] (HIVE-21935) Hive Vectorization : degraded performance with vectorize UDF

2019-06-28 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875255#comment-16875255
 ] 

Gopal V commented on HIVE-21935:


The loop looks like it is constant folding the value and executing the UDF to 
do that.

> Hive Vectorization : degraded performance with vectorize UDF  
> --
>
> Key: HIVE-21935
> URL: https://issues.apache.org/jira/browse/HIVE-21935
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 3.1.1
> Environment: Hive-3, JDK-8
>Reporter: Rajkumar Singh
>Priority: Major
>  Labels: performance
> Attachments: CustomSplit-1.0-SNAPSHOT.jar
>
>
> with vectorization turned on and hive.vectorized.adaptor.usage.mode=all we 
> were seeing severe performance degradation. looking at the task jstacks it 
> seems that it is running the code which vectorizes UDF and stuck in some loop.
> {code:java}
> jstack -l 14954 | grep 0x3af0 -A20
> "TezChild" #15 daemon prio=5 os_prio=0 tid=0x7f157538d800 nid=0x3af0 
> runnable [0x7f1547581000]
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:573)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:350)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:205)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:150)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:271)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.ListIndexColScalar.evaluate(ListIndexColScalar.java:59)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:889)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
> [yarn@hdp32b ~]$ jstack -l 14954 | grep 0x3af0 -A20
> "TezChild" #15 daemon prio=5 os_prio=0 tid=0x7f157538d800 nid=0x3af0 
> runnable [0x7f1547581000]
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.ensureSize(BytesColumnVector.java:554)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:570)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:350)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:205)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:150)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:271)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.ListIndexColScalar.evaluate(ListIndexColScalar.java:59)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:889)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76)
>   at 
> org.apache.had

[jira] [Updated] (HIVE-21935) Hive Vectorization : degraded performance issue with vectorize UDF

2019-06-28 Thread Rajkumar Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajkumar Singh updated HIVE-21935:
--
Summary: Hive Vectorization : degraded performance issue with vectorize UDF 
   (was: Hive Vectorization : Server performance issue with vectorize UDF  )

> Hive Vectorization : degraded performance issue with vectorize UDF  
> 
>
> Key: HIVE-21935
> URL: https://issues.apache.org/jira/browse/HIVE-21935
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 3.1.1
> Environment: Hive-3, JDK-8
>Reporter: Rajkumar Singh
>Priority: Major
>  Labels: performance
> Attachments: CustomSplit-1.0-SNAPSHOT.jar
>
>
> with vectorization turned on and hive.vectorized.adaptor.usage.mode=all we 
> were seeing severe performance degradation. looking at the task jstacks it 
> seems that it is running the code which vectorizes UDF and stuck in some loop.
> {code:java}
> jstack -l 14954 | grep 0x3af0 -A20
> "TezChild" #15 daemon prio=5 os_prio=0 tid=0x7f157538d800 nid=0x3af0 
> runnable [0x7f1547581000]
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:573)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:350)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:205)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:150)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:271)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.ListIndexColScalar.evaluate(ListIndexColScalar.java:59)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:889)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
> [yarn@hdp32b ~]$ jstack -l 14954 | grep 0x3af0 -A20
> "TezChild" #15 daemon prio=5 os_prio=0 tid=0x7f157538d800 nid=0x3af0 
> runnable [0x7f1547581000]
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.ensureSize(BytesColumnVector.java:554)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:570)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:350)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:205)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:150)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:271)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.ListIndexColScalar.evaluate(ListIndexColScalar.java:59)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:889)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource

[jira] [Updated] (HIVE-21935) Hive Vectorization : degraded performance with vectorize UDF

2019-06-28 Thread Rajkumar Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajkumar Singh updated HIVE-21935:
--
Summary: Hive Vectorization : degraded performance with vectorize UDF
(was: Hive Vectorization : degraded performance issue with vectorize UDF  )

> Hive Vectorization : degraded performance with vectorize UDF  
> --
>
> Key: HIVE-21935
> URL: https://issues.apache.org/jira/browse/HIVE-21935
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 3.1.1
> Environment: Hive-3, JDK-8
>Reporter: Rajkumar Singh
>Priority: Major
>  Labels: performance
> Attachments: CustomSplit-1.0-SNAPSHOT.jar
>
>
> with vectorization turned on and hive.vectorized.adaptor.usage.mode=all we 
> were seeing severe performance degradation. looking at the task jstacks it 
> seems that it is running the code which vectorizes UDF and stuck in some loop.
> {code:java}
> jstack -l 14954 | grep 0x3af0 -A20
> "TezChild" #15 daemon prio=5 os_prio=0 tid=0x7f157538d800 nid=0x3af0 
> runnable [0x7f1547581000]
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:573)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:350)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:205)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:150)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:271)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.ListIndexColScalar.evaluate(ListIndexColScalar.java:59)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:889)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
> [yarn@hdp32b ~]$ jstack -l 14954 | grep 0x3af0 -A20
> "TezChild" #15 daemon prio=5 os_prio=0 tid=0x7f157538d800 nid=0x3af0 
> runnable [0x7f1547581000]
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.ensureSize(BytesColumnVector.java:554)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:570)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:350)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:205)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:150)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:271)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.ListIndexColScalar.evaluate(ListIndexColScalar.java:59)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:889)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76)
> 

[jira] [Updated] (HIVE-21935) Hive Vectorization : Server performance issue with vectorize UDF

2019-06-28 Thread Rajkumar Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajkumar Singh updated HIVE-21935:
--
Attachment: CustomSplit-1.0-SNAPSHOT.jar

> Hive Vectorization : Server performance issue with vectorize UDF  
> --
>
> Key: HIVE-21935
> URL: https://issues.apache.org/jira/browse/HIVE-21935
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 3.1.1
> Environment: Hive-3, JDK-8
>Reporter: Rajkumar Singh
>Priority: Major
> Attachments: CustomSplit-1.0-SNAPSHOT.jar
>
>
> with vectorization turned on and hive.vectorized.adaptor.usage.mode=all we 
> were seeing severe performance degradation. looking at the task jstacks it 
> seems that it is running the code which vectorizes UDF and stuck in some loop.
> {code:java}
> jstack -l 14954 | grep 0x3af0 -A20
> "TezChild" #15 daemon prio=5 os_prio=0 tid=0x7f157538d800 nid=0x3af0 
> runnable [0x7f1547581000]
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:573)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:350)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:205)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:150)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:271)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.ListIndexColScalar.evaluate(ListIndexColScalar.java:59)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:889)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
> [yarn@hdp32b ~]$ jstack -l 14954 | grep 0x3af0 -A20
> "TezChild" #15 daemon prio=5 os_prio=0 tid=0x7f157538d800 nid=0x3af0 
> runnable [0x7f1547581000]
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.ensureSize(BytesColumnVector.java:554)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:570)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:350)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:205)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:150)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:271)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.ListIndexColScalar.evaluate(ListIndexColScalar.java:59)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:889)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
>   at 
> org.apache.hadoop.hive.ql.e

[jira] [Updated] (HIVE-21935) Hive Vectorization : Server performance issue with vectorize UDF

2019-06-28 Thread Rajkumar Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajkumar Singh updated HIVE-21935:
--
Labels: performance  (was: )

> Hive Vectorization : Server performance issue with vectorize UDF  
> --
>
> Key: HIVE-21935
> URL: https://issues.apache.org/jira/browse/HIVE-21935
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 3.1.1
> Environment: Hive-3, JDK-8
>Reporter: Rajkumar Singh
>Priority: Major
>  Labels: performance
> Attachments: CustomSplit-1.0-SNAPSHOT.jar
>
>
> with vectorization turned on and hive.vectorized.adaptor.usage.mode=all we 
> were seeing severe performance degradation. looking at the task jstacks it 
> seems that it is running the code which vectorizes UDF and stuck in some loop.
> {code:java}
> jstack -l 14954 | grep 0x3af0 -A20
> "TezChild" #15 daemon prio=5 os_prio=0 tid=0x7f157538d800 nid=0x3af0 
> runnable [0x7f1547581000]
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:573)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:350)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:205)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:150)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:271)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.ListIndexColScalar.evaluate(ListIndexColScalar.java:59)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:889)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
> [yarn@hdp32b ~]$ jstack -l 14954 | grep 0x3af0 -A20
> "TezChild" #15 daemon prio=5 os_prio=0 tid=0x7f157538d800 nid=0x3af0 
> runnable [0x7f1547581000]
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.ensureSize(BytesColumnVector.java:554)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:570)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:350)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:205)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:150)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:271)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.ListIndexColScalar.evaluate(ListIndexColScalar.java:59)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:889)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
>   at 
> org.

[jira] [Comment Edited] (HIVE-21848) Table property name definition between ORC and Parquet encrytion

2019-06-28 Thread Xinli Shang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875247#comment-16875247
 ] 

Xinli Shang edited comment on HIVE-21848 at 6/28/19 9:39 PM:
-

Hi [~owen.omalley], yes, I looked at the HadoopShims.java earlier. I still 
remember you had a super smart workaround to avoid two round trips to 
generate/encrypt a working key from KMS. It reduced half of the traffic. 

For the nested column questions above, I generally agree that makes sense. 
There are only a few corner cases that we need to discuss.

For the example above "name: struct", if we see the 
table properties have the following entry, "encrypt.columns" = 
"pii:name;other_category:name.first", what do we do? Should we through 
exception? Or we just ignore "other_category:name.first" to let parent to 
override it? 

Do we allow exclusion of some leaf columns not to be encrypted, if their parent 
is specified to be encrypted? I guess people will raise the feature request 
later when it is roll out. 

With that said, I am not objecting the proposal but just some thoughts on 
corner cases. 

 


was (Author: sha...@uber.com):
Hi [~owen.omalley], yes, I looked at the HadoopShims.java earlier. I still 
remember you had a super smart workaround to avoid two round trips to get 
generate/encrypt a working key from KMS. It reduced half of the traffic. 

For the nested column questions above, I generally agree that makes sense. 
There are only a few corner cases that we need to discuss.

For the example above "name: struct", if we see the 
table properties have the following entry, "encrypt.columns" = 
"pii:name;other_category:name.first", what do we do? Should we through 
exception? Or we just ignore "other_category:name.first" to let parent to 
override it? 

Do we allow exclusion of some leaf columns not to be encrypted, if their parent 
is specified to be encrypted? I guess people will raise the feature request 
later when it is roll out. 

With that said, I am not objecting the proposal but just some thoughts on 
corner cases. 

 

> Table property name definition between ORC and Parquet encrytion
> 
>
> Key: HIVE-21848
> URL: https://issues.apache.org/jira/browse/HIVE-21848
> Project: Hive
>  Issue Type: Task
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Xinli Shang
>Assignee: Xinli Shang
>Priority: Major
> Fix For: 3.0.0
>
>
> The goal of this Jira is to define a superset of unified table property names 
> that can be used for both Parquet and ORC column encryption. There is no code 
> change needed for this Jira.
> *Background:*
> ORC-14 and Parquet-1178 introduced column encryption to ORC and Parquet. To 
> configure the encryption, e.g. which column is sensitive, what master key to 
> be used, algorithm, etc, table properties can be used. It is important that 
> both Parquet and ORC can use unified names.
> According to the slide 
> [https://www.slideshare.net/oom65/fine-grain-access-control-for-big-data-orc-column-encryption-137308692],
>  ORC use table properties like orc.encrypt.pii, orc.encrypt.credit. While in 
> the Parquet community, it is still discussing to provide several ways and 
> using table properties is one of the options, while there is no detailed 
> design of the table property names yet.
> So it is a good time to discuss within two communities to have unified table 
> names as a superset.
> *Proposal:*
> There are several encryption properties that need to be specified for a 
> table. Here is the list. This is the superset of Parquet and ORC. Some of 
> them might not apply to both.
>  # PII columns including nest columns
>  # Column key metadata, master key metadata
>  # Encryption algorithm, for example, Parquet support AES_GCM and AES_CTR. 
> ORC might support AES_CTR.
>  # Encryption footer - Parquet allow footer to be encrypted or plaintext
>  # Footer key metadata
> Here is the table properties proposal.  
> |*Table Property Name*|*Value*|*Notes*|
> |encrypt_algorithm|aes_ctr, aes_gcm|The algorithm to be used for encryption.|
> |encrypt_footer_plaintext|true, false|Parquet support plaintext and encrypted 
> footer. By default, it is encrypted.|
> |encrypt_footer_key_metadata|base64 string of footer key metadata|It is up to 
> the KMS to define what key metadata is. The metadata should have enough 
> information to figure out the corresponding key by the KMS.  |
> |encrypt_col_xxx|base64 string of column key metadata|‘xxx’ is the column 
> name for example, ‘address.zipcode’. 
>  
> It is up to the KMS to define what key metadata is. The metadata should have 
> enough information to figure out the corresponding key by the KMS.|
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76

[jira] [Commented] (HIVE-21848) Table property name definition between ORC and Parquet encrytion

2019-06-28 Thread Xinli Shang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875247#comment-16875247
 ] 

Xinli Shang commented on HIVE-21848:


Hi [~owen.omalley], yes, I looked at the HadoopShims.java earlier. I still 
remember you had a super smart workaround to avoid two round trips to get 
generate/encrypt a working key from KMS. It reduced half of the traffic. 

For the nested column questions above, I generally agree that makes sense. 
There are only a few corner cases that we need to discuss.

For the example above "name: struct", if we see the 
table properties have the following entry, "encrypt.columns" = 
"pii:name;other_category:name.first", what do we do? Should we through 
exception? Or we just ignore "other_category:name.first" to let parent to 
override it? 

Do we allow exclusion of some leaf columns not to be encrypted, if their parent 
is specified to be encrypted? I guess people will raise the feature request 
later when it is roll out. 

With that said, I am not objecting the proposal but just some thoughts on 
corner cases. 

 

> Table property name definition between ORC and Parquet encrytion
> 
>
> Key: HIVE-21848
> URL: https://issues.apache.org/jira/browse/HIVE-21848
> Project: Hive
>  Issue Type: Task
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Xinli Shang
>Assignee: Xinli Shang
>Priority: Major
> Fix For: 3.0.0
>
>
> The goal of this Jira is to define a superset of unified table property names 
> that can be used for both Parquet and ORC column encryption. There is no code 
> change needed for this Jira.
> *Background:*
> ORC-14 and Parquet-1178 introduced column encryption to ORC and Parquet. To 
> configure the encryption, e.g. which column is sensitive, what master key to 
> be used, algorithm, etc, table properties can be used. It is important that 
> both Parquet and ORC can use unified names.
> According to the slide 
> [https://www.slideshare.net/oom65/fine-grain-access-control-for-big-data-orc-column-encryption-137308692],
>  ORC use table properties like orc.encrypt.pii, orc.encrypt.credit. While in 
> the Parquet community, it is still discussing to provide several ways and 
> using table properties is one of the options, while there is no detailed 
> design of the table property names yet.
> So it is a good time to discuss within two communities to have unified table 
> names as a superset.
> *Proposal:*
> There are several encryption properties that need to be specified for a 
> table. Here is the list. This is the superset of Parquet and ORC. Some of 
> them might not apply to both.
>  # PII columns including nest columns
>  # Column key metadata, master key metadata
>  # Encryption algorithm, for example, Parquet support AES_GCM and AES_CTR. 
> ORC might support AES_CTR.
>  # Encryption footer - Parquet allow footer to be encrypted or plaintext
>  # Footer key metadata
> Here is the table properties proposal.  
> |*Table Property Name*|*Value*|*Notes*|
> |encrypt_algorithm|aes_ctr, aes_gcm|The algorithm to be used for encryption.|
> |encrypt_footer_plaintext|true, false|Parquet support plaintext and encrypted 
> footer. By default, it is encrypted.|
> |encrypt_footer_key_metadata|base64 string of footer key metadata|It is up to 
> the KMS to define what key metadata is. The metadata should have enough 
> information to figure out the corresponding key by the KMS.  |
> |encrypt_col_xxx|base64 string of column key metadata|‘xxx’ is the column 
> name for example, ‘address.zipcode’. 
>  
> It is up to the KMS to define what key metadata is. The metadata should have 
> enough information to figure out the corresponding key by the KMS.|
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21867) Sort semijoin conditions to accelerate query processing

2019-06-28 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-21867:
---
Affects Version/s: 4.0.0
   Status: In Progress  (was: Patch Available)

Pushed to master, thanks for reviewing [~vgarg]!

> Sort semijoin conditions to accelerate query processing
> ---
>
> Key: HIVE-21867
> URL: https://issues.apache.org/jira/browse/HIVE-21867
> Project: Hive
>  Issue Type: New Feature
>  Components: Physical Optimizer
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21867.02.patch, HIVE-21867.03.patch, 
> HIVE-21867.04.patch, HIVE-21867.05.patch, HIVE-21867.05.patch, 
> HIVE-21867.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The problem was tackled for CBO in HIVE-21857. Semijoin filters are 
> introduced later in the planning phase. Follow similar approach to sort them, 
> trying to accelerate filter evaluation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21867) Sort semijoin conditions to accelerate query processing

2019-06-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21867?focusedWorklogId=269511&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-269511
 ]

ASF GitHub Bot logged work on HIVE-21867:
-

Author: ASF GitHub Bot
Created on: 28/Jun/19 20:27
Start Date: 28/Jun/19 20:27
Worklog Time Spent: 10m 
  Work Description: asfgit commented on pull request #687: HIVE-21867
URL: https://github.com/apache/hive/pull/687
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 269511)
Time Spent: 1h 40m  (was: 1.5h)

> Sort semijoin conditions to accelerate query processing
> ---
>
> Key: HIVE-21867
> URL: https://issues.apache.org/jira/browse/HIVE-21867
> Project: Hive
>  Issue Type: New Feature
>  Components: Physical Optimizer
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21867.02.patch, HIVE-21867.03.patch, 
> HIVE-21867.04.patch, HIVE-21867.05.patch, HIVE-21867.05.patch, 
> HIVE-21867.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The problem was tackled for CBO in HIVE-21857. Semijoin filters are 
> introduced later in the planning phase. Follow similar approach to sort them, 
> trying to accelerate filter evaluation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HIVE-21867) Sort semijoin conditions to accelerate query processing

2019-06-28 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez resolved HIVE-21867.

   Resolution: Fixed
Fix Version/s: 4.0.0

> Sort semijoin conditions to accelerate query processing
> ---
>
> Key: HIVE-21867
> URL: https://issues.apache.org/jira/browse/HIVE-21867
> Project: Hive
>  Issue Type: New Feature
>  Components: Physical Optimizer
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21867.02.patch, HIVE-21867.03.patch, 
> HIVE-21867.04.patch, HIVE-21867.05.patch, HIVE-21867.05.patch, 
> HIVE-21867.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The problem was tackled for CBO in HIVE-21857. Semijoin filters are 
> introduced later in the planning phase. Follow similar approach to sort them, 
> trying to accelerate filter evaluation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21743) day( ) gives wrong day from the date in Apache Hive 3.1 server

2019-06-28 Thread Adarshdeep Cheema (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adarshdeep Cheema updated HIVE-21743:
-
Due Date: (was: 23/May/19)

> day( ) gives wrong day from the date in Apache Hive 3.1 server
> --
>
> Key: HIVE-21743
> URL: https://issues.apache.org/jira/browse/HIVE-21743
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.1
> Environment: Server: Apache Hive 3.1 
> Driver hive-jdbc-3.1.0.3.1.0.0-78
>Reporter: Adarshdeep Cheema
>Priority: Critical
>
> Using Apache Hive 3.1 server 
> Run the following SQL and you will get 3 instead of 1
> SELECT
>  (day( DATE '0001-01-01'))
> FROM
>  `table`
> PLEASE NOTE THIS DOES NOT HAPPEN WITH Apache HIVE 2.1 SERVER 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21921) Support for correlated quantified predicates

2019-06-28 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875186#comment-16875186
 ] 

Vineet Garg commented on HIVE-21921:


[~jcamachorodriguez] [~ashutoshc] Can you take a look? 
https://github.com/apache/hive/pull/693

> Support for correlated quantified predicates
> 
>
> Key: HIVE-21921
> URL: https://issues.apache.org/jira/browse/HIVE-21921
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21921.1.patch, HIVE-21921.2.patch, 
> HIVE-21921.3.patch, HIVE-21921.4.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21921) Support for correlated quantified predicates

2019-06-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21921?focusedWorklogId=269485&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-269485
 ]

ASF GitHub Bot logged work on HIVE-21921:
-

Author: ASF GitHub Bot
Created on: 28/Jun/19 19:42
Start Date: 28/Jun/19 19:42
Worklog Time Spent: 10m 
  Work Description: vineetgarg02 commented on pull request #693: 
HIVE-21921: Support for correlated quantified predicates
URL: https://github.com/apache/hive/pull/693
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 269485)
Time Spent: 10m
Remaining Estimate: 0h

> Support for correlated quantified predicates
> 
>
> Key: HIVE-21921
> URL: https://issues.apache.org/jira/browse/HIVE-21921
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21921.1.patch, HIVE-21921.2.patch, 
> HIVE-21921.3.patch, HIVE-21921.4.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21921) Support for correlated quantified predicates

2019-06-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-21921:
--
Labels: pull-request-available  (was: )

> Support for correlated quantified predicates
> 
>
> Key: HIVE-21921
> URL: https://issues.apache.org/jira/browse/HIVE-21921
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21921.1.patch, HIVE-21921.2.patch, 
> HIVE-21921.3.patch, HIVE-21921.4.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21867) Sort semijoin conditions to accelerate query processing

2019-06-28 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875183#comment-16875183
 ] 

Hive QA commented on HIVE-21867:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12973179/HIVE-21867.05.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 16357 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/17793/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17793/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17793/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12973179 - PreCommit-HIVE-Build

> Sort semijoin conditions to accelerate query processing
> ---
>
> Key: HIVE-21867
> URL: https://issues.apache.org/jira/browse/HIVE-21867
> Project: Hive
>  Issue Type: New Feature
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21867.02.patch, HIVE-21867.03.patch, 
> HIVE-21867.04.patch, HIVE-21867.05.patch, HIVE-21867.05.patch, 
> HIVE-21867.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The problem was tackled for CBO in HIVE-21857. Semijoin filters are 
> introduced later in the planning phase. Follow similar approach to sort them, 
> trying to accelerate filter evaluation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21934) Materialized view on top of Druid not pushing everything

2019-06-28 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-21934:
---
Component/s: Materialized views
 Druid integration

> Materialized view on top of Druid not pushing everything
> 
>
> Key: HIVE-21934
> URL: https://issues.apache.org/jira/browse/HIVE-21934
> Project: Hive
>  Issue Type: Improvement
>  Components: Druid integration, Materialized views
>Reporter: slim bouguerra
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> The title is not very informative, but examples hopefully are.
> this is the plan with the view
> {code}
> explain SELECT MONTH(`dates_n1`.`__time`) AS `mn___time_ok`,
> CAST((MONTH(`dates_n1`.`__time`) - 1) / 3 + 1 AS BIGINT) AS `qr___time_ok`,
> SUM(1) AS `sum_number_of_records_ok`,
> YEAR(`dates_n1`.`__time`) AS `yr___time_ok`
> FROM `mv_ssb_100_scale`.`lineorder_n0` `lineorder_n0`
> JOIN `mv_ssb_100_scale`.`dates_n1` `dates_n1` ON 
> (`lineorder_n0`.`lo_orderdate` = `dates_n1`.`d_datekey`)
> JOIN `mv_ssb_100_scale`.`customer_n1` `customer_n1` ON 
> (`lineorder_n0`.`lo_custkey` = `customer_n1`.`c_custkey`)
> JOIN `mv_ssb_100_scale`.`supplier_n0` `supplier_n0` ON 
> (`lineorder_n0`.`lo_suppkey` = `supplier_n0`.`s_suppkey`)
> JOIN `mv_ssb_100_scale`.`ssb_part_n0` `ssb_part_n0` ON 
> (`lineorder_n0`.`lo_partkey` = `ssb_part_n0`.`p_partkey`)
> GROUP BY MONTH(`dates_n1`.`__time`),
> CAST((MONTH(`dates_n1`.`__time`) - 1) / 3 + 1 AS BIGINT),
> YEAR(`dates_n1`.`__time`)
> INFO : Starting task [Stage-3:EXPLAIN] in serial mode
> INFO : Completed executing 
> command(queryId=sbouguerra_20190627113101_1493ee87-0288-4e30-b53c-0ee729ce3977);
>  Time taken: 0.005 seconds
> INFO : OK
> ++
> | Explain |
> ++
> | Plan optimized by CBO. |
> | |
> | Vertex dependency in root stage |
> | Reducer 2 <- Map 1 (SIMPLE_EDGE) |
> | |
> | Stage-0 |
> | Fetch Operator |
> | limit:-1 |
> | Stage-1 |
> | Reducer 2 vectorized, llap |
> | File Output Operator [FS_13] |
> | Select Operator [SEL_12] (rows=300018951 width=38) |
> | Output:["_col0","_col1","_col2","_col3"] |
> | Group By Operator [GBY_11] (rows=300018951 width=38) |
> | 
> Output:["_col0","_col1","_col2","_col3"],aggregations:["sum(VALUE._col0)"],keys:KEY._col0,
>  KEY._col1, KEY._col2 |
> | <-Map 1 [SIMPLE_EDGE] vectorized, llap |
> | SHUFFLE [RS_10] |
> | PartitionCols:_col0, _col1, _col2 |
> | Group By Operator [GBY_9] (rows=600037902 width=38) |
> | 
> Output:["_col0","_col1","_col2","_col3"],aggregations:["sum(1)"],keys:_col0, 
> _col1, _col2 |
> | Select Operator [SEL_8] (rows=600037902 width=38) |
> | Output:["_col0","_col1","_col2"] |
> | TableScan [TS_0] (rows=600037902 width=38) |
> | 
> mv_ssb_100_scale@ssb_mv_druid_100,ssb_mv_druid_100,Tbl:COMPLETE,Col:NONE,Output:["vc"],properties:\{"druid.fieldNames":"vc","druid.fieldTypes":"timestamp","druid.query.json":"{\"queryType\":\"scan\",\"dataSource\":\"mv_ssb_100_scale.ssb_mv_druid_100\",\"intervals\":[\"1900-01-01T00:00:00.000Z/3000-01-01T00:00:00.000Z\"],\"virtualColumns\":[{\"type\":\"expression\",\"name\":\"vc\",\"expression\":\"\\\"__time\\\"\",\"outputType\":\"LONG\"}],\"columns\":[\"vc\"],\"resultFormat\":\"compactedList\"}","druid.query.type":"scan"}
>  |
> | |
> ++
>  
> {code}
> if i use a simple druid table without MV 
> {code}
> explain SELECT MONTH(`__time`) AS `mn___time_ok`,
> CAST((MONTH(`__time`) - 1) / 3 + 1 AS BIGINT) AS `qr___time_ok`,
> SUM(1) AS `sum_number_of_records_ok`,
> YEAR(`__time`) AS `yr___time_ok`
> FROM `druid_ssb.ssb_druid_100`
> GROUP BY MONTH(`__time`),
> CAST((MONTH(`__time`) - 1) / 3 + 1 AS BIGINT),
> YEAR(`__time`);
> {code}
> {code}
> ++
> | Explain |
> ++
> | Plan optimized by CBO. |
> | |
> | Stage-0 |
> | Fetch Operator |
> | limit:-1 |
> | Select Operator [SEL_1] |
> | Output:["_col0","_col1","_col2","_col3"] |
> | TableScan [TS_0] |
> | 
> Output:["extract_month","vc","$f3","extract_year"],properties:\{"druid.fieldNames":"extract_month,vc,extract_year,$f3","druid.fieldTypes":"int,bigint,int,bigint","druid.query.json":"{\"queryType\":\"groupBy\",\"dataSource\":\"druid_ssb.ssb_druid_100\",\"granularity\":\"all\",\"dimensions\":[{\"type\":\"extraction\",\"dimension\":\"__time\",\"outputName\":\"extract_month\",\"extractionFn\":{\"type\":\"timeFormat\",\"format\":\"M\",\"timeZone\":\"America/New_York\",\"locale\":\"en-US\"}},\{\"type\":\"default\",\"dimension\":\"vc\",\"outputName\":\"vc\",\"outputType\":\"LONG\"},\{\"type\":\"extraction\",\"dimension\":\"__time\",\"outputName\"

[jira] [Updated] (HIVE-21934) Materialized view on top of Druid not pushing everything

2019-06-28 Thread slim bouguerra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-21934:
--
Summary: Materialized view on top of Druid not pushing everything  (was: 
Materialized view on top of Druid not pushing every thing)

> Materialized view on top of Druid not pushing everything
> 
>
> Key: HIVE-21934
> URL: https://issues.apache.org/jira/browse/HIVE-21934
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> The title is not very informative, but examples hopefully are.
> this is the plan with the view
> {code}
> explain SELECT MONTH(`dates_n1`.`__time`) AS `mn___time_ok`,
> CAST((MONTH(`dates_n1`.`__time`) - 1) / 3 + 1 AS BIGINT) AS `qr___time_ok`,
> SUM(1) AS `sum_number_of_records_ok`,
> YEAR(`dates_n1`.`__time`) AS `yr___time_ok`
> FROM `mv_ssb_100_scale`.`lineorder_n0` `lineorder_n0`
> JOIN `mv_ssb_100_scale`.`dates_n1` `dates_n1` ON 
> (`lineorder_n0`.`lo_orderdate` = `dates_n1`.`d_datekey`)
> JOIN `mv_ssb_100_scale`.`customer_n1` `customer_n1` ON 
> (`lineorder_n0`.`lo_custkey` = `customer_n1`.`c_custkey`)
> JOIN `mv_ssb_100_scale`.`supplier_n0` `supplier_n0` ON 
> (`lineorder_n0`.`lo_suppkey` = `supplier_n0`.`s_suppkey`)
> JOIN `mv_ssb_100_scale`.`ssb_part_n0` `ssb_part_n0` ON 
> (`lineorder_n0`.`lo_partkey` = `ssb_part_n0`.`p_partkey`)
> GROUP BY MONTH(`dates_n1`.`__time`),
> CAST((MONTH(`dates_n1`.`__time`) - 1) / 3 + 1 AS BIGINT),
> YEAR(`dates_n1`.`__time`)
> INFO : Starting task [Stage-3:EXPLAIN] in serial mode
> INFO : Completed executing 
> command(queryId=sbouguerra_20190627113101_1493ee87-0288-4e30-b53c-0ee729ce3977);
>  Time taken: 0.005 seconds
> INFO : OK
> ++
> | Explain |
> ++
> | Plan optimized by CBO. |
> | |
> | Vertex dependency in root stage |
> | Reducer 2 <- Map 1 (SIMPLE_EDGE) |
> | |
> | Stage-0 |
> | Fetch Operator |
> | limit:-1 |
> | Stage-1 |
> | Reducer 2 vectorized, llap |
> | File Output Operator [FS_13] |
> | Select Operator [SEL_12] (rows=300018951 width=38) |
> | Output:["_col0","_col1","_col2","_col3"] |
> | Group By Operator [GBY_11] (rows=300018951 width=38) |
> | 
> Output:["_col0","_col1","_col2","_col3"],aggregations:["sum(VALUE._col0)"],keys:KEY._col0,
>  KEY._col1, KEY._col2 |
> | <-Map 1 [SIMPLE_EDGE] vectorized, llap |
> | SHUFFLE [RS_10] |
> | PartitionCols:_col0, _col1, _col2 |
> | Group By Operator [GBY_9] (rows=600037902 width=38) |
> | 
> Output:["_col0","_col1","_col2","_col3"],aggregations:["sum(1)"],keys:_col0, 
> _col1, _col2 |
> | Select Operator [SEL_8] (rows=600037902 width=38) |
> | Output:["_col0","_col1","_col2"] |
> | TableScan [TS_0] (rows=600037902 width=38) |
> | 
> mv_ssb_100_scale@ssb_mv_druid_100,ssb_mv_druid_100,Tbl:COMPLETE,Col:NONE,Output:["vc"],properties:\{"druid.fieldNames":"vc","druid.fieldTypes":"timestamp","druid.query.json":"{\"queryType\":\"scan\",\"dataSource\":\"mv_ssb_100_scale.ssb_mv_druid_100\",\"intervals\":[\"1900-01-01T00:00:00.000Z/3000-01-01T00:00:00.000Z\"],\"virtualColumns\":[{\"type\":\"expression\",\"name\":\"vc\",\"expression\":\"\\\"__time\\\"\",\"outputType\":\"LONG\"}],\"columns\":[\"vc\"],\"resultFormat\":\"compactedList\"}","druid.query.type":"scan"}
>  |
> | |
> ++
>  
> {code}
> if i use a simple druid table without MV 
> {code}
> explain SELECT MONTH(`__time`) AS `mn___time_ok`,
> CAST((MONTH(`__time`) - 1) / 3 + 1 AS BIGINT) AS `qr___time_ok`,
> SUM(1) AS `sum_number_of_records_ok`,
> YEAR(`__time`) AS `yr___time_ok`
> FROM `druid_ssb.ssb_druid_100`
> GROUP BY MONTH(`__time`),
> CAST((MONTH(`__time`) - 1) / 3 + 1 AS BIGINT),
> YEAR(`__time`);
> {code}
> {code}
> ++
> | Explain |
> ++
> | Plan optimized by CBO. |
> | |
> | Stage-0 |
> | Fetch Operator |
> | limit:-1 |
> | Select Operator [SEL_1] |
> | Output:["_col0","_col1","_col2","_col3"] |
> | TableScan [TS_0] |
> | 
> Output:["extract_month","vc","$f3","extract_year"],properties:\{"druid.fieldNames":"extract_month,vc,extract_year,$f3","druid.fieldTypes":"int,bigint,int,bigint","druid.query.json":"{\"queryType\":\"groupBy\",\"dataSource\":\"druid_ssb.ssb_druid_100\",\"granularity\":\"all\",\"dimensions\":[{\"type\":\"extraction\",\"dimension\":\"__time\",\"outputName\":\"extract_month\",\"extractionFn\":{\"type\":\"timeFormat\",\"format\":\"M\",\"timeZone\":\"America/New_York\",\"locale\":\"en-US\"}},\{\"type\":\"default\",\"dimension\":\"vc\",\"outputName\":\"vc\",\"outputType\":\"LONG\"},\{\"type\":\"extraction\",\"dimension\":\"__time\",\"outputName\":\"extract_ye

[jira] [Commented] (HIVE-21934) Materialized view on top of Druid not pushing every thing

2019-06-28 Thread slim bouguerra (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875141#comment-16875141
 ] 

slim bouguerra commented on HIVE-21934:
---

source tables

{code}

CREATE TABLE `lineorder_n0`(
 `lo_orderkey` bigint, 
 `lo_linenumber` int, 
 `lo_custkey` bigint not null disable rely,
 `lo_partkey` bigint not null disable rely,
 `lo_suppkey` bigint not null disable rely,
 `lo_orderdate` bigint not null disable rely,
 `lo_ordpriority` string, 
 `lo_shippriority` string, 
 `lo_quantity` double, 
 `lo_extendedprice` double, 
 `lo_ordtotalprice` double, 
 `lo_discount` double, 
 `lo_revenue` double, 
 `lo_supplycost` double, 
 `lo_tax` double, 
 `lo_commitdate` bigint, 
 `lo_shipmode` string,
 primary key (`lo_orderkey`) disable rely,
 constraint fk21 foreign key (`lo_custkey`) references 
`customer_n1`(`c_custkey`) disable rely,
 constraint fk22 foreign key (`lo_orderdate`) references 
`dates_n1`(`d_datekey`) disable rely,
 constraint fk23 foreign key (`lo_partkey`) references 
`ssb_part_n0`(`p_partkey`) disable rely,
 constraint fk24 foreign key (`lo_suppkey`) references 
`supplier_n0`(`s_suppkey`) disable rely)
STORED AS ORC
TBLPROPERTIES ('transactional'='true');

{code}

 

{code}

CREATE TABLE `dates_n1`(
 `d_datekey` bigint, 
 `__time` timestamp,
 `d_date` string, 
 `d_dayofweek` string, 
 `d_month` string, 
 `d_year` int, 
 `d_yearmonthnum` int, 
 `d_yearmonth` string, 
 `d_daynuminweek` int,
 `d_daynuminmonth` int,
 `d_daynuminyear` int,
 `d_monthnuminyear` int,
 `d_weeknuminyear` int,
 `d_sellingseason` string,
 `d_lastdayinweekfl` int,
 `d_lastdayinmonthfl` int,
 `d_holidayfl` int ,
 `d_weekdayfl`int,
 primary key (`d_datekey`) disable rely
)
STORED AS ORC
TBLPROPERTIES ('transactional'='true');

{code}

> Materialized view on top of Druid not pushing every thing
> -
>
> Key: HIVE-21934
> URL: https://issues.apache.org/jira/browse/HIVE-21934
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> The title is not very informative, but examples hopefully are.
> this is the plan with the view
> {code}
> explain SELECT MONTH(`dates_n1`.`__time`) AS `mn___time_ok`,
> CAST((MONTH(`dates_n1`.`__time`) - 1) / 3 + 1 AS BIGINT) AS `qr___time_ok`,
> SUM(1) AS `sum_number_of_records_ok`,
> YEAR(`dates_n1`.`__time`) AS `yr___time_ok`
> FROM `mv_ssb_100_scale`.`lineorder_n0` `lineorder_n0`
> JOIN `mv_ssb_100_scale`.`dates_n1` `dates_n1` ON 
> (`lineorder_n0`.`lo_orderdate` = `dates_n1`.`d_datekey`)
> JOIN `mv_ssb_100_scale`.`customer_n1` `customer_n1` ON 
> (`lineorder_n0`.`lo_custkey` = `customer_n1`.`c_custkey`)
> JOIN `mv_ssb_100_scale`.`supplier_n0` `supplier_n0` ON 
> (`lineorder_n0`.`lo_suppkey` = `supplier_n0`.`s_suppkey`)
> JOIN `mv_ssb_100_scale`.`ssb_part_n0` `ssb_part_n0` ON 
> (`lineorder_n0`.`lo_partkey` = `ssb_part_n0`.`p_partkey`)
> GROUP BY MONTH(`dates_n1`.`__time`),
> CAST((MONTH(`dates_n1`.`__time`) - 1) / 3 + 1 AS BIGINT),
> YEAR(`dates_n1`.`__time`)
> INFO : Starting task [Stage-3:EXPLAIN] in serial mode
> INFO : Completed executing 
> command(queryId=sbouguerra_20190627113101_1493ee87-0288-4e30-b53c-0ee729ce3977);
>  Time taken: 0.005 seconds
> INFO : OK
> ++
> | Explain |
> ++
> | Plan optimized by CBO. |
> | |
> | Vertex dependency in root stage |
> | Reducer 2 <- Map 1 (SIMPLE_EDGE) |
> | |
> | Stage-0 |
> | Fetch Operator |
> | limit:-1 |
> | Stage-1 |
> | Reducer 2 vectorized, llap |
> | File Output Operator [FS_13] |
> | Select Operator [SEL_12] (rows=300018951 width=38) |
> | Output:["_col0","_col1","_col2","_col3"] |
> | Group By Operator [GBY_11] (rows=300018951 width=38) |
> | 
> Output:["_col0","_col1","_col2","_col3"],aggregations:["sum(VALUE._col0)"],keys:KEY._col0,
>  KEY._col1, KEY._col2 |
> | <-Map 1 [SIMPLE_EDGE] vectorized, llap |
> | SHUFFLE [RS_10] |
> | PartitionCols:_col0, _col1, _col2 |
> | Group By Operator [GBY_9] (rows=600037902 width=38) |
> | 
> Output:["_col0","_col1","_col2","_col3"],aggregations:["sum(1)"],keys:_col0, 
> _col1, _col2 |
> | Select Operator [SEL_8] (rows=600037902 width=38) |
> | Output:["_col0","_col1","_col2"] |
> | TableScan [TS_0] (rows=600037902 width=38) |
> | 
> mv_ssb_100_scale@ssb_mv_druid_100,ssb_mv_druid_100,Tbl:COMPLETE,Col:NONE,Output:["vc"],properties:\{"druid.fieldNames":"vc","druid.fieldTypes":"timestamp","druid.query.json":"{\"queryType\":\"scan\",\"dataSource\":\"mv_ssb_100_scale.ssb_mv_druid_100\",\"intervals\":[\"1900-01-01T00:00:00.000Z/3000-01-01T00:00:00.000Z\"],\"virtualColumns\":[{\"type\":\"expression\",\"name\":\"vc\",\"expression\":\"\\\"__time\\\"\",\"outputType\":\"LONG\"}],\"columns\":[\"vc\"

[jira] [Commented] (HIVE-21934) Materialized view on top of Druid not pushing every thing

2019-06-28 Thread slim bouguerra (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875140#comment-16875140
 ] 

slim bouguerra commented on HIVE-21934:
---

view definition

{code}


CREATE MATERIALIZED VIEW `ssb_mv_druid_100`
STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
TBLPROPERTIES (
 "druid.segment.granularity" = "MONTH",
 "druid.query.granularity" = "DAY")
AS
SELECT
 `__time` as `__time` ,
 cast(c_city as string) c_city,
 cast(c_nation as string) c_nation,
 cast(c_region as string) c_region,
 c_mktsegment as c_mktsegment,
 cast(d_weeknuminyear as string) d_weeknuminyear,
 cast(d_year as string) d_year,
 cast(d_yearmonth as string) d_yearmonth,
 cast(d_yearmonthnum as string) d_yearmonthnum,
 cast(p_brand1 as string) p_brand1,
 cast(p_category as string) p_category,
 cast(p_mfgr as string) p_mfgr,
 p_type,
 s_name,
 cast(s_city as string) s_city,
 cast(s_nation as string) s_nation,
 cast(s_region as string) s_region,
 cast(`lo_ordpriority` as string) lo_ordpriority, 
 cast(`lo_shippriority` as string) lo_shippriority, 
 `d_sellingseason`
 `lo_shipmode`, 
 lo_revenue,
 lo_supplycost ,
 lo_discount ,
 `lo_quantity`, 
 `lo_extendedprice`, 
 `lo_ordtotalprice`, 
 lo_extendedprice * lo_discount discounted_price,
 lo_revenue - lo_supplycost net_revenue
FROM
 customer_n1, dates_n1, lineorder_n1, ssb_part_n0, supplier_n0
where
 lo_orderdate = d_datekey
 and lo_partkey = p_partkey
 and lo_suppkey = s_suppkey
 and lo_custkey = c_custkey;

{code}

> Materialized view on top of Druid not pushing every thing
> -
>
> Key: HIVE-21934
> URL: https://issues.apache.org/jira/browse/HIVE-21934
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> The title is not very informative, but examples hopefully are.
> this is the plan with the view
> {code}
> explain SELECT MONTH(`dates_n1`.`__time`) AS `mn___time_ok`,
> CAST((MONTH(`dates_n1`.`__time`) - 1) / 3 + 1 AS BIGINT) AS `qr___time_ok`,
> SUM(1) AS `sum_number_of_records_ok`,
> YEAR(`dates_n1`.`__time`) AS `yr___time_ok`
> FROM `mv_ssb_100_scale`.`lineorder_n0` `lineorder_n0`
> JOIN `mv_ssb_100_scale`.`dates_n1` `dates_n1` ON 
> (`lineorder_n0`.`lo_orderdate` = `dates_n1`.`d_datekey`)
> JOIN `mv_ssb_100_scale`.`customer_n1` `customer_n1` ON 
> (`lineorder_n0`.`lo_custkey` = `customer_n1`.`c_custkey`)
> JOIN `mv_ssb_100_scale`.`supplier_n0` `supplier_n0` ON 
> (`lineorder_n0`.`lo_suppkey` = `supplier_n0`.`s_suppkey`)
> JOIN `mv_ssb_100_scale`.`ssb_part_n0` `ssb_part_n0` ON 
> (`lineorder_n0`.`lo_partkey` = `ssb_part_n0`.`p_partkey`)
> GROUP BY MONTH(`dates_n1`.`__time`),
> CAST((MONTH(`dates_n1`.`__time`) - 1) / 3 + 1 AS BIGINT),
> YEAR(`dates_n1`.`__time`)
> INFO : Starting task [Stage-3:EXPLAIN] in serial mode
> INFO : Completed executing 
> command(queryId=sbouguerra_20190627113101_1493ee87-0288-4e30-b53c-0ee729ce3977);
>  Time taken: 0.005 seconds
> INFO : OK
> ++
> | Explain |
> ++
> | Plan optimized by CBO. |
> | |
> | Vertex dependency in root stage |
> | Reducer 2 <- Map 1 (SIMPLE_EDGE) |
> | |
> | Stage-0 |
> | Fetch Operator |
> | limit:-1 |
> | Stage-1 |
> | Reducer 2 vectorized, llap |
> | File Output Operator [FS_13] |
> | Select Operator [SEL_12] (rows=300018951 width=38) |
> | Output:["_col0","_col1","_col2","_col3"] |
> | Group By Operator [GBY_11] (rows=300018951 width=38) |
> | 
> Output:["_col0","_col1","_col2","_col3"],aggregations:["sum(VALUE._col0)"],keys:KEY._col0,
>  KEY._col1, KEY._col2 |
> | <-Map 1 [SIMPLE_EDGE] vectorized, llap |
> | SHUFFLE [RS_10] |
> | PartitionCols:_col0, _col1, _col2 |
> | Group By Operator [GBY_9] (rows=600037902 width=38) |
> | 
> Output:["_col0","_col1","_col2","_col3"],aggregations:["sum(1)"],keys:_col0, 
> _col1, _col2 |
> | Select Operator [SEL_8] (rows=600037902 width=38) |
> | Output:["_col0","_col1","_col2"] |
> | TableScan [TS_0] (rows=600037902 width=38) |
> | 
> mv_ssb_100_scale@ssb_mv_druid_100,ssb_mv_druid_100,Tbl:COMPLETE,Col:NONE,Output:["vc"],properties:\{"druid.fieldNames":"vc","druid.fieldTypes":"timestamp","druid.query.json":"{\"queryType\":\"scan\",\"dataSource\":\"mv_ssb_100_scale.ssb_mv_druid_100\",\"intervals\":[\"1900-01-01T00:00:00.000Z/3000-01-01T00:00:00.000Z\"],\"virtualColumns\":[{\"type\":\"expression\",\"name\":\"vc\",\"expression\":\"\\\"__time\\\"\",\"outputType\":\"LONG\"}],\"columns\":[\"vc\"],\"resultFormat\":\"compactedList\"}","druid.query.type":"scan"}
>  |
> | |
> ++
>  
> {code}
> if i use a simple druid table without MV 
> {code}
> explain SELECT MONTH(`__time`) AS `mn___time_ok`,
> CAST((MONTH(`__time`

[jira] [Assigned] (HIVE-21934) Materialized view on top of Druid not pushing every thing

2019-06-28 Thread slim bouguerra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra reassigned HIVE-21934:
-


> Materialized view on top of Druid not pushing every thing
> -
>
> Key: HIVE-21934
> URL: https://issues.apache.org/jira/browse/HIVE-21934
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> The title is not very informative, but examples hopefully are.
> this is the plan with the view
> {code}
> explain SELECT MONTH(`dates_n1`.`__time`) AS `mn___time_ok`,
> CAST((MONTH(`dates_n1`.`__time`) - 1) / 3 + 1 AS BIGINT) AS `qr___time_ok`,
> SUM(1) AS `sum_number_of_records_ok`,
> YEAR(`dates_n1`.`__time`) AS `yr___time_ok`
> FROM `mv_ssb_100_scale`.`lineorder_n0` `lineorder_n0`
> JOIN `mv_ssb_100_scale`.`dates_n1` `dates_n1` ON 
> (`lineorder_n0`.`lo_orderdate` = `dates_n1`.`d_datekey`)
> JOIN `mv_ssb_100_scale`.`customer_n1` `customer_n1` ON 
> (`lineorder_n0`.`lo_custkey` = `customer_n1`.`c_custkey`)
> JOIN `mv_ssb_100_scale`.`supplier_n0` `supplier_n0` ON 
> (`lineorder_n0`.`lo_suppkey` = `supplier_n0`.`s_suppkey`)
> JOIN `mv_ssb_100_scale`.`ssb_part_n0` `ssb_part_n0` ON 
> (`lineorder_n0`.`lo_partkey` = `ssb_part_n0`.`p_partkey`)
> GROUP BY MONTH(`dates_n1`.`__time`),
> CAST((MONTH(`dates_n1`.`__time`) - 1) / 3 + 1 AS BIGINT),
> YEAR(`dates_n1`.`__time`)
> INFO : Starting task [Stage-3:EXPLAIN] in serial mode
> INFO : Completed executing 
> command(queryId=sbouguerra_20190627113101_1493ee87-0288-4e30-b53c-0ee729ce3977);
>  Time taken: 0.005 seconds
> INFO : OK
> ++
> | Explain |
> ++
> | Plan optimized by CBO. |
> | |
> | Vertex dependency in root stage |
> | Reducer 2 <- Map 1 (SIMPLE_EDGE) |
> | |
> | Stage-0 |
> | Fetch Operator |
> | limit:-1 |
> | Stage-1 |
> | Reducer 2 vectorized, llap |
> | File Output Operator [FS_13] |
> | Select Operator [SEL_12] (rows=300018951 width=38) |
> | Output:["_col0","_col1","_col2","_col3"] |
> | Group By Operator [GBY_11] (rows=300018951 width=38) |
> | 
> Output:["_col0","_col1","_col2","_col3"],aggregations:["sum(VALUE._col0)"],keys:KEY._col0,
>  KEY._col1, KEY._col2 |
> | <-Map 1 [SIMPLE_EDGE] vectorized, llap |
> | SHUFFLE [RS_10] |
> | PartitionCols:_col0, _col1, _col2 |
> | Group By Operator [GBY_9] (rows=600037902 width=38) |
> | 
> Output:["_col0","_col1","_col2","_col3"],aggregations:["sum(1)"],keys:_col0, 
> _col1, _col2 |
> | Select Operator [SEL_8] (rows=600037902 width=38) |
> | Output:["_col0","_col1","_col2"] |
> | TableScan [TS_0] (rows=600037902 width=38) |
> | 
> mv_ssb_100_scale@ssb_mv_druid_100,ssb_mv_druid_100,Tbl:COMPLETE,Col:NONE,Output:["vc"],properties:\{"druid.fieldNames":"vc","druid.fieldTypes":"timestamp","druid.query.json":"{\"queryType\":\"scan\",\"dataSource\":\"mv_ssb_100_scale.ssb_mv_druid_100\",\"intervals\":[\"1900-01-01T00:00:00.000Z/3000-01-01T00:00:00.000Z\"],\"virtualColumns\":[{\"type\":\"expression\",\"name\":\"vc\",\"expression\":\"\\\"__time\\\"\",\"outputType\":\"LONG\"}],\"columns\":[\"vc\"],\"resultFormat\":\"compactedList\"}","druid.query.type":"scan"}
>  |
> | |
> ++
>  
> {code}
> if i use a simple druid table without MV 
> {code}
> explain SELECT MONTH(`__time`) AS `mn___time_ok`,
> CAST((MONTH(`__time`) - 1) / 3 + 1 AS BIGINT) AS `qr___time_ok`,
> SUM(1) AS `sum_number_of_records_ok`,
> YEAR(`__time`) AS `yr___time_ok`
> FROM `druid_ssb.ssb_druid_100`
> GROUP BY MONTH(`__time`),
> CAST((MONTH(`__time`) - 1) / 3 + 1 AS BIGINT),
> YEAR(`__time`);
> {code}
> {code}
> ++
> | Explain |
> ++
> | Plan optimized by CBO. |
> | |
> | Stage-0 |
> | Fetch Operator |
> | limit:-1 |
> | Select Operator [SEL_1] |
> | Output:["_col0","_col1","_col2","_col3"] |
> | TableScan [TS_0] |
> | 
> Output:["extract_month","vc","$f3","extract_year"],properties:\{"druid.fieldNames":"extract_month,vc,extract_year,$f3","druid.fieldTypes":"int,bigint,int,bigint","druid.query.json":"{\"queryType\":\"groupBy\",\"dataSource\":\"druid_ssb.ssb_druid_100\",\"granularity\":\"all\",\"dimensions\":[{\"type\":\"extraction\",\"dimension\":\"__time\",\"outputName\":\"extract_month\",\"extractionFn\":{\"type\":\"timeFormat\",\"format\":\"M\",\"timeZone\":\"America/New_York\",\"locale\":\"en-US\"}},\{\"type\":\"default\",\"dimension\":\"vc\",\"outputName\":\"vc\",\"outputType\":\"LONG\"},\{\"type\":\"extraction\",\"dimension\":\"__time\",\"outputName\":\"extract_year\",\"extractionFn\":{\"type\":\"timeFormat\",\"format\":\"\",\"timeZone\":\"America/New_York\",\"locale\":\"en-US\"}}],\"v

[jira] [Commented] (HIVE-21867) Sort semijoin conditions to accelerate query processing

2019-06-28 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875139#comment-16875139
 ] 

Hive QA commented on HIVE-21867:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
27s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
58s{color} | {color:blue} ql in master has 2253 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
41s{color} | {color:red} ql: The patch generated 1 new + 155 unchanged - 0 
fixed = 156 total (was 155) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 24m 46s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-17793/dev-support/hive-personality.sh
 |
| git revision | master / 21177ef |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17793/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17793/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Sort semijoin conditions to accelerate query processing
> ---
>
> Key: HIVE-21867
> URL: https://issues.apache.org/jira/browse/HIVE-21867
> Project: Hive
>  Issue Type: New Feature
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21867.02.patch, HIVE-21867.03.patch, 
> HIVE-21867.04.patch, HIVE-21867.05.patch, HIVE-21867.05.patch, 
> HIVE-21867.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The problem was tackled for CBO in HIVE-21857. Semijoin filters are 
> introduced later in the planning phase. Follow similar approach to sort them, 
> trying to accelerate filter evaluation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21932) IndexOutOfRangeException in FileChksumIterator

2019-06-28 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-21932:
---
   Resolution: Fixed
Fix Version/s: 3.2.0
   4.0.0
   Status: Resolved  (was: Patch Available)

Patch merged in master and branch-3

> IndexOutOfRangeException in FileChksumIterator
> --
>
> Key: HIVE-21932
> URL: https://issues.apache.org/jira/browse/HIVE-21932
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-21932.01.patch
>
>
> According to definition of {{InsertEventRequestData}} in 
> {{hive_metastore.thrift}} the {{filesAddedChecksum}} is a optional field. But 
> the FileChksumIterator does not handle it correctly when a client fires a 
> insert event which does not have file checksums. The issue is that 
> {{InsertEvent}} class initializes fileChecksums list to a empty arrayList so 
> the following check will never come into play
> {noformat}
> result = ReplChangeManager.encodeFileUri(files.get(i), chksums != null ? 
> chksums.get(i) : null,
> subDirs != null ? subDirs.get(i) : null);
> {noformat}
> The chksums check above should include a {{!chksums.isEmpty()}} check as well 
> in the above line.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21578) Introduce SQL:2016 formats FM, FX, and nested strings

2019-06-28 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875115#comment-16875115
 ] 

Hive QA commented on HIVE-21578:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12973172/HIVE-21578.02.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 16359 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/17792/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17792/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17792/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12973172 - PreCommit-HIVE-Build

> Introduce SQL:2016 formats FM, FX, and nested strings
> -
>
> Key: HIVE-21578
> URL: https://issues.apache.org/jira/browse/HIVE-21578
> Project: Hive
>  Issue Type: Improvement
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
> Attachments: HIVE-21578.01.patch, HIVE-21578.02.patch
>
>
> Enable Hive to parse the following datetime formats when any combination or 
> subset of these or previously implemented formats is provided in one string. 
>  * "text" (nested strings)
>  * FM
>  * FX
> [Definitions 
> here|https://docs.google.com/document/d/1V7k6-lrPGW7_uhqM-FhKl3QsxwCRy69v2KIxPsGjc1k/edit]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21867) Sort semijoin conditions to accelerate query processing

2019-06-28 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875099#comment-16875099
 ] 

Vineet Garg commented on HIVE-21867:


+1 LGTM

> Sort semijoin conditions to accelerate query processing
> ---
>
> Key: HIVE-21867
> URL: https://issues.apache.org/jira/browse/HIVE-21867
> Project: Hive
>  Issue Type: New Feature
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21867.02.patch, HIVE-21867.03.patch, 
> HIVE-21867.04.patch, HIVE-21867.05.patch, HIVE-21867.05.patch, 
> HIVE-21867.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The problem was tackled for CBO in HIVE-21857. Semijoin filters are 
> introduced later in the planning phase. Follow similar approach to sort them, 
> trying to accelerate filter evaluation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21932) IndexOutOfRangeException in FileChksumIterator

2019-06-28 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-21932:
---
Description: 
According to definition of {{InsertEventRequestData}} in 
{{hive_metastore.thrift}} the {{filesAddedChecksum}} is a optional field. But 
the FileChksumIterator does not handle it correctly when a client fires a 
insert event which does not have file checksums. The issue is that 
{{InsertEvent}} class initializes fileChecksums list to a empty arrayList so 
the following check will never come into play

{noformat}
result = ReplChangeManager.encodeFileUri(files.get(i), chksums != null ? 
chksums.get(i) : null,
subDirs != null ? subDirs.get(i) : null);
{noformat}

The chksums check above should include a {{!chksums.isEmpty()}} check as well 
in the above line.

  was:
According to definition of {{InsertEventRequestData}} in 
{{hive_metastore.thrift}} the {{filesAddedChecksum}} is a optional field. But 
the FileChksumIterator does not handle it correctly when a client fires a 
insert event which does not have file checksums. The issue is that 
{{InsertEvent}} class initializes fileChecksums list to a empty arrayList to 
the following check will never come into play

{noformat}
result = ReplChangeManager.encodeFileUri(files.get(i), chksums != null ? 
chksums.get(i) : null,
subDirs != null ? subDirs.get(i) : null);
{noformat}

The chksums check above should include a {{!chksums.isEmpty()}} check as well 
in the above line.


> IndexOutOfRangeException in FileChksumIterator
> --
>
> Key: HIVE-21932
> URL: https://issues.apache.org/jira/browse/HIVE-21932
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-21932.01.patch
>
>
> According to definition of {{InsertEventRequestData}} in 
> {{hive_metastore.thrift}} the {{filesAddedChecksum}} is a optional field. But 
> the FileChksumIterator does not handle it correctly when a client fires a 
> insert event which does not have file checksums. The issue is that 
> {{InsertEvent}} class initializes fileChecksums list to a empty arrayList so 
> the following check will never come into play
> {noformat}
> result = ReplChangeManager.encodeFileUri(files.get(i), chksums != null ? 
> chksums.get(i) : null,
> subDirs != null ? subDirs.get(i) : null);
> {noformat}
> The chksums check above should include a {{!chksums.isEmpty()}} check as well 
> in the above line.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21932) IndexOutOfRangeException in FileChksumIterator

2019-06-28 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-21932:
---
Summary: IndexOutOfRangeException in FileChksumIterator  (was: 
IndexOutOfRangeExeption in FileChksumIterator)

> IndexOutOfRangeException in FileChksumIterator
> --
>
> Key: HIVE-21932
> URL: https://issues.apache.org/jira/browse/HIVE-21932
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-21932.01.patch
>
>
> According to definition of {{InsertEventRequestData}} in 
> {{hive_metastore.thrift}} the {{filesAddedChecksum}} is a optional field. But 
> the FileChksumIterator does not handle it correctly when a client fires a 
> insert event which does not have file checksums. The issue is that 
> {{InsertEvent}} class initializes fileChecksums list to a empty arrayList to 
> the following check will never come into play
> {noformat}
> result = ReplChangeManager.encodeFileUri(files.get(i), chksums != null ? 
> chksums.get(i) : null,
> subDirs != null ? subDirs.get(i) : null);
> {noformat}
> The chksums check above should include a {{!chksums.isEmpty()}} check as well 
> in the above line.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21578) Introduce SQL:2016 formats FM, FX, and nested strings

2019-06-28 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875076#comment-16875076
 ] 

Hive QA commented on HIVE-21578:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
44s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
23s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
50s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
34s{color} | {color:blue} common in master has 62 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
1s{color} | {color:blue} ql in master has 2253 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
13s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
28s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
12s{color} | {color:red} common: The patch generated 3 new + 0 unchanged - 0 
fixed = 3 total (was 0) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 28m 36s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-17792/dev-support/hive-personality.sh
 |
| git revision | master / 21177ef |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17792/yetus/diff-checkstyle-common.txt
 |
| modules | C: common ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17792/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Introduce SQL:2016 formats FM, FX, and nested strings
> -
>
> Key: HIVE-21578
> URL: https://issues.apache.org/jira/browse/HIVE-21578
> Project: Hive
>  Issue Type: Improvement
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
> Attachments: HIVE-21578.01.patch, HIVE-21578.02.patch
>
>
> Enable Hive to parse the following datetime formats when any combination or 
> subset of these or previously implemented formats is provided in one string. 
>  * "text" (nested strings)
>  * FM
>  * FX
> [Definitions 
> here|https://docs.google.com/document/d/1V7k6-lrPGW7_uhqM-FhKl3QsxwCRy69v2KIxPsGjc1k/edit]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21888) Set hive.parquet.timestamp.skip.conversion default to true

2019-06-28 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-21888:
---
   Resolution: Fixed
Fix Version/s: 3.2.0
   4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master, branch-3. Thanks [~klcopp]

> Set hive.parquet.timestamp.skip.conversion default to true
> --
>
> Key: HIVE-21888
> URL: https://issues.apache.org/jira/browse/HIVE-21888
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-21888.02.patch, HIVE-21888.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21910) Multiple target location generation in HostAffinitySplitLocationProvider

2019-06-28 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875053#comment-16875053
 ] 

Hive QA commented on HIVE-21910:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12973170/HIVE-21910.2.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 16346 tests 
executed
*Failed tests:*
{noformat}
TestMiniLlapCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=156)

[intersect_all.q,unionDistinct_1.q,table_nonprintable.q,orc_llap_counters1.q,mm_cttas.q,whroot_external1.q,global_limit.q,cte_2.q,rcfile_createas1.q,dynamic_partition_pruning_2.q,intersect_merge.q,results_cache_diff_fs.q,cttl.q,parallel_colstats.q,load_hdfs_file_with_space_in_the_name.q]
org.apache.hadoop.hive.ql.TestWarehouseExternalDir.org.apache.hadoop.hive.ql.TestWarehouseExternalDir
 (batchId=255)
org.apache.hadoop.hive.ql.TestWarehouseExternalDir.testExternalDefaultPaths 
(batchId=255)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/17791/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17791/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17791/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12973170 - PreCommit-HIVE-Build

> Multiple target location generation in HostAffinitySplitLocationProvider
> 
>
> Key: HIVE-21910
> URL: https://issues.apache.org/jira/browse/HIVE-21910
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21910.2.patch, HIVE-21910.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> We need to generate multiple target locations by 
> HostAffinitySplitLocationProvider, so we will have deterministic fallback 
> nodes in case the target node is disabled



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21910) Multiple target location generation in HostAffinitySplitLocationProvider

2019-06-28 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875012#comment-16875012
 ] 

Hive QA commented on HIVE-21910:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
45s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
11s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
38s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 9s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
34s{color} | {color:blue} common in master has 62 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
26s{color} | {color:blue} llap-tez in master has 17 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
3s{color} | {color:blue} ql in master has 2253 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
26s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
28s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
41s{color} | {color:red} ql: The patch generated 5 new + 41 unchanged - 1 fixed 
= 46 total (was 42) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  4m 
12s{color} | {color:red} ql generated 1 new + 2252 unchanged - 1 fixed = 2253 
total (was 2253) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 31m 30s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  Null passed for non-null parameter of new java.util.HashSet(Collection) 
in new 
org.apache.hadoop.hive.ql.exec.tez.HostAffinitySplitLocationProvider(List, 
boolean, int)  Method invoked at HostAffinitySplitLocationProvider.java:of new 
java.util.HashSet(Collection) in new 
org.apache.hadoop.hive.ql.exec.tez.HostAffinitySplitLocationProvider(List, 
boolean, int)  Method invoked at HostAffinitySplitLocationProvider.java:[line 
68] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-17791/dev-support/hive-personality.sh
 |
| git revision | master / 5b46790 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17791/yetus/diff-checkstyle-ql.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17791/yetus/new-findbugs-ql.html
 |
| modules | C: common llap-tez ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17791/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Multiple target location generation in HostAffinitySplitLocationProvider
> 
>
> Key: HIVE-21910
> URL: https://issues.apache.org/jira/browse/HIVE-21910
> Project: H

[jira] [Commented] (HIVE-21933) Remove unused methods from Utilities

2019-06-28 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874988#comment-16874988
 ] 

Hive QA commented on HIVE-21933:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12973167/HIVE-21933.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 16357 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/17790/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17790/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17790/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12973167 - PreCommit-HIVE-Build

> Remove unused methods from Utilities
> 
>
> Key: HIVE-21933
> URL: https://issues.apache.org/jira/browse/HIVE-21933
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Ivan Suller
>Assignee: Ivan Suller
>Priority: Trivial
> Attachments: HIVE-21933.1.patch
>
>
> Over the years it seems org.apache.hadoop.hive.ql.exec.Utilities collected 
> many methods which are not used anymore. Removing them is the right thing to 
> do.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21867) Sort semijoin conditions to accelerate query processing

2019-06-28 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-21867:
---
Attachment: HIVE-21867.05.patch

> Sort semijoin conditions to accelerate query processing
> ---
>
> Key: HIVE-21867
> URL: https://issues.apache.org/jira/browse/HIVE-21867
> Project: Hive
>  Issue Type: New Feature
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21867.02.patch, HIVE-21867.03.patch, 
> HIVE-21867.04.patch, HIVE-21867.05.patch, HIVE-21867.05.patch, 
> HIVE-21867.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The problem was tackled for CBO in HIVE-21857. Semijoin filters are 
> introduced later in the planning phase. Follow similar approach to sort them, 
> trying to accelerate filter evaluation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21933) Remove unused methods from Utilities

2019-06-28 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874960#comment-16874960
 ] 

Hive QA commented on HIVE-21933:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
41s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
6s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
58s{color} | {color:blue} ql in master has 2253 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} ql: The patch generated 0 new + 130 unchanged - 4 
fixed = 130 total (was 134) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 24m 48s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-17790/dev-support/hive-personality.sh
 |
| git revision | master / 5b46790 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17790/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Remove unused methods from Utilities
> 
>
> Key: HIVE-21933
> URL: https://issues.apache.org/jira/browse/HIVE-21933
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Ivan Suller
>Assignee: Ivan Suller
>Priority: Trivial
> Attachments: HIVE-21933.1.patch
>
>
> Over the years it seems org.apache.hadoop.hive.ql.exec.Utilities collected 
> many methods which are not used anymore. Removing them is the right thing to 
> do.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21578) Introduce SQL:2016 formats FM, FX, and nested strings

2019-06-28 Thread Karen Coppage (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage updated HIVE-21578:
-
Status: Open  (was: Patch Available)

> Introduce SQL:2016 formats FM, FX, and nested strings
> -
>
> Key: HIVE-21578
> URL: https://issues.apache.org/jira/browse/HIVE-21578
> Project: Hive
>  Issue Type: Improvement
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
> Attachments: HIVE-21578.01.patch, HIVE-21578.02.patch
>
>
> Enable Hive to parse the following datetime formats when any combination or 
> subset of these or previously implemented formats is provided in one string. 
>  * "text" (nested strings)
>  * FM
>  * FX
> [Definitions 
> here|https://docs.google.com/document/d/1V7k6-lrPGW7_uhqM-FhKl3QsxwCRy69v2KIxPsGjc1k/edit]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21578) Introduce SQL:2016 formats FM, FX, and nested strings

2019-06-28 Thread Karen Coppage (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage updated HIVE-21578:
-
Attachment: HIVE-21578.02.patch
Status: Patch Available  (was: Open)

> Introduce SQL:2016 formats FM, FX, and nested strings
> -
>
> Key: HIVE-21578
> URL: https://issues.apache.org/jira/browse/HIVE-21578
> Project: Hive
>  Issue Type: Improvement
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
> Attachments: HIVE-21578.01.patch, HIVE-21578.02.patch
>
>
> Enable Hive to parse the following datetime formats when any combination or 
> subset of these or previously implemented formats is provided in one string. 
>  * "text" (nested strings)
>  * FM
>  * FX
> [Definitions 
> here|https://docs.google.com/document/d/1V7k6-lrPGW7_uhqM-FhKl3QsxwCRy69v2KIxPsGjc1k/edit]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21578) Introduce SQL:2016 formats FM, FX, and nested strings

2019-06-28 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874925#comment-16874925
 ] 

Hive QA commented on HIVE-21578:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12973164/HIVE-21578.01.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 16359 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/17789/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17789/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17789/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12973164 - PreCommit-HIVE-Build

> Introduce SQL:2016 formats FM, FX, and nested strings
> -
>
> Key: HIVE-21578
> URL: https://issues.apache.org/jira/browse/HIVE-21578
> Project: Hive
>  Issue Type: Improvement
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
> Attachments: HIVE-21578.01.patch
>
>
> Enable Hive to parse the following datetime formats when any combination or 
> subset of these or previously implemented formats is provided in one string. 
>  * "text" (nested strings)
>  * FM
>  * FX
> [Definitions 
> here|https://docs.google.com/document/d/1V7k6-lrPGW7_uhqM-FhKl3QsxwCRy69v2KIxPsGjc1k/edit]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18735) Create table like loses transactional attribute

2019-06-28 Thread Marta Kuczora (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marta Kuczora updated HIVE-18735:
-
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

> Create table like loses transactional attribute
> ---
>
> Key: HIVE-18735
> URL: https://issues.apache.org/jira/browse/HIVE-18735
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.0.0
>Reporter: Eugene Koifman
>Assignee: Laszlo Pinter
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-18735.01.patch, HIVE-18735.02.patch, 
> HIVE-18735.03.patch, HIVE-18735.04.patch, HIVE-18735.05.patch, 
> HIVE-18735.06.patch
>
>
> {noformat}
> create table T1(a int, b int) clustered by (a) into 2 buckets stored as orc 
> TBLPROPERTIES ('transactional'='true')";
> create table T like T1;
> show create table T ;
> CREATE TABLE `T`(
>   `a` int,
>   `b` int)
> CLUSTERED BY (
>   a)
> INTO 2 BUCKETS
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
> LOCATION
>  
> 'file:/Users/ekoifman/IdeaProjects/hive/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands-1518813536099/warehouse/t'
> TBLPROPERTIES (
>   'transient_lastDdlTime'='1518813564')
> {noformat}
> Specifying props explicitly does work 
> {noformat}
> create table T1(a int, b int) clustered by (a) into 2 buckets stored as orc 
> TBLPROPERTIES ('transactional'='true')";
> create table T like T1 TBLPROPERTIES ('transactional'='true');
> show create table T ;
> CREATE TABLE `T`(
>   `a` int,
>   `b` int)
> CLUSTERED BY (
>   a)
> INTO 2 BUCKETS
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
> LOCATION
>   
> 'file:/Users/ekoifman/IdeaProjects/hive/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands-1518814098564/warehouse/t'
> TBLPROPERTIES (
>   'transactional'='true',
>   'transactional_properties'='default',
>   'transient_lastDdlTime'='1518814111')
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18735) Create table like loses transactional attribute

2019-06-28 Thread Marta Kuczora (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874923#comment-16874923
 ] 

Marta Kuczora commented on HIVE-18735:
--

Pushed to master. (Got +1 on Review Board on Wednesday.)
Thanks a lot [~lpinter] for the patch.


> Create table like loses transactional attribute
> ---
>
> Key: HIVE-18735
> URL: https://issues.apache.org/jira/browse/HIVE-18735
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.0.0
>Reporter: Eugene Koifman
>Assignee: Laszlo Pinter
>Priority: Major
> Attachments: HIVE-18735.01.patch, HIVE-18735.02.patch, 
> HIVE-18735.03.patch, HIVE-18735.04.patch, HIVE-18735.05.patch, 
> HIVE-18735.06.patch
>
>
> {noformat}
> create table T1(a int, b int) clustered by (a) into 2 buckets stored as orc 
> TBLPROPERTIES ('transactional'='true')";
> create table T like T1;
> show create table T ;
> CREATE TABLE `T`(
>   `a` int,
>   `b` int)
> CLUSTERED BY (
>   a)
> INTO 2 BUCKETS
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
> LOCATION
>  
> 'file:/Users/ekoifman/IdeaProjects/hive/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands-1518813536099/warehouse/t'
> TBLPROPERTIES (
>   'transient_lastDdlTime'='1518813564')
> {noformat}
> Specifying props explicitly does work 
> {noformat}
> create table T1(a int, b int) clustered by (a) into 2 buckets stored as orc 
> TBLPROPERTIES ('transactional'='true')";
> create table T like T1 TBLPROPERTIES ('transactional'='true');
> show create table T ;
> CREATE TABLE `T`(
>   `a` int,
>   `b` int)
> CLUSTERED BY (
>   a)
> INTO 2 BUCKETS
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
> LOCATION
>   
> 'file:/Users/ekoifman/IdeaProjects/hive/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands-1518814098564/warehouse/t'
> TBLPROPERTIES (
>   'transactional'='true',
>   'transactional_properties'='default',
>   'transient_lastDdlTime'='1518814111')
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21910) Multiple target location generation in HostAffinitySplitLocationProvider

2019-06-28 Thread Peter Vary (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-21910:
--
Attachment: HIVE-21910.2.patch

> Multiple target location generation in HostAffinitySplitLocationProvider
> 
>
> Key: HIVE-21910
> URL: https://issues.apache.org/jira/browse/HIVE-21910
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21910.2.patch, HIVE-21910.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> We need to generate multiple target locations by 
> HostAffinitySplitLocationProvider, so we will have deterministic fallback 
> nodes in case the target node is disabled



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21910) Multiple target location generation in HostAffinitySplitLocationProvider

2019-06-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21910?focusedWorklogId=269233&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-269233
 ]

ASF GitHub Bot logged work on HIVE-21910:
-

Author: ASF GitHub Bot
Created on: 28/Jun/19 12:48
Start Date: 28/Jun/19 12:48
Worklog Time Spent: 10m 
  Work Description: pvary commented on pull request #690: HIVE-21910: 
Multiple target location generation in HostAffinitySplitLocationProvider
URL: https://github.com/apache/hive/pull/690#discussion_r298578751
 
 

 ##
 File path: 
llap-tez/src/test/org/apache/hadoop/hive/llap/tezplugins/TestLlapTaskSchedulerService.java
 ##
 @@ -946,6 +946,161 @@ public void testForcedLocalityUnknownHost() throws 
IOException, InterruptedExcep
 }
   }
 
+  @Test(timeout = 1)
 
 Review comment:
   All of the tests are 10s timeout in this file, so I decided to keep this for 
the shake of consistency
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 269233)
Time Spent: 1h 40m  (was: 1.5h)

> Multiple target location generation in HostAffinitySplitLocationProvider
> 
>
> Key: HIVE-21910
> URL: https://issues.apache.org/jira/browse/HIVE-21910
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21910.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> We need to generate multiple target locations by 
> HostAffinitySplitLocationProvider, so we will have deterministic fallback 
> nodes in case the target node is disabled



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21933) Remove unused methods from Utilities

2019-06-28 Thread Ivan Suller (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Suller updated HIVE-21933:
---
Attachment: HIVE-21933.1.patch

> Remove unused methods from Utilities
> 
>
> Key: HIVE-21933
> URL: https://issues.apache.org/jira/browse/HIVE-21933
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Ivan Suller
>Priority: Trivial
> Attachments: HIVE-21933.1.patch
>
>
> Over the years it seems org.apache.hadoop.hive.ql.exec.Utilities collected 
> many methods which are not used anymore. Removing them is the right thing to 
> do.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21933) Remove unused methods from Utilities

2019-06-28 Thread Ivan Suller (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Suller updated HIVE-21933:
---
Status: Patch Available  (was: Open)

> Remove unused methods from Utilities
> 
>
> Key: HIVE-21933
> URL: https://issues.apache.org/jira/browse/HIVE-21933
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Ivan Suller
>Assignee: Ivan Suller
>Priority: Trivial
> Attachments: HIVE-21933.1.patch
>
>
> Over the years it seems org.apache.hadoop.hive.ql.exec.Utilities collected 
> many methods which are not used anymore. Removing them is the right thing to 
> do.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-21933) Remove unused methods from Utilities

2019-06-28 Thread Ivan Suller (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Suller reassigned HIVE-21933:
--

Assignee: Ivan Suller

> Remove unused methods from Utilities
> 
>
> Key: HIVE-21933
> URL: https://issues.apache.org/jira/browse/HIVE-21933
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Ivan Suller
>Assignee: Ivan Suller
>Priority: Trivial
> Attachments: HIVE-21933.1.patch
>
>
> Over the years it seems org.apache.hadoop.hive.ql.exec.Utilities collected 
> many methods which are not used anymore. Removing them is the right thing to 
> do.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21578) Introduce SQL:2016 formats FM, FX, and nested strings

2019-06-28 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874898#comment-16874898
 ] 

Hive QA commented on HIVE-21578:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
54s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 2s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
22s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
54s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
37s{color} | {color:blue} common in master has 62 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
58s{color} | {color:blue} ql in master has 2253 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
15s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
29s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
12s{color} | {color:red} common: The patch generated 14 new + 0 unchanged - 0 
fixed = 14 total (was 0) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 29m 50s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-17789/dev-support/hive-personality.sh
 |
| git revision | master / 57c4217 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17789/yetus/diff-checkstyle-common.txt
 |
| modules | C: common ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17789/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Introduce SQL:2016 formats FM, FX, and nested strings
> -
>
> Key: HIVE-21578
> URL: https://issues.apache.org/jira/browse/HIVE-21578
> Project: Hive
>  Issue Type: Improvement
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
> Attachments: HIVE-21578.01.patch
>
>
> Enable Hive to parse the following datetime formats when any combination or 
> subset of these or previously implemented formats is provided in one string. 
>  * "text" (nested strings)
>  * FM
>  * FX
> [Definitions 
> here|https://docs.google.com/document/d/1V7k6-lrPGW7_uhqM-FhKl3QsxwCRy69v2KIxPsGjc1k/edit]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21910) Multiple target location generation in HostAffinitySplitLocationProvider

2019-06-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21910?focusedWorklogId=269225&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-269225
 ]

ASF GitHub Bot logged work on HIVE-21910:
-

Author: ASF GitHub Bot
Created on: 28/Jun/19 12:31
Start Date: 28/Jun/19 12:31
Worklog Time Spent: 10m 
  Work Description: pvary commented on pull request #690: HIVE-21910: 
Multiple target location generation in HostAffinitySplitLocationProvider
URL: https://github.com/apache/hive/pull/690#discussion_r298573326
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HostAffinitySplitLocationProvider.java
 ##
 @@ -72,11 +78,17 @@ public HostAffinitySplitLocationProvider(List 
knownLocations) {
 FileSplit fsplit = (FileSplit) split;
 String splitDesc = "Split at " + fsplit.getPath() + " with offset= " + 
fsplit.getStart()
 + ", length=" + fsplit.getLength();
-List preferredLocations = preferLocations(fsplit);
-String location =
-preferredLocations.get(determineLocation(preferredLocations, 
fsplit.getPath().toString(),
-fsplit.getStart(), splitDesc));
-return (location != null) ? new String[] { location } : null;
+List preferredLocations = new ArrayList<>(preferLocations(fsplit));
+List finalLocations = new ArrayList<>(numberOfLocations);
+// Generate new preferred locations until we need more, or we do not have 
any preferred
+// location left
+while (finalLocations.size() < numberOfLocations && 
preferredLocations.size() > 0) {
+  String nextLocation = 
preferredLocations.get(determineLocation(preferredLocations,
+  fsplit.getPath().toString(), fsplit.getStart(), splitDesc));
+  finalLocations.add(nextLocation);
+  preferredLocations.remove(nextLocation);
 
 Review comment:
   I did some measurements for the split generation with this code:
   `
   @Test (timeout = 500)
   public void testOrcSplitsBasic() throws IOException {
 HostAffinitySplitLocationProvider locationProvider = new 
HostAffinitySplitLocationProvider(executorLocations, true, 1);
   
 InputSplit os1 = createMockFileSplit(true, "path1", 0, 1000, new String[] 
{locations.get(0), locations.get(1), locations.get(2), locations.get(3)});
   
 long start = System.nanoTime();
 for(int i=0;i<10;i++) {
   locationProvider.getLocations(os1);
 }
 LOG.error("TIME: " + (System.nanoTime()-start)/100);
   }
   `
   
   I got the following results:
   Original code (~6100ms for 100k requests):
   - 5859
   - 6511
   - 6813
   - 5721
   - 5663
   
   New code with 1 location (~5823ms for 100k requests):
   - 5877
   - 5621
   - 5613
   - 5883
   - 6120
   
   New code with 2 locations (~6579ms for 100k request):
   - 6433
   - 6825
   - 6574
   - 6444
   - 6621
   
   I do not see why the new code should be faster, so this means probably high 
variation for the data. Generating 2 locations instead of 1 seems like a 10% 
overhead. Since this is 0.006ms per request this seems reasonable for me.
   
   What is your opinion?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 269225)
Time Spent: 1.5h  (was: 1h 20m)

> Multiple target location generation in HostAffinitySplitLocationProvider
> 
>
> Key: HIVE-21910
> URL: https://issues.apache.org/jira/browse/HIVE-21910
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21910.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> We need to generate multiple target locations by 
> HostAffinitySplitLocationProvider, so we will have deterministic fallback 
> nodes in case the target node is disabled



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21930) WINDOW COUNT DISTINCT return wrong value with PARTITION BY

2019-06-28 Thread Igor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor updated HIVE-21930:

Description: 
count(distinct a) over (partiton by b) return wring result. For example (T is 
CTE here):
{code:java}
select p, day, ts
, row_number() OVER (PARTITION BY phone ORDER BY ts ASC) as line_number
, count(1) OVER (PARTITION BY phone ROWS BETWEEN UNBOUNDED PRECEDING AND 
UNBOUNDED FOLLOWING) as lines
, count(distinct day) OVER (PARTITION BY phone ROWS BETWEEN UNBOUNDED PRECEDING 
AND UNBOUNDED FOLLOWING) as days
FROM T{code}
 WINDOW specification doesn't affect on results: same wrong with and without 
window.

count(1) and count(distinct day) return the same result. Count distinct is 
wrong.

 

I've add size(collect_set(day) OVER (PARTITION BY phone)) as days2 and 
count(distinct return correct result.

Following query return non-empty result:
{code:java}
select A.*, B.days, B. from (
select p, day, ts
, row_number() OVER (PARTITION BY phone ORDER BY ts ASC) as line_number
, count(1) OVER (PARTITION BY p ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED 
FOLLOWING) as lines
, count(distinct day) OVER (PARTITION BY phone ROWS BETWEEN UNBOUNDED PRECEDING 
AND UNBOUNDED FOLLOWING) as days
, size(collect_set(day) OVER (PARTITION BY phone)) as days2
, dense_rank() over (partition by phone order by day) + dense_rank() over 
(partition by phone order by day desc) - 1 as days3
FROM T ) as A 
join (
select p, day, ts
, row_number() OVER (PARTITION BY phone ORDER BY ts ASC) as line_number
, count(1) OVER (PARTITION BY phone ROWS BETWEEN UNBOUNDED PRECEDING AND 
UNBOUNDED FOLLOWING) as lines
, count(distinct day) OVER (PARTITION BY phone ROWS BETWEEN UNBOUNDED PRECEDING 
AND UNBOUNDED FOLLOWING) as days
FROM T
) as B on A.p=B.p and A.line_number=B.line_number
where A.days!=B.days
order by A.p, A.line_number
{code}
 

  was:
count(distinct a) over (partiton by b) return wring result. For example:
{code:java}
select p, day, ts
, row_number() OVER (PARTITION BY phone ORDER BY ts ASC) as line_number
, count(1) OVER (PARTITION BY phone ROWS BETWEEN UNBOUNDED PRECEDING AND 
UNBOUNDED FOLLOWING) as lines
, count(distinct day) OVER (PARTITION BY phone ROWS BETWEEN UNBOUNDED PRECEDING 
AND UNBOUNDED FOLLOWING) as days
FROM T{code}
 WINDOW specification doesn't affect on results: same wrong with and without 
window.

count(1) and count(distinct day) return the same result. Count distinct is 
wrong.

 

I've add size(collect_set(day) OVER (PARTITION BY phone)) as days2 and 
count(distinct return correct result.

Following query return non-empty result:
{code:java}
select A.*, B.days, B. from (
select p, day, ts
, row_number() OVER (PARTITION BY phone ORDER BY ts ASC) as line_number
, count(1) OVER (PARTITION BY p ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED 
FOLLOWING) as lines
, count(distinct day) OVER (PARTITION BY phone ROWS BETWEEN UNBOUNDED PRECEDING 
AND UNBOUNDED FOLLOWING) as days
, size(collect_set(day) OVER (PARTITION BY phone)) as days2
, dense_rank() over (partition by phone order by day) + dense_rank() over 
(partition by phone order by day desc) - 1 as days3
FROM T ) as A 
join (
select p, day, ts
, row_number() OVER (PARTITION BY phone ORDER BY ts ASC) as line_number
, count(1) OVER (PARTITION BY phone ROWS BETWEEN UNBOUNDED PRECEDING AND 
UNBOUNDED FOLLOWING) as lines
, count(distinct day) OVER (PARTITION BY phone ROWS BETWEEN UNBOUNDED PRECEDING 
AND UNBOUNDED FOLLOWING) as days
FROM T
) as B on A.p=B.p and A.line_number=B.line_number
where A.days!=B.days
order by A.p, A.line_number
{code}
 


> WINDOW COUNT DISTINCT return wrong value with PARTITION BY
> --
>
> Key: HIVE-21930
> URL: https://issues.apache.org/jira/browse/HIVE-21930
> Project: Hive
>  Issue Type: Bug
>  Components: PTF-Windowing
>Affects Versions: 3.1.0
> Environment: Beeline version 3.1.0.3.0.1.0-187 by Apache Hive
>Reporter: Igor
>Priority: Major
>  Labels: distinct, window_funcion
>
> count(distinct a) over (partiton by b) return wring result. For example (T is 
> CTE here):
> {code:java}
> select p, day, ts
> , row_number() OVER (PARTITION BY phone ORDER BY ts ASC) as line_number
> , count(1) OVER (PARTITION BY phone ROWS BETWEEN UNBOUNDED PRECEDING AND 
> UNBOUNDED FOLLOWING) as lines
> , count(distinct day) OVER (PARTITION BY phone ROWS BETWEEN UNBOUNDED 
> PRECEDING AND UNBOUNDED FOLLOWING) as days
> FROM T{code}
>  WINDOW specification doesn't affect on results: same wrong with and without 
> window.
> count(1) and count(distinct day) return the same result. Count distinct is 
> wrong.
>  
> I've add size(collect_set(day) OVER (PARTITION BY phone)) as days2 and 
> count(distinct return correct result.
> Following query return non-empty result:
> {code:java}
> select A.*, B.days, B. from 

[jira] [Updated] (HIVE-21578) Introduce SQL:2016 formats FM, FX, and nested strings

2019-06-28 Thread Karen Coppage (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage updated HIVE-21578:
-
Attachment: HIVE-21578.01.patch
Status: Patch Available  (was: Open)

> Introduce SQL:2016 formats FM, FX, and nested strings
> -
>
> Key: HIVE-21578
> URL: https://issues.apache.org/jira/browse/HIVE-21578
> Project: Hive
>  Issue Type: Improvement
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
> Attachments: HIVE-21578.01.patch
>
>
> Enable Hive to parse the following datetime formats when any combination or 
> subset of these or previously implemented formats is provided in one string. 
>  * "text" (nested strings)
>  * FM
>  * FX
> [Definitions 
> here|https://docs.google.com/document/d/1V7k6-lrPGW7_uhqM-FhKl3QsxwCRy69v2KIxPsGjc1k/edit]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21880) Enable flaky test TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites.

2019-06-28 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874869#comment-16874869
 ] 

Hive QA commented on HIVE-21880:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12973146/HIVE-21880.03.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 16360 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.llap.cache.TestBuddyAllocator.testMTT[2] (batchId=350)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomCreatedDynamicPartitions
 (batchId=275)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomCreatedDynamicPartitionsUnionAll
 (batchId=275)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomNonExistent
 (batchId=275)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighBytesRead 
(batchId=275)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighShuffleBytes
 (batchId=275)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerSlowQueryElapsedTime
 (batchId=275)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerSlowQueryExecutionTime
 (batchId=275)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/17788/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17788/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17788/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12973146 - PreCommit-HIVE-Build

> Enable flaky test 
> TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites.
> ---
>
> Key: HIVE-21880
> URL: https://issues.apache.org/jira/browse/HIVE-21880
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21880.01.patch, HIVE-21880.02.patch, 
> HIVE-21880.03.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Need tp enable 
> TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites
>  which is disabled as it is flaky and randomly failing with below error.
> {code}
> Error Message
> Notification events are missing in the meta store.
> Stacktrace
> java.lang.IllegalStateException: Notification events are missing in the meta 
> store.
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getNextNotification(HiveMetaStoreClient.java:3246)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:212)
>   at com.sun.proxy.$Proxy58.getNextNotification(Unknown Source)
>   at 
> org.apache.hadoop.hive.ql.metadata.events.EventUtils$MSClientNotificationFetcher.getNextNotificationEvents(EventUtils.java:107)
>   at 
> org.apache.hadoop.hive.ql.metadata.events.EventUtils$NotificationEventIterator.fetchNextBatch(EventUtils.java:159)
>   at 
> org.apache.hadoop.hive.ql.metadata.events.EventUtils$NotificationEventIterator.hasNext(EventUtils.java:189)
>   at 
> org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.incrementalDump(ReplDumpTask.java:231)
>   at 
> org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.execute(ReplDumpTask.java:121)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2709)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2361)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2028)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1788)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1782)
>   at 
> org.apache.hadoop.hive.ql.reexec.Re

[jira] [Work logged] (HIVE-21910) Multiple target location generation in HostAffinitySplitLocationProvider

2019-06-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21910?focusedWorklogId=269203&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-269203
 ]

ASF GitHub Bot logged work on HIVE-21910:
-

Author: ASF GitHub Bot
Created on: 28/Jun/19 11:32
Start Date: 28/Jun/19 11:32
Worklog Time Spent: 10m 
  Work Description: pvary commented on pull request #690: HIVE-21910: 
Multiple target location generation in HostAffinitySplitLocationProvider
URL: https://github.com/apache/hive/pull/690#discussion_r298557620
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HostAffinitySplitLocationProvider.java
 ##
 @@ -52,13 +52,19 @@
 
   private final List locations;
   private final Set locationSet;
+  private final int numberOfLocations;
 
-  public HostAffinitySplitLocationProvider(List knownLocations) {
+  public HostAffinitySplitLocationProvider(List knownLocations, int 
numberOfLocations) {
 Preconditions.checkState(knownLocations != null && 
!knownLocations.isEmpty(),
 HostAffinitySplitLocationProvider.class.getName() +
 " needs at least 1 location to function");
+Preconditions.checkArgument(numberOfLocations >= 0,
 
 Review comment:
   Yeah - remained from previous thoughts
   Set to 1
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 269203)
Time Spent: 1h 20m  (was: 1h 10m)

> Multiple target location generation in HostAffinitySplitLocationProvider
> 
>
> Key: HIVE-21910
> URL: https://issues.apache.org/jira/browse/HIVE-21910
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21910.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> We need to generate multiple target locations by 
> HostAffinitySplitLocationProvider, so we will have deterministic fallback 
> nodes in case the target node is disabled



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21910) Multiple target location generation in HostAffinitySplitLocationProvider

2019-06-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21910?focusedWorklogId=269201&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-269201
 ]

ASF GitHub Bot logged work on HIVE-21910:
-

Author: ASF GitHub Bot
Created on: 28/Jun/19 11:27
Start Date: 28/Jun/19 11:27
Worklog Time Spent: 10m 
  Work Description: pvary commented on pull request #690: HIVE-21910: 
Multiple target location generation in HostAffinitySplitLocationProvider
URL: https://github.com/apache/hive/pull/690#discussion_r298556443
 
 

 ##
 File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
 ##
 @@ -4440,6 +4440,12 @@ private static void 
populateLlapDaemonVarsSet(Set llapDaemonVarsSetLocal
 "preferring one of the locations provided by the split itself. If 
there is no llap daemon " +
 "running on any of those locations (or on the cloud), fall back to a 
cache affinity to" +
 " an LLAP node. This is effective only if hive.execution.mode is 
llap."),
+
LLAP_CLIENT_CONSISTENT_SPLITS_NUMBER("hive.llap.client.consistent.splits.number",
 1,
+"The number of the preferred locations to generate if 
hive.llap.client.consistent.splits\n" +
 
 Review comment:
   Done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 269201)
Time Spent: 1h 10m  (was: 1h)

> Multiple target location generation in HostAffinitySplitLocationProvider
> 
>
> Key: HIVE-21910
> URL: https://issues.apache.org/jira/browse/HIVE-21910
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21910.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> We need to generate multiple target locations by 
> HostAffinitySplitLocationProvider, so we will have deterministic fallback 
> nodes in case the target node is disabled



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21910) Multiple target location generation in HostAffinitySplitLocationProvider

2019-06-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21910?focusedWorklogId=269199&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-269199
 ]

ASF GitHub Bot logged work on HIVE-21910:
-

Author: ASF GitHub Bot
Created on: 28/Jun/19 11:26
Start Date: 28/Jun/19 11:26
Worklog Time Spent: 10m 
  Work Description: pvary commented on pull request #690: HIVE-21910: 
Multiple target location generation in HostAffinitySplitLocationProvider
URL: https://github.com/apache/hive/pull/690#discussion_r298556011
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HostAffinitySplitLocationProvider.java
 ##
 @@ -72,11 +78,17 @@ public HostAffinitySplitLocationProvider(List 
knownLocations) {
 FileSplit fsplit = (FileSplit) split;
 String splitDesc = "Split at " + fsplit.getPath() + " with offset= " + 
fsplit.getStart()
 + ", length=" + fsplit.getLength();
-List preferredLocations = preferLocations(fsplit);
-String location =
-preferredLocations.get(determineLocation(preferredLocations, 
fsplit.getPath().toString(),
-fsplit.getStart(), splitDesc));
-return (location != null) ? new String[] { location } : null;
+List preferredLocations = new ArrayList<>(preferLocations(fsplit));
 
 Review comment:
   We might want to keep it configurable.
   Until the cluster reaches the threshold where the highest loaded node starts 
to struggle for resources keeping the tasks aligned with HDFS location still 
makes sense.
   What do you think?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 269199)
Time Spent: 1h  (was: 50m)

> Multiple target location generation in HostAffinitySplitLocationProvider
> 
>
> Key: HIVE-21910
> URL: https://issues.apache.org/jira/browse/HIVE-21910
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21910.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> We need to generate multiple target locations by 
> HostAffinitySplitLocationProvider, so we will have deterministic fallback 
> nodes in case the target node is disabled



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21880) Enable flaky test TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites.

2019-06-28 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874862#comment-16874862
 ] 

Hive QA commented on HIVE-21880:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
54s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 6s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
30s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
55s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  2m 
35s{color} | {color:blue} standalone-metastore/metastore-common in master has 
31 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m 
12s{color} | {color:blue} standalone-metastore/metastore-server in master has 
179 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
3s{color} | {color:blue} ql in master has 2253 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
27s{color} | {color:blue} hcatalog/server-extensions in master has 3 extant 
Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
39s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m  
7s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
27s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m 
30s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
12s{color} | {color:red} itests/hcatalog-unit: The patch generated 81 new + 0 
unchanged - 0 fixed = 81 total (was 0) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
17s{color} | {color:red} itests/hive-unit: The patch generated 1 new + 32 
unchanged - 0 fixed = 33 total (was 32) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 1s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  9m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m  
3s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 50m 31s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-17788/dev-support/hive-personality.sh
 |
| git revision | master / 57c4217 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17788/yetus/diff-checkstyle-itests_hcatalog-unit.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17788/yetus/diff-checkstyle-itests_hive-unit.txt
 |
| modules | C: standalone-metastore/metastore-common 
standalone-metastore/metastore-server ql hcatalog/server-extensions 
itests/hcatalog-unit itests/hive-unit U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17788/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Enable flaky test 
> TestRep

[jira] [Commented] (HIVE-21923) Vectorized MapJoin may miss results when only the join key is selected

2019-06-28 Thread Zoltan Haindrich (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874832#comment-16874832
 ] 

Zoltan Haindrich commented on HIVE-21923:
-

this issue was introduced in HIVE-18908

> Vectorized MapJoin may miss results when only the join key is selected
> --
>
> Key: HIVE-21923
> URL: https://issues.apache.org/jira/browse/HIVE-21923
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-21923.01.patch
>
>
> HIVE-21189 have introduced some resultset changes
> in ql/src/test/results/clientpositive/llap/hybridgrace_hashjoin_2.q.out
> https://github.com/apache/hive/commit/5799398450c17d06e8ef144ce835a8524f5abec9#diff-56b3ab96b6c90fdbebe2c4f84e8595afL500



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21867) Sort semijoin conditions to accelerate query processing

2019-06-28 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874830#comment-16874830
 ] 

Hive QA commented on HIVE-21867:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12973143/HIVE-21867.05.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 16325 tests 
executed
*Failed tests:*
{noformat}
TestDataSourceProviderFactory - did not produce a TEST-*.xml file (likely timed 
out) (batchId=232)
TestObjectStore - did not produce a TEST-*.xml file (likely timed out) 
(batchId=232)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/17787/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17787/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17787/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12973143 - PreCommit-HIVE-Build

> Sort semijoin conditions to accelerate query processing
> ---
>
> Key: HIVE-21867
> URL: https://issues.apache.org/jira/browse/HIVE-21867
> Project: Hive
>  Issue Type: New Feature
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21867.02.patch, HIVE-21867.03.patch, 
> HIVE-21867.04.patch, HIVE-21867.05.patch, HIVE-21867.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The problem was tackled for CBO in HIVE-21857. Semijoin filters are 
> introduced later in the planning phase. Follow similar approach to sort them, 
> trying to accelerate filter evaluation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21867) Sort semijoin conditions to accelerate query processing

2019-06-28 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874802#comment-16874802
 ] 

Hive QA commented on HIVE-21867:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
49s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
12s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
4s{color} | {color:blue} ql in master has 2253 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
41s{color} | {color:red} ql: The patch generated 1 new + 155 unchanged - 0 
fixed = 156 total (was 155) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 25m  9s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-17787/dev-support/hive-personality.sh
 |
| git revision | master / 57c4217 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17787/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17787/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Sort semijoin conditions to accelerate query processing
> ---
>
> Key: HIVE-21867
> URL: https://issues.apache.org/jira/browse/HIVE-21867
> Project: Hive
>  Issue Type: New Feature
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21867.02.patch, HIVE-21867.03.patch, 
> HIVE-21867.04.patch, HIVE-21867.05.patch, HIVE-21867.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The problem was tackled for CBO in HIVE-21857. Semijoin filters are 
> introduced later in the planning phase. Follow similar approach to sort them, 
> trying to accelerate filter evaluation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21880) Enable flaky test TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites.

2019-06-28 Thread Ashutosh Bapat (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Bapat updated HIVE-21880:
--
Attachment: HIVE-21880.03.patch
Status: Patch Available  (was: In Progress)

The failed tests are passing for me locally. Re-submitting .02 patch as .03 to 
trigger ptests.

> Enable flaky test 
> TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites.
> ---
>
> Key: HIVE-21880
> URL: https://issues.apache.org/jira/browse/HIVE-21880
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21880.01.patch, HIVE-21880.02.patch, 
> HIVE-21880.03.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Need tp enable 
> TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites
>  which is disabled as it is flaky and randomly failing with below error.
> {code}
> Error Message
> Notification events are missing in the meta store.
> Stacktrace
> java.lang.IllegalStateException: Notification events are missing in the meta 
> store.
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getNextNotification(HiveMetaStoreClient.java:3246)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:212)
>   at com.sun.proxy.$Proxy58.getNextNotification(Unknown Source)
>   at 
> org.apache.hadoop.hive.ql.metadata.events.EventUtils$MSClientNotificationFetcher.getNextNotificationEvents(EventUtils.java:107)
>   at 
> org.apache.hadoop.hive.ql.metadata.events.EventUtils$NotificationEventIterator.fetchNextBatch(EventUtils.java:159)
>   at 
> org.apache.hadoop.hive.ql.metadata.events.EventUtils$NotificationEventIterator.hasNext(EventUtils.java:189)
>   at 
> org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.incrementalDump(ReplDumpTask.java:231)
>   at 
> org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.execute(ReplDumpTask.java:121)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2709)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2361)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2028)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1788)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1782)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:162)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:223)
>   at 
> org.apache.hadoop.hive.ql.parse.WarehouseInstance.run(WarehouseInstance.java:227)
>   at 
> org.apache.hadoop.hive.ql.parse.WarehouseInstance.dump(WarehouseInstance.java:282)
>   at 
> org.apache.hadoop.hive.ql.parse.WarehouseInstance.dump(WarehouseInstance.java:265)
>   at 
> org.apache.hadoop.hive.ql.parse.WarehouseInstance.dump(WarehouseInstance.java:289)
>   at 
> org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites(TestReplicationScenariosAcidTablesBootstrap.java:328)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>  

[jira] [Updated] (HIVE-21880) Enable flaky test TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites.

2019-06-28 Thread Ashutosh Bapat (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Bapat updated HIVE-21880:
--
Status: In Progress  (was: Patch Available)

> Enable flaky test 
> TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites.
> ---
>
> Key: HIVE-21880
> URL: https://issues.apache.org/jira/browse/HIVE-21880
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21880.01.patch, HIVE-21880.02.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Need tp enable 
> TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites
>  which is disabled as it is flaky and randomly failing with below error.
> {code}
> Error Message
> Notification events are missing in the meta store.
> Stacktrace
> java.lang.IllegalStateException: Notification events are missing in the meta 
> store.
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getNextNotification(HiveMetaStoreClient.java:3246)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:212)
>   at com.sun.proxy.$Proxy58.getNextNotification(Unknown Source)
>   at 
> org.apache.hadoop.hive.ql.metadata.events.EventUtils$MSClientNotificationFetcher.getNextNotificationEvents(EventUtils.java:107)
>   at 
> org.apache.hadoop.hive.ql.metadata.events.EventUtils$NotificationEventIterator.fetchNextBatch(EventUtils.java:159)
>   at 
> org.apache.hadoop.hive.ql.metadata.events.EventUtils$NotificationEventIterator.hasNext(EventUtils.java:189)
>   at 
> org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.incrementalDump(ReplDumpTask.java:231)
>   at 
> org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.execute(ReplDumpTask.java:121)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2709)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2361)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2028)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1788)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1782)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:162)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:223)
>   at 
> org.apache.hadoop.hive.ql.parse.WarehouseInstance.run(WarehouseInstance.java:227)
>   at 
> org.apache.hadoop.hive.ql.parse.WarehouseInstance.dump(WarehouseInstance.java:282)
>   at 
> org.apache.hadoop.hive.ql.parse.WarehouseInstance.dump(WarehouseInstance.java:265)
>   at 
> org.apache.hadoop.hive.ql.parse.WarehouseInstance.dump(WarehouseInstance.java:289)
>   at 
> org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites(TestReplicationScenariosAcidTablesBootstrap.java:328)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runn

[jira] [Commented] (HIVE-21886) REPL - With table list - Handle rename events during replace policy

2019-06-28 Thread Sankar Hariappan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874779#comment-16874779
 ] 

Sankar Hariappan commented on HIVE-21886:
-

+1, patch LGTM, pending tests.

> REPL - With table list - Handle rename events during replace policy
> ---
>
> Key: HIVE-21886
> URL: https://issues.apache.org/jira/browse/HIVE-21886
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, Replication, pull-request-available
> Attachments: HIVE-21886.01.patch, HIVE-21886.02.patch, 
> HIVE-21886.03.patch, HIVE-21886.04.patch, HIVE-21886.04.patch
>
>  Time Spent: 11h 10m
>  Remaining Estimate: 0h
>
> If some rename events are found to be dumped and replayed while replace 
> policy is getting executed, it needs to take care of the policy inclusion in 
> both the policy for each table name.
>  1. Create a list of tables to be bootstrapped. 
>   2. During handling of alter table, if the alter type is rename 
>       1. If the old table name is present in the list of table to be 
> bootstrapped, remove it.
>        2. If the new table name, matches the new policy, add it to the list 
> of tables to be bootstrapped.
>        3. If the old table does not match the old policy drop it, even if the 
> table is not present at target.
>   3. During handling of drop table
>        1. if the table is in the list of tables to be bootstrapped, then 
> remove it and ignore the event.
>   4. During other event handling 
>        1. if the table is there in the list of tables to be bootstrapped, 
> then ignore the event.
>        2. If the new policy does not match the table name, then ignore the 
> event.
>  
> Rename handling during replace policy
>  # Old name not matching old policy – The old table will not be there at the 
> target cluster. The table will not be returned by get-all-table.
>  ## Old name is not matching new policy
>  ### New name not matching old policy
>   New name not matching new policy
>  * Ignore the event, no need to do anything.
>   New name matching new policy
>  * The table will be returned by get-all-table. Replace policy handler 
> will bootstrap this table as its matching new policy and not matching old 
> policy.
>  * All the future events will be ignored as part of check added by 
> replace policy handling.
>  * All the event with old table name will anyways be ignored as the old 
> name is not matching the new policy.
>  ### New name matching old policy
>   New name not matching new policy
>  * As the new name is not matching the new policy, the table need not be 
> replicated.
>  * As the old name is not matching the new policy, the rename events will 
> be ignored.
>  * So nothing to be done for this scenario.
>   New name matching new policy
>  * As the new name is matching both old and new policy, replace handler 
> will not bootstrap the table.
>  * Add the table to the list of tables to be bootstrapped.
>  * Ignore all the events with new name.
>  * If there is a drop event for the table (with new name), then remove 
> the table from the the list of table to be bootstrapped.
>  * In case of rename event (double rename)
>  ** If the new name satisfies the table pattern, then add the new name to 
> the list of tables to be bootstrapped and remove the old name from the list 
> of tables to be bootstrapped.
>  ** If the new name does not satisfies then just removed the table name 
> from the list of tables to be bootstrapped.
>  ## Old name is matching new policy – As per replace policy handler, which 
> checks based on old table, the table should be bootstrapped and event should 
> be ignored. But rename handler should decide based on new name.The old table 
> name will not be returned by get-all-table, so replace handler will not d 
> anything for the old table.
>  ### New name not matching old policy
>   New name not matching new policy
>  * As the old table is not there at target and new name is not matching 
> new policy. Ignore the event.
>  * No need to add the table to the list of tables to be bootstrapped.
>  * All the subsequent events will be ignored as the new name is not 
> matching the new policy.
>   New name matching new policy
>  * As the new name is not matching old policy but matching new policy, 
> the table will be bootstrapped by replace policy handler. So rename event 
> need not add this table to list of table to be bootstrapped.
>  * All the future events will be ignored by replace policy handler.
>  * For rename event (double rename)
>  ** If there is a rename, the table

[jira] [Commented] (HIVE-21888) Set hive.parquet.timestamp.skip.conversion default to true

2019-06-28 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874771#comment-16874771
 ] 

Jesus Camacho Rodriguez commented on HIVE-21888:


+1

> Set hive.parquet.timestamp.skip.conversion default to true
> --
>
> Key: HIVE-21888
> URL: https://issues.apache.org/jira/browse/HIVE-21888
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
> Attachments: HIVE-21888.02.patch, HIVE-21888.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21867) Sort semijoin conditions to accelerate query processing

2019-06-28 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-21867:
---
Attachment: HIVE-21867.05.patch

> Sort semijoin conditions to accelerate query processing
> ---
>
> Key: HIVE-21867
> URL: https://issues.apache.org/jira/browse/HIVE-21867
> Project: Hive
>  Issue Type: New Feature
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21867.02.patch, HIVE-21867.03.patch, 
> HIVE-21867.04.patch, HIVE-21867.05.patch, HIVE-21867.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The problem was tackled for CBO in HIVE-21857. Semijoin filters are 
> introduced later in the planning phase. Follow similar approach to sort them, 
> trying to accelerate filter evaluation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21637) Synchronized metastore cache

2019-06-28 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874736#comment-16874736
 ] 

Hive QA commented on HIVE-21637:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
36s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
11s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
13s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
34s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
27s{color} | {color:blue} storage-api in master has 48 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  2m 
31s{color} | {color:blue} standalone-metastore/metastore-common in master has 
31 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m 
13s{color} | {color:blue} standalone-metastore/metastore-server in master has 
179 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
12s{color} | {color:blue} ql in master has 2253 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
32s{color} | {color:blue} beeline in master has 44 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
30s{color} | {color:blue} hcatalog/server-extensions in master has 3 extant 
Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
25s{color} | {color:blue} 
standalone-metastore/metastore-tools/metastore-benchmarks in master has 3 
extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
43s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
49s{color} | {color:blue} itests/util in master has 44 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m 
15s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
32s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
10s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
12s{color} | {color:red} storage-api: The patch generated 1 new + 5 unchanged - 
0 fixed = 6 total (was 5) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
17s{color} | {color:red} standalone-metastore/metastore-common: The patch 
generated 9 new + 498 unchanged - 2 fixed = 507 total (was 500) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
48s{color} | {color:red} standalone-metastore/metastore-server: The patch 
generated 160 new + 2193 unchanged - 65 fixed = 2353 total (was 2258) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
54s{color} | {color:red} ql: The patch generated 10 new + 970 unchanged - 2 
fixed = 980 total (was 972) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
11s{color} | {color:red} standalone-metastore/metastore-tools/tools-common: The 
patch generated 5 new + 31 unchanged - 0 fixed = 36 total (was 31) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
12s{color} | {color:red} itests/hcatalog-unit: The patch generated 2 new + 24 
unchanged - 3 fixed = 26 total (was 27) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
18s{color} | {color:red} itests/hive-unit: The patch generated 3 new + 163 
unchanged - 1 fixed = 166 total (was 164) {color} |
| {color:red}-1{color} | {color:red} checkstyle 

[jira] [Commented] (HIVE-21932) IndexOutOfRangeExeption in FileChksumIterator

2019-06-28 Thread anishek (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874732#comment-16874732
 ] 

anishek commented on HIVE-21932:


+1

> IndexOutOfRangeExeption in FileChksumIterator
> -
>
> Key: HIVE-21932
> URL: https://issues.apache.org/jira/browse/HIVE-21932
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-21932.01.patch
>
>
> According to definition of {{InsertEventRequestData}} in 
> {{hive_metastore.thrift}} the {{filesAddedChecksum}} is a optional field. But 
> the FileChksumIterator does not handle it correctly when a client fires a 
> insert event which does not have file checksums. The issue is that 
> {{InsertEvent}} class initializes fileChecksums list to a empty arrayList to 
> the following check will never come into play
> {noformat}
> result = ReplChangeManager.encodeFileUri(files.get(i), chksums != null ? 
> chksums.get(i) : null,
> subDirs != null ? subDirs.get(i) : null);
> {noformat}
> The chksums check above should include a {{!chksums.isEmpty()}} check as well 
> in the above line.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21762) REPL DUMP to support new format for replication policy input to take included tables list.

2019-06-28 Thread Sankar Hariappan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874729#comment-16874729
 ] 

Sankar Hariappan commented on HIVE-21762:
-

Updated Apache Wiki page for this issue,
https://cwiki.apache.org/confluence/display/Hive/HiveReplicationv2Development

> REPL DUMP to support new format for replication policy input to take included 
> tables list.
> --
>
> Key: HIVE-21762
> URL: https://issues.apache.org/jira/browse/HIVE-21762
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, Replication, pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21762.01.patch, HIVE-21762.02.patch, 
> HIVE-21762.03.patch, HIVE-21762.04.patch, HIVE-21762.05.patch, 
> HIVE-21762.06.patch, HIVE-21762.07.patch
>
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> - REPL DUMP syntax:
> {code}
> REPL DUMP  [FROM  WITH ;
> {code}
> - New format for the Replication policy have 3 parts all separated with Dot 
> (.). 
> 1. First part is DB name.
> 2. Second part is included list. Comma separated table names/regex with in 
> square brackets[].  If square brackets are not there, then it is treated as 
> single table replication which skips DB level events.
> 3. Third part is excluded list. Comma separated table names/regex with in 
> square brackets[].
> {code}
>  -- Full DB replication which is currently supported
> .['.*?']  -- Full DB replication
> .[] -- Replicate just functions and not include any tables.
> .['t1', 't2']  -- DB replication with static list of tables t1 and 
> t2 included.
> .['t1*', 't2', '*t3'].['t100', '5t3', 't4'] -- DB replication with 
> all tables having prefix t1, with suffix t3 and include table t2 and exclude 
> t100 which has the prefix t1, 5t3 which suffix t3 and t4.
> {code}
> - Need to support regular expression of any format. 
> - A table is included in dump only if it matches the regular expressions in 
> included list and doesn't match the excluded list.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)