[jira] [Commented] (HIVE-16934) Transform COUNT(x) into COUNT() when x is not nullable

2017-06-23 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16061729#comment-16061729
 ] 

Ashutosh Chauhan commented on HIVE-16934:
-

+1

> Transform COUNT(x) into COUNT() when x is not nullable
> --
>
> Key: HIVE-16934
> URL: https://issues.apache.org/jira/browse/HIVE-16934
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-16934.01.patch, HIVE-16934.02.patch, 
> HIVE-16934.03.patch, HIVE-16934.patch
>
>
> Add a rule to simplify COUNT aggregation function if possible, removing 
> expressions that cannot be nullable from its parameters.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16934) Transform COUNT(x) into COUNT() when x is not nullable

2017-06-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16061409#comment-16061409
 ] 

Hive QA commented on HIVE-16934:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12874292/HIVE-16934.03.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 10845 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_main]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=146)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=99)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=233)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query16] 
(batchId=233)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query94] 
(batchId=233)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testBootstrapFunctionReplication
 (batchId=217)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionIncrementalReplication
 (batchId=217)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionWithFunctionBinaryJarsOnHDFS
 (batchId=217)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=178)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=178)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=178)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5752/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5752/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5752/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12874292 - PreCommit-HIVE-Build

> Transform COUNT(x) into COUNT() when x is not nullable
> --
>
> Key: HIVE-16934
> URL: https://issues.apache.org/jira/browse/HIVE-16934
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-16934.01.patch, HIVE-16934.02.patch, 
> HIVE-16934.03.patch, HIVE-16934.patch
>
>
> Add a rule to simplify COUNT aggregation function if possible, removing 
> expressions that cannot be nullable from its parameters.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16934) Transform COUNT(x) into COUNT() when x is not nullable

2017-06-23 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16061297#comment-16061297
 ] 

Jesus Camacho Rodriguez commented on HIVE-16934:


[~ashutoshc], regenerated last q file changes and created RB in: 
https://reviews.apache.org/r/60395/

I read your mind :)

> Transform COUNT(x) into COUNT() when x is not nullable
> --
>
> Key: HIVE-16934
> URL: https://issues.apache.org/jira/browse/HIVE-16934
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-16934.01.patch, HIVE-16934.02.patch, 
> HIVE-16934.patch
>
>
> Add a rule to simplify COUNT aggregation function if possible, removing 
> expressions that cannot be nullable from its parameters.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16934) Transform COUNT(x) into COUNT() when x is not nullable

2017-06-23 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16061294#comment-16061294
 ] 

Ashutosh Chauhan commented on HIVE-16934:
-

Can you create a RB for this? Wanna take a closer look at some of plan changes.

> Transform COUNT(x) into COUNT() when x is not nullable
> --
>
> Key: HIVE-16934
> URL: https://issues.apache.org/jira/browse/HIVE-16934
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-16934.01.patch, HIVE-16934.02.patch, 
> HIVE-16934.patch
>
>
> Add a rule to simplify COUNT aggregation function if possible, removing 
> expressions that cannot be nullable from its parameters.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16934) Transform COUNT(x) into COUNT() when x is not nullable

2017-06-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16061205#comment-16061205
 ] 

Hive QA commented on HIVE-16934:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12874270/HIVE-16934.02.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 25 failed/errored test(s), 10845 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_dynamic_partitions]
 (batchId=241)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[reduce_deduplicate_extended2]
 (batchId=57)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[subquery_in_having] 
(batchId=55)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_partition_pruning]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_1]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[multiMapJoin2]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_in]
 (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_multi]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar]
 (batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_views]
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_main]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_dynamic_partition_pruning]
 (batchId=151)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=233)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query16] 
(batchId=233)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query83] 
(batchId=233)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query94] 
(batchId=233)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_in] 
(batchId=127)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testBootstrapFunctionReplication
 (batchId=217)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionIncrementalReplication
 (batchId=217)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionWithFunctionBinaryJarsOnHDFS
 (batchId=217)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=178)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=178)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=178)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5749/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5749/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5749/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 25 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12874270 - PreCommit-HIVE-Build

> Transform COUNT(x) into COUNT() when x is not nullable
> --
>
> Key: HIVE-16934
> URL: https://issues.apache.org/jira/browse/HIVE-16934
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-16934.01.patch, HIVE-16934.02.patch, 
> HIVE-16934.patch
>
>
> Add a rule to simplify COUNT aggregation function if possible, removing 
> expressions that cannot be nullable from its parameters.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16934) Transform COUNT(x) into COUNT() when x is not nullable

2017-06-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059437#comment-16059437
 ] 

Hive QA commented on HIVE-16934:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12874082/HIVE-16934.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 19 failed/errored test(s), 10846 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 (batchId=238)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=142)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_main]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=99)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=233)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query16] 
(batchId=233)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=233)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query94] 
(batchId=233)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby_sort_1_23] 
(batchId=134)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby_sort_skew_1_23]
 (batchId=104)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join35] 
(batchId=127)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union24] 
(batchId=125)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testBootstrapFunctionReplication
 (batchId=217)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionIncrementalReplication
 (batchId=217)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionWithFunctionBinaryJarsOnHDFS
 (batchId=217)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=178)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=178)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=178)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5731/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5731/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5731/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 19 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12874082 - PreCommit-HIVE-Build

> Transform COUNT(x) into COUNT() when x is not nullable
> --
>
> Key: HIVE-16934
> URL: https://issues.apache.org/jira/browse/HIVE-16934
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-16934.01.patch, HIVE-16934.patch
>
>
> Add a rule to simplify COUNT aggregation function if possible, removing 
> expressions that cannot be nullable from its parameters.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16934) Transform COUNT(x) into COUNT() when x is not nullable

2017-06-22 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059024#comment-16059024
 ] 

Jesus Camacho Rodriguez commented on HIVE-16934:


[~vgarg], there are multiple benefits that I can think of. First, at the 
execution side we will not have to access/evaluate any expression when 
calculating the COUNT. Further, by removing expressions that are referenced by 
the aggregate call, we might be able to further prune columns in the operator 
plan. Another benefit is that this might lead to some aggregate calls not being 
computed twice, e.g., {{COUNT(x)}} and {{COUNT(y)}} if _x_ and _y_ are not 
nullable. Finally, as a side effect, we might be able to recognize more 
equivalent expressions in MVs rewriting or SharedWorkOptimizer, and push more 
computation to Druid, since currently Druid is only capable of executing 
{{count(*)}}.

> Transform COUNT(x) into COUNT() when x is not nullable
> --
>
> Key: HIVE-16934
> URL: https://issues.apache.org/jira/browse/HIVE-16934
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-16934.patch
>
>
> Add a rule to simplify COUNT aggregation function if possible, removing 
> expressions that cannot be nullable from its parameters.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16934) Transform COUNT(x) into COUNT() when x is not nullable

2017-06-21 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16058245#comment-16058245
 ] 

Vineet Garg commented on HIVE-16934:


Hi [~jcamachorodriguez],  why do we do this rewrite? In what way rewriting 
count(a) to count(*) benefits?

> Transform COUNT(x) into COUNT() when x is not nullable
> --
>
> Key: HIVE-16934
> URL: https://issues.apache.org/jira/browse/HIVE-16934
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-16934.patch
>
>
> Add a rule to simplify COUNT aggregation function if possible, removing 
> expressions that cannot be nullable from its parameters.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)