date:20190227

[jira] [Commented] (HIVE-21344) CBO: Materialized view registry is not used for Calcite planner

2019-02-27 Thread Jesus Camacho Rodriguez (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16780214#comment-16780214
 ] 

Jesus Camacho Rodriguez commented on HIVE-21344:


[~gopalv], {{getAllValidMaterializedViews}} should get the plans from the 
registry, but it still requires a call to metastore to verify that the 
materializations still exist, whether they are outdated or not, etc. That logic 
is in {{getValidMaterializedViews}}. Is that what you are seeing?

> CBO: Materialized view registry is not used for Calcite planner
> ---
>
> Key: HIVE-21344
> URL: https://issues.apache.org/jira/browse/HIVE-21344
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views
>Affects Versions: 4.0.0
>Reporter: Gopal V
>Priority: Major
> Attachments: calcite-planner-after-fix.svg.zip, mv-get-from-remote.png
>
>
> {code}
> // This is not a rebuild, we retrieve all the materializations. In turn, we 
> do not need
>   // to force the materialization contents to be up-to-date, as this 
> is not a rebuild, and
>   // we apply the user parameters 
> (HIVE_MATERIALIZED_VIEW_REWRITING_TIME_WINDOW) instead.
>   materializations = 
> db.getAllValidMaterializedViews(getTablesUsed(basePlan), false, getTxnMgr());
> }
> {code}
> !mv-get-from-remote.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21343) CBO: CalcitePlanner debug logging is expensive and costly

2019-02-27 Thread Jesus Camacho Rodriguez (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16780203#comment-16780203
 ] 

Jesus Camacho Rodriguez commented on HIVE-21343:


Fix for this is part of the patch in HIVE-18920.

> CBO: CalcitePlanner debug logging is expensive and costly
> -
>
> Key: HIVE-21343
> URL: https://issues.apache.org/jira/browse/HIVE-21343
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 4.0.0
>Reporter: Gopal V
>Priority: Major
> Attachments: Reloptutil-toString.png, 
> calcite-planner-after-fix.svg.zip
>
>
> {code}
>   //Remove subquery
>   LOG.debug("Plan before removing subquery:\n" + 
> RelOptUtil.toString(calciteGenPlan));
>   calciteGenPlan = hepPlan(calciteGenPlan, false, 
> mdProvider.getMetadataProvider(), null,
>   new HiveSubQueryRemoveRule(conf));
>   LOG.debug("Plan just after removing subquery:\n" + 
> RelOptUtil.toString(calciteGenPlan));
>   calciteGenPlan = HiveRelDecorrelator.decorrelateQuery(calciteGenPlan);
>   LOG.debug("Plan after decorrelation:\n" + 
> RelOptUtil.toString(calciteGenPlan));
> {code}
> The LOG.debug() consumes more CPU than the actual planner steps.
>  !Reloptutil-toString.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-21343) CBO: CalcitePlanner debug logging is expensive and costly

2019-02-27 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez reassigned HIVE-21343:
--

Assignee: Jesus Camacho Rodriguez

> CBO: CalcitePlanner debug logging is expensive and costly
> -
>
> Key: HIVE-21343
> URL: https://issues.apache.org/jira/browse/HIVE-21343
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 4.0.0
>Reporter: Gopal V
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: Reloptutil-toString.png, 
> calcite-planner-after-fix.svg.zip
>
>
> {code}
>   //Remove subquery
>   LOG.debug("Plan before removing subquery:\n" + 
> RelOptUtil.toString(calciteGenPlan));
>   calciteGenPlan = hepPlan(calciteGenPlan, false, 
> mdProvider.getMetadataProvider(), null,
>   new HiveSubQueryRemoveRule(conf));
>   LOG.debug("Plan just after removing subquery:\n" + 
> RelOptUtil.toString(calciteGenPlan));
>   calciteGenPlan = HiveRelDecorrelator.decorrelateQuery(calciteGenPlan);
>   LOG.debug("Plan after decorrelation:\n" + 
> RelOptUtil.toString(calciteGenPlan));
> {code}
> The LOG.debug() consumes more CPU than the actual planner steps.
>  !Reloptutil-toString.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21343) CBO: CalcitePlanner debug logging is expensive and costly

2019-02-27 Thread Gopal V (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-21343:
---
Attachment: calcite-planner-after-fix.svg.zip

> CBO: CalcitePlanner debug logging is expensive and costly
> -
>
> Key: HIVE-21343
> URL: https://issues.apache.org/jira/browse/HIVE-21343
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 4.0.0
>Reporter: Gopal V
>Priority: Major
> Attachments: Reloptutil-toString.png, 
> calcite-planner-after-fix.svg.zip
>
>
> {code}
>   //Remove subquery
>   LOG.debug("Plan before removing subquery:\n" + 
> RelOptUtil.toString(calciteGenPlan));
>   calciteGenPlan = hepPlan(calciteGenPlan, false, 
> mdProvider.getMetadataProvider(), null,
>   new HiveSubQueryRemoveRule(conf));
>   LOG.debug("Plan just after removing subquery:\n" + 
> RelOptUtil.toString(calciteGenPlan));
>   calciteGenPlan = HiveRelDecorrelator.decorrelateQuery(calciteGenPlan);
>   LOG.debug("Plan after decorrelation:\n" + 
> RelOptUtil.toString(calciteGenPlan));
> {code}
> The LOG.debug() consumes more CPU than the actual planner steps.
>  !Reloptutil-toString.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21344) CBO: Materialized view registry is not used for Calcite planner

2019-02-27 Thread Gopal V (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-21344:
---
Attachment: calcite-planner-after-fix.svg.zip

> CBO: Materialized view registry is not used for Calcite planner
> ---
>
> Key: HIVE-21344
> URL: https://issues.apache.org/jira/browse/HIVE-21344
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views
>Affects Versions: 4.0.0
>Reporter: Gopal V
>Priority: Major
> Attachments: calcite-planner-after-fix.svg.zip, mv-get-from-remote.png
>
>
> {code}
> // This is not a rebuild, we retrieve all the materializations. In turn, we 
> do not need
>   // to force the materialization contents to be up-to-date, as this 
> is not a rebuild, and
>   // we apply the user parameters 
> (HIVE_MATERIALIZED_VIEW_REWRITING_TIME_WINDOW) instead.
>   materializations = 
> db.getAllValidMaterializedViews(getTablesUsed(basePlan), false, getTxnMgr());
> }
> {code}
> !mv-get-from-remote.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20656) Sensible defaults: Map aggregation memory configs are too aggressive

2019-02-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16780187#comment-16780187
 ] 

Hive QA commented on HIVE-20656:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
26s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
19s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
39s{color} | {color:blue} common in master has 63 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 14m 54s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16285/dev-support/hive-personality.sh
 |
| git revision | master / 7246b13 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: common U: common |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16285/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Sensible defaults: Map aggregation memory configs are too aggressive
> 
>
> Key: HIVE-20656
> URL: https://issues.apache.org/jira/browse/HIVE-20656
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20656.1.patch
>
>
> The defaults for the following configs seems to be too aggressive. In java 
> this can easily lead to several full GC pauses whose memory cannot be 
> reclaimed.
> {code:java}
> HIVEMAPAGGRHASHMEMORY("hive.map.aggr.hash.percentmemory", (float) 0.99,
> "Portion of total memory to be used by map-side group aggregation hash 
> table"),
> HIVEMAPAGGRMEMORYTHRESHOLD("hive.map.aggr.hash.force.flush.memory.threshold", 
> (float) 0.9,
> "The max memory to be used by map-side group aggregation hash table.\n" +
> "If the memory usage is higher than this number, force to flush 
> data"),{code}
>  
> We can be little bit conservative for these configs to avoid getting into GC 
> pause. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21001) Upgrade to calcite-1.18

2019-02-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16780185#comment-16780185
 ] 

Hive QA commented on HIVE-21001:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
45s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
57s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m  
8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
33s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
31s{color} | {color:blue} ql in master has 2251 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
37s{color} | {color:blue} accumulo-handler in master has 21 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
37s{color} | {color:blue} hbase-handler in master has 15 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  9m 
40s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
46s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  9m  
5s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
51s{color} | {color:red} ql: The patch generated 5 new + 290 unchanged - 29 
fixed = 295 total (was 319) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  2m 
18s{color} | {color:red} root: The patch generated 5 new + 290 unchanged - 29 
fixed = 295 total (was 319) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 18m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
35s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 89m  9s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  xml  compile  findbugs  
checkstyle  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16284/dev-support/hive-personality.sh
 |
| git revision | master / 38c20ba |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16284/yetus/diff-checkstyle-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16284/yetus/diff-checkstyle-root.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16284/yetus/whitespace-eol.txt
 |
| modules | C: ql accumulo-handler hbase-handler . U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16284/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Upgrade to calcite-1.18
> ---
>
> Key: HIVE-21001
> URL: https://issues.apache.org/jira/browse/HIVE-21001
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>

[jira] [Commented] (HIVE-21001) Upgrade to calcite-1.18

2019-02-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16780170#comment-16780170
 ] 

Hive QA commented on HIVE-21001:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12960456/HIVE-21001.41.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 40 failed/errored test(s), 15805 tests 
executed
*Failed tests:*
{noformat}
TestMiniLlapCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=155)

[intersect_all.q,unionDistinct_1.q,table_nonprintable.q,orc_llap_counters1.q,mm_cttas.q,whroot_external1.q,global_limit.q,cte_2.q,rcfile_createas1.q,dynamic_partition_pruning_2.q,intersect_merge.q,results_cache_diff_fs.q,cttl.q,parallel_colstats.q,load_hdfs_file_with_space_in_the_name.q]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ambiguitycheck] 
(batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[constant_prop_3] 
(batchId=47)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[semijoin6] 
(batchId=182)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[cbo_query11] 
(batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[cbo_query14] 
(batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[cbo_query17] 
(batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[cbo_query25] 
(batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[cbo_query29] 
(batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[cbo_query31] 
(batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[cbo_query47] 
(batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[cbo_query4] 
(batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[cbo_query50] 
(batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[cbo_query57] 
(batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[cbo_query58] 
(batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[cbo_query64] 
(batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[cbo_query74] 
(batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[cbo_query75] 
(batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[cbo_query85] 
(batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[cbo_query11]
 (batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[cbo_query14]
 (batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[cbo_query17]
 (batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[cbo_query25]
 (batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[cbo_query29]
 (batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[cbo_query31]
 (batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[cbo_query47]
 (batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[cbo_query4]
 (batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[cbo_query50]
 (batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[cbo_query57]
 (batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[cbo_query58]
 (batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[cbo_query64]
 (batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[cbo_query72]
 (batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[cbo_query74]
 (batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[cbo_query75]
 (batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[cbo_query85]
 (batchId=275)
org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthorization.testAllowedCommands
 (batchId=265)
org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthorization.testAuthZFailureLlapCachePurge
 (batchId=265)
org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthorization.testAuthorization1
 (batchId=265)
org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthorization.testBlackListedUdfUsage
 (batchId=265)
org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthorization.testConfigWhiteList
 (batchId=265)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16284/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16284/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16284/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase

[jira] [Updated] (HIVE-21329) Custom Tez runtime unordered output buffer size depending on operator pipeline

2019-02-27 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-21329:
---
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master.

> Custom Tez runtime unordered output buffer size depending on operator pipeline
> --
>
> Key: HIVE-21329
> URL: https://issues.apache.org/jira/browse/HIVE-21329
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21329.01.patch, HIVE-21329.patch, HIVE-21329.patch
>
>
> For instance, if we have a reduce sink operator with no keys followed by a 
> Group By (merge partial), we can decrease the output buffer size since we 
> will only produce a single row.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21342) Analyze compute stats for column leave behind staging dir on hdfs

2019-02-27 Thread Rajkumar Singh (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajkumar Singh updated HIVE-21342:
--
Attachment: HIVE-21342.patch
Status: Patch Available  (was: In Progress)

> Analyze compute stats for column leave behind staging dir on hdfs
> -
>
> Key: HIVE-21342
> URL: https://issues.apache.org/jira/browse/HIVE-21342
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
> Environment: hive-3.1
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
> Attachments: HIVE-21342.patch
>
>
> staging dir cleanup does not happen for the "analyze table .. compute 
> statistics for columns", this leads to stale directory on hdfs.
> the problem seems to be with ColumnStatsSemanticAnalyzer which don't have 
> hdfscleanup set for the context.
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/ColumnStatsSemanticAnalyzer.java#L310



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21332) Cache Purge command does purge the in-use buffer.

2019-02-27 Thread Jesus Camacho Rodriguez (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16780155#comment-16780155
 ] 

Jesus Camacho Rodriguez commented on HIVE-21332:


This issue has been merged with wrong JIRA number (HIVE-21333). Also there 
seems to be a commit and a merge, not sure why.

> Cache Purge command does purge the in-use buffer.
> -
>
> Key: HIVE-21332
> URL: https://issues.apache.org/jira/browse/HIVE-21332
> Project: Hive
>  Issue Type: Bug
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
> Attachments: HIVE-21332.patch
>
>
> Cache Purge command, is purging what is not suppose to evict.
> This can lead to unrecoverable state.
> {code} 
> TaskAttempt 3 failed, info=[Error: Error while running task ( failure ) : 
> attempt_1545278897356_0093_27_00_01_3:java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
> java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 32768; at 0 out of 1 (entire cache is fragmented and 
> locked, or an internal issue)
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
>  at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>  at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>  at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>  at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
>  at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>  at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>  at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>  at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.IOException: java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 32768; at 0 out of 1 (entire cache is fragmented and 
> locked, or an internal issue)
>  at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80)
>  at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
>  ... 15 more
> Caused by: java.io.IOException: java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 32768; at 0 out of 1 (entire cache is fragmented and 
> locked, or an internal issue)
>  at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>  at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>  at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
>  at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
>  at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
>  at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>  at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
>  at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
>  at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
>  ... 17 more
> Caused by: java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 32768; at 0 out of 1 (entire cache is fragmented and 
> locked, or an internal issue)
>  at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:513)
>  at 
>

[jira] [Assigned] (HIVE-21342) Analyze compute stats for column leave behind staging dir on hdfs

2019-02-27 Thread Rajkumar Singh (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajkumar Singh reassigned HIVE-21342:
-


> Analyze compute stats for column leave behind staging dir on hdfs
> -
>
> Key: HIVE-21342
> URL: https://issues.apache.org/jira/browse/HIVE-21342
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
> Environment: hive-3.1
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
>
> staging dir cleanup does not happen for the "analyze table .. compute 
> statistics for columns", this leads to stale directory on hdfs.
> the problem seems to be with ColumnStatsSemanticAnalyzer which don't have 
> hdfscleanup set for the context.
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/ColumnStatsSemanticAnalyzer.java#L310



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work started] (HIVE-21342) Analyze compute stats for column leave behind staging dir on hdfs

2019-02-27 Thread Rajkumar Singh (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-21342 started by Rajkumar Singh.
-
> Analyze compute stats for column leave behind staging dir on hdfs
> -
>
> Key: HIVE-21342
> URL: https://issues.apache.org/jira/browse/HIVE-21342
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
> Environment: hive-3.1
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
>
> staging dir cleanup does not happen for the "analyze table .. compute 
> statistics for columns", this leads to stale directory on hdfs.
> the problem seems to be with ColumnStatsSemanticAnalyzer which don't have 
> hdfscleanup set for the context.
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/ColumnStatsSemanticAnalyzer.java#L310



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21279) Avoid moving/rename operation in FileSink op for SELECT queries

2019-02-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16780111#comment-16780111
 ] 

Hive QA commented on HIVE-21279:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12960477/HIVE-21279.11.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 15782 tests 
executed
*Failed tests:*
{noformat}
TestDataSourceProviderFactory - did not produce a TEST-*.xml file (likely timed 
out) (batchId=230)
TestObjectStore - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestPartitionProjectionEvaluator - did not produce a TEST-*.xml file (likely 
timed out) (batchId=230)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16282/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16282/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16282/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12960477 - PreCommit-HIVE-Build

> Avoid moving/rename operation in FileSink op for SELECT queries
> ---
>
> Key: HIVE-21279
> URL: https://issues.apache.org/jira/browse/HIVE-21279
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21279.1.patch, HIVE-21279.10.patch, 
> HIVE-21279.11.patch, HIVE-21279.2.patch, HIVE-21279.3.patch, 
> HIVE-21279.4.patch, HIVE-21279.5.patch, HIVE-21279.6.patch, 
> HIVE-21279.7.patch, HIVE-21279.8.patch, HIVE-21279.9.patch
>
>
> Currently at the end of a job FileSink operator moves/rename temp directory 
> to another directory from which FetchTask fetches result. This is done to 
> avoid fetching potential partial/invalid files by failed/runway tasks. This 
> operation is expensive for cloud storage. It could be avoided if FetchTask is 
> passed on set of files to read from instead of whole directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21279) Avoid moving/rename operation in FileSink op for SELECT queries

2019-02-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16780112#comment-16780112
 ] 

Hive QA commented on HIVE-21279:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12960477/HIVE-21279.11.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16283/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16283/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16283/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Tests exited with: Exception: Patch URL 
https://issues.apache.org/jira/secure/attachment/12960477/HIVE-21279.11.patch 
was found in seen patch url's cache and a test was probably run already on it. 
Aborting...
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12960477 - PreCommit-HIVE-Build

> Avoid moving/rename operation in FileSink op for SELECT queries
> ---
>
> Key: HIVE-21279
> URL: https://issues.apache.org/jira/browse/HIVE-21279
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21279.1.patch, HIVE-21279.10.patch, 
> HIVE-21279.11.patch, HIVE-21279.2.patch, HIVE-21279.3.patch, 
> HIVE-21279.4.patch, HIVE-21279.5.patch, HIVE-21279.6.patch, 
> HIVE-21279.7.patch, HIVE-21279.8.patch, HIVE-21279.9.patch
>
>
> Currently at the end of a job FileSink operator moves/rename temp directory 
> to another directory from which FetchTask fetches result. This is done to 
> avoid fetching potential partial/invalid files by failed/runway tasks. This 
> operation is expensive for cloud storage. It could be avoided if FetchTask is 
> passed on set of files to read from instead of whole directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21279) Avoid moving/rename operation in FileSink op for SELECT queries

2019-02-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16780098#comment-16780098
 ] 

Hive QA commented on HIVE-21279:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
53s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
45s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
43s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 5s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
29s{color} | {color:blue} ql in master has 2251 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
33s{color} | {color:blue} hcatalog/streaming in master has 11 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
24s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
29s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
51s{color} | {color:green} ql: The patch generated 0 new + 780 unchanged - 9 
fixed = 780 total (was 789) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} The patch streaming passed checkstyle {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 32m 26s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16282/dev-support/hive-personality.sh
 |
| git revision | master / 38c20ba |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql hcatalog/streaming U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16282/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Avoid moving/rename operation in FileSink op for SELECT queries
> ---
>
> Key: HIVE-21279
> URL: https://issues.apache.org/jira/browse/HIVE-21279
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21279.1.patch, HIVE-21279.10.patch, 
> HIVE-21279.11.patch, HIVE-21279.2.patch, HIVE-21279.3.patch, 
> HIVE-21279.4.patch, HIVE-21279.5.patch, HIVE-21279.6.patch, 
> HIVE-21279.7.patch, HIVE-21279.8.patch, HIVE-21279.9.patch
>
>
> Currently at the end of a job FileSink operator moves/rename temp directory 
> to another directory from which FetchTask fetches result. This is done to 
> avoid fetching potential partial/invalid files by failed/runway tasks. This 
> operation is expensive for cloud storage. It could be

[jira] [Updated] (HIVE-21340) CBO: Prune non-key columns feeding into a SemiJoin

2019-02-27 Thread Vineet Garg (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-21340:
---
Attachment: HIVE-21340.1.patch

> CBO: Prune non-key columns feeding into a SemiJoin
> --
>
> Key: HIVE-21340
> URL: https://issues.apache.org/jira/browse/HIVE-21340
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 4.0.0
>Reporter: Gopal V
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-21340.1.patch
>
>
> {code}
> explain cbo 
> with ss as 
> (select count(1), ss_item_sk, ss_ticket_number from 
> store_sales group by ss_item_sk, ss_ticket_number 
> having count(1) > 1) 
> select count(1) from item where i_item_sk IN (select ss_item_sk from ss);
> {code}
> Notice the {{HiveProject(ss_item_sk=[$0], ss_ticket_number=[$1], $f2=[$2])}} 
> Only ss_item_sk is relevant for the HiveSemiJoin
> {code}
> CBO PLAN:
> HiveAggregate(group=[{}], agg#0=[count()])
>   HiveSemiJoin(condition=[=($0, $1)], joinType=[inner])
> HiveProject(i_item_sk=[$0])
>   HiveFilter(condition=[IS NOT NULL($0)])
> HiveTableScan(table=[[tpcds_copy_orc_partitioned_1, item]], 
> table:alias=[item])
> HiveProject(ss_item_sk=[$0], ss_ticket_number=[$1], $f2=[$2])
>   HiveFilter(condition=[>($2, 1)])
> HiveAggregate(group=[{1, 8}], agg#0=[count()])
>   HiveFilter(condition=[IS NOT NULL($1)])
> HiveTableScan(table=[[tpcds_copy_orc_partitioned_1, 
> store_sales]], table:alias=[store_sales])
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21340) CBO: Prune non-key columns feeding into a SemiJoin

2019-02-27 Thread Vineet Garg (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-21340:
---
Status: Patch Available  (was: Open)

> CBO: Prune non-key columns feeding into a SemiJoin
> --
>
> Key: HIVE-21340
> URL: https://issues.apache.org/jira/browse/HIVE-21340
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 4.0.0
>Reporter: Gopal V
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-21340.1.patch
>
>
> {code}
> explain cbo 
> with ss as 
> (select count(1), ss_item_sk, ss_ticket_number from 
> store_sales group by ss_item_sk, ss_ticket_number 
> having count(1) > 1) 
> select count(1) from item where i_item_sk IN (select ss_item_sk from ss);
> {code}
> Notice the {{HiveProject(ss_item_sk=[$0], ss_ticket_number=[$1], $f2=[$2])}} 
> Only ss_item_sk is relevant for the HiveSemiJoin
> {code}
> CBO PLAN:
> HiveAggregate(group=[{}], agg#0=[count()])
>   HiveSemiJoin(condition=[=($0, $1)], joinType=[inner])
> HiveProject(i_item_sk=[$0])
>   HiveFilter(condition=[IS NOT NULL($0)])
> HiveTableScan(table=[[tpcds_copy_orc_partitioned_1, item]], 
> table:alias=[item])
> HiveProject(ss_item_sk=[$0], ss_ticket_number=[$1], $f2=[$2])
>   HiveFilter(condition=[>($2, 1)])
> HiveAggregate(group=[{1, 8}], agg#0=[count()])
>   HiveFilter(condition=[IS NOT NULL($1)])
> HiveTableScan(table=[[tpcds_copy_orc_partitioned_1, 
> store_sales]], table:alias=[store_sales])
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21329) Custom Tez runtime unordered output buffer size depending on operator pipeline

2019-02-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16780069#comment-16780069
 ] 

Hive QA commented on HIVE-21329:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12960443/HIVE-21329.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15820 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16281/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16281/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16281/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12960443 - PreCommit-HIVE-Build

> Custom Tez runtime unordered output buffer size depending on operator pipeline
> --
>
> Key: HIVE-21329
> URL: https://issues.apache.org/jira/browse/HIVE-21329
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21329.01.patch, HIVE-21329.patch, HIVE-21329.patch
>
>
> For instance, if we have a reduce sink operator with no keys followed by a 
> Group By (merge partial), we can decrease the output buffer size since we 
> will only produce a single row.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21329) Custom Tez runtime unordered output buffer size depending on operator pipeline

2019-02-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16780049#comment-16780049
 ] 

Hive QA commented on HIVE-21329:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m  
2s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
45s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
43s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 5s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
41s{color} | {color:blue} common in master has 63 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
36s{color} | {color:blue} ql in master has 2251 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
26s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
31s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
40s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
17s{color} | {color:red} common: The patch generated 2 new + 428 unchanged - 0 
fixed = 430 total (was 428) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 32m 54s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16281/dev-support/hive-personality.sh
 |
| git revision | master / 38c20ba |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16281/yetus/diff-checkstyle-common.txt
 |
| modules | C: common ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16281/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Custom Tez runtime unordered output buffer size depending on operator pipeline
> --
>
> Key: HIVE-21329
> URL: https://issues.apache.org/jira/browse/HIVE-21329
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21329.01.patch, HIVE-21329.patch, HIVE-21329.patch
>
>
> For instance, if we have a reduce sink operator with no keys followed by a 
> Group By (merge partial), we can decrease the output buffer size since we 
> will only produce a single row.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21312) FSStatsAggregator::connect is slow

2019-02-27 Thread Rajesh Balamohan (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16780047#comment-16780047
 ] 

Rajesh Balamohan commented on HIVE-21312:
-

Kryo readObject classtype was missing earlier causing the failures. Fixed it in 
recently uploaded patch with no test failures.

> FSStatsAggregator::connect is slow
> --
>
> Key: HIVE-21312
> URL: https://issues.apache.org/jira/browse/HIVE-21312
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Trivial
> Attachments: HIVE-21312.1.patch, HIVE-21312.2.patch, 
> HIVE-21312.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18920) CBO: Initialize the Janino providers ahead of 1st query

2019-02-27 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-18920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-18920:
---
Attachment: HIVE-18920.01.patch

> CBO: Initialize the Janino providers ahead of 1st query
> ---
>
> Key: HIVE-18920
> URL: https://issues.apache.org/jira/browse/HIVE-18920
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-18920.01.patch, HIVE-18920.patch
>
>
> Hive Calcite metadata providers are compiled when the 1st query comes in.
> If a second query arrives before the 1st one has built a metadata provider, 
> it will also try to do the same thing, because the cache is not populated yet.
> With 1024 concurrent users, it takes 6 minutes for the 1st query to finish 
> fighting all the other queries which are trying to load that cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21316) Comparision of varchar column and string literal should happen in varchar

2019-02-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16780033#comment-16780033
 ] 

Hive QA commented on HIVE-21316:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
56s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
24s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
25s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
 6s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
37s{color} | {color:blue} ql in master has 2251 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  9m 
11s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
28s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
 8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m 
15s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
45s{color} | {color:red} ql: The patch generated 4 new + 183 unchanged - 0 
fixed = 187 total (was 183) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  2m 
19s{color} | {color:red} root: The patch generated 4 new + 358 unchanged - 0 
fixed = 362 total (was 358) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch 9 line(s) with tabs. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  4m 
48s{color} | {color:red} ql generated 1 new + 2251 unchanged - 0 fixed = 2252 
total (was 2251) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  9m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 71m 45s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  
org.apache.hadoop.hive.ql.optimizer.calcite.translator.RexNodeConverter$MXNlsString
 doesn't override org.apache.calcite.util.NlsString.equals(Object)  At 
RexNodeConverter.java:At RexNodeConverter.java:[line 1] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16280/dev-support/hive-personality.sh
 |
| git revision | master / 38c20ba |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16280/yetus/diff-checkstyle-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16280/yetus/diff-checkstyle-root.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16280/yetus/whitespace-tabs.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16280/yetus/new-findbugs-ql.html
 |
| modules | C: ql . itests U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16280/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Comparision of varchar column and string literal should happen in varchar
> -
>
> Key: HIVE-21316
> URL: https://issues.apache.org/jira/browse/HIVE-21316
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments:

[jira] [Commented] (HIVE-21316) Comparision of varchar column and string literal should happen in varchar

2019-02-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16780031#comment-16780031
 ] 

Hive QA commented on HIVE-21316:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12960433/HIVE-21316.02.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15821 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16280/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16280/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16280/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12960433 - PreCommit-HIVE-Build

> Comparision of varchar column and string literal should happen in varchar
> -
>
> Key: HIVE-21316
> URL: https://issues.apache.org/jira/browse/HIVE-21316
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-21316.01.patch, HIVE-21316.02.patch
>
>
> this is most probably the root cause behind HIVE-21310 as well



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18920) CBO: Initialize the Janino providers ahead of 1st query

2019-02-27 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-18920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-18920:
---
Status: Patch Available  (was: In Progress)

> CBO: Initialize the Janino providers ahead of 1st query
> ---
>
> Key: HIVE-18920
> URL: https://issues.apache.org/jira/browse/HIVE-18920
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-18920.patch
>
>
> Hive Calcite metadata providers are compiled when the 1st query comes in.
> If a second query arrives before the 1st one has built a metadata provider, 
> it will also try to do the same thing, because the cache is not populated yet.
> With 1024 concurrent users, it takes 6 minutes for the 1st query to finish 
> fighting all the other queries which are trying to load that cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18920) CBO: Initialize the Janino providers ahead of 1st query

2019-02-27 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-18920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-18920:
---
Attachment: HIVE-18920.patch

> CBO: Initialize the Janino providers ahead of 1st query
> ---
>
> Key: HIVE-18920
> URL: https://issues.apache.org/jira/browse/HIVE-18920
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-18920.patch
>
>
> Hive Calcite metadata providers are compiled when the 1st query comes in.
> If a second query arrives before the 1st one has built a metadata provider, 
> it will also try to do the same thing, because the cache is not populated yet.
> With 1024 concurrent users, it takes 6 minutes for the 1st query to finish 
> fighting all the other queries which are trying to load that cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21340) CBO: Prune non-key columns feeding into a SemiJoin

2019-02-27 Thread Vineet Garg (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16780025#comment-16780025
 ] 

Vineet Garg commented on HIVE-21340:


Problem is with HiveSemiJoinRule. Column pruning is occurring e.g. the plan 
just before HiveSemiJoinRule is:

{code:sql}
HiveAggregate(group=[{}], agg#0=[count()])
  HiveJoin(condition=[=($0, $1)], joinType=[inner], algorithm=[none], cost=[not 
available])
HiveProject(i_item_sk=[$0])
  HiveFilter(condition=[IS NOT NULL($0)])
HiveTableScan(table=[[perf, item]], table:alias=[item])
HiveAggregate(group=[{0}])
  HiveFilter(condition=[>($2, 1)])
HiveAggregate(group=[{2, 9}], agg#0=[count()])
  HiveFilter(condition=[IS NOT NULL($2)])
HiveTableScan(table=[[perf, store_sales]], 
table:alias=[store_sales])

{code}

HiveSemiJoinRule rewrites the HiveJoin + HIveAggregate into HiveSemiJoin. It 
does not introduce HiveProject as replacement of HiveAggregate, as a result 
schema changes to whatever HiveAggregate's input is (HiveFilter in this case)

> CBO: Prune non-key columns feeding into a SemiJoin
> --
>
> Key: HIVE-21340
> URL: https://issues.apache.org/jira/browse/HIVE-21340
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 4.0.0
>Reporter: Gopal V
>Assignee: Vineet Garg
>Priority: Major
>
> {code}
> explain cbo 
> with ss as 
> (select count(1), ss_item_sk, ss_ticket_number from 
> store_sales group by ss_item_sk, ss_ticket_number 
> having count(1) > 1) 
> select count(1) from item where i_item_sk IN (select ss_item_sk from ss);
> {code}
> Notice the {{HiveProject(ss_item_sk=[$0], ss_ticket_number=[$1], $f2=[$2])}} 
> Only ss_item_sk is relevant for the HiveSemiJoin
> {code}
> CBO PLAN:
> HiveAggregate(group=[{}], agg#0=[count()])
>   HiveSemiJoin(condition=[=($0, $1)], joinType=[inner])
> HiveProject(i_item_sk=[$0])
>   HiveFilter(condition=[IS NOT NULL($0)])
> HiveTableScan(table=[[tpcds_copy_orc_partitioned_1, item]], 
> table:alias=[item])
> HiveProject(ss_item_sk=[$0], ss_ticket_number=[$1], $f2=[$2])
>   HiveFilter(condition=[>($2, 1)])
> HiveAggregate(group=[{1, 8}], agg#0=[count()])
>   HiveFilter(condition=[IS NOT NULL($1)])
> HiveTableScan(table=[[tpcds_copy_orc_partitioned_1, 
> store_sales]], table:alias=[store_sales])
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21339) LLAP: Cache hit also initializes an FS object

2019-02-27 Thread Prasanth Jayachandran (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-21339:
-
Attachment: HIVE-21339.2.patch

> LLAP: Cache hit also initializes an FS object 
> --
>
> Key: HIVE-21339
> URL: https://issues.apache.org/jira/browse/HIVE-21339
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0
>Reporter: Gopal V
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21339.1.patch, HIVE-21339.2.patch, 
> llap-cache-fs-get.png
>
>
> https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java#L214
> {code}
> // 1. Get file metadata from cache, or create the reader and read it.
> // Don't cache the filesystem object for now; Tez closes it and FS cache 
> will fix all that
> fs = split.getPath().getFileSystem(jobConf);
> fileKey = determineFileId(fs, split,
> HiveConf.getBoolVar(daemonConf, 
> ConfVars.LLAP_CACHE_ALLOW_SYNTHETIC_FILEID),
> HiveConf.getBoolVar(daemonConf, 
> ConfVars.LLAP_CACHE_DEFAULT_FS_FILE_ID),
> !HiveConf.getBoolVar(daemonConf, ConfVars.LLAP_IO_USE_FILEID_PATH)
> );
> {code}
>  !llap-cache-fs-get.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21339) LLAP: Cache hit also initializes an FS object

2019-02-27 Thread Prasanth Jayachandran (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-21339:
-
Attachment: HIVE-21339.1.patch

> LLAP: Cache hit also initializes an FS object 
> --
>
> Key: HIVE-21339
> URL: https://issues.apache.org/jira/browse/HIVE-21339
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0
>Reporter: Gopal V
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21339.1.patch, llap-cache-fs-get.png
>
>
> https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java#L214
> {code}
> // 1. Get file metadata from cache, or create the reader and read it.
> // Don't cache the filesystem object for now; Tez closes it and FS cache 
> will fix all that
> fs = split.getPath().getFileSystem(jobConf);
> fileKey = determineFileId(fs, split,
> HiveConf.getBoolVar(daemonConf, 
> ConfVars.LLAP_CACHE_ALLOW_SYNTHETIC_FILEID),
> HiveConf.getBoolVar(daemonConf, 
> ConfVars.LLAP_CACHE_DEFAULT_FS_FILE_ID),
> !HiveConf.getBoolVar(daemonConf, ConfVars.LLAP_IO_USE_FILEID_PATH)
> );
> {code}
>  !llap-cache-fs-get.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21339) LLAP: Cache hit also initializes an FS object

2019-02-27 Thread Prasanth Jayachandran (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-21339:
-
Status: Patch Available  (was: Open)

> LLAP: Cache hit also initializes an FS object 
> --
>
> Key: HIVE-21339
> URL: https://issues.apache.org/jira/browse/HIVE-21339
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0
>Reporter: Gopal V
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21339.1.patch, llap-cache-fs-get.png
>
>
> https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java#L214
> {code}
> // 1. Get file metadata from cache, or create the reader and read it.
> // Don't cache the filesystem object for now; Tez closes it and FS cache 
> will fix all that
> fs = split.getPath().getFileSystem(jobConf);
> fileKey = determineFileId(fs, split,
> HiveConf.getBoolVar(daemonConf, 
> ConfVars.LLAP_CACHE_ALLOW_SYNTHETIC_FILEID),
> HiveConf.getBoolVar(daemonConf, 
> ConfVars.LLAP_CACHE_DEFAULT_FS_FILE_ID),
> !HiveConf.getBoolVar(daemonConf, ConfVars.LLAP_IO_USE_FILEID_PATH)
> );
> {code}
>  !llap-cache-fs-get.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21339) LLAP: Cache hit also initializes an FS object

2019-02-27 Thread Prasanth Jayachandran (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-21339:
-
Attachment: (was: HIVE-21339.1.patch)

> LLAP: Cache hit also initializes an FS object 
> --
>
> Key: HIVE-21339
> URL: https://issues.apache.org/jira/browse/HIVE-21339
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0
>Reporter: Gopal V
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21339.1.patch, llap-cache-fs-get.png
>
>
> https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java#L214
> {code}
> // 1. Get file metadata from cache, or create the reader and read it.
> // Don't cache the filesystem object for now; Tez closes it and FS cache 
> will fix all that
> fs = split.getPath().getFileSystem(jobConf);
> fileKey = determineFileId(fs, split,
> HiveConf.getBoolVar(daemonConf, 
> ConfVars.LLAP_CACHE_ALLOW_SYNTHETIC_FILEID),
> HiveConf.getBoolVar(daemonConf, 
> ConfVars.LLAP_CACHE_DEFAULT_FS_FILE_ID),
> !HiveConf.getBoolVar(daemonConf, ConfVars.LLAP_IO_USE_FILEID_PATH)
> );
> {code}
>  !llap-cache-fs-get.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779979#comment-16779979
 ] 

Hive QA commented on HIVE-21292:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12960427/HIVE-21292.17.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15820 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16279/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16279/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16279/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12960427 - PreCommit-HIVE-Build

> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21292.01.patch, HIVE-21292.02.patch, 
> HIVE-21292.03.patch, HIVE-21292.04.patch, HIVE-21292.05.patch, 
> HIVE-21292.06.patch, HIVE-21292.07.patch, HIVE-21292.08.patch, 
> HIVE-21292.09.patch, HIVE-21292.10.patch, HIVE-21292.11.patch, 
> HIVE-21292.12.patch, HIVE-21292.13.patch, HIVE-21292.14.patch, 
> HIVE-21292.15.patch, HIVE-21292.15.patch, HIVE-21292.16.patch, 
> HIVE-21292.17.patch
>
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there are two DDLTask and DDLWork classes in the 
> code base the new ones in the new package are called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes are in use.
> Step #1: extract all the database related operations from the old DDLTask, 
> and move them under the new package. Also create the new internal framework.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21338) Remove order by and limit for aggregates

2019-02-27 Thread Vineet Garg (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-21338:
---
Status: Patch Available  (was: Open)

> Remove order by and limit for aggregates
> 
>
> Key: HIVE-21338
> URL: https://issues.apache.org/jira/browse/HIVE-21338
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-21338.1.patch
>
>
> If a query is guaranteed to produce at most one row LIMIT and ORDER BY could 
> be removed. This saves unnecessary vertex for LIMIT/ORDER BY.
> {code:sql}
> explain select count(*) cs from store_sales where ss_ext_sales_price > 100.00 
> order by cs limit 100
> {code}
> {code}
> STAGE PLANS:
>   Stage: Stage-1
> Tez
>   DagId: vgarg_20190227131959_2914830f-eab6-425d-b9f0-b8cb56f8a1e9:4
>   Edges:
> Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
>   DagName: vgarg_20190227131959_2914830f-eab6-425d-b9f0-b8cb56f8a1e9:4
>   Vertices:
> Map 1
> Map Operator Tree:
> TableScan
>   alias: store_sales
>   filterExpr: (ss_ext_sales_price > 100) (type: boolean)
>   Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
>   Filter Operator
> predicate: (ss_ext_sales_price > 100) (type: boolean)
> Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
> Select Operator
>   Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
>   Group By Operator
> aggregations: count()
> mode: hash
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
> Reduce Output Operator
>   sort order:
>   Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
>   value expressions: _col0 (type: bigint)
> Execution mode: vectorized
> Reducer 2
> Execution mode: vectorized
> Reduce Operator Tree:
>   Group By Operator
> aggregations: count(VALUE._col0)
> mode: mergepartial
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 120 Basic stats: COMPLETE 
> Column stats: NONE
> Reduce Output Operator
>   key expressions: _col0 (type: bigint)
>   sort order: +
>   Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
>   TopN Hash Memory Usage: 0.1
> Reducer 3
> Execution mode: vectorized
> Reduce Operator Tree:
>   Select Operator
> expressions: KEY.reducesinkkey0 (type: bigint)
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 120 Basic stats: COMPLETE 
> Column stats: NONE
> Limit
>   Number of rows: 100
>   Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
>   File Output Operator
> compressed: false
> Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
> table:
> input format: 
> org.apache.hadoop.mapred.SequenceFileInputFormat
> output format: 
> org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
> serde: 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-0
> Fetch Operator
>   limit: 100
>   Processor Tree:
> ListSink
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21338) Remove order by and limit for aggregates

2019-02-27 Thread Vineet Garg (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-21338:
---
Attachment: HIVE-21338.1.patch

> Remove order by and limit for aggregates
> 
>
> Key: HIVE-21338
> URL: https://issues.apache.org/jira/browse/HIVE-21338
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-21338.1.patch
>
>
> If a query is guaranteed to produce at most one row LIMIT and ORDER BY could 
> be removed. This saves unnecessary vertex for LIMIT/ORDER BY.
> {code:sql}
> explain select count(*) cs from store_sales where ss_ext_sales_price > 100.00 
> order by cs limit 100
> {code}
> {code}
> STAGE PLANS:
>   Stage: Stage-1
> Tez
>   DagId: vgarg_20190227131959_2914830f-eab6-425d-b9f0-b8cb56f8a1e9:4
>   Edges:
> Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
>   DagName: vgarg_20190227131959_2914830f-eab6-425d-b9f0-b8cb56f8a1e9:4
>   Vertices:
> Map 1
> Map Operator Tree:
> TableScan
>   alias: store_sales
>   filterExpr: (ss_ext_sales_price > 100) (type: boolean)
>   Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
>   Filter Operator
> predicate: (ss_ext_sales_price > 100) (type: boolean)
> Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
> Select Operator
>   Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
>   Group By Operator
> aggregations: count()
> mode: hash
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
> Reduce Output Operator
>   sort order:
>   Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
>   value expressions: _col0 (type: bigint)
> Execution mode: vectorized
> Reducer 2
> Execution mode: vectorized
> Reduce Operator Tree:
>   Group By Operator
> aggregations: count(VALUE._col0)
> mode: mergepartial
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 120 Basic stats: COMPLETE 
> Column stats: NONE
> Reduce Output Operator
>   key expressions: _col0 (type: bigint)
>   sort order: +
>   Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
>   TopN Hash Memory Usage: 0.1
> Reducer 3
> Execution mode: vectorized
> Reduce Operator Tree:
>   Select Operator
> expressions: KEY.reducesinkkey0 (type: bigint)
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 120 Basic stats: COMPLETE 
> Column stats: NONE
> Limit
>   Number of rows: 100
>   Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
>   File Output Operator
> compressed: false
> Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
> table:
> input format: 
> org.apache.hadoop.mapred.SequenceFileInputFormat
> output format: 
> org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
> serde: 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-0
> Fetch Operator
>   limit: 100
>   Processor Tree:
> ListSink
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21338) Remove order by and limit for aggregates

2019-02-27 Thread Vineet Garg (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-21338:
---
Attachment: HIVE-21338.1.patch

> Remove order by and limit for aggregates
> 
>
> Key: HIVE-21338
> URL: https://issues.apache.org/jira/browse/HIVE-21338
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-21338.1.patch
>
>
> If a query is guaranteed to produce at most one row LIMIT and ORDER BY could 
> be removed. This saves unnecessary vertex for LIMIT/ORDER BY.
> {code:sql}
> explain select count(*) cs from store_sales where ss_ext_sales_price > 100.00 
> order by cs limit 100
> {code}
> {code}
> STAGE PLANS:
>   Stage: Stage-1
> Tez
>   DagId: vgarg_20190227131959_2914830f-eab6-425d-b9f0-b8cb56f8a1e9:4
>   Edges:
> Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
>   DagName: vgarg_20190227131959_2914830f-eab6-425d-b9f0-b8cb56f8a1e9:4
>   Vertices:
> Map 1
> Map Operator Tree:
> TableScan
>   alias: store_sales
>   filterExpr: (ss_ext_sales_price > 100) (type: boolean)
>   Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
>   Filter Operator
> predicate: (ss_ext_sales_price > 100) (type: boolean)
> Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
> Select Operator
>   Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
>   Group By Operator
> aggregations: count()
> mode: hash
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
> Reduce Output Operator
>   sort order:
>   Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
>   value expressions: _col0 (type: bigint)
> Execution mode: vectorized
> Reducer 2
> Execution mode: vectorized
> Reduce Operator Tree:
>   Group By Operator
> aggregations: count(VALUE._col0)
> mode: mergepartial
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 120 Basic stats: COMPLETE 
> Column stats: NONE
> Reduce Output Operator
>   key expressions: _col0 (type: bigint)
>   sort order: +
>   Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
>   TopN Hash Memory Usage: 0.1
> Reducer 3
> Execution mode: vectorized
> Reduce Operator Tree:
>   Select Operator
> expressions: KEY.reducesinkkey0 (type: bigint)
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 120 Basic stats: COMPLETE 
> Column stats: NONE
> Limit
>   Number of rows: 100
>   Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
>   File Output Operator
> compressed: false
> Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
> table:
> input format: 
> org.apache.hadoop.mapred.SequenceFileInputFormat
> output format: 
> org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
> serde: 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-0
> Fetch Operator
>   limit: 100
>   Processor Tree:
> ListSink
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21338) Remove order by and limit for aggregates

2019-02-27 Thread Vineet Garg (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-21338:
---
Attachment: (was: HIVE-21338.1.patch)

> Remove order by and limit for aggregates
> 
>
> Key: HIVE-21338
> URL: https://issues.apache.org/jira/browse/HIVE-21338
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-21338.1.patch
>
>
> If a query is guaranteed to produce at most one row LIMIT and ORDER BY could 
> be removed. This saves unnecessary vertex for LIMIT/ORDER BY.
> {code:sql}
> explain select count(*) cs from store_sales where ss_ext_sales_price > 100.00 
> order by cs limit 100
> {code}
> {code}
> STAGE PLANS:
>   Stage: Stage-1
> Tez
>   DagId: vgarg_20190227131959_2914830f-eab6-425d-b9f0-b8cb56f8a1e9:4
>   Edges:
> Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
>   DagName: vgarg_20190227131959_2914830f-eab6-425d-b9f0-b8cb56f8a1e9:4
>   Vertices:
> Map 1
> Map Operator Tree:
> TableScan
>   alias: store_sales
>   filterExpr: (ss_ext_sales_price > 100) (type: boolean)
>   Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
>   Filter Operator
> predicate: (ss_ext_sales_price > 100) (type: boolean)
> Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
> Select Operator
>   Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
>   Group By Operator
> aggregations: count()
> mode: hash
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
> Reduce Output Operator
>   sort order:
>   Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
>   value expressions: _col0 (type: bigint)
> Execution mode: vectorized
> Reducer 2
> Execution mode: vectorized
> Reduce Operator Tree:
>   Group By Operator
> aggregations: count(VALUE._col0)
> mode: mergepartial
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 120 Basic stats: COMPLETE 
> Column stats: NONE
> Reduce Output Operator
>   key expressions: _col0 (type: bigint)
>   sort order: +
>   Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
>   TopN Hash Memory Usage: 0.1
> Reducer 3
> Execution mode: vectorized
> Reduce Operator Tree:
>   Select Operator
> expressions: KEY.reducesinkkey0 (type: bigint)
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 120 Basic stats: COMPLETE 
> Column stats: NONE
> Limit
>   Number of rows: 100
>   Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
>   File Output Operator
> compressed: false
> Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
> table:
> input format: 
> org.apache.hadoop.mapred.SequenceFileInputFormat
> output format: 
> org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
> serde: 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-0
> Fetch Operator
>   limit: 100
>   Processor Tree:
> ListSink
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779965#comment-16779965
 ] 

Hive QA commented on HIVE-21292:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
48s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
30s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
39s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
27s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
44s{color} | {color:blue} ql in master has 2251 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
43s{color} | {color:blue} hcatalog/core in master has 29 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
49s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
54s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
32s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
 4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
50s{color} | {color:green} ql: The patch generated 0 new + 468 unchanged - 25 
fixed = 468 total (was 493) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} hcatalog/core: The patch generated 0 new + 40 
unchanged - 2 fixed = 40 total (was 42) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} The patch hive-unit passed checkstyle {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
59s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 40m 39s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16279/dev-support/hive-personality.sh
 |
| git revision | master / 38c20ba |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql hcatalog/core itests/hive-unit U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16279/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21292.01.patch, HIVE-21292.02.patch, 
> HIVE-21292.03.patch, HIVE-21292.04.patch,

[jira] [Commented] (HIVE-21210) CombineHiveInputFormat Thread Pool Sizing

2019-02-27 Thread BELUGA BEHR (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779954#comment-16779954
 ] 

BELUGA BEHR commented on HIVE-21210:


[~zchovan] Can I get your thoughts on this one? :)

> CombineHiveInputFormat Thread Pool Sizing
> -
>
> Key: HIVE-21210
> URL: https://issues.apache.org/jira/browse/HIVE-21210
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0, 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: HIVE-21210.1.patch, HIVE-21210.2.patch, 
> HIVE-21210.3.patch, HIVE-21210.4.patch, HIVE-21210.5.patch, 
> HIVE-21210.6.patch, HIVE-21210.7.patch, HIVE-21210.8.patch
>
>
> Threadpools.
> Hive uses threadpools in several different places and each implementation is 
> a little different and requires different configurations. I think that Hive 
> needs to reign in and standardize the way that threadpools are used and 
> threadpools should scale automatically without manual configuration. At any 
> given time, there are many hundreds of threads running in the HS2 as the 
> number of simultaneous connections increases and they surely cause contention 
> with one-another.
> Here is an example:
> {code:java|title=CombineHiveInputFormat.java}
>   // max number of threads we can use to check non-combinable paths
>   private static final int MAX_CHECK_NONCOMBINABLE_THREAD_NUM = 50;
>   private static final int DEFAULT_NUM_PATH_PER_THREAD = 100;
> {code}
> When building the splits for a MR job, there are up to 50 threads running per 
> query and there is not much scaling here, it's simply 1 thread : 100 files 
> ratio.  This implies that to process 5000 files, there are 50 threads, after 
> that, 50 threads are still used. Many Hive jobs these days involve more than 
> 5000 files so it's not scaling well on bigger sizes.
> This is not configurable (even manually), it doesn't change when the hardware 
> specs increase, and 50 threads seems like a lot when a service must support 
> up to 80 connections:
> [https://www.cloudera.com/documentation/enterprise/5/latest/topics/admin_hive_tuning.html]
> Not to mention, I have never seen a scenario where HS2 is running on a host 
> all by itself and has the entire system dedicated to it. Therefore it should 
> be more friendly and spin up fewer threads.
> I am attaching a patch here that provides a few features:
>  * Common module that produces {{ExecutorService}} which caps the number of 
> threads it spins up at the number of processors a host has. Keep in mind that 
> a class may submit as much work units ({{Callables}} as they would like, but 
> the number of threads in the pool is capped.
>  * Common module for partitioning work. That is, allow for a generic 
> framework for dividing work into partitions (i.e. batches)
>  * Modify {{CombineHiveInputFormat}} to take advantage of both modules, 
> performing its same duties in a more Java OO way that is currently implemented
>  * Add a partitioning (batching) implementation that enforces partitioning of 
> a {{Collection}} based on the natural log of the {{Collection}} size so that 
> it scales more slowly than a simple 1:100 ratio.
>  * Simplify unit test code for {{CombineHiveInputFormat}}
> My hope is to introduce these tools to {{CombineHiveInputFormat}} and then to 
> drop it into other places.  One of the things I will introduce here is a 
> "direct thread" {{ExecutorService}} so that even if there is a configuration 
> for a thread pool to be disabled, it will still use an {{ExecutorService}} so 
> that the project can avoid logic like "if this function is services by a 
> thread pool, use a {{ExecutorService}} (and remember to close it later!) 
> otherwise, create a single thread" so that things like [HIVE-16949] can be 
> avoided in the future.  Everything will just use an {{ExecutorService}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21341) Sensible defaults : hive.server2.idle.operation.timeout and hive.server2.idle.session.timeout are too high

2019-02-27 Thread Ashutosh Chauhan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21341:

Attachment: HIVE-21341.patch

> Sensible defaults : hive.server2.idle.operation.timeout and 
> hive.server2.idle.session.timeout are too high
> --
>
> Key: HIVE-21341
> URL: https://issues.apache.org/jira/browse/HIVE-21341
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, HiveServer2
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-21341.patch
>
>
> Defaults are too high, which results in extra resources being held too long 
> in HS2 memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21341) Sensible defaults : hive.server2.idle.operation.timeout and hive.server2.idle.session.timeout are too high

2019-02-27 Thread Thejas M Nair (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779949#comment-16779949
 ] 

Thejas M Nair commented on HIVE-21341:
--

hive.server2.session.check.interval can be set to 5m. Otherwise, with current 
setting in the patch, the timeout might happen only after 2:29 min in some 
cases.
+1 pending that change and tests


> Sensible defaults : hive.server2.idle.operation.timeout and 
> hive.server2.idle.session.timeout are too high
> --
>
> Key: HIVE-21341
> URL: https://issues.apache.org/jira/browse/HIVE-21341
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, HiveServer2
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-21341.patch
>
>
> Defaults are too high, which results in extra resources being held too long 
> in HS2 memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21336) HMS Index PCS_STATS_IDX too long for Oracle when NLS_LENGTH_SEMANTICS=char

2019-02-27 Thread Yongzhi Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779948#comment-16779948
 ] 

Yongzhi Chen commented on HIVE-21336:
-

Could it cause inconsistent if user has NLS_LENGTH_SEMANTICS default as CHAR 
and do following?
1. Upgrade one HMS from lower version to 4.0
2. Freshly install a new HMS 4.0

> HMS Index PCS_STATS_IDX too long for Oracle when NLS_LENGTH_SEMANTICS=char
> --
>
> Key: HIVE-21336
> URL: https://issues.apache.org/jira/browse/HIVE-21336
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Major
> Attachments: HIVE-21336.patch
>
>
> CREATE INDEX PCS_STATS_IDX ON PAR T_COL_STATS 
> (DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME) 
> Error: ORA-01450: maximum key length (6398) exceeded (state=72000,code=1450) 
> Customer tried the same DDL in SQLDevloper, and got the same error. This 
> could be a result of combination of DB level settings like the db_block_size, 
> limiting the maximum key length, as per below doc: 
> http://www.dba-oracle.com/t_ora_01450_maximum_key_length_exceeded.htm 
> Also {{NLS_LENGTH_SEMANTICS}} is by default BYTE, but users can set this at 
> the session level to CHAR, thus reducing the max size of the index length. We 
> have increased the size of the COLUMN_NAME from 128 to 767 (used to be at 
> 1000) and TABLE_NAME from 128 to 256. This by setting 
> {code} 
> CREATE TABLE PART_COL_STATS ( 
> CS_ID NUMBER NOT NULL, 
> DB_NAME VARCHAR2(128) NOT NULL, 
> TABLE_NAME VARCHAR2(256) NOT NULL, 
> PARTITION_NAME VARCHAR2(767) NOT NULL, 
> COLUMN_NAME VARCHAR2(767) NOT NULL,  
> CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
> (DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME); 
> {code} 
> Reproducer: 
> {code} 
> SQL*Plus: Release 11.2.0.2.0 Production on Wed Feb 27 11:02:16 2019 Copyright 
> (c) 1982, 2011, Oracle. All rights reserved. 
> Connected to: Oracle Database 11g Express Edition Release 11.2.0.2.0 - 64bit 
> Production 
> SQL> select * from v$nls_parameters where parameter = 'NLS_LENGTH_SEMANTICS'; 
> PARAMETER 
>  
> VALUE 
>  
> NLS_LENGTH_SEMANTICS 
> BYTE 
> SQL> alter session set NLS_LENGTH_SEMANTICS=CHAR; Session altered. 
> SQL> commit; Commit complete. 
> SQL> select * from v$nls_parameters where parameter = 'NLS_LENGTH_SEMANTICS'; 
> PARAMETER 
>  
> VALUE 
>  
> NLS_LENGTH_SEMANTICS 
> CHAR 
> SQL> CREATE TABLE PART_COL_STATS (CS_ID NUMBER NOT NULL, DB_NAME 
> VARCHAR2(128) NOT NULL, TABLE_NAME VARCHAR2(256) NOT NULL, PARTITION_NAME 
> VARCHAR2(767) NOT NULL, COLUMN_NAME VARCHAR2(767) NOT NULL); 
> Table created. 
> SQL> CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
> (DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME); 
> CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
> (DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME) 
> * ERROR at line 1: ORA-01450: maximum key length (6398) exceeded 
> SQL> alter session set NLS_LENGTH_SEMANTICS=BYTE; 
> Session altered. 
> SQL> commit; 
> Commit complete. 
> SQL> drop table PART_COL_STATS; 
> Table dropped. 
> SQL> commit; 
> Commit complete. 
> SQL> CREATE TABLE PART_COL_STATS (CS_ID NUMBER NOT NULL, DB_NAME 
> VARCHAR2(128) NOT NULL, TABLE_NAME VARCHAR2(256) NOT NULL, PARTITION_NAME 
> VARCHAR2(767) NOT NULL, COLUMN_NAME VARCHAR2(767) NOT NULL); 
> Table created. 
> SQL> CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
> (DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME); 
> Index created. 
> SQL> commit; 
> Commit complete. 
> SQL> 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21341) Sensible defaults : hive.server2.idle.operation.timeout and hive.server2.idle.session.timeout are too high

2019-02-27 Thread Ashutosh Chauhan (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779946#comment-16779946
 ] 

Ashutosh Chauhan commented on HIVE-21341:
-

[~thejas] Can you please review? Is there any other companion configs we shall 
change too?

> Sensible defaults : hive.server2.idle.operation.timeout and 
> hive.server2.idle.session.timeout are too high
> --
>
> Key: HIVE-21341
> URL: https://issues.apache.org/jira/browse/HIVE-21341
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, HiveServer2
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-21341.patch
>
>
> Defaults are too high, which results in extra resources being held too long 
> in HS2 memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21341) Sensible defaults : hive.server2.idle.operation.timeout and hive.server2.idle.session.timeout are too high

2019-02-27 Thread Ashutosh Chauhan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21341:

Status: Patch Available  (was: Open)

> Sensible defaults : hive.server2.idle.operation.timeout and 
> hive.server2.idle.session.timeout are too high
> --
>
> Key: HIVE-21341
> URL: https://issues.apache.org/jira/browse/HIVE-21341
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, HiveServer2
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-21341.patch
>
>
> Defaults are too high, which results in extra resources being held too long 
> in HS2 memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-21341) Sensible defaults : hive.server2.idle.operation.timeout and hive.server2.idle.session.timeout are too high

2019-02-27 Thread Ashutosh Chauhan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan reassigned HIVE-21341:
---


> Sensible defaults : hive.server2.idle.operation.timeout and 
> hive.server2.idle.session.timeout are too high
> --
>
> Key: HIVE-21341
> URL: https://issues.apache.org/jira/browse/HIVE-21341
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, HiveServer2
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
>
> Defaults are too high, which results in extra resources being held too long 
> in HS2 memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-21340) CBO: Prune non-key columns feeding into a SemiJoin

2019-02-27 Thread Vineet Garg (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg reassigned HIVE-21340:
--

Assignee: Vineet Garg

> CBO: Prune non-key columns feeding into a SemiJoin
> --
>
> Key: HIVE-21340
> URL: https://issues.apache.org/jira/browse/HIVE-21340
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 4.0.0
>Reporter: Gopal V
>Assignee: Vineet Garg
>Priority: Major
>
> {code}
> explain cbo 
> with ss as 
> (select count(1), ss_item_sk, ss_ticket_number from 
> store_sales group by ss_item_sk, ss_ticket_number 
> having count(1) > 1) 
> select count(1) from item where i_item_sk IN (select ss_item_sk from ss);
> {code}
> Notice the {{HiveProject(ss_item_sk=[$0], ss_ticket_number=[$1], $f2=[$2])}} 
> Only ss_item_sk is relevant for the HiveSemiJoin
> {code}
> CBO PLAN:
> HiveAggregate(group=[{}], agg#0=[count()])
>   HiveSemiJoin(condition=[=($0, $1)], joinType=[inner])
> HiveProject(i_item_sk=[$0])
>   HiveFilter(condition=[IS NOT NULL($0)])
> HiveTableScan(table=[[tpcds_copy_orc_partitioned_1, item]], 
> table:alias=[item])
> HiveProject(ss_item_sk=[$0], ss_ticket_number=[$1], $f2=[$2])
>   HiveFilter(condition=[>($2, 1)])
> HiveAggregate(group=[{1, 8}], agg#0=[count()])
>   HiveFilter(condition=[IS NOT NULL($1)])
> HiveTableScan(table=[[tpcds_copy_orc_partitioned_1, 
> store_sales]], table:alias=[store_sales])
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21231) HiveJoinAddNotNullRule support for range predicates

2019-02-27 Thread Miklos Gergely (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely updated HIVE-21231:
--
Attachment: HIVE-21231.01.patch

> HiveJoinAddNotNullRule support for range predicates
> ---
>
> Key: HIVE-21231
> URL: https://issues.apache.org/jira/browse/HIVE-21231
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: newbie
> Attachments: HIVE-21231.01.patch
>
>
> For instance, given the following query:
> {code:sql}
> SELECT t0.col0, t0.col1
> FROM
>   (
> SELECT col0, col1 FROM tab
>   ) AS t0
>   INNER JOIN
>   (
> SELECT col0, col1 FROM tab
>   ) AS t1
> ON t0.col0 < t1.col0 AND t0.col1 > t1.col1
> {code}
> we could still infer that col0 and col1 cannot be null for any of the inputs. 
> Currently we do not.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21339) LLAP: Cache hit also initializes an FS object

2019-02-27 Thread Prasanth Jayachandran (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-21339:
-
Attachment: HIVE-21339.1.patch

> LLAP: Cache hit also initializes an FS object 
> --
>
> Key: HIVE-21339
> URL: https://issues.apache.org/jira/browse/HIVE-21339
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0
>Reporter: Gopal V
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21339.1.patch, llap-cache-fs-get.png
>
>
> https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java#L214
> {code}
> // 1. Get file metadata from cache, or create the reader and read it.
> // Don't cache the filesystem object for now; Tez closes it and FS cache 
> will fix all that
> fs = split.getPath().getFileSystem(jobConf);
> fileKey = determineFileId(fs, split,
> HiveConf.getBoolVar(daemonConf, 
> ConfVars.LLAP_CACHE_ALLOW_SYNTHETIC_FILEID),
> HiveConf.getBoolVar(daemonConf, 
> ConfVars.LLAP_CACHE_DEFAULT_FS_FILE_ID),
> !HiveConf.getBoolVar(daemonConf, ConfVars.LLAP_IO_USE_FILEID_PATH)
> );
> {code}
>  !llap-cache-fs-get.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-16924) Support distinct in presence of Group By

2019-02-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-16924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779930#comment-16779930
 ] 

Hive QA commented on HIVE-16924:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12960413/HIVE-16924.13.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 15820 tests 
executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.mapreduce.TestHCatPartitioned.testHCatPartitionedTable[3]
 (batchId=209)
org.apache.hive.minikdc.TestJdbcNonKrbSASLWithMiniKdc.testCancelRenewTokenFlow 
(batchId=276)
org.apache.hive.minikdc.TestJdbcNonKrbSASLWithMiniKdc.testConnection 
(batchId=276)
org.apache.hive.minikdc.TestJdbcNonKrbSASLWithMiniKdc.testIsValid (batchId=276)
org.apache.hive.minikdc.TestJdbcNonKrbSASLWithMiniKdc.testIsValidNeg 
(batchId=276)
org.apache.hive.minikdc.TestJdbcNonKrbSASLWithMiniKdc.testNegativeProxyAuth 
(batchId=276)
org.apache.hive.minikdc.TestJdbcNonKrbSASLWithMiniKdc.testNegativeTokenAuth 
(batchId=276)
org.apache.hive.minikdc.TestJdbcNonKrbSASLWithMiniKdc.testNoKrbSASLTokenAuthNeg 
(batchId=276)
org.apache.hive.minikdc.TestJdbcNonKrbSASLWithMiniKdc.testNonKrbSASLAuth 
(batchId=276)
org.apache.hive.minikdc.TestJdbcNonKrbSASLWithMiniKdc.testNonKrbSASLFullNameAuth
 (batchId=276)
org.apache.hive.minikdc.TestJdbcNonKrbSASLWithMiniKdc.testProxyAuth 
(batchId=276)
org.apache.hive.minikdc.TestJdbcNonKrbSASLWithMiniKdc.testRenewDelegationToken 
(batchId=276)
org.apache.hive.minikdc.TestJdbcNonKrbSASLWithMiniKdc.testTokenAuth 
(batchId=276)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16278/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16278/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16278/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12960413 - PreCommit-HIVE-Build

> Support distinct in presence of Group By 
> -
>
> Key: HIVE-16924
> URL: https://issues.apache.org/jira/browse/HIVE-16924
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning
>Reporter: Carter Shanklin
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-16924.01.patch, HIVE-16924.02.patch, 
> HIVE-16924.03.patch, HIVE-16924.04.patch, HIVE-16924.05.patch, 
> HIVE-16924.06.patch, HIVE-16924.07.patch, HIVE-16924.08.patch, 
> HIVE-16924.09.patch, HIVE-16924.10.patch, HIVE-16924.11.patch, 
> HIVE-16924.12.patch, HIVE-16924.13.patch
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> {code:sql}
> create table e011_01 (c1 int, c2 smallint);
> insert into e011_01 values (1, 1), (2, 2);
> {code}
> These queries should work:
> {code:sql}
> select distinct c1, count(*) from e011_01 group by c1;
> select distinct c1, avg(c2) from e011_01 group by c1;
> {code}
> Currently, you get : 
> FAILED: SemanticException 1:52 SELECT DISTINCT and GROUP BY can not be in the 
> same query. Error encountered near token 'c1'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-16924) Support distinct in presence of Group By

2019-02-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-16924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779934#comment-16779934
 ] 

Hive QA commented on HIVE-16924:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m  
5s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
12s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
51s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
19s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
41s{color} | {color:blue} ql in master has 2251 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  9m 
30s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
30s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m 
52s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
52s{color} | {color:red} ql: The patch generated 8 new + 639 unchanged - 13 
fixed = 647 total (was 652) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  2m 
25s{color} | {color:red} root: The patch generated 8 new + 647 unchanged - 13 
fixed = 655 total (was 660) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 5 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
58s{color} | {color:green} ql generated 0 new + 2249 unchanged - 2 fixed = 2249 
total (was 2251) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 12m  
1s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 77m 22s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16278/dev-support/hive-personality.sh
 |
| git revision | master / 38c20ba |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16278/yetus/diff-checkstyle-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16278/yetus/diff-checkstyle-root.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16278/yetus/whitespace-eol.txt
 |
| modules | C: ql . U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16278/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Support distinct in presence of Group By 
> -
>
> Key: HIVE-16924
> URL: https://issues.apache.org/jira/browse/HIVE-16924
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning
>Reporter: Carter Shanklin
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-16924.01.patch, HIVE-16924.02.patch, 
> HIVE-16924.03.patch, HIVE-16924.04.patch, HIVE-16924.05.patch, 
> HIVE-16924.06.patch, HIVE-16924.07.patch, HIVE-16924.08.patch, 
> HIVE-16924.09.patch, HIVE-16924.10.patch, HIVE-16924.11.patch, 
> HIVE-16924.12.patch,

[jira] [Updated] (HIVE-21231) HiveJoinAddNotNullRule support for range predicates

2019-02-27 Thread Miklos Gergely (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely updated HIVE-21231:
--
Status: Patch Available  (was: Open)

> HiveJoinAddNotNullRule support for range predicates
> ---
>
> Key: HIVE-21231
> URL: https://issues.apache.org/jira/browse/HIVE-21231
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: newbie
> Attachments: HIVE-21231.01.patch
>
>
> For instance, given the following query:
> {code:sql}
> SELECT t0.col0, t0.col1
> FROM
>   (
> SELECT col0, col1 FROM tab
>   ) AS t0
>   INNER JOIN
>   (
> SELECT col0, col1 FROM tab
>   ) AS t1
> ON t0.col0 < t1.col0 AND t0.col1 > t1.col1
> {code}
> we could still infer that col0 and col1 cannot be null for any of the inputs. 
> Currently we do not.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21279) Avoid moving/rename operation in FileSink op for SELECT queries

2019-02-27 Thread Ashutosh Chauhan (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779933#comment-16779933
 ] 

Ashutosh Chauhan commented on HIVE-21279:
-

+1 pending tests.

> Avoid moving/rename operation in FileSink op for SELECT queries
> ---
>
> Key: HIVE-21279
> URL: https://issues.apache.org/jira/browse/HIVE-21279
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21279.1.patch, HIVE-21279.10.patch, 
> HIVE-21279.11.patch, HIVE-21279.2.patch, HIVE-21279.3.patch, 
> HIVE-21279.4.patch, HIVE-21279.5.patch, HIVE-21279.6.patch, 
> HIVE-21279.7.patch, HIVE-21279.8.patch, HIVE-21279.9.patch
>
>
> Currently at the end of a job FileSink operator moves/rename temp directory 
> to another directory from which FetchTask fetches result. This is done to 
> avoid fetching potential partial/invalid files by failed/runway tasks. This 
> operation is expensive for cloud storage. It could be avoided if FetchTask is 
> passed on set of files to read from instead of whole directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21339) LLAP: Cache hit also initializes an FS object

2019-02-27 Thread Prasanth Jayachandran (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779932#comment-16779932
 ] 

Prasanth Jayachandran commented on HIVE-21339:
--

[~gopalv] can you please review?

> LLAP: Cache hit also initializes an FS object 
> --
>
> Key: HIVE-21339
> URL: https://issues.apache.org/jira/browse/HIVE-21339
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0
>Reporter: Gopal V
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21339.1.patch, llap-cache-fs-get.png
>
>
> https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java#L214
> {code}
> // 1. Get file metadata from cache, or create the reader and read it.
> // Don't cache the filesystem object for now; Tez closes it and FS cache 
> will fix all that
> fs = split.getPath().getFileSystem(jobConf);
> fileKey = determineFileId(fs, split,
> HiveConf.getBoolVar(daemonConf, 
> ConfVars.LLAP_CACHE_ALLOW_SYNTHETIC_FILEID),
> HiveConf.getBoolVar(daemonConf, 
> ConfVars.LLAP_CACHE_DEFAULT_FS_FILE_ID),
> !HiveConf.getBoolVar(daemonConf, ConfVars.LLAP_IO_USE_FILEID_PATH)
> );
> {code}
>  !llap-cache-fs-get.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-21339) LLAP: Cache hit also initializes an FS object

2019-02-27 Thread Prasanth Jayachandran (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-21339:


Assignee: Prasanth Jayachandran

> LLAP: Cache hit also initializes an FS object 
> --
>
> Key: HIVE-21339
> URL: https://issues.apache.org/jira/browse/HIVE-21339
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0
>Reporter: Gopal V
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: llap-cache-fs-get.png
>
>
> https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java#L214
> {code}
> // 1. Get file metadata from cache, or create the reader and read it.
> // Don't cache the filesystem object for now; Tez closes it and FS cache 
> will fix all that
> fs = split.getPath().getFileSystem(jobConf);
> fileKey = determineFileId(fs, split,
> HiveConf.getBoolVar(daemonConf, 
> ConfVars.LLAP_CACHE_ALLOW_SYNTHETIC_FILEID),
> HiveConf.getBoolVar(daemonConf, 
> ConfVars.LLAP_CACHE_DEFAULT_FS_FILE_ID),
> !HiveConf.getBoolVar(daemonConf, ConfVars.LLAP_IO_USE_FILEID_PATH)
> );
> {code}
>  !llap-cache-fs-get.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-16924) Support distinct in presence of Group By

2019-02-27 Thread Miklos Gergely (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-16924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely updated HIVE-16924:
--
Attachment: HIVE-16924.14.patch

> Support distinct in presence of Group By 
> -
>
> Key: HIVE-16924
> URL: https://issues.apache.org/jira/browse/HIVE-16924
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning
>Reporter: Carter Shanklin
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-16924.01.patch, HIVE-16924.02.patch, 
> HIVE-16924.03.patch, HIVE-16924.04.patch, HIVE-16924.05.patch, 
> HIVE-16924.06.patch, HIVE-16924.07.patch, HIVE-16924.08.patch, 
> HIVE-16924.09.patch, HIVE-16924.10.patch, HIVE-16924.11.patch, 
> HIVE-16924.12.patch, HIVE-16924.13.patch, HIVE-16924.14.patch
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> {code:sql}
> create table e011_01 (c1 int, c2 smallint);
> insert into e011_01 values (1, 1), (2, 2);
> {code}
> These queries should work:
> {code:sql}
> select distinct c1, count(*) from e011_01 group by c1;
> select distinct c1, avg(c2) from e011_01 group by c1;
> {code}
> Currently, you get : 
> FAILED: SemanticException 1:52 SELECT DISTINCT and GROUP BY can not be in the 
> same query. Error encountered near token 'c1'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-16924) Support distinct in presence of Group By

2019-02-27 Thread Miklos Gergely (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-16924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely updated HIVE-16924:
--
Status: Patch Available  (was: Open)

> Support distinct in presence of Group By 
> -
>
> Key: HIVE-16924
> URL: https://issues.apache.org/jira/browse/HIVE-16924
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning
>Reporter: Carter Shanklin
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-16924.01.patch, HIVE-16924.02.patch, 
> HIVE-16924.03.patch, HIVE-16924.04.patch, HIVE-16924.05.patch, 
> HIVE-16924.06.patch, HIVE-16924.07.patch, HIVE-16924.08.patch, 
> HIVE-16924.09.patch, HIVE-16924.10.patch, HIVE-16924.11.patch, 
> HIVE-16924.12.patch, HIVE-16924.13.patch, HIVE-16924.14.patch
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> {code:sql}
> create table e011_01 (c1 int, c2 smallint);
> insert into e011_01 values (1, 1), (2, 2);
> {code}
> These queries should work:
> {code:sql}
> select distinct c1, count(*) from e011_01 group by c1;
> select distinct c1, avg(c2) from e011_01 group by c1;
> {code}
> Currently, you get : 
> FAILED: SemanticException 1:52 SELECT DISTINCT and GROUP BY can not be in the 
> same query. Error encountered near token 'c1'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-16924) Support distinct in presence of Group By

2019-02-27 Thread Miklos Gergely (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-16924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely updated HIVE-16924:
--
Status: Open  (was: Patch Available)

> Support distinct in presence of Group By 
> -
>
> Key: HIVE-16924
> URL: https://issues.apache.org/jira/browse/HIVE-16924
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning
>Reporter: Carter Shanklin
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-16924.01.patch, HIVE-16924.02.patch, 
> HIVE-16924.03.patch, HIVE-16924.04.patch, HIVE-16924.05.patch, 
> HIVE-16924.06.patch, HIVE-16924.07.patch, HIVE-16924.08.patch, 
> HIVE-16924.09.patch, HIVE-16924.10.patch, HIVE-16924.11.patch, 
> HIVE-16924.12.patch, HIVE-16924.13.patch, HIVE-16924.14.patch
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> {code:sql}
> create table e011_01 (c1 int, c2 smallint);
> insert into e011_01 values (1, 1), (2, 2);
> {code}
> These queries should work:
> {code:sql}
> select distinct c1, count(*) from e011_01 group by c1;
> select distinct c1, avg(c2) from e011_01 group by c1;
> {code}
> Currently, you get : 
> FAILED: SemanticException 1:52 SELECT DISTINCT and GROUP BY can not be in the 
> same query. Error encountered near token 'c1'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21336) HMS Index PCS_STATS_IDX too long for Oracle when NLS_LENGTH_SEMANTICS=char

2019-02-27 Thread Naveen Gangam (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-21336:
-
Status: Patch Available  (was: Open)

> HMS Index PCS_STATS_IDX too long for Oracle when NLS_LENGTH_SEMANTICS=char
> --
>
> Key: HIVE-21336
> URL: https://issues.apache.org/jira/browse/HIVE-21336
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Major
> Attachments: HIVE-21336.patch
>
>
> CREATE INDEX PCS_STATS_IDX ON PAR T_COL_STATS 
> (DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME) 
> Error: ORA-01450: maximum key length (6398) exceeded (state=72000,code=1450) 
> Customer tried the same DDL in SQLDevloper, and got the same error. This 
> could be a result of combination of DB level settings like the db_block_size, 
> limiting the maximum key length, as per below doc: 
> http://www.dba-oracle.com/t_ora_01450_maximum_key_length_exceeded.htm 
> Also {{NLS_LENGTH_SEMANTICS}} is by default BYTE, but users can set this at 
> the session level to CHAR, thus reducing the max size of the index length. We 
> have increased the size of the COLUMN_NAME from 128 to 767 (used to be at 
> 1000) and TABLE_NAME from 128 to 256. This by setting 
> {code} 
> CREATE TABLE PART_COL_STATS ( 
> CS_ID NUMBER NOT NULL, 
> DB_NAME VARCHAR2(128) NOT NULL, 
> TABLE_NAME VARCHAR2(256) NOT NULL, 
> PARTITION_NAME VARCHAR2(767) NOT NULL, 
> COLUMN_NAME VARCHAR2(767) NOT NULL,  
> CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
> (DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME); 
> {code} 
> Reproducer: 
> {code} 
> SQL*Plus: Release 11.2.0.2.0 Production on Wed Feb 27 11:02:16 2019 Copyright 
> (c) 1982, 2011, Oracle. All rights reserved. 
> Connected to: Oracle Database 11g Express Edition Release 11.2.0.2.0 - 64bit 
> Production 
> SQL> select * from v$nls_parameters where parameter = 'NLS_LENGTH_SEMANTICS'; 
> PARAMETER 
>  
> VALUE 
>  
> NLS_LENGTH_SEMANTICS 
> BYTE 
> SQL> alter session set NLS_LENGTH_SEMANTICS=CHAR; Session altered. 
> SQL> commit; Commit complete. 
> SQL> select * from v$nls_parameters where parameter = 'NLS_LENGTH_SEMANTICS'; 
> PARAMETER 
>  
> VALUE 
>  
> NLS_LENGTH_SEMANTICS 
> CHAR 
> SQL> CREATE TABLE PART_COL_STATS (CS_ID NUMBER NOT NULL, DB_NAME 
> VARCHAR2(128) NOT NULL, TABLE_NAME VARCHAR2(256) NOT NULL, PARTITION_NAME 
> VARCHAR2(767) NOT NULL, COLUMN_NAME VARCHAR2(767) NOT NULL); 
> Table created. 
> SQL> CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
> (DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME); 
> CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
> (DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME) 
> * ERROR at line 1: ORA-01450: maximum key length (6398) exceeded 
> SQL> alter session set NLS_LENGTH_SEMANTICS=BYTE; 
> Session altered. 
> SQL> commit; 
> Commit complete. 
> SQL> drop table PART_COL_STATS; 
> Table dropped. 
> SQL> commit; 
> Commit complete. 
> SQL> CREATE TABLE PART_COL_STATS (CS_ID NUMBER NOT NULL, DB_NAME 
> VARCHAR2(128) NOT NULL, TABLE_NAME VARCHAR2(256) NOT NULL, PARTITION_NAME 
> VARCHAR2(767) NOT NULL, COLUMN_NAME VARCHAR2(767) NOT NULL); 
> Table created. 
> SQL> CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
> (DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME); 
> Index created. 
> SQL> commit; 
> Commit complete. 
> SQL> 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21337) HMS Metadata migration from Postgres/Derby to other DBs fail

2019-02-27 Thread Naveen Gangam (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-21337:
-
Attachment: HIVE-21337.patch
Status: Patch Available  (was: Open)

I am just fixing the schemas to be consistent going forward. I dont think the 
upgrade path will be clean if we add {{ALTER TABLE .. }} to the upgrade script 
to reduce the existing 4000 length to be 256. If there are some rows, that 
contain value of > 256 chars for this column, the behavior across different 
databases is different. Some truncate the value while some fail.

> HMS Metadata migration from Postgres/Derby to other DBs fail
> 
>
> Key: HIVE-21337
> URL: https://issues.apache.org/jira/browse/HIVE-21337
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Minor
> Attachments: HIVE-21337.patch
>
>
> Customer recently was migrating from Postgres to Oracle for HMS metastore. 
> During import of the [exported] data from HMS metastore from postgres, 
> failures are seen as the COLUMNS_V2.COMMENT is 4000 bytes long whereas oracle 
> and other schemas define it to be 256 bytes.
> This inconsistency in the schema makes the migration cumbersome and manual. 
> This jira makes this column consistent in length across all databases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21294) Vectorization: 1-reducer Shuffle can skip the object hash functions

2019-02-27 Thread Teddy Choi (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779908#comment-16779908
 ] 

Teddy Choi commented on HIVE-21294:
---

[~gopalv], I fixed the differences in murmur_hash_migration.q.out. 
TestObjectStore failures seem unrelated. I tested them on my laptop and there 
were no errors. Will it be okay to push it to master?

> Vectorization: 1-reducer Shuffle can skip the object hash functions
> ---
>
> Key: HIVE-21294
> URL: https://issues.apache.org/jira/browse/HIVE-21294
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Gopal V
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21294.2.patch, HIVE-21294.3.patch, 
> HIVE-21294.4.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> VectorReduceSinkObjectHashOperator can skip the object hashing entirely if 
> the reducer count = 1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21279) Avoid moving/rename operation in FileSink op for SELECT queries

2019-02-27 Thread Vineet Garg (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-21279:
---
Status: Patch Available  (was: Open)

> Avoid moving/rename operation in FileSink op for SELECT queries
> ---
>
> Key: HIVE-21279
> URL: https://issues.apache.org/jira/browse/HIVE-21279
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21279.1.patch, HIVE-21279.10.patch, 
> HIVE-21279.11.patch, HIVE-21279.2.patch, HIVE-21279.3.patch, 
> HIVE-21279.4.patch, HIVE-21279.5.patch, HIVE-21279.6.patch, 
> HIVE-21279.7.patch, HIVE-21279.8.patch, HIVE-21279.9.patch
>
>
> Currently at the end of a job FileSink operator moves/rename temp directory 
> to another directory from which FetchTask fetches result. This is done to 
> avoid fetching potential partial/invalid files by failed/runway tasks. This 
> operation is expensive for cloud storage. It could be avoided if FetchTask is 
> passed on set of files to read from instead of whole directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21279) Avoid moving/rename operation in FileSink op for SELECT queries

2019-02-27 Thread Vineet Garg (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-21279:
---
Attachment: HIVE-21279.11.patch

> Avoid moving/rename operation in FileSink op for SELECT queries
> ---
>
> Key: HIVE-21279
> URL: https://issues.apache.org/jira/browse/HIVE-21279
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21279.1.patch, HIVE-21279.10.patch, 
> HIVE-21279.11.patch, HIVE-21279.2.patch, HIVE-21279.3.patch, 
> HIVE-21279.4.patch, HIVE-21279.5.patch, HIVE-21279.6.patch, 
> HIVE-21279.7.patch, HIVE-21279.8.patch, HIVE-21279.9.patch
>
>
> Currently at the end of a job FileSink operator moves/rename temp directory 
> to another directory from which FetchTask fetches result. This is done to 
> avoid fetching potential partial/invalid files by failed/runway tasks. This 
> operation is expensive for cloud storage. It could be avoided if FetchTask is 
> passed on set of files to read from instead of whole directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21279) Avoid moving/rename operation in FileSink op for SELECT queries

2019-02-27 Thread Vineet Garg (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-21279:
---
Status: Open  (was: Patch Available)

> Avoid moving/rename operation in FileSink op for SELECT queries
> ---
>
> Key: HIVE-21279
> URL: https://issues.apache.org/jira/browse/HIVE-21279
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21279.1.patch, HIVE-21279.10.patch, 
> HIVE-21279.11.patch, HIVE-21279.2.patch, HIVE-21279.3.patch, 
> HIVE-21279.4.patch, HIVE-21279.5.patch, HIVE-21279.6.patch, 
> HIVE-21279.7.patch, HIVE-21279.8.patch, HIVE-21279.9.patch
>
>
> Currently at the end of a job FileSink operator moves/rename temp directory 
> to another directory from which FetchTask fetches result. This is done to 
> avoid fetching potential partial/invalid files by failed/runway tasks. This 
> operation is expensive for cloud storage. It could be avoided if FetchTask is 
> passed on set of files to read from instead of whole directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (HIVE-21333) [trivial] Fix argument order in TestDateWritableV2#setupDateStrings

2019-02-27 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21333?focusedWorklogId=205439=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-205439
 ]

ASF GitHub Bot logged work on HIVE-21333:
-

Author: ASF GitHub Bot
Created on: 27/Feb/19 22:48
Start Date: 27/Feb/19 22:48
Worklog Time Spent: 10m 
  Work Description: b-slim commented on pull request #554: [HIVE-21333] 
Purge the non locked buffers instead of locked ones
URL: https://github.com/apache/hive/pull/554
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 205439)
Time Spent: 20m  (was: 10m)

> [trivial] Fix argument order in TestDateWritableV2#setupDateStrings
> ---
>
> Key: HIVE-21333
> URL: https://issues.apache.org/jira/browse/HIVE-21333
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21333.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Calendar#add (int field, int amount) is given parameters (1, 
> Calendar.DAY_OF_YEAR) which i presume is backwards especially since this 
> method is called 365 times.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (HIVE-20364) Update default for hive.map.aggr.hash.min.reduction

2019-02-27 Thread Ashutosh Chauhan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-20364.
-
Resolution: Fixed

Taken care of in HIVE-20656

> Update default for hive.map.aggr.hash.min.reduction
> ---
>
> Key: HIVE-20364
> URL: https://issues.apache.org/jira/browse/HIVE-20364
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Nita Dembla
>Assignee: Ashutosh Chauhan
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20364.patch
>
>
> Default value is 0.5 Lets update it to 0.99
> In average case its a trade-off between cpu vs network. Erring on side of CPU 
> is better since perf loss caused by network is usually larger.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21332) Cache Purge command does purge the in-use buffer.

2019-02-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779844#comment-16779844
 ] 

Hive QA commented on HIVE-21332:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12960405/HIVE-21332.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15820 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16277/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16277/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16277/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12960405 - PreCommit-HIVE-Build

> Cache Purge command does purge the in-use buffer.
> -
>
> Key: HIVE-21332
> URL: https://issues.apache.org/jira/browse/HIVE-21332
> Project: Hive
>  Issue Type: Bug
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
> Attachments: HIVE-21332.patch
>
>
> Cache Purge command, is purging what is not suppose to evict.
> This can lead to unrecoverable state.
> {code} 
> TaskAttempt 3 failed, info=[Error: Error while running task ( failure ) : 
> attempt_1545278897356_0093_27_00_01_3:java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
> java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 32768; at 0 out of 1 (entire cache is fragmented and 
> locked, or an internal issue)
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
>  at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>  at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>  at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>  at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
>  at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>  at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>  at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>  at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.IOException: java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 32768; at 0 out of 1 (entire cache is fragmented and 
> locked, or an internal issue)
>  at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80)
>  at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
>  ... 15 more
> Caused by: java.io.IOException: java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 32768; at 0 out of 1 (entire cache is fragmented and 
> locked, or an internal issue)
>  at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>  at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>  at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
>  at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
>  at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
>  at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>  at 
>

[jira] [Updated] (HIVE-20656) Sensible defaults: Map aggregation memory configs are too aggressive

2019-02-27 Thread Gopal V (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-20656:
---
Summary: Sensible defaults: Map aggregation memory configs are too 
aggressive  (was: Map aggregation memory configs are too aggressive)

> Sensible defaults: Map aggregation memory configs are too aggressive
> 
>
> Key: HIVE-20656
> URL: https://issues.apache.org/jira/browse/HIVE-20656
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20656.1.patch
>
>
> The defaults for the following configs seems to be too aggressive. In java 
> this can easily lead to several full GC pauses whose memory cannot be 
> reclaimed.
> {code:java}
> HIVEMAPAGGRHASHMEMORY("hive.map.aggr.hash.percentmemory", (float) 0.99,
> "Portion of total memory to be used by map-side group aggregation hash 
> table"),
> HIVEMAPAGGRMEMORYTHRESHOLD("hive.map.aggr.hash.force.flush.memory.threshold", 
> (float) 0.9,
> "The max memory to be used by map-side group aggregation hash table.\n" +
> "If the memory usage is higher than this number, force to flush 
> data"),{code}
>  
> We can be little bit conservative for these configs to avoid getting into GC 
> pause. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Reopened] (HIVE-20364) Update default for hive.map.aggr.hash.min.reduction

2019-02-27 Thread Ashutosh Chauhan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan reopened HIVE-20364:
-

> Update default for hive.map.aggr.hash.min.reduction
> ---
>
> Key: HIVE-20364
> URL: https://issues.apache.org/jira/browse/HIVE-20364
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Nita Dembla
>Assignee: Ashutosh Chauhan
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20364.patch
>
>
> Default value is 0.5 Lets update it to 0.99
> In average case its a trade-off between cpu vs network. Erring on side of CPU 
> is better since perf loss caused by network is usually larger.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20656) Sensible defaults: Map aggregation memory configs are too aggressive

2019-02-27 Thread Gopal V (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779837#comment-16779837
 ] 

Gopal V commented on HIVE-20656:


This patch already does now, I think we can just wipe over that.

> Sensible defaults: Map aggregation memory configs are too aggressive
> 
>
> Key: HIVE-20656
> URL: https://issues.apache.org/jira/browse/HIVE-20656
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20656.1.patch
>
>
> The defaults for the following configs seems to be too aggressive. In java 
> this can easily lead to several full GC pauses whose memory cannot be 
> reclaimed.
> {code:java}
> HIVEMAPAGGRHASHMEMORY("hive.map.aggr.hash.percentmemory", (float) 0.99,
> "Portion of total memory to be used by map-side group aggregation hash 
> table"),
> HIVEMAPAGGRMEMORYTHRESHOLD("hive.map.aggr.hash.force.flush.memory.threshold", 
> (float) 0.9,
> "The max memory to be used by map-side group aggregation hash table.\n" +
> "If the memory usage is higher than this number, force to flush 
> data"),{code}
>  
> We can be little bit conservative for these configs to avoid getting into GC 
> pause. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20656) Sensible defaults: Map aggregation memory configs are too aggressive

2019-02-27 Thread Ashutosh Chauhan (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779835#comment-16779835
 ] 

Ashutosh Chauhan commented on HIVE-20656:
-

Sorry about that. I will reopen HIVE-20364 and update correct config there.

> Sensible defaults: Map aggregation memory configs are too aggressive
> 
>
> Key: HIVE-20656
> URL: https://issues.apache.org/jira/browse/HIVE-20656
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20656.1.patch
>
>
> The defaults for the following configs seems to be too aggressive. In java 
> this can easily lead to several full GC pauses whose memory cannot be 
> reclaimed.
> {code:java}
> HIVEMAPAGGRHASHMEMORY("hive.map.aggr.hash.percentmemory", (float) 0.99,
> "Portion of total memory to be used by map-side group aggregation hash 
> table"),
> HIVEMAPAGGRMEMORYTHRESHOLD("hive.map.aggr.hash.force.flush.memory.threshold", 
> (float) 0.9,
> "The max memory to be used by map-side group aggregation hash table.\n" +
> "If the memory usage is higher than this number, force to flush 
> data"),{code}
>  
> We can be little bit conservative for these configs to avoid getting into GC 
> pause. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HIVE-20656) Sensible defaults: Map aggregation memory configs are too aggressive

2019-02-27 Thread Gopal V (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779830#comment-16779830
 ] 

Gopal V edited comment on HIVE-20656 at 2/27/19 10:27 PM:
--

LGTM - +1 tests pending

-Maybe do the thing that HIVE-20364 was actually supposed to do?-


was (Author: gopalv):
LGTM - +1 tests pending

Maybe do the thing that HIVE-20364 was actually supposed to do?

> Sensible defaults: Map aggregation memory configs are too aggressive
> 
>
> Key: HIVE-20656
> URL: https://issues.apache.org/jira/browse/HIVE-20656
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20656.1.patch
>
>
> The defaults for the following configs seems to be too aggressive. In java 
> this can easily lead to several full GC pauses whose memory cannot be 
> reclaimed.
> {code:java}
> HIVEMAPAGGRHASHMEMORY("hive.map.aggr.hash.percentmemory", (float) 0.99,
> "Portion of total memory to be used by map-side group aggregation hash 
> table"),
> HIVEMAPAGGRMEMORYTHRESHOLD("hive.map.aggr.hash.force.flush.memory.threshold", 
> (float) 0.9,
> "The max memory to be used by map-side group aggregation hash table.\n" +
> "If the memory usage is higher than this number, force to flush 
> data"),{code}
>  
> We can be little bit conservative for these configs to avoid getting into GC 
> pause. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20656) Sensible defaults: Map aggregation memory configs are too aggressive

2019-02-27 Thread Gopal V (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779830#comment-16779830
 ] 

Gopal V commented on HIVE-20656:


LGTM - +1 tests pending

Maybe do the thing that HIVE-20364 was actually supposed to do?

> Sensible defaults: Map aggregation memory configs are too aggressive
> 
>
> Key: HIVE-20656
> URL: https://issues.apache.org/jira/browse/HIVE-20656
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20656.1.patch
>
>
> The defaults for the following configs seems to be too aggressive. In java 
> this can easily lead to several full GC pauses whose memory cannot be 
> reclaimed.
> {code:java}
> HIVEMAPAGGRHASHMEMORY("hive.map.aggr.hash.percentmemory", (float) 0.99,
> "Portion of total memory to be used by map-side group aggregation hash 
> table"),
> HIVEMAPAGGRMEMORYTHRESHOLD("hive.map.aggr.hash.force.flush.memory.threshold", 
> (float) 0.9,
> "The max memory to be used by map-side group aggregation hash table.\n" +
> "If the memory usage is higher than this number, force to flush 
> data"),{code}
>  
> We can be little bit conservative for these configs to avoid getting into GC 
> pause. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21332) Cache Purge command does purge the in-use buffer.

2019-02-27 Thread Ashutosh Chauhan (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779824#comment-16779824
 ] 

Ashutosh Chauhan commented on HIVE-21332:
-

+1

> Cache Purge command does purge the in-use buffer.
> -
>
> Key: HIVE-21332
> URL: https://issues.apache.org/jira/browse/HIVE-21332
> Project: Hive
>  Issue Type: Bug
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
> Attachments: HIVE-21332.patch
>
>
> Cache Purge command, is purging what is not suppose to evict.
> This can lead to unrecoverable state.
> {code} 
> TaskAttempt 3 failed, info=[Error: Error while running task ( failure ) : 
> attempt_1545278897356_0093_27_00_01_3:java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
> java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 32768; at 0 out of 1 (entire cache is fragmented and 
> locked, or an internal issue)
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
>  at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>  at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>  at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>  at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
>  at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>  at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>  at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>  at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.IOException: java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 32768; at 0 out of 1 (entire cache is fragmented and 
> locked, or an internal issue)
>  at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80)
>  at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
>  ... 15 more
> Caused by: java.io.IOException: java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 32768; at 0 out of 1 (entire cache is fragmented and 
> locked, or an internal issue)
>  at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>  at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>  at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
>  at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
>  at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
>  at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>  at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
>  at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
>  at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
>  ... 17 more
> Caused by: java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 32768; at 0 out of 1 (entire cache is fragmented and 
> locked, or an internal issue)
>  at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:513)
>  at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:407)
>  at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:266)
>  at 
>

[jira] [Assigned] (HIVE-20656) Map aggregation memory configs are too aggressive

2019-02-27 Thread Prasanth Jayachandran (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-20656:


Assignee: Prasanth Jayachandran

> Map aggregation memory configs are too aggressive
> -
>
> Key: HIVE-20656
> URL: https://issues.apache.org/jira/browse/HIVE-20656
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20656.1.patch
>
>
> The defaults for the following configs seems to be too aggressive. In java 
> this can easily lead to several full GC pauses whose memory cannot be 
> reclaimed.
> {code:java}
> HIVEMAPAGGRHASHMEMORY("hive.map.aggr.hash.percentmemory", (float) 0.99,
> "Portion of total memory to be used by map-side group aggregation hash 
> table"),
> HIVEMAPAGGRMEMORYTHRESHOLD("hive.map.aggr.hash.force.flush.memory.threshold", 
> (float) 0.9,
> "The max memory to be used by map-side group aggregation hash table.\n" +
> "If the memory usage is higher than this number, force to flush 
> data"),{code}
>  
> We can be little bit conservative for these configs to avoid getting into GC 
> pause. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20656) Map aggregation memory configs are too aggressive

2019-02-27 Thread Prasanth Jayachandran (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20656:
-
Attachment: HIVE-20656.1.patch

> Map aggregation memory configs are too aggressive
> -
>
> Key: HIVE-20656
> URL: https://issues.apache.org/jira/browse/HIVE-20656
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20656.1.patch
>
>
> The defaults for the following configs seems to be too aggressive. In java 
> this can easily lead to several full GC pauses whose memory cannot be 
> reclaimed.
> {code:java}
> HIVEMAPAGGRHASHMEMORY("hive.map.aggr.hash.percentmemory", (float) 0.99,
> "Portion of total memory to be used by map-side group aggregation hash 
> table"),
> HIVEMAPAGGRMEMORYTHRESHOLD("hive.map.aggr.hash.force.flush.memory.threshold", 
> (float) 0.9,
> "The max memory to be used by map-side group aggregation hash table.\n" +
> "If the memory usage is higher than this number, force to flush 
> data"),{code}
>  
> We can be little bit conservative for these configs to avoid getting into GC 
> pause. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20656) Map aggregation memory configs are too aggressive

2019-02-27 Thread Prasanth Jayachandran (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20656:
-
Status: Patch Available  (was: Open)

> Map aggregation memory configs are too aggressive
> -
>
> Key: HIVE-20656
> URL: https://issues.apache.org/jira/browse/HIVE-20656
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20656.1.patch
>
>
> The defaults for the following configs seems to be too aggressive. In java 
> this can easily lead to several full GC pauses whose memory cannot be 
> reclaimed.
> {code:java}
> HIVEMAPAGGRHASHMEMORY("hive.map.aggr.hash.percentmemory", (float) 0.99,
> "Portion of total memory to be used by map-side group aggregation hash 
> table"),
> HIVEMAPAGGRMEMORYTHRESHOLD("hive.map.aggr.hash.force.flush.memory.threshold", 
> (float) 0.9,
> "The max memory to be used by map-side group aggregation hash table.\n" +
> "If the memory usage is higher than this number, force to flush 
> data"),{code}
>  
> We can be little bit conservative for these configs to avoid getting into GC 
> pause. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20656) Map aggregation memory configs are too aggressive

2019-02-27 Thread Gopal V (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779828#comment-16779828
 ] 

Gopal V commented on HIVE-20656:


HIVE-20364 changed the wrong config - the comment says 

{code}
Update default for hive.map.aggr.hash.min.reduction
{code}

the actual config changed was

{code}
-HIVEMAPAGGRHASHMEMORY("hive.map.aggr.hash.percentmemory", (float) 0.5,
+HIVEMAPAGGRHASHMEMORY("hive.map.aggr.hash.percentmemory", (float) 0.99,
{code}

The mem thresholds are good at 0.5 - the GC pauses start to really trouble us 
at 80%, the tez buffers are approx ~30%, so 50% is a good enough high watermark.

> Map aggregation memory configs are too aggressive
> -
>
> Key: HIVE-20656
> URL: https://issues.apache.org/jira/browse/HIVE-20656
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Priority: Major
>
> The defaults for the following configs seems to be too aggressive. In java 
> this can easily lead to several full GC pauses whose memory cannot be 
> reclaimed.
> {code:java}
> HIVEMAPAGGRHASHMEMORY("hive.map.aggr.hash.percentmemory", (float) 0.99,
> "Portion of total memory to be used by map-side group aggregation hash 
> table"),
> HIVEMAPAGGRMEMORYTHRESHOLD("hive.map.aggr.hash.force.flush.memory.threshold", 
> (float) 0.9,
> "The max memory to be used by map-side group aggregation hash table.\n" +
> "If the memory usage is higher than this number, force to flush 
> data"),{code}
>  
> We can be little bit conservative for these configs to avoid getting into GC 
> pause. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21298) Move Hive Schema Tool classes to their own package to have cleaner structure

2019-02-27 Thread Ashutosh Chauhan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21298:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Miklos!

> Move Hive Schema Tool classes to their own package to have  cleaner structure
> -
>
> Key: HIVE-21298
> URL: https://issues.apache.org/jira/browse/HIVE-21298
> Project: Hive
>  Issue Type: Improvement
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21298.01.patch, HIVE-21298.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20436) Lock Manager scalability - linear

2019-02-27 Thread Vaibhav Gumashta (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779825#comment-16779825
 ] 

Vaibhav Gumashta commented on HIVE-20436:
-

Look at notes in {{TxnHandler.checkLock}}

> Lock Manager scalability - linear
> -
>
> Key: HIVE-20436
> URL: https://issues.apache.org/jira/browse/HIVE-20436
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
>
> Hive TransactionManager currently has a mix of lock based and optimistic 
> concurrency management techniques (which at times overlap).
> For inserts with Dynamic Partitions that represents update/merge it acquires 
> locks on each existing partition which can flood the metastore DB.
> Need to clean up the logical model and the implementation.
> This will be an umbrella Jira for this



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21332) Cache Purge command does purge the in-use buffer.

2019-02-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779801#comment-16779801
 ] 

Hive QA commented on HIVE-21332:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
51s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
26s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
52s{color} | {color:blue} llap-server in master has 79 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} llap-server: The patch generated 0 new + 63 
unchanged - 3 fixed = 63 total (was 66) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 15m 52s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16277/dev-support/hive-personality.sh
 |
| git revision | master / 474a19d |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: llap-server U: llap-server |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16277/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Cache Purge command does purge the in-use buffer.
> -
>
> Key: HIVE-21332
> URL: https://issues.apache.org/jira/browse/HIVE-21332
> Project: Hive
>  Issue Type: Bug
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
> Attachments: HIVE-21332.patch
>
>
> Cache Purge command, is purging what is not suppose to evict.
> This can lead to unrecoverable state.
> {code} 
> TaskAttempt 3 failed, info=[Error: Error while running task ( failure ) : 
> attempt_1545278897356_0093_27_00_01_3:java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
> java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 32768; at 0 out of 1 (entire cache is fragmented and 
> locked, or an internal issue)
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
>  at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>  at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>  at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>  at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
>

[jira] [Updated] (HIVE-21001) Upgrade to calcite-1.18

2019-02-27 Thread Zoltan Haindrich (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-21001:

Attachment: HIVE-21001.41.patch

> Upgrade to calcite-1.18
> ---
>
> Key: HIVE-21001
> URL: https://issues.apache.org/jira/browse/HIVE-21001
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-21001.01.patch, HIVE-21001.01.patch, 
> HIVE-21001.02.patch, HIVE-21001.03.patch, HIVE-21001.04.patch, 
> HIVE-21001.05.patch, HIVE-21001.06.patch, HIVE-21001.06.patch, 
> HIVE-21001.07.patch, HIVE-21001.08.patch, HIVE-21001.08.patch, 
> HIVE-21001.08.patch, HIVE-21001.09.patch, HIVE-21001.09.patch, 
> HIVE-21001.09.patch, HIVE-21001.10.patch, HIVE-21001.11.patch, 
> HIVE-21001.12.patch, HIVE-21001.13.patch, HIVE-21001.15.patch, 
> HIVE-21001.16.patch, HIVE-21001.17.patch, HIVE-21001.18.patch, 
> HIVE-21001.18.patch, HIVE-21001.19.patch, HIVE-21001.20.patch, 
> HIVE-21001.21.patch, HIVE-21001.22.patch, HIVE-21001.22.patch, 
> HIVE-21001.22.patch, HIVE-21001.23.patch, HIVE-21001.24.patch, 
> HIVE-21001.26.patch, HIVE-21001.26.patch, HIVE-21001.26.patch, 
> HIVE-21001.26.patch, HIVE-21001.26.patch, HIVE-21001.27.patch, 
> HIVE-21001.28.patch, HIVE-21001.29.patch, HIVE-21001.29.patch, 
> HIVE-21001.30.patch, HIVE-21001.31.patch, HIVE-21001.32.patch, 
> HIVE-21001.34.patch, HIVE-21001.35.patch, HIVE-21001.36.patch, 
> HIVE-21001.37.patch, HIVE-21001.38.patch, HIVE-21001.39.patch, 
> HIVE-21001.40.patch, HIVE-21001.41.patch
>
>
> XLEAR LIBRARY CACHE 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-21338) Remove order by and limit for aggregates

2019-02-27 Thread Vineet Garg (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg reassigned HIVE-21338:
--


> Remove order by and limit for aggregates
> 
>
> Key: HIVE-21338
> URL: https://issues.apache.org/jira/browse/HIVE-21338
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>
> If a query is guaranteed to produce at most one row LIMIT and ORDER BY could 
> be removed. This saves unnecessary vertex for LIMIT/ORDER BY.
> {code:sql}
> explain select count(*) cs from store_sales where ss_ext_sales_price > 100.00 
> order by cs limit 100
> {code}
> {code}
> STAGE PLANS:
>   Stage: Stage-1
> Tez
>   DagId: vgarg_20190227131959_2914830f-eab6-425d-b9f0-b8cb56f8a1e9:4
>   Edges:
> Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
>   DagName: vgarg_20190227131959_2914830f-eab6-425d-b9f0-b8cb56f8a1e9:4
>   Vertices:
> Map 1
> Map Operator Tree:
> TableScan
>   alias: store_sales
>   filterExpr: (ss_ext_sales_price > 100) (type: boolean)
>   Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
>   Filter Operator
> predicate: (ss_ext_sales_price > 100) (type: boolean)
> Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
> Select Operator
>   Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
>   Group By Operator
> aggregations: count()
> mode: hash
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
> Reduce Output Operator
>   sort order:
>   Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
>   value expressions: _col0 (type: bigint)
> Execution mode: vectorized
> Reducer 2
> Execution mode: vectorized
> Reduce Operator Tree:
>   Group By Operator
> aggregations: count(VALUE._col0)
> mode: mergepartial
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 120 Basic stats: COMPLETE 
> Column stats: NONE
> Reduce Output Operator
>   key expressions: _col0 (type: bigint)
>   sort order: +
>   Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
>   TopN Hash Memory Usage: 0.1
> Reducer 3
> Execution mode: vectorized
> Reduce Operator Tree:
>   Select Operator
> expressions: KEY.reducesinkkey0 (type: bigint)
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 120 Basic stats: COMPLETE 
> Column stats: NONE
> Limit
>   Number of rows: 100
>   Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
>   File Output Operator
> compressed: false
> Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
> table:
> input format: 
> org.apache.hadoop.mapred.SequenceFileInputFormat
> output format: 
> org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
> serde: 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-0
> Fetch Operator
>   limit: 100
>   Processor Tree:
> ListSink
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21279) Avoid moving/rename operation in FileSink op for SELECT queries

2019-02-27 Thread Vineet Garg (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-21279:
---
Attachment: HIVE-21279.10.patch

> Avoid moving/rename operation in FileSink op for SELECT queries
> ---
>
> Key: HIVE-21279
> URL: https://issues.apache.org/jira/browse/HIVE-21279
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21279.1.patch, HIVE-21279.10.patch, 
> HIVE-21279.2.patch, HIVE-21279.3.patch, HIVE-21279.4.patch, 
> HIVE-21279.5.patch, HIVE-21279.6.patch, HIVE-21279.7.patch, 
> HIVE-21279.8.patch, HIVE-21279.9.patch
>
>
> Currently at the end of a job FileSink operator moves/rename temp directory 
> to another directory from which FetchTask fetches result. This is done to 
> avoid fetching potential partial/invalid files by failed/runway tasks. This 
> operation is expensive for cloud storage. It could be avoided if FetchTask is 
> passed on set of files to read from instead of whole directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21198) Introduce a database object reference class

2019-02-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779781#comment-16779781
 ] 

Hive QA commented on HIVE-21198:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12960403/HIVE-21198.5.patch

{color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1024 failed/errored test(s), 15822 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=267)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[escape_comments] 
(batchId=275)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_1] 
(batchId=275)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[ctas_blobstore_to_blobstore]
 (batchId=278)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[ctas_blobstore_to_hdfs]
 (batchId=278)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[ctas_hdfs_to_blobstore]
 (batchId=278)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_addpartition_blobstore_to_blobstore]
 (batchId=278)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_addpartition_blobstore_to_local]
 (batchId=278)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_addpartition_blobstore_to_warehouse]
 (batchId=278)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_addpartition_local_to_blobstore]
 (batchId=278)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_blobstore_to_blobstore]
 (batchId=278)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_blobstore_to_blobstore_nonpart]
 (batchId=278)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_blobstore_to_local]
 (batchId=278)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_blobstore_to_warehouse]
 (batchId=278)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_blobstore_to_warehouse_nonpart]
 (batchId=278)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_local_to_blobstore]
 (batchId=278)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_stats2] (batchId=51)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_stats4] (batchId=66)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_stats5] (batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_table_stats] 
(batchId=57)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[allcolref_in_udf] 
(batchId=57)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter1] (batchId=93)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter2] (batchId=11)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter3] (batchId=23)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter4] (batchId=55)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alterColumnStatsPart] 
(batchId=93)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alterColumnStats] 
(batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_char1] (batchId=32)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_file_format] 
(batchId=63)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_merge_stats] 
(batchId=65)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_numbuckets_partitioned_table2_h23]
 (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_numbuckets_partitioned_table_h23]
 (batchId=74)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_change_col]
 (batchId=27)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_clusterby_sortby]
 (batchId=43)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_coltype] 
(batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_format_loc]
 (batchId=57)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_onto_nocurrent_db]
 (batchId=95)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_update_status]
 (batchId=97)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_rename_table] 
(batchId=33)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_skewed_table] 
(batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_add_partition]
 (batchId=19)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_cascade] 
(batchId=96)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_column_stats]
 (batchId=70)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_location] 
(batchId=72)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_not_sorted] 
(batchId=69)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_serde2] 
(batchId=28)

[jira] [Updated] (HIVE-21279) Avoid moving/rename operation in FileSink op for SELECT queries

2019-02-27 Thread Vineet Garg (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-21279:
---
Status: Patch Available  (was: Open)

> Avoid moving/rename operation in FileSink op for SELECT queries
> ---
>
> Key: HIVE-21279
> URL: https://issues.apache.org/jira/browse/HIVE-21279
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21279.1.patch, HIVE-21279.10.patch, 
> HIVE-21279.2.patch, HIVE-21279.3.patch, HIVE-21279.4.patch, 
> HIVE-21279.5.patch, HIVE-21279.6.patch, HIVE-21279.7.patch, 
> HIVE-21279.8.patch, HIVE-21279.9.patch
>
>
> Currently at the end of a job FileSink operator moves/rename temp directory 
> to another directory from which FetchTask fetches result. This is done to 
> avoid fetching potential partial/invalid files by failed/runway tasks. This 
> operation is expensive for cloud storage. It could be avoided if FetchTask is 
> passed on set of files to read from instead of whole directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21279) Avoid moving/rename operation in FileSink op for SELECT queries

2019-02-27 Thread Vineet Garg (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-21279:
---
Status: Open  (was: Patch Available)

> Avoid moving/rename operation in FileSink op for SELECT queries
> ---
>
> Key: HIVE-21279
> URL: https://issues.apache.org/jira/browse/HIVE-21279
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21279.1.patch, HIVE-21279.10.patch, 
> HIVE-21279.2.patch, HIVE-21279.3.patch, HIVE-21279.4.patch, 
> HIVE-21279.5.patch, HIVE-21279.6.patch, HIVE-21279.7.patch, 
> HIVE-21279.8.patch, HIVE-21279.9.patch
>
>
> Currently at the end of a job FileSink operator moves/rename temp directory 
> to another directory from which FetchTask fetches result. This is done to 
> avoid fetching potential partial/invalid files by failed/runway tasks. This 
> operation is expensive for cloud storage. It could be avoided if FetchTask is 
> passed on set of files to read from instead of whole directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21279) Avoid moving/rename operation in FileSink op for SELECT queries

2019-02-27 Thread Vineet Garg (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-21279:
---
Attachment: (was: HIVE-21279.10.patch)

> Avoid moving/rename operation in FileSink op for SELECT queries
> ---
>
> Key: HIVE-21279
> URL: https://issues.apache.org/jira/browse/HIVE-21279
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21279.1.patch, HIVE-21279.10.patch, 
> HIVE-21279.2.patch, HIVE-21279.3.patch, HIVE-21279.4.patch, 
> HIVE-21279.5.patch, HIVE-21279.6.patch, HIVE-21279.7.patch, 
> HIVE-21279.8.patch, HIVE-21279.9.patch
>
>
> Currently at the end of a job FileSink operator moves/rename temp directory 
> to another directory from which FetchTask fetches result. This is done to 
> avoid fetching potential partial/invalid files by failed/runway tasks. This 
> operation is expensive for cloud storage. It could be avoided if FetchTask is 
> passed on set of files to read from instead of whole directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21198) Introduce a database object reference class

2019-02-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779753#comment-16779753
 ] 

Hive QA commented on HIVE-21198:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
51s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
37s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
36s{color} | {color:blue} storage-api in master has 48 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  5m 
14s{color} | {color:blue} ql in master has 2251 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
47s{color} | {color:blue} hcatalog/core in master has 29 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
43s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
45s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} storage-api: The patch generated 0 new + 4 unchanged 
- 1 fixed = 4 total (was 5) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
10s{color} | {color:red} ql: The patch generated 13 new + 2113 unchanged - 136 
fixed = 2126 total (was 2249) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} The patch core passed checkstyle {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
41s{color} | {color:green} storage-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m  
9s{color} | {color:green} ql generated 0 new + 2249 unchanged - 2 fixed = 2249 
total (was 2251) {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
56s{color} | {color:green} core in the patch passed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  1m 
10s{color} | {color:red} ql generated 1 new + 99 unchanged - 1 fixed = 100 
total (was 100) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
19s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 39m 19s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16276/dev-support/hive-personality.sh
 |
| git revision | master / 474a19d |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16276/yetus/diff-checkstyle-ql.txt
 |
| javadoc | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16276/yetus/diff-javadoc-javadoc-ql.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16276/yetus/patch-asflicense-problems.txt
 |
| modules | C: storage-api ql hcatalog/core U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16276/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message

[jira] [Updated] (HIVE-21279) Avoid moving/rename operation in FileSink op for SELECT queries

2019-02-27 Thread Vineet Garg (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-21279:
---
Attachment: HIVE-21279.10.patch

> Avoid moving/rename operation in FileSink op for SELECT queries
> ---
>
> Key: HIVE-21279
> URL: https://issues.apache.org/jira/browse/HIVE-21279
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21279.1.patch, HIVE-21279.10.patch, 
> HIVE-21279.2.patch, HIVE-21279.3.patch, HIVE-21279.4.patch, 
> HIVE-21279.5.patch, HIVE-21279.6.patch, HIVE-21279.7.patch, 
> HIVE-21279.8.patch, HIVE-21279.9.patch
>
>
> Currently at the end of a job FileSink operator moves/rename temp directory 
> to another directory from which FetchTask fetches result. This is done to 
> avoid fetching potential partial/invalid files by failed/runway tasks. This 
> operation is expensive for cloud storage. It could be avoided if FetchTask is 
> passed on set of files to read from instead of whole directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21279) Avoid moving/rename operation in FileSink op for SELECT queries

2019-02-27 Thread Vineet Garg (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-21279:
---
Status: Patch Available  (was: Open)

> Avoid moving/rename operation in FileSink op for SELECT queries
> ---
>
> Key: HIVE-21279
> URL: https://issues.apache.org/jira/browse/HIVE-21279
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21279.1.patch, HIVE-21279.10.patch, 
> HIVE-21279.2.patch, HIVE-21279.3.patch, HIVE-21279.4.patch, 
> HIVE-21279.5.patch, HIVE-21279.6.patch, HIVE-21279.7.patch, 
> HIVE-21279.8.patch, HIVE-21279.9.patch
>
>
> Currently at the end of a job FileSink operator moves/rename temp directory 
> to another directory from which FetchTask fetches result. This is done to 
> avoid fetching potential partial/invalid files by failed/runway tasks. This 
> operation is expensive for cloud storage. It could be avoided if FetchTask is 
> passed on set of files to read from instead of whole directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21329) Custom Tez runtime unordered output buffer size depending on operator pipeline

2019-02-27 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-21329:
---
Attachment: HIVE-21329.01.patch

> Custom Tez runtime unordered output buffer size depending on operator pipeline
> --
>
> Key: HIVE-21329
> URL: https://issues.apache.org/jira/browse/HIVE-21329
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21329.01.patch, HIVE-21329.patch, HIVE-21329.patch
>
>
> For instance, if we have a reduce sink operator with no keys followed by a 
> Group By (merge partial), we can decrease the output buffer size since we 
> will only produce a single row.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-21337) HMS Metadata migration from Postgres/Derby to other DBs fail

2019-02-27 Thread Naveen Gangam (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam reassigned HIVE-21337:



> HMS Metadata migration from Postgres/Derby to other DBs fail
> 
>
> Key: HIVE-21337
> URL: https://issues.apache.org/jira/browse/HIVE-21337
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Minor
>
> Customer recently was migrating from Postgres to Oracle for HMS metastore. 
> During import of the [exported] data from HMS metastore from postgres, 
> failures are seen as the COLUMNS_V2.COMMENT is 4000 bytes long whereas oracle 
> and other schemas define it to be 256 bytes.
> This inconsistency in the schema makes the migration cumbersome and manual. 
> This jira makes this column consistent in length across all databases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21336) HMS Index PCS_STATS_IDX too long for Oracle when NLS_LENGTH_SEMANTICS=char

2019-02-27 Thread Naveen Gangam (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-21336:
-
Attachment: HIVE-21336.patch

> HMS Index PCS_STATS_IDX too long for Oracle when NLS_LENGTH_SEMANTICS=char
> --
>
> Key: HIVE-21336
> URL: https://issues.apache.org/jira/browse/HIVE-21336
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Major
> Attachments: HIVE-21336.patch
>
>
> CREATE INDEX PCS_STATS_IDX ON PAR T_COL_STATS 
> (DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME) 
> Error: ORA-01450: maximum key length (6398) exceeded (state=72000,code=1450) 
> Customer tried the same DDL in SQLDevloper, and got the same error. This 
> could be a result of combination of DB level settings like the db_block_size, 
> limiting the maximum key length, as per below doc: 
> http://www.dba-oracle.com/t_ora_01450_maximum_key_length_exceeded.htm 
> Also {{NLS_LENGTH_SEMANTICS}} is by default BYTE, but users can set this at 
> the session level to CHAR, thus reducing the max size of the index length. We 
> have increased the size of the COLUMN_NAME from 128 to 767 (used to be at 
> 1000) and TABLE_NAME from 128 to 256. This by setting 
> {code} 
> CREATE TABLE PART_COL_STATS ( 
> CS_ID NUMBER NOT NULL, 
> DB_NAME VARCHAR2(128) NOT NULL, 
> TABLE_NAME VARCHAR2(256) NOT NULL, 
> PARTITION_NAME VARCHAR2(767) NOT NULL, 
> COLUMN_NAME VARCHAR2(767) NOT NULL,  
> CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
> (DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME); 
> {code} 
> Reproducer: 
> {code} 
> SQL*Plus: Release 11.2.0.2.0 Production on Wed Feb 27 11:02:16 2019 Copyright 
> (c) 1982, 2011, Oracle. All rights reserved. 
> Connected to: Oracle Database 11g Express Edition Release 11.2.0.2.0 - 64bit 
> Production 
> SQL> select * from v$nls_parameters where parameter = 'NLS_LENGTH_SEMANTICS'; 
> PARAMETER 
>  
> VALUE 
>  
> NLS_LENGTH_SEMANTICS 
> BYTE 
> SQL> alter session set NLS_LENGTH_SEMANTICS=CHAR; Session altered. 
> SQL> commit; Commit complete. 
> SQL> select * from v$nls_parameters where parameter = 'NLS_LENGTH_SEMANTICS'; 
> PARAMETER 
>  
> VALUE 
>  
> NLS_LENGTH_SEMANTICS 
> CHAR 
> SQL> CREATE TABLE PART_COL_STATS (CS_ID NUMBER NOT NULL, DB_NAME 
> VARCHAR2(128) NOT NULL, TABLE_NAME VARCHAR2(256) NOT NULL, PARTITION_NAME 
> VARCHAR2(767) NOT NULL, COLUMN_NAME VARCHAR2(767) NOT NULL); 
> Table created. 
> SQL> CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
> (DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME); 
> CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
> (DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME) 
> * ERROR at line 1: ORA-01450: maximum key length (6398) exceeded 
> SQL> alter session set NLS_LENGTH_SEMANTICS=BYTE; 
> Session altered. 
> SQL> commit; 
> Commit complete. 
> SQL> drop table PART_COL_STATS; 
> Table dropped. 
> SQL> commit; 
> Commit complete. 
> SQL> CREATE TABLE PART_COL_STATS (CS_ID NUMBER NOT NULL, DB_NAME 
> VARCHAR2(128) NOT NULL, TABLE_NAME VARCHAR2(256) NOT NULL, PARTITION_NAME 
> VARCHAR2(767) NOT NULL, COLUMN_NAME VARCHAR2(767) NOT NULL); 
> Table created. 
> SQL> CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
> (DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME); 
> Index created. 
> SQL> commit; 
> Commit complete. 
> SQL> 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21336) HMS Index PCS_STATS_IDX too long for Oracle when NLS_LENGTH_SEMANTICS=char

2019-02-27 Thread Naveen Gangam (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779732#comment-16779732
 ] 

Naveen Gangam commented on HIVE-21336:
--

[~ychena] [~vihangk1] Could you please review? Thanks

> HMS Index PCS_STATS_IDX too long for Oracle when NLS_LENGTH_SEMANTICS=char
> --
>
> Key: HIVE-21336
> URL: https://issues.apache.org/jira/browse/HIVE-21336
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Major
> Attachments: HIVE-21336.patch
>
>
> CREATE INDEX PCS_STATS_IDX ON PAR T_COL_STATS 
> (DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME) 
> Error: ORA-01450: maximum key length (6398) exceeded (state=72000,code=1450) 
> Customer tried the same DDL in SQLDevloper, and got the same error. This 
> could be a result of combination of DB level settings like the db_block_size, 
> limiting the maximum key length, as per below doc: 
> http://www.dba-oracle.com/t_ora_01450_maximum_key_length_exceeded.htm 
> Also {{NLS_LENGTH_SEMANTICS}} is by default BYTE, but users can set this at 
> the session level to CHAR, thus reducing the max size of the index length. We 
> have increased the size of the COLUMN_NAME from 128 to 767 (used to be at 
> 1000) and TABLE_NAME from 128 to 256. This by setting 
> {code} 
> CREATE TABLE PART_COL_STATS ( 
> CS_ID NUMBER NOT NULL, 
> DB_NAME VARCHAR2(128) NOT NULL, 
> TABLE_NAME VARCHAR2(256) NOT NULL, 
> PARTITION_NAME VARCHAR2(767) NOT NULL, 
> COLUMN_NAME VARCHAR2(767) NOT NULL,  
> CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
> (DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME); 
> {code} 
> Reproducer: 
> {code} 
> SQL*Plus: Release 11.2.0.2.0 Production on Wed Feb 27 11:02:16 2019 Copyright 
> (c) 1982, 2011, Oracle. All rights reserved. 
> Connected to: Oracle Database 11g Express Edition Release 11.2.0.2.0 - 64bit 
> Production 
> SQL> select * from v$nls_parameters where parameter = 'NLS_LENGTH_SEMANTICS'; 
> PARAMETER 
>  
> VALUE 
>  
> NLS_LENGTH_SEMANTICS 
> BYTE 
> SQL> alter session set NLS_LENGTH_SEMANTICS=CHAR; Session altered. 
> SQL> commit; Commit complete. 
> SQL> select * from v$nls_parameters where parameter = 'NLS_LENGTH_SEMANTICS'; 
> PARAMETER 
>  
> VALUE 
>  
> NLS_LENGTH_SEMANTICS 
> CHAR 
> SQL> CREATE TABLE PART_COL_STATS (CS_ID NUMBER NOT NULL, DB_NAME 
> VARCHAR2(128) NOT NULL, TABLE_NAME VARCHAR2(256) NOT NULL, PARTITION_NAME 
> VARCHAR2(767) NOT NULL, COLUMN_NAME VARCHAR2(767) NOT NULL); 
> Table created. 
> SQL> CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
> (DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME); 
> CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
> (DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME) 
> * ERROR at line 1: ORA-01450: maximum key length (6398) exceeded 
> SQL> alter session set NLS_LENGTH_SEMANTICS=BYTE; 
> Session altered. 
> SQL> commit; 
> Commit complete. 
> SQL> drop table PART_COL_STATS; 
> Table dropped. 
> SQL> commit; 
> Commit complete. 
> SQL> CREATE TABLE PART_COL_STATS (CS_ID NUMBER NOT NULL, DB_NAME 
> VARCHAR2(128) NOT NULL, TABLE_NAME VARCHAR2(256) NOT NULL, PARTITION_NAME 
> VARCHAR2(767) NOT NULL, COLUMN_NAME VARCHAR2(767) NOT NULL); 
> Table created. 
> SQL> CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
> (DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME); 
> Index created. 
> SQL> commit; 
> Commit complete. 
> SQL> 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21329) Custom Tez runtime unordered output buffer size depending on operator pipeline

2019-02-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779725#comment-16779725
 ] 

Hive QA commented on HIVE-21329:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12960398/HIVE-21329.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16275/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16275/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16275/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2019-02-27 20:00:05.677
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-16275/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2019-02-27 20:00:05.681
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   66533e2..474a19d  master -> origin/master
+ git reset --hard HEAD
HEAD is now at 66533e2 HIVE-21297: Replace all occurences of new Long, Boolean, 
Double etc with the corresponding .valueOf (Ivan Suller via )
+ git clean -f -d
Removing standalone-metastore/metastore-server/src/gen/
+ git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded.
  (use "git pull" to update your local branch)
+ git reset --hard origin/master
HEAD is now at 474a19d HIVE-21320 : get_fields() and get_tables_by_type() are 
not protected by HMS server access control (Na Li, reviewed by Peter Vary)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2019-02-27 20:00:07.152
+ rm -rf ../yetus_PreCommit-HIVE-Build-16275
+ mkdir ../yetus_PreCommit-HIVE-Build-16275
+ git gc
+ cp -R . ../yetus_PreCommit-HIVE-Build-16275
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-16275/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java: does not 
exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java: does not 
exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java: does not 
exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/plan/TezEdgeProperty.java: does 
not exist in index
Going to apply patch with: git apply -p1
+ [[ maven == \m\a\v\e\n ]]
+ rm -rf /data/hiveptest/working/maven/org/apache/hive
+ mvn -B clean install -DskipTests -T 4 -q 
-Dmaven.repo.local=/data/hiveptest/working/maven
protoc-jar: executing: [/tmp/protoc1938844176056961807.exe, --version]
libprotoc 2.5.0
protoc-jar: executing: [/tmp/protoc1938844176056961807.exe, 
-I/data/hiveptest/working/apache-github-source-source/standalone-metastore/metastore-common/src/main/protobuf/org/apache/hadoop/hive/metastore,
 
--java_out=/data/hiveptest/working/apache-github-source-source/standalone-metastore/metastore-common/target/generated-sources,
 
/data/hiveptest/working/apache-github-source-source/standalone-metastore/metastore-common/src/main/protobuf/org/apache/hadoop/hive/metastore/metastore.proto]
ANTLR Parser Generator  Version 3.5.2
protoc-jar: executing: [/tmp/protoc78224008702710153.exe, --version]
libprotoc 2.5.0
ANTLR Parser Generator  Version 3.5.2
Output file 
/data/hiveptest/working/apache-github-source-source/standalone-metastore/metastore-server/target/generated-sources/org/apache/hadoop/hive/metastore/parser/FilterParser.java
 does not exist: must build 
/data/hiveptest/working/apache-github-source-source/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/parser/Filter.g

[jira] [Commented] (HIVE-21312) FSStatsAggregator::connect is slow

2019-02-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779720#comment-16779720
 ] 

Hive QA commented on HIVE-21312:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12960387/HIVE-21312.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15819 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16274/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16274/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16274/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12960387 - PreCommit-HIVE-Build

> FSStatsAggregator::connect is slow
> --
>
> Key: HIVE-21312
> URL: https://issues.apache.org/jira/browse/HIVE-21312
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Trivial
> Attachments: HIVE-21312.1.patch, HIVE-21312.2.patch, 
> HIVE-21312.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21316) Comparision of varchar column and string literal should happen in varchar

2019-02-27 Thread Zoltan Haindrich (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-21316:

Attachment: HIVE-21316.02.patch

> Comparision of varchar column and string literal should happen in varchar
> -
>
> Key: HIVE-21316
> URL: https://issues.apache.org/jira/browse/HIVE-21316
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-21316.01.patch, HIVE-21316.02.patch
>
>
> this is most probably the root cause behind HIVE-21310 as well



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-21336) HMS Index PCS_STATS_IDX too long for Oracle when NLS_LENGTH_SEMANTICS=char

2019-02-27 Thread Naveen Gangam (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam reassigned HIVE-21336:



> HMS Index PCS_STATS_IDX too long for Oracle when NLS_LENGTH_SEMANTICS=char
> --
>
> Key: HIVE-21336
> URL: https://issues.apache.org/jira/browse/HIVE-21336
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Major
>
> CREATE INDEX PCS_STATS_IDX ON PAR T_COL_STATS 
> (DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME) 
> Error: ORA-01450: maximum key length (6398) exceeded (state=72000,code=1450) 
> Customer tried the same DDL in SQLDevloper, and got the same error. This 
> could be a result of combination of DB level settings like the db_block_size, 
> limiting the maximum key length, as per below doc: 
> http://www.dba-oracle.com/t_ora_01450_maximum_key_length_exceeded.htm 
> Also {{NLS_LENGTH_SEMANTICS}} is by default BYTE, but users can set this at 
> the session level to CHAR, thus reducing the max size of the index length. We 
> have increased the size of the COLUMN_NAME from 128 to 767 (used to be at 
> 1000) and TABLE_NAME from 128 to 256. This by setting 
> {code} 
> CREATE TABLE PART_COL_STATS ( 
> CS_ID NUMBER NOT NULL, 
> DB_NAME VARCHAR2(128) NOT NULL, 
> TABLE_NAME VARCHAR2(256) NOT NULL, 
> PARTITION_NAME VARCHAR2(767) NOT NULL, 
> COLUMN_NAME VARCHAR2(767) NOT NULL,  
> CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
> (DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME); 
> {code} 
> Reproducer: 
> {code} 
> SQL*Plus: Release 11.2.0.2.0 Production on Wed Feb 27 11:02:16 2019 Copyright 
> (c) 1982, 2011, Oracle. All rights reserved. 
> Connected to: Oracle Database 11g Express Edition Release 11.2.0.2.0 - 64bit 
> Production 
> SQL> select * from v$nls_parameters where parameter = 'NLS_LENGTH_SEMANTICS'; 
> PARAMETER 
>  
> VALUE 
>  
> NLS_LENGTH_SEMANTICS 
> BYTE 
> SQL> alter session set NLS_LENGTH_SEMANTICS=CHAR; Session altered. 
> SQL> commit; Commit complete. 
> SQL> select * from v$nls_parameters where parameter = 'NLS_LENGTH_SEMANTICS'; 
> PARAMETER 
>  
> VALUE 
>  
> NLS_LENGTH_SEMANTICS 
> CHAR 
> SQL> CREATE TABLE PART_COL_STATS (CS_ID NUMBER NOT NULL, DB_NAME 
> VARCHAR2(128) NOT NULL, TABLE_NAME VARCHAR2(256) NOT NULL, PARTITION_NAME 
> VARCHAR2(767) NOT NULL, COLUMN_NAME VARCHAR2(767) NOT NULL); 
> Table created. 
> SQL> CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
> (DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME); 
> CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
> (DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME) 
> * ERROR at line 1: ORA-01450: maximum key length (6398) exceeded 
> SQL> alter session set NLS_LENGTH_SEMANTICS=BYTE; 
> Session altered. 
> SQL> commit; 
> Commit complete. 
> SQL> drop table PART_COL_STATS; 
> Table dropped. 
> SQL> commit; 
> Commit complete. 
> SQL> CREATE TABLE PART_COL_STATS (CS_ID NUMBER NOT NULL, DB_NAME 
> VARCHAR2(128) NOT NULL, TABLE_NAME VARCHAR2(256) NOT NULL, PARTITION_NAME 
> VARCHAR2(767) NOT NULL, COLUMN_NAME VARCHAR2(767) NOT NULL); 
> Table created. 
> SQL> CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
> (DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME); 
> Index created. 
> SQL> commit; 
> Commit complete. 
> SQL> 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21336) HMS Index PCS_STATS_IDX too long for Oracle when NLS_LENGTH_SEMANTICS=char

2019-02-27 Thread Naveen Gangam (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-21336:
-
Description: 
CREATE INDEX PCS_STATS_IDX ON PAR T_COL_STATS 
(DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME) 
Error: ORA-01450: maximum key length (6398) exceeded (state=72000,code=1450) 

Customer tried the same DDL in SQLDevloper, and got the same error. This could 
be a result of combination of DB level settings like the db_block_size, 
limiting the maximum key length, as per below doc: 
http://www.dba-oracle.com/t_ora_01450_maximum_key_length_exceeded.htm 

Also {{NLS_LENGTH_SEMANTICS}} is by default BYTE, but users can set this at the 
session level to CHAR, thus reducing the max size of the index length. We have 
increased the size of the COLUMN_NAME from 128 to 767 (used to be at 1000) and 
TABLE_NAME from 128 to 256. This by setting 

{code} 
CREATE TABLE PART_COL_STATS ( 
CS_ID NUMBER NOT NULL, 
DB_NAME VARCHAR2(128) NOT NULL, 
TABLE_NAME VARCHAR2(256) NOT NULL, 
PARTITION_NAME VARCHAR2(767) NOT NULL, 
COLUMN_NAME VARCHAR2(767) NOT NULL,  

CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
(DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME); 
{code} 

Reproducer: 

{code} 
SQL*Plus: Release 11.2.0.2.0 Production on Wed Feb 27 11:02:16 2019 Copyright 
(c) 1982, 2011, Oracle. All rights reserved. 
Connected to: Oracle Database 11g Express Edition Release 11.2.0.2.0 - 64bit 
Production 

SQL> select * from v$nls_parameters where parameter = 'NLS_LENGTH_SEMANTICS'; 
PARAMETER 
 
VALUE 
 
NLS_LENGTH_SEMANTICS 
BYTE 

SQL> alter session set NLS_LENGTH_SEMANTICS=CHAR; Session altered. 

SQL> commit; Commit complete. 

SQL> select * from v$nls_parameters where parameter = 'NLS_LENGTH_SEMANTICS'; 
PARAMETER 
 
VALUE 
 
NLS_LENGTH_SEMANTICS 
CHAR 

SQL> CREATE TABLE PART_COL_STATS (CS_ID NUMBER NOT NULL, DB_NAME VARCHAR2(128) 
NOT NULL, TABLE_NAME VARCHAR2(256) NOT NULL, PARTITION_NAME VARCHAR2(767) NOT 
NULL, COLUMN_NAME VARCHAR2(767) NOT NULL); 
Table created. 

SQL> CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
(DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME); 

CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
(DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME) 
* ERROR at line 1: ORA-01450: maximum key length (6398) exceeded 

SQL> alter session set NLS_LENGTH_SEMANTICS=BYTE; 
Session altered. 

SQL> commit; 
Commit complete. 

SQL> drop table PART_COL_STATS; 
Table dropped. 

SQL> commit; 
Commit complete. 

SQL> CREATE TABLE PART_COL_STATS (CS_ID NUMBER NOT NULL, DB_NAME VARCHAR2(128) 
NOT NULL, TABLE_NAME VARCHAR2(256) NOT NULL, PARTITION_NAME VARCHAR2(767) NOT 
NULL, COLUMN_NAME VARCHAR2(767) NOT NULL); 
Table created. 

SQL> CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
(DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME); 
Index created. 

SQL> commit; 
Commit complete. 

SQL> 
{code}

  was:
CREATE INDEX PCS_STATS_IDX ON PAR T_COL_STATS 
(DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME) 
Error: ORA-01450: maximum key length (6398) exceeded (state=72000,code=1450) 

Customer tried the same DDL in SQLDevloper, and got the same error. This could 
be a result of combination of DB level settings like the db_block_size, 
limiting the maximum key length, as per below doc: 
http://www.dba-oracle.com/t_ora_01450_maximum_key_length_exceeded.htm 

Also {{NLS_LENGTH_SEMANTICS}} is by default BYTE, but users can set this at the 
session level to CHAR, thus reducing the max size of the index length. We have 
increased the size of the COLUMN_NAME from 128 to 767 (used to be at 1000) and 
TABLE_NAME from 128 to 256. This by setting 

{code} 
CREATE TABLE PART_COL_STATS ( 
CS_ID NUMBER NOT NULL, 
DB_NAME VARCHAR2(128) NOT NULL, 
TABLE_NAME VARCHAR2(256) NOT NULL, 
PARTITION_NAME VARCHAR2(767) NOT NULL, 
COLUMN_NAME VARCHAR2(767) NOT NULL,  

CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
(DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME); 
{code} 

Reproducer: 

{code} 
SQL*Plus: Release 11.2.0.2.0 Production on Wed Feb 27 11:02:16 2019 Copyright 
(c) 1982, 2011, Oracle. All rights reserved. 
Connected to: Oracle Database 11g Express Edition Release 11.2.0.2.0 - 64bit 
Production 

SQL> select * from v$nls_parameters where parameter = 'NLS_LENGTH_SEMANTICS'; 
PARAMETER 
 
VALUE 
 
NLS_LENGTH_SEMANTICS 
BYTE 

SQL> alter session set NLS_LENGTH_SEMANTICS=CHAR; Session altered. 

SQL> commit; Commit complete. 

SQL> select * from v$nls_parameters where parameter = 'NLS_LENGTH_SEMANTICS'; 
PARAMETER

[jira] [Commented] (HIVE-21312) FSStatsAggregator::connect is slow

2019-02-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779677#comment-16779677
 ] 

Hive QA commented on HIVE-21312:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
54s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
23s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
38s{color} | {color:blue} ql in master has 2251 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
10s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
47s{color} | {color:red} ql: The patch generated 5 new + 8 unchanged - 6 fixed 
= 13 total (was 14) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 28m 52s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16274/dev-support/hive-personality.sh
 |
| git revision | master / 474a19d |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16274/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16274/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> FSStatsAggregator::connect is slow
> --
>
> Key: HIVE-21312
> URL: https://issues.apache.org/jira/browse/HIVE-21312
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Trivial
> Attachments: HIVE-21312.1.patch, HIVE-21312.2.patch, 
> HIVE-21312.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-27 Thread Miklos Gergely (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely updated HIVE-21292:
--
Status: Open  (was: Patch Available)

> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21292.01.patch, HIVE-21292.02.patch, 
> HIVE-21292.03.patch, HIVE-21292.04.patch, HIVE-21292.05.patch, 
> HIVE-21292.06.patch, HIVE-21292.07.patch, HIVE-21292.08.patch, 
> HIVE-21292.09.patch, HIVE-21292.10.patch, HIVE-21292.11.patch, 
> HIVE-21292.12.patch, HIVE-21292.13.patch, HIVE-21292.14.patch, 
> HIVE-21292.15.patch, HIVE-21292.15.patch, HIVE-21292.16.patch, 
> HIVE-21292.17.patch
>
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there are two DDLTask and DDLWork classes in the 
> code base the new ones in the new package are called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes are in use.
> Step #1: extract all the database related operations from the old DDLTask, 
> and move them under the new package. Also create the new internal framework.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

1 2 >

1 - 100 of 180 matches

Mail list logo