[jira] [Updated] (HIVE-17483) HS2 kill command to kill queries using query id

2017-09-17 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-17483:
--
Attachment: HIVE-17483.2.patch

> HS2 kill command to kill queries using query id
> ---
>
> Key: HIVE-17483
> URL: https://issues.apache.org/jira/browse/HIVE-17483
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Teddy Choi
> Attachments: HIVE-17483.1.patch, HIVE-17483.2.patch, 
> HIVE-17483.2.patch
>
>
> For administrators, it is important to be able to kill queries if required. 
> Currently, there is no clean way to do it.
> It would help to have a "kill query " command that can be run using 
> odbc/jdbc against a HiveServer2 instance, to kill a query with that queryid 
> running in that instance.
> Authorization will have to be done to ensure that the user that is invoking 
> the API is allowed to perform this action.
> In case of SQL std authorization, this would require admin role.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17426) Execution framework in hive to run tasks in parallel

2017-09-17 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169619#comment-16169619
 ] 

Lefty Leverenz commented on HIVE-17426:
---

Thanks for changing the description of *hive.repl.approx.max.load.tasks*, 
[~anishek].

> Execution framework in hive to run tasks in parallel
> 
>
> Key: HIVE-17426
> URL: https://issues.apache.org/jira/browse/HIVE-17426
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
> Fix For: 3.0.0
>
> Attachments: HIVE-17426.0.patch, HIVE-17426.1.patch, 
> HIVE-17426.2.patch, HIVE-17426.3.patch, HIVE-17426.4.patch, HIVE-17426.5.patch
>
>
> the execution framework currently only runs MR / Spark  Tasks in parallel 
> when {{set hive.exec.parallel=true}}.
> Allow other types of tasks to run in parallel as well to support replication 
> scenarios in hive. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17521) Improve defaults for few runtime configs

2017-09-17 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169599#comment-16169599
 ] 

Lefty Leverenz commented on HIVE-17521:
---

Doc note:  This changes default values for 
*hive.auto.convert.join.hashtable.max.entries*, 
*hive.hashtable.key.count.adjustment*, and *hive.tez.bloom.filter.factor* so 
the wiki needs to be updated (with version information).

* [hive.auto.convert.join.hashtable.max.entries | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.auto.convert.join.hashtable.max.entries]
 -- not documented yet, created by HIVE-12492 (this link will work after it is 
documented)
* [hive.hashtable.key.count.adjustment | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.hashtable.key.count.adjustment]
* [hive.tez.bloom.filter.factor | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.tez.bloom.filter.factor]

Added a TODOC3.0 label.

> Improve defaults for few runtime configs
> 
>
> Key: HIVE-17521
> URL: https://issues.apache.org/jira/browse/HIVE-17521
> Project: Hive
>  Issue Type: Task
>  Components: Configuration
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Minor
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-17521.2.patch, HIVE-17521.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-12492) MapJoin: 4 million unique integers seems to be a probe plateau

2017-09-17 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15887496#comment-15887496
 ] 

Lefty Leverenz edited comment on HIVE-12492 at 9/18/17 5:19 AM:


Doc note:  This adds *hive.auto.convert.join.hashtable.max.entries* to 
HiveConf.java, so it needs to be documented in the wiki.

* [Configuration Properties -- Query and DDL Execution | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-QueryandDDLExecution]

Added a TODOC2.2 label.

Edit (7/Mar/17):  HIVE-16137 changes the default value to 40,000,000 so that's 
what should be documented in the wiki.

Typo alert:  In the parameter description, "does not take affect" should be 
"does not take effect."  This can be corrected in the wiki.

Edit (18/Sep/17):  HIVE-17276 fixes the typo in the parameter description and 
HIVE-17521 changes the default value to 21,000,000 (both in release 3.0.0).


was (Author: le...@hortonworks.com):
Doc note:  This adds *hive.auto.convert.join.hashtable.max.entries* to 
HiveConf.java, so it needs to be documented in the wiki.

* [Configuration Properties -- Query and DDL Execution | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-QueryandDDLExecution]

Added a TODOC2.2 label.

Edit (7/Mar/17):  HIVE-16137 changes the default value to 40,000,000 so that's 
what should be documented in the wiki.

Typo alert:  In the parameter description, "does not take affect" should be 
"does not take effect."  This can be corrected in the wiki.

> MapJoin: 4 million unique integers seems to be a probe plateau
> --
>
> Key: HIVE-12492
> URL: https://issues.apache.org/jira/browse/HIVE-12492
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Affects Versions: 1.3.0, 1.2.1, 2.0.0
>Reporter: Gopal V
>Assignee: Jesus Camacho Rodriguez
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-12492.01.patch, HIVE-12492.02.patch, 
> HIVE-12492.patch
>
>
> After 4 million keys, the map-join implementation seems to suffer from a 
> performance degradation. 
> The hashtable build & probe time makes this very inefficient, even if the 
> data is very compact (i.e 2 ints).
> Falling back onto the shuffle join or bucket map-join is useful after 2^22 
> items.
> (Note: this fixes a statsutil issue - due to the extra clone() in the column 
> stats path)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17521) Improve defaults for few runtime configs

2017-09-17 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-17521:
--
Labels: TODOC3.0  (was: )

> Improve defaults for few runtime configs
> 
>
> Key: HIVE-17521
> URL: https://issues.apache.org/jira/browse/HIVE-17521
> Project: Hive
>  Issue Type: Task
>  Components: Configuration
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Minor
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-17521.2.patch, HIVE-17521.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-13989) Extended ACLs are not handled according to specification

2017-09-17 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169583#comment-16169583
 ] 

Lefty Leverenz commented on HIVE-13989:
---

[~vgumashta], this jira is marked as fixed in 2.3.0 and 2.2.1 but you committed 
it to branch-2 and branch-2.2 so I think it should say fixed in 2.4.0 and 2.2.1.

> Extended ACLs are not handled according to specification
> 
>
> Key: HIVE-13989
> URL: https://issues.apache.org/jira/browse/HIVE-13989
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Chris Drome
>Assignee: Chris Drome
> Fix For: 2.3.0, 2.2.1
>
> Attachments: HIVE-13989.1-branch-1.patch, HIVE-13989.1.patch, 
> HIVE-13989.4-branch-2.2.patch, HIVE-13989.4-branch-2.patch, 
> HIVE-13989-branch-1.patch, HIVE-13989-branch-2.2.patch, 
> HIVE-13989-branch-2.2.patch, HIVE-13989-branch-2.2.patch
>
>
> Hive takes two approaches to working with extended ACLs depending on whether 
> data is being produced via a Hive query or HCatalog APIs. A Hive query will 
> run an FsShell command to recursively set the extended ACLs for a directory 
> sub-tree. HCatalog APIs will attempt to build up the directory sub-tree 
> programmatically and runs some code to set the ACLs to match the parent 
> directory.
> Some incorrect assumptions were made when implementing the extended ACLs 
> support. Refer to https://issues.apache.org/jira/browse/HDFS-4685 for the 
> design documents of extended ACLs in HDFS. These documents model the 
> implementation after the POSIX implementation on Linux, which can be found at 
> http://www.vanemery.com/Linux/ACL/POSIX_ACL_on_Linux.html.
> The code for setting extended ACLs via HCatalog APIs is found in 
> HdfsUtils.java:
> {code}
> if (aclEnabled) {
>   aclStatus =  sourceStatus.getAclStatus();
>   if (aclStatus != null) {
> LOG.trace(aclStatus.toString());
> aclEntries = aclStatus.getEntries();
> removeBaseAclEntries(aclEntries);
> //the ACL api's also expect the tradition user/group/other permission 
> in the form of ACL
> aclEntries.add(newAclEntry(AclEntryScope.ACCESS, AclEntryType.USER, 
> sourcePerm.getUserAction()));
> aclEntries.add(newAclEntry(AclEntryScope.ACCESS, AclEntryType.GROUP, 
> sourcePerm.getGroupAction()));
> aclEntries.add(newAclEntry(AclEntryScope.ACCESS, AclEntryType.OTHER, 
> sourcePerm.getOtherAction()));
>   }
> }
> {code}
> We found that DEFAULT extended ACL rules were not being inherited properly by 
> the directory sub-tree, so the above code is incomplete because it 
> effectively drops the DEFAULT rules. The second problem is with the call to 
> {{sourcePerm.getGroupAction()}}, which is incorrect in the case of extended 
> ACLs. When extended ACLs are used the GROUP permission is replaced with the 
> extended ACL mask. So the above code will apply the wrong permissions to the 
> GROUP. Instead the correct GROUP permissions now need to be pulled from the 
> AclEntry as returned by {{getAclStatus().getEntries()}}. See the 
> implementation of the new method {{getDefaultAclEntries}} for details.
> Similar issues exist with the HCatalog API. None of the API accounts for 
> setting extended ACLs on the directory sub-tree. The changes to the HCatalog 
> API allow the extended ACLs to be passed into the required methods similar to 
> how basic permissions are passed in. When building the directory sub-tree the 
> extended ACLs of the table directory are inherited by all sub-directories, 
> including the DEFAULT rules.
> Replicating the problem:
> Create a table to write data into (I will use acl_test as the destination and 
> words_text as the source) and set the ACLs as follows:
> {noformat}
> $ hdfs dfs -setfacl -m 
> default:user::rwx,default:group::r-x,default:mask::rwx,default:user:hdfs:rwx,group::r-x,user:hdfs:rwx
>  /user/cdrome/hive/acl_test
> $ hdfs dfs -ls -d /user/cdrome/hive/acl_test
> drwxrwx---+  - cdrome hdfs  0 2016-07-13 20:36 
> /user/cdrome/hive/acl_test
> $ hdfs dfs -getfacl -R /user/cdrome/hive/acl_test
> # file: /user/cdrome/hive/acl_test
> # owner: cdrome
> # group: hdfs
> user::rwx
> user:hdfs:rwx
> group::r-x
> mask::rwx
> other::---
> default:user::rwx
> default:user:hdfs:rwx
> default:group::r-x
> default:mask::rwx
> default:other::---
> {noformat}
> Note that the basic GROUP permission is set to {{rwx}} after setting the 
> ACLs. The ACLs explicitly set the DEFAULT rules and a rule specifically for 
> the {{hdfs}} user.
> Run the following query to populate the table:
> {noformat}
> insert into acl_test partition (dt='a', ds='b') select a, b from words_text 
> where dt = 'c';
> {noformat}
> Note that words_text only has a single partition key.
> Now 

[jira] [Updated] (HIVE-17139) Conditional expressions optimization: skip the expression evaluation if the condition is not satisfied for vectorization engine.

2017-09-17 Thread Ke Jia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ke Jia updated HIVE-17139:
--
Attachment: HIVE-17139.10.patch

> Conditional expressions optimization: skip the expression evaluation if the 
> condition is not satisfied for vectorization engine.
> 
>
> Key: HIVE-17139
> URL: https://issues.apache.org/jira/browse/HIVE-17139
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ke Jia
>Assignee: Ke Jia
> Attachments: HIVE-17139.10.patch, HIVE-17139.1.patch, 
> HIVE-17139.2.patch, HIVE-17139.3.patch, HIVE-17139.4.patch, 
> HIVE-17139.5.patch, HIVE-17139.6.patch, HIVE-17139.7.patch, 
> HIVE-17139.8.patch, HIVE-17139.9.patch
>
>
> The case when and if statement execution for Hive vectorization is not 
> optimal, which all the conditional and else expressions are evaluated for 
> current implementation. The optimized approach is to update the selected 
> array of batch parameter after the conditional expression is executed. Then 
> the else expression will only do the selected rows instead of all.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17375) stddev_samp,var_samp standard compliance

2017-09-17 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169580#comment-16169580
 ] 

Lefty Leverenz commented on HIVE-17375:
---

Thanks [~kgyrtkirk], I agree that this doesn't need to be documented.

> stddev_samp,var_samp standard compliance
> 
>
> Key: HIVE-17375
> URL: https://issues.apache.org/jira/browse/HIVE-17375
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-17375.1.patch, HIVE-17375.2.patch
>
>
> these two udaf-s are returning 0 in case of only one element - however the 
> stadard requires NULL to be returned



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17496) Bootstrap repl is not cleaning up staging dirs

2017-09-17 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17496:
--
Status: Open  (was: Patch Available)

> Bootstrap repl is not cleaning up staging dirs
> --
>
> Key: HIVE-17496
> URL: https://issues.apache.org/jira/browse/HIVE-17496
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17496.1.patch, HIVE-17496.2.patch, 
> HIVE-17496.3.patch, HIVE-17496.4.patch, HIVE-17496.5.patch
>
>
> This will put more pressure on the HDFS file limit.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17548) ThriftCliService reports inaccurate the number of current sessions in the log message

2017-09-17 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang reassigned HIVE-17548:
--

Assignee: Xuefu Zhang

> ThriftCliService reports inaccurate the number of current sessions in the log 
> message
> -
>
> Key: HIVE-17548
> URL: https://issues.apache.org/jira/browse/HIVE-17548
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.1.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
>
> Currently ThriftCliService uses an atomic integer to keep track of the number 
> of currently open sessions. It reports it through the following two log 
> messages:
> {code}
> 2017-09-18 04:14:31,722 INFO [HiveServer2-Handler-Pool: Thread-729979]: 
> org.apache.hive.service.cli.thrift.ThriftCLIService: Opened a session: 
> SessionHandle [99ec30d7-5c44-4a45-a8d6-0f0e7ecf4879], current sessions: 345
> 2017-09-18 04:14:41,926 INFO [HiveServer2-Handler-Pool: Thread-717542]: 
> org.apache.hive.service.cli.thrift.ThriftCLIService: Closed session: 
> SessionHandle [f38f7890-cba4-459c-872e-4c261b897e00], current sessions: 344
> {code}
> This assumes that all sessions are closed or opened thru Thrift API. This 
> assumption isn't correct because sessions may be closed by the server such as 
> in case of timeout. Therefore, such log messages tends to over-report the 
> number of open sessions.
> In order to accurately report the number of outstanding sessions, session 
> manager should be consulted instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17496) Bootstrap repl is not cleaning up staging dirs

2017-09-17 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17496:
--
Attachment: (was: HIVE-17496.5.patch)

> Bootstrap repl is not cleaning up staging dirs
> --
>
> Key: HIVE-17496
> URL: https://issues.apache.org/jira/browse/HIVE-17496
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17496.1.patch, HIVE-17496.2.patch, 
> HIVE-17496.3.patch, HIVE-17496.4.patch, HIVE-17496.5.patch
>
>
> This will put more pressure on the HDFS file limit.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17496) Bootstrap repl is not cleaning up staging dirs

2017-09-17 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17496:
--
Attachment: HIVE-17496.5.patch

> Bootstrap repl is not cleaning up staging dirs
> --
>
> Key: HIVE-17496
> URL: https://issues.apache.org/jira/browse/HIVE-17496
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17496.1.patch, HIVE-17496.2.patch, 
> HIVE-17496.3.patch, HIVE-17496.4.patch, HIVE-17496.5.patch
>
>
> This will put more pressure on the HDFS file limit.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17496) Bootstrap repl is not cleaning up staging dirs

2017-09-17 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17496:
--
Status: Patch Available  (was: Open)

> Bootstrap repl is not cleaning up staging dirs
> --
>
> Key: HIVE-17496
> URL: https://issues.apache.org/jira/browse/HIVE-17496
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17496.1.patch, HIVE-17496.2.patch, 
> HIVE-17496.3.patch, HIVE-17496.4.patch, HIVE-17496.5.patch
>
>
> This will put more pressure on the HDFS file limit.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17542) Make HoS CombineEquivalentWorkResolver Configurable

2017-09-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169564#comment-16169564
 ] 

Hive QA commented on HIVE-17542:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12887577/HIVE-17542.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 11042 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udaf_context_ngrams] 
(batchId=70)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_mask_hash] 
(batchId=28)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=100)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[drop_table_failure2]
 (batchId=89)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=234)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=234)
org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=215)
org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 
(batchId=215)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6858/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6858/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6858/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12887577 - PreCommit-HIVE-Build

> Make HoS CombineEquivalentWorkResolver Configurable
> ---
>
> Key: HIVE-17542
> URL: https://issues.apache.org/jira/browse/HIVE-17542
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer, Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17542.1.patch, HIVE-17542.2.patch
>
>
> The {{CombineEquivalentWorkResolver}} is run by default. We should make it 
> configurable so that users can disable it in case there are any issues. We 
> can enable it by default to preserve backwards compatibility.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17496) Bootstrap repl is not cleaning up staging dirs

2017-09-17 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17496:
--
Status: Patch Available  (was: Open)

> Bootstrap repl is not cleaning up staging dirs
> --
>
> Key: HIVE-17496
> URL: https://issues.apache.org/jira/browse/HIVE-17496
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17496.1.patch, HIVE-17496.2.patch, 
> HIVE-17496.3.patch, HIVE-17496.4.patch, HIVE-17496.5.patch
>
>
> This will put more pressure on the HDFS file limit.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17496) Bootstrap repl is not cleaning up staging dirs

2017-09-17 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17496:
--
Status: Open  (was: Patch Available)

> Bootstrap repl is not cleaning up staging dirs
> --
>
> Key: HIVE-17496
> URL: https://issues.apache.org/jira/browse/HIVE-17496
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17496.1.patch, HIVE-17496.2.patch, 
> HIVE-17496.3.patch, HIVE-17496.4.patch, HIVE-17496.5.patch
>
>
> This will put more pressure on the HDFS file limit.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17496) Bootstrap repl is not cleaning up staging dirs

2017-09-17 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17496:
--
Attachment: HIVE-17496.5.patch

> Bootstrap repl is not cleaning up staging dirs
> --
>
> Key: HIVE-17496
> URL: https://issues.apache.org/jira/browse/HIVE-17496
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17496.1.patch, HIVE-17496.2.patch, 
> HIVE-17496.3.patch, HIVE-17496.4.patch, HIVE-17496.5.patch
>
>
> This will put more pressure on the HDFS file limit.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17483) HS2 kill command to kill queries using query id

2017-09-17 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-17483:
--
Attachment: HIVE-17483.2.patch

Removed unnecessary files

> HS2 kill command to kill queries using query id
> ---
>
> Key: HIVE-17483
> URL: https://issues.apache.org/jira/browse/HIVE-17483
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Teddy Choi
> Attachments: HIVE-17483.1.patch, HIVE-17483.2.patch
>
>
> For administrators, it is important to be able to kill queries if required. 
> Currently, there is no clean way to do it.
> It would help to have a "kill query " command that can be run using 
> odbc/jdbc against a HiveServer2 instance, to kill a query with that queryid 
> running in that instance.
> Authorization will have to be done to ensure that the user that is invoking 
> the API is allowed to perform this action.
> In case of SQL std authorization, this would require admin role.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15212) merge branch into master

2017-09-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169531#comment-16169531
 ] 

Hive QA commented on HIVE-15212:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12887575/HIVE-15212.14.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6857/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6857/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6857/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-09-18 02:53:21.912
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-6857/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-09-18 02:53:21.915
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at a51ae9c HIVE-17203: Add InterfaceAudience and InterfaceStability 
annotations for HCat APIs (Sahil Takiar, reviewed by Aihua Xu)
+ git clean -f -d
Removing standalone-metastore/src/gen/org/
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at a51ae9c HIVE-17203: Add InterfaceAudience and InterfaceStability 
annotations for HCat APIs (Sahil Takiar, reviewed by Aihua Xu)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-09-18 02:53:27.049
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: 
metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreThread.java: No 
such file or directory
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12887575 - PreCommit-HIVE-Build

> merge branch into master
> 
>
> Key: HIVE-15212
> URL: https://issues.apache.org/jira/browse/HIVE-15212
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15212.01.patch, HIVE-15212.02.patch, 
> HIVE-15212.03.patch, HIVE-15212.04.patch, HIVE-15212.05.patch, 
> HIVE-15212.06.patch, HIVE-15212.07.patch, HIVE-15212.08.patch, 
> HIVE-15212.09.patch, HIVE-15212.10.patch, HIVE-15212.11.patch, 
> HIVE-15212.12.patch, HIVE-15212.12.patch, HIVE-15212.13.patch, 
> HIVE-15212.13.patch, HIVE-15212.14.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17545) Make HoS RDD Cacheing Optimization Configurable

2017-09-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169530#comment-16169530
 ] 

Hive QA commented on HIVE-17545:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12887576/HIVE-17545.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 11041 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_mask_hash] 
(batchId=28)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=156)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[drop_table_failure2]
 (batchId=89)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=234)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=234)
org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=215)
org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 
(batchId=215)
org.apache.hive.jdbc.TestJdbcDriver2.testSelectExecAsync2 (batchId=225)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6856/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6856/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6856/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12887576 - PreCommit-HIVE-Build

> Make HoS RDD Cacheing Optimization Configurable
> ---
>
> Key: HIVE-17545
> URL: https://issues.apache.org/jira/browse/HIVE-17545
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer, Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17545.1.patch, HIVE-17545.2.patch
>
>
> The RDD cacheing optimization add in HIVE-10550 is enabled by default. We 
> should make it configurable in case users want to disable it. We can leave it 
> on by default to preserve backwards compatibility.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17344) LocalCache element memory usage is not calculated properly.

2017-09-17 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169524#comment-16169524
 ] 

Lefty Leverenz commented on HIVE-17344:
---

[~kgyrtkirk], please add fix version 3.0.0 to this issue.  Thanks.

> LocalCache element memory usage is not calculated properly.
> ---
>
> Key: HIVE-17344
> URL: https://issues.apache.org/jira/browse/HIVE-17344
> Project: Hive
>  Issue Type: Bug
>Reporter: Janos Gub
>Assignee: Janos Gub
> Attachments: HIVE-17344.2.patch, HIVE-17344.patch
>
>
> Orc footer cache has a calculation of memory usage:
> {code:java}
> public int getMemoryUsage() {
>   return bb.remaining() + 100; // 100 is for 2 longs, BB and java overheads 
> (semi-arbitrary).
> }
> {code}
> ByteBuffer.remaining returns the remaining space in the bytebuffer, thus 
> allowing this cache have elements MAXWEIGHT/100 of arbitrary size. I think 
> the correct solution would be bb.capacity.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively

2017-09-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169496#comment-16169496
 ] 

Hive QA commented on HIVE-17465:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12887574/HIVE-17465.7.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 11041 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_mask_hash] 
(batchId=28)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_explainuser_1]
 (batchId=170)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[drop_table_failure2]
 (batchId=89)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=234)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=234)
org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=215)
org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 
(batchId=215)
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testHttpRetryOnServerIdleTimeout 
(batchId=227)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6855/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6855/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6855/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12887574 - PreCommit-HIVE-Build

> Statistics: Drill-down filters don't reduce row-counts progressively
> 
>
> Key: HIVE-17465
> URL: https://issues.apache.org/jira/browse/HIVE-17465
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer, Statistics
>Reporter: Gopal V
>Assignee: Vineet Garg
> Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, 
> HIVE-17465.3.patch, HIVE-17465.4.patch, HIVE-17465.5.patch, 
> HIVE-17465.6.patch, HIVE-17465.7.patch
>
>
> {code}
> explain select count(d_date_sk) from date_dim where d_year=2001 ;
> explain select count(d_date_sk) from date_dim where d_year=2001  and d_moy = 
> 9;
> explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 
> and d_dom = 21;
> {code}
> All 3 queries end up with the same row-count estimates after the filter.
> {code}
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: (d_year = 2001) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (d_year = 2001) (type: boolean)
> Statistics: Num rows: 363 Data size: 4356 Basic stats: 
> COMPLETE Column stats: COMPLETE
>  
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
> Statistics: Num rows: 363 Data size: 5808 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
> Statistics: Num rows: 363 Data size: 7260 Basic stats: 
> COMPLETE Column stats: COMPLETE
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16886) HMS log notifications may have duplicated event IDs if multiple HMS are running concurrently

2017-09-17 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169484#comment-16169484
 ] 

Lefty Leverenz commented on HIVE-16886:
---

Doc note:  This adds *hive.notification.sequence.lock.max.retries* and 
*hive.notification.sequence.lock.retry.sleep.interval* to HiveConf.java, so 
they need to be documented in the wiki.

Apparently they belong in the metastore section.

* [Configuration Properties -- MetaStore | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-MetaStore]

Added a TODOC3.0 label.

> HMS log notifications may have duplicated event IDs if multiple HMS are 
> running concurrently
> 
>
> Key: HIVE-16886
> URL: https://issues.apache.org/jira/browse/HIVE-16886
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Metastore
>Reporter: Sergio Peña
>Assignee: anishek
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: datastore-identity-holes.diff, HIVE-16886.1.patch, 
> HIVE-16886.2.patch, HIVE-16886.3.patch, HIVE-16886.4.patch, 
> HIVE-16886.5.patch, HIVE-16886.6.patch, HIVE-16886.7.patch, HIVE-16886.8.patch
>
>
> When running multiple Hive Metastore servers and DB notifications are 
> enabled, I could see that notifications can be persisted with a duplicated 
> event ID. 
> This does not happen when running multiple threads in a single HMS node due 
> to the locking acquired on the DbNotificationsLog class, but multiple HMS 
> could cause conflicts.
> The issue is in the ObjectStore#addNotificationEvent() method. The event ID 
> fetched from the datastore is used for the new notification, incremented in 
> the server itself, then persisted or updated back to the datastore. If 2 
> servers read the same ID, then these 2 servers write a new notification with 
> the same ID.
> The event ID is not unique nor a primary key.
> Here's a test case using the TestObjectStore class that confirms this issue:
> {noformat}
> @Test
>   public void testConcurrentAddNotifications() throws ExecutionException, 
> InterruptedException {
> final int NUM_THREADS = 2;
> CountDownLatch countIn = new CountDownLatch(NUM_THREADS);
> CountDownLatch countOut = new CountDownLatch(1);
> HiveConf conf = new HiveConf();
> conf.setVar(HiveConf.ConfVars.METASTORE_EXPRESSION_PROXY_CLASS, 
> MockPartitionExpressionProxy.class.getName());
> ExecutorService executorService = 
> Executors.newFixedThreadPool(NUM_THREADS);
> FutureTask tasks[] = new FutureTask[NUM_THREADS];
> for (int i=0; i   final int n = i;
>   tasks[i] = new FutureTask(new Callable() {
> @Override
> public Void call() throws Exception {
>   ObjectStore store = new ObjectStore();
>   store.setConf(conf);
>   NotificationEvent dbEvent =
>   new NotificationEvent(0, 0, 
> EventMessage.EventType.CREATE_DATABASE.toString(), "CREATE DATABASE DB" + n);
>   System.out.println("ADDING NOTIFICATION");
>   countIn.countDown();
>   countOut.await();
>   store.addNotificationEvent(dbEvent);
>   System.out.println("FINISH NOTIFICATION");
>   return null;
> }
>   });
>   executorService.execute(tasks[i]);
> }
> countIn.await();
> countOut.countDown();
> for (int i = 0; i < NUM_THREADS; ++i) {
>   tasks[i].get();
> }
> NotificationEventResponse eventResponse = 
> objectStore.getNextNotification(new NotificationEventRequest());
> Assert.assertEquals(2, eventResponse.getEventsSize());
> Assert.assertEquals(1, eventResponse.getEvents().get(0).getEventId());
> // This fails because the next notification has an event ID = 1
> Assert.assertEquals(2, eventResponse.getEvents().get(1).getEventId());
>   }
> {noformat}
> The last assertion fails expecting an event ID 1 instead of 2. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17203) Add InterfaceAudience and InterfaceStability annotations for HCat APIs

2017-09-17 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17203:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to master.

> Add InterfaceAudience and InterfaceStability annotations for HCat APIs
> --
>
> Key: HIVE-17203
> URL: https://issues.apache.org/jira/browse/HIVE-17203
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog
>Affects Versions: 3.0.0
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17203.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17542) Make HoS CombineEquivalentWorkResolver Configurable

2017-09-17 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17542:

Attachment: HIVE-17542.2.patch

> Make HoS CombineEquivalentWorkResolver Configurable
> ---
>
> Key: HIVE-17542
> URL: https://issues.apache.org/jira/browse/HIVE-17542
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer, Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17542.1.patch, HIVE-17542.2.patch
>
>
> The {{CombineEquivalentWorkResolver}} is run by default. We should make it 
> configurable so that users can disable it in case there are any issues. We 
> can enable it by default to preserve backwards compatibility.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17545) Make HoS RDD Cacheing Optimization Configurable

2017-09-17 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17545:

Attachment: HIVE-17545.2.patch

> Make HoS RDD Cacheing Optimization Configurable
> ---
>
> Key: HIVE-17545
> URL: https://issues.apache.org/jira/browse/HIVE-17545
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer, Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17545.1.patch, HIVE-17545.2.patch
>
>
> The RDD cacheing optimization add in HIVE-10550 is enabled by default. We 
> should make it configurable in case users want to disable it. We can leave it 
> on by default to preserve backwards compatibility.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-15212) merge branch into master

2017-09-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15212:

Status: Patch Available  (was: Open)

> merge branch into master
> 
>
> Key: HIVE-15212
> URL: https://issues.apache.org/jira/browse/HIVE-15212
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15212.01.patch, HIVE-15212.02.patch, 
> HIVE-15212.03.patch, HIVE-15212.04.patch, HIVE-15212.05.patch, 
> HIVE-15212.06.patch, HIVE-15212.07.patch, HIVE-15212.08.patch, 
> HIVE-15212.09.patch, HIVE-15212.10.patch, HIVE-15212.11.patch, 
> HIVE-15212.12.patch, HIVE-15212.12.patch, HIVE-15212.13.patch, 
> HIVE-15212.13.patch, HIVE-15212.14.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-15212) merge branch into master

2017-09-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15212:

Attachment: HIVE-15212.14.patch

HiveQA runs just keep disappearing... again

> merge branch into master
> 
>
> Key: HIVE-15212
> URL: https://issues.apache.org/jira/browse/HIVE-15212
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15212.01.patch, HIVE-15212.02.patch, 
> HIVE-15212.03.patch, HIVE-15212.04.patch, HIVE-15212.05.patch, 
> HIVE-15212.06.patch, HIVE-15212.07.patch, HIVE-15212.08.patch, 
> HIVE-15212.09.patch, HIVE-15212.10.patch, HIVE-15212.11.patch, 
> HIVE-15212.12.patch, HIVE-15212.12.patch, HIVE-15212.13.patch, 
> HIVE-15212.13.patch, HIVE-15212.14.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-15212) merge branch into master

2017-09-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15212:

Status: Open  (was: Patch Available)

> merge branch into master
> 
>
> Key: HIVE-15212
> URL: https://issues.apache.org/jira/browse/HIVE-15212
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15212.01.patch, HIVE-15212.02.patch, 
> HIVE-15212.03.patch, HIVE-15212.04.patch, HIVE-15212.05.patch, 
> HIVE-15212.06.patch, HIVE-15212.07.patch, HIVE-15212.08.patch, 
> HIVE-15212.09.patch, HIVE-15212.10.patch, HIVE-15212.11.patch, 
> HIVE-15212.12.patch, HIVE-15212.12.patch, HIVE-15212.13.patch, 
> HIVE-15212.13.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16886) HMS log notifications may have duplicated event IDs if multiple HMS are running concurrently

2017-09-17 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-16886:
--
Labels: TODOC3.0  (was: )

> HMS log notifications may have duplicated event IDs if multiple HMS are 
> running concurrently
> 
>
> Key: HIVE-16886
> URL: https://issues.apache.org/jira/browse/HIVE-16886
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Metastore
>Reporter: Sergio Peña
>Assignee: anishek
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: datastore-identity-holes.diff, HIVE-16886.1.patch, 
> HIVE-16886.2.patch, HIVE-16886.3.patch, HIVE-16886.4.patch, 
> HIVE-16886.5.patch, HIVE-16886.6.patch, HIVE-16886.7.patch, HIVE-16886.8.patch
>
>
> When running multiple Hive Metastore servers and DB notifications are 
> enabled, I could see that notifications can be persisted with a duplicated 
> event ID. 
> This does not happen when running multiple threads in a single HMS node due 
> to the locking acquired on the DbNotificationsLog class, but multiple HMS 
> could cause conflicts.
> The issue is in the ObjectStore#addNotificationEvent() method. The event ID 
> fetched from the datastore is used for the new notification, incremented in 
> the server itself, then persisted or updated back to the datastore. If 2 
> servers read the same ID, then these 2 servers write a new notification with 
> the same ID.
> The event ID is not unique nor a primary key.
> Here's a test case using the TestObjectStore class that confirms this issue:
> {noformat}
> @Test
>   public void testConcurrentAddNotifications() throws ExecutionException, 
> InterruptedException {
> final int NUM_THREADS = 2;
> CountDownLatch countIn = new CountDownLatch(NUM_THREADS);
> CountDownLatch countOut = new CountDownLatch(1);
> HiveConf conf = new HiveConf();
> conf.setVar(HiveConf.ConfVars.METASTORE_EXPRESSION_PROXY_CLASS, 
> MockPartitionExpressionProxy.class.getName());
> ExecutorService executorService = 
> Executors.newFixedThreadPool(NUM_THREADS);
> FutureTask tasks[] = new FutureTask[NUM_THREADS];
> for (int i=0; i   final int n = i;
>   tasks[i] = new FutureTask(new Callable() {
> @Override
> public Void call() throws Exception {
>   ObjectStore store = new ObjectStore();
>   store.setConf(conf);
>   NotificationEvent dbEvent =
>   new NotificationEvent(0, 0, 
> EventMessage.EventType.CREATE_DATABASE.toString(), "CREATE DATABASE DB" + n);
>   System.out.println("ADDING NOTIFICATION");
>   countIn.countDown();
>   countOut.await();
>   store.addNotificationEvent(dbEvent);
>   System.out.println("FINISH NOTIFICATION");
>   return null;
> }
>   });
>   executorService.execute(tasks[i]);
> }
> countIn.await();
> countOut.countDown();
> for (int i = 0; i < NUM_THREADS; ++i) {
>   tasks[i].get();
> }
> NotificationEventResponse eventResponse = 
> objectStore.getNextNotification(new NotificationEventRequest());
> Assert.assertEquals(2, eventResponse.getEventsSize());
> Assert.assertEquals(1, eventResponse.getEvents().get(0).getEventId());
> // This fails because the next notification has an event ID = 1
> Assert.assertEquals(2, eventResponse.getEvents().get(1).getEventId());
>   }
> {noformat}
> The last assertion fails expecting an event ID 1 instead of 2. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-14309) Fix naming of classes in orc module to not conflict with standalone orc

2017-09-17 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16047439#comment-16047439
 ] 

Lefty Leverenz edited comment on HIVE-14309 at 9/18/17 12:08 AM:
-

[~owen.omalley], you committed this to branch-2.2 on May 31, so please update 
the status and fix version.

See commit fd1188a6a79c6da868d463c1e5db50c017c3b1a2.

Edit 17/Sep/17:  [~owen.omalley], nudge.


was (Author: le...@hortonworks.com):
[~owen.omalley], you committed this to branch-2.2 on May 31, so please update 
the status and fix version.

See commit fd1188a6a79c6da868d463c1e5db50c017c3b1a2.

> Fix naming of classes in orc module to not conflict with standalone orc
> ---
>
> Key: HIVE-14309
> URL: https://issues.apache.org/jira/browse/HIVE-14309
> Project: Hive
>  Issue Type: Bug
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>
> The current Hive 2.0 and 2.1 releases have classes in the org.apache.orc 
> namespace that clash with the ORC project's classes. From Hive 2.2 onward, 
> the classes will only be on ORC, but we'll reduce the problems of classpath 
> issues if we rename the classes to org.apache.hive.orc.
> I've looked at a set of projects (pig, spark, oozie, flume, & storm) and 
> can't find any uses of Hive's versions of the org.apache.orc classes, so I 
> believe this is a safe change that will reduce the integration problems down 
> stream.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17451) Cannot read decimal from avro file created with HIVE

2017-09-17 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169474#comment-16169474
 ] 

Matt McCline commented on HIVE-17451:
-

Seems like avro-tools.jar tojson isn't converting the binary (physical type) to 
decimal (logical type).

> Cannot read decimal from avro file created with HIVE
> 
>
> Key: HIVE-17451
> URL: https://issues.apache.org/jira/browse/HIVE-17451
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.0
>Reporter: liviu
>Assignee: Ganesh Tripathi
>Priority: Blocker
>
> Hi,
> When we export decimal data from a hive managed table to a hive avro external 
> table (as bytes with decimal logicalType) the value from avro file cannot be 
> read with any other tools (ex: avro-tools, spark, datastage..)
> _+Scenario:+_
> *create hive managed table an insert a decimal record:*
> {code:java}
> create table test_decimal (col1 decimal(20,2));
> insert into table test_decimal values (3.12);
> {code}
> *create avro schema /tmp/test_decimal.avsc with below content:*
> {code:java}
> {
>   "type" : "record",
>   "name" : "decimal_test_avro",
>   "fields" : [ {
> "name" : "col1",
> "type" : [ "null", {
>   "type" : "bytes",
>   "logicalType" : "decimal",
>   "precision" : 20,
>   "scale" : 2
> } ],
> "default" : null,
> "columnName" : "col1",
> "sqlType" : "2"
>   }],
>   "tableName" : "decimal_test_avro"
> }
> {code}
> *create an hive external table stored as avro:*
> {code:java}
> create external table test_decimal_avro
> STORED AS AVRO
> LOCATION '/tmp/test_decimal'
> TBLPROPERTIES (
>   'avro.schema.url'='/tmp/test_decimal.avsc',
>   'orc.compress'='SNAPPY');
> {code}
> *insert data in avro external table from hive managed table:*
> {code:java}
> set hive.exec.compress.output=true;
> set hive.exec.compress.intermediate=true;
> set avro.output.codec=snappy; 
> insert overwrite table test_decimal_avro select * from test_decimal;
> {code}
> *successfully reading data from hive avro table through hive cli:*
> {code:java}
> select * from test_decimal_avro;
> OK
> 3.12
> {code}
> *avro schema from avro created file is ok:*
> {code:java}
> hadoop jar /avro-tools.jar getschema /tmp/test_decimal/00_0
> {
>   "type" : "record",
>   "name" : "decimal_test_avro",
>   "fields" : [ {
> "name" : "col1",
> "type" : [ "null", {
>   "type" : "bytes",
>   "logicalType" : "decimal",
>   "precision" : 20,
>   "scale" : 2
> } ],
> "default" : null,
> "columnName" : "col1",
> "sqlType" : "2"
>   } ],
>   "tableName" : "decimal_test_avro"
> }
> {code}
> *read data from avro file with avro-tools {color:#d04437}error{color}, got 
> {color:#d04437}"\u00018"{color} value instead of the correct one:*
> {code:java}
> hadoop jar avro-tools.jar tojson /tmp/test_decimal/00_0
> {"col1":{"bytes":"\u00018"}}
> {code}
> *Read data in a spark dataframe error, got {color:#d04437}[01 38]{color} 
> and{color:#d04437} 8{color} when converted to string instead of correct 
> "3.12" value :*
> {code:java}
> val df = sql.read.avro("/tmp/test_decimal")
> df: org.apache.spark.sql.DataFrame = [col1: binary]
> scala> df.show()
> +---+
> |   col1|
> +---+
> |[01 38]|
> +---+
> scala> df.withColumn("col2", 'col1.cast("String")).select("col2").show()
> ++
> |col2|
> ++
> |  8|
> ++
> {code}
> Is this a Hive bug or there is anything else I can do in order to get correct 
> values in the avro file created by Hive?
> Thanks,



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17387) implement Tez AM registry in Hive

2017-09-17 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169473#comment-16169473
 ] 

Lefty Leverenz commented on HIVE-17387:
---

Doc note:  This adds three configuration parameters to HiveConf.java 
(*hive.llap.task.scheduler.am.registry*, 
*hive.llap.task.scheduler.am.registry.principal*, and 
*hive.llap.task.scheduler.am.registry.keytab.file*) so they need to be 
documented in the wiki.

* [Configuration Properties -- LLAP | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-LLAP]

Added a TODOC3.0 label.

> implement Tez AM registry in Hive
> -
>
> Key: HIVE-17387
> URL: https://issues.apache.org/jira/browse/HIVE-17387
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-17387.01.patch, HIVE-17387.02.patch, 
> HIVE-17387.patch
>
>
> Necessary for HS2 HA, to transfer AMs between HS2s, etc.
> Helpful for workload management.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17387) implement Tez AM registry in Hive

2017-09-17 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-17387:
--
Labels: TODOC3.0  (was: )

> implement Tez AM registry in Hive
> -
>
> Key: HIVE-17387
> URL: https://issues.apache.org/jira/browse/HIVE-17387
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-17387.01.patch, HIVE-17387.02.patch, 
> HIVE-17387.patch
>
>
> Necessary for HS2 HA, to transfer AMs between HS2s, etc.
> Helpful for workload management.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17451) Cannot read decimal from avro file created with HIVE

2017-09-17 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169472#comment-16169472
 ] 

Matt McCline commented on HIVE-17451:
-

I believe Avro stores the decimal as the serialization of a Java BigInteger.  
That is, we call BigInteger's toByteArray function that returns a byte array 
with two's compliment reprentation.  Why do we use BigInteger?  Because 
decimal's max precion of 38 digits cannot be stored in a Java 64 bit signed 
long.  128 bits are needed.  I believe Spark represents decimal with precision 
> 18 as Java BigDecimal.  BigDecimal.unscaledValue() returns a BigInteger.

So logicalType decimal and stored as bytes makes sense.

Does specifying [col1:decimal(20,2)] in the data frame detail not work?

> Cannot read decimal from avro file created with HIVE
> 
>
> Key: HIVE-17451
> URL: https://issues.apache.org/jira/browse/HIVE-17451
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.0
>Reporter: liviu
>Assignee: Ganesh Tripathi
>Priority: Blocker
>
> Hi,
> When we export decimal data from a hive managed table to a hive avro external 
> table (as bytes with decimal logicalType) the value from avro file cannot be 
> read with any other tools (ex: avro-tools, spark, datastage..)
> _+Scenario:+_
> *create hive managed table an insert a decimal record:*
> {code:java}
> create table test_decimal (col1 decimal(20,2));
> insert into table test_decimal values (3.12);
> {code}
> *create avro schema /tmp/test_decimal.avsc with below content:*
> {code:java}
> {
>   "type" : "record",
>   "name" : "decimal_test_avro",
>   "fields" : [ {
> "name" : "col1",
> "type" : [ "null", {
>   "type" : "bytes",
>   "logicalType" : "decimal",
>   "precision" : 20,
>   "scale" : 2
> } ],
> "default" : null,
> "columnName" : "col1",
> "sqlType" : "2"
>   }],
>   "tableName" : "decimal_test_avro"
> }
> {code}
> *create an hive external table stored as avro:*
> {code:java}
> create external table test_decimal_avro
> STORED AS AVRO
> LOCATION '/tmp/test_decimal'
> TBLPROPERTIES (
>   'avro.schema.url'='/tmp/test_decimal.avsc',
>   'orc.compress'='SNAPPY');
> {code}
> *insert data in avro external table from hive managed table:*
> {code:java}
> set hive.exec.compress.output=true;
> set hive.exec.compress.intermediate=true;
> set avro.output.codec=snappy; 
> insert overwrite table test_decimal_avro select * from test_decimal;
> {code}
> *successfully reading data from hive avro table through hive cli:*
> {code:java}
> select * from test_decimal_avro;
> OK
> 3.12
> {code}
> *avro schema from avro created file is ok:*
> {code:java}
> hadoop jar /avro-tools.jar getschema /tmp/test_decimal/00_0
> {
>   "type" : "record",
>   "name" : "decimal_test_avro",
>   "fields" : [ {
> "name" : "col1",
> "type" : [ "null", {
>   "type" : "bytes",
>   "logicalType" : "decimal",
>   "precision" : 20,
>   "scale" : 2
> } ],
> "default" : null,
> "columnName" : "col1",
> "sqlType" : "2"
>   } ],
>   "tableName" : "decimal_test_avro"
> }
> {code}
> *read data from avro file with avro-tools {color:#d04437}error{color}, got 
> {color:#d04437}"\u00018"{color} value instead of the correct one:*
> {code:java}
> hadoop jar avro-tools.jar tojson /tmp/test_decimal/00_0
> {"col1":{"bytes":"\u00018"}}
> {code}
> *Read data in a spark dataframe error, got {color:#d04437}[01 38]{color} 
> and{color:#d04437} 8{color} when converted to string instead of correct 
> "3.12" value :*
> {code:java}
> val df = sql.read.avro("/tmp/test_decimal")
> df: org.apache.spark.sql.DataFrame = [col1: binary]
> scala> df.show()
> +---+
> |   col1|
> +---+
> |[01 38]|
> +---+
> scala> df.withColumn("col2", 'col1.cast("String")).select("col2").show()
> ++
> |col2|
> ++
> |  8|
> ++
> {code}
> Is this a Hive bug or there is anything else I can do in order to get correct 
> values in the avro file created by Hive?
> Thanks,



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively

2017-09-17 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169469#comment-16169469
 ] 

Vineet Garg commented on HIVE-17465:


[~ashutoshc] can you take a look at updated review?

> Statistics: Drill-down filters don't reduce row-counts progressively
> 
>
> Key: HIVE-17465
> URL: https://issues.apache.org/jira/browse/HIVE-17465
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer, Statistics
>Reporter: Gopal V
>Assignee: Vineet Garg
> Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, 
> HIVE-17465.3.patch, HIVE-17465.4.patch, HIVE-17465.5.patch, 
> HIVE-17465.6.patch, HIVE-17465.7.patch
>
>
> {code}
> explain select count(d_date_sk) from date_dim where d_year=2001 ;
> explain select count(d_date_sk) from date_dim where d_year=2001  and d_moy = 
> 9;
> explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 
> and d_dom = 21;
> {code}
> All 3 queries end up with the same row-count estimates after the filter.
> {code}
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: (d_year = 2001) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (d_year = 2001) (type: boolean)
> Statistics: Num rows: 363 Data size: 4356 Basic stats: 
> COMPLETE Column stats: COMPLETE
>  
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
> Statistics: Num rows: 363 Data size: 5808 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
> Statistics: Num rows: 363 Data size: 7260 Basic stats: 
> COMPLETE Column stats: COMPLETE
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively

2017-09-17 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17465:
---
Attachment: HIVE-17465.7.patch

> Statistics: Drill-down filters don't reduce row-counts progressively
> 
>
> Key: HIVE-17465
> URL: https://issues.apache.org/jira/browse/HIVE-17465
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer, Statistics
>Reporter: Gopal V
>Assignee: Vineet Garg
> Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, 
> HIVE-17465.3.patch, HIVE-17465.4.patch, HIVE-17465.5.patch, 
> HIVE-17465.6.patch, HIVE-17465.7.patch
>
>
> {code}
> explain select count(d_date_sk) from date_dim where d_year=2001 ;
> explain select count(d_date_sk) from date_dim where d_year=2001  and d_moy = 
> 9;
> explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 
> and d_dom = 21;
> {code}
> All 3 queries end up with the same row-count estimates after the filter.
> {code}
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: (d_year = 2001) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (d_year = 2001) (type: boolean)
> Statistics: Num rows: 363 Data size: 4356 Basic stats: 
> COMPLETE Column stats: COMPLETE
>  
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
> Statistics: Num rows: 363 Data size: 5808 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
> Statistics: Num rows: 363 Data size: 7260 Basic stats: 
> COMPLETE Column stats: COMPLETE
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively

2017-09-17 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17465:
---
Status: Patch Available  (was: Open)

> Statistics: Drill-down filters don't reduce row-counts progressively
> 
>
> Key: HIVE-17465
> URL: https://issues.apache.org/jira/browse/HIVE-17465
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer, Statistics
>Reporter: Gopal V
>Assignee: Vineet Garg
> Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, 
> HIVE-17465.3.patch, HIVE-17465.4.patch, HIVE-17465.5.patch, 
> HIVE-17465.6.patch, HIVE-17465.7.patch
>
>
> {code}
> explain select count(d_date_sk) from date_dim where d_year=2001 ;
> explain select count(d_date_sk) from date_dim where d_year=2001  and d_moy = 
> 9;
> explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 
> and d_dom = 21;
> {code}
> All 3 queries end up with the same row-count estimates after the filter.
> {code}
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: (d_year = 2001) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (d_year = 2001) (type: boolean)
> Statistics: Num rows: 363 Data size: 4356 Basic stats: 
> COMPLETE Column stats: COMPLETE
>  
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
> Statistics: Num rows: 363 Data size: 5808 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
> Statistics: Num rows: 363 Data size: 7260 Basic stats: 
> COMPLETE Column stats: COMPLETE
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively

2017-09-17 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17465:
---
Status: Open  (was: Patch Available)

> Statistics: Drill-down filters don't reduce row-counts progressively
> 
>
> Key: HIVE-17465
> URL: https://issues.apache.org/jira/browse/HIVE-17465
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer, Statistics
>Reporter: Gopal V
>Assignee: Vineet Garg
> Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, 
> HIVE-17465.3.patch, HIVE-17465.4.patch, HIVE-17465.5.patch, 
> HIVE-17465.6.patch, HIVE-17465.7.patch
>
>
> {code}
> explain select count(d_date_sk) from date_dim where d_year=2001 ;
> explain select count(d_date_sk) from date_dim where d_year=2001  and d_moy = 
> 9;
> explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 
> and d_dom = 21;
> {code}
> All 3 queries end up with the same row-count estimates after the filter.
> {code}
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: (d_year = 2001) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (d_year = 2001) (type: boolean)
> Statistics: Num rows: 363 Data size: 4356 Basic stats: 
> COMPLETE Column stats: COMPLETE
>  
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
> Statistics: Num rows: 363 Data size: 5808 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
> Statistics: Num rows: 363 Data size: 7260 Basic stats: 
> COMPLETE Column stats: COMPLETE
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17451) Cannot read decimal from avro file created with HIVE

2017-09-17 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169433#comment-16169433
 ] 

Matt McCline edited comment on HIVE-17451 at 9/17/17 11:38 PM:
---

Hexidecimal 0x138 = Decimal 312 (unscaled)
and, hexidecimal bytes 01 and 38 are a non-visible control character \001 and 
ASCII "8"


was (Author: mmccline):
Hexidecimal 0x138 = Decimal 312 

> Cannot read decimal from avro file created with HIVE
> 
>
> Key: HIVE-17451
> URL: https://issues.apache.org/jira/browse/HIVE-17451
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.0
>Reporter: liviu
>Assignee: Ganesh Tripathi
>Priority: Blocker
>
> Hi,
> When we export decimal data from a hive managed table to a hive avro external 
> table (as bytes with decimal logicalType) the value from avro file cannot be 
> read with any other tools (ex: avro-tools, spark, datastage..)
> _+Scenario:+_
> *create hive managed table an insert a decimal record:*
> {code:java}
> create table test_decimal (col1 decimal(20,2));
> insert into table test_decimal values (3.12);
> {code}
> *create avro schema /tmp/test_decimal.avsc with below content:*
> {code:java}
> {
>   "type" : "record",
>   "name" : "decimal_test_avro",
>   "fields" : [ {
> "name" : "col1",
> "type" : [ "null", {
>   "type" : "bytes",
>   "logicalType" : "decimal",
>   "precision" : 20,
>   "scale" : 2
> } ],
> "default" : null,
> "columnName" : "col1",
> "sqlType" : "2"
>   }],
>   "tableName" : "decimal_test_avro"
> }
> {code}
> *create an hive external table stored as avro:*
> {code:java}
> create external table test_decimal_avro
> STORED AS AVRO
> LOCATION '/tmp/test_decimal'
> TBLPROPERTIES (
>   'avro.schema.url'='/tmp/test_decimal.avsc',
>   'orc.compress'='SNAPPY');
> {code}
> *insert data in avro external table from hive managed table:*
> {code:java}
> set hive.exec.compress.output=true;
> set hive.exec.compress.intermediate=true;
> set avro.output.codec=snappy; 
> insert overwrite table test_decimal_avro select * from test_decimal;
> {code}
> *successfully reading data from hive avro table through hive cli:*
> {code:java}
> select * from test_decimal_avro;
> OK
> 3.12
> {code}
> *avro schema from avro created file is ok:*
> {code:java}
> hadoop jar /avro-tools.jar getschema /tmp/test_decimal/00_0
> {
>   "type" : "record",
>   "name" : "decimal_test_avro",
>   "fields" : [ {
> "name" : "col1",
> "type" : [ "null", {
>   "type" : "bytes",
>   "logicalType" : "decimal",
>   "precision" : 20,
>   "scale" : 2
> } ],
> "default" : null,
> "columnName" : "col1",
> "sqlType" : "2"
>   } ],
>   "tableName" : "decimal_test_avro"
> }
> {code}
> *read data from avro file with avro-tools {color:#d04437}error{color}, got 
> {color:#d04437}"\u00018"{color} value instead of the correct one:*
> {code:java}
> hadoop jar avro-tools.jar tojson /tmp/test_decimal/00_0
> {"col1":{"bytes":"\u00018"}}
> {code}
> *Read data in a spark dataframe error, got {color:#d04437}[01 38]{color} 
> and{color:#d04437} 8{color} when converted to string instead of correct 
> "3.12" value :*
> {code:java}
> val df = sql.read.avro("/tmp/test_decimal")
> df: org.apache.spark.sql.DataFrame = [col1: binary]
> scala> df.show()
> +---+
> |   col1|
> +---+
> |[01 38]|
> +---+
> scala> df.withColumn("col2", 'col1.cast("String")).select("col2").show()
> ++
> |col2|
> ++
> |  8|
> ++
> {code}
> Is this a Hive bug or there is anything else I can do in order to get correct 
> values in the avro file created by Hive?
> Thanks,



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17422) Skip non-native/temporary tables for all major table/partition related scenarios

2017-09-17 Thread Tao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169445#comment-16169445
 ] 

Tao Li commented on HIVE-17422:
---

Test resut looks good.

> Skip non-native/temporary tables for all major table/partition related 
> scenarios
> 
>
> Key: HIVE-17422
> URL: https://issues.apache.org/jira/browse/HIVE-17422
> Project: Hive
>  Issue Type: Improvement
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17422.1.patch, HIVE-17422.2.patch, 
> HIVE-17422.3.patch, HIVE-17422.4.patch
>
>
> Currently during incremental dump, the non-native/temporary table info is 
> partially dumped in metadata file and will be ignored later by the repl load. 
> We can optimize it by moving the check (whether the table should be exported 
> or not) earlier so that we don't save any info to dump file for such types of 
> tables. CreateTableHandler already has this optimization, so we just need to 
> apply similar logic to other scenarios.
> The change is to apply the EximUtil.shouldExportTable check to all scenarios 
> (e.g. alter table) that calls into the common dump method. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17275) Auto-merge fails on writes of UNION ALL output to ORC file with dynamic partitioning

2017-09-17 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169442#comment-16169442
 ] 

Lefty Leverenz commented on HIVE-17275:
---

This issue needs fix versions.  (Will it also be committed to branch-2.3?)

> Auto-merge fails on writes of UNION ALL output to ORC file with dynamic 
> partitioning
> 
>
> Key: HIVE-17275
> URL: https://issues.apache.org/jira/browse/HIVE-17275
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.2.0
>Reporter: Chris Drome
>Assignee: Chris Drome
> Attachments: HIVE-17275.2-branch-2.2.patch, 
> HIVE-17275.2-branch-2.patch, HIVE-17275.2.patch, HIVE-17275-branch-2.2.patch, 
> HIVE-17275-branch-2.patch, HIVE-17275.patch
>
>
> If dynamic partitioning is used to write the output of UNION or UNION ALL 
> queries into ORC files with hive.merge.tezfiles=true, the merge step fails as 
> follows:
> {noformat}
> 2017-08-08T11:27:19,958 ERROR [e7b1f06d-d632-408a-9dff-f7ae042cd25a main] 
> SessionState: Vertex failed, vertexName=File Merge, 
> vertexId=vertex_1502216690354_0001_33_00, diagnostics=[Task failed, 
> taskId=task_1502216690354_0001_33_00_00, diagnostics=[TaskAttempt 0 
> failed, info=[Error: Error while running task ( failure ) : 
> attempt_1502216690354_0001_33_00_00_0:java.lang.RuntimeException: 
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.IOException: Multiple partitions for one merge mapper: 
> hdfs://localhost:39943/build/ql/test/data/warehouse/partunion1/.hive-staging_hive_2017-08-08_11-27-09_105_286405133968521828-1/-ext-10002/part1=2014/1
>  NOT EQUAL TO 
> hdfs://localhost:39943/build/ql/test/data/warehouse/partunion1/.hive-staging_hive_2017-08-08_11-27-09_105_286405133968521828-1/-ext-10002/part1=2014/2
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileTezProcessor.run(MergeFileTezProcessor.java:42)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
> Multiple partitions for one merge mapper: 
> hdfs://localhost:39943/build/ql/test/data/warehouse/partunion1/.hive-staging_hive_2017-08-08_11-27-09_105_286405133968521828-1/-ext-10002/part1=2014/1
>  NOT EQUAL TO 
> hdfs://localhost:39943/build/ql/test/data/warehouse/partunion1/.hive-staging_hive_2017-08-08_11-27-09_105_286405133968521828-1/-ext-10002/part1=2014/2
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.processRow(MergeFileRecordProcessor.java:225)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.run(MergeFileRecordProcessor.java:154)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:185)
>   ... 14 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.IOException: Multiple partitions for one merge mapper: 
> hdfs://localhost:39943/build/ql/test/data/warehouse/partunion1/.hive-staging_hive_2017-08-08_11-27-09_105_286405133968521828-1/-ext-10002/part1=2014/1
>  NOT EQUAL TO 
> hdfs://localhost:39943/build/ql/test/data/warehouse/partunion1/.hive-staging_hive_2017-08-08_11-27-09_105_286405133968521828-1/-ext-10002/part1=2014/2
>   at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.processKeyValuePairs(OrcFileMergeOperator.java:169)
>   at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.process(OrcFileMergeOperator.java:72)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.processRow(MergeFileRecordProcessor.java:216)
>   ... 16 more
> Caused by: java.io.IOException: 

[jira] [Commented] (HIVE-17422) Skip non-native/temporary tables for all major table/partition related scenarios

2017-09-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169438#comment-16169438
 ] 

Hive QA commented on HIVE-17422:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12887565/HIVE-17422.4.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 11041 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_mask_hash] 
(batchId=28)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=156)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[drop_table_failure2]
 (batchId=89)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=234)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=234)
org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=215)
org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 
(batchId=215)
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testHttpRetryOnServerIdleTimeout 
(batchId=227)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6854/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6854/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6854/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12887565 - PreCommit-HIVE-Build

> Skip non-native/temporary tables for all major table/partition related 
> scenarios
> 
>
> Key: HIVE-17422
> URL: https://issues.apache.org/jira/browse/HIVE-17422
> Project: Hive
>  Issue Type: Improvement
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17422.1.patch, HIVE-17422.2.patch, 
> HIVE-17422.3.patch, HIVE-17422.4.patch
>
>
> Currently during incremental dump, the non-native/temporary table info is 
> partially dumped in metadata file and will be ignored later by the repl load. 
> We can optimize it by moving the check (whether the table should be exported 
> or not) earlier so that we don't save any info to dump file for such types of 
> tables. CreateTableHandler already has this optimization, so we just need to 
> apply similar logic to other scenarios.
> The change is to apply the EximUtil.shouldExportTable check to all scenarios 
> (e.g. alter table) that calls into the common dump method. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17522) cleanup old 'repl dump' dirs

2017-09-17 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169435#comment-16169435
 ] 

Lefty Leverenz commented on HIVE-17522:
---

Doc note:  This adds *hive.repl.dumpdir.clean.freq* and *hive.repl.dumpdir.ttl* 
to HiveConf.java, so they need to be documented in the wiki for release 3.0.0.

* [Configuration Properties -- Replication | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-Replication]

Added a TODOC3.0 label.

Question:  What does TTL mean?

> cleanup old 'repl dump' dirs
> 
>
> Key: HIVE-17522
> URL: https://issues.apache.org/jira/browse/HIVE-17522
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-17522.1.patch
>
>
> We want to clean up the old dump dirs to save space and reduce scan time when 
> needed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17451) Cannot read decimal from avro file created with HIVE

2017-09-17 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169433#comment-16169433
 ] 

Matt McCline commented on HIVE-17451:
-

Hexidecimal 0x138 = Decimal 312 

> Cannot read decimal from avro file created with HIVE
> 
>
> Key: HIVE-17451
> URL: https://issues.apache.org/jira/browse/HIVE-17451
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.0
>Reporter: liviu
>Assignee: Ganesh Tripathi
>Priority: Blocker
>
> Hi,
> When we export decimal data from a hive managed table to a hive avro external 
> table (as bytes with decimal logicalType) the value from avro file cannot be 
> read with any other tools (ex: avro-tools, spark, datastage..)
> _+Scenario:+_
> *create hive managed table an insert a decimal record:*
> {code:java}
> create table test_decimal (col1 decimal(20,2));
> insert into table test_decimal values (3.12);
> {code}
> *create avro schema /tmp/test_decimal.avsc with below content:*
> {code:java}
> {
>   "type" : "record",
>   "name" : "decimal_test_avro",
>   "fields" : [ {
> "name" : "col1",
> "type" : [ "null", {
>   "type" : "bytes",
>   "logicalType" : "decimal",
>   "precision" : 20,
>   "scale" : 2
> } ],
> "default" : null,
> "columnName" : "col1",
> "sqlType" : "2"
>   }],
>   "tableName" : "decimal_test_avro"
> }
> {code}
> *create an hive external table stored as avro:*
> {code:java}
> create external table test_decimal_avro
> STORED AS AVRO
> LOCATION '/tmp/test_decimal'
> TBLPROPERTIES (
>   'avro.schema.url'='/tmp/test_decimal.avsc',
>   'orc.compress'='SNAPPY');
> {code}
> *insert data in avro external table from hive managed table:*
> {code:java}
> set hive.exec.compress.output=true;
> set hive.exec.compress.intermediate=true;
> set avro.output.codec=snappy; 
> insert overwrite table test_decimal_avro select * from test_decimal;
> {code}
> *successfully reading data from hive avro table through hive cli:*
> {code:java}
> select * from test_decimal_avro;
> OK
> 3.12
> {code}
> *avro schema from avro created file is ok:*
> {code:java}
> hadoop jar /avro-tools.jar getschema /tmp/test_decimal/00_0
> {
>   "type" : "record",
>   "name" : "decimal_test_avro",
>   "fields" : [ {
> "name" : "col1",
> "type" : [ "null", {
>   "type" : "bytes",
>   "logicalType" : "decimal",
>   "precision" : 20,
>   "scale" : 2
> } ],
> "default" : null,
> "columnName" : "col1",
> "sqlType" : "2"
>   } ],
>   "tableName" : "decimal_test_avro"
> }
> {code}
> *read data from avro file with avro-tools {color:#d04437}error{color}, got 
> {color:#d04437}"\u00018"{color} value instead of the correct one:*
> {code:java}
> hadoop jar avro-tools.jar tojson /tmp/test_decimal/00_0
> {"col1":{"bytes":"\u00018"}}
> {code}
> *Read data in a spark dataframe error, got {color:#d04437}[01 38]{color} 
> and{color:#d04437} 8{color} when converted to string instead of correct 
> "3.12" value :*
> {code:java}
> val df = sql.read.avro("/tmp/test_decimal")
> df: org.apache.spark.sql.DataFrame = [col1: binary]
> scala> df.show()
> +---+
> |   col1|
> +---+
> |[01 38]|
> +---+
> scala> df.withColumn("col2", 'col1.cast("String")).select("col2").show()
> ++
> |col2|
> ++
> |  8|
> ++
> {code}
> Is this a Hive bug or there is anything else I can do in order to get correct 
> values in the avro file created by Hive?
> Thanks,



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17522) cleanup old 'repl dump' dirs

2017-09-17 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-17522:
--
Labels: TODOC3.0  (was: )

> cleanup old 'repl dump' dirs
> 
>
> Key: HIVE-17522
> URL: https://issues.apache.org/jira/browse/HIVE-17522
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-17522.1.patch
>
>
> We want to clean up the old dump dirs to save space and reduce scan time when 
> needed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17422) Skip non-native/temporary tables for all major table/partition related scenarios

2017-09-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169407#comment-16169407
 ] 

Hive QA commented on HIVE-17422:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12887565/HIVE-17422.4.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 11041 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_mask_hash] 
(batchId=28)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=100)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[drop_table_failure2]
 (batchId=89)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=234)
org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=215)
org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 
(batchId=215)
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testHttpRetryOnServerIdleTimeout 
(batchId=227)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6853/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6853/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6853/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12887565 - PreCommit-HIVE-Build

> Skip non-native/temporary tables for all major table/partition related 
> scenarios
> 
>
> Key: HIVE-17422
> URL: https://issues.apache.org/jira/browse/HIVE-17422
> Project: Hive
>  Issue Type: Improvement
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17422.1.patch, HIVE-17422.2.patch, 
> HIVE-17422.3.patch, HIVE-17422.4.patch
>
>
> Currently during incremental dump, the non-native/temporary table info is 
> partially dumped in metadata file and will be ignored later by the repl load. 
> We can optimize it by moving the check (whether the table should be exported 
> or not) earlier so that we don't save any info to dump file for such types of 
> tables. CreateTableHandler already has this optimization, so we just need to 
> apply similar logic to other scenarios.
> The change is to apply the EximUtil.shouldExportTable check to all scenarios 
> (e.g. alter table) that calls into the common dump method. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17493) Improve PKFK cardinality estimation in Physical planning

2017-09-17 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17493:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed to master

> Improve PKFK cardinality estimation in Physical planning
> 
>
> Key: HIVE-17493
> URL: https://issues.apache.org/jira/browse/HIVE-17493
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17493.1.patch, HIVE-17493.2.patch, 
> HIVE-17493.3.patch, HIVE-17493.4.patch
>
>
> Cardinality estimation of a join, after PK-FK relation has been ascertained, 
> could be improved if parent of the join operator is LEFT outer or RIGHT outer 
> join.
> Currently estimation is done by estimating reduction of rows occurred on PK 
> side, then multiplying the reduction to FK side row count. This estimation of 
> reduction currently doesn't distinguish b/w INNER or OUTER joins. This could 
> be improved to handle outer joins better.
> TPC-DS query45 is impacted by this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17422) Skip non-native/temporary tables for all major table/partition related scenarios

2017-09-17 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17422:
--
Attachment: HIVE-17422.4.patch

> Skip non-native/temporary tables for all major table/partition related 
> scenarios
> 
>
> Key: HIVE-17422
> URL: https://issues.apache.org/jira/browse/HIVE-17422
> Project: Hive
>  Issue Type: Improvement
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17422.1.patch, HIVE-17422.2.patch, 
> HIVE-17422.3.patch, HIVE-17422.4.patch
>
>
> Currently during incremental dump, the non-native/temporary table info is 
> partially dumped in metadata file and will be ignored later by the repl load. 
> We can optimize it by moving the check (whether the table should be exported 
> or not) earlier so that we don't save any info to dump file for such types of 
> tables. CreateTableHandler already has this optimization, so we just need to 
> apply similar logic to other scenarios.
> The change is to apply the EximUtil.shouldExportTable check to all scenarios 
> (e.g. alter table) that calls into the common dump method. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17422) Skip non-native/temporary tables for all major table/partition related scenarios

2017-09-17 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17422:
--
Status: Patch Available  (was: Open)

> Skip non-native/temporary tables for all major table/partition related 
> scenarios
> 
>
> Key: HIVE-17422
> URL: https://issues.apache.org/jira/browse/HIVE-17422
> Project: Hive
>  Issue Type: Improvement
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17422.1.patch, HIVE-17422.2.patch, 
> HIVE-17422.3.patch, HIVE-17422.4.patch
>
>
> Currently during incremental dump, the non-native/temporary table info is 
> partially dumped in metadata file and will be ignored later by the repl load. 
> We can optimize it by moving the check (whether the table should be exported 
> or not) earlier so that we don't save any info to dump file for such types of 
> tables. CreateTableHandler already has this optimization, so we just need to 
> apply similar logic to other scenarios.
> The change is to apply the EximUtil.shouldExportTable check to all scenarios 
> (e.g. alter table) that calls into the common dump method. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17422) Skip non-native/temporary tables for all major table/partition related scenarios

2017-09-17 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17422:
--
Attachment: (was: HIVE-17422.4.patch)

> Skip non-native/temporary tables for all major table/partition related 
> scenarios
> 
>
> Key: HIVE-17422
> URL: https://issues.apache.org/jira/browse/HIVE-17422
> Project: Hive
>  Issue Type: Improvement
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17422.1.patch, HIVE-17422.2.patch, 
> HIVE-17422.3.patch
>
>
> Currently during incremental dump, the non-native/temporary table info is 
> partially dumped in metadata file and will be ignored later by the repl load. 
> We can optimize it by moving the check (whether the table should be exported 
> or not) earlier so that we don't save any info to dump file for such types of 
> tables. CreateTableHandler already has this optimization, so we just need to 
> apply similar logic to other scenarios.
> The change is to apply the EximUtil.shouldExportTable check to all scenarios 
> (e.g. alter table) that calls into the common dump method. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17422) Skip non-native/temporary tables for all major table/partition related scenarios

2017-09-17 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17422:
--
Status: Open  (was: Patch Available)

> Skip non-native/temporary tables for all major table/partition related 
> scenarios
> 
>
> Key: HIVE-17422
> URL: https://issues.apache.org/jira/browse/HIVE-17422
> Project: Hive
>  Issue Type: Improvement
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17422.1.patch, HIVE-17422.2.patch, 
> HIVE-17422.3.patch
>
>
> Currently during incremental dump, the non-native/temporary table info is 
> partially dumped in metadata file and will be ignored later by the repl load. 
> We can optimize it by moving the check (whether the table should be exported 
> or not) earlier so that we don't save any info to dump file for such types of 
> tables. CreateTableHandler already has this optimization, so we just need to 
> apply similar logic to other scenarios.
> The change is to apply the EximUtil.shouldExportTable check to all scenarios 
> (e.g. alter table) that calls into the common dump method. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17422) Skip non-native/temporary tables for all major table/partition related scenarios

2017-09-17 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17422:
--
Status: Patch Available  (was: Open)

> Skip non-native/temporary tables for all major table/partition related 
> scenarios
> 
>
> Key: HIVE-17422
> URL: https://issues.apache.org/jira/browse/HIVE-17422
> Project: Hive
>  Issue Type: Improvement
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17422.1.patch, HIVE-17422.2.patch, 
> HIVE-17422.3.patch, HIVE-17422.4.patch
>
>
> Currently during incremental dump, the non-native/temporary table info is 
> partially dumped in metadata file and will be ignored later by the repl load. 
> We can optimize it by moving the check (whether the table should be exported 
> or not) earlier so that we don't save any info to dump file for such types of 
> tables. CreateTableHandler already has this optimization, so we just need to 
> apply similar logic to other scenarios.
> The change is to apply the EximUtil.shouldExportTable check to all scenarios 
> (e.g. alter table) that calls into the common dump method. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17422) Skip non-native/temporary tables for all major table/partition related scenarios

2017-09-17 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17422:
--
Attachment: HIVE-17422.4.patch

> Skip non-native/temporary tables for all major table/partition related 
> scenarios
> 
>
> Key: HIVE-17422
> URL: https://issues.apache.org/jira/browse/HIVE-17422
> Project: Hive
>  Issue Type: Improvement
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17422.1.patch, HIVE-17422.2.patch, 
> HIVE-17422.3.patch, HIVE-17422.4.patch
>
>
> Currently during incremental dump, the non-native/temporary table info is 
> partially dumped in metadata file and will be ignored later by the repl load. 
> We can optimize it by moving the check (whether the table should be exported 
> or not) earlier so that we don't save any info to dump file for such types of 
> tables. CreateTableHandler already has this optimization, so we just need to 
> apply similar logic to other scenarios.
> The change is to apply the EximUtil.shouldExportTable check to all scenarios 
> (e.g. alter table) that calls into the common dump method. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17422) Skip non-native/temporary tables for all major table/partition related scenarios

2017-09-17 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17422:
--
Status: Open  (was: Patch Available)

> Skip non-native/temporary tables for all major table/partition related 
> scenarios
> 
>
> Key: HIVE-17422
> URL: https://issues.apache.org/jira/browse/HIVE-17422
> Project: Hive
>  Issue Type: Improvement
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17422.1.patch, HIVE-17422.2.patch, 
> HIVE-17422.3.patch, HIVE-17422.4.patch
>
>
> Currently during incremental dump, the non-native/temporary table info is 
> partially dumped in metadata file and will be ignored later by the repl load. 
> We can optimize it by moving the check (whether the table should be exported 
> or not) earlier so that we don't save any info to dump file for such types of 
> tables. CreateTableHandler already has this optimization, so we just need to 
> apply similar logic to other scenarios.
> The change is to apply the EximUtil.shouldExportTable check to all scenarios 
> (e.g. alter table) that calls into the common dump method. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-6590) Hive does not work properly with boolean partition columns (wrong results and inserts to incorrect HDFS path)

2017-09-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169319#comment-16169319
 ] 

Hive QA commented on HIVE-6590:
---



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12885121/HIVE-6590.5.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 11037 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_gby2_map_multi_distinct]
 (batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_mask_hash] 
(batchId=28)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=156)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[drop_table_failure2]
 (batchId=89)
org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver
 (batchId=241)
org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=215)
org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 
(batchId=215)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6851/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6851/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6851/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12885121 - PreCommit-HIVE-Build

> Hive does not work properly with boolean partition columns (wrong results and 
> inserts to incorrect HDFS path)
> -
>
> Key: HIVE-6590
> URL: https://issues.apache.org/jira/browse/HIVE-6590
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema, Metastore
>Affects Versions: 0.10.0
>Reporter: Lenni Kuff
>Assignee: Zoltan Haindrich
> Attachments: HIVE-6590.1.patch, HIVE-6590.2.patch, HIVE-6590.3.patch, 
> HIVE-6590.4.patch, HIVE-6590.5.patch, HIVE-6590.5.patch
>
>
> Hive does not work properly with boolean partition columns. Queries return 
> wrong results and also insert to incorrect HDFS paths.
> {code}
> create table bool_part(int_col int) partitioned by(bool_col boolean);
> # This works, creating 3 unique partitions!
> ALTER TABLE bool_table ADD PARTITION (bool_col=FALSE);
> ALTER TABLE bool_table ADD PARTITION (bool_col=false);
> ALTER TABLE bool_table ADD PARTITION (bool_col=False);
> {code}
> The first problem is that Hive cannot filter on a bool partition key column. 
> "select * from bool_part" returns the correct results, but if you apply a 
> filter on the bool partition key column hive won't return any results.
> The second problem is that Hive seems to just call "toString()" on the 
> boolean literal value. This means you can end up with multiple partitions 
> (FALSE, false, FaLSE, etc) mapping to the literal value 'FALSE'. For example, 
> if you can add three partition in have for the same logic value "false" doing:
> ALTER TABLE bool_table ADD PARTITION (bool_col=FALSE) -> 
> /test-warehouse/bool_table/bool_col=FALSE/
> ALTER TABLE bool_table ADD PARTITION (bool_col=false) -> 
> /test-warehouse/bool_table/bool_col=false/
> ALTER TABLE bool_table ADD PARTITION (bool_col=False) -> 
> /test-warehouse/bool_table/bool_col=False/



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16355) Service: embedded mode should only be available if service is loaded onto the classpath

2017-09-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169295#comment-16169295
 ] 

Hive QA commented on HIVE-16355:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12885123/HIVE-16355.5.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 11035 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_mask_hash] 
(batchId=28)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=156)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[drop_table_failure2]
 (batchId=89)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=234)
org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver
 (batchId=241)
org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=215)
org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 
(batchId=215)
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testHttpRetryOnServerIdleTimeout 
(batchId=227)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6850/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6850/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6850/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12885123 - PreCommit-HIVE-Build

> Service: embedded mode should only be available if service is loaded onto the 
> classpath
> ---
>
> Key: HIVE-16355
> URL: https://issues.apache.org/jira/browse/HIVE-16355
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore, Server Infrastructure
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-16355.1.patch, HIVE-16355.2.patch, 
> HIVE-16355.2.patch, HIVE-16355.3.patch, HIVE-16355.4.patch, 
> HIVE-16355.4.patch, HIVE-16355.5.patch
>
>
> I would like to relax the hard reference to 
> {{EmbeddedThriftBinaryCLIService}} to be only used in case {{service}} module 
> is loaded onto the classpath.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17519) Transpose column stats display

2017-09-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169270#comment-16169270
 ] 

Hive QA commented on HIVE-17519:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12887328/HIVE-17519.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 392 failed/errored test(s), 11041 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[escape_comments] 
(batchId=239)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_12] 
(batchId=239)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_13] 
(batchId=239)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_addpartition_blobstore_to_blobstore]
 (batchId=242)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_addpartition_blobstore_to_local]
 (batchId=242)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_addpartition_blobstore_to_warehouse]
 (batchId=242)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_addpartition_local_to_blobstore]
 (batchId=242)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_blobstore_to_blobstore]
 (batchId=242)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_blobstore_to_local]
 (batchId=242)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_blobstore_to_warehouse]
 (batchId=242)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_local_to_blobstore]
 (batchId=242)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_table_stats] 
(batchId=51)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter2] (batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter3] (batchId=20)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter5] (batchId=40)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alterColumnStatsPart] 
(batchId=82)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alterColumnStats] 
(batchId=53)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_file_format] 
(batchId=56)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_merge_2] 
(batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_merge_2_orc] 
(batchId=72)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_merge_stats] 
(batchId=58)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_numbuckets_partitioned_table2_h23]
 (batchId=14)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_numbuckets_partitioned_table_h23]
 (batchId=66)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_change_col]
 (batchId=24)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_clusterby_sortby]
 (batchId=38)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_coltype] 
(batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_format_loc]
 (batchId=51)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_onto_nocurrent_db]
 (batchId=84)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_update_status]
 (batchId=86)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_skewed_table] 
(batchId=35)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_add_partition]
 (batchId=17)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_cascade] 
(batchId=84)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_column_stats]
 (batchId=62)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_not_sorted] 
(batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_serde2] 
(batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_serde] 
(batchId=7)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_stats_status]
 (batchId=52)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_update_status]
 (batchId=76)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_update_status_disable_bitvector]
 (batchId=76)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_view_as_select] 
(batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[analyze_table_null_partition]
 (batchId=78)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[analyze_tbl_date] 
(batchId=33)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[analyze_tbl_part] 
(batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_index] 
(batchId=13)

[jira] [Commented] (HIVE-17541) Move testing related methods from MetaStoreUtils to some testing related utility

2017-09-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169262#comment-16169262
 ] 

Hive QA commented on HIVE-17541:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12887348/HIVE-17541.01.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6848/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6848/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6848/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-09-17 10:40:23.402
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-6848/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-09-17 10:40:23.404
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at f772ba3 HIVE-17013: Delete request with a subquery based on 
select over a view (Eugene Koifman, reviewed by Ashutosh Chauhan)
+ git clean -f -d
Removing ql/src/java/org/apache/hadoop/hive/ql/Context.java.orig
Removing 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/QueryPlanPostProcessor.java
Removing standalone-metastore/src/gen/org/
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at f772ba3 HIVE-17013: Delete request with a subquery based on 
select over a view (Eugene Koifman, reviewed by Ashutosh Chauhan)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-09-17 10:40:28.744
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
Going to apply patch with: patch -p0
patching file 
hcatalog/core/src/test/java/org/apache/hive/hcatalog/cli/TestPermsGrp.java
patching file 
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatMultiOutputFormat.java
patching file 
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatPartitionPublish.java
patching file hcatalog/webhcat/java-client/pom.xml
patching file 
hcatalog/webhcat/java-client/src/test/java/org/apache/hive/hcatalog/api/TestHCatClient.java
patching file 
hcatalog/webhcat/svr/src/test/java/org/apache/hive/hcatalog/templeton/TestWebHCatE2e.java
patching file 
itests/hive-unit-hadoop2/src/test/java/org/apache/hadoop/hive/metastore/security/TestHadoopAuthBridge23.java
patching file 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/AbstractTestAuthorizationApiAuthorizer.java
patching file 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestFilterHooks.java
patching file 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStoreWithEnvironmentContext.java
patching file 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestMarkPartitionRemote.java
patching file 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestMetaStoreAuthorization.java
patching file 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestMetaStoreEndFunctionListener.java
patching file 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestMetaStoreEventListener.java
patching file 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestMetaStoreEventListenerOnlyOnCommit.java
patching file 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestMetaStoreInitListener.java
patching file 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestMetaStoreListenersError.java
patching file 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestMetaStoreMetrics.java
patching 

[jira] [Commented] (HIVE-15899) check CTAS over acid table

2017-09-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169243#comment-16169243
 ] 

Hive QA commented on HIVE-15899:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12887533/HIVE-15899.10.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 11048 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_mask_hash] 
(batchId=28)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=100)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[drop_table_failure2]
 (batchId=89)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=234)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=234)
org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion02 
(batchId=215)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6847/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6847/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6847/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12887533 - PreCommit-HIVE-Build

> check CTAS over acid table 
> ---
>
> Key: HIVE-15899
> URL: https://issues.apache.org/jira/browse/HIVE-15899
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-15899.01.patch, HIVE-15899.02.patch, 
> HIVE-15899.03.patch, HIVE-15899.04.patch, HIVE-15899.05.patch, 
> HIVE-15899.07.patch, HIVE-15899.08.patch, HIVE-15899.09.patch, 
> HIVE-15899.10.patch
>
>
> need to add a test to check if create table as works correctly with acid 
> tables



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17496) Bootstrap repl is not cleaning up staging dirs

2017-09-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169223#comment-16169223
 ] 

Hive QA commented on HIVE-17496:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12887403/HIVE-17496.4.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 11042 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join_literals] 
(batchId=84)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_mask_hash] 
(batchId=28)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=100)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[drop_table_failure2]
 (batchId=89)
org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=215)
org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 
(batchId=215)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testDeleteStagingDir 
(batchId=218)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6846/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6846/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6846/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12887403 - PreCommit-HIVE-Build

> Bootstrap repl is not cleaning up staging dirs
> --
>
> Key: HIVE-17496
> URL: https://issues.apache.org/jira/browse/HIVE-17496
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17496.1.patch, HIVE-17496.2.patch, 
> HIVE-17496.3.patch, HIVE-17496.4.patch
>
>
> This will put more pressure on the HDFS file limit.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17422) Skip non-native/temporary tables for all major table/partition related scenarios

2017-09-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169206#comment-16169206
 ] 

Hive QA commented on HIVE-17422:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12887446/HIVE-17422.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 20 failed/errored test(s), 11040 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_mask_hash] 
(batchId=28)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=156)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[drop_table_failure2]
 (batchId=89)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=234)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=234)
org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=215)
org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 
(batchId=215)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testAlters 
(batchId=218)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testIncrementalLoadWithVariableLengthEventId
 (batchId=218)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testIncrementalRepeatEventOnExistingObject
 (batchId=218)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testIncrementalRepeatEventOnMissingObject
 (batchId=218)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testRenameTableWithCM 
(batchId=218)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testStatus 
(batchId=218)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testTruncateTable 
(batchId=218)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testTruncateWithCM 
(batchId=218)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testViewsReplication 
(batchId=218)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6845/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6845/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6845/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 20 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12887446 - PreCommit-HIVE-Build

> Skip non-native/temporary tables for all major table/partition related 
> scenarios
> 
>
> Key: HIVE-17422
> URL: https://issues.apache.org/jira/browse/HIVE-17422
> Project: Hive
>  Issue Type: Improvement
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17422.1.patch, HIVE-17422.2.patch, 
> HIVE-17422.3.patch
>
>
> Currently during incremental dump, the non-native/temporary table info is 
> partially dumped in metadata file and will be ignored later by the repl load. 
> We can optimize it by moving the check (whether the table should be exported 
> or not) earlier so that we don't save any info to dump file for such types of 
> tables. CreateTableHandler already has this optimization, so we just need to 
> apply similar logic to other scenarios.
> The change is to apply the EximUtil.shouldExportTable check to all scenarios 
> (e.g. alter table) that calls into the common dump method. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)