date:20171109

[jira] [Commented] (HIVE-17528) Add more q-tests for Hive-on-Spark with Parquet vectorized reader

2017-11-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247089#comment-16247089
 ] 

Hive QA commented on HIVE-17528:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12897012/HIVE-17528.2.patch

{color:green}SUCCESS:{color} +1 due to 29 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 38 failed/errored test(s), 11430 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[add_part_exist] 
(batchId=7)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter1] (batchId=84)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter2] (batchId=10)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter3] (batchId=21)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter4] (batchId=50)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter5] (batchId=41)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_index] (batchId=21)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_rename_partition] 
(batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_9] 
(batchId=56)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_show_grant]
 (batchId=17)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cte_5] (batchId=32)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cte_mat_4] (batchId=6)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cte_mat_5] (batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dbtxnmgr_showlocks] 
(batchId=78)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[describe_table_json] 
(batchId=16)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[drop_table_with_index] 
(batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_creation] 
(batchId=87)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input2] (batchId=87)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input3] (batchId=81)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_read_backward_compatible_files]
 (batchId=49)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[rename_column] 
(batchId=21)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[show_tables] (batchId=19)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[temp_table_truncate] 
(batchId=25)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cte_5] 
(batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cte_mat_4] 
(batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cte_mat_5] 
(batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[jdbc_handler]
 (batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[temp_table] 
(batchId=169)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
 (batchId=95)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=113)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[temp_table] 
(batchId=144)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=208)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testApplyPlanQpChanges 
(batchId=283)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=225)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7761/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7761/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7761/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 38 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12897012 - PreCommit-HIVE-Build

> Add more q-tests for Hive-on-Spark with Parquet vectorized reader
> -
>
> Key: HIVE-17528
> URL: https://issues.apache.org/jira/browse/HIVE-17528
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
> Attachments: HIVE-17528.1.patch, HIVE-17528.2.patch, HIVE-17528.patch
>
>
>

[jira] [Commented] (HIVE-17976) HoS: don't set output collector if there's no data to process

2017-11-09 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247079#comment-16247079
 ] 

Xuefu Zhang commented on HIVE-17976:


[~lirui] Thanks for working on this. I will take a look.

> HoS: don't set output collector if there's no data to process
> -
>
> Key: HIVE-17976
> URL: https://issues.apache.org/jira/browse/HIVE-17976
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Minor
> Attachments: HIVE-17976.1.patch, HIVE-17976.2.patch
>
>
> MR doesn't set an output collector if no row is processed, i.e. 
> {{ExecMapper::map}} is never called. Let's investigate whether Spark should 
> do the same.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17528) Add more q-tests for Hive-on-Spark with Parquet vectorized reader

2017-11-09 Thread Ferdinand Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-17528:

Attachment: HIVE-17528.2.patch

> Add more q-tests for Hive-on-Spark with Parquet vectorized reader
> -
>
> Key: HIVE-17528
> URL: https://issues.apache.org/jira/browse/HIVE-17528
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
> Attachments: HIVE-17528.1.patch, HIVE-17528.2.patch, HIVE-17528.patch
>
>
> Most of the vectorization related q-tests operate on ORC tables using Tez. It 
> would be good to add more coverage on a different combination of engine and 
> file-format. We can model existing q-tests using parquet tables and run it 
> using TestSparkCliDriver



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-18039) Use ConcurrentHashMap for CachedStore

2017-11-09 Thread Alexander Kolbasov (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246961#comment-16246961
 ] 

Alexander Kolbasov commented on HIVE-18039:
---

It is also worth considering 3-rd party Cache implementations - e.g. Caffeine 
(https://github.com/ben-manes/caffeine).

> Use ConcurrentHashMap for CachedStore
> -
>
> Key: HIVE-18039
> URL: https://issues.apache.org/jira/browse/HIVE-18039
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Alexander Kolbasov
>
> SharedCache used by CachedStore uses single big lock to synchronize all 
> access. This looks like an overkill - looks like it is possible to use 
> ConcurrentHashMap instead. Also it makes sense to move deepCopy() operations 
> outside the lock to reduce lock hold times.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-18029) user mapping - support proper usernames with doAs = false

2017-11-09 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-18029:
---

Assignee: Sergey Shelukhin

> user mapping - support proper usernames with doAs = false
> -
>
> Key: HIVE-18029
> URL: https://issues.apache.org/jira/browse/HIVE-18029
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> Right now what happens on unsecure cluster with doAs=false (not sure which 
> one is to blame - didn't look into it, maybe both) is {noformat}
> 2017-11-08T21:39:49,404  INFO [HiveServer2-Background-Pool: Thread-205] 
> tez.WorkloadManagerFederation: Getting a WM session for anonymous
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-18002) add group support for pool mappings

2017-11-09 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-18002:

Status: Patch Available  (was: Open)

[~prasanth_j] can you take a look?

> add group support for pool mappings
> ---
>
> Key: HIVE-18002
> URL: https://issues.apache.org/jira/browse/HIVE-18002
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-18002) add group support for pool mappings

2017-11-09 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-18002:

Attachment: HIVE-18002.patch

A simple patch. UGI.getGroups apparently doesn't need to be called on a real 
UGI to be useful; it only takes the user name to fetch groups based on config.

> add group support for pool mappings
> ---
>
> Key: HIVE-18002
> URL: https://issues.apache.org/jira/browse/HIVE-18002
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-18026) Hive webhcat principal configuration optimization

2017-11-09 Thread ZhangBing Lin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhangBing Lin updated HIVE-18026:
-
Status: Patch Available  (was: Open)

> Hive webhcat principal configuration optimization
> -
>
> Key: HIVE-18026
> URL: https://issues.apache.org/jira/browse/HIVE-18026
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 3.0.0
>Reporter: ZhangBing Lin
>Assignee: ZhangBing Lin
> Attachments: HIVE-18026.1.patch
>
>
> Hive webhcat principal configuration optimization，when you configure:
> 
> templeton.kerberos.principal
> HTTP/_HOST@
> 
> The '_HOST' should be replaced by specific host name.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-18026) Hive webhcat principal configuration optimization

2017-11-09 Thread ZhangBing Lin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhangBing Lin updated HIVE-18026:
-
Status: Open  (was: Patch Available)

> Hive webhcat principal configuration optimization
> -
>
> Key: HIVE-18026
> URL: https://issues.apache.org/jira/browse/HIVE-18026
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 3.0.0
>Reporter: ZhangBing Lin
>Assignee: ZhangBing Lin
> Attachments: HIVE-18026.1.patch
>
>
> Hive webhcat principal configuration optimization，when you configure:
> 
> templeton.kerberos.principal
> HTTP/_HOST@
> 
> The '_HOST' should be replaced by specific host name.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17528) Add more q-tests for Hive-on-Spark with Parquet vectorized reader

2017-11-09 Thread Ferdinand Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-17528:

Attachment: (was: HIVE-17528.2.patch)

> Add more q-tests for Hive-on-Spark with Parquet vectorized reader
> -
>
> Key: HIVE-17528
> URL: https://issues.apache.org/jira/browse/HIVE-17528
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
> Attachments: HIVE-17528.1.patch, HIVE-17528.patch
>
>
> Most of the vectorization related q-tests operate on ORC tables using Tez. It 
> would be good to add more coverage on a different combination of engine and 
> file-format. We can model existing q-tests using parquet tables and run it 
> using TestSparkCliDriver



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17931) Implement Parquet vectorization reader for Array type

2017-11-09 Thread Colin Ma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Ma updated HIVE-17931:

Attachment: HIVE-17931.004.patch

> Implement Parquet vectorization reader for Array type
> -
>
> Key: HIVE-17931
> URL: https://issues.apache.org/jira/browse/HIVE-17931
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Colin Ma
>Assignee: Colin Ma
> Attachments: HIVE-17931.001.patch, HIVE-17931.002.patch, 
> HIVE-17931.003.patch, HIVE-17931.004.patch
>
>
> Parquet vectorized reader can't support array type, it should be supported to 
> improve the performance when the query with array type. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17528) Add more q-tests for Hive-on-Spark with Parquet vectorized reader

2017-11-09 Thread Ferdinand Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246885#comment-16246885
 ] 

Ferdinand Xu commented on HIVE-17528:
-

Hi [~vihangk1] Review board has been created for review. 

> Add more q-tests for Hive-on-Spark with Parquet vectorized reader
> -
>
> Key: HIVE-17528
> URL: https://issues.apache.org/jira/browse/HIVE-17528
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
> Attachments: HIVE-17528.1.patch, HIVE-17528.2.patch, HIVE-17528.patch
>
>
> Most of the vectorization related q-tests operate on ORC tables using Tez. It 
> would be good to add more coverage on a different combination of engine and 
> file-format. We can model existing q-tests using parquet tables and run it 
> using TestSparkCliDriver



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17528) Add more q-tests for Hive-on-Spark with Parquet vectorized reader

2017-11-09 Thread Ferdinand Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-17528:

Attachment: HIVE-17528.2.patch

> Add more q-tests for Hive-on-Spark with Parquet vectorized reader
> -
>
> Key: HIVE-17528
> URL: https://issues.apache.org/jira/browse/HIVE-17528
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
> Attachments: HIVE-17528.1.patch, HIVE-17528.2.patch, HIVE-17528.patch
>
>
> Most of the vectorization related q-tests operate on ORC tables using Tez. It 
> would be good to add more coverage on a different combination of engine and 
> file-format. We can model existing q-tests using parquet tables and run it 
> using TestSparkCliDriver



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16075) MetaStore needs to reinitialize log4j to allow log specific settings via hiveconf take effect

2017-11-09 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-16075:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Yunfei!

> MetaStore needs to reinitialize log4j to allow log specific settings via 
> hiveconf take effect 
> --
>
> Key: HIVE-16075
> URL: https://issues.apache.org/jira/browse/HIVE-16075
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 2.2.0
>Reporter: yunfei liu
>Assignee: yunfei liu
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-16075.1.patch
>
>
> when I start hive metastore with command:
> {quote}
> {{hive --service metastore -hiveconf hive.log.file=hivemetastore.log 
> -hiveconf hive.log.dir=/home/yun/hive/log}}
> {quote}
> The two log parameters won't take effect because after metastore copy hive 
> conf parameters into java system properties, it doesn't reinitialize log4j.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-18038) org.apache.hadoop.hive.ql.session.OperationLog - Review

2017-11-09 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-18038:
---
Attachment: HIVE-18038.1.patch

> org.apache.hadoop.hive.ql.session.OperationLog - Review
> ---
>
> Key: HIVE-18038
> URL: https://issues.apache.org/jira/browse/HIVE-18038
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Priority: Trivial
> Attachments: HIVE-18038.1.patch
>
>
> Simplifications, improve readability



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-18038) org.apache.hadoop.hive.ql.session.OperationLog - Review

2017-11-09 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-18038:
---
Status: Patch Available  (was: Open)

> org.apache.hadoop.hive.ql.session.OperationLog - Review
> ---
>
> Key: HIVE-18038
> URL: https://issues.apache.org/jira/browse/HIVE-18038
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Priority: Trivial
> Attachments: HIVE-18038.1.patch
>
>
> Simplifications, improve readability



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17931) Implement Parquet vectorization reader for Array type

2017-11-09 Thread Colin Ma (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246862#comment-16246862
 ] 

Colin Ma commented on HIVE-17931:
-

[~vihangk1], I'm afraid complex types is not fully supported, and I got the 
problem because of the following code:
[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java#L3261]
The List and Map are not supported, and the MapWork can't be vectorized with 
these types.

> Implement Parquet vectorization reader for Array type
> -
>
> Key: HIVE-17931
> URL: https://issues.apache.org/jira/browse/HIVE-17931
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Colin Ma
>Assignee: Colin Ma
> Attachments: HIVE-17931.001.patch, HIVE-17931.002.patch, 
> HIVE-17931.003.patch
>
>
> Parquet vectorized reader can't support array type, it should be supported to 
> improve the performance when the query with array type. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-18008) Add optimization rule to remove gby from right side of left semi-join

2017-11-09 Thread Andrew Sherman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246853#comment-16246853
 ] 

Andrew Sherman commented on HIVE-18008:
---

Thanks

> Add optimization rule to remove gby from right side of left semi-join
> -
>
> Key: HIVE-18008
> URL: https://issues.apache.org/jira/browse/HIVE-18008
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-18008.1.patch, HIVE-18008.2.patch
>
>
> Group by (on same keys as semi join) as right side of Left semi join is 
> unnecessary and could be removed. We see this pattern in subqueries with 
> explicit distinct keyword e.g.
> {code:sql}
> explain select * from src b where b.key in (select distinct key from src a 
> where a.value = b.value)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17193) HoS: don't combine map works that are targets of different DPPs

2017-11-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246852#comment-16246852
 ] 

Hive QA commented on HIVE-17193:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12896851/HIVE-17193.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 17 failed/errored test(s), 11371 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dbtxnmgr_showlocks] 
(batchId=77)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_text] (batchId=73)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.org.apache.hadoop.hive.cli.TestContribNegativeCliDriver
 (batchId=240)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=156)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning]
 (batchId=173)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=173)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=102)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
 (batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=111)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=206)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testApplyPlanQpChanges 
(batchId=281)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testReopen (batchId=281)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7742/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7742/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7742/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 17 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12896851 - PreCommit-HIVE-Build

> HoS: don't combine map works that are targets of different DPPs
> ---
>
> Key: HIVE-17193
> URL: https://issues.apache.org/jira/browse/HIVE-17193
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-17193.1.patch
>
>
> Suppose {{srcpart}} is partitioned by {{ds}}. The following query can trigger 
> the issue:
> {code}
> explain
> select * from
>   (select srcpart.ds,srcpart.key from srcpart join src on srcpart.ds=src.key) 
> a
> join
>   (select srcpart.ds,srcpart.key from srcpart join src on 
> srcpart.ds=src.value) b
> on a.key=b.key;
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17138) FileSinkOperator/Compactor doesn't create empty files for acid path

2017-11-09 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246850#comment-16246850
 ] 

Eugene Koifman commented on HIVE-17138:
---

check usages of AcidUtils.getAcidState() specifically when ignoreEmptyFiles=true

> FileSinkOperator/Compactor doesn't create empty files for acid path
> ---
>
> Key: HIVE-17138
> URL: https://issues.apache.org/jira/browse/HIVE-17138
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> For bucketed tables, FileSinkOperator is expected (in some cases)  to produce 
> a specific number of files even if they are empty.
> FileSinkOperator.closeOp(boolean abort) has logic to create files even if 
> empty.
> This doesn't property work for Acid path.  For Insert, the 
> OrcRecordUpdater(s) is set up in createBucketForFileIdx() which creates the 
> actual bucketN file (as of HIVE-14007, it does it regardless of whether 
> RecordUpdater sees any rows).  This causes empty (i.e.ORC metadata only) 
> bucket files to be created for multiFileSpray=true if a particular 
> FileSinkOperator.process() sees at least 1 row.  For example,
> {noformat}
> create table fourbuckets (a int, b int) clustered by (a) into 4 buckets 
> stored as orc TBLPROPERTIES ('transactional'='true');
> insert into fourbuckets values(0,1),(1,1);
> with mapreduce.job.reduces = 1 or 2 
> {noformat}
> For Update/Delete path, OrcRecordWriter is created lazily when the 1st row 
> that needs to land there is seen.  Thus it never creates empty buckets no 
> mater what the value of _skipFiles_ in closeOp(boolean).
> Once Split Update does the split early (in operator pipeline) only the Insert 
> path will matter since base and delta are the only files split computation, 
> etc looks at.  delete_delta is only for Acid internals so there is never any 
> reason for create empty files there.
> Also make sure to close RecordUpdaters in FileSinkOperator.abortWriters()



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (HIVE-18008) Add optimization rule to remove gby from right side of left semi-join

2017-11-09 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg resolved HIVE-18008.

Resolution: Fixed

> Add optimization rule to remove gby from right side of left semi-join
> -
>
> Key: HIVE-18008
> URL: https://issues.apache.org/jira/browse/HIVE-18008
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-18008.1.patch, HIVE-18008.2.patch
>
>
> Group by (on same keys as semi join) as right side of Left semi join is 
> unnecessary and could be removed. We see this pattern in subqueries with 
> explicit distinct keyword e.g.
> {code:sql}
> explain select * from src b where b.key in (select distinct key from src a 
> where a.value = b.value)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-18008) Add optimization rule to remove gby from right side of left semi-join

2017-11-09 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246839#comment-16246839
 ] 

Vineet Garg commented on HIVE-18008:


Should be fixed now

> Add optimization rule to remove gby from right side of left semi-join
> -
>
> Key: HIVE-18008
> URL: https://issues.apache.org/jira/browse/HIVE-18008
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-18008.1.patch, HIVE-18008.2.patch
>
>
> Group by (on same keys as semi join) as right side of Left semi join is 
> unnecessary and could be removed. We see this pattern in subqueries with 
> explicit distinct keyword e.g.
> {code:sql}
> explain select * from src b where b.key in (select distinct key from src a 
> where a.value = b.value)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17935) Turn on hive.optimize.sort.dynamic.partition by default

2017-11-09 Thread Andrew Sherman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Sherman updated HIVE-17935:
--
Attachment: HIVE-17935.4.patch

more test changes

> Turn on hive.optimize.sort.dynamic.partition by default
> ---
>
> Key: HIVE-17935
> URL: https://issues.apache.org/jira/browse/HIVE-17935
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
> Attachments: HIVE-17935.1.patch, HIVE-17935.2.patch, 
> HIVE-17935.3.patch, HIVE-17935.4.patch
>
>
> The config option hive.optimize.sort.dynamic.partition is an optimization for 
> Hive’s dynamic partitioning feature. It was originally implemented in 
> [HIVE-6455|https://issues.apache.org/jira/browse/HIVE-6455]. With this 
> optimization, the dynamic partition columns and bucketing columns (in case of 
> bucketed tables) are sorted before being fed to the reducers. Since the 
> partitioning and bucketing columns are sorted, each reducer can keep only one 
> record writer open at any time thereby reducing the memory pressure on the 
> reducers. There were some early problems with this optimization and it was 
> disabled by default in HiveConf in 
> [HIVE-8151|https://issues.apache.org/jira/browse/HIVE-8151]. Since then 
> setting hive.optimize.sort.dynamic.partition=true has been used to solve 
> problems where dynamic partitioning produces with (1) too many small files on 
> HDFS, which is bad for the cluster and can increase overhead for future Hive 
> queries over those partitions, and (2) OOM issues in the map tasks because it 
> trying to simultaneously write to 100 different files. 
> It now seems that the feature is probably mature enough that it can be 
> enabled by default.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Reopened] (HIVE-18008) Add optimization rule to remove gby from right side of left semi-join

2017-11-09 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg reopened HIVE-18008:


> Add optimization rule to remove gby from right side of left semi-join
> -
>
> Key: HIVE-18008
> URL: https://issues.apache.org/jira/browse/HIVE-18008
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-18008.1.patch, HIVE-18008.2.patch
>
>
> Group by (on same keys as semi join) as right side of Left semi join is 
> unnecessary and could be removed. We see this pattern in subqueries with 
> explicit distinct keyword e.g.
> {code:sql}
> explain select * from src b where b.key in (select distinct key from src a 
> where a.value = b.value)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-18008) Add optimization rule to remove gby from right side of left semi-join

2017-11-09 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246825#comment-16246825
 ] 

Vineet Garg commented on HIVE-18008:


Let me revert this, sorry about the inconvenience. 

> Add optimization rule to remove gby from right side of left semi-join
> -
>
> Key: HIVE-18008
> URL: https://issues.apache.org/jira/browse/HIVE-18008
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-18008.1.patch, HIVE-18008.2.patch
>
>
> Group by (on same keys as semi join) as right side of Left semi join is 
> unnecessary and could be removed. We see this pattern in subqueries with 
> explicit distinct keyword e.g.
> {code:sql}
> explain select * from src b where b.key in (select distinct key from src a 
> where a.value = b.value)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-18008) Add optimization rule to remove gby from right side of left semi-join

2017-11-09 Thread Andrew Sherman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246824#comment-16246824
 ] 

Andrew Sherman commented on HIVE-18008:
---

I see the same break, just FYI, not piling on  :-)
I see the new file in the patch but it is not in the commit;

{noformat}
[~/git/asf/hive]$ git show --name-only ff3b327d322b04916e019fcec75d3fbd48e26bae
commit ff3b327d322b04916e019fcec75d3fbd48e26bae (HEAD -> master, origin/master, 
origin/HEAD)
Author: Vineet Garg 
Date:   Thu Nov 9 15:54:11 2017 -0800

HIVE-18008 : Add optimization rule to remove gby from right side of left 
semi-join (Vineet Garg, reviewed by Ashutosh Chauhan)

ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java
ql/src/test/queries/clientpositive/subquery_in.q
ql/src/test/results/clientpositive/llap/subquery_in.q.out
ql/src/test/results/clientpositive/spark/subquery_in.q.out
ql/src/test/results/clientpositive/subquery_unqualcolumnrefs.q.out
{noformat}

> Add optimization rule to remove gby from right side of left semi-join
> -
>
> Key: HIVE-18008
> URL: https://issues.apache.org/jira/browse/HIVE-18008
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-18008.1.patch, HIVE-18008.2.patch
>
>
> Group by (on same keys as semi join) as right side of Left semi join is 
> unnecessary and could be removed. We see this pattern in subqueries with 
> explicit distinct keyword e.g.
> {code:sql}
> explain select * from src b where b.key in (select distinct key from src a 
> where a.value = b.value)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-18008) Add optimization rule to remove gby from right side of left semi-join

2017-11-09 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246823#comment-16246823
 ] 

Sergey Shelukhin commented on HIVE-18008:
-

Appears to have broken the build:
{noformat}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.6.1:compile (default-compile) 
on project hive-exec: Compilation failure
[ERROR] 
/Users/sergey/git/hivegit/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java:[208,57]
 cannot find symbol
[ERROR] symbol:   class HiveRemoveGBYSemiJoinRule
[ERROR] location: package org.apache.hadoop.hive.ql.optimizer.calcite.rules
[ERROR] -> [Help 1]
{noformat}

> Add optimization rule to remove gby from right side of left semi-join
> -
>
> Key: HIVE-18008
> URL: https://issues.apache.org/jira/browse/HIVE-18008
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-18008.1.patch, HIVE-18008.2.patch
>
>
> Group by (on same keys as semi join) as right side of Left semi join is 
> unnecessary and could be removed. We see this pattern in subqueries with 
> explicit distinct keyword e.g.
> {code:sql}
> explain select * from src b where b.key in (select distinct key from src a 
> where a.value = b.value)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-18008) Add optimization rule to remove gby from right side of left semi-join

2017-11-09 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-18008:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed to master, thanks for taking a look [~ashutoshc]

> Add optimization rule to remove gby from right side of left semi-join
> -
>
> Key: HIVE-18008
> URL: https://issues.apache.org/jira/browse/HIVE-18008
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-18008.1.patch, HIVE-18008.2.patch
>
>
> Group by (on same keys as semi join) as right side of Left semi join is 
> unnecessary and could be removed. We see this pattern in subqueries with 
> explicit distinct keyword e.g.
> {code:sql}
> explain select * from src b where b.key in (select distinct key from src a 
> where a.value = b.value)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-18008) Add optimization rule to remove gby from right side of left semi-join

2017-11-09 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-18008:
---
Status: Patch Available  (was: Open)

> Add optimization rule to remove gby from right side of left semi-join
> -
>
> Key: HIVE-18008
> URL: https://issues.apache.org/jira/browse/HIVE-18008
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-18008.1.patch, HIVE-18008.2.patch
>
>
> Group by (on same keys as semi join) as right side of Left semi join is 
> unnecessary and could be removed. We see this pattern in subqueries with 
> explicit distinct keyword e.g.
> {code:sql}
> explain select * from src b where b.key in (select distinct key from src a 
> where a.value = b.value)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-18008) Add optimization rule to remove gby from right side of left semi-join

2017-11-09 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-18008:
---
Attachment: HIVE-18008.2.patch

> Add optimization rule to remove gby from right side of left semi-join
> -
>
> Key: HIVE-18008
> URL: https://issues.apache.org/jira/browse/HIVE-18008
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-18008.1.patch, HIVE-18008.2.patch
>
>
> Group by (on same keys as semi join) as right side of Left semi join is 
> unnecessary and could be removed. We see this pattern in subqueries with 
> explicit distinct keyword e.g.
> {code:sql}
> explain select * from src b where b.key in (select distinct key from src a 
> where a.value = b.value)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-18008) Add optimization rule to remove gby from right side of left semi-join

2017-11-09 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-18008:
---
Status: Open  (was: Patch Available)

> Add optimization rule to remove gby from right side of left semi-join
> -
>
> Key: HIVE-18008
> URL: https://issues.apache.org/jira/browse/HIVE-18008
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-18008.1.patch, HIVE-18008.2.patch
>
>
> Group by (on same keys as semi join) as right side of Left semi join is 
> unnecessary and could be removed. We see this pattern in subqueries with 
> explicit distinct keyword e.g.
> {code:sql}
> explain select * from src b where b.key in (select distinct key from src a 
> where a.value = b.value)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17856) MM tables - IOW is not ACID compliant

2017-11-09 Thread Steve Yeom (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246762#comment-16246762
 ] 

Steve Yeom commented on HIVE-17856:
---

Right now working on mm_all.q failures in detail from the pre-commit tests. 
I think the run I did couple of days ago was not correct (which was 
successful). 

> MM tables - IOW is not ACID compliant
> -
>
> Key: HIVE-17856
> URL: https://issues.apache.org/jira/browse/HIVE-17856
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Steve Yeom
>  Labels: mm-gap-1
> Attachments: HIVE-17856.1.patch, HIVE-17856.2.patch, 
> HIVE-17856.3.patch, HIVE-17856.4.patch, HIVE-17856.5.patch
>
>
> The following tests were removed from mm_all during "integration"... I should 
> have never allowed such manner of intergration.
> MM logic should have been kept intact until ACID logic could catch up. Alas, 
> here we are.
> {noformat}
> drop table iow0_mm;
> create table iow0_mm(key int) tblproperties("transactional"="true", 
> "transactional_properties"="insert_only");
> insert overwrite table iow0_mm select key from intermediate;
> insert into table iow0_mm select key + 1 from intermediate;
> select * from iow0_mm order by key;
> insert overwrite table iow0_mm select key + 2 from intermediate;
> select * from iow0_mm order by key;
> drop table iow0_mm;
> drop table iow1_mm; 
> create table iow1_mm(key int) partitioned by (key2 int)  
> tblproperties("transactional"="true", 
> "transactional_properties"="insert_only");
> insert overwrite table iow1_mm partition (key2)
> select key as k1, key from intermediate union all select key as k1, key from 
> intermediate;
> insert into table iow1_mm partition (key2)
> select key + 1 as k1, key from intermediate union all select key as k1, key 
> from intermediate;
> select * from iow1_mm order by key, key2;
> insert overwrite table iow1_mm partition (key2)
> select key + 3 as k1, key from intermediate union all select key + 4 as k1, 
> key from intermediate;
> select * from iow1_mm order by key, key2;
> insert overwrite table iow1_mm partition (key2)
> select key + 3 as k1, key + 3 from intermediate union all select key + 2 as 
> k1, key + 2 from intermediate;
> select * from iow1_mm order by key, key2;
> drop table iow1_mm;
> {noformat}
> {noformat}
> drop table simple_mm;
> create table simple_mm(key int) stored as orc tblproperties 
> ("transactional"="true", "transactional_properties"="insert_only");
> insert into table simple_mm select key from intermediate;
> -insert overwrite table simple_mm select key from intermediate;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17856) MM tables - IOW is not ACID compliant

2017-11-09 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246757#comment-16246757
 ] 

Sergey Shelukhin commented on HIVE-17856:
-

Left some comments on RB, mostly minor. Looks like mm_all had result change 
last time it ran... let's see for the recent patch

> MM tables - IOW is not ACID compliant
> -
>
> Key: HIVE-17856
> URL: https://issues.apache.org/jira/browse/HIVE-17856
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Steve Yeom
>  Labels: mm-gap-1
> Attachments: HIVE-17856.1.patch, HIVE-17856.2.patch, 
> HIVE-17856.3.patch, HIVE-17856.4.patch, HIVE-17856.5.patch
>
>
> The following tests were removed from mm_all during "integration"... I should 
> have never allowed such manner of intergration.
> MM logic should have been kept intact until ACID logic could catch up. Alas, 
> here we are.
> {noformat}
> drop table iow0_mm;
> create table iow0_mm(key int) tblproperties("transactional"="true", 
> "transactional_properties"="insert_only");
> insert overwrite table iow0_mm select key from intermediate;
> insert into table iow0_mm select key + 1 from intermediate;
> select * from iow0_mm order by key;
> insert overwrite table iow0_mm select key + 2 from intermediate;
> select * from iow0_mm order by key;
> drop table iow0_mm;
> drop table iow1_mm; 
> create table iow1_mm(key int) partitioned by (key2 int)  
> tblproperties("transactional"="true", 
> "transactional_properties"="insert_only");
> insert overwrite table iow1_mm partition (key2)
> select key as k1, key from intermediate union all select key as k1, key from 
> intermediate;
> insert into table iow1_mm partition (key2)
> select key + 1 as k1, key from intermediate union all select key as k1, key 
> from intermediate;
> select * from iow1_mm order by key, key2;
> insert overwrite table iow1_mm partition (key2)
> select key + 3 as k1, key from intermediate union all select key + 4 as k1, 
> key from intermediate;
> select * from iow1_mm order by key, key2;
> insert overwrite table iow1_mm partition (key2)
> select key + 3 as k1, key + 3 from intermediate union all select key + 2 as 
> k1, key + 2 from intermediate;
> select * from iow1_mm order by key, key2;
> drop table iow1_mm;
> {noformat}
> {noformat}
> drop table simple_mm;
> create table simple_mm(key int) stored as orc tblproperties 
> ("transactional"="true", "transactional_properties"="insert_only");
> insert into table simple_mm select key from intermediate;
> -insert overwrite table simple_mm select key from intermediate;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-16406) Remove unwanted interning when creating PartitionDesc

2017-11-09 Thread Rajesh Balamohan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan reassigned HIVE-16406:
---

Assignee: Rajesh Balamohan

> Remove unwanted interning when creating PartitionDesc
> -
>
> Key: HIVE-16406
> URL: https://issues.apache.org/jira/browse/HIVE-16406
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-16406.1.patch, HIVE-16406.2.patch, 
> HIVE-16406.3.patch, HIVE-16406.profiler.png
>
>
> {{PartitionDesc::getTableDesc}} interns all table description properties by 
> default. But the table description properties are already interned and need 
> not be interned again. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16406) Remove unwanted interning when creating PartitionDesc

2017-11-09 Thread Rajesh Balamohan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-16406:

Attachment: HIVE-16406.3.patch

It is relevant [~ashutoshc]. I am attaching the rebased patch. 

> Remove unwanted interning when creating PartitionDesc
> -
>
> Key: HIVE-16406
> URL: https://issues.apache.org/jira/browse/HIVE-16406
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-16406.1.patch, HIVE-16406.2.patch, 
> HIVE-16406.3.patch, HIVE-16406.profiler.png
>
>
> {{PartitionDesc::getTableDesc}} interns all table description properties by 
> default. But the table description properties are already interned and need 
> not be interned again. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16406) Remove unwanted interning when creating PartitionDesc

2017-11-09 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246691#comment-16246691
 ] 

Ashutosh Chauhan commented on HIVE-16406:
-

[~rajesh.balamohan] Is this still relevant? If so, can you please rebase.

> Remove unwanted interning when creating PartitionDesc
> -
>
> Key: HIVE-16406
> URL: https://issues.apache.org/jira/browse/HIVE-16406
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-16406.1.patch, HIVE-16406.2.patch, 
> HIVE-16406.profiler.png
>
>
> {{PartitionDesc::getTableDesc}} interns all table description properties by 
> default. But the table description properties are already interned and need 
> not be interned again. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17904) handle internal Tez AM restart in registry and WM

2017-11-09 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17904:

Attachment: HIVE-17904.02.patch

A relatively trivial rebase

> handle internal Tez AM restart in registry and WM
> -
>
> Key: HIVE-17904
> URL: https://issues.apache.org/jira/browse/HIVE-17904
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17904.01.patch, HIVE-17904.02.patch, 
> HIVE-17904.patch, HIVE-17904.patch
>
>
> After the plan update patch is committed. The current code doesn't account 
> very well for it; registry may have races, and an event needs to be added to 
> WM when some AM resets, at least to make sure we discard the update errors 
> that pertain to the old AM. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17076) typo in itests/src/test/resources/testconfiguration.properties

2017-11-09 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-17076:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Eugene!

> typo in itests/src/test/resources/testconfiguration.properties
> --
>
> Key: HIVE-17076
> URL: https://issues.apache.org/jira/browse/HIVE-17076
> Project: Hive
>  Issue Type: Bug
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 3.0.0
>
> Attachments: HIVE-17076.01.patch
>
>
> it has 
> {noformat}
> minillap.shared.query.files=insert_into1.q,\
>   insert_into2.q,\
>   insert_values_orig_table.,\
>   llapdecider.q,\
> {noformat}
>  "insert_values_orig_table.,\" is a typo which causes these to be run with 
> TestCliDriver
> Note that there are 2 .q files that start with insert_values_orig_table



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-18037) Migrate Slider LLAP package to YARN Service framework for Hadoop 3.x

2017-11-09 Thread Gour Saha (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gour Saha reassigned HIVE-18037:


Assignee: Gour Saha

I have tested few of the changes required to make Slider LLAP package work with 
YARN Service in Hadoop 3.x. I will assign this jira to myself and provide a 
patch.

> Migrate Slider LLAP package to YARN Service framework for Hadoop 3.x
> 
>
> Key: HIVE-18037
> URL: https://issues.apache.org/jira/browse/HIVE-18037
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Gour Saha
>Assignee: Gour Saha
> Fix For: 3.0.0
>
>
> Apache Slider has been migrated to Hadoop-3.x and is referred to as YARN 
> Service (YARN-4692). Most of the classic Slider features are now going to be 
> supported in a first-class manner by core YARN. It includes several new 
> features like a RESTful API. Command line equivalents of classic Slider are 
> supported by YARN Service as well.
> This jira will take care of all changes required to Slider LLAP packaging and 
> scripts to make it work against Hadoop 3.x.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16129) log Tez DAG ID in places

2017-11-09 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246716#comment-16246716
 ] 

Ashutosh Chauhan commented on HIVE-16129:
-

This can go in now that we are on tez 0.9

> log Tez DAG ID in places
> 
>
> Key: HIVE-16129
> URL: https://issues.apache.org/jira/browse/HIVE-16129
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16129.01.patch, HIVE-16129.patch
>
>
> After TEZ-3550, we should be able to log Tez DAG ID early to have 
> queryId-dagId mapping when debugging



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17906) use kill query mechanics to kill queries in WM

2017-11-09 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17906:

Attachment: HIVE-17906.03.patch

> use kill query mechanics to kill queries in WM
> --
>
> Key: HIVE-17906
> URL: https://issues.apache.org/jira/browse/HIVE-17906
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17906.01.patch, HIVE-17906.02.patch, 
> HIVE-17906.03.patch, HIVE-17906.03.patch, HIVE-17906.patch
>
>
> Right now it just closes the session (see HIVE-17841). The sessions would 
> need to be reused after the kill, or closed after the kill if the total QP 
> has decreased



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17098) Race condition in Hbase tables

2017-11-09 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246670#comment-16246670
 ] 

Ashutosh Chauhan commented on HIVE-17098:
-

[~osayankin] There are also changes in getSplit() to do doAs, is that necessary?

> Race condition in Hbase tables
> --
>
> Key: HIVE-17098
> URL: https://issues.apache.org/jira/browse/HIVE-17098
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Affects Versions: 2.1.1
>Reporter: Oleksiy Sayankin
>Assignee: Oleksiy Sayankin
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-17098.1.patch
>
>
> These steps simulate our customer production env.
> *STEP 1. Create test tables*
> {code}
> CREATE TABLE for_loading(
>   key int, 
>   value string,
>   age int,
>   salary decimal (10,2)
> ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
> {code}
> {code}
> CREATE TABLE test_1(
>   key int, 
>   value string,
>   age int,
>   salary decimal (10,2)
> )
> ROW FORMAT SERDE 
>   'org.apache.hadoop.hive.hbase.HBaseSerDe' 
> STORED BY 
>   'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
> WITH SERDEPROPERTIES ( 
>   'hbase.columns.mapping'=':key, cf1:value, cf1:age, cf1:salary', 
>   'serialization.format'='1')
> TBLPROPERTIES (
>   'COLUMN_STATS_ACCURATE'='{\"BASIC_STATS\":\"true\"}', 
>   'hbase.table.name'='test_1', 
>   'numFiles'='0', 
>   'numRows'='0', 
>   'rawDataSize'='0', 
>   'totalSize'='0', 
>   'transient_lastDdlTime'='1495769316');
> {code}
> {code}
> CREATE TABLE test_2(
>   key int, 
>   value string,
>   age int,
>   salary decimal (10,2)
> )
> ROW FORMAT SERDE 
>   'org.apache.hadoop.hive.hbase.HBaseSerDe' 
> STORED BY 
>   'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
> WITH SERDEPROPERTIES ( 
>   'hbase.columns.mapping'=':key, cf1:value, cf1:age, cf1:salary', 
>   'serialization.format'='1')
> TBLPROPERTIES (
>   'COLUMN_STATS_ACCURATE'='{\"BASIC_STATS\":\"true\"}', 
>   'hbase.table.name'='test_2', 
>   'numFiles'='0', 
>   'numRows'='0', 
>   'rawDataSize'='0', 
>   'totalSize'='0', 
>   'transient_lastDdlTime'='1495769316');
> {code}
> *STEP 2. Create test data*
> {code}
> import java.io.IOException;
> import java.math.BigDecimal;
> import java.nio.charset.Charset;
> import java.nio.file.Files;
> import java.nio.file.Path;
> import java.nio.file.Paths;
> import java.nio.file.StandardOpenOption;
> import java.util.ArrayList;
> import java.util.Arrays;
> import java.util.List;
> import java.util.Random;
> import static java.lang.String.format;
> public class Generator {
> private static List lines = new ArrayList<>();
> private static List name = Arrays.asList("Brian", "John", 
> "Rodger", "Max", "Freddie", "Albert", "Fedor", "Lev", "Niccolo");
> private static List salary = new ArrayList<>();
> public static void main(String[] args) {
> generateData(Integer.parseInt(args[0]), args[1]);
> }
> public static void generateData(int rowNumber, String file) {
> double maxValue = 2.55;
> double minValue = 1000.03;
> Random random = new Random();
> for (int i = 1; i <= rowNumber; i++) {
> lines.add(
> i + "," +
> name.get(random.nextInt(name.size())) + "," +
> (random.nextInt(62) + 18) + "," +
> format("%.2f", (minValue + (maxValue - minValue) * 
> random.nextDouble(;
> }
> Path path = Paths.get(file);
> try {
> Files.write(path, lines, Charset.forName("UTF-8"), 
> StandardOpenOption.APPEND);
> } catch (IOException e) {
> e.printStackTrace();
> }
> }
> }
> {code}
> {code}
> javac Generator.java
> java Generator 300 dataset.csv
> hadoop fs -put dataset.csv /
> {code}
> *STEP 3. Upload test data*
> {code}
> load data local inpath '/home/myuser/dataset.csv' into table for_loading;
> {code}
> {code}
> from for_loading
> insert into table test_1
> select key,value,age,salary;
> {code}
> {code}
> from for_loading
> insert into table test_2
> select key,value,age,salary;
> {code}
> *STEP 4. Run test queries*
> Run in 5 parallel terminals for table {{test_1}}
> {code}
> for i in {1..500}; do beeline -u "jdbc:hive2://localhost:1/default 
> testuser1" -e "select * from test_1 limit 10;" 1>/dev/null; done
> {code}
> Run in 5 parallel terminals for table {{test_2}}
> {code}
> for i in {1..500}; do beeline -u "jdbc:hive2://localhost:1/default 
> testuser2" -e "select * from test_2 limit 10;" 1>/dev/null; done
> {code}
> *EXPECTED RESULT:*
> All queris are OK.
> *ACTUAL RESULT*
> {code}
> org.apache.hive.service.cli.HiveSQLException: java.io.IOException: 
> java.lang.IllegalStateException: The input format instance has not been 
> properly ini
> tialized. Ensure you

[jira] [Updated] (HIVE-17906) use kill query mechanics to kill queries in WM

2017-11-09 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17906:

Attachment: HIVE-17906.03.patch

Rebased the patch based on recent changes to master

> use kill query mechanics to kill queries in WM
> --
>
> Key: HIVE-17906
> URL: https://issues.apache.org/jira/browse/HIVE-17906
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17906.01.patch, HIVE-17906.02.patch, 
> HIVE-17906.03.patch, HIVE-17906.patch
>
>
> Right now it just closes the session (see HIVE-17841). The sessions would 
> need to be reused after the kill, or closed after the kill if the total QP 
> has decreased



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17376) Upgrade snappy version to 1.1.4

2017-11-09 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-17376:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks [~ashutoshc] for reviewing. 

> Upgrade snappy version to 1.1.4
> ---
>
> Key: HIVE-17376
> URL: https://issues.apache.org/jira/browse/HIVE-17376
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Fix For: 3.0.0
>
> Attachments: HIVE-17376.1.patch
>
>
> Upgrade the snappy java version to 1.1.4. The older version has some issues 
> like memory leak (https://github.com/xerial/snappy-java/issues/91).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16855) org.apache.hadoop.hive.ql.exec.mr.HashTableLoader Improvements

2017-11-09 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-16855:
---
Status: Patch Available  (was: Open)

> org.apache.hadoop.hive.ql.exec.mr.HashTableLoader Improvements
> --
>
> Key: HIVE-16855
> URL: https://issues.apache.org/jira/browse/HIVE-16855
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.1, 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-16855.1.patch, HIVE-16855.2.patch
>
>
> # Improve (Simplify) Logging
> # Remove custom buffer size for {{BufferedInputStream}} and instead rely on 
> JVM default which is often larger these days (8192)
> # Simplify looping logic



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16855) org.apache.hadoop.hive.ql.exec.mr.HashTableLoader Improvements

2017-11-09 Thread BELUGA BEHR (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246657#comment-16246657
 ] 

BELUGA BEHR commented on HIVE-16855:


The only thing I can't figure out here is:

{code:java}
  private void loadDirectly(MapJoinTableContainer[] mapJoinTables, String 
inputFileName)
  throws Exception {
...
MapJoinTableContainer[] tables = sink.getMapJoinTables();
for (int i = 0; i < sink.getNumParent(); i++) {
  if (sink.getParentOperators().get(i) != null) {
mapJoinTables[i] = tables[i];
  }
}

Arrays.fill(tables, null);
{code}

Why is the 'tables' array being filled with NULL values?  This is poor 
encapsulation that the sink's internal array is being manipulated outside of 
the sink code itself.  There is no comment why this is being done.  It could be 
that the array is being NULL'ed out to allow for GC, but all of the object 
references are being copied into 'mapJoinTables', which is an argument to the 
method, so no GC will be occurring anyway.  I've just removed this call.

> org.apache.hadoop.hive.ql.exec.mr.HashTableLoader Improvements
> --
>
> Key: HIVE-16855
> URL: https://issues.apache.org/jira/browse/HIVE-16855
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.1, 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-16855.1.patch, HIVE-16855.2.patch
>
>
> # Improve (Simplify) Logging
> # Remove custom buffer size for {{BufferedInputStream}} and instead rely on 
> JVM default which is often larger these days (8192)
> # Simplify looping logic



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16855) org.apache.hadoop.hive.ql.exec.mr.HashTableLoader Improvements

2017-11-09 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-16855:
---
Attachment: HIVE-16855.2.patch

> org.apache.hadoop.hive.ql.exec.mr.HashTableLoader Improvements
> --
>
> Key: HIVE-16855
> URL: https://issues.apache.org/jira/browse/HIVE-16855
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.1, 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-16855.1.patch, HIVE-16855.2.patch
>
>
> # Improve (Simplify) Logging
> # Remove custom buffer size for {{BufferedInputStream}} and instead rely on 
> JVM default which is often larger these days (8192)
> # Simplify looping logic



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16855) org.apache.hadoop.hive.ql.exec.mr.HashTableLoader Improvements

2017-11-09 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-16855:
---
Status: Open  (was: Patch Available)

> org.apache.hadoop.hive.ql.exec.mr.HashTableLoader Improvements
> --
>
> Key: HIVE-16855
> URL: https://issues.apache.org/jira/browse/HIVE-16855
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.1, 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-16855.1.patch, HIVE-16855.2.patch
>
>
> # Improve (Simplify) Logging
> # Remove custom buffer size for {{BufferedInputStream}} and instead rely on 
> JVM default which is often larger these days (8192)
> # Simplify looping logic



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17995) Run checkstyle on standalone-metastore module with proper configuration

2017-11-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246612#comment-16246612
 ] 

Hive QA commented on HIVE-17995:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12896848/HIVE-17995.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 11374 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dbtxnmgr_showlocks] 
(batchId=77)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=62)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=156)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
 (batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=111)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=206)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=223)
org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery 
(batchId=233)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7741/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7741/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7741/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12896848 - PreCommit-HIVE-Build

> Run checkstyle on standalone-metastore module with proper configuration
> ---
>
> Key: HIVE-17995
> URL: https://issues.apache.org/jira/browse/HIVE-17995
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Adam Szita
>Assignee: Adam Szita
> Attachments: HIVE-17995.0.patch, HIVE-17995.1.patch
>
>
> Maven module standalone-metastore is obviously not connected to Hive root 
> pom, therefore if someone (or an automated Yetus check) runs {{mvn 
> checkstyle}} it will not consider Hive-specific checkstyle settings (e.g. 
> validates row lengths against 80, not 100)
> We need to make sure standalone-metastore pom has the proper checkstyle 
> configuration



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-18028) fix WM based on cluster smoke test; add logging

2017-11-09 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-18028:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed to master; thanks for the review!

> fix WM based on cluster smoke test; add logging
> ---
>
> Key: HIVE-18028
> URL: https://issues.apache.org/jira/browse/HIVE-18028
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 3.0.0
>
> Attachments: HIVE-18028.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-18008) Add optimization rule to remove gby from right side of left semi-join

2017-11-09 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246592#comment-16246592
 ] 

Ashutosh Chauhan commented on HIVE-18008:
-

scratch that. Patch LGTM
+1

> Add optimization rule to remove gby from right side of left semi-join
> -
>
> Key: HIVE-18008
> URL: https://issues.apache.org/jira/browse/HIVE-18008
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-18008.1.patch
>
>
> Group by (on same keys as semi join) as right side of Left semi join is 
> unnecessary and could be removed. We see this pattern in subqueries with 
> explicit distinct keyword e.g.
> {code:sql}
> explain select * from src b where b.key in (select distinct key from src a 
> where a.value = b.value)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-18028) fix WM based on cluster smoke test; add logging

2017-11-09 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-18028:

Summary: fix WM based on cluster smoke test; add logging  (was: fix stuff 
and add logging - part 1)

> fix WM based on cluster smoke test; add logging
> ---
>
> Key: HIVE-18028
> URL: https://issues.apache.org/jira/browse/HIVE-18028
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18028.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17856) MM tables - IOW is not ACID compliant

2017-11-09 Thread Steve Yeom (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Yeom updated HIVE-17856:
--
Attachment: HIVE-17856.5.patch

> MM tables - IOW is not ACID compliant
> -
>
> Key: HIVE-17856
> URL: https://issues.apache.org/jira/browse/HIVE-17856
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Steve Yeom
>  Labels: mm-gap-1
> Attachments: HIVE-17856.1.patch, HIVE-17856.2.patch, 
> HIVE-17856.3.patch, HIVE-17856.4.patch, HIVE-17856.5.patch
>
>
> The following tests were removed from mm_all during "integration"... I should 
> have never allowed such manner of intergration.
> MM logic should have been kept intact until ACID logic could catch up. Alas, 
> here we are.
> {noformat}
> drop table iow0_mm;
> create table iow0_mm(key int) tblproperties("transactional"="true", 
> "transactional_properties"="insert_only");
> insert overwrite table iow0_mm select key from intermediate;
> insert into table iow0_mm select key + 1 from intermediate;
> select * from iow0_mm order by key;
> insert overwrite table iow0_mm select key + 2 from intermediate;
> select * from iow0_mm order by key;
> drop table iow0_mm;
> drop table iow1_mm; 
> create table iow1_mm(key int) partitioned by (key2 int)  
> tblproperties("transactional"="true", 
> "transactional_properties"="insert_only");
> insert overwrite table iow1_mm partition (key2)
> select key as k1, key from intermediate union all select key as k1, key from 
> intermediate;
> insert into table iow1_mm partition (key2)
> select key + 1 as k1, key from intermediate union all select key as k1, key 
> from intermediate;
> select * from iow1_mm order by key, key2;
> insert overwrite table iow1_mm partition (key2)
> select key + 3 as k1, key from intermediate union all select key + 4 as k1, 
> key from intermediate;
> select * from iow1_mm order by key, key2;
> insert overwrite table iow1_mm partition (key2)
> select key + 3 as k1, key + 3 from intermediate union all select key + 2 as 
> k1, key + 2 from intermediate;
> select * from iow1_mm order by key, key2;
> drop table iow1_mm;
> {noformat}
> {noformat}
> drop table simple_mm;
> create table simple_mm(key int) stored as orc tblproperties 
> ("transactional"="true", "transactional_properties"="insert_only");
> insert into table simple_mm select key from intermediate;
> -insert overwrite table simple_mm select key from intermediate;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-18029) user mapping - support proper usernames with doAs = false

2017-11-09 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246362#comment-16246362
 ] 

Thejas M Nair commented on HIVE-18029:
--

The user name in session is the end user name. beeline passes anonymous if you 
didn't specify a username, in non kerberos mode.
Use beeline -n , to specify a particular user.


> user mapping - support proper usernames with doAs = false
> -
>
> Key: HIVE-18029
> URL: https://issues.apache.org/jira/browse/HIVE-18029
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>
> Right now what happens on unsecure cluster with doAs=false (not sure which 
> one is to blame - didn't look into it, maybe both) is {noformat}
> 2017-11-08T21:39:49,404  INFO [HiveServer2-Background-Pool: Thread-205] 
> tez.WorkloadManagerFederation: Getting a WM session for anonymous
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-18030) HCatalog can't be used with Pig on Spark

2017-11-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246354#comment-16246354
 ] 

Hive QA commented on HIVE-18030:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12896846/HIVE-18030.0.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 11359 tests 
executed
*Failed tests:*
{noformat}
TestMiniLlapCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=146)

[intersect_all.q,unionDistinct_1.q,orc_ppd_schema_evol_3a.q,table_nonprintable.q,tez_union_dynamic_partition.q,tez_union_dynamic_partition_2.q,temp_table_external.q,global_limit.q,llap_udf.q,schemeAuthority.q,cte_2.q,rcfile_createas1.q,dynamic_partition_pruning_2.q,intersect_merge.q,parallel_colstats.q]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dbtxnmgr_showlocks] 
(batchId=77)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=62)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=156)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=102)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
 (batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=111)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=206)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=223)
org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testTupleInBagInTupleInBag[3]
 (batchId=187)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7740/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7740/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7740/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12896846 - PreCommit-HIVE-Build

> HCatalog can't be used with Pig on Spark
> 
>
> Key: HIVE-18030
> URL: https://issues.apache.org/jira/browse/HIVE-18030
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Reporter: Adam Szita
>Assignee: Adam Szita
> Attachments: HIVE-18030.0.patch
>
>
> When using Pig on Spark in cluster mode, all queries containing HCatalog 
> access are failing:
> {code}
> 2017-11-03 12:39:19,268 [dispatcher-event-loop-19] INFO  
> org.apache.spark.storage.BlockManagerInfo - Added broadcast_6_piece0 in 
> memory on <>:<> (size: 83.0 KB, free: 408.5 
> MB)
> 2017-11-03 12:39:19,277 [task-result-getter-0] WARN  
> org.apache.spark.scheduler.TaskSetManager - Lost task 0.0 in stage 0.0 (TID 
> 0, <>, executor 2): java.lang.NullPointerException
>   at org.apache.hadoop.security.Credentials.addAll(Credentials.java:401)
>   at org.apache.hadoop.security.Credentials.addAll(Credentials.java:388)
>   at 
> org.apache.hive.hcatalog.pig.HCatLoader.setLocation(HCatLoader.java:128)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.mergeSplitSpecificConf(PigInputFormat.java:147)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat$RecordReaderFactory.(PigInputFormat.java:115)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.running.PigInputFormatSpark$SparkRecordReaderFactory.(PigInputFormatSpark.java:126)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.running.PigInputFormatSpark.createRecordReader(PigInputFormatSpark.java:70)
>   at 
> org.apache.spark.rdd.NewHadoopRDD$$anon$1.liftedTree1$1(NewHadoopRDD.scala:180)
>   at 
> org.apache.spark.rdd.NewHadoopRDD$$anon$1.(NewHadoopRDD.scala:179)
>   at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:134)
>   at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:69)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)

[jira] [Commented] (HIVE-18027) Explore the option of using AM's application id or attempt id for AM recovery

2017-11-09 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246341#comment-16246341
 ] 

Sergey Shelukhin commented on HIVE-18027:
-

Update: Tez creates a new app attempt when doing the recovery. We could expose 
that from AMs into the registry.

> Explore the option of using AM's application id or attempt id for AM recovery
> -
>
> Key: HIVE-18027
> URL: https://issues.apache.org/jira/browse/HIVE-18027
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>
> HIVE-17904 uses sequence numbers generated by ZK to identify new AMs. Instead 
> use application id or attempt id or some ID that AM generated during its 
> recovery. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-14069) update curator version to 2.10.0

2017-11-09 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-14069:
--
Attachment: HIVE-14069.4.patch

Updating curator version to 2.12.0 and shade options on a couple of modules to 
make sure all curator deps are shaded.

> update curator version to 2.10.0 
> -
>
> Key: HIVE-14069
> URL: https://issues.apache.org/jira/browse/HIVE-14069
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Metastore
>Reporter: Thejas M Nair
>Assignee: Jason Dere
> Attachments: HIVE-14069.1.patch, HIVE-14069.2.patch, 
> HIVE-14069.3.patch, HIVE-14069.4.patch
>
>
> curator-2.10.0 has several bug fixes over current version (2.6.0), updating 
> would help improve stability.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-14069) update curator version to 2.12.0

2017-11-09 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-14069:
--
Summary: update curator version to 2.12.0   (was: update curator version to 
2.10.0 )

> update curator version to 2.12.0 
> -
>
> Key: HIVE-14069
> URL: https://issues.apache.org/jira/browse/HIVE-14069
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Metastore
>Reporter: Thejas M Nair
>Assignee: Jason Dere
> Attachments: HIVE-14069.1.patch, HIVE-14069.2.patch, 
> HIVE-14069.3.patch, HIVE-14069.4.patch
>
>
> curator-2.10.0 has several bug fixes over current version (2.6.0), updating 
> would help improve stability.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-18029) user mapping - support proper usernames with doAs = false

2017-11-09 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246334#comment-16246334
 ] 

Sergey Shelukhin commented on HIVE-18029:
-

[~thejas] do you know if this is intended? Unsecure cluster, doAs=false, the 
username that is passed thru the session and stored in session state is 
"anonymous". I'm connecting via beeline.

> user mapping - support proper usernames with doAs = false
> -
>
> Key: HIVE-18029
> URL: https://issues.apache.org/jira/browse/HIVE-18029
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>
> Right now what happens on unsecure cluster with doAs=false (not sure which 
> one is to blame - didn't look into it, maybe both) is {noformat}
> 2017-11-08T21:39:49,404  INFO [HiveServer2-Background-Pool: Thread-205] 
> tez.WorkloadManagerFederation: Getting a WM session for anonymous
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-16756) Vectorization: LongColModuloLongColumn throws "java.lang.ArithmeticException: / by zero"

2017-11-09 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar reassigned HIVE-16756:
--

Assignee: Vihang Karajgaonkar  (was: Matt McCline)

> Vectorization: LongColModuloLongColumn throws "java.lang.ArithmeticException: 
> / by zero"
> 
>
> Key: HIVE-16756
> URL: https://issues.apache.org/jira/browse/HIVE-16756
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.3.0
>Reporter: Matt McCline
>Assignee: Vihang Karajgaonkar
>Priority: Critical
>
> vectorization_div0.q needs to test the long data type testing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16756) Vectorization: LongColModuloLongColumn throws "java.lang.ArithmeticException: / by zero"

2017-11-09 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-16756:
---
Affects Version/s: 2.3.0

> Vectorization: LongColModuloLongColumn throws "java.lang.ArithmeticException: 
> / by zero"
> 
>
> Key: HIVE-16756
> URL: https://issues.apache.org/jira/browse/HIVE-16756
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.3.0
>Reporter: Matt McCline
>Assignee: Vihang Karajgaonkar
>Priority: Critical
>
> vectorization_div0.q needs to test the long data type testing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16756) Vectorization: LongColModuloLongColumn throws "java.lang.ArithmeticException: / by zero"

2017-11-09 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246306#comment-16246306
 ] 

Vihang Karajgaonkar commented on HIVE-16756:


Thanks [~mmccline] Let me take a look. Reassigning to myself.

> Vectorization: LongColModuloLongColumn throws "java.lang.ArithmeticException: 
> / by zero"
> 
>
> Key: HIVE-16756
> URL: https://issues.apache.org/jira/browse/HIVE-16756
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.3.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>
> vectorization_div0.q needs to test the long data type testing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17528) Add more q-tests for Hive-on-Spark with Parquet vectorized reader

2017-11-09 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246300#comment-16246300
 ] 

Vihang Karajgaonkar commented on HIVE-17528:


Thanks [~Ferd] for the patch. Can you please post a review board link. The 
patch is large and hard to review without RB.

> Add more q-tests for Hive-on-Spark with Parquet vectorized reader
> -
>
> Key: HIVE-17528
> URL: https://issues.apache.org/jira/browse/HIVE-17528
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
> Attachments: HIVE-17528.1.patch, HIVE-17528.patch
>
>
> Most of the vectorization related q-tests operate on ORC tables using Tez. It 
> would be good to add more coverage on a different combination of engine and 
> file-format. We can model existing q-tests using parquet tables and run it 
> using TestSparkCliDriver



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17931) Implement Parquet vectorization reader for Array type

2017-11-09 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246298#comment-16246298
 ] 

Vihang Karajgaonkar commented on HIVE-17931:


I thought vectorization support for complex types is available on master based 
on what I see here 
https://github.com/apache/hive/blob/master/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L2891

Is that not true?

> Implement Parquet vectorization reader for Array type
> -
>
> Key: HIVE-17931
> URL: https://issues.apache.org/jira/browse/HIVE-17931
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Colin Ma
>Assignee: Colin Ma
> Attachments: HIVE-17931.001.patch, HIVE-17931.002.patch, 
> HIVE-17931.003.patch
>
>
> Parquet vectorized reader can't support array type, it should be supported to 
> improve the performance when the query with array type. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-18012) fix ct_noperm_loc test

2017-11-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246204#comment-16246204
 ] 

Hive QA commented on HIVE-18012:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12896826/HIVE-18012.001.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 11371 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dbtxnmgr_showlocks] 
(batchId=77)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=62)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=156)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=102)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=111)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query39] 
(batchId=245)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=206)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7739/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7739/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7739/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12896826 - PreCommit-HIVE-Build

> fix ct_noperm_loc test
> --
>
> Key: HIVE-18012
> URL: https://issues.apache.org/jira/browse/HIVE-18012
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Akira Ajisaka
> Attachments: HIVE-18012.001.patch
>
>
> the goal of the test is to check that hive doesn't let user1 to create a 
> table with a location under an unowned path.
> I've bisected this test to be broken by 
> 5250ef450430fcdeed0a2cb7a770f48647987cd3 (HIVE-12408).
> the original exception was (which have been by that sole masked line):
> {code}
> FAILED: HiveAccessControlException Permission denied: Principal [name=user1, 
> type=USER] does not have following privileges for operation CREATETABLE 
> [[OBJECT OWNERSHIP] on Object [type=DFS_URI, 
> name=hdfs://localhost:35753/tmp/ct_noperm_loc_foo0]]
> {code}
> the current semanticexception shouldnt be accepted ; because it's unrelated 
> to the tests goal.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17856) MM tables - IOW is not ACID compliant

2017-11-09 Thread Steve Yeom (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246192#comment-16246192
 ] 

Steve Yeom commented on HIVE-17856:
---

FYI, actively working on the pre-commit test failures.. 

> MM tables - IOW is not ACID compliant
> -
>
> Key: HIVE-17856
> URL: https://issues.apache.org/jira/browse/HIVE-17856
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Steve Yeom
>  Labels: mm-gap-1
> Attachments: HIVE-17856.1.patch, HIVE-17856.2.patch, 
> HIVE-17856.3.patch, HIVE-17856.4.patch
>
>
> The following tests were removed from mm_all during "integration"... I should 
> have never allowed such manner of intergration.
> MM logic should have been kept intact until ACID logic could catch up. Alas, 
> here we are.
> {noformat}
> drop table iow0_mm;
> create table iow0_mm(key int) tblproperties("transactional"="true", 
> "transactional_properties"="insert_only");
> insert overwrite table iow0_mm select key from intermediate;
> insert into table iow0_mm select key + 1 from intermediate;
> select * from iow0_mm order by key;
> insert overwrite table iow0_mm select key + 2 from intermediate;
> select * from iow0_mm order by key;
> drop table iow0_mm;
> drop table iow1_mm; 
> create table iow1_mm(key int) partitioned by (key2 int)  
> tblproperties("transactional"="true", 
> "transactional_properties"="insert_only");
> insert overwrite table iow1_mm partition (key2)
> select key as k1, key from intermediate union all select key as k1, key from 
> intermediate;
> insert into table iow1_mm partition (key2)
> select key + 1 as k1, key from intermediate union all select key as k1, key 
> from intermediate;
> select * from iow1_mm order by key, key2;
> insert overwrite table iow1_mm partition (key2)
> select key + 3 as k1, key from intermediate union all select key + 4 as k1, 
> key from intermediate;
> select * from iow1_mm order by key, key2;
> insert overwrite table iow1_mm partition (key2)
> select key + 3 as k1, key + 3 from intermediate union all select key + 2 as 
> k1, key + 2 from intermediate;
> select * from iow1_mm order by key, key2;
> drop table iow1_mm;
> {noformat}
> {noformat}
> drop table simple_mm;
> create table simple_mm(key int) stored as orc tblproperties 
> ("transactional"="true", "transactional_properties"="insert_only");
> insert into table simple_mm select key from intermediate;
> -insert overwrite table simple_mm select key from intermediate;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16495) ColumnStats merge should consider the accuracy of the current stats

2017-11-09 Thread Zoltan Haindrich (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-16495:

Resolution: Duplicate
Status: Resolved  (was: Patch Available)

fixed in HIVE-16827

> ColumnStats merge should consider the accuracy of the current stats
> ---
>
> Key: HIVE-16495
> URL: https://issues.apache.org/jira/browse/HIVE-16495
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Zoltan Haindrich
> Attachments: HIVE-16495.01.patch, HIVE-16495.02.patch, 
> HIVE-16495.03.patch, HIVE-16495.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17971) Possible misuse of getDataSizeFromColumnStats

2017-11-09 Thread Zoltan Haindrich (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-17971:

Issue Type: Sub-task  (was: Bug)
Parent: HIVE-11160

> Possible misuse of getDataSizeFromColumnStats
> -
>
> Key: HIVE-17971
> URL: https://issues.apache.org/jira/browse/HIVE-17971
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics
>Reporter: Zoltan Haindrich
>
> the same method's return value is used with 
> [used as 
> {{stats.setDataSize()}}|https://github.com/apache/hive/blob/10aa33072316a23ab7e21dd8d9d78a3ae1664b9a/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L512]
> [used as 
> {{stats.addToDataSize()}}|https://github.com/apache/hive/blob/10aa33072316a23ab7e21dd8d9d78a3ae1664b9a/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L498]
> seems odd...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-18005) Improve size estimation for array() to be not 0

2017-11-09 Thread Zoltan Haindrich (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-18005:

Issue Type: Sub-task  (was: Bug)
Parent: HIVE-11160

> Improve size estimation for array() to be not 0
> ---
>
> Key: HIVE-18005
> URL: https://issues.apache.org/jira/browse/HIVE-18005
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics
>Reporter: Zoltan Haindrich
>
> happens only in case the array is not from a column; and the array contains 
> no column references
> {code}
> EXPLAIN
> SELECT sort_array(array("b", "d", "c", "a")),array("1","2") FROM t
> ...
>  Statistics: Num rows: 1 Data size: 0 Basic stats: COMPLETE 
> Column stats: COMPLETE
>  ListSink
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-6590) Hive does not work properly with boolean partition columns (wrong results and inserts to incorrect HDFS path)

2017-11-09 Thread Zoltan Haindrich (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246186#comment-16246186
 ] 

Zoltan Haindrich commented on HIVE-6590:


[~ashutoshc] I've filed the followup a long time ago; and fortunately, there's 
already a patch to fix that in HIVE-15939 :)

> Hive does not work properly with boolean partition columns (wrong results and 
> inserts to incorrect HDFS path)
> -
>
> Key: HIVE-6590
> URL: https://issues.apache.org/jira/browse/HIVE-6590
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema, Metastore
>Affects Versions: 0.10.0
>Reporter: Lenni Kuff
>Assignee: Zoltan Haindrich
> Fix For: 3.0.0
>
> Attachments: HIVE-6590.1.patch, HIVE-6590.2.patch, HIVE-6590.3.patch, 
> HIVE-6590.4.patch, HIVE-6590.5.patch, HIVE-6590.5.patch
>
>
> Hive does not work properly with boolean partition columns. Queries return 
> wrong results and also insert to incorrect HDFS paths.
> {code}
> create table bool_part(int_col int) partitioned by(bool_col boolean);
> # This works, creating 3 unique partitions!
> ALTER TABLE bool_table ADD PARTITION (bool_col=FALSE);
> ALTER TABLE bool_table ADD PARTITION (bool_col=false);
> ALTER TABLE bool_table ADD PARTITION (bool_col=False);
> {code}
> The first problem is that Hive cannot filter on a bool partition key column. 
> "select * from bool_part" returns the correct results, but if you apply a 
> filter on the bool partition key column hive won't return any results.
> The second problem is that Hive seems to just call "toString()" on the 
> boolean literal value. This means you can end up with multiple partitions 
> (FALSE, false, FaLSE, etc) mapping to the literal value 'FALSE'. For example, 
> if you can add three partition in have for the same logic value "false" doing:
> ALTER TABLE bool_table ADD PARTITION (bool_col=FALSE) -> 
> /test-warehouse/bool_table/bool_col=FALSE/
> ALTER TABLE bool_table ADD PARTITION (bool_col=false) -> 
> /test-warehouse/bool_table/bool_col=false/
> ALTER TABLE bool_table ADD PARTITION (bool_col=False) -> 
> /test-warehouse/bool_table/bool_col=False/



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-15939) Make cast expressions comply more to sql2011

2017-11-09 Thread Zoltan Haindrich (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246182#comment-16246182
 ] 

Zoltan Haindrich commented on HIVE-15939:
-

now that HIVE-6590 is in; this could go in as well - IIRC the patch is already 
good
+1 pending tests

> Make cast expressions comply more to sql2011
> 
>
> Key: HIVE-15939
> URL: https://issues.apache.org/jira/browse/HIVE-15939
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Zoltan Haindrich
>Assignee: Teddy Choi
> Attachments: HIVE-15939.1.patch, HIVE-15939.2.patch, 
> HIVE-15939.3.patch, HIVE-15939.4.patch, HIVE-15939.5.patch
>
>
> in HIVE-6590 Jason have uncovered the fact that UDFToBoolean treats all 
> non-empty strings as true.
> It would be great to have the cast expressions closer to the standard...at 
> least when there is an expected behaviour from the user;
> like {{cast('false' as boolean)}} should be true.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-18009) Multiple lateral view query is slow on hive on spark

2017-11-09 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-18009:

Attachment: HIVE-18009.3.patch

patch-3: update the test result for TestCliDriver and address the feedback from 
code review.

> Multiple lateral view query is slow on hive on spark
> 
>
> Key: HIVE-18009
> URL: https://issues.apache.org/jira/browse/HIVE-18009
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-18009.1.patch, HIVE-18009.2.patch, 
> HIVE-18009.3.patch
>
>
> When running the query with multiple lateral view, HoS is busy with the 
> compilation. GenSparkUtils has an efficient implementation of 
> getChildOperator when we have diamond hierarchy in operator trees (lateral 
> view in this case) since the node may be visited multiple times.
> {noformat}
> at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:442)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
>

[jira] [Updated] (HIVE-17911) org.apache.hadoop.hive.metastore.ObjectStore - Tune Up

2017-11-09 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-17911:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Beluga!

> org.apache.hadoop.hive.metastore.ObjectStore - Tune Up
> --
>
> Key: HIVE-17911
> URL: https://issues.apache.org/jira/browse/HIVE-17911
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-17911.3.patch, HIVE-17911.4.patch, 
> HIVE-17911.5.patch, HIVE-17911.6.patch
>
>
> # Remove unused variables
> # Add logging parameterization
> # Use CollectionUtils.isEmpty/isNotEmpty to simplify and unify collection 
> empty check (and always use null check)
> # Minor tweaks



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (HIVE-17964) HoS: some spark configs doesn't require re-creating a session

2017-11-09 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246136#comment-16246136
 ] 

Xuefu Zhang edited comment on HIVE-17964 at 11/9/17 5:47 PM:
-

{quote}
I think renaming a bunch of configs is not very user friendly. Maybe we should 
differentiate these configs in our code.
{quote}
+1. Probably we can add a new param that enlists the params that require a 
session refresh.


was (Author: xuefuz):
{quote}
I think renaming a bunch of configs is not very user friendly. Maybe we should 
differentiate these configs in our code.
{quote}
Probably we can add a new param that enlists the params that require a session 
refresh.

> HoS: some spark configs doesn't require re-creating a session
> -
>
> Key: HIVE-17964
> URL: https://issues.apache.org/jira/browse/HIVE-17964
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Minor
>
> I guess the {{hive.spark.}} configs were initially intended for the RSC. 
> Therefore when they're changed, we'll re-create the session for them to take 
> effect. There're some configs not related to RSC that also start with 
> {{hive.spark.}}. We'd better rename them so that we don't unnecessarily 
> re-create sessions, which is usually time consuming.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17964) HoS: some spark configs doesn't require re-creating a session

2017-11-09 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246136#comment-16246136
 ] 

Xuefu Zhang commented on HIVE-17964:


{quote}
I think renaming a bunch of configs is not very user friendly. Maybe we should 
differentiate these configs in our code.
{quote}
Probably we can add a new param that enlists the params that require a session 
refresh.

> HoS: some spark configs doesn't require re-creating a session
> -
>
> Key: HIVE-17964
> URL: https://issues.apache.org/jira/browse/HIVE-17964
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Minor
>
> I guess the {{hive.spark.}} configs were initially intended for the RSC. 
> Therefore when they're changed, we'll re-create the session for them to take 
> effect. There're some configs not related to RSC that also start with 
> {{hive.spark.}}. We'd better rename them so that we don't unnecessarily 
> re-create sessions, which is usually time consuming.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17911) org.apache.hadoop.hive.metastore.ObjectStore - Tune Up

2017-11-09 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246124#comment-16246124
 ] 

Ashutosh Chauhan commented on HIVE-17911:
-

+1

> org.apache.hadoop.hive.metastore.ObjectStore - Tune Up
> --
>
> Key: HIVE-17911
> URL: https://issues.apache.org/jira/browse/HIVE-17911
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-17911.3.patch, HIVE-17911.4.patch, 
> HIVE-17911.5.patch, HIVE-17911.6.patch
>
>
> # Remove unused variables
> # Add logging parameterization
> # Use CollectionUtils.isEmpty/isNotEmpty to simplify and unify collection 
> empty check (and always use null check)
> # Minor tweaks



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17942) HiveAlterHandler not using conf from HMS Handler

2017-11-09 Thread Janaki Lahorani (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246104#comment-16246104
 ] 

Janaki Lahorani commented on HIVE-17942:


@akolb Changing AlterHandler to be not configurable is a significant change - 
potentially backward incompatible.  There can be other implementations that can 
be used other than HiveAlterHandler.
https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L485
Also, though this can be considered as a good code cleanup, it is in theory 
cosmetic and not needed for this bug fix.
And, the story about the way HMS Handler works with DDL handlers is a design 
detail that is common to all DDL handlers, hence doesn't necessarily belong in 
any DDL handler, but rather in a design document or possibly in HMS Handler 
implementation, which again IMHO outside the scope of this bug fix.

There is an example in the test.

Configuration changes within a connection is to be made using
set =value
If the configuration change is to be effective in HMS it is to be prepended 
with metaconf:
set metaconf:=value
Example:
set metaconf:hive.metastore.disallow.incompatible.col.type.changes=false


> HiveAlterHandler not using conf from HMS Handler
> 
>
> Key: HIVE-17942
> URL: https://issues.apache.org/jira/browse/HIVE-17942
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Janaki Lahorani
>Assignee: Janaki Lahorani
> Fix For: 3.0.0
>
> Attachments: HIVE-17942.1.patch, HIVE-17942.2.patch, 
> HIVE-17942.3.patch, HIVE-17942.4.patch, HIVE-17942.5.patch
>
>
> When HiveAlterHandler looks for conf, it is not getting the one from thread 
> local.  So, local changes are not visible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (HIVE-17942) HiveAlterHandler not using conf from HMS Handler

2017-11-09 Thread Janaki Lahorani (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246104#comment-16246104
 ] 

Janaki Lahorani edited comment on HIVE-17942 at 11/9/17 5:26 PM:
-

[~akolb] Changing AlterHandler to be not configurable is a significant change - 
potentially backward incompatible.  There can be other implementations that can 
be used other than HiveAlterHandler.
https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L485
Also, though this can be considered as a good code cleanup, it is in theory 
cosmetic and not needed for this bug fix.
And, the story about the way HMS Handler works with DDL handlers is a design 
detail that is common to all DDL handlers, hence doesn't necessarily belong in 
any DDL handler, but rather in a design document or possibly in HMS Handler 
implementation, which again IMHO outside the scope of this bug fix.

There is an example in the test.

Configuration changes within a connection is to be made using
set =value
If the configuration change is to be effective in HMS it is to be prepended 
with metaconf:
set metaconf:=value
Example:
set metaconf:hive.metastore.disallow.incompatible.col.type.changes=false



was (Author: janulatha):
@akolb Changing AlterHandler to be not configurable is a significant change - 
potentially backward incompatible.  There can be other implementations that can 
be used other than HiveAlterHandler.
https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L485
Also, though this can be considered as a good code cleanup, it is in theory 
cosmetic and not needed for this bug fix.
And, the story about the way HMS Handler works with DDL handlers is a design 
detail that is common to all DDL handlers, hence doesn't necessarily belong in 
any DDL handler, but rather in a design document or possibly in HMS Handler 
implementation, which again IMHO outside the scope of this bug fix.

There is an example in the test.

Configuration changes within a connection is to be made using
set =value
If the configuration change is to be effective in HMS it is to be prepended 
with metaconf:
set metaconf:=value
Example:
set metaconf:hive.metastore.disallow.incompatible.col.type.changes=false


> HiveAlterHandler not using conf from HMS Handler
> 
>
> Key: HIVE-17942
> URL: https://issues.apache.org/jira/browse/HIVE-17942
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Janaki Lahorani
>Assignee: Janaki Lahorani
> Fix For: 3.0.0
>
> Attachments: HIVE-17942.1.patch, HIVE-17942.2.patch, 
> HIVE-17942.3.patch, HIVE-17942.4.patch, HIVE-17942.5.patch
>
>
> When HiveAlterHandler looks for conf, it is not getting the one from thread 
> local.  So, local changes are not visible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17934) Merging Statistics are promoted to COMPLETE (most of the time)

2017-11-09 Thread Zoltan Haindrich (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-17934:

Attachment: HIVE-17934.06.patch

#6) accepted all q.out-s because they are improvements or caused by known 
caveats

> Merging Statistics are promoted to COMPLETE (most of the time)
> --
>
> Key: HIVE-17934
> URL: https://issues.apache.org/jira/browse/HIVE-17934
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-17934.01.patch, HIVE-17934.02.patch, 
> HIVE-17934.03.patch, HIVE-17934.04.patch, HIVE-17934.05.patch, 
> HIVE-17934.06.patch, HIVE-17934.06wip01.patch
>
>
> in case multiple partition statistics are merged the STATS state is computed 
> based on the datasize and rowcount;
> the merge may hide away non-existent stats in case there are other partition 
> or operators which do contribute to the datasize and the rowcount.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-6590) Hive does not work properly with boolean partition columns (wrong results and inserts to incorrect HDFS path)

2017-11-09 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246091#comment-16246091
 ] 

Ashutosh Chauhan commented on HIVE-6590:


[~kgyrtkirk] I think we shall be consistent and change UDFToBoolean also along 
similar lines. Can you take that up in a follow-up?

> Hive does not work properly with boolean partition columns (wrong results and 
> inserts to incorrect HDFS path)
> -
>
> Key: HIVE-6590
> URL: https://issues.apache.org/jira/browse/HIVE-6590
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema, Metastore
>Affects Versions: 0.10.0
>Reporter: Lenni Kuff
>Assignee: Zoltan Haindrich
> Fix For: 3.0.0
>
> Attachments: HIVE-6590.1.patch, HIVE-6590.2.patch, HIVE-6590.3.patch, 
> HIVE-6590.4.patch, HIVE-6590.5.patch, HIVE-6590.5.patch
>
>
> Hive does not work properly with boolean partition columns. Queries return 
> wrong results and also insert to incorrect HDFS paths.
> {code}
> create table bool_part(int_col int) partitioned by(bool_col boolean);
> # This works, creating 3 unique partitions!
> ALTER TABLE bool_table ADD PARTITION (bool_col=FALSE);
> ALTER TABLE bool_table ADD PARTITION (bool_col=false);
> ALTER TABLE bool_table ADD PARTITION (bool_col=False);
> {code}
> The first problem is that Hive cannot filter on a bool partition key column. 
> "select * from bool_part" returns the correct results, but if you apply a 
> filter on the bool partition key column hive won't return any results.
> The second problem is that Hive seems to just call "toString()" on the 
> boolean literal value. This means you can end up with multiple partitions 
> (FALSE, false, FaLSE, etc) mapping to the literal value 'FALSE'. For example, 
> if you can add three partition in have for the same logic value "false" doing:
> ALTER TABLE bool_table ADD PARTITION (bool_col=FALSE) -> 
> /test-warehouse/bool_table/bool_col=FALSE/
> ALTER TABLE bool_table ADD PARTITION (bool_col=false) -> 
> /test-warehouse/bool_table/bool_col=false/
> ALTER TABLE bool_table ADD PARTITION (bool_col=False) -> 
> /test-warehouse/bool_table/bool_col=False/



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-18030) HCatalog can't be used with Pig on Spark

2017-11-09 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246089#comment-16246089
 ] 

Xuefu Zhang commented on HIVE-18030:


Patch looks fine. However, I'm wondering if it would be better to set 
mapred.task.id in Pig on Spark, as Hive on Spark sets it already.

> HCatalog can't be used with Pig on Spark
> 
>
> Key: HIVE-18030
> URL: https://issues.apache.org/jira/browse/HIVE-18030
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Reporter: Adam Szita
>Assignee: Adam Szita
> Attachments: HIVE-18030.0.patch
>
>
> When using Pig on Spark in cluster mode, all queries containing HCatalog 
> access are failing:
> {code}
> 2017-11-03 12:39:19,268 [dispatcher-event-loop-19] INFO  
> org.apache.spark.storage.BlockManagerInfo - Added broadcast_6_piece0 in 
> memory on <>:<> (size: 83.0 KB, free: 408.5 
> MB)
> 2017-11-03 12:39:19,277 [task-result-getter-0] WARN  
> org.apache.spark.scheduler.TaskSetManager - Lost task 0.0 in stage 0.0 (TID 
> 0, <>, executor 2): java.lang.NullPointerException
>   at org.apache.hadoop.security.Credentials.addAll(Credentials.java:401)
>   at org.apache.hadoop.security.Credentials.addAll(Credentials.java:388)
>   at 
> org.apache.hive.hcatalog.pig.HCatLoader.setLocation(HCatLoader.java:128)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.mergeSplitSpecificConf(PigInputFormat.java:147)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat$RecordReaderFactory.(PigInputFormat.java:115)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.running.PigInputFormatSpark$SparkRecordReaderFactory.(PigInputFormatSpark.java:126)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.running.PigInputFormatSpark.createRecordReader(PigInputFormatSpark.java:70)
>   at 
> org.apache.spark.rdd.NewHadoopRDD$$anon$1.liftedTree1$1(NewHadoopRDD.scala:180)
>   at 
> org.apache.spark.rdd.NewHadoopRDD$$anon$1.(NewHadoopRDD.scala:179)
>   at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:134)
>   at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:69)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
>   at org.apache.spark.scheduler.Task.run(Task.scala:108)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-6590) Hive does not work properly with boolean partition columns (wrong results and inserts to incorrect HDFS path)

2017-11-09 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6590:
---
   Resolution: Fixed
 Hadoop Flags: Incompatible change
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Zoltan!

> Hive does not work properly with boolean partition columns (wrong results and 
> inserts to incorrect HDFS path)
> -
>
> Key: HIVE-6590
> URL: https://issues.apache.org/jira/browse/HIVE-6590
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema, Metastore
>Affects Versions: 0.10.0
>Reporter: Lenni Kuff
>Assignee: Zoltan Haindrich
> Fix For: 3.0.0
>
> Attachments: HIVE-6590.1.patch, HIVE-6590.2.patch, HIVE-6590.3.patch, 
> HIVE-6590.4.patch, HIVE-6590.5.patch, HIVE-6590.5.patch
>
>
> Hive does not work properly with boolean partition columns. Queries return 
> wrong results and also insert to incorrect HDFS paths.
> {code}
> create table bool_part(int_col int) partitioned by(bool_col boolean);
> # This works, creating 3 unique partitions!
> ALTER TABLE bool_table ADD PARTITION (bool_col=FALSE);
> ALTER TABLE bool_table ADD PARTITION (bool_col=false);
> ALTER TABLE bool_table ADD PARTITION (bool_col=False);
> {code}
> The first problem is that Hive cannot filter on a bool partition key column. 
> "select * from bool_part" returns the correct results, but if you apply a 
> filter on the bool partition key column hive won't return any results.
> The second problem is that Hive seems to just call "toString()" on the 
> boolean literal value. This means you can end up with multiple partitions 
> (FALSE, false, FaLSE, etc) mapping to the literal value 'FALSE'. For example, 
> if you can add three partition in have for the same logic value "false" doing:
> ALTER TABLE bool_table ADD PARTITION (bool_col=FALSE) -> 
> /test-warehouse/bool_table/bool_col=FALSE/
> ALTER TABLE bool_table ADD PARTITION (bool_col=false) -> 
> /test-warehouse/bool_table/bool_col=false/
> ALTER TABLE bool_table ADD PARTITION (bool_col=False) -> 
> /test-warehouse/bool_table/bool_col=False/



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17931) Implement Parquet vectorization reader for Array type

2017-11-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246076#comment-16246076
 ] 

Hive QA commented on HIVE-17931:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12896820/HIVE-17931.003.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7738/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7738/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7738/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-11-09 17:08:46.927
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-7738/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-11-09 17:08:46.931
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 812d757 HIVE-18010 : Update hbase version (Ashutosh Chauhan via 
Zoltan Haindrich)
+ git clean -f -d
Removing ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommandsForMmTable.java
Removing ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommandsForOrcMmTable.java
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 812d757 HIVE-18010 : Update hbase version (Ashutosh Chauhan via 
Zoltan Haindrich)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-11-09 17:08:52.332
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
Going to apply patch with: patch -p1
patching file 
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/BaseVectorizedColumnReader.java
patching file 
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedListColumnReader.java
patching file 
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedParquetRecordReader.java
Hunk #2 succeeded at 495 (offset -14 lines).
patching file 
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedPrimitiveColumnReader.java
patching file 
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestVectorizedColumnReader.java
Hunk #1 succeeded at 133 with fuzz 2 (offset 24 lines).
patching file 
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestVectorizedDictionaryEncodingColumnReader.java
patching file 
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/VectorizedColumnReaderTestBase.java
Hunk #2 succeeded at 116 (offset 1 line).
Hunk #3 succeeded at 243 (offset 5 lines).
Hunk #4 succeeded at 306 (offset 5 lines).
Hunk #5 succeeded at 570 (offset 5 lines).
+ [[ maven == \m\a\v\e\n ]]
+ rm -rf /data/hiveptest/working/maven/org/apache/hive
+ mvn -B clean install -DskipTests -T 4 -q 
-Dmaven.repo.local=/data/hiveptest/working/maven
protoc-jar: protoc version: 250, detected platform: linux/amd64
protoc-jar: executing: [/tmp/protoc7386819152450114916.exe, 
-I/data/hiveptest/working/apache-github-source-source/standalone-metastore/src/main/protobuf/org/apache/hadoop/hive/metastore,
 
--java_out=/data/hiveptest/working/apache-github-source-source/standalone-metastore/target/generated-sources,
 
/data/hiveptest/working/apache-github-source-source/standalone-metastore/src/main/protobuf/org/apache/hadoop/hive/metastore/metastore.proto]
ANTLR Parser Generator  Version 3.5.2
Output file 
/data/hiveptest/working/apache-github-source-source/standalone-metastore/target/generated-sources/org/apache/hadoop/hive/metastore/parser/FilterParser.java
 does not exist: must build 
/data/hiveptest/working/apache-github-source-source/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/parser/Filter.g

[jira] [Commented] (HIVE-17856) MM tables - IOW is not ACID compliant

2017-11-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246071#comment-16246071
 ] 

Hive QA commented on HIVE-17856:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12896813/HIVE-17856.4.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 24 failed/errored test(s), 11382 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dbtxnmgr_showlocks] 
(batchId=77)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_1_23] 
(batchId=76)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=62)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mm_all] (batchId=66)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mm_loaddata] (batchId=45)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[mm_all] 
(batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dp_counter_mm]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=156)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
 (batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=111)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query64] 
(batchId=243)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=206)
org.apache.hadoop.hive.ql.TestTxnCommands.testNonAcidToAcidConversion01 
(batchId=288)
org.apache.hadoop.hive.ql.TestTxnCommands.testTimeOutReaper (batchId=288)
org.apache.hadoop.hive.ql.TestTxnCommandsForMmTable.testInsertOverwriteWithDynamicPartition
 (batchId=254)
org.apache.hadoop.hive.ql.TestTxnCommandsForOrcMmTable.testInsertOverwriteWithDynamicPartition
 (batchId=272)
org.apache.hadoop.hive.ql.TestTxnNoBuckets.testCTAS (batchId=276)
org.apache.hadoop.hive.ql.TestTxnNoBuckets.testInsertToAcidWithUnionRemove 
(batchId=276)
org.apache.hadoop.hive.ql.TestTxnNoBuckets.testNoBuckets (batchId=276)
org.apache.hadoop.hive.ql.TestTxnNoBuckets.testNonAcidToAcidVectorzied 
(batchId=276)
org.apache.hadoop.hive.ql.TestTxnNoBuckets.testToAcidConversion02 (batchId=276)
org.apache.hadoop.hive.ql.TestTxnNoBuckets.testToAcidConversionMultiBucket 
(batchId=276)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7737/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7737/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7737/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 24 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12896813 - PreCommit-HIVE-Build

> MM tables - IOW is not ACID compliant
> -
>
> Key: HIVE-17856
> URL: https://issues.apache.org/jira/browse/HIVE-17856
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Steve Yeom
>  Labels: mm-gap-1
> Attachments: HIVE-17856.1.patch, HIVE-17856.2.patch, 
> HIVE-17856.3.patch, HIVE-17856.4.patch
>
>
> The following tests were removed from mm_all during "integration"... I should 
> have never allowed such manner of intergration.
> MM logic should have been kept intact until ACID logic could catch up. Alas, 
> here we are.
> {noformat}
> drop table iow0_mm;
> create table iow0_mm(key int) tblproperties("transactional"="true", 
> "transactional_properties"="insert_only");
> insert overwrite table iow0_mm select key from intermediate;
> insert into table iow0_mm select key + 1 from intermediate;
> select * from iow0_mm order by key;
> insert overwrite table iow0_mm select key + 2 from intermediate;
> select * from iow0_mm order by key;
> drop table iow0_mm;
> drop table iow1_mm; 
> create table iow1_mm(key int) partitioned by (key2 int)  
> tblproperties("transactional"="true", 
> "transactional_properties"="insert_only");
> insert overwrite table iow1_mm partition (key2)
> select key as k1, key from intermediate union all select key as k1, key from 
> intermediate;
> insert into table iow1_mm partition (key2)
> select key + 1 as k1, key from intermediate union all select key as k1, key 
> from intermediate;
> select * from iow1_mm order by key, key2;
> insert overwrite

[jira] [Updated] (HIVE-17942) HiveAlterHandler not using conf from HMS Handler

2017-11-09 Thread Janaki Lahorani (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Janaki Lahorani updated HIVE-17942:
---
Attachment: HIVE-17942.5.patch

Added comments to HiveAlterHandler.java and TestHiveMetaStoreAlterColumnPar.

> HiveAlterHandler not using conf from HMS Handler
> 
>
> Key: HIVE-17942
> URL: https://issues.apache.org/jira/browse/HIVE-17942
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Janaki Lahorani
>Assignee: Janaki Lahorani
> Fix For: 3.0.0
>
> Attachments: HIVE-17942.1.patch, HIVE-17942.2.patch, 
> HIVE-17942.3.patch, HIVE-17942.4.patch, HIVE-17942.5.patch
>
>
> When HiveAlterHandler looks for conf, it is not getting the one from thread 
> local.  So, local changes are not visible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17472) Drop-partition for multi-level partition fails, if data does not exist.

2017-11-09 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17472:

Fix Version/s: 2.3.2

> Drop-partition for multi-level partition fails, if data does not exist.
> ---
>
> Key: HIVE-17472
> URL: https://issues.apache.org/jira/browse/HIVE-17472
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.2.0, 3.0.0
>Reporter: Mithun Radhakrishnan
>Assignee: Chris Drome
> Fix For: 3.0.0, 2.4.0, 2.2.1, 2.3.2
>
> Attachments: HIVE-17472.1.patch, HIVE-17472.2-branch-2.patch, 
> HIVE-17472.2.patch, HIVE-17472.3-branch-2.2.patch, 
> HIVE-17472.3-branch-2.patch, HIVE-17472.3.patch
>
>
> Raising this on behalf of [~cdrome] and [~selinazh]. 
> Here's how to reproduce the problem:
> {code:sql}
> CREATE TABLE foobar ( foo STRING, bar STRING ) PARTITIONED BY ( dt STRING, 
> region STRING ) STORED AS RCFILE LOCATION '/tmp/foobar';
> ALTER TABLE foobar ADD PARTITION ( dt='1', region='A' ) ;
> dfs -rm -R -skipTrash /tmp/foobar/dt=1;
> ALTER TABLE foobar DROP PARTITION ( dt='1' );
> {code}
> This causes a client-side error as follows:
> {code}
> 15/02/26 23:08:32 ERROR exec.DDLTask: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Unknown error. Please check 
> logs.
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17891) HIVE-13076 uses create table if not exists for the postgres script

2017-11-09 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17891:

Fix Version/s: 2.3.2

> HIVE-13076 uses create table if not exists for the postgres script
> --
>
> Key: HIVE-17891
> URL: https://issues.apache.org/jira/browse/HIVE-17891
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Fix For: 3.0.0, 2.4.0, 2.3.2
>
> Attachments: HIVE-17891.01.patch, HIVE-17891.02.patch, 
> HIVE-17891.03.patch, HIVE-17891.04.patch
>
>
> HIVE-13076 addes a new table to the schema but the patch script uses {{CREATE 
> TABLE IF NOT EXISTS}} syntax to add the new table. The issue is that the {{IF 
> NOT EXISTS}} clause is only available from postgres 9.1 onwards. So the 
> script will fail for older versions of postgres.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17640) Comparison of date return null if time part is provided in string.

2017-11-09 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17640:

Fix Version/s: 2.3.2

> Comparison of date return null if time part is provided in string.
> --
>
> Key: HIVE-17640
> URL: https://issues.apache.org/jira/browse/HIVE-17640
> Project: Hive
>  Issue Type: Bug
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Fix For: 2.4.0, 2.3.2
>
> Attachments: HIVE-17640.01-branch-2.patch
>
>
> Reproduce:
> select '2017-01-01 00:00:00' < current_date;
> INFO  : OK
> ...
> 1 row selected (18.324 seconds)
> ...
>  NULL



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17189) Fix backwards incompatibility in HiveMetaStoreClient

2017-11-09 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17189:

Fix Version/s: 2.3.2

> Fix backwards incompatibility in HiveMetaStoreClient
> 
>
> Key: HIVE-17189
> URL: https://issues.apache.org/jira/browse/HIVE-17189
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.1.1
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Fix For: 3.0.0, 2.4.0, 2.3.2
>
> Attachments: HIVE-17189.01.patch, HIVE-17189.02.patch
>
>
> HIVE-12730 adds the ability to edit the basic stats using {{alter table}} and 
> {{alter partition}} commands. However, it changes the signature of @public 
> interface of MetastoreClient and removes some methods which breaks backwards 
> compatibility. This can be fixed easily by re-introducing the removed methods 
> and making them call into newly added method 
> {{alter_table_with_environment_context}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16991) HiveMetaStoreClient needs a 2-arg constructor for backwards compatibility

2017-11-09 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-16991:

Fix Version/s: 2.3.2

> HiveMetaStoreClient needs a 2-arg constructor for backwards compatibility
> -
>
> Key: HIVE-16991
> URL: https://issues.apache.org/jira/browse/HIVE-16991
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
> Fix For: 3.0.0, 2.4.0, 2.3.2
>
> Attachments: HIVE-16991.1.patch
>
>
> Some client code that is not easy to change uses a 2-arg constructor on 
> HiveMetaStoreClient.
> It is trivial and safe to add this constructor:
> {noformat}
> public HiveMetaStoreClient(HiveConf conf, HiveMetaHookLoader hookLoader) 
> throws MetaException {
> this(conf, hookLoader, true);
> }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16487) Serious Zookeeper exception is logged when a race condition happens

2017-11-09 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-16487:

Fix Version/s: 2.3.2

> Serious Zookeeper exception is logged when a race condition happens
> ---
>
> Key: HIVE-16487
> URL: https://issues.apache.org/jira/browse/HIVE-16487
> Project: Hive
>  Issue Type: Bug
>  Components: Locking
>Affects Versions: 3.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Fix For: 3.0.0, 2.4.0, 2.3.2
>
> Attachments: HIVE-16487.02.patch, HIVE-16487.patch
>
>
> A customer started to see this in the logs, but happily everything was 
> working as intended:
> {code}
> 2017-03-30 12:01:59,446 ERROR ZooKeeperHiveLockManager: 
> [HiveServer2-Background-Pool: Thread-620]: Serious Zookeeper exception: 
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = 
> NoNode for /hive_zookeeper_namespace//LOCK-SHARED-
> {code}
> This was happening, because a race condition between the lock releasing, and 
> lock acquiring. The thread releasing the lock removes the parent ZK node just 
> after the thread acquiring the lock made sure, that the parent node exists.
> Since this can happen without any real problem, I plan to add NODEEXISTS, and 
> NONODE as a transient ZooKeeper exception, so the users are not confused.
> Also, the original author of ZooKeeperHiveLockManager maybe planned to handle 
> different ZooKeeperExceptions differently, and the code is hard to 
> understand. See the {{continue}} and the {{break}}. The {{break}} only breaks 
> the switch, and not the loop which IMHO is not intuitive:
> {code}
> do {
>   try {
> [..]
> ret = lockPrimitive(key, mode, keepAlive, parentCreated, 
>   } catch (Exception e1) {
> if (e1 instanceof KeeperException) {
>   KeeperException e = (KeeperException) e1;
>   switch (e.code()) {
>   case CONNECTIONLOSS:
>   case OPERATIONTIMEOUT:
> LOG.debug("Possibly transient ZooKeeper exception: ", e);
> continue;
>   default:
> LOG.error("Serious Zookeeper exception: ", e);
> break;
>   }
> }
> [..]
>   }
> } while (tryNum < numRetriesForLock);
> {code}
> If we do not want to try again in case of a "Serious Zookeeper exception:", 
> then we should add a label to the do loop, and break it in the switch.
> If we do want to try regardless of the type of the ZK exception, then we 
> should just change the {{continue;}} to {{break;}} and move the lines part of 
> the code which did not run in case of {{continue}} to the {{default}} switch, 
> so it is easier to understand the code.
> Any suggestions or ideas [~ctang.ma] or [~szehon]?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17184) Unexpected new line in beeline output when running with -f option

2017-11-09 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17184:

Fix Version/s: 2.3.2

> Unexpected new line in beeline output when running with -f option
> -
>
> Key: HIVE-17184
> URL: https://issues.apache.org/jira/browse/HIVE-17184
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Fix For: 3.0.0, 2.4.0, 2.3.2
>
> Attachments: HIVE-17184.01.patch
>
>
> When running in -f mode on BeeLine I see an extra new line getting added at 
> the end of the results.
> {noformat}
> vihang-MBP:bin vihang$ beeline -f /tmp/query.sql 2>/dev/null
> +--+---+
> | test.id  | test.val  |
> +--+---+
> | 1| one   |
> | 2| two   |
> | 1| three |
> +--+---+
> vihang-MBP:bin vihang$ beeline -e "select * from test;" 2>/dev/null
> +--+---+
> | test.id  | test.val  |
> +--+---+
> | 1| one   |
> | 2| two   |
> | 1| three |
> +--+---+
> vihang-MBP:bin vihang$
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16930) HoS should verify the value of Kerberos principal and keytab file before adding them to spark-submit command parameters

2017-11-09 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-16930:

Fix Version/s: 2.3.2

> HoS should verify the value of Kerberos principal and keytab file before 
> adding them to spark-submit command parameters
> ---
>
> Key: HIVE-16930
> URL: https://issues.apache.org/jira/browse/HIVE-16930
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Fix For: 3.0.0, 2.4.0, 2.3.2
>
> Attachments: HIVE-16930.1.patch
>
>
> When Kerberos is enabled, Hive CLI fails to run Hive on Spark queries:
> {noformat}
> >hive -e "set hive.execution.engine=spark; create table if not exists test(a 
> >int); select count(*) from test" --hiveconf hive.root.logger=INFO,console > 
> >/var/tmp/hive_log.txt > /var/tmp/hive_log_2.txt 
> 17/06/16 16:13:13 [main]: ERROR client.SparkClientImpl: Error while waiting 
> for client to connect. 
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel 
> client 'a5de85d1-6933-43e7-986f-5f8e5c001b5f'. Error: Child process exited 
> before connecting back with error log Error: Cannot load main class from JAR 
> file:/tmp/spark-submit.7196051517706529285.properties 
> Run with --help for usage help or --verbose for debug output 
> at 
> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37) 
> at 
> org.apache.hive.spark.client.SparkClientImpl.(SparkClientImpl.java:107) 
> at 
> org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)
>  
> at 
> org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.createRemoteClient(RemoteHiveSparkClient.java:100)
>  
> at 
> org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.(RemoteHiveSparkClient.java:96)
>  
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:66)
>  
> at 
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:62)
>  
> at 
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114)
>  
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:111)
>  
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:97) 
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) 
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) 
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1972) 
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1685) 
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1421) 
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1205) 
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1195) 
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:220) 
> at 
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:172) 
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:383) 
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:318) 
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:720) 
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:693) 
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:628) 
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  
> at java.lang.reflect.Method.invoke(Method.java:606) 
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221) 
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136) 
> Caused by: java.lang.RuntimeException: Cancel client 
> 'a5de85d1-6933-43e7-986f-5f8e5c001b5f'. Error: Child process exited before 
> connecting back with error log Error: Cannot load main class from JAR 
> file:/tmp/spark-submit.7196051517706529285.properties 
> Run with --help for usage help or --verbose for debug output 
> at 
> org.apache.hive.spark.client.rpc.RpcServer.cancelClient(RpcServer.java:179) 
> at 
> org.apache.hive.spark.client.SparkClientImpl$3.run(SparkClientImpl.java:490) 
> at java.lang.Thread.run(Thread.java:745) 
> 17/06/16 16:13:13 [Driver]: WARN client.SparkClientImpl: Child process exited 
> with code 1 
> {noformat} 
> In the log, below message shows up:
>

[jira] [Updated] (HIVE-17150) CREATE INDEX execute HMS out-of-transaction listener calls inside a transaction

2017-11-09 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17150:

Fix Version/s: 2.3.2

> CREATE INDEX execute HMS out-of-transaction listener calls inside a 
> transaction
> ---
>
> Key: HIVE-17150
> URL: https://issues.apache.org/jira/browse/HIVE-17150
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.3.0
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Fix For: 3.0.0, 2.4.0, 2.3.2
>
> Attachments: HIVE-17150.1.patch, HIVE-17150.2.patch
>
>
> The problem with CREATE INDEX is that it calls a CREATE TABLE operation 
> inside the same CREATE INDEX transaction. During listener calls, there are 
> some listeners that should run in an out-of-transaction context, for 
> instance, Sentry blocks the HMS operation until the DB log notification is 
> processed, but if the transaction has not finished, then the 
> out-of-transaction listener will block forever (or until a read-time out 
> happens).
> A fix would be to add a parameter to the out-of-transaction listener that 
> alerts the listener if HMS is in an active transaction. If so, then is up to 
> the listener plugin to return immediately and avoid blocking the HMS 
> operation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16213) ObjectStore can leak Queries when rollbackTransaction throws an exception

2017-11-09 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-16213:

Fix Version/s: 2.3.2

> ObjectStore can leak Queries when rollbackTransaction throws an exception
> -
>
> Key: HIVE-16213
> URL: https://issues.apache.org/jira/browse/HIVE-16213
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Alexander Kolbasov
>Assignee: Vihang Karajgaonkar
> Fix For: 3.0.0, 2.4.0, 2.3.2
>
> Attachments: HIVE-16213.01.patch, HIVE-16213.02.patch, 
> HIVE-16213.03.patch, HIVE-16213.04.patch, HIVE-16213.05.patch, 
> HIVE-16213.06.patch, HIVE-16213.07.patch, HIVE-16213.08.patch
>
>
> In ObjectStore.java there are a few places with the code similar to:
> {code}
> Query query = null;
> try {
>   openTransaction();
>   query = pm.newQuery(Something.class);
>   ...
>   commited = commitTransaction();
> } finally {
>   if (!commited) {
> rollbackTransaction();
>   }
>   if (query != null) {
> query.closeAll();
>   }
> }
> {code}
> The problem is that rollbackTransaction() may throw an exception in which 
> case query.closeAll() wouldn't be executed. 
> The fix would be to wrap rollbackTransaction in its own try-catch block.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17169) Avoid extra call to KeyProvider::getMetadata()

2017-11-09 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17169:

Fix Version/s: 2.3.2

> Avoid extra call to KeyProvider::getMetadata()
> --
>
> Key: HIVE-17169
> URL: https://issues.apache.org/jira/browse/HIVE-17169
> Project: Hive
>  Issue Type: Bug
>  Components: Shims
>Affects Versions: 3.0.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Fix For: 3.0.0, 2.4.0, 2.3.2
>
> Attachments: HIVE-17169.1-branch-2.patch, HIVE-17169.1.patch
>
>
> Here's the code from {{Hadoop23Shims}}:
> {code:title=Hadoop23Shims.java|borderStyle=solid}
> @Override
> public int comparePathKeyStrength(Path path1, Path path2) throws 
> IOException {
>   EncryptionZone zone1, zone2;
>   zone1 = hdfsAdmin.getEncryptionZoneForPath(path1);
>   zone2 = hdfsAdmin.getEncryptionZoneForPath(path2);
>   if (zone1 == null && zone2 == null) {
> return 0;
>   } else if (zone1 == null) {
> return -1;
>   } else if (zone2 == null) {
> return 1;
>   }
>   return compareKeyStrength(zone1.getKeyName(), zone2.getKeyName());
> }
> private int compareKeyStrength(String keyname1, String keyname2) throws 
> IOException {
>   KeyProvider.Metadata meta1, meta2;
>   if (keyProvider == null) {
> throw new IOException("HDFS security key provider is not configured 
> on your server.");
>   }
>   meta1 = keyProvider.getMetadata(keyname1);
>   meta2 = keyProvider.getMetadata(keyname2);
>   if (meta1.getBitLength() < meta2.getBitLength()) {
> return -1;
>   } else if (meta1.getBitLength() == meta2.getBitLength()) {
> return 0;
>   } else {
> return 1;
>   }
> }
>   }
> {code}
> It turns out that {{EncryptionZone}} already has the cipher's bit-length 
> stored in a member variable. One shouldn't need an additional name-node call 
> ({{KeyProvider::getMetadata()}}) only to fetch it again.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16646) Alias in transform ... as clause shouldn't be case sensitive

2017-11-09 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-16646:

Fix Version/s: 2.3.2

> Alias in transform ... as clause shouldn't be case sensitive
> 
>
> Key: HIVE-16646
> URL: https://issues.apache.org/jira/browse/HIVE-16646
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Fix For: 3.0.0, 2.4.0, 2.3.2
>
> Attachments: HIVE-16646.1.patch, HIVE-16646.2.patch
>
>
> Create a table like below:
> {code:sql}
> CREATE TABLE hive_bug(col1 string);
> {code}
> Run below query in Hive:
> {code}
> from hive_bug select transform(col1) using '/bin/cat' as ( string);
> {code}
> The result would be:
> {noformat}
> 0: jdbc:hive2://localhost:1> from hive_bug select transform(col1) using 
> '/bin/cat' as ( string);
> ..
> INFO  : OK
> +---+--+
> |   |
> +---+--+
> +---+--+
> {noformat}
> The output column name is ** instead of the lowercase .



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-15761) ObjectStore.getNextNotification could return an empty NotificationEventResponse causing TProtocolException

2017-11-09 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-15761:

Fix Version/s: 2.3.2

> ObjectStore.getNextNotification could return an empty 
> NotificationEventResponse causing TProtocolException 
> ---
>
> Key: HIVE-15761
> URL: https://issues.apache.org/jira/browse/HIVE-15761
> Project: Hive
>  Issue Type: Bug
>Reporter: Hao Hao
>Assignee: Sergio Peña
> Fix For: 3.0.0, 2.4.0, 2.3.2
>
> Attachments: HIVE-15761.1.patch
>
>
> If there is no new events greater than the requested event,  
> ObjectStore.getNextNotification will return an empty 
> NotificationEventResponse. And the client side will get the following 
> exception:
> {noformat} [ERROR - 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:295)]
>  Thrift error occurred during processing of message.
> org.apache.thrift.protocol.TProtocolException: Required field 'events' is 
> unset! Struct:NotificationEventResponse(events:null)
>   at 
> org.apache.hadoop.hive.metastore.api.NotificationEventResponse.validate(NotificationEventResponse.java:310)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_next_notification_result.validate(ThriftHiveMetastore.java)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_next_notification_result$get_next_notification_resultStandardScheme.write(ThriftHiveMetastore.java)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_next_notification_result$get_next_notification_resultStandardScheme.write(ThriftHiveMetastore.java)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_next_notification_result.write(ThriftHiveMetastore.java)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745){noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17942) HiveAlterHandler not using conf from HMS Handler

2017-11-09 Thread Alexander Kolbasov (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246000#comment-16246000
 ] 

Alexander Kolbasov commented on HIVE-17942:
---

[~janulatha] Thanks for the explanation. Can you provide some example showing 
how HMSHandler configuration may be customized by a user?

Looking at the fix and at your comment - the most important part is that 
HMSHandler's configuration may be different from general server configuration 
and that's the one we should use during alterTable. The fact that it is 
thread-local is interesting, but this is really an implementation detail of 
HMSHandler.

It would be good to explain this in the story in the code comments.

Another question that I asked - since now the server configuration isn't used 
by HiveALterHandle, should it still be Configurable?  What is the value of 
setConf()/getConf() calls in the new code?

> HiveAlterHandler not using conf from HMS Handler
> 
>
> Key: HIVE-17942
> URL: https://issues.apache.org/jira/browse/HIVE-17942
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Janaki Lahorani
>Assignee: Janaki Lahorani
> Fix For: 3.0.0
>
> Attachments: HIVE-17942.1.patch, HIVE-17942.2.patch, 
> HIVE-17942.3.patch, HIVE-17942.4.patch
>
>
> When HiveAlterHandler looks for conf, it is not getting the one from thread 
> local.  So, local changes are not visible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (HIVE-17942) HiveAlterHandler not using conf from HMS Handler

2017-11-09 Thread Alexander Kolbasov (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246000#comment-16246000
 ] 

Alexander Kolbasov edited comment on HIVE-17942 at 11/9/17 4:46 PM:


[~janulatha] Thanks for the explanation. Can you provide some example showing 
how HMSHandler configuration may be customized by a user?

Looking at the fix and at your comment - the most important part is that 
HMSHandler's configuration may be different from general server configuration 
and that's the one we should use during alterTable. The fact that it is 
thread-local is interesting, but this is really an implementation detail of 
HMSHandler.

It would be good to explain this in the story in the code comments.

Another question that I asked - since now the server configuration isn't used 
by HiveAlterHandle, should it still be Configurable?  What is the value of 
setConf()/getConf() calls in the new code?


was (Author: akolb):
[~janulatha] Thanks for the explanation. Can you provide some example showing 
how HMSHandler configuration may be customized by a user?

Looking at the fix and at your comment - the most important part is that 
HMSHandler's configuration may be different from general server configuration 
and that's the one we should use during alterTable. The fact that it is 
thread-local is interesting, but this is really an implementation detail of 
HMSHandler.

It would be good to explain this in the story in the code comments.

Another question that I asked - since now the server configuration isn't used 
by HiveALterHandle, should it still be Configurable?  What is the value of 
setConf()/getConf() calls in the new code?

> HiveAlterHandler not using conf from HMS Handler
> 
>
> Key: HIVE-17942
> URL: https://issues.apache.org/jira/browse/HIVE-17942
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Janaki Lahorani
>Assignee: Janaki Lahorani
> Fix For: 3.0.0
>
> Attachments: HIVE-17942.1.patch, HIVE-17942.2.patch, 
> HIVE-17942.3.patch, HIVE-17942.4.patch
>
>
> When HiveAlterHandler looks for conf, it is not getting the one from thread 
> local.  So, local changes are not visible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

1 2 >

1 - 100 of 151 matches

Mail list logo