[jira] [Updated] (HIVE-20841) LLAP: Make dynamic ports configurable

2019-02-06 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20841:
-
Attachment: HIVE-20841.2.patch

> LLAP: Make dynamic ports configurable
> -
>
> Key: HIVE-20841
> URL: https://issues.apache.org/jira/browse/HIVE-20841
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-20841.1.patch, HIVE-20841.2.patch
>
>
> Some ports in llap -> tez interaction code uses dynamic ports, provide an 
> option to make them configurable to facilitate adding them to iptable rules 
> in some environment. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21222) ACID: When there are no delete deltas skip finding min max keys

2019-02-06 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16762238#comment-16762238
 ] 

Prasanth Jayachandran commented on HIVE-21222:
--

fixes test failures. 

> ACID: When there are no delete deltas skip finding min max keys
> ---
>
> Key: HIVE-21222
> URL: https://issues.apache.org/jira/browse/HIVE-21222
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21222.1.patch, HIVE-21222.2.patch
>
>
> We create an orc reader in VectorizedOrcAcidRowBatchReader.findMinMaxKeys 
> (which will read 16K footer) even for cases where delete deltas does not 
> exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21222) ACID: When there are no delete deltas skip finding min max keys

2019-02-06 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-21222:
-
Attachment: HIVE-21222.2.patch

> ACID: When there are no delete deltas skip finding min max keys
> ---
>
> Key: HIVE-21222
> URL: https://issues.apache.org/jira/browse/HIVE-21222
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21222.1.patch, HIVE-21222.2.patch
>
>
> We create an orc reader in VectorizedOrcAcidRowBatchReader.findMinMaxKeys 
> (which will read 16K footer) even for cases where delete deltas does not 
> exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21009) LDAP - Specify binddn for ldap-search

2019-02-06 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-21009:
-
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Thanks [~mcginnda] for the contribution! Committed patch to master. 

> LDAP - Specify binddn for ldap-search
> -
>
> Key: HIVE-21009
> URL: https://issues.apache.org/jira/browse/HIVE-21009
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.1.0, 2.1.1, 2.2.0, 2.3.0, 2.3.1, 2.3.2
>Reporter: Thomas Uhren
>Assignee: David McGinnis
>Priority: Major
>  Labels: features, newbie, security
> Fix For: 4.0.0
>
> Attachments: 
> 0001-HIVE-21009-Adding-ability-for-user-to-set-bind-user-.patch, 
> HIVE-21009.01.patch, HIVE-21009.02.patch, HIVE-21009.03.patch, 
> HIVE-21009.04.patch, HIVE-21009.05.patch, HIVE-21009.06.patch, 
> HIVE-21009.07.patch, HIVE-21009.patch
>
>
> When user accounts cannot do an LDAP search, there is currently no way of 
> specifying a custom binddn to use for the ldap-search.
> So I'm missing something like that:
> {code}
> hive.server2.authentication.ldap.bindn=cn=ldapuser,ou=user,dc=example
> hive.server2.authentication.ldap.bindnpw=password
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21103) PartitionManagementTask should not modify DN configs to avoid closing persistence manager

2019-02-06 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-21103:
-
Attachment: HIVE-21103.3.patch

> PartitionManagementTask should not modify DN configs to avoid closing 
> persistence manager
> -
>
> Key: HIVE-21103
> URL: https://issues.apache.org/jira/browse/HIVE-21103
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-21103.1.patch, HIVE-21103.2.patch, 
> HIVE-21103.3.patch
>
>
> HIVE-20707 added automatic partition management which uses thread pools to 
> run parallel msck repair. It also modifies datanucleus connection pool size 
> to avoid explosion of connections to backend database. But object store 
> closes the persistence manager when it detects a change in datanuclues or jdo 
> configs. So when PartitionManagementTask is running and when HS2 tries to 
> connect to metastore HS2 will get persistence manager close exception. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21009) LDAP - Specify binddn for ldap-search

2019-02-06 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16762201#comment-16762201
 ] 

Prasanth Jayachandran commented on HIVE-21009:
--

[~mcginnda] can you please upload git format-patch to properly attribute the 
commit to your email?

> LDAP - Specify binddn for ldap-search
> -
>
> Key: HIVE-21009
> URL: https://issues.apache.org/jira/browse/HIVE-21009
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.1.0, 2.1.1, 2.2.0, 2.3.0, 2.3.1, 2.3.2
>Reporter: Thomas Uhren
>Assignee: David McGinnis
>Priority: Major
>  Labels: features, newbie, security
> Attachments: HIVE-21009.01.patch, HIVE-21009.02.patch, 
> HIVE-21009.03.patch, HIVE-21009.04.patch, HIVE-21009.05.patch, 
> HIVE-21009.06.patch, HIVE-21009.07.patch, HIVE-21009.patch
>
>
> When user accounts cannot do an LDAP search, there is currently no way of 
> specifying a custom binddn to use for the ldap-search.
> So I'm missing something like that:
> {code}
> hive.server2.authentication.ldap.bindn=cn=ldapuser,ou=user,dc=example
> hive.server2.authentication.ldap.bindnpw=password
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-21009) LDAP - Specify binddn for ldap-search

2019-02-06 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16761990#comment-16761990
 ] 

Prasanth Jayachandran edited comment on HIVE-21009 at 2/6/19 6:10 PM:
--

Yeah. Make sense. If it is not related to this patch (or caused by the patch) 
then we don't have to handle it in this ticket. The test run looks clean, I 
will go ahead and commit the patch shortly.


was (Author: prasanth_j):
Yeah. Make sense. If it is not related to this patch then we don't have to 
handle it in this ticket. The test run looks clean, I will go ahead and commit 
the patch shortly.

> LDAP - Specify binddn for ldap-search
> -
>
> Key: HIVE-21009
> URL: https://issues.apache.org/jira/browse/HIVE-21009
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.1.0, 2.1.1, 2.2.0, 2.3.0, 2.3.1, 2.3.2
>Reporter: Thomas Uhren
>Assignee: David McGinnis
>Priority: Major
>  Labels: features, newbie, security
> Attachments: HIVE-21009.01.patch, HIVE-21009.02.patch, 
> HIVE-21009.03.patch, HIVE-21009.04.patch, HIVE-21009.05.patch, 
> HIVE-21009.06.patch, HIVE-21009.07.patch, HIVE-21009.patch
>
>
> When user accounts cannot do an LDAP search, there is currently no way of 
> specifying a custom binddn to use for the ldap-search.
> So I'm missing something like that:
> {code}
> hive.server2.authentication.ldap.bindn=cn=ldapuser,ou=user,dc=example
> hive.server2.authentication.ldap.bindnpw=password
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21009) LDAP - Specify binddn for ldap-search

2019-02-06 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16761990#comment-16761990
 ] 

Prasanth Jayachandran commented on HIVE-21009:
--

Yeah. Make sense. If it is not related to this patch then we don't have to 
handle it in this ticket. The test run looks clean, I will go ahead and commit 
the patch shortly.

> LDAP - Specify binddn for ldap-search
> -
>
> Key: HIVE-21009
> URL: https://issues.apache.org/jira/browse/HIVE-21009
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.1.0, 2.1.1, 2.2.0, 2.3.0, 2.3.1, 2.3.2
>Reporter: Thomas Uhren
>Assignee: David McGinnis
>Priority: Major
>  Labels: features, newbie, security
> Attachments: HIVE-21009.01.patch, HIVE-21009.02.patch, 
> HIVE-21009.03.patch, HIVE-21009.04.patch, HIVE-21009.05.patch, 
> HIVE-21009.06.patch, HIVE-21009.07.patch, HIVE-21009.patch
>
>
> When user accounts cannot do an LDAP search, there is currently no way of 
> specifying a custom binddn to use for the ldap-search.
> So I'm missing something like that:
> {code}
> hive.server2.authentication.ldap.bindn=cn=ldapuser,ou=user,dc=example
> hive.server2.authentication.ldap.bindnpw=password
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21009) LDAP - Specify binddn for ldap-search

2019-02-06 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16761959#comment-16761959
 ] 

Prasanth Jayachandran commented on HIVE-21009:
--

You may have to add apache rat exclusion for jceks file to 
[https://github.com/apache/hive/blob/master/pom.xml#L1350-L1353] to avoid the 
asflicense issue.

 

> LDAP - Specify binddn for ldap-search
> -
>
> Key: HIVE-21009
> URL: https://issues.apache.org/jira/browse/HIVE-21009
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.1.0, 2.1.1, 2.2.0, 2.3.0, 2.3.1, 2.3.2
>Reporter: Thomas Uhren
>Assignee: David McGinnis
>Priority: Major
>  Labels: features, newbie, security
> Attachments: HIVE-21009.01.patch, HIVE-21009.02.patch, 
> HIVE-21009.03.patch, HIVE-21009.04.patch, HIVE-21009.05.patch, 
> HIVE-21009.06.patch, HIVE-21009.07.patch, HIVE-21009.patch
>
>
> When user accounts cannot do an LDAP search, there is currently no way of 
> specifying a custom binddn to use for the ldap-search.
> So I'm missing something like that:
> {code}
> hive.server2.authentication.ldap.bindn=cn=ldapuser,ou=user,dc=example
> hive.server2.authentication.ldap.bindnpw=password
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21222) ACID: When there are no delete deltas skip finding min max keys

2019-02-05 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16761322#comment-16761322
 ] 

Prasanth Jayachandran commented on HIVE-21222:
--

[~ekoifman] could you please review this small patch?

 

> ACID: When there are no delete deltas skip finding min max keys
> ---
>
> Key: HIVE-21222
> URL: https://issues.apache.org/jira/browse/HIVE-21222
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21222.1.patch
>
>
> We create an orc reader in VectorizedOrcAcidRowBatchReader.findMinMaxKeys 
> (which will read 16K footer) even for cases where delete deltas does not 
> exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21222) ACID: When there are no delete deltas skip finding min max keys

2019-02-05 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-21222:
-
Status: Patch Available  (was: Open)

> ACID: When there are no delete deltas skip finding min max keys
> ---
>
> Key: HIVE-21222
> URL: https://issues.apache.org/jira/browse/HIVE-21222
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21222.1.patch
>
>
> We create an orc reader in VectorizedOrcAcidRowBatchReader.findMinMaxKeys 
> (which will read 16K footer) even for cases where delete deltas does not 
> exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21222) ACID: When there are no delete deltas skip finding min max keys

2019-02-05 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-21222:
-
Attachment: HIVE-21222.1.patch

> ACID: When there are no delete deltas skip finding min max keys
> ---
>
> Key: HIVE-21222
> URL: https://issues.apache.org/jira/browse/HIVE-21222
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21222.1.patch
>
>
> We create an orc reader in VectorizedOrcAcidRowBatchReader.findMinMaxKeys 
> (which will read 16K footer) even for cases where delete deltas does not 
> exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-21222) ACID: When there are no delete deltas skip finding min max keys

2019-02-05 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-21222:



> ACID: When there are no delete deltas skip finding min max keys
> ---
>
> Key: HIVE-21222
> URL: https://issues.apache.org/jira/browse/HIVE-21222
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
>
> We create an orc reader in VectorizedOrcAcidRowBatchReader.findMinMaxKeys 
> (which will read 16K footer) even for cases where delete deltas does not 
> exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21212) LLAP: shuffle port config uses internal configuration

2019-02-04 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-21212:
-
Affects Version/s: (was: 3.2.0)

> LLAP: shuffle port config uses internal configuration
> -
>
> Key: HIVE-21212
> URL: https://issues.apache.org/jira/browse/HIVE-21212
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21212.1.patch
>
>
> LlapDaemon main() reads daemon configuration but for shuffle port it reads 
> internal config instead of hive.llap.daemon.yarn.shuffle.port
> [https://github.com/apache/hive/blob/c8eb03affa2533f4827cf6497e7c9873bc9520a7/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/LlapDaemon.java#L535]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21212) LLAP: shuffle port config uses internal configuration

2019-02-04 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-21212:
-
Status: Patch Available  (was: Open)

> LLAP: shuffle port config uses internal configuration
> -
>
> Key: HIVE-21212
> URL: https://issues.apache.org/jira/browse/HIVE-21212
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21212.1.patch
>
>
> LlapDaemon main() reads daemon configuration but for shuffle port it reads 
> internal config instead of hive.llap.daemon.yarn.shuffle.port
> [https://github.com/apache/hive/blob/c8eb03affa2533f4827cf6497e7c9873bc9520a7/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/LlapDaemon.java#L535]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21212) LLAP: shuffle port config uses internal configuration

2019-02-04 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16760376#comment-16760376
 ] 

Prasanth Jayachandran commented on HIVE-21212:
--

[~gopalv] can you please review this one liner?

> LLAP: shuffle port config uses internal configuration
> -
>
> Key: HIVE-21212
> URL: https://issues.apache.org/jira/browse/HIVE-21212
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21212.1.patch
>
>
> LlapDaemon main() reads daemon configuration but for shuffle port it reads 
> internal config instead of hive.llap.daemon.yarn.shuffle.port
> [https://github.com/apache/hive/blob/c8eb03affa2533f4827cf6497e7c9873bc9520a7/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/LlapDaemon.java#L535]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21212) LLAP: shuffle port config uses internal configuration

2019-02-04 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-21212:
-
Attachment: HIVE-21212.1.patch

> LLAP: shuffle port config uses internal configuration
> -
>
> Key: HIVE-21212
> URL: https://issues.apache.org/jira/browse/HIVE-21212
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21212.1.patch
>
>
> LlapDaemon main() reads daemon configuration but for shuffle port it reads 
> internal config instead of hive.llap.daemon.yarn.shuffle.port
> [https://github.com/apache/hive/blob/c8eb03affa2533f4827cf6497e7c9873bc9520a7/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/LlapDaemon.java#L535]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21212) LLAP: shuffle port config uses internal configuration

2019-02-04 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-21212:
-
Target Version/s: 4.0.0  (was: 4.0.0, 3.2.0)

> LLAP: shuffle port config uses internal configuration
> -
>
> Key: HIVE-21212
> URL: https://issues.apache.org/jira/browse/HIVE-21212
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21212.1.patch
>
>
> LlapDaemon main() reads daemon configuration but for shuffle port it reads 
> internal config instead of hive.llap.daemon.yarn.shuffle.port
> [https://github.com/apache/hive/blob/c8eb03affa2533f4827cf6497e7c9873bc9520a7/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/LlapDaemon.java#L535]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21009) LDAP - Specify binddn for ldap-search

2019-02-04 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16760340#comment-16760340
 ] 

Prasanth Jayachandran commented on HIVE-21009:
--

Thanks for the updated patch [~mcginnda] . +1 still. Will get it committed 
after pre-commit test results.

> LDAP - Specify binddn for ldap-search
> -
>
> Key: HIVE-21009
> URL: https://issues.apache.org/jira/browse/HIVE-21009
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.1.0, 2.1.1, 2.2.0, 2.3.0, 2.3.1, 2.3.2
>Reporter: Thomas Uhren
>Assignee: David McGinnis
>Priority: Major
> Attachments: HIVE-21009.01.patch, HIVE-21009.02.patch, 
> HIVE-21009.patch
>
>
> When user accounts cannot do an LDAP search, there is currently no way of 
> specifying a custom binddn to use for the ldap-search.
> So I'm missing something like that:
> {code}
> hive.server2.authentication.ldap.bindn=cn=ldapuser,ou=user,dc=example
> hive.server2.authentication.ldap.bindnpw=password
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21009) LDAP - Specify binddn for ldap-search

2019-02-04 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16760321#comment-16760321
 ] 

Prasanth Jayachandran commented on HIVE-21009:
--

nit: Hadoop configuration getPassword() seems to be iterating over credentials 
provider and fallsback to config. which seems similar to what you are doing, 
isn't it? 
[https://hadoop.apache.org/docs/r2.6.4/api/org/apache/hadoop/conf/Configuration.html#getPassword(java.lang.String)]

 

> LDAP - Specify binddn for ldap-search
> -
>
> Key: HIVE-21009
> URL: https://issues.apache.org/jira/browse/HIVE-21009
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.1.0, 2.1.1, 2.2.0, 2.3.0, 2.3.1, 2.3.2
>Reporter: Thomas Uhren
>Assignee: David McGinnis
>Priority: Major
> Attachments: HIVE-21009.01.patch, HIVE-21009.patch
>
>
> When user accounts cannot do an LDAP search, there is currently no way of 
> specifying a custom binddn to use for the ldap-search.
> So I'm missing something like that:
> {code}
> hive.server2.authentication.ldap.bindn=cn=ldapuser,ou=user,dc=example
> hive.server2.authentication.ldap.bindnpw=password
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21009) LDAP - Specify binddn for ldap-search

2019-02-04 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16760314#comment-16760314
 ] 

Prasanth Jayachandran commented on HIVE-21009:
--

lgtm, +1

> LDAP - Specify binddn for ldap-search
> -
>
> Key: HIVE-21009
> URL: https://issues.apache.org/jira/browse/HIVE-21009
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.1.0, 2.1.1, 2.2.0, 2.3.0, 2.3.1, 2.3.2
>Reporter: Thomas Uhren
>Assignee: David McGinnis
>Priority: Major
> Attachments: HIVE-21009.01.patch, HIVE-21009.patch
>
>
> When user accounts cannot do an LDAP search, there is currently no way of 
> specifying a custom binddn to use for the ldap-search.
> So I'm missing something like that:
> {code}
> hive.server2.authentication.ldap.bindn=cn=ldapuser,ou=user,dc=example
> hive.server2.authentication.ldap.bindnpw=password
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21009) LDAP - Specify binddn for ldap-search

2019-02-04 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16760300#comment-16760300
 ] 

Prasanth Jayachandran commented on HIVE-21009:
--

To get password, using conf.getPassword() is more secure as it reads using 
hadoop's credentials provider (which could be jceks file). 

> LDAP - Specify binddn for ldap-search
> -
>
> Key: HIVE-21009
> URL: https://issues.apache.org/jira/browse/HIVE-21009
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.1.0, 2.1.1, 2.2.0, 2.3.0, 2.3.1, 2.3.2
>Reporter: Thomas Uhren
>Assignee: David McGinnis
>Priority: Major
> Attachments: HIVE-21009.patch
>
>
> When user accounts cannot do an LDAP search, there is currently no way of 
> specifying a custom binddn to use for the ldap-search.
> So I'm missing something like that:
> {code}
> hive.server2.authentication.ldap.bindn=cn=ldapuser,ou=user,dc=example
> hive.server2.authentication.ldap.bindnpw=password
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-21212) LLAP: shuffle port config uses internal configuration

2019-02-04 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-21212:



> LLAP: shuffle port config uses internal configuration
> -
>
> Key: HIVE-21212
> URL: https://issues.apache.org/jira/browse/HIVE-21212
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
>
> LlapDaemon main() reads daemon configuration but for shuffle port it reads 
> internal config instead of hive.llap.daemon.yarn.shuffle.port
> [https://github.com/apache/hive/blob/c8eb03affa2533f4827cf6497e7c9873bc9520a7/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/LlapDaemon.java#L535]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21177) Optimize AcidUtils.getLogicalLength()

2019-01-30 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756510#comment-16756510
 ] 

Prasanth Jayachandran commented on HIVE-21177:
--

Looks like only path is used inside ParsedDeltaLight. So this 
fs.getFileStatus() call can be avoided? One less fs operation. 

> Optimize AcidUtils.getLogicalLength()
> -
>
> Key: HIVE-21177
> URL: https://issues.apache.org/jira/browse/HIVE-21177
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-21177.01.patch, HIVE-21177.02.patch
>
>
> {{AcidUtils.getLogicalLength()}} - tries look for the side file 
> {{OrcAcidUtils.getSideFile()}} on the file system even when the file couldn't 
> possibly be there, e.g. when the path is delta_x_x or base_x.  It could only 
> be there in delta_x_y, x != y.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21177) Optimize AcidUtils.getLogicalLength()

2019-01-29 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755782#comment-16755782
 ] 

Prasanth Jayachandran commented on HIVE-21177:
--

Any reason why the #readOps jumped for test case 
testACIDReaderFooterSerializeWithDeltas (for 2 of the asserts)? Would be good 
to list those 2 new calls in the comment for reference. 

> Optimize AcidUtils.getLogicalLength()
> -
>
> Key: HIVE-21177
> URL: https://issues.apache.org/jira/browse/HIVE-21177
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-21177.01.patch, HIVE-21177.02.patch
>
>
> {{AcidUtils.getLogicalLength()}} - tries look for the side file 
> {{OrcAcidUtils.getSideFile()}} on the file system even when the file couldn't 
> possibly be there, e.g. when the path is delta_x_x or base_x.  It could only 
> be there in delta_x_y, x != y.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20707) Automatic partition management

2019-01-16 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16744562#comment-16744562
 ] 

Prasanth Jayachandran commented on HIVE-20707:
--

[~jcamachorodriguez] This patch brings in breaking changes mostly for external 
tables (auto partition discovery is enabled by default). That's why did not 
backport the patch to branch-3. I am guessing the fixes are primarily related 
to catalog name missing in metastore APIs?. If that's the case we could pick 
only the changes to ObjectStore.java, NonCatCallsWithCatalog.java which are 
pretty small. What do you think?

> Automatic partition management
> --
>
> Key: HIVE-20707
> URL: https://issues.apache.org/jira/browse/HIVE-20707
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20702.3.patch, HIVE-20707-branch-3.patch, 
> HIVE-20707.1.patch, HIVE-20707.2.patch, HIVE-20707.4.patch, 
> HIVE-20707.5.patch, HIVE-20707.6.patch, HIVE-20707.6.patch, HIVE-20707.7.patch
>
>
> In current scenario, to add partitions for external tables to metastore, MSCK 
> REPAIR command has to be executed manually. To avoid this manual step, 
> external tables can be specified a table property based on which a background 
> metastore thread can sync partitions periodically. Tables can also be 
> specified with partition retention period. Any partition whose age exceeds 
> the retention period will be dropped automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21103) PartitionManagementTask should not modify DN configs to avoid closing persistence manager

2019-01-08 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-21103:
-
Attachment: HIVE-21103.2.patch

> PartitionManagementTask should not modify DN configs to avoid closing 
> persistence manager
> -
>
> Key: HIVE-21103
> URL: https://issues.apache.org/jira/browse/HIVE-21103
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-21103.1.patch, HIVE-21103.2.patch
>
>
> HIVE-20707 added automatic partition management which uses thread pools to 
> run parallel msck repair. It also modifies datanucleus connection pool size 
> to avoid explosion of connections to backend database. But object store 
> closes the persistence manager when it detects a change in datanuclues or jdo 
> configs. So when PartitionManagementTask is running and when HS2 tries to 
> connect to metastore HS2 will get persistence manager close exception. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21103) PartitionManagementTask should not modify DN configs to avoid closing persistence manager

2019-01-08 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737507#comment-16737507
 ] 

Prasanth Jayachandran commented on HIVE-21103:
--

Thanks [~sankarh] for finding for the root cause of this issue! Since you have 
more context on this issue can you please review this change? It just removes 
the DN config change in PartitionManagementTask

> PartitionManagementTask should not modify DN configs to avoid closing 
> persistence manager
> -
>
> Key: HIVE-21103
> URL: https://issues.apache.org/jira/browse/HIVE-21103
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-21103.1.patch
>
>
> HIVE-20707 added automatic partition management which uses thread pools to 
> run parallel msck repair. It also modifies datanucleus connection pool size 
> to avoid explosion of connections to backend database. But object store 
> closes the persistence manager when it detects a change in datanuclues or jdo 
> configs. So when PartitionManagementTask is running and when HS2 tries to 
> connect to metastore HS2 will get persistence manager close exception. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21103) PartitionManagementTask should not modify DN configs to avoid closing persistence manager

2019-01-08 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-21103:
-
Status: Patch Available  (was: Open)

> PartitionManagementTask should not modify DN configs to avoid closing 
> persistence manager
> -
>
> Key: HIVE-21103
> URL: https://issues.apache.org/jira/browse/HIVE-21103
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-21103.1.patch
>
>
> HIVE-20707 added automatic partition management which uses thread pools to 
> run parallel msck repair. It also modifies datanucleus connection pool size 
> to avoid explosion of connections to backend database. But object store 
> closes the persistence manager when it detects a change in datanuclues or jdo 
> configs. So when PartitionManagementTask is running and when HS2 tries to 
> connect to metastore HS2 will get persistence manager close exception. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21103) PartitionManagementTask should not modify DN configs to avoid closing persistence manager

2019-01-08 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-21103:
-
Attachment: HIVE-21103.1.patch

> PartitionManagementTask should not modify DN configs to avoid closing 
> persistence manager
> -
>
> Key: HIVE-21103
> URL: https://issues.apache.org/jira/browse/HIVE-21103
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-21103.1.patch
>
>
> HIVE-20707 added automatic partition management which uses thread pools to 
> run parallel msck repair. It also modifies datanucleus connection pool size 
> to avoid explosion of connections to backend database. But object store 
> closes the persistence manager when it detects a change in datanuclues or jdo 
> configs. So when PartitionManagementTask is running and when HS2 tries to 
> connect to metastore HS2 will get persistence manager close exception. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-21103) PartitionManagementTask should not modify DN configs to avoid closing persistence manager

2019-01-08 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-21103:



> PartitionManagementTask should not modify DN configs to avoid closing 
> persistence manager
> -
>
> Key: HIVE-21103
> URL: https://issues.apache.org/jira/browse/HIVE-21103
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
>
> HIVE-20707 added automatic partition management which uses thread pools to 
> run parallel msck repair. It also modifies datanucleus connection pool size 
> to avoid explosion of connections to backend database. But object store 
> closes the persistence manager when it detects a change in datanuclues or jdo 
> configs. So when PartitionManagementTask is running and when HS2 tries to 
> connect to metastore HS2 will get persistence manager close exception. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21040) msck does unnecessary file listing at last level of directory tree

2018-12-19 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725484#comment-16725484
 ] 

Prasanth Jayachandran commented on HIVE-21040:
--

+1, pending tests.

> msck does unnecessary file listing at last level of directory tree
> --
>
> Key: HIVE-21040
> URL: https://issues.apache.org/jira/browse/HIVE-21040
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-21040.01.patch, HIVE-21040.02.patch
>
>
> Here is the code snippet which is run by {{msck}} to list directories
> {noformat}
> final Path currentPath = pd.p;
>   final int currentDepth = pd.depth;
>   FileStatus[] fileStatuses = fs.listStatus(currentPath, 
> FileUtils.HIDDEN_FILES_PATH_FILTER);
>   // found no files under a sub-directory under table base path; it is 
> possible that the table
>   // is empty and hence there are no partition sub-directories created 
> under base path
>   if (fileStatuses.length == 0 && currentDepth > 0 && currentDepth < 
> partColNames.size()) {
> // since maxDepth is not yet reached, we are missing partition
> // columns in currentPath
> logOrThrowExceptionWithMsg(
> "MSCK is missing partition columns under " + 
> currentPath.toString());
>   } else {
> // found files under currentPath add them to the queue if it is a 
> directory
> for (FileStatus fileStatus : fileStatuses) {
>   if (!fileStatus.isDirectory() && currentDepth < 
> partColNames.size()) {
> // found a file at depth which is less than number of partition 
> keys
> logOrThrowExceptionWithMsg(
> "MSCK finds a file rather than a directory when it searches 
> for "
> + fileStatus.getPath().toString());
>   } else if (fileStatus.isDirectory() && currentDepth < 
> partColNames.size()) {
> // found a sub-directory at a depth less than number of partition 
> keys
> // validate if the partition directory name matches with the 
> corresponding
> // partition colName at currentDepth
> Path nextPath = fileStatus.getPath();
> String[] parts = nextPath.getName().split("=");
> if (parts.length != 2) {
>   logOrThrowExceptionWithMsg("Invalid partition name " + 
> nextPath);
> } else if 
> (!parts[0].equalsIgnoreCase(partColNames.get(currentDepth))) {
>   logOrThrowExceptionWithMsg(
>   "Unexpected partition key " + parts[0] + " found at " + 
> nextPath);
> } else {
>   // add sub-directory to the work queue if maxDepth is not yet 
> reached
>   pendingPaths.add(new PathDepthInfo(nextPath, currentDepth + 1));
> }
>   }
> }
> if (currentDepth == partColNames.size()) {
>   return currentPath;
> }
>   }
> {noformat}
> You can see that when the {{currentDepth}} at the {{maxDepth}} it still does 
> a unnecessary listing of the files. We can improve this call by checking the 
> currentDepth and bailing out early.
> This can improve the performance of msck command significantly especially 
> when there are lot of files in each partitions on remote filesystems like S3 
> or ADLS



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21040) msck does unnecessary file listing at last level of directory tree

2018-12-18 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724382#comment-16724382
 ] 

Prasanth Jayachandran commented on HIVE-21040:
--

[~vihangk1] You can use mock FS something similar to 
[https://github.com/apache/hive/blob/master/ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestInputOutputFormat.java#L794]
 to get fs statistics. The testcase in the above link does something similar to 
assert few listStatus calls are made. 

> msck does unnecessary file listing at last level of directory tree
> --
>
> Key: HIVE-21040
> URL: https://issues.apache.org/jira/browse/HIVE-21040
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-21040.01.patch
>
>
> Here is the code snippet which is run by {{msck}} to list directories
> {noformat}
> final Path currentPath = pd.p;
>   final int currentDepth = pd.depth;
>   FileStatus[] fileStatuses = fs.listStatus(currentPath, 
> FileUtils.HIDDEN_FILES_PATH_FILTER);
>   // found no files under a sub-directory under table base path; it is 
> possible that the table
>   // is empty and hence there are no partition sub-directories created 
> under base path
>   if (fileStatuses.length == 0 && currentDepth > 0 && currentDepth < 
> partColNames.size()) {
> // since maxDepth is not yet reached, we are missing partition
> // columns in currentPath
> logOrThrowExceptionWithMsg(
> "MSCK is missing partition columns under " + 
> currentPath.toString());
>   } else {
> // found files under currentPath add them to the queue if it is a 
> directory
> for (FileStatus fileStatus : fileStatuses) {
>   if (!fileStatus.isDirectory() && currentDepth < 
> partColNames.size()) {
> // found a file at depth which is less than number of partition 
> keys
> logOrThrowExceptionWithMsg(
> "MSCK finds a file rather than a directory when it searches 
> for "
> + fileStatus.getPath().toString());
>   } else if (fileStatus.isDirectory() && currentDepth < 
> partColNames.size()) {
> // found a sub-directory at a depth less than number of partition 
> keys
> // validate if the partition directory name matches with the 
> corresponding
> // partition colName at currentDepth
> Path nextPath = fileStatus.getPath();
> String[] parts = nextPath.getName().split("=");
> if (parts.length != 2) {
>   logOrThrowExceptionWithMsg("Invalid partition name " + 
> nextPath);
> } else if 
> (!parts[0].equalsIgnoreCase(partColNames.get(currentDepth))) {
>   logOrThrowExceptionWithMsg(
>   "Unexpected partition key " + parts[0] + " found at " + 
> nextPath);
> } else {
>   // add sub-directory to the work queue if maxDepth is not yet 
> reached
>   pendingPaths.add(new PathDepthInfo(nextPath, currentDepth + 1));
> }
>   }
> }
> if (currentDepth == partColNames.size()) {
>   return currentPath;
> }
>   }
> {noformat}
> You can see that when the {{currentDepth}} at the {{maxDepth}} it still does 
> a unnecessary listing of the files. We can improve this call by checking the 
> currentDepth and bailing out early.
> This can improve the performance of msck command significantly especially 
> when there are lot of files in each partitions on remote filesystems like S3 
> or ADLS



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21040) msck does unnecessary file listing at last level of directory tree

2018-12-18 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724383#comment-16724383
 ] 

Prasanth Jayachandran commented on HIVE-21040:
--

looks good otherwise. 

> msck does unnecessary file listing at last level of directory tree
> --
>
> Key: HIVE-21040
> URL: https://issues.apache.org/jira/browse/HIVE-21040
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-21040.01.patch
>
>
> Here is the code snippet which is run by {{msck}} to list directories
> {noformat}
> final Path currentPath = pd.p;
>   final int currentDepth = pd.depth;
>   FileStatus[] fileStatuses = fs.listStatus(currentPath, 
> FileUtils.HIDDEN_FILES_PATH_FILTER);
>   // found no files under a sub-directory under table base path; it is 
> possible that the table
>   // is empty and hence there are no partition sub-directories created 
> under base path
>   if (fileStatuses.length == 0 && currentDepth > 0 && currentDepth < 
> partColNames.size()) {
> // since maxDepth is not yet reached, we are missing partition
> // columns in currentPath
> logOrThrowExceptionWithMsg(
> "MSCK is missing partition columns under " + 
> currentPath.toString());
>   } else {
> // found files under currentPath add them to the queue if it is a 
> directory
> for (FileStatus fileStatus : fileStatuses) {
>   if (!fileStatus.isDirectory() && currentDepth < 
> partColNames.size()) {
> // found a file at depth which is less than number of partition 
> keys
> logOrThrowExceptionWithMsg(
> "MSCK finds a file rather than a directory when it searches 
> for "
> + fileStatus.getPath().toString());
>   } else if (fileStatus.isDirectory() && currentDepth < 
> partColNames.size()) {
> // found a sub-directory at a depth less than number of partition 
> keys
> // validate if the partition directory name matches with the 
> corresponding
> // partition colName at currentDepth
> Path nextPath = fileStatus.getPath();
> String[] parts = nextPath.getName().split("=");
> if (parts.length != 2) {
>   logOrThrowExceptionWithMsg("Invalid partition name " + 
> nextPath);
> } else if 
> (!parts[0].equalsIgnoreCase(partColNames.get(currentDepth))) {
>   logOrThrowExceptionWithMsg(
>   "Unexpected partition key " + parts[0] + " found at " + 
> nextPath);
> } else {
>   // add sub-directory to the work queue if maxDepth is not yet 
> reached
>   pendingPaths.add(new PathDepthInfo(nextPath, currentDepth + 1));
> }
>   }
> }
> if (currentDepth == partColNames.size()) {
>   return currentPath;
> }
>   }
> {noformat}
> You can see that when the {{currentDepth}} at the {{maxDepth}} it still does 
> a unnecessary listing of the files. We can improve this call by checking the 
> currentDepth and bailing out early.
> This can improve the performance of msck command significantly especially 
> when there are lot of files in each partitions on remote filesystems like S3 
> or ADLS



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20785) Wrong key name in the JDBC DatabaseMetaData.getPrimaryKeys method

2018-12-17 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20785:
-
   Resolution: Fixed
Fix Version/s: 3.2.0
   4.0.0
   Status: Resolved  (was: Patch Available)

Committed to branch-3 and master. Thanks [~ggrossetie] for the contribution!

> Wrong key name in the JDBC DatabaseMetaData.getPrimaryKeys method
> -
>
> Key: HIVE-20785
> URL: https://issues.apache.org/jira/browse/HIVE-20785
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 3.1.0
>Reporter: Guillaume Grossetie
>Assignee: Guillaume Grossetie
>Priority: Major
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-20785.1.patch, patch.patch
>
>
> According to the documentation (1) the key should be {{KEY_SEQ, not KEQ_SEQ.}}
> Pull request available: https://github.com/apache/hive/pull/440
>  
> (1) 
> [https://docs.oracle.com/javase/8/docs/api/java/sql/DatabaseMetaData.html#getPrimaryKeys-java.lang.String-java.lang.String-java.lang.String-]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20785) Wrong key name in the JDBC DatabaseMetaData.getPrimaryKeys method

2018-12-17 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20785:
-
Attachment: HIVE-20785.1.patch

> Wrong key name in the JDBC DatabaseMetaData.getPrimaryKeys method
> -
>
> Key: HIVE-20785
> URL: https://issues.apache.org/jira/browse/HIVE-20785
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 3.1.0
>Reporter: Guillaume Grossetie
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20785.1.patch, patch.patch
>
>
> According to the documentation (1) the key should be {{KEY_SEQ, not KEQ_SEQ.}}
> Pull request available: https://github.com/apache/hive/pull/440
>  
> (1) 
> [https://docs.oracle.com/javase/8/docs/api/java/sql/DatabaseMetaData.html#getPrimaryKeys-java.lang.String-java.lang.String-java.lang.String-]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20785) Wrong key name in the JDBC DatabaseMetaData.getPrimaryKeys method

2018-12-17 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16723465#comment-16723465
 ] 

Prasanth Jayachandran commented on HIVE-20785:
--

{code:java}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_reduce] 
(batchId=61)
org.apache.hadoop.hive.common.metrics.metrics2.TestCodahaleMetrics.testFileReporting
 (batchId=282){code}
These 2 test failures seems to be unrelated. Giving the patch another try to 
see if the tests are flaky. 

> Wrong key name in the JDBC DatabaseMetaData.getPrimaryKeys method
> -
>
> Key: HIVE-20785
> URL: https://issues.apache.org/jira/browse/HIVE-20785
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 3.1.0
>Reporter: Guillaume Grossetie
>Assignee: Guillaume Grossetie
>Priority: Major
> Attachments: HIVE-20785.1.patch, patch.patch
>
>
> According to the documentation (1) the key should be {{KEY_SEQ, not KEQ_SEQ.}}
> Pull request available: https://github.com/apache/hive/pull/440
>  
> (1) 
> [https://docs.oracle.com/javase/8/docs/api/java/sql/DatabaseMetaData.html#getPrimaryKeys-java.lang.String-java.lang.String-java.lang.String-]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20785) Wrong key name in the JDBC DatabaseMetaData.getPrimaryKeys method

2018-12-17 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-20785:


Assignee: Prasanth Jayachandran  (was: Guillaume Grossetie)

> Wrong key name in the JDBC DatabaseMetaData.getPrimaryKeys method
> -
>
> Key: HIVE-20785
> URL: https://issues.apache.org/jira/browse/HIVE-20785
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 3.1.0
>Reporter: Guillaume Grossetie
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20785.1.patch, patch.patch
>
>
> According to the documentation (1) the key should be {{KEY_SEQ, not KEQ_SEQ.}}
> Pull request available: https://github.com/apache/hive/pull/440
>  
> (1) 
> [https://docs.oracle.com/javase/8/docs/api/java/sql/DatabaseMetaData.html#getPrimaryKeys-java.lang.String-java.lang.String-java.lang.String-]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20785) Wrong key name in the JDBC DatabaseMetaData.getPrimaryKeys method

2018-12-17 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-20785:


Assignee: Guillaume Grossetie  (was: Prasanth Jayachandran)

> Wrong key name in the JDBC DatabaseMetaData.getPrimaryKeys method
> -
>
> Key: HIVE-20785
> URL: https://issues.apache.org/jira/browse/HIVE-20785
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 3.1.0
>Reporter: Guillaume Grossetie
>Assignee: Guillaume Grossetie
>Priority: Major
> Attachments: HIVE-20785.1.patch, patch.patch
>
>
> According to the documentation (1) the key should be {{KEY_SEQ, not KEQ_SEQ.}}
> Pull request available: https://github.com/apache/hive/pull/440
>  
> (1) 
> [https://docs.oracle.com/javase/8/docs/api/java/sql/DatabaseMetaData.html#getPrimaryKeys-java.lang.String-java.lang.String-java.lang.String-]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20979) Fix memory leak in hive streaming

2018-12-10 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20979:
-
   Resolution: Fixed
Fix Version/s: 3.1.1
   4.0.0
   Status: Resolved  (was: Patch Available)

Committed to branch-3 and master. Thanks [~ShubhamChaurasia] for the 
contribution. 

> Fix memory leak in hive streaming
> -
>
> Key: HIVE-20979
> URL: https://issues.apache.org/jira/browse/HIVE-20979
> Project: Hive
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 3.1.1
>Reporter: Shubham Chaurasia
>Assignee: Shubham Chaurasia
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 4.0.0, 3.1.1
>
> Attachments: HIVE-20979.1.patch, HIVE-20979.1.patch, 
> HIVE-20979.2.patch, HIVE-20979.3.patch, HIVE-20979.4.patch, HIVE-20979.5.patch
>
>
> {{1) HiveStreamingConnection.Builder#init() adds a shutdown hook handler via 
> }}{{ShutdownHookManager.addShutdownHook but it is never removed which causes 
> all the handlers to accumulate and hence a memory leak.}}
> 2) AbstractRecordWriter creates an instance of FileSystem but does not close 
> it which in turn causes a leak due to accumulation in FileSystem$Cache#map
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20979) Fix memory leak in hive streaming

2018-11-28 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16701973#comment-16701973
 ] 

Prasanth Jayachandran commented on HIVE-20979:
--

+1, pending tests. [~ShubhamChaurasia] could you please reupload patch (bump up 
the version) until you a get a green test run (requirement for commit)?

> Fix memory leak in hive streaming
> -
>
> Key: HIVE-20979
> URL: https://issues.apache.org/jira/browse/HIVE-20979
> Project: Hive
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 3.1.1
>Reporter: Shubham Chaurasia
>Assignee: Shubham Chaurasia
>Priority: Critical
>  Labels: pull-request-available
> Attachments: HIVE-20979.1.patch, HIVE-20979.1.patch
>
>
> {{1) HiveStreamingConnection.Builder#init() adds a shutdown hook handler via 
> }}{{ShutdownHookManager.addShutdownHook but it is never removed which causes 
> all the handlers to accumulate and hence a memory leak.}}
> 2) AbstractRecordWriter creates an instance of FileSystem but does not close 
> it which in turn causes a leak due to accumulation in FileSystem$Cache#map
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20935) Upload of llap package tarball fails in EC2 causing LLAP service start failure

2018-11-16 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16690066#comment-16690066
 ] 

Prasanth Jayachandran commented on HIVE-20935:
--

+1

> Upload of llap package tarball fails in EC2 causing LLAP service start failure
> --
>
> Key: HIVE-20935
> URL: https://issues.apache.org/jira/browse/HIVE-20935
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 3.1.1
>Reporter: Gour Saha
>Assignee: Gour Saha
>Priority: Major
> Attachments: HIVE-20935.01.patch
>
>
> Even though package dir is defined as below (with a / at the end) -
> {code}
> LLAP_PACKAGE_DIR = ".yarn/package/LLAP/";
> {code}
> copyLocalFileToHdfs API fails to create the dir hierarchy of 
> .yarn/package/LLAP/ first and then copy the file under it. It instead uploads 
> the file under .yarn/package with name LLAP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20899) Keytab URI for LLAP YARN Service is restrictive to support HDFS only

2018-11-12 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20899:
-
   Resolution: Fixed
Fix Version/s: 3.2.0
   4.0.0
   Status: Resolved  (was: Patch Available)

Committed to master and branch-3. Thanks [~gsaha] for the patch!

> Keytab URI for LLAP YARN Service is restrictive to support HDFS only
> 
>
> Key: HIVE-20899
> URL: https://issues.apache.org/jira/browse/HIVE-20899
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 3.1.1
>Reporter: Gour Saha
>Assignee: Gour Saha
>Priority: Major
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-20899.01.patch, HIVE-20899.02.patch, 
> HIVE-20899.03.patch, HIVE-20899.04.patch
>
>
> llap-server/src/main/resources/package.py restricts the keytab URI to support 
> HDFS only and hence fails for other FileSystem API conforming FSs like s3a, 
> wasb, gs, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20899) Keytab URI for LLAP YARN Service is restrictive to support HDFS only

2018-11-12 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-20899:


Assignee: Prasanth Jayachandran  (was: Gour Saha)

> Keytab URI for LLAP YARN Service is restrictive to support HDFS only
> 
>
> Key: HIVE-20899
> URL: https://issues.apache.org/jira/browse/HIVE-20899
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 3.1.1
>Reporter: Gour Saha
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20899.01.patch, HIVE-20899.02.patch, 
> HIVE-20899.03.patch, HIVE-20899.04.patch
>
>
> llap-server/src/main/resources/package.py restricts the keytab URI to support 
> HDFS only and hence fails for other FileSystem API conforming FSs like s3a, 
> wasb, gs, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20899) Keytab URI for LLAP YARN Service is restrictive to support HDFS only

2018-11-12 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16684105#comment-16684105
 ] 

Prasanth Jayachandran commented on HIVE-20899:
--

Yes. The failures are unrelated. But we need a green run to commit as per the 
new rules. 

> Keytab URI for LLAP YARN Service is restrictive to support HDFS only
> 
>
> Key: HIVE-20899
> URL: https://issues.apache.org/jira/browse/HIVE-20899
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 3.1.1
>Reporter: Gour Saha
>Assignee: Gour Saha
>Priority: Major
> Attachments: HIVE-20899.01.patch, HIVE-20899.02.patch, 
> HIVE-20899.03.patch, HIVE-20899.04.patch
>
>
> llap-server/src/main/resources/package.py restricts the keytab URI to support 
> HDFS only and hence fails for other FileSystem API conforming FSs like s3a, 
> wasb, gs, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20899) Keytab URI for LLAP YARN Service is restrictive to support HDFS only

2018-11-12 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20899:
-
Attachment: HIVE-20899.04.patch

> Keytab URI for LLAP YARN Service is restrictive to support HDFS only
> 
>
> Key: HIVE-20899
> URL: https://issues.apache.org/jira/browse/HIVE-20899
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 3.1.1
>Reporter: Gour Saha
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20899.01.patch, HIVE-20899.02.patch, 
> HIVE-20899.03.patch, HIVE-20899.04.patch
>
>
> llap-server/src/main/resources/package.py restricts the keytab URI to support 
> HDFS only and hence fails for other FileSystem API conforming FSs like s3a, 
> wasb, gs, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20899) Keytab URI for LLAP YARN Service is restrictive to support HDFS only

2018-11-12 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-20899:


Assignee: Gour Saha  (was: Prasanth Jayachandran)

> Keytab URI for LLAP YARN Service is restrictive to support HDFS only
> 
>
> Key: HIVE-20899
> URL: https://issues.apache.org/jira/browse/HIVE-20899
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 3.1.1
>Reporter: Gour Saha
>Assignee: Gour Saha
>Priority: Major
> Attachments: HIVE-20899.01.patch, HIVE-20899.02.patch, 
> HIVE-20899.03.patch, HIVE-20899.04.patch
>
>
> llap-server/src/main/resources/package.py restricts the keytab URI to support 
> HDFS only and hence fails for other FileSystem API conforming FSs like s3a, 
> wasb, gs, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20899) Keytab URI for LLAP YARN Service is restrictive to support HDFS only

2018-11-12 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-20899:


Assignee: Gour Saha  (was: Prasanth Jayachandran)

> Keytab URI for LLAP YARN Service is restrictive to support HDFS only
> 
>
> Key: HIVE-20899
> URL: https://issues.apache.org/jira/browse/HIVE-20899
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 3.1.1
>Reporter: Gour Saha
>Assignee: Gour Saha
>Priority: Major
> Attachments: HIVE-20899.01.patch, HIVE-20899.02.patch, 
> HIVE-20899.03.patch
>
>
> llap-server/src/main/resources/package.py restricts the keytab URI to support 
> HDFS only and hence fails for other FileSystem API conforming FSs like s3a, 
> wasb, gs, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20899) Keytab URI for LLAP YARN Service is restrictive to support HDFS only

2018-11-12 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20899:
-
Attachment: HIVE-20899.03.patch

> Keytab URI for LLAP YARN Service is restrictive to support HDFS only
> 
>
> Key: HIVE-20899
> URL: https://issues.apache.org/jira/browse/HIVE-20899
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 3.1.1
>Reporter: Gour Saha
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20899.01.patch, HIVE-20899.02.patch, 
> HIVE-20899.03.patch
>
>
> llap-server/src/main/resources/package.py restricts the keytab URI to support 
> HDFS only and hence fails for other FileSystem API conforming FSs like s3a, 
> wasb, gs, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20899) Keytab URI for LLAP YARN Service is restrictive to support HDFS only

2018-11-12 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-20899:


Assignee: Prasanth Jayachandran  (was: Gour Saha)

> Keytab URI for LLAP YARN Service is restrictive to support HDFS only
> 
>
> Key: HIVE-20899
> URL: https://issues.apache.org/jira/browse/HIVE-20899
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 3.1.1
>Reporter: Gour Saha
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20899.01.patch, HIVE-20899.02.patch, 
> HIVE-20899.03.patch
>
>
> llap-server/src/main/resources/package.py restricts the keytab URI to support 
> HDFS only and hence fails for other FileSystem API conforming FSs like s3a, 
> wasb, gs, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20899) Keytab URI for LLAP YARN Service is restrictive to support HDFS only

2018-11-11 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20899:
-
Attachment: HIVE-20899.02.patch

> Keytab URI for LLAP YARN Service is restrictive to support HDFS only
> 
>
> Key: HIVE-20899
> URL: https://issues.apache.org/jira/browse/HIVE-20899
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 3.1.1
>Reporter: Gour Saha
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20899.01.patch, HIVE-20899.02.patch
>
>
> llap-server/src/main/resources/package.py restricts the keytab URI to support 
> HDFS only and hence fails for other FileSystem API conforming FSs like s3a, 
> wasb, gs, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20899) Keytab URI for LLAP YARN Service is restrictive to support HDFS only

2018-11-11 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683318#comment-16683318
 ] 

Prasanth Jayachandran commented on HIVE-20899:
--

+1

> Keytab URI for LLAP YARN Service is restrictive to support HDFS only
> 
>
> Key: HIVE-20899
> URL: https://issues.apache.org/jira/browse/HIVE-20899
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 3.1.1
>Reporter: Gour Saha
>Assignee: Gour Saha
>Priority: Major
> Attachments: HIVE-20899.01.patch
>
>
> llap-server/src/main/resources/package.py restricts the keytab URI to support 
> HDFS only and hence fails for other FileSystem API conforming FSs like s3a, 
> wasb, gs, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20899) Keytab URI for LLAP YARN Service is restrictive to support HDFS only

2018-11-11 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683320#comment-16683320
 ] 

Prasanth Jayachandran commented on HIVE-20899:
--

resubmitted patch for green run

> Keytab URI for LLAP YARN Service is restrictive to support HDFS only
> 
>
> Key: HIVE-20899
> URL: https://issues.apache.org/jira/browse/HIVE-20899
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 3.1.1
>Reporter: Gour Saha
>Assignee: Gour Saha
>Priority: Major
> Attachments: HIVE-20899.01.patch, HIVE-20899.02.patch
>
>
> llap-server/src/main/resources/package.py restricts the keytab URI to support 
> HDFS only and hence fails for other FileSystem API conforming FSs like s3a, 
> wasb, gs, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20899) Keytab URI for LLAP YARN Service is restrictive to support HDFS only

2018-11-11 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-20899:


Assignee: Gour Saha  (was: Prasanth Jayachandran)

> Keytab URI for LLAP YARN Service is restrictive to support HDFS only
> 
>
> Key: HIVE-20899
> URL: https://issues.apache.org/jira/browse/HIVE-20899
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 3.1.1
>Reporter: Gour Saha
>Assignee: Gour Saha
>Priority: Major
> Attachments: HIVE-20899.01.patch, HIVE-20899.02.patch
>
>
> llap-server/src/main/resources/package.py restricts the keytab URI to support 
> HDFS only and hence fails for other FileSystem API conforming FSs like s3a, 
> wasb, gs, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20899) Keytab URI for LLAP YARN Service is restrictive to support HDFS only

2018-11-11 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-20899:


Assignee: Prasanth Jayachandran  (was: Gour Saha)

> Keytab URI for LLAP YARN Service is restrictive to support HDFS only
> 
>
> Key: HIVE-20899
> URL: https://issues.apache.org/jira/browse/HIVE-20899
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 3.1.1
>Reporter: Gour Saha
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20899.01.patch
>
>
> llap-server/src/main/resources/package.py restricts the keytab URI to support 
> HDFS only and hence fails for other FileSystem API conforming FSs like s3a, 
> wasb, gs, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20484) Disable Block Cache By Default With HBase SerDe

2018-11-08 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16680520#comment-16680520
 ] 

Prasanth Jayachandran commented on HIVE-20484:
--

+1

> Disable Block Cache By Default With HBase SerDe
> ---
>
> Key: HIVE-20484
> URL: https://issues.apache.org/jira/browse/HIVE-20484
> Project: Hive
>  Issue Type: Improvement
>  Components: HBase Handler
>Affects Versions: 1.2.3, 2.4.0, 4.0.0, 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: HIVE-20484.1.patch, HIVE-20484.2.patch
>
>
> {quote}
> Scan instances can be set to use the block cache in the RegionServer via the 
> setCacheBlocks method. For input Scans to MapReduce jobs, this should be 
> false. 
> https://hbase.apache.org/book.html#perf.hbase.client.blockcache
> {quote}
> However, from the Hive code, we can see that this is not the case.
> {code}
> public static final String HBASE_SCAN_CACHEBLOCKS = "hbase.scan.cacheblock";
> ...
> String scanCacheBlocks = 
> tableProperties.getProperty(HBaseSerDe.HBASE_SCAN_CACHEBLOCKS);
> if (scanCacheBlocks != null) {
>   jobProperties.put(HBaseSerDe.HBASE_SCAN_CACHEBLOCKS, scanCacheBlocks);
> }
> ...
> String scanCacheBlocks = jobConf.get(HBaseSerDe.HBASE_SCAN_CACHEBLOCKS);
> if (scanCacheBlocks != null) {
>   scan.setCacheBlocks(Boolean.parseBoolean(scanCacheBlocks));
> }
> {code}
> In the Hive code, we can see that if {{hbase.scan.cacheblock}} is not 
> specified in the {{SERDEPROPERTIES}} then {{setCacheBlocks}} is not called 
> and the default value of the HBase {{Scan}} class is used.
> {code:java|title=Scan.java}
>   /**
>* Set whether blocks should be cached for this Scan.
>* 
>* This is true by default.  When true, default settings of the table and
>* family are used (this will never override caching blocks if the block
>* cache is disabled for that family or entirely).
>*
>* @param cacheBlocks if false, default settings are overridden and blocks
>* will not be cached
>*/
>   public Scan setCacheBlocks(boolean cacheBlocks) {
> this.cacheBlocks = cacheBlocks;
> return this;
>   }
> {code}
> Hive is doing full scans of the table with MapReduce/Spark and therefore, 
> according to the HBase docs, the default behavior here should be that blocks 
> are not cached.  Hive should set this value to "false" by default unless the 
> table {{SERDEPROPERTIES}} override this.
> {code:sql}
> -- Commands for HBase
> -- create 'test', 't'
> CREATE EXTERNAL TABLE test(value map, row_key string) 
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES (
> "hbase.columns.mapping" = "t:,:key",
> "hbase.scan.cacheblock" = "false"
> );
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20161) Do Not Print StackTraces to STDERR in ParseDriver

2018-11-08 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16680513#comment-16680513
 ] 

Prasanth Jayachandran commented on HIVE-20161:
--

+1

> Do Not Print StackTraces to STDERR in ParseDriver
> -
>
> Key: HIVE-20161
> URL: https://issues.apache.org/jira/browse/HIVE-20161
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
>  Labels: newbie, noob
> Attachments: HIVE-20161.1.patch
>
>
> https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/ql/src/java/org/apache/hadoop/hive/ql/exec/JoinOperator.java
> {code}
> // Do not print stack trace to STDERR - remove this, just throw the 
> HiveException
> } catch (Exception e) {
>   e.printStackTrace();
>   throw new HiveException(e);
> }
> ...
> // Do not log and throw.  log *or* throw.  In this case, just throw. Remove 
> logging.
> // Remove explicit 'return' call. No need for it.
>   try {
> skewJoinKeyContext.endGroup();
>   } catch (IOException e) {
> LOG.error(e.getMessage(), e);
> throw new HiveException(e);
>   }
>   return;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20160) Do Not Print StackTraces to STDERR in OperatorFactory

2018-11-08 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16680514#comment-16680514
 ] 

Prasanth Jayachandran commented on HIVE-20160:
--

+1

> Do Not Print StackTraces to STDERR in OperatorFactory
> -
>
> Key: HIVE-20160
> URL: https://issues.apache.org/jira/browse/HIVE-20160
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
>  Labels: newbie, noob
> Attachments: HIVE-20160.1.patch
>
>
> https://github.com/apache/hive/blob/ac6b2a3fb195916e22b2e5f465add2ffbcdc7430/ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java#L158
> {code}
> } catch (Exception e) {
>   e.printStackTrace();
>   throw new HiveException(...
> {code}
> Do not print the stack trace.  The error is being wrapped in a HiveException. 
>  Allow the code catching this exception to print the error to a logger 
> instead of dumping it here to STDERR.  There are several instances of this in 
> the class.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20223) SmallTableCache.java SLF4J Parameterized Logging

2018-11-08 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16680512#comment-16680512
 ] 

Prasanth Jayachandran commented on HIVE-20223:
--

+1

> SmallTableCache.java SLF4J Parameterized Logging
> 
>
> Key: HIVE-20223
> URL: https://issues.apache.org/jira/browse/HIVE-20223
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
>  Labels: newbie, noob
> Attachments: HIVE-20223.1.patch
>
>
> {code:java|title=org/apache/hadoop/hive/ql/exec/spark/SmallTableCache.java}
> if (LOG.isDebugEnabled()) {
> LOG.debug("Cleaned up small table cache for query " + queryId);
> }
> if (tableContainerMap.putIfAbsent(path, tableContainer) == null && 
> LOG.isDebugEnabled()) {
>   LOG.debug("Cached small table file " + path + " for query " + queryId);
> }
> if (tableContainer != null && LOG.isDebugEnabled()) {
>   LOG.debug("Loaded small table file " + path + " from cache for query " 
> + queryId);
> }
> {code}
>  
> Remove {{isDebugEnabled}} and replace with parameterized logging.
> https://www.slf4j.org/faq.html#logging_performance



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-20892) Benchmark XXhash for 64 bit hashing function instead of Murmum hash

2018-11-08 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16680441#comment-16680441
 ] 

Prasanth Jayachandran edited comment on HIVE-20892 at 11/8/18 9:40 PM:
---

[https://github.com/prasanthj/hasher] has perf comparison for some of the 
non-cryptographic hashing algorithms. 

Murmur2 is slightly better in terms of perf than Murmur3 but for this reason 
[https://github.com/apache/hive/blob/master/storage-api/src/java/org/apache/hive/common/util/BloomFilter.java#L37-L40]
 Murmur3 is chosen for bloomfilter and HLL in Hive. 


was (Author: prasanth_j):
[https://github.com/prasanthj/hasher]

Murmur2 is slightly better in terms of perf than Murmur3 but for this reason 
[https://github.com/apache/hive/blob/master/storage-api/src/java/org/apache/hive/common/util/BloomFilter.java#L37-L40]
 Murmur3 is chosen for bloomfilter and HLL in Hive. 

> Benchmark XXhash for 64 bit hashing function instead of Murmum hash
> ---
>
> Key: HIVE-20892
> URL: https://issues.apache.org/jira/browse/HIVE-20892
> Project: Hive
>  Issue Type: Sub-task
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
>
> https://cyan4973.github.io/xxHash/
> FYI this is used by lot of other MPP systems ...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20831) Add Session ID to Operation Logging

2018-11-08 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16680510#comment-16680510
 ] 

Prasanth Jayachandran commented on HIVE-20831:
--

+1

> Add Session ID to Operation Logging
> ---
>
> Key: HIVE-20831
> URL: https://issues.apache.org/jira/browse/HIVE-20831
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.1.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
>  Labels: newbie, noob
> Attachments: HIVE-20831.1.patch
>
>
> {code:java|title=OperationManager.java}
> LOG.info("Adding operation: " + operation.getHandle());
> {code}
> Please add additional logging to explicitly state which Hive session this 
> operation is being added to.
> https://github.com/apache/hive/blob/3963c729fabf90009cb67d277d40fe5913936358/service/src/java/org/apache/hive/service/cli/operation/OperationManager.java#L201



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20892) Benchmark XXhash for 64 bit hashing function instead of Murmum hash

2018-11-08 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16680441#comment-16680441
 ] 

Prasanth Jayachandran commented on HIVE-20892:
--

[https://github.com/prasanthj/hasher]

Murmur2 is slightly better in terms of perf than Murmur3 but for this reason 
[https://github.com/apache/hive/blob/master/storage-api/src/java/org/apache/hive/common/util/BloomFilter.java#L37-L40]
 Murmur3 is chosen for bloomfilter and HLL in Hive. 

> Benchmark XXhash for 64 bit hashing function instead of Murmum hash
> ---
>
> Key: HIVE-20892
> URL: https://issues.apache.org/jira/browse/HIVE-20892
> Project: Hive
>  Issue Type: Sub-task
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
>
> https://cyan4973.github.io/xxHash/
> FYI this is used by lot of other MPP systems ...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20707) Automatic partition management

2018-11-06 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20707:
-
Attachment: HIVE-20707-branch-3.patch

> Automatic partition management
> --
>
> Key: HIVE-20707
> URL: https://issues.apache.org/jira/browse/HIVE-20707
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20702.3.patch, HIVE-20707-branch-3.patch, 
> HIVE-20707.1.patch, HIVE-20707.2.patch, HIVE-20707.4.patch, 
> HIVE-20707.5.patch, HIVE-20707.6.patch, HIVE-20707.6.patch, HIVE-20707.7.patch
>
>
> In current scenario, to add partitions for external tables to metastore, MSCK 
> REPAIR command has to be executed manually. To avoid this manual step, 
> external tables can be specified a table property based on which a background 
> metastore thread can sync partitions periodically. Tables can also be 
> specified with partition retention period. Any partition whose age exceeds 
> the retention period will be dropped automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20876) Use tez provided AM registry client for external sessions

2018-11-06 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-20876:



> Use tez provided AM registry client for external sessions
> -
>
> Key: HIVE-20876
> URL: https://issues.apache.org/jira/browse/HIVE-20876
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
>
> Continuation to HIVE-20547, replace hive side AM external sessions registry 
> with the one provided by tez. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20841) LLAP: Make dynamic ports configurable

2018-10-30 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20841:
-
Status: Patch Available  (was: Open)

> LLAP: Make dynamic ports configurable
> -
>
> Key: HIVE-20841
> URL: https://issues.apache.org/jira/browse/HIVE-20841
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-20841.1.patch
>
>
> Some ports in llap -> tez interaction code uses dynamic ports, provide an 
> option to make them configurable to facilitate adding them to iptable rules 
> in some environment. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20841) LLAP: Make dynamic ports configurable

2018-10-30 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16669383#comment-16669383
 ] 

Prasanth Jayachandran commented on HIVE-20841:
--

[~sershe] can you please take a look? small patch

> LLAP: Make dynamic ports configurable
> -
>
> Key: HIVE-20841
> URL: https://issues.apache.org/jira/browse/HIVE-20841
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-20841.1.patch
>
>
> Some ports in llap -> tez interaction code uses dynamic ports, provide an 
> option to make them configurable to facilitate adding them to iptable rules 
> in some environment. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20841) LLAP: Make dynamic ports configurable

2018-10-30 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20841:
-
Attachment: HIVE-20841.1.patch

> LLAP: Make dynamic ports configurable
> -
>
> Key: HIVE-20841
> URL: https://issues.apache.org/jira/browse/HIVE-20841
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-20841.1.patch
>
>
> Some ports in llap -> tez interaction code uses dynamic ports, provide an 
> option to make them configurable to facilitate adding them to iptable rules 
> in some environment. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20841) LLAP: Make dynamic ports configurable

2018-10-30 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-20841:



> LLAP: Make dynamic ports configurable
> -
>
> Key: HIVE-20841
> URL: https://issues.apache.org/jira/browse/HIVE-20841
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Fix For: 4.0.0, 3.2.0
>
>
> Some ports in llap -> tez interaction code uses dynamic ports, provide an 
> option to make them configurable to facilitate adding them to iptable rules 
> in some environment. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20707) Automatic partition management

2018-10-29 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20707:
-
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks Jason for the review!

> Automatic partition management
> --
>
> Key: HIVE-20707
> URL: https://issues.apache.org/jira/browse/HIVE-20707
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20702.3.patch, HIVE-20707.1.patch, 
> HIVE-20707.2.patch, HIVE-20707.4.patch, HIVE-20707.5.patch, 
> HIVE-20707.6.patch, HIVE-20707.6.patch, HIVE-20707.7.patch
>
>
> In current scenario, to add partitions for external tables to metastore, MSCK 
> REPAIR command has to be executed manually. To avoid this manual step, 
> external tables can be specified a table property based on which a background 
> metastore thread can sync partitions periodically. Tables can also be 
> specified with partition retention period. Any partition whose age exceeds 
> the retention period will be dropped automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20707) Automatic partition management

2018-10-29 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1889#comment-1889
 ] 

Prasanth Jayachandran commented on HIVE-20707:
--

Uploaded wrong patch before. Moved partition management task to remote 
metastore only task. This fixed druid test failures.

> Automatic partition management
> --
>
> Key: HIVE-20707
> URL: https://issues.apache.org/jira/browse/HIVE-20707
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20702.3.patch, HIVE-20707.1.patch, 
> HIVE-20707.2.patch, HIVE-20707.4.patch, HIVE-20707.5.patch, 
> HIVE-20707.6.patch, HIVE-20707.6.patch, HIVE-20707.7.patch
>
>
> In current scenario, to add partitions for external tables to metastore, MSCK 
> REPAIR command has to be executed manually. To avoid this manual step, 
> external tables can be specified a table property based on which a background 
> metastore thread can sync partitions periodically. Tables can also be 
> specified with partition retention period. Any partition whose age exceeds 
> the retention period will be dropped automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20707) Automatic partition management

2018-10-29 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1891#comment-1891
 ] 

Prasanth Jayachandran commented on HIVE-20707:
--

[~jdere] can you please take another look?

> Automatic partition management
> --
>
> Key: HIVE-20707
> URL: https://issues.apache.org/jira/browse/HIVE-20707
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20702.3.patch, HIVE-20707.1.patch, 
> HIVE-20707.2.patch, HIVE-20707.4.patch, HIVE-20707.5.patch, 
> HIVE-20707.6.patch, HIVE-20707.6.patch, HIVE-20707.7.patch
>
>
> In current scenario, to add partitions for external tables to metastore, MSCK 
> REPAIR command has to be executed manually. To avoid this manual step, 
> external tables can be specified a table property based on which a background 
> metastore thread can sync partitions periodically. Tables can also be 
> specified with partition retention period. Any partition whose age exceeds 
> the retention period will be dropped automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20707) Automatic partition management

2018-10-29 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20707:
-
Attachment: HIVE-20707.6.patch

> Automatic partition management
> --
>
> Key: HIVE-20707
> URL: https://issues.apache.org/jira/browse/HIVE-20707
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20702.3.patch, HIVE-20707.1.patch, 
> HIVE-20707.2.patch, HIVE-20707.4.patch, HIVE-20707.5.patch, 
> HIVE-20707.6.patch, HIVE-20707.6.patch, HIVE-20707.7.patch
>
>
> In current scenario, to add partitions for external tables to metastore, MSCK 
> REPAIR command has to be executed manually. To avoid this manual step, 
> external tables can be specified a table property based on which a background 
> metastore thread can sync partitions periodically. Tables can also be 
> specified with partition retention period. Any partition whose age exceeds 
> the retention period will be dropped automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20707) Automatic partition management

2018-10-29 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20707:
-
Attachment: HIVE-20707.7.patch

> Automatic partition management
> --
>
> Key: HIVE-20707
> URL: https://issues.apache.org/jira/browse/HIVE-20707
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20702.3.patch, HIVE-20707.1.patch, 
> HIVE-20707.2.patch, HIVE-20707.4.patch, HIVE-20707.5.patch, 
> HIVE-20707.6.patch, HIVE-20707.6.patch, HIVE-20707.7.patch
>
>
> In current scenario, to add partitions for external tables to metastore, MSCK 
> REPAIR command has to be executed manually. To avoid this manual step, 
> external tables can be specified a table property based on which a background 
> metastore thread can sync partitions periodically. Tables can also be 
> specified with partition retention period. Any partition whose age exceeds 
> the retention period will be dropped automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20793) add RP namespacing to workload management

2018-10-26 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665756#comment-16665756
 ] 

Prasanth Jayachandran commented on HIVE-20793:
--

+1, pending tests

> add RP namespacing to workload management
> -
>
> Key: HIVE-20793
> URL: https://issues.apache.org/jira/browse/HIVE-20793
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-20793.01.nogen.patch, HIVE-20793.01.patch, 
> HIVE-20793.02.nogen.patch, HIVE-20793.02.patch, HIVE-20793.nogen.patch, 
> HIVE-20793.patch
>
>
> The idea is to be able to use the same warehouse for multiple clusters in the 
> cloud use cases. This scenario is not currently supported by WM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20793) add RP namespacing to workload management

2018-10-26 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665579#comment-16665579
 ] 

Prasanth Jayachandran commented on HIVE-20793:
--

some minor comments. mostly looks good. 

> add RP namespacing to workload management
> -
>
> Key: HIVE-20793
> URL: https://issues.apache.org/jira/browse/HIVE-20793
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-20793.01.nogen.patch, HIVE-20793.01.patch, 
> HIVE-20793.nogen.patch, HIVE-20793.patch
>
>
> The idea is to be able to use the same warehouse for multiple clusters in the 
> cloud use cases. This scenario is not currently supported by WM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20538) Allow to store a key value together with a transaction.

2018-10-22 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20538:
-
Component/s: Streaming

> Allow to store a key value together with a transaction.
> ---
>
> Key: HIVE-20538
> URL: https://issues.apache.org/jira/browse/HIVE-20538
> Project: Hive
>  Issue Type: New Feature
>  Components: Standalone Metastore, Streaming, Transactions
>Affects Versions: 4.0.0
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20538.1.patch, HIVE-20538.1.patch, 
> HIVE-20538.2.patch, HIVE-20538.3.patch, HIVE-20538.4.patch, 
> HIVE-20538.5.patch, HIVE-20538.6.patch, HIVE-20538.7.patch, HIVE-20538.8.patch
>
>
> This can be useful for example to know if a transaction has already happened.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20538) Allow to store a key value together with a transaction.

2018-10-22 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20538:
-
Affects Version/s: 4.0.0

> Allow to store a key value together with a transaction.
> ---
>
> Key: HIVE-20538
> URL: https://issues.apache.org/jira/browse/HIVE-20538
> Project: Hive
>  Issue Type: New Feature
>  Components: Standalone Metastore, Streaming, Transactions
>Affects Versions: 4.0.0
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20538.1.patch, HIVE-20538.1.patch, 
> HIVE-20538.2.patch, HIVE-20538.3.patch, HIVE-20538.4.patch, 
> HIVE-20538.5.patch, HIVE-20538.6.patch, HIVE-20538.7.patch, HIVE-20538.8.patch
>
>
> This can be useful for example to know if a transaction has already happened.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20701) Allow HiveStreaming to receive a key value to commit atomically together with the transaction

2018-10-22 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20701:
-
Affects Version/s: 4.0.0

> Allow HiveStreaming to receive a key value to commit atomically together with 
> the transaction
> -
>
> Key: HIVE-20701
> URL: https://issues.apache.org/jira/browse/HIVE-20701
> Project: Hive
>  Issue Type: Improvement
>  Components: Streaming
>Affects Versions: 4.0.0
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Major
>  Labels: Streaming, Transaction
> Fix For: 4.0.0
>
> Attachments: HIVE-20701.1.patch, HIVE-20701.2.patch, 
> HIVE-20701.3.patch, HIVE-20701.4.patch, HIVE-20701.5.patch, 
> HIVE-20701.6.patch, HIVE-20701.7.patch
>
>
> Following up with HIVE-20538 it'd be nice to be able to use this feature with 
> hive streaming



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20701) Allow HiveStreaming to receive a key value to commit atomically together with the transaction

2018-10-22 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20701:
-
Labels: Streaming  (was: )

> Allow HiveStreaming to receive a key value to commit atomically together with 
> the transaction
> -
>
> Key: HIVE-20701
> URL: https://issues.apache.org/jira/browse/HIVE-20701
> Project: Hive
>  Issue Type: Improvement
>  Components: Streaming
>Affects Versions: 4.0.0
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Major
>  Labels: Streaming, Transaction
> Fix For: 4.0.0
>
> Attachments: HIVE-20701.1.patch, HIVE-20701.2.patch, 
> HIVE-20701.3.patch, HIVE-20701.4.patch, HIVE-20701.5.patch, 
> HIVE-20701.6.patch, HIVE-20701.7.patch
>
>
> Following up with HIVE-20538 it'd be nice to be able to use this feature with 
> hive streaming



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20701) Allow HiveStreaming to receive a key value to commit atomically together with the transaction

2018-10-22 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20701:
-
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Thanks for the patch [~jmarhuen]! Committed patch to master. 

> Allow HiveStreaming to receive a key value to commit atomically together with 
> the transaction
> -
>
> Key: HIVE-20701
> URL: https://issues.apache.org/jira/browse/HIVE-20701
> Project: Hive
>  Issue Type: Improvement
>  Components: Streaming
>Affects Versions: 4.0.0
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Major
>  Labels: Streaming, Transaction
> Fix For: 4.0.0
>
> Attachments: HIVE-20701.1.patch, HIVE-20701.2.patch, 
> HIVE-20701.3.patch, HIVE-20701.4.patch, HIVE-20701.5.patch, 
> HIVE-20701.6.patch, HIVE-20701.7.patch
>
>
> Following up with HIVE-20538 it'd be nice to be able to use this feature with 
> hive streaming



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20701) Allow HiveStreaming to receive a key value to commit atomically together with the transaction

2018-10-22 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20701:
-
Component/s: Streaming

> Allow HiveStreaming to receive a key value to commit atomically together with 
> the transaction
> -
>
> Key: HIVE-20701
> URL: https://issues.apache.org/jira/browse/HIVE-20701
> Project: Hive
>  Issue Type: Improvement
>  Components: Streaming
>Affects Versions: 4.0.0
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Major
>  Labels: Streaming, Transaction
> Fix For: 4.0.0
>
> Attachments: HIVE-20701.1.patch, HIVE-20701.2.patch, 
> HIVE-20701.3.patch, HIVE-20701.4.patch, HIVE-20701.5.patch, 
> HIVE-20701.6.patch, HIVE-20701.7.patch
>
>
> Following up with HIVE-20538 it'd be nice to be able to use this feature with 
> hive streaming



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20701) Allow HiveStreaming to receive a key value to commit atomically together with the transaction

2018-10-22 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20701:
-
Labels: Streaming Transaction  (was: Streaming)

> Allow HiveStreaming to receive a key value to commit atomically together with 
> the transaction
> -
>
> Key: HIVE-20701
> URL: https://issues.apache.org/jira/browse/HIVE-20701
> Project: Hive
>  Issue Type: Improvement
>  Components: Streaming
>Affects Versions: 4.0.0
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Major
>  Labels: Streaming, Transaction
> Fix For: 4.0.0
>
> Attachments: HIVE-20701.1.patch, HIVE-20701.2.patch, 
> HIVE-20701.3.patch, HIVE-20701.4.patch, HIVE-20701.5.patch, 
> HIVE-20701.6.patch, HIVE-20701.7.patch
>
>
> Following up with HIVE-20538 it'd be nice to be able to use this feature with 
> hive streaming



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20701) Allow HiveStreaming to receive a key value to commit atomically together with the transaction

2018-10-22 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659685#comment-16659685
 ] 

Prasanth Jayachandran commented on HIVE-20701:
--

+1

> Allow HiveStreaming to receive a key value to commit atomically together with 
> the transaction
> -
>
> Key: HIVE-20701
> URL: https://issues.apache.org/jira/browse/HIVE-20701
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Major
> Attachments: HIVE-20701.1.patch, HIVE-20701.2.patch, 
> HIVE-20701.3.patch, HIVE-20701.4.patch, HIVE-20701.5.patch, 
> HIVE-20701.6.patch, HIVE-20701.7.patch
>
>
> Following up with HIVE-20538 it'd be nice to be able to use this feature with 
> hive streaming



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20772) record per-task CPU counters in LLAP

2018-10-18 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16655838#comment-16655838
 ] 

Prasanth Jayachandran commented on HIVE-20772:
--

oh yeah.. make sense :)

+1

> record per-task CPU counters in LLAP
> 
>
> Key: HIVE-20772
> URL: https://issues.apache.org/jira/browse/HIVE-20772
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-20772.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20772) record per-task CPU counters in LLAP

2018-10-18 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16655829#comment-16655829
 ] 

Prasanth Jayachandran commented on HIVE-20772:
--

tezCounters.findCounter(LlapExecutorCounters.EXECUTOR_CPU_NS).increment(cpuTime);

Should this be set instead of increment? If we are reusing the executor thread 
(I hope we are not), increment will give aggregate whereas set will just give 
just for that thread. 

 

Looks good otherwise. 

> record per-task CPU counters in LLAP
> 
>
> Key: HIVE-20772
> URL: https://issues.apache.org/jira/browse/HIVE-20772
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-20772.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20703) Put dynamic sort partition optimization under cost based decision

2018-10-18 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16655729#comment-16655729
 ] 

Prasanth Jayachandran commented on HIVE-20703:
--

lgtm, +1

> Put dynamic sort partition optimization under cost based decision
> -
>
> Key: HIVE-20703
> URL: https://issues.apache.org/jira/browse/HIVE-20703
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20703.1.patch, HIVE-20703.10.patch, 
> HIVE-20703.2.patch, HIVE-20703.3.patch, HIVE-20703.4.patch, 
> HIVE-20703.5.patch, HIVE-20703.6.patch, HIVE-20703.7.patch, 
> HIVE-20703.8.patch, HIVE-20703.9.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20707) Automatic partition management

2018-10-18 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16655714#comment-16655714
 ] 

Prasanth Jayachandran commented on HIVE-20707:
--

druid test failure looks strange. Running it locally is passing. 

> Automatic partition management
> --
>
> Key: HIVE-20707
> URL: https://issues.apache.org/jira/browse/HIVE-20707
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20702.3.patch, HIVE-20707.1.patch, 
> HIVE-20707.2.patch, HIVE-20707.4.patch, HIVE-20707.5.patch, HIVE-20707.6.patch
>
>
> In current scenario, to add partitions for external tables to metastore, MSCK 
> REPAIR command has to be executed manually. To avoid this manual step, 
> external tables can be specified a table property based on which a background 
> metastore thread can sync partitions periodically. Tables can also be 
> specified with partition retention period. Any partition whose age exceeds 
> the retention period will be dropped automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20707) Automatic partition management

2018-10-18 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20707:
-
Attachment: HIVE-20707.6.patch

> Automatic partition management
> --
>
> Key: HIVE-20707
> URL: https://issues.apache.org/jira/browse/HIVE-20707
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20702.3.patch, HIVE-20707.1.patch, 
> HIVE-20707.2.patch, HIVE-20707.4.patch, HIVE-20707.5.patch, HIVE-20707.6.patch
>
>
> In current scenario, to add partitions for external tables to metastore, MSCK 
> REPAIR command has to be executed manually. To avoid this manual step, 
> external tables can be specified a table property based on which a background 
> metastore thread can sync partitions periodically. Tables can also be 
> specified with partition retention period. Any partition whose age exceeds 
> the retention period will be dropped automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20701) Allow HiveStreaming to receive a key value to commit atomically together with the transaction

2018-10-18 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16655686#comment-16655686
 ] 

Prasanth Jayachandran commented on HIVE-20701:
--

nit: long tableId = conn.getMSC().getTable(conn.getTable().getDbName(), 
conn.getTable().getTableName()).getId();

This tableId does not change within a streaming connection right? Maybe this 
can be a variable in streaming connection or transaction batch since we already 
have the table object (also avoids additional metastore call during commit).

Looks good otherwise. +1

> Allow HiveStreaming to receive a key value to commit atomically together with 
> the transaction
> -
>
> Key: HIVE-20701
> URL: https://issues.apache.org/jira/browse/HIVE-20701
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Major
> Attachments: HIVE-20701.1.patch, HIVE-20701.2.patch, 
> HIVE-20701.3.patch, HIVE-20701.4.patch, HIVE-20701.5.patch
>
>
> Following up with HIVE-20538 it'd be nice to be able to use this feature with 
> hive streaming



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20703) Put dynamic sort partition optimization under cost based decision

2018-10-18 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16655654#comment-16655654
 ] 

Prasanth Jayachandran commented on HIVE-20703:
--

{quote}During compilation getMaxMemoryAvailable() returns zero, perhaps this is 
set during execution?
{quote}
Yes. you are right. This is set in RecordProcessor init which is execution. 
Maybe you can copy MemoryInfo class from HIVE-20713 to avoid duplicate. I can 
rebase HIVE-20713 after this patch is committed. 

> Put dynamic sort partition optimization under cost based decision
> -
>
> Key: HIVE-20703
> URL: https://issues.apache.org/jira/browse/HIVE-20703
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20703.1.patch, HIVE-20703.2.patch, 
> HIVE-20703.3.patch, HIVE-20703.4.patch, HIVE-20703.5.patch, 
> HIVE-20703.6.patch, HIVE-20703.7.patch, HIVE-20703.8.patch, HIVE-20703.9.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20703) Put dynamic sort partition optimization under cost based decision

2018-10-17 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16654280#comment-16654280
 ] 

Prasanth Jayachandran commented on HIVE-20703:
--

long executorMem = 40L can be replaced by 
OperatorDesc.getMaxMemoryAvailable() which give max memory available per 
container (in case of tez) or executor in case of llap. 

> Put dynamic sort partition optimization under cost based decision
> -
>
> Key: HIVE-20703
> URL: https://issues.apache.org/jira/browse/HIVE-20703
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20703.1.patch, HIVE-20703.2.patch, 
> HIVE-20703.3.patch, HIVE-20703.4.patch, HIVE-20703.5.patch, 
> HIVE-20703.6.patch, HIVE-20703.7.patch, HIVE-20703.8.patch, HIVE-20703.9.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20763) Add google cloud storage (gs) to the exim uri schema whitelist

2018-10-17 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16654093#comment-16654093
 ] 

Prasanth Jayachandran commented on HIVE-20763:
--

+1

> Add google cloud storage (gs) to the exim uri schema whitelist
> --
>
> Key: HIVE-20763
> URL: https://issues.apache.org/jira/browse/HIVE-20763
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20763.01.patch
>
>
> import/export is enabled for s3a by default. Ideally this list should include 
> other cloud storage options. This Jira adds Google Storage to the list.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20707) Automatic partition management

2018-10-17 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16654066#comment-16654066
 ] 

Prasanth Jayachandran commented on HIVE-20707:
--

Only msck_repair_drop.q seems to be relevant. Added sorting to stabilize the 
output. 

> Automatic partition management
> --
>
> Key: HIVE-20707
> URL: https://issues.apache.org/jira/browse/HIVE-20707
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20702.3.patch, HIVE-20707.1.patch, 
> HIVE-20707.2.patch, HIVE-20707.4.patch, HIVE-20707.5.patch
>
>
> In current scenario, to add partitions for external tables to metastore, MSCK 
> REPAIR command has to be executed manually. To avoid this manual step, 
> external tables can be specified a table property based on which a background 
> metastore thread can sync partitions periodically. Tables can also be 
> specified with partition retention period. Any partition whose age exceeds 
> the retention period will be dropped automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20707) Automatic partition management

2018-10-17 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20707:
-
Attachment: HIVE-20707.5.patch

> Automatic partition management
> --
>
> Key: HIVE-20707
> URL: https://issues.apache.org/jira/browse/HIVE-20707
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20702.3.patch, HIVE-20707.1.patch, 
> HIVE-20707.2.patch, HIVE-20707.4.patch, HIVE-20707.5.patch
>
>
> In current scenario, to add partitions for external tables to metastore, MSCK 
> REPAIR command has to be executed manually. To avoid this manual step, 
> external tables can be specified a table property based on which a background 
> metastore thread can sync partitions periodically. Tables can also be 
> specified with partition retention period. Any partition whose age exceeds 
> the retention period will be dropped automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20756) Disable SARG leaf creation for date column until ORC-135

2018-10-16 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16652267#comment-16652267
 ] 

Prasanth Jayachandran commented on HIVE-20756:
--

branch-1 has different type conversions for PPD.. this may still be applicable. 
the DATE/timestamp ppd is subject to timezone issues anyway. so safer to 
disable it.

+1

> Disable SARG leaf creation for date column until ORC-135
> 
>
> Key: HIVE-20756
> URL: https://issues.apache.org/jira/browse/HIVE-20756
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Chiran Ravani
>Assignee: Chiran Ravani
>Priority: Major
> Attachments: HIVE-20756.1.patch
>
>
> Until ORC-135 is committed and orc version is updated in hive, disable SARG 
> creation for date columns in hive.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20707) Automatic partition management

2018-10-16 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20707:
-
Attachment: HIVE-20707.4.patch

> Automatic partition management
> --
>
> Key: HIVE-20707
> URL: https://issues.apache.org/jira/browse/HIVE-20707
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20702.3.patch, HIVE-20707.1.patch, 
> HIVE-20707.2.patch, HIVE-20707.4.patch
>
>
> In current scenario, to add partitions for external tables to metastore, MSCK 
> REPAIR command has to be executed manually. To avoid this manual step, 
> external tables can be specified a table property based on which a background 
> metastore thread can sync partitions periodically. Tables can also be 
> specified with partition retention period. Any partition whose age exceeds 
> the retention period will be dropped automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20713) Use percentage for join conversion size thresholds

2018-10-15 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20713:
-
Attachment: HIVE-20713.2.patch

> Use percentage for join conversion size thresholds
> --
>
> Key: HIVE-20713
> URL: https://issues.apache.org/jira/browse/HIVE-20713
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20713.1.patch, HIVE-20713.2.patch
>
>
> There are many places in join conversion that rely on absolute byte sizes for 
> join conversions (mapjoin, dynamic hashjoin etc.). When container sizes 
> change, these join conversion thresholds have to be tuned accordingly 
> according to the new container size. Instead, make the join conversions byte 
> sizes a percentage/fraction of  container size. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20707) Automatic partition management

2018-10-15 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650995#comment-16650995
 ] 

Prasanth Jayachandran commented on HIVE-20707:
--

Fixes some test failures. Druid tests are not running in my local machine still 
debugging it to make it run and regenerate golden files. 

> Automatic partition management
> --
>
> Key: HIVE-20707
> URL: https://issues.apache.org/jira/browse/HIVE-20707
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20702.3.patch, HIVE-20707.1.patch, 
> HIVE-20707.2.patch
>
>
> In current scenario, to add partitions for external tables to metastore, MSCK 
> REPAIR command has to be executed manually. To avoid this manual step, 
> external tables can be specified a table property based on which a background 
> metastore thread can sync partitions periodically. Tables can also be 
> specified with partition retention period. Any partition whose age exceeds 
> the retention period will be dropped automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20707) Automatic partition management

2018-10-15 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20707:
-
Attachment: HIVE-20702.3.patch

> Automatic partition management
> --
>
> Key: HIVE-20707
> URL: https://issues.apache.org/jira/browse/HIVE-20707
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20702.3.patch, HIVE-20707.1.patch, 
> HIVE-20707.2.patch
>
>
> In current scenario, to add partitions for external tables to metastore, MSCK 
> REPAIR command has to be executed manually. To avoid this manual step, 
> external tables can be specified a table property based on which a background 
> metastore thread can sync partitions periodically. Tables can also be 
> specified with partition retention period. Any partition whose age exceeds 
> the retention period will be dropped automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20649) LLAP aware memory manager for Orc writers

2018-10-14 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20649:
-
   Resolution: Fixed
Fix Version/s: 3.2.0
   4.0.0
   Status: Resolved  (was: Patch Available)

Committed to branch-3 and master. 

> LLAP aware memory manager for Orc writers
> -
>
> Key: HIVE-20649
> URL: https://issues.apache.org/jira/browse/HIVE-20649
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-20649.1.patch, HIVE-20649.2.patch, 
> HIVE-20649.3.patch, HIVE-20649.4.patch, HIVE-20649.5.patch, HIVE-20649.6.patch
>
>
> ORC writer has its own memory manager that assumes memory usage or memory 
> available based on JVM heap (MemoryMX bean). This works on tez container mode 
> execution model but not in LLAP where container sizes (and Xmx) are typically 
> high and there are multiple executors per LLAP daemon. This custom memory 
> manager should be aware of memory bounds per executor. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


<    1   2   3   4   5   6   7   8   9   10   >