[jira] [Commented] (HIVE-20992) Split the config "hive.metastore.dbaccess.ssl.properties" into more meaningful configs

2018-12-15 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16722389#comment-16722389
 ] 

Hive QA commented on HIVE-20992:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12951915/HIVE-20992.6.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15728 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15340/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15340/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15340/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12951915 - PreCommit-HIVE-Build

> Split the config "hive.metastore.dbaccess.ssl.properties" into more 
> meaningful configs
> --
>
> Key: HIVE-20992
> URL: https://issues.apache.org/jira/browse/HIVE-20992
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Security, Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Morio Ramdenbourg
>Assignee: Morio Ramdenbourg
>Priority: Minor
> Attachments: HIVE-20992.2.patch, HIVE-20992.3.patch, 
> HIVE-20992.4.patch, HIVE-20992.5.patch, HIVE-20992.6.patch, HIVE-20992.patch
>
>
> HIVE-13044 brought in the ability to enable TLS encryption from the HMS 
> Service to the HMSDB by configuring the following two properties:
>  # _javax.jdo.option.ConnectionURL_: JDBC connect string for a JDBC 
> metastore. To use SSL to encrypt/authenticate the connection, provide 
> database-specific SSL flag in the connection URL. (E.g. 
> "jdbc:postgresql://myhost/db?ssl=true")
>  # _hive.metastore.dbaccess.ssl.properties_: Comma-separated SSL properties 
> for metastore to access database when JDO connection URL. (E.g. 
> javax.net.ssl.trustStore=/tmp/truststore,javax.net.ssl.trustStorePassword=pwd)
> However, the latter configuration option is opaque and poses some problems. 
> The most glaring of which is it takes in _any_ 
> [java.lang.System|https://docs.oracle.com/javase/7/docs/api/java/lang/System.html]
>  system property, whether it is 
> [TLS-related|https://docs.oracle.com/javase/8/docs/technotes/guides/security/jsse/JSSERefGuide.html#InstallationAndCustomization]
>  or not. This can cause some unintended side-effects for other components of 
> the HMS, especially if it overrides an already-set system property. If the 
> user truly wishes to add an unrelated Java property, setting it statically 
> using the "-D" option of the _java_ command is more appropriate. Secondly, 
> the truststore password is stored in plain text. We should add Hadoop Shims 
> back to the HMS to prevent exposing these passwords, but this effort can be 
> done after this ticket.
> I propose we deprecate _hive.metastore.dbaccess.ssl.properties_, and add the 
> following properties:
>  * *_hive.metastore.dbaccess.ssl.use.SSL (metastore.dbaccess.ssl.use.SSL)_*
>  ** Set this to true to for using SSL/TLS encryption from the HMS Service to 
> the HMS backend store
>  ** Default: false
>  * _*hive.metastore.dbaccess.ssl.truststore.path 
> (metastore.dbaccess.ssl.truststore.path)*_
>  ** Truststore location
>  ** Directly maps to _javax.net.ssl.trustStore_ System property
>  ** Default: None
>  ** E.g. _/tmp/truststore_
>  * *_hive.metastore.dbaccess.ssl.truststore.password 
> (metastore.dbaccess.ssl.truststore.password)_*
>  ** Truststore password
>  ** Directly maps to _javax.net.ssl.trustStorePassword_ System property
>  ** Default: None
>  ** E.g. _password_
>  * *_hive.metastore.dbaccess.ssl.truststore.type 
> (metastore.dbaccess.ssl.truststore.type)_*
>  ** Truststore type
>  ** Directly maps to _javax.net.ssl.trustStoreType_ System property
>  ** Default: JKS
>  ** E.g. _pkcs12_
> We should guide the user towards an easier TLS configuration experience. This 
> is the minimum configuration necessary to configure TLS to the HMSDB. If we 
> need other options, such as the keystore location/password for 
> dual-authentication, then we can add those on afterwards.
> Also, document these changes - 
> [javax.jdo.option.ConnectionURL|https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-javax.jdo.option.ConnectionURL]
>  does not have up-to-date documentation, and these new parameters will need 
> documentation as well.
> Note "TLS" refers to both SSL and TLS. TLS is simply the 

[jira] [Commented] (HIVE-20992) Split the config "hive.metastore.dbaccess.ssl.properties" into more meaningful configs

2018-12-15 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16722386#comment-16722386
 ] 

Hive QA commented on HIVE-20992:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
49s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
44s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  2m 
16s{color} | {color:blue} standalone-metastore/metastore-common in master has 
29 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m  
3s{color} | {color:blue} standalone-metastore/metastore-server in master has 
188 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
6s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 21m 46s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15340/dev-support/hive-personality.sh
 |
| git revision | master / 4e41560 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: standalone-metastore/metastore-common 
standalone-metastore/metastore-server U: standalone-metastore |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15340/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Split the config "hive.metastore.dbaccess.ssl.properties" into more 
> meaningful configs
> --
>
> Key: HIVE-20992
> URL: https://issues.apache.org/jira/browse/HIVE-20992
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Security, Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Morio Ramdenbourg
>Assignee: Morio Ramdenbourg
>Priority: Minor
> Attachments: HIVE-20992.2.patch, HIVE-20992.3.patch, 
> HIVE-20992.4.patch, HIVE-20992.5.patch, HIVE-20992.6.patch, HIVE-20992.patch
>
>
> HIVE-13044 brought in the ability to enable TLS encryption from the HMS 
> Service to the HMSDB by configuring the following two properties:
>  # _javax.jdo.option.ConnectionURL_: JDBC connect string for a JDBC 
> metastore. To use SSL to encrypt/authenticate the connection, provide 
> database-specific SSL flag in the connection URL. (E.g. 
> "jdbc:postgresql://myhost/db?ssl=true")
>  # 

[jira] [Commented] (HIVE-18661) CachedStore: Use metastore notification log events to update cache

2018-12-15 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16722379#comment-16722379
 ] 

Hive QA commented on HIVE-18661:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12951914/HIVE-18661.09.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 19 failed/errored test(s), 15727 tests 
executed
*Failed tests:*
{noformat}
TestLocationQueries - did not produce a TEST-*.xml file (likely timed out) 
(batchId=250)
org.apache.hadoop.hive.metastore.TestObjectStore.catalogs (batchId=229)
org.apache.hadoop.hive.metastore.TestObjectStore.testDatabaseOps (batchId=229)
org.apache.hadoop.hive.metastore.TestObjectStore.testDirectSQLDropParitionsCleanup
 (batchId=229)
org.apache.hadoop.hive.metastore.TestObjectStore.testDirectSQLDropPartitionsCacheCrossSession
 (batchId=229)
org.apache.hadoop.hive.metastore.TestObjectStore.testDirectSqlErrorMetrics 
(batchId=229)
org.apache.hadoop.hive.metastore.TestObjectStore.testMasterKeyOps (batchId=229)
org.apache.hadoop.hive.metastore.TestObjectStore.testMaxEventResponse 
(batchId=229)
org.apache.hadoop.hive.metastore.TestObjectStore.testPartitionOps (batchId=229)
org.apache.hadoop.hive.metastore.TestObjectStore.testQueryCloseOnError 
(batchId=229)
org.apache.hadoop.hive.metastore.TestObjectStore.testRoleOps (batchId=229)
org.apache.hadoop.hive.metastore.TestObjectStore.testTableOps (batchId=229)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomCreatedDynamicPartitions
 (batchId=261)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomCreatedDynamicPartitionsUnionAll
 (batchId=261)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomNonExistent
 (batchId=261)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighBytesRead 
(batchId=261)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighShuffleBytes
 (batchId=261)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerSlowQueryElapsedTime
 (batchId=261)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerSlowQueryExecutionTime
 (batchId=261)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15339/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15339/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15339/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 19 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12951914 - PreCommit-HIVE-Build

> CachedStore: Use metastore notification log events to update cache
> --
>
> Key: HIVE-18661
> URL: https://issues.apache.org/jira/browse/HIVE-18661
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Vaibhav Gumashta
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-18661.02.patch, HIVE-18661.03.patch, 
> HIVE-18661.04.patch, HIVE-18661.05.patch, HIVE-18661.06.patch, 
> HIVE-18661.07.patch, HIVE-18661.08.patch, HIVE-18661.09.patch
>
>
> Currently, a background thread updates the entire cache which is pretty 
> inefficient. We capture the updates to metadata in NOTIFICATION_LOG table 
> which is getting used in the Replication work. We should have the background 
> thread apply these notifications to incrementally update the cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20992) Split the config "hive.metastore.dbaccess.ssl.properties" into more meaningful configs

2018-12-15 Thread Morio Ramdenbourg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Morio Ramdenbourg updated HIVE-20992:
-
Attachment: HIVE-20992.6.patch
Status: Patch Available  (was: In Progress)

> Split the config "hive.metastore.dbaccess.ssl.properties" into more 
> meaningful configs
> --
>
> Key: HIVE-20992
> URL: https://issues.apache.org/jira/browse/HIVE-20992
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Security, Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Morio Ramdenbourg
>Assignee: Morio Ramdenbourg
>Priority: Minor
> Attachments: HIVE-20992.2.patch, HIVE-20992.3.patch, 
> HIVE-20992.4.patch, HIVE-20992.5.patch, HIVE-20992.6.patch, HIVE-20992.patch
>
>
> HIVE-13044 brought in the ability to enable TLS encryption from the HMS 
> Service to the HMSDB by configuring the following two properties:
>  # _javax.jdo.option.ConnectionURL_: JDBC connect string for a JDBC 
> metastore. To use SSL to encrypt/authenticate the connection, provide 
> database-specific SSL flag in the connection URL. (E.g. 
> "jdbc:postgresql://myhost/db?ssl=true")
>  # _hive.metastore.dbaccess.ssl.properties_: Comma-separated SSL properties 
> for metastore to access database when JDO connection URL. (E.g. 
> javax.net.ssl.trustStore=/tmp/truststore,javax.net.ssl.trustStorePassword=pwd)
> However, the latter configuration option is opaque and poses some problems. 
> The most glaring of which is it takes in _any_ 
> [java.lang.System|https://docs.oracle.com/javase/7/docs/api/java/lang/System.html]
>  system property, whether it is 
> [TLS-related|https://docs.oracle.com/javase/8/docs/technotes/guides/security/jsse/JSSERefGuide.html#InstallationAndCustomization]
>  or not. This can cause some unintended side-effects for other components of 
> the HMS, especially if it overrides an already-set system property. If the 
> user truly wishes to add an unrelated Java property, setting it statically 
> using the "-D" option of the _java_ command is more appropriate. Secondly, 
> the truststore password is stored in plain text. We should add Hadoop Shims 
> back to the HMS to prevent exposing these passwords, but this effort can be 
> done after this ticket.
> I propose we deprecate _hive.metastore.dbaccess.ssl.properties_, and add the 
> following properties:
>  * *_hive.metastore.dbaccess.ssl.use.SSL (metastore.dbaccess.ssl.use.SSL)_*
>  ** Set this to true to for using SSL/TLS encryption from the HMS Service to 
> the HMS backend store
>  ** Default: false
>  * _*hive.metastore.dbaccess.ssl.truststore.path 
> (metastore.dbaccess.ssl.truststore.path)*_
>  ** Truststore location
>  ** Directly maps to _javax.net.ssl.trustStore_ System property
>  ** Default: None
>  ** E.g. _/tmp/truststore_
>  * *_hive.metastore.dbaccess.ssl.truststore.password 
> (metastore.dbaccess.ssl.truststore.password)_*
>  ** Truststore password
>  ** Directly maps to _javax.net.ssl.trustStorePassword_ System property
>  ** Default: None
>  ** E.g. _password_
>  * *_hive.metastore.dbaccess.ssl.truststore.type 
> (metastore.dbaccess.ssl.truststore.type)_*
>  ** Truststore type
>  ** Directly maps to _javax.net.ssl.trustStoreType_ System property
>  ** Default: JKS
>  ** E.g. _pkcs12_
> We should guide the user towards an easier TLS configuration experience. This 
> is the minimum configuration necessary to configure TLS to the HMSDB. If we 
> need other options, such as the keystore location/password for 
> dual-authentication, then we can add those on afterwards.
> Also, document these changes - 
> [javax.jdo.option.ConnectionURL|https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-javax.jdo.option.ConnectionURL]
>  does not have up-to-date documentation, and these new parameters will need 
> documentation as well.
> Note "TLS" refers to both SSL and TLS. TLS is simply the successor of SSL.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18661) CachedStore: Use metastore notification log events to update cache

2018-12-15 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16722372#comment-16722372
 ] 

Hive QA commented on HIVE-18661:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
4s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
21s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
25s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  2m 
14s{color} | {color:blue} standalone-metastore/metastore-common in master has 
29 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m  
4s{color} | {color:blue} standalone-metastore/metastore-server in master has 
188 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
37s{color} | {color:blue} ql in master has 2310 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
25s{color} | {color:blue} hcatalog/server-extensions in master has 3 extant 
Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
35s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
43s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m  
5s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
37s{color} | {color:red} ql: The patch generated 4 new + 3 unchanged - 0 fixed 
= 7 total (was 3) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
16s{color} | {color:red} itests/hive-unit: The patch generated 20 new + 86 
unchanged - 0 fixed = 106 total (was 86) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 3 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
14s{color} | {color:red} standalone-metastore/metastore-server generated 1 new 
+ 188 unchanged - 0 fixed = 189 total (was 188) {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
17s{color} | {color:red} standalone-metastore_metastore-server generated 1 new 
+ 48 unchanged - 0 fixed = 49 total (was 48) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 44m 47s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:standalone-metastore/metastore-server |
|  |  Load of known null value in 
org.apache.hadoop.hive.metastore.cache.CachedStore.get_aggr_stats_for(String, 
String, String, List, List, String)  At CachedStore.java:in 
org.apache.hadoop.hive.metastore.cache.CachedStore.get_aggr_stats_for(String, 
String, String, List, List, String)  At CachedStore.java:[line ] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15339/dev-support/hive-personality.sh
 |
| git revision | master / 4e41560 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 

[jira] [Updated] (HIVE-20992) Split the config "hive.metastore.dbaccess.ssl.properties" into more meaningful configs

2018-12-15 Thread Morio Ramdenbourg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Morio Ramdenbourg updated HIVE-20992:
-
Status: In Progress  (was: Patch Available)

> Split the config "hive.metastore.dbaccess.ssl.properties" into more 
> meaningful configs
> --
>
> Key: HIVE-20992
> URL: https://issues.apache.org/jira/browse/HIVE-20992
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Security, Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Morio Ramdenbourg
>Assignee: Morio Ramdenbourg
>Priority: Minor
> Attachments: HIVE-20992.2.patch, HIVE-20992.3.patch, 
> HIVE-20992.4.patch, HIVE-20992.5.patch, HIVE-20992.patch
>
>
> HIVE-13044 brought in the ability to enable TLS encryption from the HMS 
> Service to the HMSDB by configuring the following two properties:
>  # _javax.jdo.option.ConnectionURL_: JDBC connect string for a JDBC 
> metastore. To use SSL to encrypt/authenticate the connection, provide 
> database-specific SSL flag in the connection URL. (E.g. 
> "jdbc:postgresql://myhost/db?ssl=true")
>  # _hive.metastore.dbaccess.ssl.properties_: Comma-separated SSL properties 
> for metastore to access database when JDO connection URL. (E.g. 
> javax.net.ssl.trustStore=/tmp/truststore,javax.net.ssl.trustStorePassword=pwd)
> However, the latter configuration option is opaque and poses some problems. 
> The most glaring of which is it takes in _any_ 
> [java.lang.System|https://docs.oracle.com/javase/7/docs/api/java/lang/System.html]
>  system property, whether it is 
> [TLS-related|https://docs.oracle.com/javase/8/docs/technotes/guides/security/jsse/JSSERefGuide.html#InstallationAndCustomization]
>  or not. This can cause some unintended side-effects for other components of 
> the HMS, especially if it overrides an already-set system property. If the 
> user truly wishes to add an unrelated Java property, setting it statically 
> using the "-D" option of the _java_ command is more appropriate. Secondly, 
> the truststore password is stored in plain text. We should add Hadoop Shims 
> back to the HMS to prevent exposing these passwords, but this effort can be 
> done after this ticket.
> I propose we deprecate _hive.metastore.dbaccess.ssl.properties_, and add the 
> following properties:
>  * *_hive.metastore.dbaccess.ssl.use.SSL (metastore.dbaccess.ssl.use.SSL)_*
>  ** Set this to true to for using SSL/TLS encryption from the HMS Service to 
> the HMS backend store
>  ** Default: false
>  * _*hive.metastore.dbaccess.ssl.truststore.path 
> (metastore.dbaccess.ssl.truststore.path)*_
>  ** Truststore location
>  ** Directly maps to _javax.net.ssl.trustStore_ System property
>  ** Default: None
>  ** E.g. _/tmp/truststore_
>  * *_hive.metastore.dbaccess.ssl.truststore.password 
> (metastore.dbaccess.ssl.truststore.password)_*
>  ** Truststore password
>  ** Directly maps to _javax.net.ssl.trustStorePassword_ System property
>  ** Default: None
>  ** E.g. _password_
>  * *_hive.metastore.dbaccess.ssl.truststore.type 
> (metastore.dbaccess.ssl.truststore.type)_*
>  ** Truststore type
>  ** Directly maps to _javax.net.ssl.trustStoreType_ System property
>  ** Default: JKS
>  ** E.g. _pkcs12_
> We should guide the user towards an easier TLS configuration experience. This 
> is the minimum configuration necessary to configure TLS to the HMSDB. If we 
> need other options, such as the keystore location/password for 
> dual-authentication, then we can add those on afterwards.
> Also, document these changes - 
> [javax.jdo.option.ConnectionURL|https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-javax.jdo.option.ConnectionURL]
>  does not have up-to-date documentation, and these new parameters will need 
> documentation as well.
> Note "TLS" refers to both SSL and TLS. TLS is simply the successor of SSL.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18661) CachedStore: Use metastore notification log events to update cache

2018-12-15 Thread mahesh kumar behera (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-18661:
---
Attachment: (was: HIVE-18661.09.patch)

> CachedStore: Use metastore notification log events to update cache
> --
>
> Key: HIVE-18661
> URL: https://issues.apache.org/jira/browse/HIVE-18661
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Vaibhav Gumashta
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-18661.02.patch, HIVE-18661.03.patch, 
> HIVE-18661.04.patch, HIVE-18661.05.patch, HIVE-18661.06.patch, 
> HIVE-18661.07.patch, HIVE-18661.08.patch
>
>
> Currently, a background thread updates the entire cache which is pretty 
> inefficient. We capture the updates to metadata in NOTIFICATION_LOG table 
> which is getting used in the Replication work. We should have the background 
> thread apply these notifications to incrementally update the cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18661) CachedStore: Use metastore notification log events to update cache

2018-12-15 Thread mahesh kumar behera (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-18661:
---
Attachment: HIVE-18661.09.patch

> CachedStore: Use metastore notification log events to update cache
> --
>
> Key: HIVE-18661
> URL: https://issues.apache.org/jira/browse/HIVE-18661
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Vaibhav Gumashta
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-18661.02.patch, HIVE-18661.03.patch, 
> HIVE-18661.04.patch, HIVE-18661.05.patch, HIVE-18661.06.patch, 
> HIVE-18661.07.patch, HIVE-18661.08.patch, HIVE-18661.09.patch
>
>
> Currently, a background thread updates the entire cache which is pretty 
> inefficient. We capture the updates to metadata in NOTIFICATION_LOG table 
> which is getting used in the Replication work. We should have the background 
> thread apply these notifications to incrementally update the cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18661) CachedStore: Use metastore notification log events to update cache

2018-12-15 Thread mahesh kumar behera (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-18661:
---
Status: Patch Available  (was: Open)

> CachedStore: Use metastore notification log events to update cache
> --
>
> Key: HIVE-18661
> URL: https://issues.apache.org/jira/browse/HIVE-18661
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Vaibhav Gumashta
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-18661.02.patch, HIVE-18661.03.patch, 
> HIVE-18661.04.patch, HIVE-18661.05.patch, HIVE-18661.06.patch, 
> HIVE-18661.07.patch, HIVE-18661.08.patch, HIVE-18661.09.patch
>
>
> Currently, a background thread updates the entire cache which is pretty 
> inefficient. We capture the updates to metadata in NOTIFICATION_LOG table 
> which is getting used in the Replication work. We should have the background 
> thread apply these notifications to incrementally update the cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18661) CachedStore: Use metastore notification log events to update cache

2018-12-15 Thread mahesh kumar behera (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-18661:
---
Status: Open  (was: Patch Available)

> CachedStore: Use metastore notification log events to update cache
> --
>
> Key: HIVE-18661
> URL: https://issues.apache.org/jira/browse/HIVE-18661
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Vaibhav Gumashta
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-18661.02.patch, HIVE-18661.03.patch, 
> HIVE-18661.04.patch, HIVE-18661.05.patch, HIVE-18661.06.patch, 
> HIVE-18661.07.patch, HIVE-18661.08.patch
>
>
> Currently, a background thread updates the entire cache which is pretty 
> inefficient. We capture the updates to metadata in NOTIFICATION_LOG table 
> which is getting used in the Replication work. We should have the background 
> thread apply these notifications to incrementally update the cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21047) Read the HMS backend database password and truststore password during PersistenceManagerFactory initialization time

2018-12-15 Thread Morio Ramdenbourg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Morio Ramdenbourg updated HIVE-21047:
-
Description: 
This was pointed out by [~vihangk1] as part of the review for 
[HIVE-20992|https://issues.apache.org/jira/browse/HIVE-20992].

As part of the redaction of the _javax.jdo.option.ConnectionPassword_ and 
_metastore.dbaccess.ssl.truststore.password_ properties, they both use the 
[Hadoop Credential Provider 
API|https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/CredentialProviderAPI.html]
 to prevent the passwords from being stored in plain text.

However, these are both being read in 
[setConf()|https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L198-L247]
 in ObjectStore, thereby calling the expensive decrypt during every new 
database connection initialization despite these values almost never changing.

We should instead move these reads into the PersistenceManagerFactory 
[initPMF()|https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PersistenceManagerProvider.java#L227-L273]
 method and cache their values so they are only read once when the HMS starts.

  was:
This was pointed out by [~vihangk1] as part of the review for 
[HIVE-20992|https://issues.apache.org/jira/browse/HIVE-20992].

As part of the redaction of the _javax.jdo.option.ConnectionPassword_ and 
_metastore.dbaccess.ssl.truststore.password_ properties, they both use the 
[Hadoop Credential Provider 
API|https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/CredentialProviderAPI.html]
 to prevent the passwords from being stored in plain text.

However, these are both being read in during every new database connection 
initialization in 
[setConf()|https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L198-L247]
 in ObjectStore, thereby calling the expensive decrypt every time despite these 
values almost never changing.

We should instead move these reads into the PersistenceManagerFactory 
[initPMF()|https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PersistenceManagerProvider.java#L227-L273]
 method and cache their values so they are only read once when the HMS starts.


> Read the HMS backend database password and truststore password during 
> PersistenceManagerFactory initialization time
> ---
>
> Key: HIVE-21047
> URL: https://issues.apache.org/jira/browse/HIVE-21047
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Morio Ramdenbourg
>Priority: Major
>
> This was pointed out by [~vihangk1] as part of the review for 
> [HIVE-20992|https://issues.apache.org/jira/browse/HIVE-20992].
> As part of the redaction of the _javax.jdo.option.ConnectionPassword_ and 
> _metastore.dbaccess.ssl.truststore.password_ properties, they both use the 
> [Hadoop Credential Provider 
> API|https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/CredentialProviderAPI.html]
>  to prevent the passwords from being stored in plain text.
> However, these are both being read in 
> [setConf()|https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L198-L247]
>  in ObjectStore, thereby calling the expensive decrypt during every new 
> database connection initialization despite these values almost never changing.
> We should instead move these reads into the PersistenceManagerFactory 
> [initPMF()|https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PersistenceManagerProvider.java#L227-L273]
>  method and cache their values so they are only read once when the HMS starts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-16255) Support percentile_cont / percentile_disc

2018-12-15 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16722341#comment-16722341
 ] 

Hive QA commented on HIVE-16255:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12951912/HIVE-16255.02.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 15754 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parallel_orderby] 
(batchId=58)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[show_functions] 
(batchId=79)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15338/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15338/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15338/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12951912 - PreCommit-HIVE-Build

> Support percentile_cont / percentile_disc
> -
>
> Key: HIVE-16255
> URL: https://issues.apache.org/jira/browse/HIVE-16255
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Carter Shanklin
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-16255.01.patch, HIVE-16255.02.patch
>
>
> Way back in HIVE-259, a percentile function was added that provides a subset 
> of the standard percentile_cont aggregate function.
> The SQL standard provides some additional options and also a percentile_disc 
> aggregate function with different rules. In the standard you specify an 
> ordering with arbitrary value expression and the results are drawn from this 
> value expression. This aggregate functions should be usable as analytic 
> functions as well (i.e. support the over clause). The current percentile 
> function is able to be used with an over clause.
> The rough outline of how this works is:
> percentile_cont(number) within group (order by expression) [ over(window 
> spec) ]
> percentile_disc(number) within group (order by expression) [ over(window 
> spec) ]
> The value of number should be between 0 and 1. The value expression is 
> evaluated for each row of the group, nulls are discarded, and the remaining 
> rows are ordered.
> — If PERCENTILE_CONT is specified, by considering the pair of consecutive 
> rows that are indicated by the argument, treated as a fraction of the total 
> number of rows in the group, and interpolating the value of the value 
> expression evaluated for these rows.
> — If PERCENTILE_DISC is specified, by treating the group as a window 
> partition of the CUME_DIST window function, using the specified ordering of 
> the value expression as the window ordering, and returning the  first value 
> expression whose cumulative distribution value is greater than or equal to 
> the argument.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-16255) Support percentile_cont / percentile_disc

2018-12-15 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16722333#comment-16722333
 ] 

Hive QA commented on HIVE-16255:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
38s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
6s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
51s{color} | {color:blue} ql in master has 2310 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
38s{color} | {color:red} ql: The patch generated 48 new + 72 unchanged - 0 
fixed = 120 total (was 72) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  4m  
1s{color} | {color:red} ql generated 3 new + 2310 unchanged - 0 fixed = 2313 
total (was 2310) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
13s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 23m 40s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  
org.apache.hadoop.hive.ql.udf.generic.GenericUDAFPercentileCont$DoubleComparator
 implements Comparator but not Serializable  At 
GenericUDAFPercentileCont.java:Serializable  At 
GenericUDAFPercentileCont.java:[lines 109-114] |
|  |  
org.apache.hadoop.hive.ql.udf.generic.GenericUDAFPercentileCont$LongComparator 
implements Comparator but not Serializable  At 
GenericUDAFPercentileCont.java:Serializable  At 
GenericUDAFPercentileCont.java:[lines 101-105] |
|  |  
org.apache.hadoop.hive.ql.udf.generic.GenericUDAFPercentileCont$PercentileContEvaluator.terminatePartial(GenericUDAFEvaluator$AggregationBuffer)
 may expose internal representation by returning 
GenericUDAFPercentileCont$PercentileContEvaluator.partialResult  At 
GenericUDAFPercentileCont.java:by returning 
GenericUDAFPercentileCont$PercentileContEvaluator.partialResult  At 
GenericUDAFPercentileCont.java:[line 186] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15338/dev-support/hive-personality.sh
 |
| git revision | master / 4e41560 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15338/yetus/diff-checkstyle-ql.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15338/yetus/new-findbugs-ql.html
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15338/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15338/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Support percentile_cont / percentile_disc
> -
>
> Key: HIVE-16255
> URL: https://issues.apache.org/jira/browse/HIVE-16255
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Carter Shanklin
>Assignee: Laszlo Bodor
>

[jira] [Updated] (HIVE-16255) Support percentile_cont / percentile_disc

2018-12-15 Thread Laszlo Bodor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-16255:

Attachment: HIVE-16255.02.patch

> Support percentile_cont / percentile_disc
> -
>
> Key: HIVE-16255
> URL: https://issues.apache.org/jira/browse/HIVE-16255
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Carter Shanklin
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-16255.01.patch, HIVE-16255.02.patch
>
>
> Way back in HIVE-259, a percentile function was added that provides a subset 
> of the standard percentile_cont aggregate function.
> The SQL standard provides some additional options and also a percentile_disc 
> aggregate function with different rules. In the standard you specify an 
> ordering with arbitrary value expression and the results are drawn from this 
> value expression. This aggregate functions should be usable as analytic 
> functions as well (i.e. support the over clause). The current percentile 
> function is able to be used with an over clause.
> The rough outline of how this works is:
> percentile_cont(number) within group (order by expression) [ over(window 
> spec) ]
> percentile_disc(number) within group (order by expression) [ over(window 
> spec) ]
> The value of number should be between 0 and 1. The value expression is 
> evaluated for each row of the group, nulls are discarded, and the remaining 
> rows are ordered.
> — If PERCENTILE_CONT is specified, by considering the pair of consecutive 
> rows that are indicated by the argument, treated as a fraction of the total 
> number of rows in the group, and interpolating the value of the value 
> expression evaluated for these rows.
> — If PERCENTILE_DISC is specified, by treating the group as a window 
> partition of the CUME_DIST window function, using the specified ordering of 
> the value expression as the window ordering, and returning the  first value 
> expression whose cumulative distribution value is greater than or equal to 
> the argument.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-16255) Support percentile_cont / percentile_disc

2018-12-15 Thread Laszlo Bodor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-16255:

Status: Patch Available  (was: Open)

> Support percentile_cont / percentile_disc
> -
>
> Key: HIVE-16255
> URL: https://issues.apache.org/jira/browse/HIVE-16255
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Carter Shanklin
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-16255.01.patch, HIVE-16255.02.patch
>
>
> Way back in HIVE-259, a percentile function was added that provides a subset 
> of the standard percentile_cont aggregate function.
> The SQL standard provides some additional options and also a percentile_disc 
> aggregate function with different rules. In the standard you specify an 
> ordering with arbitrary value expression and the results are drawn from this 
> value expression. This aggregate functions should be usable as analytic 
> functions as well (i.e. support the over clause). The current percentile 
> function is able to be used with an over clause.
> The rough outline of how this works is:
> percentile_cont(number) within group (order by expression) [ over(window 
> spec) ]
> percentile_disc(number) within group (order by expression) [ over(window 
> spec) ]
> The value of number should be between 0 and 1. The value expression is 
> evaluated for each row of the group, nulls are discarded, and the remaining 
> rows are ordered.
> — If PERCENTILE_CONT is specified, by considering the pair of consecutive 
> rows that are indicated by the argument, treated as a fraction of the total 
> number of rows in the group, and interpolating the value of the value 
> expression evaluated for these rows.
> — If PERCENTILE_DISC is specified, by treating the group as a window 
> partition of the CUME_DIST window function, using the specified ordering of 
> the value expression as the window ordering, and returning the  first value 
> expression whose cumulative distribution value is greater than or equal to 
> the argument.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19968) UDF exception is not throw out

2018-12-15 Thread Laszlo Bodor (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16722173#comment-16722173
 ] 

Laszlo Bodor commented on HIVE-19968:
-

[~kgyrtkirk] : 04.patch is okay from my side, and it achieved a green run

> UDF exception is not throw out
> --
>
> Key: HIVE-19968
> URL: https://issues.apache.org/jira/browse/HIVE-19968
> Project: Hive
>  Issue Type: Bug
>Reporter: sandflee
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-19968.01.patch, HIVE-19968.02.patch, 
> HIVE-19968.03.patch, HIVE-19968.04.patch, hive-udf.png
>
>
> udf init failed, and throw a exception, but hive catch it and do nothing, 
> leading to app succ, but no data is generated.
> {code}
> GenericUDFReflect.java#evaluate()
> try {  
>    o = null;  
>    o = ReflectionUtils.newInstance(c, null);
> }   catch (Exception e) {  
> // ignored
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20079) Populate more accurate rawDataSize for parquet format

2018-12-15 Thread Antal Sinkovits (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16722137#comment-16722137
 ] 

Antal Sinkovits commented on HIVE-20079:


Hi, 
I'm planing to work on this next week. Let me know, if there are any concerns 
regarding it. Thanks.

> Populate more accurate rawDataSize for parquet format
> -
>
> Key: HIVE-20079
> URL: https://issues.apache.org/jira/browse/HIVE-20079
> Project: Hive
>  Issue Type: Improvement
>  Components: File Formats
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Priority: Major
> Attachments: HIVE-20079.1.patch, HIVE-20079.2.patch
>
>
> Run the following queries and you will see the raw data for the table is 4 
> (that is the number of fields) incorrectly. We need to populate correct data 
> size so data can be split properly.
> {noformat}
> SET hive.stats.autogather=true;
> CREATE TABLE parquet_stats (id int,str string) STORED AS PARQUET;
> INSERT INTO parquet_stats values(0, 'this is string 0'), (1, 'string 1');
> DESC FORMATTED parquet_stats;
> {noformat}
> {noformat}
> Table Parameters:
>   COLUMN_STATS_ACCURATE   true
>   numFiles1
>   numRows 2
>   rawDataSize 4
>   totalSize   373
>   transient_lastDdlTime   1530660523
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)