[jira] [Commented] (HIVE-19376) Statistics: switch to 10bit HLL by default for Hive

2018-05-01 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16459955#comment-16459955
 ] 

Prasanth Jayachandran commented on HIVE-19376:
--

+1

> Statistics: switch to 10bit HLL by default for Hive
> ---
>
> Key: HIVE-19376
> URL: https://issues.apache.org/jira/browse/HIVE-19376
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Major
> Attachments: HIVE-19376.1.patch
>
>
> This reduces the memory usage for the metastore cache and the size of 
> bit-vectors in the DB by 16x.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19376) Statistics: switch to 10bit HLL by default for Hive

2018-05-01 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16460138#comment-16460138
 ] 

Ashutosh Chauhan commented on HIVE-19376:
-

This is just in memory representation. Metastore will still store 16bit HLL 
vectors? Should that also be changed?

> Statistics: switch to 10bit HLL by default for Hive
> ---
>
> Key: HIVE-19376
> URL: https://issues.apache.org/jira/browse/HIVE-19376
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Major
> Attachments: HIVE-19376.1.patch
>
>
> This reduces the memory usage for the metastore cache and the size of 
> bit-vectors in the DB by 16x.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19376) Statistics: switch to 10bit HLL by default for Hive

2018-05-01 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16460146#comment-16460146
 ] 

Gopal V commented on HIVE-19376:


This change already switches the metastore storage to 1kb instead of 16kb.

The parent task right above this JIRA is the one which adds backwards compat 
for existing metastores which store the 14bit data.

> Statistics: switch to 10bit HLL by default for Hive
> ---
>
> Key: HIVE-19376
> URL: https://issues.apache.org/jira/browse/HIVE-19376
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Major
> Attachments: HIVE-19376.1.patch
>
>
> This reduces the memory usage for the metastore cache and the size of 
> bit-vectors in the DB by 16x.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19376) Statistics: switch to 10bit HLL by default for Hive

2018-05-01 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16460157#comment-16460157
 ] 

Ashutosh Chauhan commented on HIVE-19376:
-

i see. sounds good.

> Statistics: switch to 10bit HLL by default for Hive
> ---
>
> Key: HIVE-19376
> URL: https://issues.apache.org/jira/browse/HIVE-19376
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Major
> Attachments: HIVE-19376.1.patch
>
>
> This reduces the memory usage for the metastore cache and the size of 
> bit-vectors in the DB by 16x.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19376) Statistics: switch to 10bit HLL by default for Hive

2018-05-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16462946#comment-16462946
 ] 

Hive QA commented on HIVE-19376:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
59s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 14m 52s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-10659/dev-support/hive-personality.sh
 |
| git revision | master / cc52e9b |
| Default Java | 1.8.0_111 |
| modules | C: standalone-metastore U: standalone-metastore |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-10659/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Statistics: switch to 10bit HLL by default for Hive
> ---
>
> Key: HIVE-19376
> URL: https://issues.apache.org/jira/browse/HIVE-19376
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Major
> Attachments: HIVE-19376.1.patch
>
>
> This reduces the memory usage for the metastore cache and the size of 
> bit-vectors in the DB by 16x.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19376) Statistics: switch to 10bit HLL by default for Hive

2018-05-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16463000#comment-16463000
 ] 

Hive QA commented on HIVE-19376:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12921424/HIVE-19376.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 186 failed/errored test(s), 14316 tests 
executed
*Failed tests:*
{noformat}
TestDbNotificationListener - did not produce a TEST-*.xml file (likely timed 
out) (batchId=247)
TestHCatHiveCompatibility - did not produce a TEST-*.xml file (likely timed 
out) (batchId=247)
TestNonCatCallsWithCatalog - did not produce a TEST-*.xml file (likely timed 
out) (batchId=217)
TestSequenceFileReadWrite - did not produce a TEST-*.xml file (likely timed 
out) (batchId=247)
TestTxnExIm - did not produce a TEST-*.xml file (likely timed out) (batchId=286)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_13] 
(batchId=253)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_2] 
(batchId=86)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_9] 
(batchId=37)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bitvector] (batchId=85)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_1] 
(batchId=68)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_2] 
(batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin_negative3] 
(batchId=29)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[compute_stats_date] 
(batchId=46)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[confirm_initial_tbl_stats]
 (batchId=31)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cross_join_merge] 
(batchId=7)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[describe_table] 
(batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_1_23] 
(batchId=81)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_skew_1_23] 
(batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[hll] (batchId=90)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_mapjoin] 
(batchId=52)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_13] 
(batchId=32)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sort_merge_join_desc_7] 
(batchId=27)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=178)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[dynamic_semijoin_user_level]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[explainuser_2] 
(batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llapdecider] 
(batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[acid_no_buckets]
 (batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[autoColumnStats_2]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_join1] 
(batchId=173)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_join21]
 (batchId=174)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_join29]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_join30]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_6]
 (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_groupby]
 (batchId=176)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez2]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[check_constraint]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[correlationoptimizer1]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[correlationoptimizer2]
 (batchId=166)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[correlationoptimizer3]
 (batchId=174)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[correlationoptimizer6]
 (batchId=166)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cross_join] 
(batchId=160)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction]
 (batchId=166)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction_sw]
 (batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainanalyze_2]
 (batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCli

[jira] [Commented] (HIVE-19376) Statistics: switch to 10bit HLL by default for Hive

2018-06-07 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16505746#comment-16505746
 ] 

Ashutosh Chauhan commented on HIVE-19376:
-

Now that HIVE-18079 is in, can you please rebase this patch?

> Statistics: switch to 10bit HLL by default for Hive
> ---
>
> Key: HIVE-19376
> URL: https://issues.apache.org/jira/browse/HIVE-19376
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Major
> Attachments: HIVE-19376.1.patch
>
>
> This reduces the memory usage for the metastore cache and the size of 
> bit-vectors in the DB by 16x.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19376) Statistics: switch to 10bit HLL by default for Hive

2018-06-08 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16505855#comment-16505855
 ] 

Prasanth Jayachandran commented on HIVE-19376:
--

I don't think is required anymore as HIVE-18079 already switched to 10 bit HLL 
for column stats NDV. 

> Statistics: switch to 10bit HLL by default for Hive
> ---
>
> Key: HIVE-19376
> URL: https://issues.apache.org/jira/browse/HIVE-19376
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Major
> Attachments: HIVE-19376.1.patch
>
>
> This reduces the memory usage for the metastore cache and the size of 
> bit-vectors in the DB by 16x.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)