[jira] [Updated] (HIVE-22162) MVs are not using ACID tables.

2019-09-02 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa updated HIVE-22162:
--
Status: Patch Available  (was: Open)

> MVs are not using ACID tables.
> --
>
> Key: HIVE-22162
> URL: https://issues.apache.org/jira/browse/HIVE-22162
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views
>Affects Versions: 3.1.2
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22162.1.patch, HIVE-22162.2.patch, 
> HIVE-22162.3.patch
>
>
> {code}
> SET hive.support.concurrency=true;
> SET hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> SET metastore.strict.managed.tables=true;
> SET hive.default.fileformat=textfile;
> SET hive.default.fileformat.managed=orc;
> SET metastore.create.as.acid=true;
> CREATE TABLE cmv_basetable_n4 (a int, b varchar(256), c decimal(10,2));
> INSERT INTO cmv_basetable_n4 VALUES (1, 'alfred', 10.30),(2, 'bob', 3.14),(2, 
> 'bonnie', 172342.2),(3, 'calvin', 978.76),(3, 'charlie', 9.8);
> CREATE MATERIALIZED VIEW cmv_mat_view_n4 disable rewrite
> AS SELECT a, b, c FROM cmv_basetable_n4;
> DESCRIBE FORMATTED cmv_mat_view_n4;
> {code}
> {code}
> POSTHOOK: query: DESCRIBE FORMATTED cmv_mat_view_n4
> ...
> Table Type:   MATERIALIZED_VIEW
> Table Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"a\":\"true\",\"b\":\"true\",\"c\":\"true\"}}
>   bucketing_version   2   
>   numFiles1   
>   numRows 5   
>   rawDataSize 1025
>   totalSize   509   
> {code}
> Missing table parameter
> {code}
> transaction = true
> {code}
> cc.: [~ashutoshc], [~gopalv], [~jcamachorodriguez]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (HIVE-22162) MVs are not using ACID tables.

2019-09-02 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa updated HIVE-22162:
--
Status: Open  (was: Patch Available)

> MVs are not using ACID tables.
> --
>
> Key: HIVE-22162
> URL: https://issues.apache.org/jira/browse/HIVE-22162
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views
>Affects Versions: 3.1.2
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22162.1.patch, HIVE-22162.2.patch, 
> HIVE-22162.3.patch
>
>
> {code}
> SET hive.support.concurrency=true;
> SET hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> SET metastore.strict.managed.tables=true;
> SET hive.default.fileformat=textfile;
> SET hive.default.fileformat.managed=orc;
> SET metastore.create.as.acid=true;
> CREATE TABLE cmv_basetable_n4 (a int, b varchar(256), c decimal(10,2));
> INSERT INTO cmv_basetable_n4 VALUES (1, 'alfred', 10.30),(2, 'bob', 3.14),(2, 
> 'bonnie', 172342.2),(3, 'calvin', 978.76),(3, 'charlie', 9.8);
> CREATE MATERIALIZED VIEW cmv_mat_view_n4 disable rewrite
> AS SELECT a, b, c FROM cmv_basetable_n4;
> DESCRIBE FORMATTED cmv_mat_view_n4;
> {code}
> {code}
> POSTHOOK: query: DESCRIBE FORMATTED cmv_mat_view_n4
> ...
> Table Type:   MATERIALIZED_VIEW
> Table Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"a\":\"true\",\"b\":\"true\",\"c\":\"true\"}}
>   bucketing_version   2   
>   numFiles1   
>   numRows 5   
>   rawDataSize 1025
>   totalSize   509   
> {code}
> Missing table parameter
> {code}
> transaction = true
> {code}
> cc.: [~ashutoshc], [~gopalv], [~jcamachorodriguez]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (HIVE-22162) MVs are not using ACID tables.

2019-09-02 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa updated HIVE-22162:
--
Attachment: HIVE-22162.3.patch

> MVs are not using ACID tables.
> --
>
> Key: HIVE-22162
> URL: https://issues.apache.org/jira/browse/HIVE-22162
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views
>Affects Versions: 3.1.2
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22162.1.patch, HIVE-22162.2.patch, 
> HIVE-22162.3.patch
>
>
> {code}
> SET hive.support.concurrency=true;
> SET hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> SET metastore.strict.managed.tables=true;
> SET hive.default.fileformat=textfile;
> SET hive.default.fileformat.managed=orc;
> SET metastore.create.as.acid=true;
> CREATE TABLE cmv_basetable_n4 (a int, b varchar(256), c decimal(10,2));
> INSERT INTO cmv_basetable_n4 VALUES (1, 'alfred', 10.30),(2, 'bob', 3.14),(2, 
> 'bonnie', 172342.2),(3, 'calvin', 978.76),(3, 'charlie', 9.8);
> CREATE MATERIALIZED VIEW cmv_mat_view_n4 disable rewrite
> AS SELECT a, b, c FROM cmv_basetable_n4;
> DESCRIBE FORMATTED cmv_mat_view_n4;
> {code}
> {code}
> POSTHOOK: query: DESCRIBE FORMATTED cmv_mat_view_n4
> ...
> Table Type:   MATERIALIZED_VIEW
> Table Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"a\":\"true\",\"b\":\"true\",\"c\":\"true\"}}
>   bucketing_version   2   
>   numFiles1   
>   numRows 5   
>   rawDataSize 1025
>   totalSize   509   
> {code}
> Missing table parameter
> {code}
> transaction = true
> {code}
> cc.: [~ashutoshc], [~gopalv], [~jcamachorodriguez]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (HIVE-21930) WINDOW COUNT DISTINCT return wrong value with PARTITION BY

2019-09-02 Thread Igor (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-21930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920717#comment-16920717
 ] 

Igor commented on HIVE-21930:
-

Any updates on this?

> WINDOW COUNT DISTINCT return wrong value with PARTITION BY
> --
>
> Key: HIVE-21930
> URL: https://issues.apache.org/jira/browse/HIVE-21930
> Project: Hive
>  Issue Type: Bug
>  Components: PTF-Windowing
>Affects Versions: 3.1.0
> Environment: Beeline version 3.1.0.3.0.1.0-187 by Apache Hive
>Reporter: Igor
>Priority: Major
>  Labels: distinct, window_funcion
>
> count(distinct a) over (partiton by b) return wring result. For example (T is 
> CTE here):
> {code:java}
> select p, day, ts
> , row_number() OVER (PARTITION BY phone ORDER BY ts ASC) as line_number
> , count(1) OVER (PARTITION BY phone ROWS BETWEEN UNBOUNDED PRECEDING AND 
> UNBOUNDED FOLLOWING) as lines
> , count(distinct day) OVER (PARTITION BY phone ROWS BETWEEN UNBOUNDED 
> PRECEDING AND UNBOUNDED FOLLOWING) as days
> FROM T{code}
>  WINDOW specification doesn't affect on results: same wrong with and without 
> window.
> count(1) and count(distinct day) return the same result. Count distinct is 
> wrong.
>  
> I've add size(collect_set(day) OVER (PARTITION BY phone)) as days2 and 
> count(distinct return correct result.
> Following query return non-empty result:
> {code:java}
> select A.*, B.days, B. from (
> select p, day, ts
> , row_number() OVER (PARTITION BY phone ORDER BY ts ASC) as line_number
> , count(1) OVER (PARTITION BY p ROWS BETWEEN UNBOUNDED PRECEDING AND 
> UNBOUNDED FOLLOWING) as lines
> , count(distinct day) OVER (PARTITION BY phone ROWS BETWEEN UNBOUNDED 
> PRECEDING AND UNBOUNDED FOLLOWING) as days
> , size(collect_set(day) OVER (PARTITION BY phone)) as days2
> , dense_rank() over (partition by phone order by day) + dense_rank() over 
> (partition by phone order by day desc) - 1 as days3
> FROM T ) as A 
> join (
> select p, day, ts
> , row_number() OVER (PARTITION BY phone ORDER BY ts ASC) as line_number
> , count(1) OVER (PARTITION BY phone ROWS BETWEEN UNBOUNDED PRECEDING AND 
> UNBOUNDED FOLLOWING) as lines
> , count(distinct day) OVER (PARTITION BY phone ROWS BETWEEN UNBOUNDED 
> PRECEDING AND UNBOUNDED FOLLOWING) as days
> FROM T
> ) as B on A.p=B.p and A.line_number=B.line_number
> where A.days!=B.days
> order by A.p, A.line_number
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (HIVE-21737) Upgrade Avro to version 1.9.1

2019-09-02 Thread Fokko Driesprong (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fokko Driesprong updated HIVE-21737:

Summary: Upgrade Avro to version 1.9.1  (was: Upgrade Avro to version 1.9.0)

> Upgrade Avro to version 1.9.1
> -
>
> Key: HIVE-21737
> URL: https://issues.apache.org/jira/browse/HIVE-21737
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Ismaël Mejía
>Assignee: Fokko Driesprong
>Priority: Minor
>  Labels: pull-request-available
> Attachments: 0001-HIVE-21737-Bump-Apache-Avro-to-1.9.0.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Avro 1.9.0 was released recently. It brings a lot of fixes including a leaner 
> version of Avro without Jackson in the public API. Worth the update.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (HIVE-21737) Upgrade Avro to version 1.9.1

2019-09-02 Thread Fokko Driesprong (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fokko Driesprong updated HIVE-21737:

Attachment: (was: 0001-HIVE-21737-Bump-Apache-Avro-to-1.9.0.patch)

> Upgrade Avro to version 1.9.1
> -
>
> Key: HIVE-21737
> URL: https://issues.apache.org/jira/browse/HIVE-21737
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Ismaël Mejía
>Assignee: Fokko Driesprong
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Avro 1.9.0 was released recently. It brings a lot of fixes including a leaner 
> version of Avro without Jackson in the public API. Worth the update.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (HIVE-22149) Metastore: Unify codahale metrics.log json structure between hiveserver2 and metastore services

2019-09-02 Thread Laszlo Bodor (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-22149:

Attachment: metrics_metastore.log
metrics_hiveserver2.log

> Metastore: Unify codahale metrics.log json structure between hiveserver2 and 
> metastore services
> ---
>
> Key: HIVE-22149
> URL: https://issues.apache.org/jira/browse/HIVE-22149
> Project: Hive
>  Issue Type: Bug
>Reporter: Laszlo Bodor
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: metrics_hiveserver2.log, metrics_metastore.log
>
>
> While fixing HIVE-22140 I found some really annoying differences between the 
> codahale metric file structures between hiveserver2 and metastore, e.g.
> open_connections: can be found in "counters" for hs2, but in "gauges" for ms
> threads count: it's a proper "threads.count" for hs2, but a really ambiguous 
> "count" for ms
> so I realized that "memory." and "threads." prefix is completely absent in ms 
> metrics file, which is misleading



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (HIVE-21737) Upgrade Avro to version 1.9.1

2019-09-02 Thread Fokko Driesprong (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fokko Driesprong updated HIVE-21737:

Attachment: 0001-HIVE-21737-Bump-Apache-Avro-to-1.9.1.patch

> Upgrade Avro to version 1.9.1
> -
>
> Key: HIVE-21737
> URL: https://issues.apache.org/jira/browse/HIVE-21737
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Ismaël Mejía
>Assignee: Fokko Driesprong
>Priority: Minor
>  Labels: pull-request-available
> Attachments: 0001-HIVE-21737-Bump-Apache-Avro-to-1.9.1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Avro 1.9.0 was released recently. It brings a lot of fixes including a leaner 
> version of Avro without Jackson in the public API. Worth the update.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (HIVE-22149) Metastore: Unify codahale metrics.log json structure between hiveserver2 and metastore services

2019-09-02 Thread Laszlo Bodor (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-22149:

Attachment: HIVE-22149.01.patch

> Metastore: Unify codahale metrics.log json structure between hiveserver2 and 
> metastore services
> ---
>
> Key: HIVE-22149
> URL: https://issues.apache.org/jira/browse/HIVE-22149
> Project: Hive
>  Issue Type: Bug
>Reporter: Laszlo Bodor
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-22149.01.patch, metrics_hiveserver2.log, 
> metrics_metastore.log
>
>
> While fixing HIVE-22140 I found some really annoying differences between the 
> codahale metric file structures between hiveserver2 and metastore, e.g.
> open_connections: can be found in "counters" for hs2, but in "gauges" for ms
> threads count: it's a proper "threads.count" for hs2, but a really ambiguous 
> "count" for ms
> so I realized that "memory." and "threads." prefix is completely absent in ms 
> metrics file, which is misleading



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (HIVE-22149) Metastore: Unify codahale metrics.log json structure between hiveserver2 and metastore services

2019-09-02 Thread Laszlo Bodor (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-22149:

Status: Patch Available  (was: Open)

> Metastore: Unify codahale metrics.log json structure between hiveserver2 and 
> metastore services
> ---
>
> Key: HIVE-22149
> URL: https://issues.apache.org/jira/browse/HIVE-22149
> Project: Hive
>  Issue Type: Bug
>Reporter: Laszlo Bodor
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-22149.01.patch, metrics_hiveserver2.log, 
> metrics_metastore.log
>
>
> While fixing HIVE-22140 I found some really annoying differences between the 
> codahale metric file structures between hiveserver2 and metastore, e.g.
> open_connections: can be found in "counters" for hs2, but in "gauges" for ms
> threads count: it's a proper "threads.count" for hs2, but a really ambiguous 
> "count" for ms
> so I realized that "memory." and "threads." prefix is completely absent in ms 
> metrics file, which is misleading



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (HIVE-22162) MVs are not using ACID tables.

2019-09-02 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920840#comment-16920840
 ] 

Hive QA commented on HIVE-22162:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
38s{color} | {color:blue} ql in master has 2248 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 22m 46s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-18435/dev-support/hive-personality.sh
 |
| git revision | master / 04397e5 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18435/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> MVs are not using ACID tables.
> --
>
> Key: HIVE-22162
> URL: https://issues.apache.org/jira/browse/HIVE-22162
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views
>Affects Versions: 3.1.2
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22162.1.patch, HIVE-22162.2.patch, 
> HIVE-22162.3.patch
>
>
> {code}
> SET hive.support.concurrency=true;
> SET hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> SET metastore.strict.managed.tables=true;
> SET hive.default.fileformat=textfile;
> SET hive.default.fileformat.managed=orc;
> SET metastore.create.as.acid=true;
> CREATE TABLE cmv_basetable_n4 (a int, b varchar(256), c decimal(10,2));
> INSERT INTO cmv_basetable_n4 VALUES (1, 'alfred', 10.30),(2, 'bob', 3.14),(2, 
> 'bonnie', 172342.2),(3, 'calvin', 978.76),(3, 'charlie', 9.8);
> CREATE MATERIALIZED VIEW cmv_mat_view_n4 disable rewrite
> AS SELECT a, b, c FROM cmv_basetable_n4;
> DESCRIBE FORMATTED cmv_mat_view_n4;
> {code}
> {code}
> POSTHOOK: query: DESCRIBE FORMATTED cmv_mat_view_n4
> ...
> Table Type:   MATERIALIZED_VIEW
> Table Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"a\":\"true\",\"b\":\"true\",\"c\":\"true\"}}
>   bucketing_version   2   
>   numFiles1   
>   numRows 5   
>   rawDataSize 1025
>   totalSize   509   

[jira] [Updated] (HIVE-22149) Metastore: Unify codahale metrics.log json structure between hiveserver2 and metastore services

2019-09-02 Thread Laszlo Bodor (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-22149:

Attachment: (was: HIVE-22149.01.patch)

> Metastore: Unify codahale metrics.log json structure between hiveserver2 and 
> metastore services
> ---
>
> Key: HIVE-22149
> URL: https://issues.apache.org/jira/browse/HIVE-22149
> Project: Hive
>  Issue Type: Bug
>Reporter: Laszlo Bodor
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-22149.01.patch, metrics_hiveserver2.log, 
> metrics_metastore.log
>
>
> While fixing HIVE-22140 I found some really annoying differences between the 
> codahale metric file structures between hiveserver2 and metastore, e.g.
> open_connections: can be found in "counters" for hs2, but in "gauges" for ms
> threads count: it's a proper "threads.count" for hs2, but a really ambiguous 
> "count" for ms
> so I realized that "memory." and "threads." prefix is completely absent in ms 
> metrics file, which is misleading



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (HIVE-22149) Metastore: Unify codahale metrics.log json structure between hiveserver2 and metastore services

2019-09-02 Thread Laszlo Bodor (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-22149:

Attachment: HIVE-22149.01.patch

> Metastore: Unify codahale metrics.log json structure between hiveserver2 and 
> metastore services
> ---
>
> Key: HIVE-22149
> URL: https://issues.apache.org/jira/browse/HIVE-22149
> Project: Hive
>  Issue Type: Bug
>Reporter: Laszlo Bodor
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-22149.01.patch, metrics_hiveserver2.log, 
> metrics_metastore.log
>
>
> While fixing HIVE-22140 I found some really annoying differences between the 
> codahale metric file structures between hiveserver2 and metastore, e.g.
> open_connections: can be found in "counters" for hs2, but in "gauges" for ms
> threads count: it's a proper "threads.count" for hs2, but a really ambiguous 
> "count" for ms
> so I realized that "memory." and "threads." prefix is completely absent in ms 
> metrics file, which is misleading



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (HIVE-22162) MVs are not using ACID tables.

2019-09-02 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920880#comment-16920880
 ] 

Hive QA commented on HIVE-22162:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12979110/HIVE-22162.3.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 16746 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.llap.security.TestLlapSignerImpl.testSigning 
(batchId=361)
org.apache.hadoop.hive.ql.TestTxnCommands.testMergeOnTezEdges (batchId=351)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18435/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18435/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18435/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12979110 - PreCommit-HIVE-Build

> MVs are not using ACID tables.
> --
>
> Key: HIVE-22162
> URL: https://issues.apache.org/jira/browse/HIVE-22162
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views
>Affects Versions: 3.1.2
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22162.1.patch, HIVE-22162.2.patch, 
> HIVE-22162.3.patch
>
>
> {code}
> SET hive.support.concurrency=true;
> SET hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> SET metastore.strict.managed.tables=true;
> SET hive.default.fileformat=textfile;
> SET hive.default.fileformat.managed=orc;
> SET metastore.create.as.acid=true;
> CREATE TABLE cmv_basetable_n4 (a int, b varchar(256), c decimal(10,2));
> INSERT INTO cmv_basetable_n4 VALUES (1, 'alfred', 10.30),(2, 'bob', 3.14),(2, 
> 'bonnie', 172342.2),(3, 'calvin', 978.76),(3, 'charlie', 9.8);
> CREATE MATERIALIZED VIEW cmv_mat_view_n4 disable rewrite
> AS SELECT a, b, c FROM cmv_basetable_n4;
> DESCRIBE FORMATTED cmv_mat_view_n4;
> {code}
> {code}
> POSTHOOK: query: DESCRIBE FORMATTED cmv_mat_view_n4
> ...
> Table Type:   MATERIALIZED_VIEW
> Table Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"a\":\"true\",\"b\":\"true\",\"c\":\"true\"}}
>   bucketing_version   2   
>   numFiles1   
>   numRows 5   
>   rawDataSize 1025
>   totalSize   509   
> {code}
> Missing table parameter
> {code}
> transaction = true
> {code}
> cc.: [~ashutoshc], [~gopalv], [~jcamachorodriguez]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (HIVE-20683) Add the Ability to push Dynamic Between and Bloom filters to Druid

2019-09-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-20683?focusedWorklogId=305285&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-305285
 ]

ASF GitHub Bot logged work on HIVE-20683:
-

Author: ASF GitHub Bot
Created on: 02/Sep/19 17:02
Start Date: 02/Sep/19 17:02
Worklog Time Spent: 10m 
  Work Description: nishantmonu51 commented on pull request #723: 
[HIVE-20683] Add the Ability to push Dynamic Between and Bloom filters to Druid
URL: https://github.com/apache/hive/pull/723#discussion_r320020777
 
 

 ##
 File path: 
druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandlerUtils.java
 ##
 @@ -894,4 +944,257 @@ public static IndexSpec getIndexSpec(Configuration jc) {
 ImmutableList aggregatorFactories = 
aggregatorFactoryBuilder.build();
 return Pair.of(dimensions, aggregatorFactories.toArray(new 
AggregatorFactory[0]));
   }
+
+  // Druid only supports String,Long,Float,Double selectors
+  private static Set druidSupportedTypeInfos = 
ImmutableSet.of(
+  TypeInfoFactory.stringTypeInfo, TypeInfoFactory.charTypeInfo,
+  TypeInfoFactory.varcharTypeInfo, TypeInfoFactory.byteTypeInfo,
+  TypeInfoFactory.intTypeInfo, TypeInfoFactory.longTypeInfo,
+  TypeInfoFactory.shortTypeInfo, TypeInfoFactory.doubleTypeInfo
+  );
+
+  private static Set stringTypeInfos = ImmutableSet.of(
+  TypeInfoFactory.stringTypeInfo,
+  TypeInfoFactory.charTypeInfo, TypeInfoFactory.varcharTypeInfo
+  );
+
+
+  public static org.apache.druid.query.Query 
addDynamicFilters(org.apache.druid.query.Query query,
+  ExprNodeGenericFuncDesc filterExpr, Configuration conf, boolean 
resolveDynamicValues
+  ) {
+List virtualColumns = Arrays
+.asList(getVirtualColumns(query).getVirtualColumns());
+org.apache.druid.query.Query rv = query;
+DimFilter joinReductionFilter = toDruidFilter(filterExpr, conf, 
virtualColumns,
+resolveDynamicValues
+);
+if(joinReductionFilter != null) {
+  String type = query.getType();
+  DimFilter filter = new AndDimFilter(joinReductionFilter, 
query.getFilter());
+  switch (type) {
+  case org.apache.druid.query.Query.TIMESERIES:
+rv = Druids.TimeseriesQueryBuilder.copy((TimeseriesQuery) query)
+.filters(filter)
+.virtualColumns(VirtualColumns.create(virtualColumns))
+.build();
+break;
+  case org.apache.druid.query.Query.TOPN:
+rv = new TopNQueryBuilder((TopNQuery) query)
+.filters(filter)
+.virtualColumns(VirtualColumns.create(virtualColumns))
+.build();
+break;
+  case org.apache.druid.query.Query.GROUP_BY:
+rv = new GroupByQuery.Builder((GroupByQuery) query)
+.setDimFilter(filter)
+.setVirtualColumns(VirtualColumns.create(virtualColumns))
+.build();
+break;
+  case org.apache.druid.query.Query.SCAN:
+rv = ScanQuery.ScanQueryBuilder.copy((ScanQuery) query)
+.filters(filter)
+.virtualColumns(VirtualColumns.create(virtualColumns))
+.build();
+break;
+  case org.apache.druid.query.Query.SELECT:
+rv = Druids.SelectQueryBuilder.copy((SelectQuery) query)
+.filters(filter)
+.virtualColumns(VirtualColumns.create(virtualColumns))
+.build();
+break;
+  default:
+throw new UnsupportedOperationException("Unsupported Query type " + 
type);
+  }
+}
+return rv;
+  }
+
+  private static DimFilter toDruidFilter(ExprNodeDesc filterExpr, 
Configuration configuration,
+  List virtualColumns, boolean resolveDynamicValues
+  ) {
+if(filterExpr == null) {
+  return null;
+}
+Class genericUDFClass = 
getGenericUDFClassFromExprDesc(filterExpr);
+if(FunctionRegistry.isOpAnd(filterExpr)) {
+  Iterator iterator = filterExpr.getChildren().iterator();
+  List delegates = Lists.newArrayList();
+  while (iterator.hasNext()) {
+DimFilter filter = toDruidFilter(iterator.next(), configuration, 
virtualColumns,
+resolveDynamicValues
+);
+if(filter != null) {
+  delegates.add(filter);
+}
+  }
+  if(delegates != null && !delegates.isEmpty()) {
+return new AndDimFilter(delegates);
+  }
+}
+if(FunctionRegistry.isOpOr(filterExpr)) {
+  Iterator iterator = filterExpr.getChildren().iterator();
+  List delegates = Lists.newArrayList();
+  while (iterator.hasNext()) {
+DimFilter filter = toDruidFilter(iterator.next(), configuration, 
virtualColumns,
+resolveDynamicValues
+);
+if(filter != null) {
+  delegates.add(filter);
+}
+  }
+  if(delegates != null) {
+return new OrDimFilter(delegates);
+  }
+} else if(GenericUDFBetween.class ==

[jira] [Work logged] (HIVE-20683) Add the Ability to push Dynamic Between and Bloom filters to Druid

2019-09-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-20683?focusedWorklogId=305284&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-305284
 ]

ASF GitHub Bot logged work on HIVE-20683:
-

Author: ASF GitHub Bot
Created on: 02/Sep/19 17:02
Start Date: 02/Sep/19 17:02
Worklog Time Spent: 10m 
  Work Description: nishantmonu51 commented on pull request #723: 
[HIVE-20683] Add the Ability to push Dynamic Between and Bloom filters to Druid
URL: https://github.com/apache/hive/pull/723#discussion_r320020745
 
 

 ##
 File path: 
druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandlerUtils.java
 ##
 @@ -894,4 +944,257 @@ public static IndexSpec getIndexSpec(Configuration jc) {
 ImmutableList aggregatorFactories = 
aggregatorFactoryBuilder.build();
 return Pair.of(dimensions, aggregatorFactories.toArray(new 
AggregatorFactory[0]));
   }
+
+  // Druid only supports String,Long,Float,Double selectors
+  private static Set druidSupportedTypeInfos = 
ImmutableSet.of(
+  TypeInfoFactory.stringTypeInfo, TypeInfoFactory.charTypeInfo,
+  TypeInfoFactory.varcharTypeInfo, TypeInfoFactory.byteTypeInfo,
+  TypeInfoFactory.intTypeInfo, TypeInfoFactory.longTypeInfo,
+  TypeInfoFactory.shortTypeInfo, TypeInfoFactory.doubleTypeInfo
+  );
+
+  private static Set stringTypeInfos = ImmutableSet.of(
+  TypeInfoFactory.stringTypeInfo,
+  TypeInfoFactory.charTypeInfo, TypeInfoFactory.varcharTypeInfo
+  );
+
+
+  public static org.apache.druid.query.Query 
addDynamicFilters(org.apache.druid.query.Query query,
+  ExprNodeGenericFuncDesc filterExpr, Configuration conf, boolean 
resolveDynamicValues
+  ) {
+List virtualColumns = Arrays
+.asList(getVirtualColumns(query).getVirtualColumns());
+org.apache.druid.query.Query rv = query;
+DimFilter joinReductionFilter = toDruidFilter(filterExpr, conf, 
virtualColumns,
+resolveDynamicValues
+);
+if(joinReductionFilter != null) {
+  String type = query.getType();
+  DimFilter filter = new AndDimFilter(joinReductionFilter, 
query.getFilter());
+  switch (type) {
+  case org.apache.druid.query.Query.TIMESERIES:
+rv = Druids.TimeseriesQueryBuilder.copy((TimeseriesQuery) query)
+.filters(filter)
+.virtualColumns(VirtualColumns.create(virtualColumns))
+.build();
+break;
+  case org.apache.druid.query.Query.TOPN:
+rv = new TopNQueryBuilder((TopNQuery) query)
+.filters(filter)
+.virtualColumns(VirtualColumns.create(virtualColumns))
+.build();
+break;
+  case org.apache.druid.query.Query.GROUP_BY:
+rv = new GroupByQuery.Builder((GroupByQuery) query)
+.setDimFilter(filter)
+.setVirtualColumns(VirtualColumns.create(virtualColumns))
+.build();
+break;
+  case org.apache.druid.query.Query.SCAN:
+rv = ScanQuery.ScanQueryBuilder.copy((ScanQuery) query)
+.filters(filter)
+.virtualColumns(VirtualColumns.create(virtualColumns))
+.build();
+break;
+  case org.apache.druid.query.Query.SELECT:
+rv = Druids.SelectQueryBuilder.copy((SelectQuery) query)
+.filters(filter)
+.virtualColumns(VirtualColumns.create(virtualColumns))
+.build();
+break;
+  default:
+throw new UnsupportedOperationException("Unsupported Query type " + 
type);
+  }
+}
+return rv;
+  }
+
+  private static DimFilter toDruidFilter(ExprNodeDesc filterExpr, 
Configuration configuration,
 
 Review comment:
   added
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 305284)
Time Spent: 1h 20m  (was: 1h 10m)

> Add the Ability to push Dynamic Between and Bloom filters to Druid
> --
>
> Key: HIVE-20683
> URL: https://issues.apache.org/jira/browse/HIVE-20683
> Project: Hive
>  Issue Type: New Feature
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20683.1.patch, HIVE-20683.2.patch, 
> HIVE-20683.3.patch, HIVE-20683.4.patch, HIVE-20683.5.patch, 
> HIVE-20683.6.patch, HIVE-20683.8.patch, HIVE-20683.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> For optimizing joins, Hive generates BETWEEN fil

[jira] [Work logged] (HIVE-20683) Add the Ability to push Dynamic Between and Bloom filters to Druid

2019-09-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-20683?focusedWorklogId=305286&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-305286
 ]

ASF GitHub Bot logged work on HIVE-20683:
-

Author: ASF GitHub Bot
Created on: 02/Sep/19 17:03
Start Date: 02/Sep/19 17:03
Worklog Time Spent: 10m 
  Work Description: nishantmonu51 commented on pull request #723: 
[HIVE-20683] Add the Ability to push Dynamic Between and Bloom filters to Druid
URL: https://github.com/apache/hive/pull/723#discussion_r320020948
 
 

 ##
 File path: 
druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandlerUtils.java
 ##
 @@ -894,4 +944,257 @@ public static IndexSpec getIndexSpec(Configuration jc) {
 ImmutableList aggregatorFactories = 
aggregatorFactoryBuilder.build();
 return Pair.of(dimensions, aggregatorFactories.toArray(new 
AggregatorFactory[0]));
   }
+
+  // Druid only supports String,Long,Float,Double selectors
+  private static Set druidSupportedTypeInfos = 
ImmutableSet.of(
+  TypeInfoFactory.stringTypeInfo, TypeInfoFactory.charTypeInfo,
+  TypeInfoFactory.varcharTypeInfo, TypeInfoFactory.byteTypeInfo,
+  TypeInfoFactory.intTypeInfo, TypeInfoFactory.longTypeInfo,
+  TypeInfoFactory.shortTypeInfo, TypeInfoFactory.doubleTypeInfo
+  );
+
+  private static Set stringTypeInfos = ImmutableSet.of(
+  TypeInfoFactory.stringTypeInfo,
+  TypeInfoFactory.charTypeInfo, TypeInfoFactory.varcharTypeInfo
+  );
+
+
+  public static org.apache.druid.query.Query 
addDynamicFilters(org.apache.druid.query.Query query,
 
 Review comment:
   Not sure about it yet, as most of the logic is similar is for query filters
   probably we can separate things into DruidQueryUtils at some point in a 
separate PR. 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 305286)
Time Spent: 1h 40m  (was: 1.5h)

> Add the Ability to push Dynamic Between and Bloom filters to Druid
> --
>
> Key: HIVE-20683
> URL: https://issues.apache.org/jira/browse/HIVE-20683
> Project: Hive
>  Issue Type: New Feature
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20683.1.patch, HIVE-20683.2.patch, 
> HIVE-20683.3.patch, HIVE-20683.4.patch, HIVE-20683.5.patch, 
> HIVE-20683.6.patch, HIVE-20683.8.patch, HIVE-20683.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> For optimizing joins, Hive generates BETWEEN filter with min-max and BLOOM 
> filter for filtering one side of semi-join.
> Druid 0.13.0 will have support for Bloom filters (Added via 
> https://github.com/apache/incubator-druid/pull/6222)
> Implementation details - 
> # Hive generates and passes the filters as part of 'filterExpr' in TableScan. 
> # DruidQueryBasedRecordReader gets this filter passed as part of the conf. 
> # During execution phase, before sending the query to druid in 
> DruidQueryBasedRecordReader we will deserialize this filter, translate it 
> into a DruidDimFilter and add it to existing DruidQuery.  Tez executor 
> already ensures that when we start reading results from the record reader, 
> all the dynamic values are initialized. 
> # Explaining a druid query also prints the query sent to druid as 
> {{druid.json.query}}. We also need to make sure to update the druid query 
> with the filters. During explain we do not have the actual values for the 
> dynamic values, so instead of values we will print the dynamic expression 
> itself as part of druid query. 
> Note:- This work needs druid to be updated to version 0.13.0



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (HIVE-20683) Add the Ability to push Dynamic Between and Bloom filters to Druid

2019-09-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-20683?focusedWorklogId=305287&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-305287
 ]

ASF GitHub Bot logged work on HIVE-20683:
-

Author: ASF GitHub Bot
Created on: 02/Sep/19 17:03
Start Date: 02/Sep/19 17:03
Worklog Time Spent: 10m 
  Work Description: nishantmonu51 commented on pull request #723: 
[HIVE-20683] Add the Ability to push Dynamic Between and Bloom filters to Druid
URL: https://github.com/apache/hive/pull/723#discussion_r320020963
 
 

 ##
 File path: ql/src/test/results/clientpositive/druid/druidmini_joins.q.out
 ##
 @@ -223,8 +228,8 @@ GROUP BY `tbl1`.`username`
 POSTHOOK: type: QUERY
 
 Review comment:
   added
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 305287)
Time Spent: 1h 50m  (was: 1h 40m)

> Add the Ability to push Dynamic Between and Bloom filters to Druid
> --
>
> Key: HIVE-20683
> URL: https://issues.apache.org/jira/browse/HIVE-20683
> Project: Hive
>  Issue Type: New Feature
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20683.1.patch, HIVE-20683.2.patch, 
> HIVE-20683.3.patch, HIVE-20683.4.patch, HIVE-20683.5.patch, 
> HIVE-20683.6.patch, HIVE-20683.8.patch, HIVE-20683.patch
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> For optimizing joins, Hive generates BETWEEN filter with min-max and BLOOM 
> filter for filtering one side of semi-join.
> Druid 0.13.0 will have support for Bloom filters (Added via 
> https://github.com/apache/incubator-druid/pull/6222)
> Implementation details - 
> # Hive generates and passes the filters as part of 'filterExpr' in TableScan. 
> # DruidQueryBasedRecordReader gets this filter passed as part of the conf. 
> # During execution phase, before sending the query to druid in 
> DruidQueryBasedRecordReader we will deserialize this filter, translate it 
> into a DruidDimFilter and add it to existing DruidQuery.  Tez executor 
> already ensures that when we start reading results from the record reader, 
> all the dynamic values are initialized. 
> # Explaining a druid query also prints the query sent to druid as 
> {{druid.json.query}}. We also need to make sure to update the druid query 
> with the filters. During explain we do not have the actual values for the 
> dynamic values, so instead of values we will print the dynamic expression 
> itself as part of druid query. 
> Note:- This work needs druid to be updated to version 0.13.0



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (HIVE-20683) Add the Ability to push Dynamic Between and Bloom filters to Druid

2019-09-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-20683?focusedWorklogId=305288&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-305288
 ]

ASF GitHub Bot logged work on HIVE-20683:
-

Author: ASF GitHub Bot
Created on: 02/Sep/19 17:04
Start Date: 02/Sep/19 17:04
Worklog Time Spent: 10m 
  Work Description: nishantmonu51 commented on pull request #723: 
[HIVE-20683] Add the Ability to push Dynamic Between and Bloom filters to Druid
URL: https://github.com/apache/hive/pull/723#discussion_r320021144
 
 

 ##
 File path: ql/src/test/results/clientpositive/druid/druidmini_expressions.q.out
 ##
 @@ -1868,9 +1868,9 @@ POSTHOOK: query: SELECT DATE_ADD(cast(`__time` as date), 
CAST((cdouble / 1000) A
 POSTHOOK: type: QUERY
 POSTHOOK: Input: default@druid_table_alltypesorc
 POSTHOOK: Output: hdfs://### HDFS PATH ###
-1969-02-26 1970-11-04
 
 Review comment:
   checked it out, DATE_ADD(cast(`__time` as date), CAST((cdouble / 1000) 
changed as the cdouble value changed due to change in rollup. 
   Added more columns to this query to make things more clear in this PR
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 305288)
Time Spent: 2h  (was: 1h 50m)

> Add the Ability to push Dynamic Between and Bloom filters to Druid
> --
>
> Key: HIVE-20683
> URL: https://issues.apache.org/jira/browse/HIVE-20683
> Project: Hive
>  Issue Type: New Feature
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20683.1.patch, HIVE-20683.2.patch, 
> HIVE-20683.3.patch, HIVE-20683.4.patch, HIVE-20683.5.patch, 
> HIVE-20683.6.patch, HIVE-20683.8.patch, HIVE-20683.patch
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> For optimizing joins, Hive generates BETWEEN filter with min-max and BLOOM 
> filter for filtering one side of semi-join.
> Druid 0.13.0 will have support for Bloom filters (Added via 
> https://github.com/apache/incubator-druid/pull/6222)
> Implementation details - 
> # Hive generates and passes the filters as part of 'filterExpr' in TableScan. 
> # DruidQueryBasedRecordReader gets this filter passed as part of the conf. 
> # During execution phase, before sending the query to druid in 
> DruidQueryBasedRecordReader we will deserialize this filter, translate it 
> into a DruidDimFilter and add it to existing DruidQuery.  Tez executor 
> already ensures that when we start reading results from the record reader, 
> all the dynamic values are initialized. 
> # Explaining a druid query also prints the query sent to druid as 
> {{druid.json.query}}. We also need to make sure to update the druid query 
> with the filters. During explain we do not have the actual values for the 
> dynamic values, so instead of values we will print the dynamic expression 
> itself as part of druid query. 
> Note:- This work needs druid to be updated to version 0.13.0



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (HIVE-20683) Add the Ability to push Dynamic Between and Bloom filters to Druid

2019-09-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-20683?focusedWorklogId=305289&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-305289
 ]

ASF GitHub Bot logged work on HIVE-20683:
-

Author: ASF GitHub Bot
Created on: 02/Sep/19 17:04
Start Date: 02/Sep/19 17:04
Worklog Time Spent: 10m 
  Work Description: nishantmonu51 commented on pull request #723: 
[HIVE-20683] Add the Ability to push Dynamic Between and Bloom filters to Druid
URL: https://github.com/apache/hive/pull/723#discussion_r320021168
 
 

 ##
 File path: ql/src/test/results/clientpositive/druid/druidmini_expressions.q.out
 ##
 @@ -1868,9 +1868,9 @@ POSTHOOK: query: SELECT DATE_ADD(cast(`__time` as date), 
CAST((cdouble / 1000) A
 POSTHOOK: type: QUERY
 POSTHOOK: Input: default@druid_table_alltypesorc
 POSTHOOK: Output: hdfs://### HDFS PATH ###
-1969-02-26 1970-11-04
-1969-03-19 1970-10-14
-1969-11-13 1970-02-17
+1969-12-15 1970-01-16
+1969-12-15 1970-01-16
+1969-12-15 1970-01-16
 
 Review comment:
   checked it out, DATE_ADD(cast(__time as date), CAST((cdouble / 1000) changed 
as the cdouble value changed due to change in rollup.
   Added more columns to this query to make things more clear in this PR
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 305289)
Time Spent: 2h 10m  (was: 2h)

> Add the Ability to push Dynamic Between and Bloom filters to Druid
> --
>
> Key: HIVE-20683
> URL: https://issues.apache.org/jira/browse/HIVE-20683
> Project: Hive
>  Issue Type: New Feature
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20683.1.patch, HIVE-20683.2.patch, 
> HIVE-20683.3.patch, HIVE-20683.4.patch, HIVE-20683.5.patch, 
> HIVE-20683.6.patch, HIVE-20683.8.patch, HIVE-20683.patch
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> For optimizing joins, Hive generates BETWEEN filter with min-max and BLOOM 
> filter for filtering one side of semi-join.
> Druid 0.13.0 will have support for Bloom filters (Added via 
> https://github.com/apache/incubator-druid/pull/6222)
> Implementation details - 
> # Hive generates and passes the filters as part of 'filterExpr' in TableScan. 
> # DruidQueryBasedRecordReader gets this filter passed as part of the conf. 
> # During execution phase, before sending the query to druid in 
> DruidQueryBasedRecordReader we will deserialize this filter, translate it 
> into a DruidDimFilter and add it to existing DruidQuery.  Tez executor 
> already ensures that when we start reading results from the record reader, 
> all the dynamic values are initialized. 
> # Explaining a druid query also prints the query sent to druid as 
> {{druid.json.query}}. We also need to make sure to update the druid query 
> with the filters. During explain we do not have the actual values for the 
> dynamic values, so instead of values we will print the dynamic expression 
> itself as part of druid query. 
> Note:- This work needs druid to be updated to version 0.13.0



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (HIVE-20683) Add the Ability to push Dynamic Between and Bloom filters to Druid

2019-09-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-20683?focusedWorklogId=305290&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-305290
 ]

ASF GitHub Bot logged work on HIVE-20683:
-

Author: ASF GitHub Bot
Created on: 02/Sep/19 17:05
Start Date: 02/Sep/19 17:05
Worklog Time Spent: 10m 
  Work Description: nishantmonu51 commented on issue #723: [HIVE-20683] Add 
the Ability to push Dynamic Between and Bloom filters to Druid
URL: https://github.com/apache/hive/pull/723#issuecomment-527206460
 
 
   @b-slim : Updated based on your comments, please check. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 305290)
Time Spent: 2h 20m  (was: 2h 10m)

> Add the Ability to push Dynamic Between and Bloom filters to Druid
> --
>
> Key: HIVE-20683
> URL: https://issues.apache.org/jira/browse/HIVE-20683
> Project: Hive
>  Issue Type: New Feature
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20683.1.patch, HIVE-20683.2.patch, 
> HIVE-20683.3.patch, HIVE-20683.4.patch, HIVE-20683.5.patch, 
> HIVE-20683.6.patch, HIVE-20683.8.patch, HIVE-20683.patch
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> For optimizing joins, Hive generates BETWEEN filter with min-max and BLOOM 
> filter for filtering one side of semi-join.
> Druid 0.13.0 will have support for Bloom filters (Added via 
> https://github.com/apache/incubator-druid/pull/6222)
> Implementation details - 
> # Hive generates and passes the filters as part of 'filterExpr' in TableScan. 
> # DruidQueryBasedRecordReader gets this filter passed as part of the conf. 
> # During execution phase, before sending the query to druid in 
> DruidQueryBasedRecordReader we will deserialize this filter, translate it 
> into a DruidDimFilter and add it to existing DruidQuery.  Tez executor 
> already ensures that when we start reading results from the record reader, 
> all the dynamic values are initialized. 
> # Explaining a druid query also prints the query sent to druid as 
> {{druid.json.query}}. We also need to make sure to update the druid query 
> with the filters. During explain we do not have the actual values for the 
> dynamic values, so instead of values we will print the dynamic expression 
> itself as part of druid query. 
> Note:- This work needs druid to be updated to version 0.13.0



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (HIVE-22162) MVs are not using ACID tables.

2019-09-02 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa updated HIVE-22162:
--
Attachment: HIVE-22162.4.patch

> MVs are not using ACID tables.
> --
>
> Key: HIVE-22162
> URL: https://issues.apache.org/jira/browse/HIVE-22162
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views
>Affects Versions: 3.1.2
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22162.1.patch, HIVE-22162.2.patch, 
> HIVE-22162.3.patch, HIVE-22162.4.patch
>
>
> {code}
> SET hive.support.concurrency=true;
> SET hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> SET metastore.strict.managed.tables=true;
> SET hive.default.fileformat=textfile;
> SET hive.default.fileformat.managed=orc;
> SET metastore.create.as.acid=true;
> CREATE TABLE cmv_basetable_n4 (a int, b varchar(256), c decimal(10,2));
> INSERT INTO cmv_basetable_n4 VALUES (1, 'alfred', 10.30),(2, 'bob', 3.14),(2, 
> 'bonnie', 172342.2),(3, 'calvin', 978.76),(3, 'charlie', 9.8);
> CREATE MATERIALIZED VIEW cmv_mat_view_n4 disable rewrite
> AS SELECT a, b, c FROM cmv_basetable_n4;
> DESCRIBE FORMATTED cmv_mat_view_n4;
> {code}
> {code}
> POSTHOOK: query: DESCRIBE FORMATTED cmv_mat_view_n4
> ...
> Table Type:   MATERIALIZED_VIEW
> Table Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"a\":\"true\",\"b\":\"true\",\"c\":\"true\"}}
>   bucketing_version   2   
>   numFiles1   
>   numRows 5   
>   rawDataSize 1025
>   totalSize   509   
> {code}
> Missing table parameter
> {code}
> transaction = true
> {code}
> cc.: [~ashutoshc], [~gopalv], [~jcamachorodriguez]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (HIVE-22162) MVs are not using ACID tables.

2019-09-02 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa updated HIVE-22162:
--
Status: Open  (was: Patch Available)

> MVs are not using ACID tables.
> --
>
> Key: HIVE-22162
> URL: https://issues.apache.org/jira/browse/HIVE-22162
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views
>Affects Versions: 3.1.2
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22162.1.patch, HIVE-22162.2.patch, 
> HIVE-22162.3.patch, HIVE-22162.4.patch
>
>
> {code}
> SET hive.support.concurrency=true;
> SET hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> SET metastore.strict.managed.tables=true;
> SET hive.default.fileformat=textfile;
> SET hive.default.fileformat.managed=orc;
> SET metastore.create.as.acid=true;
> CREATE TABLE cmv_basetable_n4 (a int, b varchar(256), c decimal(10,2));
> INSERT INTO cmv_basetable_n4 VALUES (1, 'alfred', 10.30),(2, 'bob', 3.14),(2, 
> 'bonnie', 172342.2),(3, 'calvin', 978.76),(3, 'charlie', 9.8);
> CREATE MATERIALIZED VIEW cmv_mat_view_n4 disable rewrite
> AS SELECT a, b, c FROM cmv_basetable_n4;
> DESCRIBE FORMATTED cmv_mat_view_n4;
> {code}
> {code}
> POSTHOOK: query: DESCRIBE FORMATTED cmv_mat_view_n4
> ...
> Table Type:   MATERIALIZED_VIEW
> Table Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"a\":\"true\",\"b\":\"true\",\"c\":\"true\"}}
>   bucketing_version   2   
>   numFiles1   
>   numRows 5   
>   rawDataSize 1025
>   totalSize   509   
> {code}
> Missing table parameter
> {code}
> transaction = true
> {code}
> cc.: [~ashutoshc], [~gopalv], [~jcamachorodriguez]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (HIVE-22162) MVs are not using ACID tables.

2019-09-02 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa updated HIVE-22162:
--
Status: Patch Available  (was: Open)

> MVs are not using ACID tables.
> --
>
> Key: HIVE-22162
> URL: https://issues.apache.org/jira/browse/HIVE-22162
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views
>Affects Versions: 3.1.2
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22162.1.patch, HIVE-22162.2.patch, 
> HIVE-22162.3.patch, HIVE-22162.4.patch
>
>
> {code}
> SET hive.support.concurrency=true;
> SET hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> SET metastore.strict.managed.tables=true;
> SET hive.default.fileformat=textfile;
> SET hive.default.fileformat.managed=orc;
> SET metastore.create.as.acid=true;
> CREATE TABLE cmv_basetable_n4 (a int, b varchar(256), c decimal(10,2));
> INSERT INTO cmv_basetable_n4 VALUES (1, 'alfred', 10.30),(2, 'bob', 3.14),(2, 
> 'bonnie', 172342.2),(3, 'calvin', 978.76),(3, 'charlie', 9.8);
> CREATE MATERIALIZED VIEW cmv_mat_view_n4 disable rewrite
> AS SELECT a, b, c FROM cmv_basetable_n4;
> DESCRIBE FORMATTED cmv_mat_view_n4;
> {code}
> {code}
> POSTHOOK: query: DESCRIBE FORMATTED cmv_mat_view_n4
> ...
> Table Type:   MATERIALIZED_VIEW
> Table Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"a\":\"true\",\"b\":\"true\",\"c\":\"true\"}}
>   bucketing_version   2   
>   numFiles1   
>   numRows 5   
>   rawDataSize 1025
>   totalSize   509   
> {code}
> Missing table parameter
> {code}
> transaction = true
> {code}
> cc.: [~ashutoshc], [~gopalv], [~jcamachorodriguez]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (HIVE-22162) MVs are not using ACID tables.

2019-09-02 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16921199#comment-16921199
 ] 

Hive QA commented on HIVE-22162:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
39s{color} | {color:blue} ql in master has 2248 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 22m 36s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-18436/dev-support/hive-personality.sh
 |
| git revision | master / 04397e5 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18436/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> MVs are not using ACID tables.
> --
>
> Key: HIVE-22162
> URL: https://issues.apache.org/jira/browse/HIVE-22162
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views
>Affects Versions: 3.1.2
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22162.1.patch, HIVE-22162.2.patch, 
> HIVE-22162.3.patch, HIVE-22162.4.patch
>
>
> {code}
> SET hive.support.concurrency=true;
> SET hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> SET metastore.strict.managed.tables=true;
> SET hive.default.fileformat=textfile;
> SET hive.default.fileformat.managed=orc;
> SET metastore.create.as.acid=true;
> CREATE TABLE cmv_basetable_n4 (a int, b varchar(256), c decimal(10,2));
> INSERT INTO cmv_basetable_n4 VALUES (1, 'alfred', 10.30),(2, 'bob', 3.14),(2, 
> 'bonnie', 172342.2),(3, 'calvin', 978.76),(3, 'charlie', 9.8);
> CREATE MATERIALIZED VIEW cmv_mat_view_n4 disable rewrite
> AS SELECT a, b, c FROM cmv_basetable_n4;
> DESCRIBE FORMATTED cmv_mat_view_n4;
> {code}
> {code}
> POSTHOOK: query: DESCRIBE FORMATTED cmv_mat_view_n4
> ...
> Table Type:   MATERIALIZED_VIEW
> Table Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"a\":\"true\",\"b\":\"true\",\"c\":\"true\"}}
>   bucketing_version   2   
>   numFiles1   
>   numRows 5   
>   rawDataSize 1025
>   totalSize