[jira] [Assigned] (HIVE-20910) Insert in bucketed table fails due to dynamic partition sort optimization

2018-11-13 Thread Laszlo Bodor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor reassigned HIVE-20910:
---

Assignee: Vineet Garg  (was: Laszlo Bodor)

> Insert in bucketed table fails due to dynamic partition sort optimization
> -
>
> Key: HIVE-20910
> URL: https://issues.apache.org/jira/browse/HIVE-20910
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20910.1.patch, HIVE-20910.2.patch, 
> HIVE-20910.3.patch, HIVE-20910.4.patch
>
>
> *Repro*
> {code:sql}
> CREATE TABLE tbl_lbodor_1 (ca_address_sk int, ca_address_id string, 
> ca_street_number string, ca_street_name string,
> ca_street_type string, ca_suite_number string, ca_city string, ca_county 
> string, ca_state string,
> ca_zip string, ca_country string, ca_gmt_offset decimal(5,2))
> PARTITIONED BY (ca_location_type string)
> CLUSTERED BY (ca_state) INTO 50 BUCKETS STORED AS ORC 
> TBLPROPERTIES('transactional'='true');
> INSERT INTO TABLE tbl_lbodor_1 PARTITION (ca_location_type) VALUES (, 
> 'DLFB', '126',
> 'Highland Park', 'Court', 'Suite E', 'San Jose', 'King George 
> County', 'VA', '28003', 'United States',
> '-5', 'single family');
> {code}
> {noformat}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.Decimal64ColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:184)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:237)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.addRow(WriterImpl.java:308)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSimpleEvent(OrcRecordUpdater.java:423)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSplitUpdateEvent(OrcRecordUpdater.java:431)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.insert(OrcRecordUpdater.java:483)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:995)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.process(VectorFileSinkOperator.java:111)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:490)
>   ... 20 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20910) Insert in bucketed table fails due to dynamic partition sort optimization

2018-11-13 Thread Laszlo Bodor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-20910:

Attachment: (was: HIVE-20910.04.patch)

> Insert in bucketed table fails due to dynamic partition sort optimization
> -
>
> Key: HIVE-20910
> URL: https://issues.apache.org/jira/browse/HIVE-20910
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-20910.1.patch, HIVE-20910.2.patch, 
> HIVE-20910.3.patch, HIVE-20910.4.patch
>
>
> *Repro*
> {code:sql}
> CREATE TABLE tbl_lbodor_1 (ca_address_sk int, ca_address_id string, 
> ca_street_number string, ca_street_name string,
> ca_street_type string, ca_suite_number string, ca_city string, ca_county 
> string, ca_state string,
> ca_zip string, ca_country string, ca_gmt_offset decimal(5,2))
> PARTITIONED BY (ca_location_type string)
> CLUSTERED BY (ca_state) INTO 50 BUCKETS STORED AS ORC 
> TBLPROPERTIES('transactional'='true');
> INSERT INTO TABLE tbl_lbodor_1 PARTITION (ca_location_type) VALUES (, 
> 'DLFB', '126',
> 'Highland Park', 'Court', 'Suite E', 'San Jose', 'King George 
> County', 'VA', '28003', 'United States',
> '-5', 'single family');
> {code}
> {noformat}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.Decimal64ColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:184)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:237)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.addRow(WriterImpl.java:308)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSimpleEvent(OrcRecordUpdater.java:423)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSplitUpdateEvent(OrcRecordUpdater.java:431)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.insert(OrcRecordUpdater.java:483)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:995)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.process(VectorFileSinkOperator.java:111)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:490)
>   ... 20 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20910) Insert in bucketed table fails due to dynamic partition sort optimization

2018-11-13 Thread Laszlo Bodor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-20910:

Attachment: HIVE-20910.04.patch

> Insert in bucketed table fails due to dynamic partition sort optimization
> -
>
> Key: HIVE-20910
> URL: https://issues.apache.org/jira/browse/HIVE-20910
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-20910.1.patch, HIVE-20910.2.patch, 
> HIVE-20910.3.patch, HIVE-20910.4.patch
>
>
> *Repro*
> {code:sql}
> CREATE TABLE tbl_lbodor_1 (ca_address_sk int, ca_address_id string, 
> ca_street_number string, ca_street_name string,
> ca_street_type string, ca_suite_number string, ca_city string, ca_county 
> string, ca_state string,
> ca_zip string, ca_country string, ca_gmt_offset decimal(5,2))
> PARTITIONED BY (ca_location_type string)
> CLUSTERED BY (ca_state) INTO 50 BUCKETS STORED AS ORC 
> TBLPROPERTIES('transactional'='true');
> INSERT INTO TABLE tbl_lbodor_1 PARTITION (ca_location_type) VALUES (, 
> 'DLFB', '126',
> 'Highland Park', 'Court', 'Suite E', 'San Jose', 'King George 
> County', 'VA', '28003', 'United States',
> '-5', 'single family');
> {code}
> {noformat}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.Decimal64ColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:184)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:237)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.addRow(WriterImpl.java:308)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSimpleEvent(OrcRecordUpdater.java:423)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSplitUpdateEvent(OrcRecordUpdater.java:431)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.insert(OrcRecordUpdater.java:483)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:995)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.process(VectorFileSinkOperator.java:111)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:490)
>   ... 20 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20910) Insert in bucketed table fails due to dynamic partition sort optimization

2018-11-13 Thread Laszlo Bodor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-20910:

Attachment: HIVE-20910.4.patch

> Insert in bucketed table fails due to dynamic partition sort optimization
> -
>
> Key: HIVE-20910
> URL: https://issues.apache.org/jira/browse/HIVE-20910
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-20910.1.patch, HIVE-20910.2.patch, 
> HIVE-20910.3.patch, HIVE-20910.4.patch
>
>
> *Repro*
> {code:sql}
> CREATE TABLE tbl_lbodor_1 (ca_address_sk int, ca_address_id string, 
> ca_street_number string, ca_street_name string,
> ca_street_type string, ca_suite_number string, ca_city string, ca_county 
> string, ca_state string,
> ca_zip string, ca_country string, ca_gmt_offset decimal(5,2))
> PARTITIONED BY (ca_location_type string)
> CLUSTERED BY (ca_state) INTO 50 BUCKETS STORED AS ORC 
> TBLPROPERTIES('transactional'='true');
> INSERT INTO TABLE tbl_lbodor_1 PARTITION (ca_location_type) VALUES (, 
> 'DLFB', '126',
> 'Highland Park', 'Court', 'Suite E', 'San Jose', 'King George 
> County', 'VA', '28003', 'United States',
> '-5', 'single family');
> {code}
> {noformat}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.Decimal64ColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:184)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:237)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.addRow(WriterImpl.java:308)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSimpleEvent(OrcRecordUpdater.java:423)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSplitUpdateEvent(OrcRecordUpdater.java:431)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.insert(OrcRecordUpdater.java:483)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:995)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.process(VectorFileSinkOperator.java:111)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:490)
>   ... 20 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20910) Insert in bucketed table fails due to dynamic partition sort optimization

2018-11-13 Thread Laszlo Bodor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor reassigned HIVE-20910:
---

Assignee: Laszlo Bodor  (was: Vineet Garg)

> Insert in bucketed table fails due to dynamic partition sort optimization
> -
>
> Key: HIVE-20910
> URL: https://issues.apache.org/jira/browse/HIVE-20910
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-20910.1.patch, HIVE-20910.2.patch, 
> HIVE-20910.3.patch
>
>
> *Repro*
> {code:sql}
> CREATE TABLE tbl_lbodor_1 (ca_address_sk int, ca_address_id string, 
> ca_street_number string, ca_street_name string,
> ca_street_type string, ca_suite_number string, ca_city string, ca_county 
> string, ca_state string,
> ca_zip string, ca_country string, ca_gmt_offset decimal(5,2))
> PARTITIONED BY (ca_location_type string)
> CLUSTERED BY (ca_state) INTO 50 BUCKETS STORED AS ORC 
> TBLPROPERTIES('transactional'='true');
> INSERT INTO TABLE tbl_lbodor_1 PARTITION (ca_location_type) VALUES (, 
> 'DLFB', '126',
> 'Highland Park', 'Court', 'Suite E', 'San Jose', 'King George 
> County', 'VA', '28003', 'United States',
> '-5', 'single family');
> {code}
> {noformat}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.Decimal64ColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:184)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:237)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.addRow(WriterImpl.java:308)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSimpleEvent(OrcRecordUpdater.java:423)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSplitUpdateEvent(OrcRecordUpdater.java:431)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.insert(OrcRecordUpdater.java:483)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:995)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.process(VectorFileSinkOperator.java:111)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:490)
>   ... 20 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20911) External Table Replication for Hive

2018-11-13 Thread anishek (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anishek reassigned HIVE-20911:
--

Assignee: anishek

> External Table Replication for Hive
> ---
>
> Key: HIVE-20911
> URL: https://issues.apache.org/jira/browse/HIVE-20911
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Critical
> Fix For: 4.0.0
>
>
> External tables are not replicated currently as part of hive replication. As 
> part of this jira we want to enable that.
> Approach:
> * Target cluster will have a top level base directory config that will be 
> used to copy all data relevant to external tables. This will be provided via 
> the *with* clause in the *repl load* command. This base path will be prefixed 
> to the path of the same external table on source cluster.
> * Since changes to directories on the external table can happen without hive 
> knowing it, hence we cant capture the relevant events when ever new data is 
> added or removed, we will have to copy the data from the source path to 
> target path for external tables every time we run incremental replication.
> ** this will require incremental *repl dump*  to now create an additional 
> file *\_external\_tables\_info* with data in the following form 
> {code}
> OpearationType,tableName,base64Encoded(tableDataLocation)
> {code}
> where OpeartionType can be one in (ADD, REMOVE)
> ** *repl load* will look up all the external tables on target and remove 
> tables listed with REMOVE type in the above file.
> ** For the remaining tables it will create tasks for the corresponding paths 
> from source to target along with the existing tasks for incremental load.
> * New External tables will be created with data copied as part of regular 
> tasks wile incremental load, applying the base directory prefix
> * Bootstrap will also create / copy these external tables as part of their 
> regular workflow, applying the base directory prefix



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20911) External Table Replication for Hive

2018-11-13 Thread anishek (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anishek updated HIVE-20911:
---
Description: 
External tables are not replicated currently as part of hive replication. As 
part of this jira we want to enable that.

Approach:
* Target cluster will have a top level base directory config that will be used 
to copy all data relevant to external tables. This will be provided via the 
*with* clause in the *repl load* command. This base path will be prefixed to 
the path of the same external table on source cluster.
* Since changes to directories on the external table can happen without hive 
knowing it, hence we cant capture the relevant events when ever new data is 
added or removed, we will have to copy the data from the source path to target 
path for external tables every time we run incremental replication.
** this will require incremental *repl dump*  to now create an additional file 
*\_external\_tables\_info* with data in the following form 
{code}
OpearationType,tableName,base64Encoded(tableDataLocation)
{code}
where OpeartionType can be one in (ADD, REMOVE)
** *repl load* will look up all the external tables on target and remove tables 
listed with REMOVE type in the above file.
** For the remaining tables it will create tasks for the corresponding paths 
from source to target along with the existing tasks for incremental load.
* New External tables will be created with data copied as part of regular tasks 
wile incremental load, applying the base directory prefix
* Bootstrap will also create / copy these external tables as part of their 
regular workflow, applying the base directory prefix

  was:
External tables are not replicated currently as part of hive replication. As 
part of this jira we want to enable that.

Approach:
* Target cluster will have a top level base directory config that will be used 
to copy all data relevant to external tables. This will be provided via the 
*with* clause in the *repl load* command. This base path will be prefixed to 
the path of the same external table on source cluster.
* Since changes to directories on the external table can happen without hive 
knowing it, hence we cant capture the relevant events when ever new data is 
added or removed, we will have to copy the data from the source path to target 
path for external tables every time we run incremental replication.
** this will require incremental *repl dump*  to now create an additional file 
*\_external_\tables\_info* with data in the following form 
{code}
OpearationType,tableName,base64Encoded(tableDataLocation)
{code}
where OpeartionType can be one in (ADD, REMOVE)
** *repl load* will look up all the external tables on target and remove tables 
listed with REMOVE type in the above file.
** For the remaining tables it will create tasks for the corresponding paths 
from source to target along with the existing tasks for incremental load.
* New External tables will be created with data copied as part of regular tasks 
wile incremental load, applying the base directory prefix
* Bootstrap will also create / copy these external tables as part of their 
regular workflow, applying the base directory prefix


> External Table Replication for Hive
> ---
>
> Key: HIVE-20911
> URL: https://issues.apache.org/jira/browse/HIVE-20911
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: anishek
>Priority: Critical
> Fix For: 4.0.0
>
>
> External tables are not replicated currently as part of hive replication. As 
> part of this jira we want to enable that.
> Approach:
> * Target cluster will have a top level base directory config that will be 
> used to copy all data relevant to external tables. This will be provided via 
> the *with* clause in the *repl load* command. This base path will be prefixed 
> to the path of the same external table on source cluster.
> * Since changes to directories on the external table can happen without hive 
> knowing it, hence we cant capture the relevant events when ever new data is 
> added or removed, we will have to copy the data from the source path to 
> target path for external tables every time we run incremental replication.
> ** this will require incremental *repl dump*  to now create an additional 
> file *\_external\_tables\_info* with data in the following form 
> {code}
> OpearationType,tableName,base64Encoded(tableDataLocation)
> {code}
> where OpeartionType can be one in (ADD, REMOVE)
> ** *repl load* will look up all the external tables on target and remove 
> tables listed with REMOVE type in the above file.
> ** For the remaining tables it will create tasks for the corresponding paths 
> from source to target along with the existing tasks for incremental load.
> * New External tables will b

[jira] [Commented] (HIVE-20822) Improvements to push computation to JDBC from Calcite

2018-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686146#comment-16686146
 ] 

Hive QA commented on HIVE-20822:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12948069/HIVE-20822.03.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 15540 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[external_jdbc_table2]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[external_jdbc_table]
 (batchId=160)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[external_jdbc_table_perf]
 (batchId=181)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14922/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14922/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14922/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12948069 - PreCommit-HIVE-Build

> Improvements to push computation to JDBC from Calcite
> -
>
> Key: HIVE-20822
> URL: https://issues.apache.org/jira/browse/HIVE-20822
> Project: Hive
>  Issue Type: Improvement
>  Components: StorageHandler
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20822.01.patch, HIVE-20822.02.patch, 
> HIVE-20822.02.patch, HIVE-20822.03.patch, HIVE-20822.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20822) Improvements to push computation to JDBC from Calcite

2018-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686122#comment-16686122
 ] 

Hive QA commented on HIVE-20822:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
36s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
43s{color} | {color:blue} ql in master has 2317 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
38s{color} | {color:red} ql: The patch generated 6 new + 185 unchanged - 0 
fixed = 191 total (was 185) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 555 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
7s{color} | {color:red} The patch 142 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 22m 46s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14922/dev-support/hive-personality.sh
 |
| git revision | master / d813b48 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14922/yetus/diff-checkstyle-ql.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14922/yetus/whitespace-eol.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14922/yetus/whitespace-tabs.txt
 |
| modules | C: itests ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14922/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Improvements to push computation to JDBC from Calcite
> -
>
> Key: HIVE-20822
> URL: https://issues.apache.org/jira/browse/HIVE-20822
> Project: Hive
>  Issue Type: Improvement
>  Components: StorageHandler
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20822.01.patch, HIVE-20822.02.patch, 
> HIVE-20822.02.patch, HIVE-20822.03.patch, HIVE-20822.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20910) Insert in bucketed table fails due to dynamic partition sort optimization

2018-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686110#comment-16686110
 ] 

Hive QA commented on HIVE-20910:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12948067/HIVE-20910.3.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 15539 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.metastore.client.TestForeignKey.addNoSuchCatalog[Remote] 
(batchId=223)
org.apache.hadoop.hive.metastore.client.TestForeignKey.addNoSuchDb[Remote] 
(batchId=223)
org.apache.hadoop.hive.metastore.client.TestForeignKey.addNoSuchTable[Remote] 
(batchId=223)
org.apache.hadoop.hive.metastore.client.TestForeignKey.createGetDrop2Column[Remote]
 (batchId=223)
org.apache.hadoop.hive.metastore.client.TestForeignKey.createGetDrop[Remote] 
(batchId=223)
org.apache.hadoop.hive.metastore.client.TestForeignKey.createTableWithConstraintsInOtherCatalog[Remote]
 (batchId=223)
org.apache.hadoop.hive.metastore.client.TestForeignKey.createTableWithConstraints[Remote]
 (batchId=223)
org.apache.hadoop.hive.metastore.client.TestForeignKey.foreignKeyAcrossCatalogs[Remote]
 (batchId=223)
org.apache.hadoop.hive.metastore.client.TestForeignKey.inOtherCatalog[Remote] 
(batchId=223)
org.apache.hadoop.hive.metastore.client.TestForeignKey.noSuchPk[Remote] 
(batchId=223)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14921/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14921/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14921/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12948067 - PreCommit-HIVE-Build

> Insert in bucketed table fails due to dynamic partition sort optimization
> -
>
> Key: HIVE-20910
> URL: https://issues.apache.org/jira/browse/HIVE-20910
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20910.1.patch, HIVE-20910.2.patch, 
> HIVE-20910.3.patch
>
>
> *Repro*
> {code:sql}
> CREATE TABLE tbl_lbodor_1 (ca_address_sk int, ca_address_id string, 
> ca_street_number string, ca_street_name string,
> ca_street_type string, ca_suite_number string, ca_city string, ca_county 
> string, ca_state string,
> ca_zip string, ca_country string, ca_gmt_offset decimal(5,2))
> PARTITIONED BY (ca_location_type string)
> CLUSTERED BY (ca_state) INTO 50 BUCKETS STORED AS ORC 
> TBLPROPERTIES('transactional'='true');
> INSERT INTO TABLE tbl_lbodor_1 PARTITION (ca_location_type) VALUES (, 
> 'DLFB', '126',
> 'Highland Park', 'Court', 'Suite E', 'San Jose', 'King George 
> County', 'VA', '28003', 'United States',
> '-5', 'single family');
> {code}
> {noformat}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.Decimal64ColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:184)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:237)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.addRow(WriterImpl.java:308)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSimpleEvent(OrcRecordUpdater.java:423)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSplitUpdateEvent(OrcRecordUpdater.java:431)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.insert(OrcRecordUpdater.java:483)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:995)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.process(VectorFileSinkOperator.java:111)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:490)
>   ... 20 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20910) Insert in bucketed table fails due to dynamic partition sort optimization

2018-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686094#comment-16686094
 ] 

Hive QA commented on HIVE-20910:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
31s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
45s{color} | {color:blue} ql in master has 2317 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 22m 24s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14921/dev-support/hive-personality.sh
 |
| git revision | master / d813b48 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14921/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Insert in bucketed table fails due to dynamic partition sort optimization
> -
>
> Key: HIVE-20910
> URL: https://issues.apache.org/jira/browse/HIVE-20910
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20910.1.patch, HIVE-20910.2.patch, 
> HIVE-20910.3.patch
>
>
> *Repro*
> {code:sql}
> CREATE TABLE tbl_lbodor_1 (ca_address_sk int, ca_address_id string, 
> ca_street_number string, ca_street_name string,
> ca_street_type string, ca_suite_number string, ca_city string, ca_county 
> string, ca_state string,
> ca_zip string, ca_country string, ca_gmt_offset decimal(5,2))
> PARTITIONED BY (ca_location_type string)
> CLUSTERED BY (ca_state) INTO 50 BUCKETS STORED AS ORC 
> TBLPROPERTIES('transactional'='true');
> INSERT INTO TABLE tbl_lbodor_1 PARTITION (ca_location_type) VALUES (, 
> 'DLFB', '126',
> 'Highland Park', 'Court', 'Suite E', 'San Jose', 'King George 
> County', 'VA', '28003', 'United States',
> '-5', 'single family');
> {code}
> {noformat}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.Decimal64ColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:184)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:237)
>   at 
> org.apache.hadoop.hive.ql.io.orc.Writer

[jira] [Commented] (HIVE-20794) Use Zookeeper for metastore service discovery

2018-11-13 Thread Ashutosh Bapat (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686087#comment-16686087
 ] 

Ashutosh Bapat commented on HIVE-20794:
---

[~maheshk114], [~daijy], can you please review these changes?

> Use Zookeeper for metastore service discovery
> -
>
> Key: HIVE-20794
> URL: https://issues.apache.org/jira/browse/HIVE-20794
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20794.01
>
>
> Right now, multiple metastore services can be specified in 
> hive.metastore.uris configuration, but that list is static and can not be 
> modified dynamically. Use Zookeeper for dynamic service discovery of 
> metastore.
> h3. Improve ZooKeeperHiveHelper class (suggestions for name welcome)
> The Zookeeper related code (for service discovery) accesses Zookeeper 
> parameters directly from HiveConf. The class is changed so that it could be 
> used for both HiveServer2 and Metastore server and works with both the 
> configurations. Following methods from HiveServer2 are now moved into 
> ZooKeeperHiveHelper. # startZookeeperClient # addServerInstanceToZooKeeper # 
> removeServerInstanceFromZooKeeper
> h3. HiveMetaStore conf changes
>  # THRIFT_URIS (hive.metastore.uris) can also be used to specify ZooKeeper 
> quorum. When THRIFT_SERVICE_DISCOVERY_MODE 
> (hive.metastore.service.discovery.mode) is set to "zookeeper" the URIs are 
> used as ZooKeeper quorum. When it's set to be empty, the URIs are used to 
> locate the metastore directly.
>  # Here's list of Hiveserver2's parameters and their proposed metastore conf 
> counterparts. It looks odd that the Metastore related configurations do not 
> have their macros start with METASTORE, but start with THRIFT. I have just 
> followed naming convention used for other parameters.
> #* HIVE_SERVER2_ZOOKEEPER_NAMESPACE - THRIFT_ZOOKEEPER_NAMESPACE 
> (hive.metastore.zookeeper.namespace)
> #* HIVE_ZOOKEEPER_CLIENT_PORT - THRIFT_ZOOKEEPER_CLIENT_PORT 
> (hive.metastore.zookeeper.client.port)
> #* HIVE_ZOOKEEPER_CONNECTION_TIMEOUT - THRIFT_ZOOKEEPER_CONNECTION_TIMEOUT - 
> (hive.metastore.zookeeper.connection.timeout)
> #* HIVE_ZOOKEEPER_CONNECTION_MAX_RETRIES - 
> THRIFT_ZOOKEEPER_CONNECTION_MAX_RETRIES 
> (hive.metastore.zookeeper.connection.max.retries)
> #* HIVE_ZOOKEEPER_CONNECTION_BASESLEEPTIME - 
> THRIFT_ZOOKEEPER_CONNECTION_BASESLEEPTIME 
> (hive.metastore.zookeeper.connection.basesleeptime)
> Following Hive ZK configurations seem to be related to managing locks and 
> seem irrelevant for MS ZK.
> # HIVE_ZOOKEEPER_SESSION_TIMEOUT
> # HIVE_ZOOKEEPER_CLEAN_EXTRA_NODES
> Since there is no configuration to be published, 
> HIVE_ZOOKEEPER_PUBLISH_CONFIGS does not have a THRIFT counterpart.
> h3. HiveMetaStore class changes
> # startMetaStore should also register the instance with Zookeeper, when 
> configured.
> # When shutting a metastore server down it should deregister itself from 
> Zookeeper, when configured.
> # These changes use the refactored code described above.
> h3. HiveMetaStoreClient class changes
> When service discovery mode is zookeeper, we fetch the metatstore URIs from 
> the specified ZooKeeper and treat those as if they were specified in 
> THRIFT_URIS i.e. use the existing mechanisms to choose a metastore server to 
> connect to and establish a connection.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20842) Fix logic introduced in HIVE-20660 to estimate statistics for group by

2018-11-13 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20842:
---
Status: Open  (was: Patch Available)

> Fix logic introduced in HIVE-20660 to estimate statistics for group by
> --
>
> Key: HIVE-20842
> URL: https://issues.apache.org/jira/browse/HIVE-20842
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20842.1.patch, HIVE-20842.10.patch, 
> HIVE-20842.11.patch, HIVE-20842.2.patch, HIVE-20842.3.patch, 
> HIVE-20842.4.patch, HIVE-20842.5.patch, HIVE-20842.6.patch, 
> HIVE-20842.7.patch, HIVE-20842.8.patch, HIVE-20842.9.patch
>
>
> HIVE-20660 introduced better estimation for group by operator. But the logic 
> did not account for Partial and Full group by separately.
> For partial group by parallelism (i.e. number of tasks) should be taken into 
> account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20842) Fix logic introduced in HIVE-20660 to estimate statistics for group by

2018-11-13 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20842:
---
Status: Patch Available  (was: Open)

> Fix logic introduced in HIVE-20660 to estimate statistics for group by
> --
>
> Key: HIVE-20842
> URL: https://issues.apache.org/jira/browse/HIVE-20842
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20842.1.patch, HIVE-20842.10.patch, 
> HIVE-20842.11.patch, HIVE-20842.2.patch, HIVE-20842.3.patch, 
> HIVE-20842.4.patch, HIVE-20842.5.patch, HIVE-20842.6.patch, 
> HIVE-20842.7.patch, HIVE-20842.8.patch, HIVE-20842.9.patch
>
>
> HIVE-20660 introduced better estimation for group by operator. But the logic 
> did not account for Partial and Full group by separately.
> For partial group by parallelism (i.e. number of tasks) should be taken into 
> account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20794) Use Zookeeper for metastore service discovery

2018-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686080#comment-16686080
 ] 

Hive QA commented on HIVE-20794:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12948062/HIVE-20794.01

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 15584 tests 
executed
*Failed tests:*
{noformat}
org.apache.hive.jdbc.TestActivePassiveHA.testActivePassiveHA (batchId=258)
org.apache.hive.jdbc.TestActivePassiveHA.testClientConnectionsOnFailover 
(batchId=258)
org.apache.hive.jdbc.TestActivePassiveHA.testConnectionActivePassiveHAServiceDiscovery
 (batchId=258)
org.apache.hive.jdbc.TestActivePassiveHA.testManualFailover (batchId=258)
org.apache.hive.jdbc.TestActivePassiveHA.testManualFailoverUnauthorized 
(batchId=258)
org.apache.hive.jdbc.TestActivePassiveHA.testNoConnectionOnPassive (batchId=258)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14920/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14920/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14920/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12948062 - PreCommit-HIVE-Build

> Use Zookeeper for metastore service discovery
> -
>
> Key: HIVE-20794
> URL: https://issues.apache.org/jira/browse/HIVE-20794
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20794.01
>
>
> Right now, multiple metastore services can be specified in 
> hive.metastore.uris configuration, but that list is static and can not be 
> modified dynamically. Use Zookeeper for dynamic service discovery of 
> metastore.
> h3. Improve ZooKeeperHiveHelper class (suggestions for name welcome)
> The Zookeeper related code (for service discovery) accesses Zookeeper 
> parameters directly from HiveConf. The class is changed so that it could be 
> used for both HiveServer2 and Metastore server and works with both the 
> configurations. Following methods from HiveServer2 are now moved into 
> ZooKeeperHiveHelper. # startZookeeperClient # addServerInstanceToZooKeeper # 
> removeServerInstanceFromZooKeeper
> h3. HiveMetaStore conf changes
>  # THRIFT_URIS (hive.metastore.uris) can also be used to specify ZooKeeper 
> quorum. When THRIFT_SERVICE_DISCOVERY_MODE 
> (hive.metastore.service.discovery.mode) is set to "zookeeper" the URIs are 
> used as ZooKeeper quorum. When it's set to be empty, the URIs are used to 
> locate the metastore directly.
>  # Here's list of Hiveserver2's parameters and their proposed metastore conf 
> counterparts. It looks odd that the Metastore related configurations do not 
> have their macros start with METASTORE, but start with THRIFT. I have just 
> followed naming convention used for other parameters.
> #* HIVE_SERVER2_ZOOKEEPER_NAMESPACE - THRIFT_ZOOKEEPER_NAMESPACE 
> (hive.metastore.zookeeper.namespace)
> #* HIVE_ZOOKEEPER_CLIENT_PORT - THRIFT_ZOOKEEPER_CLIENT_PORT 
> (hive.metastore.zookeeper.client.port)
> #* HIVE_ZOOKEEPER_CONNECTION_TIMEOUT - THRIFT_ZOOKEEPER_CONNECTION_TIMEOUT - 
> (hive.metastore.zookeeper.connection.timeout)
> #* HIVE_ZOOKEEPER_CONNECTION_MAX_RETRIES - 
> THRIFT_ZOOKEEPER_CONNECTION_MAX_RETRIES 
> (hive.metastore.zookeeper.connection.max.retries)
> #* HIVE_ZOOKEEPER_CONNECTION_BASESLEEPTIME - 
> THRIFT_ZOOKEEPER_CONNECTION_BASESLEEPTIME 
> (hive.metastore.zookeeper.connection.basesleeptime)
> Following Hive ZK configurations seem to be related to managing locks and 
> seem irrelevant for MS ZK.
> # HIVE_ZOOKEEPER_SESSION_TIMEOUT
> # HIVE_ZOOKEEPER_CLEAN_EXTRA_NODES
> Since there is no configuration to be published, 
> HIVE_ZOOKEEPER_PUBLISH_CONFIGS does not have a THRIFT counterpart.
> h3. HiveMetaStore class changes
> # startMetaStore should also register the instance with Zookeeper, when 
> configured.
> # When shutting a metastore server down it should deregister itself from 
> Zookeeper, when configured.
> # These changes use the refactored code described above.
> h3. HiveMetaStoreClient class changes
> When service discovery mode is zookeeper, we fetch the metatstore URIs from 
> the specified ZooKeeper and treat those as if they were specified in 
> THRIFT_URIS i.e. use

[jira] [Commented] (HIVE-20794) Use Zookeeper for metastore service discovery

2018-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686074#comment-16686074
 ] 

Hive QA commented on HIVE-20794:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
24s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
50s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
34s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
50s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
30s{color} | {color:blue} common in master has 65 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  2m 
14s{color} | {color:blue} standalone-metastore/metastore-common in master has 
29 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
35s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
43s{color} | {color:blue} itests/util in master has 48 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
45s{color} | {color:blue} ql in master has 2317 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
36s{color} | {color:blue} service in master has 48 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
59s{color} | {color:blue} standalone-metastore/metastore-server in master has 
185 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m 
21s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  4m 
33s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
15s{color} | {color:red} common: The patch generated 16 new + 426 unchanged - 0 
fixed = 442 total (was 426) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
12s{color} | {color:red} itests/util: The patch generated 1 new + 17 unchanged 
- 0 fixed = 18 total (was 17) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
36s{color} | {color:red} ql: The patch generated 2 new + 17 unchanged - 4 fixed 
= 19 total (was 21) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
12s{color} | {color:red} service: The patch generated 3 new + 35 unchanged - 0 
fixed = 38 total (was 35) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
3s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
43s{color} | {color:red} service generated 1 new + 48 unchanged - 0 fixed = 49 
total (was 48) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m 
13s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 56m 26s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:service |
|  |  Inconsistent synchronization of 
org.apache.hive.service.server.HiveServer2.zooKeeperHelper; locked 66% of time  
Unsynchronized access at HiveServer2.java:6

[jira] [Updated] (HIVE-20822) Improvements to push computation to JDBC from Calcite

2018-11-13 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-20822:
---
Attachment: HIVE-20822.03.patch

> Improvements to push computation to JDBC from Calcite
> -
>
> Key: HIVE-20822
> URL: https://issues.apache.org/jira/browse/HIVE-20822
> Project: Hive
>  Issue Type: Improvement
>  Components: StorageHandler
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20822.01.patch, HIVE-20822.02.patch, 
> HIVE-20822.02.patch, HIVE-20822.03.patch, HIVE-20822.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20910) Insert in bucketed table fails due to dynamic partition sort optimization

2018-11-13 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20910:
---
Status: Patch Available  (was: Open)

Another attempt to get clean run

> Insert in bucketed table fails due to dynamic partition sort optimization
> -
>
> Key: HIVE-20910
> URL: https://issues.apache.org/jira/browse/HIVE-20910
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20910.1.patch, HIVE-20910.2.patch, 
> HIVE-20910.3.patch
>
>
> *Repro*
> {code:sql}
> CREATE TABLE tbl_lbodor_1 (ca_address_sk int, ca_address_id string, 
> ca_street_number string, ca_street_name string,
> ca_street_type string, ca_suite_number string, ca_city string, ca_county 
> string, ca_state string,
> ca_zip string, ca_country string, ca_gmt_offset decimal(5,2))
> PARTITIONED BY (ca_location_type string)
> CLUSTERED BY (ca_state) INTO 50 BUCKETS STORED AS ORC 
> TBLPROPERTIES('transactional'='true');
> INSERT INTO TABLE tbl_lbodor_1 PARTITION (ca_location_type) VALUES (, 
> 'DLFB', '126',
> 'Highland Park', 'Court', 'Suite E', 'San Jose', 'King George 
> County', 'VA', '28003', 'United States',
> '-5', 'single family');
> {code}
> {noformat}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.Decimal64ColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:184)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:237)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.addRow(WriterImpl.java:308)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSimpleEvent(OrcRecordUpdater.java:423)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSplitUpdateEvent(OrcRecordUpdater.java:431)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.insert(OrcRecordUpdater.java:483)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:995)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.process(VectorFileSinkOperator.java:111)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:490)
>   ... 20 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20910) Insert in bucketed table fails due to dynamic partition sort optimization

2018-11-13 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20910:
---
Attachment: HIVE-20910.3.patch

> Insert in bucketed table fails due to dynamic partition sort optimization
> -
>
> Key: HIVE-20910
> URL: https://issues.apache.org/jira/browse/HIVE-20910
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20910.1.patch, HIVE-20910.2.patch, 
> HIVE-20910.3.patch
>
>
> *Repro*
> {code:sql}
> CREATE TABLE tbl_lbodor_1 (ca_address_sk int, ca_address_id string, 
> ca_street_number string, ca_street_name string,
> ca_street_type string, ca_suite_number string, ca_city string, ca_county 
> string, ca_state string,
> ca_zip string, ca_country string, ca_gmt_offset decimal(5,2))
> PARTITIONED BY (ca_location_type string)
> CLUSTERED BY (ca_state) INTO 50 BUCKETS STORED AS ORC 
> TBLPROPERTIES('transactional'='true');
> INSERT INTO TABLE tbl_lbodor_1 PARTITION (ca_location_type) VALUES (, 
> 'DLFB', '126',
> 'Highland Park', 'Court', 'Suite E', 'San Jose', 'King George 
> County', 'VA', '28003', 'United States',
> '-5', 'single family');
> {code}
> {noformat}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.Decimal64ColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:184)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:237)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.addRow(WriterImpl.java:308)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSimpleEvent(OrcRecordUpdater.java:423)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSplitUpdateEvent(OrcRecordUpdater.java:431)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.insert(OrcRecordUpdater.java:483)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:995)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.process(VectorFileSinkOperator.java:111)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:490)
>   ... 20 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20910) Insert in bucketed table fails due to dynamic partition sort optimization

2018-11-13 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20910:
---
Status: Open  (was: Patch Available)

> Insert in bucketed table fails due to dynamic partition sort optimization
> -
>
> Key: HIVE-20910
> URL: https://issues.apache.org/jira/browse/HIVE-20910
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20910.1.patch, HIVE-20910.2.patch
>
>
> *Repro*
> {code:sql}
> CREATE TABLE tbl_lbodor_1 (ca_address_sk int, ca_address_id string, 
> ca_street_number string, ca_street_name string,
> ca_street_type string, ca_suite_number string, ca_city string, ca_county 
> string, ca_state string,
> ca_zip string, ca_country string, ca_gmt_offset decimal(5,2))
> PARTITIONED BY (ca_location_type string)
> CLUSTERED BY (ca_state) INTO 50 BUCKETS STORED AS ORC 
> TBLPROPERTIES('transactional'='true');
> INSERT INTO TABLE tbl_lbodor_1 PARTITION (ca_location_type) VALUES (, 
> 'DLFB', '126',
> 'Highland Park', 'Court', 'Suite E', 'San Jose', 'King George 
> County', 'VA', '28003', 'United States',
> '-5', 'single family');
> {code}
> {noformat}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.Decimal64ColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:184)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:237)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.addRow(WriterImpl.java:308)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSimpleEvent(OrcRecordUpdater.java:423)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSplitUpdateEvent(OrcRecordUpdater.java:431)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.insert(OrcRecordUpdater.java:483)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:995)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.process(VectorFileSinkOperator.java:111)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:490)
>   ... 20 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20794) Use Zookeeper for metastore service discovery

2018-11-13 Thread Ashutosh Bapat (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Bapat updated HIVE-20794:
--
Attachment: HIVE-20794.01
Status: Patch Available  (was: Open)

> Use Zookeeper for metastore service discovery
> -
>
> Key: HIVE-20794
> URL: https://issues.apache.org/jira/browse/HIVE-20794
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20794.01
>
>
> Right now, multiple metastore services can be specified in 
> hive.metastore.uris configuration, but that list is static and can not be 
> modified dynamically. Use Zookeeper for dynamic service discovery of 
> metastore.
> h3. Improve ZooKeeperHiveHelper class (suggestions for name welcome)
> The Zookeeper related code (for service discovery) accesses Zookeeper 
> parameters directly from HiveConf. The class is changed so that it could be 
> used for both HiveServer2 and Metastore server and works with both the 
> configurations. Following methods from HiveServer2 are now moved into 
> ZooKeeperHiveHelper. # startZookeeperClient # addServerInstanceToZooKeeper # 
> removeServerInstanceFromZooKeeper
> h3. HiveMetaStore conf changes
>  # THRIFT_URIS (hive.metastore.uris) can also be used to specify ZooKeeper 
> quorum. When THRIFT_SERVICE_DISCOVERY_MODE 
> (hive.metastore.service.discovery.mode) is set to "zookeeper" the URIs are 
> used as ZooKeeper quorum. When it's set to be empty, the URIs are used to 
> locate the metastore directly.
>  # Here's list of Hiveserver2's parameters and their proposed metastore conf 
> counterparts. It looks odd that the Metastore related configurations do not 
> have their macros start with METASTORE, but start with THRIFT. I have just 
> followed naming convention used for other parameters.
> #* HIVE_SERVER2_ZOOKEEPER_NAMESPACE - THRIFT_ZOOKEEPER_NAMESPACE 
> (hive.metastore.zookeeper.namespace)
> #* HIVE_ZOOKEEPER_CLIENT_PORT - THRIFT_ZOOKEEPER_CLIENT_PORT 
> (hive.metastore.zookeeper.client.port)
> #* HIVE_ZOOKEEPER_CONNECTION_TIMEOUT - THRIFT_ZOOKEEPER_CONNECTION_TIMEOUT - 
> (hive.metastore.zookeeper.connection.timeout)
> #* HIVE_ZOOKEEPER_CONNECTION_MAX_RETRIES - 
> THRIFT_ZOOKEEPER_CONNECTION_MAX_RETRIES 
> (hive.metastore.zookeeper.connection.max.retries)
> #* HIVE_ZOOKEEPER_CONNECTION_BASESLEEPTIME - 
> THRIFT_ZOOKEEPER_CONNECTION_BASESLEEPTIME 
> (hive.metastore.zookeeper.connection.basesleeptime)
> Following Hive ZK configurations seem to be related to managing locks and 
> seem irrelevant for MS ZK.
> # HIVE_ZOOKEEPER_SESSION_TIMEOUT
> # HIVE_ZOOKEEPER_CLEAN_EXTRA_NODES
> Since there is no configuration to be published, 
> HIVE_ZOOKEEPER_PUBLISH_CONFIGS does not have a THRIFT counterpart.
> h3. HiveMetaStore class changes
> # startMetaStore should also register the instance with Zookeeper, when 
> configured.
> # When shutting a metastore server down it should deregister itself from 
> Zookeeper, when configured.
> # These changes use the refactored code described above.
> h3. HiveMetaStoreClient class changes
> When service discovery mode is zookeeper, we fetch the metatstore URIs from 
> the specified ZooKeeper and treat those as if they were specified in 
> THRIFT_URIS i.e. use the existing mechanisms to choose a metastore server to 
> connect to and establish a connection.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20794) Use Zookeeper for metastore service discovery

2018-11-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-20794:
--
Labels: pull-request-available  (was: )

> Use Zookeeper for metastore service discovery
> -
>
> Key: HIVE-20794
> URL: https://issues.apache.org/jira/browse/HIVE-20794
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20794.01
>
>
> Right now, multiple metastore services can be specified in 
> hive.metastore.uris configuration, but that list is static and can not be 
> modified dynamically. Use Zookeeper for dynamic service discovery of 
> metastore.
> h3. Improve ZooKeeperHiveHelper class (suggestions for name welcome)
> The Zookeeper related code (for service discovery) accesses Zookeeper 
> parameters directly from HiveConf. The class is changed so that it could be 
> used for both HiveServer2 and Metastore server and works with both the 
> configurations. Following methods from HiveServer2 are now moved into 
> ZooKeeperHiveHelper. # startZookeeperClient # addServerInstanceToZooKeeper # 
> removeServerInstanceFromZooKeeper
> h3. HiveMetaStore conf changes
>  # THRIFT_URIS (hive.metastore.uris) can also be used to specify ZooKeeper 
> quorum. When THRIFT_SERVICE_DISCOVERY_MODE 
> (hive.metastore.service.discovery.mode) is set to "zookeeper" the URIs are 
> used as ZooKeeper quorum. When it's set to be empty, the URIs are used to 
> locate the metastore directly.
>  # Here's list of Hiveserver2's parameters and their proposed metastore conf 
> counterparts. It looks odd that the Metastore related configurations do not 
> have their macros start with METASTORE, but start with THRIFT. I have just 
> followed naming convention used for other parameters.
> #* HIVE_SERVER2_ZOOKEEPER_NAMESPACE - THRIFT_ZOOKEEPER_NAMESPACE 
> (hive.metastore.zookeeper.namespace)
> #* HIVE_ZOOKEEPER_CLIENT_PORT - THRIFT_ZOOKEEPER_CLIENT_PORT 
> (hive.metastore.zookeeper.client.port)
> #* HIVE_ZOOKEEPER_CONNECTION_TIMEOUT - THRIFT_ZOOKEEPER_CONNECTION_TIMEOUT - 
> (hive.metastore.zookeeper.connection.timeout)
> #* HIVE_ZOOKEEPER_CONNECTION_MAX_RETRIES - 
> THRIFT_ZOOKEEPER_CONNECTION_MAX_RETRIES 
> (hive.metastore.zookeeper.connection.max.retries)
> #* HIVE_ZOOKEEPER_CONNECTION_BASESLEEPTIME - 
> THRIFT_ZOOKEEPER_CONNECTION_BASESLEEPTIME 
> (hive.metastore.zookeeper.connection.basesleeptime)
> Following Hive ZK configurations seem to be related to managing locks and 
> seem irrelevant for MS ZK.
> # HIVE_ZOOKEEPER_SESSION_TIMEOUT
> # HIVE_ZOOKEEPER_CLEAN_EXTRA_NODES
> Since there is no configuration to be published, 
> HIVE_ZOOKEEPER_PUBLISH_CONFIGS does not have a THRIFT counterpart.
> h3. HiveMetaStore class changes
> # startMetaStore should also register the instance with Zookeeper, when 
> configured.
> # When shutting a metastore server down it should deregister itself from 
> Zookeeper, when configured.
> # These changes use the refactored code described above.
> h3. HiveMetaStoreClient class changes
> When service discovery mode is zookeeper, we fetch the metatstore URIs from 
> the specified ZooKeeper and treat those as if they were specified in 
> THRIFT_URIS i.e. use the existing mechanisms to choose a metastore server to 
> connect to and establish a connection.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20794) Use Zookeeper for metastore service discovery

2018-11-13 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686032#comment-16686032
 ] 

ASF GitHub Bot commented on HIVE-20794:
---

GitHub user ashutosh-bapat opened a pull request:

https://github.com/apache/hive/pull/487

Hive20794

Find more details about the changes in HIVE-20794.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ashutosh-bapat/hive hive20794

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/487.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #487


commit 383e8be33934d078bad2e8fe1233cc0f3c6119ed
Author: Ashutosh Bapat 
Date:   2018-10-26T08:22:04Z

HIVE-20794: Use Zookeeper for dynamic service discovery of metastore.

The patch also adds new ZooKeeper configurations for metastore. We reuse 
THRIFT_URIs to specify
ZooKeeper quorum and have another configuration by name 
THRIFT_SERVICE_DISCOVERY_MODE to specify
what method to use for dynamic service discovery.

Ashutosh Bapat

commit a38e2e8c9fdc85cd809a1aac9d16ed1d204117bb
Author: Ashutosh Bapat 
Date:   2018-11-13T09:05:03Z

HIVE-20794: Refactor existing code for supporting metastore dynamic 
discovery using Zookeeper

Extract the code in HiveServer2 dealing with ZooKeeper into a 
ZooKeeperHiveHelper class so that
it can be used by MetaStore server as well. This also moves the 
ZooKeeperHiveHelper.java into a
location common to both HiveServer2 and MetaStore code.

Ashutosh Bapat




> Use Zookeeper for metastore service discovery
> -
>
> Key: HIVE-20794
> URL: https://issues.apache.org/jira/browse/HIVE-20794
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20794.01
>
>
> Right now, multiple metastore services can be specified in 
> hive.metastore.uris configuration, but that list is static and can not be 
> modified dynamically. Use Zookeeper for dynamic service discovery of 
> metastore.
> h3. Improve ZooKeeperHiveHelper class (suggestions for name welcome)
> The Zookeeper related code (for service discovery) accesses Zookeeper 
> parameters directly from HiveConf. The class is changed so that it could be 
> used for both HiveServer2 and Metastore server and works with both the 
> configurations. Following methods from HiveServer2 are now moved into 
> ZooKeeperHiveHelper. # startZookeeperClient # addServerInstanceToZooKeeper # 
> removeServerInstanceFromZooKeeper
> h3. HiveMetaStore conf changes
>  # THRIFT_URIS (hive.metastore.uris) can also be used to specify ZooKeeper 
> quorum. When THRIFT_SERVICE_DISCOVERY_MODE 
> (hive.metastore.service.discovery.mode) is set to "zookeeper" the URIs are 
> used as ZooKeeper quorum. When it's set to be empty, the URIs are used to 
> locate the metastore directly.
>  # Here's list of Hiveserver2's parameters and their proposed metastore conf 
> counterparts. It looks odd that the Metastore related configurations do not 
> have their macros start with METASTORE, but start with THRIFT. I have just 
> followed naming convention used for other parameters.
> #* HIVE_SERVER2_ZOOKEEPER_NAMESPACE - THRIFT_ZOOKEEPER_NAMESPACE 
> (hive.metastore.zookeeper.namespace)
> #* HIVE_ZOOKEEPER_CLIENT_PORT - THRIFT_ZOOKEEPER_CLIENT_PORT 
> (hive.metastore.zookeeper.client.port)
> #* HIVE_ZOOKEEPER_CONNECTION_TIMEOUT - THRIFT_ZOOKEEPER_CONNECTION_TIMEOUT - 
> (hive.metastore.zookeeper.connection.timeout)
> #* HIVE_ZOOKEEPER_CONNECTION_MAX_RETRIES - 
> THRIFT_ZOOKEEPER_CONNECTION_MAX_RETRIES 
> (hive.metastore.zookeeper.connection.max.retries)
> #* HIVE_ZOOKEEPER_CONNECTION_BASESLEEPTIME - 
> THRIFT_ZOOKEEPER_CONNECTION_BASESLEEPTIME 
> (hive.metastore.zookeeper.connection.basesleeptime)
> Following Hive ZK configurations seem to be related to managing locks and 
> seem irrelevant for MS ZK.
> # HIVE_ZOOKEEPER_SESSION_TIMEOUT
> # HIVE_ZOOKEEPER_CLEAN_EXTRA_NODES
> Since there is no configuration to be published, 
> HIVE_ZOOKEEPER_PUBLISH_CONFIGS does not have a THRIFT counterpart.
> h3. HiveMetaStore class changes
> # startMetaStore should also register the instance with Zookeeper, when 
> configured.
> # When shutting a metastore server down it should deregister itself from 
> Zookeeper, when configured.
> # These changes use the refactored code described above.
> h3. HiveMetaStoreClient class changes
> When service discovery mode is zookeeper, we fetch the metatstore URIs from 
> the specified ZooKeeper and treat those as if they were spec

[jira] [Updated] (HIVE-20794) Use Zookeeper for metastore service discovery

2018-11-13 Thread Ashutosh Bapat (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Bapat updated HIVE-20794:
--
Attachment: (was: HIVE-20794.01)

> Use Zookeeper for metastore service discovery
> -
>
> Key: HIVE-20794
> URL: https://issues.apache.org/jira/browse/HIVE-20794
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>
> Right now, multiple metastore services can be specified in 
> hive.metastore.uris configuration, but that list is static and can not be 
> modified dynamically. Use Zookeeper for dynamic service discovery of 
> metastore.
> h3. Improve ZooKeeperHiveHelper class (suggestions for name welcome)
> The Zookeeper related code (for service discovery) accesses Zookeeper 
> parameters directly from HiveConf. The class is changed so that it could be 
> used for both HiveServer2 and Metastore server and works with both the 
> configurations. Following methods from HiveServer2 are now moved into 
> ZooKeeperHiveHelper. # startZookeeperClient # addServerInstanceToZooKeeper # 
> removeServerInstanceFromZooKeeper
> h3. HiveMetaStore conf changes
>  # THRIFT_URIS (hive.metastore.uris) can also be used to specify ZooKeeper 
> quorum. When THRIFT_SERVICE_DISCOVERY_MODE 
> (hive.metastore.service.discovery.mode) is set to "zookeeper" the URIs are 
> used as ZooKeeper quorum. When it's set to be empty, the URIs are used to 
> locate the metastore directly.
>  # Here's list of Hiveserver2's parameters and their proposed metastore conf 
> counterparts. It looks odd that the Metastore related configurations do not 
> have their macros start with METASTORE, but start with THRIFT. I have just 
> followed naming convention used for other parameters.
> #* HIVE_SERVER2_ZOOKEEPER_NAMESPACE - THRIFT_ZOOKEEPER_NAMESPACE 
> (hive.metastore.zookeeper.namespace)
> #* HIVE_ZOOKEEPER_CLIENT_PORT - THRIFT_ZOOKEEPER_CLIENT_PORT 
> (hive.metastore.zookeeper.client.port)
> #* HIVE_ZOOKEEPER_CONNECTION_TIMEOUT - THRIFT_ZOOKEEPER_CONNECTION_TIMEOUT - 
> (hive.metastore.zookeeper.connection.timeout)
> #* HIVE_ZOOKEEPER_CONNECTION_MAX_RETRIES - 
> THRIFT_ZOOKEEPER_CONNECTION_MAX_RETRIES 
> (hive.metastore.zookeeper.connection.max.retries)
> #* HIVE_ZOOKEEPER_CONNECTION_BASESLEEPTIME - 
> THRIFT_ZOOKEEPER_CONNECTION_BASESLEEPTIME 
> (hive.metastore.zookeeper.connection.basesleeptime)
> Following Hive ZK configurations seem to be related to managing locks and 
> seem irrelevant for MS ZK.
> # HIVE_ZOOKEEPER_SESSION_TIMEOUT
> # HIVE_ZOOKEEPER_CLEAN_EXTRA_NODES
> Since there is no configuration to be published, 
> HIVE_ZOOKEEPER_PUBLISH_CONFIGS does not have a THRIFT counterpart.
> h3. HiveMetaStore class changes
> # startMetaStore should also register the instance with Zookeeper, when 
> configured.
> # When shutting a metastore server down it should deregister itself from 
> Zookeeper, when configured.
> # These changes use the refactored code described above.
> h3. HiveMetaStoreClient class changes
> When service discovery mode is zookeeper, we fetch the metatstore URIs from 
> the specified ZooKeeper and treat those as if they were specified in 
> THRIFT_URIS i.e. use the existing mechanisms to choose a metastore server to 
> connect to and establish a connection.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20794) Use Zookeeper for metastore service discovery

2018-11-13 Thread Ashutosh Bapat (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Bapat updated HIVE-20794:
--
Attachment: HIVE-20794.01

> Use Zookeeper for metastore service discovery
> -
>
> Key: HIVE-20794
> URL: https://issues.apache.org/jira/browse/HIVE-20794
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
> Attachments: HIVE-20794.01
>
>
> Right now, multiple metastore services can be specified in 
> hive.metastore.uris configuration, but that list is static and can not be 
> modified dynamically. Use Zookeeper for dynamic service discovery of 
> metastore.
> h3. Improve ZooKeeperHiveHelper class (suggestions for name welcome)
> The Zookeeper related code (for service discovery) accesses Zookeeper 
> parameters directly from HiveConf. The class is changed so that it could be 
> used for both HiveServer2 and Metastore server and works with both the 
> configurations. Following methods from HiveServer2 are now moved into 
> ZooKeeperHiveHelper. # startZookeeperClient # addServerInstanceToZooKeeper # 
> removeServerInstanceFromZooKeeper
> h3. HiveMetaStore conf changes
>  # THRIFT_URIS (hive.metastore.uris) can also be used to specify ZooKeeper 
> quorum. When THRIFT_SERVICE_DISCOVERY_MODE 
> (hive.metastore.service.discovery.mode) is set to "zookeeper" the URIs are 
> used as ZooKeeper quorum. When it's set to be empty, the URIs are used to 
> locate the metastore directly.
>  # Here's list of Hiveserver2's parameters and their proposed metastore conf 
> counterparts. It looks odd that the Metastore related configurations do not 
> have their macros start with METASTORE, but start with THRIFT. I have just 
> followed naming convention used for other parameters.
> #* HIVE_SERVER2_ZOOKEEPER_NAMESPACE - THRIFT_ZOOKEEPER_NAMESPACE 
> (hive.metastore.zookeeper.namespace)
> #* HIVE_ZOOKEEPER_CLIENT_PORT - THRIFT_ZOOKEEPER_CLIENT_PORT 
> (hive.metastore.zookeeper.client.port)
> #* HIVE_ZOOKEEPER_CONNECTION_TIMEOUT - THRIFT_ZOOKEEPER_CONNECTION_TIMEOUT - 
> (hive.metastore.zookeeper.connection.timeout)
> #* HIVE_ZOOKEEPER_CONNECTION_MAX_RETRIES - 
> THRIFT_ZOOKEEPER_CONNECTION_MAX_RETRIES 
> (hive.metastore.zookeeper.connection.max.retries)
> #* HIVE_ZOOKEEPER_CONNECTION_BASESLEEPTIME - 
> THRIFT_ZOOKEEPER_CONNECTION_BASESLEEPTIME 
> (hive.metastore.zookeeper.connection.basesleeptime)
> Following Hive ZK configurations seem to be related to managing locks and 
> seem irrelevant for MS ZK.
> # HIVE_ZOOKEEPER_SESSION_TIMEOUT
> # HIVE_ZOOKEEPER_CLEAN_EXTRA_NODES
> Since there is no configuration to be published, 
> HIVE_ZOOKEEPER_PUBLISH_CONFIGS does not have a THRIFT counterpart.
> h3. HiveMetaStore class changes
> # startMetaStore should also register the instance with Zookeeper, when 
> configured.
> # When shutting a metastore server down it should deregister itself from 
> Zookeeper, when configured.
> # These changes use the refactored code described above.
> h3. HiveMetaStoreClient class changes
> When service discovery mode is zookeeper, we fetch the metatstore URIs from 
> the specified ZooKeeper and treat those as if they were specified in 
> THRIFT_URIS i.e. use the existing mechanisms to choose a metastore server to 
> connect to and establish a connection.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20794) Use Zookeeper for metastore service discovery

2018-11-13 Thread Ashutosh Bapat (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Bapat updated HIVE-20794:
--
Description: 
Right now, multiple metastore services can be specified in hive.metastore.uris 
configuration, but that list is static and can not be modified dynamically. Use 
Zookeeper for dynamic service discovery of metastore.
h3. Improve ZooKeeperHiveHelper class (suggestions for name welcome)

The Zookeeper related code (for service discovery) accesses Zookeeper 
parameters directly from HiveConf. The class is changed so that it could be 
used for both HiveServer2 and Metastore server and works with both the 
configurations. Following methods from HiveServer2 are now moved into 
ZooKeeperHiveHelper. # startZookeeperClient # addServerInstanceToZooKeeper # 
removeServerInstanceFromZooKeeper
h3. HiveMetaStore conf changes
 # THRIFT_URIS (hive.metastore.uris) can also be used to specify ZooKeeper 
quorum. When THRIFT_SERVICE_DISCOVERY_MODE 
(hive.metastore.service.discovery.mode) is set to "zookeeper" the URIs are used 
as ZooKeeper quorum. When it's set to be empty, the URIs are used to locate the 
metastore directly.
 # Here's list of Hiveserver2's parameters and their proposed metastore conf 
counterparts. It looks odd that the Metastore related configurations do not 
have their macros start with METASTORE, but start with THRIFT. I have just 
followed naming convention used for other parameters.
#* HIVE_SERVER2_ZOOKEEPER_NAMESPACE - THRIFT_ZOOKEEPER_NAMESPACE 
(hive.metastore.zookeeper.namespace)
#* HIVE_ZOOKEEPER_CLIENT_PORT - THRIFT_ZOOKEEPER_CLIENT_PORT 
(hive.metastore.zookeeper.client.port)
#* HIVE_ZOOKEEPER_CONNECTION_TIMEOUT - THRIFT_ZOOKEEPER_CONNECTION_TIMEOUT - 
(hive.metastore.zookeeper.connection.timeout)
#* HIVE_ZOOKEEPER_CONNECTION_MAX_RETRIES - 
THRIFT_ZOOKEEPER_CONNECTION_MAX_RETRIES 
(hive.metastore.zookeeper.connection.max.retries)
#* HIVE_ZOOKEEPER_CONNECTION_BASESLEEPTIME - 
THRIFT_ZOOKEEPER_CONNECTION_BASESLEEPTIME 
(hive.metastore.zookeeper.connection.basesleeptime)

Following Hive ZK configurations seem to be related to managing locks and seem 
irrelevant for MS ZK.
# HIVE_ZOOKEEPER_SESSION_TIMEOUT
# HIVE_ZOOKEEPER_CLEAN_EXTRA_NODES

Since there is no configuration to be published, HIVE_ZOOKEEPER_PUBLISH_CONFIGS 
does not have a THRIFT counterpart.
h3. HiveMetaStore class changes
# startMetaStore should also register the instance with Zookeeper, when 
configured.
# When shutting a metastore server down it should deregister itself from 
Zookeeper, when configured.
# These changes use the refactored code described above.

h3. HiveMetaStoreClient class changes
When service discovery mode is zookeeper, we fetch the metatstore URIs from the 
specified ZooKeeper and treat those as if they were specified in THRIFT_URIS 
i.e. use the existing mechanisms to choose a metastore server to connect to and 
establish a connection.

  was:Right now, multiple metastore services can be specified in 
hive.metastore.uris configuration, but that list is static and can not be 
modified dynamically. Use Zookeeper for dynamic service discovery of metastore.


> Use Zookeeper for metastore service discovery
> -
>
> Key: HIVE-20794
> URL: https://issues.apache.org/jira/browse/HIVE-20794
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>
> Right now, multiple metastore services can be specified in 
> hive.metastore.uris configuration, but that list is static and can not be 
> modified dynamically. Use Zookeeper for dynamic service discovery of 
> metastore.
> h3. Improve ZooKeeperHiveHelper class (suggestions for name welcome)
> The Zookeeper related code (for service discovery) accesses Zookeeper 
> parameters directly from HiveConf. The class is changed so that it could be 
> used for both HiveServer2 and Metastore server and works with both the 
> configurations. Following methods from HiveServer2 are now moved into 
> ZooKeeperHiveHelper. # startZookeeperClient # addServerInstanceToZooKeeper # 
> removeServerInstanceFromZooKeeper
> h3. HiveMetaStore conf changes
>  # THRIFT_URIS (hive.metastore.uris) can also be used to specify ZooKeeper 
> quorum. When THRIFT_SERVICE_DISCOVERY_MODE 
> (hive.metastore.service.discovery.mode) is set to "zookeeper" the URIs are 
> used as ZooKeeper quorum. When it's set to be empty, the URIs are used to 
> locate the metastore directly.
>  # Here's list of Hiveserver2's parameters and their proposed metastore conf 
> counterparts. It looks odd that the Metastore related configurations do not 
> have their macros start with METASTORE, but start with THRIFT. I have just 
> followed naming convention used for other parameters.
> #* HIVE_SERVER2_ZOOKEEPER_NAMESPACE - THRIFT_ZOOKEEPER_N

[jira] [Commented] (HIVE-20910) Insert in bucketed table fails due to dynamic partition sort optimization

2018-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685999#comment-16685999
 ] 

Hive QA commented on HIVE-20910:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12948040/HIVE-20910.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15539 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_reduce] 
(batchId=61)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14919/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14919/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14919/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12948040 - PreCommit-HIVE-Build

> Insert in bucketed table fails due to dynamic partition sort optimization
> -
>
> Key: HIVE-20910
> URL: https://issues.apache.org/jira/browse/HIVE-20910
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20910.1.patch, HIVE-20910.2.patch
>
>
> *Repro*
> {code:sql}
> CREATE TABLE tbl_lbodor_1 (ca_address_sk int, ca_address_id string, 
> ca_street_number string, ca_street_name string,
> ca_street_type string, ca_suite_number string, ca_city string, ca_county 
> string, ca_state string,
> ca_zip string, ca_country string, ca_gmt_offset decimal(5,2))
> PARTITIONED BY (ca_location_type string)
> CLUSTERED BY (ca_state) INTO 50 BUCKETS STORED AS ORC 
> TBLPROPERTIES('transactional'='true');
> INSERT INTO TABLE tbl_lbodor_1 PARTITION (ca_location_type) VALUES (, 
> 'DLFB', '126',
> 'Highland Park', 'Court', 'Suite E', 'San Jose', 'King George 
> County', 'VA', '28003', 'United States',
> '-5', 'single family');
> {code}
> {noformat}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.Decimal64ColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:184)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:237)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.addRow(WriterImpl.java:308)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSimpleEvent(OrcRecordUpdater.java:423)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSplitUpdateEvent(OrcRecordUpdater.java:431)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.insert(OrcRecordUpdater.java:483)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:995)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.process(VectorFileSinkOperator.java:111)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:490)
>   ... 20 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20512) Improve record and memory usage logging in SparkRecordHandler

2018-11-13 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-20512:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master.

> Improve record and memory usage logging in SparkRecordHandler
> -
>
> Key: HIVE-20512
> URL: https://issues.apache.org/jira/browse/HIVE-20512
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20512.1.patch, HIVE-20512.2.patch, 
> HIVE-20512.3.patch, HIVE-20512.4.patch, HIVE-20512.5.patch, 
> HIVE-20512.6.patch, HIVE-20512.7.patch, HIVE-20512.8.patch, 
> HIVE-20512.9.patch, HIVE-20512.91.patch, HIVE-20512.92.patch
>
>
> We currently log memory usage and # of records processed in Spark tasks, but 
> we should improve the methodology for how frequently we log this info. 
> Currently we use the following code:
> {code:java}
> private long getNextLogThreshold(long currentThreshold) {
> // A very simple counter to keep track of number of rows processed by the
> // reducer. It dumps
> // every 1 million times, and quickly before that
> if (currentThreshold >= 100) {
>   return currentThreshold + 100;
> }
> return 10 * currentThreshold;
>   }
> {code}
> The issue is that after a while, the increase by 10x factor means that you 
> have to process a huge # of records before this gets triggered.
> A better approach would be to log this info at a given interval. This would 
> help in debugging tasks that are seemingly hung.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20910) Insert in bucketed table fails due to dynamic partition sort optimization

2018-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685970#comment-16685970
 ] 

Hive QA commented on HIVE-20910:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
45s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
44s{color} | {color:blue} ql in master has 2316 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 22m 30s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14919/dev-support/hive-personality.sh
 |
| git revision | master / ccbc5c3 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14919/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Insert in bucketed table fails due to dynamic partition sort optimization
> -
>
> Key: HIVE-20910
> URL: https://issues.apache.org/jira/browse/HIVE-20910
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20910.1.patch, HIVE-20910.2.patch
>
>
> *Repro*
> {code:sql}
> CREATE TABLE tbl_lbodor_1 (ca_address_sk int, ca_address_id string, 
> ca_street_number string, ca_street_name string,
> ca_street_type string, ca_suite_number string, ca_city string, ca_county 
> string, ca_state string,
> ca_zip string, ca_country string, ca_gmt_offset decimal(5,2))
> PARTITIONED BY (ca_location_type string)
> CLUSTERED BY (ca_state) INTO 50 BUCKETS STORED AS ORC 
> TBLPROPERTIES('transactional'='true');
> INSERT INTO TABLE tbl_lbodor_1 PARTITION (ca_location_type) VALUES (, 
> 'DLFB', '126',
> 'Highland Park', 'Court', 'Suite E', 'San Jose', 'King George 
> County', 'VA', '28003', 'United States',
> '-5', 'single family');
> {code}
> {noformat}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.Decimal64ColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:184)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:237)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.addRow(WriterImpl.

[jira] [Commented] (HIVE-20842) Fix logic introduced in HIVE-20660 to estimate statistics for group by

2018-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685948#comment-16685948
 ] 

Hive QA commented on HIVE-20842:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12948034/HIVE-20842.11.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15539 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_memcheck] 
(batchId=45)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14918/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14918/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14918/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12948034 - PreCommit-HIVE-Build

> Fix logic introduced in HIVE-20660 to estimate statistics for group by
> --
>
> Key: HIVE-20842
> URL: https://issues.apache.org/jira/browse/HIVE-20842
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20842.1.patch, HIVE-20842.10.patch, 
> HIVE-20842.11.patch, HIVE-20842.2.patch, HIVE-20842.3.patch, 
> HIVE-20842.4.patch, HIVE-20842.5.patch, HIVE-20842.6.patch, 
> HIVE-20842.7.patch, HIVE-20842.8.patch, HIVE-20842.9.patch
>
>
> HIVE-20660 introduced better estimation for group by operator. But the logic 
> did not account for Partial and Full group by separately.
> For partial group by parallelism (i.e. number of tasks) should be taken into 
> account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20842) Fix logic introduced in HIVE-20660 to estimate statistics for group by

2018-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685922#comment-16685922
 ] 

Hive QA commented on HIVE-20842:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
35s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
43s{color} | {color:blue} ql in master has 2316 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} ql: The patch generated 0 new + 37 unchanged - 1 
fixed = 37 total (was 38) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 22m 30s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14918/dev-support/hive-personality.sh
 |
| git revision | master / ccbc5c3 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14918/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Fix logic introduced in HIVE-20660 to estimate statistics for group by
> --
>
> Key: HIVE-20842
> URL: https://issues.apache.org/jira/browse/HIVE-20842
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20842.1.patch, HIVE-20842.10.patch, 
> HIVE-20842.11.patch, HIVE-20842.2.patch, HIVE-20842.3.patch, 
> HIVE-20842.4.patch, HIVE-20842.5.patch, HIVE-20842.6.patch, 
> HIVE-20842.7.patch, HIVE-20842.8.patch, HIVE-20842.9.patch
>
>
> HIVE-20660 introduced better estimation for group by operator. But the logic 
> did not account for Partial and Full group by separately.
> For partial group by parallelism (i.e. number of tasks) should be taken into 
> account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20910) Insert in bucketed table fails due to dynamic partition sort optimization

2018-11-13 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20910:
---
Status: Patch Available  (was: Open)

> Insert in bucketed table fails due to dynamic partition sort optimization
> -
>
> Key: HIVE-20910
> URL: https://issues.apache.org/jira/browse/HIVE-20910
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20910.1.patch, HIVE-20910.2.patch
>
>
> *Repro*
> {code:sql}
> CREATE TABLE tbl_lbodor_1 (ca_address_sk int, ca_address_id string, 
> ca_street_number string, ca_street_name string,
> ca_street_type string, ca_suite_number string, ca_city string, ca_county 
> string, ca_state string,
> ca_zip string, ca_country string, ca_gmt_offset decimal(5,2))
> PARTITIONED BY (ca_location_type string)
> CLUSTERED BY (ca_state) INTO 50 BUCKETS STORED AS ORC 
> TBLPROPERTIES('transactional'='true');
> INSERT INTO TABLE tbl_lbodor_1 PARTITION (ca_location_type) VALUES (, 
> 'DLFB', '126',
> 'Highland Park', 'Court', 'Suite E', 'San Jose', 'King George 
> County', 'VA', '28003', 'United States',
> '-5', 'single family');
> {code}
> {noformat}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.Decimal64ColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:184)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:237)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.addRow(WriterImpl.java:308)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSimpleEvent(OrcRecordUpdater.java:423)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSplitUpdateEvent(OrcRecordUpdater.java:431)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.insert(OrcRecordUpdater.java:483)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:995)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.process(VectorFileSinkOperator.java:111)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:490)
>   ... 20 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20910) Insert in bucketed table fails due to dynamic partition sort optimization

2018-11-13 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20910:
---
Attachment: HIVE-20910.2.patch

> Insert in bucketed table fails due to dynamic partition sort optimization
> -
>
> Key: HIVE-20910
> URL: https://issues.apache.org/jira/browse/HIVE-20910
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20910.1.patch, HIVE-20910.2.patch
>
>
> *Repro*
> {code:sql}
> CREATE TABLE tbl_lbodor_1 (ca_address_sk int, ca_address_id string, 
> ca_street_number string, ca_street_name string,
> ca_street_type string, ca_suite_number string, ca_city string, ca_county 
> string, ca_state string,
> ca_zip string, ca_country string, ca_gmt_offset decimal(5,2))
> PARTITIONED BY (ca_location_type string)
> CLUSTERED BY (ca_state) INTO 50 BUCKETS STORED AS ORC 
> TBLPROPERTIES('transactional'='true');
> INSERT INTO TABLE tbl_lbodor_1 PARTITION (ca_location_type) VALUES (, 
> 'DLFB', '126',
> 'Highland Park', 'Court', 'Suite E', 'San Jose', 'King George 
> County', 'VA', '28003', 'United States',
> '-5', 'single family');
> {code}
> {noformat}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.Decimal64ColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:184)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:237)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.addRow(WriterImpl.java:308)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSimpleEvent(OrcRecordUpdater.java:423)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSplitUpdateEvent(OrcRecordUpdater.java:431)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.insert(OrcRecordUpdater.java:483)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:995)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.process(VectorFileSinkOperator.java:111)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:490)
>   ... 20 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20910) Insert in bucketed table fails due to dynamic partition sort optimization

2018-11-13 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20910:
---
Status: Open  (was: Patch Available)

> Insert in bucketed table fails due to dynamic partition sort optimization
> -
>
> Key: HIVE-20910
> URL: https://issues.apache.org/jira/browse/HIVE-20910
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20910.1.patch, HIVE-20910.2.patch
>
>
> *Repro*
> {code:sql}
> CREATE TABLE tbl_lbodor_1 (ca_address_sk int, ca_address_id string, 
> ca_street_number string, ca_street_name string,
> ca_street_type string, ca_suite_number string, ca_city string, ca_county 
> string, ca_state string,
> ca_zip string, ca_country string, ca_gmt_offset decimal(5,2))
> PARTITIONED BY (ca_location_type string)
> CLUSTERED BY (ca_state) INTO 50 BUCKETS STORED AS ORC 
> TBLPROPERTIES('transactional'='true');
> INSERT INTO TABLE tbl_lbodor_1 PARTITION (ca_location_type) VALUES (, 
> 'DLFB', '126',
> 'Highland Park', 'Court', 'Suite E', 'San Jose', 'King George 
> County', 'VA', '28003', 'United States',
> '-5', 'single family');
> {code}
> {noformat}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.Decimal64ColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:184)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:237)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.addRow(WriterImpl.java:308)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSimpleEvent(OrcRecordUpdater.java:423)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSplitUpdateEvent(OrcRecordUpdater.java:431)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.insert(OrcRecordUpdater.java:483)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:995)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.process(VectorFileSinkOperator.java:111)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:490)
>   ... 20 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20910) Insert in bucketed table fails due to dynamic partition sort optimization

2018-11-13 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685902#comment-16685902
 ] 

Vineet Garg commented on HIVE-20910:


[~prasanth_j] Would you mind taking a look? 
Failure was due to removal of 'CAST' which was part of SELECT after RS, SDPO 
was removing RS + SEL (introduced by bucketing) to introduce new RS+SEL, as a 
result CAST ended up being removed.

> Insert in bucketed table fails due to dynamic partition sort optimization
> -
>
> Key: HIVE-20910
> URL: https://issues.apache.org/jira/browse/HIVE-20910
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20910.1.patch
>
>
> *Repro*
> {code:sql}
> CREATE TABLE tbl_lbodor_1 (ca_address_sk int, ca_address_id string, 
> ca_street_number string, ca_street_name string,
> ca_street_type string, ca_suite_number string, ca_city string, ca_county 
> string, ca_state string,
> ca_zip string, ca_country string, ca_gmt_offset decimal(5,2))
> PARTITIONED BY (ca_location_type string)
> CLUSTERED BY (ca_state) INTO 50 BUCKETS STORED AS ORC 
> TBLPROPERTIES('transactional'='true');
> INSERT INTO TABLE tbl_lbodor_1 PARTITION (ca_location_type) VALUES (, 
> 'DLFB', '126',
> 'Highland Park', 'Court', 'Suite E', 'San Jose', 'King George 
> County', 'VA', '28003', 'United States',
> '-5', 'single family');
> {code}
> {noformat}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.Decimal64ColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:184)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:237)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.addRow(WriterImpl.java:308)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSimpleEvent(OrcRecordUpdater.java:423)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSplitUpdateEvent(OrcRecordUpdater.java:431)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.insert(OrcRecordUpdater.java:483)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:995)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.process(VectorFileSinkOperator.java:111)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:490)
>   ... 20 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20910) Insert in bucketed table fails due to dynamic partition sort optimization

2018-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685899#comment-16685899
 ] 

Hive QA commented on HIVE-20910:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12948030/HIVE-20910.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15534 tests 
executed
*Failed tests:*
{noformat}
TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=195)

[druidmini_dynamic_partition.q,druidmini_test_ts.q,druidmini_expressions.q,druidmini_test_alter.q,druidmini_test_insert.q]
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14917/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14917/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14917/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12948030 - PreCommit-HIVE-Build

> Insert in bucketed table fails due to dynamic partition sort optimization
> -
>
> Key: HIVE-20910
> URL: https://issues.apache.org/jira/browse/HIVE-20910
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20910.1.patch
>
>
> *Repro*
> {code:sql}
> CREATE TABLE tbl_lbodor_1 (ca_address_sk int, ca_address_id string, 
> ca_street_number string, ca_street_name string,
> ca_street_type string, ca_suite_number string, ca_city string, ca_county 
> string, ca_state string,
> ca_zip string, ca_country string, ca_gmt_offset decimal(5,2))
> PARTITIONED BY (ca_location_type string)
> CLUSTERED BY (ca_state) INTO 50 BUCKETS STORED AS ORC 
> TBLPROPERTIES('transactional'='true');
> INSERT INTO TABLE tbl_lbodor_1 PARTITION (ca_location_type) VALUES (, 
> 'DLFB', '126',
> 'Highland Park', 'Court', 'Suite E', 'San Jose', 'King George 
> County', 'VA', '28003', 'United States',
> '-5', 'single family');
> {code}
> {noformat}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.Decimal64ColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:184)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:237)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.addRow(WriterImpl.java:308)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSimpleEvent(OrcRecordUpdater.java:423)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSplitUpdateEvent(OrcRecordUpdater.java:431)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.insert(OrcRecordUpdater.java:483)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:995)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.process(VectorFileSinkOperator.java:111)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:490)
>   ... 20 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20910) Insert in bucketed table fails due to dynamic partition sort optimization

2018-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685862#comment-16685862
 ] 

Hive QA commented on HIVE-20910:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
11s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
48s{color} | {color:blue} ql in master has 2316 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 23m  3s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14917/dev-support/hive-personality.sh
 |
| git revision | master / ccbc5c3 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14917/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Insert in bucketed table fails due to dynamic partition sort optimization
> -
>
> Key: HIVE-20910
> URL: https://issues.apache.org/jira/browse/HIVE-20910
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20910.1.patch
>
>
> *Repro*
> {code:sql}
> CREATE TABLE tbl_lbodor_1 (ca_address_sk int, ca_address_id string, 
> ca_street_number string, ca_street_name string,
> ca_street_type string, ca_suite_number string, ca_city string, ca_county 
> string, ca_state string,
> ca_zip string, ca_country string, ca_gmt_offset decimal(5,2))
> PARTITIONED BY (ca_location_type string)
> CLUSTERED BY (ca_state) INTO 50 BUCKETS STORED AS ORC 
> TBLPROPERTIES('transactional'='true');
> INSERT INTO TABLE tbl_lbodor_1 PARTITION (ca_location_type) VALUES (, 
> 'DLFB', '126',
> 'Highland Park', 'Court', 'Suite E', 'San Jose', 'King George 
> County', 'VA', '28003', 'United States',
> '-5', 'single family');
> {code}
> {noformat}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.Decimal64ColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:184)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:237)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.addRow(WriterImpl.java:308)
>   at

[jira] [Updated] (HIVE-20842) Fix logic introduced in HIVE-20660 to estimate statistics for group by

2018-11-13 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20842:
---
Status: Patch Available  (was: Open)

> Fix logic introduced in HIVE-20660 to estimate statistics for group by
> --
>
> Key: HIVE-20842
> URL: https://issues.apache.org/jira/browse/HIVE-20842
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20842.1.patch, HIVE-20842.10.patch, 
> HIVE-20842.11.patch, HIVE-20842.2.patch, HIVE-20842.3.patch, 
> HIVE-20842.4.patch, HIVE-20842.5.patch, HIVE-20842.6.patch, 
> HIVE-20842.7.patch, HIVE-20842.8.patch, HIVE-20842.9.patch
>
>
> HIVE-20660 introduced better estimation for group by operator. But the logic 
> did not account for Partial and Full group by separately.
> For partial group by parallelism (i.e. number of tasks) should be taken into 
> account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20842) Fix logic introduced in HIVE-20660 to estimate statistics for group by

2018-11-13 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20842:
---
Attachment: HIVE-20842.11.patch

> Fix logic introduced in HIVE-20660 to estimate statistics for group by
> --
>
> Key: HIVE-20842
> URL: https://issues.apache.org/jira/browse/HIVE-20842
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20842.1.patch, HIVE-20842.10.patch, 
> HIVE-20842.11.patch, HIVE-20842.2.patch, HIVE-20842.3.patch, 
> HIVE-20842.4.patch, HIVE-20842.5.patch, HIVE-20842.6.patch, 
> HIVE-20842.7.patch, HIVE-20842.8.patch, HIVE-20842.9.patch
>
>
> HIVE-20660 introduced better estimation for group by operator. But the logic 
> did not account for Partial and Full group by separately.
> For partial group by parallelism (i.e. number of tasks) should be taken into 
> account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20842) Fix logic introduced in HIVE-20660 to estimate statistics for group by

2018-11-13 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20842:
---
Status: Open  (was: Patch Available)

> Fix logic introduced in HIVE-20660 to estimate statistics for group by
> --
>
> Key: HIVE-20842
> URL: https://issues.apache.org/jira/browse/HIVE-20842
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20842.1.patch, HIVE-20842.10.patch, 
> HIVE-20842.11.patch, HIVE-20842.2.patch, HIVE-20842.3.patch, 
> HIVE-20842.4.patch, HIVE-20842.5.patch, HIVE-20842.6.patch, 
> HIVE-20842.7.patch, HIVE-20842.8.patch, HIVE-20842.9.patch
>
>
> HIVE-20660 introduced better estimation for group by operator. But the logic 
> did not account for Partial and Full group by separately.
> For partial group by parallelism (i.e. number of tasks) should be taken into 
> account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20900) serde2.JsonSerDe no longer supports timestamp.formats

2018-11-13 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685830#comment-16685830
 ] 

Jason Dere commented on HIVE-20900:
---

Looks like TIMESTAMP WITH LOCAL TIME ZONE never implemented the 
timestamp.formats support:
{noformat}
Caused by: org.apache.hadoop.hive.serde2.SerDeException: struct field __time: 
Could not convert from string to map type timestamp with local time zone
at 
org.apache.hadoop.hive.serde2.json.HiveJsonStructReader.parseStruct(HiveJsonStructReader.java:203)
at 
org.apache.hadoop.hive.serde2.json.HiveJsonStructReader.parseDispatcher(HiveJsonStructReader.java:117)
at 
org.apache.hadoop.hive.serde2.json.HiveJsonStructReader.parseInternal(HiveJsonStructReader.java:100)
... 55 more
Caused by: java.io.IOException: Could not convert from string to map type 
timestamp with local time zone
at 
org.apache.hadoop.hive.serde2.json.HiveJsonStructReader.parsePrimitiveValue(HiveJsonStructReader.java:370)
at 
org.apache.hadoop.hive.serde2.json.HiveJsonStructReader.getObjectOfCorrespondingPrimitiveType(HiveJsonStructReader.java:322)
at 
org.apache.hadoop.hive.serde2.json.HiveJsonStructReader.parsePrimitive(HiveJsonStructReader.java:308)
at 
org.apache.hadoop.hive.serde2.json.HiveJsonStructReader.parseDispatcher(HiveJsonStructReader.java:113)
at 
org.apache.hadoop.hive.serde2.json.HiveJsonStructReader.parseStruct(HiveJsonStructReader.java:201)
... 57 more
{noformat}

> serde2.JsonSerDe no longer supports timestamp.formats
> -
>
> Key: HIVE-20900
> URL: https://issues.apache.org/jira/browse/HIVE-20900
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Jason Dere
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-20900.1.patch, HIVE-20900.2.patch
>
>
> Looks like HIVE-18545 broke this.
> Also json_serde_tsformat.q only tested the hcat version of JsonSerde, and the 
> format in that test used the ISO timestamp format which apparently is now 
> parsed by the default timestamp parsing, so the test was too simple.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20787) MapJoinBytesTableContainer dummyRow case doesn't handle reuse

2018-11-13 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-20787:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Committed to master; thanks for the review

> MapJoinBytesTableContainer dummyRow case doesn't handle reuse
> -
>
> Key: HIVE-20787
> URL: https://issues.apache.org/jira/browse/HIVE-20787
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20787.01.patch, HIVE-20787.02.patch, 
> HIVE-20787.patch
>
>
> Discovered while investigating some (probably) unrelated issue.
> MapJoinBytesTableContainer was not intended to be reused, but it looks like 
> some code might reuse it. If that happens, the dummyRow case will not work 
> correctly (dummyRow is cleared on first(), so another call to first() will 
> behave differently).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20910) Insert in bucketed table fails due to dynamic partition sort optimization

2018-11-13 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20910:
---
Attachment: HIVE-20910.1.patch

> Insert in bucketed table fails due to dynamic partition sort optimization
> -
>
> Key: HIVE-20910
> URL: https://issues.apache.org/jira/browse/HIVE-20910
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20910.1.patch
>
>
> *Repro*
> {code:sql}
> CREATE TABLE tbl_lbodor_1 (ca_address_sk int, ca_address_id string, 
> ca_street_number string, ca_street_name string,
> ca_street_type string, ca_suite_number string, ca_city string, ca_county 
> string, ca_state string,
> ca_zip string, ca_country string, ca_gmt_offset decimal(5,2))
> PARTITIONED BY (ca_location_type string)
> CLUSTERED BY (ca_state) INTO 50 BUCKETS STORED AS ORC 
> TBLPROPERTIES('transactional'='true');
> INSERT INTO TABLE tbl_lbodor_1 PARTITION (ca_location_type) VALUES (, 
> 'DLFB', '126',
> 'Highland Park', 'Court', 'Suite E', 'San Jose', 'King George 
> County', 'VA', '28003', 'United States',
> '-5', 'single family');
> {code}
> {noformat}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.Decimal64ColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:184)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:237)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.addRow(WriterImpl.java:308)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSimpleEvent(OrcRecordUpdater.java:423)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSplitUpdateEvent(OrcRecordUpdater.java:431)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.insert(OrcRecordUpdater.java:483)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:995)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.process(VectorFileSinkOperator.java:111)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:490)
>   ... 20 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20910) Insert in bucketed table fails due to dynamic partition sort optimization

2018-11-13 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20910:
---
Status: Patch Available  (was: Open)

> Insert in bucketed table fails due to dynamic partition sort optimization
> -
>
> Key: HIVE-20910
> URL: https://issues.apache.org/jira/browse/HIVE-20910
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20910.1.patch
>
>
> *Repro*
> {code:sql}
> CREATE TABLE tbl_lbodor_1 (ca_address_sk int, ca_address_id string, 
> ca_street_number string, ca_street_name string,
> ca_street_type string, ca_suite_number string, ca_city string, ca_county 
> string, ca_state string,
> ca_zip string, ca_country string, ca_gmt_offset decimal(5,2))
> PARTITIONED BY (ca_location_type string)
> CLUSTERED BY (ca_state) INTO 50 BUCKETS STORED AS ORC 
> TBLPROPERTIES('transactional'='true');
> INSERT INTO TABLE tbl_lbodor_1 PARTITION (ca_location_type) VALUES (, 
> 'DLFB', '126',
> 'Highland Park', 'Court', 'Suite E', 'San Jose', 'King George 
> County', 'VA', '28003', 'United States',
> '-5', 'single family');
> {code}
> {noformat}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.Decimal64ColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:184)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:237)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.addRow(WriterImpl.java:308)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSimpleEvent(OrcRecordUpdater.java:423)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSplitUpdateEvent(OrcRecordUpdater.java:431)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.insert(OrcRecordUpdater.java:483)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:995)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.process(VectorFileSinkOperator.java:111)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:490)
>   ... 20 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20910) Insert in bucketed table fails due to dynamic partition sort optimization

2018-11-13 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg reassigned HIVE-20910:
--


> Insert in bucketed table fails due to dynamic partition sort optimization
> -
>
> Key: HIVE-20910
> URL: https://issues.apache.org/jira/browse/HIVE-20910
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>
> *Repro*
> {code:sql}
> CREATE TABLE tbl_lbodor_1 (ca_address_sk int, ca_address_id string, 
> ca_street_number string, ca_street_name string,
> ca_street_type string, ca_suite_number string, ca_city string, ca_county 
> string, ca_state string,
> ca_zip string, ca_country string, ca_gmt_offset decimal(5,2))
> PARTITIONED BY (ca_location_type string)
> CLUSTERED BY (ca_state) INTO 50 BUCKETS STORED AS ORC 
> TBLPROPERTIES('transactional'='true');
> INSERT INTO TABLE tbl_lbodor_1 PARTITION (ca_location_type) VALUES (, 
> 'DLFB', '126',
> 'Highland Park', 'Court', 'Suite E', 'San Jose', 'King George 
> County', 'VA', '28003', 'United States',
> '-5', 'single family');
> {code}
> {noformat}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.Decimal64ColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:184)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:237)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.addRow(WriterImpl.java:308)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSimpleEvent(OrcRecordUpdater.java:423)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.addSplitUpdateEvent(OrcRecordUpdater.java:431)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.insert(OrcRecordUpdater.java:483)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:995)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.process(VectorFileSinkOperator.java:111)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:490)
>   ... 20 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20730) Do delete event filtering even if hive.acid.index is not there

2018-11-13 Thread Saurabh Seth (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685782#comment-16685782
 ] 

Saurabh Seth commented on HIVE-20730:
-

[~ekoifman] this is ready to be committed now. I've added the test case I was 
working on.

> Do delete event filtering even if hive.acid.index is not there
> --
>
> Key: HIVE-20730
> URL: https://issues.apache.org/jira/browse/HIVE-20730
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 4.0.0
>Reporter: Eugene Koifman
>Assignee: Saurabh Seth
>Priority: Major
> Attachments: HIVE-20730.2.patch, HIVE-20730.3.patch, HIVE-20730.patch
>
>
> since HIVE-16812 {{VectorizedOrcAcidRowBatchReader}} filters delete events 
> based on min/max ROW__ID in the split which relies on {{hive.acid.index}} to 
> be in the ORC footer.  
> There is no way to generate {{hive.acid.index}} from a plain query as in 
> HIVE-20699 and so we need to make sure that we generate a SARG into 
> delete_delta/bucket_x based on stripe stats even the index is missing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20905) querying streaming table fails with out of memory exception

2018-11-13 Thread Thejas M Nair (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-20905:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> querying streaming table fails with out of memory exception
> ---
>
> Key: HIVE-20905
> URL: https://issues.apache.org/jira/browse/HIVE-20905
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Transactions
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20905.01.patch, HIVE-20905.02.patch, 
> HIVE-20905.03.patch
>
>
> Streaming app was ran for 24hrs post which it went down due authentication 
> issue . The table was accessible for 12hrs into the run, however currently 
> querying the table fails with OOM exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20900) serde2.JsonSerDe no longer supports timestamp.formats

2018-11-13 Thread slim bouguerra (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685735#comment-16685735
 ] 

slim bouguerra commented on HIVE-20900:
---

this test 

org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[kafka_storage_handler]

might be failing because of expected results miss match due to the bug fix.

Can you please re run the test and update the results?

> serde2.JsonSerDe no longer supports timestamp.formats
> -
>
> Key: HIVE-20900
> URL: https://issues.apache.org/jira/browse/HIVE-20900
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Jason Dere
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-20900.1.patch, HIVE-20900.2.patch
>
>
> Looks like HIVE-18545 broke this.
> Also json_serde_tsformat.q only tested the hcat version of JsonSerde, and the 
> format in that test used the ISO timestamp format which apparently is now 
> parsed by the default timestamp parsing, so the test was too simple.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20905) querying streaming table fails with out of memory exception

2018-11-13 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685730#comment-16685730
 ] 

Thejas M Nair commented on HIVE-20905:
--

+1


> querying streaming table fails with out of memory exception
> ---
>
> Key: HIVE-20905
> URL: https://issues.apache.org/jira/browse/HIVE-20905
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Transactions
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20905.01.patch, HIVE-20905.02.patch, 
> HIVE-20905.03.patch
>
>
> Streaming app was ran for 24hrs post which it went down due authentication 
> issue . The table was accessible for 12hrs into the run, however currently 
> querying the table fails with OOM exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20900) serde2.JsonSerDe no longer supports timestamp.formats

2018-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685725#comment-16685725
 ] 

Hive QA commented on HIVE-20900:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12948011/HIVE-20900.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15540 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[kafka_storage_handler]
 (batchId=194)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14916/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14916/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14916/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12948011 - PreCommit-HIVE-Build

> serde2.JsonSerDe no longer supports timestamp.formats
> -
>
> Key: HIVE-20900
> URL: https://issues.apache.org/jira/browse/HIVE-20900
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Jason Dere
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-20900.1.patch, HIVE-20900.2.patch
>
>
> Looks like HIVE-18545 broke this.
> Also json_serde_tsformat.q only tested the hcat version of JsonSerde, and the 
> format in that test used the ISO timestamp format which apparently is now 
> parsed by the default timestamp parsing, so the test was too simple.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20900) serde2.JsonSerDe no longer supports timestamp.formats

2018-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685717#comment-16685717
 ] 

Hive QA commented on HIVE-20900:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
23s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m  
7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 8s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
36s{color} | {color:blue} serde in master has 198 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
46s{color} | {color:blue} ql in master has 2316 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
25s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} serde: The patch generated 0 new + 5 unchanged - 1 
fixed = 5 total (was 6) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} root: The patch generated 0 new + 5 unchanged - 1 
fixed = 5 total (was 6) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} The patch ql passed checkstyle {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
45s{color} | {color:red} serde generated 1 new + 197 unchanged - 1 fixed = 198 
total (was 198) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
19s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 56m 45s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:serde |
|  |  Found reliance on default encoding in 
org.apache.hadoop.hive.serde2.json.HiveJsonStructReader.parsePrimitiveValue(String,
 PrimitiveObjectInspector):in 
org.apache.hadoop.hive.serde2.json.HiveJsonStructReader.parsePrimitiveValue(String,
 PrimitiveObjectInspector): String.getBytes()  At 
HiveJsonStructReader.java:[line 353] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14916/dev-support/hive-personality.sh
 |
| git revision | master / 52f94b8 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14916/yetus/new-findbugs-serde.html
 |
| modules | C: serde . ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14916/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> serde2.JsonSerDe no longer supports timestamp.formats
> -
>
> Key: HIVE-20900
> URL: https://issues.apache.org/jira/browse/HIVE-20900
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/

[jira] [Assigned] (HIVE-20304) When hive.optimize.skewjoin and hive.auto.convert.join are both set to true, and the execution engine is mr, same stage may launch twice due to the wrong generated plan

2018-11-13 Thread Yongzhi Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen reassigned HIVE-20304:
---

Assignee: Hui Huang  (was: Yongzhi Chen)

> When hive.optimize.skewjoin and hive.auto.convert.join are both set to true, 
> and the execution engine is mr, same stage may launch twice due to the wrong 
> generated plan
> 
>
> Key: HIVE-20304
> URL: https://issues.apache.org/jira/browse/HIVE-20304
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 1.2.1, 2.3.3
>Reporter: Hui Huang
>Assignee: Hui Huang
>Priority: Major
> Fix For: 1.2.1, 4.0.0
>
> Attachments: HIVE-20304.1.patch, HIVE-20304.2.patch, HIVE-20304.patch
>
>
> `When hive.optimize.skewjoin and hive.auto.convert.join are both set to true, 
> and the execution engine is set to mr, same stage of a query may launch twice 
> due to the wrong generated plan. If hive.exec.parallel is also true, the same 
> stage will launch at the same time and the job will failed due to the first 
> completed stage clear the map.xml/reduce.xml file stored in the hdfs.
> use following sql to reproduce the issue:
> {code:java}
> CREATE TABLE `tbl1`(
>   `fence` string);
> CREATE TABLE `tbl2`(
>   `order_id` string,
>   `phone` string,
>   `search_id` string
> )
> PARTITIONED BY (
>   `dt` string);
> CREATE TABLE `tbl3`(
>   `order_id` string,
>   `platform` string)
> PARTITIONED BY (
>   `dt` string);
> CREATE TABLE `tbl4`(
>   `groupname` string,
>   `phone` string)
> PARTITIONED BY (
>   `dt` string);
> CREATE TABLE `tbl5`(
>   `search_id` string,
>   `fence` string)
> PARTITIONED BY (
>   `dt` string);
> SET hive.exec.parallel = TRUE;
> SET hive.auto.convert.join = TRUE;
> SET hive.optimize.skewjoin = TRUE;
> SELECT dt,
>platform,
>groupname,
>count(1) as cnt
> FROM
> (SELECT dt,
> platform,
> groupname
>  FROM
>  (SELECT fence
>   FROM tbl1)ta
>JOIN
>(SELECT a0.dt,
>a1.platform,
>a2.groupname,
>a3.fence
> FROM
> (SELECT dt,
> order_id,
> phone,
> search_id
>  FROM tbl2
>  WHERE dt =20180703 )a0
>   JOIN
>   (SELECT order_id,
>   platform,
>   dt
>FROM tbl3
>WHERE dt =20180703 )a1 ON a0.order_id = a1.order_id
>   INNER JOIN
>   (SELECT groupname,
>   phone,
>   dt
>FROM tbl4
>WHERE dt =20180703 )a2 ON a0.phone = a2.phone
>   LEFT JOIN
>   (SELECT search_id,
>   fence,
>   dt
>FROM tbl5
>WHERE dt =20180703)a3 ON a0.search_id = a3.search_id)t0 ON 
> ta.fence = t0.fence)t11
> GROUP BY dt,
>  platform,
>  groupname;
> DROP TABLE tbl1;
> DROP TABLE tbl2;
> DROP TABLE tbl3;
> DROP TABLE tbl4;
> DROP TABLE tbl5;
> {code}
> We will get some error message like this:
> Examining task ID: task_1531284442065_3637_m_00 (and more) from job 
> job_1531284442065_3637
> Task with the most failures(4):
> 
> Task ID:
>  task_1531284442065_3637_m_00
> URL:
>  
> [http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1531284442065_3637&tipid=task_1531284442065_3637_m_00]
> 
> Diagnostic Messages for this Task:
>  File does not exist: 
> hdfs://test/tmp/hive-hadoop/hadoop/fe5efa94-abb1-420f-b6ba-ec782e7b79ad/hive_2018-08-03_17-00-17_707_592882314975289971-5/-mr-10045/757eb1f7-7a37-4a7e-abc0-4a3b8b06510c/reduce.xml
>  java.io.FileNotFoundException: File does not exist: 
> hdfs://test/tmp/hive-hadoop/hadoop/fe5efa94-abb1-420f-b6ba-ec782e7b79ad/hive_2018-08-03_17-00-17_707_592882314975289971-5/-mr-10045/757eb1f7-7a37-4a7e-abc0-4a3b8b06510c/reduce.xml
> Looking into the plan by executing explain, I found that the Stage-4 and 
> Stage-5 can reached from multi root tasks.
> {code:java}
> Explain
> STAGE DEPENDENCIES:
>   Stage-21 is a root stage , consists of Stage-34, Stage-5
>   Stage-34 has a backup stage: Stage-5
>   Stage-20 depends on stages: Stage-34
>   Stage-17 depends on stages: Stage-5, Stage-18, Stage-20 , consists of 
> Stage-32, Stage-33, Stage-1
>   Stage-32 has a backup stage: Stage-1
>   Stage-15 depends on stages: Stage-32
>   Stage-10 depends on stages: Stage-1, Stage-15, Stage-16 , consists of 
> Stage-31, Stage-2
>   Stage-31
>   S

[jira] [Updated] (HIVE-20304) When hive.optimize.skewjoin and hive.auto.convert.join are both set to true, and the execution engine is mr, same stage may launch twice due to the wrong generated plan

2018-11-13 Thread Yongzhi Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-20304:

Fix Version/s: 4.0.0
   Status: Patch Available  (was: Open)

> When hive.optimize.skewjoin and hive.auto.convert.join are both set to true, 
> and the execution engine is mr, same stage may launch twice due to the wrong 
> generated plan
> 
>
> Key: HIVE-20304
> URL: https://issues.apache.org/jira/browse/HIVE-20304
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 1.2.1, 2.3.3
>Reporter: Hui Huang
>Assignee: Yongzhi Chen
>Priority: Major
> Fix For: 4.0.0, 1.2.1
>
> Attachments: HIVE-20304.1.patch, HIVE-20304.2.patch, HIVE-20304.patch
>
>
> `When hive.optimize.skewjoin and hive.auto.convert.join are both set to true, 
> and the execution engine is set to mr, same stage of a query may launch twice 
> due to the wrong generated plan. If hive.exec.parallel is also true, the same 
> stage will launch at the same time and the job will failed due to the first 
> completed stage clear the map.xml/reduce.xml file stored in the hdfs.
> use following sql to reproduce the issue:
> {code:java}
> CREATE TABLE `tbl1`(
>   `fence` string);
> CREATE TABLE `tbl2`(
>   `order_id` string,
>   `phone` string,
>   `search_id` string
> )
> PARTITIONED BY (
>   `dt` string);
> CREATE TABLE `tbl3`(
>   `order_id` string,
>   `platform` string)
> PARTITIONED BY (
>   `dt` string);
> CREATE TABLE `tbl4`(
>   `groupname` string,
>   `phone` string)
> PARTITIONED BY (
>   `dt` string);
> CREATE TABLE `tbl5`(
>   `search_id` string,
>   `fence` string)
> PARTITIONED BY (
>   `dt` string);
> SET hive.exec.parallel = TRUE;
> SET hive.auto.convert.join = TRUE;
> SET hive.optimize.skewjoin = TRUE;
> SELECT dt,
>platform,
>groupname,
>count(1) as cnt
> FROM
> (SELECT dt,
> platform,
> groupname
>  FROM
>  (SELECT fence
>   FROM tbl1)ta
>JOIN
>(SELECT a0.dt,
>a1.platform,
>a2.groupname,
>a3.fence
> FROM
> (SELECT dt,
> order_id,
> phone,
> search_id
>  FROM tbl2
>  WHERE dt =20180703 )a0
>   JOIN
>   (SELECT order_id,
>   platform,
>   dt
>FROM tbl3
>WHERE dt =20180703 )a1 ON a0.order_id = a1.order_id
>   INNER JOIN
>   (SELECT groupname,
>   phone,
>   dt
>FROM tbl4
>WHERE dt =20180703 )a2 ON a0.phone = a2.phone
>   LEFT JOIN
>   (SELECT search_id,
>   fence,
>   dt
>FROM tbl5
>WHERE dt =20180703)a3 ON a0.search_id = a3.search_id)t0 ON 
> ta.fence = t0.fence)t11
> GROUP BY dt,
>  platform,
>  groupname;
> DROP TABLE tbl1;
> DROP TABLE tbl2;
> DROP TABLE tbl3;
> DROP TABLE tbl4;
> DROP TABLE tbl5;
> {code}
> We will get some error message like this:
> Examining task ID: task_1531284442065_3637_m_00 (and more) from job 
> job_1531284442065_3637
> Task with the most failures(4):
> 
> Task ID:
>  task_1531284442065_3637_m_00
> URL:
>  
> [http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1531284442065_3637&tipid=task_1531284442065_3637_m_00]
> 
> Diagnostic Messages for this Task:
>  File does not exist: 
> hdfs://test/tmp/hive-hadoop/hadoop/fe5efa94-abb1-420f-b6ba-ec782e7b79ad/hive_2018-08-03_17-00-17_707_592882314975289971-5/-mr-10045/757eb1f7-7a37-4a7e-abc0-4a3b8b06510c/reduce.xml
>  java.io.FileNotFoundException: File does not exist: 
> hdfs://test/tmp/hive-hadoop/hadoop/fe5efa94-abb1-420f-b6ba-ec782e7b79ad/hive_2018-08-03_17-00-17_707_592882314975289971-5/-mr-10045/757eb1f7-7a37-4a7e-abc0-4a3b8b06510c/reduce.xml
> Looking into the plan by executing explain, I found that the Stage-4 and 
> Stage-5 can reached from multi root tasks.
> {code:java}
> Explain
> STAGE DEPENDENCIES:
>   Stage-21 is a root stage , consists of Stage-34, Stage-5
>   Stage-34 has a backup stage: Stage-5
>   Stage-20 depends on stages: Stage-34
>   Stage-17 depends on stages: Stage-5, Stage-18, Stage-20 , consists of 
> Stage-32, Stage-33, Stage-1
>   Stage-32 has a backup stage: Stage-1
>   Stage-15 depends on stages: Stage-32
>   Stage-10 depends on stages: Stage-1, Stage-15, Stage-16 , consists of 
> Stage-31, St

[jira] [Assigned] (HIVE-20304) When hive.optimize.skewjoin and hive.auto.convert.join are both set to true, and the execution engine is mr, same stage may launch twice due to the wrong generated plan

2018-11-13 Thread Yongzhi Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen reassigned HIVE-20304:
---

Assignee: Yongzhi Chen  (was: Hui Huang)

> When hive.optimize.skewjoin and hive.auto.convert.join are both set to true, 
> and the execution engine is mr, same stage may launch twice due to the wrong 
> generated plan
> 
>
> Key: HIVE-20304
> URL: https://issues.apache.org/jira/browse/HIVE-20304
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 1.2.1, 2.3.3
>Reporter: Hui Huang
>Assignee: Yongzhi Chen
>Priority: Major
> Fix For: 1.2.1, 4.0.0
>
> Attachments: HIVE-20304.1.patch, HIVE-20304.2.patch, HIVE-20304.patch
>
>
> `When hive.optimize.skewjoin and hive.auto.convert.join are both set to true, 
> and the execution engine is set to mr, same stage of a query may launch twice 
> due to the wrong generated plan. If hive.exec.parallel is also true, the same 
> stage will launch at the same time and the job will failed due to the first 
> completed stage clear the map.xml/reduce.xml file stored in the hdfs.
> use following sql to reproduce the issue:
> {code:java}
> CREATE TABLE `tbl1`(
>   `fence` string);
> CREATE TABLE `tbl2`(
>   `order_id` string,
>   `phone` string,
>   `search_id` string
> )
> PARTITIONED BY (
>   `dt` string);
> CREATE TABLE `tbl3`(
>   `order_id` string,
>   `platform` string)
> PARTITIONED BY (
>   `dt` string);
> CREATE TABLE `tbl4`(
>   `groupname` string,
>   `phone` string)
> PARTITIONED BY (
>   `dt` string);
> CREATE TABLE `tbl5`(
>   `search_id` string,
>   `fence` string)
> PARTITIONED BY (
>   `dt` string);
> SET hive.exec.parallel = TRUE;
> SET hive.auto.convert.join = TRUE;
> SET hive.optimize.skewjoin = TRUE;
> SELECT dt,
>platform,
>groupname,
>count(1) as cnt
> FROM
> (SELECT dt,
> platform,
> groupname
>  FROM
>  (SELECT fence
>   FROM tbl1)ta
>JOIN
>(SELECT a0.dt,
>a1.platform,
>a2.groupname,
>a3.fence
> FROM
> (SELECT dt,
> order_id,
> phone,
> search_id
>  FROM tbl2
>  WHERE dt =20180703 )a0
>   JOIN
>   (SELECT order_id,
>   platform,
>   dt
>FROM tbl3
>WHERE dt =20180703 )a1 ON a0.order_id = a1.order_id
>   INNER JOIN
>   (SELECT groupname,
>   phone,
>   dt
>FROM tbl4
>WHERE dt =20180703 )a2 ON a0.phone = a2.phone
>   LEFT JOIN
>   (SELECT search_id,
>   fence,
>   dt
>FROM tbl5
>WHERE dt =20180703)a3 ON a0.search_id = a3.search_id)t0 ON 
> ta.fence = t0.fence)t11
> GROUP BY dt,
>  platform,
>  groupname;
> DROP TABLE tbl1;
> DROP TABLE tbl2;
> DROP TABLE tbl3;
> DROP TABLE tbl4;
> DROP TABLE tbl5;
> {code}
> We will get some error message like this:
> Examining task ID: task_1531284442065_3637_m_00 (and more) from job 
> job_1531284442065_3637
> Task with the most failures(4):
> 
> Task ID:
>  task_1531284442065_3637_m_00
> URL:
>  
> [http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1531284442065_3637&tipid=task_1531284442065_3637_m_00]
> 
> Diagnostic Messages for this Task:
>  File does not exist: 
> hdfs://test/tmp/hive-hadoop/hadoop/fe5efa94-abb1-420f-b6ba-ec782e7b79ad/hive_2018-08-03_17-00-17_707_592882314975289971-5/-mr-10045/757eb1f7-7a37-4a7e-abc0-4a3b8b06510c/reduce.xml
>  java.io.FileNotFoundException: File does not exist: 
> hdfs://test/tmp/hive-hadoop/hadoop/fe5efa94-abb1-420f-b6ba-ec782e7b79ad/hive_2018-08-03_17-00-17_707_592882314975289971-5/-mr-10045/757eb1f7-7a37-4a7e-abc0-4a3b8b06510c/reduce.xml
> Looking into the plan by executing explain, I found that the Stage-4 and 
> Stage-5 can reached from multi root tasks.
> {code:java}
> Explain
> STAGE DEPENDENCIES:
>   Stage-21 is a root stage , consists of Stage-34, Stage-5
>   Stage-34 has a backup stage: Stage-5
>   Stage-20 depends on stages: Stage-34
>   Stage-17 depends on stages: Stage-5, Stage-18, Stage-20 , consists of 
> Stage-32, Stage-33, Stage-1
>   Stage-32 has a backup stage: Stage-1
>   Stage-15 depends on stages: Stage-32
>   Stage-10 depends on stages: Stage-1, Stage-15, Stage-16 , consists of 
> Stage-31, Stage-2
>   Stage-31
> 

[jira] [Updated] (HIVE-20304) When hive.optimize.skewjoin and hive.auto.convert.join are both set to true, and the execution engine is mr, same stage may launch twice due to the wrong generated plan

2018-11-13 Thread Yongzhi Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-20304:

Status: Open  (was: Patch Available)

> When hive.optimize.skewjoin and hive.auto.convert.join are both set to true, 
> and the execution engine is mr, same stage may launch twice due to the wrong 
> generated plan
> 
>
> Key: HIVE-20304
> URL: https://issues.apache.org/jira/browse/HIVE-20304
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 1.2.1, 2.3.3
>Reporter: Hui Huang
>Assignee: Hui Huang
>Priority: Major
> Fix For: 1.2.1
>
> Attachments: HIVE-20304.1.patch, HIVE-20304.2.patch, HIVE-20304.patch
>
>
> `When hive.optimize.skewjoin and hive.auto.convert.join are both set to true, 
> and the execution engine is set to mr, same stage of a query may launch twice 
> due to the wrong generated plan. If hive.exec.parallel is also true, the same 
> stage will launch at the same time and the job will failed due to the first 
> completed stage clear the map.xml/reduce.xml file stored in the hdfs.
> use following sql to reproduce the issue:
> {code:java}
> CREATE TABLE `tbl1`(
>   `fence` string);
> CREATE TABLE `tbl2`(
>   `order_id` string,
>   `phone` string,
>   `search_id` string
> )
> PARTITIONED BY (
>   `dt` string);
> CREATE TABLE `tbl3`(
>   `order_id` string,
>   `platform` string)
> PARTITIONED BY (
>   `dt` string);
> CREATE TABLE `tbl4`(
>   `groupname` string,
>   `phone` string)
> PARTITIONED BY (
>   `dt` string);
> CREATE TABLE `tbl5`(
>   `search_id` string,
>   `fence` string)
> PARTITIONED BY (
>   `dt` string);
> SET hive.exec.parallel = TRUE;
> SET hive.auto.convert.join = TRUE;
> SET hive.optimize.skewjoin = TRUE;
> SELECT dt,
>platform,
>groupname,
>count(1) as cnt
> FROM
> (SELECT dt,
> platform,
> groupname
>  FROM
>  (SELECT fence
>   FROM tbl1)ta
>JOIN
>(SELECT a0.dt,
>a1.platform,
>a2.groupname,
>a3.fence
> FROM
> (SELECT dt,
> order_id,
> phone,
> search_id
>  FROM tbl2
>  WHERE dt =20180703 )a0
>   JOIN
>   (SELECT order_id,
>   platform,
>   dt
>FROM tbl3
>WHERE dt =20180703 )a1 ON a0.order_id = a1.order_id
>   INNER JOIN
>   (SELECT groupname,
>   phone,
>   dt
>FROM tbl4
>WHERE dt =20180703 )a2 ON a0.phone = a2.phone
>   LEFT JOIN
>   (SELECT search_id,
>   fence,
>   dt
>FROM tbl5
>WHERE dt =20180703)a3 ON a0.search_id = a3.search_id)t0 ON 
> ta.fence = t0.fence)t11
> GROUP BY dt,
>  platform,
>  groupname;
> DROP TABLE tbl1;
> DROP TABLE tbl2;
> DROP TABLE tbl3;
> DROP TABLE tbl4;
> DROP TABLE tbl5;
> {code}
> We will get some error message like this:
> Examining task ID: task_1531284442065_3637_m_00 (and more) from job 
> job_1531284442065_3637
> Task with the most failures(4):
> 
> Task ID:
>  task_1531284442065_3637_m_00
> URL:
>  
> [http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1531284442065_3637&tipid=task_1531284442065_3637_m_00]
> 
> Diagnostic Messages for this Task:
>  File does not exist: 
> hdfs://test/tmp/hive-hadoop/hadoop/fe5efa94-abb1-420f-b6ba-ec782e7b79ad/hive_2018-08-03_17-00-17_707_592882314975289971-5/-mr-10045/757eb1f7-7a37-4a7e-abc0-4a3b8b06510c/reduce.xml
>  java.io.FileNotFoundException: File does not exist: 
> hdfs://test/tmp/hive-hadoop/hadoop/fe5efa94-abb1-420f-b6ba-ec782e7b79ad/hive_2018-08-03_17-00-17_707_592882314975289971-5/-mr-10045/757eb1f7-7a37-4a7e-abc0-4a3b8b06510c/reduce.xml
> Looking into the plan by executing explain, I found that the Stage-4 and 
> Stage-5 can reached from multi root tasks.
> {code:java}
> Explain
> STAGE DEPENDENCIES:
>   Stage-21 is a root stage , consists of Stage-34, Stage-5
>   Stage-34 has a backup stage: Stage-5
>   Stage-20 depends on stages: Stage-34
>   Stage-17 depends on stages: Stage-5, Stage-18, Stage-20 , consists of 
> Stage-32, Stage-33, Stage-1
>   Stage-32 has a backup stage: Stage-1
>   Stage-15 depends on stages: Stage-32
>   Stage-10 depends on stages: Stage-1, Stage-15, Stage-16 , consists of 
> Stage-31, Stage-2
>   Stage-31
>   Stage-9 depends on 

[jira] [Commented] (HIVE-19026) Configurable serde for druid kafka indexing

2018-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685648#comment-16685648
 ] 

Hive QA commented on HIVE-19026:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12948001/HIVE-19026.8.patch

{color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15542 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14915/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14915/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14915/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12948001 - PreCommit-HIVE-Build

> Configurable serde for druid kafka indexing 
> 
>
> Key: HIVE-19026
> URL: https://issues.apache.org/jira/browse/HIVE-19026
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-19026.1.patch, HIVE-19026.2.patch, 
> HIVE-19026.3.patch, HIVE-19026.4.patch, HIVE-19026.5.patch, 
> HIVE-19026.6.patch, HIVE-19026.7.patch, HIVE-19026.8.patch, HIVE-19026.patch
>
>
> https://issues.apache.org/jira/browse/HIVE-18976 introduces support for 
> setting up druid kafka-indexing service. 
> Input serialization should be configurable. for now we can say we only 
> support json, but there should be a mechanism to support other formats. 
> Perhaps, we can make use of Hive's serde library like LazySimpleSerde etc.
> Also add support to ingest timestamp column when the input timestamp column 
> name in input is not `__time`. 
> e.g. 
> CREATE EXTERNAL TABLE druid_kafka_test_avro(__time timestamp , other 
> columns...)
> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
> TBLPROPERTIES (
>  "druid.timestamp.column" = "myinputColumnTimestamp"
>   other ppts 
>  ) 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19026) Configurable serde for druid kafka indexing

2018-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685646#comment-16685646
 ] 

Hive QA commented on HIVE-19026:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
32s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
16s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
55s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
35s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
28s{color} | {color:blue} common in master has 65 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
25s{color} | {color:blue} druid-handler in master has 4 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
18s{color} | {color:blue} itests/qtest-druid in master has 6 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
42s{color} | {color:blue} itests/util in master has 48 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
48s{color} | {color:blue} ql in master has 2316 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
55s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
51s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
10s{color} | {color:red} druid-handler: The patch generated 39 new + 11 
unchanged - 2 fixed = 50 total (was 13) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  8m  
3s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 65m 34s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  
xml  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14915/dev-support/hive-personality.sh
 |
| git revision | master / 1ceb4eb |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14915/yetus/diff-checkstyle-druid-handler.txt
 |
| modules | C: common . druid-handler itests itests/qtest-druid itests/util ql 
U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14915/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Configurable serde for druid kafka indexing 
> 
>
> Key: HIVE-19026
> URL: https://issues.apache.org/jira/browse/HIVE-19026
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Prio

[jira] [Updated] (HIVE-20676) HiveServer2: PrivilegeSynchronizer is not set to daemon status

2018-11-13 Thread Thejas M Nair (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-20676:
-
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

> HiveServer2: PrivilegeSynchronizer is not set to daemon status
> --
>
> Key: HIVE-20676
> URL: https://issues.apache.org/jira/browse/HIVE-20676
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20676.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20900) serde2.JsonSerDe no longer supports timestamp.formats

2018-11-13 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-20900:
--
Attachment: HIVE-20900.2.patch

> serde2.JsonSerDe no longer supports timestamp.formats
> -
>
> Key: HIVE-20900
> URL: https://issues.apache.org/jira/browse/HIVE-20900
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Jason Dere
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-20900.1.patch, HIVE-20900.2.patch
>
>
> Looks like HIVE-18545 broke this.
> Also json_serde_tsformat.q only tested the hcat version of JsonSerde, and the 
> format in that test used the ISO timestamp format which apparently is now 
> parsed by the default timestamp parsing, so the test was too simple.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20900) serde2.JsonSerDe no longer supports timestamp.formats

2018-11-13 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685577#comment-16685577
 ] 

Jason Dere commented on HIVE-20900:
---

[~kgyrtkirk] how about this fix, primitive value parsing will go through the 
same logic whether it is the hcat or serde2 version of JsonSerde.

> serde2.JsonSerDe no longer supports timestamp.formats
> -
>
> Key: HIVE-20900
> URL: https://issues.apache.org/jira/browse/HIVE-20900
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Jason Dere
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-20900.1.patch, HIVE-20900.2.patch
>
>
> Looks like HIVE-18545 broke this.
> Also json_serde_tsformat.q only tested the hcat version of JsonSerde, and the 
> format in that test used the ISO timestamp format which apparently is now 
> parsed by the default timestamp parsing, so the test was too simple.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20891) Call alter_partition in batch when dynamically loading partitions

2018-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685574#comment-16685574
 ] 

Hive QA commented on HIVE-20891:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12947990/HIVE-20891.02.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 41 failed/errored test(s), 15539 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[buckets] 
(batchId=275)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_dynamic_partitions]
 (batchId=275)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions]
 (batchId=275)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move]
 (batchId=275)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only]
 (batchId=275)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only]
 (batchId=275)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[orc_buckets] 
(batchId=275)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[orc_nonstd_partitions_loc]
 (batchId=275)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[parquet_buckets]
 (batchId=275)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[parquet_nonstd_partitions_loc]
 (batchId=275)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[rcfile_buckets] 
(batchId=275)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[rcfile_nonstd_partitions_loc]
 (batchId=275)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_coltype] 
(batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_3] 
(batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dynamic_partition_insert]
 (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[infer_bucket_sort_dyn_part]
 (batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_dml_6] 
(batchId=34)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_dml_7] 
(batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[lock3] (batchId=58)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[lock4] (batchId=56)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[merge_dynamic_partition] 
(batchId=45)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[multi_insert_partitioned]
 (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_merge_incompat2] 
(batchId=90)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[partition_decode_name] 
(batchId=87)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[partition_special_char] 
(batchId=48)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit] 
(batchId=182)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dp_counter_non_mm]
 (batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_opt_vectorization]
 (batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization2]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_partitioned]
 (batchId=181)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[orc_merge7] 
(batchId=181)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[orc_merge_incompat2]
 (batchId=181)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_input_counters]
 (batchId=179)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge7]
 (batchId=192)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge_incompat2]
 (batchId=192)
org.apache.hadoop.hive.ql.TestTxnNoBuckets.testNoBucketsDP (batchId=307)
org.apache.hadoop.hive.ql.TestTxnNoBucketsVectorized.testNoBucketsDP 
(batchId=307)
org.apache.hadoop.hive.ql.util.TestUpgradeTool.testPostUpgrade (batchId=283)
org.apache.hive.hcatalog.listener.TestDbNotificationListener.sqlInsertPartition 
(batchId=266)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomCreatedDynamicPartitions
 (batchId=258)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14914/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14914/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14914/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Exec

[jira] [Commented] (HIVE-19701) getDelegationTokenFromMetaStore doesn't need to be synchronized

2018-11-13 Thread Sankar Hariappan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685534#comment-16685534
 ] 

Sankar Hariappan commented on HIVE-19701:
-

Thanks [~thejas] for the review!
01.patch is committed to master!

> getDelegationTokenFromMetaStore doesn't need to be synchronized
> ---
>
> Key: HIVE-19701
> URL: https://issues.apache.org/jira/browse/HIVE-19701
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: Thejas M Nair
>Assignee: Sankar Hariappan
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-19701.01.patch
>
>
> CLIService.getDelegationTokenFromMetaStore just invokes metastore api via 
> thread local Hive object. So, it doesn't have to be synchronized.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19701) getDelegationTokenFromMetaStore doesn't need to be synchronized

2018-11-13 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-19701:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

> getDelegationTokenFromMetaStore doesn't need to be synchronized
> ---
>
> Key: HIVE-19701
> URL: https://issues.apache.org/jira/browse/HIVE-19701
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: Thejas M Nair
>Assignee: Sankar Hariappan
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-19701.01.patch
>
>
> CLIService.getDelegationTokenFromMetaStore just invokes metastore api via 
> thread local Hive object. So, it doesn't have to be synchronized.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19701) getDelegationTokenFromMetaStore doesn't need to be synchronized

2018-11-13 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685530#comment-16685530
 ] 

Thejas M Nair commented on HIVE-19701:
--

+1


> getDelegationTokenFromMetaStore doesn't need to be synchronized
> ---
>
> Key: HIVE-19701
> URL: https://issues.apache.org/jira/browse/HIVE-19701
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: Thejas M Nair
>Assignee: Sankar Hariappan
>Priority: Major
> Attachments: HIVE-19701.01.patch
>
>
> CLIService.getDelegationTokenFromMetaStore just invokes metastore api via 
> thread local Hive object. So, it doesn't have to be synchronized.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19026) Configurable serde for druid kafka indexing

2018-11-13 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-19026:

Attachment: HIVE-19026.8.patch

> Configurable serde for druid kafka indexing 
> 
>
> Key: HIVE-19026
> URL: https://issues.apache.org/jira/browse/HIVE-19026
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-19026.1.patch, HIVE-19026.2.patch, 
> HIVE-19026.3.patch, HIVE-19026.4.patch, HIVE-19026.5.patch, 
> HIVE-19026.6.patch, HIVE-19026.7.patch, HIVE-19026.8.patch, HIVE-19026.patch
>
>
> https://issues.apache.org/jira/browse/HIVE-18976 introduces support for 
> setting up druid kafka-indexing service. 
> Input serialization should be configurable. for now we can say we only 
> support json, but there should be a mechanism to support other formats. 
> Perhaps, we can make use of Hive's serde library like LazySimpleSerde etc.
> Also add support to ingest timestamp column when the input timestamp column 
> name in input is not `__time`. 
> e.g. 
> CREATE EXTERNAL TABLE druid_kafka_test_avro(__time timestamp , other 
> columns...)
> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
> TBLPROPERTIES (
>  "druid.timestamp.column" = "myinputColumnTimestamp"
>   other ppts 
>  ) 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20891) Call alter_partition in batch when dynamically loading partitions

2018-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685514#comment-16685514
 ] 

Hive QA commented on HIVE-20891:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
40s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
43s{color} | {color:blue} ql in master has 2316 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
39s{color} | {color:red} ql: The patch generated 1 new + 229 unchanged - 0 
fixed = 230 total (was 229) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 22m 24s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14914/dev-support/hive-personality.sh
 |
| git revision | master / 99d25f0 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14914/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14914/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Call alter_partition in batch when dynamically loading partitions
> -
>
> Key: HIVE-20891
> URL: https://issues.apache.org/jira/browse/HIVE-20891
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: Laszlo Pinter
>Assignee: Laszlo Pinter
>Priority: Major
> Attachments: HIVE-20891.01.patch, HIVE-20891.02.patch
>
>
> When dynamically loading partitions, the setStatsPropAndAlterPartition() is 
> called for each partition one by one, resulting in unnecessary calls to the 
> metastore client. This whole logic can be changed to just one call. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20888) TxnHandler: sort() called on immutable lists

2018-11-13 Thread Igor Kryvenko (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685490#comment-16685490
 ] 

Igor Kryvenko commented on HIVE-20888:
--

The failed test is not related. 

> TxnHandler: sort() called on immutable lists
> 
>
> Key: HIVE-20888
> URL: https://issues.apache.org/jira/browse/HIVE-20888
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Gopal V
>Assignee: Igor Kryvenko
>Priority: Major
> Attachments: HIVE-20888.01.patch
>
>
> {code}
> } else {
>   assert (!rqst.isSetSrcTxnToWriteIdList());
>   assert (rqst.isSetTxnIds());
>   txnIds = rqst.getTxnIds();
> }
> Collections.sort(txnIds); //easier to read logs and for assumption 
> done in replication flow
> {code}
> when the input comes from
> {code}
>   @Override
>   public long allocateTableWriteId(long txnId, String dbName, String 
> tableName) throws TException {
> return allocateTableWriteIdsBatch(Collections.singletonList(txnId), 
> dbName, tableName).get(0).getWriteId();
>   }
> {code}
> {code}
> java.lang.UnsupportedOperationException: null
> at java.util.AbstractList.set(AbstractList.java:132) ~[?:1.8.0]
> at java.util.AbstractList$ListItr.set(AbstractList.java:426) ~[?:1.8.0]
> at java.util.Collections.sort(Collections.java:170) ~[?:1.8.0]
> at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.allocateTableWriteIds(TxnHandler.java:1523)
>  ~[hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.allocate_table_write_ids(HiveMetaStore.java:7349)
>  ~[hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20888) TxnHandler: sort() called on immutable lists

2018-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685483#comment-16685483
 ] 

Hive QA commented on HIVE-20888:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12947987/HIVE-20888.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 15534 tests 
executed
*Failed tests:*
{noformat}
TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=195)

[druidmini_dynamic_partition.q,druidmini_test_ts.q,druidmini_expressions.q,druidmini_test_alter.q,druidmini_test_insert.q]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit] 
(batchId=182)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14913/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14913/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14913/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12947987 - PreCommit-HIVE-Build

> TxnHandler: sort() called on immutable lists
> 
>
> Key: HIVE-20888
> URL: https://issues.apache.org/jira/browse/HIVE-20888
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Gopal V
>Assignee: Igor Kryvenko
>Priority: Major
> Attachments: HIVE-20888.01.patch
>
>
> {code}
> } else {
>   assert (!rqst.isSetSrcTxnToWriteIdList());
>   assert (rqst.isSetTxnIds());
>   txnIds = rqst.getTxnIds();
> }
> Collections.sort(txnIds); //easier to read logs and for assumption 
> done in replication flow
> {code}
> when the input comes from
> {code}
>   @Override
>   public long allocateTableWriteId(long txnId, String dbName, String 
> tableName) throws TException {
> return allocateTableWriteIdsBatch(Collections.singletonList(txnId), 
> dbName, tableName).get(0).getWriteId();
>   }
> {code}
> {code}
> java.lang.UnsupportedOperationException: null
> at java.util.AbstractList.set(AbstractList.java:132) ~[?:1.8.0]
> at java.util.AbstractList$ListItr.set(AbstractList.java:426) ~[?:1.8.0]
> at java.util.Collections.sort(Collections.java:170) ~[?:1.8.0]
> at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.allocateTableWriteIds(TxnHandler.java:1523)
>  ~[hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.allocate_table_write_ids(HiveMetaStore.java:7349)
>  ~[hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20909) Just "MSCK" should throw SemanticException

2018-11-13 Thread Naveen Gangam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam reassigned HIVE-20909:



> Just "MSCK" should throw SemanticException
> --
>
> Key: HIVE-20909
> URL: https://issues.apache.org/jira/browse/HIVE-20909
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Minor
>
> Per documentation, the syntax for MSCK command is 
> {{MSCK [REPAIR] TABLE table_name [ADD/DROP/SYNC PARTITIONS];}}
> So just submitting "MSCK" should throw a SemanticException like it does for 
> other queries with incorrect syntax. But instead it appears to be attempting 
> to do something.
> $ hive --hiveconf hive.root.logger=INFO,console -e "msck;"
> 2018-11-08T15:21:25,016  INFO [main] SessionState: 
> 2018-11-08T15:21:26,203  INFO [main] session.SessionState: Created HDFS 
> directory: /tmp/hive/hive/b1b62e04-5a1c-4c6a-babd-31b4f1d2bd78
> 2018-11-08T15:21:26,222  INFO [main] session.SessionState: Created local 
> directory: /tmp/root/b1b62e04-5a1c-4c6a-babd-31b4f1d2bd78
> 2018-11-08T15:21:26,229  INFO [main] session.SessionState: Created HDFS 
> directory: /tmp/hive/hive/b1b62e04-5a1c-4c6a-babd-31b4f1d2bd78/_tmp_space.db
> 2018-11-08T15:21:26,244  INFO [main] conf.HiveConf: Using the default value 
> passed in for log id: b1b62e04-5a1c-4c6a-babd-31b4f1d2bd78
> 2018-11-08T15:21:26,246  INFO [main] session.SessionState: Updating thread 
> name to b1b62e04-5a1c-4c6a-babd-31b4f1d2bd78 main
> 2018-11-08T15:21:26,246  INFO [b1b62e04-5a1c-4c6a-babd-31b4f1d2bd78 main] 
> conf.HiveConf: Using the default value passed in for log id: 
> b1b62e04-5a1c-4c6a-babd-31b4f1d2bd78
> 2018-11-08T15:21:26,548  INFO [b1b62e04-5a1c-4c6a-babd-31b4f1d2bd78 main] 
> ql.Driver: Compiling 
> command(queryId=root_20181108152126_3babeb6f-8396-4ef3-8f85-2cbf12ebe9c1): 
> msck
> 2018-11-08T15:21:28,140  INFO [b1b62e04-5a1c-4c6a-babd-31b4f1d2bd78 main] 
> hive.metastore: Trying to connect to metastore with URI 
> thrift://nightly61x-1.vpc.cloudera.com:9083
> 2018-11-08T15:21:28,184  INFO [b1b62e04-5a1c-4c6a-babd-31b4f1d2bd78 main] 
> hive.metastore: Opened a connection to metastore, current connections: 1
> 2018-11-08T15:21:28,185  INFO [b1b62e04-5a1c-4c6a-babd-31b4f1d2bd78 main] 
> hive.metastore: Connected to metastore.
> FAILED: SemanticException empty table creation??
> 2018-11-08T15:21:28,339 ERROR [b1b62e04-5a1c-4c6a-babd-31b4f1d2bd78 main] 
> ql.Driver: FAILED: SemanticException empty table creation??
> org.apache.hadoop.hive.ql.parse.SemanticException: empty table creation??
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.getTable(BaseSemanticAnalyzer.java:1670)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.getTable(BaseSemanticAnalyzer.java:1652)
>   at 
> org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeMetastoreCheck(DDLSemanticAnalyzer.java:3118)
>   at 
> org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:414)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:250)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:600)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1414)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1543)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1332)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1321)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:342)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:802)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:774)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:701)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:313)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:227)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: empty table 
> creation??
>   at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1273)
>   at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1234)
>   at 
> org.apache.hadoop.

[jira] [Commented] (HIVE-20888) TxnHandler: sort() called on immutable lists

2018-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685433#comment-16685433
 ] 

Hive QA commented on HIVE-20888:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
51s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 7s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m  
3s{color} | {color:blue} standalone-metastore/metastore-server in master has 
185 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 12m 43s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14913/dev-support/hive-personality.sh
 |
| git revision | master / 99d25f0 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: standalone-metastore/metastore-server U: 
standalone-metastore/metastore-server |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14913/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> TxnHandler: sort() called on immutable lists
> 
>
> Key: HIVE-20888
> URL: https://issues.apache.org/jira/browse/HIVE-20888
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Gopal V
>Assignee: Igor Kryvenko
>Priority: Major
> Attachments: HIVE-20888.01.patch
>
>
> {code}
> } else {
>   assert (!rqst.isSetSrcTxnToWriteIdList());
>   assert (rqst.isSetTxnIds());
>   txnIds = rqst.getTxnIds();
> }
> Collections.sort(txnIds); //easier to read logs and for assumption 
> done in replication flow
> {code}
> when the input comes from
> {code}
>   @Override
>   public long allocateTableWriteId(long txnId, String dbName, String 
> tableName) throws TException {
> return allocateTableWriteIdsBatch(Collections.singletonList(txnId), 
> dbName, tableName).get(0).getWriteId();
>   }
> {code}
> {code}
> java.lang.UnsupportedOperationException: null
> at java.util.AbstractList.set(AbstractList.java:132) ~[?:1.8.0]
> at java.util.AbstractList$ListItr.set(AbstractList.java:426) ~[?:1.8.0]
> at java.util.Collections.sort(Collections.java:170) ~[?:1.8.0]
> at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.allocateTableWriteIds(TxnHandler.java:1523)
>  ~[hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.allocate_table_write_ids(HiveMetaStore.java:7349)
>  ~[hive-standalo

[jira] [Commented] (HIVE-20760) Reducing memory overhead due to multiple HiveConfs

2018-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685397#comment-16685397
 ] 

Hive QA commented on HIVE-20760:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12947965/HIVE-20760.5.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 392 failed/errored test(s), 15434 tests 
executed
*Failed tests:*
{noformat}
TestBeeLineWithArgs - did not produce a TEST-*.xml file (likely timed out) 
(batchId=253)
TestBeelinePasswordOption - did not produce a TEST-*.xml file (likely timed 
out) (batchId=253)
TestEmbeddedThriftBinaryCLIService - did not produce a TEST-*.xml file (likely 
timed out) (batchId=254)
TestHiveSessionImpl - did not produce a TEST-*.xml file (likely timed out) 
(batchId=254)
TestSchemaTool - did not produce a TEST-*.xml file (likely timed out) 
(batchId=253)
TestThriftCLIServiceWithBinary - did not produce a TEST-*.xml file (likely 
timed out) (batchId=254)
TestThriftCLIServiceWithHttp - did not produce a TEST-*.xml file (likely timed 
out) (batchId=254)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed]
 (batchId=272)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[explain_outputs] 
(batchId=272)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[mapjoin2] 
(batchId=272)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[select_dummy_source] 
(batchId=272)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_10] 
(batchId=272)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_11] 
(batchId=272)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_12] 
(batchId=272)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_13] 
(batchId=272)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_16] 
(batchId=272)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_1] 
(batchId=272)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_2] 
(batchId=272)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_3] 
(batchId=272)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_7] 
(batchId=272)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[udf_unix_timestamp] 
(batchId=272)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[auto_sortmerge_join_16]
 (batchId=190)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucket4] 
(batchId=191)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucket5] 
(batchId=192)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucket6] 
(batchId=189)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketmapjoin6]
 (batchId=190)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketmapjoin7]
 (batchId=189)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[constprog_partitioner]
 (batchId=191)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[constprog_semijoin]
 (batchId=191)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[disable_merge_for_bucketing]
 (batchId=192)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[dynamic_rdd_cache]
 (batchId=190)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[empty_dir_in_table]
 (batchId=189)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[external_table_with_space_in_location_path]
 (batchId=191)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[file_with_header_footer]
 (batchId=192)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[gen_udf_example_add10]
 (batchId=190)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[import_exported_table]
 (batchId=191)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[infer_bucket_sort_bucketed_table]
 (batchId=192)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[infer_bucket_sort_map_operators]
 (batchId=191)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[infer_bucket_sort_merge]
 (batchId=191)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[infer_bucket_sort_num_buckets]
 (batchId=190)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[input16_cc]
 (batchId=192)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[insert_overwrite_directory2]
 (batchId=191)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[leftsemijoin_mr]
 (batchId=189)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[list_bucket_dml_10]
 (batchId=189)
org.apache.hadoop.hive.cli.TestMin

[jira] [Updated] (HIVE-20891) Call alter_partition in batch when dynamically loading partitions

2018-11-13 Thread Laszlo Pinter (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Pinter updated HIVE-20891:
-
Attachment: HIVE-20891.02.patch

> Call alter_partition in batch when dynamically loading partitions
> -
>
> Key: HIVE-20891
> URL: https://issues.apache.org/jira/browse/HIVE-20891
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: Laszlo Pinter
>Assignee: Laszlo Pinter
>Priority: Major
> Attachments: HIVE-20891.01.patch, HIVE-20891.02.patch
>
>
> When dynamically loading partitions, the setStatsPropAndAlterPartition() is 
> called for each partition one by one, resulting in unnecessary calls to the 
> metastore client. This whole logic can be changed to just one call. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20905) querying streaming table fails with out of memory exception

2018-11-13 Thread mahesh kumar behera (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685349#comment-16685349
 ] 

mahesh kumar behera commented on HIVE-20905:


[~thejas]

can you please commit it 

> querying streaming table fails with out of memory exception
> ---
>
> Key: HIVE-20905
> URL: https://issues.apache.org/jira/browse/HIVE-20905
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Transactions
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20905.01.patch, HIVE-20905.02.patch, 
> HIVE-20905.03.patch
>
>
> Streaming app was ran for 24hrs post which it went down due authentication 
> issue . The table was accessible for 12hrs into the run, however currently 
> querying the table fails with OOM exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20703) Put dynamic sort partition optimization under cost based decision

2018-11-13 Thread Yongzhi Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685318#comment-16685318
 ] 

Yongzhi Chen commented on HIVE-20703:
-

[~vgarg], this optimizer is needed for spark, we'd better add this for spark. 
Thanks

> Put dynamic sort partition optimization under cost based decision
> -
>
> Key: HIVE-20703
> URL: https://issues.apache.org/jira/browse/HIVE-20703
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20703.1.patch, HIVE-20703.10.patch, 
> HIVE-20703.11.patch, HIVE-20703.12.patch, HIVE-20703.2.patch, 
> HIVE-20703.3.patch, HIVE-20703.4.patch, HIVE-20703.5.patch, 
> HIVE-20703.6.patch, HIVE-20703.7.patch, HIVE-20703.8.patch, HIVE-20703.9.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19701) getDelegationTokenFromMetaStore doesn't need to be synchronized

2018-11-13 Thread Sankar Hariappan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685316#comment-16685316
 ] 

Sankar Hariappan commented on HIVE-19701:
-

[~thejas]
Can you please review the patch?

> getDelegationTokenFromMetaStore doesn't need to be synchronized
> ---
>
> Key: HIVE-19701
> URL: https://issues.apache.org/jira/browse/HIVE-19701
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: Thejas M Nair
>Assignee: Sankar Hariappan
>Priority: Major
> Attachments: HIVE-19701.01.patch
>
>
> CLIService.getDelegationTokenFromMetaStore just invokes metastore api via 
> thread local Hive object. So, it doesn't have to be synchronized.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20760) Reducing memory overhead due to multiple HiveConfs

2018-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685315#comment-16685315
 ] 

Hive QA commented on HIVE-20760:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
36s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
15s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
32s{color} | {color:blue} common in master has 65 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
11s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
38s{color} | {color:red} common generated 3 new + 65 unchanged - 0 fixed = 68 
total (was 65) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 11m  0s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:common |
|  |  org.apache.hadoop.hive.common.HiveConfProperties.clone() does not call 
super.clone()  At HiveConfProperties.java: At HiveConfProperties.java:[lines 
247-258] |
|  |  Inconsistent synchronization of 
org.apache.hadoop.hive.common.HiveConfProperties.interned; locked 61% of time  
Unsynchronized access at HiveConfProperties.java:61% of time  Unsynchronized 
access at HiveConfProperties.java:[line 83] |
|  |  org.apache.hadoop.hive.common.HiveConfProperties.getProperty(String, 
String) is unsynchronized, 
org.apache.hadoop.hive.common.HiveConfProperties.setProperty(String, String) is 
synchronized  At HiveConfProperties.java:String) is synchronized  At 
HiveConfProperties.java:[lines 98-105] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14912/dev-support/hive-personality.sh
 |
| git revision | master / 99d25f0 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14912/yetus/new-findbugs-common.html
 |
| modules | C: common U: common |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14912/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Reducing memory overhead due to multiple HiveConfs
> --
>
> Key: HIVE-20760
> URL: https://issues.apache.org/jira/browse/HIVE-20760
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Barnabas Maidics
>Assignee: Barnabas Maidics
>Priority: Major
> Attachments: HIVE-20760-1.patch, HIVE-20760-2.patch, 
> HIVE-20760-3.patch, HIVE-20760.4.patch, HIVE-20760.5.patch, HIVE-20760.patch, 
> hiveconf_interned.html, hiveconf_original.html
>
>
> The issue is that every Hive task has to load its own version of 
> {{HiveConf}}. When running with a large number of cores per executor (HoS), 
> there is a signific

[jira] [Commented] (HIVE-19701) getDelegationTokenFromMetaStore doesn't need to be synchronized

2018-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685290#comment-16685290
 ] 

Hive QA commented on HIVE-19701:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12947955/HIVE-19701.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15539 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14911/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14911/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14911/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12947955 - PreCommit-HIVE-Build

> getDelegationTokenFromMetaStore doesn't need to be synchronized
> ---
>
> Key: HIVE-19701
> URL: https://issues.apache.org/jira/browse/HIVE-19701
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: Thejas M Nair
>Assignee: Sankar Hariappan
>Priority: Major
> Attachments: HIVE-19701.01.patch
>
>
> CLIService.getDelegationTokenFromMetaStore just invokes metastore api via 
> thread local Hive object. So, it doesn't have to be synchronized.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20888) TxnHandler: sort() called on immutable lists

2018-11-13 Thread Igor Kryvenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Kryvenko updated HIVE-20888:
-
Status: Patch Available  (was: In Progress)

> TxnHandler: sort() called on immutable lists
> 
>
> Key: HIVE-20888
> URL: https://issues.apache.org/jira/browse/HIVE-20888
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Gopal V
>Assignee: Igor Kryvenko
>Priority: Major
> Attachments: HIVE-20888.01.patch
>
>
> {code}
> } else {
>   assert (!rqst.isSetSrcTxnToWriteIdList());
>   assert (rqst.isSetTxnIds());
>   txnIds = rqst.getTxnIds();
> }
> Collections.sort(txnIds); //easier to read logs and for assumption 
> done in replication flow
> {code}
> when the input comes from
> {code}
>   @Override
>   public long allocateTableWriteId(long txnId, String dbName, String 
> tableName) throws TException {
> return allocateTableWriteIdsBatch(Collections.singletonList(txnId), 
> dbName, tableName).get(0).getWriteId();
>   }
> {code}
> {code}
> java.lang.UnsupportedOperationException: null
> at java.util.AbstractList.set(AbstractList.java:132) ~[?:1.8.0]
> at java.util.AbstractList$ListItr.set(AbstractList.java:426) ~[?:1.8.0]
> at java.util.Collections.sort(Collections.java:170) ~[?:1.8.0]
> at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.allocateTableWriteIds(TxnHandler.java:1523)
>  ~[hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.allocate_table_write_ids(HiveMetaStore.java:7349)
>  ~[hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20888) TxnHandler: sort() called on immutable lists

2018-11-13 Thread Igor Kryvenko (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685229#comment-16685229
 ] 

Igor Kryvenko commented on HIVE-20888:
--

After an investigation, the easiest approach to fix this just add checking 
which Gopal proposed, since we can't differentiate do we handle singleton list 
or lists with txn ids. 

> TxnHandler: sort() called on immutable lists
> 
>
> Key: HIVE-20888
> URL: https://issues.apache.org/jira/browse/HIVE-20888
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Gopal V
>Assignee: Igor Kryvenko
>Priority: Major
> Attachments: HIVE-20888.01.patch
>
>
> {code}
> } else {
>   assert (!rqst.isSetSrcTxnToWriteIdList());
>   assert (rqst.isSetTxnIds());
>   txnIds = rqst.getTxnIds();
> }
> Collections.sort(txnIds); //easier to read logs and for assumption 
> done in replication flow
> {code}
> when the input comes from
> {code}
>   @Override
>   public long allocateTableWriteId(long txnId, String dbName, String 
> tableName) throws TException {
> return allocateTableWriteIdsBatch(Collections.singletonList(txnId), 
> dbName, tableName).get(0).getWriteId();
>   }
> {code}
> {code}
> java.lang.UnsupportedOperationException: null
> at java.util.AbstractList.set(AbstractList.java:132) ~[?:1.8.0]
> at java.util.AbstractList$ListItr.set(AbstractList.java:426) ~[?:1.8.0]
> at java.util.Collections.sort(Collections.java:170) ~[?:1.8.0]
> at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.allocateTableWriteIds(TxnHandler.java:1523)
>  ~[hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.allocate_table_write_ids(HiveMetaStore.java:7349)
>  ~[hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20888) TxnHandler: sort() called on immutable lists

2018-11-13 Thread Igor Kryvenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Kryvenko updated HIVE-20888:
-
Attachment: HIVE-20888.01.patch

> TxnHandler: sort() called on immutable lists
> 
>
> Key: HIVE-20888
> URL: https://issues.apache.org/jira/browse/HIVE-20888
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Gopal V
>Assignee: Igor Kryvenko
>Priority: Major
> Attachments: HIVE-20888.01.patch
>
>
> {code}
> } else {
>   assert (!rqst.isSetSrcTxnToWriteIdList());
>   assert (rqst.isSetTxnIds());
>   txnIds = rqst.getTxnIds();
> }
> Collections.sort(txnIds); //easier to read logs and for assumption 
> done in replication flow
> {code}
> when the input comes from
> {code}
>   @Override
>   public long allocateTableWriteId(long txnId, String dbName, String 
> tableName) throws TException {
> return allocateTableWriteIdsBatch(Collections.singletonList(txnId), 
> dbName, tableName).get(0).getWriteId();
>   }
> {code}
> {code}
> java.lang.UnsupportedOperationException: null
> at java.util.AbstractList.set(AbstractList.java:132) ~[?:1.8.0]
> at java.util.AbstractList$ListItr.set(AbstractList.java:426) ~[?:1.8.0]
> at java.util.Collections.sort(Collections.java:170) ~[?:1.8.0]
> at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.allocateTableWriteIds(TxnHandler.java:1523)
>  ~[hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.allocate_table_write_ids(HiveMetaStore.java:7349)
>  ~[hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19701) getDelegationTokenFromMetaStore doesn't need to be synchronized

2018-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685203#comment-16685203
 ] 

Hive QA commented on HIVE-19701:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
27s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
22s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
36s{color} | {color:blue} service in master has 48 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} service: The patch generated 0 new + 14 unchanged - 
1 fixed = 14 total (was 15) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 11m 27s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14911/dev-support/hive-personality.sh
 |
| git revision | master / 99d25f0 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: service U: service |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14911/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> getDelegationTokenFromMetaStore doesn't need to be synchronized
> ---
>
> Key: HIVE-19701
> URL: https://issues.apache.org/jira/browse/HIVE-19701
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: Thejas M Nair
>Assignee: Sankar Hariappan
>Priority: Major
> Attachments: HIVE-19701.01.patch
>
>
> CLIService.getDelegationTokenFromMetaStore just invokes metastore api via 
> thread local Hive object. So, it doesn't have to be synchronized.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19026) Configurable serde for druid kafka indexing

2018-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685171#comment-16685171
 ] 

Hive QA commented on HIVE-19026:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12947952/HIVE-19026.7.patch

{color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15542 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_limit]
 (batchId=171)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14910/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14910/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14910/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12947952 - PreCommit-HIVE-Build

> Configurable serde for druid kafka indexing 
> 
>
> Key: HIVE-19026
> URL: https://issues.apache.org/jira/browse/HIVE-19026
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-19026.1.patch, HIVE-19026.2.patch, 
> HIVE-19026.3.patch, HIVE-19026.4.patch, HIVE-19026.5.patch, 
> HIVE-19026.6.patch, HIVE-19026.7.patch, HIVE-19026.patch
>
>
> https://issues.apache.org/jira/browse/HIVE-18976 introduces support for 
> setting up druid kafka-indexing service. 
> Input serialization should be configurable. for now we can say we only 
> support json, but there should be a mechanism to support other formats. 
> Perhaps, we can make use of Hive's serde library like LazySimpleSerde etc.
> Also add support to ingest timestamp column when the input timestamp column 
> name in input is not `__time`. 
> e.g. 
> CREATE EXTERNAL TABLE druid_kafka_test_avro(__time timestamp , other 
> columns...)
> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
> TBLPROPERTIES (
>  "druid.timestamp.column" = "myinputColumnTimestamp"
>   other ppts 
>  ) 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (HIVE-20330) HCatLoader cannot handle multiple InputJobInfo objects for a job with multiple inputs

2018-11-13 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-20330 started by Adam Szita.
-
> HCatLoader cannot handle multiple InputJobInfo objects for a job with 
> multiple inputs
> -
>
> Key: HIVE-20330
> URL: https://issues.apache.org/jira/browse/HIVE-20330
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Reporter: Adam Szita
>Assignee: Adam Szita
>Priority: Major
>
> While running performance tests on Pig (0.12 and 0.17) we've observed a huge 
> performance drop in a workload that has multiple inputs from HCatLoader.
> The reason is that for a particular MR job with multiple Hive tables as 
> input, Pig calls {{setLocation}} on each {{LoaderFunc (HCatLoader)}} instance 
> but only one table's information (InputJobInfo instance) gets tracked in the 
> JobConf. (This is under config key {{HCatConstants.HCAT_KEY_JOB_INFO}}).
> Any such call overwrites preexisting values, and thus only the last table's 
> information will be considered when Pig calls {{getStatistics}} to calculate 
> and estimate required reducer count.
> In cases when there are 2 input tables, 256GB and 1MB in size respectively, 
> Pig will query the size information from HCat for both of them, but it will 
> either see 1MB+1MB=2MB or 256GB+256GB=0.5TB depending on input order in the 
> execution plan's DAG.
> It should of course see 256.00097GB in total and use 257 reducers by default 
> accordingly.
> In unlucky cases this will be seen as 2MB and 1 reducer will have to struggle 
> with the actual 256.00097GB...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19026) Configurable serde for druid kafka indexing

2018-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685170#comment-16685170
 ] 

Hive QA commented on HIVE-19026:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
49s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
37s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
32s{color} | {color:blue} common in master has 65 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
24s{color} | {color:blue} druid-handler in master has 4 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
18s{color} | {color:blue} itests/qtest-druid in master has 6 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
41s{color} | {color:blue} itests/util in master has 48 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
42s{color} | {color:blue} ql in master has 2316 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
49s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
53s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m  
9s{color} | {color:red} druid-handler: The patch generated 39 new + 11 
unchanged - 2 fixed = 50 total (was 13) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
48s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
11s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 64m 51s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  
xml  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14910/dev-support/hive-personality.sh
 |
| git revision | master / 99d25f0 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14910/yetus/diff-checkstyle-druid-handler.txt
 |
| modules | C: common . druid-handler itests itests/qtest-druid itests/util ql 
U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14910/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Configurable serde for druid kafka indexing 
> 
>
> Key: HIVE-19026
> URL: https://issues.apache.org/jira/browse/HIVE-19026
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Prio

[jira] [Commented] (HIVE-20905) querying streaming table fails with out of memory exception

2018-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685096#comment-16685096
 ] 

Hive QA commented on HIVE-20905:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12947947/HIVE-20905.03.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15539 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14909/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14909/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14909/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12947947 - PreCommit-HIVE-Build

> querying streaming table fails with out of memory exception
> ---
>
> Key: HIVE-20905
> URL: https://issues.apache.org/jira/browse/HIVE-20905
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Transactions
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20905.01.patch, HIVE-20905.02.patch, 
> HIVE-20905.03.patch
>
>
> Streaming app was ran for 24hrs post which it went down due authentication 
> issue . The table was accessible for 12hrs into the run, however currently 
> querying the table fails with OOM exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20682) Async query execution can potentially fail if shared sessionHive is closed by master thread.

2018-11-13 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20682:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

> Async query execution can potentially fail if shared sessionHive is closed by 
> master thread.
> 
>
> Key: HIVE-20682
> URL: https://issues.apache.org/jira/browse/HIVE-20682
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20682.01.patch, HIVE-20682.02.patch, 
> HIVE-20682.03.patch, HIVE-20682.04.patch, HIVE-20682.05.patch, 
> HIVE-20682.06.patch
>
>
> *Problem description:*
> The master thread initializes the *sessionHive* object in *HiveSessionImpl* 
> class when we open a new session for a client connection and by default all 
> queries from this connection shares the same sessionHive object. 
> If the master thread executes a *synchronous* query, it closes the 
> sessionHive object (referred via thread local hiveDb) if  
> {{Hive.isCompatible}} returns false and sets new Hive object in thread local 
> HiveDb but doesn't change the sessionHive object in the session. Whereas, 
> *asynchronous* query execution via async threads never closes the sessionHive 
> object and it just creates a new one if needed and sets it as their thread 
> local hiveDb.
> So, the problem can happen in the case where an *asynchronous* query is being 
> executed by async threads refers to sessionHive object and the master thread 
> receives a *synchronous* query that closes the same sessionHive object. 
> Also, each query execution overwrites the thread local hiveDb object to 
> sessionHive object which potentially leaks a metastore connection if the 
> previous synchronous query execution re-created the Hive object.
> *Possible Fix:*
> The *sessionHive* object could be shared my multiple threads and so it 
> shouldn't be allowed to be closed by any query execution threads when they 
> re-create the Hive object due to changes in Hive configurations. But the Hive 
> objects created by query execution threads should be closed when the thread 
> exits.
> So, it is proposed to have an *isAllowClose* flag (default: *true*) in Hive 
> object which should be set to *false* for *sessionHive* and would be 
> forcefully closed when the session is closed or released.
> Also, when we reset *sessionHive* object with new one due to changes in 
> *sessionConf*, the old one should be closed when no async thread is referring 
> to it. This can be done using "*finalize*" method of Hive object where we can 
> close HMS connection when Hive object is garbage collected.
> cc [~pvary]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20905) querying streaming table fails with out of memory exception

2018-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685056#comment-16685056
 ] 

Hive QA commented on HIVE-20905:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
33s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 5s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
24s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
39s{color} | {color:blue} ql in master has 2316 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m  
2s{color} | {color:blue} standalone-metastore/metastore-server in master has 
185 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
12s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 26m 51s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14909/dev-support/hive-personality.sh
 |
| git revision | master / af40170 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql standalone-metastore/metastore-server U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14909/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> querying streaming table fails with out of memory exception
> ---
>
> Key: HIVE-20905
> URL: https://issues.apache.org/jira/browse/HIVE-20905
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Transactions
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20905.01.patch, HIVE-20905.02.patch, 
> HIVE-20905.03.patch
>
>
> Streaming app was ran for 24hrs post which it went down due authentication 
> issue . The table was accessible for 12hrs into the run, however currently 
> querying the table fails with OOM exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20682) Async query execution can potentially fail if shared sessionHive is closed by master thread.

2018-11-13 Thread Sankar Hariappan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685041#comment-16685041
 ] 

Sankar Hariappan commented on HIVE-20682:
-

Thanks [~anishek] for the review!
The 06.patch is committed to master!

> Async query execution can potentially fail if shared sessionHive is closed by 
> master thread.
> 
>
> Key: HIVE-20682
> URL: https://issues.apache.org/jira/browse/HIVE-20682
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20682.01.patch, HIVE-20682.02.patch, 
> HIVE-20682.03.patch, HIVE-20682.04.patch, HIVE-20682.05.patch, 
> HIVE-20682.06.patch
>
>
> *Problem description:*
> The master thread initializes the *sessionHive* object in *HiveSessionImpl* 
> class when we open a new session for a client connection and by default all 
> queries from this connection shares the same sessionHive object. 
> If the master thread executes a *synchronous* query, it closes the 
> sessionHive object (referred via thread local hiveDb) if  
> {{Hive.isCompatible}} returns false and sets new Hive object in thread local 
> HiveDb but doesn't change the sessionHive object in the session. Whereas, 
> *asynchronous* query execution via async threads never closes the sessionHive 
> object and it just creates a new one if needed and sets it as their thread 
> local hiveDb.
> So, the problem can happen in the case where an *asynchronous* query is being 
> executed by async threads refers to sessionHive object and the master thread 
> receives a *synchronous* query that closes the same sessionHive object. 
> Also, each query execution overwrites the thread local hiveDb object to 
> sessionHive object which potentially leaks a metastore connection if the 
> previous synchronous query execution re-created the Hive object.
> *Possible Fix:*
> The *sessionHive* object could be shared my multiple threads and so it 
> shouldn't be allowed to be closed by any query execution threads when they 
> re-create the Hive object due to changes in Hive configurations. But the Hive 
> objects created by query execution threads should be closed when the thread 
> exits.
> So, it is proposed to have an *isAllowClose* flag (default: *true*) in Hive 
> object which should be set to *false* for *sessionHive* and would be 
> forcefully closed when the session is closed or released.
> Also, when we reset *sessionHive* object with new one due to changes in 
> *sessionConf*, the old one should be closed when no async thread is referring 
> to it. This can be done using "*finalize*" method of Hive object where we can 
> close HMS connection when Hive object is garbage collected.
> cc [~pvary]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18661) CachedStore: Use metastore notification log events to update cache

2018-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685020#comment-16685020
 ] 

Hive QA commented on HIVE-18661:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12947940/HIVE-18661.07.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14908/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14908/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14908/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Tests exited with: Exception: Patch URL 
https://issues.apache.org/jira/secure/attachment/12947940/HIVE-18661.07.patch 
was found in seen patch url's cache and a test was probably run already on it. 
Aborting...
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12947940 - PreCommit-HIVE-Build

> CachedStore: Use metastore notification log events to update cache
> --
>
> Key: HIVE-18661
> URL: https://issues.apache.org/jira/browse/HIVE-18661
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Vaibhav Gumashta
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-18661.02.patch, HIVE-18661.03.patch, 
> HIVE-18661.04.patch, HIVE-18661.05.patch, HIVE-18661.06.patch, 
> HIVE-18661.07.patch
>
>
> Currently, a background thread updates the entire cache which is pretty 
> inefficient. We capture the updates to metadata in NOTIFICATION_LOG table 
> which is getting used in the Replication work. We should have the background 
> thread apply these notifications to incrementally update the cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18661) CachedStore: Use metastore notification log events to update cache

2018-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685019#comment-16685019
 ] 

Hive QA commented on HIVE-18661:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12947940/HIVE-18661.07.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 15523 tests 
executed
*Failed tests:*
{noformat}
TestAlterTableMetadata - did not produce a TEST-*.xml file (likely timed out) 
(batchId=250)
TestCopyUtils - did not produce a TEST-*.xml file (likely timed out) 
(batchId=250)
TestLocationQueries - did not produce a TEST-*.xml file (likely timed out) 
(batchId=250)
TestReplAcidTablesWithJsonMessage - did not produce a TEST-*.xml file (likely 
timed out) (batchId=250)
TestReplIncrementalLoadAcidTablesWithJsonMessage - did not produce a TEST-*.xml 
file (likely timed out) (batchId=250)
TestSemanticAnalyzerHookLoading - did not produce a TEST-*.xml file (likely 
timed out) (batchId=250)
org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow.testComplexQuery (batchId=258)
org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow.testDataTypes (batchId=258)
org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow.testEscapedStrings (batchId=258)
org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow.testKillQuery (batchId=258)
org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow.testLlapInputFormatEndToEnd 
(batchId=258)
org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow.testNonAsciiStrings (batchId=258)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14907/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14907/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14907/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12947940 - PreCommit-HIVE-Build

> CachedStore: Use metastore notification log events to update cache
> --
>
> Key: HIVE-18661
> URL: https://issues.apache.org/jira/browse/HIVE-18661
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Vaibhav Gumashta
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-18661.02.patch, HIVE-18661.03.patch, 
> HIVE-18661.04.patch, HIVE-18661.05.patch, HIVE-18661.06.patch, 
> HIVE-18661.07.patch
>
>
> Currently, a background thread updates the entire cache which is pretty 
> inefficient. We capture the updates to metadata in NOTIFICATION_LOG table 
> which is getting used in the Replication work. We should have the background 
> thread apply these notifications to incrementally update the cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20682) Async query execution can potentially fail if shared sessionHive is closed by master thread.

2018-11-13 Thread anishek (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685014#comment-16685014
 ] 

anishek commented on HIVE-20682:


+1


> Async query execution can potentially fail if shared sessionHive is closed by 
> master thread.
> 
>
> Key: HIVE-20682
> URL: https://issues.apache.org/jira/browse/HIVE-20682
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20682.01.patch, HIVE-20682.02.patch, 
> HIVE-20682.03.patch, HIVE-20682.04.patch, HIVE-20682.05.patch, 
> HIVE-20682.06.patch
>
>
> *Problem description:*
> The master thread initializes the *sessionHive* object in *HiveSessionImpl* 
> class when we open a new session for a client connection and by default all 
> queries from this connection shares the same sessionHive object. 
> If the master thread executes a *synchronous* query, it closes the 
> sessionHive object (referred via thread local hiveDb) if  
> {{Hive.isCompatible}} returns false and sets new Hive object in thread local 
> HiveDb but doesn't change the sessionHive object in the session. Whereas, 
> *asynchronous* query execution via async threads never closes the sessionHive 
> object and it just creates a new one if needed and sets it as their thread 
> local hiveDb.
> So, the problem can happen in the case where an *asynchronous* query is being 
> executed by async threads refers to sessionHive object and the master thread 
> receives a *synchronous* query that closes the same sessionHive object. 
> Also, each query execution overwrites the thread local hiveDb object to 
> sessionHive object which potentially leaks a metastore connection if the 
> previous synchronous query execution re-created the Hive object.
> *Possible Fix:*
> The *sessionHive* object could be shared my multiple threads and so it 
> shouldn't be allowed to be closed by any query execution threads when they 
> re-create the Hive object due to changes in Hive configurations. But the Hive 
> objects created by query execution threads should be closed when the thread 
> exits.
> So, it is proposed to have an *isAllowClose* flag (default: *true*) in Hive 
> object which should be set to *false* for *sessionHive* and would be 
> forcefully closed when the session is closed or released.
> Also, when we reset *sessionHive* object with new one due to changes in 
> *sessionConf*, the old one should be closed when no async thread is referring 
> to it. This can be done using "*finalize*" method of Hive object where we can 
> close HMS connection when Hive object is garbage collected.
> cc [~pvary]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20760) Reducing memory overhead due to multiple HiveConfs

2018-11-13 Thread Barnabas Maidics (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barnabas Maidics updated HIVE-20760:

Attachment: HIVE-20760.5.patch
Status: Patch Available  (was: Open)

> Reducing memory overhead due to multiple HiveConfs
> --
>
> Key: HIVE-20760
> URL: https://issues.apache.org/jira/browse/HIVE-20760
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Barnabas Maidics
>Assignee: Barnabas Maidics
>Priority: Major
> Attachments: HIVE-20760-1.patch, HIVE-20760-2.patch, 
> HIVE-20760-3.patch, HIVE-20760.4.patch, HIVE-20760.5.patch, HIVE-20760.patch, 
> hiveconf_interned.html, hiveconf_original.html
>
>
> The issue is that every Hive task has to load its own version of 
> {{HiveConf}}. When running with a large number of cores per executor (HoS), 
> there is a significant (~10%) amount of memory wasted due to this 
> duplication. 
> I looked into the problem and found a way to reduce the overhead caused by 
> the multiple HiveConf objects.
> I've created an implementation of Properties, somewhat similar to 
> CopyOnFirstWriteProperties. CopyOnFirstWriteProperties can't be used to solve 
> this problem, because it drops the interned Properties right after we add a 
> new property.
> So my implementation looks like this:
>  * When we create a new HiveConf from an existing one (copy constructor), we 
> change the properties object stored by HiveConf to the new Properties 
> implementation (HiveConfProperties). We have 2 possible way to do this. 
> Either we change the visibility of the properties field in the ancestor class 
> (Configuration which comes from hadoop) to protected, or a simpler way is to 
> just change the type using reflection.
>  * HiveConfProperties instantly intern the given properties. After this, 
> every time we add a new property to HiveConf, we add it to an additional 
> Properties object. This way if we create multiple HiveConf with the same base 
> properties, they will use the same Properties object but each session/task 
> can add its own unique properties.
>  * Getting a property from HiveConfProperties would look like this: (I stored 
> the non-interned properties in super class)
>                 String property=super.getProperty(key);
>                 if (property == null) property= interned.getProperty(key);
>                 return property;
> Running some tests showed that the interning works (with 50 connections to 
> HiveServer2, heapdumps created after sessions are created for queries): 
> Overall memory:
>          original: 34,599K              interned: 20,582K
> Retained memory of HiveConfs:
>         original: 16,366K               interned: 10,804K
> I attach the JXray reports about the heapdumps.
> What are your thoughts about this solution? 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20760) Reducing memory overhead due to multiple HiveConfs

2018-11-13 Thread Barnabas Maidics (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barnabas Maidics updated HIVE-20760:

Status: Open  (was: Patch Available)

> Reducing memory overhead due to multiple HiveConfs
> --
>
> Key: HIVE-20760
> URL: https://issues.apache.org/jira/browse/HIVE-20760
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Barnabas Maidics
>Assignee: Barnabas Maidics
>Priority: Major
> Attachments: HIVE-20760-1.patch, HIVE-20760-2.patch, 
> HIVE-20760-3.patch, HIVE-20760.4.patch, HIVE-20760.patch, 
> hiveconf_interned.html, hiveconf_original.html
>
>
> The issue is that every Hive task has to load its own version of 
> {{HiveConf}}. When running with a large number of cores per executor (HoS), 
> there is a significant (~10%) amount of memory wasted due to this 
> duplication. 
> I looked into the problem and found a way to reduce the overhead caused by 
> the multiple HiveConf objects.
> I've created an implementation of Properties, somewhat similar to 
> CopyOnFirstWriteProperties. CopyOnFirstWriteProperties can't be used to solve 
> this problem, because it drops the interned Properties right after we add a 
> new property.
> So my implementation looks like this:
>  * When we create a new HiveConf from an existing one (copy constructor), we 
> change the properties object stored by HiveConf to the new Properties 
> implementation (HiveConfProperties). We have 2 possible way to do this. 
> Either we change the visibility of the properties field in the ancestor class 
> (Configuration which comes from hadoop) to protected, or a simpler way is to 
> just change the type using reflection.
>  * HiveConfProperties instantly intern the given properties. After this, 
> every time we add a new property to HiveConf, we add it to an additional 
> Properties object. This way if we create multiple HiveConf with the same base 
> properties, they will use the same Properties object but each session/task 
> can add its own unique properties.
>  * Getting a property from HiveConfProperties would look like this: (I stored 
> the non-interned properties in super class)
>                 String property=super.getProperty(key);
>                 if (property == null) property= interned.getProperty(key);
>                 return property;
> Running some tests showed that the interning works (with 50 connections to 
> HiveServer2, heapdumps created after sessions are created for queries): 
> Overall memory:
>          original: 34,599K              interned: 20,582K
> Retained memory of HiveConfs:
>         original: 16,366K               interned: 10,804K
> I attach the JXray reports about the heapdumps.
> What are your thoughts about this solution? 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18661) CachedStore: Use metastore notification log events to update cache

2018-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16684962#comment-16684962
 ] 

Hive QA commented on HIVE-18661:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
27s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 6s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
29s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
29s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  2m 
12s{color} | {color:blue} standalone-metastore/metastore-common in master has 
29 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
37s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m  
6s{color} | {color:blue} standalone-metastore/metastore-server in master has 
185 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
29s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
38s{color} | {color:red} hive-unit in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
39s{color} | {color:red} hive-unit in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 38s{color} 
| {color:red} hive-unit in the patch failed. {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
15s{color} | {color:red} itests/hive-unit: The patch generated 11 new + 0 
unchanged - 0 fixed = 11 total (was 0) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 3 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
35s{color} | {color:red} hive-unit in the patch failed. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m  
9s{color} | {color:red} standalone-metastore/metastore-server generated 1 new + 
185 unchanged - 0 fixed = 186 total (was 185) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 25m 35s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:standalone-metastore/metastore-server |
|  |  Load of known null value in 
org.apache.hadoop.hive.metastore.cache.CachedStore.get_aggr_stats_for(String, 
String, String, List, List, String)  At CachedStore.java:in 
org.apache.hadoop.hive.metastore.cache.CachedStore.get_aggr_stats_for(String, 
String, String, List, List, String)  At CachedStore.java:[line 2128] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14907/dev-support/hive-personality.sh
 |
| git revision | master / af40170 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| mvninstall | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14907/yetus/patch-mvninstall-itests_hive-unit.txt
 |
| compile | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14907/yetus/patch-compile-itests_hive-unit.txt
 |
| javac | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14907/yetus/patch-compile-itests_hive-unit.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14907/yetus/diff-checkstyle-itests_h

[jira] [Updated] (HIVE-19701) getDelegationTokenFromMetaStore doesn't need to be synchronized

2018-11-13 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-19701:

Status: Patch Available  (was: Open)

> getDelegationTokenFromMetaStore doesn't need to be synchronized
> ---
>
> Key: HIVE-19701
> URL: https://issues.apache.org/jira/browse/HIVE-19701
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: Thejas M Nair
>Assignee: Sankar Hariappan
>Priority: Major
> Attachments: HIVE-19701.01.patch
>
>
> CLIService.getDelegationTokenFromMetaStore just invokes metastore api via 
> thread local Hive object. So, it doesn't have to be synchronized.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20512) Improve record and memory usage logging in SparkRecordHandler

2018-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16684929#comment-16684929
 ] 

Hive QA commented on HIVE-20512:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12947932/HIVE-20512.92.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15539 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14906/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14906/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14906/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12947932 - PreCommit-HIVE-Build

> Improve record and memory usage logging in SparkRecordHandler
> -
>
> Key: HIVE-20512
> URL: https://issues.apache.org/jira/browse/HIVE-20512
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20512.1.patch, HIVE-20512.2.patch, 
> HIVE-20512.3.patch, HIVE-20512.4.patch, HIVE-20512.5.patch, 
> HIVE-20512.6.patch, HIVE-20512.7.patch, HIVE-20512.8.patch, 
> HIVE-20512.9.patch, HIVE-20512.91.patch, HIVE-20512.92.patch
>
>
> We currently log memory usage and # of records processed in Spark tasks, but 
> we should improve the methodology for how frequently we log this info. 
> Currently we use the following code:
> {code:java}
> private long getNextLogThreshold(long currentThreshold) {
> // A very simple counter to keep track of number of rows processed by the
> // reducer. It dumps
> // every 1 million times, and quickly before that
> if (currentThreshold >= 100) {
>   return currentThreshold + 100;
> }
> return 10 * currentThreshold;
>   }
> {code}
> The issue is that after a while, the increase by 10x factor means that you 
> have to process a huge # of records before this gets triggered.
> A better approach would be to log this info at a given interval. This would 
> help in debugging tasks that are seemingly hung.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-19701) getDelegationTokenFromMetaStore doesn't need to be synchronized

2018-11-13 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan reassigned HIVE-19701:
---

Assignee: Sankar Hariappan

> getDelegationTokenFromMetaStore doesn't need to be synchronized
> ---
>
> Key: HIVE-19701
> URL: https://issues.apache.org/jira/browse/HIVE-19701
> Project: Hive
>  Issue Type: Bug
>Reporter: Thejas M Nair
>Assignee: Sankar Hariappan
>Priority: Major
>
> or so it seems



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19701) getDelegationTokenFromMetaStore doesn't need to be synchronized

2018-11-13 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-19701:

Description: CLIService.getDelegationTokenFromMetaStore just invokes 
metastore api via thread local Hive object. So, it doesn't have to be 
synchronized.  (was: or so it seems)

> getDelegationTokenFromMetaStore doesn't need to be synchronized
> ---
>
> Key: HIVE-19701
> URL: https://issues.apache.org/jira/browse/HIVE-19701
> Project: Hive
>  Issue Type: Bug
>Reporter: Thejas M Nair
>Assignee: Sankar Hariappan
>Priority: Major
>
> CLIService.getDelegationTokenFromMetaStore just invokes metastore api via 
> thread local Hive object. So, it doesn't have to be synchronized.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19701) getDelegationTokenFromMetaStore doesn't need to be synchronized

2018-11-13 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-19701:

Component/s: HiveServer2

> getDelegationTokenFromMetaStore doesn't need to be synchronized
> ---
>
> Key: HIVE-19701
> URL: https://issues.apache.org/jira/browse/HIVE-19701
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: Thejas M Nair
>Assignee: Sankar Hariappan
>Priority: Major
> Attachments: HIVE-19701.01.patch
>
>
> CLIService.getDelegationTokenFromMetaStore just invokes metastore api via 
> thread local Hive object. So, it doesn't have to be synchronized.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19701) getDelegationTokenFromMetaStore doesn't need to be synchronized

2018-11-13 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-19701:

Attachment: HIVE-19701.01.patch

> getDelegationTokenFromMetaStore doesn't need to be synchronized
> ---
>
> Key: HIVE-19701
> URL: https://issues.apache.org/jira/browse/HIVE-19701
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: Thejas M Nair
>Assignee: Sankar Hariappan
>Priority: Major
> Attachments: HIVE-19701.01.patch
>
>
> CLIService.getDelegationTokenFromMetaStore just invokes metastore api via 
> thread local Hive object. So, it doesn't have to be synchronized.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >