[jira] [Commented] (HIVE-17315) Make the DataSource used by the DataNucleus in the HMS configurable using Hive properties

2017-08-14 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126852#comment-16126852
 ] 

Thejas M Nair commented on HIVE-17315:
--

bq. Even if the current properties are safe, I would be afraid of what could be 
added by future versions of the connection pool implementations, since we would 
be exposing these settings the moment we upgrade the library.
Other than password, I don't expect any other confidential property to be used 
by connection pool.
Hiding properties by default would cause difficulty in debugging issues. This 
is also counter to the practice we follow with other types of configs such as 
hdfs, yarn, tez, spark etc (ie, we don't hide those properties by default)


> Make the DataSource used by the DataNucleus in the HMS configurable using 
> Hive properties
> -
>
> Key: HIVE-17315
> URL: https://issues.apache.org/jira/browse/HIVE-17315
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 3.0.0
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>
> Currently we may use several connection pool implementations in the backend 
> (hikari, dbCp, boneCp) but these can only be configured using proprietary xml 
> files and not through hive-site.xml like DataNucleus.
> We should make them configurable just like DataNucleus, by allowing Hive 
> properties prefix by hikari, dbcp, bonecp to be set in the hive-site.xml. 
> However since these configurations may contain sensitive information 
> (passwords) these properties should not be displayable or manually settable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17100) Improve HS2 operation logs for REPL commands.

2017-08-14 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17100:

Attachment: HIVE-17100.02.patch

> Improve HS2 operation logs for REPL commands.
> -
>
> Key: HIVE-17100
> URL: https://issues.apache.org/jira/browse/HIVE-17100
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17100.01.patch, HIVE-17100.02.patch
>
>
> It is necessary to log the progress the replication tasks in a structured 
> manner as follows.
> *+Bootstrap Dump:+*
> * At the start of bootstrap dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (BOOTSTRAP)
> * (Estimated) Total number of tables/views to dump
> * (Estimated) Total number of functions to dump.
> * Dump Start Time{color}
> * After each table dump, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table dump end time
> * Table dump progress. Format is Table sequence no/(Estimated) Total number 
> of tables and views.{color}
> * After each function dump, will add a log as follows
> {color:#59afe1}* Function Name
> * Function dump end time
> * Function dump progress. Format is Function sequence no/(Estimated) Total 
> number of functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> dump.
> {color:#59afe1}* Database Name.
> * Dump Type (BOOTSTRAP).
> * Dump End Time.
> * (Actual) Total number of tables/views dumped.
> * (Actual) Total number of functions dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The actual and estimated number of tables/functions may not match if 
> any table/function is dropped when dump in progress.
> *+Bootstrap Load:+*
> * At the start of bootstrap load, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump directory
> * Load Type (BOOTSTRAP)
> * Total number of tables/views to load
> * Total number of functions to load.
> * Load Start Time{color}
> * After each table load, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table load completion time
> * Table load progress. Format is Table sequence no/Total number of tables and 
> views.{color}
> * After each function load, will add a log as follows
> {color:#59afe1}* Function Name
> * Function load completion time
> * Function load progress. Format is Function sequence no/Total number of 
> functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> load.
> {color:#59afe1}* Database Name.
> * Load Type (BOOTSTRAP).
> * Load End Time.
> * Total number of tables/views loaded.
> * Total number of functions loaded.
> * Last Repl ID of the loaded database.{color}
> *+Incremental Dump:+*
> * At the start of database dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (INCREMENTAL)
> * (Estimated) Total number of events to dump.
> * Dump Start Time{color}
> * After each event dump, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event dump end time
> * Event dump progress. Format is Event sequence no/ (Estimated) Total number 
> of events.{color}
> * After completion of all event dumps, will add a log as follows.
> {color:#59afe1}* Database Name.
> * Dump Type (INCREMENTAL).
> * Dump End Time.
> * (Actual) Total number of events dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The estimated number of events can be terribly inaccurate with actual 
> number as we don’t have the number of events upfront until we read from 
> metastore NotificationEvents table.
> *+Incremental Load:+*
> * At the start of incremental load, will add one log with below details.
> {color:#59afe1}* Target Database Name 
> * Dump directory
> * Load Type (INCREMENTAL)
> * Total number of events to load
> * Load Start Time{color}
> * After each event load, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event load end time
> * Event load progress. Format is Event sequence no/ Total number of 
> events.{color}
> * After completion of all event loads, will add a log as follows to 
> consolidate the load.
> {color:#59afe1}* Target Database Name.
> * Load Type (INCREMENTAL).
> * Load End Time.
> * Total number of events loaded.
> * Last Repl ID of the loaded database.{color}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17100) Improve HS2 operation logs for REPL commands.

2017-08-14 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17100:

Attachment: (was: HIVE-17100.02.patch)

> Improve HS2 operation logs for REPL commands.
> -
>
> Key: HIVE-17100
> URL: https://issues.apache.org/jira/browse/HIVE-17100
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17100.01.patch
>
>
> It is necessary to log the progress the replication tasks in a structured 
> manner as follows.
> *+Bootstrap Dump:+*
> * At the start of bootstrap dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (BOOTSTRAP)
> * (Estimated) Total number of tables/views to dump
> * (Estimated) Total number of functions to dump.
> * Dump Start Time{color}
> * After each table dump, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table dump end time
> * Table dump progress. Format is Table sequence no/(Estimated) Total number 
> of tables and views.{color}
> * After each function dump, will add a log as follows
> {color:#59afe1}* Function Name
> * Function dump end time
> * Function dump progress. Format is Function sequence no/(Estimated) Total 
> number of functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> dump.
> {color:#59afe1}* Database Name.
> * Dump Type (BOOTSTRAP).
> * Dump End Time.
> * (Actual) Total number of tables/views dumped.
> * (Actual) Total number of functions dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The actual and estimated number of tables/functions may not match if 
> any table/function is dropped when dump in progress.
> *+Bootstrap Load:+*
> * At the start of bootstrap load, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump directory
> * Load Type (BOOTSTRAP)
> * Total number of tables/views to load
> * Total number of functions to load.
> * Load Start Time{color}
> * After each table load, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table load completion time
> * Table load progress. Format is Table sequence no/Total number of tables and 
> views.{color}
> * After each function load, will add a log as follows
> {color:#59afe1}* Function Name
> * Function load completion time
> * Function load progress. Format is Function sequence no/Total number of 
> functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> load.
> {color:#59afe1}* Database Name.
> * Load Type (BOOTSTRAP).
> * Load End Time.
> * Total number of tables/views loaded.
> * Total number of functions loaded.
> * Last Repl ID of the loaded database.{color}
> *+Incremental Dump:+*
> * At the start of database dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (INCREMENTAL)
> * (Estimated) Total number of events to dump.
> * Dump Start Time{color}
> * After each event dump, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event dump end time
> * Event dump progress. Format is Event sequence no/ (Estimated) Total number 
> of events.{color}
> * After completion of all event dumps, will add a log as follows.
> {color:#59afe1}* Database Name.
> * Dump Type (INCREMENTAL).
> * Dump End Time.
> * (Actual) Total number of events dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The estimated number of events can be terribly inaccurate with actual 
> number as we don’t have the number of events upfront until we read from 
> metastore NotificationEvents table.
> *+Incremental Load:+*
> * At the start of incremental load, will add one log with below details.
> {color:#59afe1}* Target Database Name 
> * Dump directory
> * Load Type (INCREMENTAL)
> * Total number of events to load
> * Load Start Time{color}
> * After each event load, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event load end time
> * Event load progress. Format is Event sequence no/ Total number of 
> events.{color}
> * After completion of all event loads, will add a log as follows to 
> consolidate the load.
> {color:#59afe1}* Target Database Name.
> * Load Type (INCREMENTAL).
> * Load End Time.
> * Total number of events loaded.
> * Last Repl ID of the loaded database.{color}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17100) Improve HS2 operation logs for REPL commands.

2017-08-14 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17100:

Status: Patch Available  (was: Open)

> Improve HS2 operation logs for REPL commands.
> -
>
> Key: HIVE-17100
> URL: https://issues.apache.org/jira/browse/HIVE-17100
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17100.01.patch, HIVE-17100.02.patch
>
>
> It is necessary to log the progress the replication tasks in a structured 
> manner as follows.
> *+Bootstrap Dump:+*
> * At the start of bootstrap dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (BOOTSTRAP)
> * (Estimated) Total number of tables/views to dump
> * (Estimated) Total number of functions to dump.
> * Dump Start Time{color}
> * After each table dump, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table dump end time
> * Table dump progress. Format is Table sequence no/(Estimated) Total number 
> of tables and views.{color}
> * After each function dump, will add a log as follows
> {color:#59afe1}* Function Name
> * Function dump end time
> * Function dump progress. Format is Function sequence no/(Estimated) Total 
> number of functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> dump.
> {color:#59afe1}* Database Name.
> * Dump Type (BOOTSTRAP).
> * Dump End Time.
> * (Actual) Total number of tables/views dumped.
> * (Actual) Total number of functions dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The actual and estimated number of tables/functions may not match if 
> any table/function is dropped when dump in progress.
> *+Bootstrap Load:+*
> * At the start of bootstrap load, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump directory
> * Load Type (BOOTSTRAP)
> * Total number of tables/views to load
> * Total number of functions to load.
> * Load Start Time{color}
> * After each table load, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table load completion time
> * Table load progress. Format is Table sequence no/Total number of tables and 
> views.{color}
> * After each function load, will add a log as follows
> {color:#59afe1}* Function Name
> * Function load completion time
> * Function load progress. Format is Function sequence no/Total number of 
> functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> load.
> {color:#59afe1}* Database Name.
> * Load Type (BOOTSTRAP).
> * Load End Time.
> * Total number of tables/views loaded.
> * Total number of functions loaded.
> * Last Repl ID of the loaded database.{color}
> *+Incremental Dump:+*
> * At the start of database dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (INCREMENTAL)
> * (Estimated) Total number of events to dump.
> * Dump Start Time{color}
> * After each event dump, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event dump end time
> * Event dump progress. Format is Event sequence no/ (Estimated) Total number 
> of events.{color}
> * After completion of all event dumps, will add a log as follows.
> {color:#59afe1}* Database Name.
> * Dump Type (INCREMENTAL).
> * Dump End Time.
> * (Actual) Total number of events dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The estimated number of events can be terribly inaccurate with actual 
> number as we don’t have the number of events upfront until we read from 
> metastore NotificationEvents table.
> *+Incremental Load:+*
> * At the start of incremental load, will add one log with below details.
> {color:#59afe1}* Target Database Name 
> * Dump directory
> * Load Type (INCREMENTAL)
> * Total number of events to load
> * Load Start Time{color}
> * After each event load, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event load end time
> * Event load progress. Format is Event sequence no/ Total number of 
> events.{color}
> * After completion of all event loads, will add a log as follows to 
> consolidate the load.
> {color:#59afe1}* Target Database Name.
> * Load Type (INCREMENTAL).
> * Load End Time.
> * Total number of events loaded.
> * Last Repl ID of the loaded database.{color}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17100) Improve HS2 operation logs for REPL commands.

2017-08-14 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17100:

Attachment: HIVE-17100.02.patch

> Improve HS2 operation logs for REPL commands.
> -
>
> Key: HIVE-17100
> URL: https://issues.apache.org/jira/browse/HIVE-17100
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17100.01.patch
>
>
> It is necessary to log the progress the replication tasks in a structured 
> manner as follows.
> *+Bootstrap Dump:+*
> * At the start of bootstrap dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (BOOTSTRAP)
> * (Estimated) Total number of tables/views to dump
> * (Estimated) Total number of functions to dump.
> * Dump Start Time{color}
> * After each table dump, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table dump end time
> * Table dump progress. Format is Table sequence no/(Estimated) Total number 
> of tables and views.{color}
> * After each function dump, will add a log as follows
> {color:#59afe1}* Function Name
> * Function dump end time
> * Function dump progress. Format is Function sequence no/(Estimated) Total 
> number of functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> dump.
> {color:#59afe1}* Database Name.
> * Dump Type (BOOTSTRAP).
> * Dump End Time.
> * (Actual) Total number of tables/views dumped.
> * (Actual) Total number of functions dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The actual and estimated number of tables/functions may not match if 
> any table/function is dropped when dump in progress.
> *+Bootstrap Load:+*
> * At the start of bootstrap load, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump directory
> * Load Type (BOOTSTRAP)
> * Total number of tables/views to load
> * Total number of functions to load.
> * Load Start Time{color}
> * After each table load, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table load completion time
> * Table load progress. Format is Table sequence no/Total number of tables and 
> views.{color}
> * After each function load, will add a log as follows
> {color:#59afe1}* Function Name
> * Function load completion time
> * Function load progress. Format is Function sequence no/Total number of 
> functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> load.
> {color:#59afe1}* Database Name.
> * Load Type (BOOTSTRAP).
> * Load End Time.
> * Total number of tables/views loaded.
> * Total number of functions loaded.
> * Last Repl ID of the loaded database.{color}
> *+Incremental Dump:+*
> * At the start of database dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (INCREMENTAL)
> * (Estimated) Total number of events to dump.
> * Dump Start Time{color}
> * After each event dump, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event dump end time
> * Event dump progress. Format is Event sequence no/ (Estimated) Total number 
> of events.{color}
> * After completion of all event dumps, will add a log as follows.
> {color:#59afe1}* Database Name.
> * Dump Type (INCREMENTAL).
> * Dump End Time.
> * (Actual) Total number of events dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The estimated number of events can be terribly inaccurate with actual 
> number as we don’t have the number of events upfront until we read from 
> metastore NotificationEvents table.
> *+Incremental Load:+*
> * At the start of incremental load, will add one log with below details.
> {color:#59afe1}* Target Database Name 
> * Dump directory
> * Load Type (INCREMENTAL)
> * Total number of events to load
> * Load Start Time{color}
> * After each event load, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event load end time
> * Event load progress. Format is Event sequence no/ Total number of 
> events.{color}
> * After completion of all event loads, will add a log as follows to 
> consolidate the load.
> {color:#59afe1}* Target Database Name.
> * Load Type (INCREMENTAL).
> * Load End Time.
> * Total number of events loaded.
> * Last Repl ID of the loaded database.{color}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17100) Improve HS2 operation logs for REPL commands.

2017-08-14 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17100:

Attachment: (was: HIVE-17100.02.patch)

> Improve HS2 operation logs for REPL commands.
> -
>
> Key: HIVE-17100
> URL: https://issues.apache.org/jira/browse/HIVE-17100
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17100.01.patch
>
>
> It is necessary to log the progress the replication tasks in a structured 
> manner as follows.
> *+Bootstrap Dump:+*
> * At the start of bootstrap dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (BOOTSTRAP)
> * (Estimated) Total number of tables/views to dump
> * (Estimated) Total number of functions to dump.
> * Dump Start Time{color}
> * After each table dump, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table dump end time
> * Table dump progress. Format is Table sequence no/(Estimated) Total number 
> of tables and views.{color}
> * After each function dump, will add a log as follows
> {color:#59afe1}* Function Name
> * Function dump end time
> * Function dump progress. Format is Function sequence no/(Estimated) Total 
> number of functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> dump.
> {color:#59afe1}* Database Name.
> * Dump Type (BOOTSTRAP).
> * Dump End Time.
> * (Actual) Total number of tables/views dumped.
> * (Actual) Total number of functions dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The actual and estimated number of tables/functions may not match if 
> any table/function is dropped when dump in progress.
> *+Bootstrap Load:+*
> * At the start of bootstrap load, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump directory
> * Load Type (BOOTSTRAP)
> * Total number of tables/views to load
> * Total number of functions to load.
> * Load Start Time{color}
> * After each table load, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table load completion time
> * Table load progress. Format is Table sequence no/Total number of tables and 
> views.{color}
> * After each function load, will add a log as follows
> {color:#59afe1}* Function Name
> * Function load completion time
> * Function load progress. Format is Function sequence no/Total number of 
> functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> load.
> {color:#59afe1}* Database Name.
> * Load Type (BOOTSTRAP).
> * Load End Time.
> * Total number of tables/views loaded.
> * Total number of functions loaded.
> * Last Repl ID of the loaded database.{color}
> *+Incremental Dump:+*
> * At the start of database dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (INCREMENTAL)
> * (Estimated) Total number of events to dump.
> * Dump Start Time{color}
> * After each event dump, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event dump end time
> * Event dump progress. Format is Event sequence no/ (Estimated) Total number 
> of events.{color}
> * After completion of all event dumps, will add a log as follows.
> {color:#59afe1}* Database Name.
> * Dump Type (INCREMENTAL).
> * Dump End Time.
> * (Actual) Total number of events dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The estimated number of events can be terribly inaccurate with actual 
> number as we don’t have the number of events upfront until we read from 
> metastore NotificationEvents table.
> *+Incremental Load:+*
> * At the start of incremental load, will add one log with below details.
> {color:#59afe1}* Target Database Name 
> * Dump directory
> * Load Type (INCREMENTAL)
> * Total number of events to load
> * Load Start Time{color}
> * After each event load, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event load end time
> * Event load progress. Format is Event sequence no/ Total number of 
> events.{color}
> * After completion of all event loads, will add a log as follows to 
> consolidate the load.
> {color:#59afe1}* Target Database Name.
> * Load Type (INCREMENTAL).
> * Load End Time.
> * Total number of events loaded.
> * Last Repl ID of the loaded database.{color}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17100) Improve HS2 operation logs for REPL commands.

2017-08-14 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17100:

Status: Open  (was: Patch Available)

> Improve HS2 operation logs for REPL commands.
> -
>
> Key: HIVE-17100
> URL: https://issues.apache.org/jira/browse/HIVE-17100
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17100.01.patch, HIVE-17100.02.patch
>
>
> It is necessary to log the progress the replication tasks in a structured 
> manner as follows.
> *+Bootstrap Dump:+*
> * At the start of bootstrap dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (BOOTSTRAP)
> * (Estimated) Total number of tables/views to dump
> * (Estimated) Total number of functions to dump.
> * Dump Start Time{color}
> * After each table dump, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table dump end time
> * Table dump progress. Format is Table sequence no/(Estimated) Total number 
> of tables and views.{color}
> * After each function dump, will add a log as follows
> {color:#59afe1}* Function Name
> * Function dump end time
> * Function dump progress. Format is Function sequence no/(Estimated) Total 
> number of functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> dump.
> {color:#59afe1}* Database Name.
> * Dump Type (BOOTSTRAP).
> * Dump End Time.
> * (Actual) Total number of tables/views dumped.
> * (Actual) Total number of functions dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The actual and estimated number of tables/functions may not match if 
> any table/function is dropped when dump in progress.
> *+Bootstrap Load:+*
> * At the start of bootstrap load, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump directory
> * Load Type (BOOTSTRAP)
> * Total number of tables/views to load
> * Total number of functions to load.
> * Load Start Time{color}
> * After each table load, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table load completion time
> * Table load progress. Format is Table sequence no/Total number of tables and 
> views.{color}
> * After each function load, will add a log as follows
> {color:#59afe1}* Function Name
> * Function load completion time
> * Function load progress. Format is Function sequence no/Total number of 
> functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> load.
> {color:#59afe1}* Database Name.
> * Load Type (BOOTSTRAP).
> * Load End Time.
> * Total number of tables/views loaded.
> * Total number of functions loaded.
> * Last Repl ID of the loaded database.{color}
> *+Incremental Dump:+*
> * At the start of database dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (INCREMENTAL)
> * (Estimated) Total number of events to dump.
> * Dump Start Time{color}
> * After each event dump, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event dump end time
> * Event dump progress. Format is Event sequence no/ (Estimated) Total number 
> of events.{color}
> * After completion of all event dumps, will add a log as follows.
> {color:#59afe1}* Database Name.
> * Dump Type (INCREMENTAL).
> * Dump End Time.
> * (Actual) Total number of events dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The estimated number of events can be terribly inaccurate with actual 
> number as we don’t have the number of events upfront until we read from 
> metastore NotificationEvents table.
> *+Incremental Load:+*
> * At the start of incremental load, will add one log with below details.
> {color:#59afe1}* Target Database Name 
> * Dump directory
> * Load Type (INCREMENTAL)
> * Total number of events to load
> * Load Start Time{color}
> * After each event load, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event load end time
> * Event load progress. Format is Event sequence no/ Total number of 
> events.{color}
> * After completion of all event loads, will add a log as follows to 
> consolidate the load.
> {color:#59afe1}* Target Database Name.
> * Load Type (INCREMENTAL).
> * Load End Time.
> * Total number of events loaded.
> * Last Repl ID of the loaded database.{color}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17100) Improve HS2 operation logs for REPL commands.

2017-08-14 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17100:

Attachment: HIVE-17100.02.patch

> Improve HS2 operation logs for REPL commands.
> -
>
> Key: HIVE-17100
> URL: https://issues.apache.org/jira/browse/HIVE-17100
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17100.01.patch, HIVE-17100.02.patch
>
>
> It is necessary to log the progress the replication tasks in a structured 
> manner as follows.
> *+Bootstrap Dump:+*
> * At the start of bootstrap dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (BOOTSTRAP)
> * (Estimated) Total number of tables/views to dump
> * (Estimated) Total number of functions to dump.
> * Dump Start Time{color}
> * After each table dump, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table dump end time
> * Table dump progress. Format is Table sequence no/(Estimated) Total number 
> of tables and views.{color}
> * After each function dump, will add a log as follows
> {color:#59afe1}* Function Name
> * Function dump end time
> * Function dump progress. Format is Function sequence no/(Estimated) Total 
> number of functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> dump.
> {color:#59afe1}* Database Name.
> * Dump Type (BOOTSTRAP).
> * Dump End Time.
> * (Actual) Total number of tables/views dumped.
> * (Actual) Total number of functions dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The actual and estimated number of tables/functions may not match if 
> any table/function is dropped when dump in progress.
> *+Bootstrap Load:+*
> * At the start of bootstrap load, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump directory
> * Load Type (BOOTSTRAP)
> * Total number of tables/views to load
> * Total number of functions to load.
> * Load Start Time{color}
> * After each table load, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table load completion time
> * Table load progress. Format is Table sequence no/Total number of tables and 
> views.{color}
> * After each function load, will add a log as follows
> {color:#59afe1}* Function Name
> * Function load completion time
> * Function load progress. Format is Function sequence no/Total number of 
> functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> load.
> {color:#59afe1}* Database Name.
> * Load Type (BOOTSTRAP).
> * Load End Time.
> * Total number of tables/views loaded.
> * Total number of functions loaded.
> * Last Repl ID of the loaded database.{color}
> *+Incremental Dump:+*
> * At the start of database dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (INCREMENTAL)
> * (Estimated) Total number of events to dump.
> * Dump Start Time{color}
> * After each event dump, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event dump end time
> * Event dump progress. Format is Event sequence no/ (Estimated) Total number 
> of events.{color}
> * After completion of all event dumps, will add a log as follows.
> {color:#59afe1}* Database Name.
> * Dump Type (INCREMENTAL).
> * Dump End Time.
> * (Actual) Total number of events dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The estimated number of events can be terribly inaccurate with actual 
> number as we don’t have the number of events upfront until we read from 
> metastore NotificationEvents table.
> *+Incremental Load:+*
> * At the start of incremental load, will add one log with below details.
> {color:#59afe1}* Target Database Name 
> * Dump directory
> * Load Type (INCREMENTAL)
> * Total number of events to load
> * Load Start Time{color}
> * After each event load, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event load end time
> * Event load progress. Format is Event sequence no/ Total number of 
> events.{color}
> * After completion of all event loads, will add a log as follows to 
> consolidate the load.
> {color:#59afe1}* Target Database Name.
> * Load Type (INCREMENTAL).
> * Load End Time.
> * Total number of events loaded.
> * Last Repl ID of the loaded database.{color}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17100) Improve HS2 operation logs for REPL commands.

2017-08-14 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17100:

Attachment: (was: HIVE-17100.02.patch)

> Improve HS2 operation logs for REPL commands.
> -
>
> Key: HIVE-17100
> URL: https://issues.apache.org/jira/browse/HIVE-17100
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17100.01.patch, HIVE-17100.02.patch
>
>
> It is necessary to log the progress the replication tasks in a structured 
> manner as follows.
> *+Bootstrap Dump:+*
> * At the start of bootstrap dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (BOOTSTRAP)
> * (Estimated) Total number of tables/views to dump
> * (Estimated) Total number of functions to dump.
> * Dump Start Time{color}
> * After each table dump, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table dump end time
> * Table dump progress. Format is Table sequence no/(Estimated) Total number 
> of tables and views.{color}
> * After each function dump, will add a log as follows
> {color:#59afe1}* Function Name
> * Function dump end time
> * Function dump progress. Format is Function sequence no/(Estimated) Total 
> number of functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> dump.
> {color:#59afe1}* Database Name.
> * Dump Type (BOOTSTRAP).
> * Dump End Time.
> * (Actual) Total number of tables/views dumped.
> * (Actual) Total number of functions dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The actual and estimated number of tables/functions may not match if 
> any table/function is dropped when dump in progress.
> *+Bootstrap Load:+*
> * At the start of bootstrap load, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump directory
> * Load Type (BOOTSTRAP)
> * Total number of tables/views to load
> * Total number of functions to load.
> * Load Start Time{color}
> * After each table load, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table load completion time
> * Table load progress. Format is Table sequence no/Total number of tables and 
> views.{color}
> * After each function load, will add a log as follows
> {color:#59afe1}* Function Name
> * Function load completion time
> * Function load progress. Format is Function sequence no/Total number of 
> functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> load.
> {color:#59afe1}* Database Name.
> * Load Type (BOOTSTRAP).
> * Load End Time.
> * Total number of tables/views loaded.
> * Total number of functions loaded.
> * Last Repl ID of the loaded database.{color}
> *+Incremental Dump:+*
> * At the start of database dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (INCREMENTAL)
> * (Estimated) Total number of events to dump.
> * Dump Start Time{color}
> * After each event dump, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event dump end time
> * Event dump progress. Format is Event sequence no/ (Estimated) Total number 
> of events.{color}
> * After completion of all event dumps, will add a log as follows.
> {color:#59afe1}* Database Name.
> * Dump Type (INCREMENTAL).
> * Dump End Time.
> * (Actual) Total number of events dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The estimated number of events can be terribly inaccurate with actual 
> number as we don’t have the number of events upfront until we read from 
> metastore NotificationEvents table.
> *+Incremental Load:+*
> * At the start of incremental load, will add one log with below details.
> {color:#59afe1}* Target Database Name 
> * Dump directory
> * Load Type (INCREMENTAL)
> * Total number of events to load
> * Load Start Time{color}
> * After each event load, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event load end time
> * Event load progress. Format is Event sequence no/ Total number of 
> events.{color}
> * After completion of all event loads, will add a log as follows to 
> consolidate the load.
> {color:#59afe1}* Target Database Name.
> * Load Type (INCREMENTAL).
> * Load End Time.
> * Total number of events loaded.
> * Last Repl ID of the loaded database.{color}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17100) Improve HS2 operation logs for REPL commands.

2017-08-14 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17100:

Status: Patch Available  (was: Open)

> Improve HS2 operation logs for REPL commands.
> -
>
> Key: HIVE-17100
> URL: https://issues.apache.org/jira/browse/HIVE-17100
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17100.01.patch, HIVE-17100.02.patch
>
>
> It is necessary to log the progress the replication tasks in a structured 
> manner as follows.
> *+Bootstrap Dump:+*
> * At the start of bootstrap dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (BOOTSTRAP)
> * (Estimated) Total number of tables/views to dump
> * (Estimated) Total number of functions to dump.
> * Dump Start Time{color}
> * After each table dump, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table dump end time
> * Table dump progress. Format is Table sequence no/(Estimated) Total number 
> of tables and views.{color}
> * After each function dump, will add a log as follows
> {color:#59afe1}* Function Name
> * Function dump end time
> * Function dump progress. Format is Function sequence no/(Estimated) Total 
> number of functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> dump.
> {color:#59afe1}* Database Name.
> * Dump Type (BOOTSTRAP).
> * Dump End Time.
> * (Actual) Total number of tables/views dumped.
> * (Actual) Total number of functions dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The actual and estimated number of tables/functions may not match if 
> any table/function is dropped when dump in progress.
> *+Bootstrap Load:+*
> * At the start of bootstrap load, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump directory
> * Load Type (BOOTSTRAP)
> * Total number of tables/views to load
> * Total number of functions to load.
> * Load Start Time{color}
> * After each table load, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table load completion time
> * Table load progress. Format is Table sequence no/Total number of tables and 
> views.{color}
> * After each function load, will add a log as follows
> {color:#59afe1}* Function Name
> * Function load completion time
> * Function load progress. Format is Function sequence no/Total number of 
> functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> load.
> {color:#59afe1}* Database Name.
> * Load Type (BOOTSTRAP).
> * Load End Time.
> * Total number of tables/views loaded.
> * Total number of functions loaded.
> * Last Repl ID of the loaded database.{color}
> *+Incremental Dump:+*
> * At the start of database dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (INCREMENTAL)
> * (Estimated) Total number of events to dump.
> * Dump Start Time{color}
> * After each event dump, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event dump end time
> * Event dump progress. Format is Event sequence no/ (Estimated) Total number 
> of events.{color}
> * After completion of all event dumps, will add a log as follows.
> {color:#59afe1}* Database Name.
> * Dump Type (INCREMENTAL).
> * Dump End Time.
> * (Actual) Total number of events dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The estimated number of events can be terribly inaccurate with actual 
> number as we don’t have the number of events upfront until we read from 
> metastore NotificationEvents table.
> *+Incremental Load:+*
> * At the start of incremental load, will add one log with below details.
> {color:#59afe1}* Target Database Name 
> * Dump directory
> * Load Type (INCREMENTAL)
> * Total number of events to load
> * Load Start Time{color}
> * After each event load, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event load end time
> * Event load progress. Format is Event sequence no/ Total number of 
> events.{color}
> * After completion of all event loads, will add a log as follows to 
> consolidate the load.
> {color:#59afe1}* Target Database Name.
> * Load Type (INCREMENTAL).
> * Load End Time.
> * Total number of events loaded.
> * Last Repl ID of the loaded database.{color}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17100) Improve HS2 operation logs for REPL commands.

2017-08-14 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17100:

Attachment: HIVE-17100.02.patch

Added 02.patch with fix for the build failure.

> Improve HS2 operation logs for REPL commands.
> -
>
> Key: HIVE-17100
> URL: https://issues.apache.org/jira/browse/HIVE-17100
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17100.01.patch, HIVE-17100.02.patch
>
>
> It is necessary to log the progress the replication tasks in a structured 
> manner as follows.
> *+Bootstrap Dump:+*
> * At the start of bootstrap dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (BOOTSTRAP)
> * (Estimated) Total number of tables/views to dump
> * (Estimated) Total number of functions to dump.
> * Dump Start Time{color}
> * After each table dump, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table dump end time
> * Table dump progress. Format is Table sequence no/(Estimated) Total number 
> of tables and views.{color}
> * After each function dump, will add a log as follows
> {color:#59afe1}* Function Name
> * Function dump end time
> * Function dump progress. Format is Function sequence no/(Estimated) Total 
> number of functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> dump.
> {color:#59afe1}* Database Name.
> * Dump Type (BOOTSTRAP).
> * Dump End Time.
> * (Actual) Total number of tables/views dumped.
> * (Actual) Total number of functions dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The actual and estimated number of tables/functions may not match if 
> any table/function is dropped when dump in progress.
> *+Bootstrap Load:+*
> * At the start of bootstrap load, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump directory
> * Load Type (BOOTSTRAP)
> * Total number of tables/views to load
> * Total number of functions to load.
> * Load Start Time{color}
> * After each table load, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table load completion time
> * Table load progress. Format is Table sequence no/Total number of tables and 
> views.{color}
> * After each function load, will add a log as follows
> {color:#59afe1}* Function Name
> * Function load completion time
> * Function load progress. Format is Function sequence no/Total number of 
> functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> load.
> {color:#59afe1}* Database Name.
> * Load Type (BOOTSTRAP).
> * Load End Time.
> * Total number of tables/views loaded.
> * Total number of functions loaded.
> * Last Repl ID of the loaded database.{color}
> *+Incremental Dump:+*
> * At the start of database dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (INCREMENTAL)
> * (Estimated) Total number of events to dump.
> * Dump Start Time{color}
> * After each event dump, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event dump end time
> * Event dump progress. Format is Event sequence no/ (Estimated) Total number 
> of events.{color}
> * After completion of all event dumps, will add a log as follows.
> {color:#59afe1}* Database Name.
> * Dump Type (INCREMENTAL).
> * Dump End Time.
> * (Actual) Total number of events dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The estimated number of events can be terribly inaccurate with actual 
> number as we don’t have the number of events upfront until we read from 
> metastore NotificationEvents table.
> *+Incremental Load:+*
> * At the start of incremental load, will add one log with below details.
> {color:#59afe1}* Target Database Name 
> * Dump directory
> * Load Type (INCREMENTAL)
> * Total number of events to load
> * Load Start Time{color}
> * After each event load, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event load end time
> * Event load progress. Format is Event sequence no/ Total number of 
> events.{color}
> * After completion of all event loads, will add a log as follows to 
> consolidate the load.
> {color:#59afe1}* Target Database Name.
> * Load Type (INCREMENTAL).
> * Load End Time.
> * Total number of events loaded.
> * Last Repl ID of the loaded database.{color}



--
This message was sent by Atlassian JIRA
(v6.4.14#640

[jira] [Updated] (HIVE-17100) Improve HS2 operation logs for REPL commands.

2017-08-14 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17100:

Status: Open  (was: Patch Available)

> Improve HS2 operation logs for REPL commands.
> -
>
> Key: HIVE-17100
> URL: https://issues.apache.org/jira/browse/HIVE-17100
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17100.01.patch
>
>
> It is necessary to log the progress the replication tasks in a structured 
> manner as follows.
> *+Bootstrap Dump:+*
> * At the start of bootstrap dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (BOOTSTRAP)
> * (Estimated) Total number of tables/views to dump
> * (Estimated) Total number of functions to dump.
> * Dump Start Time{color}
> * After each table dump, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table dump end time
> * Table dump progress. Format is Table sequence no/(Estimated) Total number 
> of tables and views.{color}
> * After each function dump, will add a log as follows
> {color:#59afe1}* Function Name
> * Function dump end time
> * Function dump progress. Format is Function sequence no/(Estimated) Total 
> number of functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> dump.
> {color:#59afe1}* Database Name.
> * Dump Type (BOOTSTRAP).
> * Dump End Time.
> * (Actual) Total number of tables/views dumped.
> * (Actual) Total number of functions dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The actual and estimated number of tables/functions may not match if 
> any table/function is dropped when dump in progress.
> *+Bootstrap Load:+*
> * At the start of bootstrap load, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump directory
> * Load Type (BOOTSTRAP)
> * Total number of tables/views to load
> * Total number of functions to load.
> * Load Start Time{color}
> * After each table load, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table load completion time
> * Table load progress. Format is Table sequence no/Total number of tables and 
> views.{color}
> * After each function load, will add a log as follows
> {color:#59afe1}* Function Name
> * Function load completion time
> * Function load progress. Format is Function sequence no/Total number of 
> functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> load.
> {color:#59afe1}* Database Name.
> * Load Type (BOOTSTRAP).
> * Load End Time.
> * Total number of tables/views loaded.
> * Total number of functions loaded.
> * Last Repl ID of the loaded database.{color}
> *+Incremental Dump:+*
> * At the start of database dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (INCREMENTAL)
> * (Estimated) Total number of events to dump.
> * Dump Start Time{color}
> * After each event dump, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event dump end time
> * Event dump progress. Format is Event sequence no/ (Estimated) Total number 
> of events.{color}
> * After completion of all event dumps, will add a log as follows.
> {color:#59afe1}* Database Name.
> * Dump Type (INCREMENTAL).
> * Dump End Time.
> * (Actual) Total number of events dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The estimated number of events can be terribly inaccurate with actual 
> number as we don’t have the number of events upfront until we read from 
> metastore NotificationEvents table.
> *+Incremental Load:+*
> * At the start of incremental load, will add one log with below details.
> {color:#59afe1}* Target Database Name 
> * Dump directory
> * Load Type (INCREMENTAL)
> * Total number of events to load
> * Load Start Time{color}
> * After each event load, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event load end time
> * Event load progress. Format is Event sequence no/ Total number of 
> events.{color}
> * After completion of all event loads, will add a log as follows to 
> consolidate the load.
> {color:#59afe1}* Target Database Name.
> * Load Type (INCREMENTAL).
> * Load End Time.
> * Total number of events loaded.
> * Last Repl ID of the loaded database.{color}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17100) Improve HS2 operation logs for REPL commands.

2017-08-14 Thread Sankar Hariappan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126592#comment-16126592
 ] 

Sankar Hariappan edited comment on HIVE-17100 at 8/15/17 5:59 AM:
--

Added 01.patch with below changes.
- Added Repl logger for bootstrap dump/load and incremental dump/load.
- Serialized the log messages with separate classes for each log message.
- Added a new task type REPL_STATE_LOG to log the replication progress once a 
table/function/event is loaded successfully. Tried to use it as 
DependencyCollectionTask wherever possible.
- Logged both in console and log file.
- Added all logs as mentioned in description.
- Fixed a bug in dependency method in ReplLoadTask which adds dependent to all 
tasks instead of only leaf nodes.



was (Author: sankarh):
Added 01.patch with below changes.
- Added Repl logger for bootstrap dump/load and incremental dump/load.
- Serialized the log messages with separate classes for each log message.
- Logged both in console and log file.
- Added all logs as mentioned in description.


> Improve HS2 operation logs for REPL commands.
> -
>
> Key: HIVE-17100
> URL: https://issues.apache.org/jira/browse/HIVE-17100
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17100.01.patch
>
>
> It is necessary to log the progress the replication tasks in a structured 
> manner as follows.
> *+Bootstrap Dump:+*
> * At the start of bootstrap dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (BOOTSTRAP)
> * (Estimated) Total number of tables/views to dump
> * (Estimated) Total number of functions to dump.
> * Dump Start Time{color}
> * After each table dump, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table dump end time
> * Table dump progress. Format is Table sequence no/(Estimated) Total number 
> of tables and views.{color}
> * After each function dump, will add a log as follows
> {color:#59afe1}* Function Name
> * Function dump end time
> * Function dump progress. Format is Function sequence no/(Estimated) Total 
> number of functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> dump.
> {color:#59afe1}* Database Name.
> * Dump Type (BOOTSTRAP).
> * Dump End Time.
> * (Actual) Total number of tables/views dumped.
> * (Actual) Total number of functions dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The actual and estimated number of tables/functions may not match if 
> any table/function is dropped when dump in progress.
> *+Bootstrap Load:+*
> * At the start of bootstrap load, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump directory
> * Load Type (BOOTSTRAP)
> * Total number of tables/views to load
> * Total number of functions to load.
> * Load Start Time{color}
> * After each table load, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table load completion time
> * Table load progress. Format is Table sequence no/Total number of tables and 
> views.{color}
> * After each function load, will add a log as follows
> {color:#59afe1}* Function Name
> * Function load completion time
> * Function load progress. Format is Function sequence no/Total number of 
> functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> load.
> {color:#59afe1}* Database Name.
> * Load Type (BOOTSTRAP).
> * Load End Time.
> * Total number of tables/views loaded.
> * Total number of functions loaded.
> * Last Repl ID of the loaded database.{color}
> *+Incremental Dump:+*
> * At the start of database dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (INCREMENTAL)
> * (Estimated) Total number of events to dump.
> * Dump Start Time{color}
> * After each event dump, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event dump end time
> * Event dump progress. Format is Event sequence no/ (Estimated) Total number 
> of events.{color}
> * After completion of all event dumps, will add a log as follows.
> {color:#59afe1}* Database Name.
> * Dump Type (INCREMENTAL).
> * Dump End Time.
> * (Actual) Total number of events dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The estimated number of events can be terribly inaccurate with actual 
> number as we don’t have the number of events upfront until we read from 
> metastore

[jira] [Commented] (HIVE-17181) HCatOutputFormat should expose complete output-schema (including partition-keys) for dynamic-partitioning MR jobs

2017-08-14 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126836#comment-16126836
 ] 

Thejas M Nair commented on HIVE-17181:
--

+1 to latest patch
Please go ahead and commit


> HCatOutputFormat should expose complete output-schema (including 
> partition-keys) for dynamic-partitioning MR jobs
> -
>
> Key: HIVE-17181
> URL: https://issues.apache.org/jira/browse/HIVE-17181
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 2.2.0, 3.0.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-17181.1.patch, HIVE-17181.2.patch, 
> HIVE-17181.3.patch, HIVE-17181.branch-2.patch
>
>
> Map/Reduce jobs that use HCatalog APIs to write to Hive tables using Dynamic 
> partitioning are expected to call the following API methods:
> # {{HCatOutputFormat.setOutput()}} to indicate which table/partitions to 
> write to. This call populates the {{OutputJobInfo}} with details fetched from 
> the Metastore.
> # {{HCatOutputFormat.setSchema()}} to indicate the output-schema for the data 
> being written.
> It is a common mistake to invoke {{HCatOUtputFormat.setSchema()}} as follows:
> {code:java}
> HCatOutputFormat.setSchema(conf, HCatOutputFormat.getTableSchema(conf));
> {code}
> Unfortunately, {{getTableSchema()}} returns only the record-schema, not the 
> entire table's schema. We'll need a better API for use in M/R jobs to get the 
> complete table-schema.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17218) Canonical-ize hostnames for Hive metastore, and HS2 servers.

2017-08-14 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126835#comment-16126835
 ] 

Thejas M Nair commented on HIVE-17218:
--

Thanks, can you please commit ?


> Canonical-ize hostnames for Hive metastore, and HS2 servers.
> 
>
> Key: HIVE-17218
> URL: https://issues.apache.org/jira/browse/HIVE-17218
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Metastore, Security
>Affects Versions: 1.2.2, 2.2.0, 3.0.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-17218.1.patch
>
>
> Currently, the {{HiveMetastoreClient}} and {{HiveConnection}} do not 
> canonical-ize the hostnames of the metastore/HS2 servers. In deployments 
> where there are multiple such servers behind a VIP, this causes a number of 
> inconveniences:
> # The client-side configuration (e.g. {{hive.metastore.uris}} in 
> {{hive-site.xml}}) needs to specify the VIP's hostname, and cannot use a 
> simplified CNAME, in the thrift URL. If the 
> {{hive.metastore.kerberos.principal}} is specified using {{_HOST}}, one sees 
> GSS failures as follows:
> {noformat}
> hive --hiveconf hive.metastore.kerberos.principal=hive/_h...@grid.myth.net 
> --hiveconf 
> hive.metastore.uris="thrift://simplified-hcat-cname.grid.myth.net:56789"
> ...
> Exception in thread "main" java.lang.RuntimeException: 
> java.lang.RuntimeException: Unable to instantiate 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
> at 
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:542)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
> ...
> {noformat}
> This is because {{_HOST}} is filled in with the CNAME, and not the 
> canonicalized name.
> # Oozie workflows that use HCat {{}} have to always use the VIP 
> hostname, and can't use {{_HOST}}-based service principals, if the CNAME 
> differs from the VIP name.
> If the client-code simply canonical-ized the hostnames, it would enable the 
> use of both simplified CNAMEs, and _HOST in service principals.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17181) HCatOutputFormat should expose complete output-schema (including partition-keys) for dynamic-partitioning MR jobs

2017-08-14 Thread Mithun Radhakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126779#comment-16126779
 ] 

Mithun Radhakrishnan commented on HIVE-17181:
-

Hey, [~thejas]. Does the latest version of this patch look better? The test 
failures seem once again to be those being handled in HIVE-16908 and HIVE-15058.

> HCatOutputFormat should expose complete output-schema (including 
> partition-keys) for dynamic-partitioning MR jobs
> -
>
> Key: HIVE-17181
> URL: https://issues.apache.org/jira/browse/HIVE-17181
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 2.2.0, 3.0.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-17181.1.patch, HIVE-17181.2.patch, 
> HIVE-17181.3.patch, HIVE-17181.branch-2.patch
>
>
> Map/Reduce jobs that use HCatalog APIs to write to Hive tables using Dynamic 
> partitioning are expected to call the following API methods:
> # {{HCatOutputFormat.setOutput()}} to indicate which table/partitions to 
> write to. This call populates the {{OutputJobInfo}} with details fetched from 
> the Metastore.
> # {{HCatOutputFormat.setSchema()}} to indicate the output-schema for the data 
> being written.
> It is a common mistake to invoke {{HCatOUtputFormat.setSchema()}} as follows:
> {code:java}
> HCatOutputFormat.setSchema(conf, HCatOutputFormat.getTableSchema(conf));
> {code}
> Unfortunately, {{getTableSchema()}} returns only the record-schema, not the 
> entire table's schema. We'll need a better API for use in M/R jobs to get the 
> complete table-schema.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-11548) HCatLoader should support predicate pushdown.

2017-08-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126774#comment-16126774
 ] 

Hive QA commented on HIVE-11548:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12881855/HIVE-11548.5.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 11008 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 (batchId=241)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move]
 (batchId=244)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only]
 (batchId=244)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only]
 (batchId=244)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=180)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6395/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6395/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6395/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12881855 - PreCommit-HIVE-Build

> HCatLoader should support predicate pushdown.
> -
>
> Key: HIVE-11548
> URL: https://issues.apache.org/jira/browse/HIVE-11548
> Project: Hive
>  Issue Type: New Feature
>  Components: HCatalog
>Affects Versions: 3.0.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-11548.1.patch, HIVE-11548.2.patch, 
> HIVE-11548.3.patch, HIVE-11548.4.patch, HIVE-11548.5.patch
>
>
> When one uses {{HCatInputFormat}}/{{HCatLoader}} to read from file-formats 
> that support predicate pushdown (such as ORC, with 
> {{hive.optimize.index.filter=true}}), one sees that the predicates aren't 
> actually pushed down into the storage layer.
> The forthcoming patch should allow for filter-pushdown, if any of the 
> partitions being scanned with {{HCatLoader}} support the functionality. The 
> patch should technically allow the same for users of {{HCatInputFormat}}, but 
> I don't currently have a neat interface to build a compound 
> predicate-expression. Will add this separately, if required.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17308) Improvement in join cardinality estimation

2017-08-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126747#comment-16126747
 ] 

Hive QA commented on HIVE-17308:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12881854/HIVE-17308.6.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 30 failed/errored test(s), 11004 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 (batchId=240)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_join] 
(batchId=51)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[dynamic_semijoin_user_level]
 (batchId=144)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[explainuser_2] 
(batchId=144)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[correlationoptimizer1]
 (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction]
 (batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_1]
 (batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_max_hashtable]
 (batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[skewjoin] 
(batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_exists]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_in]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_multi]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_notin]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_select]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_views]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_explainuser_1]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=235)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[annotate_stats_join]
 (batchId=123)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=180)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6394/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6394/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6394/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 30 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12881854 - PreCommit-HIVE-Build

> Improvement in join cardinality estimation
> --
>
> Key: HIVE-17308
> URL: https://issues.apache.org/jira/browse/HIVE-17308
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17308.1.patch, HIVE-17308.2.patch, 
> HIVE-17308.3.patch, HIVE-17308.4.patch, HIVE-17308.5.patch, HIVE-17308.6.patch
>
>
> Currently during logical planning join cardinality is estimated assuming no 
> correlation among join keys (This estimation is done using expo

[jira] [Commented] (HIVE-17291) Set the number of executors based on config if client does not provide information

2017-08-14 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126744#comment-16126744
 ] 

Xuefu Zhang commented on HIVE-17291:


[~lirui], that makes sense.

> Set the number of executors based on config if client does not provide 
> information
> --
>
> Key: HIVE-17291
> URL: https://issues.apache.org/jira/browse/HIVE-17291
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: 3.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-17291.1.patch
>
>
> When calculating the memory and cores and the client does not provide 
> information we should try to use the one provided by default. This can happen 
> on startup, when {{spark.dynamicAllocation.enabled}} is not enabled



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17291) Set the number of executors based on config if client does not provide information

2017-08-14 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126738#comment-16126738
 ] 

Rui Li commented on HIVE-17291:
---

[~pvary],
bq. Logic suggests, that in this case we will request new executors based on 
the new settings, and we are in the middle of the query test file, so Rui Li's 
magic does not applies here
I think it's because when we get a spark session, we'll set it to the 
SessionState:
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java#L127
In QTestUtil, we override the {{setSparkSession}} method, so each time we 
create a new spark session, we'll wait for it to init. Currently, we have only 
one executor, and we wait until numCores > 1. So we don't have any flakiness. 
With your change in HIVE-17292, we'll need to wait until we have 4 cores.

[~xuefuz], your concerns about dynamic allocation are valid. I think current 
implementation only uses "size per reducer" when dynamic allocation is enabled. 
For static allocation, as I mentioned, the available executor/core can only 
give us more reducers than needed (decided by "size per reducer"). As long as 
we have a reasonable {{hive.exec.reducers.bytes.per.reducer}}, we won't 
underestimate the num of reducers. So I think we're good with the current 
logic. Does that make sense?

> Set the number of executors based on config if client does not provide 
> information
> --
>
> Key: HIVE-17291
> URL: https://issues.apache.org/jira/browse/HIVE-17291
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: 3.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-17291.1.patch
>
>
> When calculating the memory and cores and the client does not provide 
> information we should try to use the one provided by default. This can happen 
> on startup, when {{spark.dynamicAllocation.enabled}} is not enabled



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17218) Canonical-ize hostnames for Hive metastore, and HS2 servers.

2017-08-14 Thread Mithun Radhakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126708#comment-16126708
 ] 

Mithun Radhakrishnan commented on HIVE-17218:
-

I have verified that the tests aren't failing because of this patch. HIVE-16908 
tracks the {{HCatClient}} failures. The others look like they're handled in 
HIVE-15058.

> Canonical-ize hostnames for Hive metastore, and HS2 servers.
> 
>
> Key: HIVE-17218
> URL: https://issues.apache.org/jira/browse/HIVE-17218
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Metastore, Security
>Affects Versions: 1.2.2, 2.2.0, 3.0.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-17218.1.patch
>
>
> Currently, the {{HiveMetastoreClient}} and {{HiveConnection}} do not 
> canonical-ize the hostnames of the metastore/HS2 servers. In deployments 
> where there are multiple such servers behind a VIP, this causes a number of 
> inconveniences:
> # The client-side configuration (e.g. {{hive.metastore.uris}} in 
> {{hive-site.xml}}) needs to specify the VIP's hostname, and cannot use a 
> simplified CNAME, in the thrift URL. If the 
> {{hive.metastore.kerberos.principal}} is specified using {{_HOST}}, one sees 
> GSS failures as follows:
> {noformat}
> hive --hiveconf hive.metastore.kerberos.principal=hive/_h...@grid.myth.net 
> --hiveconf 
> hive.metastore.uris="thrift://simplified-hcat-cname.grid.myth.net:56789"
> ...
> Exception in thread "main" java.lang.RuntimeException: 
> java.lang.RuntimeException: Unable to instantiate 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
> at 
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:542)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
> ...
> {noformat}
> This is because {{_HOST}} is filled in with the CNAME, and not the 
> canonicalized name.
> # Oozie workflows that use HCat {{}} have to always use the VIP 
> hostname, and can't use {{_HOST}}-based service principals, if the CNAME 
> differs from the VIP name.
> If the client-code simply canonical-ized the hostnames, it would enable the 
> use of both simplified CNAMEs, and _HOST in service principals.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-11548) HCatLoader should support predicate pushdown.

2017-08-14 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-11548:

Status: Patch Available  (was: Open)

> HCatLoader should support predicate pushdown.
> -
>
> Key: HIVE-11548
> URL: https://issues.apache.org/jira/browse/HIVE-11548
> Project: Hive
>  Issue Type: New Feature
>  Components: HCatalog
>Affects Versions: 3.0.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-11548.1.patch, HIVE-11548.2.patch, 
> HIVE-11548.3.patch, HIVE-11548.4.patch, HIVE-11548.5.patch
>
>
> When one uses {{HCatInputFormat}}/{{HCatLoader}} to read from file-formats 
> that support predicate pushdown (such as ORC, with 
> {{hive.optimize.index.filter=true}}), one sees that the predicates aren't 
> actually pushed down into the storage layer.
> The forthcoming patch should allow for filter-pushdown, if any of the 
> partitions being scanned with {{HCatLoader}} support the functionality. The 
> patch should technically allow the same for users of {{HCatInputFormat}}, but 
> I don't currently have a neat interface to build a compound 
> predicate-expression. Will add this separately, if required.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-11548) HCatLoader should support predicate pushdown.

2017-08-14 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-11548:

Attachment: HIVE-11548.5.patch

Rebased, again.

> HCatLoader should support predicate pushdown.
> -
>
> Key: HIVE-11548
> URL: https://issues.apache.org/jira/browse/HIVE-11548
> Project: Hive
>  Issue Type: New Feature
>  Components: HCatalog
>Affects Versions: 3.0.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-11548.1.patch, HIVE-11548.2.patch, 
> HIVE-11548.3.patch, HIVE-11548.4.patch, HIVE-11548.5.patch
>
>
> When one uses {{HCatInputFormat}}/{{HCatLoader}} to read from file-formats 
> that support predicate pushdown (such as ORC, with 
> {{hive.optimize.index.filter=true}}), one sees that the predicates aren't 
> actually pushed down into the storage layer.
> The forthcoming patch should allow for filter-pushdown, if any of the 
> partitions being scanned with {{HCatLoader}} support the functionality. The 
> patch should technically allow the same for users of {{HCatInputFormat}}, but 
> I don't currently have a neat interface to build a compound 
> predicate-expression. Will add this separately, if required.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17308) Improvement in join cardinality estimation

2017-08-14 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17308:
---
Attachment: HIVE-17308.6.patch

> Improvement in join cardinality estimation
> --
>
> Key: HIVE-17308
> URL: https://issues.apache.org/jira/browse/HIVE-17308
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17308.1.patch, HIVE-17308.2.patch, 
> HIVE-17308.3.patch, HIVE-17308.4.patch, HIVE-17308.5.patch, HIVE-17308.6.patch
>
>
> Currently during logical planning join cardinality is estimated assuming no 
> correlation among join keys (This estimation is done using exponential 
> backoff). Physical planning on the other hand consider correlation for multi 
> keys and uses different estimation. We should consider correlation during 
> logical planning as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17308) Improvement in join cardinality estimation

2017-08-14 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17308:
---
Status: Patch Available  (was: Open)

> Improvement in join cardinality estimation
> --
>
> Key: HIVE-17308
> URL: https://issues.apache.org/jira/browse/HIVE-17308
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17308.1.patch, HIVE-17308.2.patch, 
> HIVE-17308.3.patch, HIVE-17308.4.patch, HIVE-17308.5.patch, HIVE-17308.6.patch
>
>
> Currently during logical planning join cardinality is estimated assuming no 
> correlation among join keys (This estimation is done using exponential 
> backoff). Physical planning on the other hand consider correlation for multi 
> keys and uses different estimation. We should consider correlation during 
> logical planning as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17308) Improvement in join cardinality estimation

2017-08-14 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17308:
---
Status: Open  (was: Patch Available)

> Improvement in join cardinality estimation
> --
>
> Key: HIVE-17308
> URL: https://issues.apache.org/jira/browse/HIVE-17308
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17308.1.patch, HIVE-17308.2.patch, 
> HIVE-17308.3.patch, HIVE-17308.4.patch, HIVE-17308.5.patch, HIVE-17308.6.patch
>
>
> Currently during logical planning join cardinality is estimated assuming no 
> correlation among join keys (This estimation is done using exponential 
> backoff). Physical planning on the other hand consider correlation for multi 
> keys and uses different estimation. We should consider correlation during 
> logical planning as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17297) allow AM to use LLAP guaranteed tasks

2017-08-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126690#comment-16126690
 ] 

Hive QA commented on HIVE-17297:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12881850/HIVE-17297.only.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6393/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6393/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6393/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-08-15 00:59:48.520
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-6393/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-08-15 00:59:48.523
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 06d9a6b HIVE-16873: Remove Thread Cache From Logging (BELUGA 
BEHR reviewed by Aihua Xu)
+ git clean -f -d
Removing 
itests/hive-unit-hadoop2/src/test/java/org/apache/hadoop/hive/metastore/
Removing 
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/security/
Removing 
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/HdfsUtils.java
Removing 
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/SecurityUtils.java
Removing standalone-metastore/src/main/java/org/apache/hadoop/security/
Removing 
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/utils/
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 06d9a6b HIVE-16873: Remove Thread Cache From Logging (BELUGA 
BEHR reviewed by Aihua Xu)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-08-15 00:59:55.661
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: patch failed: 
llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/LlapProtocolServerImpl.java:47
error: 
llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/LlapProtocolServerImpl.java:
 patch does not apply
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12881850 - PreCommit-HIVE-Build

> allow AM to use LLAP guaranteed tasks
> -
>
> Key: HIVE-17297
> URL: https://issues.apache.org/jira/browse/HIVE-17297
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17297.only.patch, HIVE-17297.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17241) Change metastore classes to not use the shims

2017-08-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126688#comment-16126688
 ] 

Hive QA commented on HIVE-17241:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12881840/HIVE-17241.2.patch

{color:green}SUCCESS:{color} +1 due to 24 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 21 failed/errored test(s), 11017 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 (batchId=240)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[partition_wise_fileformat6]
 (batchId=7)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge1]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge3]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge_diff_fs]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[quotedid_smb]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_combine_equivalent_work]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_use_op_stats]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[truncate_column_buckets]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=180)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6392/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6392/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6392/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 21 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12881840 - PreCommit-HIVE-Build

> Change metastore classes to not use the shims
> -
>
> Key: HIVE-17241
> URL: https://issues.apache.org/jira/browse/HIVE-17241
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-17241.2.patch, HIVE-17241.patch
>
>
> As part of moving the metastore into a standalone package, it will no longer 
> have access to the shims.  This means we need to either copy them or access 
> the underlying Hadoop operations directly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17089) make acid 2.0 the default

2017-08-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126684#comment-16126684
 ] 

Sergey Shelukhin commented on HIVE-17089:
-

Posted some comments on RB

> make acid 2.0 the default
> -
>
> Key: HIVE-17089
> URL: https://issues.apache.org/jira/browse/HIVE-17089
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17089.01.patch, HIVE-17089.03.patch, 
> HIVE-17089.05.patch, HIVE-17089.06.patch, HIVE-17089.07.patch, 
> HIVE-17089.10.patch, HIVE-17089.10.patch, HIVE-17089.11.patch, 
> HIVE-17089.12.patch, HIVE-17089.13.patch, HIVE-17089.14.patch
>
>
> acid 2.0 is introduced in HIVE-14035.  It replaces Update events with a 
> combination of Delete + Insert events.  This now makes U=D+I the default (and 
> only) supported acid table type in Hive 3.0.  
> The expectation for upgrade is that Major compaction has to be run on all 
> acid tables in the existing Hive cluster and that no new writes to these 
> table take place since the start of compaction (Need to add a mechanism to 
> put a table in read-only mode - this way it can still be read while it's 
> being compacted).  Then upgrade to Hive 3.0 can take place.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17297) allow AM to use LLAP guaranteed tasks

2017-08-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126674#comment-16126674
 ] 

Sergey Shelukhin commented on HIVE-17297:
-

Looks like RB has removed the option to post a patch with a base patch from 
both commandline and web UI. So, this won't really be on RB until the previous 
patch is committed. cc [~sseth]

> allow AM to use LLAP guaranteed tasks
> -
>
> Key: HIVE-17297
> URL: https://issues.apache.org/jira/browse/HIVE-17297
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17297.only.patch, HIVE-17297.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17297) allow AM to use LLAP guaranteed tasks

2017-08-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17297:

Status: Patch Available  (was: Open)

> allow AM to use LLAP guaranteed tasks
> -
>
> Key: HIVE-17297
> URL: https://issues.apache.org/jira/browse/HIVE-17297
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17297.only.patch, HIVE-17297.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17297) allow AM to use LLAP guaranteed tasks

2017-08-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17297:

Attachment: HIVE-17297.only.patch
HIVE-17297.patch

The patch on top of the previous one; and another version with both included, 
for HiveQA

> allow AM to use LLAP guaranteed tasks
> -
>
> Key: HIVE-17297
> URL: https://issues.apache.org/jira/browse/HIVE-17297
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17297.only.patch, HIVE-17297.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17297) allow AM to use LLAP guaranteed tasks

2017-08-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17297:

Attachment: (was: HIVE-17297.patch)

> allow AM to use LLAP guaranteed tasks
> -
>
> Key: HIVE-17297
> URL: https://issues.apache.org/jira/browse/HIVE-17297
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17297.only.patch, HIVE-17297.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17320) OrcRawRecordMerger.discoverKeyBounds logic can be simplified

2017-08-14 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-17320:
-


> OrcRawRecordMerger.discoverKeyBounds logic can be simplified
> 
>
> Key: HIVE-17320
> URL: https://issues.apache.org/jira/browse/HIVE-17320
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 3.0.0
>
>
> with HIVE-17089 we never have any insert events in the deltas
> so if for every split of the base we know min/max key, we can use them to 
> filter delete events since all files are sorted by RecordIdentifier
> So we should be able to create SARG for all delete deltas
> the code can be simplified since now min/max key doesn't ever have to be null



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17100) Improve HS2 operation logs for REPL commands.

2017-08-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126643#comment-16126643
 ] 

Hive QA commented on HIVE-17100:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12881838/HIVE-17100.01.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6391/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6391/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6391/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-08-14 23:56:04.494
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-6391/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-08-14 23:56:04.497
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 06d9a6b HIVE-16873: Remove Thread Cache From Logging (BELUGA 
BEHR reviewed by Aihua Xu)
+ git clean -f -d
Removing common/src/java/org/apache/hadoop/hive/common/ndv/fm/FMSketch.java
Removing metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/cache/
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 06d9a6b HIVE-16873: Remove Thread Cache From Logging (BELUGA 
BEHR reviewed by Aihua Xu)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-08-14 23:56:11.579
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
Going to apply patch with: patch -p1
patching file metastore/src/java/org/apache/hadoop/hive/metastore/TableType.java
patching file ql/if/queryplan.thrift
patching file ql/src/gen/thrift/gen-cpp/queryplan_types.cpp
patching file ql/src/gen/thrift/gen-cpp/queryplan_types.h
patching file 
ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/StageType.java
patching file ql/src/gen/thrift/gen-php/Types.php
patching file ql/src/gen/thrift/gen-py/queryplan/ttypes.py
patching file ql/src/gen/thrift/gen-rb/queryplan_types.rb
patching file ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java
patching file ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
patching file 
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplStateLogTask.java
patching file 
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplStateLogWork.java
patching file 
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/ReplLoadTask.java
patching file 
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/ReplLoadWork.java
patching file 
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/events/filesystem/BootstrapEventsIterator.java
patching file 
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/events/filesystem/DatabaseEventsIterator.java
patching file 
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/LoadFunction.java
patching file 
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java
patching file 
ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java
patching file ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSpec.java
patching file ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/Utils.java
patching file 
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/load/message/AbstractMessageHandler.java
patching file 
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/load/message/CreateFunctionHandler.java
patching file 
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/log/logger/BootstrapDumpLogger.java
patching file 
ql/src/java/org/apache/had

[jira] [Commented] (HIVE-17286) Avoid expensive String serialization/deserialization for bitvectors

2017-08-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126639#comment-16126639
 ] 

Hive QA commented on HIVE-17286:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12881830/HIVE-17286.04.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 47 failed/errored test(s), 11004 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 (batchId=240)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite]
 (batchId=240)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_update_status]
 (batchId=85)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_column_stats]
 (batchId=62)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_update_status]
 (batchId=76)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[analyze_tbl_part] 
(batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_3] 
(batchId=53)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[avro_decimal] 
(batchId=66)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[avro_decimal_native] 
(batchId=26)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bitvector] (batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[colstats_all_nulls] 
(batchId=6)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[column_names_with_leading_and_trailing_spaces]
 (batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[confirm_initial_tbl_stats]
 (batchId=29)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[decimal_stats] 
(batchId=80)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[describe_table] 
(batchId=41)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[extrapolate_part_stats_full]
 (batchId=33)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[extrapolate_part_stats_partial]
 (batchId=46)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[fm-sketch] (batchId=49)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[hll] (batchId=83)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[partition_coltype_literals]
 (batchId=12)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[rename_external_partition_location]
 (batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[rename_table_update_column_stats]
 (batchId=35)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[stats_only_null] 
(batchId=26)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[temp_table_display_colstats_tbllvl]
 (batchId=74)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[tunable_ndv] (batchId=43)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=165)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[autoColumnStats_2]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[column_names_with_leading_and_trailing_spaces]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[extrapolate_part_stats_partial_ndv]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_only_null]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=235)
org.apache.hadoop.hive.common.ndv.fm.TestFMSketchSerialization.testSerDe 
(batchId=250)
org.apache.hadoop.hive.metastore.TestHiveMetaStoreStatsMerge.testStatsMerge 
(batchId=210)
org.apache.hadoop.hive.metastore.cache.TestCachedStore.testAggrStatsRepeatedRead
 (batchId=201)
org.apache.hadoop.hive.metastore.cache.TestCachedStore.testPartitionAggrStats 
(batchId=201)
org.apache.hadoop.hive.metastore.cache.TestCachedStore.testPartitionAggrStatsBitVector
 (batchId=201)
org.apache.hive.hcatalog.api.TestHCa

[jira] [Updated] (HIVE-17277) HiveMetastoreClient Log name is wrong

2017-08-14 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-17277:
--
Status: Patch Available  (was: Open)

> HiveMetastoreClient Log name is wrong
> -
>
> Key: HIVE-17277
> URL: https://issues.apache.org/jira/browse/HIVE-17277
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Zac Zhou
>Assignee: Zac Zhou
>Priority: Minor
> Attachments: HIVE-17277.patch
>
>
> The name of Log for HiveMetastoreClient is "hive.metastore". It's confused 
> for users to trace hive log



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17277) HiveMetastoreClient Log name is wrong

2017-08-14 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-17277:
--
Status: Open  (was: Patch Available)

Cancelling and resubmitting the patch in an attempt to get the tests to run.

> HiveMetastoreClient Log name is wrong
> -
>
> Key: HIVE-17277
> URL: https://issues.apache.org/jira/browse/HIVE-17277
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Zac Zhou
>Assignee: Zac Zhou
>Priority: Minor
> Attachments: HIVE-17277.patch
>
>
> The name of Log for HiveMetastoreClient is "hive.metastore". It's confused 
> for users to trace hive log



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17277) HiveMetastoreClient Log name is wrong

2017-08-14 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126611#comment-16126611
 ] 

Alan Gates commented on HIVE-17277:
---

+1

> HiveMetastoreClient Log name is wrong
> -
>
> Key: HIVE-17277
> URL: https://issues.apache.org/jira/browse/HIVE-17277
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Zac Zhou
>Assignee: Zac Zhou
>Priority: Minor
> Attachments: HIVE-17277.patch
>
>
> The name of Log for HiveMetastoreClient is "hive.metastore". It's confused 
> for users to trace hive log



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17241) Change metastore classes to not use the shims

2017-08-14 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-17241:
--
Attachment: HIVE-17241.2.patch

New version of the patch that fixes failing unit tests.

> Change metastore classes to not use the shims
> -
>
> Key: HIVE-17241
> URL: https://issues.apache.org/jira/browse/HIVE-17241
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-17241.2.patch, HIVE-17241.patch
>
>
> As part of moving the metastore into a standalone package, it will no longer 
> have access to the shims.  This means we need to either copy them or access 
> the underlying Hadoop operations directly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17241) Change metastore classes to not use the shims

2017-08-14 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-17241:
--
Status: Patch Available  (was: Open)

> Change metastore classes to not use the shims
> -
>
> Key: HIVE-17241
> URL: https://issues.apache.org/jira/browse/HIVE-17241
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-17241.2.patch, HIVE-17241.patch
>
>
> As part of moving the metastore into a standalone package, it will no longer 
> have access to the shims.  This means we need to either copy them or access 
> the underlying Hadoop operations directly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17006) LLAP: Parquet caching

2017-08-14 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126595#comment-16126595
 ] 

Gunther Hagleitner commented on HIVE-17006:
---

can you create follow ups for the major todos? a) Uncopify 
ParquetMetadataCacheImpl, BB put for FileMetadataCache, Parquet impl in LlapIo 
impl...

The error handling in LlapCacheAwareFs should be done before commit it seems, 
that can leave a leak?

> LLAP: Parquet caching
> -
>
> Key: HIVE-17006
> URL: https://issues.apache.org/jira/browse/HIVE-17006
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17006.01.patch, HIVE-17006.02.patch, 
> HIVE-17006.patch, HIVE-17006.WIP.patch
>
>
> There are multiple options to do Parquet caching in LLAP:
> 1) Full elevator (too intrusive for now).
> 2) Page based cache like ORC (requires some changes to Parquet or 
> copy-pasted).
> 3) Cache disk data on column chunk level as is.
> Given that Parquet reads at column chunk granularity, (2) is not as useful as 
> for ORC, but still a good idea. I messaged the dev list about it but didn't 
> get a response, we may follow up later.
> For now, do (3). 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17100) Improve HS2 operation logs for REPL commands.

2017-08-14 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17100:

Status: Patch Available  (was: Open)

> Improve HS2 operation logs for REPL commands.
> -
>
> Key: HIVE-17100
> URL: https://issues.apache.org/jira/browse/HIVE-17100
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17100.01.patch
>
>
> It is necessary to log the progress the replication tasks in a structured 
> manner as follows.
> *+Bootstrap Dump:+*
> * At the start of bootstrap dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (BOOTSTRAP)
> * (Estimated) Total number of tables/views to dump
> * (Estimated) Total number of functions to dump.
> * Dump Start Time{color}
> * After each table dump, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table dump end time
> * Table dump progress. Format is Table sequence no/(Estimated) Total number 
> of tables and views.{color}
> * After each function dump, will add a log as follows
> {color:#59afe1}* Function Name
> * Function dump end time
> * Function dump progress. Format is Function sequence no/(Estimated) Total 
> number of functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> dump.
> {color:#59afe1}* Database Name.
> * Dump Type (BOOTSTRAP).
> * Dump End Time.
> * (Actual) Total number of tables/views dumped.
> * (Actual) Total number of functions dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The actual and estimated number of tables/functions may not match if 
> any table/function is dropped when dump in progress.
> *+Bootstrap Load:+*
> * At the start of bootstrap load, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump directory
> * Load Type (BOOTSTRAP)
> * Total number of tables/views to load
> * Total number of functions to load.
> * Load Start Time{color}
> * After each table load, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table load completion time
> * Table load progress. Format is Table sequence no/Total number of tables and 
> views.{color}
> * After each function load, will add a log as follows
> {color:#59afe1}* Function Name
> * Function load completion time
> * Function load progress. Format is Function sequence no/Total number of 
> functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> load.
> {color:#59afe1}* Database Name.
> * Load Type (BOOTSTRAP).
> * Load End Time.
> * Total number of tables/views loaded.
> * Total number of functions loaded.
> * Last Repl ID of the loaded database.{color}
> *+Incremental Dump:+*
> * At the start of database dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (INCREMENTAL)
> * (Estimated) Total number of events to dump.
> * Dump Start Time{color}
> * After each event dump, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event dump end time
> * Event dump progress. Format is Event sequence no/ (Estimated) Total number 
> of events.{color}
> * After completion of all event dumps, will add a log as follows.
> {color:#59afe1}* Database Name.
> * Dump Type (INCREMENTAL).
> * Dump End Time.
> * (Actual) Total number of events dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The estimated number of events can be terribly inaccurate with actual 
> number as we don’t have the number of events upfront until we read from 
> metastore NotificationEvents table.
> *+Incremental Load:+*
> * At the start of incremental load, will add one log with below details.
> {color:#59afe1}* Target Database Name 
> * Dump directory
> * Load Type (INCREMENTAL)
> * Total number of events to load
> * Load Start Time{color}
> * After each event load, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event load end time
> * Event load progress. Format is Event sequence no/ Total number of 
> events.{color}
> * After completion of all event loads, will add a log as follows to 
> consolidate the load.
> {color:#59afe1}* Target Database Name.
> * Load Type (INCREMENTAL).
> * Load End Time.
> * Total number of events loaded.
> * Last Repl ID of the loaded database.{color}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17100) Improve HS2 operation logs for REPL commands.

2017-08-14 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17100:

Attachment: HIVE-17100.01.patch

Added 01.patch with below changes.
- Added Repl logger for bootstrap dump/load and incremental dump/load.
- Serialized the log messages with separate classes for each log message.
- Logged both in console and log file.
- Added all logs as mentioned in description.


> Improve HS2 operation logs for REPL commands.
> -
>
> Key: HIVE-17100
> URL: https://issues.apache.org/jira/browse/HIVE-17100
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17100.01.patch
>
>
> It is necessary to log the progress the replication tasks in a structured 
> manner as follows.
> *+Bootstrap Dump:+*
> * At the start of bootstrap dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (BOOTSTRAP)
> * (Estimated) Total number of tables/views to dump
> * (Estimated) Total number of functions to dump.
> * Dump Start Time{color}
> * After each table dump, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table dump end time
> * Table dump progress. Format is Table sequence no/(Estimated) Total number 
> of tables and views.{color}
> * After each function dump, will add a log as follows
> {color:#59afe1}* Function Name
> * Function dump end time
> * Function dump progress. Format is Function sequence no/(Estimated) Total 
> number of functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> dump.
> {color:#59afe1}* Database Name.
> * Dump Type (BOOTSTRAP).
> * Dump End Time.
> * (Actual) Total number of tables/views dumped.
> * (Actual) Total number of functions dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The actual and estimated number of tables/functions may not match if 
> any table/function is dropped when dump in progress.
> *+Bootstrap Load:+*
> * At the start of bootstrap load, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump directory
> * Load Type (BOOTSTRAP)
> * Total number of tables/views to load
> * Total number of functions to load.
> * Load Start Time{color}
> * After each table load, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table load completion time
> * Table load progress. Format is Table sequence no/Total number of tables and 
> views.{color}
> * After each function load, will add a log as follows
> {color:#59afe1}* Function Name
> * Function load completion time
> * Function load progress. Format is Function sequence no/Total number of 
> functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> load.
> {color:#59afe1}* Database Name.
> * Load Type (BOOTSTRAP).
> * Load End Time.
> * Total number of tables/views loaded.
> * Total number of functions loaded.
> * Last Repl ID of the loaded database.{color}
> *+Incremental Dump:+*
> * At the start of database dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (INCREMENTAL)
> * (Estimated) Total number of events to dump.
> * Dump Start Time{color}
> * After each event dump, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event dump end time
> * Event dump progress. Format is Event sequence no/ (Estimated) Total number 
> of events.{color}
> * After completion of all event dumps, will add a log as follows.
> {color:#59afe1}* Database Name.
> * Dump Type (INCREMENTAL).
> * Dump End Time.
> * (Actual) Total number of events dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The estimated number of events can be terribly inaccurate with actual 
> number as we don’t have the number of events upfront until we read from 
> metastore NotificationEvents table.
> *+Incremental Load:+*
> * At the start of incremental load, will add one log with below details.
> {color:#59afe1}* Target Database Name 
> * Dump directory
> * Load Type (INCREMENTAL)
> * Total number of events to load
> * Load Start Time{color}
> * After each event load, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event load end time
> * Event load progress. Format is Event sequence no/ Total number of 
> events.{color}
> * After completion of all event loads, will add a log as follows to 
> consolidate the load.
> {color:#59afe1}* Target Databa

[jira] [Work stopped] (HIVE-17100) Improve HS2 operation logs for REPL commands.

2017-08-14 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-17100 stopped by Sankar Hariappan.
---
> Improve HS2 operation logs for REPL commands.
> -
>
> Key: HIVE-17100
> URL: https://issues.apache.org/jira/browse/HIVE-17100
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
>
> It is necessary to log the progress the replication tasks in a structured 
> manner as follows.
> *+Bootstrap Dump:+*
> * At the start of bootstrap dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (BOOTSTRAP)
> * (Estimated) Total number of tables/views to dump
> * (Estimated) Total number of functions to dump.
> * Dump Start Time{color}
> * After each table dump, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table dump end time
> * Table dump progress. Format is Table sequence no/(Estimated) Total number 
> of tables and views.{color}
> * After each function dump, will add a log as follows
> {color:#59afe1}* Function Name
> * Function dump end time
> * Function dump progress. Format is Function sequence no/(Estimated) Total 
> number of functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> dump.
> {color:#59afe1}* Database Name.
> * Dump Type (BOOTSTRAP).
> * Dump End Time.
> * (Actual) Total number of tables/views dumped.
> * (Actual) Total number of functions dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The actual and estimated number of tables/functions may not match if 
> any table/function is dropped when dump in progress.
> *+Bootstrap Load:+*
> * At the start of bootstrap load, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump directory
> * Load Type (BOOTSTRAP)
> * Total number of tables/views to load
> * Total number of functions to load.
> * Load Start Time{color}
> * After each table load, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table load completion time
> * Table load progress. Format is Table sequence no/Total number of tables and 
> views.{color}
> * After each function load, will add a log as follows
> {color:#59afe1}* Function Name
> * Function load completion time
> * Function load progress. Format is Function sequence no/Total number of 
> functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> load.
> {color:#59afe1}* Database Name.
> * Load Type (BOOTSTRAP).
> * Load End Time.
> * Total number of tables/views loaded.
> * Total number of functions loaded.
> * Last Repl ID of the loaded database.{color}
> *+Incremental Dump:+*
> * At the start of database dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (INCREMENTAL)
> * (Estimated) Total number of events to dump.
> * Dump Start Time{color}
> * After each event dump, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event dump end time
> * Event dump progress. Format is Event sequence no/ (Estimated) Total number 
> of events.{color}
> * After completion of all event dumps, will add a log as follows.
> {color:#59afe1}* Database Name.
> * Dump Type (INCREMENTAL).
> * Dump End Time.
> * (Actual) Total number of events dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The estimated number of events can be terribly inaccurate with actual 
> number as we don’t have the number of events upfront until we read from 
> metastore NotificationEvents table.
> *+Incremental Load:+*
> * At the start of incremental load, will add one log with below details.
> {color:#59afe1}* Target Database Name 
> * Dump directory
> * Load Type (INCREMENTAL)
> * Total number of events to load
> * Load Start Time{color}
> * After each event load, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event load end time
> * Event load progress. Format is Event sequence no/ Total number of 
> events.{color}
> * After completion of all event loads, will add a log as follows to 
> consolidate the load.
> {color:#59afe1}* Target Database Name.
> * Load Type (INCREMENTAL).
> * Load End Time.
> * Total number of events loaded.
> * Last Repl ID of the loaded database.{color}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17100) Improve HS2 operation logs for REPL commands.

2017-08-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126586#comment-16126586
 ] 

ASF GitHub Bot commented on HIVE-17100:
---

GitHub user sankarh opened a pull request:

https://github.com/apache/hive/pull/231

HIVE-17100: Improve HS2 operation logs for REPL commands.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sankarh/hive HIVE-17100

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/231.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #231


commit 2e7de9cff5093aceab124917f23b505fb0330d7b
Author: Sankar Hariappan 
Date:   2017-07-24T06:44:34Z

HIVE-17100: Improve HS2 operation logs for REPL commands.




> Improve HS2 operation logs for REPL commands.
> -
>
> Key: HIVE-17100
> URL: https://issues.apache.org/jira/browse/HIVE-17100
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
>
> It is necessary to log the progress the replication tasks in a structured 
> manner as follows.
> *+Bootstrap Dump:+*
> * At the start of bootstrap dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (BOOTSTRAP)
> * (Estimated) Total number of tables/views to dump
> * (Estimated) Total number of functions to dump.
> * Dump Start Time{color}
> * After each table dump, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table dump end time
> * Table dump progress. Format is Table sequence no/(Estimated) Total number 
> of tables and views.{color}
> * After each function dump, will add a log as follows
> {color:#59afe1}* Function Name
> * Function dump end time
> * Function dump progress. Format is Function sequence no/(Estimated) Total 
> number of functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> dump.
> {color:#59afe1}* Database Name.
> * Dump Type (BOOTSTRAP).
> * Dump End Time.
> * (Actual) Total number of tables/views dumped.
> * (Actual) Total number of functions dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The actual and estimated number of tables/functions may not match if 
> any table/function is dropped when dump in progress.
> *+Bootstrap Load:+*
> * At the start of bootstrap load, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump directory
> * Load Type (BOOTSTRAP)
> * Total number of tables/views to load
> * Total number of functions to load.
> * Load Start Time{color}
> * After each table load, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table load completion time
> * Table load progress. Format is Table sequence no/Total number of tables and 
> views.{color}
> * After each function load, will add a log as follows
> {color:#59afe1}* Function Name
> * Function load completion time
> * Function load progress. Format is Function sequence no/Total number of 
> functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> load.
> {color:#59afe1}* Database Name.
> * Load Type (BOOTSTRAP).
> * Load End Time.
> * Total number of tables/views loaded.
> * Total number of functions loaded.
> * Last Repl ID of the loaded database.{color}
> *+Incremental Dump:+*
> * At the start of database dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (INCREMENTAL)
> * (Estimated) Total number of events to dump.
> * Dump Start Time{color}
> * After each event dump, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event dump end time
> * Event dump progress. Format is Event sequence no/ (Estimated) Total number 
> of events.{color}
> * After completion of all event dumps, will add a log as follows.
> {color:#59afe1}* Database Name.
> * Dump Type (INCREMENTAL).
> * Dump End Time.
> * (Actual) Total number of events dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The estimated number of events can be terribly inaccurate with actual 
> number as we don’t have the number of events upfront until we read from 
> metastore NotificationEvents table.
> *+Incremental Load:+*
> * At the start of incremental load, will add one log with below details.
> {color:#59afe1}* Target Database Name 
> * Dump directory
> * Load Type (INCREMENTAL)
> * Total

[jira] [Commented] (HIVE-17006) LLAP: Parquet caching

2017-08-14 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126585#comment-16126585
 ] 

Gunther Hagleitner commented on HIVE-17006:
---

call it AUTHORITY then? To be less confusing? (I'm guessing empty isn't an 
option here?)

> LLAP: Parquet caching
> -
>
> Key: HIVE-17006
> URL: https://issues.apache.org/jira/browse/HIVE-17006
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17006.01.patch, HIVE-17006.02.patch, 
> HIVE-17006.patch, HIVE-17006.WIP.patch
>
>
> There are multiple options to do Parquet caching in LLAP:
> 1) Full elevator (too intrusive for now).
> 2) Page based cache like ORC (requires some changes to Parquet or 
> copy-pasted).
> 3) Cache disk data on column chunk level as is.
> Given that Parquet reads at column chunk granularity, (2) is not as useful as 
> for ORC, but still a good idea. I messaged the dev list about it but didn't 
> get a response, we may follow up later.
> For now, do (3). 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17006) LLAP: Parquet caching

2017-08-14 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126584#comment-16126584
 ] 

Gunther Hagleitner commented on HIVE-17006:
---

The fix in HDFSUtils to handle case where shim doesn't return file id. Is that 
specific to this patch? Seems like that's a problem even w/o parquet?

> LLAP: Parquet caching
> -
>
> Key: HIVE-17006
> URL: https://issues.apache.org/jira/browse/HIVE-17006
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17006.01.patch, HIVE-17006.02.patch, 
> HIVE-17006.patch, HIVE-17006.WIP.patch
>
>
> There are multiple options to do Parquet caching in LLAP:
> 1) Full elevator (too intrusive for now).
> 2) Page based cache like ORC (requires some changes to Parquet or 
> copy-pasted).
> 3) Cache disk data on column chunk level as is.
> Given that Parquet reads at column chunk granularity, (2) is not as useful as 
> for ORC, but still a good idea. I messaged the dev list about it but didn't 
> get a response, we may follow up later.
> For now, do (3). 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17006) LLAP: Parquet caching

2017-08-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126577#comment-16126577
 ] 

Sergey Shelukhin commented on HIVE-17006:
-

I have to put something as an authority for the URI. This is as good as 
anything.

> LLAP: Parquet caching
> -
>
> Key: HIVE-17006
> URL: https://issues.apache.org/jira/browse/HIVE-17006
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17006.01.patch, HIVE-17006.02.patch, 
> HIVE-17006.patch, HIVE-17006.WIP.patch
>
>
> There are multiple options to do Parquet caching in LLAP:
> 1) Full elevator (too intrusive for now).
> 2) Page based cache like ORC (requires some changes to Parquet or 
> copy-pasted).
> 3) Cache disk data on column chunk level as is.
> Given that Parquet reads at column chunk granularity, (2) is not as useful as 
> for ORC, but still a good idea. I messaged the dev list about it but didn't 
> get a response, we may follow up later.
> For now, do (3). 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17006) LLAP: Parquet caching

2017-08-14 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126557#comment-16126557
 ] 

Gunther Hagleitner commented on HIVE-17006:
---

Follow would be good for counters.

Why do you set the uri to: SCHEME +"://"+SCHEME ? what's the second scheme for?

> LLAP: Parquet caching
> -
>
> Key: HIVE-17006
> URL: https://issues.apache.org/jira/browse/HIVE-17006
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17006.01.patch, HIVE-17006.02.patch, 
> HIVE-17006.patch, HIVE-17006.WIP.patch
>
>
> There are multiple options to do Parquet caching in LLAP:
> 1) Full elevator (too intrusive for now).
> 2) Page based cache like ORC (requires some changes to Parquet or 
> copy-pasted).
> 3) Cache disk data on column chunk level as is.
> Given that Parquet reads at column chunk granularity, (2) is not as useful as 
> for ORC, but still a good idea. I messaged the dev list about it but didn't 
> get a response, we may follow up later.
> For now, do (3). 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17286) Avoid expensive String serialization/deserialization for bitvectors

2017-08-14 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-17286:
---
Attachment: HIVE-17286.04.patch

> Avoid expensive String serialization/deserialization for bitvectors
> ---
>
> Key: HIVE-17286
> URL: https://issues.apache.org/jira/browse/HIVE-17286
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-17286.01.patch, HIVE-17286.02.patch, 
> HIVE-17286.03.patch, HIVE-17286.04.patch, HIVE-17286.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17315) Make the DataSource used by the DataNucleus in the HMS configurable using Hive properties

2017-08-14 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126484#comment-16126484
 ] 

Eugene Koifman commented on HIVE-17315:
---

it may be useful to support dbcp..  As Thejas said there 
are different pools in use and it may be useful to configure some of the 
properties differently

> Make the DataSource used by the DataNucleus in the HMS configurable using 
> Hive properties
> -
>
> Key: HIVE-17315
> URL: https://issues.apache.org/jira/browse/HIVE-17315
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 3.0.0
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>
> Currently we may use several connection pool implementations in the backend 
> (hikari, dbCp, boneCp) but these can only be configured using proprietary xml 
> files and not through hive-site.xml like DataNucleus.
> We should make them configurable just like DataNucleus, by allowing Hive 
> properties prefix by hikari, dbcp, bonecp to be set in the hive-site.xml. 
> However since these configurations may contain sensitive information 
> (passwords) these properties should not be displayable or manually settable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-13817) Allow DNS CNAME ALIAS Resolution from apache hive beeline JDBC URL to allow for failover

2017-08-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126428#comment-16126428
 ] 

Hive QA commented on HIVE-13817:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12805512/HIVE-13817.3.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6389/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6389/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6389/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-08-14 21:28:04.644
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-6389/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-08-14 21:28:04.647
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   4f042cc..06d9a6b  master -> origin/master
+ git reset --hard HEAD
HEAD is now at 4f042cc HIVE-17260: Typo: exception has been created and lost in 
the ThriftJDBCBinarySerDe (Oleg Danilov via Peter Vary)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded.
  (use "git pull" to update your local branch)
+ git reset --hard origin/master
HEAD is now at 06d9a6b HIVE-16873: Remove Thread Cache From Logging (BELUGA 
BEHR reviewed by Aihua Xu)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-08-14 21:28:10.753
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: patch failed: jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java:100
error: jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java: patch does not 
apply
error: patch failed: jdbc/src/java/org/apache/hive/jdbc/Utils.java:118
error: jdbc/src/java/org/apache/hive/jdbc/Utils.java: patch does not apply
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12805512 - PreCommit-HIVE-Build

> Allow DNS CNAME ALIAS Resolution from apache hive beeline JDBC URL to allow 
> for failover
> 
>
> Key: HIVE-13817
> URL: https://issues.apache.org/jira/browse/HIVE-13817
> Project: Hive
>  Issue Type: New Feature
>  Components: Beeline
>Affects Versions: 1.2.1
>Reporter: Vijay Singh
> Attachments: HIVE-13817.1.patch, HIVE-13817.2.patch, 
> HIVE-13817.3.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently, in case of BDR clusters, DNS CNAME alias based connections fail. 
> As _HOST resolves to exact endpoint specified in connection string and that 
> may not be intended SPN for kerberos based on reverse DNS lookup. 
> Consequently this JIRA proposes that client specific setting be used to 
> resolv _HOST from CNAME DNS alias to A record entry on the fly in beeline.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17272) when hive.vectorized.execution.enabled is true, query on empty partitioned table fails with NPE

2017-08-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126421#comment-16126421
 ] 

Hive QA commented on HIVE-17272:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12881815/HIVE-17272.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 11004 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_3] 
(batchId=99)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=180)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6388/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6388/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6388/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12881815 - PreCommit-HIVE-Build

> when hive.vectorized.execution.enabled is true, query on empty partitioned 
> table fails with NPE
> ---
>
> Key: HIVE-17272
> URL: https://issues.apache.org/jira/browse/HIVE-17272
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.1.1
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-17272.1.patch, HIVE-17272.2.patch
>
>
> {noformat}
> set hive.vectorized.execution.enabled=true;
> CREATE TABLE `tab`(`x` int) PARTITIONED BY ( `y` int) stored as parquet;
> select * from tab t1 join tab t2 where t1.x=t2.x;
> {noformat}
> The query fails with the following exception.
> {noformat}
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.createAndInitPartitionContext(VectorMapOperator.java:386)
>  ~[hive-exec-2.3.0.jar:2.3.0]
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.internalSetChildren(VectorMapOperator.java:559)
>  ~[hive-exec-2.3.0.jar:2.3.0]
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.setChildren(VectorMapOperator.java:474)
>  ~[hive-exec-2.3.0.jar:2.3.0]
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:106) 
> ~[hive-exec-2.3.0.jar:2.3.0]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_101]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_101]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_101]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_101]
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) 
> ~[hadoop-common-2.6.0.jar:?]
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) 
> ~[hadoop-common-2.6.0.jar:?]
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
> ~[hadoop-common-2.6.0.jar:?]
> at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34) 
> ~[hadoop-core-2.6.0-mr1-cdh5.4.2.jar:?]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_101]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_101]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8

[jira] [Commented] (HIVE-17315) Make the DataSource used by the DataNucleus in the HMS configurable using Hive properties

2017-08-14 Thread Barna Zsombor Klara (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126381#comment-16126381
 ] 

Barna Zsombor Klara commented on HIVE-17315:


[~thejas]
Yes, you are correct. We currently only set the connection pool size property 
on the datasource, but at least for BoneCp I think (though I haven't tested it 
myself) that other BoneCp specific properties can also be set by adding a 
bonecp-config.xml to the classpath.
Look at the constructors for the config object for reference:
[BoneCPConfig|http://grepcode.com/file/repo1.maven.org/maven2/com.jolbox/bonecp/0.7.1.RELEASE/com/jolbox/bonecp/BoneCPConfig.java#BoneCPConfig.%3Cinit%3E%28%29]
I think that if the xml configuration file is present it will be loaded by 
BoneCp, so it also could be loaded by the BoneCp datasource backing the 
DataNucleus PersistenceManagerFactory.
HikariConfig is using a java system property "hikaricp.configurationFile" that 
can point to a properties file with default values, and I don't know about 
dbcp, but I assume they also have something like this.
I'm not sure if any of these "backdoor configurations" are used by anyone, but 
it should be possible.
And yes you are also correct about my intention. I would like to be able to set 
any property supported by bonecp (or hikari, or dbcp) on the underlying 
datasource not just the connectionpool size. And while you are right that 
password was a bad example I would still like to have these properties hidden. 
Even if the current properties are safe, I would be afraid of what could be 
added by future versions of the connection pool implementations, since we would 
be exposing these settings the moment we upgrade the library. I don't want to 
add every property explicitly to HiveConf, I would like to propagate anything 
prefixed with bonecp/hikari/dbcp to the DataSource implementation and let it 
use/ignore it.

[~ekoifman]
Exactly. I want to be able to set bonecp/hikari/dbcp properties on the 
datasource. For example in TxnHandler the getConnectionTimeoutMs for every 
implementation and partition count for bonecp is hardcoded. We should be able 
to set these values in the hive-site.xml with a property like 
bonecp.partitionCount or hikari.getConnectionTimeoutMs.

> Make the DataSource used by the DataNucleus in the HMS configurable using 
> Hive properties
> -
>
> Key: HIVE-17315
> URL: https://issues.apache.org/jira/browse/HIVE-17315
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 3.0.0
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>
> Currently we may use several connection pool implementations in the backend 
> (hikari, dbCp, boneCp) but these can only be configured using proprietary xml 
> files and not through hive-site.xml like DataNucleus.
> We should make them configurable just like DataNucleus, by allowing Hive 
> properties prefix by hikari, dbcp, bonecp to be set in the hive-site.xml. 
> However since these configurations may contain sensitive information 
> (passwords) these properties should not be displayable or manually settable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16886) HMS log notifications may have duplicated event IDs if multiple HMS are running concurrently

2017-08-14 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-16886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126371#comment-16126371
 ] 

Sergio Peña commented on HIVE-16886:


I read on the datanucleus doc that those datastore-identity objects may be 
accessed by something like {{Object id = pm.getObjectId(obj);}}, do you know 
how if that would work? can we read and add the NL_ID into the object to be 
returned back to the client?

http://www.datanucleus.org/products/datanucleus/jdo/datastore_identity.html

> HMS log notifications may have duplicated event IDs if multiple HMS are 
> running concurrently
> 
>
> Key: HIVE-16886
> URL: https://issues.apache.org/jira/browse/HIVE-16886
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Metastore
>Reporter: Sergio Peña
>Assignee: anishek
>
> When running multiple Hive Metastore servers and DB notifications are 
> enabled, I could see that notifications can be persisted with a duplicated 
> event ID. 
> This does not happen when running multiple threads in a single HMS node due 
> to the locking acquired on the DbNotificationsLog class, but multiple HMS 
> could cause conflicts.
> The issue is in the ObjectStore#addNotificationEvent() method. The event ID 
> fetched from the datastore is used for the new notification, incremented in 
> the server itself, then persisted or updated back to the datastore. If 2 
> servers read the same ID, then these 2 servers write a new notification with 
> the same ID.
> The event ID is not unique nor a primary key.
> Here's a test case using the TestObjectStore class that confirms this issue:
> {noformat}
> @Test
>   public void testConcurrentAddNotifications() throws ExecutionException, 
> InterruptedException {
> final int NUM_THREADS = 2;
> CountDownLatch countIn = new CountDownLatch(NUM_THREADS);
> CountDownLatch countOut = new CountDownLatch(1);
> HiveConf conf = new HiveConf();
> conf.setVar(HiveConf.ConfVars.METASTORE_EXPRESSION_PROXY_CLASS, 
> MockPartitionExpressionProxy.class.getName());
> ExecutorService executorService = 
> Executors.newFixedThreadPool(NUM_THREADS);
> FutureTask tasks[] = new FutureTask[NUM_THREADS];
> for (int i=0; i   final int n = i;
>   tasks[i] = new FutureTask(new Callable() {
> @Override
> public Void call() throws Exception {
>   ObjectStore store = new ObjectStore();
>   store.setConf(conf);
>   NotificationEvent dbEvent =
>   new NotificationEvent(0, 0, 
> EventMessage.EventType.CREATE_DATABASE.toString(), "CREATE DATABASE DB" + n);
>   System.out.println("ADDING NOTIFICATION");
>   countIn.countDown();
>   countOut.await();
>   store.addNotificationEvent(dbEvent);
>   System.out.println("FINISH NOTIFICATION");
>   return null;
> }
>   });
>   executorService.execute(tasks[i]);
> }
> countIn.await();
> countOut.countDown();
> for (int i = 0; i < NUM_THREADS; ++i) {
>   tasks[i].get();
> }
> NotificationEventResponse eventResponse = 
> objectStore.getNextNotification(new NotificationEventRequest());
> Assert.assertEquals(2, eventResponse.getEventsSize());
> Assert.assertEquals(1, eventResponse.getEvents().get(0).getEventId());
> // This fails because the next notification has an event ID = 1
> Assert.assertEquals(2, eventResponse.getEvents().get(1).getEventId());
>   }
> {noformat}
> The last assertion fails expecting an event ID 1 instead of 2. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16873) Remove Thread Cache From Logging

2017-08-14 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-16873:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks [~belugabehr] for the work.

> Remove Thread Cache From Logging
> 
>
> Key: HIVE-16873
> URL: https://issues.apache.org/jira/browse/HIVE-16873
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-16873.1.patch, HIVE-16873.2.patch, 
> HIVE-16873.3.patch
>
>
> In {{org.apache.hadoop.hive.metastore.HiveMetaStore}} we have a {{Formatter}} 
> class (and its buffer) tied to every thread.
> This {{Formatter}} is for logging purposes. I would suggest that we simply 
> let let the logging framework itself handle these kind of details and ditch 
> the buffer per thread.
> {code}
> public static final String AUDIT_FORMAT =
> "ugi=%s\t" + // ugi
> "ip=%s\t" + // remote IP
> "cmd=%s\t"; // command
> public static final Logger auditLog = LoggerFactory.getLogger(
> HiveMetaStore.class.getName() + ".audit");
> private static final ThreadLocal auditFormatter =
> new ThreadLocal() {
>   @Override
>   protected Formatter initialValue() {
> return new Formatter(new StringBuilder(AUDIT_FORMAT.length() * 
> 4));
>   }
> };
> ...
> private static final void logAuditEvent(String cmd) {
>   final Formatter fmt = auditFormatter.get();
>   ((StringBuilder) fmt.out()).setLength(0);
>   String address = getIPAddress();
>   if (address == null) {
> address = "unknown-ip-addr";
>   }
>   auditLog.info(fmt.format(AUDIT_FORMAT, ugi.getUserName(),
>   address, cmd).toString());
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-13817) Allow DNS CNAME ALIAS Resolution from apache hive beeline JDBC URL to allow for failover

2017-08-14 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126359#comment-16126359
 ] 

Aihua Xu commented on HIVE-13817:
-

[~SINGHVJD] The patch makes sense to me. Can you attach a new patch from the 
new rebase since it has been a while? Thanks.

> Allow DNS CNAME ALIAS Resolution from apache hive beeline JDBC URL to allow 
> for failover
> 
>
> Key: HIVE-13817
> URL: https://issues.apache.org/jira/browse/HIVE-13817
> Project: Hive
>  Issue Type: New Feature
>  Components: Beeline
>Affects Versions: 1.2.1
>Reporter: Vijay Singh
> Attachments: HIVE-13817.1.patch, HIVE-13817.2.patch, 
> HIVE-13817.3.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently, in case of BDR clusters, DNS CNAME alias based connections fail. 
> As _HOST resolves to exact endpoint specified in connection string and that 
> may not be intended SPN for kerberos based on reverse DNS lookup. 
> Consequently this JIRA proposes that client specific setting be used to 
> resolv _HOST from CNAME DNS alias to A record entry on the fly in beeline.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17272) when hive.vectorized.execution.enabled is true, query on empty partitioned table fails with NPE

2017-08-14 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-17272:

Attachment: HIVE-17272.2.patch

> when hive.vectorized.execution.enabled is true, query on empty partitioned 
> table fails with NPE
> ---
>
> Key: HIVE-17272
> URL: https://issues.apache.org/jira/browse/HIVE-17272
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.1.1
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-17272.1.patch, HIVE-17272.2.patch
>
>
> {noformat}
> set hive.vectorized.execution.enabled=true;
> CREATE TABLE `tab`(`x` int) PARTITIONED BY ( `y` int) stored as parquet;
> select * from tab t1 join tab t2 where t1.x=t2.x;
> {noformat}
> The query fails with the following exception.
> {noformat}
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.createAndInitPartitionContext(VectorMapOperator.java:386)
>  ~[hive-exec-2.3.0.jar:2.3.0]
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.internalSetChildren(VectorMapOperator.java:559)
>  ~[hive-exec-2.3.0.jar:2.3.0]
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.setChildren(VectorMapOperator.java:474)
>  ~[hive-exec-2.3.0.jar:2.3.0]
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:106) 
> ~[hive-exec-2.3.0.jar:2.3.0]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_101]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_101]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_101]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_101]
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) 
> ~[hadoop-common-2.6.0.jar:?]
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) 
> ~[hadoop-common-2.6.0.jar:?]
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
> ~[hadoop-common-2.6.0.jar:?]
> at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34) 
> ~[hadoop-core-2.6.0-mr1-cdh5.4.2.jar:?]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_101]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_101]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_101]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_101]
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) 
> ~[hadoop-common-2.6.0.jar:?]
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) 
> ~[hadoop-common-2.6.0.jar:?]
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
> ~[hadoop-common-2.6.0.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:413) 
> ~[hadoop-core-2.6.0-mr1-cdh5.4.2.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332) 
> ~[hadoop-core-2.6.0-mr1-cdh5.4.2.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:268)
>  ~[hadoop-core-2.6.0-mr1-cdh5.4.2.jar:?]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[?:1.8.0_101]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[?:1.8.0_101]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[?:1.8.0_101]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  ~[?:1.8.0_101]
> at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_101]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17308) Improvement in join cardinality estimation

2017-08-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126319#comment-16126319
 ] 

Hive QA commented on HIVE-17308:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12881806/HIVE-17308.5.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 10998 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed]
 (batchId=240)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=100)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=235)
org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver
 (batchId=242)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=180)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6387/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6387/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6387/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12881806 - PreCommit-HIVE-Build

> Improvement in join cardinality estimation
> --
>
> Key: HIVE-17308
> URL: https://issues.apache.org/jira/browse/HIVE-17308
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17308.1.patch, HIVE-17308.2.patch, 
> HIVE-17308.3.patch, HIVE-17308.4.patch, HIVE-17308.5.patch
>
>
> Currently during logical planning join cardinality is estimated assuming no 
> correlation among join keys (This estimation is done using exponential 
> backoff). Physical planning on the other hand consider correlation for multi 
> keys and uses different estimation. We should consider correlation during 
> logical planning as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17291) Set the number of executors based on config if client does not provide information

2017-08-14 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126284#comment-16126284
 ] 

Xuefu Zhang commented on HIVE-17291:


Did you confirm that the configuration is updated and new session is launched? 
It might be possible that the configuration is updated via other code path or 
the configuration update is ignored by the test. Every qtest files was executed 
in a single session in the past and later this was changed to sharing a session 
for multiple qtest files. I'm not sure if there is also some magic.

> Set the number of executors based on config if client does not provide 
> information
> --
>
> Key: HIVE-17291
> URL: https://issues.apache.org/jira/browse/HIVE-17291
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: 3.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-17291.1.patch
>
>
> When calculating the memory and cores and the client does not provide 
> information we should try to use the one provided by default. This can happen 
> on startup, when {{spark.dynamicAllocation.enabled}} is not enabled



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17291) Set the number of executors based on config if client does not provide information

2017-08-14 Thread Peter Vary (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126257#comment-16126257
 ] 

Peter Vary commented on HIVE-17291:
---

For testing this is exactly what I did in HIVE-17292.3.patch. I am not sure why 
we do not have the flakiness in this case when the configuration is changed, 
and the spark session is killed and restarted. Logic suggests, that in this 
case we will request new executors based on the new settings, and we are in the 
middle of the query test file, so [~lirui]'s magic does not applies here.
For example:
{code:title=spark_dynamic_partition_pruning_2.q}
EXPLAIN SELECT d1.label, count(*), sum(agg.amount) 
[..]

set hive.spark.dynamic.partition.pruning.max.data.size=1;   <-- I think new 
session is started here

EXPLAIN SELECT d1.label, count(*), sum(agg.amount)
[..]
{code}

> Set the number of executors based on config if client does not provide 
> information
> --
>
> Key: HIVE-17291
> URL: https://issues.apache.org/jira/browse/HIVE-17291
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: 3.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-17291.1.patch
>
>
> When calculating the memory and cores and the client does not provide 
> information we should try to use the one provided by default. This can happen 
> on startup, when {{spark.dynamicAllocation.enabled}} is not enabled



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17315) Make the DataSource used by the DataNucleus in the HMS configurable using Hive properties

2017-08-14 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126248#comment-16126248
 ] 

Eugene Koifman edited comment on HIVE-17315 at 8/14/17 7:11 PM:


What is the intent here?  is it to support something like "bonecp." in hives-site.xml so that Hive sets "some string that 
bonecp recognizes" on the data source? (so hive doesn't interpret "some string 
that bonecp recognizes" in any way)



was (Author: ekoifman):
What is the intent here?  is it to support something like "bonecp." in hives-site.xml so that Hive sets "some string that 
bonecp recognizes" on the data source?


> Make the DataSource used by the DataNucleus in the HMS configurable using 
> Hive properties
> -
>
> Key: HIVE-17315
> URL: https://issues.apache.org/jira/browse/HIVE-17315
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 3.0.0
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>
> Currently we may use several connection pool implementations in the backend 
> (hikari, dbCp, boneCp) but these can only be configured using proprietary xml 
> files and not through hive-site.xml like DataNucleus.
> We should make them configurable just like DataNucleus, by allowing Hive 
> properties prefix by hikari, dbcp, bonecp to be set in the hive-site.xml. 
> However since these configurations may contain sensitive information 
> (passwords) these properties should not be displayable or manually settable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17315) Make the DataSource used by the DataNucleus in the HMS configurable using Hive properties

2017-08-14 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126248#comment-16126248
 ] 

Eugene Koifman commented on HIVE-17315:
---

What is the intent here?  is it to support something like "bonecp." in hives-site.xml so that Hive sets "some string that 
bonecp recognizes" on the data source?


> Make the DataSource used by the DataNucleus in the HMS configurable using 
> Hive properties
> -
>
> Key: HIVE-17315
> URL: https://issues.apache.org/jira/browse/HIVE-17315
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 3.0.0
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>
> Currently we may use several connection pool implementations in the backend 
> (hikari, dbCp, boneCp) but these can only be configured using proprietary xml 
> files and not through hive-site.xml like DataNucleus.
> We should make them configurable just like DataNucleus, by allowing Hive 
> properties prefix by hikari, dbcp, bonecp to be set in the hive-site.xml. 
> However since these configurations may contain sensitive information 
> (passwords) these properties should not be displayable or manually settable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17291) Set the number of executors based on config if client does not provide information

2017-08-14 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126242#comment-16126242
 ] 

Xuefu Zhang commented on HIVE-17291:


Getting a value greater than 0 doesn't necessarily mean that all requested 
executors are up unless only one executor is requested. For testing, you might 
alter Rui's logic such that it waits until all expected cores are returned.

> Set the number of executors based on config if client does not provide 
> information
> --
>
> Key: HIVE-17291
> URL: https://issues.apache.org/jira/browse/HIVE-17291
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: 3.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-17291.1.patch
>
>
> When calculating the memory and cores and the client does not provide 
> information we should try to use the one provided by default. This can happen 
> on startup, when {{spark.dynamicAllocation.enabled}} is not enabled



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17308) Improvement in join cardinality estimation

2017-08-14 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17308:
---
Status: Patch Available  (was: Open)

> Improvement in join cardinality estimation
> --
>
> Key: HIVE-17308
> URL: https://issues.apache.org/jira/browse/HIVE-17308
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17308.1.patch, HIVE-17308.2.patch, 
> HIVE-17308.3.patch, HIVE-17308.4.patch, HIVE-17308.5.patch
>
>
> Currently during logical planning join cardinality is estimated assuming no 
> correlation among join keys (This estimation is done using exponential 
> backoff). Physical planning on the other hand consider correlation for multi 
> keys and uses different estimation. We should consider correlation during 
> logical planning as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17308) Improvement in join cardinality estimation

2017-08-14 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17308:
---
Status: Open  (was: Patch Available)

> Improvement in join cardinality estimation
> --
>
> Key: HIVE-17308
> URL: https://issues.apache.org/jira/browse/HIVE-17308
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17308.1.patch, HIVE-17308.2.patch, 
> HIVE-17308.3.patch, HIVE-17308.4.patch, HIVE-17308.5.patch
>
>
> Currently during logical planning join cardinality is estimated assuming no 
> correlation among join keys (This estimation is done using exponential 
> backoff). Physical planning on the other hand consider correlation for multi 
> keys and uses different estimation. We should consider correlation during 
> logical planning as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17308) Improvement in join cardinality estimation

2017-08-14 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17308:
---
Attachment: HIVE-17308.5.patch

> Improvement in join cardinality estimation
> --
>
> Key: HIVE-17308
> URL: https://issues.apache.org/jira/browse/HIVE-17308
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17308.1.patch, HIVE-17308.2.patch, 
> HIVE-17308.3.patch, HIVE-17308.4.patch, HIVE-17308.5.patch
>
>
> Currently during logical planning join cardinality is estimated assuming no 
> correlation among join keys (This estimation is done using exponential 
> backoff). Physical planning on the other hand consider correlation for multi 
> keys and uses different estimation. We should consider correlation during 
> logical planning as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17291) Set the number of executors based on config if client does not provide information

2017-08-14 Thread Peter Vary (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126224#comment-16126224
 ] 

Peter Vary commented on HIVE-17291:
---

To clarify: I see the flakiness appear only when running the HoS tests against 
*real clusters*. I am still not sure why I do not see it with [~lirui]'s magic. 
When running the tests agains an appropriately configured HoS cluster I am 
using the TestBeeLineDriver to run the query tests, and I see the flakiness 
there. The BeeLineDriver could not use the QTestUtil magic, since it runs the 
query against a real HS2 instance with the out of the box Sessions. So the 
current state is not that bad!

Quick question: Do we have a way to determine when info about the available 
executors is reliable? The current implementation assumes that if we have a 
value which is bigger than 0, then it is reliable. But it still might be 
invalid if we requested multiple executors, but not yet received all.

Thanks,
Peter

> Set the number of executors based on config if client does not provide 
> information
> --
>
> Key: HIVE-17291
> URL: https://issues.apache.org/jira/browse/HIVE-17291
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: 3.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-17291.1.patch
>
>
> When calculating the memory and cores and the client does not provide 
> information we should try to use the one provided by default. This can happen 
> on startup, when {{spark.dynamicAllocation.enabled}} is not enabled



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17315) Make the DataSource used by the DataNucleus in the HMS configurable using Hive properties

2017-08-14 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126208#comment-16126208
 ] 

Thejas M Nair commented on HIVE-17315:
--

[~zsombor.klara]
+1 I think this would be useful for TxnHandler.java also which doesn't use 
Datanucleus. (cc [~ekoifman] [~anishek])

bq. However since these configurations may contain sensitive information 
(passwords) these properties should not be displayable or manually settable.
javax.jdo.option.ConnectionPassword is currently used to supply the password 
for metastore. Why do we need to use connection pool specific properties to 
pass in the password ? 

bq. these can only be configured using proprietary xml files and not through 
hive-site.xml like DataNucleus.
Just to clarify the properties that can be controlled via 
"datanucleus.connectionPool.*" settings are configurable through hive-site.xml. 
But I assume there are some additional connection pool properties which are not 
supported through that means, which you are trying to support. Is that right ?


> Make the DataSource used by the DataNucleus in the HMS configurable using 
> Hive properties
> -
>
> Key: HIVE-17315
> URL: https://issues.apache.org/jira/browse/HIVE-17315
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 3.0.0
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>
> Currently we may use several connection pool implementations in the backend 
> (hikari, dbCp, boneCp) but these can only be configured using proprietary xml 
> files and not through hive-site.xml like DataNucleus.
> We should make them configurable just like DataNucleus, by allowing Hive 
> properties prefix by hikari, dbcp, bonecp to be set in the hive-site.xml. 
> However since these configurations may contain sensitive information 
> (passwords) these properties should not be displayable or manually settable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17300) WebUI query plan graphs

2017-08-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126192#comment-16126192
 ] 

Hive QA commented on HIVE-17300:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12881784/HIVE-17300.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 11004 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite]
 (batchId=240)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[udaf_example_avg] 
(batchId=234)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=235)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=180)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6386/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6386/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6386/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12881784 - PreCommit-HIVE-Build

> WebUI query plan graphs
> ---
>
> Key: HIVE-17300
> URL: https://issues.apache.org/jira/browse/HIVE-17300
> Project: Hive
>  Issue Type: Improvement
>  Components: Web UI
>Reporter: Karen Coppage
>Assignee: Karen Coppage
> Attachments: complete_success.png, full_mapred_stats.png, 
> graph_with_mapred_stats.png, HIVE-17300.patch, last_stage_error.png, 
> last_stage_running.png, non_mapred_task_selected.png
>
>
> Hi all,
> I’m working on a feature of the Hive WebUI Query Plan tab that would provide 
> the option to display the query plan as a nice graph (scroll down for 
> screenshots). If you click on one of the graph’s stages, the plan for that 
> stage appears as text below. 
> Stages are color-coded if they have a status (Success, Error, Running), and 
> the rest are grayed out. Coloring is based on status already available in the 
> WebUI, under the Stages tab.
> There is an additional option to display stats for MapReduce tasks. This 
> includes the job’s ID, tracking URL (where the logs are found), and mapper 
> and reducer numbers/progress, among other info. 
> The library I’m using for the graph is called vis.js (http://visjs.org/). It 
> has an Apache license, and the only necessary file to be included from this 
> library is about 700 KB.
> I tried to keep server-side changes minimal, and graph generation is taken 
> care of by the client. Plans with more than a given number of stages 
> (default: 25) won't be displayed in order to preserve resources.
> I’d love to hear any and all input from the community about this feature: do 
> you think it’s useful, and is there anything important I’m missing?
> Thanks,
> Karen Coppage



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17006) LLAP: Parquet caching

2017-08-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126148#comment-16126148
 ] 

Sergey Shelukhin commented on HIVE-17006:
-

Fragment-level IO counters. They are maintained in IO elevator and this doesn't 
use the elevator, only cache directly. Maybe this needs a follow-up JIRA.

> LLAP: Parquet caching
> -
>
> Key: HIVE-17006
> URL: https://issues.apache.org/jira/browse/HIVE-17006
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17006.01.patch, HIVE-17006.02.patch, 
> HIVE-17006.patch, HIVE-17006.WIP.patch
>
>
> There are multiple options to do Parquet caching in LLAP:
> 1) Full elevator (too intrusive for now).
> 2) Page based cache like ORC (requires some changes to Parquet or 
> copy-pasted).
> 3) Cache disk data on column chunk level as is.
> Given that Parquet reads at column chunk granularity, (2) is not as useful as 
> for ORC, but still a good idea. I messaged the dev list about it but didn't 
> get a response, we may follow up later.
> For now, do (3). 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17291) Set the number of executors based on config if client does not provide information

2017-08-14 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126125#comment-16126125
 ] 

Xuefu Zhang commented on HIVE-17291:


For test result stability, I'm fine with whatever solution. It would be 
interesting to find out why this is an issue even with Rui's magic in QTestUtil.

As to production, most likely dynamic allocation is enabled for sharing 
resources among users. The problem I see with dynamically determining 
parallelism with available executors is the dynamic nature of the executors. 
Typically, each user or user group has a queue, which is specified when a query 
is submitted. The capacity of the queue is decided in YARN. Further, the 
availability of the yarn containers is not guaranteed, and this is apparent 
when the cluster is busy. In this mode, using available executors can 
underestimate the number of reducers needed. This also happens when Spark 
client is initially launched. In addition, with dynamic allocation, 
minExecutors (usually 0) and initialExecutors (usually a small number) are not 
guaranteed either, and maxExecutors (usually a large number) only puts an upper 
limit. In a production env, users are less likely to overwrite what admin sets 
as default. Given such an uncertainly, I don't think we should determine 
parallelism based on available executors. I'd propose that we use "size per 
reducer" to decide the number of reducers, which might be further constrained 
under what maxExecutors allows.

Static allocation is mostly useful for benchmarking, but less likely used in a 
multi-tenant env. Even under this mode, {{spark.executor.instances}} are not 
guaranteed either. However, once the client gets an executor, it never gives it 
up. Thus, it's useful to determine the number of reducers using available 
executors. This comes with catch, which is about the first query run when the 
executors are starting. For this case, I think it should be okay to use 
{{spark.executor.instances}}. In short, with static allocation, we can use 
available executors to determine reducer parallelism and use 
{{spark.executor.instances}} when info about available executor is not 
available.

Any thoughts?

> Set the number of executors based on config if client does not provide 
> information
> --
>
> Key: HIVE-17291
> URL: https://issues.apache.org/jira/browse/HIVE-17291
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: 3.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-17291.1.patch
>
>
> When calculating the memory and cores and the client does not provide 
> information we should try to use the one provided by default. This can happen 
> on startup, when {{spark.dynamicAllocation.enabled}} is not enabled



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17241) Change metastore classes to not use the shims

2017-08-14 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-17241:
--
Status: Open  (was: Patch Available)

Failure of TestSSLWithMiniKdc and TestJdbcWithMiniHS2 look authentic.

> Change metastore classes to not use the shims
> -
>
> Key: HIVE-17241
> URL: https://issues.apache.org/jira/browse/HIVE-17241
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-17241.patch
>
>
> As part of moving the metastore into a standalone package, it will no longer 
> have access to the shims.  This means we need to either copy them or access 
> the underlying Hadoop operations directly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17303) Missmatch between roaring bitmap library used by druid and the one coming from tez

2017-08-14 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126033#comment-16126033
 ] 

Ashutosh Chauhan commented on HIVE-17303:
-

+1

> Missmatch between roaring bitmap library used by druid and the one coming 
> from tez
> --
>
> Key: HIVE-17303
> URL: https://issues.apache.org/jira/browse/HIVE-17303
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-17303.patch
>
>
> {code} 
>  
> Caused by: java.util.concurrent.ExecutionException: 
> java.lang.NoSuchMethodError: 
> org.roaringbitmap.buffer.MutableRoaringBitmap.runOptimize()Z
>   at 
> org.apache.hive.druid.com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
>   at 
> org.apache.hive.druid.com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
>   at 
> org.apache.hive.druid.com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
>   at 
> org.apache.hadoop.hive.druid.io.DruidRecordWriter.pushSegments(DruidRecordWriter.java:165)
>   ... 25 more
> Caused by: java.lang.NoSuchMethodError: 
> org.roaringbitmap.buffer.MutableRoaringBitmap.runOptimize()Z
>   at 
> org.apache.hive.druid.com.metamx.collections.bitmap.WrappedRoaringBitmap.toImmutableBitmap(WrappedRoaringBitmap.java:65)
>   at 
> org.apache.hive.druid.com.metamx.collections.bitmap.RoaringBitmapFactory.makeImmutableBitmap(RoaringBitmapFactory.java:88)
>   at 
> org.apache.hive.druid.io.druid.segment.StringDimensionMergerV9.writeIndexes(StringDimensionMergerV9.java:348)
>   at 
> org.apache.hive.druid.io.druid.segment.IndexMergerV9.makeIndexFiles(IndexMergerV9.java:218)
>   at 
> org.apache.hive.druid.io.druid.segment.IndexMerger.merge(IndexMerger.java:438)
>   at 
> org.apache.hive.druid.io.druid.segment.IndexMerger.persist(IndexMerger.java:186)
>   at 
> org.apache.hive.druid.io.druid.segment.IndexMerger.persist(IndexMerger.java:152)
>   at 
> org.apache.hive.druid.io.druid.segment.realtime.appenderator.AppenderatorImpl.persistHydrant(AppenderatorImpl.java:996)
>   at 
> org.apache.hive.druid.io.druid.segment.realtime.appenderator.AppenderatorImpl.access$200(AppenderatorImpl.java:93)
>   at 
> org.apache.hive.druid.io.druid.segment.realtime.appenderator.AppenderatorImpl$2.doCall(AppenderatorImpl.java:385)
>   at 
> org.apache.hive.druid.io.druid.common.guava.ThreadRenamingCallable.call(ThreadRenamingCallable.java:44)
>   ... 4 more
> ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 
> killedTasks:89, Vertex vertex_1502470020457_0005_12_05 [Reducer 2] 
> killed/failed due to:OWN_TASK_FAILURE]DAG did not succeed due to 
> VERTEX_FAILURE. failedVertices:1 killedVertices:0 (state=08S01,code=2)
> Options
> Attachments
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17300) WebUI query plan graphs

2017-08-14 Thread Karen Coppage (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126002#comment-16126002
 ] 

Karen Coppage commented on HIVE-17300:
--

Done, thanks very much for the reminder, [~xuefuz]!

> WebUI query plan graphs
> ---
>
> Key: HIVE-17300
> URL: https://issues.apache.org/jira/browse/HIVE-17300
> Project: Hive
>  Issue Type: Improvement
>  Components: Web UI
>Reporter: Karen Coppage
>Assignee: Karen Coppage
> Attachments: complete_success.png, full_mapred_stats.png, 
> graph_with_mapred_stats.png, HIVE-17300.patch, last_stage_error.png, 
> last_stage_running.png, non_mapred_task_selected.png
>
>
> Hi all,
> I’m working on a feature of the Hive WebUI Query Plan tab that would provide 
> the option to display the query plan as a nice graph (scroll down for 
> screenshots). If you click on one of the graph’s stages, the plan for that 
> stage appears as text below. 
> Stages are color-coded if they have a status (Success, Error, Running), and 
> the rest are grayed out. Coloring is based on status already available in the 
> WebUI, under the Stages tab.
> There is an additional option to display stats for MapReduce tasks. This 
> includes the job’s ID, tracking URL (where the logs are found), and mapper 
> and reducer numbers/progress, among other info. 
> The library I’m using for the graph is called vis.js (http://visjs.org/). It 
> has an Apache license, and the only necessary file to be included from this 
> library is about 700 KB.
> I tried to keep server-side changes minimal, and graph generation is taken 
> care of by the client. Plans with more than a given number of stages 
> (default: 25) won't be displayed in order to preserve resources.
> I’d love to hear any and all input from the community about this feature: do 
> you think it’s useful, and is there anything important I’m missing?
> Thanks,
> Karen Coppage



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17300) WebUI query plan graphs

2017-08-14 Thread Karen Coppage (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage updated HIVE-17300:
-
Status: Patch Available  (was: Open)

> WebUI query plan graphs
> ---
>
> Key: HIVE-17300
> URL: https://issues.apache.org/jira/browse/HIVE-17300
> Project: Hive
>  Issue Type: Improvement
>  Components: Web UI
>Reporter: Karen Coppage
>Assignee: Karen Coppage
> Attachments: complete_success.png, full_mapred_stats.png, 
> graph_with_mapred_stats.png, HIVE-17300.patch, last_stage_error.png, 
> last_stage_running.png, non_mapred_task_selected.png
>
>
> Hi all,
> I’m working on a feature of the Hive WebUI Query Plan tab that would provide 
> the option to display the query plan as a nice graph (scroll down for 
> screenshots). If you click on one of the graph’s stages, the plan for that 
> stage appears as text below. 
> Stages are color-coded if they have a status (Success, Error, Running), and 
> the rest are grayed out. Coloring is based on status already available in the 
> WebUI, under the Stages tab.
> There is an additional option to display stats for MapReduce tasks. This 
> includes the job’s ID, tracking URL (where the logs are found), and mapper 
> and reducer numbers/progress, among other info. 
> The library I’m using for the graph is called vis.js (http://visjs.org/). It 
> has an Apache license, and the only necessary file to be included from this 
> library is about 700 KB.
> I tried to keep server-side changes minimal, and graph generation is taken 
> care of by the client. Plans with more than a given number of stages 
> (default: 25) won't be displayed in order to preserve resources.
> I’d love to hear any and all input from the community about this feature: do 
> you think it’s useful, and is there anything important I’m missing?
> Thanks,
> Karen Coppage



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17300) WebUI query plan graphs

2017-08-14 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16125996#comment-16125996
 ] 

Xuefu Zhang commented on HIVE-17300:


[~klcopp], Thanks for your patch. Could you please change the status to "Patch 
available" so that tests can run on your patch? Thanks.

> WebUI query plan graphs
> ---
>
> Key: HIVE-17300
> URL: https://issues.apache.org/jira/browse/HIVE-17300
> Project: Hive
>  Issue Type: Improvement
>  Components: Web UI
>Reporter: Karen Coppage
>Assignee: Karen Coppage
> Attachments: complete_success.png, full_mapred_stats.png, 
> graph_with_mapred_stats.png, HIVE-17300.patch, last_stage_error.png, 
> last_stage_running.png, non_mapred_task_selected.png
>
>
> Hi all,
> I’m working on a feature of the Hive WebUI Query Plan tab that would provide 
> the option to display the query plan as a nice graph (scroll down for 
> screenshots). If you click on one of the graph’s stages, the plan for that 
> stage appears as text below. 
> Stages are color-coded if they have a status (Success, Error, Running), and 
> the rest are grayed out. Coloring is based on status already available in the 
> WebUI, under the Stages tab.
> There is an additional option to display stats for MapReduce tasks. This 
> includes the job’s ID, tracking URL (where the logs are found), and mapper 
> and reducer numbers/progress, among other info. 
> The library I’m using for the graph is called vis.js (http://visjs.org/). It 
> has an Apache license, and the only necessary file to be included from this 
> library is about 700 KB.
> I tried to keep server-side changes minimal, and graph generation is taken 
> care of by the client. Plans with more than a given number of stages 
> (default: 25) won't be displayed in order to preserve resources.
> I’d love to hear any and all input from the community about this feature: do 
> you think it’s useful, and is there anything important I’m missing?
> Thanks,
> Karen Coppage



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17006) LLAP: Parquet caching

2017-08-14 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16125994#comment-16125994
 ] 

Gunther Hagleitner commented on HIVE-17006:
---

{quote}
  // TODO: we currently pass null counters because this doesn't use 
LlapRecordReader.   

  //   Create counters for non-elevator-using fragments also?  
{quote}

What counters are those?

> LLAP: Parquet caching
> -
>
> Key: HIVE-17006
> URL: https://issues.apache.org/jira/browse/HIVE-17006
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17006.01.patch, HIVE-17006.02.patch, 
> HIVE-17006.patch, HIVE-17006.WIP.patch
>
>
> There are multiple options to do Parquet caching in LLAP:
> 1) Full elevator (too intrusive for now).
> 2) Page based cache like ORC (requires some changes to Parquet or 
> copy-pasted).
> 3) Cache disk data on column chunk level as is.
> Given that Parquet reads at column chunk granularity, (2) is not as useful as 
> for ORC, but still a good idea. I messaged the dev list about it but didn't 
> get a response, we may follow up later.
> For now, do (3). 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17309) alter partition onto a table not in current database throw InvalidOperationException

2017-08-14 Thread Wang Haihua (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16125944#comment-16125944
 ] 

Wang Haihua commented on HIVE-17309:


Seems failed test result is not related with this patch. cc [~pxiong] 
[~alangates] thanks for review in advance.

> alter partition onto a table not in current database throw 
> InvalidOperationException
> 
>
> Key: HIVE-17309
> URL: https://issues.apache.org/jira/browse/HIVE-17309
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.2, 2.1.1, 2.2.0
>Reporter: Wang Haihua
>Assignee: Wang Haihua
> Attachments: HIVE-17309.1.patch
>
>
> When executor alter partition onto a table which existed not in current 
> database, InvalidOperationException thrown.
> SQL example:
> {code}
> use default;
> ALTER TABLE anotherdb.test_table_for_alter_partition_nocurrentdb 
> partition(ds='haihua001') CHANGE COLUMN a a_new BOOLEAN;
> {code}
> We see this code in {{DDLTask.java}} potential problem that not transfer the 
> qualified table name with database name when {{db.alterPartitions}} called.
> {code}
>   if (allPartitions == null) {
> db.alterTable(alterTbl.getOldName(), tbl, alterTbl.getIsCascade(), 
> alterTbl.getEnvironmentContext());
>   } else {
> db.alterPartitions(tbl.getTableName(), allPartitions, 
> alterTbl.getEnvironmentContext());
>   }
> {code}
> stacktrace:
> {code}
> 2017-07-19T11:06:39,639  INFO [main] metastore.HiveMetaStore: New partition 
> values:[2017-07-14]
> 2017-07-19T11:06:39,654 ERROR [main] metastore.RetryingHMSHandler: 
> InvalidOperationException(message:alter is not possible)
> at 
> org.apache.hadoop.hive.metastore.HiveAlterHandler.alterPartitions(HiveAlterHandler.java:526)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_partitions_with_environment_context(HiveMetaStore.java:3560)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:140)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99)
> at 
> com.sun.proxy.$Proxy21.alter_partitions_with_environment_context(Unknown 
> Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_partitions(HiveMetaStoreClient.java:1486)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:154)
> at com.sun.proxy.$Proxy22.alter_partitions(Unknown Source)
> at org.apache.hadoop.hive.ql.metadata.Hive.alterPartitions(Hive.java:712)
> at org.apache.hadoop.hive.ql.exec.DDLTask.alterTable(DDLTask.java:3338)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:368)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2166)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1837)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1713)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1543)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1174)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1164)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
> at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.run(RunJar.j

[jira] [Commented] (HIVE-17292) Change TestMiniSparkOnYarnCliDriver test configuration to use the configured cores

2017-08-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16125940#comment-16125940
 ] 

Hive QA commented on HIVE-17292:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12881771/HIVE-17292.3.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 11004 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=180)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6385/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6385/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6385/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12881771 - PreCommit-HIVE-Build

> Change TestMiniSparkOnYarnCliDriver test configuration to use the configured 
> cores
> --
>
> Key: HIVE-17292
> URL: https://issues.apache.org/jira/browse/HIVE-17292
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark, Test
>Affects Versions: 3.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-17292.1.patch, HIVE-17292.2.patch, 
> HIVE-17292.3.patch
>
>
> Currently the {{hive-site.xml}} for the {{TestMiniSparkOnYarnCliDriver}} test 
> defines 2 cores, and 2 executors, but only 1 is used, because the MiniCluster 
> does not allows the creation of the 3rd container.
> The FairScheduler uses 1GB increments for memory, but the containers would 
> like to use only 512MB. We should change the fairscheduler configuration to 
> use only the requested 512MB



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17300) WebUI query plan graphs

2017-08-14 Thread Karen Coppage (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage updated HIVE-17300:
-
Attachment: HIVE-17300.patch

> WebUI query plan graphs
> ---
>
> Key: HIVE-17300
> URL: https://issues.apache.org/jira/browse/HIVE-17300
> Project: Hive
>  Issue Type: Improvement
>  Components: Web UI
>Reporter: Karen Coppage
>Assignee: Karen Coppage
> Attachments: complete_success.png, full_mapred_stats.png, 
> graph_with_mapred_stats.png, HIVE-17300.patch, last_stage_error.png, 
> last_stage_running.png, non_mapred_task_selected.png
>
>
> Hi all,
> I’m working on a feature of the Hive WebUI Query Plan tab that would provide 
> the option to display the query plan as a nice graph (scroll down for 
> screenshots). If you click on one of the graph’s stages, the plan for that 
> stage appears as text below. 
> Stages are color-coded if they have a status (Success, Error, Running), and 
> the rest are grayed out. Coloring is based on status already available in the 
> WebUI, under the Stages tab.
> There is an additional option to display stats for MapReduce tasks. This 
> includes the job’s ID, tracking URL (where the logs are found), and mapper 
> and reducer numbers/progress, among other info. 
> The library I’m using for the graph is called vis.js (http://visjs.org/). It 
> has an Apache license, and the only necessary file to be included from this 
> library is about 700 KB.
> I tried to keep server-side changes minimal, and graph generation is taken 
> care of by the client. Plans with more than a given number of stages 
> (default: 25) won't be displayed in order to preserve resources.
> I’d love to hear any and all input from the community about this feature: do 
> you think it’s useful, and is there anything important I’m missing?
> Thanks,
> Karen Coppage



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17291) Set the number of executors based on config if client does not provide information

2017-08-14 Thread Peter Vary (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16125842#comment-16125842
 ] 

Peter Vary commented on HIVE-17291:
---

Thanks [~lirui], helpful as always!

What I do not get, why we do not see flakiness on TestMiniSparkOnYarnCliDriver 
tests, when the we change a spark configuration value, and the spark session is 
invalidated. Until now I thought, that in this case new containers are created 
by yarn, and we will have a race condition when checking 
{{hiveSparkClient.getExecutorCount()}} again. But obviously this is not the 
case, since I do not see any flakiness in the test results in HIVE-17292.

Thanks,
Peter

> Set the number of executors based on config if client does not provide 
> information
> --
>
> Key: HIVE-17291
> URL: https://issues.apache.org/jira/browse/HIVE-17291
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: 3.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-17291.1.patch
>
>
> When calculating the memory and cores and the client does not provide 
> information we should try to use the one provided by default. This can happen 
> on startup, when {{spark.dynamicAllocation.enabled}} is not enabled



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17292) Change TestMiniSparkOnYarnCliDriver test configuration to use the configured cores

2017-08-14 Thread Peter Vary (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-17292:
--
Attachment: HIVE-17292.3.patch

Changed we expect different number of executors on yarn, and on standalone mode

> Change TestMiniSparkOnYarnCliDriver test configuration to use the configured 
> cores
> --
>
> Key: HIVE-17292
> URL: https://issues.apache.org/jira/browse/HIVE-17292
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark, Test
>Affects Versions: 3.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-17292.1.patch, HIVE-17292.2.patch, 
> HIVE-17292.3.patch
>
>
> Currently the {{hive-site.xml}} for the {{TestMiniSparkOnYarnCliDriver}} test 
> defines 2 cores, and 2 executors, but only 1 is used, because the MiniCluster 
> does not allows the creation of the 3rd container.
> The FairScheduler uses 1GB increments for memory, but the containers would 
> like to use only 512MB. We should change the fairscheduler configuration to 
> use only the requested 512MB



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17306) Support MySQL InnoDB Cluster

2017-08-14 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17306:
--
Component/s: Transactions

> Support MySQL InnoDB Cluster
> 
>
> Key: HIVE-17306
> URL: https://issues.apache.org/jira/browse/HIVE-17306
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Shawn Weeks
>Priority: Minor
>
> To support high availability of the Hive Metastore using a highly available 
> database is required. To support the MySQL InnoDB Cluster it looks like we're 
> just missing a couple primary keys as we were already using InnoDB tables for 
> the metastore. It looks like it's primarily the transaction tables that don't 
> have primary keys like TXN_COMPONENTS and COMPLETED_TXN_COMPONENTS. The 
> primary keys can be surrogate sequences if there really is no unique 
> identifier in these tables.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Work started] (HIVE-17316) Use regular expressions for the hidden configuration variables

2017-08-14 Thread Barna Zsombor Klara (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-17316 started by Barna Zsombor Klara.
--
> Use regular expressions for the hidden configuration variables
> --
>
> Key: HIVE-17316
> URL: https://issues.apache.org/jira/browse/HIVE-17316
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>
> Currently HiveConf variables which should not be displayed to the user need 
> to be enumerated. We should enhance this to be able to set regular 
> expressions and any variable matching it should be hidden.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17316) Use regular expressions for the hidden configuration variables

2017-08-14 Thread Barna Zsombor Klara (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barna Zsombor Klara reassigned HIVE-17316:
--


> Use regular expressions for the hidden configuration variables
> --
>
> Key: HIVE-17316
> URL: https://issues.apache.org/jira/browse/HIVE-17316
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>
> Currently HiveConf variables which should not be displayed to the user need 
> to be enumerated. We should enhance this to be able to set regular 
> expressions and any variable matching it should be hidden.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17315) Make the DataSource used by the DataNucleus in the HMS configurable using Hive properties

2017-08-14 Thread Barna Zsombor Klara (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barna Zsombor Klara reassigned HIVE-17315:
--


> Make the DataSource used by the DataNucleus in the HMS configurable using 
> Hive properties
> -
>
> Key: HIVE-17315
> URL: https://issues.apache.org/jira/browse/HIVE-17315
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 3.0.0
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>
> Currently we may use several connection pool implementations in the backend 
> (hikari, dbCp, boneCp) but these can only be configured using proprietary xml 
> files and not through hive-site.xml like DataNucleus.
> We should make them configurable just like DataNucleus, by allowing Hive 
> properties prefix by hikari, dbcp, bonecp to be set in the hive-site.xml. 
> However since these configurations may contain sensitive information 
> (passwords) these properties should not be displayable or manually settable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17292) Change TestMiniSparkOnYarnCliDriver test configuration to use the configured cores

2017-08-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16125780#comment-16125780
 ] 

Hive QA commented on HIVE-17292:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12881744/HIVE-17292.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 51 failed/errored test(s), 10413 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=101)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=102)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=103)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=104)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=105)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=106)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=107)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=108)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=109)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=110)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=111)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=112)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=113)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=114)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=115)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=116)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=117)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=118)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=119)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=120)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=121)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=122)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=123)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=124)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=125)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=126)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=127)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=128)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=129)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=130)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=131)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=132)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=133)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliD

[jira] [Updated] (HIVE-17292) Change TestMiniSparkOnYarnCliDriver test configuration to use the configured cores

2017-08-14 Thread Peter Vary (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-17292:
--
Attachment: HIVE-17292.2.patch

Moved the config change to the Shim, as suggested by [~lirui].
Also updated the QTestUtil, so it will wait until all of the executors are 
ready.
Might be some problem still, if the configuration is changed with a set command 
inside the test file. Will see the results.

Updated the necessary q.out files

> Change TestMiniSparkOnYarnCliDriver test configuration to use the configured 
> cores
> --
>
> Key: HIVE-17292
> URL: https://issues.apache.org/jira/browse/HIVE-17292
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark, Test
>Affects Versions: 3.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-17292.1.patch, HIVE-17292.2.patch
>
>
> Currently the {{hive-site.xml}} for the {{TestMiniSparkOnYarnCliDriver}} test 
> defines 2 cores, and 2 executors, but only 1 is used, because the MiniCluster 
> does not allows the creation of the 3rd container.
> The FairScheduler uses 1GB increments for memory, but the containers would 
> like to use only 512MB. We should change the fairscheduler configuration to 
> use only the requested 512MB



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17313) Potentially possible 'case fall through' in the ObjectInspectorConverters

2017-08-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16125627#comment-16125627
 ] 

Hive QA commented on HIVE-17313:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12881721/HIVE-17313.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 11004 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed]
 (batchId=240)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=235)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=180)
org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testWriteDate2 (batchId=183)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6383/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6383/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6383/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12881721 - PreCommit-HIVE-Build

> Potentially possible 'case fall through' in the ObjectInspectorConverters
> -
>
> Key: HIVE-17313
> URL: https://issues.apache.org/jira/browse/HIVE-17313
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleg Danilov
>Assignee: Oleg Danilov
>Priority: Trivial
> Attachments: HIVE-17313.patch
>
>
> Lines 103-110:
> {code:java}
> case STRING:
>   if (outputOI instanceof WritableStringObjectInspector) {
> return new PrimitiveObjectInspectorConverter.TextConverter(
> inputOI);
>   } else if (outputOI instanceof JavaStringObjectInspector) {
> return new PrimitiveObjectInspectorConverter.StringConverter(
> inputOI);
>   }
> case CHAR:
> {code}
> De-facto it should work correctly since outputOI is either an instance of 
> WritableStringObjectInspector or JavaStringObjectInspector, but it would be 
> better to rewrite this case to avoid possible fall through.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17260) Typo: exception has been created and lost in the ThriftJDBCBinarySerDe

2017-08-14 Thread Peter Vary (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-17260:
--
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master.
Thanks [~olegd]!

> Typo: exception has been created and lost in the ThriftJDBCBinarySerDe
> --
>
> Key: HIVE-17260
> URL: https://issues.apache.org/jira/browse/HIVE-17260
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleg Danilov
>Assignee: Oleg Danilov
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-17260.patch
>
>
> Line 100:
> {code:java}
> } catch (Exception e) {
>   new SerDeException(e);
> }
> {code}
> Seems like it should be thrown there :-)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17260) Typo: exception has been created and lost in the ThriftJDBCBinarySerDe

2017-08-14 Thread Peter Vary (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary reassigned HIVE-17260:
-

Assignee: Oleg Danilov

> Typo: exception has been created and lost in the ThriftJDBCBinarySerDe
> --
>
> Key: HIVE-17260
> URL: https://issues.apache.org/jira/browse/HIVE-17260
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleg Danilov
>Assignee: Oleg Danilov
>Priority: Minor
> Attachments: HIVE-17260.patch
>
>
> Line 100:
> {code:java}
> } catch (Exception e) {
>   new SerDeException(e);
> }
> {code}
> Seems like it should be thrown there :-)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17100) Improve HS2 operation logs for REPL commands.

2017-08-14 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17100:

Description: 
It is necessary to log the progress the replication tasks in a structured 
manner as follows.
*+Bootstrap Dump:+*
* At the start of bootstrap dump, will add one log with below details.
{color:#59afe1}* Database Name
* Dump Type (BOOTSTRAP)
* (Estimated) Total number of tables/views to dump
* (Estimated) Total number of functions to dump.
* Dump Start Time{color}
* After each table dump, will add a log as follows
{color:#59afe1}* Table/View Name
* Type (TABLE/VIEW/MATERIALIZED_VIEW)
* Table dump end time
* Table dump progress. Format is Table sequence no/(Estimated) Total number of 
tables and views.{color}
* After each function dump, will add a log as follows
{color:#59afe1}* Function Name
* Function dump end time
* Function dump progress. Format is Function sequence no/(Estimated) Total 
number of functions.{color}
* After completion of all dumps, will add a log as follows to consolidate the 
dump.
{color:#59afe1}* Database Name.
* Dump Type (BOOTSTRAP).
* Dump End Time.
* (Actual) Total number of tables/views dumped.
* (Actual) Total number of functions dumped.
* Dump Directory.
* Last Repl ID of the dump.{color}
*Note:* The actual and estimated number of tables/functions may not match if 
any table/function is dropped when dump in progress.

*+Bootstrap Load:+*
* At the start of bootstrap load, will add one log with below details.
{color:#59afe1}* Database Name
* Dump directory
* Load Type (BOOTSTRAP)
* Total number of tables/views to load
* Total number of functions to load.
* Load Start Time{color}
* After each table load, will add a log as follows
{color:#59afe1}* Table/View Name
* Type (TABLE/VIEW/MATERIALIZED_VIEW)
* Table load completion time
* Table load progress. Format is Table sequence no/Total number of tables and 
views.{color}
* After each function load, will add a log as follows
{color:#59afe1}* Function Name
* Function load completion time
* Function load progress. Format is Function sequence no/Total number of 
functions.{color}
* After completion of all dumps, will add a log as follows to consolidate the 
load.
{color:#59afe1}* Database Name.
* Load Type (BOOTSTRAP).
* Load End Time.
* Total number of tables/views loaded.
* Total number of functions loaded.
* Last Repl ID of the loaded database.{color}

*+Incremental Dump:+*
* At the start of database dump, will add one log with below details.
{color:#59afe1}* Database Name
* Dump Type (INCREMENTAL)
* (Estimated) Total number of events to dump.
* Dump Start Time{color}
* After each event dump, will add a log as follows
{color:#59afe1}* Event ID
* Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
* Event dump end time
* Event dump progress. Format is Event sequence no/ (Estimated) Total number of 
events.{color}
* After completion of all event dumps, will add a log as follows.
{color:#59afe1}* Database Name.
* Dump Type (INCREMENTAL).
* Dump End Time.
* (Actual) Total number of events dumped.
* Dump Directory.
* Last Repl ID of the dump.{color}
*Note:* The estimated number of events can be terribly inaccurate with actual 
number as we don’t have the number of events upfront until we read from 
metastore NotificationEvents table.

*+Incremental Load:+*
* At the start of incremental load, will add one log with below details.
{color:#59afe1}* Target Database Name 
* Dump directory
* Load Type (INCREMENTAL)
* Total number of events to load
* Load Start Time{color}
* After each event load, will add a log as follows
{color:#59afe1}* Event ID
* Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
* Event load end time
* Event load progress. Format is Event sequence no/ Total number of 
events.{color}
* After completion of all event loads, will add a log as follows to consolidate 
the load.
{color:#59afe1}* Target Database Name.
* Load Type (INCREMENTAL).
* Load End Time.
* Total number of events loaded.
* Last Repl ID of the loaded database.{color}

  was:
It is necessary to log the progress the replication tasks in a structured 
manner as follows.
*+Bootstrap Dump:+*
* At the start of bootstrap dump, will add one log with below details.
{color:#59afe1}* Database Name
* Dump Type (BOOTSTRAP)
* (Estimated) Total number of tables/views to dump
* (Estimated) Total number of functions to dump.
* Dump Start Time{color}
* After each table dump, will add a log as follows
{color:#59afe1}* Table/View Name
* Type (TABLE/VIEW/MATERIALIZED_VIEW)
* Table dump end time
* Table dump progress. Format is Table sequence no/(Estimated) Total number of 
tables and views.{color}
* After each function dump, will add a log as follows
{color:#59afe1}* Function Name
* Function dump end time
* Function dump progress. Format is Function sequence no/(Estimated) Total 
number of functions.{color}
* After completion of all dumps, will 

[jira] [Commented] (HIVE-17311) Numeric overflow in the HiveConf

2017-08-14 Thread Peter Vary (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16125581#comment-16125581
 ] 

Peter Vary commented on HIVE-17311:
---

LGTM +1

> Numeric overflow in the HiveConf
> 
>
> Key: HIVE-17311
> URL: https://issues.apache.org/jira/browse/HIVE-17311
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleg Danilov
>Assignee: Oleg Danilov
>Priority: Minor
> Attachments: HIVE-17311.patch
>
>
> multiplierFor() method contains a typo, which causes wrong parsing of the 
> rare suffixes ('tb' & 'pb').



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17311) Numeric overflow in the HiveConf

2017-08-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16125568#comment-16125568
 ] 

Hive QA commented on HIVE-17311:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12881707/HIVE-17311.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 11005 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=180)
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testHttpRetryOnServerIdleTimeout 
(batchId=228)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6382/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6382/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6382/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12881707 - PreCommit-HIVE-Build

> Numeric overflow in the HiveConf
> 
>
> Key: HIVE-17311
> URL: https://issues.apache.org/jira/browse/HIVE-17311
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleg Danilov
>Assignee: Oleg Danilov
>Priority: Minor
> Attachments: HIVE-17311.patch
>
>
> multiplierFor() method contains a typo, which causes wrong parsing of the 
> rare suffixes ('tb' & 'pb').



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17313) Potentially possible 'case fall through' in the ObjectInspectorConverters

2017-08-14 Thread Oleg Danilov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleg Danilov updated HIVE-17313:

Status: Patch Available  (was: Open)

> Potentially possible 'case fall through' in the ObjectInspectorConverters
> -
>
> Key: HIVE-17313
> URL: https://issues.apache.org/jira/browse/HIVE-17313
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleg Danilov
>Assignee: Oleg Danilov
>Priority: Trivial
> Attachments: HIVE-17313.patch
>
>
> Lines 103-110:
> {code:java}
> case STRING:
>   if (outputOI instanceof WritableStringObjectInspector) {
> return new PrimitiveObjectInspectorConverter.TextConverter(
> inputOI);
>   } else if (outputOI instanceof JavaStringObjectInspector) {
> return new PrimitiveObjectInspectorConverter.StringConverter(
> inputOI);
>   }
> case CHAR:
> {code}
> De-facto it should work correctly since outputOI is either an instance of 
> WritableStringObjectInspector or JavaStringObjectInspector, but it would be 
> better to rewrite this case to avoid possible fall through.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17313) Potentially possible 'case fall through' in the ObjectInspectorConverters

2017-08-14 Thread Oleg Danilov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleg Danilov updated HIVE-17313:

Attachment: HIVE-17313.patch

> Potentially possible 'case fall through' in the ObjectInspectorConverters
> -
>
> Key: HIVE-17313
> URL: https://issues.apache.org/jira/browse/HIVE-17313
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleg Danilov
>Assignee: Oleg Danilov
>Priority: Trivial
> Attachments: HIVE-17313.patch
>
>
> Lines 103-110:
> {code:java}
> case STRING:
>   if (outputOI instanceof WritableStringObjectInspector) {
> return new PrimitiveObjectInspectorConverter.TextConverter(
> inputOI);
>   } else if (outputOI instanceof JavaStringObjectInspector) {
> return new PrimitiveObjectInspectorConverter.StringConverter(
> inputOI);
>   }
> case CHAR:
> {code}
> De-facto it should work correctly since outputOI is either an instance of 
> WritableStringObjectInspector or JavaStringObjectInspector, but it would be 
> better to rewrite this case to avoid possible fall through.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17313) Potentially possible 'case fall through' in the ObjectInspectorConverters

2017-08-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16125563#comment-16125563
 ] 

ASF GitHub Bot commented on HIVE-17313:
---

GitHub user dosoft opened a pull request:

https://github.com/apache/hive/pull/230

HIVE-17313: Fixed 'case fall-through'



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dosoft/hive HIVE-17313

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/230.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #230


commit 2f322ea6d6a9b5754c0c9b698395d8b3e94563f7
Author: Oleg Danilov 
Date:   2017-08-14T11:28:47Z

HIVE-17313: Fixed 'case fall-through'




> Potentially possible 'case fall through' in the ObjectInspectorConverters
> -
>
> Key: HIVE-17313
> URL: https://issues.apache.org/jira/browse/HIVE-17313
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleg Danilov
>Assignee: Oleg Danilov
>Priority: Trivial
>
> Lines 103-110:
> {code:java}
> case STRING:
>   if (outputOI instanceof WritableStringObjectInspector) {
> return new PrimitiveObjectInspectorConverter.TextConverter(
> inputOI);
>   } else if (outputOI instanceof JavaStringObjectInspector) {
> return new PrimitiveObjectInspectorConverter.StringConverter(
> inputOI);
>   }
> case CHAR:
> {code}
> De-facto it should work correctly since outputOI is either an instance of 
> WritableStringObjectInspector or JavaStringObjectInspector, but it would be 
> better to rewrite this case to avoid possible fall through.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17313) Potentially possible 'case fall through' in the ObjectInspectorConverters

2017-08-14 Thread Oleg Danilov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleg Danilov reassigned HIVE-17313:
---

Assignee: Oleg Danilov

> Potentially possible 'case fall through' in the ObjectInspectorConverters
> -
>
> Key: HIVE-17313
> URL: https://issues.apache.org/jira/browse/HIVE-17313
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleg Danilov
>Assignee: Oleg Danilov
>Priority: Trivial
>
> Lines 103-110:
> {code:java}
> case STRING:
>   if (outputOI instanceof WritableStringObjectInspector) {
> return new PrimitiveObjectInspectorConverter.TextConverter(
> inputOI);
>   } else if (outputOI instanceof JavaStringObjectInspector) {
> return new PrimitiveObjectInspectorConverter.StringConverter(
> inputOI);
>   }
> case CHAR:
> {code}
> De-facto it should work correctly since outputOI is either an instance of 
> WritableStringObjectInspector or JavaStringObjectInspector, but it would be 
> better to rewrite this case to avoid possible fall through.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   >