[jira] [Commented] (HIVE-26564) Separate query live operation log and historical operation log

2022-10-07 Thread Sun Chao (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17614157#comment-17614157
 ] 

Sun Chao commented on HIVE-26564:
-

Thanks [~zabetak] . Yes please create follow-up JIRAs for any further comments, 
and sorry I forgot to close this one :).

> Separate query live operation log and historical operation log
> --
>
> Key: HIVE-26564
> URL: https://issues.apache.org/jira/browse/HIVE-26564
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 4.0.0-alpha-2
>Reporter: Yi Zhang
>Assignee: Yi Zhang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0, 4.0.0-alpha-2
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> HIVE-24802 added OperationLogManager to support historical operation logs. 
> OperationLogManager.createOperationLog creates operation log inside 
> historical operation log dir if 
> HIVE_SERVER2_HISTORIC_OPERATION_LOG_ENABLED=true. This is confusing, since on 
> session level, SessionManager and HiveSession are using original operation 
> log session directory.
> Proposed change is to separate live query's operation log and historical 
> operation log. Upon operation close, OperationLogManager.closeOperation is 
> called to move the operation log from session directory to historical log 
> dir. OperationLogManager is only responsible to clean up historical operation 
> logs.
> This change also makes it easier to manage historical logs, for example, user 
> may want to persist historical logs, it is easier to differentiate live and 
> historical operation logs.
>  
> before this change, if HIVE_SERVER2_HISTORIC_OPERATION_LOG_ENABLED=true, the 
> operation logs lay out is as following. in operation_logs_historic has both 
> live queries and historic queries's operational logs
> ```
> /tmp/hive/
> ├── operation_logs
> └── operation_logs_historic
>     └── hs2hostname_startupTimestamp
>         ├── session_id_1
>         │   ├── hive_query_id_1
>         │   ├── hive_query_id_2
>         │   └── hive_query_id_3
>         ├── session_id_2
>         │   ├── hive_query_id_4
>         │   └── hive_query_id_5
>         ├── session_id_3
>         │   └── hive_query_id_6
>         └── session_id_4
>             ├── hive_query_id_7
>             └── hive_query_id_8
> ```
> after this change, the live queries operation logs are under  
> and historical ones under 
> /tmp/hive
> ├── operation_logs
> │   ├── session_id_1
> │   │   ├── hive_query_id_2
> │   │   └── hive_query_id_3
> │   └── session_id_4
> │       └── hive_query_id_8
> └── operation_logs_historic
>     └── hs2hostname_startupTimestamp
>         ├── session_id_1
>         │   └── hive_query_id_1
>         ├── session_id_2
>         │   ├── hive_query_id_4
>         │   └── hive_query_id_5
>         ├── session_id_3
>         │   └── hive_query_id_6
>         └── session_id_4
>             └── hive_query_id_7
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26564) Separate query live operation log and historical operation log

2022-10-07 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17614058#comment-17614058
 ] 

Stamatis Zampetakis commented on HIVE-26564:


I see that the change was already merged so for potential code improvements I 
will probably open a follow up JIRA.

Thanks for merging the change [~sunchao], [~yigress] for the PR, and [~dengzh] 
for the review. In the future please remember to mark the JIRA ticket as 
resolved and always set the fix version otherwise the release notes will be 
broken.

Fixed in 
https://github.com/apache/hive/commit/6efbcae38d6ef201eab6c5a4e425ac771b9cec12.



> Separate query live operation log and historical operation log
> --
>
> Key: HIVE-26564
> URL: https://issues.apache.org/jira/browse/HIVE-26564
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 4.0.0-alpha-2
>Reporter: Yi Zhang
>Assignee: Yi Zhang
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> HIVE-24802 added OperationLogManager to support historical operation logs. 
> OperationLogManager.createOperationLog creates operation log inside 
> historical operation log dir if 
> HIVE_SERVER2_HISTORIC_OPERATION_LOG_ENABLED=true. This is confusing, since on 
> session level, SessionManager and HiveSession are using original operation 
> log session directory.
> Proposed change is to separate live query's operation log and historical 
> operation log. Upon operation close, OperationLogManager.closeOperation is 
> called to move the operation log from session directory to historical log 
> dir. OperationLogManager is only responsible to clean up historical operation 
> logs.
> This change also makes it easier to manage historical logs, for example, user 
> may want to persist historical logs, it is easier to differentiate live and 
> historical operation logs.
>  
> before this change, if HIVE_SERVER2_HISTORIC_OPERATION_LOG_ENABLED=true, the 
> operation logs lay out is as following. in operation_logs_historic has both 
> live queries and historic queries's operational logs
> ```
> /tmp/hive/
> ├── operation_logs
> └── operation_logs_historic
>     └── hs2hostname_startupTimestamp
>         ├── session_id_1
>         │   ├── hive_query_id_1
>         │   ├── hive_query_id_2
>         │   └── hive_query_id_3
>         ├── session_id_2
>         │   ├── hive_query_id_4
>         │   └── hive_query_id_5
>         ├── session_id_3
>         │   └── hive_query_id_6
>         └── session_id_4
>             ├── hive_query_id_7
>             └── hive_query_id_8
> ```
> after this change, the live queries operation logs are under  
> and historical ones under 
> /tmp/hive
> ├── operation_logs
> │   ├── session_id_1
> │   │   ├── hive_query_id_2
> │   │   └── hive_query_id_3
> │   └── session_id_4
> │       └── hive_query_id_8
> └── operation_logs_historic
>     └── hs2hostname_startupTimestamp
>         ├── session_id_1
>         │   └── hive_query_id_1
>         ├── session_id_2
>         │   ├── hive_query_id_4
>         │   └── hive_query_id_5
>         ├── session_id_3
>         │   └── hive_query_id_6
>         └── session_id_4
>             └── hive_query_id_7
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26564) Separate query live operation log and historical operation log

2022-10-06 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17613727#comment-17613727
 ] 

Stamatis Zampetakis commented on HIVE-26564:


[~yigress] I started looking into the PR; probably I will finalize the first 
round of comments by tomorrow.

> Separate query live operation log and historical operation log
> --
>
> Key: HIVE-26564
> URL: https://issues.apache.org/jira/browse/HIVE-26564
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 4.0.0-alpha-2
>Reporter: Yi Zhang
>Assignee: Yi Zhang
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> HIVE-24802 added OperationLogManager to support historical operation logs. 
> OperationLogManager.createOperationLog creates operation log inside 
> historical operation log dir if 
> HIVE_SERVER2_HISTORIC_OPERATION_LOG_ENABLED=true. This is confusing, since on 
> session level, SessionManager and HiveSession are using original operation 
> log session directory.
> Proposed change is to separate live query's operation log and historical 
> operation log. Upon operation close, OperationLogManager.closeOperation is 
> called to move the operation log from session directory to historical log 
> dir. OperationLogManager is only responsible to clean up historical operation 
> logs.
> This change also makes it easier to manage historical logs, for example, user 
> may want to persist historical logs, it is easier to differentiate live and 
> historical operation logs.
>  
> before this change, if HIVE_SERVER2_HISTORIC_OPERATION_LOG_ENABLED=true, the 
> operation logs lay out is as following. in operation_logs_historic has both 
> live queries and historic queries's operational logs
> ```
> /tmp/hive/
> ├── operation_logs
> └── operation_logs_historic
>     └── hs2hostname_startupTimestamp
>         ├── session_id_1
>         │   ├── hive_query_id_1
>         │   ├── hive_query_id_2
>         │   └── hive_query_id_3
>         ├── session_id_2
>         │   ├── hive_query_id_4
>         │   └── hive_query_id_5
>         ├── session_id_3
>         │   └── hive_query_id_6
>         └── session_id_4
>             ├── hive_query_id_7
>             └── hive_query_id_8
> ```
> after this change, the live queries operation logs are under  
> and historical ones under 
> /tmp/hive
> ├── operation_logs
> │   ├── session_id_1
> │   │   ├── hive_query_id_2
> │   │   └── hive_query_id_3
> │   └── session_id_4
> │       └── hive_query_id_8
> └── operation_logs_historic
>     └── hs2hostname_startupTimestamp
>         ├── session_id_1
>         │   └── hive_query_id_1
>         ├── session_id_2
>         │   ├── hive_query_id_4
>         │   └── hive_query_id_5
>         ├── session_id_3
>         │   └── hive_query_id_6
>         └── session_id_4
>             └── hive_query_id_7
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26564) Separate query live operation log and historical operation log

2022-09-28 Thread Yi Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610790#comment-17610790
 ] 

Yi Zhang commented on HIVE-26564:
-

[~zabetak]  updated the description with example layouts before and after the 
change.

> Separate query live operation log and historical operation log
> --
>
> Key: HIVE-26564
> URL: https://issues.apache.org/jira/browse/HIVE-26564
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 4.0.0-alpha-2
>Reporter: Yi Zhang
>Assignee: Yi Zhang
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> HIVE-24802 added OperationLogManager to support historical operation logs. 
> OperationLogManager.createOperationLog creates operation log inside 
> historical operation log dir if 
> HIVE_SERVER2_HISTORIC_OPERATION_LOG_ENABLED=true. This is confusing, since on 
> session level, SessionManager and HiveSession are using original operation 
> log session directory.
> Proposed change is to separate live query's operation log and historical 
> operation log. Upon operation close, OperationLogManager.closeOperation is 
> called to move the operation log from session directory to historical log 
> dir. OperationLogManager is only responsible to clean up historical operation 
> logs.
> This change also makes it easier to manage historical logs, for example, user 
> may want to persist historical logs, it is easier to differentiate live and 
> historical operation logs.
>  
> before this change, if HIVE_SERVER2_HISTORIC_OPERATION_LOG_ENABLED=true, the 
> operation logs lay out is as following. in operation_logs_historic has both 
> live queries and historic queries's operational logs
> /tmp/hive/
> ├── operation_logs
> └── operation_logs_historic
>     └── hs2hostname_startupTimestamp
>         ├── session_id_1
>         │   ├── hive_query_id_1
>         │   ├── hive_query_id_2
>         │   └── hive_query_id_3
>         ├── session_id_2
>         │   ├── hive_query_id_4
>         │   └── hive_query_id_5
>         ├── session_id_3
>         │   └── hive_query_id_6
>         └── session_id_4
>             ├── hive_query_id_7
>             └── hive_query_id_8
>  
> after this change, the live queries operation logs are under  
> and historical ones under 
> /tmp/hive
> ├── operation_logs
> │   ├── session_id_1
> │   │   ├── hive_query_id_2
> │   │   └── hive_query_id_3
> │   └── session_id_4
> │       └── hive_query_id_8
> └── operation_logs_historic
>     └── hs2hostname_startupTimestamp
>         ├── session_id_1
>         │   └── hive_query_id_1
>         ├── session_id_2
>         │   ├── hive_query_id_4
>         │   └── hive_query_id_5
>         ├── session_id_3
>         │   └── hive_query_id_6
>         └── session_id_4
>             └── hive_query_id_7
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26564) Separate query live operation log and historical operation log

2022-09-28 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610439#comment-17610439
 ] 

Stamatis Zampetakis commented on HIVE-26564:


[~yigress] Can you please add some examples in the description with the sample 
structure of the log directories before and after the proposed changes. The 
current description is oriented towards developers  and people familiar with 
operation logging but JIRA tickets become also part of the release notes so it 
helps to provide some examples more tailored to end-users.

> Separate query live operation log and historical operation log
> --
>
> Key: HIVE-26564
> URL: https://issues.apache.org/jira/browse/HIVE-26564
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 4.0.0-alpha-2
>Reporter: Yi Zhang
>Assignee: Yi Zhang
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> HIVE-24802 added OperationLogManager to support historical operation logs. 
> OperationLogManager.createOperationLog creates operation log inside 
> historical operation log dir if 
> HIVE_SERVER2_HISTORIC_OPERATION_LOG_ENABLED=true. This is confusing, since on 
> session level, SessionManager and HiveSession are using original operation 
> log session directory.
> Proposed change is to separate live query's operation log and historical 
> operation log. Upon operation close, OperationLogManager.closeOperation is 
> called to move the operation log from session directory to historical log 
> dir. OperationLogManager is only responsible to clean up historical operation 
> logs.
> This change also makes it easier to manage historical logs, for example, user 
> may want to persist historical logs, it is easier to differentiate live and 
> historical operation logs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)