[jira] [Commented] (HIVE-26564) Separate query live operation log and historical operation log
[ https://issues.apache.org/jira/browse/HIVE-26564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17614157#comment-17614157 ] Sun Chao commented on HIVE-26564: - Thanks [~zabetak] . Yes please create follow-up JIRAs for any further comments, and sorry I forgot to close this one :). > Separate query live operation log and historical operation log > -- > > Key: HIVE-26564 > URL: https://issues.apache.org/jira/browse/HIVE-26564 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 4.0.0-alpha-2 >Reporter: Yi Zhang >Assignee: Yi Zhang >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0, 4.0.0-alpha-2 > > Time Spent: 2h 20m > Remaining Estimate: 0h > > HIVE-24802 added OperationLogManager to support historical operation logs. > OperationLogManager.createOperationLog creates operation log inside > historical operation log dir if > HIVE_SERVER2_HISTORIC_OPERATION_LOG_ENABLED=true. This is confusing, since on > session level, SessionManager and HiveSession are using original operation > log session directory. > Proposed change is to separate live query's operation log and historical > operation log. Upon operation close, OperationLogManager.closeOperation is > called to move the operation log from session directory to historical log > dir. OperationLogManager is only responsible to clean up historical operation > logs. > This change also makes it easier to manage historical logs, for example, user > may want to persist historical logs, it is easier to differentiate live and > historical operation logs. > > before this change, if HIVE_SERVER2_HISTORIC_OPERATION_LOG_ENABLED=true, the > operation logs lay out is as following. in operation_logs_historic has both > live queries and historic queries's operational logs > ``` > /tmp/hive/ > ├── operation_logs > └── operation_logs_historic > └── hs2hostname_startupTimestamp > ├── session_id_1 > │ ├── hive_query_id_1 > │ ├── hive_query_id_2 > │ └── hive_query_id_3 > ├── session_id_2 > │ ├── hive_query_id_4 > │ └── hive_query_id_5 > ├── session_id_3 > │ └── hive_query_id_6 > └── session_id_4 > ├── hive_query_id_7 > └── hive_query_id_8 > ``` > after this change, the live queries operation logs are under > and historical ones under > /tmp/hive > ├── operation_logs > │ ├── session_id_1 > │ │ ├── hive_query_id_2 > │ │ └── hive_query_id_3 > │ └── session_id_4 > │ └── hive_query_id_8 > └── operation_logs_historic > └── hs2hostname_startupTimestamp > ├── session_id_1 > │ └── hive_query_id_1 > ├── session_id_2 > │ ├── hive_query_id_4 > │ └── hive_query_id_5 > ├── session_id_3 > │ └── hive_query_id_6 > └── session_id_4 > └── hive_query_id_7 > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-26564) Separate query live operation log and historical operation log
[ https://issues.apache.org/jira/browse/HIVE-26564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17614058#comment-17614058 ] Stamatis Zampetakis commented on HIVE-26564: I see that the change was already merged so for potential code improvements I will probably open a follow up JIRA. Thanks for merging the change [~sunchao], [~yigress] for the PR, and [~dengzh] for the review. In the future please remember to mark the JIRA ticket as resolved and always set the fix version otherwise the release notes will be broken. Fixed in https://github.com/apache/hive/commit/6efbcae38d6ef201eab6c5a4e425ac771b9cec12. > Separate query live operation log and historical operation log > -- > > Key: HIVE-26564 > URL: https://issues.apache.org/jira/browse/HIVE-26564 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 4.0.0-alpha-2 >Reporter: Yi Zhang >Assignee: Yi Zhang >Priority: Minor > Labels: pull-request-available > Time Spent: 2h 20m > Remaining Estimate: 0h > > HIVE-24802 added OperationLogManager to support historical operation logs. > OperationLogManager.createOperationLog creates operation log inside > historical operation log dir if > HIVE_SERVER2_HISTORIC_OPERATION_LOG_ENABLED=true. This is confusing, since on > session level, SessionManager and HiveSession are using original operation > log session directory. > Proposed change is to separate live query's operation log and historical > operation log. Upon operation close, OperationLogManager.closeOperation is > called to move the operation log from session directory to historical log > dir. OperationLogManager is only responsible to clean up historical operation > logs. > This change also makes it easier to manage historical logs, for example, user > may want to persist historical logs, it is easier to differentiate live and > historical operation logs. > > before this change, if HIVE_SERVER2_HISTORIC_OPERATION_LOG_ENABLED=true, the > operation logs lay out is as following. in operation_logs_historic has both > live queries and historic queries's operational logs > ``` > /tmp/hive/ > ├── operation_logs > └── operation_logs_historic > └── hs2hostname_startupTimestamp > ├── session_id_1 > │ ├── hive_query_id_1 > │ ├── hive_query_id_2 > │ └── hive_query_id_3 > ├── session_id_2 > │ ├── hive_query_id_4 > │ └── hive_query_id_5 > ├── session_id_3 > │ └── hive_query_id_6 > └── session_id_4 > ├── hive_query_id_7 > └── hive_query_id_8 > ``` > after this change, the live queries operation logs are under > and historical ones under > /tmp/hive > ├── operation_logs > │ ├── session_id_1 > │ │ ├── hive_query_id_2 > │ │ └── hive_query_id_3 > │ └── session_id_4 > │ └── hive_query_id_8 > └── operation_logs_historic > └── hs2hostname_startupTimestamp > ├── session_id_1 > │ └── hive_query_id_1 > ├── session_id_2 > │ ├── hive_query_id_4 > │ └── hive_query_id_5 > ├── session_id_3 > │ └── hive_query_id_6 > └── session_id_4 > └── hive_query_id_7 > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-26564) Separate query live operation log and historical operation log
[ https://issues.apache.org/jira/browse/HIVE-26564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17613727#comment-17613727 ] Stamatis Zampetakis commented on HIVE-26564: [~yigress] I started looking into the PR; probably I will finalize the first round of comments by tomorrow. > Separate query live operation log and historical operation log > -- > > Key: HIVE-26564 > URL: https://issues.apache.org/jira/browse/HIVE-26564 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 4.0.0-alpha-2 >Reporter: Yi Zhang >Assignee: Yi Zhang >Priority: Minor > Labels: pull-request-available > Time Spent: 2h > Remaining Estimate: 0h > > HIVE-24802 added OperationLogManager to support historical operation logs. > OperationLogManager.createOperationLog creates operation log inside > historical operation log dir if > HIVE_SERVER2_HISTORIC_OPERATION_LOG_ENABLED=true. This is confusing, since on > session level, SessionManager and HiveSession are using original operation > log session directory. > Proposed change is to separate live query's operation log and historical > operation log. Upon operation close, OperationLogManager.closeOperation is > called to move the operation log from session directory to historical log > dir. OperationLogManager is only responsible to clean up historical operation > logs. > This change also makes it easier to manage historical logs, for example, user > may want to persist historical logs, it is easier to differentiate live and > historical operation logs. > > before this change, if HIVE_SERVER2_HISTORIC_OPERATION_LOG_ENABLED=true, the > operation logs lay out is as following. in operation_logs_historic has both > live queries and historic queries's operational logs > ``` > /tmp/hive/ > ├── operation_logs > └── operation_logs_historic > └── hs2hostname_startupTimestamp > ├── session_id_1 > │ ├── hive_query_id_1 > │ ├── hive_query_id_2 > │ └── hive_query_id_3 > ├── session_id_2 > │ ├── hive_query_id_4 > │ └── hive_query_id_5 > ├── session_id_3 > │ └── hive_query_id_6 > └── session_id_4 > ├── hive_query_id_7 > └── hive_query_id_8 > ``` > after this change, the live queries operation logs are under > and historical ones under > /tmp/hive > ├── operation_logs > │ ├── session_id_1 > │ │ ├── hive_query_id_2 > │ │ └── hive_query_id_3 > │ └── session_id_4 > │ └── hive_query_id_8 > └── operation_logs_historic > └── hs2hostname_startupTimestamp > ├── session_id_1 > │ └── hive_query_id_1 > ├── session_id_2 > │ ├── hive_query_id_4 > │ └── hive_query_id_5 > ├── session_id_3 > │ └── hive_query_id_6 > └── session_id_4 > └── hive_query_id_7 > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-26564) Separate query live operation log and historical operation log
[ https://issues.apache.org/jira/browse/HIVE-26564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17610790#comment-17610790 ] Yi Zhang commented on HIVE-26564: - [~zabetak] updated the description with example layouts before and after the change. > Separate query live operation log and historical operation log > -- > > Key: HIVE-26564 > URL: https://issues.apache.org/jira/browse/HIVE-26564 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 4.0.0-alpha-2 >Reporter: Yi Zhang >Assignee: Yi Zhang >Priority: Minor > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > HIVE-24802 added OperationLogManager to support historical operation logs. > OperationLogManager.createOperationLog creates operation log inside > historical operation log dir if > HIVE_SERVER2_HISTORIC_OPERATION_LOG_ENABLED=true. This is confusing, since on > session level, SessionManager and HiveSession are using original operation > log session directory. > Proposed change is to separate live query's operation log and historical > operation log. Upon operation close, OperationLogManager.closeOperation is > called to move the operation log from session directory to historical log > dir. OperationLogManager is only responsible to clean up historical operation > logs. > This change also makes it easier to manage historical logs, for example, user > may want to persist historical logs, it is easier to differentiate live and > historical operation logs. > > before this change, if HIVE_SERVER2_HISTORIC_OPERATION_LOG_ENABLED=true, the > operation logs lay out is as following. in operation_logs_historic has both > live queries and historic queries's operational logs > /tmp/hive/ > ├── operation_logs > └── operation_logs_historic > └── hs2hostname_startupTimestamp > ├── session_id_1 > │ ├── hive_query_id_1 > │ ├── hive_query_id_2 > │ └── hive_query_id_3 > ├── session_id_2 > │ ├── hive_query_id_4 > │ └── hive_query_id_5 > ├── session_id_3 > │ └── hive_query_id_6 > └── session_id_4 > ├── hive_query_id_7 > └── hive_query_id_8 > > after this change, the live queries operation logs are under > and historical ones under > /tmp/hive > ├── operation_logs > │ ├── session_id_1 > │ │ ├── hive_query_id_2 > │ │ └── hive_query_id_3 > │ └── session_id_4 > │ └── hive_query_id_8 > └── operation_logs_historic > └── hs2hostname_startupTimestamp > ├── session_id_1 > │ └── hive_query_id_1 > ├── session_id_2 > │ ├── hive_query_id_4 > │ └── hive_query_id_5 > ├── session_id_3 > │ └── hive_query_id_6 > └── session_id_4 > └── hive_query_id_7 > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-26564) Separate query live operation log and historical operation log
[ https://issues.apache.org/jira/browse/HIVE-26564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17610439#comment-17610439 ] Stamatis Zampetakis commented on HIVE-26564: [~yigress] Can you please add some examples in the description with the sample structure of the log directories before and after the proposed changes. The current description is oriented towards developers and people familiar with operation logging but JIRA tickets become also part of the release notes so it helps to provide some examples more tailored to end-users. > Separate query live operation log and historical operation log > -- > > Key: HIVE-26564 > URL: https://issues.apache.org/jira/browse/HIVE-26564 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 4.0.0-alpha-2 >Reporter: Yi Zhang >Assignee: Yi Zhang >Priority: Minor > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > HIVE-24802 added OperationLogManager to support historical operation logs. > OperationLogManager.createOperationLog creates operation log inside > historical operation log dir if > HIVE_SERVER2_HISTORIC_OPERATION_LOG_ENABLED=true. This is confusing, since on > session level, SessionManager and HiveSession are using original operation > log session directory. > Proposed change is to separate live query's operation log and historical > operation log. Upon operation close, OperationLogManager.closeOperation is > called to move the operation log from session directory to historical log > dir. OperationLogManager is only responsible to clean up historical operation > logs. > This change also makes it easier to manage historical logs, for example, user > may want to persist historical logs, it is easier to differentiate live and > historical operation logs. -- This message was sent by Atlassian Jira (v8.20.10#820010)