[ 
https://issues.apache.org/jira/browse/HIVE-26564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Zhang updated HIVE-26564:
----------------------------
    Description: 
HIVE-24802 added OperationLogManager to support historical operation logs. 

OperationLogManager.createOperationLog creates operation log inside historical 
operation log dir if HIVE_SERVER2_HISTORIC_OPERATION_LOG_ENABLED=true. This is 
confusing, since on session level, SessionManager and HiveSession are using 
original operation log session directory.

Proposed change is to separate live query's operation log and historical 
operation log. Upon operation close, OperationLogManager.closeOperation is 
called to move the operation log from session directory to historical log dir. 
OperationLogManager is only responsible to clean up historical operation logs.

This change also makes it easier to manage historical logs, for example, user 
may want to persist historical logs, it is easier to differentiate live and 
historical operation logs.

 

before this change, if HIVE_SERVER2_HISTORIC_OPERATION_LOG_ENABLED=true, the 
operation logs lay out is as following. in operation_logs_historic has both 
live queries and historic queries's operational logs

```

/tmp/hive/

├── operation_logs

└── operation_logs_historic

    └── hs2hostname_startupTimestamp

        ├── session_id_1

        │   ├── hive_query_id_1

        │   ├── hive_query_id_2

        │   └── hive_query_id_3

        ├── session_id_2

        │   ├── hive_query_id_4

        │   └── hive_query_id_5

        ├── session_id_3

        │   └── hive_query_id_6

        └── session_id_4

            ├── hive_query_id_7

            └── hive_query_id_8

```

after this change, the live queries operation logs are under <operation_logs> 
and historical ones under <operation_logs_historic>

/tmp/hive

├── operation_logs

│   ├── session_id_1

│   │   ├── hive_query_id_2

│   │   └── hive_query_id_3

│   └── session_id_4

│       └── hive_query_id_8

└── operation_logs_historic

    └── hs2hostname_startupTimestamp

        ├── session_id_1

        │   └── hive_query_id_1

        ├── session_id_2

        │   ├── hive_query_id_4

        │   └── hive_query_id_5

        ├── session_id_3

        │   └── hive_query_id_6

        └── session_id_4

            └── hive_query_id_7

 

 

 

  was:
HIVE-24802 added OperationLogManager to support historical operation logs. 

OperationLogManager.createOperationLog creates operation log inside historical 
operation log dir if HIVE_SERVER2_HISTORIC_OPERATION_LOG_ENABLED=true. This is 
confusing, since on session level, SessionManager and HiveSession are using 
original operation log session directory.

Proposed change is to separate live query's operation log and historical 
operation log. Upon operation close, OperationLogManager.closeOperation is 
called to move the operation log from session directory to historical log dir. 
OperationLogManager is only responsible to clean up historical operation logs.

This change also makes it easier to manage historical logs, for example, user 
may want to persist historical logs, it is easier to differentiate live and 
historical operation logs.

 

before this change, if HIVE_SERVER2_HISTORIC_OPERATION_LOG_ENABLED=true, the 
operation logs lay out is as following. in operation_logs_historic has both 
live queries and historic queries's operational logs

/tmp/hive/

├── operation_logs

└── operation_logs_historic

    └── hs2hostname_startupTimestamp

        ├── session_id_1

        │   ├── hive_query_id_1

        │   ├── hive_query_id_2

        │   └── hive_query_id_3

        ├── session_id_2

        │   ├── hive_query_id_4

        │   └── hive_query_id_5

        ├── session_id_3

        │   └── hive_query_id_6

        └── session_id_4

            ├── hive_query_id_7

            └── hive_query_id_8

 

after this change, the live queries operation logs are under <operation_logs> 
and historical ones under <operation_logs_historic>

/tmp/hive

├── operation_logs

│   ├── session_id_1

│   │   ├── hive_query_id_2

│   │   └── hive_query_id_3

│   └── session_id_4

│       └── hive_query_id_8

└── operation_logs_historic

    └── hs2hostname_startupTimestamp

        ├── session_id_1

        │   └── hive_query_id_1

        ├── session_id_2

        │   ├── hive_query_id_4

        │   └── hive_query_id_5

        ├── session_id_3

        │   └── hive_query_id_6

        └── session_id_4

            └── hive_query_id_7

 

 

 


> Separate query live operation log and historical operation log
> --------------------------------------------------------------
>
>                 Key: HIVE-26564
>                 URL: https://issues.apache.org/jira/browse/HIVE-26564
>             Project: Hive
>          Issue Type: Improvement
>          Components: HiveServer2
>    Affects Versions: 4.0.0-alpha-2
>            Reporter: Yi Zhang
>            Assignee: Yi Zhang
>            Priority: Minor
>              Labels: pull-request-available
>          Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> HIVE-24802 added OperationLogManager to support historical operation logs. 
> OperationLogManager.createOperationLog creates operation log inside 
> historical operation log dir if 
> HIVE_SERVER2_HISTORIC_OPERATION_LOG_ENABLED=true. This is confusing, since on 
> session level, SessionManager and HiveSession are using original operation 
> log session directory.
> Proposed change is to separate live query's operation log and historical 
> operation log. Upon operation close, OperationLogManager.closeOperation is 
> called to move the operation log from session directory to historical log 
> dir. OperationLogManager is only responsible to clean up historical operation 
> logs.
> This change also makes it easier to manage historical logs, for example, user 
> may want to persist historical logs, it is easier to differentiate live and 
> historical operation logs.
>  
> before this change, if HIVE_SERVER2_HISTORIC_OPERATION_LOG_ENABLED=true, the 
> operation logs lay out is as following. in operation_logs_historic has both 
> live queries and historic queries's operational logs
> ```
> /tmp/hive/
> ├── operation_logs
> └── operation_logs_historic
>     └── hs2hostname_startupTimestamp
>         ├── session_id_1
>         │   ├── hive_query_id_1
>         │   ├── hive_query_id_2
>         │   └── hive_query_id_3
>         ├── session_id_2
>         │   ├── hive_query_id_4
>         │   └── hive_query_id_5
>         ├── session_id_3
>         │   └── hive_query_id_6
>         └── session_id_4
>             ├── hive_query_id_7
>             └── hive_query_id_8
> ```
> after this change, the live queries operation logs are under <operation_logs> 
> and historical ones under <operation_logs_historic>
> /tmp/hive
> ├── operation_logs
> │   ├── session_id_1
> │   │   ├── hive_query_id_2
> │   │   └── hive_query_id_3
> │   └── session_id_4
> │       └── hive_query_id_8
> └── operation_logs_historic
>     └── hs2hostname_startupTimestamp
>         ├── session_id_1
>         │   └── hive_query_id_1
>         ├── session_id_2
>         │   ├── hive_query_id_4
>         │   └── hive_query_id_5
>         ├── session_id_3
>         │   └── hive_query_id_6
>         └── session_id_4
>             └── hive_query_id_7
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to