[jira] [Commented] (HIVE-26145) Disable notification cleaner if interval is zero

2022-05-10 Thread Janos Kovacs (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17534522#comment-17534522
 ] 

Janos Kovacs commented on HIVE-26145:
-

OK, let's wait if there are other inputs.

On the side note: one of my ops mentioned, that configuring zero value for this 
and expecting a continuous run is impossible if anyone knows it a little bit, 
at least on a level to change the config as both the notification log cleaner 
and the txn write notification cleaner run in the same call. That would 1) 
hardly run within sub-sec or 2) would kill hive's performance as even with 
hive.txn.readonly.enabled turned on, a generic use-case would generate 
notifications and there would be continuous lock on backed db (although I have 
another ticket to enhance index on these to make sure no table-scan lock these 
in the RDBMS). If we add that tools actually depend on notification log's TTL, 
it's more uncommon to have the cleaner running that frequently. 

> Disable notification cleaner if interval is zero
> 
>
> Key: HIVE-26145
> URL: https://issues.apache.org/jira/browse/HIVE-26145
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Janos Kovacs
>Assignee: Janos Kovacs
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Many of the housekeeping/background tasks can be turned off in case of having 
> multiple instances running parallel. 
> Some are controlled via the housekeeping node configuration, others are not 
> started if their frequency is set to zero.
> The DB-Notification cleaner unfortunately doesn't have this functionality 
> which makes all instances to race for the lock on the backend HMS database. 
> Goal is to add change to be able to turn cleaner off in case if there are 
> multiple instances running (be able to bound it to the housekeeping 
> instance).  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HIVE-26145) Disable notification cleaner if interval is zero

2022-05-10 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17534398#comment-17534398
 ] 

Stamatis Zampetakis commented on HIVE-26145:


Thanks for taking the time to address my comments [~kovjanos].

I am afraid that the changes proposed in the PR make the code more difficult to 
follow and there is a risk of breaking something. 

Personally, I would leave things as they are i.e., use 
{{metastore.event.db.listener.clean.startup.wait.interval}} for turning off the 
cleaner. A way to make things more formal would be to update the description of 
{{metastore.event.db.listener.clean.startup.wait.interval}} property to 
indicate that setting it to a big value somewhat disables the cleaner.

Still this is a personal preference (and subjective) so if there are others who 
prefer another option from those that [~kovjanos] outlined previously I am not 
going to stand in the middle.

> Disable notification cleaner if interval is zero
> 
>
> Key: HIVE-26145
> URL: https://issues.apache.org/jira/browse/HIVE-26145
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Janos Kovacs
>Assignee: Janos Kovacs
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Many of the housekeeping/background tasks can be turned off in case of having 
> multiple instances running parallel. 
> Some are controlled via the housekeeping node configuration, others are not 
> started if their frequency is set to zero.
> The DB-Notification cleaner unfortunately doesn't have this functionality 
> which makes all instances to race for the lock on the backend HMS database. 
> Goal is to add change to be able to turn cleaner off in case if there are 
> multiple instances running (be able to bound it to the housekeeping 
> instance).  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HIVE-26145) Disable notification cleaner if interval is zero

2022-05-06 Thread Janos Kovacs (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17532750#comment-17532750
 ] 

Janos Kovacs commented on HIVE-26145:
-

[~zabetak] I agree, that bumping interval up is a workaround (that's what we 
are also using now). The proposed change is just a more elegant way to control 
it. The logic is copied from the TASK_THREADS_ALWAYS components, although those 
mostly have .frequency property,
metastore.acidmetrics.check.interval is the only having .interval and using the 
same 'if equals to zero then disable' configuration. 
 
I see the following options going forward:
- keep as-is and recommend using the large interval value to 'turn-off' the task
- keep using metastore.event.db.listener.clean.interval and extend the 
description in MetastoreConf.java stating that zero value will turn it off
- introduce a new configuration property explicitly planned to control enabling 
the task, like metastore.event.db.listener.clean.enabled, but that would not be 
consistent with the other housekeeping tasks
- rename the property to .frequency to mitigate the misunderstanding - but that 
would not be backward compatible.

What should we go forward with?

> Disable notification cleaner if interval is zero
> 
>
> Key: HIVE-26145
> URL: https://issues.apache.org/jira/browse/HIVE-26145
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Janos Kovacs
>Assignee: Janos Kovacs
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Many of the housekeeping/background tasks can be turned off in case of having 
> multiple instances running parallel. 
> Some are controlled via the housekeeping node configuration, others are not 
> started if their frequency is set to zero.
> The DB-Notification cleaner unfortunately doesn't have this functionality 
> which makes all instances to race for the lock on the backend HMS database. 
> Goal is to add change to be able to turn cleaner off in case if there are 
> multiple instances running (be able to bound it to the housekeeping 
> instance).  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HIVE-26145) Disable notification cleaner if interval is zero

2022-04-25 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17527445#comment-17527445
 ] 

Stamatis Zampetakis commented on HIVE-26145:


Looking at the available configuration properties and their description:
* metastore.event.db.listener.clean.interval
* metastore.event.db.listener.clean.startup.wait.interval
it's not very intuitive which one should be used to disable the cleaner and how.

Setting the {{clean.interval}} to zero could mean that you want the cleaner to 
run ASAP without sleeping at all instead of deactivating it completely. On the 
other hand if you set the {{startup.wait.interval}} to 100*365 you are sure 
that the cleaner will not run in the next 100 years which I think it is as good 
as being disabled.

> Disable notification cleaner if interval is zero
> 
>
> Key: HIVE-26145
> URL: https://issues.apache.org/jira/browse/HIVE-26145
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Janos Kovacs
>Assignee: Janos Kovacs
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Many of the housekeeping/background tasks can be turned off in case of having 
> multiple instances running parallel. 
> Some are controlled via the housekeeping node configuration, others are not 
> started if their frequency is set to zero.
> The DB-Notification cleaner unfortunately doesn't have this functionality 
> which makes all instances to race for the lock on the backend HMS database. 
> Goal is to add change to be able to turn cleaner off in case if there are 
> multiple instances running (be able to bound it to the housekeeping 
> instance).  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HIVE-26145) Disable notification cleaner if interval is zero

2022-04-16 Thread Janos Kovacs (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17523138#comment-17523138
 ] 

Janos Kovacs commented on HIVE-26145:
-

The critical one was only on TXN_WRITE_NOTIFICATION_LOG, seems NOTIFICATION_LOG 
has proper index.
{noformat}
MariaDB [(none)]> select distinct blocking_trx_id from (SELECT r.trx_id 
waiting_trx_id, r.trx_mysql_thread_id waiting_thread, r.trx_query 
waiting_query, b.trx_id blocking_trx_id, b.trx_mysql_thread_id blocking_thread, 
b.trx_query blocking_query FROM information_schema.innodb_lock_waits w INNER 
JOIN information_schema.innodb_trx b ON b.trx_id = w.blocking_trx_id INNER JOIN 
information_schema.innodb_trx r ON r.trx_id = w.requesting_trx_id) as a where 
a.blocking_query like '%TXN_WRITE_NOTIFICATION_LOG%';
+-+
| blocking_trx_id |
+-+
| 3086363831 |
| 3086389504 |
| 3086363966 |
| 3086366531 |
+-+
{noformat}
Increasing startup delay - like to days or weeks - is rather a workaround than 
a proper solution for disabling the task in certain instances.
For the workaround we also increased the TTL on the other 4 instances - just 
for the case if the startup delay would not be long enough, ensure the other 
instances find no record to clean up, that way those nodes would not execute 
DELETE statements.

The no-run-on-zero-freq flag would be a more proper way to turn off the 
cleaners. 

 

> Disable notification cleaner if interval is zero
> 
>
> Key: HIVE-26145
> URL: https://issues.apache.org/jira/browse/HIVE-26145
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Janos Kovacs
>Assignee: Janos Kovacs
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Many of the housekeeping/background tasks can be turned off in case of having 
> multiple instances running parallel. 
> Some are controlled via the housekeeping node configuration, others are not 
> started if their frequency is set to zero.
> The DB-Notification cleaner unfortunately doesn't have this functionality 
> which makes all instances to race for the lock on the backend HMS database. 
> Goal is to add change to be able to turn cleaner off in case if there are 
> multiple instances running (be able to bound it to the housekeeping 
> instance).  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HIVE-26145) Disable notification cleaner if interval is zero

2022-04-16 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17523119#comment-17523119
 ] 

Stamatis Zampetakis commented on HIVE-26145:


Which lock is the one to blame here?

What happens if you set the 
{{metastore.event.db.listener.clean.startup.wait.interval}} to a very big 
value? Wouldn't this be somewhat equivalent to disabling the cleaner?

> Disable notification cleaner if interval is zero
> 
>
> Key: HIVE-26145
> URL: https://issues.apache.org/jira/browse/HIVE-26145
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Janos Kovacs
>Assignee: Janos Kovacs
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Many of the housekeeping/background tasks can be turned off in case of having 
> multiple instances running parallel. 
> Some are controlled via the housekeeping node configuration, others are not 
> started if their frequency is set to zero.
> The DB-Notification cleaner unfortunately doesn't have this functionality 
> which makes all instances to race for the lock on the backend HMS database. 
> Goal is to add change to be able to turn cleaner off in case if there are 
> multiple instances running (be able to bound it to the housekeeping 
> instance).  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)