Hello there, I'd like to bring to discussion a previously discussed topic - disabling WAL in RocksDB recovery.
It's clear that WAL is not needed during the process, the reason being that the WAL is never read, so there's no need to write it. AFAIK the last thing that was done with WAL during recovery is an attempt to remove it and later reverting that removal (https://issues.apache.org/jira/browse/FLINK-8922). If I interpret the comments in the ticket correctly, what happened was that a) WAL was kept in the recovery, 2) it's unknown why removing WAL causes segfault. What can be seen in the ticket is that having WAL causes a significant performance penalty. Thus, getting rid of WAL would be a very nice performance improvement. I think it'd be worth to creating a new JIRA ticket at least as a reminder that WAL should be removed? I'm planning adding an experimental flag to remove WAL in the environment I'm using Flink and trying it out. If the flag is made configurable, WAL can always be re-enabled if removing it causes issues. Thoughts? Regards, Juha