https://bugs.kde.org/show_bug.cgi?id=356357
Kai Krakow <k...@kaishome.de> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |k...@kaishome.de --- Comment #14 from Kai Krakow <k...@kaishome.de> --- I've added some patches before finding this bug. My findings are that disabling read-ahead on the database somewhat helps in low-mem situation but the biggest problem is fsync: That call will actually sync the whole filesystem and not just the database file, and doing that constantly is toxic to performance. It's as simple as that. Here's the link: https://bugs.kde.org/show_bug.cgi?id=404057 and https://github.com/kakra/baloo/commits/fixes/bko-404057. Some of these patches may not be needed at all, some optimize for corner cases. But we should really turn off fsync as the very least. If you don't want to disable fsync, then LMDB is probably the wrong tool to do the job. You'd then need some append-only database with garbage collection (LMDB is already acting a lot like this). I'm pretty sure LMDB is actually a bad choice for baloo, if, and only if, you expect it to be the only software needing to do IO. But after some research, I think LMDB is not the wrong tool, thus we need to adjust how Baloo uses it. The devs of LMDB say that it is safe to use without fsync on any current Linux filesystem (it can loose transactions but it won't corrupt). It is not safe to use on some hypothetical filesystems (it could corrupt). Can we please at least let the user decide and allow him to shoot his own foot? Maybe a config option or env variable? Baloo already has some sort of recovery: If it fails to open the database it will simply purge and recreate it. Maybe it could detect corruptions during use somehow and act similar? I'm not sure if LMDB function could return errors or simply cause crashes. In the first case, it should be easy. I also like the time-based instead of count-based approach much more: Linux already flushes data after no more than 30s, why not just use the same amount? Regarding fsync: I'm not sure if LMDB uses fsync or fdatasync, or if this is even a choice. The developers say in their documentation it's fsync, the strace by Riku says fdatasync. Whatever is used: It's a problem: You cannot expect users to use the software if it totally destroys their user experience. Baloo should be designed around the idea that corruption can occur and luckily it's easy to recover from it: Just rebuild the database. So the proposed solution is really about: How do we properly detect database corruption? -- You are receiving this mail because: You are watching all bug changes.