Dnia 2024-02-23, o godz. 10:26:49 Stefan Monnier <monn...@iro.umontreal.ca> napisał(a):
> > [13:37:48]cthulhu:/var/log/journal☠ journalctl |dd of=/dev/zero > > bs=1M 0+15442 records in > > 0+15442 records out > > 63643115 bytes (64 MB, 61 MiB) copied, 5,47791 s, 11,6 MB/s > > du -h /var/log/journal/ > > 337M /var/log/journal/44cf6f547971fc33309d1e9e000002e7 > > 337M /var/log/journal/ > > > > (I've raised a bug 8 years ago about it > > https://github.com/systemd/systemd/issues/2460 ) > > Oh, that bug report is quite interesting, thanks. > Makes one wonder why they don't use naive append-only "plain text" > logs (tho with appropriate delimiters (maybe some kind of CSV) to make > searches more reliable than with old-style plain text logs)? > > What are the advantages of journald's representation? > I mean, to justify the slow search and large disk space usage, there > is presumably some upside for some use cases. The advantage of binary format is that you have separate fields for program, level etc and it is easier to filter by exactly what you want. So "systemctl status servicename" just asks log subsystem for exactly what it needs. But journald logs aren't build for querying speed, there is for example no index (in memory or on disk) that tells you where last log entries from apps are so you get pathological cases like in ticket. The problem is that you kinda do 2 types of querying on logs most often: * get me all/most of what happened in last x lines/hours * get me everything recent (or just everything period) that this particular service returned. Which kinda exclude the "simple" tricks" like "just hash over service name so at worst you have to search only 1/64th of files" as they would make the "get me everything" case slower. But maybe not by all that much. The size issue... it's appendable binary format which means that any metadata systemd adds can't be deduplicated by just having join on table with service info, it needs to be repeated every single line and I'm guessing that's where the bloat comes from. I rambled about SQLite as it is kinda closest comparison for embedded binary format that can be searched and I think that being able to run SQL queries (sqlite now even have full text search) against your local logs with no database daemons running would be killer feature on some of the smaller devices. > argument against Sqlite based on the size of Sqlite, I'd argue that's not really technically relevant as in most cases you'd have other app in system using sqlite for *something* and will have installed the library anyway. And space savings gets wasted just on how verbose journald format is. > but what are the > advantages of journald's representation compared to a naive one? in short: querability without text parsing. That's about it. Which *is* a great feature, do not get me wrong, I use it all the time, but the slowdown caused by its unfortunate implementation detail is also showing up when I use it pretty often as the most common use case is "something broke, log in on server that has been running on automation for last 3 months and see what is happening" It does have some features around FSS (signing logs) so for folks that need it there isn't many alternatives as it would be hard to make similar feature on database. But on flipside inability to separate log into log rotation groups is *nasty*. You basically can't have verbose service in the system as that will rotate away any old logs of other services away. And looking for those logs is also the worst case: [17:57:56]cthulhu:~☠ echo 3 > /proc/sys/vm/drop_caches [17:58:14]cthulhu:~☠ time systemctl status nginx |grep Notice Notice: journal has been rotated since unit was started, output may be incomplete. real 0m1,645s user 0m0,000s sys 0m0,094s [17:58:19]cthulhu:~☠ echo 3 > /proc/sys/vm/drop_caches [17:58:28]cthulhu:~☠ time sqlite /tmp/log_indexed.sqlite 'select * from log where source = "nginx";' >/dev/null real 0m0,374s user 0m0,000s sys 0m0,009s I guess another one being appendable log is that in case of crash it's easier to recover than embedded database... ...or so would I say if I didn't saw journald throwing away binary logs with some errors after crash. > > > Stefan > > > PS: FWIW, the only time I thought about the design of a log format, > I was thinking along the lines of reducing redundancy, i.e. keeping > a kind of "suffix tree" of past messages so as to replace the many > repeated chunks of text with references to past log entries. > Then I realized that I could get the same result much more simply by > gzipping the files, which would naturally be done as part of > log rotation 🙂 > One-service-per-file approach is honestly good enough for most stuff. PITA to get chronological order thought, every approach really have some drawbacks and benefits. But honestly ? Send it to central logger and worry there if you have a chance, especially if you do some parsing there can be some amazing productivity gains to be had. We took a bunch of work to have every firewall log parsed into right columns but now answering questions like "at which firewall this traffic is blocked?" is a breeze. Of course, not really an option for smaller stuff -- Mariusz Gronczewski (XANi) <xani...@gmail.com> GnuPG: 0xEA8ACE64 https://devrandom.eu